注意,/consul/data这个存储被我注释掉了,请按需自行配置相应的volume
主要思路就是先启动3台server,彼此之间通过consul-server实现自动加入节点.并通过反亲和度确保每个节点只允许一个consul-server.实现真正高可用.
然后启动consul-client,通过consul-server实现自动加入节点.
server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-server
labels:
name: consul-server
spec:
ports:
- name: http
port: 8500
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: serfwan-tcp
protocol: "TCP"
port: 8302
- name: server
port: 8300
- name: consuldns
port: 8600
selector:
app: consul
consul-role: server
---
# kgpo -l app=consul
# kgpo -l app=consul -o wide -w
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-server
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-server
replicas: 3
template:
metadata:
labels:
app: consul
consul-role: server
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
namespaces:
- $(namespace)
labelSelector:
matchExpressions:
- key: 'consul-role'
operator: In
values:
- "server"
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- "agent"
- "-advertise=$(POD_IP)"
- "-bind=0.0.0.0"
- "-bootstrap-expect=3"
- "-retry-join=consul-server"
- "-client=0.0.0.0"
- "-datacenter=dc1"
- "-data-dir=/consul/data"
- "-domain=cluster.local"
- "-server"
- "-ui"
- "-disable-host-node-id"
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
ports:
- containerPort: 8500
name: ui-port
- containerPort: 8400
name: alt-port
- containerPort: 53
name: udp-port
- containerPort: 8301
name: serflan
- containerPort: 8302
name: serfwan
- containerPort: 8600
name: consuldns
- containerPort: 8300
name: server
# volumeClaimTemplates:
# - metadata:
# name: data
client
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-client
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-client
replicas: 10
template:
metadata:
labels:
app: consul
consul-role: client
spec:
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: podname
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- agent
- -ui
- -retry-join=consul-server
- -node=$(podname)
- -bind=0.0.0.0
- -client=0.0.0.0
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
readinessProbe:
# NOTE(mitchellh): when our HTTP status endpoints support the
# proper status codes, we should switch to that. This is temporary.
exec:
command:
- "/bin/sh"
- "-ec"
- |
curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
ports:
- containerPort: 8301
name: serflan
- containerPort: 8500
name: ui-port
- containerPort: 8600
name: consuldns
---
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-client
labels:
name: consul-client
consul-role: consul-client
spec:
ports:
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: http
port: 8500
- name: consuldns
port: 8600
selector:
app: consul
consul-role: client
UI
带-ui参数的节点都能作为ui呈现,记住是用8500这个端口.例子我就不写了.
不足之处
重启机制没做好.应该在server那里配置livenessProbe,当自身离开时自动重启,不过这个问题不是很大,consul自身挺稳定的,本身很少出事.
主要是consul-client,consul-client在检测离开了server节点之后,应该直接重启,重新加入.但是这一点我没做.
其他问题
加密通讯
consul还支持彼此间加密通讯,但是我之前配置client的时候失败了,这就比较遗憾了.加密通讯要加多一些配置,比较麻烦,所以我改成无加密通讯了.
反注册失败
这个问题遇到很多次了,有些服务需要手动反注册3次(可能因为我有个server节点).有些流氓服务不管多少次也是反注册失败,相当残念.
consul很卡
consul的架构,server一定要跟client分离.如果直接往server注册服务,server担任了服务健康检查的角色,就会使整个consul变得非常的卡,我本想通过反注册服务给它降低负荷,但还是失败了,搞得最后我迁移了配置,重新搭了一套consul,相当蛋疼.
常用api
1
2
3
4
5
6
7
8
9
10
# 反注册服务
put /v1/agent/service/deregister/<serviceid>
# 获取配置
get /v1/kv/goms/config/<config>
# 获取服务列表
get /v1/agent/services
# 查询节点状态
get /v1/status/leader
参考链接
https://github.com/hashicorp/consul-helm
Note: The /consul/data storage has been commented out. Please configure the corresponding volume as needed.
The main idea is to first start 3 servers, which automatically join nodes through consul-server. High availability is achieved by using anti-affinity to ensure only one consul-server is allowed per node.
Then start consul-client, which automatically joins nodes through consul-server.
server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-server
labels:
name: consul-server
spec:
ports:
- name: http
port: 8500
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: serfwan-tcp
protocol: "TCP"
port: 8302
- name: server
port: 8300
- name: consuldns
port: 8600
selector:
app: consul
consul-role: server
---
# kgpo -l app=consul
# kgpo -l app=consul -o wide -w
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-server
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-server
replicas: 3
template:
metadata:
labels:
app: consul
consul-role: server
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
namespaces:
- $(namespace)
labelSelector:
matchExpressions:
- key: 'consul-role'
operator: In
values:
- "server"
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- "agent"
- "-advertise=$(POD_IP)"
- "-bind=0.0.0.0"
- "-bootstrap-expect=3"
- "-retry-join=consul-server"
- "-client=0.0.0.0"
- "-datacenter=dc1"
- "-data-dir=/consul/data"
- "-domain=cluster.local"
- "-server"
- "-ui"
- "-disable-host-node-id"
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
ports:
- containerPort: 8500
name: ui-port
- containerPort: 8400
name: alt-port
- containerPort: 53
name: udp-port
- containerPort: 8301
name: serflan
- containerPort: 8302
name: serfwan
- containerPort: 8600
name: consuldns
- containerPort: 8300
name: server
# volumeClaimTemplates:
# - metadata:
# name: data
client
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-client
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-client
replicas: 10
template:
metadata:
labels:
app: consul
consul-role: client
spec:
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: podname
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- agent
- -ui
- -retry-join=consul-server
- -node=$(podname)
- -bind=0.0.0.0
- -client=0.0.0.0
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
readinessProbe:
# NOTE(mitchellh): when our HTTP status endpoints support the
# proper status codes, we should switch to that. This is temporary.
exec:
command:
- "/bin/sh"
- "-ec"
- |
curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
ports:
- containerPort: 8301
name: serflan
- containerPort: 8500
name: ui-port
- containerPort: 8600
name: consuldns
---
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-client
labels:
name: consul-client
consul-role: consul-client
spec:
ports:
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: http
port: 8500
- name: consuldns
port: 8600
selector:
app: consul
consul-role: client
UI
Nodes with the -ui parameter can all serve as UI. Remember to use port 8500. I won’t write an example.
Shortcomings
The restart mechanism wasn’t done well. Should configure livenessProbe on the server, so it automatically restarts when it leaves. However, this isn’t a big issue. Consul itself is quite stable and rarely has problems.
Mainly consul-client. After consul-client detects it has left the server node, it should directly restart and rejoin. But I didn’t do this.
Other Issues
Encrypted Communication
Consul also supports encrypted communication between nodes, but I failed when configuring the client before, which is quite regrettable. Encrypted communication requires adding more configuration, which is troublesome, so I changed to unencrypted communication.
Deregistration Failure
This problem has been encountered many times. Some services need to be manually deregistered 3 times (possibly because I have a server node). Some rogue services fail to deregister no matter how many times, which is quite unfortunate.
Consul is Very Slow
In Consul’s architecture, servers must be separated from clients. If services are directly registered to servers, and servers take on the role of service health checks, it will make the entire Consul very slow. I originally wanted to reduce the load by deregistering services, but it still failed. In the end, I migrated the configuration and rebuilt a Consul cluster, which was quite painful.
Common APIs
1
2
3
4
5
6
7
8
9
10
# Deregister service
put /v1/agent/service/deregister/<serviceid>
# Get configuration
get /v1/kv/goms/config/<config>
# Get service list
get /v1/agent/services
# Query node status
get /v1/status/leader
Reference Links
https://github.com/hashicorp/consul-helm
注意:/consul/dataのストレージはコメントアウトされています。必要に応じて対応するボリュームを設定してください。
主な考え方は、まず3台のサーバーを起動し、consul-serverを通じて自動的にノードに参加させることです。アンチアフィニティを使用して、各ノードで1つのconsul-serverのみが許可されるようにすることで、真の高可用性を実現します。
次にconsul-clientを起動し、consul-serverを通じて自動的にノードに参加させます。
server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-server
labels:
name: consul-server
spec:
ports:
- name: http
port: 8500
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: serfwan-tcp
protocol: "TCP"
port: 8302
- name: server
port: 8300
- name: consuldns
port: 8600
selector:
app: consul
consul-role: server
---
# kgpo -l app=consul
# kgpo -l app=consul -o wide -w
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-server
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-server
replicas: 3
template:
metadata:
labels:
app: consul
consul-role: server
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
namespaces:
- $(namespace)
labelSelector:
matchExpressions:
- key: 'consul-role'
operator: In
values:
- "server"
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- "agent"
- "-advertise=$(POD_IP)"
- "-bind=0.0.0.0"
- "-bootstrap-expect=3"
- "-retry-join=consul-server"
- "-client=0.0.0.0"
- "-datacenter=dc1"
- "-data-dir=/consul/data"
- "-domain=cluster.local"
- "-server"
- "-ui"
- "-disable-host-node-id"
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
ports:
- containerPort: 8500
name: ui-port
- containerPort: 8400
name: alt-port
- containerPort: 53
name: udp-port
- containerPort: 8301
name: serflan
- containerPort: 8302
name: serfwan
- containerPort: 8600
name: consuldns
- containerPort: 8300
name: server
# volumeClaimTemplates:
# - metadata:
# name: data
client
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-client
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-client
replicas: 10
template:
metadata:
labels:
app: consul
consul-role: client
spec:
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: podname
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- agent
- -ui
- -retry-join=consul-server
- -node=$(podname)
- -bind=0.0.0.0
- -client=0.0.0.0
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
readinessProbe:
# NOTE(mitchellh): when our HTTP status endpoints support the
# proper status codes, we should switch to that. This is temporary.
exec:
command:
- "/bin/sh"
- "-ec"
- |
curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
ports:
- containerPort: 8301
name: serflan
- containerPort: 8500
name: ui-port
- containerPort: 8600
name: consuldns
---
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-client
labels:
name: consul-client
consul-role: consul-client
spec:
ports:
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: http
port: 8500
- name: consuldns
port: 8600
selector:
app: consul
consul-role: client
UI
-uiパラメータを持つノードはすべてUIとして機能できます。ポート8500を使用することを覚えておいてください。例は書きません。
不足点
再起動メカニズムがうまくいっていませんでした。サーバーでlivenessProbeを設定し、自身が離脱したときに自動的に再起動する必要があります。ただし、これは大きな問題ではありません。consul自体は非常に安定しており、問題が発生することはほとんどありません。
主にconsul-clientです。consul-clientがserverノードから離脱したことを検出した後、直接再起動して再参加する必要があります。しかし、これは行いませんでした。
その他の問題
暗号化通信
consulはノード間の暗号化通信もサポートしていますが、以前クライアントを設定したときに失敗しました。これは非常に残念です。暗号化通信には追加の設定が必要で、面倒なので、暗号化なしの通信に変更しました。
登録解除の失敗
この問題は何度も発生しました。一部のサービスは手動で3回登録解除する必要があります(serverノードがあるためかもしれません)。一部の不正なサービスは、何回試しても登録解除に失敗し、非常に残念です。
consulが非常に遅い
consulのアーキテクチャでは、serverはclientから分離する必要があります。サービスをserverに直接登録し、serverがサービスヘルスチェックの役割を担うと、consul全体が非常に遅くなります。サービスを登録解除して負荷を減らそうとしましたが、それでも失敗しました。最終的に、設定を移行し、consulクラスターを再構築しました。これは非常に苦痛でした。
よく使うAPI
1
2
3
4
5
6
7
8
9
10
# サービスを登録解除
put /v1/agent/service/deregister/<serviceid>
# 設定を取得
get /v1/kv/goms/config/<config>
# サービスリストを取得
get /v1/agent/services
# ノードステータスを照会
get /v1/status/leader
参考リンク
https://github.com/hashicorp/consul-helm
Примечание: хранилище /consul/data закомментировано. Пожалуйста, настройте соответствующий том по мере необходимости.
Основная идея заключается в том, чтобы сначала запустить 3 сервера, которые автоматически присоединяются к узлам через consul-server. Высокая доступность достигается за счет использования антисродства, чтобы гарантировать, что на каждом узле разрешен только один consul-server.
Затем запустите consul-client, который автоматически присоединяется к узлам через consul-server.
server
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-server
labels:
name: consul-server
spec:
ports:
- name: http
port: 8500
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: serfwan-tcp
protocol: "TCP"
port: 8302
- name: server
port: 8300
- name: consuldns
port: 8600
selector:
app: consul
consul-role: server
---
# kgpo -l app=consul
# kgpo -l app=consul -o wide -w
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-server
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-server
replicas: 3
template:
metadata:
labels:
app: consul
consul-role: server
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
namespaces:
- $(namespace)
labelSelector:
matchExpressions:
- key: 'consul-role'
operator: In
values:
- "server"
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
args:
- "agent"
- "-advertise=$(POD_IP)"
- "-bind=0.0.0.0"
- "-bootstrap-expect=3"
- "-retry-join=consul-server"
- "-client=0.0.0.0"
- "-datacenter=dc1"
- "-data-dir=/consul/data"
- "-domain=cluster.local"
- "-server"
- "-ui"
- "-disable-host-node-id"
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
ports:
- containerPort: 8500
name: ui-port
- containerPort: 8400
name: alt-port
- containerPort: 53
name: udp-port
- containerPort: 8301
name: serflan
- containerPort: 8302
name: serfwan
- containerPort: 8600
name: consuldns
- containerPort: 8300
name: server
# volumeClaimTemplates:
# - metadata:
# name: data
client
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: consul-client
spec:
updateStrategy:
rollingUpdate:
partition: 0
type: RollingUpdate
serviceName: consul-client
replicas: 10
template:
metadata:
labels:
app: consul
consul-role: client
spec:
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
containers:
- name: consul
image: "consul:1.4.2"
imagePullPolicy: Always
resources:
requests:
memory: 500Mi
env:
- name: podname
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- agent
- -ui
- -retry-join=consul-server
- -node=$(podname)
- -bind=0.0.0.0
- -client=0.0.0.0
- '-recursor=114.114.114.114'
# volumeMounts:
# - name: data
# mountPath: /consul/data
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- consul leave
readinessProbe:
# NOTE(mitchellh): when our HTTP status endpoints support the
# proper status codes, we should switch to that. This is temporary.
exec:
command:
- "/bin/sh"
- "-ec"
- |
curl http://127.0.0.1:8500/v1/status/leader 2>/dev/null | \
grep -E '".+"'
ports:
- containerPort: 8301
name: serflan
- containerPort: 8500
name: ui-port
- containerPort: 8600
name: consuldns
---
apiVersion: v1
kind: Service
metadata:
namespace: $(namespace)
name: consul-client
labels:
name: consul-client
consul-role: consul-client
spec:
ports:
- name: serflan-tcp
protocol: "TCP"
port: 8301
- name: http
port: 8500
- name: consuldns
port: 8600
selector:
app: consul
consul-role: client
UI
Узлы с параметром -ui могут служить как UI. Помните, что нужно использовать порт 8500. Пример я не буду писать.
Недостатки
Механизм перезапуска не был сделан хорошо. Следует настроить livenessProbe на сервере, чтобы он автоматически перезапускался при выходе. Однако это не большая проблема. Consul сам по себе довольно стабилен и редко имеет проблемы.
В основном consul-client. После того, как consul-client обнаружит, что он покинул серверный узел, он должен напрямую перезапуститься и повторно присоединиться. Но я этого не сделал.
Другие проблемы
Зашифрованная связь
Consul также поддерживает зашифрованную связь между узлами, но я потерпел неудачу при настройке клиента ранее, что довольно досадно. Зашифрованная связь требует добавления больше конфигурации, что хлопотно, поэтому я перешел на незашифрованную связь.
Сбой отмены регистрации
Эта проблема встречалась много раз. Некоторые службы нужно вручную отменять регистрацию 3 раза (возможно, потому что у меня есть серверный узел). Некоторые недобросовестные службы не могут отменить регистрацию, сколько бы раз ни пытались, что довольно досадно.
Consul очень медленный
В архитектуре Consul серверы должны быть отделены от клиентов. Если службы регистрируются напрямую на серверах, и серверы берут на себя роль проверки здоровья служб, это сделает весь Consul очень медленным. Я изначально хотел снизить нагрузку, отменив регистрацию служб, но это все равно не удалось. В итоге я перенес конфигурацию и перестроил кластер Consul, что было довольно болезненно.
Общие API
1
2
3
4
5
6
7
8
9
10
# Отменить регистрацию службы
put /v1/agent/service/deregister/<serviceid>
# Получить конфигурацию
get /v1/kv/goms/config/<config>
# Получить список служб
get /v1/agent/services
# Запросить статус узла
get /v1/status/leader
Ссылки
https://github.com/hashicorp/consul-helm
💬 讨论 / Discussion
对这篇文章有想法?欢迎在 GitHub 上发起讨论。
Have thoughts on this post? Start a discussion on GitHub.