Linux系统高可用之HA-Cluster

[TOC]

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563

HA Cluster:

集群类型:LB(lvs/nginx(http/upstream, stream/upstream))、HA、HP

SPoF: Single Point of Failure

系统可用性的公式:A=MTBF/(MTBF+MTTR)
(0,1), 95%
几个9(指标): 99%, ..., 99.999%,99.9999%;
99%: %1, 99.9%, 0.1%

系统故障:
硬件故障:设计缺陷、wear out、自然灾害、……
软件故障:设计缺陷、

提升系统高用性的解决方案之降低MTTR:

手段:冗余(redundant)

active/passive(主备),active/active(双主)
active --> HEARTBEAT --> passive
active <--> HEARTBEAT <--> active

高可用的是“服务”:
HA nginx service:
vip/nginx process[/shared storage]

资源:组成一个高可用服务的“组件”;

(1) passive node的数量?
(2) 资源切换?

shared storage:
NAS:文件共享服务器;
SAN:存储区域网络,块级别的共享;

Network partition:网络分区
隔离设备:
node:STONITH = Shooting The Other Node In The Head
资源:fence

quorum:
with quorum: > total/2
without quorum: <= total/2

TWO nodes Cluster?
辅助设备:ping node, quorum disk;

Failover:故障切换,即某资源的主节点故障时,将资源转移至其它节点的操作;
Failback:故障移回,即某资源的主节点故障后重新修改上线后,将转移至其它节点的资源重新切回的过程;

HA Cluster实现方案:
vrrp协议的实现
keepalived
ais:完备HA集群
RHCS(cman)
heartbeat
corosync

keepalived:

vrrp协议:Virtual Redundant Routing Protocol
术语:
虚拟路由器:Virtual Router
虚拟路由器标识:VRID(0-255)
物理路由器:
master:主设备
backup:备用设备
priority:优先级
VIP:Virtual IP
VMAC:Virutal MAC (00-00-5e-00-01-VRID)

通告:心跳,优先级等;周期性;

抢占式,非抢占式;

安全工作:
认证:
无认证
简单字符认证
MD5

工作模式:
主/备:单虚拟路径器;
主/主:主/备(虚拟路径器1),备/主(虚拟路径器2)

keepalived:
vrrp协议的软件实现,原生设计的目的为了高可用ipvs服务:
vrrp协议完成地址流动;
为vip地址所在的节点生成ipvs规则(在配置文件中预先定义);
为ipvs集群的各RS做健康状态检测;
基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务;

组件:
核心组件:
vrrp stack
ipvs wrapper
checkers
控制组件:配置文件分析器
IO复用器
内存管理组件

HA Cluster的配置前提:
(1) 各节点时间必须同步;
ntp, chrony
(2) 确保iptables及selinux不会成为阻碍;
(3) 各节点之间可通过主机名互相通信(对KA并非必须);
建议使用/etc/hosts文件实现;
(4) 各节点之间的root用户可以基于密钥认证的ssh服务完成互相通信;(并非必须)

keepalived安装配置:
CentOS 6.4+

程序环境:
主配置文件:/etc/keepalived/keepalived.conf
主程序文件:/usr/sbin/keepalived
Unit File:keepalived.service
Unit File的环境配置文件:/etc/sysconfig/keepalived

配置文件组件部分:
TOP HIERACHY
GLOBAL CONFIGURATION
Global definitions
Static routes/addresses
VRRPD CONFIGURATION
VRRP synchronization group(s):vrrp同步组;
VRRP instance(s):每个vrrp instance即一个vrrp路由器;
LVS CONFIGURATION
Virtual server group(s)
Virtual server(s):ipvs集群的vs和rs;

单主配置示例:
! Configuration File for keepalived

global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}

vrrp_instance VI_1 {
state BACKUP
interface eno16777736
virtual_router_id 14
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress {
10.1.0.91/16 dev eno16777736
}
}


配置语法:
配置虚拟路由器:
vrrp_instance <STRING> {
....
}

专用参数:
state MASTER|BACKUP:当前节点在此虚拟路由器上的初始状态;只能有一个是MASTER,余下的都应该为BACKUP;
interface IFACE_NAME:绑定为当前虚拟路由器使用的物理接口;
virtual_router_id VRID:当前虚拟路由器的惟一标识,范围是0-255;
priority 100:当前主机在此虚拟路径器中的优先级;范围1-254;
advert_int 1:vrrp通告的时间间隔;
authentication {
auth_type AH|PASS
auth_pass <PASSWORD>
}
virtual_ipaddress {
<IPADDR>/<MASK> brd <IPADDR> dev <STRING> scope <SCOPE> label <LABEL>
192.168.200.17/24 dev eth1
192.168.200.18/24 dev eth2 label eth2:1
}
track_interface {
eth0
eth1
...
}
配置要监控的网络接口,一旦接口出现故障,则转为FAULT状态;
nopreempt:定义工作模式为非抢占模式;
preempt_delay 300:抢占式模式下,节点上线后触发新选举操作的延迟时长;

定义通知脚本:
notify_master <STRING>|<QUOTED-STRING>:当前节点成为主节点时触发的脚本;
notify_backup <STRING>|<QUOTED-STRING>:当前节点转为备节点时触发的脚本;
notify_fault <STRING>|<QUOTED-STRING>:当前节点转为“失败”状态时触发的脚本;

notify <STRING>|<QUOTED-STRING>:通用格式的通知触发机制,一个脚本可完成以上三种状态的转换时的通知;

双主模型示例:
! Configuration File for keepalived

global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}

vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress {
10.1.0.91/16 dev eno16777736
}
}

vrrp_instance VI_2 {
state BACKUP
interface eno16777736
virtual_router_id 15
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass 578f07b2
}
virtual_ipaddress {
10.1.0.92/16 dev eno16777736
}
}



示例通知脚本:
#!/bin/bash
#
contact='root@localhost'

notify() {
mailsubject="$(hostname) to be $1, vip floating"
mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1"
echo "$mailbody" | mail -s "$mailsubject" $contact
}

case $1 in
master)
notify master
;;
backup)
notify backup
;;
fault)
notify fault
;;
*)
echo "Usage: $(basename $0) {master|backup|fault}"
exit 1
;;
esac

脚本的调用方法:
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"

虚拟服务器:
配置参数:
virtual_server IP port |
virtual_server fwmark int
{
...
real_server {
...
}
...
}

常用参数:
delay_loop <INT>:服务轮询的时间间隔;
lb_algo rr|wrr|lc|wlc|lblc|sh|dh:定义调度方法;
lb_kind NAT|DR|TUN:集群的类型;
persistence_timeout <INT>:持久连接时长;
protocol TCP:服务协议,仅支持TCP;
sorry_server <IPADDR> <PORT>:备用服务器地址;
real_server <IPADDR> <PORT>
{
weight <INT>
notify_up <STRING>|<QUOTED-STRING>
notify_down <STRING>|<QUOTED-STRING>
HTTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK { ... }:定义当前主机的健康状态检测方法;
}

HTTP_GET|SSL_GET:应用层检测

HTTP_GET|SSL_GET {
url {
path <URL_PATH>:定义要监控的URL;
status_code <INT>:判断上述检测机制为健康状态的响应码;
digest <STRING>:判断上述检测机制为健康状态的响应的内容的校验码;
}
nb_get_retry <INT>:重试次数;
delay_before_retry <INT>:重试之前的延迟时长;
connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求
connect_port <PORT>:向当前RS的哪个PORT发起健康状态检测请求
bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址;
bind_port <PORT>:发出健康状态检测请求时使用的源端口;
connect_timeout <INTEGER>:连接请求的超时时长;
}

TCP_CHECK {
connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求
connect_port <PORT>:向当前RS的哪个PORT发起健康状态检测请求
bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址;
bind_port <PORT>:发出健康状态检测请求时使用的源端口;
connect_timeout <INTEGER>:连接请求的超时时长;
}

高可用的ipvs集群示例:
! Configuration File for keepalived

global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}

vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress {
10.1.0.93/16 dev eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}

virtual_server 10.1.0.93 80 {
delay_loop 3
lb_algo rr
lb_kind DR
protocol TCP

sorry_server 127.0.0.1 80

real_server 10.1.0.69 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 1
nb_get_retry 3
delay_before_retry 1
}
}
real_server 10.1.0.71 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 1
nb_get_retry 3
delay_before_retry 1
}
}
}


博客作业:第一部分
双主模式的lvs集群,拓扑、实现过程;

配置示例(一个节点):

! Configuration File for keepalived

global_defs {
notification_email {
root@localhost
}
notification_email_from kaadmin@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.67
}

vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 44
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass f1bf7fde
}
virtual_ipaddress {
172.16.0.80/16 dev eno16777736 label eno16777736:0
}
track_interface {
eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}

vrrp_instance VI_2 {
state BACKUP
interface eno16777736
virtual_router_id 45
priority 98
advert_int 1
authentication {
auth_type PASS
auth_pass f2bf7ade
}
virtual_ipaddress {
172.16.0.90/16 dev eno16777736 label eno16777736:1
}
track_interface {
eno16777736
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}

virtual_server fwmark 3 {
delay_loop 2
lb_algo rr
lb_kind DR
nat_mask 255.255.0.0
protocol TCP
sorry_server 127.0.0.1 80

real_server 172.16.0.69 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 2
nb_get_retry 3
delay_before_retry 3
}
}
real_server 172.16.0.6 80 {
weight 1
HTTP_GET {
url {
path /
status_code 200
}
connect_timeout 2
nb_get_retry 3
delay_before_retry 3
}
}
}









keepalived调用外部的辅助脚本进行资源监控,并根据监控的结果状态能实现优先动态调整;
分两步:(1) 先定义一个脚本;(2) 调用此脚本;
vrrp_script <SCRIPT_NAME> {
script ""
interval INT
weight -INT
}

track_script {
SCRIPT_NAME_1
SCRIPT_NAME_2
...
}

示例:高可用nginx服务

! Configuration File for keepalived

global_defs {
notification_email {
root@localhost
}
notification_email_from keepalived@localhost
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}

vrrp_script chk_down {
script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0"
interval 1
weight -5
}

vrrp_script chk_nginx {
script "killall -0 nginx && exit 0 || exit 1"
interval 1
weight -5
}

vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97b2
}
virtual_ipaddress {
10.1.0.93/16 dev eno16777736
}
track_script {
chk_down
chk_nginx
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}

博客作业:以上所有内容;