1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563
| HA Cluster:
集群类型:LB(lvs/nginx(http/upstream, stream/upstream))、HA、HP SPoF: Single Point of Failure 系统可用性的公式:A=MTBF/(MTBF+MTTR) (0,1), 95% 几个9(指标): 99%, ..., 99.999%,99.9999%; 99%: %1, 99.9%, 0.1% 系统故障: 硬件故障:设计缺陷、wear out、自然灾害、…… 软件故障:设计缺陷、 提升系统高用性的解决方案之降低MTTR: 手段:冗余(redundant)
active/passive(主备),active/active(双主) active --> HEARTBEAT --> passive active <--> HEARTBEAT <--> active 高可用的是“服务”: HA nginx service: vip/nginx process[/shared storage] 资源:组成一个高可用服务的“组件”; (1) passive node的数量? (2) 资源切换? shared storage: NAS:文件共享服务器; SAN:存储区域网络,块级别的共享; Network partition:网络分区 隔离设备: node:STONITH = Shooting The Other Node In The Head 资源:fence quorum: with quorum: > total/2 without quorum: <= total/2 TWO nodes Cluster? 辅助设备:ping node, quorum disk; Failover:故障切换,即某资源的主节点故障时,将资源转移至其它节点的操作; Failback:故障移回,即某资源的主节点故障后重新修改上线后,将转移至其它节点的资源重新切回的过程; HA Cluster实现方案: vrrp协议的实现 keepalived ais:完备HA集群 RHCS(cman) heartbeat corosync keepalived: vrrp协议:Virtual Redundant Routing Protocol 术语: 虚拟路由器:Virtual Router 虚拟路由器标识:VRID(0-255) 物理路由器: master:主设备 backup:备用设备 priority:优先级 VIP:Virtual IP VMAC:Virutal MAC (00-00-5e-00-01-VRID) 通告:心跳,优先级等;周期性; 抢占式,非抢占式; 安全工作: 认证: 无认证 简单字符认证 MD5 工作模式: 主/备:单虚拟路径器; 主/主:主/备(虚拟路径器1),备/主(虚拟路径器2) keepalived: vrrp协议的软件实现,原生设计的目的为了高可用ipvs服务: vrrp协议完成地址流动; 为vip地址所在的节点生成ipvs规则(在配置文件中预先定义); 为ipvs集群的各RS做健康状态检测; 基于脚本调用接口通过执行脚本完成脚本中定义的功能,进而影响集群事务; 组件: 核心组件: vrrp stack ipvs wrapper checkers 控制组件:配置文件分析器 IO复用器 内存管理组件 HA Cluster的配置前提: (1) 各节点时间必须同步; ntp, chrony (2) 确保iptables及selinux不会成为阻碍; (3) 各节点之间可通过主机名互相通信(对KA并非必须); 建议使用/etc/hosts文件实现; (4) 各节点之间的root用户可以基于密钥认证的ssh服务完成互相通信;(并非必须) keepalived安装配置: CentOS 6.4+ 程序环境: 主配置文件:/etc/keepalived/keepalived.conf 主程序文件:/usr/sbin/keepalived Unit File:keepalived.service Unit File的环境配置文件:/etc/sysconfig/keepalived 配置文件组件部分: TOP HIERACHY GLOBAL CONFIGURATION Global definitions Static routes/addresses VRRPD CONFIGURATION VRRP synchronization group(s):vrrp同步组; VRRP instance(s):每个vrrp instance即一个vrrp路由器; LVS CONFIGURATION Virtual server group(s) Virtual server(s):ipvs集群的vs和rs; 单主配置示例: ! Configuration File for keepalived
global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node1 vrrp_mcast_group4 224.0.100.19 }
vrrp_instance VI_1 { state BACKUP interface eno16777736 virtual_router_id 14 priority 98 advert_int 1 authentication { auth_type PASS auth_pass 571f97b2 } virtual_ipaddress { 10.1.0.91/16 dev eno16777736 } } 配置语法: 配置虚拟路由器: vrrp_instance <STRING> { .... } 专用参数: state MASTER|BACKUP:当前节点在此虚拟路由器上的初始状态;只能有一个是MASTER,余下的都应该为BACKUP; interface IFACE_NAME:绑定为当前虚拟路由器使用的物理接口; virtual_router_id VRID:当前虚拟路由器的惟一标识,范围是0-255; priority 100:当前主机在此虚拟路径器中的优先级;范围1-254; advert_int 1:vrrp通告的时间间隔; authentication { auth_type AH|PASS auth_pass <PASSWORD> } virtual_ipaddress { <IPADDR>/<MASK> brd <IPADDR> dev <STRING> scope <SCOPE> label <LABEL> 192.168.200.17/24 dev eth1 192.168.200.18/24 dev eth2 label eth2:1 } track_interface { eth0 eth1 ... } 配置要监控的网络接口,一旦接口出现故障,则转为FAULT状态; nopreempt:定义工作模式为非抢占模式; preempt_delay 300:抢占式模式下,节点上线后触发新选举操作的延迟时长; 定义通知脚本: notify_master <STRING>|<QUOTED-STRING>:当前节点成为主节点时触发的脚本; notify_backup <STRING>|<QUOTED-STRING>:当前节点转为备节点时触发的脚本; notify_fault <STRING>|<QUOTED-STRING>:当前节点转为“失败”状态时触发的脚本; notify <STRING>|<QUOTED-STRING>:通用格式的通知触发机制,一个脚本可完成以上三种状态的转换时的通知; 双主模型示例: ! Configuration File for keepalived
global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node1 vrrp_mcast_group4 224.0.100.19 }
vrrp_instance VI_1 { state MASTER interface eno16777736 virtual_router_id 14 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 571f97b2 } virtual_ipaddress { 10.1.0.91/16 dev eno16777736 } }
vrrp_instance VI_2 { state BACKUP interface eno16777736 virtual_router_id 15 priority 98 advert_int 1 authentication { auth_type PASS auth_pass 578f07b2 } virtual_ipaddress { 10.1.0.92/16 dev eno16777736 } }
示例通知脚本: #!/bin/bash # contact='root@localhost'
notify() { mailsubject="$(hostname) to be $1, vip floating" mailbody="$(date +'%F %T'): vrrp transition, $(hostname) changed to be $1" echo "$mailbody" | mail -s "$mailsubject" $contact }
case $1 in master) notify master ;; backup) notify backup ;; fault) notify fault ;; *) echo "Usage: $(basename $0) {master|backup|fault}" exit 1 ;; esac 脚本的调用方法: notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" 虚拟服务器: 配置参数: virtual_server IP port | virtual_server fwmark int { ... real_server { ... } ... } 常用参数: delay_loop <INT>:服务轮询的时间间隔; lb_algo rr|wrr|lc|wlc|lblc|sh|dh:定义调度方法; lb_kind NAT|DR|TUN:集群的类型; persistence_timeout <INT>:持久连接时长; protocol TCP:服务协议,仅支持TCP; sorry_server <IPADDR> <PORT>:备用服务器地址; real_server <IPADDR> <PORT> { weight <INT> notify_up <STRING>|<QUOTED-STRING> notify_down <STRING>|<QUOTED-STRING> HTTP_GET|SSL_GET|TCP_CHECK|SMTP_CHECK|MISC_CHECK { ... }:定义当前主机的健康状态检测方法; } HTTP_GET|SSL_GET:应用层检测 HTTP_GET|SSL_GET { url { path <URL_PATH>:定义要监控的URL; status_code <INT>:判断上述检测机制为健康状态的响应码; digest <STRING>:判断上述检测机制为健康状态的响应的内容的校验码; } nb_get_retry <INT>:重试次数; delay_before_retry <INT>:重试之前的延迟时长; connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求 connect_port <PORT>:向当前RS的哪个PORT发起健康状态检测请求 bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址; bind_port <PORT>:发出健康状态检测请求时使用的源端口; connect_timeout <INTEGER>:连接请求的超时时长; } TCP_CHECK { connect_ip <IP ADDRESS>:向当前RS的哪个IP地址发起健康状态检测请求 connect_port <PORT>:向当前RS的哪个PORT发起健康状态检测请求 bindto <IP ADDRESS>:发出健康状态检测请求时使用的源地址; bind_port <PORT>:发出健康状态检测请求时使用的源端口; connect_timeout <INTEGER>:连接请求的超时时长; } 高可用的ipvs集群示例: ! Configuration File for keepalived
global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node1 vrrp_mcast_group4 224.0.100.19 }
vrrp_instance VI_1 { state MASTER interface eno16777736 virtual_router_id 14 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 571f97b2 } virtual_ipaddress { 10.1.0.93/16 dev eno16777736 } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" }
virtual_server 10.1.0.93 80 { delay_loop 3 lb_algo rr lb_kind DR protocol TCP
sorry_server 127.0.0.1 80
real_server 10.1.0.69 80 { weight 1 HTTP_GET { url { path / status_code 200 } connect_timeout 1 nb_get_retry 3 delay_before_retry 1 } } real_server 10.1.0.71 80 { weight 1 HTTP_GET { url { path / status_code 200 } connect_timeout 1 nb_get_retry 3 delay_before_retry 1 } } }
博客作业:第一部分 双主模式的lvs集群,拓扑、实现过程; 配置示例(一个节点): ! Configuration File for keepalived
global_defs { notification_email { root@localhost } notification_email_from kaadmin@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node1 vrrp_mcast_group4 224.0.100.67 }
vrrp_instance VI_1 { state MASTER interface eno16777736 virtual_router_id 44 priority 100 advert_int 1 authentication { auth_type PASS auth_pass f1bf7fde } virtual_ipaddress { 172.16.0.80/16 dev eno16777736 label eno16777736:0 } track_interface { eno16777736 } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" }
vrrp_instance VI_2 { state BACKUP interface eno16777736 virtual_router_id 45 priority 98 advert_int 1 authentication { auth_type PASS auth_pass f2bf7ade } virtual_ipaddress { 172.16.0.90/16 dev eno16777736 label eno16777736:1 } track_interface { eno16777736 } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" }
virtual_server fwmark 3 { delay_loop 2 lb_algo rr lb_kind DR nat_mask 255.255.0.0 protocol TCP sorry_server 127.0.0.1 80
real_server 172.16.0.69 80 { weight 1 HTTP_GET { url { path / status_code 200 } connect_timeout 2 nb_get_retry 3 delay_before_retry 3 } } real_server 172.16.0.6 80 { weight 1 HTTP_GET { url { path / status_code 200 } connect_timeout 2 nb_get_retry 3 delay_before_retry 3 } } } keepalived调用外部的辅助脚本进行资源监控,并根据监控的结果状态能实现优先动态调整; 分两步:(1) 先定义一个脚本;(2) 调用此脚本; vrrp_script <SCRIPT_NAME> { script "" interval INT weight -INT } track_script { SCRIPT_NAME_1 SCRIPT_NAME_2 ... } 示例:高可用nginx服务 ! Configuration File for keepalived
global_defs { notification_email { root@localhost } notification_email_from keepalived@localhost smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id node1 vrrp_mcast_group4 224.0.100.19 }
vrrp_script chk_down { script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0" interval 1 weight -5 }
vrrp_script chk_nginx { script "killall -0 nginx && exit 0 || exit 1" interval 1 weight -5 }
vrrp_instance VI_1 { state MASTER interface eno16777736 virtual_router_id 14 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 571f97b2 } virtual_ipaddress { 10.1.0.93/16 dev eno16777736 } track_script { chk_down chk_nginx } notify_master "/etc/keepalived/notify.sh master" notify_backup "/etc/keepalived/notify.sh backup" notify_fault "/etc/keepalived/notify.sh fault" } 博客作业:以上所有内容;
|