关于Pacemaker集群配置的版本

Pacemaker中CIB有贰个由admin_epoch, epoch,
num_updates组合而成的版本,当有节点加入集群时,依据版本号的深浅,取中间版本最大的当作一切集群的合併配备。

admin_epoch, epoch,
num_updates这3者中,admin_epoch平常是不会变的,epoch在每一趟”配置”改换时加上并把num_updates置0,num_updates在历次”状态”改造时增加。”配置”指持久的CIB中configuration节点下的始末,包含cluster属性,node的forever属性,财富属性等。”状态”指node的reboot属性,node死活,财富是还是不是启动等动态的事物。

“状态”经常是足以由此monitor重新取得的(除非RA脚本设计的有标题),但”配置”出错大概会造成集群的故障,所以大家更亟待关切epoch的改观以及节点加入后对集群配置的熏陶。特别有的支持中央架构的RA脚本会动态修改配置(比如mysql的mysql_REPL_INFO
和pgsql的pgsql-data-status),一旦配置处于分裂样状态大概会变成集群故障。

1. 手册表明

3.2.Configuration Version When a node joins the cluster, the cluster
will perform a check to see who has the best configuration based on the
fields below. It then asks the node with the highest
(admin_epoch,epoch,num_updates) tuple to replace the configuration on
all the nodes – which makes setting them, and setting them correctly,
very important.

Table3.1.Configuration Version Properties

Field Description
admin_epoch Never modified by the cluster. Use this to make the configurations on any inactive nodes obsolete.Never set this value to zero, in such cases the cluster cannot tell the difference between your configuration and the "empty" one used when nothing is found on disk.
epoch Incremented every time the configuration is updated (usually by the admin)
num_updates Incremented every time the configuration or status is updated (usually by the cluster)

2.事实上验证

2.1 环境

3台机器,srdsdevapp69,srdsdevapp71和srdsdevapp73
OS: CentOS 6.3
Pacemaker: 1.1.14-1.el6 (Build: 70404b0)
Corosync: 1.4.1-7.el6

2.2 基本注解

  1. 初始时epoch=”48304″,num_updates=”4″

  2. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch

  1. 更新集群配置导致epoch加1并将num_updates清0

  2. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo1
    -v “1”

  3. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch
  1. 更新值纵然和水保值同样epoch不改变

  2. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo1
    -v “1”

  3. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch
  1. 更新生命周期为forever的节点属性也招致epoch加1

  2. [[email protected]
    mysql_ha]# crm_attribute -N `hostname` -l forever -n foo2 -v
    2

  3. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch
  1. 履新生命周期为reboot的节点属性导致num_updates加1

  2. [[email protected]
    mysql_ha]# crm_attribute -N `hostname` -l reboot -n foo3 -v
    2

  3. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch

2.3 分区验证

1.
人造产生srdsdevapp69和其余三个节点的互联网隔绝形成分区,分区前的DC(Designated
Controller)为srdsdevapp73

  1. [[email protected]
    mysql_ha]#永利皇宫 , iptables -A INPUT -j DROP -s srdsdevapp71
  2. [[email protected]
    mysql_ha]# iptables -A OUTPUT -j DROP -s srdsdevapp71
  3. [[email protected]
    mysql_ha]# iptables -A INPUT -j DROP -s srdsdevapp73
  4. [[email protected]
    mysql_ha]# iptables -A OUTPUT -j DROP -s srdsdevapp73

多少个分区上的epoch都并未有变,仍是48306,但srdsdevapp69将自个儿看成了温馨分区的DC

分区1(srdsdevapp69) : 未取得QUORUM

  1. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch

分区2(srdsdevapp71,srdsdevapp73) : 取得QUORUM

  1. [[email protected]
    ~]# cibadmin -Q |grep epoch
  1. 在srdsdevapp69上做2次配置更新,使其epoch扩展2

  2. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo4
    -v “1”

  3. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo5
    -v “1”
  4. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch

3.在srdsdevapp71上做1次配置更新,使其epoch增添1

  1. [[email protected]
    ~]# crm_attribute –type crm_config -s set1 –name foo6 -v “1”
  2. [[email protected]
    ~]# cibadmin -Q |grep epoch

4.重操旧业互联网再自己争持集群的配置

  1. [[email protected]
    mysql_ha]# iptables -F
  2. [[email protected]
    mysql_ha]# cibadmin -Q |grep epoch
  3. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo5
    -q
  4. 1
  5. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo4
    -q
  6. 1
  7. [[email protected]
    mysql_ha]# crm_attribute –type crm_config -s set1 –name foo6
    -q
  8. Error performing operation: No such device or address

能够窥见集群众性采矿业用了srdsdevapp69分区的安插,因为它的本子更加大,那时在srdsdevapp71,srdsdevapp73分区上所做的创新遗失了。
其一测验反映了多少个标题:获得QUORUM的分区配置或许会被未获得QUORUM的分区配置覆盖。假使和睦费用RA的话,那是一个亟待注意的难点。

2.4 分区验证2

前二个测验中,发生疏区前的DC在获取QUORUM的分区中,以后再试一下产生疏区前的DC在未获得QUORUM的分区中的场景。

  1. 人造产生DC(srdsdevapp73)和其他五个节点的网络隔绝产生分区

  2. [ro[email protected]
    ~]# iptables -A INPUT -j DROP -s srdsdevapp69

  3. [[email protected]
    ~]# iptables -A OUTPUT -j DROP -s srdsdevapp69
  4. [[email protected]
    ~]# iptables -A INPUT -j DROP -s srdsdevapp71
  5. [[email protected]
    ~]# iptables -A OUTPUT -j DROP -s srdsdevapp71

srdsdevapp73上epoch没有变

  1. [[email protected]
    ~]# cibadmin -Q |grep epoch

但另二个分区(srdsdevapp69,srdsdevapp71)上的epoch加1了

  1. [[email protected]
    ~]# cibadmin -Q |grep epoch

复原网络后集群众性采矿业用了本子号更加高的配备,DC依旧是分区前的DC(srdsdevapp73)

  1. [[email protected]
    ~]# iptables -F
  2. [[email protected]
    ~]# cibadmin -Q |grep epoch

经过那一个测量试验能够窥见:

  • DC协商会导致epoch加1
  • 分区苏醒后,Pacemaker侧向于使分区前的DC作为新的DC

3.总结

Pacemaker的行为特征

  1. CIB配置改造会导致epoch加1
  2. DC协商会导致epoch加1
  3. 分区复苏后,Pacemaker选取版本号大的当作集群的配备
  4. 分区苏醒后,Pacemaker偏向于使分区前的DC作为新的DC

开采RA的注目点

  1. 尽量制止动态修改集群配置
  2. 假如做不到第一点,尽量防止使用三个动态集群配置参数,举例能够把多少个参数拼接成五个(mysql的mysql_REPL_INFO正是这么干的)
  3. 检查crm_attribute的失误仁同一视试(pgsql正是那般干的)
  4. 错开quorum时的能源截止管理(demote,stop)中防止予修业改集群配置

Pacemaker中CIB有贰个由admin_epoch, epoch,
num_updates组合而成的本子,当有节点加入集群时,依照版本号的轻重,取其…

网站地图xml地图