2017-03-21 [ 文章导读 ] 背景来自于群友的技术讨论,关于环境中有10台ESXi主机,当其中Host1的SSD故障后(已降级),位于Host1上的50台vm在SSD故障后,缓存是否丢失?虽然问题是假设的,那么可以根据这个假设依据vSAN的设计原理来模拟场景。首先,ESXi主机数是10台,完全符合FTT= ... 背景来自于群友的技术讨论,关于环境中有10台ESXi主机,当其中Host1的SSD故障后(已降级),位于Host1上的50台vm在SSD故障后,缓存是否丢失? 虽然问题是假设的,那么可以根据这个假设依据vSAN的设计原理来模拟场景。 首先,ESXi主机数是10台,完全符合FTT=3的8台主机需求,所以,模拟场景的规则使用FTT=3。 其次,位于Host1的主机上有50台vm, 可以假设理解为在启用HA和DRS后,DRS自动平衡的规则前提下,应该每台主机的vm数都在50左右的范围以达到DRS的平衡,所以考虑到每台主机上运行有50台vm,那么模拟场景条带sw=3以提高性能。 现在模拟配置是FTT=3,SW=3,按照这个场景开始模拟环境。 首先,FTT=3是允许3台主机故障有3个副本,而SW=3是每个Raid0有3个组件,在vSAN中的表述是witness , Raid1=Raid0+Raid0+Raid0+Raid0,根据组件条带的走向,开始计算条带组件和见证分布 计算方式如下: Primary Witness Secondary Witness Tiebreaker Witness Primary Witnesses: Need at least (2 * FTT) + 1 nodes in a cluster to be able to tolerate FTT number of node / disk failures. If after placing all the data components, we do not have the required number of nodes in the configuration, primary witnesses are on exclusive nodes until there are (2*FTT)+ 1 nodes in the configuration. Secondary Witnesses: Secondary witnesses are created to make sure that every node has equal voting power towards quorum. This is important because every node failure should affect the quorum equally. Secondary witnesses are added so that every node gets equal number of component, this includes the nodes that only hold primary witnesses. So the total count of data component + witnesses on each node are equalized in this step. Tiebreaker witness: If after adding primary and secondary witnesses we end up with even number of total components (data + witnesses) in the configuration then we add one tiebreaker witnesses to make the total component count odd. 下面主机名以H1,H2数字依次排,根据设计FTT=3,SW=3可以得出条带组件数位12.按照Primary witnesses至少需要(2 * 3) + 1=7,而当前的组件是12,所以不需要额外占用一台主机做witness. 按照SW=3,那么每个Raid0下有3个条带组件,我们按Secondary Witnesses顺序排主机,H5,6,7,8是缺少第2张投票,所以需要增加主机H5,6,7,8为witnesses,因环境是10台ESXi主机,且超出8台的需求,故将H9和H10是多出的,初始被理解为witnesses.所以witness为H5,6,7,8,9,10. 现在组件数位为18,不符合witness的奇数要求,所以系统会通过Tiebreaker witness增加一个witness,我们模拟定义H4为增加后的witness.请看下图 Witness: H4 H5 H6 H7 H8 H9 H10 Raid1 Raid0: H1 H2 H3 Raid0: H4 H5 H6 Raid0: H7 H8 H1 Raid0: H2 H3 H4 组件条带计算完成后,回到问题,结合图片和条带分布,可以看到H1只提供了2个组件位于不同的Raid0。根据vSAN的vm存储对象可访问的前提要求,>50%的投票(5.5为>50%的组件数)。同时,vm切换重启的前提是心跳,这时候vSAN kernel依然存在,那么现在可以模拟出当Host1的SSD故障后,位于Host1上的50台vm是不会发生改变的,vSAN会在clomdelay时间后启动重构Host1的数据。 补充2点: 1,vSAN没有本地化,存储和缓存。 2,vSAN的读是缓存cache,写是缓冲buffer。 |
|