SAP HANA 同/跨可用区高可用部署(基于Fence agent)
- 概述
- SAP HANA HA架构
- 安装前准备
- VPC网络规划
- ECS实例创建
- ECS实例配置
- aliyun_agent和aliyun-vpc-move-ip安装和配置
- SAP HANA数据库安装
- SLES HAE安装配置
- SAP HANA与SUSE HAE集成
- SAP HANA高可用测试和维护
版本管理
版本 | 修订日期 | 变更类型 | 生效日期 |
---|---|---|---|
1.0 | 2019/11/15 | ||
1.1 | 2020/1/18 | 更新RAM Role权限验证说明 | 2020/1/18 |
概述
本文档描述了如何在阿里云同可用区或跨可用区,使用SUSE Linux Enterprise Server的Fence agent和Resource agent来部署SAP HANA的高可用环境。
本文档基于SUSE Linux Enterprise Server for SAP Applications 12 SP4版本,同样适用于更高版本。
相对于传统的SAP高可用方案中使用的SBD fence设备和高可用虚拟IP,fence agent和Resource agent(RAs)通过调用Open API实现对阿里云云资源的灵活调度和管理,支持同可用区或跨可用区的SAP系统高可用部署,满足企业对SAP核心应用的跨可用区部署需求。
fence_aliyun是针对阿里云环境开发的,用于隔离SAP系统高可用环境故障节点的fence agent。
aliyun-vpc-move-ip是针对阿里云环境开发的,用于SAP系统高可用环境中浮动IP管理和Overlay 浮动IP设置的Resource agent(RAs)。
SUSE Enterprise Server for SAP Applications 12 SP4及之后的版本已经原生集成了fence_aliyun和aliyun-vpc-move-ip,可以直接用于阿里云公有云环境上的SAP系统高可用环境部署。
SAP HANA HA架构
本示例中,SAP HANA部署在北京Region下两个不同的可用区(C和G),通过HANA HSR+SUSE HAE实现的高可用部署。
安装前准备
SAP系统安装介质
VPC网络规划
网络规划
网络 | 位置 | 用途 | 分配网段 |
---|---|---|---|
业务网 | 华东2 可用区C | For Business | 192.168.10.0/24 |
心跳网 | 华东2 可用区C | For HA/Heartbeat | 192.168.11.0/24 |
业务网 | 华东2 可用区G | For Business | 192.168.30.0/24 |
心跳网 | 华东2 可用区G | For HA/Heartbeat | 192.168.31.0/24 |
主机名 | 角色 | 业务地址 | 心跳地址 | 高可用虚拟IP |
---|---|---|---|---|
hana-master | HANA主节点 | 192.168.10.1 | 192.168.11.1 | 192.168.100.100 |
hana-slave | HANA备节点 | 192.168.30.1 | 192.168.31.1 | 192.168.100.100 |
当HA触发切换时,SUSE Pacemaker通过Resource agent调用阿里云的Open API实现Overlay IP的调整。
“高可用虚拟IP” 不能是当前两台SAP HANA ECS所在的VPC下已经存在的子网内的地址,它必须是一个虚拟的地址,本示例中192.168.100.100这个IP地址是一个的虚拟地址。
文件系统规划
本示例中,HANA的文件系统使用了LVM条带化,/usr/sap使用一块单独的云盘,实际文件系统划分如下:
属性 | 文件系统大小 | 文件系统 | VG | LVM条带化 | 挂载点 |
---|---|---|---|---|---|
数据盘 | 800G | XFS | datavg | 是 | /hana/data |
数据盘 | 400G | XFS | datavg | 是 | /hana/log |
数据盘 | 300G | XFS | datavg | 是 | /hana/shared |
数据盘 | 50G | XFS | sapvg | 否 | /usr/sap |
VPC网络创建
专有网络VPC(Virtual Private Cloud)是基于阿里云构建的一个隔离的网络环境,专有网络之间逻辑上彻底隔离。专有网络是您自己独有的的云上私有网络。您可以完全掌控自己的专有网络,例如选择IP地址范围、配置路由表和网关等。具体详细信息和文档请参考 产品文档
按规划创建专有网络及SAP HANA业务子网和心跳子网,本示例创建了一个CIDR为“192.168.0.0/16”的VPC专有网络和对应的子网网段如下:
SAP HANA ECS实例创建
ECS 产品购买页面
访问 https://www.aliyun.com/product/ecs 购买页面,选择实例类型输入配置参数,点击确认订单。
选择付费方式
选择 付费方式:包年包月 或者 按量付费。
选择对应区域和可用区
选择地域和可用区。系统默认随机分配可用区,您可以选择适用的可用区。如何选择地域和可用区,请参见 地域和可用区。
本示例需要分别在华东2 可用区C和G各创建一台相同规格的实例。
选择实例规格
目前阿里云通过SAP HANA认证的实例规格如下:
实例规格 | vCPU | 内存(GiB) | 架构 |
---|---|---|---|
ecs.r5.8xlarge | 32 | 256 | Skylake |
ecs.r6.13xlarge | 32 | 256 | Cascade Lake |
ecs.r5.16xlarge | 64 | 512 | Skylake |
ecs.se1.14xlarge | 56 | 480 | Broadwell |
ecs.re4.20xlarge | 80 | 960 | Broadwell |
ecs.re4.40xlarge | 160 | 1920 | Broadwell |
ecs.re4e.40xlarge | 160 | 3840 | Broadwell |
详情请见 SAP Certified IaaS Platforms
本示例选择 ecs.r5.8xlarge 作为测试实例。
选择镜像
您可以选择公共镜像、自定义镜像、共享镜像或从镜像市场选择镜像。
SAP HANA的镜像,可按实际需求选择对应的镜像类型和版本。
本示例使用的是“SUSE 12 SP4 for SAP”镜像。
配置存储
系统盘:必选项,用于安装操作系统。指定系统盘的云盘类型和容量。
数据盘:可选项。如果在此时创建云盘作为数据盘,必须选择云盘类型、容量、数量,并设置是否 加密。您可以创建空云盘,也可以使用快照创建云盘。最多可以添加16块云盘作数据盘。
数据盘的大小需要根据SAP HANA实例的需求做相应的调整。
本示例中, /hana/data,/hana/log,/hana/shared 使用三块同等容量的SSD云盘用LVM进行条带化处理以满足SAP HANA的性能要求,文件系统为XFS。
有关SAP HANA存储需求请参考 SAP HANA TDI-Storage Requirements
选择网络类型
单击 下一步:网络和安全组,完成网络和安全组设置:
1、选择网络类型
按规划选择对应的专有网络和交换机。
2、设置公网带宽
SAP HANA应用属于企业核心应用,不建议直接开放公网访问,这里我们取消勾选“分配公网IP地址”。需要公网访问时,建议结合NAT网关使用
选择安全组
如果您还没有创建安全组,需要手工创建安全组用于管理ECS出和入的网络访问策略。系统默认的安全组规则需要结合企业自身精细化管理的需要,自定义策略添加安全组规则
网卡配置
Note:先不增加第二张弹性网卡,ECS创建成功之后再添加第二张网卡。
完成系统配置、分组设置,完成ECS的购买。
配置弹性网卡
弹性网卡(ENI)是一种可以附加到专有网络VPC类型ECS实例上的虚拟网卡,通过弹性网卡,您可以实现高可用集群搭建、低成本故障转移和精细化的网络管理。所有地域均支持弹性网卡。具体说明请参考 弹性网卡。
创建弹性网卡
本示例,按规划创建两块ENI,用于心跳网卡登录 ECS管理控制台,在左侧导航栏中,选择 网络和安全 > 弹性网卡。选择地域。单击 创建弹性网卡。
ENI创建完成后,绑定对应的ECS实例,绑定成功后如下图:
控制台创建完ENI后,还需要登录操作系统配置网卡,本示例的操作系统是SUSE linux,配置命令如下:yast2 network
将新加的ENI的IP地址、子网掩码等按规划配置如下:
不要在操作系统上直接修改主网卡的IP地址或者改为静态IP。如一定要修改主网卡IP地址,请参考 修改私有IP地址。 如果已经修改并保存了主网卡的IP地址,可以通过控制台->ECS->远程连接,恢复为默认的设置即将主网卡再设置成DHCP并重启ECS或者通过工单来寻求帮助。
同理配置完成备节点的心跳网卡后如下:
#主节点
hana-master:~ # ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:16:3E:16:D5:78
inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:10410623 errors:0 dropped:0 overruns:0 frame:0
TX packets:2351934 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15407579259 (14693.8 Mb) TX bytes:237438291 (226.4 Mb)
eth1 Link encap:Ethernet HWaddr 00:16:3E:16:B3:CF
inet addr:192.168.11.1 Bcast:192.168.11.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:126 (126.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:3584482 errors:0 dropped:0 overruns:0 frame:0
TX packets:3584482 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8501733500 (8107.8 Mb) TX bytes:8501733500 (8107.8 Mb)
#备节点
hana-slave:~ # ifconfig -a
eth0 Link encap:Ethernet HWaddr 00:16:3E:16:D1:7B
inet addr:192.168.30.1 Bcast:192.168.30.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:10305987 errors:0 dropped:0 overruns:0 frame:0
TX packets:1281821 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:15387761017 (14674.9 Mb) TX bytes:176376293 (168.2 Mb)
eth1 Link encap:Ethernet HWaddr 00:16:3E:0C:27:58
inet addr:192.168.31.1 Bcast:192.168.31.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1 errors:0 dropped:0 overruns:0 frame:0
TX packets:4 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:42 (42.0 b) TX bytes:168 (168.0 b)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:3454980 errors:0 dropped:0 overruns:0 frame:0
TX packets:3454980 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8244241992 (7862.3 Mb) TX bytes:8244241992 (7862.3 Mb)
SAP HANA ECS配置
维护主机名
分别在 HA 集群两台SAP HANA ECS上维护主机名称解析。
本示例的 /etc/hosts 文件内容如下:
127.0.0.1 localhost
192.168.10.1 hana-master
192.168.11.1 hana-01
192.168.30.1 hana-slave
192.168.31.1 hana-02
配置ECS SSH互信
HA集群的两台SAP HANA ECS需要配置SSH互信,配置过程如下。
配置认证公钥
在SAP HANA主节点执行如下命令:
hana-master:~ # ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:Glxb95CX3CYmyvIWP1+dGNjTAKyOA6OQVsGawLB1Mwc root@hana-master
The key's randomart image is:
+---[RSA 2048]----+
|+ o.E.. .. |
|.+ + + ..o o |
|o = . o =.* o|
| * + . = oo*oo |
|. . . = S +. +.. |
| . = + o + o|
| . . o o. .o|
| . o . |
| . |
+----[SHA256]-----+
hana-master:~ # ssh-copy-id -i /root/.ssh/id_rsa.pub root@hana-slave
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'hana-slave (192.168.30.1)' can't be established.
ECDSA key fingerprint is SHA256:FkjnE833pcHvtcDTFfOLDYzblmAp1wvBE5cT9xk69Po.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@hana-slave'"
and check to make sure that only the key(s) you wanted were added.
在SAP HANA备节点上执行如下命令:
hana-slave:~ # ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:lYh/w8U+vRET/E5wCAgZx990sinegTAmHsech5CIF4M root@hana-slave
The key's randomart image is:
+---[RSA 2048]----+
| ooo+X.+.o.. |
| E oo=o%o. *.o|
| ....=o=o+oO |
| ..o o+.=oo|
| S +.ooo+ |
| . ....o.|
| . |
| |
| |
+----[SHA256]-----+
hana-slave:~ # ssh-copy-id -i /root/.ssh/id_rsa.pub root@hana-master
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host 'hana-master (192.168.10.1)' can't be established.
ECDSA key fingerprint is SHA256:zi+gKx4IFe6Ea12thsdVW9L3J93ZwFymo0+YOLjLJ18.
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
Password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh 'root@hana-master'"
and check to make sure that only the key(s) you wanted were added.
验证配置结果
分别在两个节点上,使用 SSH 登录另外一个节点,如果不需要密码登录,则说明互信已经建立。
hana-master:~ # ssh hana-slave
Last login: Wed Nov 13 10:16:35 2019 from 106.11.34.9
hana-slave:~ # ssh hana-master
Last login: Wed Nov 13 10:16:34 2019 from 106.11.34.9
ECS Metrics Collector for SAP监控代理
ECS Metrics Collector监控代理程序,用于云平台上SAP系统收集需要的虚拟机配置信息和底层物理资源使用相关的信息,供日后做性能统计和问题分析使用。
每台SAP应用和数据库都需要安装Metrics Collector,监控代理的部署请参考 ECS Metrics Collector for SAP部署指南
HANA文件系统划分
按前面的文件系统规划,使用LVM来管理和配置云盘
有关LVM分区的介绍,请参考 LVM HOW TO
- 创建PV和VG
# pvcreate /dev/vdb /dev/vdc /dev/vdd /dev/vdg
Physical volume "/dev/vdb" successfully created
Physical volume "/dev/vdc" successfully created
Physical volume "/dev/vdd" successfully created
Physical volume "/dev/vdg" successfully created
# vgcreate hanavg /dev/vdb /dev/vdc /dev/vdd
Volume group "hanavg" successfully created
# vgcreate sapvg /dev/vdg
Volume group "sapvg" successfully created
- 创建LV(将三块500G的SSD云盘做条带化)
# lvcreate -l 100%FREE -n usrsaplv sapvg
Logical volume "usrsaplv" created.
# lvcreate -L 800G -n datalv -i 3 -I 64 hanavg
Rounding size (204800 extents) up to stripe boundary size (204801 extents).
Logical volume "datalv" created.
# lvcreate -L 400G -n loglv -i 3 -I 64 hanavg
Rounding size (102400 extents) up to stripe boundary size (102402 extents).
Logical volume "loglv" created.
# lvcreate -l 100%FREE -n sharedlv -i 3 -I 64 hanavg
Rounding size (38395 extents) down to stripe boundary size (38394 extents)
Logical volume "sharedlv" created.
- 创建挂载点并格式化文件系统
# mkdir -p /usr/sap /hana/data /hana/log /hana/shared
# mkfs.xfs /dev/sapvg/usrsaplv
# mkfs.xfs /dev/hanavg/datalv
# mkfs.xfs /dev/hanavg/loglv
# mkfs.xfs /dev/hanavg/sharedlv
- 挂载文件系统并设置开机自启动
# vim /etc/fstab
添加下列项:
/dev/mapper/hanavg-datalv /hana/data xfs defaults 0 0
/dev/mapper/hanavg-loglv /hana/log xfs defaults 0 0
/dev/mapper/hanavg-sharedlv /hana/shared xfs defaults 0 0
/dev/mapper/sapvg-usrsaplv /usr/sap xfs defaults 0 0
/dev/vdf swap swap defaults 0 0
# mount -a
# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 32G 0 32G 0% /dev
tmpfs 48G 55M 48G 1% /dev/shm
tmpfs 32G 768K 32G 1% /run
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/vda1 99G 30G 64G 32% /
tmpfs 6.3G 16K 6.3G 1% /run/user/0
/dev/mapper/hanavg-datalv 800G 34M 800G 1% /hana/data
/dev/mapper/sapvg-usrsaplv 50G 33M 50G 1% /usr/sap
/dev/mapper/hanavg-loglv 400G 33M 400G 1% /hana/log
/dev/mapper/hanavg-sharedlv 300G 33M 300G 1% /hana/shared
aliyun_agent和aliyun-vpc-move-ip安装和配置
环境准备
配置过程需要连接公网下载或更新相应的组件,这里我们为这两台ECS申请并挂载了EIP
1.python和pip安装和检查目前fence-agent仅支持Python2.x版本,因此需要确保Python和pip版本为2.x
#检查Python版本
python -V
Python 2.7.13
#安装pip
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python get-pip.py
# 检查Python包管理工具pip版本
pip -V
pip 19.3.1 from /usr/lib/python2.7/site-packages/pip (python 2.7)
2.安装aliyun python sdk
pip install aliyun-python-sdk-core # 阿里云SDK核心库
pip install aliyun-python-sdk-ecs # 阿里云ECS管理库
#如果你的ECS已经安装了旧版的sdk,需要升级到最新版
pip install aliyun-python-sdk-core --upgrade
pip install aliyun-python-sdk-ecs --upgrade
#安装后检查(当前示例的版本)
pip list | grep aliyun-python
aliyun-python-sdk-core 2.13.10
aliyun-python-sdk-ecs 4.17.6
3.安装Aliyun CLI命令行工具最新版本 下载地址
3.1 下载最新版本安装,当前最新版本为3.0.32:
#下载aliyun cli
wget https://github.com/aliyun/aliyun-cli/releases/download/v3.0.32/aliyun-cli-linux-3.0.32-amd64.tgz
#解压
tar -xvf aliyun-cli-linux-3.0.32-amd64.tgz
mv aliyun /usr/local/bin
3.2 配置RAM Role
Aliyun cli结合RAM role的配置方式,可以减少AK泄漏导致的安全风险。
3.2.1 登录阿里云控制台-》访问控制-》权限管理,点击“新建RAM角色”,如SAP-HA-ROLE
3.2.2 “新建权限策略”,如SAP-HA-ROLE-POLICY:
{
"Version": "1",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ecs:StartInstance",
"ecs:StopInstance",
"ecs:RebootInstance",
"ecs:Describe*"
],
"Resource": [
"*"
],
"Condition": {}
},
{
"Effect": "Allow",
"Action": [
"vpc:CreateRouteEntry",
"vpc:DeleteRouteEntry",
"vpc:Describe*"
],
"Resource": [
"*"
],
"Condition": {}
}
]
}
3.2.3 把刚才新建的权限策略SAP-HA-ROLE-POLICY,授权给角色SAP-HA-ROLE
3.2.4 RAM Role授权给SAP HANA ECS
控制台-》ECS-》更多-》实例设置-》授予/收回RAM角色,授予刚才新建的“SAP-HA-ROLE”角色
如果您同时在使用HBR(混合云备份)产品,由于每台ECS只允许授权一个角色,因此需要将上面创建的策略添加到AliyunECSAccessingHBRRole角色中,只授予ECS AliyunECSAccessingHBRRole即可。
3.3 配置aliyun cli命令行工具
# aliyun configure --profile ecsRamRoleProfile --mode EcsRamRole
Configuring profile 'ecsRamRoleProfile' in 'EcsRamRole' authenticate mode...
Ecs Ram Role []: SAP-HA-ROLE -- 输入上面创建的RAM角色名字
Default Region Id []: cn-beijing -- 输入当前ECS所在的地域 region-ID
Default Output Format [json]: json (Only support json)
Default Language [zh|en] en:
Saving profile[ecsRamRoleProfile] ...Done.
Configure Done!!!
..............888888888888888888888 ........=8888888888888888888D=..............
...........88888888888888888888888 ..........D8888888888888888888888I...........
.........,8888888888888ZI: ...........................=Z88D8888888888D..........
.........+88888888 ..........................................88888888D..........
.........+88888888 .......Welcome to use Alibaba Cloud.......O8888888D..........
.........+88888888 ............. ************* ..............O8888888D..........
.........+88888888 .... Command Line Interface(Reloaded) ....O8888888D..........
.........+88888888...........................................88888888D..........
..........D888888888888DO+. ..........................?ND888888888888D..........
...........O8888888888888888888888...........D8888888888888888888888=...........
............ .:D8888888888888888888.........78888888888888888888O ..............
如需查询地域Region ID对应列表,请点这里
fence_aliyun安装和配置
1.下载最新版fence_aliyun
请确保每一步操作都执行成功
curl https://raw.githubusercontent.com/ClusterLabs/fence-agents/master/agents/aliyun/fence_aliyun.py > /usr/sbin/fence_aliyun
chmod 755 /usr/sbin/fence_aliyun
chown root:root /usr/sbin/fence_aliyun
2.适配用户环境
# 指定解释器为python
sed -i "1s|@PYTHON@|$(which python)|" /usr/sbin/fence_aliyun
# 指定Fence agent lib 目录
sed -i "s|@FENCEAGENTSLIBDIR@|/usr/share/fence|" /usr/sbin/fence_aliyun
3.验证安装
stonith_admin -I |grep fence_aliyun
#正确返回fence_aliyun说明配置正确
100 devices found
fence_aliyun
aliyun-vpc-move-ip安装和配置
请确保每一步操作都执行成功
1.下载最新版aliyun-vpc-move-ip
mkdir -p /usr/lib/ocf/resource.d/aliyun
curl https://raw.githubusercontent.com/ClusterLabs/resource-agents/master/heartbeat/aliyun-vpc-move-ip > /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
chmod 755 /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
chown root:root /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
2.验证安装
# ll /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
-rwxr-xr-x 1 root root 9983 Nov 14 19:44 /usr/lib/ocf/resource.d/aliyun/vpc-move-ip
OpenAPI内网域名解析验证
本方案会使用VPC和ECS的OpenAPI实现对HA的资源监控和行为调度。VPC和ECS的OpenAPI支持内网直接调用,无需额外配置。
请确认集群的两台ECS访问阿里云VPC和ECS的OpenAPI域名正常。
本示例是北京地域内网环境的ECS,其他地域的域名请参考下面的列表
# ping vpc.cn-beijing.aliyuncs.com
PING popunify-vpc.cn-beijing.aliyuncs.com (100.100.80.162) 56(84) bytes of data.
64 bytes from 100.100.80.162: icmp_seq=1 ttl=102 time=0.065 ms
64 bytes from 100.100.80.162: icmp_seq=2 ttl=102 time=0.087 ms
64 bytes from 100.100.80.162: icmp_seq=3 ttl=102 time=0.106 ms
64 bytes from 100.100.80.162: icmp_seq=4 ttl=102 time=0.107 ms
--- popunify-vpc.cn-beijing.aliyuncs.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3058ms
rtt min/avg/max/mdev = 0.065/0.091/0.107/0.018 ms
# ping ecs.cn-beijing.aliyuncs.com
PING popunify-vpc.cn-beijing.aliyuncs.com (100.100.80.162) 56(84) bytes of data.
64 bytes from 100.100.80.162: icmp_seq=1 ttl=102 time=0.065 ms
64 bytes from 100.100.80.162: icmp_seq=2 ttl=102 time=0.093 ms
64 bytes from 100.100.80.162: icmp_seq=3 ttl=102 time=0.129 ms
64 bytes from 100.100.80.162: icmp_seq=4 ttl=102 time=0.102 ms
--- popunify-vpc.cn-beijing.aliyuncs.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3059ms
rtt min/avg/max/mdev = 0.065/0.097/0.129/0.023 ms
当前支持内网API调用的地域如下:
地域 | 地域ID |
---|---|
华东 1(杭州) | cn-hangzhou |
华南 1(深圳) | cn-shenzhen |
华北 5(呼和浩特) | cn-huhehaote |
华北 3(张家口) | cn-zhangjiakou |
西南1(成都) | cn-chengdu |
德国(法兰克福) | eu-central-1 |
新加坡 | ap-southeast-1 |
澳大利亚(悉尼) | ap-southeast-2 |
马来西亚(吉隆坡) | ap-southeast-3 |
印度尼西亚(雅加达) | ap-southeast-5 |
英国(伦敦) | eu-west-1 |
日本(东京) | ap-northeast-1 |
印度(孟买) | ap-south-1 |
美国(硅谷) | us-west-1 |
美国(弗吉尼亚) | us-east-1 |
还不支持OpenAPI内网调用的地域,可以使用NAT产品的SNAT功能来实现OpenAPI外网调用。
安装SAP HANA
本示例的HANA的System ID为H01,Instance ID为00。
SAP HANA的安装和配置可以参考 SAP HANA Platform
配置HANA System Replication
SAP HANA System Replication的配置请参考 How To Perform System Replication for SAP HANA
SLES Cluster HA安装配置
安装SUSE HAE软件
SUSE HAE操作手册请参考:SUSE Linux Enterprise High Availability Extension 12
在主、备节点,检查是否已经安装HAE和SAPHanaSR组件
本示例使用的是SUSE CSP(Cloud Service Provider)镜像,此镜像已经预置了阿里云SUSE SMT Server配置,可直接进行组件检查和安装。如果是自定义镜像或其他镜像,需要先购买SUSE订阅,再注册到SUSE官方的SMT Server或者手工配置Zypper repository源。
需要安装的组件如下:
saphana-01:~ # zypper in patterns-sle-gnome-basic patterns-ha-ha_sles SAPHanaSR sap_suse_cluster_connector saptune
配置集群
生成集群配置文件
本示例使用VNC打开图形界面,在HANA主节点上配置Corosync
# yast2 cluster
配置communication channel
主节点Hana-master的业务网段是192.168.10.x,心跳网段是192.168.11.x
- Channel选择心跳网段,Redundant Channel选择业务网段
- 按正确的顺序依次添加Member address(前心跳地址,后业务地址)
- Excepted Votes: 2
- Transport: Unicast
配置Security
勾选”Enable Security Auth”,并点击 Generate Auth Key File
配置Csync2
- 添加Sync host
- 点击Add Suggested Files
- 点击Generate Pre-Shared-Keys
- 确保Turn csync2 OFF
Configure conntrackd这一步使用默认,直接下一步
配置Service
- 确认Cluster服务不要设成开机自启动
配置完成后保存退出,将Corosync配置文件复制到SAP HANA备节点。
# scp -pr /etc/corosync/authkey /etc/corosync/corosync.conf root@hana-slave:/etc/corosync/
- 备节点跟主节点不在同一个可用区,因此需要手工修改成相应的配置
备节点Hana-slave的业务网段是192.168.30.x,心跳网段是192.168.31.x
启动集群
在两个节点里执行如下命令:
systemctl start pacemaker
查看集群状态
现在两个node都online了,resources集成后面做配置
crm_mon -r
Stack: corosync
Current DC: hana-slave (version 1.1.19+20181105.ccd6b5b10-3.13.1-1.1.19+20181105.ccd6b5b10) - partition with quorum
Last updated: Thu Nov 14 14:47:00 2019
Last change: Thu Nov 14 13:40:57 2019 by hacluster via crmd on hana-slave
2 nodes configured
0 resources configured
Online: [ hana-master hana-slave ]
No resources
启动WEB网页图形化配置
(1)激活两台ECS的Hawk2服务
#重置hawk管理员密码
passwd hacluster
New password:
Retype new password:
passwd: password updated successfully
#重启hawk服务
systemctl restart hawk.service
#设置为开机自启动
systemctl enable hawk.service
(2)访问Hawk2
https://<HANA ECS IP address>/:7630,输入用户名 hacluster和密码登录。
SAP HANA与SUSE HAE集成
SAPHanaSR配置SAP HANA资源
在任何集群节点之一,新建脚本文件替换脚本中的参数:
本示例中脚本文件名HANA_HA_script.sh
- res_ALIYUN_STONITH_x中,plug=[ECS ID] ram_role=[上面配置的RAM角色] region=[Region ID]
- params SID=[HANA SID] InstanceNumber=[HANA实例号]
- rsc_vip中,address=[规划的高可用虚拟IP] routing_table=[ECS所在的VPC的路由表ID] endpoint=vpc.[region ID].aliyuncs.com interface=eth0
-
location loc[hana主节点主机名]_stonith_not_on[hana主节点主机名] res_ALIYUN_STONITH_1 -inf: [hana主节点主机名]
-
location loc[hana备节点主机名]_stonith_not_on[hana备节点主机名] res_ALIYUN_STONITH_2 -inf: [hana备节点主机名]
primitive res_ALIYUN_STONITH_1 stonith:fence_aliyun \
op monitor interval=120 timeout=60 \
params plug=i-2ze2ujq5zpxxmaemlyfn ram_role=SAP-HA-ROLE region=cn-beijing \
meta target-role=Started
primitive res_ALIYUN_STONITH_2 stonith:fence_aliyun \
op monitor interval=120 timeout=60 \
params plug=i-2zefdluqx20n43jos4vj ram_role=SAP-HA-ROLE region=cn-beijing \
meta target-role=Started
###SAP HANA Topology is a resource agent that monitors and analyze the HANA landscape and communicate the status between two nodes##
primitive rsc_SAPHanaTopology_HDB ocf:suse:SAPHanaTopology \
operations $id=rsc_SAPHanaTopology_HDB-operations \
op monitor interval=10 timeout=600 \
op start interval=0 timeout=600 \
op stop interval=0 timeout=300 \
params SID=H01 InstanceNumber=00
###This file defines the resources in the cluster together with the HAVIP###
primitive rsc_SAPHana_HDB ocf:suse:SAPHana \
operations $id=rsc_SAPHana_HDB-operations \
op start interval=0 timeout=3600 \
op stop interval=0 timeout=3600 \
op promote interval=0 timeout=3600 \
op monitor interval=60 role=Master timeout=700 \
op monitor interval=61 role=Slave timeout=700 \
params SID=H01 InstanceNumber=00 PREFER_SITE_TAKEOVER=true DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=false
#This is for Overlay IP resource setting##
primitive rsc_vip ocf:aliyun:vpc-move-ip \
op monitor interval=60 \
meta target-role=Started \
params address=192.168.100.100 routing_table=vtb-2zequdzq4luddui4voe6x endpoint=vpc.cn-beijing.aliyuncs.com interface=eth0
ms msl_SAPHana_HDB rsc_SAPHana_HDB \
meta is-managed=true notify=true clone-max=2 clone-node-max=1 target-role=Started interleave=true maintenance=false
clone cln_SAPHanaTopology_HDB rsc_SAPHanaTopology_HDB \
meta is-managed=true clone-node-max=1 target-role=Started interleave=true maintenance=false
colocation col_saphana_ip_HDB 2000: rsc_vip:Started msl_SAPHana_HDB:Master
location loc_hana-master_stonith_not_on_hana-master res_ALIYUN_STONITH_1 -inf: hana-master
#Stonith 1 should not run on primary node because it is controling primary node
location loc_hana-slave_stonith_not_on_hana-slave res_ALIYUN_STONITH_2 -inf: hana-slave
order ord_SAPHana_HDB Optional: cln_SAPHanaTopology_HDB msl_SAPHana_HDB
property cib-bootstrap-options: \
have-watchdog=false \
cluster-infrastructure=corosync \
cluster-name=cluster \
stonith-enabled=true \
stonith-action=off \
stonith-timeout=150s \
no-quorum-policy=ignore
rsc_defaults rsc-options: \
migration-threshold=5000 \
resource-stickiness=1000
op_defaults op-options: \
timeout=600
用root用户运行如下命令:
crm configure load update HANA_HA_script.sh
验证集群状态
正常的集群资源状态:
- res_ALIYUN_STONITH_1资源在备节点启动
- res_ALIYUN_STONITH_2资源在主节点启动
- rsc_vip资源在当前的主节点且为绿色
- SAPHana_HDB资源,在master和slave节点均为绿色
- SAPHanaTopolopy资源,在两个节点均为绿色
登录Hawk2 web控制台,访问地址 https://[IP address]:7630
查看Cluster的Status和Dashboard
也可以登录任意一个节点,使用crmsh检查当前集群状态
crm_mon -r
Stack: corosync
Current DC: hana-master (version 1.1.19+20181105.ccd6b5b10-3.13.1-1.1.19+20181105.ccd6b5b10) - partition with quorum
Last updated: Sun Jan 19 16:36:35 2020
Last change: Sun Jan 19 16:35:36 2020 by root via crm_attribute on hana-master
2 nodes configured
7 resources configured
Online: [ hana-master hana-slave ]
Full list of resources:
res_ALIYUN_STONITH_1 (stonith:fence_aliyun): Started hana-slave
res_ALIYUN_STONITH_2 (stonith:fence_aliyun): Started hana-master
rsc_vip (ocf::aliyun:vpc-move-ip): Started hana-master
Master/Slave Set: msl_SAPHana_HDB [rsc_SAPHana_HDB]
Masters: [ hana-master ]
Slaves: [ hana-slave ]
Clone Set: cln_SAPHanaTopology_HDB [rsc_SAPHanaTopology_HDB]
Started: [ hana-master hana-slave ]
SAP HANA高可用测试和维护
- SAP系统高可用测试 请参考 SAP 高可用测试最佳实践
- SAP系统高可用环境维护指南 请参考 SAP 高可用环境维护指南