Escolar Documentos
Profissional Documentos
Cultura Documentos
This article describes the prerequisite configuration required, and the actual configuration to
use IBM® PowerHA® 7.1 storage area network (SAN) heartbeat. In PowerHA 7.1, the disk
heartbeat has been replaced by a SAN heartbeat, which should be included in a resilient
PowerHA architecture.
Introduction
IBM PowerHA System Mirror for AIX is clustering software which gives the capability for a
resource or group of resources (an application) to be automatically or manually moved to another
IBM AIX® system in the event of a system failure.
Heartbeat and failure detection is performed over all interfaces available to the cluster. This could
be network interfaces, Fibre Channel (FC) adapter interfaces, and the Cluster Aware AIX (CAA)
repository disk.
In PowerHA 6.1 and earlier versions, heartbeat over FC adapter interfaces was not supported, and
instead, a SAN-attached heartbeat disk was made available to both nodes, and this was used for
heartbeat and failure detection. In PowerHA 7.1, the use of heartbeat disks is no longer supported,
and configuring heartbeat over SAN is the supported method to use in place of heartbeat disks.
For this heartbeat over SAN to take place, the FC adapter in the AIX system needs to be
configured to act as a target and an initiator. In most SAN environments, an initiator device
belongs to the server which is typically a host bus adapter (HBA) and a target is typically a storage
device, such as a storage controller or a tape device. The IBM AIX 7.1 Information Center contains
a list of supported FC adapters that can support the target mode. These adapters can be used for
heartbeat over SAN.
Overview
In this article, I have provided simple examples of how to set up the SAN heartbeat in two
scenarios; the first example with two AIX systems using physical I/O and the other example with
two AIX logical partitions (LPARs) using Virtual I/O Server and N-Port ID Virtualization (NPIV).
In each of the examples, we have a two-node PowerHA 7.1 cluster, with one node residing on
a different IBM POWER® processor-based server. This article does not cover how to configure
shared storage, advanced network communications, or application controllers. This is a practical
example of how to build a very simple cluster, and get the SAN heartbeat working.
Requirements
The following minimum requirements must be met to ensure that we can create the cluster and
configure the SAN heartbeat:
• AIX 6.1 or preferably AIX 7.1 needs to be installed on both AIX systems, using the latest
technology level and service pack.
• PowerHA 7.1 needs to be installed on both AIX systems, using the latest service pack.
• The FC adapters in the servers must support target mode, and if NPIV is in use, they must be
8 GBps adapters supporting NPIV. NPIV support is required for Scenario 2 that is explained in
this article.
• If Virtual I/O Server is in use, then the VIOS code should be the latest service pack of IOS 2.2.
This is required for Scenario 2 in this article.
• If NPIV is in use, then the fabric switches must have NPIV support enabled, and be on a
supported level of firmware. This is required for Scenario 2 that is explained in this article.
• There must be a logical unit number (LUN) allocated to both AIX systems for use as the CAA
repository disk.
• There must be a LUN allocated to both AIX systems for use as shared storage for the cluster.
• Storage zones
• Heartbeat zones
To configure the zoning, first log in to each of the nodes, verify that the FC adapters are available,
and capture the worldwide port number (WWPN) of each adapter port, as shown in the following
example.
root@ha71_node1:/home/root# lsdev -Cc adapter |grep fcs
fcs0 Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs1 Available 03-T1 8Gb PCI Express Dual Port FC Adapter
fcs2 Available 02-T1 8Gb PCI Express Dual Port FC Adapter
fcs3 Available 03-T1 8Gb PCI Express Dual Port FC Adapter
root@ha71_node1:/home/root# for i in `lsdev -Cc adapter |awk '{print $1}'
|grep fcs `; do print ${i} - $(lscfg -vl $i |grep Network |awk '{print $2}'
|cut -c21-50| sed 's/../&:/g;s/:$//'); done
fcs0 - 10:00:00:00:C9:CC:49:44
fcs1 - 10:00:00:00:C9:CC:49:45
fcs2 - 10:00:00:00:C9:C8:85:CC
fcs3 - 10:00:00:00:C9:C8:85:CD
root@ha71_node1:/home/root#
After the WWPNs are known, zoning can be performed on the fabric switches. Zone the HBA
adapters to the storage ports on the storage controller used for the shared storage, and also
create zones that can be used for the heartbeat. The following diagram gives an overview of how
the heartbeat zones should be created.
Ensure that you zone one port from each FC adapter on the first node to another port on each FC
adapter on the second node.
For target mode to be enabled, both dyntrk (dynamic tracking) and fast_fail need to be enabled
on the fscsi device, and target mode need to be enabled on the fcs device.
To enable target mode, perform the following steps on both nodes to zone.
root@ha71_node1:/home/root# rmdev –l fcs0 –R
fscsi0 Defined
fcs0 Defined
root@ha71_node1:/home/root# rmdev –l fcs2 –R
fscsi2 Defined
fcs2 Defined
root@ha71_node1:/home/root# chdev –l fscsi0 –a dyntrk=yes -a fc_err_recov=fast_fail
fscsi0 changed
root@ha71_node1:/home/root# chdev –l fscsi2 –a dyntrk=yes –a fc_err_recov=fast_fail
fscsi2 changed
root@ha71_node1:/home/root# chdev –l fcs0 -a tme=yes
fcs0 changed
root@ha71_node1:/home/root# chdev –l fcs2 -a tme=yes
fcs2 changed
root@ha71_node1:/home/root# cfgmgr
root@ha71_node1:/home/root#
If the devices are busy, make the changes with the –P option at the end of the command, and
restart the server. This will cause the change to be applied at the next start of the server.
The target mode setting can be verified by checking the attributes of the fscsi devices. The
following example shows how to check fscsi0 and fcs0 on one of the nodes. This should be
checked on each of the fcs0 and fcs2 adapters on both nodes.
root@ha71_node1:/home/root# lsattr -El fscsi0
attach switch How this adapter is CONNECTED False
dyntrk yes Dynamic Tracking of FC Devices True
fc_err_recov fast_fail FC Fabric Event Error RECOVERY Policy True
scsi_id 0xbc0e0a Adapter SCSI ID False
sw_fc_class 3 FC Class for Fabric True
root@ha71_node1:/home/root# lsattr -El fcs0 |grep tme
tme yes Target Mode Enabled True
root@ha71_node1:/home/root#
After the target mode is enabled, we should next look for the available sfwcomm devices. These
devices are used for the PowerHA error detection and heartbeat over SAN.
When using VIOS, what differs from the physical I/O scenario is that the FC ports of the Virtual I/O
Server must be zoned together. There is then a private virtual LAN (VLAN) with the port VLAN ID
of 3358 (3358 is the only VLAN ID that will work) used to carry the heartbeat communication over
the hypervisor from the Virtual I/O Server to the client LPAR, which is our PowerHA node.
• Storage zones
• Contains the LPAR's virtual WWPNs
• Contains the storage controller's WWPNs
• Heartbeat zones (contains the VIOS physical WWPNs)
• The VIOS on each machine should be zoned together.
• The virtual WWPNs of the client LPARs should not be zoned together.
When performing the zoning, log in to each of the VIOS (both VIOS on each managed system)
and verify that the FC adapters are available, and capture the WWPN information for zoning. The
following example shows how to perform this step on one VIOS.
The virtual WWPNs also need to be captured from the client LPAR for the storage zones. The
following example shows how to perform this step on both nodes.
root@ha71_node1:/home/root# lsdev -Cc adapter |grep fcs
fcs0 Available 02-T1 Virtual Fibre Channel Client Adapter
fcs1 Available 03-T1 Virtual Fibre Channel Client Adapter
fcs2 Available 02-T1 Virtual Fibre Channel Client Adapter
fcs3 Available 03-T1 Virtual Fibre Channel Client Adapter
root@ha71_node1:/home/root# for i in `lsdev -Cc adapter |awk '{print $1}'
|grep fcs `; do print ${i} - $(lscfg -vl $i |grep Network |awk '{print $2}'
|cut -c21-50| sed 's/../&:/g;s/:$//'); done
fcs0 – c0:50:76:04:f8:f6:00:40
fcs1 – c0:50:76:04:f8:f6:00:42
fcs2 – c0:50:76:04:f8:f6:00:44
fcs3 – c0:50:76:04:f8:f6:00:46
root@ha71_node1:/home/root#
After the WWPNs are known, zoning can be performed on the fabric switches. Zone the LPAR’s
virtual WWPNs to the storage ports on the storage controller used for the shared storage, and
also create zones containing the VIOS physical ports, which will be used for the heartbeat. The
following figure gives an overview of how the heartbeat zones should be created.
For target mode to be enabled, both dyntrk (dynamic tracking) and fast_fail need to be enabled
on the fscsi device, and target mode need to be enabled on the fcs device.
To enable target mode, perform the following steps on both VIOS on each managed system.
$ chdev -dev fscsi0 -attr dyntrk=yes fc_err_recov=fast_fail –perm
fscsi0 changed
$ chdev -dev fcs0 -attr tme=yes –perm
fcs0 changed
$ chdev -dev fscsi2 -attr dyntrk=yes fc_err_recov=fast_fail –perm
fscsi2 changed
$ chdev -dev fcs2 -attr tme=yes –perm
fcs2 changed
$ shutdown -restart
A restart of each VIOS is required, and therefore, it is strongly recommended to modify one VIOS
at a time.
The VLAN ID must be 3358 for this to work. The following figure describes the virtual Ethernet
setup.
First, log in to each of the VIOS, and add an additional VLAN to each shared Ethernet bridge
adapter. This provides the VIOS connectivity to the 3358 VLAN.
The following figure shows how this additional VLAN can be added to the bridge adapter.
Next, create a virtual Ethernet adapter on the client partition, and set the port virtual VLAN ID to be
3358. This provides the client LPAR connectivity to the 3358 VLAN.
From AIX, run the cfgmgr command and pick up the virtual Ethernet adapter.
After this is complete, we can create our PowerHA cluster, and the SAN heartbeat is ready for use.
NODE ha71_node1:
Network net_ether_01
ha71_node1 172.16.5.251
NODE ha71_node2:
Network net_ether_01
ha71_node2 172.16.5.252
….. etc…..
Retrieving data from available cluster nodes. This could take a few minutes.
root@ha71_node1:/home/root #
After creating the cluster definition, the next step is to check whether there is a free disk on each
node, so that we can configure the CAA repository.
root@ha71_node1:/home/root# lsdev –Cc disk
hdisk0 Available 00-00-01 IBM MPIO FC 2107
hdisk1 Available 00-00-01 IBM MPIO FC 2107
root@ha71_node1:/home/root# lspv
hdisk0 000966fa5e41e427 rootvg active
hdisk1 000966fa08520349 None
root@ha71_node1:/home/root#
From the above example, it is clear that hdisk1 is a free disk on each node. So, this can be used
for the repository. Next, modify the cluster definition to include the cluster repository disk. Our free
disk on both nodes is hdisk1.
This can be performed using smitty hacmp or on the command line. The following example
shows how to perform this step on the command line.
root@ha71_node1:/home/root # clmgr modify cluster ha71_cluster REPOSITORY=hdisk1
NODE ha71_node1:
Network net_ether_01
ha71_node1 172.16.5.251
NODE ha71_node2:
Network net_ether_01
ha71_node2 172.16.5.252
root@ha71_node1:/home/root #
The next step is to verify and synchronize the cluster configuration. This can be performed using
smitty hacmp or on the command line. The following example shows how to synchronize the
cluster topology and resources on the command line.
root@ha71_node1:/home/root # cldare -rt
Timer object autoclverify already exists
Retrieving data from available cluster nodes. This could take a few minutes.
Node: Network:
---------------------------------- ----------------------------------
ha71_node1 net_ether_01
ha71_node2 net_ether_01
Now that a basic cluster has been configured, the last step is to verify that the SAN heartbeat is
up.
The lscluster –i command displays the cluster interfaces and their status. The sfwcom (Storage
Framework Communication) interface is the SAN heartbeat.
In the following example, we can check this from one of the nodes to ensure that the SAN
heartbeat is up. This is good news!
The remaining steps for cluster configuration, such as configuring shared storage, mirror pools, file
collections, application controllers, monitors, and so on are not covered in this article.
Resources
Learn
Discuss