aBA2128 PDF

Stefan Gocke
Consultant
Gocke IT Solutions
PowerHA and PowerVM together

building a virtualized cluster
Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
9.0
Stefan Gocke
Gocke IT Solutions
stefan.gocke@t-online.de
In IT Industry since 1986 - Freelance since 2003

based south of Munich
Projekts in POWER Systems
(LPAR, Virtualization, HACMP, TSA, etc.)
Projects in Storage
(DS8000, DS4000, SVC, V7K, SAN)
Projects with TSM and Tape
(AIX, Windows, Linux, Solaris, Netware, z/OS TSM SRV)
Trainings for ArrowECS, GlobalKnowledge, Avnet, FastLane, ....
(AIX, PowerHA, GPFS, LPAR, POWER7/8, TSM, SVC, Storwize, DS8000, SAN, )
IBM CATE Power Systems for AIX V3 , and many more .
! ! ! ! !
Agenda
Differences between PowerHA V6.1 and V7.1
Identify and work around SPOFs
in a virtualized environment
(Live) Partition Mobility
and PowerHA SystemMirror (HACMP)
VIO Server functions and
PowerHA SystemMirror (HACMP)
POWER7 AME and PowerHA SystemMirror (HACMP)
SR-IOV / IVE - with PowerHA System Mirror (HACMP)
Its your session
If you dont ask you cant get an answer
So please ask questions at all times
Or email later to : Stefan.Gocke@t-online.de

PowerHA SystemMirror and Virtual Devices
PowerHA SystemMirror will work with virtual devices.
PowerHA SystemMirror is supported with Live Partition Mobility
All IBM Education Services classes on PowerHA SystemMirror

run on fully virtualized LPARs
Some restrictions apply when using virtual ethernet and/or

virtual disk access.
Using a virtual I/O server will add new SPOFs to the cluster which
need to be taken into account.
Whats different with PowerHA V7.1(.x)
PowerHA SystemMirror Version 7.1.x
Change in PowerHA Cluster through new use of

Cluster Aware AIX (CAA) instead of topology services
no topsvcs any more
New SAN heartbeating no disk heartbeat
Use of AIX - CAA repository
Different SAN Zoning (may be) needed
Network needs multicast enabled
cthags instead of grpsvcs
rootvg system event monitoring through AHAFS
And dont use PowerHA V7.1.0 !!
(CAA redesign afterward makes migrations not usable)
PowerHA SystemMirror Version 7.1.x
The HACMP - part is still the same !

HACMPODM is still there
Different GUI
Some more options in resource groups (not discussed here)
My personal testing viewpoints:

The failover is better / faster
The cluster is more stable
The repository in CAA is different ...
Cluster repository
There is only one repository to make sure no split brain situations can occur.
Starting with PowerHA SystemMirror V7.1.1 SP2 repository resiliency has been
implemented, so a failure of the repository will not fail the cluster and the failed disk
can be recovered.
Starting with PowerHA SystemMirrir V7.1.3 you can replace the repository disk
clmgr replace repository hdisk<x>

You can prepare for the loss of a repository
clmgr add repository hdisk<x>
clmgr replace repository (when the rst one failed)

You can add a second repository when building the cluster
clmgr add cluster <xxx> repository hdiskp,hdiskq
Copyright IBM Corporation 2015

Multicast IP, SAN and Repository disk requirements
Cluster communication requires a multicast IP address to be used.

You can specify this address when you create the cluster, or you can let one be generated
automatically.
The ranges 224.0.0.0 through 224.0.0.255 and 239.0.0.0 through 239.255.255.255 are
reserved for administrative and maintenance purposes.
Also, make sure the multicast traffic generated by each of the cluster nodes is properly
forwarded by the network infrastructure toward any other cluster node.
Should you use SAN-based heartbeating, you must have zoning setup to ensure
connectivity between host FC adapters. You also need to activate Target Mode Enabled
parameter on the involved FC adapters.
(see later charts for better explanation)
Hardware redundancy at the storage subsystem level is mandatory for the Cluster
Repository disk. LVM mirroring of repository disk is not supported. This disk should be at
least 1GB in size.
Starting with PowerHA 7.1.3 also unicast addresses can be used again
(This is still untested as eGA will be mid December 2013)
10
Differences PowerHA V6.1 and V7.1.x
PowerHA V6.1 uses RSCT Topology Services
PowerHA V7.1 uses CAA
Dont start/use V7.1.0 use 7.1.2 / 7.1.3 (bepends
Repository disk considerations

Mulitcast instead of unicast heartbeating
SAN heartbeating using the multicast
Both use RSCT Group Services
Be aware of a memory leak RSCT (IV69760/IV66606)
IBMconfig.RM daemon on CAA Group Leader (lssrc -ls IBM.ConfigRM)

Information on IV69760
Be aware of a memory leak RSCT (IV69760/IV66606))
rsct.core.rmc starting 3.1.5.0 to 3.2.0.4 in IBMconfig.RM (only group leader)

in CAA which can bring the CAA node down.
Time to failure after a new boot is estimated to be between 6 and 8 months.
A reboot of the node will remedy the situation.
"lssrc -ls IBM.ConfigRM will show the active group leader.
Affects PowerHA V7 and VIO SSP. Fix is pending.

PowerHA SystemMirror V6.1
PowerHA 6.1
RSCT
Resource Monitoring
and Control
Resource Manager Group Services
Topology Services
AIX
PowerHA SystemMirror V7.1.x
PowerHA 7.1.x
RSCT
Resource Monitoring
and Control
Resource Manager Group Services
CAA
AIX
The non-virtualized Cluster
Eliminating all single points of failures
includes the following items:

client
Redundant access to the disks of the rootvg.
Node 1 Node 2
Redundant disks for the rootvg.
hdisk0 hdisk0
rootvg rootvg
Redundant disks for the datavg.
hdisk1 hdisk1
rootvg rootvg
Redundant adapters for the datavg.
External SAN Storage

Redundant network cards.
hdisk2 hdisk3 hdisk4 hdisk5
The virtualized cluster Node
Client LPAR A Client LPAR B

In a virtualized environement, a VIO server is
A B
used for the disk access.
The following SPOFs are now

vscsi0 vscsi0
Hypervisor
added to the system:
- rootvg disks
vhost0 vhost1
vtscsi0 vtscsi1
- datavg disks
- VIO server partition
scsi0
VIOS 1
How can these SPOFs from a storage (disk)
A point of view be eliminated?

B
Eliminating SPOF for VIO storage
(vSCSI)
Add a Second VIO Server
AIX Client LPAR A AIX Client LPAR A
A A B B
LVM Mirroring LVM Mirroring
vscsi0 vscsi1 vscsi0 vscsi1

Hypervisor
vhost0 vhost1 vhost0 vhost1
vtscsi0 vtscsi1 vtscsi0 vtscsi1
VIOS 1 scsi0 VIOS 2 scsi0
A A
B B
Redundancy Using Mirroring (as a reference only)
Define a second VIO server partition and allocate a second disk for the rootvg.
The backing device can be a whole disk (local or on the SAN) as well as a single LV.
Mirror the rootvg in the same way as always:
- extend the volume group

- mirror all LVs or use the mirrorvg command
- turn off quorum checking
- set the bootlist
- execute the bosboot command.
In case of a failure of one VIO server:
- The LPAR will see stale partitions

- Volume group needs to be resynchronized using syncvg or varyonvg
The procedure could be used for datavgs as well.
This procedure does no automatic recovery so not really usable

Redundancy using MPIO
AIX Client LPAR 1 AIX Client LPAR 2
A B
MPIO Default PCM MPIO Default PCM

Hypervisor
vhost0 vhost1 vhost0 vhost1
fcs0 fcs1
VIOS 1 VIOS 2
A
PV LUNs
The Concept
The same LUN is mapped to both VIO servers in

AIX Client LPAR 1 AIX Client LPAR 2
A B
the SAN. MPIO Default PCM MPIO Default PCM
From both VIO servers, the LUN is mapped again
Hypervisor
to the same LPAR.
The LPAR will correctly identify the disk as an vhost0 vhost1 vhost0 vhost1
MPIO capable device and create one hdisk device
with two paths. VIOS 1

fcs0
VIOS 2
fcs1
The device will work only in failover mode, other PV LUNs
modes are not supported.

Disk Reservation
A VIO server will reserve a disk device on open. Opening means making the virtual SCSI target
available.
If the device driver is not able to ignore the reservation, the device can not be mapped to a
second partition at a time.
All devices accessed through a VIO server must support a no_reserve attribute.
This is not an issue if it is a real single VIO attachment (which means no redundancy and no
HACMP)
Consequences for PowerHA SystemMirror
The reservation held by a VIO server can not be broken by PowerHA SystemMirror.
Reservation requests from the client partition can not be passed to the real storage (so
far).
Only devices that will not be reserved on open are supported.
In this case PowerHA SystemMirror requires the use of

enhanced concurrent mode volume groups.
No Problem for PowerHA V7.x, as only ecmvg are supported.
The traditional VG type is not supported in a virtualized environment due to the lack of
access control.
PowerHA SystemMirror V7.1.x will convert a VG automatically to an ecmvg.

Example: Normal Operation
root@GITS-LPAR1:/home/root# lspv
hdisk0 00cd0e0ad3ff82a7 rootvg active
hdisk1 00cd0e0afb614ca2 rootvg active
hdisk2 00cd0e0aed8c400b dirvg active
hdisk3 00cd0e0aed8c40e2 dirvg active
root@GITS-LPAR1:/home/root# lsdev -Ccdisk

hdisk0 Available Virtual SCSI Disk Drive
root@GITS-LPAR1:/home/root# lsattr -El hdisk2

PCM PCM/friend/vscsi Path Control Module False
algorithm fail_over Algorithm True
hcheck_cmd test_unit_rdy Health Check Command True
hcheck_interval 0 Health Check Interval True
hcheck_mode nonactive Health Check Mode True
max_transfer 0x40000 Maximum TRANSFER Size True
pvid 00cd0e0aed8c400b0000000000000000 Physical volume identifier False
queue_depth 3 Queue DEPTH True
reserve_policy no_reserve Reserve Policy True
root@GITS-LPAR1:/home/root# lspath
Enabled hdisk0 vscsi0
Failure Situation
root@GITS-LPAR1:/home/root# lspv
hdisk0 00cd0e0ad3ff82a7 rootvg active
hdisk1 00cd0e0afb614ca2 rootvg active
hdisk2 00cd0e0aed8c400b dirvg active
hdisk3 00cd0e0aed8c40e2 dirvg active
root@GITS-LPAR1:/home/root# lspath
Failed hdisk0 vscsi0
An errpt entry for the failure will be generated

Failure Recovery
Two error sceanrios:

Device is Failed needs activation this can be automated
Device is Missing needs cfgmgr
First step: Restart the failed VIOServer.

- Verify device access by the VIOServer and VSCSI mapping.
- lsdev type disk; lspv; lsmap -all
VSCSI does not (yet) provide the same level of comfort as software like SDDPCM. This
means:
- No daemon process for continuous path verification.
- No automatic recovery.
If the client LPAR has been rebooted while one VIOServer was down: cfgmgr
Use chpath to bring failed paths back:

chpath -1 hdiskX -p vscsiN -s enable
Failure Recovery MPIO automatic recovery
Automatic path recovery is available (supported) since VIOS1.3

in the following way:
Update the following attributes of each individual disk (hdisk) device:
assign a health check interval (given in seconds check with disk vendor)
* chpath -1 hdisk3 a hcheck_interval=60 p
assign different priorities to each path:
* chpath -1 hdisk3 a priority=1 p vscsi1
* chpath -1 hdisk3 a priority=2 p vscsi2
If a path fails, it will be set to error. If a path returns, it will be set automatically back to the
enabled state. No need to call cfgmgr.
Be careful when performing maintenance of a VIO server partition !
[ lsattr -El <hdiskx> ] will now show if the setting is enabled or only in ODM but not reboot
has been made
Scripts to help (from Brian Smiths blog)
If you set the parameters via p only the ODM is updated.

How do you see if a reboot has been made?
Check the settings with kdb L or

https://www.ibm.com/developerworks/community/blogs/brian/entry/compare_hba_settings?lang=en
https://www.ibm.com/developerworks/community/blogs/brian?lang=en
Do at your own risk as they use kdb.

reboot had been done NO reboot has been made
MPIO Summary
Uses simple fail_over mode of VSCSI PCM only!
- Other algorithm may not be possible anyway.

- Can`t guarantee order of writes otherwise!
- At the moment this is the only mode available.
- Disadvantage: FC adapters in second VIOServer underutilized.
- Can use chpath to turn a path off or on, but primary path is
determined during cfgmgr automatically.
- If priority is configured for the devices, the lower value determines
the highest priority.
AIX CDs contain all you need in the client partitions:
- No special drivers required.

- Necessary filesets are installed automatically.
Tuning - Queue Depth
VIO Client Virtual Disk (Q1) VIO

Client
Queue depth: 1-256, default: 3
To avoid queuing on virtual SCSI client adapter the Q1 Virtual
Disk
maximum
number LUNs per virtual client adapter should be:
Virtual
Max LUNs = INT (510 / (Q1+3)) Q2 Adapter
PowerVM
Hypervisor
VIOC Virtual SCSI Client Adapter (Q2)
Command elements (CE): 512
2 CE for adapter use (512 2 = 510 remaining)
3 CE for each device for recovery
Virtual VIO
Q2 queue depth is fixed Adapter Server
Target
Device
VIOS Disk Adapter (Q3)
Default Q3 queue depth for FC is 200 (num_cmd_elems)
Disk
Q3 Adapter
VIOS Physical Disk (Q4)

Queue depth varies by device (1-256) Q4 Disk
Single queue per disk or LUN even if using LV-VSCI disks
Note: There may also be queues for multi-path driver paths (e.g. SDD and PowerPath)
MPIO using NPIV virtual fiberchannel devices
MPIO using NPIV requires the use of the supplied MPIO device drivers of the manufaturer of
the disk subsystem in the LPAR (again).
SDDPCM for IBM disks subsystems

(DS8000 / DS5000 / )
PowerPath for EMC disks subsystems
HLDM for Hitachi disk subsystems
.
Deamon controlled
Loadbalancing enabled per default
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10691
N_Port ID Virtualization (Virtual FC)
NPIV simplifies disk management
N_Port ID Virtualization
Multiple Virtual World Wide Port Names per FC port PCIe 8 Gb adapter
LPARs have direct visibility on SAN (Zoning/Masking)
I/O Virtualization configuration effort is reduced
Virtual SCSI Model N_Port ID Virtualization

AIX AIX
Generic SCSI Disks DS8000 HDS
VIOS VIOS
FC Adapters FC Adapters
SAN SAN
DS8000 HDS DS8000 HDS

VIOC VIOC
Virtualizes FC adapters
Virtual WWPNs are attributes of the client virtual Multipath SW Multipath SW
FC adapters not physical adapters
64 WWPNs per FC port (128 per dual port HBA) Virtual FC - Client
Hypervisor
Customer Value VIOS Virtual FC - Server VIOS
use existing storage management

Allows common SAN managers, copy services,
backup/restore, zoning, tape libraries, etc
Physical
Load balancing across VIOS FC Ports (8 Gb FC)
Allows mobility without manual management

intervention
NPIV Enabled
Tape SAN
NPIV - Things to consider
WWPN pair is generated EACH time you create a VFC.

NEVER is re-created or re-used. Just like a real HBA.
If you create a new VFC, you get a NEW pair of WWPNs.
Save the partition profile with VFCs in it.
Make a copy, dont delete a profile with a VFC in it.
Make sure the partition profile is backed up for local and disaster recovery! Otherwise youll
have to create new VFCs and map to them during a recovery.
Target Storage SUBSYSTEM must be zoned and visible from source and destination systems
for LPM to work.
Active/Passive storage controllers must BOTH be in the SAN zone for LPM to work
Do NOT include the VIOS physical 8G adapter WWPNs in the zone
You should NOT see any NPIV LUNs in the VIOS
Load multi-path code in the client LPAR, NOT in the VIOS
No passthru tunables in VIOS
PowerHA SystemMirror - NPIV setup
Client LPAR 1 Client LPAR 2
A B
Multi-Path Software Multi-Path Software
fcs0 fcs1 fcs2 fcs3 fcs0 fcs1 fcs2 fcs3

Hypervisor
vfchost0 vfchost1 vfchost2 vfchost3 vfchost0 vfchost1 vfchost2 vfchost3
VIOS 1 fcs0 fcs1 VIOS 2 fcs0 fcs1
B
PV LUNs
NPIV heterogeneous connectivity
Mix virtual fiberchannel and physical fiberchannel adapters

for heterogeneous SAN connectivity
VIO client
Generic backup
To fulfill service level agreements and provide redundancy tape client
device
Cost effective redundancy
Tape support / LAN-free backup and restore support vSCSI
VIOS
VIOS AIX Client LPAR
Passthru module A
FC Adapters
NPIV
NPIV
Fibre
Fibre
HBA
HBA
LAN
Note: backup
Multipath server
POWER Hypervisor FC & FC SAN
drive robotics
Storage Controller
SAN Switch SAN Switch Tape Library
A B C D
Note: NOT vSCSI A B C D

Considerations for SAN heartbeating
with PowerHA SystemMirror V7.1.x
Configure SAN heart beating in virtual environment
Host Zoning is not enough !
target-mode on the FC adapter(s) is needed
Enable the tme attribute on (VIOS) FC adapters:

As padmin: chdev -dev fcs(x) -attr tme=yes -perm
(shutdown -restart needed)
On the HMC, Add a new virtual ethernet adapter to the profile of each PowerHA
virtual client node with a VLAN ID of 3358.
Reactivate the partition using the new profile.
When it boots, a new "entX" should show up, and there should be an sfwcommX
child of that entX.
Verify the configuration changes by running the following command:
[ lsdev -C | grep sfwcom ]
sfwcomm0 Available 01-00-02-FF Fiber Channel Storage Framework Comm
sfwcomm1 Available 01-01-02-FF Fiber Channel Storage Framework Comm
lscluster -i should list the interface as UP.

Notes on the VLAN 3358
VLAN 3358 only needs to be created on virtual client LPARs,

not VIOS. (?)
VLAN 3358 is the only value that CAA will use,

VLAN tag of sfw0 should not be changed.
The entX adapter associated with VLAN 3358 does not require an
enX interface on top of it and does not require an IP address.
Eliminating SPOF for VIO LAN
(Virtual Ethernet Adapter)
Virtual Ethernet The virtualized Node
VIO Client
In a virtualized environment, a VIO server is needed
for access to the outside world.
NIC The SEA acts as a layer-2 based ethernet bridge.
VIOS
Now, the physical network devices in the VIO Server
SEA are new SPOFs.
NIC
How can these SPOFs from a networking point of view
Switch
be eliminated?
Single VIOS Single NIC
The Second VIO Server
VIOS 1 VIOS 2 Client 1 Client 2
en2 en2 en2 en2

(if) (if) (if) (if)
ent2 ent2 ent2 ent2

(SEA) (SEA) (LA) (LA)
NIB NIB
ent0 ent1 ent1 ent0 ent0 ent1 ent0 ent1
(Phy) (Vir) (Vir) (Phy) (Vir) (Vir) (Vir) (Vir)
PVID PVID PVID PVID PVID PVID

Hypervisor
1 2 1 2 1 2
VLAN 1
VLAN 2
Untagged Untagged
Untagged
Ethernet Switch Ethernet Switch
Active
Passive
The Second VIO Server (using manual loadbalancing)
VIOS 1 VIOS 2 AIX 1 AIX 2
en2 en2 en2 en2

(if) (if) (if) (if)
ent2 ent2 ent2 ent2

(SEA) (SEA) (LA) (LA)
NIB NIB
(Phy) (Vir) (Vir) (Phy) (Vir) (Vir) (Vir) (Vir)
PVID PVID PVID PVID PVID PVID

Hypervisor
1 2 1 2 2 1
VLAN 1
VLAN 2
Untagged Untagged Note: If you split the active

client interfaces across VIOSs,
those LPARs will talk to each
other through the external
switches not the Hypervisor.
Untagged
Active
Passive
Etherchannel
The VIO server partitions can use an etherchannel in link aggregation mode.
The VIO client partition can use etherchannel only interface backup mode.
Link aggregation for client partitions is not supported/implemented.
The same PVID for both VIO servers is only allowed using a second vSwitch (on
POWER6 and POWER7 Firmware.
In case one VIO fails, the etherchannel in the client partition will automatically
switch to second adapter.
The virtual server adapter must have the same PVIDs as the client adapters.
If this is true, all VLAN tags will be stripped off by the VIO server which is a
requirement for PowerHA SystemMirror.
Dual VIO-Server with Network Interface Backup (NIB)
Two virtual switches with the same VLAN ID
PowerHA LPAR
NIB
MAC Y MAC Z
ent0 ent1
Virtual I/O Server 1 Virtual I/O Server 2
Shared Ethernet Adapter ent4 Shared Ethernet Adapter

VLAN 4 vSwitch0
VLAN 4
802.3ad vSwitch1 ent4 802.3ad
MAC A MAC B MAC C MAC D MAC E MAC F MAC G MAC H
IEEE 802.3ad IEEE 802.3ad

Link Aggregation Link Aggregation
Switch 1 Switch 2
Two virtual switches with the same VLAN ID
Disadvantage:
More manual work on the LPARs
Should use address to ping for failover to work
(Performance effect ?)
Advantage:
Faster failover/failback than SEA failover (see next)
Still needs netmon.cf !

The second, the best-practices way ! SEA failover
Starting with VIOS level 1.2 and HMC version 5.1,

the failover of a SEA device is supported in the systems firmware.
For this approach, two VIO servers with same PVID for the SEA device have to be
configured.
A different Trunk priority is assigned to each VIO server.
(Starting with Firmware 720 a ARP storm cannot be accedentially made, the devices used
for the SEA will not become available should this configuration error accendtially have
been created)
An additional heartbeat VLAN is used between the VIO servers for monitoring.
Or on POWER7+/POWER8 automatically uses VLAN
Dual VIOs and SEA failover
en3 en3
Primary (if) (if) Backup
ent3 ent2 ent2 ent3 en0 en0

(SEA) (Vir) (Vir) (SEA) (if) (if)
ent0 ent1 ent1 ent0 ent0 ent0

(Phy) (Vir) (Vir) (Phy) (Vir) (Vir)
PVID PVID=99 PVID PVID PVID

Hypervisor
1 1 1 1
VLAN 1
Untagged Untagged
Ethernet Switch Untagged Ethernet Switch
Active
Passive
Failure Situation
X
en3 en3
Primary
Backup (if) (if) Backup
Primary
ent3 ent2 ent2 ent3 en0 en0

(SEA) (Vir) (Vir) (SEA) (if) (if)
X
ent0 ent1 ent1 ent0 ent0 ent0
(Phy) (Vir) (Vir) (Phy) (Vir) (Vir)
PVID PVID=99 PVID PVID PVID

Hypervisor
1 1 1 1
VLAN 1
Untagged Untagged
X Ethernet Switch Untagged Ethernet Switch
Active
Passive
Virtual Ethernet Adapters Configuration (PowerHA V6.1)
net_ether_0
9.19.51.20 (service IP) (service IP) 9.19.51.21

9.19.51.10 (persistent IP) (persistent IP) 9.19.51.11
192.168.100.1 192.168.100.2
( base address) Topsvcs heartbeating ( base address)
en0 serial_net_0 en0
PowerHA Node 1 PowerHA Node 2

FRAME 1 FRAME 2
Hypervisor
ent1 ent0 ent2 ent5 ent0 ent5 ent2 ent1 ent0

(phy) (phy) (virt) (virt) (virt) (virt) (virt) (phy) (phy)
FRAME X
Control
Channel Control
Channel
ent3 ent4 en0 ent4 ent3
(LA) (SEA) (SEA) (LA)
Virtual I/O Server (VIOS1) AIX Client LPAR Virtual I/O Server (VIOS2)
Single Interface PowerHA Recommendation (PowerHA V6.1)
and PowerHA V7.1.3 with Unicast addresses
net_ether_0
9.19.51.20 (service IP) (service IP) 9.19.51.21
9.19.51.10 9.19.51.11
( base address) Topsvcs heartbeating ( base address)
en0 serial_net_0 en0
PowerHA Node 1 PowerHA Node 2

FRAME 1 FRAME 2
In single adapter configurations using IPAT via Aliasing the base address
and the service IP address can be on the same routable subnet.
This change translates in no longer having the need for a persistent IP

address on each node
This change came into effect with APAR IZ26020 (5.4) and IZ31675 (5.3).
The change basically removes the verification check.
PowerHA and Single Interface Networks
Generally, single interface networks are not allowed as this limits the error detection
capabilities of PowerHA SystemMirror (V6.1 and also V7.1).
If you decide to use a single interface network, consider the following:

enter external IP-addresses to the netmon.cf for additional analysis.
losing the network connection will result in a failover of all resource groups using this interface
(selective failover).
verify that the failover of the etherchannel works properly. Adjust heartbeat parameters if necessary,
otherwise, you may experience a DGSP.
IZ01331: NEW NETMON FUNCTIONALITY TO SUPPORT HACMP ON VIO
!REQD !ALL 100.12.7.9

!REQD !ALL 110.12.7.9
!REQD !ALL host5.ibm.com
!REQD en1 9.12.11.10
This is also/still true for PowerHA Version 7.1.x

netmon.cf changes APAR IZ01331
Changes to netmon.cf for virtualized LPARs

The "traditional" format of the netmon.cf file has not changed -- one
hostname or IP address per line.
Any adapter matching one or more "!REQD" lines (as the owner) will
ignore any traditional lines.
Order from one line to the other is unimportant;
you can mix "!REQD" lines with traditional ones in any way. However, if using a full 32 traditional lines, do
not put them all at the very beginning of the file otherwise each adapter will read in all the traditional lines
(since those lines apply to any adapter by default), stop at 32 and quit reading the file there. The same
problem is not true in reverse, as "!REQD" lines which do not apply to a given adapter will be skipped over
and not count toward the 32 maximum.
Comments (lines beginning with "#") are allowed on or between lines and
will be ignored.
If more than 32 "!REQD" lines are specified for the same owner, any extra
will simply be ignored (just as with traditional lines).
Is netmon.cf needed fr PowerHA Version 7.1.x ??
PowerHA SystemMirror Version 7.1.0 said not needed
but the same problems still arise
So PowerHA SystemMirror Version 7.1.1 had to

reintoduce the netmon.cf functionality.
netmon.cf was handled by topology services which have moved to or are

replaced by CAA, so PowerHA 7.1.1+ reintroduced netmon.cf handled now
by group services (cthags)
PowerHA Version 7.1.x netmon.cf
For all virtualized network adapters

(virtual ethernet, IVE, and SR-IOV)
and single adapter networks !!
use the netmon.cf

by adding IP-Addresses/hostnames
What about PowerHA SystemMirror Version 7.1.3 ?
The problem is still the same.
Have multiple virtual networks.

(service/backup/admin networks with seperate SEA/physical adapters)
The virtualized cluster
Virtualized PowerHA SystemMirror (HACMP) clusters
AIX Client 1 (HACMP AIX Client 2 (HACMP)
A A
en0 en0
(if) MPIO Default PCM MPIO Default PCM (if)
ent0 ent0
(Vir) vscsi0 vscsi1 vscsi0 vscsi1 (Vir)
PVID=1 PVID=99 PVID=1
PVID=99
PowerVM Hypervisor PowerVM Hypervisor
ent2 ent2 ent2 ent2

VIOS1 (Vir) VIOS2 (Vir) VIOS1 vhost0 (Vir) VIOS2 (Vir) vhost0
vhost0 vhost0
(SEA) (Vir) (Vir) (SEA) (SEA) (Vir) (Vir) (SEA)
Primary Backup Primary Backup
ent4 ent4 ent4 ent4
(LA) (LA) (LA) (LA) Multi-Path Driver
Multi-Path Driver Multi-Path Driver
Multi-Path Driver
fcs0 fcs1 ent1 ent0 ent1 ent0 fcs0 fcs1 fcs0 fcs1 ent1 ent0 ent1 ent0 fcs0 fcs1
(Phy) (Phy) (Phy) (Phy) (Phy) (Phy) (Phy) (Phy)
Active
Passive
NPIV: PowerHA SystemMirror Configuration Example
System 1
VIOS 1 Node 1
NPIV
HBA
vscsi0
hdisk0 } rootvg
HBA
}
hdisk vhost0
Hypervisor
MPIO hdisk1
vscsi1
vscsi_vg
hdisk2
HBA
hdisk vhost0 fcs0
}
hdisk3
NPIV
MPIO
LUNS
VSCSI HBA
fcs1
hdisk4
npiv_vg
VIOS 2
LUNS
NPIV
System 2
STORAGE
SUBSYSTEM VIOS 1 Node 2
NPIV
HBA
vscsi0
hdisk0 } rootvg
HBA
}
Hypervisor
hdisk vhost0 MPIO hdisk1

vscsi1
vscsi_vg
hdisk2
HBA
hdisk vhost0 fcs0
}
hdisk3
MPIO
NPIV
HBA
fcs1
hdisk4
npiv_vg
VIOS 2
The biggest problem areas Split Brain
Split brain situations - keeping data consistent
Redundancy has moved out of the cluster

(and into the VIO Servers / VIO Setup)
No direct connectivity between VIO and cluster
Hung cluster nodes write post event scripts

VIO server maintenance
Maintenance of the VIO Server Partition
Situation is different from VIOServer shutdown:

- SW update may coause temporarily inconsistent code levels.
- Can`t rely on continous operation.
Solution: manual failover
- chpath -1 hdiskxy p vscsixy s disable

- smitty etherchannel: force a failover (AIX 530-02 at least needed)
- for LVM mirrored disk, set the virtual SCSI target devices
to a defined` state in the VIO server partition.
The VIO partition needs to be rebooted after maintenance!

Reintegration:
Same as after a failure for vSCSI
Reconfigure etherchannels back to previous settings
Switch of the SEA failover is done automatically.
Virtual or Dedicated Adapters?
The decision if virtual or dedicated devices are used depend on the following factors (pick
any two J ) :
- Costs
- Performance
- Flexibility
From a performance point of view, virtual ethernet inside of a managed system delivers
very high data transfer rates, but this will need CPU cycles.
The use of virtual disk I/O is quite cheap, nevertheless some latency will be added to
your I/O.
NPIV Virtual FiberChannel gets nearer to the original cluster setups

VIO Server Failure
What happens if a VIO server is down?

Simple answer: It depends!
If you only use the VIO server for networking, the networks connections may be lost,
depending on setup. Then a failover of the resource group will occur.
If I/O and networking is done through the VIO server, you will lose all connections between
your cluster nodes. In this case, the LPAR needs to be shutdown immediately.
Otherwise you may receive a DGSP which will halt one of your cluster nodes.
So redundant VIO servers + SEA failover + MPIO
or
Redundant VIO Server + SEA failover + NPIV
I would recommend NPIV / Virtual Fiberchannel easier to setup and maintain
more active resiliancy.
VIO Server Failure / Maintenance / Reboot
Be sure to check all LPARs at all times !!
This is essential !!
There is no guarantie that you automated recovery settings

(VSCSI / SEA Failover / NIB Failback) will work.
So !!
Be sure to check all LPARs afterwards !!
POWER6 and POWER7 - PowerVM
POWER6/POWER7 new virtualization options
Live Partition Mobility next charts

Integrated Virtual Ethernet like virtual ethernet
Multiple shared pools works no PowerHA effects
Shared Dedicated Capacity supported/works
VIO 2.1 NPIV discussed already
VIO 2.1.1 Active Memory Sharing yes special considerations
POWER7 Active Memory Expansion supported/works
VIO 2.2 Shared Storage Pools not usable
VIO 2.2.1 Shared Storage Pools supported
VIO 2.3.x Shared Storage Pools supported
PowerHA (HACMP) and Live Partition Mobility
Minimum requirements:
Firmware Level: 01EM320_40 (or higher) (Nov 2007)
VIOS Level: 1.5.1.1-FP-10.1 (w/Interim Fix 071116)
HMC Level: 7.3.2.
Live Partition Mobility AIX V5.3 AIX V6.1
HACMP 5.3 HACMP # IZ07791 HACMP # IZ07791

AIX 5.3.7.1 AIX 6.1.0.1
RSCT 2.4.5.4 RSCT 2.5.0.0
HACMP 5.4.1 HACMP # IZ02620 HACMP #IZ02620

AIX 5.3.7.1 AIX 6.1.0.1
RSCT 2.4.5.4 RSCT 2.5.0.0
HACMP tolerates the Live Partition Mobility

Live Partition Mobility will not cause unwanted cluster events. When using Live Partition
Mobility error detection rates have to set to normal or slow, not fast.
Live Partition Mobility Steps
Source Mover Mover Destination

Mobile Service Service Partition
Partition Partition Partition Shell
1 2 3 HMC 4 5 6
PowerVM PowerVM
POWER6 Source System POWER6 Destination System

Ethernet
The HMC creates a compatible partition shell on the destination system

The HMC configures the mover service partitions on the source and destination systems
The HMC issues a prepare for migration event to the source operating system
The HMC creates the necessary vSCSI and/or NPIV devices in the destination systems VIOSes
The source mover starts sending partition state to the destination mover
Once sufficient pages have moved, the Hypervisor suspends the source partition
During the suspension, the source mover partition continues to send partition state information
The mobile partition resumes execution on the destination server
The destination partition retries all pending I/O requests that were not completed
When the destination mover partition receives the last memory page the migration is complete
Live Partition Mobility the migration process
Power6 System #1 Power6 System #2
Suspended
AIX Client 1 Partition ShellClient
AIX Partition
1
M M M M M M M M M M M M M M
en0 en0
A (if) (if) A
vscsi0 ent1 ent1 vscsi0

HMC
VLAN VLAN
PowerVM PowerVM
VASI vhost0 ent1 ent1 vhost0 VASI
Mover ent2 en2 en2 ent2 Mover

Service vtscsi0 vtscsi0 Service
SEA (if) (if) SEA
fcs0 ent0 VIOS ent0 fcs0 VIOS
Logical Memory Copy

Storage
Subsystem
A
Live Partition Mobility
System X System A System B
LPAR y LPAR x HYPV HYPV
HACMP
HACMP
Migrating LPAR to other system
en0
hdisk0
vfscsi0 vhost0 vhost0 vfscsi0
en2 HMC en2

hdiskX
if VASI MSP VASI MSP if hdiskX
ent2 ent1 ent1 ent2

SEA virt virt SEA
Private Private
network network
fsc0 ent0 Service Service
ent0 fsc0
processor processor
Ethernet
Network
Storage Area Network

LUN
Live Partition Mobility to standby system ok ?
LPAR y LPAR x HYPV HYPV
HACMP
HACMP
en0
hdisk0
vfscsi0 vhost0 vhost0 vfscsi0
en2 HMC en2

hdiskX
ent2 ent1 ent1 ent2

SEA virt virt SEA
Private Private
network network
ent0 fsc0
processor processor
Ethernet
Network

LUN
Live Partition Mobility and PowerHA SystemMirror
Normal maintenance with PowerHA SystemMirror is:

Failover time
No high availability
Maintenance using LPM is:

No downtime because no failover
High availability at least with OS and Application
Maybe SPOF because all LPARs in one system
What about errors during LPM !?!

Live Partition Mobility to standby system ok ? YES
LPAR y LPAR x Maintenance using LPM is:

HYPV HYPV
No downtime because no failover

HACMP
HACMP
High availability
en0 at least with OS and Application
hdisk0
Same SPOF as before
vfscsi0 vhost0
because all LPARs in one system vhost0 vfscsi0
en2 HMC en2

hdiskX
ent2
SEA
ent1
virt
What about errors during LPM !?! ent1
virt
ent2
SEA
Private Private
network network
ent0 fsc0
processor
Thats is what PowerHA SysMir is for !
processor
Ethernet
Network

LUN
POWER7 Active Memory Expansion
See AME sessions and Redbook for more details
http://www.redbooks.ibm.com/abstracts/sg247590.html
Hardware feature - uses free CPU cycles to compress/uncompress memory
New POWER7+ has accelerator to get a higher expansion factor with less CPU overhead.
Is supported with PowerHA System Mirror (HACMP)
.
Active Memory Sharing (AMS)
For details see AMS Redbook

http://www.redbooks.ibm.com/abstracts/redp4470.html
Use on Standby LPARs
Faster than DLPAR of

memory
Start testing to find out

I have no active experience
Supported
AMS Think about the paging
Spread paging I/O across many disk spindles

5 second glitch or 0.5 second glitch
Cache
Disk write caches
Solid State Disks

Other virtualization features
Multiple shared pools used for licensing

are you in the right pool
Shared Dedicated Capacity
performance gain possible when switching

unused cycles of standby LPAR are added to
the shared pool
Discussion - Questions
Now or send your question and/or remarks per email
Stefan.Gocke@t-online.de
The End
Thank You and enjoy the rest of the conference

Continue growing your IBM skills
ibm.com/training provides a
comprehensive portfolio of skills and career
accelerators that are designed to meet all
your training needs.
Training in cities local to you - where and

when you need it, and in the format you want
Use IBM Training Search to locate public training classes
near to you with our five Global Training Providers
Private training is also available with our Global Training Providers
Demanding a high standard of quality

view the paths to success
Browse Training Paths and Certifications to find the
course that is right for you
If you cant find the training that is right for you with our
Global Training Providers, we can help.
Contact IBM Training at dpmc@us.ibm.com
Global Skills Initiative
Copyright IBM Corporation

2015

aBA2128 PDF

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

aBA2128 PDF

Enviado por

Direitos autorais:

Formatos disponíveis

Stefan Gocke

PowerHA and PowerVM together

In IT Industry since 1986 - Freelance since 2003

IBM CATE Power Systems for AIX V3 , and many more .

If you dont ask you cant get an answer

So please ask questions at all times

Or email later to : Stefan.Gocke@t-online.de

PowerHA SystemMirror will work with virtual devices.

PowerHA SystemMirror is supported with Live Partition Mobility

All IBM Education Services classes on PowerHA SystemMirror

Some restrictions apply when using virtual ethernet and/or

Change in PowerHA Cluster through new use of

The HACMP - part is still the same !

My personal testing viewpoints:

Copyright IBM Corporation 2015

Cluster communication requires a multicast IP address to be used.

Repository disk considerations

Both use RSCT Group Services

Be aware of a memory leak RSCT (IV69760/IV66606)

IBMconfig.RM daemon on CAA Group Leader (lssrc -ls IBM.ConfigRM)

rsct.core.rmc starting 3.1.5.0 to 3.2.0.4 in IBMconfig.RM (only group leader)

Time to failure after a new boot is estimated to be between 6 and 8 months.

A reboot of the node will remedy the situation.

"lssrc -ls IBM.ConfigRM will show the active group leader.

Affects PowerHA V7 and VIO SSP. Fix is pending.

Eliminating all single points of failures

includes the following items:

Redundant adapters for the datavg.

External SAN Storage

Client LPAR A Client LPAR B

used for the disk access.

The following SPOFs are now

added to the system:

- VIO server partition

A point of view be eliminated?

AIX Client LPAR A AIX Client LPAR A

LVM Mirroring LVM Mirroring

vscsi0 vscsi1 vscsi0 vscsi1

vhost0 vhost1 vhost0 vhost1

vtscsi0 vtscsi1 vtscsi0 vtscsi1

VIOS 1 scsi0 VIOS 2 scsi0

Mirror the rootvg in the same way as always:

- extend the volume group

In case of a failure of one VIO server:

- The LPAR will see stale partitions

The procedure could be used for datavgs as well.

This procedure does no automatic recovery so not really usable

MPIO Default PCM MPIO Default PCM

vscsi0 vscsi1 vscsi0 vscsi1

vhost0 vhost1 vhost0 vhost1

vtscsi0 vtscsi1 vtscsi0 vtscsi1

The same LUN is mapped to both VIO servers in

the SAN. MPIO Default PCM MPIO Default PCM

vscsi0 vscsi1 vscsi0 vscsi1

From both VIO servers, the LUN is mapped again

vtscsi0 vtscsi1 vtscsi0 vtscsi1

MPIO capable device and create one hdisk device

with two paths. VIOS 1

The device will work only in failover mode, other PV LUNs

modes are not supported.

Only devices that will not be reserved on open are supported.

In this case PowerHA SystemMirror requires the use of

No Problem for PowerHA V7.x, as only ecmvg are supported.