Escolar Documentos
Profissional Documentos
Cultura Documentos
Consultant
Gocke IT Solutions
Technical University/Symposia materials may not be reproduced in whole or in part without the prior written permission of IBM.
9.0
Stefan Gocke
Gocke IT Solutions
stefan.gocke@t-online.de
! ! ! ! !
Agenda
Differences between PowerHA V6.1 and V7.1
Identify and work around SPOFs
in a virtualized environment
(Live) Partition Mobility
and PowerHA SystemMirror (HACMP)
VIO Server functions and
PowerHA SystemMirror (HACMP)
POWER7 AME and PowerHA SystemMirror (HACMP)
SR-IOV / IVE - with PowerHA System Mirror (HACMP)
Its your session
Using a virtual I/O server will add new SPOFs to the cluster which
need to be taken into account.
Whats different with PowerHA V7.1(.x)
PowerHA SystemMirror Version 7.1.x
Starting with PowerHA SystemMirror V7.1.1 SP2 repository resiliency has been
implemented, so a failure of the repository will not fail the cluster and the failed disk
can be recovered.
Starting with PowerHA SystemMirrir V7.1.3 you can replace the repository disk
clmgr
replace
repository
hdisk<x>
You
can
prepare
for
the
loss
of
a
repository
clmgr
add
repository
hdisk<x>
clmgr
replace
repository
(when
the
rst
one
failed)
You
can
add
a
second
repository
when
building
the
cluster
clmgr
add
cluster
<xxx>
repository
hdiskp,hdiskq
10
Differences PowerHA V6.1 and V7.1.x
PowerHA V6.1 uses RSCT Topology Services
PowerHA V7.1 uses CAA
Dont start/use V7.1.0 use 7.1.2 / 7.1.3 (bepends
PowerHA 6.1
RSCT
Resource Monitoring
and Control
Resource Manager Group Services
Topology Services
AIX
PowerHA SystemMirror V7.1.x
PowerHA 7.1.x
RSCT
Resource Monitoring
and Control
Resource Manager Group Services
CAA
AIX
The non-virtualized Cluster
Node 1 Node 2
Redundant disks for the rootvg.
hdisk0 hdisk0
rootvg rootvg
Redundant disks for the datavg.
hdisk1 hdisk1
rootvg rootvg
- rootvg disks
vhost0 vhost1
vtscsi0 vtscsi1
- datavg disks
scsi0
VIOS 1
How can these SPOFs from a storage (disk)
A A B B
A A
B B
Redundancy Using Mirroring (as a reference only)
Define a second VIO server partition and allocate a second disk for the rootvg.
The backing device can be a whole disk (local or on the SAN) as well as a single LV.
A B
fcs0 fcs1
VIOS 1 VIOS 2
A
PV LUNs
The Concept
A B
Hypervisor
to the same LPAR.
The LPAR will correctly identify the disk as an vhost0 vhost1 vhost0 vhost1
A VIO server will reserve a disk device on open. Opening means making the virtual SCSI target
available.
If the device driver is not able to ignore the reservation, the device can not be mapped to a
second partition at a time.
All devices accessed through a VIO server must support a no_reserve attribute.
This is not an issue if it is a real single VIO attachment (which means no redundancy and no
HACMP)
Consequences for PowerHA SystemMirror
The reservation held by a VIO server can not be broken by PowerHA SystemMirror.
Reservation requests from the client partition can not be passed to the real storage (so
far).
The traditional VG type is not supported in a virtualized environment due to the lack of
access control.
root@GITS-LPAR1:/home/root# lspath
Enabled hdisk0 vscsi0
Enabled hdisk1 vscsi1
Enabled hdisk2 vscsi0
Enabled hdisk3 vscsi1
Enabled hdisk2 vscsi1
Enabled hdisk3 vscsi0
Failure Situation
root@GITS-LPAR1:/home/root# lspv
hdisk0 00cd0e0ad3ff82a7 rootvg active
hdisk1 00cd0e0afb614ca2 rootvg active
hdisk2 00cd0e0aed8c400b dirvg active
hdisk3 00cd0e0aed8c40e2 dirvg active
root@GITS-LPAR1:/home/root# lspath
Failed hdisk0 vscsi0
Enabled hdisk1 vscsi1
Failed hdisk2 vscsi0
Enabled hdisk3 vscsi1
Enabled hdisk2 vscsi1
Failed hdisk3 vscsi0
VSCSI does not (yet) provide the same level of comfort as software like SDDPCM. This
means:
- No daemon process for continuous path verification.
- No automatic recovery.
If the client LPAR has been rebooted while one VIOServer was down: cfgmgr
If a path fails, it will be set to error. If a path returns, it will be set automatically back to the
enabled state. No need to call cfgmgr.
[ lsattr -El <hdiskx> ] will now show if the setting is enabled or only in ODM but not reboot
has been made
Scripts to help (from Brian Smiths blog)
PowerVM
Hypervisor
VIOC Virtual SCSI Client Adapter (Q2)
Command elements (CE): 512
2 CE for adapter use (512 2 = 510 remaining)
3 CE for each device for recovery
Virtual VIO
Q2 queue depth is fixed Adapter Server
Target
Device
VIOS Disk Adapter (Q3)
Default Q3 queue depth for FC is 200 (num_cmd_elems)
Disk
Q3 Adapter
MPIO using NPIV requires the use of the supplied MPIO device drivers of the manufaturer of
the disk subsystem in the LPAR (again).
Deamon controlled
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10691
N_Port ID Virtualization (Virtual FC)
NPIV simplifies disk management
N_Port ID Virtualization
Multiple Virtual World Wide Port Names per FC port PCIe 8 Gb adapter
LPARs have direct visibility on SAN (Zoning/Masking)
I/O Virtualization configuration effort is reduced
VIOS VIOS
FC Adapters FC Adapters
SAN SAN
N_Port ID Virtualization
VIOC VIOC
Virtualizes FC adapters
Virtual WWPNs are attributes of the client virtual Multipath SW Multipath SW
FC adapters not physical adapters
64 WWPNs per FC port (128 per dual port HBA) Virtual FC - Client
Hypervisor
A B
B
PV LUNs
NPIV heterogeneous connectivity
VIOS
VIOS AIX Client LPAR
Passthru module A
FC Adapters
NPIV
NPIV
Fibre
Fibre
HBA
HBA
LAN
Note: backup
Multipath server
POWER Hypervisor FC & FC SAN
drive robotics
Storage Controller
SAN Switch SAN Switch Tape Library
A B C D
The entX adapter associated with VLAN 3358 does not require an
enX interface on top of it and does not require an IP address.
Eliminating SPOF for VIO LAN
(Virtual Ethernet Adapter)
Virtual Ethernet The virtualized Node
VIO Client
In a virtualized environment, a VIO server is needed
VIOS
Now, the physical network devices in the VIO Server
NIC
Switch
be eliminated?
Single VIOS Single NIC
The Second VIO Server
1 2 1 2 1 2
VLAN 1
VLAN 2
Untagged Untagged
Untagged
Ethernet Switch Ethernet Switch
Active
Passive
The Second VIO Server (using manual loadbalancing)
1 2 1 2 2 1
VLAN 1
VLAN 2
Active
Passive
Etherchannel
The VIO server partitions can use an etherchannel in link aggregation mode.
The VIO client partition can use etherchannel only interface backup mode.
Link aggregation for client partitions is not supported/implemented.
The same PVID for both VIO servers is only allowed using a second vSwitch (on
POWER6 and POWER7 Firmware.
In case one VIO fails, the etherchannel in the client partition will automatically
switch to second adapter.
The virtual server adapter must have the same PVIDs as the client adapters.
If this is true, all VLAN tags will be stripped off by the VIO server which is a
requirement for PowerHA SystemMirror.
Dual VIO-Server with Network Interface Backup (NIB)
PowerHA LPAR
NIB
MAC Y MAC Z
ent0 ent1
VLAN 4
802.3ad vSwitch1 ent4 802.3ad
Switch 1 Switch 2
Two virtual switches with the same VLAN ID
Disadvantage:
More manual work on the LPARs
Should use address to ping for failover to work
(Performance effect ?)
Advantage:
Faster failover/failback than SEA failover (see next)
For this approach, two VIO servers with same PVID for the SEA device have to be
configured.
(Starting with Firmware 720 a ARP storm cannot be accedentially made, the devices used
for the SEA will not become available should this configuration error accendtially have
been created)
An additional heartbeat VLAN is used between the VIO servers for monitoring.
Or on POWER7+/POWER8 automatically uses VLAN
Dual VIOs and SEA failover
en3 en3
Primary (if) (if) Backup
1 1 1 1
VLAN 1
Untagged Untagged
Active
Passive
Failure Situation
X
en3 en3
Primary
Backup (if) (if) Backup
Primary
X
ent0 ent1 ent1 ent0 ent0 ent0
(Phy) (Vir) (Vir) (Phy) (Vir) (Vir)
1 1 1 1
VLAN 1
Untagged Untagged
Active
Passive
Virtual Ethernet Adapters Configuration (PowerHA V6.1)
net_ether_0
Hypervisor
Virtual I/O Server (VIOS1) AIX Client LPAR Virtual I/O Server (VIOS2)
Single Interface PowerHA Recommendation (PowerHA V6.1)
and PowerHA V7.1.3 with Unicast addresses
net_ether_0
9.19.51.10 9.19.51.11
( base address) Topsvcs heartbeating ( base address)
In single adapter configurations using IPAT via Aliasing the base address
and the service IP address can be on the same routable subnet.
This change came into effect with APAR IZ26020 (5.4) and IZ31675 (5.3).
The change basically removes the verification check.
PowerHA and Single Interface Networks
Generally, single interface networks are not allowed as this limits the error detection
capabilities of PowerHA SystemMirror (V6.1 and also V7.1).
you can mix "!REQD" lines with traditional ones in any way. However, if using a full 32 traditional lines, do
not put them all at the very beginning of the file otherwise each adapter will read in all the traditional lines
(since those lines apply to any adapter by default), stop at 32 and quit reading the file there. The same
problem is not true in reverse, as "!REQD" lines which do not apply to a given adapter will be skipped over
and not count toward the 32 maximum.
Comments (lines beginning with "#") are allowed on or between lines and
will be ignored.
If more than 32 "!REQD" lines are specified for the same owner, any extra
will simply be ignored (just as with traditional lines).
Is netmon.cf needed fr PowerHA Version 7.1.x ??
PowerHA SystemMirror Version 7.1.0 said not needed
but the same problems still arise
A A
en0 en0
(if) MPIO Default PCM MPIO Default PCM (if)
ent0 ent0
(Vir) vscsi0 vscsi1 vscsi0 vscsi1 (Vir)
PVID=99
PowerVM Hypervisor PowerVM Hypervisor
fcs0 fcs1 ent1 ent0 ent1 ent0 fcs0 fcs1 fcs0 fcs1 ent1 ent0 ent1 ent0 fcs0 fcs1
(Phy) (Phy) (Phy) (Phy) (Phy) (Phy) (Phy) (Phy)
Active
Passive
NPIV: PowerHA SystemMirror Configuration Example
System 1
VIOS 1 Node 1
NPIV
HBA
vscsi0
hdisk0 } rootvg
HBA
}
hdisk vhost0
Hypervisor
MPIO hdisk1
vscsi1
vscsi_vg
hdisk2
HBA
}
hdisk3
NPIV
MPIO
LUNS
VSCSI HBA
fcs1
hdisk4
npiv_vg
VIOS 2
LUNS
NPIV
System 2
STORAGE
SUBSYSTEM VIOS 1 Node 2
NPIV
HBA
vscsi0
hdisk0 } rootvg
HBA
}
Hypervisor
}
hdisk3
MPIO
NPIV
HBA
fcs1
hdisk4
npiv_vg
VIOS 2
The biggest problem areas Split Brain
The decision if virtual or dedicated devices are used depend on the following factors (pick
any two J ) :
- Costs
- Performance
- Flexibility
From a performance point of view, virtual ethernet inside of a managed system delivers
very high data transfer rates, but this will need CPU cycles.
The use of virtual disk I/O is quite cheap, nevertheless some latency will be added to
your I/O.
Otherwise you may receive a DGSP which will halt one of your cluster nodes.
So redundant VIO servers + SEA failover + MPIO
or
Redundant VIO Server + SEA failover + NPIV
I would recommend NPIV / Virtual Fiberchannel easier to setup and maintain
more active resiliancy.
VIO Server Failure / Maintenance / Reboot
This is essential !!
Minimum requirements:
Firmware Level: 01EM320_40 (or higher) (Nov 2007)
VIOS Level: 1.5.1.1-FP-10.1 (w/Interim Fix 071116)
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10640
Live Partition Mobility Steps
1 2 3 HMC 4 5 6
PowerVM PowerVM
Suspended
AIX Client 1 Partition ShellClient
AIX Partition
1
M M M M M M M M M M M M M M
en0 en0
A (if) (if) A
A
Live Partition Mobility
HACMP
HACMP
Migrating LPAR to other system
en0
hdisk0
Ethernet
Network
HACMP
HACMP
en0
hdisk0
Ethernet
Network
ent2
SEA
ent1
virt
What about errors during LPM !?! ent1
virt
ent2
SEA
Private Private
network network
fsc0 ent0 Service Service
ent0 fsc0
processor
Thats is what PowerHA SysMir is for !
processor
Ethernet
Network
New POWER7+ has accelerator to get a higher expansion factor with less CPU overhead.
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10708
http://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/FLASH10705
.
Active Memory Sharing (AMS)
Supported
AMS Think about the paging
Cache
Disk write caches
Stefan.Gocke@t-online.de
The End
ibm.com/training provides a
comprehensive portfolio of skills and career
accelerators that are designed to meet all
your training needs.
If you cant find the training that is right for you with our
Global Training Providers, we can help.
Contact IBM Training at dpmc@us.ibm.com
Global Skills Initiative