An 313 Inst

V5.4.0.
cover
Front cover
Power Systems for AIX -

Virtualization II: Advanced
PowerVM and Performance
(Course code AN31)
Instructor Guide
ERC 3.1
Instructor Guide
Trademarks
IBM is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
Active Memory AIX 5L AIX 6
AIX BladeCenter DB
DB2 EnergyScale Express
HACMP IBM Systems Director Active i5/OS
Energy Manager
i5/OS Micro-Partitioning Power Architecture
POWER Hypervisor Power Systems POWER
PowerVM POWER4 POWER5
POWER5+ POWER6+ POWER6
POWER7 Systems POWER7 pSeries
Redbooks System i System p
System p5 System z Systems Director
VMControl
Tivoli Workload Partitions z/VM
Manager
400
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or
both.
Windows and Windows NT are trademarks of Microsoft Corporation in the United States,
other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of
Oracle and/or its affiliates.
Other product and service names might be trademarks of IBM or other companies.
May 2011 edition

The information contained in this document has not been submitted to any formal IBM test and is distributed on an as is basis without
any warranty either express or implied. The use of this information or the implementation of any of these techniques is a customer
responsibility and depends on the customers ability to evaluate and integrate them into the customers operational environment. While
each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results will
result elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.
Copyright International Business Machines Corporation 2010, 2011.

This document may not be reproduced in whole or in part without the prior written permission of IBM.
Note to U.S. Government Users Documentation related to restricted rights Use, duplication or disclosure is subject to restrictions
set forth in GSA ADP Schedule Contract with IBM Corp.
V5.4.0.3
Instructor Guide
TOC Contents
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Instructor course overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
Course description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix
Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
Unit 1. PowerVM features review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Virtualization technologies on POWER systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
IBM PowerVM editions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
POWER6 processor architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-12
POWER7 processor architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-15
Micro-Partitioning: Shared processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18
Simultaneous multithreading and Micro-Partitioning . . . . . . . . . . . . . . . . . . . . . . . 1-21
Micro-Partitioning: Processor folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-23
Shared dedicated capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-26
Multiple shared processor pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-28
Virtual Ethernet adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-31
Virtual I/O Server example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-34
Virtual I/O Server: Shared Ethernet adapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-38
Integrated Virtual Ethernet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-41
Virtual I/O Server: Virtual SCSI architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-44
N_Port_ID Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-46
Virtual tape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-48
PowerVM Live Partition Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-50
PowerVM Active Memory Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-52
Active Memory Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-54
PowerVM Lx86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-57
What is performance management? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-59
Performance methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-61
Performance analysis flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-63
AIX performance analysis tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-65
AIX performance tuning tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-69
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-72
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-74
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-76
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-78
Unit 2. Processor virtualization tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
What is simultaneous multithreading? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-5
When to use simultaneous multithreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
Copyright IBM Corp. 2010, 2011 Contents iii

Course materials may not be reproduced in whole or in part
without the prior written permission of IBM.
Instructor Guide
POWER7 intelligent threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-11

Turning on or off simultaneous multithreading (1 of 2) . . . . . . . . . . . . . . . . . . . . . .2-13
Turning on or off simultaneous multithreading (2 of 2) . . . . . . . . . . . . . . . . . . . . . .2-16
Traditional CPU utilization statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-18
POWER6: Processor Utilization Resource Register . . . . . . . . . . . . . . . . . . . . . . . .2-21
POWER7: Processor Utilization Resource Register . . . . . . . . . . . . . . . . . . . . . . . .2-24
Scaled Performance Utilization Resource Register . . . . . . . . . . . . . . . . . . . . . . . .2-26
CPU utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-29
CPU utilization metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-32
Additional CPU utilization metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-34
Commands supporting PURR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37
Using sar with simultaneous multithreading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-40
Using mpstat: POWER6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42
Using mpstat: POWER7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-44
topas: Example screens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-46
POWER6 utilization view: SMT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-48
POWER7 utilization view: SMT4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-50
POWER7 utilization view: Dynamic SMT scheduling . . . . . . . . . . . . . . . . . . . . . . .2-52
2.1 Shared processor LPAR considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-55
Dedicated processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-57
Shared processors (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-60
Shared processors (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-63
Shared processor pool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-66
Virtual processors (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-68
Shared processor dispatch latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-76
Shared processor affinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-79
Scheduling affinity domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-81
Capped shared processor LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-85
Uncapped shared processor LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-87
Capacity and virtual processor relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-89
Virtual processors: What to do? (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-92
Virtual processors: What to do? (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-95
Virtual processors: Cede, confer, or fold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-98
Virtual processor folding (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-102
Simultaneous multithreading and SPLPARs . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-110
Metrics with simultaneous multithreading and SPLPAR . . . . . . . . . . . . . . . . . . . .2-112
AIX 6.1 SPLPAR tool impact (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-114
AIX 6.1 SPLPAR tool impact (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-117
Using sar with SPLPAR (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-120
Using sar with SPLPAR (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-123
topas: Example main screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-126
Partition data with topas - L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-128
Cross partition data with topas -C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-130
Understanding lbusy percentage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-133
iv PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
TOC Micro-Partitioning and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-135

Micro-Partitioning and capacity planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-137
Checkpoint (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-139
Checkpoint (2 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-141
Checkpoint (3 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-143
Checkpoint (4 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-145
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-147
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-149
Unit 3. Dedicated shared capacity and multiple shared processor pools . . . . . . . . 3-1
Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Topic 1: Dedicated shared processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Dedicated processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Shared dedicated processors: Donating mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
POWER virtualization enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
Dedicated processors: Enabling donating mode . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
Viewing the sharing mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-15
Working with sharing/donor mode from CLI (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . 3-17
Working with sharing/donor mode from CLI (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . 3-19
Viewing donating mode in AIX tools (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-21
Viewing donating mode in AIX tools (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-23
Viewing donating mode: HMC utilization data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-26
Dedicated processors donating scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-28
Dedicated idle cycles donation: New metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31
Processor folding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-34
Processor folding: Maximizing the idle capacity (1 of 2) . . . . . . . . . . . . . . . . . . . . 3-37
Processor folding: Maximizing the idle capacity (2 of 2) . . . . . . . . . . . . . . . . . . . . 3-39
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41
Topic 1: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
Topic 2: Multiple shared processor pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-45
What are multiple shared processor pools? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-47
Multiple shared processor pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-49
Multiple shared processor pool: Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-51
CPU consumption for uncapped partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-53
CPU usage in a user-defined shared processor pool . . . . . . . . . . . . . . . . . . . . . . 3-55
Virtual shared processor pools: Resolution level . . . . . . . . . . . . . . . . . . . . . . . . . 3-57
Hardware and software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-59
Configuring multiple shared pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-61
Managed system properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-63
Change attributes of shared processor pools (1 of 3) . . . . . . . . . . . . . . . . . . . . . . 3-65
Changing the LPAR shared pool assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-71
Viewing shared pools in AIX tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-73
Viewing shared pools from HMC CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-75
Monitoring shared pools: AIX tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-77
Monitoring shared pools: HMC utilization data . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-79
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-81
Copyright IBM Corp. 2010, 2011 Contents v

Instructor Guide
Topic 2: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-83

Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-85
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-87
Unit 4. Active Memory Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2
What is PowerVM Active Memory Sharing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4
Dedicated and shared memory types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7
When to use PowerVM Active Memory Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10
PowerVM Active Memory Sharing requirements . . . . . . . . . . . . . . . . . . . . . . . . . .4-13
AMS configuration restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15
Active Memory Sharing components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17
Virtualization control point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20
Active Memory Sharing Manager (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22
Active Memory Sharing Manager (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-24
Operating system support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26
Paging Virtual I/O Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-28
I/O entitled memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-31
Logical to physical memory mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-33
POWER Hypervisor paging example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-36
Collaborative memory manager: Loaning policy . . . . . . . . . . . . . . . . . . . . . . . . . . .4-38
Memory subscription ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-41
Shared memory partition configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-44
Shared memory pool management menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-46
Shared memory pool management wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-48
Shared memory pool size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-50
Paging Virtual I/O Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-52
Create the paging devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-54
Select the devices to be paging space devices . . . . . . . . . . . . . . . . . . . . . . . . . . .4-56
Paging space device filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-58
Paging space devices summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-60
Creating the shared memory pool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-62
Pool properties: Paging space devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-64
Virtual I/O Server virtual devices for AMS (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . .4-66
Virtual I/O Server virtual devices for AMS (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . .4-68
Partition shared memory settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-70
Activate a shared memory LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-72
Dynamically add or remove memory resource . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-74
Some performance guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-77
Monitoring tools for AMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-79
lparstat -i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-81
Active shared virtual memory in an LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-83
vmstat command (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-85
vmstat command (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-88
lparstat -me . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-90
topas -L . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-93
topas -C (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-95
topas -C (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-97
vi PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
TOC HMC utilization data: Memory pool utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-100

HMC utilization data: Partition memory utilization . . . . . . . . . . . . . . . . . . . . . . . . 4-103
Checkpoint (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-105
Checkpoint (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-107
Checkpoint (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-109
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-111
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-113
Unit 5. Active Memory Expansion: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
Topic 1: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4
Active Memory Expansion: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
Active Memory Expansion: Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8
AME scenarios (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
AME scenarios (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
Logical memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
Logical memory pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
Page faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18
Expanded logical memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
Pool size (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-22
Pool size (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24
CPU utilization for compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-26
Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
AME economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
AME deployment phases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-32
Memory utilization improvement technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
Sample SAP ERP workload (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-36
Sample SAP ERP workload (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-38
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-40
Topic 1: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-42
Topic 2: Planning for Active Memory Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . 5-44
Workload characteristics (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-46
Workload characteristics (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-48
CPU resource consumption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-50
AME planning tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-52
Running the AME planning tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-54
Command usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-56
Command usage examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-58
Planning tool report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-60
First example report (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-62
First example report (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-64
Second example report (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-66
Second example report (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-68
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-70
Topic 2: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-72
Topic 3: Deploying Active Memory Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-74
AME system requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-76
AME 60 day trial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-78
Copyright IBM Corp. 2010, 2011 Contents vii

Instructor Guide
Requesting a trial activation (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-80

Requesting a trial activation (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-82
Retrieving an activation code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-84
Enter activation code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-86
Managed system properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-88
Permanent activation of AME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-90
Configuring an LPAR for AME (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-92
DLPAR operations with AME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-100
Unconfiguring AME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-102
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-104
Topic 3: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-106
Topic 4: Monitoring AME partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-108
Monitoring AME partitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-110
Expanded memory deficit (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-112
Basic AME monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-120
Basic monitoring example report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-122
The lparstat command (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-124
The lparstat command (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-126
The vmstat command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-128
The topas command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-130
Fine tuning AME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-132
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-134
Topic 4: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-136
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-138
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-140
Unit 6. N_Port ID Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2
NPIV: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-4
SAN disk access without NPIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7
SAN access using NPIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-9
NPIV benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-12
VIOS with NPIV and vSCSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-14
NPIV requirements and considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-16
NPIV configuration steps: HMC, VIOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-19
NPIV configuration steps: HMC, client LPAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-22
NPIV configuration steps: HMC, SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-24
NPIV configuration steps: VIO Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-26
Listing the NPIV mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-28
DLPAR and NPIV considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-30
Heterogeneous multipathing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-32
AIX LPAR heterogeneous MPIO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-34
viii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
TOC Shared NPIV adapter for efficient path redundancy . . . . . . . . . . . . . . . . . . . . . . . 6-36

Dual VIOS without NPIV (vSCSI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-38
Dual VIOS with NPIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-40
Live Partition Mobility and NPIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-42
NPIV useful commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-45
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-47
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-49
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-51
Unit 7. I/O device virtualization performance and tuning . . . . . . . . . . . . . . . . . . . . . 7-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
Topic 1: Virtual device performance overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
Virtual I/O performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-7
Virtual devices: Overview (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
Virtual devices: Overview (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
Virtual device performance considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
VIOS performance tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
Monitoring Virtual I/O Server resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
Topic 1: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
Topic 2: Virtual SCSI tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
Virtual SCSI devices example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25
Virtual SCSI storage and LVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-27
Performance factors with virtual SCSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
Additional I/O latency for virtual SCSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
I/O bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33
CPU needs for different I/O types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
Sizing the Virtual I/O Server for virtual SCSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-37
Processor sizing methodologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-39
Server sizing: Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-41
Server sizing: Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-43
Virtual SCSI queue depth (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-45
Virtual SCSI queue depth (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-48
Monitoring virtual SCSI devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-50
Is the system disk bound? (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-52
Is the system disk bound? (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-54
Find busiest logical volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-56
Virtual SCSI I/O analysis flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-58
Virtual SCSI performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-60
Checkpoint (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-62
Checkpoint (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-64
Topic 2: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-66
Topic 3: Virtual Ethernet tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-68
Virtual Ethernet: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-70
Virtual Ethernet adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-72
Performance factors: CPU entitlement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-74
Performance factors: MTU size example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-76
Performance factors: MTU size example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-78
Copyright IBM Corp. 2010, 2011 Contents ix

Instructor Guide
Performance factors: SMT impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-80

TCP checksum offload (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-82
TCP checksum offload (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-84
Virtual Ethernet I/O analysis flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-86
Network monitoring tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-88
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-90
Topic 3: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-92
Topic 4: Shared Ethernet adapter tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-94
Shared Ethernet adapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-96
Configuring the interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-98
SEA configuration options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-100
Virtual I/O Server sizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-102
Virtual I/O Server sizing: Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-104
Virtual I/O Server sizing: CPU (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-106
Virtual I/O Server sizing: CPU (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-108
Threading/non-threading (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-110
Threading/non-threading (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-112
TCP segmentation offload (largesend) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-114
Configuring largesend on SEA device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-116
Configuring largesend on client LPARs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-118
SEA bandwidth apportioning (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-120
SEA performance summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-126
SEA I/O analysis flowchart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-128
Viewing SEA configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-130
Monitor SEA activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-132
Monitor with the entstat command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-134
Using seastat to monitor SEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-136
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-138
Topic 4: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-140
Topic 5: Integrated Virtual Ethernet tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-142
IVE architecture (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-144
IVE architecture (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-146
IVE example: Dual- and quad-port gigabit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-148
IVE example: Dual-port 10 gigabit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-150
SEA versus dedicated HEA logical port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-152
Multi-Core Scaling value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-154
Queue pairs (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-156
Queue pairs (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-158
Viewing number of QPs in AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-160
Viewing MCS and QP configuration in AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-162
MCS and QPs tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-164
Performance considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-166
Monitoring IVE traffic (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-168
Monitoring IVE traffic (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-170
Checkpoint (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-172
Checkpoint (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-174
x PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
TOC Topic 5: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-176

Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-178
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-180
Unit 8. Partition mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
What is partition mobility? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
Overview of process (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
Overview of process (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-14
Basic requirements (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
Validation (1 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27
Validation (2 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29
Validation (3 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32
Validation (4 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-34
Validation (5 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36
Validation (6 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38
Validation (7 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-41
Validation (8 of 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43
Migrating the LPAR (1 of 16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45
Migrating the LPAR (5 of 16)) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-55
Migrating LPAR: NPIV (10 of 16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-65
Migrating the LPAR (11 of 16) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-68
Monitoring the migration process (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-80
Error log entries: Mobile client . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-86
Error log entries: Source VIOS (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-88
Error log entries: Source VIOS (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-90
Error log entries: Target VIOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-92
Troubleshooting (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-94
Copyright IBM Corp. 2010, 2011 Contents xi

Instructor Guide
Troubleshooting (4 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-101
Troubleshooting (5 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-103
Dual VIO Server: Network topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-105
Dual VIO Server: Virtual SCSI with dual HMC consideration . . . . . . . . . . . . . . . .8-108
Partition mobility with IVM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-112
Partition mobility with IVM: Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-114
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-116
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-118
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-120
Unit 9. PowerVM advanced systems maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . .9-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-2
PowerVM firmware enablement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-4
Firmware management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-7
HMC and system firmware compatibility (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . .9-10
Decoding the firmware level and name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-18
Using the HMC to update system firmware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-21
Managed system firmware update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-24
Updating firmware without an HMC (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-27
VIO Server code updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-39
VIO Server service strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-42
Upgrade to VIOS 2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-44
VIO Server software updates: Single VIO Server . . . . . . . . . . . . . . . . . . . . . . . . . .9-46
VIO Server software updates: Dual VIO Servers (1 of 4) . . . . . . . . . . . . . . . . . . . .9-49
Acquiring VIO Server software updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-61
updateios command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-63
Applying updates from a local hard disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-66
Applying updates from remote file system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-68
Applying updates from the CD/DVD drive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-70
Virtual I/O Server backup strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-72
VIO Server backup and restoration methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-75
Backing up user-defined virtual devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-78
Backing up user-defined disk device structures . . . . . . . . . . . . . . . . . . . . . . . . . . .9-81
Backing up the Virtual I/O Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-84
Restoring the Virtual I/O Server (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-89
Using cron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-101
xii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
TOC Backing up the client LPAR to virtual DVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-104

Managing virtual SCSI device capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-106
Processor recovery: Partition availability priority . . . . . . . . . . . . . . . . . . . . . . . . . 9-109
Power saver mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-112
Hot pluggable devices (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-116
Hot pluggable devices (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-119
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-121
Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-123
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-125
Unit 10. Virtualization management tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

Unit objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2
Utilization data management (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-4
Utilization data management (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-7
Utilization data management (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-10
Normal monitoring tools (1 of 5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-13
nmon analyzer and nmon consolidator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-28
VIO Server 2.1 monitoring commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-31
VIO monitoring using topas (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-34
AIX MPIO paths monitoring using topas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-42
VIO monitoring using Workload Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-44
Freeware monitoring tools (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-47
Performance toolbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-57
Management tools: IBM Systems Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-60
IBM Systems Director: Welcome . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-63
IBM Systems Director: Power Systems Management . . . . . . . . . . . . . . . . . . . . . 10-65
IBM Systems Director VMControl features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-68
IBM Systems Director VMControl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-71
IBM Tivoli Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-74
IBM Tivoli Monitoring: AIX/System p architecture . . . . . . . . . . . . . . . . . . . . . . . . 10-77
IBM Tivoli Monitoring: Hypervisor view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-80
IBM Tivoli Monitoring: AIX LPAR view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-82
IBM Tivoli Monitoring: Network mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-84
IBM Tivoli Monitoring: Virtual storage mapping . . . . . . . . . . . . . . . . . . . . . . . . . . 10-86
IBM Tivoli Monitoring: Device status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-88
Checkpoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-90
Unit summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-92
Appendix A. Checkpoint solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Copyright IBM Corp. 2010, 2011 Contents xiii

Instructor Guide
xiv PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
TMK Trademarks
The reader should recognize that the following terms, which appear in the content of this
training document, are official trademarks of IBM or other companies:
IBM is a registered trademark of International Business Machines Corporation.
The following are trademarks of International Business Machines Corporation in the United
States, or other countries, or both:
Active Memory AIX 5L AIX 6
AIX BladeCenter DB
DB2 EnergyScale Express
HACMP IBM Systems Director Active i5/OS
Energy Manager
i5/OS Micro-Partitioning Power Architecture
POWER Hypervisor Power Systems POWER
PowerVM POWER4 POWER5
POWER5+ POWER6+ POWER6
POWER7 Systems POWER7 pSeries
Redbooks System i System p
System p5 System z Systems Director
VMControl
Tivoli Workload Partitions z/VM
Manager
400
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or
both.
Windows and Windows NT are trademarks of Microsoft Corporation in the United States,
other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Java and all Java-based trademarks and logos are trademarks or registered trademarks of
Oracle and/or its affiliates.
Other product and service names might be trademarks of IBM or other companies.
Copyright IBM Corp. 2010, 2011 Trademarks xv

Instructor Guide
xvi PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
pref Instructor course overview

This course will teach the student how to manage the virtualization
performance on IBM POWER Systems for AIX. There is also a focus
on advanced virtualization features such as Active Memory Sharing
and Active Memory Expansion. Topics on how to use AIX monitoring
tools, options, and tuning techniques are also included.
With the exception of the virtualization management tools unit, all units
have an associated hands-on lab. Also, the NPIV exercise is provided
as an optional lab. This exercise can be performed if your systems
have the 8GBps NPIV capable Fibre Channel adapters. None of the
labs require POWER7 systems.
Copyright IBM Corp. 2010, 2011 Instructor course overview xvii

Instructor Guide
xviii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
pref Course description

Power Systems for AIX - Virtualization II: Advanced PowerVM and
Performance
Duration: 4.5 days
Purpose
Students in this course will learn how to implement advanced
PowerVM features, such as Active Memory Sharing, Active Memory
Expansion, shared dedicated processors, multiple shared processor
pools, N_Port Virtualization, and Remote Live Partition Mobility.
Additionally, students will learn skills to implement, measure, analyze,
and tune PowerVM virtualization features for optimal performance on
IBM System p servers. This course focuses on the features that relate
to the performance of POWER6 and POWER7 processors, AIX 6.1,
and the special monitoring, configuring, and tuning needs of logical
partitions (LPARs). This course does not cover application monitoring
and tuning.
Students will also learn AIX 6.1 performance analysis and tuning tools
that help an administrator take advantage of the Micro-Partitioning and
other virtualization features of the System p servers.
Hands-on lab exercises reinforce each lecture and give the students
practical experience.
Audience
Anyone responsible for the system administrative duties implementing
and managing virtualization features on a System p server.
The audience for this training includes the following:
AIX technical support individuals
System administrators
Systems engineers
System architects
Copyright IBM Corp. 2010, 2011 Course description xix

Instructor Guide
Prerequisites
The LPAR prerequisite skills can be met by attending one of the
following classes or students can have equivalent LPAR skills.
AN11 Power Systems for AIX I: LPAR Planning and Configuration
AN30 PowerVM Virtualization II: Dual VIO Servers and IVE
Objectives
After completing this course, the students should be able to:
Describe the effect of the POWER6 virtualization features on
performance and monitoring, such as:
- Simultaneous multithreading (SMT), Micro-Partitioning, multiple
shared processor pools (MSPP), shared dedicated capacity,
Active Memory Sharing (AMS), Active Memory Expansion
(AME), and other virtualization features
Interpret the outputs of AIX 6.1 performance monitoring and tuning
tools used to view the impact of SMT, Micro-Partitioning, additional
shared processor pool activations, and device virtualization; these
tools include the following:
- vmstat, iostat, sar, topas, trace, curt, mpstat, lparstat, smtctl
List various sources of information and support related to AIX 6.1
performance tools, system sizing, system tuning, and AIX 6.1
enhancements and new features
Perform a Live Partition Mobility between two different
POWER6/POWER7 servers.
Describe the new features available with the Virtual I/O Server
Version 2.1 and Version 2.2, such as:
- N_port ID Virtualization, heterogeneous multithreading, virtual
tape devices, Active Memory Sharing
Describe and implement the Active Memory Sharing feature
Describe the Active Memory Expansion feature
Curriculum relationship
This course assumes that students have taken the prerequisite AIX
and virtualization training. This course is the third of the available
Power p virtualization courses.
xx PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
pref Agenda
Day 1
(00:30) Welcome
(01:00) Unit 1: PowerVM features review
(00:45) Exercise 1: Introduction to the lab environment
(02:00) Unit 2: Processor virtualization tuning
(02:00) Exercise 2: Processor virtualization tuning
Day 2
(01:30) Unit 3: Dedicated shared capacity and multiple shared
processor pools
(01:30) Exercise 3: Configuring multiple shared processor pools
(01:30) Unit 4: Active Memory Sharing
(02:00) Exercise 4: Configuring Active Memory Sharing
Day 3
(02:00) Unit 5: Active Memory Expansion: Overview
(00:35) Exercise 5: Active Memory Expansion
(01:00) Unit 6: N_Port ID Virtualization
(01:30) Exercise 6: Virtual Fibre Channel adapter configuration
(optional)
(02:00) Unit 7: I/O device virtualization performance and tuning
Day 4
(02:00) Unit 7: I/O device virtualization performance and tuning
(continued)
(01:00) Exercise 7: I/O device virtualization performance and tuning
(01:30) Unit 8: Partition mobility
Day 5
(01:00) Exercise 8: Implementing Live Partition Mobility
(01:30) Unit 9: PowerVM advanced systems maintenance
(01:00) Exercise 9: PowerVM system maintenance
(01:00) Unit 10: Virtualization management tools
(00:30) Wrap up/Evaluations
Copyright IBM Corp. 2010, 2011 Agenda xxi

Instructor Guide
xxii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
Uempty Unit 1. PowerVM features review
Estimated time
01:00
What this unit is about

This unit reviews the virtualization features of the IBM Power Systems.
Each PowerVM feature is overviewed, but most are detailed in the
subsequent units.
What you should be able to do

After completing this unit, you should be able to:
List the key technologies associated with the IBM POWER
Systems
Describe the IBM PowerVM features
Discuss virtualization performance management
How you will check your progress

Checkpoint questions
Machine exercises
References
Redbooks and Redpapers related to the PowerVM that you can
download at http://www.redbooks.ibm.com
SG24-7940 PowerVM Virtualization on IBM System p: Introduction
and Configuration. (Fourth Edition Redbook)
SG24-7590 PowerVM Virtualization on IBM System p: Managing and
Monitoring, SG24-7590-00
REDP-4194 IBM System p Advanced POWER Virtualization
(PowerVM) Best Practices, REDP-4194-00
REDP-4638-00 IBM Power 750 and 755 Technical Overview and
Introduction
Copyright IBM Corp. 2010, 2011 Unit 1. PowerVM features review 1-1
Instructor Guide
Unit objectives

Systems
Copyright IBM Corporation 2011
Figure 1-1. Unit objectives AN313.1
Notes:
The objectives list what you should be able to do at the end of this unit.
1-2 PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
Uempty Instructor notes:

Purpose
Details All of IBMs current system models support virtualization in some way. Most of
the key components are hardware-based.
Additional information
Transition statement Let's examine the POWER Systems, key virtualization features.
Instructor Guide
Virtualization technologies on POWER systems

Multiple OSs
Multiple LPARs
Dynamic LPAR
Micro-partitioning
Virtual CPUs
Multiple shared pools*
Shared dedicated
processor*
IVE*
SMT
Virtual LANs
Virtual I/O
IVM
COD
PowerVM Lx86
Live Partition Mobility
Active Memory Sharing
Active Memory Expansion*
* POWER7 systems only Copyright IBM Corporation 2011
Figure 1-2. Virtualization technologies on POWER systems AN313.1
Notes:
The IBM System and IBM Power System technologies

The following technologies are the components of the PowerVM edition, which are
integrated into the IBM Power Systems:
POWER Hypervisor: Supports partitioning and dynamic resource movement across
multiple operating system environments.
Multiple operating system support: Logical partitioning allows a single server to run
multiple operating system images concurrently.
Dynamic logical partitioning: Allows adding, moving, or deleting physical and virtual
resources of an LPAR without shutdown.
Micro-Partitioning: Enables a logical partition to be allocated a percentage of processor
usage rather than a full physical processor. Multiple virtual shared processor pools are
available with POWER6 and POWER7 technology.

V5.4.0.3
Instructor Guide
Uempty Dedicated shared processor: Function provides the ability for partitions that normally
run as dedicated processor partitions to contribute unused processor capacity to the
shared processor pool. This support allows unneeded capacity to be donated to
uncapped micro-partitions instead of being wasted as idle cycles in the dedicated
partition.
Integrated Virtual Ethernet (IVE): Also called Host Ethernet Adapter (HEA), this is a
feature that provides network connectivity to the partitions. It is available on certain
types of POWER6 and POWER7-based systems and allows partition communication. It
uses a physical Integrated Virtual Ethernet adapter and it must not be considered as a
virtual feature.
Simultaneous Multithreading (SMT): On-chip hardware threads to improve resource
usage.
Virtual LAN: Provides network virtualization capabilities. It is purely firmware-based,
using the POWER Hypervisor, and does not require the purchase of the PowerVM
edition. Multiple virtual switches can be defined with POWER6 and POWER7
processor-based systems.
Virtual I/O: Allows the sharing of I/O adapters and devices between partitions.
Integrated Virtualization Manager (IVM) provides the virtualization capabilities for the
management of a server and LPARs without using a Hardware Management Console
(HMC).
Capacity on Demand (CoD): Allows system resources, such as processors and
memory, to be activated as needed. Utility CoD is new with POWER6 technology and
automates the usage of CoD processors.
The IBM PowerVM Lx86 feature of PowerVM editions is designed to allow you to run
most x86 Linux applications. This allows consolidation of AIX and Linux on POWER
and x86 Linux applications on the same server.
The PowerVM Enterprise edition includes the Partition Mobility virtualization feature that
allows migrating a virtualized logical partitions from a source system to a target. This
feature supports Live Partition Migration without partition shutdown.
The PowerVM Enterprise edition includes also the Active Memory sharing feature that
allows shared memory partitions to shared a common shared memory pool.
Active memory expansion feature requires POWER7 processor-based systems. The
purpose of AME is to reduce the memory footprint used by an LPAR by compressing
the memory. The Operating system running in the LPAR compress data in memory to
effectively expand the size of the memory by allowing more data to be packed into
memory. Active memory expansion is configurable on a per-LPAR basis.
Most of the hardware virtualization features listed above are managed using the HMC
V7 Web user interface. IVM allows management of some IBM Power Systems, and
blades without using an HMC. IVM does not support all of the HMC functions and has
limited capabilities.
Instructor Guide
The Workload Partition (WPAR) feature, not listed on the previous slide, provides a
software solution for creating virtualized operating environments to use when managing
multiple workloads. WPAR is a purely software partitioning solution that is provided by
the operating system. It has no dependencies on hardware features. It is strictly an AIX
6.1 feature.

V5.4.0.3
Instructor Guide

Purpose To review the Power Systems virtualization technologies.
Details simultaneous multithreading (SMT)
Enhancements in the IBM Power processors design allow for improved overall hardware
resource utilization. SMT technology allows two or four separate instruction streams
(threads) to run concurrently on the same physical processor, improving overall throughput.
LPAR and shared-processor partitions
A logical partition (LPAR) is not constrained to physical processor boundaries, and can be
allocated processor resources from a shared processor pool. An LPAR that utilizes
processor resources from the shared processor pool is known as a Micro-Partition LPAR.
The percentage of a physical processor that is allocated is known as processor entitlement.
Processor entitlement can range from 10% of a physical processor up to the maximum
installed processor capacity of the IBM Power Systems. Additional processor entitlement
can be allocated in increments of 1% of a physical processor.
Multiple virtual shared pools can be defined on IBM Power Systems. The physical shared
pool is the default (VSP 0) and all virtual shared pools are subsets of it.
Shared dedicated processor partitions
This is a POWER6 and POWER7 feature that improves the processor usage of the system
when having dedicated processor partitions. Idle CPU cycles are not wasted and can be
ceded to the shared processor pool for uncapped micro-partitions. It is based on a similar
AIX/Hypervisor mechanism using interrupts.
Dynamic reconfiguration
It is possible to dynamically move system resources, physical processors, virtual
processors, memory, and I/O slots, between partitions without rebooting. This is known as
dynamic reconfiguration (DR) or dynamic LPAR (DLPAR).
Integrated Virtual Ethernet
IVE is a physical feature available on some IBM Power Systems types. Logical Host
Ethernet Adapter (LHEA) ports can be defined from the IVE physical adapter, and can be
assigned to the LPARs as a network device. LHEA ports have higher bandwidth and lower
latency than virtual Ethernet adapters. No Hypervisor cycles are used by the IVE.
Virtual LAN
A function of the POWER Hypervisor, Virtual LAN allows secure communication between
logical partitions without the need for a physical I/O adapter. The ability to securely share
Ethernet bandwidth across multiple partitions increases hardware utilization.
Virtual I/O
Virtual I/O allows a single physical I/O adapter and disk to be used by multiple logical
partitions of the same server. This facilitates consolidation of I/O resources and
Instructor Guide
minimization of the number of I/O adapters required. File-backed devices are available with
VIOS version 1.5.
Multiple operating system support
The IBM Power Systems support IBM AIX Version 6.1, IBM AIX Version 5.3, i5/OS, and
Linux distributions from SUSE and Red Hat.
Integrated Virtualization Manager
The Integrated Virtualization Manager is a hardware management solution that inherits the
most basic of Hardware Management Console (HMC) features and removes the
requirement of an external HMC. It is limited to managing a single IBM Power System
server. Integrated Virtualization Manager (IVM) runs now on the Virtual I/O Server Version
1.5.
Additional information Active memory expansion is a separate priced feature from
PowerVM Editions. Active memory expansion is detailed in a separate unit.
Transition statement Lets discuss the different PowerVM editions and the supported
features.

V5.4.0.3
Instructor Guide
Uempty
IBM PowerVM editions

The virtualization platform for AIX, Linux, and IBM i clients
Express Standard Enterprise
Features
Edition Edition Edition
Micro-Partitioning
Shared Dedicated Capacity (1)
Multiple shared processor pools (1)
Virtual I/O Server
N_Port_ID Virtualization (NPIV) (1)
Live Partition Mobility (2)
Active Memory Sharing
Lx86
(1): POWER6 and POWER7-based systems only

(2): POWER6 and POWER7-based systems and requires PowerVM Enterprise Edition
Figure 1-3. IBM PowerVM editions AN313.1
Notes:
PowerVM Enterprise Edition

PowerVM Enterprise Edition is offered exclusively on POWER6 and POWER7
processor-based servers and includes all the features of PowerVM Standard Edition plus
two new industry-leading capabilities called Active Memory Sharing and Live Partition
Mobility.
PowerVM Standard Edition

For users ready to get the full value out of their server, IBM offers PowerVM Standard
Edition providing the most complete virtualization functionality for AIX, IBM i, and Linux
operating systems in the industry. PowerVM Standard Edition is supported on Power
Systems servers and includes features designed to allow businesses to increase system
utilization, while helping to ensure that applications continue to get the resources they
need.
Instructor Guide
PowerVM Express Edition

PowerVM Express Edition is offered exclusively on the Power 520 and Power 550 Express
servers and is designed for users looking for an introduction to more advanced
virtualization features at a highly affordable price. With PowerVM Express Edition, users
can create up to three partitions on the server, leverage virtualized disk and optical devices
VIOS, and even try out the shared processor pool.

V5.4.0.3
Instructor Guide

Purpose Highlight the different PowerVM Editions. Point out the features requiring the
Enterprise Edition and POWER6 or POWER7-based systems.
Details
Transition statement Next slide is just an introduction to the POWER6 processor and
its characteristics.
Instructor Guide
POWER6 processor architecture

Ultra-high frequency dual-core chip: 3.5 to 5.0 GHz (~ 2 times
POWER5)
Seven way superscalar:
Allows up to seven instructions to be executed in one
clock cycle Alti P6 P6 Alti
Nine execution units: Vec core core Vec
2 LS, 2 FP, 2 FX, 1 DFP, 1 BR/CR, and 1 VMX (AltiVec)
Two threads simultaneous execution per core
2x4 MB on-chip L2 with point of coherency L3
L3 4 MB 4 MB
Ctrl L2 L2
(L2 to L2 fast path)
On-chip L3 directory and controller
High-speed elastic bus interface
for memory and IOs
Fabric Bus
Dynamic power saving: Controller
More work per watt is performed
Advanced clock gating Memory GX Bus Cntrl
Various power-saving modes can be activated Cntrl
when the processor is idle
Full error checking and recovery
GX+ bridge
Recovery unit
Memory+
Figure 1-4. POWER6 processor architecture AN313.1
Notes:
The POWER6 processor is designed to provide improved performance. It uses the latest
65nm advanced technology with 10 levels of metal (low-k dielectric on first eight levels).
The distributed switch architecture (previous POWER5 generation) is enhanced for SMP
enhanced scaling and core parallelism. The memory path and caches have been modified
to increase memory bandwidth and reduce memory latencies.
L1 data cache is 64KB instead of 32KB and cache access bandwidth is increased (double
number of paths compared to POWER5). L2 cache capacity is increased to 2x4MB.
High-speed elastic bus interface is implemented, which allows all interface busses to the
POWER6 to operate at a range of higher frequencies. The bus speed scales with the
speed of the processor.
The following features have been implemented on the POWER6 core:
Advanced memory subsystem
Decimal floating point execution unit
VMX (Alti-Vec) execution unit

V5.4.0.3
Instructor Guide
Uempty Enhanced architecture for higher frequencies

Check point restart mechanism using recovery unit
POWER6 is a two way simultaneous multithreading (SMT) core. Each core (physical
processor) is capable of running two separate instruction pipelines corresponding to two
hardware threads. The SMT hardware feature is also improved and is capable of running
up to seven instructions per clock cycle.
The POWER6 is able to stop clocking on unused circuitry. This is called clock gating, which
lowers power consumption and thermal dissipation when there are few instructions to
execute. Also, a power-save mode is available to reduce the frequency, but this limits the
systems performance.
POWER6 processor compared to POWER5 processor
The POWER6 processor benefits from a new architecture design and from the experience
gained in the development of previous POWER processors. The design change was
necessary to get the maximum performance from the technology while maintaining a high
level of RAS and making more efficient use of electrical power.
When comparing POWER6 to the previous POWER 5 design, you should notice POWER6
no longer has a shared L2 cache, but two dedicated and larger L2s for each core. The
POWER5/POWER5+ systems included a 1.875 MB L2 cache that is shared between the
two on-chip cores. The logic that does the sharing and allows good scalability uses
transistors that consumes electrical power.
In the POWER6 design, each core has its own private 4 MB cache for a total of 8 MB on
the chip. The sharing occurs below the L2, at the L3. Because of the size of the POWER6
L2, the miss rates are significantly lower than the miss rates of the L1s on
POWER5/POWER5+. When it is not needed, the power to the circuits is turned off. The
design conserves power for the sharing logic while allowing better L2 cache hit rates
because of the increased size.
Additionally, in those cases where data that a core needs is in the L2 connected to the
other core, a fast path allows fast transfer from one L2 to the other L2 (L2 point of
coherency). By going to a private L2, the reduced logic allows for faster access and lower
latency. Also, the L3 associativity has been increased to 16-way (versus 12 for POWER5
L3 cache).
POWER 5+ has out-of-order execution; POWER6, however, uses mostly in-order
execution. An out-of-order execution core has some performance advantages but takes
significantly more logic to manage the execution. The extra logic consumes electrical
power. Because a key objective in the design of POWER6 systems was to conserve
electrical power, the decision was made to implement the core in-order. The potentially
lower performance is offset by the significant increase of processor frequency, more than
2X higher performance than POWER5/POWER5+ systems. TPC-C and other benchmarks
results for POWER6 show a performance improvement near 100% boost over POWER5
systems.
Instructor Guide
Instructor notes:
Purpose Explain that POWER6 system technology is a full redesign based on a new:
Processor
System architecture
PHYP microcode and new HMC
Virtualization component set
AIX version
Explain that the POWER processor is based on a strong roadmap, and the POWER6
design benefits from the past experience. POWER4 and AIX 5 brought LPAR and DLPAR
for server consolidation. POWER5 and AIX 5.3 brought virtualization for resource
optimization. POWER6 and AIX 6 brings enhancements to all previous features with a
emphasis on RAS.
All of the generations of POWER systems bring advancements in computing performance.
POWER6/AIX 6 is a UNIX main frame class of systems.
Details
Additional information IBM POWER systems scale significantly better than any other
systems. Part of the secret is the implementation of the storage hierarchy, including
management of the L2 and connections to it from other parts of the system.
Transition statement Next slide is just an introduction to the POWER6 processor and
its characteristics.

V5.4.0.3
Instructor Guide
Uempty
POWER7 processor architecture

Cores: Eight (four and six core options)
Frequency from 3.0 to 4.25Ghz
Eight processor cores P
Core Core Core Core
O S
12 execution units per core W L2 L2 L2 L2 M
Four-way SMT per core: up to four E
R P
threads per core
G F
32 threads per chip L3 cache A
X
L1: 32KB I cache/32KB D cache B
R
L2: 256KB per core B
L2 L2 L2 L2 I
U C
L3: Shared 32MB on chip eDRAM S
Core Core Core Core
Dual DDR3 memory controllers
90GB/s memory bandwidth per chip
Memory interface
Scalability up to 32 sockets
360GB/s SMP bandwidth/chip
20,000 coherent operations in flight
Up to 256 cores per server
Figure 1-5. POWER7 processor architecture AN313.1
Notes:
POWER7 processor is 45 nm technology. The basic processor is 8 cores, with 4 and 6
cores options. POWER7 provides significantly better performance per chip as we have
many more cores per chip than POWER6.
The L3 cache for an 8 core chip is 32 MB just like it was for POWER6, but POWER7
implements L3 caches on-chip. So we have 32 MB of cache but now it is on chip and uses
this new embedded DRAM technology. With this eDRAM technology, you get significant
bandwidth and latency improvements, up to six times faster access of data in this cache
versus external cache.
Each core has its own level 2 cache. The level 2 cache was 4 MB on POWER6 and now it
is 256 KB but it is about three times faster to access. In fact the cache acts more like it was
on L1 cache before.
The POWER7 processor has the same L3 cache size than POWER6 but much faster to
access and much higher bandwidth. Also much faster access to L2 cache than it was on
POWER6.
Instructor Guide
POWER7 processor has dual DDR 3 memory controllers with a new buffer chip to connect
to the memory to be able to move data faster and more efficiently. The memory bandwidth
just doubled from POWER6 to POWER7
POWER7 processor has 12 execution units compared to 9 on POWER6, and instead of
running two threads per core, the POWER7 implements four threads per core.
AIX 6.1 and AIX 7 support POWER7. AIX 6.1 supports 64 cores but allows to use four-way
multithreading for a maximum of 256 threads. AIX 7 will support 256 cores with up to 1024
threads.

V5.4.0.3
Instructor Guide

Purpose Give some characteristics of the POWER7 processor.
Details
Transition statement Lets discuss the shared processors.
Instructor Guide
Micro-Partitioning: Shared processors

Processor capacity is assigned in processing units from the
shared processor pool.
Partitions guaranteed amount is its entitled capacity (EC).
Each partition is configured with a percentage of execution dispatch
time for each 10 ms timeslice (dispatch window).
Each virtual processor provides access to a single physical
processor in the pool.
LPAR1 LPAR2
Entitled
capacity in
EC=2.4 EC=1.55
processing Virtual
units processors
Shared processors pool

Physical
processors
Figure 1-6. Micro-Partitioning: Shared processors AN313.1
Notes:
Shared processors
Shared processors are physical processors that are allocated to partitions on a timeslice
basis. Any physical processor in the shared processor pool can be used to meet the
execution needs of any partition using the shared processor pool.
A POWER system can contain a mix of shared and dedicated partitions. A partition must
be either all shared or all dedicated, and you cannot use dynamic LPAR commands to
change between the two. You need to bring down the partition and switch it from using
dedicated to shared, or vice versa.
Processing units
When a partition is configured, you assign it an amount of processing units. A partition
must have a minimum of one tenth of a processor, and after that requirement has been
met, you can configure processing units at the granularity of one hundredth of a processor.

V5.4.0.3
Instructor Guide
Uempty 10 ms timeslice (also known as a dispatch window)

It is important to think about execution capacity in terms of 10 millisecond (ms) timeslices.
A partition uses the timeslice, or a portion of the timeslice, based on the allocated
processing units. For example, 0.5 processing units ensures that for every 10 ms timeslice,
that partition will receive up to 5 ms of processor time. Up to is an important concept,
because the partition might not need the entire 5 ms timeslice because of waits, interrupts,
or lack of processing need, but the partition is guaranteed up to 5 ms of processing time.
Virtual processors
The virtual processor setting defines the way that a partitions entitlement can be spread
concurrently over physical processors. That is, you can think of the processing power
available to the operating system on the partition as being spread equally across these
virtual processors. The number of virtual processors is what the operating system thinks it
has for physical processors. The Hypervisor dispatches virtual processors onto physical
processors.
The example in the visual above shows four physical processors in the shared pool, and
each partition thinks it has three processors. The number of virtual processors can be
independently configured for each shared partition. The number of virtual processors can
be changed dynamically.
Instructor Guide
Instructor notes:
Purpose Review shared processors.
Details
Transition statement Lets look at how simultaneous multithreading and virtual
processors work together.

V5.4.0.3
Instructor Guide
Uempty
Simultaneous multithreading and
Micro-Partitioning
Simultaneous multithreading can
be used with Micro-Partitions. POWER7
With simultaneous multithreading, LPAR1
each virtual processor runs two
threads (POWER6) or four
threads (POWER7): Logical
processors
Each thread is called a
logical processor.
Virtual
LPAR1 example:
processors
1.6 processing units
Two virtual processors
Simultaneous
multithreading enabled
Eight logical processors
Shared processing pool
Figure 1-7. Simultaneous multithreading and Micro-Partitioning AN313.1
Notes:
Logical processors and virtual processors

Each of the simultaneous multithreading threads (logical processors) of a virtual processor
has a separate hardware state, but they are viewed as one entity for the purpose of a
dispatch of a virtual processor. The two logical processors are always assigned to the
same partition. The amount of time that each virtual processor runs is split between the two
logical processors. So the dispatch wheel works the same whether simultaneous
multithreading is enabled or not.
PURR
The Processor Utilization Resource Register (PURR) value, covered in the processor
virtualization unit of this course, is used to accumulate information only when the virtual
processor is dispatched on a physical processor. So PURR is utilized even if simultaneous
multithreading is disabled, because it provides accurate processor utilization statistics in a
shared processor environment.
Instructor Guide
Instructor notes:
Purpose Provide an overview of how logical processors and virtual processors work
together.
Details
Additional information Additional information is given later in this course.
Transition statement

V5.4.0.3
Instructor Guide
Uempty
Micro-Partitioning: Processor folding

Partitions physical CPU utilization is measured once every
second.
If the current number of unfolded (or active) VPs is more than needed,
then fold (deactivate) one.
If the current number of unfolded VPs is not enough, then unfold one.
Virtual processor folding provides the following benefits:
It improves processor affinity.
It increases the processor dispatch time of active virtual processors.
It provides better cache utilization and less work for hypervisor.
Can be disabled or enabled with the schedo option
vpm_fold_policy.
Can be tuned dynamically with the schedo tuning option
vpm_xvcpus. Default value is 0 (-1 disables it).
Projected #VPs = partition physical CPU utilization (Physc) +
vpm_xvcpus
Figure 1-8. Micro-Partitioning: Processor folding AN313.1
Notes:
Virtual processor (VP) folding

Starting with AIX 5L V5.3 maintenance level 3, the kernel scheduler has been enhanced to
dynamically increase and decrease the use of virtual processors based on CPU utilization
within the partition. This is a function of the AIX operating system, and not a Hypervisor
call.
If there are too many virtual processors for the load on the partition, the cede hcall works
well, but only within a dispatch cycle. At the next dispatch cycle, the Hypervisor distributes
entitled capacity and must cede the virtual processor again if there is no work. The VP
folding feature puts the virtual processor to sleep across dispatch cycles. This improves
performance by reducing the Hypervisor workload, by decreasing context switches, and by
improving cache affinity.
When virtual processors are deactivated, they are not dynamically removed from the
partition as with DLPAR. The virtual processor is no longer a candidate to run on or receive
unbound work. However, it can still run bound jobs. The number of online logical
Instructor Guide
processors and online virtual processors that are visible to the user or applications does
not change. The middleware or the applications running on the system are not affected,
because the active and inactive virtual processors are internal to the system.
Schedo parameter vpm_xvcpus

You can use the vpm_xvcpus tunable to enable and disable this feature of folding virtual
processors. The default value of the vpm_xvcpus tunable is 0, which signifies that the
functionality is enabled. You can use the schedo command to modify the tunable.
Each virtual processor can consume a maximum of one physical processor. The number of
virtual processors needed is determined by calculating the sum of the physical CPU
utilization and the value of the vpm_xvcpus tunable, as shown in the following equation:
Number of virtual processors needed = Physical CPU utilization + Number of additional
virtual processors to enable.

V5.4.0.3
Instructor Guide

Purpose Discuss virtual processor folding.
Details You should always round up the value that is calculated from the above
equation to the next integer. The following example describes how to calculate the number
of virtual processors to use: Over the last interval, partition A is utilizing two and a half
processors. The vpm_xvcpus tunable is set to 1. Using the above equation, the physical
CPU utilization = 2.5. The number of additional virtual processors to enable (vpm_xvcpus)
= 1; the number of virtual processors needed = 2.5 + 1 = 3.5; rounding up the value that
was calculated to the next integer equals 4. Therefore, the number of virtual processors
needed on the system is 4.
Additional information This processor folding has been implemented for dedicated
partitions when running in donating mode.
Transition statement Dedicated partitions running in donating mode are discussed in
the next slide.
Instructor Guide
Shared dedicated capacity

Will donate idle cycles to pool like virtual processors in shared
processor partitions
Excess capacity can be used by uncapped partitions.
This is also known as donating mode.
Enable if the active option is checked for a partition
LPAR
Partition properties (or profile)
Dedicated
processor
Shared processor pool

LPAR idle cycles
donated to pool
Figure 1-9. Shared dedicated capacity AN313.1
Notes:
Shared dedicated processors

This new feature is available only on POWER6 processor-based servers for partitions
configured with dedicated processors. This function allows idle dedicated processors to
donate their cycles to the shared processor pool. When there is more work to do for the
dedicated processor partition, its work takes priority over the work from the uncapped
shared partitions.
This feature is licensed as part of the PowerVM Standard and Enterprise editions. It is
supported only on POWER6 Systems and available for AIX 5.3, AIX 6.1, and Linux
enterprise distributions. Note that with AIX6.1, only 64-bit kernel is available.

V5.4.0.3
Instructor Guide

Purpose Give an overview of the shared dedicated processors, also known as
dedicated processors, running in donating mode.
Details
Transition statement Lets introduce the multiple shared processor pools.
Instructor Guide
Multiple shared processor pools
Up to 64
Shared processor pooln
Shared processor pool1 Set of micro-partitions shared
Set of micro-partitions
Shared processor pool0 processor
LPAR7 pools
LPAR1 LPAR3 LPAR8
LPAR2 LPAR4
LPAR5
LPAR6
Shared
Dedicated
Physical
processors
Physical shared processor pool

Figure 1-10. Multiple shared processor pools AN313.1
Notes:

Multiple shared processor pools feature has been introduced with POWER6 Systems. This
allows the system administrator to create a set of micro-partitions to control the processor
capacity they consume from the physical shared processor pool. Each shared processor
pool has an associated entitled pool capacity, which is consumed by the set of
micro-partitions in that shared processor pool.
This feature allows for automatic, non-disruptive balancing of processing power between
partitions assigned to shared pools. This results in increased throughput and the potential
to reduce processor-based software licensing costs. This feature is licensed by PowerVM
Standard or Enterprise edition along with a POWER6 or POWER7 processor-based
server.

V5.4.0.3
Instructor Guide
Uempty Default shared processor pool

All IBM Power Systems support the Multiple shared processor pools capability and have a
minimum of one (the default) shared processor pool and up to a maximum of 64 shared
processor pools.
Instructor Guide
Instructor notes:
Purpose Give an overview of MSPPs.
Details This feature is detailed later in another unit.
Transition statement Lets remind the students about virtual Ethernet adapters.

V5.4.0.3
Instructor Guide
Uempty
Virtual Ethernet adapters

Connects to an IEEE 802.1Q (VLAN) style virtual Ethernet switch
Can be used by LPAR to communicate with each other
HMC used to create adapter, assign VID, and generate MAC address
Assigns VIDs (VLAN ID)
Packets copied between partitions memory
LPAR 1 LPAR 2 LPAR 3 LPAR 4
Virtual Ethernet Virtual Ethernet Virtual Ethernet Virtual Ethernet

adapter adapter adapter adapter
VLAN 100 VLAN 200
Power Hypervisor VLAN 200

Figure 1-11. Virtual Ethernet adapters AN313.1
Notes:
Introduction
Virtual Ethernet enables inter-partition communication without the need for physical
network adapters assigned to each partition. It can be used in both shared and dedicated
POWER processor partitions provided the partition is running AIX 5.3, AIX 6.1 or Linux with
the 2.6 kernel or a kernel that supports virtualization. This technology enables IP-based
communication between logical partitions on the same system using a Virtual Local Area
Network (VLAN)-capable software switch (POWER Hypervisor) in POWER systems.
Because the number of partitions possible on many systems is greater than the number of
I/O slots, virtual Ethernet is a convenient and cost-saving option to enable partitions within
a single system to communicate with one another through a virtual Ethernet LAN.
The virtual Ethernet interfaces can be configured with both IPv4 and IPv6 protocols.
Instructor Guide
No additional purchase required

The virtual Ethernet capability is not part of the PowerVM feature. Virtual Ethernet is
different from shared Ethernet adapter in that there is no connection to a physical Ethernet
adapter, which connects to a physical Ethernet network. To use virtual Ethernet to connect
to a physical Ethernet adapter connected to a physical Ethernet network, you must
implement shared Ethernet adapter.
The POWER Hypervisor and the virtual switches

The POWER Hypervisor transmits packets by copying the packet directly from the memory
of the sender partition to the receive buffers of the receiver partition, without any
intermediate buffering of the packet. The POWER Hypervisor Ethernet switch function is
included as standard in all POWER5, POWER6 and POWER7 systems. It does not require
the purchase of any PowerVM edition.
Multiple virtual switches can be defined in POWER6 system from the HMC user interface.
A switch defined by default is named ETHERNET0 (Default). In case you have defined
multiple virtual switches in your IBM Power system, be careful to select the correct one
when creating a virtual adapter.

V5.4.0.3
Instructor Guide

Purpose Give an overview of the virtual Ethernet adapter.
Details
Transition statement Lets see an example of an environment with a Virtual I/O Server
and a client logical partition.
Instructor Guide
Virtual I/O Server example
Client Virtual I/O Server

partition partition
Physical
Virtual Virtual Virtual Layer 2 Physical network
Ethernet Ethernet bridge Ethernet
switch
Virtual
Ethernet Logical
Virtual Disk
Hypervisor disk
Client Server Device
adapter adapter mapping
SCSI RDMA protocol
DMA
buffer Device DM
A transfer
SCSI, SAS, USB, SATA

FC physical, files, tapes,
or logical disks
Figure 1-12. Virtual I/O Server example AN313.1
Notes:
Client/server relationship
Virtual I/O devices provide for sharing of physical resources, such as adapters and
devices, among partitions. Multiple partitions can share physical I/O resources and each
partition can simultaneously use virtual and physical I/O devices. When sharing adapters,
the client/server model is used to designate partitions as users or suppliers of adapters. A
server must make its physical adapter available and a client must configure the virtual
adapter.
If a server partition providing I/O for a client partition fails, the client partition might continue
to function, or it might fail, depending on the significance of the hardware it is using. For
example, if the server is providing the paging volume for another partition, a failure of the
server partition would be significant to the client.

V5.4.0.3
Instructor Guide
Uempty Virtual I/O Server

The IBM Virtual I/O Server software allows the creation of partitions that use the I/O
resources of another partition. It helps to maximize the utilization of physical resources on
POWER systems. Partitions can have native I/O, virtual I/O, or both. Physical resources
are assigned to the Virtual I/O Server partition in the same way physical resources are
assigned to other partitions.
Virtual I/O Server is a separate software product, and is included as part of the PowerVM
Editions. It supports AIX 5.3 AIX 6.1and Linux partitions as virtual I/O clients.
For the latest information on which devices and operating systems are supported by the
Virtual I/O Server, see the following Web site:
http://www14.software.ibm.com/webapp/set2/sas/f/vios/home.html
Virtual SCSI adapters

Virtual SCSI adapters enable a partition to use SCSI devices that are owned by another
partition. For example, one partition might provide disk storage space to other partitions. In
addition to disks, CD-ROM, DVD-RAM, and DVD-ROM devices are supported. Writing,
however, is limited to DVD-RAM.
The Virtual I/O Server software is required to configure virtual SCSI devices. In the visual
above, DMA is direct memory access and RDMA is remote direct memory access. Remote
DMA services enable a Virtual I/O Server partition to transfer data to another partitions
memory. This enables a device driver in a server partition to efficiently transfer data to and
from another partition. This is key to sharing a virtual I/O adapter in the Virtual I/O Server
partition.
Virtual Ethernet
Virtual Ethernet provides a network connection between partitions on the same managed
server. The Hypervisor provides the inter-partition virtual switch to provide support for
connecting up to 4,096 LANs. All partitions using a particular virtual LAN ID can
communicate with each other. The Virtual I/O Server software is not required to use virtual
Ethernet.
Shared Ethernet adapter

The shared Ethernet adapter bridges virtual LANs with external networks without the need
for TCP/IP routing. This function allows a partition with a virtual Ethernet to appear to be
connected directly to an external network.
The connections to the physical networks are through Virtual I/O Server partitions and this
implies a trusted environment for the Virtual I/O Server partition. The benefits to using this
feature include not needing a physical adapter for each partition and removal of some
network load. Since the release of version 1.2 of the Virtual I/O Server software, shared
Instructor Guide
Ethernet adapter failover functionality has been supported. The Virtual I/O Server software
is required to configure shared Ethernet adapters.
Virtual I/O Server variations

The Virtual I/O Server software is loaded in a special type of partition and uses a version of
AIX 6 (Virtual I/O Server Version 2.1) as its base.

V5.4.0.3
Instructor Guide

Purpose Give an overview of each virtual component in this client server model.
Details
Transition statement The shared Ethernet adapter is shown in the next slide.
Instructor Guide
Virtual I/O Server: Shared Ethernet adapter

Bridges external networks to internal VLANs.
The adapter forwards frames at OSI layer 2 and is transparent to IP layer.
Virtual I/O Server can give a higher priority to some packets.
Bandwidth apportioning feature for the shared Ethernet adapter distinguishes
more-important traffic from less-important traffic.
Higher-priority traffic is sent faster and uses more VIOS bandwidth than less-
important traffic.
Virtual I/O Server partition
Layer 2 bridge (shared Ethernet adapter)
Device driver Device driver Device driver
Physical Virtual Virtual

adapter adapter adapter
External To client
LANs partitions
Figure 1-13. Virtual I/O Server: Shared Ethernet adapter AN313.1
Notes:
Shared Ethernet adapter (SEA) technology (part of the Virtual I/O Server feature on
POWER hardware) enables the logical partitions to communicate with other systems
outside the managed system.
A shared Ethernet adapter is a bridge between a physical Ethernet adapter or aggregation

of physical adapters and one or more virtual Ethernet adapters on the Virtual I/O Server. A
SEA enables partitions on the virtual Ethernet to share access to the physical Ethernet and
communicate with standalone server and partitions on other systems. The shared Ethernet
provides this access by connecting the internal Hypervisor VLANs with the VLANs on the
external switches. This enables partitions on POWER5, POWER6 and POWER7 systems
to share the IP subnet with stand alone systems and other POWER partitions to allow for a
more flexible network configuration.

V5.4.0.3
Instructor Guide
Uempty Bridge
As the shared Ethernet processes packets at Layer 2, the original MAC address and VLAN
tags of the packet are visible to other systems on the physical network.
MTU issues
The virtual Ethernet adapters can transmit packets with a size up to 65408 bytes.
Therefore, the maximum MTU for the corresponding interface can be up to 65394 (65390
with VLAN tagging). Since the shared Ethernet adapter can only forward packets of a size
up to the MTU of the physical Ethernet adapters, a lower MTU or PMTU discovery should
be used when the network is being extended using the shared Ethernet.
Most packets including broadcast (for example, ARP) or multicast (for example, Network
Discovery Packet [NDP]) packets that pass through the shared Ethernet setup are not
modified. These packets retain their original MAC header and VLAN tag information. When
the MTU of the physical and virtual side do not match this can result in the shared Ethernet
receiving packets that cannot be forwarded because of MTU limitations. This situation is
handled by processing the packets at the IP layer, by either doing IP fragmentation or
reflecting ICMP errors (packet too big) to the source based on the IP flags in the packet. In
the case of IPv6, the packets ICMP errors are sent back to the source, as IPv6 allows
fragmentation only at the source host. These ICMP errors help the source host discover the
PMTU and therefore handle future packets appropriately.
VIOS network bandwidth apportioning

The bandwidth apportioning feature for the shared Ethernet adapter (SEA), allows the
VIOS to give a higher priority to some types of packets. In accordance with the IEEE
802.1q specification, VIOS administrators can instruct the SEA to inspect bridged
VLAN-tagged traffic for the VLAN priority field in the VLAN header. The 3-bit VLAN
priority field allows each individual packet to be prioritized with a value from 0 to 7 to
distinguish more important traffic from less important traffic. More important traffic is sent
faster and uses more VIOS bandwidth than less important traffic.
Instructor Guide
Instructor notes:
Purpose Overview of the shared Ethernet adapter.
Details The SEA network bandwidth apportioning feature is available with the Virtual I/O
Server version 1.5.2 and later.
Transition statement The Integrated Virtual Ethernet is an alternative to the shared
Ethernet adapter.

V5.4.0.3
Instructor Guide
Uempty
Integrated Virtual Ethernet

IVE is an integrated physical adapter with two to four physical ports.
IVE communicates directly to logical partitions, thus reducing the interaction
with the Power Hypervisor.
It allows up to 32 partitions to share an Ethernet adapter to directly connect to an external
network.
IVE ports as AIX devices:
IVE is presented logically to partitions as a Logical Host Ethernet Adapter.
Logical port appears as an ent# to AIX and eth# to Linux.
LPAR OS
ent Logical devices as

Logical they appear in AIX
port lhea
Logical Host
switch Port group Ethernet
Adapter (HEA)
Physical
port External switch
Figure 1-14. Integrated Virtual Ethernet AN313.1
Notes:
IBM Power Systems have an Integrated Virtual Ethernet (IVE). IBM Power 570 and Power
Systems 770 can have one IVE per system drawer. All operating systems supported on
IBM Power Systems support the use of IVE ports. The IVE allows multiple partitions to
share a single integrated Ethernet adapter to connect to an external network without a
Virtual I/O Server and without routing through another partition.
IVE terminology and description

AIX sees logical ports, which are the representation of the shared physical ports. (The OS
in the LPAR box in the visual above stands for operating system).
Logical ports are grouped in sets of 16, which is called a port group. Each port group will
have either one or two physical ports, depending on the IVE model. There are one or two
port groups for each IVE, depending on the model.
Instructor Guide
Hardware configuration options, such as speed, are set at the physical port level using the
HMC. The administrator chooses which logical ports to allocate to partitions and which
physical port to use for the logical ports.
There is one HEA in each IVE adapter. In the operating system, the HEA is represented
logically as an lhea device. If a partition uses two logical ports from the same HEA, they
must use different physical ports. In this case, there will be one lhea parent device and two
ent# devices.

V5.4.0.3
Instructor Guide

Purpose Describe the basic IVE terminology.
Details An important point is that AIX sees the HEA as the lhea device and the physical
port as an ent# device.
Transition statement The next slide is virtual SCSI.
Instructor Guide
Virtual I/O Server: Virtual SCSI architecture

Virtual SCSI supports Fibre Channel, SAS, parallel SCSI, SCSI RAID
devices, iSCSI, and optical devices, including DVD-RAM and DVD-
ROM.
Version 1.5 and later support file-backed virtual SCSI devices and file-
backed virtual optical devices.
Version 2 supports SAS tape devices.
External storage Micro-partitions

Shared VIOS AIX 5L Linux AIX 5L Linux
Fibre Chan v V6.1 V5.3
adapter S v
A1 A2 A3 A4 L
A5 C A A1 A2
Shared S N
I B1 B2 B3 A3
SCSI
adapter
B1 B2 B4 B5
B3
Virtual SCSI
POWER Hypervisor
Figure 1-15. Virtual I/O Server: Virtual SCSI architecture AN313.1
Notes:
The virtual SCSI architecture

Both AIX and i operating systems have virtual SCSI drivers. These server drivers adhere to
SCSI standards and allow interoperability between operating systems. Both operating
systems support virtual SCSI disks, optical media, and tape.
The visual shows how the virtual SCSI implementation allows the Virtual I/O Server to
share the Fibre Channel or SCSI attached disk resources with the Linux and AIX clients.
The resource could be whole disks (A1, A2, A3, B1, B4, and B5 in the figure, or logical
volumes A4, A5, B2, and B3 in the figure), or files contained in file storage pools. A file
storage pool is similar to an AIX file system.

V5.4.0.3
Instructor Guide

Purpose Overview the virtual SCSI architecture on POWER.
Details
Transition statement NPIV improves and simplifies the Fibre Channel SAN
management.
Instructor Guide
N_Port_ID Virtualization
Simplify management of
Fibre Channel SAN VIO client
EMC 5000 IBM 2105
environment with port LUN LUN
virtualization.
Fibre Channel industry Virtual FC
adapters
standard method for using
VIOS
virtualization to map multiple
Pass through
N_Port IDs to a physical Fibre module
Channel port
Allows LPARs to have PCIe 8Gbit
Fibre Channel adapters
dedicated N_Port IDs (just as
with a dedicated physical HBA). NPIV SAN
EMC 5000 LUN IBM 4700 LUN
Figure 1-16. N_Port_ID Virtualization AN313.1
Notes:
N Port ID Virtualization (NPIV)

NPIV provides direct access to Fibre Channel adapters from multiple client partitions,
simplifying the management of Fibre Channel SAN environments. NPIV support is included
with PowerVM Express, Standard, and Enterprise edition and supports AIX V5.3, AIX V6.1,
and SUSE Linux Enterprise Server 11 partitions on IBM Power Systems with an 8 GB Fibre
Channel Host Bus Adapter (HBA).
As shown in the figure, each client logical partition has a unique identity to the SAN. The
LUNs seen at the client partition are not seen as virtual SCSI disk, but as the storage
subsystem LUN type. At the Virtual I/O Server, there are not any virtual SCSI adapters or
virtual target devices to be defined, only virtual Fibre Channel adapters. This simplifies the
management of SAN-attached LUNs.

V5.4.0.3
Instructor Guide

Purpose Give an overview of NPIV.
Details NPIV is an industry standard and has been implemented already on IBM z9
Mainframes.
Transition statement Virtual tapes are supported by Virtual I/O Server 2.1.
Instructor Guide
Virtual tape
Simplify backup and restore operations with virtual tape.
Only SAS tape drives are supported.
SAN Fibre Channel tape drives are supported through N-port ID
Virtualization (NPIV).
VIO client
VIO client Generic

tape
Virtual device
tape
device vFC
adapters
vSCSI
adapters
NPIV FC
VIOS
adapter
SAS adapter VIOS

LAN
SAN
Tape library
drive robotics
SAS tape Copyright IBM Corporation 2011
Figure 1-17. Virtual tape AN313.1
Notes:
PowerVM has two virtualization methods for using tape devices on IBM Power Systems,
simplifying backup and restore operations. Both methods are supported with PowerVM
Express, Standard, or Enterprise Edition.
NPIV enables PowerVM LPARs to access SAN tape libraries using shared physical
HBA resources for AIX V5.3, AIX V6.1, and SUSE Linux Enterprise Server 11 partitions
on POWER6 processor-based servers.
Virtual tape support allows serial sharing of selected SAS tape devices for AIX V5.3,
AIX V6.1, IBM i 6.1, and SUSE Linux Enterprise Server 11 partitions.

V5.4.0.3
Instructor Guide

Purpose Highlight both possibilities when virtualizing tapes.
Details
Transition statement Next slide is live partition mobility.
Instructor Guide
PowerVM Live Partition Mobility

Move running AIX and Linux
operating system workloads from
one IBM Power server to another
Continuous availability
Eliminate many planned outages
Energy saving
During non-peak hours
Workload balancing
During peaks and to address
Virtualized SAN and network infrastructure
spikes in workload Virtualized SAN and network infrastructure
Requirements
POWER6 and POWER7 systems
Logical partition must only have
virtual adapters
Figure 1-18. PowerVM Live Partition Mobility AN313.1
Notes:
Live Partition Mobility allows for the movement of a running partition from one POWER6 or
POWER7 processor-based server to another without application downtime. This provides
better system utilization, improved application availability and energy savings. With live
partition mobility, planned application downtime due to regular server maintenance can be
a thing of the past.
As of this writing, all the resources of the moving partition must be virtualized (no dedicated
adapters). Also, you can perform a live partition mobility between two systems that are
managed by different HMCs.

V5.4.0.3
Instructor Guide

Purpose Give an overview an highlight the advantages of using the live partition
mobility feature.
Details
Transition statement Active memory sharing is new as of May 2009.
Instructor Guide
PowerVM Active Memory Sharing

AMS allows multiple logical partitions to share a common pool
of physical memory.
AMS intelligently assigns memory from one partition to another based
on memory page demands.
Optimizes memory utilization and provides flexible memory usage
LPAR1 LPAR2 LPAR3
POWER
Hypervisor
AMS shared Virtual

I/O
memory pool
Server
Paging devices
Physical memory
Figure 1-19. PowerVM Active Memory Sharing AN313.1
Notes:
The IBM Power Virtualization Manager (IBM PowerVM) Active Memory Sharing (AMS)
technology takes PowerVM virtualization to a new level of consolidation and virtualization
by optimizing memory utilization. AMS intelligently shares memory by dynamically moving
it from one partition to another on demand. This can optimizes memory utilization and
allows for flexible global memory usage.
Because memory utilization can be linked to processor utilization, this function
complements shared processors very well. Systems with low CPU requirements are very
likely to have low memory residency requirements.
The Virtual I/O Server is required as paging partition, owning the paging devices used
when the hypervisor pages out partitions memory to satisfy demands from other partitions.

V5.4.0.3
Instructor Guide

Purpose Give an overview of AMS.
Details Active memory sharing intelligently moves system memory from one partition to
another as workload demands change.
Transition statement The last feature to discuss is PowerVM Lx86.
Instructor Guide
Active Memory Expansion

Compresses in-memory data to fit System physical System physical
memory memory
more data into memory LPAR LPAR LPAR
LPAR LPAR LPAR
LPAR LPAR LPAR

The physical memory requirements of LPAR LPAR LPAR LPAR LPAR LPAR
AME
existing LPARs is reduced. LPAR LPAR LPAR
LPAR LPAR LPAR
LPAR LPAR LPAR

Free memory capacity can be used to LPAR LPAR LPAR LPAR .
create more LPARs.
Increases a LPARs effective LPARs effective LPARs effective

memory capacity memory capacity memory capacity
Physical memory Physical memory
Can increase the effective memory
capacity of a LPAR
AME
Increases the memory available to a
workload. Expanded
memory capacity
Figure 1-20. Active Memory Expansion AN313.1
Notes:
Active memory expansion is configurable on a per-LPAR basis. When memory expansion
is enabled for a LPAR, the operating system running in the LPAR will compress in-memory
data to effectively expand the size of memory by allowing more data to be packed into
memory.
Logically, the operating system will maintain two pools of memory - a compressed pool of
memory and an uncompressed pool of memory. The sizes of the pools will be controlled by
the operating system. With AIX, the sizes of the memory pools will vary based on load and
the target memory expansion factor.
Only pages in the uncompressed memory pool are directly accessible and usable. Pages
in the compressed pool must first be decompressed into the uncompressed pool in order to
be used. The OS will be responsible for moving pages between the compressed and
uncompressed pools based on workload.

V5.4.0.3
Instructor Guide
Uempty AME added value

Active memory expansion can make it possible to achieve higher levels of consolidation
on IBM Power servers and higher utilization of servers. For example, you have a
system with 10 LPAR's running at 40% CPU utilization and you would like to deploy
additional LPAR's on the system, but all of the physical memory is in-use on the system.
In this environment, you could turn memory expansion on for some (or all) of the
LPAR's to reduce their memory requirements. This would increase the CPU utilization
of the existing LPAR's but would also reduce the amount of physical memory required
by each LPAR. This allows you to free up physical memory capacity that can be used to
deploy additional LPAR's onto the system.
Active memory expansion can increase the effective amount of memory capacity
beyond the maximum physical DRAM configuration. For example, if your systems
physical memory capacity is limited to 512GB, enabling active memory compression
might increase the effective maximum amount of memory capacity beyond 512GB.
Active memory expansion can reduce the acquisition cost of memory. With active
memory expansion, you could choose to buy more CPU and less memory for a system.
The economic benefits would depend on the CPU and memory costs, as well as the
CPU utilization impacts caused by memory compression.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets see PowerVM Lx86.

V5.4.0.3
Instructor Guide
Uempty
PowerVM Lx86
Run x86 Linux applications on Power Systems servers along with your AIX
and Linux on POWER applications
Simplifies migration of Linux on x86 applications
Runs most existing 32-bit x86 Linux applications with no application changes
Enables customers to realize the energy and administrative benefits associated with
consolidation
Is included with the purchase of PowerVM editions
PowerVM
x86 Install and run x86
x86
x86 Linux OS Linux OS
Linux OS No porting
Linux OS app app POWER
app AIX OS
app Linux OS
Linux OS No recompile PowerVM application
Linux OS Lx86 application
Linux OS
No changes
x86 platforms Supported Linux OS AIX OS
x86 Platforms
x86 Platforms
Power Systems platform
Figure 1-21. PowerVM Lx86 AN313.1
Notes:
This feature enables the dynamic execution of x86 Linux instructions by mapping them to
instructions on a POWER processor-based system and caching the mapped instructions to
optimize performance. PowerVM Lx86 software is designed with features that enable users
to easily install and run a wide range of x86 Linux applications on Power Systems
platforms, with a Linux on POWER operating system.
This allows the consolidation of AIX and Linux on POWER and x86 Linux applications on
the same server.
PowerVM Lx86 supports the installation and running of most 32-bit x86 Linux applications
on any IBM System p or BladeCenter model with POWER7, POWER6, POWER5+ or
POWER5 processors, or IBM POWER Architecture technology-based IBM BladeCenter
blade servers. It creates an x86 Linux application environment running on POWER
processor-based systems by dynamically translating x86 instructions to POWER
Architecture instructions and caching them to enhance performance, as well as mapping
x86 Linux system calls to Linux on POWER system calls. No native porting or application
upgrade is required for running most x86 Linux applications.
Instructor Guide
Instructor notes:
Purpose Lx86 is available with all the PowerVM Editions.
Details PowerVM Lx86 can help developers and ISVs reduce the effort required to
support Linux by reducing or eliminating the requirement to port, tune, recompile, release
new media or documentation, or maintain a unique product offering for POWER
technology.
Transition statement Lets introduce virtualization performance management.

V5.4.0.3
Instructor Guide
Uempty
What is performance management?

Have goals.
Design for performance, capacity, cost, availability, or some mix of
all?
Make configuration decisions before purchasing.
Know the performance baseline.
Analyze, tune, monitor (repeat)
What is tuning?
Could be a reset of goals, change of configuration, addition of
resources, or software configuration adjustment
What is performance management in a virtualized
environment?
Take into account unique system features, such as partitioning,
hardware, and AIX 6 enhancements
System view rather than an individual partition
Figure 1-22. What is performance management? AN313.1
Notes:
Measuring performance
Tools are used to measure performance in key areas such as:
CPU utilization
Memory utilization and paging
Disk I/O
Network I/O
Know your performance baseline over time so that performance issues can be recognized
and tuning activities can be evaluated.
Instructor Guide
Instructor notes:
Purpose Introduce why the course title refers to performance management and not
tuning.
Details Performance management encompasses more than tuning. It includes
understanding what is normal on your system, making adjustments as necessary, and
evaluating tuning adjustments that work.
Transition statement Lets review a methodology for analyzing the performance of a
system.

V5.4.0.3
Instructor Guide
Uempty
Performance methodology
Performance can be improved by using a methodical
approach.
1. Understand the factors that can affect performance.
2. Measure the current performance of the server.
3. Identify any performance bottlenecks.
4. Change the component causing the bottleneck.
5. Measure the new performance of the server to check for
improvement.
Figure 1-23. Performance methodology AN313.1
Notes:
Methodical approach
Using a methodical approach, you can improve server performance. For example:
Understanding the factors that can affect server performance for the specific server
functional requirements and for the characteristics of the particular system.
Measuring the current performance of the server.
Identifying performance bottlenecks.
Changing the component that is causing the bottleneck.
Measuring the new performance of the server to check for improvement.
Instructor Guide
Instructor notes:
Purpose Discuss the performance analysis methodology.
Details This is a typical scientific approach to performance management. Measure,
change one thing, measure again.
Transition statement Lets look at a general flowchart used to analyze performance.

V5.4.0.3
Instructor Guide
Uempty
Performance analysis flowchart
CPU bound?
Yes
Yes Actions
No
Is there a
performance Memory bound?
No problem? Yes
Actions
No
I/O bound?
Yes
Normal operations Actions
No
Monitor system performance
and check against requirements. Network bound?
Yes
Actions
No
No
Does performance Additional tests
meet stated
goals?
Actions
Yes
Figure 1-24. Performance analysis flowchart AN313.1
Notes:
This is a flowchart that some performance analysts use. Keep in mind that this is an
iterative process.
Instructor Guide
Instructor notes:
Purpose Use this flowchart to provide an overview of performance analysis.
Details Point out that continuous monitoring should be done, as well as satisfying
customer complaints.
Transition statement Lets look at some AIX tools that can be used for performance
monitoring.

V5.4.0.3
Instructor Guide
Uempty
AIX performance analysis tools
Memory I/O Network

CPU
system subsystem subsystem
vmstat, iostat vmstat iostat lsattr
ps lsps vmstat netstat, entstat
sar svmon lsps nfsstat
tprof, gprof, prof filemon lsattr netpmon
time, timex lsdev ifconfig
lspv, lslv,
netpmon iptrace, ipreport
lsvg
fileplace tcpdump
locktrace filemon
emstat, alstat lvmstat
topas, topas,
topas, performance topas, performance
performance performance
toolbox toolbox
toolbox toolbox
trace, trcrpt, curt, trace, trcrpt, trace, trcrpt,
trace, trcrpt, truss
splat, truss truss truss
procmon, lparstat, procmon,
nfs4cl
mpstat, smtctl lparstat
Figure 1-25. AIX performance analysis tools AN313.1
Notes:
CPU analysis tools

CPU metrics analysis tools include:
vmstat, iostat, sar, lparstat and mpstat which are packaged with bos.acct
ps which is in bos.rte.control and cpupstat which is part of bos.rte.commands
gprof and prof which are in bos.adt.prof
time (built into the various shells) or timex which is part of bos.acct
emstat and alstat - emulation and alignment tools from bos.perf.tools
netpmon, tprof, locktrace, curt, splat, and topas are in bos.perf.tools
Performance toolbox tools such as xmperf, 3dmon are part of perfmgr.network
trace and trcrpt which are part of bos.sysmgt.trace
procmon is in bos.perf.gtools
Instructor Guide
Memory subsystem analysis tools

Some of the memory metric analysis tools are:
vmstat which is packaged with bos.acct
lsps which is part of bos.rte.lvm
svmon and filemon are part of bos.perf.tools
Performance toolbox tools such as xmperf, 3dmon which are part of perfmgr
I/O subsystem analysis tools

I/O metric analysis tools include:
iostat and vmstat are packaged with bos.acct
lsps, lspv, lsvg, lslv and lvmstat in bos.rte.lvm
lsattr and lsdev in bos.rte.methods
filemon, fileplace in bos.perf.tools
Network subsystem analysis tools

Network metric analysis tools include:
lsattr and netstat which are part of bos.net.tcp.client
nfsstat and nfs4cl as part of bos.net.nfs.client
netpmon is part of bos.perf.tools
ifconfig as part of bos.net.tcp.client
iptrace and ipreport are part of bos.net.tcp.server
tcpdump which is part of bos.net.tcp.server
Enhanced commands for AIX 5.3 and AIX 6, which support the
virtualization features
The lparstat command reports logical partition CPU and memory-related information
and statistics.

V5.4.0.3
Instructor Guide
Uempty The mpstat command collects and displays performance statistics for all logical
processors in the system. The mpstat command shows SMT utilization (-s), interrupt
metrics (-i), detailed software and dispatcher metrics (-d), and other information.
The smtctl command controls the enabling and disabling of the processor
simultaneous multithreading mode.
The vmstat, iostat, sar, and topas commands are collecting statistics for virtual
processors.
The entstat command shows Ethernet device statistics including shared Ethernet
adapter statistics.
Instructor Guide
Instructor notes:
Purpose List the performance analysis tools.
Details Provide an overview of the available performance tools based on the metrics
they are used for.
Point out that the lparstat, vmstat, and topas tools have been enhanced to support the
active memory sharing feature.
Additional information Other performance tools not written by IBM also exist. Some
are for purchase and others are publicly available on the Internet.
Transition statement Next slide shows AIX tuning tools.

V5.4.0.3
Instructor Guide
Uempty
AIX performance tuning tools
Memory I/O Network

CPU
nice
vmo vmo no
renice
schedo ioo ioo nfso
bindprocesso chps
chdev chdev
r mkps
chdev fdpr migratepv ifconfig
setpri chdev chlv
bindintcpu rmss reorgvg
procmon
smtctl
Figure 1-26. AIX performance tuning tools AN313.1
Notes:
CPU tuning tools

CPU tuning tools include:
nice, renice, and setpri to modify priorities.
nice and renice are in the bos.rte.control fileset.
setpri is a command available with the perfpmr package.
schedo to modify scheduler algorithms (in the bos.perf.tune fileset).
bindprocessor to bind processes to CPUs (in the bos.mp fileset).
chdev to modify certain system tunables (in the bos.rte.methods fileset).
bindintcpu to bind an adapter interrupt to a specific CPU (in the devices.chrp.base.rte
fileset).
procmon is in bos.perf.gtools.
Instructor Guide
smtctl is used to enable and disable simultaneous multithreading and to view the
status.
Memory tuning tools

Memory tuning tools include:
vmo and ioo for various VMM, file system, and LVM parameters (in bos.perf.tune
fileset). Since AIX6, some tunable parameters are restricted and are not displayed
unless you specify the -F flag.
chps and mkps to modify paging space attributes (in bos.rte.lvm fileset)
fdpr to rearrange basic blocks in an executable so that memory footprints become
smaller and cache misses are reduced (in perfagent.tools fileset)
chdev to modify certain system tunables (in bos.rte.methods fileset)
rmss to reduce amount of available real memory (in bos.perf.tools fileset)
I/O tuning tools

I/O tuning tools include:
vmo and ioo to modify certain file systems and LVM parameters (in bos.perf.tune
fileset).
chdev to modify system tunables such as disk and disk adapter attributes (in
bos.rte.methods fileset).
migratepv to move logical volumes from one disk to another (in bos.rte.lvm fileset).
chlv to modify logical volume attributes (in bos.rte.lvm fileset).
reorgvg to move logical volumes around on a disk (in bos.rte.lvm fileset).
Network tuning tools

Network tuning tools include:
no to modify network options (in bos.net.tcp.client fileset).
nfso to modify network file system (NFS) options (in bos.net.nfs.client fileset).
chdev to modify network adapter attributes (in bos.rte.methods fileset).
ifconfig to modify network interface attributes (in bos.net.tcp.client fileset).

V5.4.0.3
Instructor Guide

Purpose List the performance tuning tools.
Details Provide an overview of the available performance tuning tools based on the
performance metrics.
Additional information Dont go into details about these commands. Mention that since
AIX6.1, some vmo, ioo, and schedo parameters are not displayed by default. These are
restricted parameters and can be seen using the -F flag. These restricted parameters can
be changed when asked by IBM support.
Transition statement The next page shows where to get more information on POWER
Systems and PowerVM features.
Instructor Guide
References
Information Center documents:
http://publib16.boulder.ibm.com/pseries/index.htm
Support for virtualization software:
http://www14.software.ibm.com/webapp/set2/sas/f/virtualization/home.html
IBM PowerVM Web portal:
http://www-03.ibm.com/systems/power/software/virtualization/index.html
Provides links to white papers, education resources, services, and so forth
Redbooks:
http://www.redbooks.ibm.com/
In particular:
SG24-7590 IBM PowerVM Virtualization Managing and Monitoring
Redp-4194 IBM System p PowerVM Best Practices
Redp-4470 PowerVM Virtualization Active Memory sharing
SG24-7559 IBM AIX Version 6.1 Differences Guide
SG24-6478 AIX 5L Practical Performance Tools and Tuning Guide
Figure 1-27. References AN313.1
Notes:
This list is a starting point to obtain documentation for your system. There is documentation
for your specific system model, for the HMC, for the operating systems, and for configuring
partitions. The Information Center is the access point to the IBM documentation.
There are new Redbooks released all the time, particularly as a product matures. Check
the www.redbooks.ibm.com Web site from time to time.
Technical support Web site links

Additional important web sites for information include the following:
The Virtual I/O Server technical services Web site:
http://www14.software.ibm.com/webapp/set2/sas/f/vios/home.html
The Support for IBM Systems and servers is at:
http://www-947.ibm.com/systems/support/

V5.4.0.3
Instructor Guide

Purpose Give some references for PowerVM and IBM POWER Systems.
Details
Transition statement Lets do a quick checkpoint.
Instructor Guide
Checkpoint
1. The PowerVM Enterprise Edition is required for which of the
following?
a. Shared Ethernet adapter
b. Partition mobility
c. Virtual SCSI Adapter
d. Integrated Virtual Ethernet
e. Active Memory Sharing
2. How many shared processor pools are supported on a

POWER6 processor-based system?
3. Which PowerVM feature allows sharing a pool of memory?
4. Which PowerVM feature allows LPAR migration from one

physical machine to another?
Figure 1-28. Checkpoint AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose Checkpoint
Details
Checkpoint solution
1. The PowerVM Enterprise Edition is required for which of the following?
The answers are partition mobility and Active Memory Sharing.
2. How many shared processor pools are supported on a POWER6

processor-based system?
The answer is 64 (the default one plus 63 additional).

The answer is Active Memory Sharing.
4. Which PowerVM feature allows LPAR migration from one physical

machine to another?
The answer is partition mobility.
Instructor Guide
Exercise
Unit
exerc
ise
Figure 1-29. Exercise AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Unit summary
Having completed this unit, you should be able to:

Systems
Figure 1-30. Unit summary AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

V5.4.0.3
Instructor Guide
Uempty Unit 2. Processor virtualization tuning
Estimated time
02:00

This unit describes the SMT concept and the partition shared
processor options that can be tuned for performance. AIX 6.1 tools are
used to monitor performance statistics.

Describe the simultaneous multithreading concept and its effect on
performance monitoring and tuning
Describe the function of the PURR/SPURR statistics
Describe the impact of simultaneous multithreading on tools such
as vmstat, iostat, sar, and topas
Discuss guidelines for systems running simultaneous
multithreading with various workloads
Use tools to view statistics related to the monitoring and tuning of
partitions that have simultaneous multithreading enabled
Describe how the POWER Hypervisor allocates processing power
from the shared processing pool
Discuss recommendations associated with the number of virtual
processors
Describe performance considerations associated with
implementing Micro-Partitioning
Use tools to monitor the statistics on a partition running a workload
with Micro-Partitioning configured

Machine exercises
Copyright IBM Corp. 2010, 2011 Unit 2. Processor virtualization tuning 2-1
Instructor Guide
References
SG247940 - PowerVM Virtualization on IBM System p Introduction
and Configuration (Fourth Edition)
SG247590 - PowerVM Virtualization Managing and Monitoring
Redbook
White Paper: POWER5 System micro architecture from IBM research
http://www.research.ibm.com/journal/rd/494/sinharoy.
pdf
White Paper: POWER6 System micro architecture from IBM research
http://researchweb.watson.ibm.com/journal/rd/516/le.p
df
White Paper: SPURR informations and descriptions: EnergyScale for
IBM POWER6 microprocessor based systems:
http://researchweb.watson.ibm.com/journal/rd/516/mc
creary.pdf
REDP-4638-00: IBM Power 750 and 755 Technical Overview and
Introduction redpaper

V5.4.0.3
Instructor Guide
Uempty
Unit objectives
Describe the impact of simultaneous multithreading on tools such as
vmstat, iostat, sar, and topas
Discuss guidelines for systems running simultaneous multithreading
with various workloads
Describe how the POWER Hypervisor allocates processing power from
the shared processing pool
processors
Describe performance considerations associated with implementing
Micro-Partitioning
Use tools to monitor the statistics on a partition running a workload with
Micro-Partitioning configured
Notes:
Instructor Guide
Instructor notes:
Purpose Review the objectives for this unit.
Details Explain what well cover and what the students should be able to do at the end
of the unit.
Transition statement Heres the performance flowchart that we will be following
throughout the course, with the CPU box highlighted.

V5.4.0.3
Instructor Guide
Uempty
What is simultaneous multithreading?

Two (on POWER5 and POWER6) or four (on POWER7) hardware
threads can run on one physical processor at the same time.
One processor appears as two (or four) logical processors to AIX and
applications.
Each hardware thread is seen as a logical CPU.
Simultaneous multithreading is a means of converting thread-level
parallelism (multiple CPUs) to instruction-level parallelism (same CPU).
Logical Logical Logical Logical

CPU0 CPU1 CPU0 CPU1
AIX layer
Physical layer
Hardware Hardware Hardware Hardware
thread0 thread1 thread2 thread3
Physical CPU
Figure 2-2. What is simultaneous multithreading? AN313.1
Notes:
Simultaneous multithreading (SMT) is the ability of a single physical processor to
concurrently dispatch instructions from more than one hardware thread. There are two
hardware threads per physical processor on POWER6 and four on POWER7, so additional
instructions can run at the same time. Because the processor can fetch instructions from
any of the threads in a given cycle, the processor is no longer limited by the
instruction-level parallelism of the individual threads.
Simultaneous multithreading also allows instructions from one thread to utilize all the
execution units if the other thread encounters a long latency event. For instance, when one
of the threads has a cache miss, the second thread can continue to execute.
The operating system supports each hardware thread as a separate logical processor. So,
the operating system configures a dedicated partition that is created with one physical
processor as a logical two-way or four-way when simultaneous multithreading is enabled.
This is independent of the partition type, so a shared partition with one virtual processor is
configured as a logical two-way. Starting in AIX 5L V5.3, simultaneous multithreading is
enabled by default.
Instructor Guide
Instructor notes:
Purpose Describe how simultaneous multithreading works on IBM POWER systems.
Details When in simultaneous multithreading mode, instructions from either thread can
use the eight instruction pipelines in a given clock cycle. By duplicating portions of logic in
the instruction pipeline and increasing the capacity of the register rename pool, the IBM
POWER processor can execute two or four instruction streams, or threads, concurrently.
Additional information Normally, AIX maintains sibling threads at the same priority,
but boosts or lower-thread priorities in a few key places to optimize performance. AIX
lowers thread priorities, when the thread is doing non-productive work spinning in the idle
loop or on a kernel lock. When a thread is holding a critical kernel lock, AIX boosts the
thread priorities. These priority adjustments do not persist into user mode. AIX does not
consider a software threads dispatching priority when choosing its hardware thread
priority. Several scheduling enhancements were also made to exploit simultaneous
multithreading. For example, work is distributed across all primary threads before it is
dispatched to secondary threads. The reason for this enhancement is that a thread
performs best when its sibling thread is idle. AIX also considers thread affinity in idle
stealing and periodic run queue load balancing.
Transition statement When is simultaneous multithreading beneficial?

V5.4.0.3
Instructor Guide
Uempty
When to use simultaneous multithreading

Simultaneous multithreading might be beneficial if:
There is random data access (where you must wait for data to be
loaded into cache)
The overall throughput is more important than the throughput of an
individual thread
Simultaneous multithreading might not be beneficial if:
Both threads use same execution units
Where simultaneous multithreading is not beneficial, POWER
systems support single-threaded execution mode:
Automatically by snoozing
Manually by disabling simultaneous multithreading
POWER7 SMT modes can be adapted autonomously
according to workload characteristics (intelligent threads)
Figure 2-3. When to use simultaneous multithreading AN313.1
Notes:
In general, the following rules can be summarized for application performance on
simultaneous multithreading environments.
Applications found in commercial environments showed higher simultaneous
multithreading gain, than scientific applications.
Experiments on different workloads have shown varying degrees of simultaneous
multithreading gain ranging from -11% to 43%.
On average, most of the workloads showed a positive gain when running in
Applications that showed a negative simultaneous multithreading gain can be attributed
to L2 cache thrashing and increased local latency under simultaneous multithreading.
When simultaneous multithreading might be beneficial

Simultaneous multithreading is a good choice when the overall throughput is more
important than the throughput of an individual thread. For example, Web servers and
database servers are good candidates for simultaneous multithreading.
Instructor Guide
Workloads that have a very high Cycles Per Instruction (CPI) count tend to utilize
processor and memory resources poorly and usually see the greatest simultaneous
multithreading benefit. These large CPIs are usually caused by high cache miss rates from
a very large working set. Large commercial workloads typically have this characteristic,
although it depends somewhat on whether the two hardware threads share instructions or
data or are completely distinct. Workloads that share instructions or data, which includes
those that run a lot in the operating system or within a single application, tend to have
better simultaneous multithreading benefit. Workloads with low CPI and low cache miss
rates tend to see a benefit, but a smaller one.
For high performance computing, try enabling simultaneous multithreading and monitor
performance. If the workload is data-intensive with tight loops, you might see more
contention for cache and memory which can reduce performance.
When simultaneous multithreading might not be beneficial

Simultaneous multithreading is not always advantageous. Any workload where the majority
of individual software threads highly utilize any resource in the processor or memory will
benefit very little from simultaneous multithreading. For example, workloads that are
heavily floating-point-intensive are likely to gain little from simultaneous multithreading and
are most likely to lose performance. They tend to heavily utilize either the floating-point
units or the memory bandwidth.
In addition, if one thread is slowed by many L2 cache misses, it could still be holding
resources on the processor (for example, issue queue slots) that cause the other thread to
slow down. In this case, the CPU detects the situation and throttles the rate of instruction
dispatch for the thread that is blocking the resources.
If simultaneous multithreading is not beneficial, it can be disabled.
Single-threaded execution mode

Where simultaneous multithreading is not beneficial, IBM POWER systems support
single-threaded execution mode. In this mode, the IBM POWER system gives all the
physical resources to the active thread. In single-threaded mode, the IBM POWER system
uses only one instruction fetch address register and fetches instructions for one thread
every cycle.
There are two ways your AIX partition can enter single-threaded mode. You can
dynamically disable it with the smtctl command, or in a dedicated processor partition. If the
second thread is idle, it will be marked as dormant, and single-threaded mode will be used
until there is work for the second thread. For shared processor partitions, a hardware
thread that can find no work to do (that is, it is running in the idle loop), is marked at a lower
priority until there is more work.

V5.4.0.3
Instructor Guide
Uempty Snoozing
The process of putting an active thread into a dormant state is known as snoozing. In
dedicated processor partitions, if there are not enough tasks available to run on both
hardware threads of a processor, the operating systems idle process will be selected to run
on the idle hardware thread. It is better for the operating system to snooze the idle process
thread and switch to single-threaded mode. Doing so enables all of the processor
resources to be available to the thread doing meaningful work.
To snooze a thread, the operating system will invoke the h_cede Hypervisor call. The
thread then goes to the dormant state. A snoozed thread is woken when a decrementer,
external interrupt, or an h_prod hypervisor call is received. When other tasks become
ready to run, the processor transitions from single-threaded mode to simultaneous
multithreading mode. It does not make sense to snooze a thread as soon as the idle
condition is detected. There could be another thread in the ready-to-run state in the run
queue by the time the snooze occurs, resulting in wasted cycles due to the thread start-up
latency. It is good for performance if the operating system waits for a short of time for work
to come in before snoozing a thread. This short idle spinning time is known as
simultaneous multithreading snooze delay. Both AIX and Linux provide snooze delay
tunables.
To view the current snooze delay value on AIX 6.1:
# schedo -o smt_snooze_delay
smt_snooze_delay = 0
The value represents the number of microseconds spent in the idle loop without useful
work before snoozing (calling h_cede). A value of -1 indicates to disable snoozing; a value
of 0 (the default) indicates to snooze immediately. The value can go as high as 100000000
(100 secs).
Certain workloads might see better performance with a larger snooze delay. To change the
delay, use schedo. For example, heres the command to change the delay to five
microseconds:
# schedo -o smt_snooze_delay=5
Setting smt_snooze_delay to 5
With POWER7, a new parameter was added: smt_tertiary_snooze_delay. This acts similar
to smt_snooze_delay except that it works on the third and fourth smt threads while
smt_snooze_delay works on the first and second thread.
Instructor Guide
Instructor notes:
Purpose Explain when simultaneous multithreading is beneficial or not.
Details Review types of workloads and whether simultaneous multithreading is
beneficial or not. The bottom line is that it is difficult to predict ahead of time whether it
would be beneficial or not for any given workload. The best option is to monitor
performance with it off, then monitor performance with it on and see if theres a difference.
Additional information In AIX 5.3, the schedo command had a rounding error that set
the number of microseconds to one less than you specify. It use to cause the following:
# schedo -o smt_snooze_delay=10
Setting smt_snooze_delay to 10
# schedo -o smt_snooze_delay
smt_snooze_delay = 9
However, this has been fixed by APAR IY85228.
Transition statement Lets overview POWER7 intelligent threads

V5.4.0.3
Instructor Guide
Uempty
POWER7 intelligent threads

POWER7 can run in single thread (ST), dual thread (SMT2), or
quad thread (SMT4) mode.
Dynamic runtime SMT scheduling
Spread work among cores to execute in appropriate threaded mode
Can dynamical shift between modes as required: SMT1(single thread)
/ SMT2 / SMT4
Happens automatically or the administrator can configure the system into
ST, SMT2, or SMT4
Intelligent threads capability utilizes more threads when
workload benefits.
Figure 2-4. POWER7 intelligent threads AN313.1
Notes:
POWER7 features Intelligent threads that can vary based on the workload demand. The
system will automatically determine whether a workload benefits from dedicating as much
capability as possible to a single thread of work, or benefits from having capability spread
across 2 or 4 threads of work.
With more threads, POWER7 can deliver more total capacity as more tasks are
accomplished in parallel. With fewer threads, those workloads that need very fast individual
tasks (like databases or transaction workloads) can get the performance they need for
maximum benefit.
Powers Intelligent threads capability lets the system dynamically switch from single thread
(ST) to dual thread (SMT2) to quad thread (SMT4) modes per core.
Instructor Guide
Instructor notes:
Purpose Describe POWER7 intelligent threads.
Details
Transition statement How can we enable or disable SMT?

V5.4.0.3
Instructor Guide
Uempty
Turning on or off simultaneous
multithreading (1 of 2)
Use the smtctl command or SMIT to enable, disable, or see status:
smtctl [ -m off | on [ -w boot | now]]
SMIT fastpath: smitty smt
To turn simultaneous multithreading off dynamically (for now):
# smtctl -m off -w now
smtctl: SMT is now disabled.
# bindprocessor -q
The available processors are: 0
To turn simultaneous multithreading on dynamically (now and reboot):

# smtctl -m on (defaults to both)
smtctl: SMT is now enabled. It will persist across reboots if you
run the bosboot command before the next reboot.
# bosboot -a
# bindprocessor -q
The available processors are: 0 1
Figure 2-5. Turning on or off simultaneous multithreading (1 of 2) AN313.1
Notes:
Modifying simultaneous multithreading with the smtctl command

The smtctl command provides privileged users and applications the ability to control
utilization of processors with simultaneous multithreading support. With this command, you
can turn simultaneous multithreading on or off system-wide, either immediately or the next
time the system boots. By default, AIX enables simultaneous multithreading.
The smtctl command syntax is:
smtctl [ -m off | on [ -w boot | now ]]
where:
-m off Sets simultaneous multithreading mode to disabled.
-m on Sets simultaneous multithreading mode to enabled.
Instructor Guide
-w boot Makes the simultaneous multithreading mode change effective on the

next and subsequent reboots. (You must run the bosboot command
before the next system reboot).
-w now Makes the simultaneous multithreading mode change immediately but
will not persist across reboot.
If neither the -w boot or the -w now options are specified, then the mode change is made
now and when the system is rebooted. (You must run the bosboot command before the
next system reboot.)
Note, the smtctl command does not rebuild the boot image. If you want your change to
persist across reboots, the bosboot command must be used to rebuild the boot image. The
boot image in AIX 6.1 has been extended to include an indicator that controls the default
Issuing the smtctl command with no options will display the current simultaneous
multithreading settings.
Modifying simultaneous multithreading with SMIT

The following path in SMIT will get you to the main simultaneous multithreading screen:
Performance & Resource Scheduling -> Simultaneous Multi-Threading Processor
Mode
The fastpath to this screen is smitty smt.
There are two options on this screen:
List SMT Mode Settings
Change SMT Mode
The Change SMT Mode screen gives the following options:
SMT Mode
- Options are:
enable
disable
SMT Change Effective:
- Options are:
Now and subsequent boots
Now
Only on subsequent boots

V5.4.0.3
Instructor Guide

Purpose Show how simultaneous multithreading can be turned on and off.
Details Simultaneous multithreading is a configuration option for AIX 6.1 and Linux
when running on POWER5 or above systems.
Transition statement The smtctl command has been modified to support POWER7
simultaneous multithreading.
Instructor Guide
Turning on or off simultaneous

multithreading (2 of 2)
Use the smtctl command or SMIT to set the number of simultaneous
threads per processor:
smtctl [ -t #SMT [ -w boot | now]]
ST, SMT2, SMT4 modes
To set simultaneous multithreading value to two dynamically (for now):

# smtctl t 2 -w now
smtctl: SMT is now enabled.
# bindprocessor -q
The available processors are: 0 1
To set simultaneous multithreading value to four dynamically (now and

reboot) on a system that supports four threads:
# smtctl t 4 (defaults to both)
smtctl: SMT is now enabled. It will persist across reboots if you
run the bosboot command before the next reboot
# bosboot -a
# bindprocessor -q
The available processors are: 0 1 2 3
A value of 1 disables simultaneous multithreading

Figure 2-6. Turning on or off simultaneous multithreading (2 of 2) AN313.1
Notes:
Starting with AIX6.1 TL4, the smtctl command has been enhanced to support POWER7
SMT2 and SMT4 modes.
The -t option of the smtctl command will set the number of the simultaneous threads per
processor. The value can be set to one to disable simultaneous multithreading. The value
can be set to two for systems that support 2-way simultaneous multithreading (POWER6)
and the value can be set to four, for the systems that support 4-way simultaneous
multithreading (POWER7). This option cannot be used with the -m flag.
To disable simultaneous multithreading you can perform:
# smtctl -t 1
or # smtctl -m off

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets look at the traditional way to measure CPU utilization.
Instructor Guide
Traditional CPU utilization statistics

Data collection is sample-based.
100 samples per second, sorted into the following categories:
user
sys
iowait
idle
Each sample corresponds to a 10 ms clock tick.
Recorded in the kernel data structures: sysinfo and cpuinfo
Performance tools convert:
Tick counts from the sysinfo structure into utilization percentages for
the machine/partition (for example, vmstat, iostat, sar)
Tick counts from the cpuinfo structure into utilization percentages for a
processor/thread (for example, sar -P ALL, topas hot CPU section)
Figure 2-7. Traditional CPU utilization statistics AN313.1
Notes:
Traditionally, AIX processor utilization uses a sample-based approach to approximate the
percentage of processor time spent executing user programs, system code, waiting for disk
I/O, and idle time.
AIX produces 100 interrupts per second to take samples. At each interrupt, a local timer
tick (10 ms) is charged to the current running thread that is preempted by the timer
interrupt. One of the following utilization categories is chosen based on the state of the
interrupted thread:
user: Interrupted code outside AIX kernel
sys: Interrupted code inside AIX kernel and currently running thread is not waitproc
iowait: Currently running thread is waitproc and there is an I/O pending
idle: Currently running thread is waitproc and there is no I/O pending
If the thread was executing code in the kernel through a system call, the entire tick is
charged to the process system time. If the thread was executing application code, the
entire tick is charged to the process user time. Otherwise, if the current running thread was

V5.4.0.3
Instructor Guide
Uempty the operating systems idle process, the tick is charged in a separate variable. The problem
with this method is that the process receiving the tick most likely did not run for the entire
time period and happened to be executing when the timer expired.
Data structures
The processor utilization information is recorded in the sysinfo (system-wide) and cpuinfo
(per-processor) kernel data structures. These structures are documented in
/usr/include/sys/sysinfo.h. In order to preserve binary compatibility, this stays
unchanged with AIX 5L V5.3 or V6.1.
Performance tools such as vmstat, iostat or sar, convert tick counts from the sysinfo
structure into utilization percentages for a machine/partition. Other tools like
sar -P ALL and the topas hot CPU section, convert tick counts from the cpuinfo structure
into utilization percentages for a processor/thread.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
POWER6: Processor Utilization Resource Register

Traditional utilization metrics are misleading:
They lead you to believe there are two physical processors.
Register used to measure the number of dispatch cycles for
each thread
Two PURR registers (one for each hardware thread)
Units are the same as the timebase register
Sum of the PURR values for both
Logical Logical
threads is equal to the timebase CPU0 CPU1
register.
Introduced in AIX 5
Thread0 Thread1
purr0 purr1
Physical CPU
timebase register
Figure 2-8. POWER6: Processor Utilization Resource Register AN313.1
Notes:
PURR
The Processor Utilization Resource Register (PURR) is a register, provided by the
POWER5, POWER6 and POWER7 processors, which is used to provide an actual count of
physical processing time units that a logical processor has used. All performance tools and
APIs utilize this PURR value to report CPU utilization metrics for Micro-Partitioning and
simultaneous multithreading systems. This register is a special purpose register that can
be read or written by the POWER Hypervisor but is read-only by the operating system
(supervisor mode). There are two registers, one for each hardware thread.
The PURR is used to approximate the time that a virtual or logical processor is actually
running on a physical processor. The register advances automatically so that the operating
system can always get the current, up-to-date value. The Hypervisor saves and restores
the register across virtual processor context switches.
Because there are many resources in the hardware, any one of which can be a bottleneck
that limits simultaneous multithreading gain, the use of the PURR is an approximation of
Instructor Guide
the time spent running. The execution time for a virtual processor can be calculated by
adding sibling thread PURRs.
Timebase register (TB)

The timebase register (TB) provides a long-period counter driven at 1/8 the processor clock
frequency. The timebase register is a 64-bit register and contains a 64-bit unsigned integer
that is incremented by one every eight processor clock cycles.
Incrementing the PURR

Like the timebase register, the PURR increments by one every eight processor clock cycles
when the processor is in single-threaded mode. When the processor is in simultaneous
multithreading mode, the thread that dispatches a group of instructions in a cycle
increments the counter by 1/8 in that cycle. If no group dispatch occurs in a given cycle,
both threads increment their PURR by 1/16. Over a period of time, the sum of the two
PURR registers when running in simultaneous multithreading mode, should be very close
to, but not greater than, the number of timebase ticks.
Linux PURR implementation

At the time of the writing of this course, some releases of Linux report PURR statistics,
which can be seen in the /proc/ppc64/lparcfg file. If you are not at a level that supports this,
there might still be a PURR line, but it could have a value of zero.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
POWER7: Processor Utilization Resource Register

Four PURR registers (one for each hardware thread)
Units are the same as the timebase register
Sum of the PURR values for four threads is equal to the
timebase register
Introduced in AIX 6.1 TL4
Logical Logical Logical Logical

CPU0 CPU1 CPU0 CPU1
Thread0 Thread1 Thread3 Thread4

purr0 purr1 purr3 purr4
Physical CPU
timebase register
Figure 2-9. POWER7: Processor Utilization Resource Register AN313.1
Notes:
POWER7 implements four PURR registers. One for each hardware thread.

V5.4.0.3
Instructor Guide

Purpose Point out four threads and four PURR on POWER7.
Details
Instructor Guide
Scaled Performance Utilization Resource Register

POWER6 enhancement
Similar to PURR introduced in AIX 5
Two SPURR registers (one for each hardware thread). Four on
POWER7
Processor use statistics scale with frequency or instruction
dispatch rate
Throttle number of CPU cycles needed to execute an instruction
Logical Logical
CPU0 CPU1
Thread0 Thread1
spurr0 spurr1
Physical CPU
timebase register
Figure 2-10. Scaled Performance Utilization Resource Register AN313.1
Notes:
The POWER6 and POWER7 Systems internal power management results in performance
variability. This implies as the power management implementation operates, it can change
the effective speed of the processor.
POWER6 and POWER7 processors contain an additional special-purpose register for
each hardware thread known as the Scaled Processor Utilization of Resources Register
(SPURR). The SPURR is used to compensate for the effects of performance variability in
the operating systems. The hypervisor virtualizes the SPURR for each hardware thread so
that each OS obtains accurate readings that reflect only the portion of the SPURR count
that is associated with its partition. Implementing virtualization for the SPURR is the same
as that for the PURR. Building on the functions provided by hypervisor, the operating
systems use SPURR to do the same type of accurate accounting that is available on
POWER5 processor-based machines. With the introduction of the EnergyScale
architecture for POWER6 processor-based machines, not all timebase ticks have the same
computational value; some represent more usable processor cycles than others. The
SPURR provides a scaled count of the number of timebase ticks assigned to a hardware

V5.4.0.3
Instructor Guide
Uempty thread, in which the scaling reflects the speed of the processor (taking into account
frequency changes and throttling) relative to its nominal speed.
The AIX tools reverted back to using PURR on POWER6 from 5.3 TL7 (SP9), 5.3 TL8
(SP7), 5.3 TL9 (SP4), and 5.3 TL10 (SP1). SPURR is used on Power6 from 6.1.0.0
through present.
The SPURR is supported on POWER6 and POWER7 processors. It is similar to the
Performance Utilization Resources Register (PURR), except that it scales as a function of
degree of processor throttling. If your hardware supports the SPURR, the processor use
statistics shown by the sar command are proportional to the frequency or the instruction
dispatch rate of the processor.
SPURR is used when Power Saver is activated.
System-wide tools have been modified for variable processor frequency. The EnergyScale
architecture might therefore affect some of the performance tools and metrics built with the
user-visible performance counters. Many of these counters count processor cycles, and
because the number of cycles per unit time varier, the values reported by unmodified
performance monitors are subject to some interpretation.
The lparstat command has been updated in AIX Version 6.1 to display new statistics if the
processor is not running at nominal speed. The %nsp metric shows the current average
processor speed as a percentage of nominal speed. This field is also displayed by the new
version of the mpstat command.
Instructor Guide
Instructor notes:
Purpose Describe the new PURR register for a simultaneous multithreading
environment.
Details Describe how the traditional sample-based utilization metrics are misleading if
used with simultaneous multithreading enabled. Show that there is a PURR for each logical
processor.
Additional information The decrementer (DEC) is a counter that is updated at the
same rate as the timebase register. It provides a means of signaling an interrupt after a
specified amount of time has elapsed unless the decrementer is altered by software in the
interim, or the frequency of the timebase update changes.
Transition statement How are CPU utilization metrics calculated in a simultaneous
multithreading environment?

V5.4.0.3
Instructor Guide
Uempty
CPU utilization
In a simultaneous multithreaded environment and or a
Micro-Partition, CPU utilization statistics:
Still collect 100 samples per second (for binary compatibility)
Collect additional state-based PURR-based metrics (in PURR
increments)
Utilization metrics:
Same categories are used: user, sys, iowait, and idle
Physical resource utilization metrics for a logical processor:
(delta PURR/delta TB) Represents the fraction of the physical processor
consumed by a logical processor
(delta PURR/delta TB)*100 over an interval represents the percentage of
dispatch cycles given to a logical processor
Delta PURR0 Delta PURR1
Delta timebase
Timebase
Figure 2-11. CPU utilization AN313.1
Notes:
Sample-based versus state-based

Each logical processor continues to collect 100 samples per second and stores the tick
information in the cpuinfo data structure. Starting in AIX 5L V5.3, the kernel collects metrics
that are state-based (rather than sample-based) and are stored in new structures.
State-based means we are collecting information of PURR increments rather than a set
time of 10 ms.
Utilization metrics
AIX uses the PURR for process accounting. Instead of charging the entire 10 ms clock tick
to the interrupted process as before, processes are charged based on the PURR delta for
the hardware thread since the last interval, which is an approximation of the computing
resource that the thread actually received. This results in a more accurate accounting of
processor time in the simultaneous multithreading environment.
Instructor Guide
At each interrupt:
The elapsed PURR is calculated for the current sample period.
This value is added to the appropriate utilization category, instead of the fixed-size
increment (10 ms) that was previously added.
The interval information is stored in the same four categories; user, sys, iowait, and idle.
There are metrics in AIX associated with simultaneous multithreading utilization. There are
two different ways to measure it: the threads processor time and the elapsed time. For the
first, the thread's PURR values are used and are now virtualized. To measure the elapsed
time, the timebase register (TB) is still used.
The physical resource utilization metrics for a logical processor are:
(delta PURR/delta TB) represents the fraction of the physical processor consumed by a
logical processor.
(delta PURR/delta TB)*100 over an interval represents the percentage of dispatch
cycles given to a logical processor.

V5.4.0.3
Instructor Guide

Purpose Explain why a new method to collect CPU utilization metrics is needed in a
simultaneous multithreading environment.
Details The old sample-based CPU utilization statistics are still kept, but they are
inaccurate when using simultaneous multithreading and shared processor partition.
Transition statement Lets look at an example that shows why we need a new way of
reporting CPU utilization statistics.
Instructor Guide
CPU utilization metrics

Physical CPU utilization metrics are calculated using PURR
statistics.
For example:
%sys = (delta PURR in system mode/delta PURR in all modes)*100
That is, if a logical processor runs for 4 ms, and it was in system mode for 1
ms, %sys would report 25.
%sys
Delta PURR0 Delta PURR1
Delta timebase
Timebase
Figure 2-12. CPU utilization metrics AN313.1
Notes:
Physical processor utilization metrics

The physical (or virtual) processor utilization metrics are calculated using PURR statistics.
The utilization metrics are user, system, iowait, and idle as seen in the output of commands
such as sar -P, lparstat, and topas.
For dedicated processor partitions, all of the physical processor utilization metrics will add
up to the total number of processors configured in the partition. Micro-Partitions give back
unused processing time, so utilization numbers are likely to add up to less than the full
entitlement.

V5.4.0.3
Instructor Guide

Purpose Describe how CPU utilization metrics are calculated in a simultaneous
multithreading or Micro-Partition environment.
Details This visual shows how the physical (or virtual) CPU metrics are calculated
when using PURR. The example on the visual is for system time, but it would be the same
for user, iowait, and idle.
Transition statement Lets look at some CPU statistical metrics for this environment.
Instructor Guide
Additional CPU utilization metrics

Physical Processor Consumed (PPC) = sum (delta
PURR/delta TB)
This is how much physical processor time was consumed for each
logical processor.
Dedicated partitions always show all processors consumed (because
even if idle, it is being consumed by the partition).
Micro-Partitions show actual portions of physical processors.
consumed because a virtual processor gives up its excess cycles.
Micro-Partitions only:
Percentage of entitlement consumed = (PPC/ENT)*100
Available physical processors = (delta PIC/delta TB)
PIC = Pool Idle Count (Delta PURR when no VPs are dispatched):
All partition entitlements satisfied
No partition to dispatch
Logical processor utilization:
Sum of %sys and %user
Figure 2-13. Additional CPU utilization metrics AN313.1
Notes:
The new registers used to track processor utilization also provide some new statistics.
Some statistics can only be viewed when the partition is using shared processors.
Physical Processor Consumed (PPC)

To find out how much physical processor is being consumed (PPC), use sum(delta
PURR/delta TB) for each logical processor in a partition. The result is a decimal number
representing the amount of physical processors consumed.
Percentage of entitlement consumed

The percentage of entitlement consumed is calculated as (PPC/ENT)*100. This is a
statistic that shows up only in a partition that uses shared processors. We will see this
statistic in example command outputs in the Micro-Partitioning unit of this course.

V5.4.0.3
Instructor Guide
Uempty Available physical processors

Another useful metric is the amount of available physical processors in the shared
processing pool. This is calculated as (delta PIC/delta TB).
The Pool Idle Count (PIC) represents the delta PURR of all shared processors when no
partitions were dispatched. This is when all partition entitlements are satisfied, and there is
no partition to dispatch. This results in a decimal number of processors. This is useful to
monitor to see if there are excess processing resources that could be utilized.
This is a statistic that shows up only in a partition that uses shared processors and that has
this statistic enabled in the partition profile. We will see this statistic in example command
outputs in the Micro-Partitioning unit of this course.
Logical processor utilization

Logical processor utilization helps you determine if simultaneous multithreading is
beneficial for the workload on the system or if your partition is CPU-bound. It is calculated
by summing the %sys and %user values that are stored in the sysinfo and cpuinfo data
structures. If all the logical processor numbers are very high, indicating the partition is
CPU-bound, then adding more physical processing power might be beneficial. If the logical
processor numbers are very unbalanced, such that one thread from each physical or virtual
processor is being utilized but the other is not, this could indicate that the workload is not
benefiting from simultaneous multithreading.
This metric shows up as lbusy in the lparstat command output in a Micro-Partition.
Instructor Guide
Instructor notes:
Purpose List the enhancements to the performance library API to support simultaneous
multithreading.
Details This visual shows some commands been updated for the new virtual
environment.
Transition statement Lets look at some of the performance monitoring commands
that had to be changed to support the simultaneous multithreading environment.

V5.4.0.3
Instructor Guide
Uempty
Commands supporting PURR

When simultaneous multithreading is enabled, vmstat, iostat,
topas, and sar:
Use the PURR-based data (%user, %sys, %wait, %idle).
Show the following columns:
Physical Processor Consumed (pc or physc) by the partition
Adds up to total number of processors for a dedicated partition
Percentage of Entitlement Consumed (pec or %entc) by the partition
(Micro-Partitions only).
Trace can optionally collect PURR register values at each
trace hook and trcrpt can display elapsed PURR.
The trace-based tools netpmon, pprof, gprof, curt, and splat
can optionally use PURR-based data.
Graphics tools like PTX 3dmon, PTX xmperf, and PTX
jtopas have also been updated to use PURR-based data.
Figure 2-14. Commands supporting PURR AN313.1
Notes:
When AIX is running in simultaneous multithreading mode or in a Micro-Partition,
commands that display CPU information, such as vmstat, iostat, topas, and sar, display
the PURR-based statistics rather than the traditional sample-based statistics.
In simultaneous multithreading mode, additional columns of information are displayed:
pc or physc - Physical Processor Consumed
pec or %entc - Percentage of Entitlement Consumed (Micro-Partitions only)
Simultaneous multithreading with trace and trace tools

Another tool that needed modification was trace/trcrpt and several other tools that are
based on the trace utility. The trace command can optionally collect PURR register values
at each trace hook and trcrpt can display elapsed PURR.
The following list shows the arguments to use for a simultaneous multithreading or
Micro-Partition environment:
Instructor Guide
trace -r PURR Collects the PURR register values. Only valid for a
trace run on a 64-bit kernel.
trcrpt -O PURR=[on|off] Tells trcrpt to show the PURR along with any
timestamps. The PURR is displayed following any
timestamps. If the PURR is not valid for the processor
traced, the elapsed time is shown instead of the PURR.
If the PURR is valid, or the CPUID is unknown, but
wasn't traced for a hook, the PURR field contains
asterisks (*).
netpmon -r PURR Uses the PURR time instead of timebase in percent
and CPU time calculation. Elapsed time calculations
are unaffected.
pprof -r PURR Uses the PURR time instead of timebase in percent
and CPU time calculation. Elapsed time calculations
are unaffected.
gprof New environment variable GPROF controls the gprof's
new mode that supports simultaneous multithreading.
curt -r PURR Uses the PURR register to calculate CPU times.
splat -p Specifies the use of the PURR register to calculate
CPU times. The output will show the message PURR
was used to calculate CPU times.

V5.4.0.3
Instructor Guide

Purpose List the changes to existing commands to support simultaneous
multithreading environments.
Details This visual provides an overview of the tools that have been enhanced in AIX
V5.3 or above to use PURR.
Point out the student notes where the various command flags for retrieving PURR
information are listed.
Additional information The trace command supports a list of registers (reglist; up to
8) as the parameter to the -r flag. Valid reglist values are:
PURR
The PURR register for this processor
SPURR
The SPURR register for this processor
MCR0, MCR1, MCRA - the MCR
Registers, 0, 1, and A
PMC1, PMC2, ... PMC8 - PMC
Registers 1 through 8.
Restriction: Not all registers are valid for all processors.
Transition statement Lets look at sar -P ALL output on a simultaneous
multithreading partition.
Instructor Guide
Using sar with simultaneous multithreading

# sar -P ALL 1 2
AIX rita82 3 5 00C35B804C00 04/27/10
System configuration: lcpu=2 mode=Capped
10:00:18 cpu %usr %sys %wio %idle physc

10:00:20 0 22 67 0 11 0.51
1 0 100 0 0 0.49
- 11 84 0 5 1.00
10:00:22 0 22 67 0 11 0.51
1 1 99 0 0 0.49
- 11 84 0 5 1.00
Average 0 22 67 0 11 0.51
1 0 100 0 0 0.49
- 11 84 0 5 1.00
Figure 2-15. Using sar with simultaneous multithreading AN313.1
Notes:
sar -P ALL [interval [number]]

The visual shows sar -P ALL output for a partition using dedicated processors with
simultaneous multithreading enabled. If the interval and iteration numbers are not used, the
statistics shown are since the last boot. Starting at AIX V5.3, this changed slightly.
The system configuration line shows the number of logical processors (lcpu).
The physical processor consumed physc column shows the relative simultaneous
multithreading split between logical processors, for example, shows the measurement of
fraction of time a logical processor was getting physical processor cycles.
On a dedicated partition, if there is nothing really running, then both logical CPUs will be
running the AIX wait thread (the idle loop), since the processor(s) are dedicated.

V5.4.0.3
Instructor Guide

Purpose Describe the simultaneous multithreading information displayed by the sar -P
ALL command.
Details Point out that the number of logical processors (lcpu) and the physical
processor consumed (physc) column.
For dedicated partitions, the physc column will always total the number of processors
allocated to the partition. Well see in the next unit that for shared partitions that cede their
idle cycles back to the shared processing pool, the physc column shows the actual work by
the processor(s) for this partition.
You can mention the lparstat command after describing the sar -P command. The lparstat
command output in a partition with dedicated processors shows the four physical CPU
utilization metrics like the output of the sar -P command: usr, sys, wio, and idle.
Interestingly, lparstat does not show physc in a dedicated processor partition, but sar
-P does. To see physc with lparstat, the partition must have shared processors. We will
see the lparstat command output in the next unit.
Transition statement Lets look at the mpstat -s command running on POWER6
processor.
Instructor Guide
Using mpstat: POWER6

# mpstat s 1 1
Proc0
cpu0 cpu1
65.23% 32.18%
# smtctl
This system is SMT capable.
SMT is currently enabled.
SMT boot mode is set to enabled.
SMT threads are bound to the same physical processor.
proc0 has 2 SMT threads.
Bind processor 0 is bound with proc0
# lsattr -El proc0

frequency 3504000000 Processor Speed False
smt_enabled true Processor SMT enabled False
smt_threads 2 Processor SMT threads False
state enable Processor state False
type PowerPC_POWER6 Processor type False
Figure 2-16. Using mpstat: POWER6 AN313.1
Notes:
The mpstat command collects and displays performance statistics for all logical CPUs in
the system.
If simultaneous multithreading is enabled, the mpstat -s command displays logical
processors usage as shown in the visual above. In the example, logical processor cpu0 is
65.23% busy and logical processor cpu1 is 32.18%. cpu0 and cpu1 are hardware threads
for proc0.
Because this example output is from a dedicated partition, the logical processors could
simply be running the AIX wait thread and not be doing any real work. The two logical
processor utilization metrics always add up to a whole processor. Well see later in this unit,
how idle cycles are ceded back to the shared processing pool if the partition was using
shared processors. With share processor LPARs, the mpstat output would show the time
that the logical processors were busy doing real work.

V5.4.0.3
Instructor Guide

Purpose Describe the simultaneous multithreading information displayed by the
mpstat -s, smtctl, and lsattr commands.
Details Point out the significant information in the output of these commands:
mpstat shows the number of logical processors and the split of work between them.
You can use this output to see if the workload in the partition is taking advantage of
multiple threads.
smtctl shows that SMT is enabled or disabled, and if the word physical was replaced
by virtual in the output, then the partition would be using shared processors rather
than dedicated.
lsattr shows that smt_enabled is true which means that it is enabled, and it also
shows the number of threads. Point out that the fourth column in the output, where it
shows False for everything simply means that these attributes cannot be changed.
Transition statement Lets see the mpstat command running on a simultaneous
multithreading POWER7 system.
Instructor Guide
Using mpstat: POWER7

# mpstat s 1 1
Proc0
cpu0 cpu1 cpu2 cpu3
35.23% 22.18% 25.23% 17.18%
# smtctl
This system is SMT capable.
SMT is currently enabled.
SMT boot mode is set to enabled.
SMT threads are bound to the same physical processor.
proc0 has 4 SMT threads.
# lsattr -El proc0

frequency 3000000000 Processor Speed False
smt_enabled true Processor SMT enabled False
smt_threads 4 Processor SMT threads False
state enable Processor state False
type PowerPC_POWER7 Processor type False
Figure 2-17. Using mpstat: POWER7 AN313.1
Notes:
On POWER7 processor-based systems, If simultaneous multithreading is enabled, the
mpstat -s command displays four logical processors usage as shown in the visual
above. In the example, logical processor cpu0 is 35.23% busy, logical processor cpu1 is
22.18%, logical processor cpu2 is 25.23% busy and logical processor cpu3 is 17.18%
busy. cpu0, cpu1, cpu2 and cpu3 are hardware threads for proc0.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets see the topas command running on a simultaneous
multithreading system.
Instructor Guide
topas: Example screens

Run topas and press L.
Interval: 2 Logical Partition: Tue Apr 4 07:16:13 2010
Dedicated SMT ON Online Memory: 1024.0
Partition CPU Utilization Online Virtual CPUs: 2 Online Logical CPUs: 4
%user %sys %wait %idle %hypv hcalls
50 0 0 50 50.0 270
===============================================================================
LCPU minpf majpf intr csw icsw runq lpa scalls usr sys _wt idl pc
Cpu0 0 0 360 96 50 1 100 146 14 0 0 86 0.69
Cpu1 0 0 20 0 0 0 0 0 0 0 0 100 0.32
Cpu2 0 0 211 21 12 1 99 20 94 0 0 6 0.97
Cpu3 0 0 20 0 0 0 0 0 0 0 0 100 0.03
Topas Monitor for host: sady102

Tue Apr 27 17:05:31 2010 Interval: 2 Run topas and
CPU User% Kern% Wait% Idle% press c twice.
2 87.4 0.0 0.0 12.5
0 25.0 0.2 0.0 74.8
1 0.0 0.0 0.0 100.0
3 0.0 0.0 0.0 100.0
Figure 2-18. topas: Example screens AN313.1
Notes:
The topas output shows statistics by logical processor. The metrics have been applied so
that processor utilization is calculated using the PURR-based register and formula when
running in simultaneous multithreading (or with shared processors).
The visual shows output from a system with two dedicated processors and simultaneous
multithreading enabled, which is why we see four logical processors.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
POWER6 utilization view: SMT

One busy thread makes the core appear fully utilized.
Available capacity of idle SMT thread in
spload SMT thread gets 100% PURR
SMT mode is not apparent. charge, Idle thread gets 0%.
Two-core POWER6, one spload thread

consumes 50% of two CPUs.
Figure 2-19. POWER6 utilization view: SMT AN313.1
Notes:
This slide shows the traditional POWER6 statistics behavior. When one process is running
with SMT enabled, the processor appears 100% busy. Looking at the mpstat -s output,
you can notice the logical processor cpu0 100% busy and the logical processor cpu1 idle.
The topas output shows the logical partition 50% busy because we have two processors
in the logical partition.

V5.4.0.3
Instructor Guide

Purpose Explain the traditional POWER6 statistics behavior, one thread make the
processor appears 100% busy.
Details
Instructor Guide
POWER7 utilization view: SMT4

One busy thread does not make the core appear fully utilized.
Weighting is added to idle SMT threads.
Shows that the core has more capacity to do work
spload SMT thread gets

64% PURR charge, Idle
threads gets 12% each.
Two-core POWER7, one spload thread

consumes 32% of two CPUs.
Figure 2-20. POWER7 utilization view: SMT4 AN313.1
Notes:
One of the new behaviors with POWER7 processor-based systems, is to provide a better
representation of the core capacity in the utilization metrics. The slide shows a topas
command output and an mpstat -s command output on a two cores system when in
SMT4 mode with only one process (spload program running a single thread) running.
Compared to POWER6 processor-based systems, you can notice in the mpstat -s
command output, that when a single thread is running on a POWER7 processor, the
statistics do not reflect the logical processor 100% percent busy but about 63 % instead.
Looking at the topas output, you can notice the logical partition is 32 % busy.
When MST-4 is enabled on POWER7 processor-based systems, the statistics reflect that
the capacity of the core is really more than what one thread can consume. So the idle
threads appear to be taxed more. The statistics of the different logical processors are
weighted if there are idle threads in the core. If the SMT mode is explicitly changed
(through smtctl), the weighting changes, since the potential capacity of the core changes.
While a single thread can get equivalent of single-thread performance mode (provided the
other 3 threads are idle), the weights reflect that there is still headroom in the core.

V5.4.0.3
Instructor Guide

Purpose Give an overview of the SMT-4 utilization metrics.
Details
Instructor Guide
POWER7 utilization view:

Dynamic SMT scheduling
Dynamic runtime SMT scheduling
Left shift work on 10ms intervals (Thread0 = Single Thread, Thread0/Thread1 =
SMT2, Thread0/Thread1/Thread2/Thread3 = SMT4
Four spload running on two POWER7 cores Four spload SMT threads,
two on each core..shifted
leftfor dynamic SMT2
Two-core POWER7, four spload threads

consume 96% of two CPUs.
Figure 2-21. POWER7 utilization view: Dynamic SMT scheduling AN313.1
Notes:
Here is a example of the dynamic SMT scheduling on POWER7. The slide shows four
processes running on a two core logical partition. SMT-4 is enabled and because only four
processes are running, the system switched automatically to SMT-2 by left shifting the work
to the two first logical processors of each virtual processor.

V5.4.0.3
Instructor Guide

Purpose Show the dynamic SMT-2 behavior.
Details
Instructor Guide

V5.4.0.3
Instructor Guide
Uempty 2.1 Shared processor LPAR considerations
Instructor Guide
Instructor notes:
Purpose Introduce section discussing SPLAR considerations.
Details In the following section, we will discuss the Micro-Partitioning or Shared
Processor LPAR (SPLPARs) considerations. First, we will briefly discuss the benefits of the
dedicated processor LPAR. We will then contrast this with the benefits of SPLPARs.

V5.4.0.3
Instructor Guide
Uempty
Dedicated processors
Performance benefits:
LPAR
Processor and memory affinity Dedicated
utilized for best performance
Whole
processors
Performance considerations: allocated to a
Unused capacity lost partition
Less granular units of

allocation
Physical processors
When partition is
stopped, dedicated
processors might (or
might not) go to shared
pool.
Figure 2-22. Dedicated processors AN313.1
Notes:
Dedicated processors were used exclusively on the POWER4 processor-based
LPAR-capable systems and it is one of the configuration options on the System p5 and
Sserver p5 platforms.
Dedicated processors are whole physical processors exclusively allocated to a particular
partition. When the partition is shut down, the processors can return to the shared
processing pool. When the dedicated processor partition starts again, it is allocated
dedicated processors, although the actual physical processors might be different than the
last time it was activated.
A checkbox in the partition profile indicates whether idle processors are returned to the
shared pool. The box is labelled Allow idle processors to be shared and if checked,
when the partition shuts down, its processors become part of the shared pool.
Instructor Guide
Processor affinity
There is performance overhead involved when jobs switch processors because of latency
due to the context switches and cache misses. With dedicated partitions (using dedicated
processors), because the partition uses the same physical processors, there is less
potential for this latency and for cache misses than there is with a shared processor
partition utilizing processors out of a shared processing pool. This is a function not of using
shared processors versus dedicated processors, but a function of the increased processor
utilization.
Processor affinity refers to how the overhead of processor context switches is reduced as
much as possible by scheduling work on the same processor if that processor is available.
Memory affinity
When processors are allocated to a partition, an attempt is made to allocate physical
memory that is local to the processors. (Local memory is physical memory that is on the
same node, for example, MCM or DCM, as the processors).
Unused capacity lost

Because dedicated processors are whole processors allocated exclusively to a partition, if
that partition does not utilize all of the processing power of its processor(s), then that
capacity will be unused.

V5.4.0.3
Instructor Guide

Purpose Describe the performance considerations of dedicated processors.
Details Students should be familiar with dedicated processors.
The term dedicated is used to differentiate these type of processors with the next topic,
shared processors.
Focus on the performance aspects of using dedicated processors as listed on the visual.
The processors are allocated whole; however, there will probably be unused capacity, and
other partitions cannot use this excess capacity. Processors and local memory, where
possible, are allocated to make best use of affinity.
Dynamic LPAR operations might or might not increase the overhead of cache misses.
Moving processor resources causes the cache to be lost, but if the cache hit rate is good,
then the cache is quickly re-established, and the effect on a commercial load should not be
able to be measured. Moving memory should not cause an issue either, unless the system
is paging.
Transition statement Now lets look at a feature introduced with the POWER5, shared
processors.
Instructor Guide
Shared processors (1 of 2)
Processor capacity assigned in processing units from the shared
processor pool
Partitions guaranteed amount is its entitled capacity (EC)
Performance benefits:
Excess processing capacity can be used by other partitions
Configuration flexibility
Performance considerations:
Context switches and cache misses
Entitled LPAR1 LPAR2

capacity in A partition uses either
processing
EC=2.4 EC=1.75
shared or dedicated
units processors.
This cannot be
Shared processor pool switched dynamically.
Figure 2-23. Shared processors (1 of 2) AN313.1
Notes:
Shared processors are physical processors which are allocated to partitions on a timeslice
basis. Any physical processor in the shared processor pool can be used to meet the
execution needs of any partition using the shared processor pool.
There can be a mix of shared and dedicated partitions on the same managed system. A
partition uses shared or dedicated processors, and you cannot use dynamic LPAR
commands to change between the two. You need to bring down the partition and switch it
from using dedicated to shared, or vice versa, by using a different partition profile or
altering the existing one.
Processing units
When a partition is configured, you assign it an amount of processing units. A partition
must have a minimum of one tenth of a processor and after that requirement has been met,
you can configure processing units at the granularity of one hundredth of a processor.

V5.4.0.3
Instructor Guide
Uempty Benefits of using shared processors

Some benefits of using shared processors are:
The processing power from a number of physical processors can be utilized
simultaneously, which can increase performance for multiple partitions (but, as well see
later, it could hurt performance)
Processing power can be allocated in sub-processor units in as little as one hundredth
of a processor
Uncapped partitions can be used which allows partitions to utilize excess processing
power not being used by other partitions
Partitions which do not need the entire capacity of a powerful processor can be
configured with as little as ten percent of a processor, and the excess capacity can be
used by other partitions
Disadvantages of using shared processors

A disadvantage of using shared processors is that because multiple partitions use the
same physical processors, there is overhead due to context switches on the processors. A
context switch is when a process or thread is running on a processor, it is interrupted (or
finishes), and a different process or thread runs on that processor. The overhead is in the
copying of each jobs data from memory into the processor cache. This overhead is normal
and even happens at the operating system level within a partition. There is added context
switch overhead, however, when the Hypervisor timeslices between partitions. When a
managed system is configured correctly, so as not to cause excessive context switching,
this overhead is typically in the 7-10% range. This overhead might affect partitions running
CPU-intensive workloads.
Micro-Partitions
The term Micro-Partition is used for partitions that take advantage of shared processing. A
system can be configured with many Micro-Partitions each running independently. The I/O
needs for many small Micro-Partitions can be supported by another partition called a
Virtual I/O server. This concept is covered later in this course.
Shared Processor Logical Partition (SPLPAR)

A partition using shared processors is often called simply a shared partition. You might also
see the acronym SPLPAR for Shared Processor Logical Partition, and it means the same
thing as a shared partition. A dedicated partition is one that uses dedicated processors.
Instructor Guide
Instructor notes:
Purpose This visual introduces shared processors.
Details Describe the concept of a shared processor. The discussion of how the shared
processor pool works is coming up in the next few visuals.
The term Micro-Partition can be used for any partition using shared processors because
you can use sub-processor allocations. Micro-Partitioning also means using virtual
processors, which is covered in a few pages.
Mention the acronym SPLPAR, because its used in some documentation.
Define the term entitled capacity.
Point out the Benefits and Disadvantage of using shared processors sections in the student
notes. In the next few pages of this unit, well be looking at the concepts brought up in the
Disadvantage... section, so just use this as a hint of things to come.
Remind students that there is just one shared processing pool and that the sum of all the
entitled capacity for running shared processor partitions cannot exceed the physical
processor capacity of the pool.
Transition statement Lets look at how shared processing units map to processing
time.

V5.4.0.3
Instructor Guide
Uempty
Shared processors (2 of 2)
Each partition is configured with a percentage of execution dispatch
time for each 10 ms timeslice (dispatch window).
Examples:
A partition with 0.2 processing units is entitled to 20% capacity during each
timeslice.
A partition with 1.8 processing units is entitled to 18 ms of processing time for
each 10 ms timeslice (using multiple processors).
The Hypervisor dispatches excess idle time back to pool.
Processor affinity algorithm takes into account hot cache.
0.2 = 20% of timeslice = 1.8 = 180% of timeslice =

2 ms per timeslice 18 ms per 10 ms timeslice
10 ms 20 ms 10 ms 20 ms
10 ms 20 ms
Figure 2-24. Shared processors (2 of 2) AN313.1
Notes:
10 ms timeslice (also known as a dispatch window)

Its important to think about execution capacity in terms of 10 millisecond (ms) timeslices. A
partition uses the timeslice, or a portion of the timeslice, based on the allocated processing
units. For example, 0.5 processing units ensures that for every 10 ms timeslice, that
partition will receive up to five ms of processor time. Up to is an important concept,
because the partition might not need the entire five ms timeslice because of waits,
interrupts, or lack of processing need, but the partition is guaranteed up to its five ms of
processing time.
Partitions with greater than 1.0 processing units

If a partition has more than 1.0 processing units, then it must utilize multiple physical
processors. For example, a partition with 1.8 processing units might, for its 10 ms timeslice,
utilize one processor for its entire 10 ms timeslice and another processor for 80% of its 10
ms timeslice. However, it is more likely to utilize multiple processors for parts of the 10 ms
Instructor Guide
timeslice, which add up to the equivalent of 1.8 of a processor. Another way to say this is
that 1.8 processing units is 18 ms of processing time that happens on multiple processors
during a 10 ms clock time period.
Excess idle time

There is no accumulation of unused cycles. If a partition doesnt use its entire entitled
processing capacity, the excess processing time is ceded back to the shared processing
pool. Later in this unit we cover uncapped partitions, which can use this excess processing
time which is over and above their configured entitled processing capacity.
Processor affinity
Hot cache refers to cache which still has data relevant to a current running process. If a
process is interrupted and another runs on that physical processor, and then the original
thread is ready to run again, its data might still be in the cache. If the time threshold has not
been reached, the original process will attempt to run on the same physical processor. This
is called processor affinity.

V5.4.0.3
Instructor Guide

Purpose Describe the timeslice aspect of shared processing.
Details Describe that you are not literally allocating a 0.01 piece of a physical processor
to a partition. The Hypervisor uses time slicing to implement the sub-processor allocation.
Describe how to convert from a decimal-based processing unit value to the 10 ms
timeslice. That is, if a processor has 0.2 processing units, it will use 20% of a 10 ms
timeslice.
A running partition might get interrupted (for example, blocked for I/O) and not use its entire
processing capacity. Any excess idle time returns to the shared processing pool.
Processor affinity refers to the tendency of processes returning to the same physical
processor if it returns within a certain amount of time. This increases the chance that its
data is still in the cache.
The student notes about processor affinity are simplified. There are a series of algorithms
in place to attempt to optimize performance.
Additional information Processor affinity: There is processor affinity happening within
the operating system within a partition and at the lower Hypervisor level between partitions.
These are separate but similar functions.
Transition statement Lets look at the shared processor pool in more detail.
Instructor Guide

This example shows one 10 ms timeslice, seven running partitions, and
four processors.
A partition can run on multiple processors depending on interrupts and its entitled
capacity.
Physical
processors
P P P P Partition 1
10 ms
Partition 2
Partition 3
Partition 4
Partition 5
Partition 6
Partition 7
Figure 2-25. Shared processor pool AN313.1
Notes:
This visual shows seven partitions running on four shared processors. Partition 1 is circled
in the example to show that within a single 10 ms timeslice it ran on two processors
simultaneously, then interrupted, and returned on a third processor. Any excess processing
time is returned to the shared processing pool.
Entitled capacity or execution entitlement

The entitled capacity or execution entitlement is the guaranteed number of processing units
per timeslice. Both terms are used for this amount. The performance monitoring tools tend
to use the term entitled capacity.
The entitled capacity would be the processor desired amount if there are enough
processing resources when the partition is activated, and there have not been any dynamic
operations to change this amount.

V5.4.0.3
Instructor Guide

Purpose Describe the shared processing pool.
Details Describe how multiple partitions can utilize processing cycles from multiple
shared processors.
Point out how Partition 1 starts out on two processors, then gets interrupted (or cedes the
processor), and then runs again in the same timeslice on a different processor.
Ask the students how many processing units Partition 1 has. The answer is that the visual
shows that the partition has been allocated around 0.7 or 0.8 processing units within the
timeslice that is shown. The partition might be configured as having an entitled capacity of
0.8, and it used that capacity during the timeslice. (Alternatively, the partition can be
configured with a smaller EC, but be uncapped, so it can consume more.) Another scenario
is that the partition might be configured with more than 0.8 EC but might have ceded idle
cycles back to the shared pool.
This page is a good lead-in for the concept of virtual processors, which is explained on the
next page. The visual on this page shows that Partition 1 must be configured with at least
two virtual processors, because we see that in one timeslice, it ran on two different physical
processors at the same instant.
Transition statement Notice that Partition 1 ran on two physical processors
simultaneously. The next few pages describe how to configure the number of physical
processors a partition thinks it has.
Instructor Guide
Virtual processors (1 of 3)
Virtual processors are used to tell the operating system how many
whole physical processors it thinks it has.
Operating system in LPAR2 does not see 1.75 processing units; it sees the
configured virtual processors.
In this example, each partition sees four processors.
LPAR1 LPAR2
Entitled
capacity in
EC=2.4 EC=1.75
processing Virtual
units processors
Physical
processors
Figure 2-26. Virtual processors (1 of 3) AN313.1
Notes:
The virtual processor setting defines the way that a partitions entitlement can be spread
concurrently over physical processors. That is, you can think of the processing power
available to the operating system on the partition as being spread equally across these
virtual processors. The number of virtual processors is what the operating system thinks it
has for physical processors. The Hypervisor dispatches virtual processors onto physical
processors.
The example in the visual above shows six physical processors in the shared pool, and
each partition thinks it has four processors.
The number of virtual processors can be configured independently for each shared
partition. The number of virtual processors can be changed dynamically.
Micro-Partitioning redefined
Previously, this course defined a Micro-Partition as a partition that takes advantage of
shared processors and sub-processor increments.

V5.4.0.3
Instructor Guide
Uempty Micro-Partitioning is that it also defined as the mapping of virtual processors to physical
processors. A partition is configured with a number of virtual processors, which are then
dispatched on the physical processors in the shared pool.
In Micro-Partitioning there is no fixed relationship between virtual processors and physical
processors. The POWER Hypervisor can use any physical processor in the shared
processor pool when it dispatches a virtual processor.
Instructor Guide
Instructor notes:
Purpose Describe what a virtual processor is.
Details Stay at the concept level on this visual. The next visual discusses the minimum
and maximum configuration options for virtual processors.
It might be a good idea at this point to turn back to the previous visual titled Figure 3-7
Shared Processor Pool. This visual illustrates an example Partition 1 running threads
simultaneously on two physical processors. If Partition 1s virtual processor number was
increased to four, then you might see Partition 1 running on all four processors at the same
time.
Transition statement The next visual discusses minimum and maximum settings for
virtual processors.

V5.4.0.3
Instructor Guide
Uempty
By default, for each 1.00 of a processor or part thereof, a
virtual processor will be allocated.
Example: 3.6 processing units would have four virtual processors.
Up to 10 virtual processors can be assigned per processing
unit.
Example: 3.6 processing units can have up to 36 virtual processors.
Number of virtual processors does not change the entitled
capacity.
Both entitled capacity and number of virtual processors can be
changed dynamically for tuning.
Maximum virtual processors per partition is 64
Example:
Partition with 4.2 entitled capacity
Minimum virtual processors = ______
Maximum virtual processors = ______
Notes:
The virtual processor setting does not change the total number of guaranteed processing
units (entitled capacity). For example, a partition with 1.5 capped processing units will still
have only 15 ms of processing time, whether that is on two physical processors or four.
With four virtual processors, the partition might consume its entitled capacity in a shorter
period than on two virtual processors.
If the partition with 1.5 processing units was uncapped, then with four virtual processors it
could consume as much as 40 ms per timeslice if there was sufficient spare capacity in the
shared processor pool. If the partition had two virtual processors, then it would only be able
to consume at most 20 ms per timeslice, even if there is more unused capacity. You want to
be sure to have enough virtual processors configured to take full advantage of all the
physical processors that can be used.
Instructor Guide
Minimum virtual processors

The minimum number of virtual processors is that for every equivalent of a whole or part of
a processor, you must have at least one virtual processor. The minimum is the default
when creating partitions.
Maximum virtual processors

The maximum number of virtual processors is ten times the amount of processing units,
with an upper limit of 64. For example, a partition with 1.8 processing units cannot have
more than 18 virtual processors.
You cannot allocate less than one ms of processing capacity per virtual processor. For
example, you cannot create a partition with 0.5 processing units and six virtual processors,
because this would result in trying to allocate less than one ms of processing time to each
of the six virtual processors.

V5.4.0.3
Instructor Guide

Purpose Emphasize that the number of virtual processors does not change the amount
of processing units.
Details Use this page to make sure students understand the difference between virtual
processors and the entitled capacity of a partition (that is, its assigned processing units).
The virtual processors setting allows you to affect the performance of a partition by allowing
it to run on multiple physical processors simultaneously. For example, if a partition has just
one virtual processor (so processing units are set to 1.0 or less), then that partition can only
run on one physical processor at a time.
There is a fill-in-the-blank activity on the visual.
For a partition with 4.2 EC (entitled capacity), what is the minimum number of virtual
processors? Answer: Five
What is the maximum number of virtual processors? Answer: 42 (Note: Often people
mistakenly multiply the minimum by 10 and answer 50, rather than taking the actual
processing unit number and multiplying by 10.)
Transition statement Lets see how entitled capacity is divided between virtual
processors.
Instructor Guide
Example:
Partition has 1.5 processing units (EC).
For each 10 ms timeslice, it is entitled to 15 ms of processing time.
Entitled capacity is distributed between all of the virtual processors.
Entitled
capacity
LPAR1 LPAR2
EC=1.5 EC=1.5
Virtual
processors
15 ms split on up to 15 ms split on up to
two physical four physical
processors processors
Notes:
The visual above shows how two partitions, each with 15 ms of processing time, divide up
the work between the virtual processors. In both cases, the work to be done in the partition
is split evenly between the virtual processors. So LPAR1 might have 7.5 ms on each virtual
processor, and LPAR2 might have 3.75 ms on each of the four virtual processors. This is
simplified, because of reality, a partition might not use its entire allotment of processing
time due to blocking for I/O, and so on.
Depending on the entitled capacity, a partition can be configured with up to 64 virtual
processors. By increasing the number of virtual processors, you decrease the amount of
processing time assigned to each virtual processor. The Hypervisor dispatches virtual
processors onto physical processors. If there are more virtual processors in a partition than
you have physical processors in the shared processing pool, then multiple virtual
processors will run on one physical processor. This causes context switches that incur a
performance cost.

V5.4.0.3
Instructor Guide

Purpose Describe the configuration limits for virtual processors.
Details Discuss the minimum and maximum settings when configuring virtual
processors.
Transition statement Lets look at how virtual processors are dispatched on physical
processors.
Instructor Guide
Shared processor dispatch latency

Virtual processors are dispatched onto physical processors.
Timeslicing and interrupts cause virtual processors to switch states.
Virtual processors return to same or different physical CPU.
Physical
CPUs
0 VP 1* VP VP Idle Idle VP 0*** VP 1* Idle
2*** 1*
1 VP 0** VP 0* VP 0***VP 1*** VP VP 0* VP 1*** VP 0**
Dispatch wheel pass 1 2*** Dispatch wheel pass 2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Time in milliseconds
LPAR1* LPAR2** LPAR3***

EC = 0.8 EC = 0.2 EC = 0.6
VP = 2 VP = 1 VP = 3
Figure 2-29. Shared processor dispatch latency AN313.1
Notes:
Virtual processors are dispatched in a time-sliced manner onto physical processors under
the control of the POWER Hypervisor, much like an operating system timeslices software
threads. Virtual processors have dispatch latency, because they are scheduled. When a
virtual processor is made runnable, it is placed on a run queue by the POWER Hypervisor,
where it sits until it is dispatched. The time between these two events is referred to as
dispatch latency.
Notice these scheduling points from the graphic in the visual above:
LPAR1s work is evenly divided over the physical CPUs. It is entitled to 80% of a
timeslice, and the workload is 40% on each physical processor.
The same virtual processor can be re-dispatched in the same dispatch wheel pass. In
the visual above, VP1 of LPAR1 is dispatched twice on CPU0 in the first dispatch wheel
pass.
LPAR2 has just one virtual processor, and it is dispatched on the same CPU (processor
affinity).

V5.4.0.3
Instructor Guide
Uempty LPAR3 has three virtual processors and runs on two physical processors. Notice that
the virtual processors context switch three times within six ms on physical CPU 1.
Dispatch wheel
The POWER Hypervisor uses the metaphor of a dispatch wheel with a fixed timeslice of 10
milliseconds (1/100 of a second) to guarantee that each virtual processor receives its share
of the entitlement in a timely fashion. When a partition is completely busy, the partition
entitlement is evenly distributed among its virtual processors.
Hypervisor decrementer facility

The most important extension in the POWER5, POWER6 and POWER7 chip made for
virtualization is the Hypervisor decrementer (HDECR) facility. This provides the Hypervisor
with a guaranteed timer interrupt regardless of the partition execution state. Unlike the
regular decrementer used by the partition for timer interrupts, the HDECR interrupt is
routed directly to the Hypervisor and uses only Hypervisor resources to capture the state of
the partition. The HDECR is used for fine-grained dispatching of multiple partitions on
shared processors, as well as for cycle stealing for special Hypervisor dispatches.
Instructor Guide
Instructor notes:
Purpose Illustrate how partitions share processors. Define what is meant by dispatch
latency and what causes it.
Details Show how the three partitions in the visual share two physical processors. To
do this, walk through all three partitions and how their virtual processors are dispatched
onto the physical processors. Do not describe the concept of uncapped yet, but students
might ask about using the idle capacity. If the students have black and white printouts of
this visual, the asterisks are there to show which boxes belong to which partitions.
Describe the concept of a dispatch wheel, and how the POWER Hypervisor uses this
wheel to ensure each partition can utilize its EC within the bounds of each 10 ms timeslice.
The point to emphasize on this visual is that as the number of virtual processors configured
for a partition increases, this will cause additional context switching. The Hypervisor
attempts to put virtual processors back on the physical CPU where it just ran, but
sometimes this is not possible, causing a cache miss. So, it is important to have just the
right amount of virtual processors, and not too many. Well be discussing this more in this
unit. This is the concept portion of this topic; well get to the performance management
portion in several pages.
Additional information One thing we dont show on any of these virtual processor
dispatch diagrams is that the Hypervisor uses Hypervisor decrementer interrupts to gain
control of a processor in order to dispatch a virtual processor of its hidden partition to
perform Hypervisor work. The Hypervisor has a layer of code (a small operating system)
that runs in a hidden partition and does not have any entitled capacity assigned to it. The
operating system in a partition is optimized to let the Hypervisor know when it has cycles
that the Hypervisor can use.
Transition statement Lets look in more detail at the concept of shared processor
affinity.

V5.4.0.3
Instructor Guide
Uempty
Shared processor affinity

Affinity:
When a waiting virtual processor is made ready-to-run, the hypervisor
tries to dispatch on the same core, the same chip, and the same
MCM.
If none is found, it dispatches on any waiting processor
Home node:
A virtual processor is assigned a home node.
MCM/DCM where most of the memory comes from
Virtual processor migrates back to home node whenever it has no
affinity left.
Figure 2-30. Shared processor affinity AN313.1
Notes:
Processor affinity policy

The POWER Hypervisor attempts to dispatch work in a way that maximizes processor
affinity. When the Hypervisor is dispatching a virtual processor, it first attempts to use the
same physical processor core on which the virtual processor previously ran. Otherwise, it
would be dispatched to the first available processor in the following order: the same chip,
the same MCM, or the same node.
When a physical processor becomes idle, priority is given to virtual processors in the
following order:
1. Virtual processors with affinity for a processor
2. Virtual processors with no affinity for any processor
3. Virtual processors that are uncapped
Instructor Guide
Instructor notes:
Purpose Describe how shared processor affinity works, and what a home node is.
Details This visual introduces more detail about shared processor affinity and how it
works.
Transition statement The next page describes scheduling affinity domains.

V5.4.0.3
Instructor Guide
Uempty
Scheduling affinity domains

Scheduling affinity domains:
Memory and processors that are directly connected:
A processor can access local memory faster (lower latency).
Dedicated partitions utilize memory affinity.
On POWER6, shared partitions are allocated memory in round-robin fashion:
Applications that would benefit from memory affinity should be in partitions that use
dedicated processors.
On POWER7, CPU/memory affinity is provided in shared partitions.
mpstat d displays affinity and migration statistics:
# mpstat d 2
System configuration: lcpu=4 ent=0.8
cpu cs ics bound rq push S3pull S3grd S0rd S1rd S2rd S3rd S4rd S5rd ilcs vlcs
0 697 619 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 954
1 0 0 0 0 0 0 0 - - - - - - 0 811
2 0 0 0 0 0 0 0 - - - - - - 0 21
3 0 0 0 0 0 0 0 - - - - - - 0 21
ALL 697 619 0 0 0 0 0 100.0 0.0 0.0 0.0 0.0 0.0 0 1807
ilcs = Involuntary logical vlcs = Voluntary logical context

context switches (timeslicing) switches (all other switches)
Figure 2-31. Scheduling affinity domains AN313.1
Notes:
The different scheduling affinity domains represent the different levels of affinity. As
previously stated, the POWER Hypervisor always tries first to dispatch the virtual processor
onto the same physical processor that it last ran on and, depending on resource utilization,
will broaden its search to the other processor on the POWER5 or POWER6 chip, then to
another chip on the same MCM, and then to a chip on another MCM.
POWER7 enhanced affinity

POWER7 starting with AIX6.1 TL5 implements an enhanced affinity. Compared to
POWER6, POWER7 has a three tier memory hierarchy (Local Memory, Near Memory and
Far memory). The enhanced CPU affinity is provided on POWER7 shared processors
logical partitions.
Instructor Guide
mpstat -d
Starting in IBM AIX 5L V5.3, the mpstat command using the -d flag displays detailed
affinity and migration statistics for AIX threads and dispatching statistics for logical
processors.
The mpstat -d command shows statistics since system boot. Use an interval value to
obtain periodic statistics. Large numbers of involuntary switches in a small interval could
mean that there are too many virtual processors.
Logical context switches

The two columns at the right side of the mpstat -d output only appear for partitions using
shared processors.
Involuntary logical context switches (ilcs) for a virtual processor means that it was
interrupted because its timeslice is over.
Voluntary logical context switches (vlcs) are due to the operating system detecting an
inability to utilize processor cycles and it cedes its cycles back to the POWER
Hypervisor, enabling it to schedule a different virtual processor on that physical
processor.
The context switches (cs) and involuntary context switches (ics) columns shown at the left
side of the output are AIX context switches within the partition.
Other mpstat -d columns

The mpstat -d columns show statistics for virtual processor dispatch across the affinity
domains and are defined as follows:
cpu Logical CPU (processor) number.
cs The number of context switches.
ics The number of involuntary context switches.
bound Total number of threads bound to a particular processor.
rq The number of threads on the run queue.
push The number of thread migrations to other processors due to starvation
load balancing.
S3pull Number of thread migrations outside the S3rd affinity domain due to idle
stealing.
S3grd Number of dispatches from the global run queue outside the S3rd affinity
domain.
S0rd The process redispatch (rd) occurs within the same logical processor.
This happens in the case of simultaneous multithreading enabled
systems.

V5.4.0.3
Instructor Guide
Uempty S1rd The process redispatch occurs within the same physical processor,
among different logical processors. This involves sharing of the L1, L2,
and L3 cache.
S2rd The process redispatch occurs within the same processor chip, but
among different physical processors. This involves sharing of the L2 and
L3 cache.
S3rd The process redispatch occurs within the same MCM module, but among
different processor chips.
S4rd The process redispatch occurs within the same central processing
complex (CPC) plane, but among different MCM modules. This involves
access to the main memory or L3-to-L3 transfer.
S5rd The process redispatch occurs outside of the CPC plane.
Tracing virtual processor dispatch

The dispatching of virtual processors by the Hypervisor does not involve the operating
system running in the partition. The operating system cannot directly monitor the rate or
characteristics of context switching from the Hypervisor. However, there is a
communication area that is shared between the Hypervisor and each virtual processor in a
partition so that an operating system such as AIX can implement tracing of the virtual
processor context switching.
The trace facility in AIX supports the trace hook value of 419, which represents the
information that is available about context switching.
Instructor Guide
Instructor notes:
Purpose Define schedule affinity domains and monitor key fields in mpstat output.
Details Point out the involuntary and voluntary logical context switch columns.
Transition statement Lets look at capped SPLPARs.

V5.4.0.3
Instructor Guide
Uempty
Capped shared processor LPAR
Processor capacity utilization
Pool Idle Capacity Available
Maximum Processor Capacity
Entitled Processor Capacity
LPAR Capacity
CededUtilization
capacity
Minimum processor capacity
Utilized capacity
Time
Capped LPAR cannot use more than entitled capacity.
Figure 2-32. Capped shared processor LPAR AN313.1
Notes:
The visual illustrates a capped partition that uses all of its entitled processor capacity twice
over the time shown, but it cannot use more.
Instructor Guide
Instructor notes:
Purpose Illustrate how a capped partition cannot use more than its entitled capacity.
Details The utilized capacity labeled on the visual (in orange) shows the partition using
processing resources. Twice, the partition reaches its total entitled capacity and it is not
allowed to use more.
Transition statement Now, lets look at a similar graph with an uncapped partition.

V5.4.0.3
Instructor Guide
Uempty
Uncapped shared processor LPAR
Pool Idle Capacity Available

Processor capacity utilization
Maximum Processor Capacity
Entitled Processor Capacity

Ceded Capacity
Minimum Processor Capacity
Utilized Capacity
Time
Uncapped LPAR takes advantage of idle capacity.
Figure 2-33. Uncapped shared processor LPAR AN313.1
Notes:
In the visual the uncapped partition reaches its entitled capacity and is allowed to utilize
capacity from the shared pool. Notice that the partition can use more than its maximum
processor capacity. The maximum setting limits dynamic LPAR operations only when
changing the entitled capacity; it has no relevance to uncapped partitions being allowed to
utilize idle processing resources.
Instructor Guide
Instructor notes:
Purpose Illustrate how an uncapped partition can use idle processing resources from
the shared pool.
Details The utilized capacity labeled on the visual (in orange) shows the partition using
processing resources.
Transition statement Now, lets look at the effect of having more or fewer virtual
processors.

V5.4.0.3
Instructor Guide
Uempty
Capacity and virtual processor relationship
Uncapped Potential Capacity

Processor capacity units
5
4
Maximum Processor
Capacity
3
Entitled Processor
Capacity
2
Capped Potential Capacity
Minimum processor capacity
1
Entitled Capacity Per Virtual CPU
1 2 3 4 5 6
Virtual processors
Uncapped LPAR can grow larger with more

virtual processors.
Figure 2-34. Capacity and virtual processor relationship AN313.1
Notes:
The visual compares a partition with one to six virtual processors. The visual compares the
partitions processor utilization when it is uncapped (first bar in each set) to when it is
capped (second bar in each set).
When capped, the partition can only utilize its entitled processor capacity.
When uncapped, the partition can not only use additional processor capacity, but its ability
to maximize the usage of the idle cycles is directly related to the number of virtual
processors it has. This is because a virtual processor can utilize a maximum of 10 ms in
each dispatch window. This visual illustrates that if you have too few virtual processors, you
would be limiting how much of the idle processing capacity can be used by the partition. It
also illustrates that having more virtual processors with a capped partition than it is capable
of using not only does not improve performance, but also increases the context switches
unnecessarily. Notice the line in the visual above labeled Entitled Capacity Per Virtual CPU.
This line shows that each virtual processor is doing less and less work.
Instructor Guide
Performance recap
Having dedicated processors provides improved performance over shared capped
processor performance because of reduced processor cache misses and reduced latency.
However, a partition using dedicated processors cannot take advantage of using excess
shared pool capacity as it could with an uncapped partition using the shared processing
pool. Performance could be better with the uncapped processors if there is excess capacity
in the shared pool that can be used.
Configuring the virtual processor number on shared processor partitions is one way to
increase (or reduce!) the performance for a partition.
The virtual processor setting for a partition can be changed dynamically. You can monitor
performance and change the virtual processor setting dynamically to see wether the
performance improves.

V5.4.0.3
Instructor Guide

Purpose Show how an uncapped partition might be limited in its utilization of idle
processing capacity if it is configured with two few processors.
Details A few things to note about the visual used above:
The first group of bars shown seems invalid with the second set in a practical scenario,
because you cannot have one virtual processor with an entitled capacity over 1.0
processing units. However, this first example shows what would happen if a dynamic
LPAR operation were to reduce the entitled capacity from 2.0 to 1.0, or if only 1.0
processing unit was available when the partition was activated.
For this example, assume that there are at least six physical processors in the shared
processing pool that have idle capacity to allow the uncapped partition to grow.
Transition statement Lets discuss what happens if you have either too many or too
few virtual processors.
Instructor Guide
Virtual processors: What to do? (1 of 2)

General rules for performance:
For uncapped partitions, increase number of virtual processors up to the number
of processors in the shared pool, if enough entitlement to support.
For capped partitions, start with minimum number of virtual processors and
monitor.
If there are many partitions running, overall performance could increase with
fewer virtual processors if:
There is lock contention
Applications are memory-intensive and there are a lot of cache misses
If you are not using VP folding, do not increase total VPs past four times the
number of physical CPUs
Dynamically change settings and monitor
If the virtual processor setting is:
Too low: Uncapped partition cannot take full advantage of excess cycles.
Too high: Could cause excessive context switching, resulting in cache misses and
excessive lock contention if virtual processor folding disabled.
Figure 2-35. Virtual processors: What to do? (1 of 2) AN313.1
Notes:
Virtual processors and capped partitions

With capped partitions, you still cannot exceed the processing unit capacity; however, the
virtual processor setting dictates how many simultaneous threads can be run. For example,
a partition with 0.4 processing units and one virtual processor will attempt to run the 0.4 in
a synchronous thread. If there are no interrupts, that 0.4 will translate to 40% of a 10 ms
timeslice on one processor. With two virtual processors, and no external influences, that
can translate to about 20% on two processors.
Virtual processors and uncapped partitions

For uncapped partitions, it is advantageous to increase the number of virtual processors to
more than the minimum because, the partition might grow to use more processing power if
it is available. To take advantage of the potential increased processing capacity, use a
number of virtual processors that is at least equal to the number of whole processors in the
shared processing pool.

V5.4.0.3
Instructor Guide
Uempty For example, if an uncapped partition is configured with 1.5 processing units and there are
eight processors in the shared processor pool, you could configure up to 15 virtual
processors because 15 is the maximum for 1.5 processing units. However, the
recommendation is to configure the uncapped partition with eight virtual processors and
check the performance. You can then increase the number of virtual processors until you
see performance degrade.
How many is too many virtual processors?

With multiple threads, your job is now split into pieces between processors. This might
make the processing time too short and cause unnecessary context switches. Too many
virtual processors can cause thrashing. Too few virtual processors might not take
advantage of the multithreading aspects of the system.
Monitor and dynamically make changes to virtual processor setting

Use trial and error, and monitor performance, when making changes to partitions. The
number of virtual processors can be changed with dynamic LPAR commands. If you wish
to increase the virtual processor settings for a partition dynamically, make sure the
maximum setting for virtual processors is high enough when you configure the partition.
An interesting aspect of monitoring performance in this environment is that given the
influences of workloads on other partitions, a performance test will undoubtedly have
different results at different times. Performance must therefore be tracked over time.
Instructor Guide
Instructor notes:
Purpose Discuss the configuration guidelines for virtual processors.
Details Trial and error is the main point on this visual, because partition workloads are
all different. Some applications benefit from more virtual processors, and for others, it just
introduces more overhead.
This visual mentions a new concept: virtual processor (VP) folding. Do not go into detail
about this feature here because the topic is covered in detail starting in two visuals. Simply
say here that VP folding is a feature with AIX 5L V5.3 ML3 or above and well discuss this
in a few pages.
Transition statement Lets look at the specific metrics to watch when tuning virtual
processors.

V5.4.0.3
Instructor Guide
Uempty
Virtual processors: What to do? (2 of 2)

Are there too few virtual processors?
Does your uncapped partition have as many virtual processors as
physical processors in the shared pool?
An uncapped partition can grow up to a total of 10 ms per virtual
processor.
Too few virtual processors will show with lparstat:
High user and sys CPU utilization (%user, %sys).
Low idle time and high entitled capacity consumed (%idle, %entc).
Are there idle processor cycles in the shared pool?
Are there available physical processors (app column in lparstat output)?
Are there too many virtual processors?
Because VP folding is enabled by default, virtual processors can be
folded (put to sleep) to avoid excessive involuntary context switches.
Figure 2-36. Virtual processors: What to do? (2 of 2) AN313.1
Notes:
Here is a summary of how to use some of the new performance management tools that can
be used to monitor and make management decisions about processing resources.
If the user and system CPU usage is consistently high for an uncapped partition, but there
are available physical processors in the shared pool, then increase the number of virtual
processors. Monitor performance of the partition to see wether it increases.
You can also use vmstat to determine the percent of user and system time (us and sy
columns). The vmstat command also shows percent of CPU idle time (id column) and
CPU idle time when there are outstanding I/O requests (wa column). For shared partitions,
vmstat also shows the amount of physical processors consumed (pc column) and the
entitled capacity consumed (ec column).
Example vmstat output (showing only the columns related to CPU):
# vmstat 2 2
System configuration: lcpu=4 mem=1024MB ent=0.80
cpu
-----------------------
Instructor Guide
us sy id wa pc ec
97 0 0 3 0.80 100.0
94 0 0 6 0.80 100.0

V5.4.0.3
Instructor Guide

Purpose Describe how to monitor AIX to make decisions about whether virtual
processor settings are too high or too low.
Details Describe what can be monitored and common tools that can be used. The
purpose is to determine if the virtual processor configuration is too high or too low.
If the students seem tired, this is a good time for a break as youve probably been teaching
for about 90 minutes.
Additional information The vmstat output shown in the notes is from a system
running the Tmmg -p 2 program to create a CPU load.
Transition statement The next topic is about virtual processor activity.
Instructor Guide
Virtual processors: Cede, confer, or fold

Virtual processors (VPs) can:
Cede cycles to the shared processing pool
Confer cycles to another virtual processor in that partition
Be folded if they are idle (as of AIX 5L V5.3 ML3)
Cede example: Idle VP (no work to do):
If it is not folded, the VP will cede cycles to the shared pool for the
remainder of the timeslice.
Confer example: VP cannot continue because it is waiting for a
resource owned by another VP in the same partition:
The VP confers its cycles to the other VP that has the resource (such
as a kernel lock).
Folding example: Idle VP (no work to do):
VPs can be folded (put to sleep until there is more work to do).
Performance tools still report original number of virtual and logical
processors.
Figure 2-37. Virtual processors: Cede, confer, or fold AN313.1
Notes:
Virtual processor activity

To optimize physical processor usage, a virtual processor yields a physical processor if it
either has no work to run or enters a wait-state such as waiting for a lock or for I/O to
complete. A virtual processor might yield a physical processor through a cede or confer
Hypervisor call. Another Hypervisor call (h_prod) activates a virtual processor that has
ceded or conferred processor cycles when there is more work to do.
cede hypervisor call (hcall)

The cede hcall is used when the virtual processor is in the idle loop, such as waiting for I/O
to complete. The cede hcall allows the POWER Hypervisor to dispatch other virtual
processors in a different partition.

V5.4.0.3
Instructor Guide
Uempty confer hypervisor call

The confer hcall is used to grant the remaining physical processor cycles in a dispatch
interval to another virtual processor in the same partition. It is used when one virtual
processor cannot make forward progress because it is waiting on an event to complete on
another virtual processor, such as a lock miss. There are three different confer hcalls
depending on the situation.
Virtual processor (VP) folding

Starting with AIX 5L V5.3 maintenance level 3, the kernel scheduler has been enhanced to
dynamically increase and decrease the use of virtual processors in conjunction with the
instantaneous load of the partition, as measured by the physical utilization of the partition.
This is a function of the AIX operating system, and not a Hypervisor call.
If there are too many virtual processors for the load on the partition, the cede hcall works
well, but it only works within a dispatch cycle. At the next dispatch cycle, the Hypervisor
distributes entitled capacity and must cede the virtual processor again if theres no work.
The VP folding feature, which puts the virtual processor to sleep across dispatch cycles,
improves performance by reducing the Hypervisor workload. It does this by decreasing
context switches, and by improving cache affinity.
When virtual processors are deactivated, they are not dynamically removed from the
partition as with DLPAR. The virtual processor is no longer a candidate to run on or receive
unbound work; however, it can still run bound jobs. The number of online logical processors
and online virtual processors that are visible to the user or applications does not change.
There are no impacts to the middleware or the applications running on the system because
the active and inactive virtual processors are internal to the system.
Monitoring hypervisor calls

The following lparstat -H example output has been reduced on this page to show only
cede, confer, and prod Hypervisor calls. You can also use an interval and a count to see
data for the last interval rather than from the last operating system boot.
# lparstat -H
System configuration: type=Shared mode=Capped smt=On lcpu=2 mem=2048 psize=2 ent=0.10
Detailed information on Hypervisor Calls
Hypervisor Number of %Total Time %Hypervisor Avg Call Max Call

Call Calls Spent Time Spent Time(ns) Time(ns)
cede 4 0.0 72.2 101937 171070

confer 0 0.0 0.0 1 0
prod 1 0.0 0.1 527 527
...
Instructor Guide
Monitoring VP folding activity

To monitor VP folding activity, look for idle virtual processors in the output of mpstat -s
with an interval and count. Well cover the mpstat -s command later in this unit.

V5.4.0.3
Instructor Guide

Purpose Describe how virtual processors are managed.
Details Describe the first two Hypervisor calls, cede and confer. Then, describe how
the newest AIX feature, VP folding, improves performance.
Point out the Hypervisor calls in the lparstat -H output shown in the student notes.
This VP folding feature can be tuned, as will be discussed on the next few pages.
Transition statement Lets look more closely at virtual processor folding.
Instructor Guide
Virtual processor folding (1 of 3)

VP folding benefits include:
Improves processor affinity
Increases the dispatch cycle for the remaining VPs for better cache
utilization and less work for Hypervisor
No benefit from this feature when partitions are busy.
Kernel checks status of VPs every second.
Can be disabled or enabled with the schedo option vpm_fold_policy
# schedo -o vpm_fold_policy=0 => disable for both shared
and dedicated processors
# schedo -o vpm_fold_policy=1 => enabled for shared
processors, disabled for dedicated processors
# schedo -o vpm_fold_policy=2 => disabled for shared
processors, enabled for dedicated processors
#schedo -o vpm_fold_policy=3 => enabled for both shared
and dedicated processors
Figure 2-38. Virtual processor folding (1 of 3) AN313.1
Notes:
Virtual processor (VP) folding benefits

This feature enhances the utilization of a shared processor pool by minimizing the use of
virtual processors that are idle most of the time. The important benefit of this feature is
improved processor affinity when there is a large number of largely idle shared processor
partitions resulting in effective use of processor cycles. It increases the average virtual
processor dispatch cycle, resulting in better cache utilization and reduced Hypervisor
workload.
Enable/Disable VP folding
The schedo command is used to dynamically enable, disable, or tune the VP folding
feature. The VP folding feature is configurable by changing the vpm_fold_policy parameter.
To enable or disable the AIX processor folding feature depending on the partition type:

V5.4.0.3
Instructor Guide
Uempty schedo -o vpm_fold_policy=0 => disable for both shared and dedicated
processors
schedo -o vpm_fold_policy=1 => enabled for shared processors, disabled for
dedicated processors
schedo -o vpm_fold_policy=2 => disabled for shared processors, enabled for
dedicated processors
schedo -o vpm_fold_policy=3 => enabled for both shared and dedicated
processors
1 is the default value
Typically, this feature should remain enabled. The disable function is available for
comparison reasons and in case any tools or packages encounter issues due to this
feature.
Instructor Guide
Instructor notes:
Purpose Explain the benefits of VP folding and how to disable or enable it.
Details Describe the benefits of the VP folding feature.
Describe how to use schedo to disable or enable this feature. The next visual shows how to
tune it.
We have not yet discussed why you might want to tune this feature, and that is coming up
soon.
Transition statement Lets look at how this feature can be set to different values.

V5.4.0.3
Instructor Guide
Uempty

VP folding can be tuned dynamically with the schedo tuning option
vpm_xvcpus
Folding activity is determined by this calculation:
Number of VPs needed = Physical CPU utilization (physc) + vpm_xvcpus value
If the number of VPs needed is less than the current number of enabled virtual
processors, one or more VPs are disabled:
Threads attached to a disabled virtual processor are still allowed to run on it.
If the number of VPs needed is greater than the current number of enabled virtual
processors, one or more disabled VPs are enabled.
Example of default VP folding activity:
Physc = 2.5 and vpm_xvcpus = 0
Number of VPs needed is three (2.5 + 0, rounded up to next integer).
If the partition is currently configured for six VPs, then three are disabled (one per
second).
Values for the entitled capacity and VP minimum are irrelevant (this is not DLPAR).
Notes:
Configuring vpm_xvcpus
Every second, the kernel scheduler evaluates the number of virtual processors in a
partition based on their utilization. If the number of virtual processors needed to
accommodate the physical utilization of the partition is less than the current number of
enabled virtual processors, one virtual processor is disabled. If the number of virtual
processors needed is greater than the current number of enabled virtual processors, one or
more (disabled) virtual processors are enabled. Threads attached to a disabled virtual
processor are still allowed to run on it.
Instructor Guide
Instructor notes:
Purpose Describe how the VP folding option to schedo is used to calculate the number
of activated virtual processors.
Details This visual shows an example when vpm_xvcpus has the default value of 0.
Well get to other values on the next visual.
Transition statement Next, well see how to tune this parameter and why.

V5.4.0.3
Instructor Guide
Uempty

Tuning vpm_xvcpus:
Besides simply disabling or enabling, you can set other values for vpm_xvcpus
# schedo -o vpm_xvcpus=1
Example:
If physc = 2.5, and vpm_xvcpus = 1, it means that the number of VPs needed is
four.
If the partition is currently using six VPs, then two are disabled (one per second).
If the partition is currently using three VPs, then one is enabled (if the VP setting is four
or more in the partition profile).
When should you increase value?
Increase: If you want to decrease dispatch latency and improve responsiveness.
For some workloads, changes once a second might be too long.
Increase: If utilization of shared pool is low, you can take advantage of more
physical processors.
Notes:
Tuning VP folding
The visual above shows how the vpm_xvcpus value set using schedo is used to determine
the number of VPs to fold. You can set this value to an integer to tune how the VP folding
feature will react to a decrease in workload.
Why increase the default value?

Increasing the virtual processor number could minimize dispatch latency and improve
responsiveness. The kernel scheduler adjusts the number of virtual processors each
second so the mechanism scales well, but for some workloads a second is too long.
Workloads that have sudden bursts of demand might benefit from having an extra couple of
virtual processors ready for use.
Another potential reason has to do with the utilization of separate physical processors. If
the utilization of the shared processor pool is very low, then the advantage of squeezing
Instructor Guide
work onto a fewer number of virtual processors is lost. In such environments, you might
want to configure a primary shared processor partition so that it has enough resources to
take over the entire shared processor pool, assuming its partition entitlement is large
enough, or it is uncapped. This enables more physical resources to be allocated to the
partition more quickly, with the additional benefit of being able to allocate essentially
dedicated processor resources to the partition. In this scenario, the assumption is that the
other shared processor partitions are mostly idle and are configured to utilize a fewer
number of virtual processors by default.

V5.4.0.3
Instructor Guide

Purpose Describe how to set the VP folding option using schedo to different values and
how this value is used.
Details Describe how to set other values for vpm_xvcpus and how this affects the
number of activated virtual processors.
Stress that the virtual processor number you will see in the performance tool remains the
same.
At the time of this writing, there is no known technical reason to disable the VP folding
feature.
Some quick-changing workloads might see improved responsiveness by having more
virtual processors, if there are adequate physical processors in the shared processing pool.
Transition statement Lets look at how simultaneous multithreading and virtual
processors work together.
Instructor Guide
Simultaneous multithreading and SPLPARs

Simultaneous multithreading
can be used with Micro-Partitions.
With simultaneous multithreading, POWER7
each virtual processor runs two LPAR1
threads (POWER6) or
four threads (POWER7). Logical
Each thread is called a processors
logical processor.
LPAR1 example: Virtual
1.6 processing units processors
Two virtual processors
Simultaneous
multithreading enabled
Eight logical processors

Figure 2-41. Simultaneous multithreading and SPLPARs AN313.1
Notes:
Logical processors and virtual processors

Each of the simultaneous multithreading threads (logical processors) of a virtual processor
has a separate hardware state, but they are viewed as one entity for the purpose of a
dispatch of a virtual processor. The logical processors are always assigned to the same
partition.
The amount of time that each virtual processor runs is split between the logical processors.
So the dispatch wheel works the same whether simultaneous multithreading is enabled or
not.
PURR
The PURR value, covered in the simultaneous multithreading unit of this course, is used to
accumulate information only when the virtual processor is dispatched on a physical
processor. So PURR is utilized even if simultaneous multithreading is disabled, because it
provides accurate processor utilization statistics in a shared processor environment.

V5.4.0.3
Instructor Guide

Purpose Describe how logical processors and virtual processors work together.
Details The last unit discussed logical processors when simultaneous multithreading is
enabled. This visual shows that everything discussed in this unit with regard to virtual
processors is still true, whether simultaneous multithreading is enabled or not.
Transition statement Lets look at one more visual on this topic.
Instructor Guide
Metrics with simultaneous multithreading and SPLPAR

Each virtual processor supports logical CPU5
two (POWER6) or four logical CPU6
LPAR 2
(POWER7) logical processors: virtual purr6
virtual purr6
Dispatched at the same time LPAR 3
virtual proc3
logical CPU1
virtual CPU logical CPU10
logical
splpar1
CPU0
virtual CPU Dispatch logical CPU1
virtual
splpar1 purr1
virtual timebase virtual purr10
wheel (10 ms) virtual timebase
virtual purr0
virtual timebase
virtual purr9
virtual proc0 logical CPU7 virtual proc5
logical CPU7
PURR statistic: LPAR 1 virtual purr7
Still measures fraction of time virtual purr6
virtual timebase
partition runs on a physical virtual proc3 LPAR 4
processor (the relative amount of
processing units consumed)
Dispatched
Thread1
Thread0
purr1
purr0
physical proc0
Figure 2-42. Metrics with simultaneous multithreading and SPLPAR AN313.1
Notes:
Dispatch wheel works the same way with simultaneous multithreading

enabled
With simultaneous multithreading enabled, the two logical processors (related to a single
virtual processor by simultaneous multithreading) are dispatched together (at the same
time) whenever the Hypervisor schedules the virtual processor to run. The amount of time
that each virtual processor runs is split between the two logical processors.
PURR statistic
The PURR statistic was described in the simultaneous multithreading unit in this course.
Notice that it still measures thread CPU elapsed time as was described in the
Simultaneous Multithreading unit.

V5.4.0.3
Instructor Guide

Purpose Describe how logical processors and virtual processors work with the dispatch
wheel.
Details The purpose of this slide is to bring together the logical processor concept from
the simultaneous multithreading unit in this course with the virtual processor concept from
this unit. Point out that when monitoring CPU utilization the PURR statistic keeps track of
thread utilization of logical processors which are related by simultaneous multithreading to
virtual processors.
Transition statement Lets look at how AIX tools have been changed to support all
these CPU concepts.
Instructor Guide
AIX 6.1 SPLPAR tool impact (1 of 2)

lparstat is an easy way to view partitions configuration.
# lparstat
System configuration: type=Shared mode=Capped smt=On lcpu=4
mem=1024 psize=2 ent=0.80
%user %sys %wait %idle physc %entc lbusy app vcsw phint
----- ---- ----- ----- ----- ----- ------ --- ---- -----
0.1 0.1 0.0 99.8 0.00 0.3 0.0 1.99 580 0
mpstat can be used to monitor activity.

Provides multiple metrics for per-logical/per-virtual processor
For example, phantom interrupts and context switches
topas L adds LPAR view.
Figure 2-43. AIX 6.1 SPLPAR tool impact (1 of 2) AN313.1
Notes:
These tools were shown in the simultaneous multithreading unit in this course. Some of the
columns are only shown with simultaneous multithreading or Micro-Partitioning.
/usr/bin/lparstat
The -i option provides detailed LPAR information, and -H and -h provide Hypervisor
specific information. You can run lparstat over time with interval and count arguments;
otherwise it shows statistics since the last operating system boot. For example, lparstat
-h 2 5 runs the command five times, with two second intervals.
The lparstat output, in the visual above, shows a system configured as shared, capped,
and with simultaneous multithreading enabled. It has four logical processors (lcpu), 1 GB
of memory (mem), and its entitled capacity (ent) is 0.80. The psize field shows there are
two physical processors in the shared pool.

V5.4.0.3
Instructor Guide
Uempty The following additional columns are displayed if the partition type is shared:
Field Description
physc Number of physical processors consumed.
%entc Percentage of the entitled capacity consumed.
Percentage of logical processor(s) utilization that occurred while
lbusy
executing at the user and system level.
app Available physical processors in the shared pool.
Number of virtual context switches which are the virtual processor
vcsw
hardware preemptions.
Number of phantom (targeted to another shared partition in this pool)
phint
interruptions received.
In the example output in the visual above, we see that the partition was mostly idle, at
99.8%. It used only 0.3 of its 0.80 entitled capacity (entc%). This consumed 0% of a
physical processor (physc), which because of the limit of two decimal places, shows as
zero.
Available physical processors (app)

The app column shows processors available in the shared processing pool. If this partition
was very busy and needed additional processing power, then you can use the app column
to see if there is any available processing power in the shared processing pool.
This app column only shows for partitions where the checkbox labeled Allow shared
processor pool utilization authority is checked. This checkbox can be found in the Partition
Properties screens on the HMC for existing partitions. To find the checkbox, open the
properties, go to the Hardware tab, then the Processors and Memory tab.
/usr/bin/mpstat
If using shared processors, and simultaneous multithreading is enabled, the mpstat -s
command displays physical as well as logical processor usage as shown in the example
below.
In the output shown below, the physical processor Proc0 is busy at 0.35%, which is made
up of on logical processor cpu0 (0.26%) and logical processor cpu1 (0.09%). cpu0 and
cpu1 are hardware threads for Proc0.
# mpstat -s 1 1
Proc0 Proc2
0.35% 0.02%
cpu0 cpu1 cpu2 cpu3
0.26% 0.09% 0.01% 0.01%
Instructor Guide
Instructor notes:
Purpose The AIX 5L V5.3 and above commands listed have enhancements to support
shared processors.
Details Point out the changes in lparstat and mpstat commands when a partition uses
shared processors.
Describe that the PURR statistics are also used for shared processor partitions even with
simultaneous multithreading disabled.
Transition statement The next page lists more commands modified in AIX 5L V5.3 to
support shared processor partitions.

V5.4.0.3
Instructor Guide
Uempty
AIX 6.1 SPLPAR tool impact (2 of 2)

vmstat, iostat, sar:
With simultaneous multithreading or SPLPAR:
Automatically use PURR-based metrics for %user, %sys, %wait, %idle
Two columns for SPLPAR:
Physical Processor Consumed (pc or physc) by the partition
Percentage of Entitlement Consumed (pec or %entc) by the partition:
Can go as high as 1000% for uncapped partitions
sar -P ALL (logical processors view):
With simultaneous multithreading or SPLPAR:
Physical Processor Consumed (physc) column added
For SPLPAR, another column:
Percentage of Entitlement Consumed (%entc) also added:
Gives relative entitlement consumption for each logical processor
Figure 2-44. AIX 6.1 SPLPAR tool impact (2 of 2) AN313.1
Notes:
/usr/bin/iostat
iostat was modified to add two additional metrics: Physical processor consumed (physc
column) and Percentage of entitlement granted (%entc column). These are shown only in
shared processor partitions or if simultaneous multithreading is enabled.
Physical processor consumed shows a measure of the fraction of time a logical processor
gets physical processor cycles.
Percentage of entitlement consumed gives the relative entitlement consumption for each
logical processor.
Instructor Guide
Example iostat with only the CPU columns showing:

# iostat -t 2 3
avg-cpu: % user % sys % idle % iowait physc %entc
0.1 0.2 99.7 0.0 0.0 0.9
0.1 0.4 99.5 0.0 0.0 1.1
0.1 0.2 99.7 0.0 0.0 0.9
/usr/bin/vmstat and /usr/sbin/sar

The vmstat output was shown earlier in this unit. The same two metrics were added to
both vmstat and sar as were added to iostat. We will look at sar next.

V5.4.0.3
Instructor Guide

Purpose Describe the enhancements to AIX 5L V5.3 and above commands to support
shared processor partitions.
Details Detail about these commands were shown in the previous unit in this course
and were shown in the context of simultaneous multithreading and logical processors.
Describe that the PURR statistics are also used for shared processor partitions even with
simultaneous multithreading disabled.
Transition statement Lets look at differences with sar -P when using a shared
partition. Well start by looking at an idle system.
Instructor Guide
Using sar with SPLPAR (1 of 2)

Id l
# sar -P ALL 1 2 es
AIX sys276_lpar1 1 6 00C35B804C00 04/28/10 ys
tem
System configuration: lcpu=2 ent=1.00 mode=Capped
20:20:57 cpu %usr %sys %wio %idle physc %entc

20:20:58 0 23 58 0 19 0.01 0.7
1 0 2 0 98 0.00 0.2
U - - 0 99 0.99 99.1
- 0 0 0 99 0.01 0.9
20:20:59 0 15 60 0 25 0.01 0.5
1 0 2 0 98 0.00 0.2
U - - 0 99 0.99 99.3
- 0 0 0 100 0.01 0.7
Average 0 20 59 0 21 0.01 0.6

1 0 2 0 98 0.00 0.2
U - - 0 99 0.99 99.2
- 0 0 0 100 0.01 0.8
Figure 2-45. Using sar with SPLPAR (1 of 2) AN313.1
Notes:
sar -P ALL
When running in a shared partition, sar displays the percentage of entitlement consumed
%entc which is ([PPC/ENT]*100). PPC is Physical Processor Consumed and ENT is
entitled capacity. This gives relative entitlement utilization for each logical processor and
allows system average utilization calculation from logical processor utilization.
Whenever the percentage of entitled capacity consumed is under 100%, a line beginning
with U is added to represent the unused capacity.
The physical processor consumed physc (delta PURR/delta TB) column shows the relative
simultaneous multithreading split between processors, that is, it shows the measurement of
fraction of time a logical processor was getting physical processor cycles.
Interpreting the lparstat output

The output in the visual can be read as follows:

V5.4.0.3
Instructor Guide
Uempty On average, for the time that it actually consumed (0.01), cpu0 spent 20% in user time
and 50% in system time.
Although it has 1.00 entitled capacity (ent), it only used an average of 0.8% (%entc) of
it.
This very little time on a CPU consumed about 0.01 of a physical CPU (physc). The
physc column shows the amount of time that a virtual processor actually ran on a
physical CPU. When these virtual processors arent doing anything, the partition cedes
its excess cycles back to the Hypervisor.
In summary, on shared partitions, it is important that you dont just look at the %usr and
%sys columns and state whether the processors are busy or not. The output in the visual
above shows that this partition is hardly doing any work at all. In fact, this output was taken
on a system simply running only the operating system. But, it looks like its 79% busy. On a
shared partition, you must look at the other columns to figure out if it was indeed busy. For
example, if the %entc was nearing 100% (or more than 100% in the case of uncapped
partitions), then the partition is busy.
Instructor Guide
Instructor notes:
Purpose Describe the differences in the sar -P ALL output when using a shared
processor partition.
Details In the visual, you can use the example of the first line for cpu0 by saying that for
0.01 of a physical processors time, it was 23% in user, 58% in system, and 19% idle. The
important point is that although on average this system looks like it was 79% busy, that is
only for the actual time that the virtual processor(s) spent on a physical CPU. Any excess
processing cycles are ceded back to the Hypervisor.
The output in the visual shows a partition that is only using 0.01 of a physical processor
and on average 0.8% of its entitled capacity.
The example on the visual shows a simple example with an entitled capacity of 1.00. The
next visual shows a more complex scenario.
Transition statement Lets see how sar -P looks on a busy system.

V5.4.0.3
Instructor Guide
Uempty
Using sar with SPLPAR (2 of 2)

# sar -P ALL 1 1
Bu
AIX sys276_lpar1 1 6 00C35B804C00 04/28/10 sy
sy
System configuration: lcpu=4 ent=1.00 mode=Capped ste
m
16:19:24 0 0 7 0 93 0.03 3.3
1 100 0 0 0 0.37 46.8
2 100 0 0 0 0.38 46.9
3 0 1 0 99 0.02 3.1
- 94 0 0 6 0.80 100.0
# mpstat -s 1 1
Proc0 Proc2
46.05% 53.66%
cpu0 cpu1 cpu2 cpu3
29.78% 16.27% 3.90% 49.76%
Figure 2-46. Using sar with SPLPAR (2 of 2) AN313.1
Notes:
sar on a busy shared processor partition

In the visual above, we see the sar -P ALL output on a busy system. There are some
differences between the example on the visual above and the previous visual. The entitled
capacity is now 0.80 rather than 1.0, and there are four logical processors.
Notice the following in the above sar -P ALL output:
There are two logical processors that are both about 100% busy with 50% of the
entitled capacity each.
Each is consuming about 0.40 of a physical processor, which is half of the 0.80 entitled
capacity.
There are two logical processors that are idle. Apparently, this workload has two
threads.
Instructor Guide
Another sar -P ALL example with uncapped partition

Heres another sar -P ALL 1 1 example on a busy, uncapped partition. Notice that %entc
can be more than 100%.
# sar -P ALL 1 1
AIX bud151 3 5 00CDEF8E4C00 03/21/06

15:21:04 0 1 9 0 90 0.08 26.1
1 95 5 0 0 0.89 296.8
2 5 78 1 16 0.00 1.6
3 0 53 0 47 0.00 0.9
- 86 6 0 7 0.98 325.4
mpstat -s output
Notice the following in the mpstat -s output in the visual above:
We can confirm with the mpstat -s command that two logical processors are busy and
two are idle.
The two logical processors that are busy are on different virtual processors. If you watch
over time, you might see the load bounce around to different logical processors.
By adding up the percent busy for both virtual processors, you reach the value of about
80% of a physical processor. This makes sense with the entitled capacity of 0.80 on an
extremely busy system.

V5.4.0.3
Instructor Guide

Purpose Describe how values might look for sar -P ALL and mpstat -s on a
CPU-constrained system.
Details The previous visual showed the basics of the sar -P ALL output with an easy
configuration of 1.0 entitled capacity on an idle system. This visual shows a more complex
configuration and a CPU-constrained system.
Describe how the mpstat -s values correspond to the sar -P ALL values.
Additional information The Tmmg command from the /home/p6perf/ex2 directory on
the training lab systems was used to generate the load on the system where these
example outputs were taken. The -p 2 option created exactly two threads for this load.
The output would look different with a different number of threads.
Transition statement Lets look at the changes that were implemented for the topas
tool.
Instructor Guide
topas: Example main screen

Topas Monitor for host: sys276_lpar1 EVENTS/QUEUES FILE/TTY
Thu Apr 29 16:20:18 2010 Interval: 2 Cswitch 219 Readch 244
Syscall 119 Writech 1398
CPU User% Kern% Wait% Idle% Physc Entc Reads 1 Rawin 0
ALL 99.0 0.5 0.0 0.5 0.80 99.9 Writes 2 Ttyout 244
Forks 0 Igets 0
Network KBPS I-Pack O-Pack KB-In KB-Out Execs 0 Namei 9
Total 0.4 0.5 1.0 0.0 0.3 Runqueue 1.0 Dirblk 0
Waitqueue 0.0
Disk Busy% KBPS TPS KB-Read KB-Writ MEMORY
Total 0.0 4.0 0.0 0.0 4.0 PAGING Real,MB 1024
Faults 0 % Comp 74
FileSystem KBPS TPS KB-Read KB-Writ Steals 0 % Noncomp 18
Total 1.1 2.0 0.2 0.8 PgspIn 0 % Client 18
PgspOut 0
Name PID CPU% PgSp Owner PageIn 0 PAGING SPACE
spload 290922 98.9 0.4 root PageOut 0 Size,MB 512
topas_nm 380942 0.1 2.6 root Sios 0 % Used 1
topas 438286 0.1 1.4 root % Free 99
sshd 503898 0.0 1.1 root NFS (calls/sec)
getty 307390 0.0 0.6 root SerV2 0 WPAR Activ 0
random 237698 0.0 0.4 root CliV2 0 WPAR Total 0
gil 57372 0.0 0.9 root SerV3 0 Press: "h"-help
java 299182 0.0 52.6 root CliV3 0 "q"-quit
Figure 2-47. topas: Example main screen AN313.1
Notes:
The topas output has been modified for Micro-Partitions. The new metrics have been
applied so processor utilization is calculated using new PURR-based values and formula
when running in simultaneous multithreading or Micro-Partitioning mode. Additional
information is:
Physc: The fractional number of processors consumed (shows for both dedicated and
shared partitions)
%Entc: The percentage of entitled capacity consumed (shown only for shared partitions)

V5.4.0.3
Instructor Guide

Purpose Show changes implemented for shared partitions in the topas tool.
Details Simply point out the %Entc data in the topas tool output. This was added for
shared processor partitions.
Transition statement You can use topas -L to look at partition-related statistics.
Instructor Guide
Partition data with topas L
# topas -L
Interval: 2 Logical Partition: LPAR1 Thu Apr 29 16:22:32 2010

Psize: - Shared SMT ON Online Memory: 1024.0
Ent: 0.80 Mode: Capped Online Logical CPUs: 4
Partition CPU Utilization Online Virtual CPUs: 2
%usr %sys %wait %idle physc %entc %lbusy app vcsw phint %hypv hcalls
99 0 0 1 0.8 99.68 24.94 - 438 0 0.6 500
==================================================================================
LCPU minpf majpf intr csw icsw runq lpa scalls usr sys _wt idl pc lcsw
Cpu0 0 0 101 2 2 0 100 0 100 0 0 0 0.67 101
Cpu1 0 0 90 0 0 0 0 0 0 17 0 83 0.00 90
Cpu2 0 0 95 140 70 0 100 19 85 7 0 8 0.01 134
Cpu3 0 0 182 79 40 1 99 47 98 1 0 1 0.11 113
Figure 2-48. Partition data with topas - L AN313.1
Notes:
topas -L output
This visual above shows the output when you press L while in topas, or when you invoke
topas with the -L option. This screen shows more partition-related statistics.
In this output, you can see the percentage of time the logical processors are busy (%lbusy),
the available processor pool (app), the number of voluntary context switches (vcsw), the
number of phantom interrupts (phint), the percentage of time processing Hypervisor calls
(%hypv), and the number of Hypervisor calls (hcalls).
There is also a break down of statistics for each logical processor.

V5.4.0.3
Instructor Guide

Purpose Describe the partition statistics screen for topas.
Details Explain the new system partition fields. These are similar statistics to other
tools.
Transition statement You can use topas to see lparstat-like statistics for multiple
partitions on the same managed system.
Instructor Guide
Cross partition data with topas -C
Topas CEC Monitor Interval: 10 Thu Apr 29

16:51:30 2010
Partitions Memory (GB) Processors
Shr: 3 Mon: 7.8 InUse: 6.0 Shr:1.8 PSz: 7 Don: 2.0 Shr_PhysB
0.03
Ded: 2 Avl: - Ded: 3 APP: 7.0 Stl: 0.0 Ded_PhysB
0.04
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw Ent %EntC PhI pmem

-------------------------------------shared-------------------------------
sys226_v2 A61 C 0.8 0.7 4 0 0 0 98 0.01 402 0.35 2.1 1 -
sys226_v3 A61 U 0.5 0.5 4 0 0 0 99 0.01 468 0.35 1.9 0 -
sys226_v4 A61 U 0.5 0.5 4 0 0 0 99 0.01 433 0.35 1.9 0 -
Shared processor partitions

Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw %istl %bstl %bdon
%idon
------------------------------------dedicated-----------------------------
-
sys226_v5 A61 S 1.5 1.4 2 2 2 0 95 0.04 252 0.00 0.00 - -
sys226_v6 A61 D 1.5 0.8 2 0 0 0 99 0.00 214 0.00 0.07 - -
Dedicated processor partition

Figure 2-49. Cross partition data with topas -C AN313.1
Notes:
This visual above shows the output when you press C while in topas, or when you invoke
topas with the -C option.
As of AIX 5L 5.3 Maintenance Level 3, the topas command can also report some
performance metrics from remote partitions. This cross partition panel displays metrics
similar to the lparstat command for all of the AIX partitions it can identify as belonging to the
same hardware platform.
The example above shows a system with multiple partitions running at AIX 5L V5.3 ML3. In
this particular system, the partitions do not have any network interfaces configured and rely
on retrieving this data from the HMC, through the service processors network connection
to the HMC.

V5.4.0.3
Instructor Guide
Uempty topas -C output

The top section represents aggregated data from the partition set to show overall partition,
memory, and processor activity. Dedicated and shared partitions are displayed in separate
sections with appropriate metrics.
The topas command also collects remote data from partitions running versions of AIX prior
to V5.3 with Maintenance Level 3 that have the Performance Aide product
(perfagent.server) installed.
A connection to the HMC must be established for topas to retrieve certain configuration
information. If this connection is not available, command-line options allow the user to
specify the needed values.
Instructor Guide
Instructor notes:
Purpose Describe how to use topas to get information from other partitions.
Details Describe how to invoke topas and view partition-related information from other
partitions.
Additional information There does not appear to be any security built in to the topas
-C function. That is, theres no way to block one partition from retrieving performance data
from another except by not upgrading to maintenance level 3.
Transition statement The lbusy percentage can be confusing. Lets look at this more
closely.

V5.4.0.3
Instructor Guide
Uempty
Understanding lbusy percentage
# lparstat 2 3
System configuration: type=Shared mode=Capped smt=On lcpu=4

mem=3776 psize=2 ent=0.80
%user %sys %wait %idle physc %entc lbusy app vcsw phint
----- ---- ----- ----- ----- ----- ------ --- ---- -----
95.1 0.2 0.0 4.6 0.80 99.9 49.6 1.20 411 0
96.5 0.2 0.0 3.3 0.80 99.9 49.6 1.20 417 1
94.2 0.2 0.0 5.6 0.80 100.0 50.5 1.20 479 0
# mpstat -s 1 1
System configuration: lcpu=4 ent=0.8 2 out of 4 logical

processors are
Proc0 Proc2 busy (50%).
39.99% 39.90%
cpu0 cpu1 cpu2 cpu3
2.36% 37.64% 37.76% 2.14%
Figure 2-50. Understanding lbusy percentage AN313.1
Notes:
The visual shows the output of lparstat and mpstat on a busy system. You can see that
although the system is using all of its entitled capacity (100% of its 0.80 processing units),
the logical processors are only 50% busy. Because this system has four logical processors,
and the load on the system apparently has only two threads, only two logical processors
are being utilized. lbusy therefore reports that 50% of the logical processors are busy. If we
were to start a load on the system using four threads, then we should see lbusy at 100%.
Instructor Guide
Instructor notes:
Purpose Describe what the lbusy percentage value represents.
Details Use the example output on the visual to show how lbusy might be less than
100% even when the system is using 100% of its entitled capacity.
Transition statement We have two final topics to wrap up the Micro-Partitioning unit.
The first is a description of the types of applications best suited for Micro-Partitioning.

V5.4.0.3
Instructor Guide
Uempty
Micro-Partitioning and applications

Applications do not need to be aware of Micro-Partitioning.
Applications that might not benefit from Micro-Partitioning include:
Applications with a strong response time requirement for transactions might find
Micro-Partitioning detrimental:
Because virtual processors can be dispatched at various times during a timeslice
Could result in longer response time with too many virtual processors
Each virtual processor with a small entitled capacity is in effect a slower CPU.
Compensate with more entitled capacity (2-5% PUs over plan)
Applications with polling behavior
CPU intensive application examples: DSS, HPC
Applications that are good candidates for Micro-Partitioning include:
Those that have low average CPU utilization, with high peaks
Examples: OLTP, Web applications, mail server, directory servers
Figure 2-51. Micro-Partitioning and applications AN313.1
Notes:
This page reviews some types of applications that would do well or not so well with
Micro-Partitioning. Your results might vary.
Decision Support System (DSS) and High Performance Computing (HPC) are examples of
applications that might be CPU-intensive. Online Transaction Processing applications are
an example of a low average CPU utilization because it usually has a human interface with
high idle times.
Polling behavior
If an application is constantly polling for available resources or particular conditions, this
uses CPU resources that might be available for other partitions without actually doing any
work. If this interferes too much with other partitions, and if dedicated processors are an
option, you might wish to try using dedicated processors for this partition.
Instructor Guide
Instructor notes:
Purpose Give general guidelines of the types of applications that would do well or not
so well with Micro-Partitioning.
Details Discuss different types of applications and whether they would be good
candidates for using shared processors.
This discussion should start with the reminder that using shared versus dedicated
processors is a trade-off. Using dedicated results in the best performance if you have
plenty of processors. But, if you need to use shared processors because you need to make
more effective use of the processors that you have, then the discussion should focus on
what type of applications would fare better than others.
Transition statement As a final word on Micro-Partitioning, lets look at some
strategies for shared partition capacity planning.

V5.4.0.3
Instructor Guide
Uempty
Micro-Partitioning and capacity planning

Planning strategies:
Guaranteed capacity (that is, dedicated processors):
Generally used on new systems where there is no performance data or
random behavior
Smallest risk and not a great effort on performance management
Downside is that excess capacity is wasted
Harvested capacity:
Generally used on system where partitions have mixed types of
applications
Applications without response time requirements can provide excess
capacity to those that do
Planned over commit:
On average, all partitions consume their required resources:
All applications cannot peak at the same time.
Most efficient use of resources
Requires accurate planning, knowledge of applications behavior, and
performance management
Figure 2-52. Micro-Partitioning and capacity planning AN313.1
Notes:
This visual lists three strategies that you can use when planning your system.
Do you want all partitions to be guaranteed their CPU resources? If you do not know the
behavior of the applications on your system, having dedicated CPUs is the safest strategy
to use (performance-wise) if you have enough resources.
Do you have a wide range of applications on the system? The harvested capacity strategy
has some partitions that might have unused capacity, which will be harvested by others.
The third strategy is the most cost-effective and performance-risky option. With this
strategy, you must monitor the partitions closely. Its safe if some of the partitions peak, but
if most of the partitions peak simultaneously, then the resources are overcommitted.
Instructor Guide
Instructor notes:
Purpose Describe the three strategies for capacity planning on a system where
partitions will utilize shared processors.
Details Discuss the three strategies and explain how you each has its own trade-off.
Transition statement Next, well do some checkpoint questions.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (1 of 4)
1. Match the following processor terms to the statements that describe
them: Dedicated, shared, capped, uncapped, virtual, logical
a. ___________ These processors cannot be used in micro-partitions.
b. ___________ Partitions marked as this might use excess processing
capacity in the shared pool.
c. ___________ There are two or four of these for each virtual processor if
simultaneous multithreading is enabled.
d. ___________ This type of processor must be configured in whole
processor units.
e. ___________ These processors are configured in processing units as
small as one-hundredth of a processor.
f. ___________ Partitions marked as this might use up to their entitled
capacity but not more.
2. Can a running partition have both dedicated and shared processors?
Figure 2-53. Checkpoint (1 of 4) AN313.1
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
Checkpoint solutions (1 of 4)
a. Dedicated These processors cannot be used in micro-partitions.
b. Uncapped Partitions marked as this might use excess processing
c. Logical There are two or four of these for each virtual processor if
d. Dedicated This type of processor must be configured in whole
processor units.
e. Shared These processors are configured in processing units as
f. Capped Partitions marked as this might use up to their entitled
The answers in the correct order are dedicated, uncapped, logical,
dedicated, shared, and capped.

The answer is no, a partition must use only one type of processor,
dedicated or shared, but not both.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (2 of 4)
3. True or False: By default, dedicated processors are returned to the
shared processor pool if the dedicated partition becomes inactive.
4. If a partition has 2.5 processing units, what is the minimum number of

virtual processors it must have?
a. One
b. Three
c. No minimum
5. If a partition has 2.5 processing units, what is the maximum number of

virtual processors it can have?
a. 25
b. 30
c. Total number of physical processors x 10
d. No maximum
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
The answer is true.
a. One
b. Three
c. No minimum
The answer is three.
a. 25 (Maximum can be no more than 10 times processing units.)
b. 30
d. No maximum
The answer is 25 (maximum can be no more than 10 times processing
units).

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (3 of 4)
6. What is the maximum amount of processing units that can be
allocated to a partition?
7. If an uncapped partition has an entitled capacity of 0.5 and

two virtual processors, what is the maximum amount of
processing units it can use?
8. If there are multiple uncapped partitions running, how are

excess shared processor pool resources divided between the
partitions?
9. True or False: If a capped partition is allocated 2.5

processing units, this means that it can use up to 25 ms of
processing time for every 10 ms of clock time.
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
6. What is the maximum amount of processing units that can be allocated
to a partition?
The answer is all available processing units.
7. If an uncapped partition has an entitled capacity of 0.5 and two virtual
processors, what is the maximum amount of processing units it can
use?
The answer is 2.0 processing units because it is uncapped and has
two virtual processors (maximum of 1.0 units per virtual processor).
8. If there are multiple uncapped partitions running, how are excess
shared processor pool resources divided between the partitions?
The answer is the uncapped weight configuration value is used to
allocate excess resources.
9. True or False: If a capped partition is allocated 2.5 processing units,

this means that it can use up to 25 ms of processing time for every 10
ms of clock time.
The answer is true.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (4 of 4)
10. What is the maximum number of virtual processors that can
be configured for an individual partition?
11. True or False: The hypervisor divides a partitions entitled

capacity evenly between its virtual processors.
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
The answer is up to ten times the amount of processing
units, with a maximum value of 64.

The answer is true.

V5.4.0.3
Instructor Guide
Uempty
Exercise
Unit
exerc
ise
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Unit summary
Describe the impact of simultaneous multithreading on tools such as
vmstat, iostat, sar, and topas
Discuss guidelines for systems running simultaneous multithreading
with various workloads
Describe how the POWER Hypervisor allocates processing power from
the shared processing pool
processors
Describe performance considerations associated with implementing
Micro-Partitioning
Use tools to monitor the statistics on a partition running a workload with
Micro-Partitioning configured
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty Unit 3. Dedicated shared capacity and multiple

shared processor pools
Estimated time
01:30

This unit describes the IBM Power Systems virtualization
enhancements. These features are called dedicated shared
processors and the multiple virtual shared processors pools.

Discuss the details associated with the following IBM Power
Systems features:
- Dedicated shared processors running in donating mode
- Multiple shared processors pools (MSPPS)
Discuss how these features can improve processor resource
utilization

Machine exercises
References
SC24-7940-03 PowerVM Virtualization on IBM System p Introduction
and Configuration (Fourth Edition)
Copyright IBM Corp. 2010, 2011 Unit 3. Dedicated shared capacity and multiple shared processor 3-1
Instructor Guide
Unit objectives
Systems features:
Dedicated shared processors running in donating mode
Multiple shared processor pools (MSPPs)
utilization
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 1: Dedicated shared processors

After completing this topic, you should be able to:
Describe the donation of unused dedicated capacity
Activate this feature through the HMC GUI or CLI
Monitor the CPU donating activity
Describe how to maximize donated idle cycles in the shared

processor pool
Figure 3-2. Topic 1: Dedicated shared processors AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose Review the objectives for this topic.
Details Explain what well cover and what the students should be able to do at the end
of the topic.
Transition statement The first visual starts with reviewing what a dedicated processor
is.
Instructor Guide
Dedicated processors
Allocated as whole processors to a specific partition.
Same physical processors are used for that partition while it is running
Idle cycles are effectively wasted.
When partition is stopped, dedicated processors will go to
shared pool if the inactive option is checked for a partition.
Partition properties (or profile) LPAR
Dedicated
processor
Physical processors
(not in the shared pool)
Figure 3-3. Dedicated processors AN313.1
Notes:
Dedicated processors are whole physical processors allocated to a particular partition.
When the partition is shut down, the processors might return to the shared processing pool.
When the dedicated processor partition starts again, it will be allocated dedicated
processors, although the actual physical processors might be different from the last time it
was activated.
As of HMC Version 7, check the Allow when partition is inactive checkbox in the Processor
Sharing section of the partitions profile or properties to configure dedicated processors to
go to the shared processor pool when the partition is shut down. In previous versions of the
HMC, this box is labelled Allow idle processors to be shared.

V5.4.0.3
Instructor Guide

Purpose Describe the concept of dedicated processors.
Details Describe the use of dedicated processors.
Dedicated processor is one of two types of processors for POWER5, POWER6 and
POWER7 processor-based systems. The second type is covered next: shared processors.
Transition statement Lets look at what shared dedicated processors are.
Instructor Guide
Shared dedicated processors: Donating mode

Processors are dedicated, but will donate idle cycles to pool.
Excess capacity can be used by uncapped partitions.
Also known as donating mode.
Enable if the active option is checked for a partition.
Partition properties (or profile) LPAR
Dedicated
processor

LPAR idle cycles
donated to pool
Figure 3-4. Shared dedicated processors: Donating mode AN313.1
Notes:
This feature is available only on POWER6 and POWER7 processor-based servers for
partitions configured with dedicated processors. This function allows idle dedicated
processors to donate their cycles to the shared processor pool.
This feature is licensed as part of the PowerVM Standard and Enterprise Editions. It is
supported only on POWER6 and POWER7 Systems and available for AIX 5.3, AIX 6.1,
and Linux enterprise distributions.

V5.4.0.3
Instructor Guide

Purpose Describe shared dedicated processors, also known as dedicated processors,
running in donating mode.
Details
Transition statement Lets look closer at the shared dedicated processors.
Instructor Guide
POWER virtualization enhancement

Dedicated processors are still committed to the dedicated
processor partition.
The dedicated processor partition can reclaim its dedicated
processors at any time it needs them.
User benefits
A dedicated partition gets absolute priority for these excess cycles.
Sharing can occur only when:
The dedicated partition has not consumed all the CPU resources.
The uncapped partitions have consumed all of their entitled capacity.
Cycles of a dedicated processor are borrowed for a short time.
Figure 3-5. POWER virtualization enhancement AN313.1
Notes:
There are cases where dedicated processors are more efficient than shared processors;
for example, with applications where the requirement is not on throughput but on execution
time of very short code. Also, consider the following:
Memory affinity of dedicated partitions compared to shared type partitions.
Guaranteed performance characteristics: the dedicated partition ceding idle cycles is
not at the mercy of other partitions in the shared processor pool. At higher CPU
utilization, there is no donation from the dedicated processor partition to the shared
processor pool.
The operating system might have optimizations for dedicated processor partitions (for
example, some VMM tuning parameters related to the memory affinity that apply only to
dedicated logical partitions).

V5.4.0.3
Instructor Guide

Purpose
Details Dedicated processors in donating mode: When are they useful using AIX 5.3 or
6? Are there any differences between four CPUs dedicated shared LPAR and a shared
capped LPAR with ent=4.0?
Additional information Answers to these two questions are in the student notes.
Transition statement Lets look how to enable the donating mode.
Instructor Guide
Dedicated processors: Enabling donating mode

Logical partition donates idle cycles to the shared processor
pool when active.
Change the sharing mode dynamically in the partitions properties or
change it in the profile for the next activation.
Processors return The partition is

to the pool when using
partition shuts dedicated
down processors
While partition is
running, idle cycles
are donated to the
pool
Figure 3-6. Dedicated processors: Enabling donating mode AN313.1
Notes:
Enable donating mode

By default, donating mode is disabled. To enable dynamically, open the partitions
properties and go to the Processor tab. Check the Allow when partition is active.
checkbox. The change happens immediately, you do not have to reactive that partition.
To enable donating mode so that it persists across partition activations, make the same
change in the partitions profile.
Allow when partition is active

Select this option to make the dedicated processors available to the shared processor pool
when this logical partition is active. This makes the dedicated processor partition behave
similar to a capped shared processor LPAR.
Deselect this option to reserve the dedicated processors so that they are not made
available to the shared processor pool when this logical partition is active.

V5.4.0.3
Instructor Guide
Uempty You can apply the partition profile settings to the logical partition by activating the logical
partition using this partition profile. You can also directly change how the logical partition
shares dedicated processors by changing the logical partition properties. Direct changes to
the logical partition properties take effect immediately.
Allow when partition is inactive

This is a profile attribute that indicates whether the dedicated processors are made
available to the shared processor pool when the logical partition that is associated with this
partition profile is shut down.
Select this option to make the dedicated processors available to the shared processor pool
when this logical partition is shut down. If you allow the dedicated processors to be shared,
the dedicated processors are still committed to this logical partition while they are in the
shared processor pool. When you reactivate this logical partition, the dedicated processors
are removed from the shared processor pool and returned to this logical partition.
Deselect this option to reserve the dedicated processors so that they are not made
available to the shared processor pool when this logical partition is shut down. When you
test the performance of the shared processor pool, deselecting this option ensures the
accuracy of the tests.
You can apply the partition profile settings to the logical partition by activating the logical
partition using this partition profile. You can also directly change how the logical partition
shares dedicated processors by changing the logical partition properties. Direct changes to
the logical partition properties take effect immediately.
Instructor Guide
Instructor notes:
Purpose Describe how to configure shared dedicated processors.
Details
Transition statement We can also check the partitions donating mode by using the
HMC command-line interface.

V5.4.0.3
Instructor Guide
Uempty
Viewing the sharing mode

Utilization Data -> View task of HMC GUI or from the HMC
command line:
# lshwres -m costieres -r proc --level lpar filter
"lpar_names=lpar2" F lpar_name,pend_sharing_mode
lpar2,share_idle_procs_active
HMC command-line terminology:
Sharing mode is sharing idle processors when partition is shut down.
Donor mode is sharing idle processors when partition is active.
SHARING_MODE values Description
Disable sharing mode and
keep_idle_procs donor mode
share_idle_procs Enable sharing mode
share_idle_procs_active Enable donor mode only
Enable both sharing mode

share_idle_procs_always and donor mode
Figure 3-7. Viewing the sharing mode AN313.1
Notes:
You can view the sharing mode for a partition in three places on the HMC: from the HMC
command line, from the Utilization Data GUI application, and from the partitions properties.
The first two use the terminology shown in the table in the visual to distinguish which mode,
or combination of modes, the partition is using.
Instructor Guide
Instructor notes:
Purpose This visual shows how to understand the terminology used for the different
shared dedicated processor modes and to use the HMC command to view the current
setting.
Details
Transition statement The donating mode can also be seen in AIX monitoring tools.

V5.4.0.3
Instructor Guide
Uempty
Working with sharing/donor
mode from CLI (1 of 2)
The sharing/donor mode attribute can be set from the command line
when creating the profile.
# mksyscfg -r prof -m system12 -i
"name=prof2,lpar_name=lpar1,min_mem=256,desired_mem=1024,
max_mem=1024,proc_mode=ded,min_procs=1,desired_procs=1,
max_procs=1,sharing_mode=keep_idle_procs
# lssyscfg -r prof -m system12 --filter "lpar_names=lpar1 \

-F name,sharing_mode
prof2,keep_idle_procs
The sharing/donor mode attribute of a profile can be changed from the
command line.
# chsyscfg -r prof -m system12 -i name=prof2,lpar_name=lpar1
,sharing_mode=share_idle_procs_always
# lssyscfg -r prof -m system12 --filter "lpar_names=lpar1 \

-F name,sharing_mode
prof2,share_idle_procs_always
Figure 3-8. Working with sharing/donor mode from CLI (1 of 2) AN313.1
Notes:
When creating a logical partition profile, the sharing/donor mode can be specified. The
example in the last bullet shows how to modify the profile (prof2) with the
sharing_mode=share_idle_procs_always.
Instructor Guide
Instructor notes:
Purpose To show how to specify the sharing/donor mode when creating or editing a
partition profile.
Details
Transition statement The sharing/donor mode of a partition can be changed using the
chhwres HMC command.

V5.4.0.3
Instructor Guide
Uempty
Working with sharing/donor
mode from CLI (2 of 2)
Sharing/donor mode attribute of a partition can be changed
from the command line.
# chhwres -m costieres -r proc -o s -a
"sharing_mode=share_idle_procs_active -p lpar1
# lshwres -r proc -m costieres --level lpar filter \

"lpar_names=lpar1" -F lpar_name,pend_sharing_mode
lpar1,share_idle_procs_active
Sharing/donor mode attribute of the partition can be changed

even when the partition is running.
Figure 3-9. Working with sharing/donor mode from CLI (2 of 2) AN313.1
Notes:
The sharing/donor mode attribute of the partition can be changed even when the partition is
running. The logical partition profile attribute overrides the sharing/donor mode attribute of
the partition (inactive partition) when the partition is activated.
Instructor Guide
Instructor notes:
Purpose The sharing/donor mode of a partition can be changed using the HCM CLI
command.
Details The change in the partition properties takes effect immediately; there is no need
to shutdown then restart the logical partition.
Transition statement AIX monitoring tools provides information about the partitions
donating mode.

V5.4.0.3
Instructor Guide
Uempty
Viewing donating mode in AIX tools (1 of 2)

lparstat command reports donating mode in dedicated partition:
# lparstat -i
Node Name : lpar1
Partition Name : lpar1
Partition Number : 1
Type : Dedicated-SMT
Mode : Donating
Entitled Capacity : 2.00
Partition Group-ID : 32770
Shared Pool ID : -
Online Virtual CPUs : 2
Maximum Virtual CPUs : 2
Minimum Virtual CPUs : 1
Online Memory : 640 MB
Maximum Memory : 1024 MB
Minimum Memory : 512 MB
Variable Capacity Weight : -
Minimum Capacity : 1.00
Maximum Capacity : 2.00
Capacity Increment : 1.00
Maximum Physical CPUs in system : 8
Active Physical CPUs in system : 8
Active CPUs in Pool : -
Shared Physical CPUs in system : 0
Maximum Capacity of Pool : 0
Entitled Capacity of Pool : 0
Unallocated Capacity : -
Physical CPU Percentage : 100.00%
Figure 3-10. Viewing donating mode in AIX tools (1 of 2) AN313.1
Notes:
The visual shows the lparstat -i command output reporting the donating mode in a
dedicated partition. For dedicated partitions not donating idle cycles, the mode would be
Capped.
Instructor Guide
Instructor notes:
Purpose The lparstat command shows the partition is set to donating mode.
Details
Transition statement Lets look at some more tools that shows the donating mode.

V5.4.0.3
Instructor Guide
Uempty
Viewing donating mode in AIX tools (2 of 2)

mpstat command reports donating mode in dedicated partition:
# mpstat s 2 1
System configuration: lcpu=2 mode=Donating
Proc0
cpu0 cpu1
0.20% 0.17%
sar P ALL command reports donating mode in dedicated partition:

# sar -P ALL 2 1
AIX lpar1 3 5 00C958AF4C00 09/06/07

22:10:22 0 0 0 0 100 0.50
1 0 0 0 100 0.50
- 0 0 0 100 1.00
Figure 3-11. Viewing donating mode in AIX tools (2 of 2) AN313.1
Notes:
mpstat command
If simultaneous multithreading is enabled, and a dedicated processor partition is configured
to donate cycles, the mpstat -s command displays logical processors usage as shown in
the visual above. In the example, logical processor cpu0 is 0.20% busy and logical
processor cpu1 is 0.17%. cpu0 and cpu1 are hardware threads for the physical processor
proc0.
When a dedicated processor partition is not donating cycles, the logical processors
percentages would add up to 100% because they are entirely dedicated to that partition,
and the mode is listed as Capped.
sar -P ALL command

The sar -P ALL command shows processor activity for all logical, virtual, or dedicated
processors (depending on the partitions configuration). In the visual above, the processor
Instructor Guide
mode shows Donating. If a dedicated partition is not donating cycles, the mode is listed as
Capped.
The phsyc field shows the amount consumed by the physical processor. You might see a
change in the physc distribution once the partition starts donating cycles if simultaneous
multithreading is enabled. The example sar -P ALL 2 1 output shown in the visual above
is from a shared dedicated partition that is not actively donating cycles. It has one
dedicated processor with simultaneous multithreading enabled. When this same partition
was donating all of its excess cycles, the commands output was:
AIX lpar1 3 5 00C958AF4C00 09/07/07

03:29:03 0 0 0 0 100 1.00
1 0 13 0 87 0.00
- 0 0 0 100 1.00

V5.4.0.3
Instructor Guide

Purpose Describe two commands that will show whether a dedicated processor
partition is configured to donate idle cycles.
Details There is information on the output of mpstat and sar -P ALL in the student
notes.
Try to keep the topic focused on whats different for shared dedicated processors.
Transition statement Lets follow a donating scenario.
Instructor Guide
Viewing donating mode: HMC utilization data
Select the partition

in dedicated
processor mode
Figure 3-12. Viewing donating mode: HMC utilization data AN313.1
Notes:
From the window that displays the utilization events, select an Utilization Sample event
type. From this periodic utilization sample, you can select the information to display by
using the View menu. The view option related to the sharing mode is:
Partitions: Displays information about the processor and memory utilization on each
logical partition in the managed system.
The Partition Processor Utilization sample window shows lpar1 with a sharing mode of
when active, which means that lpar1 can donate idle cpu cycles to the shared pool when
active.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Dedicated processors donating scenario
1 2
200 200
175 175
150 150
125 125
Wasted Dedicated
100 1 core Dedicated 100
1 Core Dedicated
75 75
50 50
25 25
0 0
3 4
200 200
175 175
150 150
125 0.5 Uncapped 2 125 0.5 Uncapped 2
0.5 Uncapped 1 0.5 Uncapped 1
100 100
Wasted Dedicated Wasted Dedicated
75 75 1 Core Dedicated
1 Core Dedicated
50 50
25 25
0 0
Figure 3-13. Dedicated processors donating scenario AN313.1
Notes:
Scenario description
1. Consider a two-core Power System with a dedicated partition having one physical cpu
assigned. A variable workload is running (between 0% and 100%).
2. The excess capacity of the dedicated processor is wasted.
3. Consider adding two evenly weighted shared uncapped partitions with a capacity
entitlement of 0.5 with a desired number of virtual processor = 1. The two uncapped
partitions are CPU-bound. Each uncapped partition shares the remaining physical
processor even though each can consume an entire physical processor (desired
#VP=1).
4. With the shared dedicated processors PowerVM feature, a dedicated partition donates
its excess cycles to the uncapped partitions. Each uncapped partition consumes an
entire processor if available (when the dedicated partition consumption is at 0%) and
shares a processor when the dedicated partition is fully utilized (when the dedicated

V5.4.0.3
Instructor Guide
Uempty partition consumption is at 100%). The total processor capacity in the system is better
utilized while the dedicated processor partition maintains the performance
characteristics and predictability of the dedicated environment when
resource-constrained.
Processor utilization threshold

If a dedicated processor's utilization is less than this threshold value specified by the
schedo parameter ded_cpu_donate_thresh, the dedicated processor is donated for use by
other partitions when the processor is idle. If a dedicated processor's utilization is equal to
or greater than this threshold, the dedicated processor is not donated for use by other
partitions. By default, if the dedicated processor utilization is 80% (default
ded_cpu_donate_thresh value), the remaining 20% idle cycles are not ceded to the shared
processor pool.
Instructor Guide
Instructor notes:
Purpose Explain how this feature works. Most of the information is in the student notes.
Details When the dedicated processor's utilization is equal to or greater than 80
percent, AIX stops the donation to preserve the level of performance of the dedicated
partition. Giving the processor to another partition at this point could risk invalidating the
cache, and can have an adverse affect on the performance of the dedicated partition.
Transition statement Lets see how the different metrics in the lparstat and mpstat
commands relate to the donation of idle cycles.

V5.4.0.3
Instructor Guide
Uempty
Dedicated idle cycles donation: New metrics

New metrics in lparstat / mpstat / topas / sar performance commands
/ # lparstat -d 2
System configuration: type=Dedicated mode=Donating smt=On lcpu=2 mem=2048
%user %sys %wait %idle physc %idon %bdon %istol %bstol
----- ----- ------ ------ ----- ------ ------ ------ ------
0.0 0.2 0.0 99.7 0.00 99.6 0.0 0.0 0.0
0.0 0.2 0.0 99.8 0.00 99.6 0.0 0.0 0.0
0.0 0.2 0.0 99.8 0.00 99.7 0.0 0.0 0.0
0.0 0.2 0.0 99.8 0.00 99.6 0.0 0.0 0.0
0.0 0.2 0.0 99.8 0.00 99.7 0.0 0.0 0.0
New physc column
displayed %idon : Percentage of
%bdon: Percentage of
idle cycles donated
busy time donated
/ # mpstat -h 2
cpu pc ilcs vlcs idon bdon istol bstol
0 0.00 0 151 99.4 0.0 0.0 0.0
1 0.00 0 106 99.8 0.0 0.0 0.0 %istol and %bstol
ALL 0.00 0 257 99.6 0.0 0.0 0.0 : stolen cycles by
---------------------------------------------------- PHYP to run
0 0.00 0 145 99.5 0.0 0.0 0.0 maintenance tasks
1 0.00 0 101 99.8 0.0 0.0 0.0
ALL 0.00 0 246 99.7 0.0 0.0 0.0
Figure 3-14. Dedicated idle cycles donation: New metrics AN313.1
Notes:
Some new metrics are added when a dedicated type partition is in donating mode.
physc column
While the %user, %kernel, %wait, and %idle columns stay relative to partition capacity, a
new physc column shows the actual physical processor consumption. The physc statistics
were displayed only when the partition type was shared on POWER5. It is now available for
dedicated LPARs on POWER6 and POWER7.
%idon and %bdon columns
Two new columns, %idon and %bdon, are related to the donated cycles.
%idon:
Shows the percentage of physical processor used while explicitly donating idle cycles.
This metric is applicable only for donating dedicated partitions.
%bdon:
Shows the percentage of physical processor used while busy cycles are being donated.
This metric is applicable only for donating dedicated partitions.
Instructor Guide
This donation occurs when the dedicated partition calls hcede and when the donation
is allowed. This time is seen as %idon.
It is also possible (but rather rare) to give some cycles when busy. This can happen
when the partition is blocked because all its logical processors are in the hypervisor
waiting for a page fault resolution. In this case, the PHYPE or Hypervisor can force a
donation. This time appears as %bdon.
Stolen cycles columns in performance monitoring commands lparstat and mpstat.
These cycles are stolen by POWER Hypervisor from a dedicated partition to run
maintenance tasks (hypervisor overhead). This can happen whether or not donation is
enabled.

V5.4.0.3
Instructor Guide

Purpose Discuss the new metrics, such as physc and %idon of the lparstat and mpstat
commands.
Details
Transition statement Lets discuss the processor folding feature, and the schedo
parameter vpm_fold_policy.
Instructor Guide
Processor folding
Processor folding can be enabled to improve the amount of
cycles that are donated.
New schedo parameter vpm_fold_policy
Enable/disable the AIX processor folding feature depends on the partition
type
There are 3 bits in vpm_fold_policy to control processor folding.
> Bit 0 (0x1): When set to 1, this bit indicates processor folding is enabled if the
partition is using shared processors.
> Bit 1 (0x2): When set to 1, this bit indicates processor folding is enabled if the
partition is using dedicated processors.
> Bit 2 (0x4): When set to 1, this bit disables the automatic setting of
processor folding when the partition is in static power-saving mode.
These bit values can be combined to form the desired value.
vpm_fold_policy=1 is default value
Figure 3-15. Processor folding AN313.1
Notes:
The processor folding feature has been available on shared processors since AIX 5.3 ML3
and has helped provide optimal virtual CPU scheduling. This feature is now available for
shared dedicated processors.
The tuning of this scheduling algorithm can be done using the AIX schedo command. The
value of the parameter vpm_xvcpus specifies the number of virtual processors to enable in
addition to the virtual processors required to satisfy the workload.
A new schedo parameter, vpm_fold_policy, affects of the virtual processor management
feature of processor folding in a logical partition. The virtual processor management feature
of processor folding can be enabled or disabled based on whether a partition has shared or
dedicated processors. When the partition is in static power- saving mode, processor folding
is automatically enabled for both shared or dedicated processor partitions.

V5.4.0.3
Instructor Guide
Uempty Some interesting values are:

0 = Processor folding feature is disabled.
1 = Default value: Processor folding is enabled for shared processors logical partition
and disabled for dedicated processors logical partition.
2 = Processor folding is disabled if shared processors logical partition and enabled for
dedicated processors logical partition.
3 = Processor folding feature is enabled regardless of the partition type.
Instructor Guide
Instructor notes:
Purpose Describe the new schedo parameter vmp_fold_policy.
Details The virtual processor folding feature on shared processors partitions is well
known and has been implemented starting at the end of 2005 in AIX. Dedicated processors
can also activate this feature to optimize the idle cycles given to the shared processor pool.
Transition statement Lets see an example of how we could maximize the idle cycles
donated to the shared processor pool.

V5.4.0.3
Instructor Guide
Uempty
Processor folding: Maximizing
the idle capacity (1 of 2)
Example: Looking at the dedicated partition CPU activity output of
topas -L
vpm_fold_policy = 0
Interval: 2 Logical Partition: Thu Feb 7 13:24:44 2008
Donating SMT OFF Online Memory: 40960.0
Partition CPU Utilization Online Logical CPUs: 4
%user %sys %wait %idle %hypv hcalls %istl %bstl %idon %bdon vcsw
0 2 0 52 0.0 6025 0.0 0.0 26.3 0.0 2181
=========================================================
LCPU minpf majpf intr csw icsw runq lpa scalls usr sys _wt idl All four
Cpu0 2805 0 119 51 311 0 99 222 1 49 0 50 physical CPUs
Cpu1 2806 0 117 59 422 0 99 173 0 44 0 56 busy
Cpu2 2803 0 107 39 737 0 97 124 0 55 0 44
Cpu3 2803 0 107 39 737 0 97 124 0 43 0 57
Looking at the shared processor pool cycles, output of lparstat 10

System configuration: type=Shared mode=Uncapped smt=off
lcpu=8 mem=46080 psize=8 ent=4.00 Shared partition :
app column : idle cycles
%user %sys %wait %idle physc %entc lbusy app vscw phint
available in shared pool
-------- ------- -------- ------- -------- -------- ------ ----- ------ -----
0.0 0.1 0.0 99.9 0.00 0.1 0.0 5.10 386 0
0.0 0.1 0.1 99.7 0.00 0.2 0.1 5.07 449 0
0.0 0.1 0.1 99.8 0.00 0.1 0.1 5.12 438 0
0.0 0.1 0.0 99.9 0.00 0.1 0.0 5.10 387 0
Figure 3-16. Processor folding: Maximizing the idle capacity (1 of 2) AN313.1
Notes:
Here is an example showing the impact on the amount of idle processing cycles available
in the shared processor pool when the processor folding feature in enabled or disabled in a
dedicated partition running in donating mode.
Consider two partitions. The first one is a dedicated partition running in donating mode with
4 physical processors assigned to it. The schedo parameter vpm_fold_policy is set to 0,
meaning that we disabled the folding policy.
The second partition is a shared-type partition running in uncapped mode, with an
entitlement of 4.0 and four virtual processors configured.
We have started a multi-threaded application (which means that lots of threads are
running) on the dedicated partition. Looking at the topas -L output, we can notice all four
processors loaded at about 50%.
Looking at the lparstat command output on the shared partition, we see the amount of idle
processing cycles available in the shared processor pool. This value can be seen in the
app column and is approximately 5.10.
Instructor Guide
Instructor notes:
Purpose Discuss the scenario in which a dedicated partition donates idle cycles to the
shared processor pool, while the processor folding feature is disabled.
Details Point out the activity is dispatched on the four processors.
Transition statement Lets see what happens when we activate the processor folding.

V5.4.0.3
Instructor Guide
Uempty
Processor folding: Maximizing
the idle capacity (2 of 2)
Example: Looking at the dedicated partition CPU activity output of
topas L
vpm_fold_policy = 2
Interval: 2 Logical Partition Thu Feb 7 13:24:44 2008
Donating SMT OFF Online Memory:40960.0
Partition CPU Utilization Online Logical CPUs: 4
%user %sys %wait %idle %hypv hcalls %istl %bstl %idon %bdon vcsw
0 2 0 52 0.0 6025 0.0 0.0 26.3 0.0 2181
=========================================================
LCPU minpf majpf intr csw icsw runq lpa scalls usr sys _wt idl Only three
Cpu0 1290 0 119 51 311 0 99 222 1 58 0 41 physical CPUs
Cpu1 0 0 117 0 0 0 0 0 0 0 0 100 busy
Cpu2 2803 0 107 39 737 0 97 124 0 56 0 44
Cpu3 2803 0 107 39 737 0 97 124 0 43 0 57
Looking at the shared processor pool cycles, output of lparstat 10

System configuration: type=Shared mode=Uncapped smt=off
lcpu=8 mem=46080 psize=8 ent=4.00 Shared partition :
app column : idle cycles
%user %sys %wait %idle physc %entc lbusy app vscw phint available increased in
-------- ------- -------- ------- -------- -------- ------ ----- ------ ----- shared pool
0.0 0.0 0.0 100 0.00 0.1 0.0 5.61 377 0
0.1 0.2 0.1 99.5 0.01 0.4 0.0 5.67 396 0
0.0 0.1 0.0 99.9 0.00 0.1 0.0 6.10 389 0
0.0 0.1 0.1 99.7 0.01 0.2 0.1 5.75 446 0
Figure 3-17. Processor folding: Maximizing the idle capacity (2 of 2) AN313.1
Notes:
On the dedicated partition, the folding mechanism is enabled by setting the processor
folding policy to 2 (schedo -o vpm_fold_policy=2).
Notice in the topas -L command output three of the four processors. One processors has
no thread dispatched on it. The folding mechanism estimated that only three processors
were enough to run the application and folded one processor.
Looking at the lparstat command output on the shared partition, you can see the amount
of idle cycles in the pool increased (5.60 average). The previous amount of idle cycles
available was about 5.10.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Its time for a checkpoint.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint
1. True or False: Dedicated processors can be shared only if
they are idle.
2. True or False: Only uncapped partitions can use idle cycles

donated by the dedicated processors.
3. Which of the following sharing_mode values allow excess

processor cycles to be donated to the shared processor pool
from an active partition only?
a. keep_idle_procs
b. share_idle_procs
c. share_idle_procs_active
d. share_idle_procs_always
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
Checkpoint solutions
1. True or False: Dedicated processors can be shared only if they are
idle.
The answer is true.
2. True or False: Only uncapped partitions can use idle cycles donated
by the dedicated processors.
The answer is true.
3. Which of the following sharing_mode values allow excess processor

cycles to be donated to the shared processor pool from an active
partition only?
a. keep_idle_procs
b. share_idle_procs
The answer is share_idle_procs_active.

V5.4.0.3
Instructor Guide
Uempty
Topic 1: Summary
Having completed this topic, you should be able to:
Describe the donation of unused dedicated capacity concept
Activate this feature through the HMC GUI or CLI
Monitor the LPAR CPU activity when donating
Describe how to maximize donated idle cycles to the shared

processor pool
Figure 3-19. Topic 1: Summary AN313.1
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets see Topic 2.

V5.4.0.3
Instructor Guide
Uempty
Topic 2: Multiple shared processor pools

Describe the multiple shared processor pool feature
Discuss how this feature lowers the cost for licensing
Monitor multiple shared processor pools
Configure an uncapped LPAR to be limited to a virtual shared

processor pool
Figure 3-20. Topic 2: Multiple shared processor pools AN313.1
Notes:
Instructor Guide
Instructor notes:
Purpose Give an overview of multiple shared processor pools (MSPPs).
Details
Transition statement Lets discuss the multiple shared processor pools feature.

V5.4.0.3
Instructor Guide
Uempty
What are multiple shared processor pools?

Provides a means to control the processor capacity that can be
consumed from the physical shared processor pool
Set resource limits on groups of LPARs
Reduce licensing costs
Up to 64 shared processor pools

(default plus up to 63 additional)
Figure 3-21. What are multiple shared processor pools? AN313.1
Notes:
Multiple shared processor pools are supported by POWER6 and POWER7 processor-
based systems. This allows the system administrator to create a set of micro-partitions and
control the processor capacity consumed from the physical shared processor pool. Each
shared-processor pool has an associated entitled pool capacity, which is consumed by the
set of micro-partitions in that shared processor pool.
This feature allows for automatic, non-disruptive balancing of processing power between
partitions assigned to the shared pools. The result is increased throughput and potentially a
reduction of processor-based software licensing costs. This feature is licensed through
PowerVM Standard or Enterprise Edition along with a POWER6 and POWER7
processor-based server.
Default shared processor pool
All IBM Power Systems support the multiple shared processor pools capability and have a
minimum of one (the default) shared processor pool and up to a maximum of 64 shared
processor pools.
Instructor Guide
Instructor notes:
Purpose Give an overview of MSPPs.
Details Introduce the default shared processor pool feature and its purpose.
Transition statement Lets discuss the details of the multiple shared processor pools
feature.

V5.4.0.3
Instructor Guide
Uempty

Up to 64
Shared processor pooln
Shared processor pool1 Set of micro-partitions shared
Shared processor P\pool0 processor
LPAR7 pools
LPAR1 LPAR3 LPAR8
LPAR2 LPAR4
LPAR5
LPAR6
Shared
Dedicated
Physical
processors

Figure 3-22. Multiple shared processor pools AN313.1
Notes:
Shared processor pools
A shared processor pool is primarily for the purpose of controlling the processor capacity
that micro-partitions can consume from the physical shared processor pool.
The set of micro-partitions form a unit through which processor capacity from the physical
shared-processor pool can be managed.
Each shared processor pool has a maximum capacity associated with it. This defines the
upper boundary of the processor capacity that can be utilized by the set of micro-partitions
in the shared processor pool.
The physical shared processor pool is a set of physical processors used to run a set of
micro-partitions. There is a maximum of one physical shared processor pool on IBM Power
Systems. All active physical processors are part of the physical-processor pool unless they
are assigned to a dedicated-processor partition where:
The LPAR is active and is not capable of capacity donation, or
The LPAR is inactive (powered-off) and the systems administrator has chosen not to
make the processors available for shared processor work.
Instructor Guide
Instructor notes:
Purpose Describe the terminology and concepts of multiple shared processor pools.
Details Discuss the physical shared processor pool and the maximum pool capacity.
Transition statement Lets see an example of how the multiple shared processor pools
can affect licensing.

V5.4.0.3
Instructor Guide
Uempty
Multiple shared processor pool: Example
Shared processor pool 1 Shared processor pool 2

Max Cap: 5 processors Max Cap: 6 processors
Only license the relevant software based on shared pool Max Cap.
DB2 cores to license:
One from dedicated partition n2 plus five from pool # 1 = 6
WebSphere cores to license:
Six from pool #2 = 6
Figure 3-23. Multiple shared processor pool: Example AN313.1
Notes:
In this example, the system has 12 CPUs and is configured with eight LPARs; three
dedicated-processor and five shared-processor LPARs. Two of the shared-processor
LPARs are assigned to pool #1 and the other three are assigned to pool #2.
The shared processor pool limits must be whole numbers. All are competing for the same
physical CPUs. The pools have equal priority, and there is not a way to change this. The
mechanism is simply limiting how many excess CPU cycles or how much excess capacity
a group of LPARs can use.
If we look at the DB2 cores to license, the shared processor pool 1 has a max capacity
value to 5, thus limiting the maximum processor cycles to be consumed to five processors.
This reduces the DB2 licensing needed in partitions n5 and n6 to five instead of nine
(without using an additional shared processor pool, we would have needed nine DB2
licenses to match with the number of shared processors in the pool).
Instructor Guide
Instructor notes:
Purpose
Details
Additional information This mechanism does not affect the processor affinity logic.
Transition statement The next slide compares a traditional CPU consumption of a
shared partition in the default pool with a shared partition in a user-defined shared pool.

V5.4.0.3
Instructor Guide
Uempty
CPU consumption for uncapped partitions

Shared processor pool (SPP) usage:
Default
Partition2
SPP
Partition1
SPP1
Max Cap Partition2
Partition1
Figure 3-24. CPU consumption for uncapped partitions AN313.1
Notes:
The first image depicts the utilization of the default physical shared processor pool. This
shows the pool utilization for two partitions with very demanding applications. The partitions
are uncapped and totally consuming all available CPU resources.
In the second image, Partition1 was moved to a separate Shared Processor Pool (SPP1)
with a defined maximum capacity value. This is possible with the new virtual shared
processor pools. This value will limit the amount of available CPU resources usable by the
partitions assigned to the pool. This is a way to set a cap for uncapped LPARs.
Instructor Guide
Instructor notes:
Purpose Discuss how the max capacity value limits the overall CPU consumption inside
a user-defined shared pool.
Details
Transition statement The following slides detail the CPU usage in a user-defined
shared pool.

V5.4.0.3
Instructor Guide
Uempty
CPU usage in a user-defined
shared processor pool
Max
capacity Remaining
Optional
Optional Optional
Entitled pool Reserved
capacity Partition1
Reserved
Reserved
Capacity
entitlement
Assigned partition(s)
Assigned partition(s)
entitlement
entitlement
Figure 3-25. CPU usage in a user-defined shared processor pool AN313.1
Notes:
Maximum pool capacity, reserved pool capacity, and entitled pool

capacity
Each processor pool has a maximum capacity associated with it that defines the upper
boundary of the processor capacity that can be utilized by the set of micro-partitions in the
shared processor pool. This must be a whole number of processing units.
The reserved pool capacity is an entitled capacity inside a shared processor pool reserved
for the express use of the micro-partitions in the shared processor pool. The reserved pool
capacity is distributed among uncapped micro-partitions in the shared processor pool
according to their uncapped weighting. This reserved pool capacity is in addition to the
processor capacity entitlements of the individual micro-partitions inside the pool.
The entitled capacity of a shared processor pool defines the guaranteed processor
capacity that is available to the group of micro-partitions in the shared processor pool. The
entitled pool capacity is the sum of the entitlement capacities of the micro-partitions in the
shared processor pool plus the reserved pool capacity.
Instructor Guide
Instructor notes:
Purpose Introduce the different concepts: The maximum pool capacity and reserved
pool capacity.
Details Inside a shared pool, only uncapped partitions can use the reserved pool
capacity. Each pool is a separate entity where cycles are ceded and then redistributed to
the logical partitions inside the pool. This is the level0 of capacity resolution.
Transition statement The two levels of capacity resolution are discussed in the next
slide.

V5.4.0.3
Instructor Guide
Uempty
Virtual shared processor pools: Resolution level
Capacity Global capacity resolution between pools
resolution
between all
pools
Max pool capacity

Uncapped
Reserved pool capacity
Capped
Capped
Uncapped
Uncapped
Capped
Capped
SPP 0 SPP 1
Capacity SPP0 capacity SPP1 capacity

resolution resolution
resolution
inside pool POWER Hypervisor
P P P P P P P P P P

Figure 3-26. Virtual shared processor pools: Resolution level AN313.1
Notes:
Pool capacity resolution

Unused capacity in the shared processor pool is redistributed to uncapped micro-partitions
within the shared processor pool if needed. This redistribution of idle cycles to any eligible
micro-partition within the same shared processor pool is also called the Level0 capacity
resolution.
When all level0 capacity have been resolved within each shared processor pool, the
POWER Hypervisor can redistribute unused processor cycles to any eligible
micro-partitions regardless of the multiple shared processor pools structure. For allocating
additional processor capacity, the POWER Hypervisor takes the uncapped weights of all
the micro-partitions in the system into account. This capacity resolution is also called
Level1 capacity resolution.
Instructor Guide
Instructor notes:
Purpose Introduce the two levels of capacity resolution.
Details If the logical partitions inside a shared pool do not consume all their cycles, or
even if cycles from the reserved pool capacity are not used, then these cycles can be given
to other partitions in other shared pools. The redistribution of these cycles is done by the
POWER Hypervisor, by taking into account all the weight of all the partitions in all the
shared pools. There is no weight associated to a specific shared pool.
Transition statement Lets see the requirements to configure multiple shared
processor pools.

V5.4.0.3
Instructor Guide
Uempty
Hardware and software requirements

Hardware
POWER6 or later
HMC-managed
IVM-managed
Software
eFW 3.2 or later
AIX 5.3 TL 7 or later

AIX 6.1
Figure 3-27. Hardware and software requirements AN313.1
Notes:
If your managed system's firmware is not at 01EM320_31_31 (as shown in the figure), it
cannot have more than one shared processor pool. You can download the required
firmware from http://www14.software.ibm.com/webapp/set2/firmware/gjsn.
Instructor Guide
Instructor notes:
Purpose Point out the minimum firmware level required for configuring MSPPs.
Details
Transition statement The next series of slides show how to configure MSPPs.

V5.4.0.3
Instructor Guide
Uempty
Configuring multiple shared pools

Verify system capability
Up to 64 pools
Default pool is pool 0
Attributes can be dynamically changed

Maximum capacity
Reserved entitled capacity
LPARs assigned dynamically
Figure 3-28. Configuring multiple shared pools AN313.1
Notes:
The managed system's Properties is the best place to verify that the system supports
multiple shared pools.
The maximum number of shared processor pools has a value of 64 if your system is
capable, but has a value of 1 if your system is not.
The maximum capacity of the pool is the maximum number of processing units available to
this LPAR's shared processor pool. Reserved entitled capacity of the pool is the number of
processing units that this LPAR's shared processor pool is entitled to receive.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Managed system properties
Figure 3-29. Managed system properties AN313.1
Notes:
If the managed system supports multiple virtual shared pools, the Maximum number of
shared processor pools is 64 (instead of 1). Also, the Partition Processor Usage includes a
Shared Processor Pool (ID) column.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Change attributes of shared
processor pools (1 of 3)
Select the
managed system.
Figure 3-30. Change attributes of shared processor pools (1 of 3) AN313.1
Notes:
From the HMC GUI, you must select Shared Processor Pool Management to configure the
shared pool attribute values. From here, you could also assign an LPAR to a specific
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Notes:
The 64 shared processor pools are already defined on your managed system, but only the
pool id 0 (the default pool) is activated.
To change the Reserve Processing Units or the Maximum Processing Units, click the link
associated with the pool name.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Specify the Maximum processing units value and optionally
the Reserved processing units value.
Notes:
The Maximum processing units must be a whole number. The Reserved processing units is
optional to activate a user-defined shared processor pool.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement You can dynamically assign an LPAR to a new shared pool.

V5.4.0.3
Instructor Guide
Uempty
Changing the LPAR shared pool assignment

Can be changed dynamically
Select the shared

processor pool.
Figure 3-33. Changing the LPAR shared pool assignment AN313.1
Notes:
The LPARs pool assignment can be changed dynamically. Under the Manage Shared
Processor Pools, select the Partitions tab and click the link associated with the LPAR. You
then see an Assign Partition to a Pool dialogue box. Select the pool from the Pool Name
pull-down list.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement The next few visuals show how to view and monitor the multiple
shared processor pools.

V5.4.0.3
Instructor Guide
Uempty
Viewing shared pools in AIX tools

lparstat command reports shared pools information:
# lparstat -i
Node Name : lpar4
Partition Number : 1
Type : Shared-SMT
Mode : Uncapped
Shared Pool ID : 1
Online Virtual CPUs : 1
Minimum Virtual CPUs : 1
Variable Capacity Weight : 128
Maximum Physical CPUs in system : 4
Active Physical CPUs in system : 4
Active CPUs in Pool : 1
Shared Physical CPUs in system : 4
Unallocated Capacity : 0.00
Figure 3-34. Viewing shared pools in AIX tools AN313.1
Notes:
The lparstat -i command shows Pool ID, Maximum Capacity of Pool, and Entitled
Capacity of Pool.
The following are the new shared processor pool fields in the lparstat command output:
Shared Pool ID: Identifier of shared pool of physical processors that this LPAR is a
member.
Maximum Capacity of Pool: This is the maximum number of processing units
available to this LPARs shared processor pool.
Entitled Capacity of Pool: This is the number of processing units that this LPARs
shared processor pool is entitled to receive.
Active CPUs in Pool: This is the maximum number of CPUs available to this LPAR's
Instructor Guide
Instructor notes:
Purpose To identify the lparstat command output changes.
Details
Transition statement Lets view the shared pool configuration from the HMC CLI.

V5.4.0.3
Instructor Guide
Uempty
Viewing shared pools from HMC CLI
From HMC CLI

> lshwres -r procpool -m <managed_system>
hscroot@r11s1hmc:~> lshwres -r procpool -m costieres

name=DefaultPool,shared_proc_pool_id=0,"lpar_names=lpar3,AMS_I
5OSB,AMS_I5OSA,vios,lpar2","lpar_ids=6,3,2,1,4"
name=SharedPool01,shared_proc_pool_id=1,max_pool_proc_units=1.
0,curr_reserved_pool_proc_units=0.3,pend_reserved_pool_proc_un
its=0.3,lpar_names=lpar4,lpar_ids=7
> lshwres -r proc -m <managed_system> --level sys

hscroot@r11s1hmc:~> lshwres -r proc -m costieres --level sys
configurable_sys_proc_units=4.0,curr_avail_sys_proc_units=0.2,
pend_avail_sys_proc_units=0.2,installed_sys_proc_units=4.0,max
_capacity_sys_proc_units=deprecated,deconfig_sys_proc_units=0,
min_proc_units_per_virtual_proc=0.1,max_virtual_procs_per_lpar
=64,max_procs_per_lpar=64,max_shared_proc_pools=64
Figure 3-35. Viewing shared pools from HMC CLI AN313.1
Notes:
Using the HMC CLI, you can list the different shared processor pools with logical partitions
assigned. The example shows the default shared pool with the lpars id 6,3,2,1, and 4
assigned to it, and then it shows the shared processor pool id 1 with the lpar id 7 assigned
to it.
The second lshwres command output shows the maximum number of shared processor
pools configurable to 64.
Instructor Guide
Instructor notes:
Purpose To identify the lparstat command output changes.
Details
Transition statement topas can be used to monitor multiple shared processor pools on
managed systems.

V5.4.0.3
Instructor Guide
Uempty
Monitoring shared pools: AIX tools

topas -C
Shared pools section enabled with p on any CEC view
Cursor and f key focuses on one pool
Topas CEC Monitor Interval: 10 Tue May 12
12:47:22 2009
Shr: 1 Mon:3.0 InUse:1.7 Shr:0.2 PSz:4 Don: 0.0 Shr_PhysB 0.01
Ded: 1 Avl: - Ded: 0 APP:4.0 Stl: 0.0 Ded_PhysB 0.00
pool psize entc maxc physb app mem muse
------------------------------------------------
0 4.0 250.0 400.0 0.0 0.00 1.5 0.9
1 1.0 40.0 100.0 0.0 0.00 1.5 0.8
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw Ent %EntC PhI

pmem
-------------------------------shared--------------------------------
lpar4 A61 U 1.5 0.8 2 0 2 0 96 0.01 324 0.10 6.4 0
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw %istl %bstl

------------------------------dedicated------------------------------
Figure 3-36. Monitoring shared pools: AIX tools AN313.1
Notes:
topas pool section

pool: Pool ID number
psize: Pool size
entc: Entitlement
maxc: Pool maximum capacity (the value is the number of 1/100 processing units)
physc: Physical capacity consumed (how many physical processors are consumed)
physb: Physical processor busy (comparable to physc but excludes the idle time)
app: Available pool processors
mem: Memory
muse: Memory in use
The shared hosts list can be manipulated to show or focus on the host associated with a
particular pool. This is accomplished by using the up and down cursor keys to select the
pool number and then pressing the f key.
Instructor Guide
Instructor notes:
Purpose To examine the topas command changes.
Details
Transition statement HMC utilization data reports also the shared processor pools
utilization.

V5.4.0.3
Instructor Guide
Uempty
Monitoring shared pools: HMC utilization data
Physical processor pool

size and utilization %
Shared
processor
pools and
utilization %
Figure 3-37. Monitoring shared pools: HMC utilization data AN313.1
Notes:
From the window that displays the utilization events, select a Utilization Sample event
type. From this periodic utilization sample, you can select the information to display by
using the View menu. The two possible View options related to the processor pools are:
Physical Processor Pool, which displays information about the total processor utilization
within all shared processor pools on the managed system.
Shared Processor Pool, which displays information about the processor utilization
within each configured shared processor pool on the managed system.
The Shared Processor Pool Utilization window example shows two shared processor
pools. The SharedPool01 has 1 processing unit assigned (the max cap value is set to 1).
The processor utilization is about 100%, which means that the LPARs running in that
shared pool consume all of the processor capacity assigned.
The Physical Processor Pool Utilization window shows a Processing Unit value of three. If
you look at the Configurable processing units value in the System Utilization window, you
will see a value of four (we have four physical processors in the managed system). The
physical processor pool contains only three processors because one physical processor
has been assigned to a dedicated partition.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Checkpoint
1. True or False: Each shared processor pool has a maximum
capacity associated with it.
2. True or False: The default shared processor pool does not

have a number.
3. What is the default value of the reserved pool capacity for a

shared processor pool?
4. Which command can be used to display the entitled capacity

of all shared processor pools?
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
The answer is true.
have a number.
The answer is false (default shared pool ID = 0).
The answer is the default value is 0.

The answer is topas C (using p to enable display of pool
section).

V5.4.0.3
Instructor Guide
Uempty
Topic 2: Summary
Describe the multiple shared processor pool feature
Discuss how this feature lowers the cost for licensing
Monitor multiple shared processor pools
Configure an uncapped LPAR to be limited to a virtual shared

processor pool
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Exercise
Unit
exerc
ise
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Unit summary
Systems features:
Dedicated shared processors running in donating mode
utilization
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty Unit 4. Active Memory Sharing
Estimated time
01:30

This unit describes the Active Memory Sharing feature concepts.

Describe the Active Memory Sharing (AMS) concepts and
components
Describe the POWER Hypervisor paging activity
Create a shared memory pool
Create and manage the AMS paging space devices
Create and activate a shared memory partition
Describe the Virtual I/O Server virtual devices involved in AMS
Monitor the shared memory partition using the AIX performance
tools vmstat, lparstat, and svmon
Monitor the shared memory pool usage using data utilization from
the HCM

Machine exercises
References
POW03026USEN.pdf PowerVM Active Memory Sharing: An
Overview
REDP-4470 PowerVM Virtualization Active Memory sharing
Copyright IBM Corp. 2010, 2011 Unit 4. Active Memory Sharing 4-1
Instructor Guide
Unit objectives
Describe the Active Memory Sharing concepts and
components
Monitor the shared memory partition using the AIX
performance tools vmstat, lparstat, and svmon
Monitor the shared memory pool usage using data utilization
from the HMC
Notes:
These are the unit objectives

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement What is AMS?
Instructor Guide
What is PowerVM Active Memory Sharing?

PowerVM virtualization technology added in May 2009.
A pool of physical memory is shared by a group of logical
partitions.
POWER Hypervisor dynamically allocates physical memory from the
pool to the partition based on demand.
Memory allocation is done at the page level granularity, not at the
system LMB.
It supports over-commitment of physical memory with overflow
going to a paging device.
It is analogous to the shared processor partitions that share
the processor resources available in a pool of processors.
Figure 4-2. What is PowerVM Active Memory Sharing? AN313.1
Notes:
As the number of CPUs available on a system increases and CPU virtualization techniques
such as Power Micro-Partitioning allow better utilization of processors across partitions, the
role of memory management in server virtualization is becoming more important.
IBM Power Virtualization Manager (IBM PowerVM) Active Memory Sharing (AMS)
technology takes PowerVM virtualization to a new level of consolidation and virtualization
by optimizing memory utilization. AMS intelligently shares memory by dynamically moving
it from one partition to another on demand, thereby optimizing memory utilization and
allowing flexibility of memory usage.
Because memory utilization can be linked to processor utilization, this function
complements shared processors very well. Systems with low CPU requirements are very
likely to have low memory residency requirements as well.

V5.4.0.3
Instructor Guide
Uempty Active Memory Sharing overview

PowerVM Active Memory Sharing (AMS) allows shared memory partitions to share
memory from a single pool of shared physical memory. Each partitions configured memory
(minimum, desired, and maximum) is the logical memory assigned to a partition. The
Hypervisor maps a range of physical memory to the partitions logical memory. This range
of physical memory is dedicated to the partition in a dedicated memory partition.
In an AMS environment, a partitions logical address can be mapped to any physical
memory that is part of the shared pool. As a result, physical memory assigned to one
partition at one time can be assigned to another partition at another time. That is, in an
AMS environment, memory is not dedicated to a single partition; it can be mapped to any
logical address of any shared memory partition.
Over-commitment of physical memory

Users can define a partition with a logical memory size larger that the available physical
memory. They can also activate a set of partitions whose aggregate logical memory size
exceeds the available physical memory.
Instructor Guide
Instructor notes:
Purpose Introduce the PowerVM Active Memory sharing concept.
Details The memory allocation is done at the memory page level (4k pages) instead of
at the LMB size when using a dynamic LPAR operation to add or remove memory to
partitions. These 4k pages movements are done automatically by AMS.

V5.4.0.3
Instructor Guide
Uempty
Dedicated and shared memory types

Logical partitions can have either dedicated or shared memory
assigned.
LPAR1 LPAR2 LPAR3 LPAR4 LPAR5
POWER Hypervisor
LPAR1 LPAR2 Virtual

Unused AMS shared
Dedicated Dedicated I/O
memory memory pool
memory memory Server
Physical memory
One paging device per

shared memory LPAR Paging devices
Figure 4-3. Dedicated and shared memory types AN313.1
Notes:
Dedicated memory and shared memory

The physical memory of an IBM Power System can now be assigned to multiple logical
partitions either in a dedicated or in a shared mode. In addition to traditional dedicated
memory assignments to single partitions, the system administrator can now assign some
physical memory to a pool that is shared among a set of logical partitions. In a shared
memory environment, at least one partition has to be using the dedicated memory mode;
this is the virtual I/O server.
When using a dedicated memory mode, it is the administrator who manually performs a
dynamic memory reconfiguration.
When using a shared memory mode, it is the system that automatically decides the optimal
distribution of the physical memory to logical partitions and adjusts the memory assignment
based on demand for memory pages. The administrator just reserves physical memory for
the shared memory pool and assigns logical partitions to the pool.
Instructor Guide
Shared memory pool

The shared memory pool is a collection of physical memory blocks that are managed as a
whole by the hypervisor. The memory in the pool is reserved upon creation and it is no
longer available for allocation to other dedicated memory partitions. The shared memory
pool is directly managed by the hypervisor for exclusive use by shared memory partitions.
If no shared memory pool is available on the system, it is not possible to define any new
shared memory partitions. Once the pool is available, it is designed to support up to 128
shared memory partitions.

V5.4.0.3
Instructor Guide

Purpose Point out to students the two different memory types. Introduce the shared
memory pool.
Details
Transition statement Lets see where AMS can help.
Instructor Guide
When to use PowerVM Active Memory Sharing

AMS can help with the following:
Partitions with low average memory
residency requirements
Retail headquarters and university
environments
Active and inactive partition
scenarios
Peak at different times (around the
world)
Day and night (day time Web
application and night batch)
AMS will not help with the following:
Environments that require high
transaction rates and high memory
residency
For example, high-performance
computing and scientific applications
AMS improves overall memory utilization.
Figure 4-4. When to use PowerVM Active Memory Sharing AN313.1
Notes:
The virtualization of the real memory through PowerVM Active memory sharing allows a
customer to activate several partitions at the same time, in a system with less physical
memory than the sum of all the partitions logical address spaces. The main benefit here is
the overall memory utilization, as opposed to increased performance.
The Active Memory Sharing feature is most beneficial for environments where logical
partitions have low average memory residency requirements, such as university
environments.
This function is not suitable for environments running high-performance applications
because these tend to have very high memory residency requirements.
AMS might not be a perfect solution for workloads that have high quality of service criteria
and predictable performance. However, AMS is an appropriate solution for time-variant
(around the world) applications, workloads that have variant load levels, file and print
servers, and other workloads that are tolerant to memory access latencies.

V5.4.0.3
Instructor Guide
Uempty Improves overall memory utilization

AMS feature through this on-demand memory allocation enables workloads that have
variable memory requirements to time share the same memory in a consolidated
environment leading to improved memory utilization as shown in the figures.
The first figure on the top right shows three partitions, each configured with shared
memory. As the memory usage peaks at different times, we should consider configuring
dedicated memory. Allocating dedicated memory to these partitions would have had to be
sized with the maximum memory consumed at peak time. We can estimate the overall
dedicated memory needed to be about 20 Gbytes in such a dedicated memory model. If we
consider that those partitions will not use all of the assigned physical memory all of the
time, this unused memory cannot be used to create other logical partitions. In the shared
memory model, we can easily configure an additional fourth partition with the same amount
of memory, thus improving the overall memory utilization.
Instructor Guide
Instructor notes:
Purpose Give some reasons to use AMS and some benefits it provides.
Details Workloads running in a partition have a working set size in memory which, if the
partitions activity level goes down. In either case, the freed or aged pages do not get used
for other partitions. Also, customer workloads have high peaks only infrequently, so
memory utilization levels are frequently low. AMS optimizes memory utilization by sharing
physical memory across multiple partitions, so memory pages that are not actively used by
one partition are given to another partition. Therefore, through consolidation of workloads
that do not peak concurrently, overall memory utilization levels can be increased in a
system.
AMS also enhances a customers ability to optimize the CPU and I/O resources along with
memory, as it allows them to consolidate a larger number of partitions on a system. Until
now, physical memory has limited the number of partitions a customer can configure on
IBM Power systems.

V5.4.0.3
Instructor Guide
Uempty
PowerVM Active Memory Sharing requirements

System requirements:
POWER6 hardware
System Firmware level at 340_75 or above
PowerVM Enterprise Edition
HMC v7.3.4 with service pack 2 or above
Virtual I/O Server 2.1.1.10-FP21 or above
Operating systems supported:

AIX 6.1 TL3 No AIX 5.3 support
IBM I 6.1 plus PTFs
SUSE Linux Enterprise Server 11
Figure 4-5. PowerVM Active Memory Sharing requirements AN313.1
Notes:
These are the system, operating system, and logical partition requirements when
implementing AMS. AMS is available with PowerVM Enterprise edition, and the per-core
pricing remains unchanged.
Instructor Guide
Instructor notes:
Purpose List of requirements for AMS.
Details

V5.4.0.3
Instructor Guide
Uempty
AMS configuration restrictions

One shared memory pool can be created.
Up to 128 shared memory partitions can use the pool.
Shared memory logical partitions must be configured as
shared processors only.
Dedicated processors are not supported.
Shared memory partitions must have virtual adapters only.
Virtual SCSI, virtual Fiber Channel (NPIV), virtual serial, and virtual
Ethernet only.
Cannot have dedicated physical adapters.
Cannot use logical ports on IVE/HEA.
Only 4KB memory pages supported.
Operating system cannot use 64 KB or large pages.
Figure 4-6. AMS configuration restrictions AN313.1
Notes:
At the time of writing, only one shared memory pool can be created per managed system,
and up to 128 shared memory partitions can be created. Shared memory partitions must be
created as the shared processor type, and all of the I/O adapters must be virtualized
through a virtual I/O server.
Also, pages larger than 4k and Barrier Synchronization register are not supported with
AMS.
Instructor Guide
Instructor notes:
Purpose Point out the actual restrictions you might encounter when implementing AMS.
Details The objective of AMS is to increase the number of partitions that can be
supported on a single CEC by sharing the limited resources such as I/O devices, memory,
and CPUs.
Also, supporting physical adapters requires changes in the physical adapter's device driver
to function properly in a shared memory environment. In AMS, the hypervisor guarantees
certain amount of memory to be available for I/O memory mapping operations, to make
sure DMA operations proceed without delays. The partition's I/O entitled memory
represents the maximum amount of memory device drivers can I/O map. The partition's I/O
entitled memory is communicated to the OS at boot time. The OS manages this memory
and distributes it across devices. Device drivers must be aware of this value because I/O
mapping operations might fail if all the I/O entitled memory has already being used.
To support physical adapters, IBM would have to work with the adapter providers to get the
changes required to support AMS in the device drivers.
Transition statement Lets overview the AMS components.

V5.4.0.3
Instructor Guide
Uempty
Active Memory Sharing components

Virtualization Control Point (VCP)
Shared memory pool: SM P1: Current Mem= 12 GB
16 GB Entitled Mem: 4 GB
Paging VIOS SM P2: Current Mem= 8 GB
Entitled Mem: 3 GB
Paging devices SM P3: Current Mem= 4 GB
Entitled Mem: 1 GB
VIOS Paging
Dedicated
partition (1 GB) Shared Shared Shared Memory
Memory Memory Memory Partition4
V
A
Partition1 Partition2 Partition3 (8 GB)
S vSCSI
FC server CMM CMM CMM
I
Page in/out Page

loaning
Active Memory Sharing Manager (AMSM)

POWER
POWER Hypervisor
Hypervisor
Dedicated Shared Free Hypervisor

memory memory pool memory memory
(9 GB) (16 GB) pool (5.5 GB) (1.5 GB)
Paging space
devices
Physical memory (32 GB)
Figure 4-7. Active Memory Sharing components AN313.1
Notes:
This visual shows the different components of the PowerVM Active Memory sharing.
Shared memory pool: Shared memory pool is a configured collection of physical memory
units managed by the PowerVM AMS manager. This shared pool holds the
memory-resident pages (memory resident pages are the physical memory pages that a
partition is actively using) pages of all the active shared memory partitions in a system. The
system administrator determines the amount of physical memory allocated to the shared
pool, in multiples of logical memory blocks (LMBs).
Shared memory partition: A partition that is associated with a shared memory pool.
Active Memory Sharing Manager (AMSM): The Active Memory Sharing Manager
(AMSM) is a hypervisor component that manages the shared memory pool and the
memory of the partitions associated with the shared memory pool. The AMSM allocates the
physical memory blocks that comprise the shared memory pool.
Collaborative Memory Manager (CMM): CMM is an operating system (kernel) feature
that gives hints on memory page usage (active, inactive, critical, and so on). The PowerVM
Instructor Guide
hypervisor uses this to select good victim pages to manage the physical memory of a
shared memory partition.
CMM allows an OS to page out the aged page contents, even when the working set is
within its configured memory limit, and loan pages to the hypervisor to use when expanding
another shared memory partitions physical memory usage.
VIO Server Paging partition: This partition is needed not only for AMS paging but also
for shared memory partitions I/O hosting.
Paging space devices: This is an area of non-volatile storage used to hold portions of a
shared memory partitions logical memory that are not resident in the shared memory pool.
The paging space is allocated in a paging space device assigned to the shared memory
pool. This paging space device can be a logical drive (hdisk in AIX) or logical volumes.
VASI: VASI stands for virtual asynchronous services interface. The VASI receives page in
and page out requests from the Active Memory Sharing Manager.

V5.4.0.3
Instructor Guide

Purpose This is a global view of the PowerVM Active Memory Sharing components.
Details
Transition statement We discuss the components in the next slides.
Instructor Guide
Virtualization control point

The VCP is provided by the HMC or IVM
User interface to manage shared memory
pool and shared memory partitions. Virtualization Control Point (VCP)
Shared memory SM P1: Current Mem= 12GB
Communicates with other major pool:
16 GB
Entitled Mem: 4 GB
SM P2: Current Mem= 8GB
Entitled Mem: 3 GB
elements to configure the shared Paging VIOS SM P3: Current Mem= 4GB
Paging devices Entitled Mem: 1 GB
memory environment
Paging VIOS interface to manage Paging VIOS
(1 GB) Dedicated
paging devices SM Partition
1
SM SM
Partition 2 Partition 3
Memory
Partition 4
V (8 GB)
A
Hypervisor interface to manage F
S
I
vSCSI
Server CMM CMM CMM
shared memory partitions. C Page in/out Page

loaning
Active Memory Sharing Manager

Client interface for DLPAR Hypervisor
Hypervisor
operation and partition
attributes changes Dedicated Shared memory Free memory Hypervisor
memory pool pool memory
(9 GB) (16 GB) (1.5 GB)
(5.5 GB)
Paging devices
Figure 4-8. Virtualization control point AN313.1
Notes:
The virtualization control point has a front end that provides the interface to the system
administrator and a back end that communicates with the rest of the firmware and software
components.
The front end is provided by the HMC or IVM and provides administrator functions to define
a shared memory pool, create shared memory partitions, specify shared memory pool
parameters for shared memory partitions, and manage the shared memory partitions.
The virtualization control point communicates with other major elements to manage the
shared memory pool function. It has a paging space partition interface to manage paging
spaces and an Active Shared Memory Manager (ASMM) interface to manage the different

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets see the Active Memory Manager.
Instructor Guide
Active Memory Sharing Manager (1 of 2)

Manage the physical memory in the shared memory pool
Resides in the hypervisor
\
Shared memory SM P1: Current Mem= 12 GB

pool: Entitled Mem: 4 GB
Reserves entitled memory for 16 GB
SM P2: Current Mem= 8 GB
Entitled Mem: 3 GB
I/O mappings Paging VIOS SM P3: Current Mem= 4 GB
Allocates free pages to partitions Paging VIOS

when a page fault occurs (1 GB) SM Partition
1
SM SM
Dedicated
Memory
Partition 4
V (8 GB)
A
S vSCSI
I Server CMM CMM CMM
F
C Page
Can steal pages from one Page in/out
loaning

partition and make them Hypervisor
Hypervisor
available to another partition
Use the paging devices Dedicated Shared memory Free memory Hypervisor
(9 GB) (16 GB) (1.5 GB)
(5.5 GB)
Paging devices
Figure 4-9. Active Memory Sharing Manager (1 of 2) AN313.1
Notes:
The Active Memory Manager is a component of the POWER Hypervisor firmware that runs
on the managed system and manages the physical memory of the AMS pool. This
management is based on partition configuration parameters, such as entitled memory,
memory capacity weight, and partitions workload.
The primary purpose of the Active Memory Sharing Manager is to select which partition
pages are kept resident in physical memory at any point and to move partition pages in and
out of the system to a paging space device with the help of a specialized VIOS partition.
When a page fault occurs (this page fault is transparent to the OS), the AMSM allocates
free pages to logical partitions.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement The Active Memory Manager maintains a free list.
Instructor Guide
Active Memory Sharing Manager (2 of 2)

AMSM keeps a list of free physical frames in the shared
memory pool.
Frames are assigned to LPARs on page faults until a low water mark
is reached.
When the water mark is reached, the hypervisor free list is
replenished through:
Page loaning mechanism: LPARs operating system loans pages to
hypervisor.
Page stealing (LPAR is not loaning pages to the hypervisor).
Selection is based on page usage status, memory weight, and page
usage statistics.
Hypervisor paging: With the help of the paging VIO Server partition.
Page out to dedicated paging device.
Page in to resolve hypervisor page faults.
Figure 4-10. Active Memory Sharing Manager (2 of 2) AN313.1
Notes:
The Active Memory Sharing Manager also keeps a list of free physical pages that are
assigned to the shared memory partitions as needed. When a page fault occurs, the AMSM
assigns a free page to handle that page fault. This is done until the AMSM free list reaches
a low water mark. At that point, the Active Memory Sharing Manager takes memory from
other partitions, using the page loaning mechanism if other partitions cooperate by loaning
pages to the Hypervisor, or through page stealing, which takes place when the partitions
are not cooperating. Page stealing is based upon the partitions:
Shared memory weight
Page usage status
Page usage statistics
The AMSM uses the hypervisor paging to try to keep all partition working sets (pages
needed by current workloads) resident in the shared memory pool in an over-committed
system.
Hypervisor page fault occurs when the partition wants to access its data that has been
paged out to disk.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Only modified operating systems support AMS.
Instructor Guide
Operating system support

Operating system
Manages I/O entitled memory (distributes it
among device drivers). Virtualization Control Point (VCP)
Shared memory SM P1: Current Mem= 12GB
Helps Hypervisor identify the best page pool: Entitled Mem: 4 GB
SM P2: Current Mem= 8GB
candidates for page out. 16 GB
Entitled Mem: 3 GB
Paging VIOS SM P3: Current Mem= 4GB
Device drivers Paging devices Entitled Mem: 1 GB
Communicates minimum I/O entitled

memory requirements to AIX kernel Paging VIOS
(1 GB) SM Partition SM SM
Dedicated
Memory
Collaborative Memory Manager (CMM) V
1 Partition 2 Partition 3
Partition 4
(8 GB)
A
S vSCSI
Dynamically changes partition memory F
C
I Server CMM CMM CMM
Page in/out Page

footprint in response to Hypervisor request loaning

for pages.
Hypervisor
Hypervisor
Marks pages as loaned
Provides information
Dedicated Shared memory Free memory Hypervisor
Hypervisor uses memory pool pool memory
(9 GB) (16 GB) (1.5 GB)
(5.5 GB)
when page stealing
Paging devices
Figure 4-11. Operating system support AN313.1
Notes:
AMS is not transparent to the partitions operating systems. AIX has been modified to
support Active Memory Sharing.
Device drivers support
An AMS enabled OS distributes the partitions entitled memory among its various device
drivers. Device drivers in turn handle failure of I/O map requests when partitions
entitlement is reached and delayed request execution when a physical frame is not
immediately available.
Collaborative Memory Manager (CMM)
The shared memory partition provides the classification of page usage (page hints) to the
hypervisor for page stealing. This pages classification is done by the Collaborative
Memory Manager. The operating system can also assist the Active Memory Sharing
Manager by providing page usage hints that identify page utilization. This could be unused
pages that are good candidates for page stealing; the active pages with contents that need
to be preserved if they are stolen, or critical pages. Hypervisor never steals I/O Mapped
(DMA) memory pages.

V5.4.0.3
Instructor Guide

Purpose Point out the operating system support for AMS, the Collaborative Memory
Manager runs in AIX and provides hints on page usage. These hints are used by
hypervisor when stealing non critical pages.
Details
Transition statement A paging Virtual I/O server is required with AMS.
Instructor Guide
Paging Virtual I/O Server

Manages paging space devices in
response to requests from the VCP
Paging devices are used by Hypervisor Shared memory SM P1: Current Mem= 12 GB
Entitled Mem: 4 GB
to back up excess memory pages that pool:
16 GB
SM P2: Current Mem= 8 GB
Entitled Mem: 3 GB
cannot be backed by physical Paging VIOS SM P3: Current Mem= 4 GB
memory.
Paging VIOS
(1 GB) Dedicated
Helps AMSM move partition page V
SM Partition
1
SM SM
Memory
Partition 4
(8 GB)
frames in and out of the shared F

A
S
I
vSCSI
Server CMM CMM CMM
C Page
memory pool through the VASI Page in/out
loaning

Initiates DMA operations in response Hypervisor
Hypervisor
to paging requests.
Dedicated Shared memory Free memory Hypervisor
(9 GB) (16 GB) (1.5 GB)
(5.5 GB)
Paging devices
Figure 4-12. Paging Virtual I/O Server AN313.1
Notes:
The paging virtual I/O server is a component of Active Memory Sharing responsible for
paging in and out memory pages to service memory requests of the partitions.
When the Hypervisor wants to free memory pages in the shared memory pool, the content
of the memory pages to be freed must be stored on a paging device in order to be restored
later when the data is accessed again.
The Paging Virtual I/O server copies the content of a physical frame to the specific paging
device of the logical partition. The memory page that has been freed in the shared memory
pool can then be safely allocated to the demanding logical partition by the Active Shared
Memory Manager. If the logical partition wants to access the memory page that has been
paged out to the VIO paging device, an Active Memory Sharing page fault is raised. This
mechanism is completely transparent to the operating system.

V5.4.0.3
Instructor Guide
Uempty Redundant path

Two virtual I/O servers can be used in an AMS configuration. One will be the primary
paging space partition, and the second one will be the secondary paging space partition. If
one virtual I/O server goes down, the paging devices can still be accessed using the
secondary virtual I/O server. If using dual virtual I/O servers for AMS, only physical
volumes are supported as paging space devices (no logical volume is supported).
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets see what the I/O memory entitlement is.

V5.4.0.3
Instructor Guide
Uempty
I/O entitled memory

I/O memory entitlement is required.
Automatically configured when the partition is activated.

I/O memory entitlement is communicated to AIX at boot time.
AIX divides I/O memory entitlement into pools for each device.
I/O memory entitlement requirements vary by virtual adapter type.
Virtual Device Virtual SCSI Virtual Ethernet Virtual Fiber Channel Virtual Serial
Default I/O Entitlement 17 MB 60 MB 137 MB 0
I/O memory entitlement can be monitored using lparstat

command.
Figure 4-13. I/O entitled memory AN313.1
Notes:
A portion of the shared memory pool has to be available for the I/O devices during I/O
operations.
I/O entitled memory is the maximum amount of physical memory (from the shared memory
pool) that is guaranteed to be available for I/O devices at any given time. If the minimum
amount of memory required by the device operation does not reside in the shared memory
pool, the device operation will fail.
I/O memory entitlement is required: When a partition is about to perform an I/O
operation (for example, disk read/writes or TCP/IP communications), it must ensure that a
portion of physical memory remains unmoved for the duration of these operations.
I/O entitlement default values
The HMC and IVM assign a default I/O entitlement memory value for each shared memory
partition. These default values are based on the number and type of I/O devices configured
for a typical partition and would work for all of the supported operating systems in most
cases. However, these values should be evaluated based on the workload, device
configuration, and adapter operations. The default I/O entitlement values for different
virtual adapter types are listed in the figure.
Instructor Guide
Instructor notes:
Purpose What is the I/O memory entitlement.
Details I/O entitled memory is not managed dynamically by the Hypervisor like logical
or physical memory is. The amount of I/O entitled memory allocated to the partition is
assigned by the HMC (or IVM), enforced by the Hypervisor, and obeyed by the partition
OS. Once assigned, this does not change without manual intervention. The amount of I/O
entitled memory automatically assigned to the partition by HMC/IVM is calculated using
default values for each type of virtual adapter configured in the partition. Depending on the
partition OS and workload, this default capacity might not provide acceptable performance.
For example, if a partition workload requires 64Mb of I/O capacity to complete its I/O
operations in an acceptable amount of time and only 48Mb is assigned to the partition, the
partition will need to queue I/O requests. So, I/O throughput will probably not be sufficient
for the partition to complete its processing in the amount of time desired. The user can
manually override the defaults and assign more (or less) I/O entitled memory capacity to
the partition to achieve the desired partition I/O performance.

V5.4.0.3
Instructor Guide
Uempty
Logical to physical memory mapping

The hypervisor virtualizes the shared memory pool to allocate memory
to partitions.
The partitions memory becomes logical memory.
The hypervisor maps a range of physical memory to the partitions logical
memory.
Page types:
Loaned pages Free pages or unused: Page contains no data
Active: Page data must be preserved
Critical: Used in critical performance path
I/O mapped: Used for DMA and cannot be stolen
Free pages
Active pages
Critical pages
I/O mapped pages
Logical memory Physical memory VIOS paging device
Operating system POWER Hypervisor

Figure 4-14. Logical to physical memory mapping AN313.1
Notes:
Logical memory
In a shared memory logical partition, the memory that is assigned as a result of the
configured minimum, desired, and maximum values is known as the logical memory.
In an AMS environment, the partitions real physical memory in a shared memory partition
becomes the partitions logical memory. The real physical memory is part of the AMS
shared memory pool, which is virtualized by the hypervisor.
The partitions logical address can be mapped to any physical memory that is part of the
shared pool. As a result, physical memory assigned to one partition at one time can be
assigned to another partition at another time.
The logical memory is the quantity of memory that the operating system manages and can
access. Logical memory pages that are in use can be backed up by either physical memory
or a pools paging device.
Instructor Guide
The figure shows an example of logical to physical mapping made by the hypervisor at a
given time. The shared memory partition owns the logical memory and provides the
classification of page usage (page hints) to the hypervisor (this classification is done by the
collaborative memory manager component).
While I/O mapped pages are always assigned physical memory, all other pages can be
placed either in physical memory or on the paging device. Free and loaned pages have no
content from the shared memory partitions point of view, and are not copied to the paging
device.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets see a hypervisor paging example.
Instructor Guide
POWER Hypervisor paging example

When memory is overcommitted and LPAR refuses to loan more
memory:
Hypervisor steals memory pages.
Asks the VIO Server to page out LPARs pages.
Once memory page is free, it gives the page to the high demanding LPAR.
High lpar1 Active memory

demanding lpar2 lpar3 pages need to
LPAR be saved on
paging device
Virtual
I/O
Server
Active Memory Sharing Manager (AMSM)
AMS shared
memory pool
Physical memory LPAR2 paging LPAR3 paging
devices devices
Figure 4-15. POWER Hypervisor paging example AN313.1
Notes:
Hypervisor paging example

The POWER Hypervisor monitors the memory demands in each shared memory partition
and proactively adjusts the number of pages assigned to each partition. If it detects
memory constraints in a shared memory partition, it will try to add free pages to its list by
borrowing or taking pages from other partitions that have lower memory demand.
Under heavy memory demands, if the physical pages loaned by a shared memory partition
are not enough to meet the memory demand, then the hypervisor can still take physical
pages from the partition and it would save its contents by paging it out to the partitions
paging device.
The POWER Hypervisor will use paging devices to page out the contents of a physical
page before allocating it to another shared memory partition. When the first shared
memory partition faults on a logical page whose contents are paged out, hypervisor will find
a free page and page in the contents to service the page fault.

V5.4.0.3
Instructor Guide

Purpose
Details The POWER Hypervisor keeps a list of free memory pages to handle partitions
page faults. These free pages are maintained in structures internal to the Hypervisor and
are not visible outside of the Hypervisor.
Instructor Guide
Collaborative memory manager: Loaning policy
Page loaning policy of 1 Page loaning policy of 2
16 GB 16 GB
Loaned Loaned
Loaned Can Loaned Can
(3 GB) (3 GB) be reclaimed
be reclaimed
13 GB 13 GB
File File
Cache Available to Cache
(4 GB) be loaned (4 GB)
9 GB 9 GB
Available to be
loaned
Working Working
Storage Unavailable Storage
(9 GB) for loaning (9 GB)
0 GB 0 GB
Figure 4-16. Collaborative memory manager: Loaning policy AN313.1
Notes:
Memory loaning
Active Memory Sharing uses a memory loaning concept. Memory loaning introduces a new
class of page frames, named loaned page frames. This memory loaning method is used to
respond to the hypervisor loan requests with memory frames that are least expensive (from
the OS point of view) to donate. Those pages can be unloaded to be used immediately as
the load increases in another LPARs operating system.
AIX collaborates with the hypervisor to help with hypervisor paging. In response to the
hypervisor requests, AIX checks once a second to determine if the hypervisor needs
memory. In the case where the hypervisor needs memory, AIX will free up logical memory
pages (which become loaned pages) and give them to the hypervisor. The policy to free up
logical memory is tunable through the vmo ams_loan_policy tunable in AIX.
The ams_loaning_policy value indicates the page loaning policy used by the Collaborative
Memory Manager (CMM).

V5.4.0.3
Instructor Guide
Uempty Default (ams_loan_policy=1) Only loan file cache pages; do not page out to its
OS paging space
Aggressive (ams_loan_policy=2) Loan file cache and working storage pages;
will page out to its OS paging space until paging space is low
Off (ams_loan_policy=0) Disables any type of loaning
Page loaning policy of 1: Default loaning
With the default loaning configuration, AIX first reduces the number of logical pages
assigned to the file cache and loans them to the hypervisor. When the Hypervisor needs to
reduce the number of physical memory pages assigned to the logical partition, it first
selects loaned pages and then selects free and used memory pages. The effect of loaning
is to reduce the number of hypervisor page faults because AIX reduces the number of
active logical pages and classifies them as loaned.
Page loaning policy of 2: Aggressive loaning
If page loaning is set to aggressive, AIX either reduces the file cache or frees additional
working storage pages by copying them into the AIX paging space. The number of loaned
pages is greater than the default loaning policy. The hypervisor uses the loaned pages and
might not need to perform any activity on its paging space. When AIX selects a working
storage page, it is first copied to the local paging space. This setup reduces the effort of the
hypervisor by moving paging activity to AIX. If the loaned pages are used pages, AIX has to
save the content to its paging space before loaning them to the hypervisor. This behavior
will especially occur if you have selected this aggressive loaning policy.
Page loaning policy to 0: Disabled
When page loaning is disabled, AIX stops adding pages in the loaning state, even if
requested by the hypervisor. When the hypervisor needs to reduce the logical partitions
memory footprint and free pages have been already selected, either file cache pages or
working storage can be moved on the hypervisors paging space.
Deciding on loan policy
Deciding on which loan policy to use depends on the configuration, consolidation factors,
and workloads. If aggressive loan policy is selected for a shared memory partition, then the
paging devices for that partition should be tuned. If AMS loan policy is disabled, only the
hypervisor paging devices should be tuned. OS paging is used in an AMS environment
only for loaning pages to the hypervisor. Therefore, if AMS loan policy is not enabled, the
OS paging device does not have to be optimized.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Memory subscription ratio

Three ways to configure the memory subscription:
Non over-commit subscription ratio:
Enough physical memory is available for all logical memory used by all
Logical overcommit subscription ratio:
All logical memory used (referenced) at a point in time is equal to the
physical memory in the pool.
This is typical when the workloads peak at different times.
Physical overcommit subscription ratio:
All logical memory used (referenced) at a point in time exceeds the
physical memory in the pool.
This is typical when the workloads peak at the same time.
Workload performance is impacted.
Consider configuring logical over commit.
Figure 4-17. Memory subscription ratio AN313.1
Notes:
Because the applications running in your shared memory LPARs will exhibit varying
behaviors, you need to understand the different ways you can configure the memory
subscription ratio. The memory subscription ratio is determined by the level of physical
memory available and the logical memory needed. Autonomic memory configuration
decisions will be made based on this ratio. This attribute does not need to be set. The
subscription ratio is the result of your partition configuration.
Non overcommit
This subscription ratio means that the amount of physical memory available in the shared
pool is enough to cover the total configured logical memory of the shared memory
partitions. Because all the configured logical memory is backed by physical memory, this
mode does not provide any memory saving benefits.
Logical overcommit
This is the ratio of the logical memory in use to physical memory available in the shared
memory pool. The total logical configured memory can be higher than the physical
Instructor Guide
memory; however, the current memory working set can be backed up by the physical
shared memory pool. (Note: the working set almost never exceeds the physical memory.)
Applications that time multiplex are good candidates for this memory overcommit ratio. For
example, in AM/PM scenarios, peaks and valleys of multiple workloads overlap, leading to
logical overcommit levels without consuming more than the physical memory available in
the pool. Test and development environments also are good candidates.
Physical overcommit
This is a subscription ratio in which the sum of all shared memory partitions logical memory
not only exceeds the physical memory in the shared pool, but the total actual memory
referenced (the working set) by all shared memory partitions exceeds the physical memory.
The working set memory of the shared memory partitions has to be backed by both the
physical memory in the shared pool and by the paging space devices.
Good candidates for this are applications that use a lot of AIX file cache or are less
sensitive to I/O latency, such as file server, print servers, and network applications.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement The next visuals describe how to configure AMS.
Instructor Guide
Shared memory partition configuration

Shared memory pool configuration
Create a set of paging devices

One paging device per shared memory partition
Partition logical memory configuration

AMS memory configuration mode
Non-over commit level
Logical over commit level
Physical over commit level
Partition memory weight
I/O memory entitlement
Collaborative Memory Managers loan policy

Figure 4-18. Shared memory partition configuration AN313.1
Notes:
Here are the tasks to perform to set up a shared memory pool. Also included are tasks
required to create and activate shared memory LPARs.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Shared Memory Pool Management menu
Select the
Server
A VIO Server must be

running or you will get
this message:
Figure 4-19. Shared memory pool management menu AN313.1
Notes:
The figure shows how to manage the share memory pool from the HMC. On the HMC,
select the managed system on which the shared memory pool should be created. Then
select Configuration > Virtual Resources > Memory Pool Management.
To get access to the memory pool management wizard, a virtual I/O server must be defined
and running on your managed system.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets see the Create Shared Memory Pool wizard.
Instructor Guide
Shared memory pool management wizard

Use the wizard to create the shared memory pool and paging
space devices.
Figure 4-20. Shared memory pool management wizard AN313.1
Notes:
This is the wizard to use to create a shared memory pool. If a pool is already configured,
the window will display the pool's configuration details.

V5.4.0.3
Instructor Guide

Purpose
Details If a shared memory pool is already configured on your managed system, you
will receive a pop-up window that displays the current shared memory pool configuration.
Instructor Guide
Shared memory pool size

Specify the Pool size and the Maximum pool size values
Figure 4-21. Shared memory pool size AN313.1
Notes:
Maximum pool size

The Maximum pool size is a soft limit that can be changed dynamically. Keep this value at
the minimum required because for each 16GB of pool memory, the hypervisor reserves
256MB of memory for page tracking. Setting this unnecessarily high results in over
allocating memory.
The pool size cannot be larger than the amount of available system memory. The memory
used by dedicated memory partitions cannot be used for the shared memory pool. In this
example, the pool is created on a managed system where we cannot create a pool that is
larger than 16GB.

V5.4.0.3
Instructor Guide

Purpose
Details Hypervisor consumes additional memory from the shared memory pool for
internal administration purposes.
Transition statement The shared memory pool requires a paging virtual I/O server.
Instructor Guide
Paging Virtual I/O Server

Select a Virtual I/O Server to be the paging space partition.
At least one must be assigned to the shared memory pool.
A secondary VIO Server can also be defined for redundant path.
Figure 4-22. Paging Virtual I/O Server AN313.1
Notes:
The shared memory pool configuration requires the definition of a set of paging devices
that are used to store excess memory pages on temporary storage devices. Access to the
paging devices associated with a shared memory partition is provided by a Paging Virtual
I/O Server on the same system. At the time of pool creation, the Virtual I/O Server that will
provide paging service to the pool must be identified.
The panel shown in the figure is provided for selecting paging VIOS partitions. You can also
provide a second paging VIOS to provide a redundant path and higher availability to the
paging space devices.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement You create the paging space devices before including them in the
configuration.
Instructor Guide
Create the paging devices

Shared memory pool requires a set of paging devices.
One is required per shared memory partition.
Can be a logical volume or a whole physical volume.

The device must be bigger than or equal to the maximum logical
memory defined in the partition profile.
If these are to be logical volumes, you must create them first:

$ mklv -lv paginglpar1 -size 2G vg_data
$ lsvg -lv vg_data
vg_data:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
paginglpar1 jfs 16 16 1 closed/syncd N/A
Figure 4-23. Create the paging devices AN313.1
Notes:
A paging device is required for each shared memory partition. The size of the paging
device must be bigger than or equal to the maximum logical memory defined in the partition
profile. The paging devices are owned by a virtual I/O server. A paging device can be a
logical volume or a whole physical disk. The disks can be local or provided by an external
storage subsystem through a SAN.
If you are using whole physical disks, there are no actions required other than making sure
the disks are configured and available on the virtual I/O server. If you are using logical
volumes, then you need to create them before proceeding with the pool management
wizard.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Select the devices to be paging space devices

Select the devices to be used as paging devices.
Click to
select the
devices
Figure 4-24. Select the devices to be paging space devices AN313.1
Notes:
After selecting the VIOS paging partition, you can select the devices to be used as paging
devices for your logical partitions.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Paging space device filter
Filter the
device type
Then
Refresh
Select the devices to use as

paging device.
Figure 4-25. Paging space device filter AN313.1
Notes:
To list devices in the VIOS device list at the bottom of the window, you must first select a
Device Type and then select Refresh. In our example, we selected Logical in the device
type selection, because we had previously created logical volumes to use as paging
devices for our logical partitions.
Once refreshed, the device list appears. You must select one paging device per shared
memory logical partition.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement The next slide is the wizard summary panel.
Instructor Guide
Paging space devices summary
Paging space
devices
Shared memory
pool size
Summary before
finish
Figure 4-26. Paging space devices summary AN313.1
Notes:
After the paging devices have been selected, click Next to get a summary of all the
selections that have been made. Review the selections and then select Finish to commit
and finish the creation of the shared memory pool.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Now the shared memory pool is created.
Instructor Guide
Creating the shared memory pool

Working window when the shared memory pool is being
created:
There is no specific message stating the shared memory pool has

been successfully created.
The hypervisor uses some of the pool memory for its own
administration purpose.
That is 256MB, plus a small amount per shared memory partition.
Figure 4-27. Creating the shared memory pool AN313.1
Notes:
A working window shows the memory pool creation, but you do not receive any specific
message stating the shared memory pool has been successfully created.
The POWER Hypervisor uses memory from the shared memory pool for its own
administration purpose. For each 16GB of memory assigned in the shared memory pool,
the Hypervisor uses 256MB. Looking at the shared memory pool properties of your
managed system, you will notice the Available Pool Memory value is lower than the
maximum pool size. The Available Pool Memory value is the amount of physical system
memory available after subtracting the amount of memory that the server firmware uses to
manage the shared memory pool and the total amount of I/O entitled memory of the shared
memory partitions.

V5.4.0.3
Instructor Guide

Purpose Highlight the lack of a message stating a successful shared pool creation. Also
point out the hypervisor memory consumption for managing AMS.
Details
Instructor Guide
Pool properties: Paging space devices
Inactive = not used

by a partition
For each paging device, a VRMPAGE device type is created.
$ lsdev -dev vrmpage*

name status description
vrmpage0 Defined Paging Device - Logical Volume
vrmpage1 Defined Paging Device - Logical Volume
Figure 4-28. Pool properties: Paging space devices AN313.1
Notes:
The paging device selection is made when a shared memory partition is activated. The
assignment is based on the availability and the size of the maximum logical memory
configuration of the logical partition. When listing the attributes of the vrmpage devices, you
can see which paging device is used by each shared memory partition.
In the following lsdev command output, we can see that partition ID 7 is using the paging
device named paginglpar1.
$lsdev -dev vrmpage0 -attr
attribute value description user_settable
LogicalUnitAddr0x8100000000000000 Logical Unit Address False
aix_tdev paginglpar1 Target Device Name False
partition_id 7 Client Partition ID False
redundant_usage no Redundant Usage True
storage_pool Storage Pool False
vasi_drc_name U8203.E4A.65D8032-V1-C3 VASI DRC Name True
vrm_state active Virtual Real Memory State True
vtd_handle 0x200001207f9b0297Virtual Target Device Handle False

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement The next two visuals show virtual devices and drivers used with
AMS on the virtual I/O server.
Instructor Guide
Virtual I/O Server virtual devices for AMS (1 of 2)
Virtual I/O Server
Pager VBSD
driver driver
Client
LPAR
VASI Native device

driver driver
POWER Hypervisor Physical Storage

adapter paging
devices
Figure 4-29. Virtual I/O Server virtual devices for AMS (1 of 2) AN313.1
Notes:
This figure shows the different drivers involved in the VIOS paging space partition.
VASI
- Accepts commands from the hypervisor and forwards them to appropriate kernel
extensions in the VIOS. The kernel extensions are responsible for executing the
commands on behalf of firmware.
Pager
- A kernel extension that is responsible for satisfying paging requests from firmware.
The paging requests are sent to the pager through the VASI kernel extension. It is a
layer of code between the VASI kernel extension and the VBSD driver.
VBSD
- VBSD is an acronym for virtual block storage device. It is a kernel extension
providing an interface for managing and accessing storage volumes. This driver
manages I/O requests made by other kernel extensions, such as a pager, within the
VIOS partition.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement We can list the AMS virtual devices in the virtual I/O server.
Instructor Guide
Virtual I/O Server virtual devices for AMS (2 of 2)

$ lsdev -dev vasi*
vasi0 Available Virtual Asynchronous Services Interface (VASI)
$ lsdev -dev vbsd*
vbsd0 Available Virtual Block Storage Device (VBSD)
$ lsdev -dev pager*
pager0 Available Pager Kernel Extension
$lsmap all ams
VBSD
Paging Stream ID Client ID
---------------- ------------------ ---------------
vrmpage1 0x200001207f9b02b9 7
Status active
Redundancy Usage no
Backing device hdisk17
Pool ID 0
VASI vasi3
Pager pager3
VBSD vbsd3
Figure 4-30. Virtual I/O Server virtual devices for AMS (2 of 2) AN313.1
Notes:
After the shared memory pool is created, four new VASI and VBSD devices should be
visible on the Virtual I/O Server. In the figure above, you will see five VASI and VBSD
devices. This is because this virtual I/O server is also defined as a Mover Service Partition
for Live Partition Mobility for which one VASI and one VBSD device are required as well.
The virtualization control point dynamically adds VASI devices to a Virtual I/O Server in
order to enable the virtual I/O server to be a paging Virtual I/O Server.
The lsmap command can be used with the -ams option to list the characteristics of the
vrmpage device associated with your shared memory partition. The lsmap command
output in the visual shows the devices used by a particular communication stream and the
associated physical backing device that is the physical paging device.

V5.4.0.3
Instructor Guide

Purpose
Details Point out the different virtual adapters you can list in the virtual I/O server and
the new -ams option of the lsmap command showing characteristics of each vrmpage
device.
Transition statement After the shared memory pool is created, you can create shared
memory partitions.
Instructor Guide
Partition shared memory settings

Shared memory mode, minimum, desired, maximum, and the
memory weight
Memory weight is a relative value used by the hypervisor in a
calculation to determine which LPARs get more memory.
255 = highest priority
Figure 4-31. Partition shared memory settings AN313.1
Notes:
Minimum, desired, and maximum values
When you activate a partition profile that uses shared memory, the managed system does
not commit a memory amount to the logical partition. The memory for the logical partition is
set to the 'Total assigned logical memory'. This is different from the dedicated memory
allocation. If the managed system does not have the assigned memory amount available,
but has at least the minimum memory amount available, the managed system activates the
logical partition with the memory that is available.
Memory weight
The Memory weight setting is one of the factors used by the hypervisor to determine which
shared memory partition should receive more memory from the shared memory pool. This
field displays the relative value that is used in determining the allocation of physical system
memory from the shared memory pool to a logical partition that uses shared memory. A
higher value, relative to the values set for other shared memory partitions, increases the
probability of the hypervisor allocating more physical system memory from the shared
memory pool to the shared memory partition.

V5.4.0.3
Instructor Guide

Purpose
Details Highlight the memory mode that is now shared, and the different memory
settings.
Transition statement The next slide shows a shared memory partition.
Instructor Guide
Activate a shared memory LPAR

Before activating a shared memory partition:
The Virtual I/O Server used for paging must be active.
The HMC must establish an RMC connection with the VIO Server.
A paging device must be available.
Size must be at least the maximum memory setting for the partition.
The paging device is chosen automatically.
The smallest available device will be used.
Figure 4-32. Activate a shared memory LPAR AN313.1
Notes:
Each shared memory partition requires a dedicated paging device in order to be activated.
Paging device selection is made when a shared memory partition is activated, based on
the availability and the size of the maximum logical memory configuration of the logical
partition. If no suitable paging device is available, activation fails with an error message
providing the required size of the paging device.
There isnt a fixed relationship between a paging device and a shared memory partition
when a system is managed using the HMC. The smallest suitable paging device
automatically selected when the shared memory partition is activated for the first time.
Once a paging device has been selected for a partition, this device is used again as long as
it is available at the time the partition is activated. However, if the paging device is
unavailable for example, because it has been deconfigured or is in use by another partition,
then a new suitable paging device is selected.

V5.4.0.3
Instructor Guide

Purpose Highlight what has to be checked before activating a shared memory partition.
Details
Focus on the paging space device size required for each partition. Point out that there is no
relationship between a partition and a paging device before the LPAR activates. We do not
have a menu or option to specify which paging device will be used by which logical
partition.
Transition statement Dynamic operations are allowed on shared memory partitions.
Instructor Guide
Dynamically add or remove memory resource
Figure 4-33. Dynamically add or remove memory resource AN313.1
Notes:
Assigned memory
This field displays the amount of memory that is currently assigned to the logical partition.
The amount of logical memory of a shared memory partition can be changed within the
minimum and maximum boundaries defined in the partition profile. Increasing the logical
memory in a shared memory partition does not mean that the amount of physical memory
that is assigned through the hypervisor is changed. The amount of physical memory that a
shared memory partition actually gets depends on the availability of free memory in the
shared memory pool. The memory weight of a shared memory partition can also be
changed dynamically.
I/O entitled memory

This field displays the total amount of physical memory that is currently allocated from the
shared memory pool for I/O device of the shared memory partition. You can change the I/O
entitled memory value using a dynamic LPAR operation. Normally, you should not have to

V5.4.0.3
Instructor Guide
Uempty change this value unless monitoring tools are reporting excessive I/O mapping failure
operations.
When you dynamically change the I/O entitled memory, you also change the I/O entitled
memory mode from the auto mode to the manual mode. In manual mode, if you add or
remove a virtual adapter to or from the shared memory partition, the HMC does not
automatically adjust the I/O entitled memory. Therefore, you might need to dynamically
adjust the I/O entitled memory when you dynamically add or remove adapters to or from
the shared memory partition. When you want to change the I/O entitled memory mode from
the manual mode to the auto mode, reactivation of the shared memory partition is required.
You can specify the size in a combination of gigabytes (GB) plus megabytes (MB).
The HMC or IVM calculates the I/O entitled memory based on the I/O configuration. The
I/O entitled memory is the maximum amount of physical memory guaranteed to be
available for I/O mapping.
Partition activation and the I/O entitled memory

The activation of a shared memory partition fails if there is not enough physical memory in
the shared pool to satisfy the I/O entitled memory capacity needed for the partitions I/O
configuration. When you are adding or removing virtual SCSI or virtual Ethernet adapters,
the I/O entitled memory is automatically adjusted if you are in auto mode. The values used
are based on default values and normally do not need to be changed. Once they are
changed, the values will remain at their new setting.
Instructor Guide
Instructor notes:
Purpose Point out the I/O memory entitlement is automatically managed except when
you set an I/O entitled memory value. Setting a new value also changes the I/O entitled
memory mode from automatic to manual. In manual mode, it is your responsibility to
manage I/O entitled memory.
Details

V5.4.0.3
Instructor Guide
Uempty
Some performance guidelines

Hypervisor paging device
Physical volumes preferred over logical volumes
External storage subsystem LUN (4Kb stripe size)
AIX operating system
AIX paging device
Loaning policy
Logical memory
Should be sized based on the maximum amount of memory that the
logical partition is expected to use during peak hours
I/O entitled memory
Default values are appropriate for most configurations
Figure 4-34. Some performance guidelines AN313.1
Notes:
In an Active Memory Sharing environment, the performance of the operating system
paging device takes on a different important role. There are two levels of paging: the
partitions operating system paging and hypervisor paging.
If the page fault rate for one logical partition becomes too high, it is possible to increase the
memory weight assigned to the logical partition to improve the possibility of its receiving
physical memory.
It is recommended to have the partition devices separate from the hypervisor paging
devices, if possible. The I/O operations from shared memory partitions might compete with
I/O operations resulting from hypervisor paging. Page in and page out operation requests
are communicated to the paging virtual I/O server using this virtual devices. Each VASI
adapter can support multiple shared memory partitions.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Monitoring tools for AMS

Get AMS configuration for LPAR
# lparstat -i
Get I/O memory entitlement usage for adapters
# lparstat -me
Monitor AMS paging statistics for LPAR
# vmstat -h
# vmstat -vh
# topas -L
Monitor AMS paging statistics for all LPARs on CEC
# topas -C
Memory in use
# svmon -G
Figure 4-35. Monitoring tools for AMS AN313.1
Notes:
The AIX operating system has enhanced its monitoring tools to show updates to AMS
specific resource consumption metrics. In a dedicated memory partition, the memory stat
tool svmon can be used to measure the working set size. The command svmon G shows
the inuse memory value. Even though this inuse value represents the working set, it does
not represent the actual memory that is being currently referenced. With AMS, the actual
working set of a partition (actual amount of pages that are being currently referenced) can
be monitored.
Existing tools such as topas and vmstat have been enhanced to report physical memory
in use, hypervisor paging rate, hypervisor paging rate latency, and the amount of memory
loaned by AIX to the hypervisor.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets see a shared memory LPARs lparstat -i output.

V5.4.0.3
Instructor Guide
Uempty
lparstat -i
# lparstat -i
Node Name : lpar1
Partition Number :5
Type : Shared-SMT
Mode : Uncapped
Shared Pool ID :0
Online Virtual CPUs :1
Minimum Virtual CPUs :1
Variable Capacity Weight : 128
Maximum Physical CPUs in system :4
Active Physical CPUs in system :4
Active CPUs in Pool :4
Shared Physical CPUs in system :4
Unallocated Capacity : 0.00
Unallocated Weight :0
Memory Mode : Shared

Total I/O Memory Entitlement : 77.000 MB
Variable Memory Capacity Weight : 128
Memory Pool ID :0
Physical Memory in the Pool : 4.000 GB
Hypervisor Page Size : 4K
Unallocated Variable Memory Capacity Weight :0
Unallocated I/O Memory entitlement : 0.000 MB
Memory Group ID of LPAR : 32773
Figure 4-36. lparstat -i AN313.1
Notes:
The AIX lparstat command has been enhanced to display statistics about shared memory.
Using the -i flag, the shared memory configuration is shown. The details include the LPAR
configured memory mode, the I/O memory entitlement (the amount of physical memory
used for I/O operations), the memory capacity weight value specified at the partition
creation time, and the shared memory pool size.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement The next slide introduces the pmem and loan values that are
seen in the vmstat output.

V5.4.0.3
Instructor Guide
Uempty
Active shared virtual memory in an LPAR

In a dedicated memory partition, the logical memory would be the physical
memory assigned to the partition.
(stolen = memory pmem - loan)
Virtual memory
Logical memory
Not really here
Ac
tu
all
ys
Hypervisor
to
re
d
Physical Loaned memory
on
memory pages to Hypervisor
dis
k
VIOS VIOS
vmstat :
paging loan value
vmstat :
pmem value
VIOS AIX OS
paging device paging device
Figure 4-37. Active shared virtual memory in an LPAR AN313.1
Notes:
This slide introduces the virtual memory in a shared memory partition.
The logical memory shown in the visual are memory page frames backed by physical
memory and also loaned pages to the POWER Hypervisor. These values can be examined
using the vmstat command.
When the sum of the logical memory backed by physical memory frames (the pmem value)
and the pages loaned to the hypervisor (loan value) is less than the amount of logical
memory defined in the logical partition, the difference represents what has been stolen by
the hypervisor and resides on paging device.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets see an example of vmstat command output.

V5.4.0.3
Instructor Guide
Uempty
vmstat command (1 of 2)
# vmstat -h 2
System configuration: lcpu=2 mem=1536MB ent=0.10 mmode=shared mpsz=4.00GB
kthr memory page faults cpu hypv-page

----- ---------------- ------------------------ ------------ ---------------------------- -----------------------------
r b avm fre re pi po fr sr cy in sy cs us sy id wa pc ec hpi hpit pmem loan
0 0 175841 130951 0 0 0 0 0 0 4 296 464 2 5 93 0 0.01 9.3 0 0 1.13 0.37
0 0 175841 134023 0 0 0 0 0 0 1 215 468 1 4 95 0 0.01 7.4 0 0 1.14 0.36
0 0 175841 136869 0 0 0 0 0 0 1 219 462 1 4 95 0 0.01 7.4 0 0 1.16 0.35
0 0 175841 139941 0 0 0 0 0 0 0 215 461 1 4 95 0 0.01 7.3 0 0 1.17 0.34
0 0 175841 143013 0 0 0 0 0 0 0 214 462 1 4 95 0 0.01 7.3 0 0 1.18 0.33
99 mmode:
mmode:Memory
Memorymode
modeofofLPAR
LPAR(dedicated
(dedicatedororshared)
shared)
99 mpsz:
mpsz: Amount of memory in shared memory pool(in
Amount of memory in shared memory pool (inGB)
GB)
99 hpi: Hypervisor page-ins / page faults
hpi: Hypervisor page-ins / page faults
99 hpit:
hpit:Time
Timewaiting
waitingfor
forhypervisor
hypervisorpage-ins
page-ins(in
(inmilliseconds)
milliseconds)
99 pmem:
pmem: Physical memory currently backingthe
Physical memory currently backing thelogical
logicalmemory
memoryassigned
assignedtotothe
theLPAR
LPAR
99 loan: Logical memory loaned
loan: Logical memory loaned
Figure 4-38. vmstat command (1 of 2) AN313.1
Notes:
With Active Memory Sharing, the vmstat command has been enhanced to report the
amount of physical memory that is backing the logical memory of the logical partition, as
well as the hypervisor paging information.
The mem field shows the amount of available logical memory. Unlike in a dedicated
memory partition where the logical memory is always backed by physical memory, this is
not the case in a shared memory partition. The command output shown in the figure has
1.5GB of logical memory. This does not mean that there is actually this amount of logical
memory available to the partition. To see how much physical memory that partition
currently has assigned, you have to look at the pmem column. In this case, it shows that
the partition only has 1.13GB of physical memory assigned at the time the output was
produced.
The hypv-page group in the vmstat command output shows physical memory statistics
and hypervisor paging activity. Looking at the loan column in this example, we see that the
partition has loaned 0.37GB of memory to the hypervisor. This loaned memory is included
in the free (amount of free memory) column under the memory section. If the workload
Instructor Guide
increases its load (memory usage), the loan and the free columns both come down and
pmem column will go up.
If the hpi and hpit values are non zero, or the pi and po values are non zero, then paging
to the paging device is occurring. This is an indication that the working set exceeds the
physical memory given to the partition, or physical pages are being loaned. Either of these
situations will impact workload performance
The following fields have been added for Active Memory Sharing:
mmode: Shows shared if the partition is running in shared memory mode.
mpsz: Shows the size of the shared memory pool.
hpi: Shows the number of hypervisor page-ins for the partition. A hypervisor page-in
occurs if a page is being referenced which is not available in real memory because it
has been paged out by the hypervisor previously. If no interval is specified when issuing
the vmstat command, the value shown is counted from boot time.
hpit: Shows the time spent in hypervisor paging in milliseconds for the partition. If no
interval is specified when issuing the vmstat command, the value shown is counted
from boot time.
pmem: Shows the amount of physical memory (in gigabytes) backing the logical
memory.
loan: Shows the amount of the logical memory in gigabytes that is loaned to the
hypervisor. The amount of loaned memory can be influenced through the vmo
ams_loan_policy tunable.

V5.4.0.3
Instructor Guide

Purpose Discuss the vmstat -h command output.
Details
Instructor Guide
vmstat command (2 of 2)
# vmstat -vh
393216 memory pages
365809 lruable pages
122435 free pages
1 memory pools
112636 pinned pages
80.0 maxpin percentage
3.0 minperm percentage
80.0 maxperm percentage
6.3 numperm percentage
23296 file pages
0.0 compressed percentage
0 compressed pages
6.3 numclient percentage
80.0 maxclient percentage
23296 client pages
0 remote pageouts scheduled
0 pending disk I/Os blocked with no pbuf
0 paging space I/Os blocked with no psbuf
1700 filesystem I/Os blocked with no fsbuf
0 client filesystem I/Os blocked with no fsbuf
0 external pager filesystem I/Os blocked with no fsbuf
8708 Virtualized Partition Memory Page Faults

6223 Time resolving virtualized partition memory page faults
103449 Number of 4k page frames loaned
26 Percentage of partition memory loaned
Figure 4-39. vmstat command (2 of 2) AN313.1
Notes:
The vmstat command, with the -v flag associated with the -h flag, displays additional
memory metrics related to AMS. The output now includes the number of AMS memory
faults and the time spent in milliseconds for hypervisor paging. It also shows the number of
4KB pages that AIX has loaned to the hypervisor and the percentage of partition logical
memory that has been loaned.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement lparstat can be used to check I/O entitled memory statistics.
Instructor Guide
lparstat -me
# lparstat -me
System configuration: lcpu=2 ent=0.10 mem=1536MB mpsz=4.00GB iome=77.00MB iomp=9
physb %entc vcsw hpi hpit pmem iomin iomu iomf iohwm iomaf
-------- --------- -------- ------ ------ ---------- --------- -------- --------- --------- --------
0.90 12.1 441 0 0 1.34 31.7 12.0 45.3 12.7 0
iompn: iomin iodes iomu iores iohwm iomaf

ent0.txpool 2.12 16.00 2.00 2.12 2.00 0
ent0.rxpool__4 4.00 16.00 3.50 4.00 3.50 0
ent0.rxpool__3 4.00 16.00 2.00 4.00 2.00 0
ent0.rxpool__2 2.50 5.00 2.00 2.50 2.00 0
ent0.rxpool__1 0.84 2.25 0.75 0.84 0.75 0
ent0.rxpool__0 1.59 4.25 1.50 1.59 1.50 0
ent0.phypmem 0.10 0.10 0.09 0.10 0.09 0
vscsi0 16.50 16.50 0.13 16.50 0.89 0
sys0 0.00 0.00 0.00 0.00 0.00 0
9 iome: Amount of I/O memory entitlement configured to LPAR
9 iome: Amount of I/O memory entitlement configured to LPAR
9 iomp: Number of I/O memory entitlement pools
9 iomp: Number of I/O memory entitlement pools
9 iomin: Minimum required I/O memory entitlement
9 iomin: Minimum required I/O memory entitlement
9 iodes: Desired I/O memory entitlement
9 iodes: Desired I/O memory entitlement
9 ioinu / iomu: In-use I/O memory entitlement
9 ioinu / iomu: In-use I/O memory entitlement
9 iores: Reserved I/O memory entitlement
9 iores: Reserved I/O memory entitlement
9 iomf: Free I/O memory entitlement
9 iomf: Free I/O memory entitlement
9 iohwm: High-water mark of I/O memory entitlement usage
9 iohwm: High-water mark of I/O memory entitlement usage
9 ioafl / iomaf: I/O memory entitlement allocation failures
9 ioafl / iomaf:I/O memory entitlement allocation failures
Figure 4-40. lparstat -me AN313.1
Notes:
The lparstat command has been enhanced to display statistics about shared memory.
Most of the metrics show the I/O entitled memory statistics.
From the lparstat command output, we can see on the left, the different memory pool
names for each virtual adapter. A device might have multiple I/O memory entitlement pools.
The virtual Ethernet adapter could have several pools for different receive buffers, transmit
buffers, and other misc. memory. Virtual Fibre Channel adapters and virtual SCSI adapters
tend to have just a single main large pool and a few other smaller pools. The lparstat
me command reports consolidated I/O entitlement usage information across all pools for
an adapter. The exception is the virtual Ethernet adapter.
The iomaf value tells how many times the OS attempted to get a page frame for an I/O and
failed. If this value is non-zero, increase your I/O entitled memory. If I/O entitled memory in
use (iomu) is high relative to your configured I/O entitled memory, or iomf is consistently
low, increase your I/O entitled memory to improve performance.

V5.4.0.3
Instructor Guide
Uempty Here are the column descriptions:

iome: Amount of I/O mem entitlement configured to LPAR
iomp: Number of I/O memory entitlement pools
iomin: Minimum required I/O mem entitlement
iodes: Desired I/O mem entitlement
ioinu / iomu: I/O memory entitlement in use for the partition
iores: Reserved I/O mem entitlement
iomf: Free I/O memory entitlement for the partition
iohwm: High-water mark of I/O mem entitlement used by the pool
ioafl / iomaf: I/O mem entitlement allocation failures. This represents the total number
of times the vrme_alloc() calls have failed on the partition.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement The next visuals are about topas.

V5.4.0.3
Instructor Guide
Uempty
topas -L
Interval: 2 Logical Partition: lpar1 Tue Apr 28 16:07:12 2009

Psize: - Shared SMT ON Online Memory: 1536.0
Ent: 0.10 Mode: UnCapped Online Logical CPUs: 2
Mmode: shr IOME: 77.00 Online Virtual CPUs: 1
Partition CPU Utilization
%usr %sys %wait %idle physc %entc app vcsw phint hpi hpit pmem iomu
1 4 0 95 0.0 8.0 - 403 0 0 0 1.05 11.96
=======================================================================
LCPU minpf majpf intr csw icsw runq lpa scalls usr sys _wt idl pc lcsw
Cpu0 0 0 352 544 265 0 100 279 17 49 0 33 0.01 313
Cpu1 0 0 89 0 0 0 0 0 0 25 0 75 0.00 89
99 Mmode:
Mmode:Memory
Memorymode
mode
99 IOME:
IOME: I/Omemory
I/O memoryentitlement
entitlementofofthe
thepartition
partitionininMegabytes
Megabytes
Figure 4-41. topas -L AN313.1
Notes:
When using the topas -L command, the logical partition view with the Active Memory
Sharing statistics is displayed. The IOME field shows the I/O memory entitlement
configured for the partition, while the iomu column shows the I/O memory entitlement in
use. Detailed information about I/O memory entitlement can be obtained by pressing e in
the topas -L menu.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
topas -C (1 of 2)
Topas CEC Monitor Interval: 10 Tue Apr 28 15:34:14 2009

Shr: 4 Mon: 6.0 InUse: 4.6 Shr:0.4 PSz: 4 Don: 0.0 Shr_PhysB 0.03
Ded: 0 Avl: - Ded: 0 APP: 4.0 Stl: 0.0 Ded_PhysB 0.00
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw Ent %EntC PhI pmem

-------------------------------------------------------------shared---------------------------------------------------------------
lpar1 A61 UM 1.5 1.2 2 0 8 0 90 0.01 334 0.10 11.9 0 0.88
lpar2 A61 UM 1.5 1.2 2 1 3 0 95 0.01 314 0.10 7.2 0 0.90
lpar3 A61 UM 1.5 1.2 2 0 3 0 95 0.01 291 0.10 6.3 0 0.94
lpar4 A61 uM 1.5 0.9 2 0 3 0 96 0.01 297 0.10 6.0 0 1.28
-----------------------------------------------------------dedicated------------------------------------------------------------
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw %istl %bstl
-----------------------------------------------------------dedicated------------------------------------------------------------
99 InU
InU - -Logical
Logicalpartition
partitionworking
workingset
set
99 CM
CM - -SMT
SMTenabled
enabledand
andcapped
cappedininshared-memory
shared-memorymode
mode
99 cM
cM - -SMT
SMTdisabled
disabledand
andcapped
cappedininshared-memory
shared-memorymode
mode
99 UM
UM - SMT enabled and uncapped inshared-memory
- SMT enabled and uncapped in shared-memorymode
mode
99 uM - SMT disabled and uncapped in shared-memory
uM - SMT disabled and uncapped in shared-memory modemode
Figure 4-42. topas -C (1 of 2) AN313.1
Notes:
In the figure, there are four shared memory partitions with host names amsaixa, amsaixb,
amsaixc, and amsaixd. Each line shows the partitions corresponding physical and logical
memory usage. The InU column displays the amount of logical memory (in GB) from the
AIX perspective, which will be the configured memory for the partition. The pmem column
shows the physical memory (in GB) allocated to the shared memory partitions from the
AMS at a given time. This reflects the actual working set of the shared memory partition.
pmem: Physical memory in Gbytes allocated to shared memory partitions from the
shared memory pool in GB at a given time.
InUse: Logical memory; from the AIX perspective, this is LPARs working set.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets use topas to see the memory pool and the partitions
associated with that pool.

V5.4.0.3
Instructor Guide
Uempty
topas -C (2 of 2)
Press the m key to display the memory pool panel from the CEC panel.
Move the cursor on the memory pool then press the f key to display the
partitions associated with that pool.
Topas CEC Monitor Interval: 10 Mon May 25 15:14:10 2009
Partitions Memory (GB) Memory Pool(GB) I/O Memory(GB)
Mshr: 4 Mon: 6.0 InUse: 4.9 MPSz: 4.0 MPUse: 4.0 Entl: 308.0Use: 47.9
Mded: 0 Avl: 1.1 Pools: 1
mpid mpsz mpus mem memu iome iomu hpi hpit

--------------------------------------------------------------------------------
0 4.00 4.00 6.00 4.86 308.0 47.9 0 0
Host mem memu pmem meml iome iomu hpi hpit vcsw physb %entc
------------------------------------------------------------------------------------------------------------
lpar2 1.50 1.22 0.98 0.52 77.0 12.0 0 0 0 0.01 5.50
lpar4 1.50 1.17 1.05 0.45 77.0 12.0 0 0 277 0.01 5.11
lpar1 1.50 1.22 1.01 0.49 77.0 12.0 0 0 0 0.00 0.00
lpar3 1.50 1.25 0.97 0.53 77.0 12.0 0 0 733 0.01 10.11
99 memu:
memu:Logical
Logicalpartition
partitionworking
workingset
set
Figure 4-43. topas -C (2 of 2) AN313.1
Notes:
To display the memory pool panel from the CEC panel, press the m key. This panel
displays the statistics of all of the memory pools in the system. The example shows a
shared memory size of 4GB, and the size of the aggregate logical memory of all the
partitions in the pool is 6GB. Also, the following values of the pools are displayed:
mpid: The ID of the memory pool
mpsz: The size of the total physical memory of the memory pool in gigabytes
mpus: The total memory of the memory pool in use (this is the sum of the physical
memory allocated to all of the LPARs in the pool)
mem: The size of the aggregate logical memory of all the partitions in the pool in
gigabytes
memu: The aggregate logical memory that is used for all the partitions in the pool in
gigabytes
iome: The aggregate of I/O memory entitlement that is configured for all the LPARs in
the pool in gigabytes
Instructor Guide
iomu: The aggregate of the I/O memory entitlement that is used for all the LPARs in the
pool in gigabytes
hpi: The aggregate number of hypervisor page faults that have occurred for all of the
LPARs in the pool
hpit: The aggregate of time spent in waiting for hypervisor page-ins by all of the LPARs
in the pool in milliseconds
To display the partitions associated with a pool in the lower section of the panel, select a
particular memory pool and press the f key. The example shows four LPARs. Each of them
has 1.5GB of logical memory configured. The following values of the partitions in the pools
are displayed:
mem: The size of logical memory of the partition in gigabytes
memu: The logical memory that is used for the partition in gigabytes
meml: The logical memory loaned to hypervisor by the LPAR
pmem: The physical memory that is allocated to the partition from the memory pool in
gigabytes
iome: The amount of I/O memory entitlement that is configured for the LPAR in
gigabytes
iomu: The amount of I/O memory entitlement that is used for the LPAR in gigabytes
hpi: The number of hypervisor page faults
hpit: The time spent in waiting for hypervisor page-ins in milliseconds
vcsw: The virtual context switches average per second
physb: The physical processor that is busy
%entc: The percentage of the consumed processor entitlement

V5.4.0.3
Instructor Guide

Purpose Explain how to use the topas CEC panel to display the shared memory pool
and the logical partitions that correspond to that pool.
Details
Transition statement You can use HMC utilization used to retrieve shared memory
pool and logical partitions information.
Instructor Guide
HMC utilization data: Memory pool utilization
Figure 4-44. HMC utilization data: Memory pool utilization AN313.1
Notes:
You can use the HMC utilization data to retrieve informations about the shared memory
pool utilization. First, specify the utilization events that you want to see. Utilization events
are records that contain information about the memory and processor utilization on your
managed system at a particular time. You can select the record type you want to see. In our
example, the shared memory pool.
Shared memory pool utilization

The information displayed here allows you to see the utilization of the shared memory pool
at the indicated date and time. The information displayed is as follows:
Pool size (GB): Displays shared memory pool size in gigabytes (GB).
Memory over commitment (GB): The difference (in gigabytes) between the
aggregated logical memory over all partitions and the size of the shared memory pool,
in gigabytes.

V5.4.0.3
Instructor Guide
Uempty Memory over commitment (percent): The difference (percentage) between the
aggregated logical memory over all partitions and the size of the shared memory pool,
as a percent
Partition Logical Memory (GB): The amount of logical memory (in gigabytes)
assigned to all of the partitions in the shared memory pool
Partition I/O entitled memory (GB): The total amount of I/O entitled memory (in
gigabytes) currently mapped by the shared memory pool
Partition mapped I/O entitled memory (GB): The amount of I/O entitled memory (in
gigabytes) currently mapped by all of the partitions in the shared memory partition
System firmware pool memory (GB): Amount of memory, in gigabytes, in the shared
memory pool that is being used by system firmware
Page fault rate (faults/second): The number of page faults per second.
Page-in delay (microseconds): The total page-in delay, in microseconds, spent
waiting for page faults since the shared memory pool was created
Page-in delay (percent): The page-in delay, expressed as a percentage of total time.
Collect utilization data

You can set the Hardware Management Console (HMC) to collect utilization data and
create utilization events at the following times:
At periodic intervals (hourly, daily, monthly, and snapshot)
When you make system-level and partition-level state and configuration changes that
affect resource utilization
When you start, shut down, or change the local time on the HMC
Display the utilization information

You can change the information that displays in the lower half of this window by using the
View menu. The possible View menu settings are as follow:
System: Displays information about the processor and memory utilization on the
managed system as a whole
LPAR: Displays information about the processor and memory utilization on each logical
partition in the managed system.
Physical Processor Pool: Displays information about the total processor utilization
within all shared processor pools on the managed system
Shared Processor Pool: Displays information about the processor utilization within
each configured shared processor pool on the managed system
Shared Memory Pool: Displays information about the processor utilization within each
configured shared processor pool on the managed system
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
HMC utilization data: Partition memory utilization
Figure 4-45. HMC utilization data: Partition memory utilization AN313.1
Notes:
The information displayed in the visual shows the processor and memory utilization of each
logical partition in the managed system at the indicated date and time.
When you select the memory tab, the window displays the memory utilization status of a
selected partition at a specific date and time.
The following information is displayed:
Partition (ID): The partition name and ID number
Memory mode: The processor mode (dedicated or shared)
Logical memory (GB): The amount of logical memory configured for the partition, in
gigabytes (GB)
Virtual processors: The amount of physical memory configured for this partition, in
gigabytes (GB)
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (1 of 3)
1. True or False: PowerVM Active Memory Sharing feature allows
shared memory partitions to share memory from a single pool of
shared physical memory.
2. Which of the following are benefits of using AMS?

a. Reduces global memory requirements of logical partitions
b. Manages operational costs by consolidating resources
c. Allows logical partitions to increase their memory footprint during peak memory
demand periods
d. Allows over commitment of physical memory resources
3. True or False: The total logical memory of all shared memory LPARs
in a system is allowed to exceed the real physical memory allocated to
a shared memory pool in the system.
4. What is the name of a partitions memory in a shared memory LPAR?
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
Checkpoint solution (1 of 3)
1. True or False: PowerVM Active Memory Sharing feature allows shared memory
partitions to share memory from a single pool of shared physical memory.
The answer is true.

c. Allows logical partitions to increase their memory footprint during peak memory demand periods
d. Allows over commitment of physical memory resources.
The answer is reduces global memory requirements of logical partitions, manages
operational costs by consolidating resources, allows logical partitions to increase
their memory footprint during peak memory demand periods, and allows over
commitment of physical memory resources.
3. True or False: The total logical memory of all shared memory LPARs is allowed to
exceed the real physical memory allocated to a shared memory pool in the
system.
The answer is true.

The answer is logical memory.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (2 of 3)
5. What requirements must be met by the LPAR in order to be
defined as a shared memory LPAR?
6. What is the purpose of an LPAR paging device?
7. True or False: Each shared memory partition requires a

dedicated paging device in order to be started.
8. The dedicated paging device must be what size to

successfully activate a shared memory LPAR?
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
5. What requirements must be met by the LPAR in order to be defined
as shared memory LPAR?
The answer is the LPAR must use shared processors and use only
virtual I/Os.
6. What is the purpose of the LPAR paging space device?

The answer is it provides an amount of paging space that the partition
requires to back up its logical memory pages.
7. True or False: Each shared memory partition requires a dedicated

paging device in order to be started.
The answer is true.
8. The dedicated paging device must be what size to successfully

activate a shared memory LPAR?
The answer is in order to successfully activate, the minimum paging
space device size must be more than or equal to the maximum
memory setting for the shared memory LPAR.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint (3 of 3)
9. True or False: The Collaborative Memory Manager is an
operating system feature that gives hints on memory page
usage to the hypervisor.
10. Which commands can be used to get Active Memory Sharing

statistics?
11. True or False: When AIX starts to loan logical memory

pages, by default it first selects pages used to cache file
data.
12. How can you tune the Collaborative Memory Managers loan
policy?
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
9. True or False: The Collaborative Memory Manager is an operating
system feature that gives hints on memory page usage to the
hypervisor.
The answer is true.
statistics?
The answer is vmstat, lparstat, topas, and svmon.
11. True or False: When AIX starts to loan logical memory pages, by
default it first selects pages used to cache file data.
The answer is true.
12. How can you tune the Collaborative Memory Managers loan policy?
The answer is the policy is tunable through the AIX VMM vmo
command. The parameter ams_loan_policy has a default value of 1.
This enables the loaning of the file cache. When set to 2, loaning of
any type of data is enabled.

V5.4.0.3
Instructor Guide
Uempty
Exercise
Unit
exerc
ise
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Unit summary
Describe the Active Memory Sharing concepts and
components
Monitor the shared memory partition using the AIX
performance tools vmstat, lparstat, and svmon
Monitor the shared memory pool usage using data utilization
from the HMC
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty Unit 5. Active Memory Expansion: Overview
Estimated time
02:00

This unit will introduce you to the IBM POWER Systems Active
Memory Expansion features. An overview of features will be followed
by planning, deploying, and monitoring details.

Describe the Active Memory Expansion (AME) feature
List the benefits of using AME
Define the purpose of the memory expansion factor
List workload characteristics used to evaluate suitability for AME
Describe how to use the AME planning tool
Explain the output produced by the AME planning tool
Describe how to select a suitable memory expansion factor
List the hardware and software requirements for AME
Describe how to activate the AME feature on a managed system
Configure a partition to use AME
List the tools used to monitor AME performance
Determine the memory compression level achieved in a partition
Determine the CPU resources used for memory compression and
decompression

Machine exercises
References
http://www-03.ibm.com/systems/power/hardware/whitepapers/am_ex
p.html
Active Memory Expansion: Overview and Usage Guide
Copyright IBM Corp. 2010, 2011 Unit 5. Active Memory Expansion: Overview 5-1
Instructor Guide
Unit objectives
decompression
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 1: Overview
Figure 5-2. Topic 1: Overview AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Active Memory Expansion: Introduction

Active Memory Expansion (AME) is an optional feature for
HMC managed POWER7 processor-based systems.
AME effectively increases the amount of memory observed by
an operating system when compared with the actual memory
allocation.
AME allows more effective server consolidation.
AME can run more workload/users in a partition using the
existing memory allocation.
AME can run the same workload/users in a partition after
reducing the actual memory allocation.
This frees up physical memory which can be used to run additional
partitions on the system.
Figure 5-3. Active Memory Expansion: Introduction AN313.1
Notes:
Active Memory Expansion (AME) is an optional feature of POWER7 processor-based
systems that enables more effective server consolidation.
Partitions configured to use AME will observe an extended logical memory amount that is
greater than the allocated physical memory.
This technique can be used in two basic ways:
Retain the existing memory allocation of the LPAR, but use AME to present an
extended memory amount to the partition, and allow more throughput.
Reduce the memory allocation of the partition, and use AME such that the extended
memory amount presented to the application is the same as the original physical
memory allocation. This allows the LPAR to handle the same workload, but with a
reduced physical memory allocation.

V5.4.0.3
Instructor Guide

Purpose Introduce AME - a technique that allows more efficient server consolidation.
Details
Transition statement AME works by compressing data to squeeze more of it into the
memory allocated to a partition.
Instructor Guide
Active Memory Expansion: Concept

AME compresses and decompress in-memory data to increase
the effective memory amount.
It allows a partition to fit more data into the existing LPAR memory
footprint.
Data compression and decompression consumes CPU

resource from the amount allocated to the partition.
Often a small amount of CPU resource can yield a large

increase in the effective memory amount.
Actual increase depends on compressibility of data.
Figure 5-4. Active Memory Expansion: Concept AN313.1
Notes:
The basic concept of AME is to use available CPU resources in the partition to compress
data to squeeze more of it into the actual memory amount allocated to the partition. The
CPU resources used for compression and decompression are from the resources allocated
to the partition, since the work is being carried out by the operating system itself.
The actual amount of increase in the effective memory capacity of the partition depends on
the compressibility of the data being used.

V5.4.0.3
Instructor Guide

Purpose
Details Explain at a high level the concept of AME - squeezing more data into memory
by compressing it using spare CPU capacity.
Transition statement Lets look at some examples of using AME.
Instructor Guide
AME scenarios (1 of 2)
AME can be used to increase the effective memory capacity of a
partition, while retaining the existing physical memory allocation.
96GB 96GB
allocation allocation
MMMM MMMMG
MMMM MMMMG
MMMM Scenario 1: Expand effective
memory in constrained LPAR
MMMMG
MMMM 96GB 120GB MMMMG
MMMM MMMMG
MMMM MMMMG
96GB 120GB
effective effective
Assumes
25%
expansion
M Physical memory allocation

G Gained memory capacity
Figure 5-5. AME scenarios (1 of 2) AN313.1
Notes:
Scenario 1
The diagram in the visual shows an example of using AME to expand the effective memory
amount observed by a partition. The partition retains its existing physical memory
allocation, and uses AME to present an extended memory amount to the applications and
users on the system.
In the example in the visual, a partition configured with 96GB of physical memory will
present an extended memory amount of 120GB to applications and users, while retaining a
physical footprint of 96GB.

V5.4.0.3
Instructor Guide

Purpose Explain how AME can increase the effective memory capacity (and therefore
workload throughput) of an LPAR while retaining the existing actual memory allocation.
Details
Additional information Well mention it in a later unit, but if anyone asks, the partition
page table is still sized based on the maximum memory value listed in the partition profile,
even if this is smaller than the extended memory amount that is presented to applications.
Transition statement Another use for AME is to free up memory for use in other
LPARs.
Instructor Guide
AME scenarios (2 of 2)
AME can be used to reduce the physical memory allocation of a
partition while retaining the same effective memory capacity. This allows
more LPARs to be created.
96 GB 64 GB 32 GB
allocation allocation allocation
MMMM MMGG MM
MMMM MMGG MM
MMMM Scenario 2: Retain effective memory
in existing LPAR, but reduce physical
MMMG MG
MMMM allocation to use for second LPAR MMMG MG
MMMM MMMG MG
MMMM MMMG MG
96 GB 96 GB 48 GB
effective effective effective
Assumes Assumes
50% 50%
expansion expansion
M Physical memory allocation

G Gained memory capacity
Figure 5-6. AME scenarios (2 of 2) AN313.1
Notes:
Scenario 2
Another way in which AME can be used is to cope with the existing workload being handled
by a partition while allowing the physical memory allocation to be reduced. This scenario is
shown in the diagram on the visual above.
The original partition is reconfigured with a reduced memory allocation, but uses AME to
present an extended memory amount to applications that is identical to the original
non-AME configuration. The physical memory that has been freed by this reconfiguration
has been repurposed to enable the creation and activation of an additional LPAR, which is
also using AME to present an extended memory amount to the applications and users it is
running.

V5.4.0.3
Instructor Guide

Purpose
Details Explain how AME can be used to reduce the physical memory allocation of a
partition, yet allow the partition to maintain existing workload performance. This allows the
freed memory to be used for additional LPARs.
Transition statement The HMC allocates logical memory to partitions.
Instructor Guide
Logical memory
Logical memory is the virtualized physical memory presented
to a partition by the hypervisor.
Logical memory amount based on values specified in the

partition profile as minimum, desired, and maximum.
For a dedicated memory partition, the mapping from logical to

physical memory is one to one.
For a partition using active memory sharing (shared memory),

the logical memory may not all be mapped to physical
memory.
Some logical memory may be paged out on VIOS controlled paging
device
Figure 5-7. Logical memory AN313.1
Notes:
In order to understand the implementation details of AME, it is necessary to define a
number of terms that are used in the explanation.
The first of these terms is logical memory. The HMC allocates logical memory to a partition,
which is then managed by the virtual memory manager component of the operating
system. For a partition configured to use dedicated memory, there is a one to one mapping
between logical memory and physical memory. For a partition configured to use Active
Memory Sharing (AMS), another memory utilization improvement technology available on
Power Systems, the logical memory might not all be mapped to physical memory at the
same time. The actual amount of physical memory used by a shared memory partition will
depend on the workload of the partition, and the demands being placed on the shared
memory pool by the other partitions configured to use AMS.

V5.4.0.3
Instructor Guide

Purpose Explain the term logical memory, and how it maps to physical memory.
Details If students are unfamiliar with AMS, then simply tell them to consider logical
memory as being equivalent to physical memory, or simply memory.
Transition statement When AME is enabled, logical memory is managed as two
distinct pools.
Instructor Guide
Logical memory pools

When AME is enabled, logical memory is divided into two
pools.
Uncompressed logical memory
Compressed logical memory
Compressed memory pool is implemented like a RAM disk

paging device.
When the number of virtual pages in use exceeds size of
uncompressed pool, AIX will start compressing the least recently used
pages and move them to the compressed pool
Figure 5-8. Logical memory pools AN313.1
Notes:
When AME is enabled, the logical memory allocated to a partition is divided into two pools
for management purposes. There is a pool of uncompressed memory, and a pool of
compressed memory.
The compressed memory pool can be thought of as a special type of RAM disk paging
device that is handled internally by the virtual memory manager.

V5.4.0.3
Instructor Guide

Purpose Describe how logical memory is divided into two pools, and how the
compressed pool is implemented as if it were a RAM disk paging device.
Details
Transition statement If we reference a page thats currently stored in the compressed
pool, a page fault will be generated.
Instructor Guide
Page faults
When a referenced virtual page is not resident in the
uncompressed pool, a page fault is generated.
The VMM page fault handler uses the software page frame
table to determine current location of the referenced page.
If the virtual page is stored in the compressed pool, a

pseudo page-in operation will uncompress the page and place
it in the uncompressed pool.
Faulting thread will resume when the page-in operation completes.
The page-in of compressed data does not involve disk I/O, therefore is
considerably faster than from paging space.
If the virtual page is located on paging space, a regular page-in

operation is arranged.
Figure 5-9. Page faults AN313.1
Notes:
When the CPU references a virtual page for which it has no translation information (not
resident in the uncompressed pool), it generates a page fault. This will result in the VMM
page fault handler being invoked.
If the virtual page being referenced is located on a regular paging space device, then the
page fault handler will obtain a free frame from the uncompressed pool, and invokes an I/O
operation to copy the virtual page from paging space back into memory. The faulting thread
is put to sleep until the I/O operation completes.
If the virtual page being referenced is currently contained within the compressed page pool,
then a similar operation is performed. The VMM page fault handler will obtain a free frame
from the uncompressed pool, and use it to store the decompressed. This operation
completes much more quickly than a traditional page fault as there is no disk-based I/O
involved.

V5.4.0.3
Instructor Guide

Purpose Describe what happens when a page fault is encountered.
Details Hopefully most students will be familiar with the concept of a page fault. If not,
then they might have difficulty grasping some of the concepts covered later.
Transition statement Applications are presented with a view of the extended logical
memory capacity. This allows them to scale algorithms correctly.
Instructor Guide
Expanded logical memory

Expanded logical memory capacity is the inflated view of logical memory
AIX will present to applications and users.
Determined by the memory expansion factor, which is a multiplier of the actual
logical memory
LPARs expanded
logical memory LPARs actual
logical memory
Uncompressed
data Uncompressed
10GB memory
pool
10GB
20GB
30GB Compressed
Compressed Compression memory
ratio
data 2.0 pool
20GB 10GB
View presented to
firmware, HMC, OS
Memory expansion factor = 1.5

View presented to (30GB/20GB)
applications and user
Figure 5-10. Expanded logical memory AN313.1
Notes:
Memory expansion factor
Applications are presented with information about the expanded logical memory capacity of
the partition. This allows them to scale data structures, algorithms, and the number of
threads they will use appropriately for the total amount of data that can be handled by the
partition.
The firmware and operating system are presented with a view that represents the actual
logical memory amount allocated to the partition.
The memory expansion factor value is a multiplier of the actual logical memory amount that
determines the target expanded logical memory amount.

V5.4.0.3
Instructor Guide

Purpose Show the difference between the extended logical memory amount presented
to applications, and the actual logical memory amount presented to firmware and the
operating system.
Details In particular, point out and explain the definition of the memory expansion factor,
which determines the relationship between the actual and extended logical memory values.
Transition statement The compressed pool grows and shrinks dynamically as
required.
Instructor Guide
Pool size (1 of 2)
The size of the compressed memory pool (and also the
uncompressed pool) will vary dynamically over time depending
on LPAR workload.
When AME is enabled, initially the pool will be zero size.

If workload fits within the LPAR actual logical memory allocation, no
compression is required.
Compressed pool expanded when LPAR runs low on free

frames.
Eligible least recently used pages will be compressed and added to
the compressed pool, thus freeing frames in the uncompressed pool.
As the compressed pool expands, the uncompressed pool will

shrink.
Figure 5-11. Pool size (1 of 2) AN313.1
Notes:
Dynamic pool boundary
The boundary between the compressed and uncompressed memory pools will change
dynamically as required, based on LPAR workload.
When AME is enabled, the compressed pool is initially empty. As the memory workload of
the partition increases, pages will be allocated from the uncompressed pool. If the partition
workload fits within the actual logical memory allocation, no compression will be required.
As the number of free frames in the uncompressed pool is reduced, the VMM will start to
compress eligible least recently used pages and move them to the compressed pool. This
will free up frames in the uncompressed pool at a rate larger than the rate at which frames
are being added to the compressed pool.
Since the amount of logical memory available to the partition is constant at any given
moment, if the compressed pool is expanded this means the uncompressed pool will
shrink.

V5.4.0.3
Instructor Guide

Purpose Explain the dynamic nature of the compressed pool.
Details Describe the growing and shrinking of the pools verbally. The diagram on the
next visual should make things clearer.
Transition statement The compressed pool will stop growing when one of three
situations occur.
Instructor Guide
Pool size (2 of 2)
LPARs actual
logical memory
Pinned Uncompressed Compressed

memory pool pool
Minimum size of
uncompressed pool Grow Shrink
ame_min_ucpool_size
Compressed pool might grow in size until:

Total amount of data in memory reaches expanded logical memory value
determined by expansion factor
Uncompressed pool shrinks to minimum pool size
System gets low on pinnable memory
Only pages from the uncompressed pool will be paged out to paging
space.
Figure 5-12. Pool size (2 of 2) AN313.1
Notes:
Dynamic pool boundary
The boundary between the compressed and uncompressed memory pools will change
dynamically as required, based on LPAR workload.
As shown in the diagram on the visual above, in addition to the uncompressed pool, some
portion of the logical memory will be used to store pinned pages which cannot be
compressed. The compressed pool start off at zero size, and will grow as required until one
of the three conditions listed on the visual are met.
If the compressed pool cannot grow any more and there is a lack of free frames in the
uncompressed pool, then regular paging activity to a paging space device will occur. Only
pages from the uncompressed pool will be examined by the VMM and considered as
candidates to be paged out.

V5.4.0.3
Instructor Guide

Purpose Show how logical memory is managed as two pools, and the boundary
between the pools varies dynamically.
Details
Transition statement AME will consume more CPU resources as the amount of data
compression and decompression increases.
Instructor Guide
CPU utilization for compression

As the size of the compressed pool increases, this increases
the chance that a thread will reference a compressed page.
This in turn increases the amount of CPU resource spent on
compressing and decompressing data.
Aim for knee of curve

best expansion without
significant additional CPU
% CPU
utilization
for
expansion
Amount of memory expansion

Figure 5-13. CPU utilization for compression AN313.1
Notes:
CPU cost
There is an associated cost in CPU resources that will be consumed within a partition when
AME is configured. As the amount of memory expansion increases, there will be an
increase in the amount of CPU resources consumed. This relationship is not linear, as
shown on the diagram in the visual above.
In many cases, a significant amount of memory expansion can be obtained without
consuming a significant amount of additional CPU resources. There is however a point in
the curve at which significant additional CPU resources might be consumed to provide only
a small amount of additional memory expansion.
The actual location of the sweet spot of the curve will depend on the existing CPU load on
the partition, the memory workload on the partition, and the compressibility of the data
being used.

V5.4.0.3
Instructor Guide

Purpose Show the CPU utilization curve for AME.
Details Explain that ideally we want to use enough compression/decompression to
maximize the extended memory we can obtain, but without expending significant additional
CPU resource to achieve this. We should be aiming for the sweet spot, or knee of the
curve.
Transition statement AME does have some limits on the type of pages that can be
compressed.
Instructor Guide
Limitations
Only working storage pages can be compressed.
Pinned memory pages are not compressed.
File system cache pages are not compressed.
If memory expansion factor is too aggressive, there might be

an expanded memory deficit.
When actual logical memory is full, but total data has not yet reached
expanded logical memory value
Resolved by reducing expansion factor and adding additional logical
memory
Figure 5-14. Limitations AN313.1
Notes:
AME does have some limitations on functionality, some of which are listed on the visual
above. Other limitations will be discussed in later in this unit.
The impact of these limitations on the potential performance of a partition using AME can
be determined in advance by the use of the AME planning tool.

V5.4.0.3
Instructor Guide

Purpose Describe some of the limitations of the AME feature.
Details More on the limitations will be covered in later units of the course.
Transition statement AME has an associated monetary cost. How would decide when
to buy it versus buying more physical memory?
Instructor Guide
AME economics
AME is a chargeable feature.
Cost varies depending on system, just as cost of physical memory
varies by system.
Scenario 1: System has maximum possible physical memory
for current configuration.
AME is a simple choice, even with moderate expansion factors.
Scenario 2: System is not at maximum possible physical
memory, and CPU utilization is relatively low.
Need to choose between adding more physical memory or adding
AME.
Calculate cost of AME gained memory based on expansion factor,
and compare with cost of same amount of physical memory for
system.
Scenario 3: System is full of small DIMMs, and need to avoid
cost of replacing with larger DIMMs.
This is the same as scenario 2.
Figure 5-15. AME economics AN313.1
Notes:
Cost evaluation
The cost of the AME feature varies from system to system, just as the cost of physical
memory varies from system to system.
When evaluating whether AME is a sensible purchase, you should consider the cost of the
AME feature with the cost of buying an amount of physical memory equivalent to the logical
memory gained by each partition using AME. The calculation should also take into account
other factors, such as whether the system is currently at maximum physical memory
capacity. Another consideration is whether all DIMM slots are currently occupied, and
adding more memory capacity would require adding another CPU card, or replacing
existing DIMMs with larger capacity DIMMs.
Remember that the cost per Gigabyte of memory for a given system will vary depending on
the size of the DIMMs used in the memory feature.

V5.4.0.3
Instructor Guide

Purpose Explain the economics of deciding when to purchase the AME feature.
Details
Additional information Memory for 750/755 class systems is cheaper than memory for
770/780 class systems. The cost per GB for larger memory features is smaller than the
cost per GB for small memory features. All of this needs to be considered when deciding
whether to purchase AME.
Transition statement When considering using AME in a production environment, there
are a number of phases you should consider.
Instructor Guide
AME deployment phases

Phase 1: Planning
Use the AME planning tool (amepat) to measure existing workload
and get an estimate of data compressibility and CPU resource
required for performing data compression.
Phase 2: Trial
Obtain 60 day trial activation of AME.
Configure LPARs as desired with memory expansion factor, and
monitor performance.
Phase 3: Deployment into production

Obtain permanent activation of AME.
Deploy production workloads on LPARs configured with memory
expansion factor.
Monitor performance.
Figure 5-16. AME deployment phases AN313.1
Notes:
A customer would be very unwise to deploy any new feature directly into a production
environment. Typically, planning and testing tasks would be performed before production
deployment.
The second topic in this unit covers the planning phase, including use of the AME planning
tool to determine the amount of benefit that can be obtained for a particular workload by
using AME.
The third topic of this unit covers the deployment details, such as obtaining the AME
activation key, and configuring a partition with a memory expansion factor value.
The fourth topic of this unit covers the facilities available to monitor the performance of a
partition configured with AME.

V5.4.0.3
Instructor Guide

Purpose Cover the main phases of evaluating and deploying AME in a production
environment.
Details
Transition statement The Power Systems environment supports other memory
utilization improvement technologies.
Instructor Guide
Memory utilization improvement technologies

AME effectively provides more memory to a partition by
compressing and decompressing the contents of logical
memory.
AIX 6.1 and POWER7 only
Does not require VIOS to be configured
Active Memory Sharing creates a shared memory pool utilized
by multiple LPARs.
Hypervisor moves physical memory from one partition to another in
the pool
Best when not all LPARs are busy at the same time
POWER6 and POWER7
AIX 5.3 and above, Linux and IBM i
Requires VIOS to be configured
Active Memory Expansion and Active Memory Sharing can be
used concurrently.
Figure 5-17. Memory utilization improvement technologies AN313.1
Notes:
Other technologies
In addition to the Active Memory Expansion memory utilization improvement feature
available on POWER7 processor-based systems, IBM Power Systems support the Active
Memory Sharing (AMS) feature on systems using POWER6 or newer processors.
AME is a technology that can improve the memory utilization of a single partition, and can
be used by AIX 6 partitions running on POWER7 processor-based systems. It does not
require a Virtual I/O Server (VIOS) partition to be configured.
AMS is a feature which can be utilized by multiple operating systems on hardware with
POWER6 or newer processors. This feature creates a shared memory pool controlled by
the hypervisor, which allocates physical memory to the partitions most in need. This feature
requires a VIOS partition to be configured.
These two memory technologies can be used together on POWER7 processor-based
systems.

V5.4.0.3
Instructor Guide

Purpose Compare AME with AMS.
Details
Transition statement Lets look at a real world example of using AME to increase the
throughput of an individual partition.
Instructor Guide
Sample SAP ERP workload (1 of 2)

Single LPAR (database + appserver)
Without AME With AME
Partition utilization Partition utilization
Memory: 100% (18GB) Memory: 100% (18GB true)
CPU: 46% (12 cores in LPAR) CPU: 88% (12 cores in LPAR) Note: Most of the
CPU increase is due to additional work done by application
Memory capacity is the bottle-neck
Higher throughput enabled with the same
CPU is under-utilized amount of physical memory
Handles 1000 simulated users Gain 37% memory capacity
Handles 1700 simulated users
Max partition throughput: 99 tps + 65% Max partition throughput: 166 tps
12-core POWER7 partition 12-core POWER7 partition

18GB Memory 24.7GB Memory
18GB true . 18GB true .
0GB expanded 6.7GB expanded
Expanded memory
Note: This is an illustrative scenario based on using a sample workload. This data
represents measured results in a controlled lab environment. Your results might vary.
Figure 5-18. Sample SAP ERP workload (1 of 2) AN313.1
Notes:
The example described in the visual shows the measured improvement in application
throughput on a single partition that was configured to retain the same physical memory
allocation and use AME to expand the amount of memory presented to the application.

V5.4.0.3
Instructor Guide

Purpose Provide a real-world example of use of AME.
Details
Transition statement Lets consider another example.
Instructor Guide
Sample SAP ERP workload (2 of 2)

Enabled additional application partition
Without AME With AME
System utilization System utilization
Memory: 100% (48GB) Memory: 100% (48GB)
CPU: 76% on 24 cores CPU: 94% on 32 cores
25% (eight core) unused Note: Majority of CPU increase is due to additional work.
Memory capacity is the bottle-neck Higher throughput enabled
CPU under-utilized Enabled unused CPU resources (25% of server) with
Handles 2900 simulated users no additional physical memory
Partitions have reached physical memory Gain 30% in application server memory capacity
limitations by showing moderate paging Handles 5000 simulated users
System Throughput: 286 TPS + 60% System Throughput: 460 TPS
3 x 8-core POWER7 partitions 4 x 8-core POWER7 partitions

48GB true . 48GB true .
0GB expanded 14GB expanded
LPAR 2 (AppServer)
LPAR 3 (AppServer)
LPAR 2 (AppServer)
LPAR 3 (AppServer)
LPAR 4 (AppServer)
LPAR 1 (DB + App)
LPAR 1 (DB + App)
LPAR 4 (IDLE)
Note: This is an illustrative scenario based on using a sample workload. This data
represents measured results in a controlled lab environment. Your results might vary.
Figure 5-19. Sample SAP ERP workload (2 of 2) AN313.1
Notes:
\The example described in this visual shows the measured improvement in overall system
utilization and throughput on a managed system configured with multiple partitions. AME
was used on the existing partitions to free up sufficient physical memory to allow a fourth
application partition to be configured. This allowed additional CPU cores on the system to
be utilized effectively.
Disclaimer
The example described on this visual and the example on the previous visual show
memory expansion improvements that are at the high end of what is possible. Not every
application will be able to achieve similar improvements in throughput by using AME.

V5.4.0.3
Instructor Guide

Purpose Provide another real-world example of use of AME.
Details This example shows how AME can be used to run an additional partition on a
managed system by reducing the physical memory allocations to existing partitions. Point
out that the data compression (and this memory expansion) achieved for these examples is
at the high end of possible outcomes. Not all applications will demonstrate similar
performance improvements.
Additional information Some might point out that configuring a system with only 48 GB
of memory for 32 POWER7 cores seems a little on the light side. This example and the one
on the previous page was taken directly from a system seller presentation.
Transition statement Now its time for some checkpoint questions.
Instructor Guide
Checkpoint
1. True or False: Every POWER7 system comes with Active
Memory Expansion as standard.
2. True or False: Active Memory Expansion allows a partition to
effectively use more memory than the logical memory
amount allocated by the hypervisor.
3. True or False: The size of the compressed memory pool is

static.
4. True or False: The compression ratio achieved on a system

depends on the compressibility of data.
5. True or False: The AME feature costs the same on every

POWER7 system.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: Every POWER7 system comes with Active Memory
Expansion as standard.
The answer is false.
effectively use more memory than the logical memory amount
allocated by the hypervisor.
The answer is true.
3. True or False: The size of the compressed memory pool is static.

4. True or False: The compression ratio achieved on a system depends

on the compressibility of data.
The answer is true.
5. True or False: The AME feature costs the same on every POWER7
system.
Instructor Guide
Topic 1: Summary
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 2: Planning for Active Memory Expansion

List workload characteristics used to evaluate suitability for
AME
Figure 5-22. Topic 2: Planning for Active Memory Expansion AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Workload characteristics (1 of 2)
Not all workloads will benefit from using AME.
Some will benefit to a greater extent than others
Workload characteristics to consider include:

Compressibility of in-memory data
Memory access patterns
Memory segment type
Pinned memory usage
These characteristics might not be well known even for existing

workloads.
The amepat planning tool can be used to estimate whether an existing
workload will benefit from AME
Figure 5-23. Workload characteristics (1 of 2) AN313.1
Notes:
Not all workloads will benefit greatly from using AME. This figure lists the workload
characteristics that have an impact on the potential benefit of using AME.
AIX provides the amepat planning and advisory tool to advise in planning and
implementing AME.

V5.4.0.3
Instructor Guide

Purpose List the major factors that can limit the benefit provided by AME, and introduce
the amepat planning tool.
Details
Transition statement How are the different memory types evaluated?
Instructor Guide
Workload characteristics (2 of 2)
Compressibility of in-memory data
Most data will compress reasonably well, resulting in high levels of
memory expansion.
The exception is data that is already compressed.
Memory access patterns
Workloads that tend to frequently access small areas of memory will
see most benefit.
Memory segment type
AME does not compress file pages cached in memory, therefore file
serving applications will have little benefit.
Pinned memory usage
AME does not compress pinned virtual memory pages.
Workloads that pin most of their memory will have a large memory
footprint and will not benefit from AME.
Figure 5-24. Workload characteristics (2 of 2) AN313.1
Notes:
This visual describes how the specified workload characteristics are evaluated to
determine suitability for use with AME.

V5.4.0.3
Instructor Guide

Purpose Explain how different memory considerations impact the benefit provided by
AME.
Details
Transition statement In essence, usage of AME means trading CPU resources to gain
additional memory space.
Instructor Guide
CPU resource consumption

In addition to workload characteristics, benefit from AME is
dependent on available CPU resource.
AME uses CPU resource allocated to the LPAR for

compression and decompression of data.
Amount of CPU resource used for AME depends on workload

memory activity and target memory expansion.
The amepat planning tool can be used to estimate the amount

of CPU resource needed for a workload running with AME.
Figure 5-25. CPU resource consumption AN313.1
Notes:
The data compression and decompression carried out when AME is enabled will consume
CPU resources allocated to the LPAR. The actual amount of memory expansion that can
be provided by AME will be limited by the amount of available CPU resource in addition to
the compressibility of the data being used.
The AME planning tool can estimate the additional amount of CPU resource that will be
consumed for a set of modeled AME configurations, based on data collected for an existing
workload.

V5.4.0.3
Instructor Guide

Purpose Remind students that AME will consume CPU resources.
Details
Transition statement The planning tool can be used to determine if an existing
workload will benefit from using AME.
Instructor Guide
AME planning tool

The AME planning tool should be run over a period of time with
an existing workload.
Monitors workload memory access patterns and determines suitability
for AME
Can also be used to monitor AME performance when AME is enabled
The amepat tool generates a report that lists a number of

possible AME configurations for the measured workload.
Includes an estimate of additional CPU resource consumption used by
AME
Tool added in AIX 6.1 TL4 SP2.

Runs on any system supported by AIX 6.1
Figure 5-26. AME planning tool AN313.1
Notes:
The amepat planning and advisory tool was added in AIX 6.1 TL4 SP2. It runs on any
system supported by AIX 6.1, and can be used to generate a report that provides advice on
the usage of AME. It can examine the memory access patterns of a running workload and
estimate the amount of CPU resource that would be required for a number of modeled
AME configurations.

V5.4.0.3
Instructor Guide

Purpose Explain the purpose of the amepat planning tool.
Details
Transition statement There are a few operational aspects to the tool we need to know.
Instructor Guide
Running the AME planning tool

Can use command line or SMIT to invoke the tool.
The amepat command should be run by the root user.
If run by non-root user, no AME modeling will be done
Two considerations:
When to run the tool
Duration to run the tool
Run the tool during workload peak utilization.
Need to measure existing memory and CPU utilization
Tool periodically scans workload memory to determine
compressibility.
Can run the tool in real time or collect data to a file and then
generate one or more reports from the recording.
Only one instance of the tool can be run at a time to collect data
Figure 5-27. Running the AME planning tool AN313.1
Notes:
Operational considerations
The amepat command can be run from the command line, or using SMIT with the
command smit amepat.
When invoked by the root user, AME modeling will be performed using the available data. If
invoked by a non-root user, modeling will be disabled.
The tool should be run for a duration of time when the existing workload is at peak
utilization. This will allow the tool to measure CPU and memory usage on the system, along
with information on data compressibility.
The command can be operated in two basic modes. In recording mode, it will gather
statistics and store the data in a recording file. In report generation mode, it will generate a
report, either from real time data, or from data supplied in a recording file.
Recorded data can be used to generate multiple reports, which will allow you to model
multiple different AME scenarios.

V5.4.0.3
Instructor Guide

Purpose Cover high-level operational details of using the tool.
Details Mention that recording the data and then generating reports is more flexible,
since it allows you to model multiple different AME configurations without having to wait for
a period of time to collect data again.
Additional information Modeling requires the user collecting the data to have the
aix.system.stat RBAC authorization. The man page simply mentions the user must have
the correct privilege, without specifying which one is needed. The command uses the
/tmp/.amepat_lock file as a check to ensure only one instance of the command collects
data.
Transition statement The tool can be operated in recording mode, or reporting mode.
Instructor Guide
Command usage
The command syntax is as follows:
amepat [ { { [ -c max_ame_cpuusage] | [ -C max_ame_cpuusage ] } |
[ -e startexpfactor [ :stopexpfactor [ :incexpfactor ] ] ] }]
[ { [ -t tgt_expmem_size ] | [ -a ] } ] [ -n num_entries ]
[ -m min_mem_gain ] [ -u minucomp_poolsize ] [ -v ] [ -N ]
[ { [ -P recfile ] | [ Duration ] | [ Interval <Samples> ] } ]
amepat [ -N ] [ -R recfile ] { [ Duration ] | [ Interval <Samples> ] }
Flag Description
-c max_ame_cpuusage Unit is percentage
-C max_ame_cpuusage Unit is physical processors
-e startexpfactor Use to specify expansion factors to model
-t tgt_expmem_size Use specified target expanded memory size in MB
-a Auto tune target expanded memory size based on workload
-n num_entries Number of modeled statistics entries to display
-m mim_mem_gain Minimum modeled memory gain in MB from using AME
-u minucomp_poolsize Model using specified minimum uncompressed pool size
Duration Duration in minutes (memory samples collected automatically)
Interval Samples Interval in minutes, number of memory samples to obtain
-N Disable workload modeling, only monitor resource usage
Figure 5-28. Command usage AN313.1
Notes:
Syntax
The syntax of the amepat command is detailed on this visual.
The amepat tool allows for workload planning for AME, and also monitoring when AME is
enabled.
For workload planning, the command can be invoked in recording mode where it will collect
data to a recording file, or it can be invoked in reporting mode, where it will generate a
report either from a recording file, or by collecting data in real time.
The -N flag disables the workload planning capability, and means that amepat will only
monitor current AME statistics (if AME is currently enabled). The -N flag is implied if the
command is invoked by a non-root user.
Refer to the online documentation or the man page for the amepat command for a
complete description of the available options.

V5.4.0.3
Instructor Guide

Purpose Cover syntax of the tool.
Details
Transition statement Lets look at some examples of using the tool.
Instructor Guide
Command usage examples

Example 1:
amepat 5
Monitor the system for a five minute duration, and generate a report
(the number of data compressibility samples will be determined
automatically).
Example 2:
amepat R recfile 5 4
Monitor the system for 20 minutes, taking a data compressibility
sample every five minutes, storing the raw data in recfile.
Example 3:
amepat P recfile e 2.0:4.0:0.5
Generate a report using the data in recfile, with modeled memory
expansion factor between 2.0 and 4.0 in increments of 0.5.
Figure 5-29. Command usage examples AN313.1
Notes:
This visual contains multiple examples of invoking the amepat command, along with a
description of each example.

V5.4.0.3
Instructor Guide

Purpose Give some examples of using the amepat tool.
Details
Transition statement The report generated by the tool contains multiple sections of
information.
Instructor Guide
Planning tool report

The report generated by the amepat tool contains multiple
sections.
The first few sections provide detail on the partition used to run
the tool.
AME Statistics section is displayed if AME is currently enabled
(used for monitoring purposes).
AME Modeled Statistics section provides options for AME
configurations on the partition.
Tool assumes that current LPAR logical memory is to remain
constant, and AME will be used to reduce physical memory allocation.
AME Recommendation section suggests an initial AME
configuration to try.
Figure 5-30. Planning tool report AN313.1
Notes:
Report structure
The report generated by the amepat command contains multiple sections. The AME
modeled statistics and AME recommendation sections will only be displayed if the
command is run by the root user. The AME statistics section will only be displayed if the
command was run on a system that has AME enabled.

V5.4.0.3
Instructor Guide

Purpose Describe the different output sections in the report generated by the tool.
Details
Transition statement Lets look at some example reports.
Instructor Guide
First example report (1 of 2)

Command Invoked : amepat 5
Date/Time of invocation : Sat May 15 14:59:16 PDT 2010

Total Monitored time : 7 mins 38 secs
Total Samples Collected : 3
System Configuration:
---------------------
Partition Name : cassini201
Processor Implementation Mode : POWER7
Number Of Logical CPUs : 16
Processor Entitled Capacity : 2.00
Processor Max. Capacity : 4.00
True Memory : 4.00 GB
SMT Threads : 4
Shared Processor Mode : Enabled-Uncapped
Active Memory Sharing : Disabled
Active Memory Expansion : Disabled
System Resource Statistics: Average Min Max

--------------------------- ----------- ----------- -----------
CPU Util (Phys. Processors) 0.35 [ 9%] 0.33 [ 8%] 0.36 [ 9%]
Virtual Memory Size (MB) 4052 [ 99%] 4052 [ 99%] 4053 [ 99%]
True Memory In-Use (MB) 4045 [ 99%] 4044 [ 99%] 4046 [ 99%]
Pinned Memory (MB) 677 [ 17%] 677 [ 17%] 677 [ 17%]
File Cache Size (MB) 48 [ 1%] 48 [ 1%] 49 [ 1%]
Available Memory (MB) 34 [ 1%] 34 [ 1%] 36 [ 1%]
Figure 5-31. First example report (1 of 2) AN313.1
Notes:
Report information
The first part of the report displays the command that was invoked to generate the report,
along with information on when the command was invoked, the monitoring time and the
number of samples of memory data that were taken.
The actual monitored time will likely be longer than the specified command duration
(indicated on the command line as duration in minutes, or interval in minutes and number
of samples to take) based on the memory usage and access patterns of the workload.
System Configuration section
This section of the report details the CPU and memory configuration of the partition.
System Resource Statistics section
This section of the report details the average, minimum and maximum CPU and memory
resource utilization of the partition during the monitoring period.

V5.4.0.3
Instructor Guide

Purpose Cover first part of an example report.
Details
Transition statement The remainder of the report is shown on the next visual.
Instructor Guide
First example report (2 of 2)

Active Memory Expansion Modeled Statistics:
-------------------------------------------
Modeled Expanded Memory Size : 4.00 GB
Average Compression Ratio : 5.62
Expansion Modeled True Modeled CPU Usage

Factor Memory Size Memory Gain Estimate
--------- ------------- ------------------ -----------
1.00 4.00 GB 0.00 KB [ 0%] 0.00 [ 0%]
1.24 3.25 GB 768.00 MB [ 23%] 0.45 [ 11%]
1.60 2.50 GB 1.50 GB [ 60%] 1.16 [ 29%]
1.78 2.25 GB 1.75 GB [ 78%] 1.39 [ 35%]
2.00 2.00 GB 2.00 GB [100%] 1.63 [ 41%]
2.29 1.75 GB 2.25 GB [129%] 1.87 [ 47%]
2.67 1.50 GB 2.50 GB [167%] 2.10 [ 53%]
Active Memory Expansion Recommendation:

---------------------------------------
The recommended AME configuration for this workload is to configure the LPAR
with a memory size of 3.25 GB and to configure a memory expansion factor
of 1.24. This will result in a memory gain of 23%. With this
configuration, the estimated CPU usage due to AME is approximately 0.45
physical processors, and the estimated overall peak CPU resource required for
the LPAR is 0.81 physical processors.
NOTE: amepat's recommendations are based on the workload's utilization level

during the monitored period. If there is a change in the workload's utilization
level or a change in workload itself, amepat should be run again.
The modeled Active Memory Expansion CPU usage reported by amepat is just an
estimate. The actual CPU usage used for Active Memory Expansion may be lower
or higher depending on the workload.
Figure 5-32. First example report (2 of 2) AN313.1
Notes:
Data compressibility
In this first example, the data being used by the application workload consisted of binary
data structures with mixed contents of double, long, and char[] data types. This normally
compresses reasonably well, and this is reflected in the observed average compression
ratio of 5.62 for the sample period.
Modeled statistics
The modeled statistics section of the report shows the possible true memory size that could
be allocated to the LPAR, and the memory expansion factor that would be used to retain an
expanded memory size of 4GB (the current memory size of the LPAR when the amepat
command was run). Each line also shows the estimated amount of CPU resource that
would be used by AME.
Recommendation section
The report recommends an initial configuration to use when enabling AME for the first time
with this workload. You should then monitor the actual performance obtained.

V5.4.0.3
Instructor Guide

Purpose Cover last part of the first example report.
Details By default, the memory granularity used for the modeled scenarios is the LMB
size of the managed system.
Transition statement Lets examine another example report.
Instructor Guide
Second example report (1 of 2)

Command Invoked : amepat 5
Date/Time of invocation : Sat May 15 15:16:34 PDT 2010

Total Monitored time : 8 mins 15 secs
Total Samples Collected : 3
---------------------
SMT Threads : 4
Active Memory Expansion : Disabled
System Resource Statistics: Average Min Max

--------------------------- ----------- ----------- -----------
CPU Util (Phys. Processors) 0.38 [ 9%] 0.35 [ 9%] 0.41 [ 10%]
Virtual Memory Size (MB) 4052 [ 99%] 4052 [ 99%] 4052 [ 99%]
True Memory In-Use (MB) 4046 [ 99%] 4046 [ 99%] 4046 [ 99%]
Pinned Memory (MB) 677 [ 17%] 677 [ 17%] 677 [ 17%]
File Cache Size (MB) 49 [ 1%] 49 [ 1%] 49 [ 1%]
Available Memory (MB) 34 [ 1%] 34 [ 1%] 34 [ 1%]
Figure 5-33. Second example report (1 of 2) AN313.1
Notes:
The second example report shown on this visual was generated on the same lab system
used for the first example. The workload running on the system was the exact same
application as used previously. The recorded CPU and memory utilization values during
the sample period are similar to those observed in the previous example.
The key difference this time is that the data being used by the application was binary data
similar to the contents of a compressed file.

V5.4.0.3
Instructor Guide

Purpose Cover first part of the second example report.
Details
Transition statement The remainder of the report is shown on the next visual.
Instructor Guide
Second example report (2 of 2)

Active Memory Expansion Modeled Statistics:
-------------------------------------------
Modeled Expanded Memory Size : 4.00 GB
Average Compression Ratio : 1.06
Expansion Modeled True Modeled CPU Usage

Factor Memory Size Memory Gain Estimate
--------- ------------- ------------------ -----------
1.00 4.00 GB 0.00 KB [ 0%] 0.00 [ 0%]
Active Memory Expansion Recommendation:

---------------------------------------
The recommended AME configuration for this workload is to configure the LPAR
with a memory size of 4.00 GB and to configure a memory expansion factor
of 1.00. This will result in a memory gain of 0%. With this
configuration, the estimated CPU usage due to AME is approximately 0.00
physical processors, and the estimated overall peak CPU resource required for
the LPAR is 0.41 physical processors.
NOTE: amepat's recommendations are based on the workload's utilization level

during the monitored period. If there is a change in the workload's utilization
level or a change in workload itself, amepat should be run again.
The modeled Active Memory Expansion CPU usage reported by amepat is just an
estimate. The actual CPU usage used for Active Memory Expansion may be lower
or higher depending on the workload.
Figure 5-34. Second example report (2 of 2) AN313.1
Notes:
Data compressibility
Note that in this second example, the average compression ratio detected during the
sampling period was only 1.06. As such, the workload would not benefit from using AME,
since the compressed data being added to the compressed pool would consume about the
same amount of space as the uncompressed copy of the data. If AME were enabled in this
situation, CPU resources would be utilized to compress and decompress the data, but
there would be almost no gain in effective memory capacity.
Because of the nature of the data being used in this sampling period, the modeled statistics
section only shows one situation, and that reflects the current configuration of the partition.
The recommendation section suggests a memory expansion factor of 1.0, essentially
meaning that no data compression will take place. In this situation, it would likely be better
to leave AME completely disabled (rather than enabling it with an expansion value of 1.0).

V5.4.0.3
Instructor Guide

Purpose Cover last part of second example report.
Details The two examples provided here are at the extreme ends of the scale of
possibilities. The second example workload was using compressed data, so obviously
AME would not be able to compress this again very effectively; therefore we see a ratio of
1.06. The first example was using data structures containing mixed data, i.e. some longs,
doubles, integers and string data, which compresses very well. In this case the
compression ratio was 5.62. In the real world, its likely that the observed compression ratio
will be somewhere between these two extreme values.
Transition statement Time for review questions.
Instructor Guide
Checkpoint
1. True or False: Any user can use the amepat command to
generate AME modeling information.
2. True or False: The amepat command can be used to

generate a report using recorded data.
3. True or False: The amepat command should be run when

the target workload is idle.
4. True or False: The amepat command can run on any system

running AIX 6.1 TL4 SP2 or above.
5. True or False: The amepat command should only be run

when AME is disabled.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: Any user can use the amepat command to generate
AME modeling information.
2. True or False: The amepat command can be used to generate a
report using recorded data.
The answer is true.
3. True or False: The amepat command should be run when the target
workload is idle.
4. True or False: The amepat command can run on any system running
AIX 6.1 TL4 SP2 or above.
The answer is true.
5. True or False: The amepat command should only be run when AME
is disabled.
Instructor Guide
Topic 2: Summary
List workload characteristics used to evaluate suitability for
AME
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 3: Deploying Active Memory Expansion

Describe how to activate the AME feature on a managed

system
Figure 5-37. Topic 3: Deploying Active Memory Expansion AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
AME system requirements

Active Memory Expansion feature is enabled on a POWER7
system using a PowerVM activation code.
Can be configured with initial system order
Can be ordered as upgrade for existing system
60 day trial activation permitted once on each system
AME has minimum system and software requirements.

POWER7 system with AME feature (trial or permanent)
HMC V7R7.1.0 or above
eFW 7.1 or above
AIX V6.1 TL4 SP2 or above
POWER7 Blade systems cannot be managed by HMC.

Therefore cannot be configured with AME feature
Figure 5-38. AME system requirements AN313.1
Notes:
Minimum requirements
Active Memory Expansion can be enabled on AIX partitions running AIX 6.1 TL4 SP2 or
above when running on POWER7 processor-based systems that have a valid Active
Memory Expansion activation. The activation can be permanent or a trial activation that is
still within the 60 day validity period. The system must be HMC managed, and the HMC
must be running V7 R7.1.0 or above of the HMC software.

V5.4.0.3
Instructor Guide

Purpose Show the system requirements for implementing AME.
Details
Transition statement Each POWER7 system can have a one-time 60 day trial
activation of AME.
Instructor Guide
AME 60 day trial

A 60 day trial activation can be obtained once for each POWER7 system
Provides 60 power-on days of AME usage
Activation request made online
Activation code delivered online and optionally via email
Capacity on Demand website:
http://www-03.ibm.com/systems/power/hardware/cod/activations.html
Figure 5-39. AME 60 day trial AN313.1
Notes:
Trial activation
Since Active Memory Expansion is a new feature, and not all workloads might benefit from
its use, IBM has decided to allow a one-time 60 day free trial activation of AME to be
generated for each eligible POWER7 processor-based system.
The trial activation request is made online from the Capacity on Demand website at the
following URL:
http://www-03.ibm.com/systems/power/hardware/cod/activations.html
This website provides links for other useful information related to Capacity on Demand
features.

V5.4.0.3
Instructor Guide

Purpose Show the web interface for requesting a 60 day trial activation of AME.
Details
Transition statement Before we can request the trial activation, well need to collect
some information about the managed system.
Instructor Guide
Requesting a trial activation (1 of 2)

Need to gather system information to request trial activation.
Select managed system in HMC GUI.
Then select Capacity on Demand (COD) -> Other Advanced
Functions -> View Code Information.
Figure 5-40. Requesting a trial activation (1 of 2) AN313.1
Notes:
System information
In order to submit a trial request for AME, you will need to collect information about the
POWER7 processor-based managed system that the activation code will be used on.
This visual shows the menu path used to obtain this needed information.

V5.4.0.3
Instructor Guide

Purpose Show how to gather the information required to request a trial activation.
Details
Transition statement Once the managed system information has been collected, were
ready to request the trial activation.
Instructor Guide
Requesting a trial activation (2 of 2)

Proceed to Capacity on Demand website, and follow
instructions to request trial activation.
Enter system information and contact details.
Figure 5-41. Requesting a trial activation (2 of 2) AN313.1
Notes:
Request submission
Once the system information has been gathered, you can proceed with the trial AME
activation request. The visual above shows the web page obtained by following the link to
request a trial AME activation that is shown on the Capacity on Demand Web site shown in
a previous visual. This page contains multiple mandatory fields for the required system
information, along with customer contact details.

V5.4.0.3
Instructor Guide

Purpose Show the details required to request a 60 day trial activation of AME.
Details
Transition statement Once the code has been generated, we can retrieve it from
another web site.
Instructor Guide
Retrieving an activation code

Trial activation codes are delivered using Capacity on Demand
Activation Code website: http://www-912.ibm.com/pod/pod.
Can also be delivered using email if requested
Enter system information to retrieve code.
Figure 5-42. Retrieving an activation code AN313.1
Notes:
Once a trial request has been submitted, the appropriate Virtualization Engine Technology
(VET) code will be generated, and sent to the email address specified in the contact
details. The activation code can also be retrieved from the Capacity on Demand: Activation
Code website at http://www-912.ibm.com/pod/pod.
The screen capture on the left side of the visual shows the page displayed at the Activation
Code website. You can enter a system type and serial number to display the available
activation code information. A typical page is shown in the screen capture on the right side
of the visual. The VET code contains the information that will enable the AME feature on a
POWER7 processor-based system.

V5.4.0.3
Instructor Guide

Purpose Show how to retrieve an AME activation code.
Details
Transition statement Once the VET code has been retrieved, we need to use the HMC
to enter it on the managed system.
Instructor Guide
Enter activation code

Use HMC GUI to enter the VET activation code.
Select managed system in HMC GUI.
Then select Capacity on Demand (COD) -> Other Advanced
Functions -> Enter Activation Code.
Figure 5-43. Enter activation code AN313.1
Notes:
Once the required VET code has been obtained, it should be applied to the managed
system using the HMC interface as shown in this visual.

V5.4.0.3
Instructor Guide

Purpose Explain how to enter the VET activation code.
Details
Transition statement Once the code has been entered, check the managed system
properties to ensure the AME capability is shown as True.
Instructor Guide
Managed system properties

Activation of Active Memory Expansion on a managed system
is dynamic.
Check Capabilities tab of managed system properties after
code has been entered.
Figure 5-44. Managed system properties AN313.1
Notes:
System capabilities
Once the VET code for AME has been entered, you should check the Capabilities tab of
the managed system properties, as shown on the visual above. The value of the Active
Memory Expansion Capable property should be displayed as True.
Enabling of the AME capability on a managed system is dynamic. There is no need to
shutdown and then restart the managed system for the code to be recognized.

V5.4.0.3
Instructor Guide

Purpose Show how to check the managed system capabilities.
Details
Transition statement In addition to the 60 day trial version, theres also a permanent
activation available for AME.
Instructor Guide
Permanent activation of AME

Permanent activation of AME can be performed:
By including AME feature in system configuration when placing a new
order.
By placing MES upgrade order to add AME feature to existing system.
Permanent VET activation code delivery performed using

same delivery method as trial activation.
Figure 5-45. Permanent activation of AME AN313.1
Notes:
Upon completion of the 60 day trial, you might wish to permanently enable the AME
capability on a managed system. This is done by placing an MES upgrade order against
the system serial number. Delivery of the activation code for upgrade orders will be made
using the Activation Code website as shown earlier.
A permanent activation of AME can also be made as part of an initial system order. In this
case, the system will be delivered from the factory with the capability already enabled.
There will be no need to retrieve and then enter an activation code.

V5.4.0.3
Instructor Guide

Purpose Explain the options for obtaining a permanent AME activation.
Details In the student notes, MES = miscellaneous equipment supply - IBM terminology
for an upgrade order.
Transition statement Once the managed system is AME capable, how would we
configure AME on an individual partition?
Instructor Guide
Configuring an LPAR for AME (1 of 4)

Memory section of LPAR profile contains AME configuration.
Can be set when editing existing profile, creating new profile, or
creating new LPAR using wizard tool
Figure 5-46. Configuring an LPAR for AME (1 of 4) AN313.1
Notes:
Once a managed system has had the AME capability enabled, you can configure AME on
individual LPARs. The AME configuration of each LPAR is independent, and is made by
specifying a memory expansion factor value in the partition profile. This can be performed
when creating the partition (along with its default profile), by editing an existing profile, or by
creating a new profile.

V5.4.0.3
Instructor Guide

Purpose Show where the memory expansion factor is set using the HMC GUI.
Details
Transition statement The memory values listed in the profile are still interpreted as
logical memory values (physical in the case of a dedicated memory partition).
Instructor Guide

Memory values specified in profile are logical allocation values.
Actual expanded memory
amount achieved by AIX will
depend on compressibility
of data.
Partition page table
scaled based on
maximum memory
specified in profile.
Notes:
Memory values
The minimum, desired, and maximum memory values specified in the partition profile are
actual logical memory values that will be presented to the firmware and operating system
running in the partition. If the partition is configured to use dedicated memory, then when it
is activated, it will be allocated the desired amount of physical memory assuming there is
sufficient available physical memory. If there is insufficient available physical memory to
allocate the desired value, the partition will still be activated assuming it can be provided
with an amount of physical memory that is greater than or equal to the minimum memory
value.
The extended memory value presented to the applications and users will be calculated by
applying the memory expansion factor value as a multiplier to the actual memory value
currently presented to the operating system.

V5.4.0.3
Instructor Guide

Purpose Explain that the memory values (minimum, desired, and maximum) listed in
the profile are logical memory values. These will translate to actual physical memory
values for a partition configured for dedicated memory.
Details The partition page table will still be sized based on the maximum memory value
specified in the profile, even if this is smaller than the resulting expanded memory value
determined by the expansion factor.
Transition statement The HMC command line can also be used to configure AME.
Instructor Guide

HMC command line can also be used
Simply set the memory expansion factor value
To enable AME in the profile called normal for LPAR1 on managed
system cassini:
chsyscfg -r prof -m cassini i
"name=normal,lpar_name=LPAR1,mem_expansion=1.5"
To list the AME setting of the normal profile for LPAR1 on managed
system cassini:
lssyscfg r prof m cassini -filter \
"lpar_names=LPAR1,profile_names=normal" \
F lpar_name,mem_expansion
To list the current AME status of all LPARs on managed system cassini:
lshwres r mem m cassini -level lpar F
lpar_name,curr_mem_expansion
Notes:
HMC command line
In addition to the HMC GUI, the HMC command line can also be used to modify and list the
AME status of partition profiles, and to list the current AME status of partitions.
This visual contains multiple examples of using the HMC command line.

V5.4.0.3
Instructor Guide

Purpose Show how to use HMC CLI to set and view AME partition and profile values.
Details
Transition statement There are some operational aspects of implementing AME that
we should be aware of.
Instructor Guide

An LPAR must be reactivated using a suitable profile for AME
to be enabled.
Once AME is enabled, the memory expansion factor can be

changed dynamically.
Setting the expansion factor value to 1.0 effectively disables

memory compression.
Note that this is not the same as unconfiguring AME.
By default, a partition with AME enabled will only use 4KB

memory frames.
Notes:
Operational considerations
This visual lists additional operational considerations when using AME.
In particular, note that while AME is enabled dynamically on a managed system, a partition
must be reactivated using a suitable profile to enable (or disable) AME.
An AIX 6 partition running on POWER7 processor-based hardware will use 64KB pages for
many portions of the kernel address space when AME is not configured. When AME is
configured, even if the memory expansion factor is set to 1.0, the operating system will not
use 64KB pages by default.

V5.4.0.3
Instructor Guide

Purpose Explain some of the caveats of configuring AME.
Details
Transition statement How is AME handled during DLPAR operations?
Instructor Guide
DLPAR operations with AME

DLPAR memory add and remove operations are on the underlying
logical memory value presented to the OS.
New expanded memory amount will be recalculated by applying memory
expansion factor to new logical amount.
Expansion factor can also be changed dynamically.
Figure 5-50. DLPAR operations with AME AN313.1
Notes:
When performing memory DLPAR operations on a partition configured with AME, it is
important to understand the relationship between the values shown on the DLPAR dialog,
and the expanded memory value presented to applications and users in the partition.
Adding or removing memory is performed at the operating system level, and as such is
dealing with the actual logical memory blocks allocated to the operating system. Once the
specified number of logical memory blocks have been added to (or removed from) the
partition, a new expanded memory value will be calculated using the current memory
expansion factor value and the current logical memory amount.
The memory expansion factor value can also be changed dynamically.

V5.4.0.3
Instructor Guide

Purpose Describe how AME is handled during memory DLPAR operations.
Details
Transition statement How is AME removed from a partition configuration?
Instructor Guide
Unconfiguring AME
Using the GUI, simply clear the checkbox.
Expansion factor value automatically reset to 0.0
Using the CLI, set mem_expansion value in profile to 0.0.

chsyscfg -r prof -m cassini i \
"name=normal,lpar_name=LPAR1,mem_expansion=0.0
LPAR must then be reactivated using modified profile to disable AME.
If trial AME activation has expired, must use CLI to remove AME
configuration from LPAR profile due to bug in HMC GUI in V7R7.1.0.
Figure 5-51. Unconfiguring AME AN313.1
Notes:
AME removal
In order to remove AME from a partition configuration, the partition must be reactivated
using a profile that has a memory expansion factor value of 0.0.
When modifying a profile using the HMC GUI, simply clearing the AME checkbox will set
the expansion factor value to 0.0. When using the HMC CLI, set the mem_expansion value
in the profile to 0.0, as shown in the example on the visual above.
When a trial activation of AME expires, it will no longer be possible to activate partitions
using profiles that specify a memory expansion factor value greater than 0.0. A bug in HMC
V7 R7.1.0 prevents the GUI from clearing the AME checkbox when a trial activation of AME
has expired. The workaround in this case is to use the command line to set the expansion
factor to 0.0.

V5.4.0.3
Instructor Guide

Purpose Explain how to unconfigure AME for a partition.
Details The bug is expected to be fixed in the next release of HMC R7V7 code in June
2010.
Transition statement Time for review.
Instructor Guide
Checkpoint
1. True or False: A partition can use AME on any POWER7
system.
2. True or False: Multiple AME 60 day trial activations can be

used on a single system.
3. True or False: A managed system must be rebooted to

enable an AME activation code.
4. True or False: A partition must be reactivated to enable AME.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: A partition can use AME on any POWER7 system.
The answer is false. (AME can only be configured on a system with
the correct activation code.)
2. True or False: Multiple AME 60 day trial activations can be used on a

single system.
The answer is false. (Only a single 60 day trial can be used on each
system.)
3. True or False: A managed system must be rebooted to enable an

AME activation code.
The answer is false. (The AME capability is enabled dynamically when
the activation code is entered.)

The answer is true.
Instructor Guide
Topic 3: Summary
Describe how to activate the AME feature on a managed

system
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 4: Monitoring AME partitions

Determine the memory compression level achieved in a

partition
Determine the CPU resources used for memory compression

and decompression
Figure 5-54. Topic 4: Monitoring AME partitions AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Monitoring AME partitions

Monitoring a partition with AME is not much different to
monitoring one without AME.
Existing tools will function as expected
When AME is enabled, two key metrics are used to determine
the efficiency of AME.
CPU utilization
Expanded memory deficit
A workload running AME will consume more CPU resources
than the same workload without AME.
If CPU becomes constrained, the workload will suffer
Use existing AIX tools to monitor overall CPU usage of the LPAR
Can use lparstat and amepat to monitor CPU usage of AME
compression and decompression activities
Use vmstat to monitor size of compressed memory pool
Figure 5-55. Monitoring AME partitions AN313.1
Notes:
Existing AIX tools function as expected when run in a partition configured with AME.
Monitoring of AME partitions is very similar to those without AME.
In particular, CPU resource utilization should be monitored, since lack of CPU resource can
affect the workload. This is no different than a partition without AME, however the CPU
resource consumption will be higher when AME is compressing and decompressing pages.
A new performance metric that will need to be monitored is the expanded memory deficit.

V5.4.0.3
Instructor Guide

Purpose Explain that monitoring AME partitions is almost like monitoring partitions
without AME.
Details
Transition statement What would happen on a partition where the chosen memory
expansion factor is to aggressive?
Instructor Guide
Expanded memory deficit (1 of 4)

A chosen memory expansion factor might be too aggressive.
Target expanded memory value might not be achievable based on
compressibility of data being used by the workload.
If data placed in compressed pool does not compress as well
as expected, there might be an expanded memory deficit.
This indicates the partition will not achieve the intended expanded
memory target.
In the examples that follow, a partition is configured as follows:
20GB physical (logical) memory allocation
Memory expansion factor value equals 1.5
Means applications and users are told partition has 30GB of memory
capacity
Figure 5-56. Expanded memory deficit (1 of 4) AN313.1
Notes:
A memory deficit is the name used to describe a situation where a partition configured for
AME is unable to compress sufficient data to meet the expanded memory target value.
Typically this is because the actual data compression ratio achieved is less than required.
The visuals that follow contain diagrams that help to explain the concept of a memory
deficit. The LPAR depicted in the diagrams is configured as described on the visual.

V5.4.0.3
Instructor Guide

Purpose Introduce the concept of an expanded memory deficit.
Details
Transition statement Lets look at a diagram that helps to explain what happens in a
deficit situation.
Instructor Guide

If data compresses on average at 1.56 ratio, and compressed pool can
grow to 18GB, expanded memory target will be met.
LPARs expanded
logical memory
Uncompressed
data Uncompressed
2GB memory
pool
2GB
20GB
30GB Compressed
Compressed Compression memory
ratio
data 1.56 pool
28GB 18GB
View presented to
firmware, HMC, OS

Notes:
Zero deficit
When there is a zero memory deficit, the partition will be able to compress sufficient data to
reach the expanded memory target. An example of this situation is shown in the diagram
on the visual.

V5.4.0.3
Instructor Guide

Purpose Show how the expanded memory target is achieved when the compression
ratio achieved is good enough.
Details
Transition statement What would happen if the compression ratio thats achieved isnt
quite good enough?
Instructor Guide

If instead data compresses on average at 1.4 ratio, and compressed
pool can grow to 18GB, there will be an expanded memory deficit of
almost 3GB.
LPARs expanded
logical memory
Uncompressed
data Uncompressed
2GB memory
pool
2GB
20GB
30GB Compressed
Compressed
data
Compression memory
ratio
25.2GB 1.4 pool
18GB
View presented to
Deficit 2.8GB firmware, HMC, OS

Notes:
In a memory deficit situation, the partition will not be able to compress sufficient data to
reach the expanded memory target. An example of this situation is shown in the diagram
on the visual.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement What can we do if theres an expanded memory deficit?
Instructor Guide

If the expanded memory deficit for a partition is zero, the target
expanded memory can be achieved.
If the expanded memory deficit is non-zero, the target

expanded memory cannot be achieved.
Using previous example, if memory demands exceeded 27.2 GB, AIX
would start paging out pages from the uncompressed pool to paging
space.
Solution is to reduce memory expansion factor to reflect less

aggressive compression ratio.
This reduces expanded memory target.
Should also add an amount of memory to the partition so that
<expansion factor> X <memory> = expanded memory
Notes:
Correcting a deficit
Correcting a deficit will typically involve lowering the memory expansion factor to a less
aggressive value, and adding additional true memory to the partition. This allows the
reconfigured partition to still retain the target expanded memory value.

V5.4.0.3
Instructor Guide

Purpose Explain how to correct an expanded memory deficit.
Details
Transition statement The amepat command can be used for basic monitoring of an
AME configuration.
Instructor Guide
Basic AME monitoring

The amepat command is used to monitor AME statistics when
AME is enabled.
When invoked with no arguments it reports a snapshot of AME
configuration and performance metrics.
Modeling information not displayed
While amepat reports CPU utilization metrics (including those

consumed for AME activities), this is an average from last
boot.
Use lparstat and vmstat for interval monitoring.
Figure 5-60. Basic AME monitoring AN313.1
Notes:
The amepat command can be used to perform basic monitoring of AME statistics. When
invoked with no arguments, it provides a snapshot of AME performance information. No
modeling information is provided when the command is invoked in this way.

V5.4.0.3
Instructor Guide

Purpose Explain amepat usage for basic monitoring.
Details
Transition statement Lets look at an example report.
Instructor Guide
Basic monitoring example report

# amepat
Command Invoked : amepat
Date/Time of invocation : Sun May 16 22:45:10 CDT 2010

Total Monitored time : NA
Total Samples Collected : NA
---------------------
SMT Threads : 4
Active Memory Expansion : Enabled
Target Expanded Memory Size : 4.00 GB
Target Memory Expansion factor : 1.60
System Resource Statistics: Current

--------------------------- ----------------
CPU Util (Phys. Processors) 0.00 [ 0%]
Virtual Memory Size (MB) 3653 [ 89%]
True Memory In-Use (MB) 2556 [100%]
Pinned Memory (MB) 512 [ 20%]
File Cache Size (MB) 13 [ 1%]
Available Memory (MB) 356 [ 9%]
AME Statistics: Current

--------------- ----------------
AME CPU Usage (Phy. Proc Units) 0.00 [ 0%]
Compressed Memory (MB) 1917 [ 47%]
Compression Ratio 2.37
Figure 5-61. Basic monitoring example report AN313.1
Notes:
AME information
The amepat command will display AME related information when invoked with no
arguments. The visual above contains an example of the output format.
The System Resource Statistics section contains summarized CPU and memory resource
utilization information. The CPU resource information is from when the LPAR was last
booted. Other commands should be used for fine grained interval monitoring.
The AME Statistics section is displayed when the command is run on an LPAR that has
AME enabled.

V5.4.0.3
Instructor Guide

Purpose Show an example report generated by amepat when invoked with no
arguments on a partition configured with AME.
Details
Transition statement The lparstat command can display information about the AME
configuration of a partition.
Instructor Guide
The lparstat command (1 of 2)

AME configuration information included in lparstat i output
# lparstat I
. . . . .
Memory Mode : Dedicated-Expanded
Total I/O Memory Entitlement : -
Variable Memory Capacity Weight : -
Memory Pool ID : -
Physical Memory in the Pool : -
Hypervisor Page Size : -
Unallocated Variable Memory Capacity Weight: -
Unallocated I/O Memory entitlement : -
Memory Group ID of LPAR : -
Desired Virtual CPUs : 2
Desired Memory : 2560 MB
Desired Variable Capacity Weight : 128
Desired Capacity : 2.00
Target Memory Expansion Factor : 1.60
Target Memory Expansion Size : 4096 MB
Figure 5-62. The lparstat command (1 of 2) AN313.1
Notes:
AME information
The lparstat command will display AME related information when invoked with the -i flag. If
AME is currently not configured, the fields will contain a dash character. The visual contains
an example of the output format.

V5.4.0.3
Instructor Guide

Purpose Show AME information displayed by the lparstat command.
Details
Transition statement The -c flag can be used with the lparstat command to display
AME information.
Instructor Guide
The lparstat command (2 of 2)

The lparstat command with the c flag will display
compression information when AME is enabled.
# lparstat -c 2 4
System configuration: type=Shared mode=Uncapped mmode=Ded-E smt=4 lcpu=8

mem=4096MB tmem=2560MB psize=8 ent=2.00
%user %sys %wait %idle physc %entc lbusy vcsw phint %xcpu dxm
----- ----- ------ ------ ----- ----- ------ ----- ----- ------ ------
5.7 0.2 3.6 90.5 0.19 9.3 2.9 917 0 6.2 0
5.6 0.1 3.4 90.9 0.18 9.0 2.2 938 0 5.3 0
5.8 0.1 3.4 90.7 0.19 9.3 2.3 935 0 6.3 0
5.8 0.1 2.6 91.5 0.19 9.3 2.7 931 0 6.3 0
%xcpu: Percentage of CPU utilization used for AME

compression and decompression activity
dxm: Memory deficit in MB
Figure 5-63. The lparstat command (2 of 2) AN313.1
Notes:
AME information
The lparstat command will display AME information when invoked with the -c flag. The
information is only shown if AME is currently configured. The visual contains an example of
the output format.

V5.4.0.3
Instructor Guide

Purpose Show AME information displayed by the lparstat command.
Details
Transition statement The -c flag can be used with the vmstat command to display
AME information.
Instructor Guide
The vmstat command

The vmstat command with the c flag will display compression
information when AME is enabled.
Each line of information is very long (suggest using terminal window
with 132 characters per line).
# vmstat -c 2 4
System Configuration: lcpu=8 mem=4096MB tmem=2560MB ent=2.00 mmode=dedicated-E
kthr memory page faults cpu

------- ------------------------------------------------------ ----------------------- ------------------ -----------------------
r b avm fre csz cfr dxm ci co pi po in sy cs us sy id wa pc ec
1 0 555618 495161 10449 4853 0 0 0 0 0 4 16 75 12 0 88 0 0.37 18.6
1 0 555618 495160 10449 4853 0 0 0 0 0 1 12 74 12 0 88 0 0.38 18.9
1 0 555618 495160 10449 4853 0 0 0 0 0 3 9 66 12 0 87 0 0.39 19.3
1 0 555618 495160 10449 4853 0 0 0 0 0 0 13 66 13 0 87 0 0.39 19.6
memory columns are:

csz: Compressed pool size in 4KB page units
cfr: Free frames in compressed pool in 4KB page units
dxm: Expanded memory deficit in 4KB page units
page columns are:
ci: Page-ins per second from compressed pool
co: Page-outs per second to compressed pool
Figure 5-64. The vmstat command AN313.1
Notes:
AME information
The vmstat command will display AME information when invoked with the -c flag. The
information is only shown if AME is currently configured. The visual contains an example of
the output format.

V5.4.0.3
Instructor Guide

Purpose Show AME information displayed by the vmstat command.
Details
Transition statement If AME is enabled, the topas command displays AME information
on the main panel.
Instructor Guide
The topas command

The topas command will display AME statistics when AME is enabled.
Topas Monitor for host: cassini201 EVENTS/QUEUES FILE/TTY
Sun May 16 23:14:20 2010 Interval: 2 Cswitch 68 Readch 0
CPU User% Kern% Wait% Idle% Physc Entc Reads 0 Target
Rawin expanded
0
Forks 0
memory
Igets
amount
0
Waitqueue 0.0
Total 0.0 0.0 0.0 0.0 0.0 PAGING Real,MB 4096
Faults 0 % Comp 52
FileSystem KBPS TPS KB-Read KB-Writ Steals 0 % Noncomp 0
Total 0.0 0.0 0.0 0.0 PgspIn 0 % Client 0
PgspOut 0
Name PID CPU% PgSp Owner PageIn 0 PAGING SPACE
rmem 217308 4.9 512.3 root PageOut 0 Size,MB 512
rmem 286902 4.9 512.3 root Sios 0 % Used 2
rmem 291008 4.9 512.3 root % Free 98
topas 184386 0.0 1.4 root AME
True memory
gil 81960 0.0 0.1 root TMEM,MB 2560 WPAR Activ 0
rpc.lock Compressed
237746 pool size
0.0 0.2 root CMEM,MB 40 WPAR Total 0
Expansion
snmpmibd factor 0.0
209002 (target/actual)
0.7 root EF[T/A] 1.6/1.6 Press: "h"-help
hostmibd
Compressed 192532
page-ins0.0and0.8 root
page-outs CI:0.0 CO:0.0 "q"-quit
netm 77862 0.0 0.1 root
Figure 5-65. The topas command AN313.1
Notes:
AME information
The topas command will display AME information on the main panel if AME is currently
configured. The visual contains an example of the output format.

V5.4.0.3
Instructor Guide

Purpose Show AME information displayed by the topas command.
Details
Transition statement The amepat command can also be used for fine tuning the AME
configuration.
Instructor Guide
Fine tuning AME

The initial report generated by amepat is only a suggested
starting configuration for AME.
Once AME is enabled, you can use amepat again to fine tune
the configuration.
When AME CPU consumption is low, could possibly benefit from
using a more aggressive expansion factor.
Monitor workload at peak utilization.
Report is more accurate and useful, since the actual AME CPU usage
and achieved compression ratio will be known for current workload
configuration.
Figure 5-66. Fine tuning AME AN313.1
Notes:
The amepat command can also be used on a partition with AME configured to perform fine
tuning of the configuration. Invoke the command to gather data while the running workload
is at peak utilization. The generated report will be more accurate and useful, as actual AME
CPU resource consumption and achieved data compression rate information is available.

V5.4.0.3
Instructor Guide

Purpose Explain that amepat can be used for fine tuning the AME configuration.
Details
Transition statement Time for checkpoint questions.
Instructor Guide
Checkpoint
1. True or False: Monitoring a partition that has AME configured
is completely different to monitoring a partition without AME.
2. True or False: A memory deficit is resolved by lowering the

expansion factor value, and removing true memory.
3. True or False: The vmstat command will always report AME

statistics when AME is enabled.
4. True or False: The topas command will always show AME

statistics on the initial page when AME is enabled.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
The answer is true.
Instructor Guide
Topic 4: Summary
Determine the memory compression level achieved in a

partition
Determine the CPU resources used for memory compression

and decompression
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Exercise
Unit
exerc
ise
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Unit summary
decompression
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

V5.4.0.3
Instructor Guide
Uempty Unit 6. N_Port ID Virtualization
Estimated time
01:00

This unit describes the new PowerVM NPIV feature introduced in
Virtual I/O Server version 2.1.

Describe the NPIV PowerVM feature
Describe how to configure virtual Fibre Channel adapters on the
Virtual I/O Server and client partitions
Discuss how to use the HMC GUI and commands to work with the
World Wide Port Name (WWPN) pairs
Identify commands used to examine the NPIV configuration

Machine exercises
References
SG24-7590-01 IBM PowerVM Virtualization Managing and Monitoring
Redpaper: Implementing the Qlogic Intelligent Pass-thru Module for
IBM BladeCenter
Copyright IBM Corp. 2010, 2011 Unit 6. N_Port ID Virtualization 6-1

Instructor Guide
Unit objectives
Describe how to configure virtual Fibre Channel adapters on
the Virtual I/O Server and client partitions
Discuss how to use the HMC GUI and commands to work with
the World Wide Port Name (WWPN) pairs
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
NPIV: Overview
Virtualization of a physical Fibre Channel port
Physical Fibre Channel port sharing
Provides dedicated N_Port IDs for LPARs
One F_Port to be associated with multiple N_Port IDs

F_Port: fabric port; port on a SAN switch
N_Port: node port; port on computer (or disk)
Figure 6-2. NPIV: Overview AN313.1
Notes:
With N_Port ID Virtualization (NPIV), you can configure the managed system so that
multiple logical partitions can access independent physical storage through the same
physical fiber channel adapter. To access physical storage in a typical storage area
network (SAN) that uses fiber channel, the physical storage is mapped to logical units
(LUNs) and the LUNs are mapped to the ports of physical fiber channel adapters. Each
physical port on each physical fiber channel adapter is identified using one worldwide port
name (WWPN). NPIV is a standard technology for fiber channel networks and enables you
to connect multiple logical partitions to one physical port of a physical fiber channel
adapter. Each logical partition is identified by a unique WWPN, and this allows you to
connect each logical partition to independent physical storage on a SAN.
Using their unique WWPNs and the virtual fiber channel connections to the physical fiber
channel adapter, the operating systems running in the client logical partitions discover,
instantiate, and manage their physical storage located on the SAN. With NPIV you have
multiple fiber Channel initiators occupying and using a single physical port; easing
hardware requirements in Storage area network design.

V5.4.0.3
Instructor Guide
Uempty NPIV also allows one F_Port (switch port) to be associated with multiple N_Port (node port)
IDs. Physical fiber channel HBA (Host Bus Attachment) can be shared across multiple
guest operating systems in a virtual environment. The combination of the ability of an
N_Port device, such as a host bus adapter (HBA), to have multiple N_Port IDs and the
ability of fabric switches to accept NPIV capable devices is the basic concept of transparent
switching.
Using the SAN tools of the SAN switch vendor, you zone your NPIV-enabled switch to
include WWPNs that are created by the HMC for any virtual fiber Channel client adapter
with the WWPNs from your storage device in a zone. This is the same as required in an
environment using physical fiber channel adapters The SAN uses zones to provide access
to the targets based on WWPNs.

Instructor Guide
Instructor notes:
Purpose Describe NPIV.
Details
Transition statement Lets take a look at an environment without NPIV.

V5.4.0.3
Instructor Guide
Uempty
SAN disk access without NPIV

Share a Fibre Channel adapter using virtual SCSI
Virtual target devices backed by physical devices
Physical
P
adapter Virtual I/O Server Client Client
VSCSI server Physical Device Virtual

S mapping
virtual adapter device target
device
C VSCSI client
virtual adapter Physical Device Virtual
Virtual Virtual
device mapping target
target target
Physical device
device device
storage NON NPIV S1 S2
FC adapter C2 C1
SAN
SAN
B
C
Hypervisor
A
Figure 6-3. SAN disk access without NPIV AN313.1
Notes:
Before NPIV, the only way to share a Fibre Channel adapter was by using the Virtual SCSI
protocol.
Virtual SCSI is based on a client-server relationship. The VIO Server owns the physical
resources as well as the virtual SCSI server adapter, and acts as a server, or SCSI target
device. The client logical partitions have a SCSI initiator, referred to as the virtual SCSI
client adapter, and access the virtual SCSI targets as standard SCSI LUNs. You configure
the virtual adapters by using the HMC or IVM. The configuration and provisioning of virtual
disk resources is performed by using the VIO Server. Physical disks owned by the VIO
Server can be either exported and assigned to a client logical partition as a whole or can be
partitioned into parts, such as logical volumes or files. The logical volumes and files can
then be assigned to different logical partitions. Therefore, using virtual SCSI, you can share
adapters as well as disk devices. To make a physical volume, logical volume, or files
available to a client logical partition requires that it be assigned to a virtual SCSI server
adapter on the Virtual I/O Server. The client logical partition accesses its assigned disks
through a virtual-SCSI client adapter. The virtual-SCSI client adapter recognizes standard
SCSI devices and LUNs through this virtual adapter.

Instructor Guide
Instructor notes:
Purpose
Details VSCSI requires the configuring of virtual target devices in the VIOS. NPIV does
not require this.
Additional information The following SCSI peripheral device types are supported:
Disk backed by logical volume
Disk backed by physical volume
Disk backed by file
Optical CD-ROM, DVD-RAM, and DVD-ROM
Optical DVD-RAM backed by file
Tape devices

V5.4.0.3
Instructor Guide
Uempty
SAN access using NPIV

Multiple N_Port IDs share a single physical N_Port
Fibre Channel pass through communications
Provides direct Fibre Channel connections from client partitions to
SAN resources
No virtual target device defined on the VIO Server
P Physical adapter Client Client

Virtual I/O Server
S Virtual Fibre Channel
server adapter
Passthru LUN LUN
Virtual Fiber Channel
C A B
client adapter
NPIV C2 C1
N_Port FC adapter S1 S2
SAN N_Port N_Port

B SAN
C
Hypervisor
A
Figure 6-4. SAN access using NPIV AN313.1
Notes:
With NPIV, the VIOS's role is fundamentally different. The VIOS facilitates adapter sharing
only. There is no device level abstraction or emulation. Rather than a storage virtualizer,
the VIOS serving NPIV is a passthru, providing a fiber channel pass-through connection
from the client to the SAN.
NPIV is a standard technology for fiber channel networks that enables you to connect
multiple logical partitions to one physical port of a physical fiber channel adapter. Each
logical partition is identified by a unique WWPN, which means that you can connect each
logical partition to independent physical storage on a SAN. To enable NPIV on the
managed system, you must create a Virtual I/O Server logical partition (version 2.1, or
later) that provides virtual resources to client logical partitions. You assign the physical fiber
channel adapters (with support for NPIV) to the Virtual I/O Server logical partition. Then,
you connect virtual fiber channel adapters on the client logical partitions to virtual fiber
channel adapters on the Virtual I/O Server logical partition. A virtual fiber channel adapter
is a virtual adapter that provides client logical partitions with a fiber channel connection to a
storage area network through the Virtual I/O Server logical partition. The Virtual I/O Server
cannot access and does not emulate the physical storage to which the client logical

Instructor Guide
partitions have access.The Virtual I/O Server logical partition provides the connection
between the virtual fiber channel adapters on the Virtual I/O Server logical partition and the
physical fiber channel adapters on the managed system.

V5.4.0.3
Instructor Guide

Purpose Describe an environment that uses NPIV for storage access.
Details
Transition statement Lets discuss the benefits of NPIV.

Instructor Guide
NPIV benefits
Optimizes FC HBA resource usage
Simplifies SAN-based resource assignments to client partitions
LUN assigned to the WWPNs of the client virtual adapter
Compatible with storage solutions
SAN managers, Copy Services, backup / restore
Supported platforms
POWER6 servers and blades
HMC-managed and IVM-managed servers
Enables access to other SAN devices like tape libraries
Compatible with Live Partition Mobility
Physical FC HBA port supports 64 virtual ports
VIOS can support NPIV and vSCSI simultaneously
Figure 6-5. NPIV benefits AN313.1
Notes:
Key benefits include the following:
Automatically adjusts to SAN fabric speed: 8Gbps, 4Gbps, or 2Gbps
- Optimizes resource usage since the physical fiber channel adapter is shared.
- Each physical NPIV capable FC HBA (Host Bus Adapter) will support 64 virtual
ports.
- NPIV simplifies the assignment of SAN-based resources to client partitions and SAN
zoning.
The LUN is assigned to the WWPNs of the client virtual adapter. The LPAR host
is defined as the disk sub-system.
You do not have to identify LUN numbers on the VIOS before mapping to
clients).
- Supported on POWER6 servers, blades, HMC-managed and IVM-managed
servers.
- Enables access to other SAN devices like tapes libraries.
- VIOS can support NPIV and vSCSI simultaneously.
- Compatible with LPM (Live Partition Mobility).

V5.4.0.3
Instructor Guide

Purpose Identify the key benefits.
Details
Transition statement NPIV can coexist with VSCSI.

Instructor Guide
VIOS with NPIV and vSCSI

LUNs assigned to the WWPNs of the physical NPIV FC adapter
Or LUNs mapped to virtual target devices and exported as VSCSI
devices
Physical Virtual I/O Server Client

P adapter
Physical Device Virtual
device mapping target Client
VSCSI server Client
S C device
virtual adapter
Physical
device
VSCSI client
C D VSCSI
virtual adapter LUN LUN
disk
C A B
Physical
NPIV S2 S3
storage S1
FC adapter C3 C2 C1
SAN
SAN
B
C
Hypervisor
D A
You cannot mix VSCSI and NPIV paths to the same LUN.
Figure 6-6. VIOS with NPIV and vSCSI AN313.1
Notes:
VIOS can support NPIV and vSCSI simultaneously. Some LUNs can be assigned to the
WWPNs of the Physical NPIV FC adapter. These include the WWPNs assigned to the
clients virtual fiber channel adapters. Simultaneously, the VIOS can also provide access to
LUNs that are mapped to Virtual Target Devices and exported as vSCSI devices. There
can be MPIO or vendor-supplied multi-pathing software used to manage the paths to the
LUNs. You cannot mix vSCSI and NPIV paths to the same LUN.
The client can have one or more Virtual I/O Servers (VIOS) providing the pass-through
function for NPIV. The client can also have one or more VIOS hosting vSCSI storage. The
administrator could configure the client to boot from internal disk, vSCSI disk, or NPIV disk.
The physical HBA in the VIOS can support both NPIV and vSCSI traffic.

V5.4.0.3
Instructor Guide

Purpose Describe an environment that supports NPIV and VSCSI simultaneously.
Details
Transition statement Lets take a look at the hardware and software requirements.

Instructor Guide
NPIV requirements and considerations

Requirements
Power6 hardware
5735 PCIe 8Gb Fibre Channel adapter
PCI express slot
VIOS V2.1 (PowerVM software)
Clients
AIX 5.3 TL09 SP 2
AIX 6.1 TL02 SP 2
Linux (kernel v2.26.27: SLES 10 SP2, RHEL 4.7/5.2),
IBM i
HMC 7.3.4 and minimum FW Ex340_036
Entry SAN switch must be NPIV capable
Considerations
Operates at link speeds of 2, 4, and 8 Gbps (not 1Gpbs)
Disk sub-system does not have to be NPIV capable
Unique WWPN generation (allocated in pairs)
Both must be zoned (2nd WWPN used with Live Partition Mobility)
Each virtual FC HBA has a unique and persistent identity
Figure 6-7. NPIV requirements and considerations AN313.1
Notes:
Only the first SAN switch which is attached to the Fibre Channel adapter in the Virtual I/O
Server needs to be NPIV capable. Other switches in your SAN environment do not need to
be NPIV capable.
An NPIV implementation requires two participating ports:
An N_Port that communicates with a Fibre Channel fabric for requesting port
addresses and subsequently registering with the fabric.
An F_Port (SAN switch port) that assigns the addresses and provides fabric services.
WWPNs are generated based on the range of names available for use with the prefix in the
vital product data on the managed system. This 6-digit prefix comes with the purchase of
the managed system and includes 32,000 pairs of WWPNs. When you remove the
connection between a logical partition and a physical port (for example, by deleting an
adapter), the hypervisor deletes the WWPNs that are assigned to the virtual Fibre Channel
adapter on the logical partition.

V5.4.0.3
Instructor Guide
Uempty The hypervisor does not reuse the WWPNs that are assigned to the virtual fiber Channel
client adapter on the client logical partition. If you create a new virtual fiber channel
adapter, you get a NEW pair of WWPNs. The pair is critical to proper operation, and both
must be zoned (2nd WWPN is used for Live Partition Mobility).
Power6 hardware, minimum firmware Ex340_041
Entry level systems: EL340_041
Midrange systems: EM340_041
Software:
HMC V7.3.4, or later
Virtual I/O Server Version 2.1 with Fix Pack 20.1, or later
AIX 5.3 TL9, or later
SDD 1.7.2.0 + PTF 1.7.2.2
IBM Multipath Software
- NPIV clients require the following versions:
SDD 1.7.2.2
SDDPCM 2.2.0.6 or 2.4.0.1
http://www-01.ibm.com/support/docview.wss?rs=540&context=ST52G7&uid=ssg
1S1003469
- For VIOS 2.1, follow the SDD/SDDPCM support matrix for AIX 6.1 versions
http://www-01.ibm.com/support/docview.wss?rs=540&uid=ssg1S7001350
EMC Power Path
- AIX 6.1clients require Power Path 5.3.0.0
- VIOS 2.1 would need Power Path 5.3.0.0
Hitachi Dynamic Link Manager
- AIX clients require HDLM 5.9.4
- VIOS 2.1 would need HDLM 5.9.4

Instructor Guide
Instructor notes:
Purpose Identify the requirements.
Details
Additional information If you reach the maximum number of WWPNs, you will need
to contact IBM and request a new activation code.
Transition statement The following discusses the task that must be performed when
configuring NPIV.

V5.4.0.3
Instructor Guide
Uempty
NPIV configuration steps: HMC, VIOS

Configure virtual Fibre Channel adapter on the VIO Server.
Edit the partition profile or use DLPAR.
Activate the VIO Server or run cfgdev. Check for a new vfchost# adapter
definition.
Figure 6-8. NPIV configuration steps: HMC, VIOS AN313.1
Notes:
The Virtual I/O Server cannot access and does not emulate the physical storage to which
the client logical partitions have access. The Virtual I/O Server provides the client logical
partitions with a connection to the physical fiber channel adapters on the managed system.
There is always a one-to-one relationship between virtual fiber channel adapters on the
client logical partitions and the virtual fiber channel adapters on the Virtual I/O Server
logical partition. That is, each virtual fiber channel adapter on a client logical partition must
connect to only one virtual fiber channel adapter on the Virtual I/O Server logical partition,
and each virtual fiber channel on the Virtual I/O Server logical partition must connect to
only one virtual fiber Fibre Channel adapter on a client logical partition.
Configuring a virtual Fibre Channel adapter using the HMC
You can configure a virtual fiber channel adapter dynamically for a running logical partition
using the Hardware Management Console (HMC). A Linux logical partition supports the
dynamic addition of virtual Fibre Channel adapters only if the DynamicRM tool package is
installed on the Linux logical partition. To download the DynamicRM tool package, see the
Service and Productivity Tools for Linux on POWER systems Web site.

Instructor Guide
When you dynamically add a virtual Fibre Channel adapter to a client logical partition, the
virtual Fibre Channel adapter (and the associated WWPNs) is lost when you restart the
logical partition. If you add the virtual fiber channel adapter to a partition profile after you
dynamically added it to the logical partition, the profile-based virtual fiber channel adapter
is assigned a different pair of worldwide port names (WWPNs) when the LPAR is started
with this profile. For this reason, the preferred way to add virtual Fibre Channel adapters is
by adding it to the partition profile.
Activate the VIO Server, or run cfgdev if virtual adapter was added using DLPAR.
Map the Virtual FC Adapter to an NPIV Physical Adapter
- vfcmap -vadapter vfchost2 -fcp fcs0
Check Virtual FC Mapping
- lsmap all npiv
Activate the LPAR, boot as SMS and install OS, or run cfgmgr if FC Virtual adapter
were added using DLPAR
Change the reserve policy attribute of the disk to no_reserve

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
NPIV configuration steps: HMC, client LPAR

Configure virtual Fibre Channel adapter on client LPAR.
Edit the partition profile.
This will result in a new fcs# adapter in the client partition.
Figure 6-9. NPIV configuration steps: HMC, client LPAR AN313.1
Notes:
Server and client virtual Fibre Channel adapters are mapped one-to-one with the vfcmap
command in the VIOS.

V5.4.0.3
Instructor Guide

Purpose Describe how to configure virtual Fibre Channel client adapters.
Details The process is very similar to how VSCSI adapters are configured. There is a
server adapter at the VIOS and a client adapter and the client LPAR.
Transition statement After the client adapter is created you can view the associated
World Wide Port Names (WWPN).

Instructor Guide
NPIV configuration steps: HMC, SAN

Identify the clients worldwide port names (WWPN).
View properties of the Virtual FC client adapter.
Or use lshwres HMC command:

# lshwres --rsubtype fc -m <managed_sys_name> --
level lpar \ -r virtualio
Configure the SAN (zoning) and assign the LUN to both
WWPNs.
Figure 6-10. NPIV configuration steps: HMC, SAN AN313.1
Notes:
The WWPNs can also be displayed using the lscfg command at the client. For example:
# lscfg -vl fcs2
Using the SAN tools of the SAN switch vendor, you zone your NPIV-enabled switch to
include WWPNs that are created by the HMC for any virtual Fibre Channel client adapters.
You would put the WWPNs of the virtual adapters and the WWPNs from your storage
device in a zone; just as with a physical fiber channel adapter environment.
Some SAN switches require an optional license to activate NPIV capabilities.

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
NPIV configuration steps: VIO Server

Map the virtual Fibre Channel adapter to a physical Fibre Channel
adapter using vfcmap command.
Check the mapping using lsmap.

$ lsmap -all npiv
Name Physloc ClntID ClntName ClntOS
------------- ---------------------- ------ -------------- -------
vfchost2 U8203.E4A.651E365-V1-C12 4 sys154_lpar2 AIX
Status:NOT_LOGGED_IN
FC name:fcs0 FC loc code:U789C.001.DQD1D71-P1-
C2-T1
Ports logged in:0
Flags:a<NOT_LOGGED>
VFC client name: VFC client DRC
Activate the client LPAR.

Figure 6-11. NPIV configuration steps: VIO Server AN313.1
Notes:
In this example, the client is not booting from the SAN LUN. In order for npiv client wwpn
to show up on the switch, the client must first, do an npiv login, which requires the client to
first do device discovery.

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
Listing the NPIV mapping

When the client partition is activated:
Check the NPIV adapter in the client partition.
# lsdev -Cc adapter | grep fc
fcs0 Available C12-T1 Virtual Fibre Channel Client Adapter
Check the WWPN of the NPIV adapter in the client partition
# lscfg vpl fcs0 | grep Net
Network Address.............C050760032220006
Check the mapping on the VIO using lsmap.
$ lsmap -all npiv
Name Physloc ClntID ClntName ClntOS
------------- ---------------------- ------ -------------- -------
vfchost2 U8203.E4A.651E365-V1-C12 4 sys154_lpar2 AIX
Status:LOGGED_IN
FC name:fcs0 FC loc code:U789C.001.DQD1D71-P1-C2-T1
Ports logged in:3
Flags:a<LOGGED_IN, STRIP_MERGE>
VFC client name: fcs0 VFC client DRC:U8203.E4A.651E365-V3-
C12-T1
Figure 6-12. Listing the NPIV mapping AN313.1
Notes:
The lscfg and lsmap commands can be helpful when examining the details of the
configuration. In the lsmap command you can find the name of the NPIV clients, the status
of the connections (LOGGED_IN implies the SAN switch has identified and connected to
the client's n_port), and location codes for the associated devices.

V5.4.0.3
Instructor Guide

Purpose Identify commands used to examine the NPIV configuration.
Details
Transition statement Lets take a look at DLPAR considerations.

Instructor Guide
DLPAR and NPIV considerations

Delete and recreate (using DLPAR or in partition profile) uses
a new pair of WWPNs.
After change, LUN is no longer assigned to your LPAR and
requires SAN reconfiguration.
A new host must be created.
SAN switch and zoning could be affected.
To avoid SAN reconfiguration:
Set WWPNs of the new adapter to match the values from the original
adapter.
Can be done using HMC command line
chsyscfg -r prof -m sys154 -i name=mobility,
lpar_name=sys154c4,
\"virtual_fc_adapters=\"\"14/client/1/sys154v1/23/c05076
00667c0018,c0507600667c0019/1\"\"\
Figure 6-13. DLPAR and NPIV considerations AN313.1
Notes:
If you delete the Virtual Fibre Channel Adapter and recreate it from the HMC GUI (using
DLPAR or in Partition Profile), then you get a new pair of WWPNs. The LUN is not
assigned to your LPAR and a SAN reconfiguration is required.
A new host must be created.
SAN switch and zoning could be affected.
To avoid SAN reconfiguration, change the WWPNs of the newly created Virtual FC Client
adapter and define it with the recorded values from the original adapter. (This is what the
HMC does during LPM process with target Virtual client FC.)
You are able to change the WWPNs of the virtual adapter to match the original WWPNS by
using the HMC command line (be careful with HMC CLI syntax, backslash and double
quotes). Below is an example:
chsyscfg -r prof -m sys154 -i name=mobility, lpar_name=sys154c4,
\"virtual_fc_adapters=\"\"14/client/1/sys154v1/23/c0507600667c0018,c0507600
667c0019/1\"\"\

V5.4.0.3
Instructor Guide

Purpose Discuss DPLPAR considerations.
Details If you remove a virtual Fibre Channel connection from the LPAR profile, and add
it back, a new WWPN pair will be assigned. The old WWPN pair is deleted from the
Hypervisor.
Additional information To list the WWPN pairs, the following HMC command can be
used:
lshwres r virtualio m CEC -- rsubtype fc --level lpar
lpar_name=zlab019-Migration2-Client,lpar_id=1,slot_num=15,adapter_type=clie
nt,state=1
,is_required=0,remote_lpar_id=4,remote_lpar_name=zlab008-VIOS-Migration1,re
mote_sl
ot_num=15,"wwpns=c050760034110002,c050760034110003"
nt,state=1
mote_sl
ot_num=16,"wwpns=c050760034110004,c050760034110005"
nt,state=1
mote_sl
ot_num=14,"wwpns=c050760034110000,c050760034110001"
Transition statement MPIO configuration can have both virtual and physical fiber
channel adapters providing the paths.

Instructor Guide
Heterogeneous multipathing
Supported between virtual NPIV and physical Fibre Channel adapters
Delivers flexibility for Live Partition Mobility environments
VIOS#1 AIX
Passthru module A
NPIV
NPIV
Fiber
Fiber
HBA
HBA
POWER Hypervisor
Storage Controller
SAN Switch SAN Switch
A B C D
A B C D
Figure 6-14. Heterogeneous multipathing AN313.1
Notes:
Provides efficient path redundancy to SAN resources for several LPARs using a single
NPIV adapter. In the example above, the virtual fiber channel adapter is used as a backup
path. This configuration also provides Dynamic Heterogeneous Multi-Path I/O. During a
Partition Mobility operation the LPAR could temporarily use the virtual path. The
administrator would have to remove the physical path using DLPAR, migrate, and then add
(reconfigure) the physical adapter at the target system.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement The following command can be useful in identifying a
heterogeneous multipathing environment.

Instructor Guide
AIX LPAR heterogeneous MPIO

# lspv
hdisk0 0002b032ee662a32 rootvg active
# lspath
Enabled hdisk0 fscsi0
# lscfg -l fcs0 ; lscfg -l fcs2

fcs0 U7311.D20.6516DFC-P1-C07-T1 FC Adapter
fcs2 U8204.E8A.652B022-V14-C31-T1 Virtual fiber Channel
Client Adapter
Figure 6-15. AIX LPAR heterogeneous MPIO AN313.1
Notes:
# lspath -l hdisk0 -s available -F"connection:parent:path_status:status"
50050763060b81c5,4050400000000000:fscsi0:Available:Enabled
50050763061881c5,4050400000000000:fscsi0:Available:Enabled
50050763060b81c5,4050400000000000:fscsi2:Available:Enabled
50050763061881c5,4050400000000000:fscsi2:Available:Enabled.

V5.4.0.3
Instructor Guide

Purpose
Details In this example, the LPAR is not in a dual VIOS environment. The output
contains four paths because each adapter can communicate with the two controllers of the
storage system.

Instructor Guide
Shared NPIV adapter for efficient path redundancy
Logical partitions
Physical
NPIV Fibre VIOS AIX Linux AIX AIX AIX AIX
Channel
Virtual
1 2 3 4 5 6
HBA
FC
server A1
adapter A7
A2 A3 A4 A5 A6 A8
Virtual client FC adapters
POWER Hypervisor
SAN Physical Physical

Fibre Fibre
Channel Channel
HBA HBA
A1 A2 A3 A4 A5 A6 A7 A8
Figure 6-16. Shared NPIV adapter for efficient path redundancy AN313.1
Notes:
Redundancy configurations help protect your network from physical adapter failures as well
as Virtual I/O Server failures. Similar to virtual SCSI redundancy, virtual Fibre Channel
redundancy can be achieved using Multi-path I/O (MPIO) and mirroring at the client
partition. The difference between traditional redundancy with SCSI adapters and the NPIV
technology using virtual Fibre Channel adapters is that the redundancy occurs on the client
because only the client recognizes the disk.
The physical Fibre Channel port is connected to a virtual Fibre Channel adapter on the VIO
Server. The virtual Fibre Channel adapter on the VIO Server is connected to ports on the
physical Fibre Channel adapter. A single adapter could have multiple ports.
This example uses host bus adapter (HBA) failover to provide a basic level of redundancy
for the client logical partitions number 5 and 6. Their primary paths are through the
assigned physical fiber channel adapters. The backup paths are the virtual Fibre Channel
adapters.
It is recommended that you configure virtual Fibre Channel adapters from multiple logical
partitions to the same HBA, or you configure virtual Fibre Channel adapters from the same
logical partition to different HBAs.

V5.4.0.3
Instructor Guide

Purpose
Details Since the pervious figure identified MPIO, in this example, identify LPARs 5 and
6 as being configured with mirroring instead of MPIO.

Instructor Guide
Dual VIOS without NPIV (vSCSI)

VIOS AIX
LVM LVM
LVM
Multipathing Multipathing
Multipathing
Disk driver Disk driver
Disk driver
Fibre Channel Fibre Channel

VSCSI VSCSI VSCSI
HBAs VSCSI HBAs
target target
HBA HBA
PHYP
SAN
Figure 6-17. Dual VIOS without NPIV (vSCSI) AN313.1
Notes:
This is a simple diagram to illustrate how the VSCSI configuration requires the VIOS's to
provide key components and devices.

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
Dual VIOS with NPIV

VIOS AIX
LVM
Multipathing
Disk driver
Fibre Channel Fibre Channel

Passthru VFC VFC Passthru
HBAs HBAs
module module
HBA HBA
PHYP
SAN
Figure 6-18. Dual VIOS with NPIV AN313.1
Notes:
With NPIV, the VIOS does not have virtual targets configured. A virtual fiber server adapter
is created, but it serves as a connection to the pass-through module.

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
Live Partition Mobility and NPIV

WWPNs are allocated in pairs.
VIOS VIOS
vio client vio client
WWPN WWPN
N N
P P
I I
V V
WWPN NPIV enabled WWPN
vio client SAN vio client
WWPN WWPN
N N
vio client WWPN P P WWPN
vio client
WWPN I I WWPN
V V
vio client vio client
WWPN WWPN
VIOS VIOS
Figure 6-19. Live Partition Mobility and NPIV AN313.1
Notes:
Target storage subsystem must be zoned and visible from source and destination systems
for LPM to work.
Active/passive storage controllers must BOTH be in the SAN zone for LPM to work.
The infrastructure must meet the following requirements for migrations with virtual Fibre
Channel adapters:
The destination Virtual I/O Server must contain an NPIV-capable physical Fibre
Channel adapter that is connected to the NPIV-enabled port on the switch that has
connectivity to a port on a SAN device that has access to the same targets as the client
is using on the source CEC.
On the source Virtual I/O Server partition, do not set the adapter as required when you
create a virtual Fibre Channel adapter. The virtual Fibre Channel adapter must be solely
accessible by the client adapter of the mobile partition.

V5.4.0.3
Instructor Guide
Uempty On the destination Virtual I/O Server partition, do not create any virtual Fibre Channel
adapters for the mobile partition. These are created automatically by the migration
function.
The mobile partitions virtual Fibre Channel WWPNs must be zoned on the switch with
the storage subsystem. You must include both WWPNs from each virtual Fibre Channel
adapter in the zone. The WWPN on the physical adapter on the source and destination
Virtual I/O Server does not have to be included in the zone.
The following components must be configured in the environment to support Live
partition Mobility:
- An NPIV-capable SAN switch
- An NPIV-capable physical Fibre Channel adapter on the source and destination
Virtual I/O Servers
- Each virtual Fibre Channel adapter on the Virtual I/O Server mapped to an
NPIV-capable physical Fibre Channel adapter
- Each virtual Fibre Channel adapter on the mobile partition mapped to a virtual Fibre
Channel adapter in the Virtual I/O Server
- At least one LUN mapped to the mobile partition's virtual Fibre Channel adapter
- Mobile partitions may have virtual SCSI and virtual Fibre Channel LUNs. Migration
of LUNs between virtual SCSI and virtual Fibre Channel is not supported at the time
of publication.

Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
NPIV useful commands

vfcmap -vadapter vfchostN -fcp fcsX
Maps the virtual FC to the physical FC port
vfcmap -vadapter vfchostN -fcp
Unmaps the virtual FC from the physical FC port
lsmap all npiv
Shows the mapping of virtual and physical adapters and current status
lsmap npiv vadapter vfchostN (shows same for one VFC)
lsdev -dev vfchost*
Lists all available virtual Fibre Channel server adapters
lsdev -dev fcs*
Lists all available physical Fibre Channel server adapters
lsdev dev fcs* -vpd
Shows all physical FC adapter properties
lsnports
Shows the Fibre Channel adapter NPIV readiness of the adapter and the SAN
switch
lscfg -vl fcsx
Shows virtual Fibre Channel properties (In AIX client LPAR)
Figure 6-20. NPIV useful commands AN313.1
Notes:
This visual shows a list of commands that are useful when managing an NPIV
environment.

Instructor Guide
Instructor notes:
Purpose
Details
Additional information Other commands in AIX:
lspath
lspath -l hdisk0 -s available
-F"connection:parent:path_status:status"

V5.4.0.3
Instructor Guide
Uempty
Checkpoint
1. As with SCSI, a server adapter must be created at the VIOS.
However, how does its function differ from VSCSI?
2. Can the VIOS support N_PIV and VSCSI simultaneously?
3. How are the LPARs pair of WWPNs used?
Notes:

Instructor Guide
Instructor notes:
Purpose
Details
The answer is with N_PIV, the VIOS provides a pass-thru
service.

The answer is yes.

The answer is the pair of numbers are associated with the
same LUN; One is use by the source system and the other
by the target system after Live Partition Mobility.

V5.4.0.3
Instructor Guide
Uempty
Exercise
Unit
exerc
ise
Notes:

Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Unit summary
Describe how to configure virtual Fibre Channel adapters on
the virtual I/O server and client partitions
Discuss how to use the HMC GUI and commands to work with
the World Wide Port Name (WWPN) pairs
Notes:

Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty Unit 7. I/O device virtualization performance and

tuning
Estimated time
04:00

This unit discusses the Virtual I/O Server, the virtual SCSI, virtual
Ethernet, shared Ethernet, and Integrated Virtual Ethernet (IVE)
adapters management and performances considerations.

Discover physical to virtual SCSI device configuration
Determine which client partitions and devices are affecting the
Virtual I/O Server performance
Describe the partition resource sizing guidelines for Virtual I/O
Servers used for virtual SCSI
Use performance analysis tools to monitor virtual SCSI device
performance
Describe how the following tuning options affect virtual Ethernet
performance:
- MTU sizes, CPU entitlement, TCP checksum offloading,
simultaneous multithreading
Monitor virtual Ethernet utilization statistics
Describe Virtual I/O Server sizing guidelines for hosting shared
Ethernet adapter services
- Physical adapters, memory, and processing resources
Configure shared Ethernet adapter threading
Configure TCP segmentation offload on the shared Ethernet
adapter
Configure SEA bandwidth apportioning and monitor with the
seastat utility
Copyright IBM Corp. 2010, 2011 Unit 7. I/O device virtualization performance and tuning 7-1
Instructor Guide
Monitor shared Ethernet adapter network traffic with Virtual I/O

Server utilities
Describe the Integrated Virtual Ethernet (IVE) adapter function
List performance and network availability considerations when
configuring IVE devices
Tune the MCS value and queue pairs for optimal performance or
scalability
View queue pair configuration from AIX
Monitor IVE port usage

Machine exercises

V5.4.0.3
Instructor Guide
Uempty
Unit objectives
Determine which client partitions and devices are affecting the Virtual I/O Server performance
Describe the partition resource sizing guidelines for Virtual I/O Servers used for virtual SCSI
Use performance analysis tools to monitor virtual SCSI device performance
Describe how the following tuning options affect virtual Ethernet performance:
MTU sizes, CPU entitlement, TCP checksum offloading, simultaneous multithreading
Describe Virtual I/O Server sizing guidelines for hosting shared Ethernet adapter services
Physical adapters, memory, and processing resources
Configure TCP segmentation offload on the shared Ethernet adapter
Configure SEA bandwidth apportioning and monitor with the seastat utility
Monitor shared Ethernet adapter network traffic with Virtual I/O Server utilities
List performance and network availability considerations when configuring IVE devices
Tune the MCS value and queue pairs for optimal performance or scalability
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Topic 1: Virtual device performance overview

Describe the performance considerations when using virtual
I/O
Use a methodical approach when tuning virtual I/O
performance
Describe tools that can be used to analyze and tune virtual
configurations
Figure 7-2. Topic 1: Virtual device performance overview AN313.1
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual I/O performance

Virtual I/O devices have more complicated performance considerations.
Many virtual devices are backed by an underlying physical resource, therefore
there are both physical and virtual performance considerations.
Many virtual devices use one or more Virtual I/O Server (VIOS)
partitions.
This adds the performance of the VIOS partition itself as another consideration.
Example: A CPU resource constraint on the VIOS could show up as a disk
performance problem on a client partition.
Is the goal optimization, flexible and quick configuration, speed, or all of
the above?
Understand how the devices will be used and choose appropriate type.
Use dedicated Ethernet for LPARs with high network loads.
LPAR boot disks are a great use for virtual disks.
Figure 7-3. Virtual I/O performance AN313.1
Notes:
Performance considerations
Virtualization alters the way we look at system performance. We still follow the same rules
with respect to identifying existing or potential bottlenecks, but the remedy can be different
and more difficult to obtain. It is important to decide what the performance goal is, or what
the performance goals are. Understand how devices should be configured for the best
results and when virtual devices should be used instead of natively-attached, physical
devices. Clients can utilize a mix of directly-attached physical devices and virtual devices
depending on its requirements and availability of devices. Client partitions can use one or
more VIOS partitions to provide their virtual services for load-balancing or redundancy.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual devices: Overview (1 of 2)

Preconfigured special-purpose virtual devices:
Virtual serial adapter: for HMC to LPAR virtual console connection
VASI adapter: Used for Hypervisor to VIOS communication to
implement various specialized system services
Virtual devices that do not require a Virtual I/O Server:
Virtual Ethernet adapter:
Partitions on the same system communicate without using physical
Ethernet adapters
Integrated virtual Ethernet adapter/Host Ethernet Adapter (POWER6
only):
Ethernet virtualized in hardware (supports 16-32 LPARs)
Figure 7-4. Virtual devices: Overview (1 of 2) AN313.1
Notes:
Each partition has two virtual serial adapters to support virtual console access. Do not
remove these and there is no need to create additional virtual serial adapters. Every
POWER6 Virtual I/O Server partition as of server firmware level 01EL320, will have a
Virtual Asynchronous Services Interface (VASI) adapter. Four additional VASI adapters are
added if the VIOS is designated as a paging VIOS for a shared memory pool. The virtual
Ethernet adapter is supported on POWER5 and POWER6 processor-based server
partitions running AIX V5.3 or higher or Linux. The Integrated Virtual Ethernet (IVE)
adapter is available on most POWER6 processor-based systems and is also called the
Host Ethernet Adapter (HEA). It is an integrated physical Ethernet adapter which can be
shared between partitions. IVE logical ports are supported in partitions running AIX V5.2
and higher and Linux.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual devices: Overview (2 of 2)

Virtual devices that do require a Virtual I/O Server
Virtual SCSI
Virtual I/O Server partition uses logical volumes, physical volumes,
optical, or file devices to provide the backing storage for virtual SCSI
devices presented to client partitions
Shared Ethernet adapter (SEA)
Layer 2 bridge function to connect internal virtual Ethernet with external
physical network
Virtual Fibre Channel
NPIV provides direct access to Fibre Channel adapters from multiple
client partitions
Virtual I/O Server is installed in a special type of partition
Software provided as part of PowerVM Express, Standard, and
Enterprise Editions features
Client partition OS minimum levels
AIX 5L V5.3, Linux SLES 9, RHEL 3, IBM i 6.1
Figure 7-5. Virtual devices: Overview (2 of 2) AN313.1
Notes:
Virtual SCSI devices are backed by physical devices on the Virtual I/O Server that provide
disk storage or media devices to the client. Even though SCSI is the protocol used for the
virtualization, the actual backing storage devices do not need to be SCSI devices. The
shared Ethernet adapter is a network bridge device that connects virtual Ethernet traffic on
a managed system to an external network. Virtual Fibre Channel adapters use N_Port
Identifier Virtualization (NPIV) technology
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual device performance considerations

Virtual I/O Server performance considerations
Supporting virtual devices uses more overall resources, particularly
processing resources.
Size VIOS partitions appropriately and monitor
While performance with virtual SCSI devices versus native SCSI
devices is comparable, native access is slightly faster.
Usually, function outweighs the slight performance hit.
SEA shares its bandwidth with all clients.
Sustained, large network transfers on SEA clients are not recommended.
Additional virtual device performance considerations
Virtual Ethernet connections are faster than physical.
Host Ethernet adapter ports can be faster than virtual Ethernet.
Figure 7-6. Virtual device performance considerations AN313.1
Notes:
The items in the visual are considerations for VIOS performance or other virtual device
performance. Most of these you will prove during the hands-on lab exercises throughout
this course.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
VIOS performance tools

CPU Memory I/O Network
topas topas topas topas
vmstat svmon vasistat netstat
vmstat viostat entstat
fcstat optimizenet
seastat
VIOS commands might have the same or a different name as

their AIX version, and the flags often do not match.
Example: topas -C in AIX is topas -cecdisp in VIOS.
Once topas is running, many keyboard shortcuts behave as in AIX.
Use oem_setup_env to use other standard AIX analysis
commands.
Figure 7-7. VIOS performance tools AN313.1
Notes:
Monitoring resources on the Virtual I/O Server and AIX client
The Virtual I/O Server has a command line interface (CLI) with its own set of commands.
Use the help command at the CLI to see the available commands. AIX tools are available
by using the oem_setup_env command to access the root shell. The VIOS CLI has the
topas command which will help monitor all of the key resource areas. In addition, for CPU
and I/O usage statistics, you can use the viostat command which is like the iostat AIX
command. Use entstat and seastat for shared Ethernet adapter devices. The optimizenet
command is like the no AIX command.
Monitor system resources on the AIX client partitions as you normally would. If you find an
area with a bottleneck, be sure to determine whether the device is native or virtual. If
virtual, track it back to the physical device on the Virtual I/O Server.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Monitoring Virtual I/O Server resources

Monitor memory, processors, physical disk controllers,
physical disks, and logical volumes on the Virtual I/O Server.
For device performance issues, determine the physical device
and tune.
If the Virtual I/O Server runs out of CPU, memory, or I/O
bandwidth, it can affect multiple clients.
Both VSCSI and SEA services use CPU resources on the
VIOS.
The VIOS needs memory for its operating system and devices.
After these requirements are met, accessing virtual devices do not
use much memory.
For memory errors when creating virtual devices, check that the
hypervisor can allocate free memory.
Figure 7-8. Monitoring Virtual I/O Server resources AN313.1
Notes:
There are two key points when monitoring virtual devices. First, if you run out of processing
or memory resources on the Virtual I/O Server then this affects all of the clients which are
using those resources. Careful monitoring and tuning of the Virtual I/O Server partition is
necessary. Second, if you discover a performance issue on a device, be sure to determine
the exact physical device used as the backing device. Then tune the physical device as
you normally would in a non-virtualized environment.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Checkpoint
1. Once a CPU constraint is found as a bottleneck, what are
some steps that can be taken to solve the problem?
2. True or False: All AIX performance commands can be used

even on a VIOS partition.
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
The answers are check process activity to determine errant
processes, add CPU resources, change configuration (for
example, capped to uncapped and dedicated to donate
cycles, and move workload.

The answer is true.

V5.4.0.3
Instructor Guide
Uempty
Topic 1: Summary
Describe the performance considerations when using virtual
I/O
Use a methodical approach when tuning virtual I/O
performance
Describe tools that can be used to analyze and tune virtual
configurations
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Topic 2: Virtual SCSI tuning

performance
Figure 7-11. Topic 2: Virtual SCSI tuning AN313.1
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual SCSI devices example
A Physical Virtual I/O Server Client

adapter
VSCSI server vtscsi0 VTD
B
virtual adapter hdisk0
vtscsi1 VTD hdisk1
VSCSI client cd0
C
virtual adapter vtopt0 VTD
VTD Virtual target cd0 vscsi0

device A A B C
sissas0 ide0 vhost0

hdiskX and
lvX are here
Physical storage
Virtual SCSI devices can be backed by:

Physical volumes, logical volumes, optical media drives, tape drives,
and a file in a file system
Figure 7-12. Virtual SCSI devices example AN313.1
Notes:
This visual shows an example system with virtual devices. The virtual target devices
(VTDs) on the Virtual I/O Server associated with vhost0 are vtscsi0, vtsci1, and vtopt0.
Each one represents the association of a single backing device to the virtual SCSI server
adapter.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual SCSI storage and LVM

Virtual storage on the AIX client can be manipulated using
LVM just like a normal physical disk.
Caution: The virtual disk on the client might already be a logical
volume on the server.
LVM features, such as mirroring and striping, can be implemented on
the client.
Do not stripe or mirror backing devices on the VIOS using LVM.
Performance considerations using dedicated storage still apply
to using virtual storage, such as spreading out hot logical
volumes.
Figure 7-13. Virtual SCSI storage and LVM AN313.1
Notes:
In the client partition, the virtual storage can be manipulated using the Logical Volume
Manager (LVM) just like a physical volume. The virtual SCSI client adapter can use these
devices like any other physically connected hdisk device for boot, swap, mirror, or any
other supported AIX feature. Performance considerations from dedicated storage are still
applicable when using virtual storage, such as spreading hot logical volumes across
multiple disks on multiple adapters so that parallel access is possible.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Performance factors with virtual SCSI

Performance factors include:
Virtual I/O latency: Time overhead to complete an I/O on a virtual disk
versus a native disk
Virtual I/O bandwidth: Difference in throughput when accessing a
virtual disk versus a native disk
Performance when accessing native disks versus virtual disks
is comparable.
Virtual SCSI uses more processing power than using native
I/O.
Processing to support the I/O occurs on both the client and the Virtual
I/O Server
Example: If one I/O requires 0.03 milliseconds of CPU time, this amount
is consumed on the client, and nearly this amount is consumed on the
VIOS.
A client does not consume more CPU resources with a virtual disk
versus a native disk
Figure 7-14. Performance factors with virtual SCSI AN313.1
Notes:
Virtual SCSI uses more processing power
Using Virtual SCSI (VSCSI) requires extra processing power compared to native disks.
This is due to the processing of extra Hypervisor calls and the paths involved for
exchanging I/O requests between the initiator and target adapters. The use of VSCSI will
roughly double the amount of processor time to perform each I/O when compared to using
directly attached storage. This processor load is split between the Virtual I/O Server and
the virtual SCSI client. Double the processor time sounds bad; however, the extra
processing time to process one 4KB I/O request is less than 50,000 CPU cycles. On a
1.65GHz processor core, this represents only 0.03 milliseconds. This is less than 1% of the
average seek time of any high performance 15,000 rpm SCSI disk.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Additional I/O latency for virtual SCSI

Additional virtual SCSI I/O latency Virtual SCSI I/O latency is small
is typically 0.03 - 0.06 ms per I/O compared to the actual disk
operation. latency (response time).
Additional latency overhead per Native I/O average response time

I/O operation for VSCSI
0.07 2
0.06 pdisk backed Read
LV backed 1.5 Write
0.05
0.04
ms
ms
1
0.03
0.02 0.5
0.01
0 0
4KB 8KB 32KB 64KB 128KB 4KB 8KB 32KB 64KB 128KB
Block size Block size
Figure 7-15. Additional I/O latency for virtual SCSI AN313.1
Notes:
I/O latency when using VSCSI
I/O latency is the time it takes between the initiation of a disk I/O and completion as
observed by the thread. Latency is an important attribute of disk I/O. Applications which are
multi-threaded or use asynchronous I/O might be less sensitive to I/O latency, but under
most circumstances, lower latency is better for performance. Latency also varies with
different I/O block sizes. Consider a program which performs 1000 random disk I/Os one at
a time. If the time to complete an average I/O is six milliseconds, the program will take at
least six seconds to run; however, if the average I/O response time is reduced to three
milliseconds, the programs run time could be reduced by three seconds. The chart on the
right side of the visual shows average response times for an I/O using a typical disk used
natively in a partition. This chart is provided to illustrate the fact that 0.06 milliseconds is a
small fraction of an overall average I/O response time.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
I/O bandwidth
Comparison of measured bandwidth using virtual SCSI disks

and native disks for reads with varying block sizes
140
In this test, a single 120

thread operates 100
sequentially on a
constant file which 80
is 256MB in size Virtual
with a dedicated 60 Native
MB/sec
processor Virtual
I/O Server. 40
20
0
4KB 8KB 32KB 64KB 128KB
Block size
Figure 7-16. I/O bandwidth AN313.1
Notes:
I/O bandwidth is the maximum amount of data which can be read or written to storage in a
unit of time. Bandwidth can be measured from a single thread, or from a set of threads
executing concurrently. Though many commercial applications are more sensitive to
latency than bandwidth, bandwidth is crucial for many typical operations such as backup
and restore. The chart in the visual shows a comparison of measured bandwidth using
VSCSI and native disks for reads with varying block sizes of operations. In these tests, a
single thread operates sequentially on a constant file which is 256MB in size. The
difference between virtual I/O and native I/O in these tests is attributable to the increased
latency using virtual I/O. Because of the larger number of operations, the bandwidth
measured with small block sizes is much lower than with large block sizes.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
CPU needs for different I/O types

This chart shows the CPU cycles per byte needed for I/O on
the VIOS when:
VIOS accesses native disk
Client accesses logical volume used as backing storage
Client accesses physical volume used as backing storage
14
Cycles per byte (CPB)
12
10
8 Native
Native
6 lvPV-backed
back
pdisk back
LV-backed
4
2
0
4K 8K 32K 64K 128K
Block size
Figure 7-17. CPU needs for different I/O types AN313.1
Notes:
Cycles per byte comparison
The graph in the visual shows a comparison of the CPU cycles per byte for native I/O and
VSCSI I/O using both logical volume backed storage and physical volume backed storage.
In the visual above, PV-backed is physical disk backed storage and LV-backed is logical
volume backed storage. The VSCSI measurements are of the Virtual I/O Server only; the
client is not included in the comparison. The processor efficiency of I/O improves with
larger I/O block sizes. Effectively, there is a fixed latency to start and complete an I/O
transaction, with some additional cycle time based on the size of the I/O transaction.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Sizing the Virtual I/O Server for virtual SCSI

Primary considerations for processor configuration on the VIOS are:
Quantity of processor entitlement.
Use dedicated or shared processors?
If shared, mark as uncapped with higher weight than clients.
Whether VIOS is also configured with shared Ethernet adapters.
Memory needs for Virtual I/O Server are typically 512MB to
1GB.
Needs are minimal since no data caching is done on the Virtual I/O Server for
clients.
For large configurations, plan for 512MB plus 4MB per disk.
More LPAR memory is needed when using Host Ethernet Adapter (HEA) ports.
Client partition sizing
The processor and memory needs on the client when using virtual I/O is equal to
using native I/O.
Once I/O processing needs are calculated, plan for this amount in both the VIOS
and the clients.
Figure 7-18. Sizing the Virtual I/O Server for virtual SCSI AN313.1
Notes:
Virtual I/O Server processor and memory sizing
Use of shared processors for VSCSI servers will slightly increase I/O response time but
might be worth the benefits of flexible processor entitlement sizing and the ability to mark
the partition as uncapped. Additional entitlement should be added when using shared
processors compared to dedicated processors on the VIOS. Tests have shown that with
low I/O loads and a small number of partitions, shared processors on the Virtual I/O Server
partition has little effect on performance. For more efficient virtual SCSI implementation
with larger loads, it might be advantageous to configure the Virtual I/O Server partition as a
dedicated processor partition. The memory requirements for the VSCSI server are modest
because there is no data caching in the VSCSI server. With large I/O configurations and
very high data rates, 1GB of memory for the VSCSI server is typically more than enough.
For configurations with low I/O rates with a small number of attached disks, 512MB of
memory is usually sufficient. If using IVE logical ports, configure an additional 103MB per
port.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Processor sizing methodologies

The Virtual I/O Server processor sizing methodologies can be based on
either:
Knowledge of the specific needs of each virtual SCSI client partitions I/O rates
and I/O block sizes
Or
Plan for the maximum configuration for all clients by knowing:
The maximum number of disks to be used as backing storage
Speed of processors
The disks capabilities in terms of disk I/Os per second and how many processing
cycles per I/O are needed
Approximate CPU cycles per second for I/O using 1.65 GHz POWER5 processor at
different block sizes:
4KB 8KB 32KB 64KB 128KB
PV-backed 45000 47000 58000 81000 120000
LV-backed 49000 51000 59000 74000 105000
Figure 7-19. Processor sizing methodologies AN313.1
Notes:
CPU cycles chart
The chart in the visual shows the typical number of CPU cycles per operation for both
physical volume and logical volume backed operations on a 1.65GHz POWER5 processor
core. These numbers are measured at the physical processor with SMT enabled. For other
CPU frequencies, adjust the cycles in the table by multiplying the cycles per operation by
the ratio of the frequencies. For example, to adjust for a 4.2GHz CPU, 1.65GHz/4.2GHz =
0.39. Multiply the CPU cycles in the table by 0.39 to get the required cycles per operation.
For example, 45,000 cycles would become 17,550 cycles.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Server sizing: Example 1

Example with knowledge of the specific I/O traffic:
Scenario: A Virtual I/O Server supports three client partitions
on physical disk backed storage.
First client partition requires a maximum of 7000 8KB operations per
second.
Second client partition requires a maximum of 10000 8KB operations
per second.
Third client partition requires a maximum of 5000 128KB operations
per second.
The number of 1.65 GHz processors for this requirement is
approximately 0.85 processors.
For dedicated processors, round up to a single processor.
(7,000 * 47,000 + 10,000 * 47,000 + 5,000 * 120,000) / 1,650,000,000)

= 0.85 processors
Figure 7-20. Server sizing: Example 1 AN313.1
Notes:
Sizing example based on knowledge of the I/O traffic
The formula to reach the 0.85 processing units is shown on the visual. The 47,000 number
is taken from the chart on the previous visual for 8KB blocks. The 120,000 number is taken
from the same chart for 128KB blocks. To customize the formula for a different processor
speed, use the ratio of the processor speeds to convert the number of cycles. For 4.2GHz
CPU you would convert the 47,000 CPU cycles into 18,330, and the 120,000 CPU cycles
into 46,800 so the result would be as follows: 128310000 + 183300000 + 234000000 /
1650000000 = 0.33 processors Alternatively, do the calculation using the 1.65GHz
information, then multiple the resulting processing units by the ratio between the 1.65GHz
processor speed and the speed of your processor. For example, for 4.2GHz, multiple the
0.85 processors by 0.39 = 0.33 processors.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Server sizing: Example 2

Example when planning for the maximum I/O configuration:
Scenario:
A Virtual I/O Server manages 32 physical SCSI disks.
Plan for maximum workload using 128KB sequential operations.
Each disk is capable of approximately 390 disk I/Os per second (15
Krpm drives).
The number of 1.65 GHz processors for this requirement is
approximately 0.91 processors.
(32 disks * 390 I/Os * 120,000 Cycles) / 1,650,000,000 = 0.91
This planning method can waste processing resources.
Consider if the I/O configuration needs are known.
8KB random operations will use 47000 cycles and 200 disk I/Os per
second resulting in only 0.19 processors needed to support the same
32 disks.
Figure 7-21. Server sizing: Example 2 AN313.1
Notes:
Sizing based on planning for maximum bandwidth configuration
If the server is sized for maximum bandwidth (which assumes sequential I/O), the
calculation will result in a much higher processor requirement than what might actually be
needed. Since disks are much more efficient doing large sequential I/Os than small random
I/Os, we can drive a much higher number of I/Os per second. Assume that a Virtual I/O
Server has 32 disks capable of 50MB per second when doing 128KB I/Os. That implies
each disk could average 390 disk I/Os per second (50,000,000 / 128,000=390.625). Thus,
the entitlement necessary to support 32 disks, each doing 390 I/Os per second with an
operation cost of 120,000 cycles requires approximately 0.91 processors
((32*390*120,000)/1,650,000,000). More simply, a Virtual I/O Server running on a single
processor core should be capable of driving approximately 32 fast disks to maximum
throughput.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Virtual SCSI queue depth (1 of 2)

A disk queue depth is the number of I/O requests that can be run in
parallel on a device.
The virtual SCSI queue depth is how many requests the disk controller
will queue to the virtual SCSI client driver at any one time.
Recommendations for performance:
Physical volumes used as backing storage: Make virtual SCSI queue depth equal
to the physical volume queue depth.
Logical volumes used as backing storage: The physical volume's queue depth
should be greater than or equal to the sum of queue depths for all the virtual disks
accessing that physical volume.
Adjusting the queue depth might improve disk performance.
Performance impact will depend on workload.
Example: Increase virtual disk queue depth to match physical volume.
Adjusting the queue depth might reduce wasted resources.
Example: If virtual disk queue depth is too high and it is reduced to match physical
volume.
Figure 7-22. Virtual SCSI queue depth (1 of 2) AN313.1
Notes:
Increasing the queue depth on a client virtual device reduces the number of supported
open devices on that virtual adapter, and the number of I/O requests that devices can have
active on the VIO Server. The VSCSI queue depth generally should not be any larger than
the queue depth on the physical LUN. A larger value wastes resources without additional
performance. If the virtual target device is a logical volume, the queue depth on all disks
included in that logical volume must be considered. If the logical volume is being mirrored,
the virtual SCSI client queue depth should not be larger than the smallest queue depth of
any physical device being used in a mirror. When mirroring, throughput is effectively
throttled to the device with the smallest queue depth. If a volume group on the client spans
virtual disks, keep the same queue depth on all the virtual disks in that volume group,
especially when using mirroring.
When increasing the VSCSI client queuing can be a useful optimization
The storage is Fibre Channel attached.
SCSI queue depth is already a limiting factor using the default setting of three.
Instructor Guide
The VSCSI device is attached using LUNs:

- The Virtual I/O Server does not allow striping LVs across multiple disks/LUNs.
- LVs could benefit when on a LUN with multiple disks and there is not great
contention on the LUN from requests from multiple clients.
The LUN contains multiple physical disks.
- The more disks in the LUN, the better the possibility of more I/Os in flight at the
same time.
The workload has enough processes or asynchronous I/O to drive a lot of outstanding
I/O requests.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual SCSI queue depth (2 of 2)

View queue_depth value
On the VIO Server:
$ lsdev dev hdiskX attr queue_depth
On an AIX virtual SCSI client:
# lsattr El hdiskX -a queue_depth
Value can be changed with chdev command
On the VIO Server:
# chdev dev hdisk0 attr queue_depth=#
Valid values:
On an AIX virtual SCSI client:
1-256
# chdev l hdisk0 a queue_depth=#
The default value is 3 on virtual SCSI disks.
Default for physical disks depends on the type of storage.
Internal SAS disks are set to 16.
To change, disks must not already be in use.
If in use, use chdev -P to change at next reboot.
Figure 7-23. Virtual SCSI queue depth (2 of 2) AN313.1
Notes:
Examples for tuning the virtual SCSI queue depth
Example configuration 1: A physical volume on the VIOS has a queue depth of 16
and the default virtual SCSI queue depth is three. The entire physical volume is used as
the backing storage. Alter the virtual SCSI queue depth to 16.
Example configuration 2: A physical volume on the VIOS has a queue depth of 16. It
has eight logical volumes being used as backing storage for client LPARs and the
virtual SCSI queue depths are three. In this case, it might result in 24 pending I/Os on a
queue depth of 16. The physical volume in this case could be tuned to 24 for better
performance. For more information, read the section on virtual SCSI queue depth in
chapter 4 of the IBM System p Advanced POWER Virtualization Best Practices, an IBM
Redbooks document (REDP-4194).

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Monitoring virtual SCSI devices

Monitor memory, processors, physical disk controllers,
physical disks, and logical volumes on the Virtual I/O Server
Tools accessible from the Virtual I/O Server CLI to monitor
resources and system activity
topas, viostat, vmstat, svmon
Tools accessible from the root shell (oem_setup_env) to
monitor resources and system activity
All AIX tools, such as sar, ps, lparstat, mpstat, smtctl, filemon, and
so on
Figure 7-24. Monitoring virtual SCSI devices AN313.1
Notes:
Monitor system resources on the client partitions as you normally would. If you find an area
with a bottleneck, be sure to determine whether the device is native or virtual. If virtual,
track it back to the physical device on the Virtual I/O Server. Do not forget to check general
resource consumption on the Virtual I/O Server (CPU and memory in particular). Once you
find out the exact configuration and what the core problem is, then you can determine what
the tuning steps should be. For example, you do not want to perform disk tuning activities
when the problem is really a CPU starvation issue on the Virtual I/O Server.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Is the system disk bound? (1 of 2)

A system could be disk bound if it has:
At least one disk busy and cannot fulfill other requests
Processes blocked and waiting for the I/O operation to complete
Topas Monitor for host: VIOS1 EVENTS/QUEUES FILE/TTY

Wed Sep 2 23:44:04 2009 Interval: 2 Cswitch 162 Readch 160
CPU User% Kern% Wait% Idle% Physc Entc Reads 1 Rawin 0
Forks 0 Igets 0
Waitqueue 0.0
hdisk2 90.0 32.3K 282.0 18.3K 14.0K PAGING Real,MB 1024
hdisk1 25.0 9.4K 81.0 0.0 9.4K Faults 0 % Comp 56
Steals 0 % Noncomp 10
FileSystem KBPS TPS KB-Read KB-Writ PgspIn 0 % Client 10
Total 0.2 1.0 0.2 0.0 PgspOut 0
PageIn 0 PAGING SPACE
Name PID CPU% PgSp Owner PageOut 0 Size,MB 1536
topas 311462 0.4 1.3 root Sios 0 % Used 0
getty 188570 0.3 0.6 root % Free 100
gil 57372 0.1 0.9 root NFS (calls/sec)
sched 12294 0.1 0.4 root SerV2 0 WPAR Activ 0
seaproc 159824 0.0 1.0 root CliV2 0 WPAR Total 0
rpc.lock 123048 0.0 1.2 root SerV3 0 Press: "h"-help
sshd 295100 0.0 0.9 padmin CliV3 0 "q"-quit
Figure 7-25. Is the system disk bound? (1 of 2) AN313.1
Notes:
Disk bound system
A system might be disk bound if at least one disk is busy and cannot fulfill other requests,
and processes are blocked and are waiting for the I/O operation to complete. The limitation
can be either physical or logical. The physical limitation involves hardware like bandwidth
of disks, adapters and the system bus. The logical limitation involves the organization of
the logical volumes on disks and Logical Volume Manager (LVM) tuning and settings, such
as striping or mirroring. The example in the visual shows a Virtual I/O Server with one very
busy disk. Usually we also look at the Wait% and the Waitqueue for indication that the
processors were idle but there were still threads waiting for I/O. However, since this is the
VIOS, the VIOS operating system does not have threads waiting. If we were to look at the
client, then we might see threads waiting or wait time. You will not always see waits when
there is a disk performance bottleneck however.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Is the system disk bound? (2 of 2)

The iostat or sar -d on the client and viostat on the VIOS also show disk
activity.
Example iostat interval on the AIX client:
tty: tin tout avg-cpu: % user % sys % idle % iowait physc % entc
0.0 180.0 0.8 31.2 34.2 33.8 0.1 36.6
Disks: % tm_act Kbps tps Kb_read Kb_wrtn

hdisk1 0.0 0.0 0.0 0 0
hdisk0 96.0 101632.0 794.0 203264 0
Example viostat interval on the VIOS:

0.0 337.0 0.1 28.7 71.2 0.0 0.1 34.6
Disks: % tm_act Kbps tps Kb_read Kb_wrtn

hdisk0 0.0 0.0 0.0 0 0
hdisk1 0.0 0.0 0.0 0 0
hdisk2 80.0 101632.0 794.0 203264 0
hdisk3 0.0 12.0 2.5 0 24
hdisk4 0.0 0.0 0.0 0 0
hdisk5 0.0 0.0 0.0 0 0
cd0 0.0 0.0 0.0 0 0
Figure 7-26. Is the system disk bound? (2 of 2) AN313.1
Notes:
Comparing the examples
In the examples in the visual, the clients hdisk0 is the same as the VIOSs hdisk2.
% tm_act: Reports back the percentage of time that the physical disk was active or the
total time of disk requests. When utilization exceeds roughly 60 to 70 percent, it usually is
indicative that processes are starting to wait for I/O.
Kbps: Reports back the amount of data transferred to the drive in kilobytes per second.
tps: Reports back the number of transfers per second issued to the physical disk.
Kb_read: Reports back the total data (kilobytes) from your measured interval that is read
from the physical volumes.
Kb_wrtn: Reports back the amount of data (kilobytes) from your measured interval that is
written to the physical volumes.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Find busiest logical volumes

If the busy physical volume has logical volumes, determine the busiest logical
volumes.
Determine logical volumes: $ lspv -lv hdisk2
Use filemon or lvmstat to determine busy logical volumes.
Use lsmap to determine which clients are using these LVs.
Example of a filemon trace for 30 seconds:
$ oem_setup_env
# filemon -o /tmp/file.out -O lv,pv; sleep 30; trcstop
Example LV section of /tmp/file.out output file:
Most Active Logical Volumes

----------------------------------------------------------------
util #rblk #wblk KB/s volume description
----------------------------------------------------------------
0.46 3362816 18496 56337.8 /dev/lv3 N/A
0.41 2689792 204880 48229.7 /dev/lv4 N/A
0.00 0 16 0.3 /dev/lv1 N/A
0.00 0 16 0.3 /dev/lv2 N/A
Figure 7-27. Find busiest logical volumes AN313.1
Notes:
Find busiest logical volumes using lvmstat command output
The example output file in the visual only shows the Most Active Logical Volumes section
of the file. It shows that there are two busy logical volumes. Figure out on which physical
volumes these are located (lspv). If theyre on the same physical volume, move one to a
less busy disk. If one logical volume is causing the disk to be too busy, you could use a
faster disk, use LVM or SAN storage features to spread the load over multiple disks, or
work with the client LPAR to figure out how that LV is being used and whether some
functions can be moved to another disk. Example line from lvmstat output:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual SCSI I/O analysis flowchart

Disk performance
START
Actions
Yes VIO Server Yes
Tune CPU
Is the disk connected CPU or memory or memory
to VSCSI? bound?
No No
Actions Actions
Yes Yes Move data or disks

Move data or disks Adapter VIO Server
Adapter bound? to another adapter
to another adapter bound?
Add an adapter
No No
Actions Actions
Reorganize disk Yes Yes Reorganize disk

Disk VIO Server
Move data to Move data to
bound? Disk bound?
another disk another disk
No No
It might not be disk I/O bound.

Actions Re-evaluate.
Figure 7-28. Virtual SCSI I/O analysis flowchart AN313.1
Notes:
Disk I/O analysis
For a native device: Check the physical adapter and the physical disk
For a virtual device: Check the CPU & memory usage on the Virtual I/O Server in
addition to the physical adapter and disk When a system has been identified having
disk I/O performance problems, the next point is to find out where the problem comes
from. This visual shows the steps to follow on a disk I/O bound system. The items to
verify are:
Other tools to use to track down the precise disk and area on disk include: lspv, lslv
iostat/viostat, fileplace, and filemon. You will use these commands in the hands-on
exercise to determine where performance problems originate.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual SCSI performance summary

The additional VSCSI I/O latency will vary with the machine utilization
and Virtual I/O Server topology.
If not constrained by CPU resources, virtual disk I/O throughput is

comparable to native I/O.
Total CPU cycles for the I/O client plus the Virtual I/O Server will be
higher than native I/O.
If a Virtual I/O Server has constrained resources, it will affect all clients
using those resources.
Memory requirements for the Virtual I/O Server are typically modest due
to the fact there is no data caching on the Virtual I/O Server.
Figure 7-29. Virtual SCSI performance summary AN313.1
Notes:
General performance considerations with virtual SCSI
If not constrained by processor performance, virtual disk I/O throughput is comparable
to native I/O.
Since VSCSI is a client/server model, the combined CPU cycles required on the I/O
client and the Virtual I/O Server will always be higher than native I/O. A reasonable
expectation is that it will take twice as many cycles to do VSCSI as native I/O (more or
less evenly distributed between the client and server).
If multiple partitions are competing for resources from a VSCSI server, care must be
taken to ensure enough server resources (processor, memory, and disk) are allocated
to do the job.
There is no data caching in memory on the server partition. Thus, all I/Os which it
services are essentially synchronous disk I/Os. Because there is no caching in memory
on the server partition, its memory requirements should be modest.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Checkpoint (1 of 2)
1. True or False: Memory requirements on the VIOS to support
VSCSI I/O operations are minimal because no data caching
is performed on the VIOS.
2. True or False: When sizing a new client partition, more

processing resources will need to be configured in the client
if it uses VSCSI disks rather than native disks.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
The answer is true.

The answer is false. Clients use the same amount of
processing resources when using virtual or physical devices.
Instructor Guide
Checkpoint (2 of 2)
3. Which one of the following recommendations about sizing the
Virtual I/O Server for virtual SCSI is false:
a. For the best performance, dedicated processors can be used.
b. When using shared processors, use the uncapped mode.
c. When using shared processors, set the priority (weight value) of the
Virtual I/O Server partition equal to its client partitions.
4. If a physical volume has a queue_depth of four and it has

two logical volumes, each used as backing storage and each
has a large amount of I/Os, what should the client set the
queue_depth to on the two VSCSI disks?
5. True or False: The overall system processing resource needs

to support virtual disk I/O is nearly double that of using native
disks.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
3. Which one of the following recommendations about sizing the Virtual
I/O Server for virtual SCSI is false:
c. When using shared processors, set the priority (weight value) of the Virtual I/O
Server partition equal to its client partitions.
The answer is when using shared processors, set the priority (weight
value) of the Virtual I/O Server partition equal to its client partitions.
4. If a physical volume has a queue_depth of four and it has two logical

volumes, each used as backing storage and each has a large amount
of I/Os, what should the client set the queue_depth to on the two
VSCSI disks?
The answer is two.
5. True or False: The overall system processing resource needs to

support virtual disk I/O is nearly double that of using native disks.
The answer is true.
Instructor Guide
Topic 2: Summary
performance
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 3: Virtual Ethernet tuning

Describe how the following tuning options affect virtual
Ethernet performance:
MTU sizes, CPU entitlement, TCP checksum offloading, simultaneous
multithreading
Figure 7-33. Topic 3: Virtual Ethernet tuning AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual Ethernet: Overview

Virtual Ethernet:
Enables inter-partition communication without the need for physical
network adapters assigned to each partition
Requires a POWER5 or later system and HMC for configuration
Partitions can run either AIX V5.3 and up, Linux, or IBM i
Does not require the purchase of any additional features or software,
such as the Advanced POWER Virtualization or PowerVM features
Figure 7-34. Virtual Ethernet: Overview AN313.1
Notes:
Virtual Ethernet enables inter-partition communication without the need for physical
network adapters assigned to each partition. This technology enables IP-based
communication between logical partitions on the same system using a VLAN capable
software switch (POWER Hypervisor) in POWER5 and POWER6 systems. The virtual
Ethernet interfaces can be configured with both IPv4 and IPv6 protocols. To use virtual
Ethernet to connect to a physical Ethernet adapter which connects to a physical Ethernet
network, you must implement shared Ethernet adapter. This will be discussed in a later
unit.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual Ethernet adapters

The POWER Hypervisor supports virtual Ethernet adapters that are
connected to an IEEE 802.1Q (VLAN) style virtual Ethernet switch.
LPARs can communicate with each other by using virtual Ethernet
adapters and assigning VIDs (VLAN ID).
Virtual Ethernet adapters are created using the HMC.
HMC generates MAC address.
Packets copied between partition's memory.
LPAR 1 LPAR 2 LPAR 3 LPAR 4 LPAR 5
VLAN 100 VLAN 200
VLAN 300
POWER Hypervisor (switch)
Figure 7-35. Virtual Ethernet adapters AN313.1
Notes:
The POWER Hypervisor provides a virtual Ethernet switch function based on the IEEE
802.1Q VLAN standard that allows partition communication within the same server. Using
this switch function, partitions can communicate with each other by using virtual Ethernet
adapters and assigning VIDs that enable them to share a common logical network. The
POWER Hypervisor Ethernet switch function is included as standard in all POWER5 and
POWER6 systems. It does not require the purchase of any of the PowerVM (or Advanced
POWER Virtualization) features. The virtual Ethernet adapters are created and the VID
assignments are performed using the HMC. The system allows virtual Ethernet adapters to
be configured with a PVID, that will be used to tag untagged packets.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Performance factors: CPU entitlement

Virtual Ethernet traffic uses more CPU than physical Ethernet.
Hypervisor performs memory transfers rather than offloading this processing to
the physical Ethernet adapter.
Throughput of virtual Ethernet is linear with CPU entitlement (if
constrained).
Shared processors might incur additional latency, decreasing throughput.
Monitor CPU and buffers rather than adapter throughput.
p ut
gh
r ou
Th
0.1 0.3 0.5 0.8 1.0

CPU entitlements
Figure 7-36. Performance factors: CPU entitlement AN313.1
Notes:
Comparing throughput of virtual Ethernet to physical Ethernet
CPU utilization: The virtual Ethernet adapter has a higher raw throughput than physical
Ethernet at all MTU sizes. With an MTU size of 9000 bytes, the throughput difference is
very large (four to five times) because the physical Ethernet adapter is running at wire
speed (989 Mbit/s user payload) while the virtual Ethernet can run much faster as it is
limited only by CPU and memory-to-memory transfer speeds.
If a partition is CPU constrained with a high virtual Ethernet workload, you can see linear
improvements in the throughput as more processor resources are added.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Performance factors: MTU size example 1

POWER5 example: Impact of MTU sizes (1500, 9000, 65390):
Throughput of MTU of 65390 is 6-7 times of MTU 1500.
Keep tcp_pmtu_discover and udp_pmtu_discover enabled.
700Mb
600Mb
Throughput
per second
500Mb
400Mb
300Mb
200Mb
100Mb
0
1500 1500 9000 9000 65394* 65394*
S D S D S D
Example simplex/duplex workloads

at different MTU sizes *Data from AIX V5
Figure 7-37. Performance factors: MTU size example 1 AN313.1
Notes:
Throughput at different MTU sizes
The data in the visual is from a test on a POWER5 system using AIX V5. The test data is
documented in IBM System p Advanced POWER Virtualization Best Practices, an IBM
Redpaper document (REDP-4194). * The actual tests from which the data in the visual was
obtained used an MTU of 65394. As of AIX 6, the maximum MTU for virtual Ethernet
adapters is 65390. When setting the MTU size to larger values, be sure to have the
network parameters tcp_pmtu_discover and udp_pmtu_discover enabled. They are
enabled by default. The tcp_sendspace and tcp_recvspace buffer settings can also have a
performance impact. See the IBM System p Advanced POWER Virtualization Best
Practices, an IBM Redpaper document (REDP-4194) for more information on buffers.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Performance factors: MTU size example 2

POWER6 example: Impact of MTU sizes (1500, 9000, 65390):
14 Gb
12 Gb
Throughput
per second
10 Gb
8 Gb
6 Gb
4 Gb
2 Gb
0
1500 1500 9000 9000 65390 65390
S D S D S D
Example simplex/duplex workloads

at different MTU sizes
Figure 7-38. Performance factors: MTU size example 2 AN313.1
Notes:
Example 2
The data in the visual was obtained with tests performed in the IBM UNIX Service
Enablement training lab on a POWER6 (4.2 GHz) processor-based server using a partition
running AIX 6. Workloads vary greatly and these numbers cannot be promised to
customers. However, notice that the affect of MTU size is the same in example 1 and
example 2.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Performance factors: SMT impact

Impact of simultaneous multithreading
Low workloads might perform slightly better with single-threaded.
High workloads might see up to 60% gain with SMT enabled.
70
Performance 60
gain %
with SMT 50
40
30
20
10
0
1500 1500 9000 9000 65394* 65394*
MTU
S D S D S D
simplex vs.
duplex
*Data from AIX V5
Figure 7-39. Performance factors: SMT impact AN313.1
Notes:
Examining the impact of simultaneous multithreading (SMT) on virtual Ethernet
performance
The virtual Ethernet performance observed by a partition typically benefits when
simultaneous multithreading is enabled, because the virtual Ethernet is not limited by
media speed and it can take advantage of the extra available processor cycles. However,
in the case of light workloads, performance can be better if simultaneous multithreading is
disabled. Depending on the partition type, the system acts in a certain way when the
workload is light. In a dedicated processor partition, the second simultaneous
multithreading thread is disabled, however the system checks periodically to determine if it
should be reactivated. In a shared processor partition, the second simultaneous
multithreading thread runs an idle loop at a low priority. The consumption of CPU cycles by
periodic disabling, checking, and enabling of the second thread or running of an idle loop
tends to affect the latency of the transactions on the virtual Ethernet, thus reducing
throughput.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
TCP checksum offload (1 of 2)

TCP checksums are calculated on sender and recalculated
and verified on receiver to detect corrupt packets.
Data transfer on virtual networks does not have potential for
link errors (like on physical networks).
Checksum calculation and verification can be offloaded to the
adapter on virtual Ethernet adapters to improve performance.
The adapter marks packets as having come from a virtual Ethernet
adapter.
The best performance is when checksum offload is enabled on
both sender and receiver.
Hypervisor keeps track of which virtual Ethernet adapters have it
enabled or disabled to generate the checksum when necessary.
Figure 7-40. TCP checksum offload (1 of 2) AN313.1
Notes:
The TCP checksum is calculated on sending systems and the value placed in a field in the
packet. When the receiving system receives the packet, it recalculates the checksum and
then verifies that it is the same as the one the sender put in the field in the packet. This is to
make sure the packet did not get corrupted as it traveled over the physical network,
through routers, and so on. On virtual networks, the Hypervisor copies the packets from the
memory of the sender partition to the memory of the receiving partition, so there is no
potential for the packet to get corrupted along the way. The checksum offload setting must
be enabled in order to disable the checksum calculation. For virtual Ethernet adapters,
checksum offload is enabled by default for performance and is configurable. This causes
the sending system TCP stack to not generate a checksum when sending a packet, as it
assumes the adapter will do it instead. The virtual Ethernet adapter marks the packet as
having come from a virtual Ethernet adapter, so that the receiving side does not expect a
real checksum value.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
TCP checksum offload (2 of 2)

To configure checksum offload:
To check ent0:
# lsattr -El ent0 -a chksum_offload
chksum_offload yes Checksum Offload Enable True
To change ent0:
# chdev l ent0 -a chksum_offload=[yes/no]
Or
# smit chgenet
Checksum offload is enabled for virtual Ethernet adapters by

default.
Virtual Ethernet adapters cannot alter this attribute with

ifconfig.
Figure 7-41. TCP checksum offload (2 of 2) AN313.1
Notes:
If the network traffic is going to be within a virtual Ethernet, then this feature should be
enabled on both the sending partitions and the receiving partitions for the best
performance. If there is a mismatch, that is, the sender and the receiver are not configured
the same way, the Hypervisor keeps track of how all of the adapters are configured and will
calculate the checksum if necessary (either on the sending side, or on the receiving side
where it will also perform the verification function). The ability to configure TCP checksum
offload for virtual Ethernet adapters is supported by AIX 5L V5.3 with maintenance level 3
and above. For physical adapters, it is also possible to enable or disable checksum
calculation at the TCP level with the ifconfig command. This method does not work for
virtual Ethernet adapters, and should not be used.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual Ethernet I/O analysis flowchart
Virtual Ethernet I/O performance

(assuming no SEA connection)
START
Actions
On both source and
Is it a Yes destination: Check
Virtual Ethernet CPU resources, SMT,
Adapter? buffers, MTU
No
It might not be a network I/O problem.
Actions Re-evaluate
Figure 7-42. Virtual Ethernet I/O analysis flowchart AN313.1
Notes:
Virtual Ethernet adapter
If there are performance problems with a virtual Ethernet adapter, make sure there is
enough CPU resources. There can be an impact on virtual Ethernet throughput even at
60-70% CPU saturation. Verify that SMT is enabled, checksum offload is enabled, and you
are using the highest MTU allowed for the configuration. Virtual Ethernet adapters which
use a shared Ethernet adapter to connect to a physical network will need to use the MTU of
the physical network or use path MTU (PMTU) discovery to set the best MTU for the
network. Also, check for socket buffer failures with the netstat -c command.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Network monitoring tools

Common monitoring tools: topas, entstat, and netstat
Example: topas -E
Topas Monitor for host: rand211 Interval: 2 Mon Sep 7 17:56:19 2009
===============================================================================
Network KBPS I-Pack O-Pack KB-In KB-Out
en0 16.6K 7607.6 2251.4 5394.9 11.4K
lo0 0.1 0.5 0.5 0.0 0.0
Example: entstat en#

-------------------------------------------------------------
ETHERNET STATISTICS (en0) :
Device Type: Host Ethernet Adapter (l-hea)
Hardware Address: 00:21:5e:1d:09:81
Elapsed Time: 2 days 3 hours 10 minutes 23 seconds
Transmit Statistics: Receive Statistics:

-------------------- -------------------
Packets: 943708 Packets: 844059
Bytes: 1183490672 Bytes: 454157501
Interrupts: 0 Interrupts: 700451
Transmit Errors: 0 Receive Errors: 0
...
Figure 7-43. Network monitoring tools AN313.1
Notes:
Virtual Ethernet monitoring
To monitor virtual Ethernet traffic you can use topas to view the interface statistics, entstat
to view the adapter statistics and overall network statistics with netstat. The entstat values
can be reset to zero with the -r flag.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Checkpoint
1. True or False: Virtual Ethernet adapters are created and the
PVID assignments are performed using the Hardware
Management Console (HMC).
2. True or False: Virtual Ethernet performance can be improved

linearly by adding more CPU entitlement to a partition if it is
constrained.
3. True or False: An MTU size of 65390 for both virtual and

physical Ethernet will result in the best performance.
4. True or False: For average to high virtual Ethernet

workloads, simultaneous multithreading will improve
performance.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: Virtual Ethernet adapters are created and the PVID
assignments are performed using the Hardware Management Console
(HMC).
The answer is true.
2. True or False: Virtual Ethernet performance can be improved by

adding more CPU entitlement to a partition if it is constrained.
The answer is true.
3. True or False (Physical Ethernet does not support an MTU of 65394.):

An MTU size of 65390 for both virtual and physical Ethernet will result
in the best performance.
The answer is false. Physical Ethernet does not support an MTU of
65394.
4. True or False: For average to high virtual Ethernet workloads,

simultaneous multithreading will improve performance.
The answer is true.
Instructor Guide
Topic 3: Summary
Describe how the following tuning options affect virtual
Ethernet performance:
MTU sizes, CPU entitlement, TCP checksum offloading, simultaneous
multithreading
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 4: Shared Ethernet adapter tuning

adapter
seastat utility
Monitor shared Ethernet
Figure 7-46. Topic 4: Shared Ethernet adapter tuning AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Shared Ethernet adapter

Shared Ethernet adapter bridges external networks to internal VLANs.
Forwards frames at OSI Layer 2 and is transparent to IP layer.
Virtual I/O Server does not reserve bandwidth on the physical adapter
for any VLAN clients.
One client can take advantage of full bandwidth, but if other clients are sending
data, it must share the bandwidth.
Thus, performance will vary!
Virtual I/O Server partition
Layer 2 bridge (shared Ethernet adapter)
Device driver Device driver Device driver
Physical Virtual Virtual

adapter adapter adapter
External To client
LANs partitions
Figure 7-47. Shared Ethernet adapter AN313.1
Notes:
In the visual, we see a representation of a Virtual I/O Server partition with one physical
adapter and two virtual adapters to connect to two separate VLANs on the managed
server. The shared Ethernet adapter (SEA) acts as an OSI Layer 2 bridge between the
virtual adapters and the physical adapters. The bridge performs the function of a MAC relay
and is independent of any higher layer protocol. With the default SEA configuration, if one
client partition sends data, it can take advantage of the full bandwidth of the adapter,
assuming the other client partitions do not send or receive data over the network adapter at
the same time. The Virtual I/O Server offers broadcast and multicast support, as well as
support for Address Resolution Protocol (ARP) and Neighborhood Discovery Protocol
(NDP).

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Configuring the interface

Ethernet interface configuration options for the Virtual I/O Server
Configure IP address on interface associated with the shared Ethernet adapter
(en3 in this example).
Configure IP address on optional second virtual Ethernet adapter (en2 in this
example).
Virtual Ethernet
Hypervisor traffic
Virtual Virtual ent1 Virtual ent2

I/O
Server
Shared Ethernet
adapter ent3
Physical ent0
External
LANs
Figure 7-48. Configuring the interface AN313.1
Notes:
Configuring the TCP/IP parameters with a shared Ethernet adapter
The visual shows the devices that can exist when implementing a single shared Ethernet
adapter. In this example, there are two interfaces on which the TCP/IP options can be
configured and there is no performance gain for either configuration. The first option is to
configure the TCP/IP parameters such as the IP address on the interface associated with
the SEA (en3 in this example). The second option is to configure a second, optional, virtual
Ethernet adapter and configure the TCP/IP parameters on its interface (en2 in this
example). You cannot configure the interface for the actual physical adapter associated
with a shared Ethernet adapter (en0 in this example). You also cannot configure the
interface associated with the shared Ethernet adapters virtual Ethernet adapter (en1 in this
example). You do not have to configure a TCP/IP address on the VIOS at all in order for the
SEA to bridge network traffic. Configuring the TCP/IP address as described here is simply
to allow the VIOS LPAR itself to communicate with other hosts on the network.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
SEA configuration options

Performance options
Use multiple physical Ethernet adapters in aggregation for increased
bandwidth for the SEA.
Clients can be distributed between multiple SEAs for load balancing.
Use the largest MTU size for the network.
Configure on the physical adapter before mapping to SEA.
Redundancy options
Configure SEA with a network interface backup adapter.
Use SEA as network interface backup adapter for clients.
Clients can use own dedicated physical adapter and use the SEA link as
a backup or clients can use two SEAs.
Configure dual VIOS partitions with SEA failover.
Figure 7-49. SEA configuration options AN313.1
Notes:
Changing the MTU on the physical adapter
More information on configuration choices: Use the chdev command to change the
MTU size on the physical adapter before the shared Ethernet adapter is created. This
example changes the MTU size to 9000 (jumbo frames) for the ent0 device which is the
physical adapter which will be associated with the shared Ethernet adapter: chdev -dev
ent0 -attr jumbo_frames=yes If your network load is made up of only small packets,
using the 9000 MTU will not decrease the processing load.
Besides the IBM redbooks documents, the IBM Unix Software Service Enablement course,
AHQV335 PowerVM Virtual I/O Server II: Advanced Configuration, covers how to
implement the redundancy and dual Virtual I/O Server configurations.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual I/O Server sizing

Properly size resources of the Virtual I/O Server
Physical Ethernet adapters, memory, processors
Resources needs for SEA are in addition to those needed for
VSCSI services
Physical adapters
For demanding client network traffic, use dedicated, physical network
adapters in partitions
That is, do not use shared Ethernet adapter.
Purchase proper type of adapter that fits the network needs
Use Ethernet adapter aggregation to increase bandwidth
IVE (HEA) logical port can be used or use PCI adapter
Monitor and tune with regular network tuning options
If network latency is an issue to an application, do not use shared
Ethernet adapter due to interrupt latencies
Figure 7-50. Virtual I/O Server sizing AN313.1
Notes:
The most important aspects of sizing the Virtual I/O Server for SEA services are the
processing resources and the type and number of Ethernet adapters to use. If network
traffic is very high and performance is important, the best performance will be if the client
partition has its own physical Ethernet adapter connected to the external network. Tests
show that the shared Ethernet adapters stream data at media speed as long as the VIOS
has enough processing resources. The shared Ethernet adapter uses more processing
power than a physical adapter because of the bridging functionality. If you know the desired
throughput rate of your client partitions, then you can determine how many and what speed
adapters you need to install in your Virtual I/O Server partition. The Integrated Virtual
Ethernet Adapter (IVE) is also known as the Host Ethernet Adapter (HEA) and is the
integrated physical Ethernet adapter on most POWER6 processor-based servers.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual I/O Server sizing: Memory

SEA services do not require much memory.
512MB memory on the Virtual I/O Server is sufficient to
support its data structures.
40MB is needed per processor for an mbuf buffer pool.
Most physical Ethernet adapters use 4MB for MTU 1500 or 16MB for
MTU 9000 for dedicated receive buffers.
More memory needed to support IVE ports (103MB each)
Each virtual Ethernet adapter needs 6MB for dedicated receive
buffers (varies based on workload).
Figure 7-51. Virtual I/O Server sizing: Memory AN313.1
Notes:
The memory requirements for a Virtual I/O Server are typically minimal. Plan for 40MB per
logical processor. If a partition has many virtual processors and SMT is enabled, this could
a significant amount.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual I/O Server sizing: CPU (1 of 2)

Type of workload affects processor utilization
Simplex versus duplex, streaming versus transaction
(request/response), small packets versus large.
The key workload factor is the processor cycles per byte of
transmission.
Cycles per byte data is available in IBM Hardware Information Center.
Processor utilization for large packet workloads using jumbo frames is
approximately half that required for MTU 1500 (or small packet
workload).
Shared processors versus dedicated processors

Dedicated will remove any virtual processor latency.
Using shared provides configuration flexibility.
Configure as uncapped to allow for bursts in network load or general
workload inconsistency.
Configure with higher weight (priority) than its clients if they are also
uncapped.
Figure 7-52. Virtual I/O Server sizing: CPU (1 of 2) AN313.1
Notes:
Types of workloads
Simplex is single direction TCP communication. Duplex is two direction TCP workloads. A
duplex example would be an ftp running from machine A to B and another ftp running
between machine B and A concurrently. Some media cannot send and receive
concurrently, thus they will not perform any better (and usually worse) when running duplex
workloads. Duplex workloads will not scale up at a full two times the rate of a simplex
workload because the TCP Ack packets coming back from the receiver now have to
compete with data packets flowing in the same direction.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual I/O Server sizing: CPU (2 of 2)

Procedure for sizing processing resources for SEA:
Look up the cycles per byte (CPB) value for the type of workload in
the IBM Systems Hardware Information Center.
Adjust for CPU GHz rating.
Information Center chart is based on 1.65 GHz processors.
Multiply maximum throughput (in bytes) needed by the cycles per byte
value then divide by maximum CPU processing cycles per processor.
Either plan for maximum workload for physical adapters or know how
much throughput the clients require.
Example CPB adjustment for 4.2 GHz processor:

Chart lists 11.2 CPB for 1.65 GHz streaming 1500 MTU workload.
Divide 1.65 by 4.2 then multiply by 11.2 which results in 4.4 CPB.
Example: 200MB of throughput with 4.4 CPB:

200 1024 1024 4.4 = 922,746,880 cycles / 1,650,000,000 cycles
per processor = 0.56 processors.
Figure 7-53. Virtual I/O Server sizing: CPU (2 of 2) AN313.1
Notes:
Cycles per byte charts
Search the IBM Hardware Information Center for Planning for shared Ethernet adapters to
find the CPB data based on type of workload and MTU size. In the example workload in the
visual, the resulting number of needed 4.2GHz processors is 0.56. If the original 1.65GHz
cycles per byte number was used, 1.42 processors would be needed to drive that
workload.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Threading/non-threading (1 of 2)
Threading ensures that CPU resources are shared fairly when a Virtual
I/O Server provides a mix of SEA and VSCSI services.
Like most device drivers, virtual Ethernet and shared Ethernet drivers
typically drive high interrupt rates and are CPU intensive.
Without threading, on a CPU constrained system, virtual network traffic will have a
higher priority than virtual SCSI interrupts resulting in worse virtual SCSI
performance.
With threading enabled, there is more consistent quality of service but at a lower
overall LAN throughput.
Threading is enabled by default for shared Ethernet adapters.
Disable threading when a Virtual I/O Server is not used for VSCSI.
Can use separate Virtual I/O Servers for shared Ethernet adapter and virtual
SCSI services.
Figure 7-54. Threading/non-threading (1 of 2) AN313.1
Notes:
How threading works
The threaded model helps to ensure that VSCSI and shared Ethernet adapter operations
share the Virtual I/O Server CPU resources fairly. However, threading adds more
instruction path length, thus using more CPU cycles. If the Virtual I/O Server will only be
running SEA services (no VSCSI) then the SEA device should be configured with threading
disabled in order to run in the most efficient mode. Threading causes incoming packets to
be queued to a buffer in memory. A special kernel thread is dispatched to process the
buffer, which uses more processing power and allows processing to be shared more evenly
with virtual SCSI. With non-threading, the virtual Ethernet and shared Ethernet adapter
driver forwards packets at the interrupt level which is more efficient and is why throughput
goes up. Note that we are not discussing simultaneous multithreading, but a configuration
option for the SEA device driver.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Threading/non-threading (2 of 2)
Disabling threading improves the cycles per byte of transmission.
Example for 1500 MTU streaming simplex workload with 4.2 GHz CPU:
Enabled is 4.4 cycles per byte and disabled is 3.65.
Potential performance gain by disabling threading:

16-20% gain at 1500 MTU
31-38% gain at 9000 MTU
Threading can be enabled or disabled on a per shared Ethernet adapter

basis on the Virtual I/O Server.
To display the current status where SEA device is entX:
$ lsdev dev entX attr

thread 1 Thread mode enabled (1) or disabled (0)

To disable on the SEA device:
For new SEA use: -attr thread=0 flag on mkvdev command
For existing SEA: $ chdev dev entX attr thread=0
Figure 7-55. Threading/non-threading (2 of 2) AN313.1
Notes:
Enabling and disabling threading
You can enable or disable threading using the -attr thread option of the mkvdev
command. To enable threading, use the -attr thread=1 option. To disable threading,
use the -attr thread=0 option. For example, the following command disables threading
for a new shared Ethernet adapter: mkvdev -sea ent1 -vadapter ent5 -default
ent5 -defaultid 1 -attr thread=0
Even though threading is enabled or disabled on an SEA device basis, if a VIOS had
multiple SEA devices, they should all be configured the same way.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
TCP segmentation offload (largesend)

With largesend enabled on the SEA, LPARs can transmit
large data, which will get segmented by the real adapter to fit
its MTU.
Saves system CPU load and increases network throughput.
Must be enabled on the physical adapter first before creating
the SEA device.
Typically enabled by default on physical adapters.
Attribute is large_send.
Not enabled by default on SEA devices.
Available on VIOS as of version 1.3.
Attribute is largesend.
Figure 7-56. TCP segmentation offload (largesend) AN313.1
Notes:
largesend: Overview
The largesend attribute (also known as segmentation offload) enables TCP largesend
capability from logical partitions to the physical adapter. For SEA devices, the attribute is
largesend. The attribute that is set on the physical adapter is called large_send. TCP will
send a big chunk of data to the adapter when TCP knows that adapter supports largesend.
The adapter will break this big TCP packet into multiple smaller TCP packets that will fit the
outgoing MTU of the adapter, saving system CPU load and increasing network throughput.
The TCP stack on a partition will determine if the Virtual I/O Server supports largesend. If
it does, then the partition will send big TCP packets directly to the Virtual I/O Server
partition. The largesend capability for SEA devices is supported with the Virtual I/O Server
version 1.3 and above. It is not enabled by default.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Configuring largesend on SEA device

To configure on the VIOS:
Physical adapter:
Example for enabling on physical adapter device ent0:
$ chdev -dev ent0 -attr large_send=1
SEA: Can be set with mkvdev or with chdev
Example for enabling on SEA device ent3 after it is created:
$ chdev -dev ent3 -attr largesend=1
Check the configuration:
Physical adapter example:
$ lsdev -dev ent0 -attr | grep large_send
large_send yes Enable hardware Transmit TCP segmentation True
SEA example: yes = Enabled
$ lsdev -dev ent3 -attr | grep largesend
largesend 1 Enable Hardware Transmit TCP Resegmentation True
1 = Enabled
Figure 7-57. Configuring largesend on SEA device AN313.1
Notes:
Configuring largesend
Be sure to enable the large_send attribute on the physical adapter before associating it
with an SEA. Typically, large_send is enabled by default on physical adapters. The
largesend attribute can be enabled on the SEA when it is created with the mkvdev
command or later by using the chdev command.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Configuring largesend on client LPARs

Enable largesend on virtual adapters in the client LPARS
which use an SEA device that uses largesend.
Enable on the interface in AIX:

# ifconfig en0 largesend
To disable:
# ifconfig en0 -largesend
Check the configuration with ifconfig.

$ ifconfig en0
en1:
flags=1e080863,4c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,6
4BIT,CHECKSUM_OFFLOAD(ACTIVE),LARGESEND,CHAIN>

Note: ifconfig changes do not persist across boots.

Figure 7-58. Configuring largesend on client LPARs AN313.1
Notes:
Configuring largesend on the clients
On AIX, largesend can be enabled on an LPARs virtual adapter using the ifconfig
command. It is not enabled by default. You cannot use chdev to configure largesend on
virtual adapters. Since ifconfig changes do not persist across operating system boots, you
must add the ifconfig command to an AIX startup script. If largesend is not enabled on a
client interface, then LARGESEND does not appear in the ifconfig output.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
SEA bandwidth apportioning (1 of 3)

As of VIOS V1.5, SEA outgoing packets can be prioritized to
provide more control over the bandwidth used by particular
clients.
Quality of Service (QoS) protocol in IEEE 802.1q
Eight priority queues
Priorities are set on each packet
Strict mode:
Always deplete higher queue first before moving to next lower queue
Might starve lower priority queues
Loose mode:
After a number of bytes are sent, move to next lower queue
Figure 7-59. SEA bandwidth apportioning (1 of 3) AN313.1
Notes:
The bandwidth apportioning feature for the SEA, also known as Virtual I/O Server quality of
service (QoS), allows the VIOS to give a higher priority to some types of outgoing packets.
In accordance with the IEEE 801.q specification, VIOS administrators can instruct the SEA
to inspect bridged VLAN-tagged traffic for the VLAN priority field in the VLAN header. The
3-bit VLAN priority field allows each individual packet to be prioritized with a value from 0 to
7 to distinguish more important traffic from less important traffic. More important traffic is
sent sooner and uses more of the physical Ethernet adapter (configured in the SEA)
bandwidth than less important traffic.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

To configure SEA bandwidth apportioning on SEA:
SEA threading must be enabled (value of 1; default):
See the current status on VIOS CLI where ent3 is the SEA device:
$ lsdev -dev ent3 -attr thread
value
1
To enable, if necessary:
$ chdev dev ent3 attr thread=1
Set qos_mode attribute to strict, loose, or disabled:
$ chdev -dev ent3 -attr qos_mode=loose
ent3 changed
To view qos_mode:
$ lsdev -dev ent3 -attr qos_mode
value
loose
Notes:
The qos_mode attribute
The VIOS administrator can reset the SEA qos_mode attribute to either strict or loose
mode. The default is disabled mode.
Disabled mode: VLAN traffic is not inspected for the priority field.
Strict mode: more important traffic is bridged over less important traffic.
Loose mode: a cap is placed on each priority level so that after a number of bytes is
sent for each priority level, the next level is serviced. The cap from the lowest to highest
priority queue is 2MB, 4MB, 8MB, 16MB, 32MB, 64MB,128MB, 256MB.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

When adding a VLAN, specify the priority (0-7).
Priority values in order:
1 (highest), 2, 0 (default), 3, 4, 5, 6, 7 (lowest)
Only VLAN tagged packets have priorities; untagged packets are put in 0 queue
The VIOS trunk virtual Ethernet adapter must be configured with additional VLANs.
Each client creates a VLAN device and sets its priority.
Can be set for additional VLANs only (not PVIDs)
$ smit addvlan
Add A VLAN
Type or select values in entry fields.

Press Enter AFTER making all desired changes.
[Entry Fields]
VLAN Base Adapter ent3
* VLAN Tag ID [100] +#
VLAN Priority [1] +#
Notes:
The priorities are set on the VLAN devices themselves, so only VLAN devices can have
priorities set. The VLAN ID that is used as the PVID cannot have a priority set and will be
set to the default (0) priority. To use this feature, when the VIOS trunk virtual Ethernet
adapter is configured on an HMC, the adapter must be configured with additional VLAN IDs
because only the traffic on these VLAN IDs is delivered to the VIOS with a VLAN tag.
Untagged traffic is always treated as though it belongs to the default priority class (for
example, as if it had a priority value of 0). To enable the SEA to prioritize traffic, client
partitions must insert a VLAN priority value in their VLAN header. For AIX clients, a VLAN
pseudo-device must be created over the Virtual I/O Ethernet Adapter, and the VLAN
priority attribute must be set (the default value is 0).

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
SEA performance summary

Type of workload will affect performance.
Simplex, duplex, streaming, transaction-oriented, packet size all affect the use of
CPU resources.
Use analysis tools to monitor resource usage (CPU, memory, physical adapter
utilization).
Performance might be reduced even at 70% CPU saturation.
Choose the highest MTU that makes sense (1500 or 9000).
The more packets, the more CPU cycles consumed.
Leave TCP checksum offload enabled on all virtual and physical
Ethernet adapters.
Enable TCP segmentation offload (largesend) on SEA and client virtual
Ethernet adapters if physical adapter supports it.
Leave SMT enabled unless performance tests show otherwise.
Disable SEA threading if the VIOS provides only SEA services.
Utilize SEA bandwidth apportioning if desired to prioritize client traffic.
Figure 7-62. SEA performance summary AN313.1
Notes:
TCP checksum offload
By default, the chksum_offload attribute is enabled for both physical and virtual Ethernet
adapters. This setting disables the TCP checksum calculation. When configuring a shared
Ethernet adapter, for best performance with typical configurations, do not change the
default value for the associated physical and virtual adapters.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
SEA I/O analysis flowchart

SEA I/O performance
START
Actions
Is it a Yes Check adapter

physical statistics and
adapter? tuning options
No
Actions
Is it a Yes Is the VIO Yes
Shared Ethernet Check VIO Server
server CPU CPU activity;
adapter? bound? Tune CPU
No
Check adapter
statistics for Actions
saturation;
Tune adapter
Figure 7-63. SEA I/O analysis flowchart AN313.1
Notes:
Virtual Ethernet adapter
If a virtual Ethernet adapter is bound, check the adapter statistics with the entstat
command. Check the adapter memory usage with the netstat command to validate if there
is enough buffer allocated to this adapter. Check to see if the partition is memory or CPU
bound. Check the clients virtual adapter utilization to see which client is using a lot of the
bandwidth. If one client has a lot of network traffic, which is affecting the network activity of
other client partitions, then perhaps it should get a dedicated Ethernet adapter.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Viewing SEA configuration

Which device is it?
$ lstcpip -adapters
Ethernet adapters:
ent0 Available 05-08 10/100/1000 Base-TX PCI-X Adapter (14106902)
ent1 Available Virtual I/O Ethernet Adapter (l-lan)
ent2 Available Virtual I/O Ethernet Adapter (l-lan)
ent3 Available Shared Ethernet Adapter
Which adapters are configured as part of the SEA?

Use lsdev or lsmap -net -all
$ lsdev -dev ent3 -attr
attribute value description
ctl_chan Control Channel adapter for SEA failover
ha_mode disabled High Availability Mode
largesend 0 Enable Hardware Transmit TCP Resegmentation
netaddr 0 Address to ping
pvid 1 PVID to use for the SEA device
pvid_adapter ent1 Default virtual adapter to use for non-VLAN-tagged packets
real_adapter ent0 Physical adapter associated with the SEA
thread 1 Thread mode enabled (1) or disabled (0)
virt_adapters ent1 List of virtual adapters associated with the SEA (comma separ
Figure 7-64. Viewing SEA configuration AN313.1
Notes:
An understanding of the topology of the network devices involved is important when
troubleshooting a shared Ethernet adapter bottleneck. The commands in the visual above
will list the devices involved in a shared Ethernet adapter configuration. Both of the
commands shown are Virtual I/O Server CLI commands. The lstcpip -adapters
command can list several shared Ethernet adapters and the lsdev command as shown in
the visual will help you determine which physical and virtual adapters are associated with
each shared Ethernet adapter.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Monitor SEA activity

When monitoring shared Ethernet adapter bottlenecks, check:
Is the physical adapter the bottleneck?
Monitor SEA-specific statistics with the topas, entstat, and seastat
commands.
Tune:
Add physical adapters: aggregation, change network design.
Use the optimizenet and chdev CLI commands to alter configurations (MTU
size, threading, TCP parameters).
Are there enough resources on Virtual I/O Server?
Use vmstat or topas to monitor CPU and memory usage:
Note: The viostat command monitors CPU but it averages across all CPUs.
On the client, monitor virtual Ethernet adapter, with topas, netstat,
and so on, as if it were a physical adapter.
Use the lsdev command to see if it is physical or virtual.
Figure 7-65. Monitor SEA activity AN313.1
Notes:
Monitoring shared Ethernet adapter
-lsnetsvc: Gives the status of a network service
-lstcpip: Displays the TCP/IP settings
-optimizenet: Changes the characteristics of network tunables
-snmp_info: Requests values of Management Information Base variables managed by
a Simple Network Management Protocol agent
-traceroute: Prints the route that IP packets take to a network host.
The output of the entstat command will show how many errors and collisions have been
detected on the shared Ethernet adapter. The topas command monitors only configured
interfaces and therefore is not a tool to use for monitoring the shared Ethernet adapter
device if another interface is configured.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Monitor with the entstat command

Use the entstat -all command with the shared Ethernet
adapter device (ent#) or interface (en#).
Multi-page output shows statistics for physical and virtual adapters.
Values can be reset with the entstat reset ent# command.
Have packets been dropped?
Are there a large number of errors?
Are there mbuf errors?
Check value of thread queue overflow packets (maximum is 8192;
add CPU if it is that high).
Figure 7-66. Monitor with the entstat command AN313.1
Notes:
Using the entstat command
The entstat command will show statistics for the shared Ethernet adapter device, and the
virtual adapter and the physical adapter to which it is associated. If you notice many errors
or collisions in the entstat output, check for CPU starvation on the Virtual I/O Server and
check the physical adapter for saturation. Besides overall packet numbers and the number
of packets dropped, monitor the Thread queue overflow packets line item. Once this
reaches 8192, packets will be dropped. If this value gets very large and packets are being
dropped, configure additional CPU resources. In addition to entstat, you can use topas
from the VIOS CLI to monitor SEA activity. As of VIOS V1.5, topas now lists the SEA
interface when it is the interface that is configured.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Using seastat to monitor SEA

The seastat command:
Provides statistics per MAC address of all SEA clients
Must enable advanced accounting on the SEA first:
$ chdev -dev ent# -attr accounting=enabled
To view reports, use -d followed by SEA device name:
$ seastat -d ent#
Example entry from seastat output:
==========================================================
MAC: 7E:DF:45:0E:31:04
----------------------
Notice that client VLAN priority
can be seen from VIOS
VLAN: 100
VLAN Priority: 7
IP: 9.47.87.82

-------------------- -------------------
Bytes: 14648 Bytes: 18425
Figure 7-67. Using seastat to monitor SEA AN313.1
Notes:
The seastat command to display MAC addresses
lshwres -m MSname -r virtualio --rsubtype eth --level lpar \
-F lpar_name,mac_addr
The seastat command is new as of Virtual I/O Server V1.5. To use seastat to see statistics
about network traffic, advanced accounting must be enabled on the SEA device. When
advanced accounting is enabled, the SEA keeps track of the hardware (MAC) addresses of
all of the packets it receives from the LPAR clients, and increments packet and byte counts
for each client independently. Command options: -n suppresses name resolution and -c
zeros out the statistics. You can use the HMC to quickly view which MAC addresses belong
to each LPAR.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Checkpoint
1. True or False: When using shared Ethernet adapters, set the MTU
size to 65390 on the physical adapter for the best performance.
2. True or False: Processor utilization for large packet workloads on

jumbo frames is approximately half that required for MTU 1500.
3. If you see many collisions or dropped packets for the SEA device,
what are the first two things to investigate?
4. True or False: You can configure a maximum amount of network

bandwidth for individual clients of a shared Ethernet adapter.
5. True or False: For mixed shared Ethernet adapter and VSCSI

services, leave threading enabled on the shared Ethernet adapter
device.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: When using shared Ethernet adapters, set the MTU size to 65390
on the physical adapter for the best performance.
2. True or False: Processor utilization for large packet workloads on jumbo frames is
approximately half that required for MTU 1500.
The answer is true.
3. If you see many collisions or dropped packets for the SEA device, what are the
first two things to investigate?
The answers are VIOS CPU utilization and physical adapter saturation.
4. True or False: You can configure a maximum amount of network bandwidth for
individual clients of a shared Ethernet adapter.
The answer is false. You can only set priorities.
5. True or False: For mixed shared Ethernet adapter and VSCSI services, leave
threading enabled on the shared Ethernet adapter device.
The answer is true.
Instructor Guide
Topic 4: Summary
adapter
seastat utility
Monitor Shared Ethernet
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Topic 5: Integrated Virtual Ethernet tuning

Tune the MCS value and queue pairs for optimal performance
or scalability
Figure 7-70. Topic 5: Integrated Virtual Ethernet tuning AN313.1
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IVE architecture (1 of 2)
Every POWER6 processor-based server I/O subsystem contains the
P5IOC2 chip
Dedicated controller that acts as the primary bridge for all PCI buses and all
internal I/O devices, including IVE adapter
Most (not all) POWER6 servers have an IVE adapter
IVE design provides high throughput and a great improvement of
latency for short packets
GX+ bus attachment for performance
P5IOC2 chip IVE also contains:

VPD chip (contains MAC
IVE addresses)
POWER6 One or two system ports
GX interface
chip (unused)
HEA
HEA is the major hardware

component of IVE:
Contains ports and Layer 2
switches
Two or four physical ports

Figure 7-71. IVE architecture (1 of 2) AN313.1
Notes:
IVE: Overview
The major component of the IVE is the Host Ethernet Adapter (HEA) which contains all of
the ports and switches. Other IVE components include the Vital Product Description (VPD)
chip with the media access control (MAC) addresses for the ports and, depending on the
model, one or two system ports. Since an HMC is required for p570 management, the IVE
system ports are not used. The IVE design provides a great improvement of latency for
small packets. The methods used to achieve low latency include attachment to the GX+
bus, immediate data in descriptors to reduce memory access, and direct user space
per-connection queueing (bypassing the operating systems). Additional acceleration
functions were designed into the IVE in order to reduce host code path length. This IVE
adapter provides three times the throughput over current 10Gbps solutions (when using the
10Gb IVE card). IVE relies exclusively on the system memory and system processing
cores to implement acceleration features.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IVE architecture (2 of 2)
Logical ports are associated with a specific physical port.
Port group:
Set of 16 logical ports:
Logical ports can be split evenly between the two physical ports in a port group or unevenly.
One or two port groups per Host Ethernet Adapter (HEA), depending on model
One or two physical ports per port group, depending on model
Each physical ports has its own Layer 2 switch.
LPAR OS
ent Logical devices as

Logical they appear in AIX
port lhea
Logical IVE
HEA
switch Port group
Physical
External switch
port
Figure 7-72. IVE architecture (2 of 2) AN313.1
Notes:
IVE terminology and description
AIX sees logical ports which are the representation of the shared physical ports. The logical
port will be an ent# device in AIX (and eth# in Linux). The OS in the LPAR box in the visual
above stands for operating system. Logical ports are grouped in sets of 16 called a port
group. Each port group will have either one or two physical ports, depending on the IVE
model. There are one or two port groups for each IVE depending on the model. The
administrator chooses which logical ports to allocate to partitions and which physical port to
use for the logical ports. Each LPAR can have one logical port per physical port. There is
one HEA in each IVE adapter. In the operating system, the HEA is represented logically as
an lhea device. If a partition uses two logical ports from the same HEA, they must use
different physical ports. In this case, there will be one lhea parent device and two ent#
devices.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IVE example: Dual- and quad-port gigabit
LPARs
Maximum 16
logical ports
per port group
Logical switch Logical switch
Port group
1Gb physical 1Gb physical
port port
External switch
Two IVE 1 gigabit models:

Dual-port gigabit: One port group, two physical 1Gb ports
Quad-port gigabit: Two port groups, two physical 1Gb ports per port group
Figure 7-73. IVE example: Dual- and quad-port gigabit AN313.1
Notes:
IVE gigabit adapter options
The dual-port gigabit IVE adapter has one port group and two physical 1Gb ports.
Because one port group supports a maximum of 16 ports, this adapter supports up to
16 logical ports, and therefore up to 16 partitions. This is the IVE that comes standard
on the p570 servers.
The quad-port gigabit IVE adapter has two identical port groups and two physical 1Gb
ports per port group. Because of the two port groups, it supports up to 32 ports, and
therefore up to 32 partitions.
For communications between partitions on the same server using the same physical port
(and thus the same logical switch), no access to an external switch is needed. For the best
performance between two LPARs on the same server, use IVE ports that share the same
HEA logical switch. This configuration will use more CPU because of increased throughput.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IVE example: Dual-port 10 gigabit
LPARs
Maximum 16
logical ports
per port group
Logical switch Logical switch
Port group 1 Port group 2

10Gb physical 10Gb physical
port port
External Layer 2 or Layer 3 switch
One IVE 10 gigabit model:

Dual-port 10 gigabit: Two port groups, one physical 10Gb port per port
group
Figure 7-74. IVE example: Dual-port 10 gigabit AN313.1
Notes:
IVE adapter 10 gigabit option
The dual-port 10 gigabit IVE adapter has two port groups and one physical (optical) 10Gb
port per port group. Because of the two port groups, it supports up to 32 ports, and
therefore up to 32 partitions.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
SEA versus dedicated HEA logical port

Shared Ethernet adapter: Integrated Virtual Ethernet:
Hypervisor forwards packets (latency) No forwarding to VIOS needed
PCI-e/X bus speed (or GX if using IVE) GX bus speed
Generated MAC addresses Hard-coded MAC addresses
Maximum clients when bandwidth is reached Maximum 16 or 32 logical ports
VIOS uses entire physical IVE port for SEA Supported by AIX V5.2
VIOS LPAR must be set as promiscuous AIX V5.2 LPAR must be set as
for physical port. promiscuous for physical port.
Virtual I/O
Linux AIX AIX Linux AIX AIX
Server
Packet Virtual Virtual Virtual

Ethernet Ethernet Ethernet
forwarder Ethernet Ethernet Ethernet
driver driver driver
driver driver driver
Virtual Ethernet switch Hypervisor

Hypervisor
Ethernet IVE
Network
adapter
Figure 7-75. SEA versus dedicated HEA logical port AN313.1
Notes:
Comparison of shared Ethernet adapter and IVE
The visual above using a shared Ethernet adapter for a client with an IVE configuration.
With an IVE, packets destined for the external network are not bridged through a Virtual I/O
Server partition. An IVE logical port can be used as the physical adapter in a shared
Ethernet adapter configuration. In this case, the physical IVE port is used exclusively by the
VIOS, and can not be configured with other logical ports. The shared Ethernet adapter
configuration requires that clients create a virtual Ethernet adapter which is not supported
by AIX V5.2.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Multi-Core Scaling value

The MCS value:
Sets the number of parallel streams for network traffic in an LPAR to optimize
performance.
Sets the maximum number of logical ports for a port group.
Configured per port group (regardless of number of physical ports).
Scalability:
Use MCS to specify number of ports you need to support LPARs.
Default value is 4 on p570 (for example), therefore only four logical ports per port
group available.
Must power off/on managed system to change MCS
MCS = 1 MCS = 2 MCS = 4 MCS = 8 MCS = 16

Maximum
logical ports 16 8 4 2 1
allowed per
port group
Figure 7-76. Multi-Core Scaling value AN313.1
Notes:
There is a performance aspect and a scalability aspect to the Multi-Core Scaling (MCS)
configuration option. The MCS value controls the level of parallelism used by each
partitions operating system for network traffic and it controls the total number of logical
ports available for use for a particular port group. As you can see in the table in the visual,
the larger the MCS value, the fewer ports that can be configured per port group. The
default value on a p570 is four, therefore there will only be four available ports per port
group. Decreasing the MCS value will increase the number of available logical ports per
port group and is appropriate when, on average, the partitions utilizing the IVE have lower
network bandwidth requirements. Increasing the MCS value will provide for fewer ports per
port group and is appropriate when, on average, LPARs have higher network bandwidth
requirements.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Queue pairs (1 of 2)
The MCS value sets the number of queue pairs in AIX.
A queue pair is a pair of transmit and receive queues.
Having multiple queue pairs breaks network traffic into multiple streams that can
be dispatched to multiple cores to take advantage of parallel processing.
P6 P6 P6 P6 P6
core core core core core
Dispatched on
next available Port
Port
core (1 of 4)
(1 of 16)
lhea lhea
QP QP QP QP QP
1 Queue
4 Queue
pair
ent# ent# pairs
(1 stream)
(4 streams)
Figure 7-77. Queue pairs (1 of 2) AN313.1
Notes:
Breaking up the network traffic into multiple streams enables the traffic to be processed in
parallel by interrupt handlers running on different processors. This is beneficial when there
are enough processors so that each stream can be dispatched in parallel. The visual above
shows two configuration scenarios. The one on the left shows an example in which the
MCS value is 1 and this sets the number of queue pairs (QP) in each of the partitions which
use the same port group to 1. The example on the right shows a partition with four queue
pairs which can utilize four processor cores in parallel to process network traffic. This is
only beneficial for performance if the partition is configured with at least four processors
(dedicated, virtual, or logical). Having more QPs will use more CPU resources.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Queue pairs (2 of 2)
The number of queue pairs (QPs) in each LPAR is equivalent to the
MCS value for the port group.
MCS = 1 MCS = 2 MCS = 4 MCS = 8 MCS = 16

Max # of ports 16 8 4 2 1
per port group
# of QPs per 1 2 4 8 16
port
The best IVE port performance is when:

MCS value equals the number of processors in an LPAR
because
the number of QPs in the LPAR matches the number of
processors.
Example: LPAR has two virtual processors so its performance will be
best when MCS=2 on the port group it is using.
Figure 7-78. Queue pairs (2 of 2) AN313.1
Notes:
The number of queue pairs shown in the visual is per partition. If MCS is 4 and you have
four partitions using that port group, each of those partitions will have four queue pairs.
Each stream can be dispatched on the next available processor. One processor core can
handle two queue pairs with simultaneous multi-threading enabled. Therefore, in the
example in the visual, with SMT enabled and with one virtual processor (two logical
processors), two queue pairs would be ideal.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Viewing number of QPs in AIX

What is the MCS value (# of QPs) for this IVE port device?
# entstat -d ent0 | grep QP

Number of Default QPNs for Port: 4
Default QPN Array:
QP0 | num: 24 | bid: 90000300 | ISN/level: 308
QP1 | num: 25 | bid: 90000300 | ISN/level: 309
QP2 | num: 26 | bid: 90000300 | ISN/level: 30A
QP3 | num: 27 | bid: 90000300 | ISN/level: 30B
Figure 7-79. Viewing number of QPs in AIX AN313.1
Notes:
The AIX entstat command output will list the number of queue pairs (QPNs) for the IVE
logical port.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Viewing MCS and QP configuration in AIX

MCS can be disabled in AIX resulting in exactly one QP for the device.
Is MCS enabled in AIX (default)?
# lsattr -El ent1 | grep multicore

multicore yes Enable Multi-Core Scaling True
How do I disable MCS in this instance of AIX?

# ifconfig en1 down detach Equivalent of
# chdev -l ent1 -a multicore=no MCS=1
# lsattr -El ent1 | grep multicore

multicore no Enable Multi-Core Scaling True
Figure 7-80. Viewing MCS and QP configuration in AIX AN313.1
Notes:
The multicore attribute
The multicore attribute is enabled by default. By disabling the multicore attribute, the
partition will have one queue pair. If the MCS value on the port group is a higher number
than you want for an LPAR, one tuning strategy is to disable the multicore attribute.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
MCS and QPs tuning

Balance scalability with performance needs:
Changing the MCS value changes the quantity of QPs and the number of ports.
Tuning can provide near wire speed (even for 10 Gbps ports).
Tuning:
MCS value might be dictated by number of LPARs to support.
Can dynamically disable/enable multicore for the IVE port device.
Do not arbitrarily change number of virtual processors to fit QPs.
Have enough CPU resources:
Intra-physical port network communications will be faster and use more CPU.
Simultaneous multithreading should provide performance benefit.
Try one thing, monitor, and see if performance improves.
Configuration example:
Four processor system with four LPARs, set MCS = 4 for performance.
If one LPAR has four processors, and others have just one, disable MCS in the three
one-processor partitions.
If all LPARs use approximately one processor (VP or dedicated), disable multicore in
all.
Figure 7-81. MCS and QPs tuning AN313.1
Notes:
Using HEA ports and system resources
The higher the MCS value, the more CPU cycles per Gbps throughput due to less effective
interrupt coalescing. IVE ports use less CPU than using an SEA, because with an SEA,
there is CPU processing on the client and on the Virtual I/O Server. The 10Gb IVE card can
be driven at full bandwidth even using a maximum transmission unit (MTU) of 1500. For
10Gb adapter only, there is also some benefit to using an MCS greater than one to allow
spreading across the direct memory access (DMA) engine. In the past, the processing
power of the adapter card could not keep up with the volume of packets required at that
MTU to drive the full bandwidth.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Performance considerations
Purchase appropriately-sized IVE model for network needs.
Use IVE over other types of Ethernet adapters when possible.
IVE performance is faster on GX bus than a PCIe/X Ethernet card.
Distribute LPAR usage across IVE physical ports.
All LPARs sharing a physical port share the bandwidth of that port.
For fastest IVE communications between LPARs, use logical
ports on same internal IVE switch (that is, on the same
physical port).
Typically, SMT will increase performance.
Use flow control option in port groups, particularly for 10 Gb
IVE.
Overall recommended value (for performance) is to set MCS to
the number of logical processors for the partitions which will
use the port group.
Figure 7-82. Performance considerations AN313.1
Notes:
IVE performance considerations
The first statement on the visual refers to the fact that there are three different IVE models
that can be ordered for most POWER6 systems. Using IVE logical ports for communication
that are on the same physical port can be several times faster than using a different
physical port. The best improvements with SMT can be seen with configurations with
communications between two logical ports on the same physical port. Check the box for
the flow control on the physical port to have the HMC attempt to negotiate flow control in
both the transmit and receive directions. The HMC enables flow control in the directions for
which the HMC can negotiate flow control. It is recommended that this is enabled,
especially for the 10Gb IVE model.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Monitoring IVE traffic (1 of 2)

Use entstat to monitor LHEA port traffic.
# entstat d ent0 | more
-------------------------------------------------------------
ETHERNET STATISTICS (ent0) :
Device Type: Host Ethernet Adapter (l-hea)
Hardware Address: 00:14:5e:5f:64:a1
Elapsed Time: 1 days 9 hours 7 minutes 40 seconds

-------------------- -------------------
Bytes: 74671769301 Bytes: 79669316081

(page down)
Multi-Core Scaling Specific Statistics:
----------------------------------------------
Total / Average:
----------------
RX Interrupt: 11305868 / 2826467
RX Coalesce: 38356 0/ 95890
(Shows total, then information for each queue pair)
Figure 7-83. Monitoring IVE traffic (1 of 2) AN313.1
Notes:
Using the entstat command
The entstat AIX command is a tool to use for monitoring Ethernet traffic, including IVE
traffic. In the visual above, the ent0 device is the Logical Host Ethernet Port (lp-hea) device
as shown in the output of the lsdev c adapter AIX command. This is the device you
would use if the HEA port is used for communications on an AIX client or on a VIOS where
it is not used as part of an SEA configuration. On the VIOS, if the IVE port is the physical
adapter in an SEA configuration, use the SEA device with the entstat command.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Monitoring IVE traffic (2 of 2)

Logical Host Ethernet Port (l-port) Driver Properties:
------------------------------------------------------
HW TX TCP Segmentation Offload: On
HW TX Checksum Offload: On
HW RX Checksum Offload: On
TX and RX Jumbo Frames: Off
RX Interrupt Coalescing: On
Promiscuous Mode: Off
Logical Port Link State: Up
Physical Port Link State: Up
Media Speed Selected: Autonegotiate
Media Speed Running: 1000 Mbps / 1 Gbps, Full Duplex
Logical Host Ethernet Port (l-port) Specific Properties:

--------------------------------------------------------
Logical Port Number: 2
Port Operational State: Up
External-Network Switch-Port Operational State: Up
External-Network-Switch (ENS) Port Speed: 1000 Mbps / 1 Gbps, Full Duplex
Port Receive Control:

TX Pause Frame Negotiated: TRUE
RX Pause Frame Negotiated: TRUE
Number of Default QPNs for Port: 4

Figure 7-84. Monitoring IVE traffic (2 of 2) AN313.1
Notes:
More entstat command output
The visual shows that entstat command output also displays configuration properties.
Notice the promiscuous setting, logical port number, and the number of QPs.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Checkpoint (1 of 2)
1. True or False: The IVE allows partitions to connect to an external
network without the need for a Virtual I/O Server partition.
2. True or False: Partitions using IVE logical ports must be connected to

an external switch to communicate with each other.
3. True or False: The standard IVE adapter card on most POWER6

systems will connect 16 LPARs, but you can optionally order an IVE
adapter card which connects up to 32 LPARs.
4. True or False: An IVE logical port can be used as the physical adapter
in an SEA configuration.
5. You can see the number of QPs by looking at output from what
command?
a. lsattr (AIX)
b. entstat (AIX)
c. ifconfig (AIX)
d. lshwres (HMC)
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: The IVE allows partitions to connect to an external network without the need for a
Virtual I/O Server partition.
The answer is true.
2. True or False: Partitions using IVE logical ports must be connected to an external switch to
communicate with each other.
The answer is false. Partitions configured with logical ports on the same physical port do not need to
connect through an external switch to communication with each other.
3. True or False: The standard IVE adapter card on most POWER6 systems will connect 16 LPARs,
but you can optionally order an IVE adapter card which connects up to 32 LPARs.
The answer is true.
4. True or False: An IVE logical port can be used as the physical adapter in an SEA configuration.
The answer is true.
5. You can see the number of QPs by looking at output from what command?
a. lsattr (AIX)
b. entstat (AIX)
c. ifconfig (AIX)
d. lshwres (HMC)
The answer is entstat (AIX).
Instructor Guide
Checkpoint (2 of 2)
6. True or False: It is best to have the number of QPs equivalent to the
number of virtual, dedicated, or logical processors in a partition
(whichever is the highest number).
7. True or False: The best performance will be between logical ports

which share the same internal switch.
8. True or False: The MCS value sets the maximum number of available
logical ports per physical port.
9. True or False: The MCS value sets the number of queue pairs (QPs)
in each partition which is configured for that port group.
10. True or False: The MCS value can be changed dynamically.
11. What is the effect of disabling the multicore attribute for an LHEA
Ethernet device in an AIX LPAR?
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
6. True or False: It is best to have the number of QPs equivalent to the number of virtual, dedicated,
or logical processors in a partition (whichever is the highest number).
The answer is true.
7. True or False: The best performance will be between logical ports which share the same internal
switch.
The answer is true.
8. True or False: The MCS value sets the maximum number of available logical ports per physical
port.
The answer is false. MCS value sets the maximum number per port group.
9. True or False: The MCS value sets the number of queue pairs (QPs) in each partition which is
configured for that port group.
The answer is true.

The answer is false. You must completely power off the managed system and power it back on to
change the MCS value.
11. What is the effect of disabling the multicore attribute for an LHEA Ethernet device in an AIX LPAR?
The answer is when you disable the multicore attribute, the device has just one QP.
Instructor Guide
Topic 5: Summary
Tune the MCS value and queue pairs for optimal performance
or scalability
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Exercise
Unit
exerc
ise
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Unit summary
Determine which client partitions and devices are affecting the Virtual I/O Server performance
Describe the partition resource sizing guidelines for Virtual I/O Servers used for virtual SCSI
Use performance analysis tools to monitor virtual SCSI device performance
Describe how the following tuning options affect virtual Ethernet performance:
MTU sizes, CPU entitlement, TCP checksum offloading, simultaneous multithreading
Describe Virtual I/O Server sizing guidelines for hosting shared Ethernet adapter services
Configure TCP segmentation offload on the shared Ethernet adapter
Configure SEA bandwidth apportioning and monitor with the seastat utility
Monitor shared Ethernet adapter network traffic with Virtual I/O Server utilities
List performance and network availability considerations when configuring IVE devices
Tune the MCS value and queue pairs for optimal performance or scalability
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

V5.4.0.3
Instructor Guide
Uempty Unit 8. Partition mobility
Estimated time
01:30

This unit describes the partition mobility feature. This includes
implementing the Remote Live Partition Mobility, dual Virtual I/O
Server considerations, and troubleshooting.

Describe active or inactive partition mobility
Identify the components required to successfully perform a partition
migration
Implement Live Partition Mobility
Troubleshoot common partition mobility problems

Machine exercises
References
SG24-7460-01 IBM System p Live Partition Mobility Redbook
IBM POWER6 partition mobility: Moving virtual servers seamlessly
between physical systems
http://researchweb.watson.ibm.com/journal/rd/516/ar
mstrong.pdf
Copyright IBM Corp. 2010, 2011 Unit 8. Partition mobility 8-1

Instructor Guide
Unit objectives
Identify the components required to successfully perform a

partition migration
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
What is partition mobility?

Move running partitions from one POWER6/POWER7 system
to another POWER6/POWER7 system
Live (Active) Partition Mobility
Inactive partition mobility
POWER6 or POWER6+ compatibility modes for moving between
POWER6 and POWER7
Provides non-disruptive system maintenance flexibility
Reduce planned down time
Dynamically move applications from one server to another
Environment and system resource optimization option
Can balance workloads and resources
Additional control over the usage of resources in the data center
Can be used as a mechanism for server consolidation
Live Partition Mobility can be automated
Commands, scripts, and so on
Figure 8-2. What is partition mobility? AN313.1
Notes:
Partition mobility provides the ability to move a logical partition from one system to another.
Live (or active) partition mobility allows you to move a running logical partition, including its
operating system and applications, from one system to another. The applications do not
need to be shut down. Inactive partition mobility allows you to move a powered off (or
deactivated) logical partition from one system to another.
Live Partition Mobility
Live Partition Mobility allows you to migrate running AIX and Linux partitions and their
hosted applications from one physical server to another without disrupting the
infrastructure services. The migration operation, which takes just a few seconds, maintains
complete system transactional integrity. The migration transfers the entire system
environment, including processor state, memory, attached virtual devices, and connected
users.
As the number of hosted partitions increases, finding a maintenance window acceptable to
all becomes increasingly difficult. Live partition mobility allows you to move your partitions

V5.4.0.3
Instructor Guide
Uempty around so that you can perform previously disruptive operations on the machine when it
best suits you, rather than when it causes the least inconvenience to the users.
Live partition mobility helps you meet the increasingly stringent service-level agreements
(SLAs) because it allows you to proactively move running partitions and applications from
one server to another. The ability to move running partitions from one server to another
offers you the ability to balance workloads and resources. If a key application's resource
requirements peak unexpectedly to a point where there is contention for server resources,
you might move it to a larger server or move other, less critical, partitions to different
servers, and use the freed-up resources to absorb the peak.
Live partition mobility can also be used as a mechanism for server consolidation, as it
provides an easy path to move applications from individual, stand-alone servers to
consolidation servers. If you have partitions with workloads that have widely fluctuating
resource requirements over time (for example, with a peak workload at the end of the
month or the end of the quarter), you can use live partition mobility to consolidate partitions
to a single server during the off-peak period, allowing you to power-off unused servers.
Then move the partitions to their own, adequately configured servers, just prior to the peak.
Live partition mobility contributes to the continuous availability goal. It can:
Reduce planned down time by dynamically moving applications from one server to
another
Respond to changing workloads and business requirements by letting you move
workloads from heavily loaded servers to servers that have spare capacity
Reduce energy consumption by allowing you to easily consolidate workloads and
power off unused servers
Inactive migration moves the definition of a powered off logical partition from one system to
another along with its network and disk configuration. No additional change in network or
disk setup is required and the partition can be activated as soon as migration is completed.
The inactive migration procedure performs the reconfiguration of the systems involved,
including the following:
A new partition is created on the destination system with the same configuration
present on the source system.
Network access and disk data is preserved and made available to the new partition.
On the source system, the partition configuration is removed and all involved resources
are freed.
If a system is down due to scheduled maintenance or not in service for other reasons, an
inactive migration can be performed. It is executed in a controlled way and with minimal
administrator interaction so that it can be safely and reliably performed in a very short time
frame.

Instructor Guide
Partition mobility and POWER7

To move an LPAR between POWER6 and POWER7 processor-based servers, the LPAR
must be operating in POWER6 or POWER6+ compatibility modes. A partition that uses AIX
5.3 executes in POWER6 or POWER6+ compatibility mode. A partition using AIX 6.1 with
TL2, TL3 or TL4 up to SP2 executes in POWER6 or POWER6+ compatibility mode.
The possibility to move partitions between POWER6 and POWER7 processor-based
servers greatly facilitates the deployment of POWER7 processor-based servers, as
follows:
Installation of the new server can be performed while the application is executing on
POWER6 server. After the POWER7 processor-based server is ready, the application
can be migrated to its new hosting server without application down-time.
When adding POWER7 processor-based servers to a POWER6 environment, you get
the additional flexibility to perform workload balancing across the whole set of POWER6
and POWER7 processor-based servers.
When performing server maintenance, you get the additional flexibility to use POWER6
Servers for hosting applications usually hosted on POWER7 processor-based servers,
and vice-versa, allowing you to perform this maintenance with no application planned
down-time.

V5.4.0.3
Instructor Guide

Purpose Discuss partition mobility and its benefits.
Details
Transition statement The following is a quick overview of the process.

Instructor Guide
Overview of process (1 of 2)
Move running partitions from one system to another
POWER6 systems
Partition with only virtual devices
Virtual disks must be LUNs on external SAN storage
LUNs must be accessible to a VIO Server on each system.
Shared and not reserved
Invoked by HMC command or GUI
migrlpar
Can be performed using the Integrated Virtualization Manager (IVM)
Figure 8-3. Overview of process (1 of 2) AN313.1
Notes:
Partition mobility provides systems management flexibility and improves system
availability. For example:
You can avoid planned outages for hardware or firmware maintenance by moving
logical partitions to another server and then performing the maintenance. Partition
mobility can help lead to zero downtime maintenance because you can use it to work
around scheduled maintenance activities.
You can avoid downtime for a server upgrade by moving logical partitions to another
server and then performing the upgrade. This allows your end users to continue their
work without disruption.
If a server indicates a potential failure, you can move its logical partitions to another
server before the failure occurs. Partition mobility can help avoid unplanned downtime.
You can consolidate workloads running on several small, under-used servers onto a
single large server.
You can move workloads from server to server to optimize resource use and workload
performance within your computing environment. With active partition mobility, you can
manage workloads with minimal downtime.

V5.4.0.3
Instructor Guide

Purpose Overview of partition mobility.
Details There are a few ways to invoke the migration process. This is a quick view of the
most popular ways to initiate the migration.
Transition statement The following is a graphical depiction of two systems before
initiating the migration.

Instructor Guide
Overview of process (2 of 2)
hdisk0
Server Server
A B
vscsi0 en0
Service processor
Service processor
VASI VASI
vhost0 hdisk1 en1 MSP MSP en1 hdisk1

vtscsi0 en2 en2 en2
SEA en2 SEA
fcs0 en0 en0 fcs0
HMC
Ethernet network
Storage area network
LUN
Figure 8-4. Overview of process (2 of 2) AN313.1
Notes:
Active partition mobility lets you move a running logical partition, including its operating
system and applications, from one server to another without disrupting the operation of that
logical partition.
1. The user ensures that all requirements are satisfied and all preparation tasks are
completed.
2. The user initiates active partition mobility using the Partition Migration wizard (or
migrlpar command) on the HMC.
3. The HMC verifies the partition mobility environment.
4. The HMC prepares the source and destination environments for active partition
mobility.
5. The HMC initiates the transfer of the partition state from the source environment to the
destination environment. This includes all the logical partition profiles associated with
the mobile partition.

V5.4.0.3
Instructor Guide
Uempty - The source mover service partition (MSP) extracts the partition state information
from the source server and sends it to the destination mover service partition over
the network.
- The destination mover service partition receives the partition state information and
installs it on the destination server.
6. The HMC initiates the suspension of the mobile partition on the source server. The
source mover service partition continues to transfer the partition state information to the
destination mover service partition.
7. The Hypervisor resumes the mobile partition on destination server.
8. The HMC initiates completion of the migration. This means that all resources that were
consumed by the mobile partition on the source server are reclaimed by the source
server, including:
- The source Virtual I/O Server unlocks, unconfigures, or undefines virtual resources
on the source server.
- The HMC removes the hosting virtual adapter slots from the source Virtual I/O
Server logical partition profiles as required.
9. The user performs post requisite tasks, such as:
- Adding the mobile partition to a partition workload group
- Adding dedicated I/O adapters
Inactive partition mobility lets you move a powered off logical partition from one server to
another.
1. The user ensures that all requirements are satisfied and all preparation tasks are
completed.
2. The user shuts down the mobile partition.
3. The user initiates inactive partition mobility using the Partition Migration wizard on the
HMC.
4. The HMC verifies the partition mobility environment.
5. The HMC prepares the source and destination environments for inactive partition
mobility.
6. The HMC initiates the transfer of the partition state from the source environment to the
destination environment. This includes all the logical partition profiles associated with
the mobile partition.
7. The HMC initiates completion of the migration. This means that all resources that were
consumed by the mobile partition on the source server are reclaimed by the source
server, including:

Instructor Guide
- The source Virtual I/O Servers unlocks, unconfigures, or undefines virtual resources
on the source and destination servers.
- The HMC removes the hosting virtual adapter slots from the source Virtual I/O
Server logical partition profiles.
8. The user activates the mobile partition on the destination server.
9. The user performs post requisite tasks, such as:
- Establishing virtual terminal connections
- Adding the mobile partition to a partition workload group

V5.4.0.3
Instructor Guide

Purpose Overview the active and inactive partition mobility process.
Details Do not go into detail here. Use the following slides to discuss the process and
component details.
Transition statement These components must reside on the source and destination
systems.

Instructor Guide
Components
Partition
POWER6
Virtual disk hdisk0
Mapped through VIO Server to a LUN server
Virtual Ethernet
Mapped through SEA in VIO Server vscsi0 en0
Hypervisor
network
Private
VLAN
Service processor
vSCSI
Virtual Ethernet
VASI
Support for migration HMC
vhost0 hdisk1 en1 MSP
VIO Server vtscsi0 en2
en2
Provides the Virtual I/O SEA
VASI interface to Hypervisor Open
fcs0 en0
network
Mover Service Partition
HMC
Ethernet network
Configuration of required capabilities
Validation of configuration
Orchestrates the sequence of events Storage area network
LUN
VASI: Virtual Asynchronous Services Interface
Figure 8-5. Components AN313.1
Notes:
The candidate partition must be one that has only virtual devices. If there are any physical
devices in its allocation, they must be removed before the validation or migration is
initiated.
The Hypervisor must support the partition mobility functionality. POWER6 Hypervisors
have this capability. PowerVM Enterprise edition must be ordered for both source and
destination managed systems.
The Virtual I/O Server on the source system provides the access to the clients resources,
but also has a Virtual Asynchronous Services Interface (VASI) and is identified as a mover
service partition (MSP). The VASI device allows the mover service partition to
communicate with the Hypervisor. MSP must be configured on both the source and
destination Virtual I/O Servers designated as the mover service partitions for the mobile
partition to participate in active mobility. The MSP is a Virtual I/O Server logical partition
that has at least one VASI adapter configured to allow the MSP to communicate with the
Hypervisor.

V5.4.0.3
Instructor Guide
Uempty The HMC is used to configure, validate, and orchestrate the migration operation. You can
use the HMC to configure the Virtual I/O Server as an MSP. There is no need to create this
VASI adapter on the Virtual I/O Server. This device is automatically configured.
HMC includes a wizard or program that validates your configuration and identifies errors
that cause the migration to fail. During the migration, the HMC controls all phases of the
process.

Instructor Guide
Instructor notes:
Purpose Identify key elements of the migration process.
Details
Transition statement The following is an introduction to the requirements.

V5.4.0.3
Instructor Guide
Uempty
Basic requirements (1 of 4)
Two POWER6 or above systems
Managed by the same HMC or different HMCs
PowerVM Enterprise feature activated
Compatible CODE levels (with partition mobility enabled)
HMC, system firmware, OS, VIO Server
The same logical memory block (LMB) size on each system
Source and target systems must have:
VIOS providing mobile LPARs network and disk access
LUN access (external hdisk with no_reserve )
Both systems must bridge to clients networks
Operating, with VIO Server running
MSP
Target system:
No partition with the same name
Cannot be running on internal battery power
Must have sufficient resources (CPU and memory) available
Figure 8-6. Basic requirements (1 of 4) AN313.1
Notes:
Preparation
When you have created the Virtual I/O Servers, and configured mover service partition
devices, you must prepare the source and destination systems for migration by doing the
following:
1. Synchronize the time of day clocks on the mover service partitions using an external
time reference, such as the network time protocol (NTP). This is an optional step that
increases the accuracy of time measurement during migration. It is not required by the
migration mechanisms and even if this step is omitted, the migration process correctly
adjusts the partition time. Time never goes backwards on the mobile partition during a
migration.
2. Prepare the partition for migration.
- Use dynamic reconfiguration on the HMC to remove all dedicated I/O, such as PCI
slots, GX slots, and HEA, from the mobile partition.
- Remove the partition from a partition workload group (if assigned).

Instructor Guide
3. Prepare the destination Virtual I/O Server.

- Configure the shared Ethernet adapter as necessary to bridge the mobile LPAR's
VLANs.
- Configure the SAN so that requisite storage devices are available.
The hardware and software required to use partition mobility varies depending on whether
you are migrating an active or inactive partition. Make sure that your partition mobility
environment meets minimum requirements before you migrate your partition.
HMC requirement
PowerVM Live Partition Mobility can include one or more HMCs:
Both the source and destination servers are managed by the same HMC (or redundant
HMC pair). In this case, the HMC must be at version 7 release 3.2, or later.
The source server is managed by one HMC and the destination server is managed by a
different HMC. In this case, both the source HMC and the destination HMC must meet
the following requirements:
- The source HMC and the destination HMC must be connected to the same network
so that they can communicate with each other.
- The source HMC and the destination HMC must be at Version 7, Release 3.4.
Use the lshmc command to display the HMC version
hscroot@hmc1:~> lshmc -V
"version= Version: 7
Release: 3.4.0
Service Pack: 0
HMC Build level 20080929.1
","base_version=V7R3.4.0
- A secure shell (SSH) connection must be set up correctly between the two HMCs.
Run the following command from the source server HMC to configure the ssh
authentication to the destination server HMC (note that 9.3.5.180 is the IP address
of the destination HMC):
hscroot@hmc1:~> mkauthkeys -u hscroot --ip 9.3.5.180 -g
Enter the password for user hscroot on the remote host 9.3.5.180:
Run the following command from the source server HMC to verify the ssh
authentication to the destination server HMC:
hscroot@hmc1:~> mkauthkeys -u hscroot --ip 9.3.5.180 --test
Source and destination system requirements
Must be an IBM System p6 model
Must both be at firmware level eFW3.2 or above

V5.4.0.3
Instructor Guide
Uempty Destination system must not have a partition with the same name as the one that is to
be migrated
Source and destination Virtual I/O Server requirements
There must be at least one Virtual I/O Server logical partition installed and activated on
both the source and destination servers. The source and destination Virtual I/O Server
partitions must be at release level 1.5.
The mobile partition's network and disk access must be virtualized using one or more
Virtual I/O Servers.
The Virtual I/O Servers on both systems (source and target) must have a shared
Ethernet adapter configured to bridge to the same Ethernet network used by the mobile
partition.
The Virtual I/O Servers on both systems must be capable of providing virtual access to
all disk resources the mobile partition is using.
On the destination Virtual I/O Server partition, do not create any virtual SCSI adapters for
the mobile partition. These are created automatically by the migration function.
Mobile partition's operating system requirements
AIX version 5.3 Technology Level 7 or later
Red Hat Enterprise Linux version V5.1 or later
SUSE Linux Enterprise Services 10 (SLES 10) Service Pack 1 or later
Previous versions of AIX and Linux can participate in inactive partition mobility, if the
operating systems support virtual devices and IBM System p6 models.
Battery power
Ensure that the destination system is not running on battery power. If the destination
system is running on battery power, then you need to return the system to its regular power
source before moving a logical partition to it. However, the source system can be running
on battery power.

Instructor Guide
Instructor notes:
Purpose Identify key requirements.
Details Reference some of the details in the student notes.
Transition statement A couple of the VIOS requirements can be viewed or created at
the HMC.

V5.4.0.3
Instructor Guide
Uempty
Virtual
Asynchronous
Services
Interface
Notes:
Access the VIOS LPAR's properties to configure it as a mover service partition by selecting
the option in the General tab. Also, verify the VASI interface is created by viewing the
Virtual Adapters tab. With the code available before November 2007 (VIOS 1.4), the VASI
interface had to be manually created. With VIOS 1.5, the VASI interface is automatically
created.
A VASI device is a virtual device unique to active partition mobility that allows the mover
service partition to communicate with the hypervisor.

Instructor Guide
Instructor notes:
Purpose Identify how to set a VIOS LPAR as an MSP and verify that the VASI interface
is created.
Details

V5.4.0.3
Instructor Guide
Uempty
Partition migration support must be enabled by entering PowerVM-
Enterprise activation key.
System firmware and
HMC code must be at
the required levels
CEC
properties
Notes:
You can verify the source and destination systems support partition mobility by looking at
the properties of the managed systems. If your system is not capable, the Migration tab
would not be visible.
The base level code requirements
HMC version and release must be at V7R320 or later.
POWER6 firmware level eFW3.2.
Must have PowerVM Enterprise Edition feature.

Instructor Guide
Instructor notes:
Purpose Manual validation of the basic requirements for the VIO and managed system.
Details
Transition statement The following lists the LPARs requirements.

V5.4.0.3
Instructor Guide
Uempty
Partition to be migrated must satisfy the following
requirements:
Not using physical I/O
Only virtual devices (user-defined virtual devices must have a virtual
slot number higher than 10)
VSCSI not backed by LVs or files in VIOS
Not set for redundant error path reporting
No additional virtual serial ports
Not part of a workload group
Not using barrier synchronization register
Not using huge pages
Notes:
These are additional considerations for the mobile client.
BSR is a memory register that is located on certain POWER-based processors. A
parallel-processing application running on AIX can use a BSR to perform barrier
synchronization, which is a method for synchronizing the threads in the parallel-processing
application. For a logical partition to participate in active partition mobility, it cannot use
BSR arrays. If the mobile partition uses BSR, it can participate in inactive partition mobility.
Huge pages can improve performance in specific environments that require a high degree
of parallelism, such as in DB2 partitioned database environments. You can specify the
minimum, desired, and maximum number of huge pages to assign to a partition when you
create the partition or partition profile. For a logical partition to participate in active partition
mobility, it cannot use huge pages. If the mobile partition uses huge pages, it can
participate in inactive partition mobility.

Instructor Guide
Instructor notes:
Purpose Identify the client/mobile partition's basic requirements.
Details
Transition statement You can use the HMC GUI or commands to assist with checking
the environment.

V5.4.0.3
Instructor Guide
Uempty
Validation (1 of 8)
Validation check options
lslparmigr and migrlpar HMC commands
$ lslparmigr -r virtualio -m srcSystem -t destSystem \
filter lpar_names=myLPAR
$ lslparmigr -r msp -m srcSystem -t destSystem filter \

"lpar_names=myLPAR
$ migrlpar -o v -m srcSystem -t destSystem -p myLPAR \

-i source_msp_id=2,dest_msp_name=S2_VIOS2
HMC GUI
Figure 8-10. Validation (1 of 8) AN313.1
Notes:
Before performing the migration, you should perform a validation. Explicitly requesting this
is optional but is recommended to manage errors before invoking the migration.
The HMC provides an easy way to check the systems and HMC for most of the
requirements. The process could be invoked from the HMC GUI or from using the
lslparmigr and migrlpar commands. When performing the migrate' option from the HMC
GUI, the default action is to automatically run the validation process before performing the
migration process.

Instructor Guide
Instructor notes:
Purpose Identify how to perform a validation.
Details
Transition statement Lets see a validation wizard example.

V5.4.0.3
Instructor Guide
Uempty
Validation (2 of 8)
Validation wizard example
Required when source and

target systems are managed
by different HMCs (v7.3.4)
Source and Target

MSPs
Target VIO Servers

with the LUN
available
Click Migrate
when validation is
successful
Notes:
Click Validate to have the HMC examine the environment. Errors are displayed with
recommended resolutions.
The source and destination systems can be managed by the same or different HMCs.
Starting with HMC Version 7 Release 3.4, the Remote Live Partition Mobility feature is
available. This feature allows a user to migrate a client partition to a destination server that
is managed by a different HMC. The function relies on Secure Shell (SSH) to communicate
with the remote HMC for information such as the list of managed systems. SSH key
authentication to the remote HMC must be configured. To perform this, you should log into
the destination systems HMC and retrieve the authentication keys from the HMC currently
managing the mobile partition. As hscroot (or an account with hmcsuperadmin privileges),
use the mkauthkeys command.
For example, to configure ssh rsa key authentication to a remote system (in our case
10.31.204.31):

Instructor Guide
mkauthkeys --ip 10.31.204.31 -u hscroot -t rsa

Enter the password for user hscroot on the remote host
10.31.204.31:
You will be prompted to provide the password for the remote HMCs user.
If they are managed by different HMCs, you must provide the IP address (or resolvable
hostname) for the HMC of the destination system in the Remote HMC field of the
Validation Wizard. Clicking MSP Pairing displays a list of MSPs on the source and
destination systems. The administrator should select the MSPs to be used during the
migration process.
After the successful validation and MSP selection, you can click Migrate to initiate the
partition mobility process.
If errors or warnings occur, the Partition Validation Errors/Warnings window opens. Perform
the following steps:
1. Check the messages and identify the prerequisites for the migration:
- For error messages: You cannot perform the migration if errors exist. Eliminate any
errors.
- For warning messages: If only warnings occur (no errors), you can migrate the
partition after the validation.
2. Close the Partition Validation Errors/Warnings window. A validation window opens
again. If you had warning messages only (no error messages), you can click Migrate.

V5.4.0.3
Instructor Guide

Purpose
Details The Validate path can be used to validate and to launch the migration process.
When you reach this screen, simply click Validate and the HMC attempts to complete the
entry fields. If there are multiple choices, for example, two MSPs on the target, you can
click MSP Pairing to identify your choice.
To see the list associated with Destination system, the ssh authentication must be
configured as referenced in the student notes (the students will have an opportunity to
perform this in the Partition Mobility lab). You could then select from the drop down menu
for the field. If the menu is empty, select Refresh Destination System to update the fields
list.
Transition statement Identify what the HMC Migration Validation process checks.

Instructor Guide
Validation (3 of 8)
Validation process checks the following items:
RMC connections between VIO Servers
RMC connection to the partition to be migrated
LMB sizes on both systems
The partition to be migrated (partition readiness):
No physical adapters defined as required in the LPAR.
The LPAR uses only external LUNs.
VSCSI cannot be backed by LVs in VIOS.
The LPAR supports active migration (OS support).
The LPAR type is AIX Linux.
The LPAR is not a mover service partition.
The LPAR is not using barrier synchronization registers.
The LPAR is not using huge pages.
The LPAR state is active or running.
The LPAR is not in a partition workload group.
The LPAR MAC address is unique (across both servers).
The LPAR has a name that is not in use on the target system.
Not exceeding the supported number of active migrations
Notes:
This is not a complete list of checks.
For example, it first checks the source and destination systems, POWER Hypervisor,
Virtual I/O Servers, and mover service partitions for active partition migration capability and
compatibility.

V5.4.0.3
Instructor Guide

Purpose Identify what the HMC Migration Validation process checks.
Details If performing a Remote Live Partition Migration, the validation will require the
ssh key authentication is configured between the two HMCs.

Instructor Guide
Validation (4 of 8)
RMC connections
hdisk0
Server Server
A B
vscsi0 en0
Service Processor
Service processor
RM
VASI VASI
C
vhost0 hdisk1 en1 MSP MSP en1 hdisk1
vtscsi0 en2 en2 en2
SEA en2 SEA
fcs0 en0 RMC RMC en0
HMC fcs0
Ethernet network
LUN
Notes:
This is showing which RMC connections are checked and needed during the migration.
The validation process checks that the RMC connections to the mobile partition, the source
and destination Virtual I/O Servers, and the connection between the source and destination
mover service partitions are established.

V5.4.0.3
Instructor Guide

Purpose Identifying the RMC connections needed during migration.
Details
Transition statement The lspartition command shows if the RMC sessions are
established between the HMC and the partitions. It can also show information about the
environment capabilities.

Instructor Guide
Validation (5 of 8)
New RMC capabilities
Enabled by new code in the VIO Server and AIX partitions.
hscroot@r10s1hmc:~> lspartition dlpar
<#0> Partition:<4*9117-MMA*1035B90, , 10.31.203.66>

Active:<1>, OS:<AIX, 6.1, 6100-02-02-0849>, DCaps:<0x79f>,
CmdCaps:<0xb, 0xb>, PinnedMem:<279>
<#2> Partition:<7*9117-MMA*1035B90, , 10.31.203.69>
<#3> Partition:<6*9117-MMA*1035B90, , 10.31.203.68>
Suggestion: Always use lspartition -dlpar before trying a

migration.
Notes:
If the results for your partition are <Active 1>, the RMC connection is established. If the
results for your partition are <Active 0> or your partition does not appear in the command
results/output, you have an RMC problem with the client.
The Dcaps value of 0x79f is identifying a version 2.1 VIOS that is partition mobility capable.
The Dcaps value of 0x5f is identifying an AIX 6.1 client partition that is partition mobility
capable.

V5.4.0.3
Instructor Guide

Purpose Identifying the partition mobility associated Dcaps values of the lspartition
command.
Details The VIOS 1.5 partition would display a DCap value of 0x19f (instead of 0x79f).
Transition statement The validation process checks the mobile LPARs readiness.

Instructor Guide
Validation (6 of 8)
hdisk0
Server Server
A B
vscsi0 en0
Service processor
Pa
Service processor
rti
t io
nr
VASI ea VASI
din
hdisk1 en1 MSP es MSP en1
vhost0 s hdisk1
vtscsi0 en2 en2 en2
SEA en2 SEA
fcs0 en0 en0 fcs0
HMC
Ethernet network
LUN
Notes:
Partition readiness
Checks that none of the client virtual SCSI disks on the mobile partition are backed by
logical volumes and that no disks map to internal disks
Checks the mobile partition, its OS, and its applications for active migration capability.
Checks that the logical memory block size is the same on the source and destination
systems
Ensures that the type of the mobile partition is aixlinux and that it is not an alternate
error logging partition or a mover service partition
Ensures that the mobile partition is not configured with barrier synchronization registers
Ensures that the mobile partition is not configured with huge pages
Checks that the partition state is active and running
Checks that the mobile partition is not in a partition workload group

V5.4.0.3
Instructor Guide
Uempty Checks the uniqueness of the mobile partition's virtual MAC addresses
Checks that the mobile partition's name is not already in use on the destination server
Checks the number of current active migrations against the number of supported active
migrations
Checks that there are no physical adapters in the mobile partition and that there are no
required virtual serial slots higher than slot 2
Application migration awareness
A migration aware application is one that is designed to recognize and dynamically adapt to
changes in the underlying system hardware after being moved from one system to another.
Most applications will not require any changes to work correctly and efficiently with Live
Partition Mobility. Some applications might have dependencies on characteristics that
change between the source and destination servers and other applications might adjust
their behavior to facilitate the migration.
Applications that should probably be made migration aware include applications that use
processor and memory affinity characteristics to tune their behavior because affinity
characteristics might change as a result of migration. The externally visible behavior
remains the same, but performance variations, for better or worse, might be observed
because of different server characteristics.
Making applications migration-aware
An application registers its capability with AIX and might block migration during the check
phase.
Mobility awareness can be built in to an application using the standard AIX dynamic
reconfiguration notification infrastructure. This infrastructure offers two different
mechanisms for alerting applications to configuration changes. Using the SIGRECONFIG
signal and the dynamic reconfiguration APIs Registering scripts with the AIX dynamic
reconfiguration infrastructure. Using the SIGRECONFIG and dynamic reconfiguration APIs
requires additional code in your applications. The DLPAR scripts allow you to add
awareness to those applications for which you do not have the source code.

Instructor Guide
Instructor notes:
Purpose Describe the Partition Readiness check of the Migration Validation process.
Details
Transition statement The next step in the validation is checking the resources.

V5.4.0.3
Instructor Guide
Uempty
Validation (7 of 8)
hdisk0
Server Server
A B
vscsi0 en0
Service processor
Service processor
VASI VASI
vhost0 hdisk1 en1 MSP MSPesen1 hdisk1

s ourc
vtscsi0 en2 en2 r e en2
m en2
SEA yste SEA
S
fcs0 en0 en0 fcs0
HMC
Ethernet network
LUN
Notes:
After verifying system and partition configurations, the HMC then determines whether
sufficient resources are available on the destination server to host the inbound mobile
partition. The following steps are performed:
1. The HMC checks that the necessary resources (processors, memory, and virtual slots)
are available to create a shell partition on the destination system with the exact
configuration of the mobile partition.
2. The HMC generates a source-to-destination hosting virtual adapter migration map,
ensuring no loss of multipath I/O capability for virtual SCSI and virtual Ethernet. The
HMC fails the migration request if the device migration map is incomplete.

Instructor Guide
Instructor notes:
Purpose Describe the System Resource validation check.
Details
Transition statement The following is the last check performed before the migration.

V5.4.0.3
Instructor Guide
Uempty
Validation (8 of 8)
hdisk0
Server Server
A B
vscsi0 en0
Service processor
Service Processor
O
ap pera
pli ti
n
ca VASI
tiog sy VASI
n r ste
vhost0 hdisk1 en1 MSP ea m MSP en1 hdisk1
din an
es d
vtscsi0 en2 en2 s en2
SEA en2 SEA
fcs0 en0 en0 fcs0
HMC
Ethernet network
LUN
Notes:
The HMC instructs the operating system in the mobile partition to check its own capacity
and readiness for migration. AIX passes the check-migrate request to those applications
and kernel extensions that have registered to be notified of dynamic reconfiguration events.
The operating system either accepts or rejects the migration. In the latter case, the HMC
fails the migration request.
An API allows the kernel and applications to be notified of the migration operation. A
SIGRECONFIG Signal is sent to DR-Aware applications. These applications cooperate
and are notified of the different phases (Check-, Prepare-, and Post- Migrate). When
notified about the pending migration, the following actions could occur:
Application could be enabled to do some reconfiguration
- Reduce memory footprint
- Loosen heartbeats and other time-outs
- Throttle workloads and so on
- Quiesce or restart
Indicate that partition is not ready for migration (The HMC would cancel the migration.)

Instructor Guide
Instructor notes:
Purpose
Details This step gives the LPAR a chance to reject the migration request.
Transition statement The following page shows the command syntax and HMC GUI
for initiating the migration process.

V5.4.0.3
Instructor Guide
Uempty
Migrating the LPAR (1 of 16)

Start the operation from the HMC GUI.
Select Operations > Mobility > Migrate (or Validate)
HMC command
migrlpar -o m -m srcSystem -t destSystem \
-p myLPAR -d 5 v
(Suggest running the Validate procedure if this LPAR has not been
migrated before.)
Figure 8-18. Migrating the LPAR (1 of 16) AN313.1
Notes:
From the mobile LPARs context menu, click Operations > Mobility > Migrate. This
launches the Partition Migration wizard which guides you through the migration process.
Through this wizard, you can also learn about the partition migration process because it
provides a lot of process details in each step.

Instructor Guide
Suggest running the Validate procedure if this LPAR has not been migrated before.
migrlpar -o m | r | s | v
-m <managed system>
[-t <managed system>]
-p <partition name> | --id <partitionID>
[-n <profile name>]
[-f <input data file> | -i <input data>]
[-w <wait time>]
[--force]
[-d <detail level>]
[-v]
[--help]
-o The operation to perform
m - validate and migrate
r - recover
s - stop
v - validate
-m <managed system>: The source managed system's name.
-t <managed system>: The destination managed system's name.
-p <partition name>: The partition on which to perform the operation.
--id <partitionID>: The ID of the partition on which to perform the operation.
-n <profile name>: The name of the partition profile to be created on the destination.
-f <input data file>: The name of the file containing input data for this command. The
format is:
- attr_name1=value,attr_name2=value,...
- or
- attr_name1=value1,value2,...
-i <input data>: The input data for this command, typically the virtual adapter mapping
from the source to destination or the destination shared-processor pool. This follows the
same format as the input data file of the -f option.
-w <wait time>: The time, in minutes, to wait for any operating system command to
complete.
--force: Force the recovery. This option should be used with caution.
-d <detail level>: The level of detail requested from operating system commands;
values range from 0 (none) to 5 (highest).
-v: Verbose mode.
--help: Prints a help message.

V5.4.0.3
Instructor Guide

Purpose Introduce the Partition Migration wizard and show the migrlpar command.
Details

Instructor Guide
Notes:
Next, the wizard gives you a chance to avoid over-writing existing partition profiles.
As part of the migration process, the HMC creates a new migration profile containing the
partitions current state. Unless you specify a profile name when you start the migration,
this profile replaces the existing profile that was used to activate the LPAR. Also, if you
specify an existing profile name, the HMC replaces that profile with the new migration
profile. If you do not want to replace any of the existing profiles that are associated with the
mobile LPAR, you must specify a new, unique profile name.
The next step gives you a chance to identify the target system. You can only identify a
system currently managed by the HMC.

V5.4.0.3
Instructor Guide

Purpose Continue to show steps of the Migration wizard.
Details

Instructor Guide
mkauthkeys
Notes:
The Remote Live Partition Mobility refers to the migration of a logical partition between two
IBM Power Systems servers each managed by a separate Hardware Management
Console. This feature is available starting with HMC Version 7 Release 3.4. Remote
migrations require coordinated movement of a partitions state and resources over a secure
network channel to a remote HMC. The following list indicates the high-level prerequisites
for remote migration. If any of the following elements are missing, a migration cannot occur:
A ready source system that is migration-capable
A ready destination system that is migration-capable
Compatibility between the source and destination systems
Destination system managed by a remote HMC
Network communication between local and remote HMC
A partition that is ready to be moved from the source system to the destination system.
For an inactive migration, the partition must be turn off, but must be capable of booting

V5.4.0.3
Instructor Guide
Uempty on the destination system. For active migrations, an MSP on the source and destination
systems.
One or more SANs that provide connectivity to all of the mobile partitions disks to the
Virtual I/O Server partitions on both the source and destination servers. The mobile
partition accesses all migratable disks through devices (virtual Fibre Channel, virtual
SCSI, or both). The LUNs used for virtual SCSI must be zoned and masked to the
Virtual I/O Servers on both systems.
The mobile partitions virtual disks must be mapped to LUNs; they cannot be part of a
storage pool or logical volume on the Virtual I/O Server. One or more physical IP
networks (LAN) that provide the necessary network connectivity for the mobile partition
through the Virtual I/O Server partitions on both the source and destination servers. The
mobile partition accesses all migratable network interfaces through virtual Ethernet
devices.
An RMC connection to manage inter-system communication
SSH key authentication to the remote HMC
Remote migration operations require that each HMC has RMC connections to its individual
systems Virtual I/O Servers and a connection to its systems service processors. The HMC
does not have to be connected to the remote systems RMC connections to its Virtual I/O
Servers nor does it have to connect to the remote systems service processor.
The local HMC, which manages the source server in a remote migration, serves as the
controlling HMC. The remote HMC, which manages the destination server, receives
requests from the local HMC and sends responses over a secure network channel.
Use the mkauthkeys command in the CLI to retrieve authentication keys from the current
HMC managing the mobile partition. You must be logged in as a user with hmcsuperadmin
privileges, such as the hscroot user, and authenticate to the remote HMC by using a
remote user ID with hmcsuperadmin privileges.

Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Next you might see errors or warnings.

V5.4.0.3
Instructor Guide
Uempty
Notes:
Errors must be resolved since they will prevent the mobility process from continuing.
Warning should be read, but will not prevent the process from succeeding.

Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Figure 8-22. Migrating the LPAR (5 of 16)) AN313.1
Notes:
The wizard identifies the MSPs on the target and lists the VLANs that are bridged. You get
an error if the target system does not have a VIOS configured to bridge the VLANs that the
mobile LPAR is configured to use on the source system. To resolve, you have to manually
configure an SEA to bridge the required VLANs at the target system.

Instructor Guide
Instructor notes:
Purpose Show how the wizard identifies the MSP and required VLAN bridging.
Details

V5.4.0.3
Instructor Guide
Uempty
Notes:
After you identify the target MSP, the wizard identifies the virtual SCSI server adapters that
appear to have access to the clients disks. This allows the administrator to identify with
either a dual VIOS configuration or single VIOS configuration at the destination system.
This determines where the vSCSI server adapters will be created to support the client
adapters.

Instructor Guide
Instructor notes:
Purpose Show how the wizard identifies the virtual SCSI server adapter at the target
systems MSP.
Details Identifies the virtual adapter that has access to the clients disks.

V5.4.0.3
Instructor Guide
Uempty
Notes:
The last step is to show the summary. If you do not like what is shown in the summary, you
can click Back to go back and change any of the values.
When you are satisfied with the values, click Finish to initiate the migration process.

Instructor Guide
Instructor notes:
Purpose Completing the Migration wizard.
Details
Transition statement The following slides graphically depicts what occurs when you
start the migration process.

V5.4.0.3
Instructor Guide
Uempty
hdisk0 hdisk0
Server Server
A B
vscsi0 en0 en0 vscsi0
Service Processor
Service processor
AR
VASI LPVASI
w
ne
vhost0 hdisk1 en1 MSP te MSP en1 hdisk1
r ea
vtscsi0 en2 en2 C en2
SEA en2 SEA
fcs0 en0 en0 fcs0
HMC
Ethernet network
LUN
Notes:
After validation checks pass, a new partition is created on the destination system to
accommodate the migrated environment.

Instructor Guide
Instructor notes:
Purpose Describe the start of the migration.
Details

V5.4.0.3
Instructor Guide
Uempty
hdisk0 hdisk0
Server Server
A B
Service processor
Service processor
VASI VASI
vhost0 hdisk1 en1 MSP MSP en1 hdisk1 vhost0

vtscsi0 en2 en2 en2 vtscsi0
SEA en2 SEA
l a d apter
Virtua
fcs0 en0 p in g en0 fcs0
HMC map
Ethernet network
LUN
Notes:
The HMC verifies that the target MSP has access to the mobile LPARs external storage. It
then creates the required virtual SCSI adapters in the MSP on the destination system and
completes the LUN to virtual adapter mapping.

Instructor Guide
Instructor notes:
Purpose Mention the HMC's role in configuring the virtual SCSI devices at the
destination system.
Details
Transition statement Next, the MSPs transfer the mobile LPARs state from the source
to the target system.

V5.4.0.3
Instructor Guide
Uempty
hdisk0 hdisk0
Server Server
A B
fcs0 en0 en0 fcs0
Service processor
Service processor
VASI VASI
vfchost0 hdisk1 en1 MSP MSP en1 hdisk1 vfchost0

en2 en2 en2
SEA en2 SEA
fcs0 en0 en0 fcs0
HMC
Ethernet network
NPIV
LUN
Figure 8-27. Migrating LPAR: NPIV (10 of 16) AN313.1
Notes:
The addition of NPIV and virtual Fibre Channel adapters reduces the number of
components and steps necessary to configure shared storage in a Virtual I/O Server
configuration:
With virtual Fibre Channel support, you do not map individual disks in the Virtual I/O
Server to the mobile partition. LUNs from the storage subsystem are zoned in a switch
with the mobile partitions virtual Fibre Channel adapter using its worldwide port names
(WWPNs), which greatly simplifies Virtual I/O Server storage management.
LUNs assigned to the virtual Fibre Channel adapter appear in the mobile partition as
standard disks from the storage subsystem. LUNs do not appear on the Virtual I/O
Server unless the physical adapters WWPN is zoned.
Standard multi-pathing software for the storage subsystem is installed on the mobile
partition. Multi-pathing software is not installed into the Virtual I/O Server partition to
manage virtual Fibre Channel disks. The absence of the software provides system
administrators with familiar configuration commands and problem determination
processes in the client partition.

Instructor Guide
Partitions can take advantage of standard multipath features, such as load balancing
across multiple virtual Fibre Channel adapters presented from dual Virtual I/O Servers.
Required components
In addition to the basic requirements, the following components must be configured in the
environment:
An NPIV-capable SAN switch
An NPIV-capable physical Fibre Channel adapter on the source and destination Virtual
I/O Servers
HMC Version 7 Release 3.4, or later
Virtual I/O Server Version 2.1 with Fix Pack 20.1, or later
AIX 6.1 TL2 SP2, or later
Each virtual Fibre Channel adapter on the Virtual I/O Server mapped to an
NPIV-capable physical Fibre Channel adapter
Each virtual Fibre Channel adapter on the mobile partition mapped to a virtual Fibre
Channel adapter in the Virtual I/O Server
At least one LUN mapped to the mobile partitions virtual Fibre Channel adapter

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
hdisk0 hdisk0
Server Server
A B
Service Processor
Service Processor
VASI VASI
vhost0 hdisk1 en1 MSP Tr a te MSP en1 hdisk1 vhost0

an St
sfe R
en2 r LP PA en2
vtscsi0 en2 AR fe rL vtscsi0
SEA Sta a ns en2 SEA
te Tr
fcs0 en0 en0 fcs0

HMC
Ethernet network
LUN
Notes:
The HMC initiates the transfer of the partition state from the source environment to the
destination environment. This includes all the logical partition profiles associated with the
mobile partition. The source mover service partition extracts the partition state information
from the source server (through Hypervisor) and sends it to the destination mover service
partition over the network. The destination mover service partition receives the partition
state information and installs it on the destination server.

V5.4.0.3
Instructor Guide

Purpose Introduce how the LPARs state is transferred.
Details
Transition statement What exactly is the LPARs state?

Instructor Guide

Partition state information:
State information from the source system hypervisor
Source system MSP collects this through the VASI
MSP transfers the state information to the target MSP
Target MSP transfers the data to hypervisor
State information is then available to the new partition shell
State information includes:
Processor configuration: Dedicated/shared, processor count, and
processor entitlement (minimum/maximum/desired)
Memory configuration (minimum/maximum/desired)
Virtual adapter configuration (vSCSI and VLAN)
Note: The state information represents the LPARs current
characteristics (it is not based on any of the LPAR profiles).
Notes:

V5.4.0.3
Instructor Guide

Purpose Describe the partition state and how it is managed.
Details
Transition statement More information on the partition state information, which is
copied to the destination system.

Instructor Guide

State information also includes:
The partitions memory
The hardware page table (HPT)
The processor state
The virtual adapter state
The non-volatile RAM (NVRAM)
The time of day (ToD)
The partition configuration
The state of each resource
Notes:
Additional state information that is transferred from the source to destination system.

V5.4.0.3
Instructor Guide

Purpose Describe the partition state information.
Details

Instructor Guide
hdisk0 hdisk0
Server Server
A B
Service processor
Service processor
VASI VASI
Logical memory copy
vhost0 hdisk1 en1 MSP MSP en1 hdisk1 vhost0
vtscsi0 en2 en2 en2 vtscsi0
SEA en2 SEA
fcs0 en0 en0 fcs0
HMC
Memory copy over network
LUN
Notes:
During the transfer of the state information, the partition and its applications are active.
Sometime after more than half of the state has been transferred, the HMC initiates the
suspension of the mobile partition on the source server. The source mover service partition
continues to transfer the partition state information to the destination mover service
partition. The hypervisor resumes the mobile partition on destination server. The partition is
inactive for around two seconds between suspension (on source system) and reactivation
(on destination system).

V5.4.0.3
Instructor Guide

Purpose Additional state transfer phase details.
Details Mention how the migration process might take a while if there is a lot of state to
be transferred. However, there is only about a 2-second pause when the mobile LPAR is
suspended on the source system and started on the destination system.
Transition statement Finally, the HMC initiates some cleanup tasks.

Instructor Guide
hdisk0
Server Server
A B
en0 vscsi0
Service processor
Service processor
VASI
VASI
vhost0 hdisk1 en1 en1 hdisk1 vhost0
vtscsi0 Virtuen2 en2 en2 vtscsi0
al SC
SEA en2 SEA
SI re
mov
fcs0 en0 al en0 fcs0
HMC
Ethernet network
LUN
Notes:
The source Virtual I/O Servers unlock, unconfigure, or undefine virtual resources on the
source servers. After the transfer of the state is completed, the HMC removes the hosting
virtual adapter slots from the source Virtual I/O Server logical partition profiles as required.
However, you will find the slots that were defined as connect any (at the source Virtual I/O
Server) will not get removed.

V5.4.0.3
Instructor Guide

Purpose Describe what happens after the state transfers.
Details
Transition statement Discuss the completion of the migration process.

Instructor Guide
hdisk0
Server Server
A B
en0 vscsi0
Service processor
Service processor
VASI
VASI
hdisk1 en1 LPAR en1 hdisk1 vhost0
re
en2 en2 mova en2 vtscsi0
l en2
SEA SEA
fcs0 en0 en0 fcs0
HMC
Ethernet network
Storage Area Network
LUN
Notes:
Finally, the source LPAR is removed.

V5.4.0.3
Instructor Guide

Purpose Finish the migration discussion.
Details
Transition statement There are a few things that could be viewed to monitor the
migration process.

Instructor Guide
Monitoring the migration process (1 of 3)
Figure 8-34. Monitoring the migration process (1 of 3) AN313.1
Notes:
When using the HMC GUI, this output box is seen immediately after starting the migration
progress.

V5.4.0.3
Instructor Guide

Purpose Show what the migration process displays for monitoring the progress.
Details

Instructor Guide
On source machine
Notes:
On the source system, in the Server > Contents of pane, you can observe the status of the
LPAR. Migrating-Running implies the migration process is running.
In the managed system Properties, the Migration tab shows the number of migrations in
progress.

V5.4.0.3
Instructor Guide

Purpose Additional progress monitoring on the source system.
Details

Instructor Guide
On target machine
In progress
Finished
Notes:
On the target system, in the Server > Contents of pane, you can observe the status of the
LPAR. Migrating-Running implies the migration process is running. A status of Running
implies the migration has completed.

V5.4.0.3
Instructor Guide

Purpose Additional progress monitoring on the target system.
Details
Transition statement The migration process leaves entries in the AIX error logs.

Instructor Guide
Error log entries: Mobile client

Errpt
IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION

A5E6DB96 0211111508 I S pmig Client Partition Migration Completed
08917DC6 0211111508 I S pmig Client Partition Migration Started
errpt -a
---------------------------------------------------------------------------
LABEL: CLIENT_PMIG_DONE
IDENTIFIER: A5E6DB96
Date/Time: Mon Feb 11 11:15:16 CET 2008

Sequence Number: 128
Machine Id: 0000556AD400
Node Id: js22mob-1
Class: S
Type: INFO
Resource Name: pmig
Description
Client Partition Migration Completed
---------------------------------------------------------------------------
LABEL: CLIENT_PMIG_STARTED
IDENTIFIER: 08917DC6

Node Id: js22mob-1
Class: S
Type: INFO
Resource Name: pmig
Description
Client Partition Migration Started
Figure 8-37. Error log entries: Mobile client AN313.1
Notes:
The migrated LPAR has entries in its AIX error log. There should be two entries; one for the
start and another for the completion.

V5.4.0.3
Instructor Guide

Purpose Show the migrated LPARs error log entries.
Details
Transition statement There are also entries in the source and destination MSPs error
logs.

Instructor Guide
Error log entries: Source VIOS (1 of 2)

errlog
3EB09F5A 0211111508 I S Migration Migration completed successfully
6CB10B8D 0211111508 I S unspecified Client partition suspend issued
errlog -a
---------------------------------------------------------------------------
LABEL: MVR_MIG_COMPLETED
IDENTIFIER: 3EB09F5A

Node Id: js226
Class: S
Type: INFO
Resource Name: Migration
Description
Migration completed successfully
Probable Causes
UNDETERMINED
Failure Causes
UNDETERMINED
Recommended Actions
NONE
Detail Data
STREAM ID
96C4 D88B 13BE F250
SERVICES (Source or Target)
Source MSP
Figure 8-38. Error log entries: Source VIOS (1 of 2) AN313.1
Notes:
At the source MSP, there is an entry for the completion and also an entry for when the
migrated LPAR is suspended.

V5.4.0.3
Instructor Guide

Purpose Show the source MSPs error log entry.
Details
Transition statement The following slide shows the destination VIOS error log entry.

Instructor Guide
Error log entries: Source VIOS (2 of 2)

---------------------------------------------------------------------------
LABEL: MVR_FORCE_SUSPEND
IDENTIFIER: 6CB10B8D

Node Id: js226
Class: S
Type: INFO
Resource Name: unspecified
Description
Client partition suspend issued
Probable Causes
UNDETERMINED
Failure Causes
UNDETERMINED
Recommended Actions
NONE
Detail Data
STREAM ID
96C4 D88B 13BE F250
SOFT SUSPEND
1
TRIGGER PERCENTAGE
99
SUSPEND COUNT
300
MAX PERCENTAGE
99
REQUESTED SUSPEND TRIGGER
100
Figure 8-39. Error log entries: Source VIOS (2 of 2) AN313.1
Notes:
This shows when the migrated LPAR was suspended at the source system.

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
Error log entries: Target VIOS

3EB09F5A 0211111508 I S Migration Migration completed successfully
---------------------------------------------------------------------------
LABEL: MVR_MIG_COMPLETED
IDENTIFIER: 3EB09F5A

Node Id: js222
Class: S
Type: INFO
Resource Name: Migration
Description
Migration completed successfully
Probable Causes
UNDETERMINED
Failure Causes
UNDETERMINED
Recommended Actions
NONE
Detail Data
STREAM ID
96C4 D88B 13BE F250
SERVICES (Source or Target)
Target MSP
Figure 8-40. Error log entries: Target VIOS AN313.1
Notes:
The destination MSPs error log shows when the migration process completed.

V5.4.0.3
Instructor Guide

Purpose Target MSPs error log entry.
Details

Instructor Guide
Troubleshooting (1 of 5)
Error log
Migration GUI or command line messages
Also stored in /var/hsc/log/cimserver.log
alog -t cfg -o > /tmp/cfglog
Provides details not included in HMC message
Run at source and destination VIOS
Contains error details
Methods and scripts called
Abbreviated descriptions
RMC return code
The OS command return code
PID
Locate the lines containing ERROR
Figure 8-41. Troubleshooting (1 of 5) AN313.1
Notes:
The migration process is fairly verbose as it provides error log entries, GUI popup
messages, and entries in a config log file. The cfg type alog (or config log) is a log that can
be invaluable when troubleshooting migration problems. Sometimes this log provides
details not found in the HMC error messages or in the AIX error log.

V5.4.0.3
Instructor Guide

Purpose Identify sources for migration error details.
Details
Transition statement Let see an example.

Instructor Guide
Example: Partition mobility validation fails with the following errors:
HSCLA29A The RMC command issued to partition VA_NET1 failed. The partition command
is: migmgr -f find_devices -t vscsi -d 1 The RMC return code is: 0 The OS command
return code is: 85 The OS standard out is: Running method
'/usr/lib/methods/mig_vscsi' 85 The OS standard err is:
Error[0]: HSCLA24E The migrating partition's virtual SCSI adapter 10 cannot be
hosted by the existing virtual I/O server (VIOS) partitions on the destination
managed system. To migrate the partition, set up the necessary VIOS hosts on the
destination managed system, then try the operation again.
First error code:

HSCLA29A
LPAR name:
VA_NET1
Partition command:
migmgr -f find_devices -t vscsi -d 1
RMC return code:
0
The OS command return code:
85
Second error Code:
HSCLA24E
Notes:
When a validation error occurs at the HMC, you are usually provided error details similar to
the information in the figure above. This will include an first associated error code, the
LPAR name, the command used by the validation process, the RMC return code, the OS
return code and sometimes additional related error codes. After reading the error message,
you might be wondering why the existing VIOS cannot host the migrating LPARs VSCSI
adapter. This information is not provided by the HMC error message, but might be provided
in the cfg log entries.

V5.4.0.3
Instructor Guide

Purpose Identify key information provided by the HMC error messages.
Details
Transition statement The following provides details regarding the 'cfg' alog.

Instructor Guide
Look for the return code in alog t cfg o output:
# grep "rc= 85" config.log
C0 663752 mig_vscsi.c 352 leaving mig_vscsi fn= find_devices, rc= 85
Process IDs
Return Codes
Notes:
As we examine the config log, we can search for information provided within the HMC error
message. One item to search for is the OS return code. In our example, this value is 85.
The C at the beginning of the entries indicate the beginning of a command execution. The
0 that follows indicate this is associated with an error. You should now search the log for
entries associated with one of the PIDs listed. In the following figure we will show an
example using 663758.
The config log is primarily for use by config commands, device methods, and dynamic
reconfig (DR) commands. There are three general forms of error log entries:
1. The start log. A command or device method that uses the error log will first log a "start"
entry. This will identify the command or method, command line parameters, its PID, and
parent's PID. The general format is as follows:
TS PID PPID [TIMESTAMP] [FILE] [LINE] CMD

V5.4.0.3
Instructor Guide
Uempty Where:
- T identifies a type of executable. The letter C indicates the start of a command and
the letter M indicates the start of a method.
- S is the letter S identifying the start of a command or method.
- PID is the process ID of the command or method. It is useful for finding other log
entries in the config log that are added by this command or method.
- PPID is the parent process ID. This is useful for correlating the start log entry of a
method with the start log entry of the command that invoked the method.
- TIMESTAMP is an optional timestamp of the format HH:MM:SS.
- FILE is an optional source file name.
- LINE is an optional line number within the source file of the line that generated the
log entry.
- CMD is the command and arguments specified when the command or method was
invoked.
2. The second form is for logging informational, error, and debug log entries. The general
format is:
T# PID [TIMESTAMP] [FILE] [LINE] STRING
Where:
- T indicates a type of executable. The letter C indicates the start of a command and
the letter M indicates the start of a method. The letter B is a special case that is used
to log special boot time information.
- # is a verbosity number in the range of 0 to 9. 0 denotes an error condition. 1
denotes an informational message. Other numbers denote different levels of debug
information. The typical value will be 4.
- PID is the process ID. It can be matched to an earlier start log to see what command
generated this log entry.
- TIMESTAMP is an optional timestamp of the format HH:MM:SS.
- FILE is an optional source file name. Will always be included for an error entry (# is
0).
- LINE is an optional line number within the source file of the line that generated the
log entry. Will always be included for an error entry (# is 0).
- STRING is the data actually logged.
3. The third form is for special informational messages logged during boot. These always
have the format:
B# STRING
Where:
- B is always the letter B.
- # is usually the number 1 but could be 0 through 9 as described above.
- STRING is the actual data. If you see a timestamp, it is because it was included in
the data.

Instructor Guide
Instructor notes:
Purpose
Details
Additional information If an error occurs, you might also see a log entry that starts with
C0 or M0. Except for DR operations, where additional levels of debug information will be
logged, these are the only entries we log by default. This might change in the near future.
There is an environment variable that can be created to change the level of detail that will
be logged. Examples:
export CFGLOG=timestamp
will result in timestamps being included in new log entries.
export CFGLOG=verbosity:4
will result in all entries of verbosity up to and including 4 to be logged.
export CFGLOG=detail
will result in all entries including source file name and line numbers. I believe it will also
change the verbosity to 4.
Any combination of the above can be used as long as the values are separated by a
comma. For example:
export CFGLOG=verbosity:4,timestamp

V5.4.0.3
Instructor Guide
Uempty
Take one of the PIDs shown and grep it:
# grep 663758 config.log
CS 663758 352486 /usr/sbin/migmgr -f find_devices -t vscsi -d 1
C4 663758 Running method '/usr/lib/methods/mig_vscsi'
CS 663758 352486 /usr/sbin/migmgr -f find_devices -t vscsi -d 1
C0 663758 vsmig_set.c 55 vsmig_dest_adapter is about to call
LIBXML_TEST_VERSION
C0 663758 vsmig_set.c 59 vsmig_dest_adapter called LIBXML_TEST_VERSION
C0 663758 vsmig_util.c 1482 original pipe_ctrl from RMC =0x0
C0 663758 vscsi_vtd.c 416 No attribute for node
[/virtDev/blockStorage/AIX/devID], cnt=0
C0 663758 vsmig_util.c 790 ERROR: virtual device name already exists,
root103
Notes:
In our example we searched for the PID 663758. An important detail is revealed in the
second from last entry. This migration validation failed because the virtual device name
(root103) used at the source system VIOS already exists on the destination systems VIOS.

Instructor Guide
Instructor notes:
Purpose Show config log details.
Details
Transition statement The following is a simple way to took for important error details in
the config log.

V5.4.0.3
Instructor Guide
Uempty
Another way to view the errors:
# grep ERROR config.log
C0 663752 vsmig_util.c 790 ERROR: virtual device name
already exists, root103
already exists, hb11vg103
already exists, hb11vg103
already exists, root103
Notes:
Most of the important error details will contain ERROR within the text entry.

Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Let take a look at Dual VIOS considerations.

V5.4.0.3
Instructor Guide
Uempty
Dual VIO Server: Network topics

Destination system must bridge mobile LPARs networks
Shared Ethernet failover

Might be configured or not
Assign virtual I/O IP to alternate virtual Ethernet adapter
Mover service partition (MSP)

Can manage up to four concurrent active migrations
Distribute the load among multiple mover service partitions.
Consider creating Virtual I/O Server dedicated to mover services
Figure 8-46. Dual VIO Server: Network topics AN313.1
Notes:
Live Partition Mobility does not make any changes to the network setup on the source and
destination systems. It only checks that all virtual networks used by the mobile partition
have a corresponding shared Ethernet adapter on the destination system. Shared Ethernet
failover might or might not be configured on either the source or the destination systems.
If you are planning to use shared Ethernet adapter failover, remember not to assign the
Virtual I/O Servers IP address on the shared Ethernet adapter. Create another virtual
Ethernet adapter and assign the IP address on it. Partition migration requires network
connectivity through the RMC protocol to the Virtual I/O Server. The backup shared
Ethernet adapter is always offline as well as its IP address, if any.
The Virtual I/O Servers selected as mover service partitions are loaded by memory moves
and network data transfer. So, if multiple mover service partitions are available on either
the source or destination systems, we suggest distributing the load among them. This can
be done explicitly by selecting the mover service partitions either using the GUI, or the
command-line interface. Each mover service partition can manage up to four concurrent

Instructor Guide
active migrations and explicitly using multiple Virtual I/O Servers avoids queuing of
requests.
Network management can cause high CPU usage and usual performance considerations
apply: use uncapped Virtual I/O Servers and add virtual processors if the load increases.
Alternatively, create dedicated Virtual I/O Servers on the source and destination systems
that provide the mover service function separating the service network traffic from the
migration network traffic. You can combine or separate virtualization functions and mover
service functions to suit your needs.

V5.4.0.3
Instructor Guide

Purpose
Details Always perform a validation before performing a migration. The validation
checks the configuration of the involved Virtual I/O Servers and shows you the
configuration that will be applied.

Instructor Guide
Dual VIO Server: Virtual SCSI

with dual HMC consideration
Dual Virtual I/O Server and client mirroring:
Recommend using two independent storage subsystems.
Provides additional client LPAR resilience
Configure destination system with two VIOS and similar VSCSI
configuration.
Single to dual Virtual I/O Server migration is possible.
Dual Virtual I/O Server and multipath I/O:

Destination system must be configured with two Virtual I/O Servers
with the same multipath setup.
Dual HMC locking

Initiating HMC takes a lock on both managed systems.
Lock is released when migration is completed.
Figure 8-47. Dual VIO Server: Virtual SCSI with dual HMC consideration AN313.1
Notes:
Dual Virtual I/O Server and client mirroring
The migration process automatically detects which Virtual I/O Server has access to which
storage and configures the virtual devices to keep the same disk access topology. When
migration is complete, the logical partition has the same disk configuration it had on
previous system, still using two Virtual I/O Servers.
If the destination system has only one Virtual I/O Server, the migration is still possible and
the same virtual SCSI setup is preserved at the client side. The destination Virtual I/O
Server must have access to all disk spaces and the process creates two virtual SCSI
adapters on the same Virtual I/O Server.
Dual Virtual I/O Server and multipath I/O
When multiple Virtual I/O Servers are involved, multiple virtual SCSI combinations are
possible, because access to the same SAN disk can be provided on the destination system
by multiple Virtual I/O Servers. Live Partition Mobility automatically manages the virtual
SCSI configuration if an administrator does not provide specific mappings.

V5.4.0.3
Instructor Guide
Uempty With multipath I/O, the logical partition accesses the same disk data using two different
paths, each provided by a separate Virtual I/O Server. One path is active and the other is
standby. The migration is possible only if the destination system is configured with two
Virtual I/O Servers that can provide the same multipath setup.
The partition that is moving must keep the same number of virtual SCSI adapters after
migration and each virtual disk must remain connected to the same adapter or adapter set.
An adapters slot number might change after migration, but the same device name is kept
by the operating system for both adapters and disks.
To migrate the partition with only one Virtual I/O Server configured on the destination
system, you must first remove one path from the source configuration before starting the
migration. The removal can be performed without interfering with the running applications.
The configuration becomes a simple single Virtual I/O Server migration.
A logical partition that is using only one Virtual I/O Server for virtual disks can be migrated
to a system where multiple Virtual I/O Servers are available. Because the migration never
changes a partitions configuration, only one Virtual I/O Server is used on the destination
system.
Multiple concurrent migrations
While a migration is in progress, you can start another one. When the number of migrations
to be executed grows, the setup time using the GUI might become long and you should
consider using the command-line interface. The migrlpar command can be used in scripts
to start multiple migrations in parallel.
A migration might fail validation checks and therefore not be started if the moving partition
adapter and disk configuration cannot be preserved on the destination system. We suggest
that you always perform a validation before performing a migration. The validation checks
the configuration of the involved Virtual I/O Servers and shows you the configuration that
will be applied.
Dual HMC considerations
To avoid concurrent operations on the same system, a locking mechanism is activated
when migrating a partition. The HMC that initiates a migration takes a lock on both
managed systems and the lock is released when migration is completed. The other HMC
can show the status of migration but cannot issue any additional configuration changes on
the two systems. The lock can be manually broken, but this option should be considered
carefully.
Dual Virtual I/O Server and virtual Fibre Channel multi-pathing
With multipath I/O, the logical partition accesses the same storage data using two different
paths, each provided by a separate Virtual I/O Server. The migration is possible only if the
destination system is configured with two Virtual I/O Servers that can provide the same
multipath setup. They both must have access to the shared disk data.
When migration is complete on the destination system, the two Virtual I/O Servers are
configured to provide the two paths to the data. If the destination system is configured with

Instructor Guide
only one Virtual I/O Server, the migration cannot be performed. The migration process
would create two paths using the same Virtual I/O Server, but this setup of having one
virtual Fibre Channel host device mapping the same LUNs on different virtual Fibre
Channel adapters is not recommended.
To migrate the partition, you must first remove one path from the source configuration
before starting the migration. The removal can be performed without interfering with the
running applications. The configuration becomes a simple single Virtual I/O Server
migration.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement For those using IVM, the migration process can be started and
monitored from the IVM GUI.

Instructor Guide
Partition mobility with IVM
Figure 8-48. Partition mobility with IVM AN313.1
Notes:
Under Virtual I/O Server version 1.5 (or above) and on a POWER6 system, the IVM can be
used to perform the migration process. From the main Partition Management panel, select
the LPAR and Migrate from the task menu. The Status menu option allows you to view
information about the current migration processes.
To see the Mobility options, your system must have the PowerVM Enterprise Edition
feature.

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
Partition mobility with IVM: Monitor
Figure 8-49. Partition mobility with IVM: Monitor AN313.1
Notes:
This visual shows what could be used to monitor the migration process. The bottom
window is what is seen after clicking the Status option in the Mobility task list. To refresh
the Percent Complete column, you must click Refresh.

V5.4.0.3
Instructor Guide

Purpose Describe monitoring the IVM-based migration process.
Details
Transition statement The following are checkpoint questions.

Instructor Guide
Checkpoint
1. True or False: The VASI interface controls every phase of
the partition mobility process.
2. True or False: With partition mobility, you can move a

partition between any two IBM Power Systems.
3. What is included in the state that is transferred during the

migration?
4. What log usually provides details not found in the HMC

migration error message?
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
1. True or False: The VASI interface controls every phase of the partition
mobility process.
2. True or False: With partition mobility, you can move a partition

between any two IBM Power Systems.
3. What is included in the state that is transferred during the migration?

The answer is the partitions memory, hardware page table (HPT),
processor state, virtual adapter state, non-volatile RAM (NVRAM),
time of day (ToD), partition configuration, and state of each resource.
4. What log usually provides details not found in the HMC migration error
message?
The answer is the config log; alog t cfg.

Instructor Guide
Exercise
Unit
exerc
ise
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide
Unit summary
Identify the components required to successfully perform a

partition migration
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details

Instructor Guide

V5.4.0.3
Instructor Guide
Uempty Unit 9. PowerVM advanced systems maintenance
Estimated time
01:30

This unit describes the systems maintenance tasks that should be
performed in a POWER virtualized environment. These tasks include
managing the PowerVM system firmware, backup, and restore of the
Virtual I/O (VIO) server, and updating the VIO Server software.

Manage PowerVM system firmware
Update the Virtual I/O Server software
- Single and dual VIO configurations
- updateios command
Back up the Virtual I/O Server
- backupios command (file, tape, CD, and DVD)
Restore the Virtual I/O Server
- Backup tape, DVD, and tar file
Add disk space to a vSCSI client partition
Backup client partitions operating system to Virtual DVD
Change partition availability priority
Activate the power saver mode
Manage hot-pluggable devices in the Virtual I/O Server
- diagmenu command: VIOS diagnostic menu
Add hot-swap SCSI disk

Machine exercises
References
SG24-7940 PowerVM Virtualization on IBM System p (Redbook)
pSeries and AIX Information Center
Copyright IBM Corp. 2010, 2011 Unit 9. PowerVM advanced systems maintenance 9-1
Instructor Guide
Unit objectives
Single and dual VIO configurations
updateios command
backupios command (file, tape, CD, and DVD)
Backup tape, DVD, and tar file
Back up client partitions operating system to virtual DVD
diagmenu command: VIOS diagnostic menu
Notes:
The objectives list of what you should be able to do at the end of this unit.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
PowerVM firmware enablement

Must order feature
With initial order
After shipment
www.ibm.com/systems/p/advantages/cod
Activated with code (similar to COD)
Figure 9-2. PowerVM firmware enablement AN313.1
Notes:
PowerVM Editions features are activated with a code, similar to the way that Capacity on
Demand is activated on IBM Systems and IBM SServer hardware. If your system is
purchased without the feature, you can later purchase it by requesting the appropriate
feature code. The following are some of the PowerVM feature codes:

V5.4.0.3
Instructor Guide
Uempty
Table 1:
Machine Power Express Standard Enterprise
Type / systems edition edition edition
Model feature feature feature
code code code
9119 FHA 595 NA #7943 #8002
9125 F2A 575 NA #7949 #8024
9117 MMA 570 NA #7942 #7995
9406 MMA 570 NA #7942 #7995
8204 E8A 550 #7983 #7982 #7986
9409 M50 550 NA #7982 #7986
8203 E4A 520 #7983 #8506 #8507
9408 M25 520 NA #8506 #8507
9407 M15 520 NA #8506 NA
7998 61X JS22 NA #5409 #5649
7998 60X JS12 NA #5409 #5606
To activate PowerVM standard edition or PowerVM Enterprise edition, you must enter an
activation code from the Hardware Management Console (HMC) or using the ASMI menu
interface.
To activate these PowerVM features, on the HMC, you must have an HMC Super
Administrator user role. Then to enter a code, perform the following tasks:
1. Retrieve the Virtualization Technology activation code from the following:
www.ibm.com/systems/p/advantages/cod.
2. Look in the Activation tools list on the right side and click Activation codes by
machine serial number.
3. Enter the system type and serial number of your server.
4. Record the activation code that is displayed on the Web site. The activation code type is
VET (Virtualization Technology Code).
5. The easiest way to enter your activation code on your managed system is by using the
HMC. To enter your code, complete the following steps:
6. In the system management navigation area of the HMC, expand Servers.
7. In the working area select your managed system.
8. Select Capacity on Demand (CoD) > Advanced POWER Virtualization > Enter
Activation Code in the Tasks Pad.
9. Type your activation code in the Code field. If you have copied the code, click the
middle mouse button.
10. Click OK.
You can now begin using the Virtualization technologies, which include Micro-Partitioning,
Virtual SCSI, shared Ethernet adapter, multiple shared processor pools, Integrated
Virtualization Manager, and so on and Partition Mobility if you ordered a PowerVM
Enterprise edition code.
Instructor Guide
Instructor notes:
Purpose Describe the procedure to enable PowerVM.
Details You request the PowerVM firmware activation code in the same manner the
CoD activation code is requested. The code can be entered by way of the HMC or the
ASMI of the managed system. The HMC is the recommended option.
Transition statement Now that we know how to enable PowerVM, lets discuss
managing the firmware of the PowerVM enabled system.

V5.4.0.3
Instructor Guide
Uempty
Firmware management
Fix or update strategy
Managed system firmware (Licensed Internal Code) updates
Server firmware
Power subsystem firmware
I/O adapter and device firmware
Types of firmware maintenance
Concurrent (must use an HMC)
Disruptive
Reboot of managed system necessary
Acquiring the fix or update
http://www14.software.ibm.com/webapp/set2/firmware/gjsn
HMC
Performing the update
HMC
AIX or LINUX operating system
Stand-alone diagnostic CD
Figure 9-3. Firmware management AN313.1
Notes:
Fixes provide changes to your software, Licensed Internal Code, or machine code that fix
known problems, add new function, and keep your server or HMC operating efficiently. For
example, you might install fixes for your operating system in the form of a program
temporary fix (PTF). Or, you might install a server firmware (Licensed Internal Code) fix
with code changes that are needed to support either new hardware or new functions of the
existing hardware.
A good fix strategy is an important part of maintaining and managing your server. If you
have a dynamic environment that changes frequently, then you should install fixes on a
regular basis. If you have a stable environment, you do not have to install fixes as
frequently. However, you should consider installing fixes whenever you make any major
software or hardware changes in your environment.
You can get fixes using a variety of methods, depending on your service environment. For
example, if you use an HMC to manage your server, you can use the HMC interface to
download, install, and manage your HMC and firmware (Licensed Internal Code) fixes. If
you do not use an HMC to manage your server, you can use the functions specific to your
Instructor Guide
operating system to get and apply your fixes.You can also use the managed system's ASMI
interface to apply the update. In addition, you can download or order many fixes through
Internet Web sites.
The server firmware is the part of the Licensed Internal Code that enables hardware, such
as the service processor. When you install a server firmware fix, it is installed on the
temporary side of the service processor.
The power subsystem firmware is the part of the Licensed Internal Code that enables the
power subsystem hardware in the model IBM System p 575 and IBM System p 595
servers. You must use an HMC to update or upgrade power subsystem firmware.
You must install HMC fixes before you install server firmware or power subsystem firmware
fixes so that the HMC can handle any fixes or new function that you apply to the server.
After you install HMC fixes, either install the power subsystem firmware and server
firmware fixes together, or install the power subsystem firmware first (if you have a model
IBM System p 575 or IBM System p 595 server), and then the server firmware second.

V5.4.0.3
Instructor Guide

Purpose Discuss firmware management.
Details Describe the types of firmware updates. The updates that we are focusing on
are for the server firmware and power subsystem firmware
Additional information By default, the server firmware is installed on the temporary
side only after the existing contents of the temporary side is permanently installed on the
permanent side. (This process is performed automatically when you install a server
firmware fix.) If you want to preserve the contents of the permanent side, select Install and
Activate from the Advanced Features on the HMC interface, and indicate that you do not
want to automatically accept the firmware level.
There is a helpful site that assists in gathering the components necessary to update
firmware. Visit the Microcode downloads site before performing any updates.
Transition statement Before updating the system firmware, you must make sure the
HMC is at a supported and compatible level.
Instructor Guide
HMC and system firmware compatibility (1 of 4)

Entry-level IBM systems with POWER6 processors
IBM Power 520
IBM Power 550
Figure 9-4. HMC and system firmware compatibility (1 of 4) AN313.1
Notes:
You must install any necessary HMC fixes before you install server firmware or power
subsystem firmware fixes so that the HMC can handle any fixes or new function that you
apply to the server. The following table lists currently supported firmware (FW) Release
Levels for Entry-level IBM Systems with POWER6 processors, as well as the compatibility
of HMC FW levels with system FW levels (as of September 2008). You can find this matrix
at http://www14.software.ibm.com/webapp/set2/sas/f/power5cm3/eltablep6.html

V5.4.0.3
Instructor Guide

Purpose Identify HMC and firmware compatibility levels.
Details The chart shows what is valid at the time of this writing (September 2008). It
shows what is valid for the IBM Low-end POWER6 processors-based models such as the
IBM System p 520 and the IBM System p 550.
Transition statement Lets take a look at the table that applies to the Mid-Range
systems.
Instructor Guide

Mid-range IBM systems with POWER6 processors
IBM Power 570
Notes:
The following table lists currently supported firmware (FW) Release Levels for Mid-Range
IBM Systems with POWER6 processors, as well as the compatibility of HMC FW levels
with system FW levels (as of September 2008). You can find this matrix at
http://www14.software.ibm.com/webapp/set2/sas/f/power5cm3/emtablep6.html

V5.4.0.3
Instructor Guide

Purpose Show HMC and firmware compatibility levels for IBM System p 570.
Details
Transition statement Lets take a look at the table which applies to the High-end
systems.
Instructor Guide

High-end IBM systems with POWER6 processors
IBM Power 595
IBM Power 575
Supported code combinations for IBM Power systems

https://www14.software.ibm.com/webapp/set2/sas/f/power5cm/supportedcode.html
Notes:
The following table lists currently supported firmware (FW) release levels for POWER6
systems, as well as the compatibility of HMC FW levels with system FW levels for High-end
IBM Systems with POWER6 processors (IBM System p 595 and IBM System p 575).
To access this table online, go to:
https://www14.software.ibm.com/webapp/set2/sas/f/power5cm/ehtablep6.html
The different POWER code matrix that list supported code combinations for IBM Power
Systems can be accessed at
https://www14.software.ibm.com/webapp/set2/sas/f/power5cm/supportedcode.html

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets take a look at the tables that apply to POWER7 systems.
Instructor Guide

Entry IBM Systems with POWER7 processors
IBM Power 710, 720, 730, 740, 750, 755
Mid-range IBM systems with POWER7 processors

IBM Power 770, 780
High-end IBM systems with POWER7 processors

IBM Power 795
https://www14.software.ibm.com/webapp/set2/sas/f/power5cm/supportedcodep7.html
Notes:
The following table lists currently supported firmware (FW) release levels for POWER7
systems, as well as the compatibility of HMC FW levels with system FW levels for Entry,
Mid-range, and High-end IBM Systems with POWER7 processors at the time of writing.
To access these tables online, go to
https://www14.software.ibm.com/webapp/set2/sas/f/power5cm3/eltablep7.html
https://www14.software.ibm.com/webapp/set2/sas/f/power5cm3/emtablep7.html
https://www14.software.ibm.com/webapp/set2/sas/f/power5cm3/ehtablep7.html

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement As you acquire the firmware update files, you will find that they
have a specific naming convention.
Instructor Guide
Decoding the firmware level and name

Associated update file name has the following form:
PPMMXXX_YYY_ZZZ where:
PP = 01 (managed system)
02 (power subsystem)
MM = EL (POWER6 Low-end)
EM (POWER6 Mid-range)
EH (POWER6 High-End)
EP (High-End Power System firmware 9125-F2A)
EB (High-End Power System firmware 9119-FHA)
XXX= the release level
YYY= the service pack level
ZZZ= the last disruptive service pack level
For example: 01EM320_061 01EM310_081_075
01EM320_031_031
Currently installed is EM320_031_031, new service pack is EM320_076_045. Is

this disruptive?
Figure 9-8. Decoding the firmware level and name AN313.1
Notes:
Firmware, also known as microcode, is Licensed Internal Code that fixes problems and
enables new system features as they are introduced. New features introduced are
supported by new firmware release levels. In between new hardware introductions, there
are fixes or updates to the supported features. These fixes are often bundled into service
packs. A service pack is referred to as an update level. A new release is referred to as an
upgrade level. Both levels are represented by the file name in the form of
PPMMXXX_YYY_ZZZ. PP, and MM are package and machine type identifiers. PP can be
01 for managed system or it can be 02 for power subsystem. The MM identifier is a EM, EL,
or EH for POWER6 systems, depending on the model and a EP or ES for Power System
firmware. The firmware version file applicable to POWER6 systems is in the form of
01ELXXX_YYY_ZZZ for Low-end servers, 01EMXXX_YYY_ZZZ for Mid-range servers and
01EHXXX_YYY_ZZZ for High-end servers.
The file naming convention for system firmware is 01ELXXX_YYY_ZZZ, where XXX is the
stream release level, YYY is the service pack level, and ZZZ is the last disruptive service
pack level.

V5.4.0.3
Instructor Guide
Uempty Using the previous example, the system firmware 01EL320_076 would be described as
release level 320, service pack 076.
Each stream release level supports new machine types, new features, or both.
Firmware updates can be disruptive or concurrent. A disruptive upgrade is defined as one
that requires the target system to be shutdown and powered off prior to activating the new
firmware level. A new release level upgrade will always be disruptive. All other upgrades
are defined as concurrent, meaning that they can be applied while the system is running.
Concurrent updates require an HMC but are not guaranteed to be non-disruptive.
In general, a firmware upgrade is disruptive if:
The release levels (XXX) are different. Example: Currently installed release is EM310,
new release is EM320.
The service pack level (YYY) and the last disruptive service pack level (ZZZ) are equal.
Example: EM320_120_120 is disruptive, no matter what level of EM310 is currently
installed on the system.
The service pack level (YYY) currently installed on the system is lower than the last
disruptive service pack level (ZZZ) of the new service pack to be installed. Example:
Currently installed service pack is EM310_120_120 and the new service pack is
EM310_152_130.
An installation is concurrent if:
The service pack level (YYY) is higher than the service pack level currently installed on
your system. Example: Currently installed service pack is EM310_126_120, new
service pack is EM310_143_120.
Instructor Guide
Instructor notes:
Purpose Decoding the fix/update file-naming convention.
Details Use the detail in the student notes to describe the naming convention.
Transition statement Before updating, you must identify the HMC and the managed
system code levels.

V5.4.0.3
Instructor Guide
Uempty
Using the HMC to update system firmware

Check HMC code level
lshmc -V
HMC code update application
Update HMC (if necessary)
Check the managed system firmware level

Updates node in the HMC navigation area
Check for available updates

Apply updates
Figure 9-9. Using the HMC to update system firmware AN313.1
Notes:
The HMC Version 7 navigation pane contains the primary navigation links for managing
your system resources and the Hardware Management Console. These include the
Updates link. Updates provides a way for you to access information on both HMC and
system firmware code levels at the same time without running a task. The Updates work
pane displays the Hardware Management Console code level, system code levels, and the
ability to install corrective service by clicking Update HMC.
The HMC version can also be checked as hscroot from the shell prompt as follows:
version= Version: 7
Release: 3.3.0
Service Pack: 1
HMC Build level 20080602.1
MH01113: Support for new T0 Synergy brand. (06-02-2008)
","base_version=V7R3.3.0"
You can check available updates at:
Instructor Guide
This site allows you to download updates for the system firmware as well as other system
components, such as devices, adapters, disks, and so on. You can also download ISO
images to use when updating the system firmware from the HMC or diagnostic CD.
From this Web page, you can select to download an RPM or ISO of your desired update.
Selecting Desc provides detailed information of the description and purpose, requirements,
and how to install the update.

V5.4.0.3
Instructor Guide

Purpose Describe how to use the HMC to update the system firmware.
Details If the HMC is not at a level that is compatible with the managed system firmware
you are about to install, then you must first update the HMC code. This course does not
cover the HMC update detail, because this is a basic skill we hope the student already has.
We look quickly at the firmware update process.
Transition statement Lets see how we could use the License Internal Code Updates
application to retrieve and apply the updates.
Instructor Guide
Managed system firmware update
To examine Firmware
current LIC update
levels From CD
only
Figure 9-10. Managed system firmware update AN313.1
Notes:
For each selected Managed system, you can launch five tasks. The first task is to perform
updates to the current LIC release. This is sometimes referred to as applying fixes to the
firmware. There are many options as to where you apply the update from, whether from an
IBM Web site, technical support system, CD, and so forth. These fixes can be concurrent
or disruptive.
If you need to upgrade to a whole new firmware release, this is done with the second
option, Upgrade Licensed Internal Code to a new release. If you are upgrading to a new
release, you can obtain images from an online source; however, they must be applied from
a CD-ROM. For example, you might obtain the CD from IBM, or you might download an
ISO image from an online source and create your own CD. This CD is used to upgrade the
release level of the managed system firmware.
The third task (Flash Side Selection) enables you to select which flash side will be active
after the next activation, t-side (temporary side) or p-side (permanent side). The Service
Pack maintains two copies of the server firmware. One copy is held in the t-side repository

V5.4.0.3
Instructor Guide
Uempty (temporary) and the other copy is held in the p-side repository (permanent) This Flash Side
Selection option is for IBM service use only.
The Check system readiness task checks for any errors on the internal code for the target
managed system.
View current firmware level

The Updates work pane displays the Hardware Management Console code level and the
system code levels (the current firmware level). But another way to determine your current
firmware level of the managed system is to select one managed system and then choose
the Change Licensed Internal Code for the current release task. In the next window, select
the View System Information option and click OK. A window will pop up and ask for the
location of the LIC repository. You can choose None, an IBM Web site, the IBM support
system, and so forth, as locations.
Instructor Guide
Instructor notes:
Purpose Discuss the update options available in the Updates HMC application.
Details This course does not cover the firmware update process details, because this is
a basic skill we hope the student already has, and this is discussed in the AU73 course.
Transition statement Lets see how to perform an update without HMC.

V5.4.0.3
Instructor Guide
Uempty
Updating firmware without an HMC (1 of 5)

ASMI (through serial cable access)
Welcome
Machine type-model: 8204-E8A
Serial number: 652AFE2
Date: 2008-7-21
Time: 20:12:48
Service Processor: Primary
User ID: admin
Password: *****
User ID to change: admin
Current password for user ID admin: *****
New password for user: ******
New password again: ******
Operation completed successfully.
PRESS ENTER TO CONTINUE:

Number of columns [80-255, Currently: 80]:
Number of lines [24-255, Currently: 24]:
Figure 9-11. Updating firmware without an HMC (1 of 5) AN313.1
Notes:
If you do not have an HMC attached to your managed system, your first step in the update
process is to determine the firmware level of the system. This can be identified through the
Advance System Management Interface (ASMI).
A system with no HMC is also known as an unmanaged system. The ASMI is used to
power on the system and perform other useful functions. Using a serial cable and a
terminal emulator program like HyperTerminal on Windows, the text-based ASMI and the
active console can be accessed. When the serial connection to the system is established,
press Enter to log in and to be presented with the following ASMI login screen:
Instructor Guide
Welcome
Machine type-model: 8204-E8A
Serial number: 652AFE2
Date: 2008-7-21
Time: 20:12:48
Service Processor: Primary
User ID: admin
Password: *****
User ID to change: admin
Current password for user ID admin: *****
New password for user: ******
New password again: ******
Operation completed successfully.
PRESS ENTER TO CONTINUE:

Number of columns [80-255, Currently: 80]:
Number of lines [24-255, Currently: 24]:
Log in as the admin user account. The default password is admin. If this is the first time,
you are prompted to change the password.
When you are logged in, you can see the managed systems firmware level.
If you know the IP address of the unmanaged systems service processor, you can use
your Web browser to display the following URL:
https://<your service processors IP address>

V5.4.0.3
Instructor Guide

Purpose Identify the steps required to update the system firmware without the HMC.
Details The first step is to identify the firmware level. Without the HMC, this can be
accomplished through the ASMI. The notes shows the early screens seen when accessing
the text-based ASMI.
Transition statement After login, you see the system firmware level as the version
value displayed.
Instructor Guide

Check firmware level
System name: Server-8204-E8A-SN652AFE2
Version: EL320_076
User: admin
Copyright 2002-2008 IBM Corporation. All rights reserved.
1. Power/Restart Control
2. System Service Aids
3. System Information
4. System Configuration
5. Network Services
6. Performance Setup
7. On Demand Utilities
8. Concurrent Maintenance
9. Login Profile
99. Log out
Notes:
The firmware level is the Version value in the upper left of the screen. The Power/Restart
Control option allows you to power on the system and observe the initialization process.
As the system powers up, you should make the system boot from the CD/DVD rom by
accessing the SMS mode to change the bootlist.

V5.4.0.3
Instructor Guide

Purpose Identify the system firmware on the text-based ASMI screen.
Details
Transition statement This value can also be seen on the Web-based ASMI.
Instructor Guide
Notes:
Access the ASMI from a Web browser. Direct your Web browser to https://<your service
processor IP address>.
Another way to get the system firmware level is by running the lsmcode command from
within one of the systems AIX LPARs.

V5.4.0.3
Instructor Guide

Purpose Identify the system firmware level using the Web-based ASMI.
Details
Transition statement Now that we have the firmware level, we can now go to the
service LPAR to start the update process.
Instructor Guide

FUNCTION SELECTION 801002
Move cursor to selection, then press Enter.
Diagnostic Routines
This selection will test the machine hardware. Wrap plugs and
other advanced functions will not be used.
Advanced Diagnostics Routines
other advanced functions will be used.
Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.) TASKS SELECTION LIST 801004
This selection will list the tasks supported by these procedures.
Once a task is selected, a resource menu may be presented showing
all resources supported by the task. From the list below, select a task by moving the cursor to
Resource Selection the task and pressing 'Enter'.
This selection will list the resources in the system that are supported To list the resources for the task highlighted, press 'List'.
by these procedures. Once a resource is selected, a task menu will
be presented showing all tasks that can be run on the resource(s). [MORE...24]
Format Media
Gather System Information
Hot Plug Task
F1=Help F10=Exit F3=Previous Menu
Identify and Attention Indicators
Local Area Network Analyzer
Log Repair Action
Microcode Tasks
RAID Array Manager
SSA Service Aids
This selection provides tools for diagnosing and resolving
problems on SSA attached devices.
Update and Manage system Flash
[BOTTOM]
F1=Help F4=List F10=Exit Enter

F3=Previous Menu
Notes:
The diagnostic program can be used to update the system firmware by accessing Task
Selection -> Update and Manage System Flash.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

AIX (or Linux) operating system: Service partition
Steps required:
View system firmware level
ASMI Welcome panel
lsmcode command
The current permanent system firmware image is EL320_076.
The current temporary system firmware image is EL320_076.
The system is currently booted from the temporary image.
Download fix/update
Apply the fix

cd /tmp/fwupdate
/usr/lpp/diagnostics/bin/update_flash -f fwlevel
Where fwlevel is the specific server firmware fix file name, such as 01EL320_yyy_zzz.
Notes:
Installing server firmware fixes through the operating system is a disruptive process. The
permanent level is also known as the backup level. The temporary level is also known as
the installed level. The system was booted from the temporary side, so at this time the
temporary level is also the activated level.
From a computer or server with an Internet connection, go to the Microcode Downloads
website at http://www14.software.ibm.com/webapp/set2/firmware/gjsn. Select your
machine type and model from the drop-down list under Download microcode by machine
type and model. Click Go. An information window is opened. Click Continue. The available
firmware levels are displayed. Record the available firmware.
Select the check box associated with all of the fixes you want to download and then select
Continue which is at the bottom of the page. Perform the following in your LPAR which has
Service Authority (the Service Partition).
To unpack the RPM file, enter one of the following commands at the AIX or Linux command
prompt:

V5.4.0.3
Instructor Guide
Uempty If you want to unpack from a CD, enter rpm -Uvh --ignoreos
/mnt/filename.rpm
If you want to unpack from the server's hard drive, enter rpm -Uvh --ignoreos
/tmp/fwupdate/filename.rpm where filename is the name of the RPM file that
contains the server firmware; for example, 01EL3xx_yyy_zzz.rpm.
When you unpack the RPM file, the server firmware fix file is saved in the /tmp/fwupdate
directory on the server's hard drive in the following format: 01EL3xx_yyy_zzz.
You need the server firmware fix file name in the next step. To view the name, enter the
following at an AIX or Linux command prompt: ls /tmp/fwupdate
Note: To perform this task, you must have root user authority.
The name of the server firmware fix file is displayed. For example, you might see output
similar to the following: 01EL3xx_yyy_zzz
To install the server firmware fix
From an AIX command prompt enter the following:
cd /tmp/fwupdate
/usr/lpp/diagnostics/bin/update_flash -f fwlevel
Where fwlevel is the specific server firmware fix file name, such as 01EL3xx_yyy_zzz.
During the server firmware installation process, reference codes CA2799FD and
CA2799FF are alternately displayed on the control panel. After the installation is complete,
the system is automatically powered off and powered on.
Instructor Guide
Instructor notes:
Purpose Describe updating the system firmware from an AIX partition.
Details Use the student note content to describe the details. You can use this procedure
on a system that is managed by an HMC. The LPAR must be the Service Partition. If the
system was managed by an HMC, you would be able to assign the Service Partition
through the General tab of the Manage System Properties. Otherwise, with the
unmanaged system, the default partition is set up to be the service partition by IBM
manufacturing.

V5.4.0.3
Instructor Guide
Uempty
VIO Server code updates

Updated code adds new features and fixes to old problems.
VIOS 1.4 VIOS 1.5.x.x VIOS 2.1.x.x
DATE 05/2007 11/2007 05/2008 11/2008 05/2009
PRODUCT 1.4.1.1 1.5.1.1 1.5.2.1 2.1.0.0 2.1.1.0
-N port ID - Active memory
Power 6 blade support
Enablement for virtualization sharing
POWER6 Partition mobility
-Virtual tape -Electronic
Enablement of File-backed virtual Network support Service Agent
LDAP, SNMP SCSI devices and apportioning
virtual optical devices -Heterogeneous -cfgassist
Added storage Feature of the
support SAS .. Expansion pack multi-path I/O enhancement
COMMENTS SEA
IVE support IVM support for VIO CLI -IVM support on -Improved VIOs
GVRP protocol physical I/O on enhancements System i and HMC
POWER6 management
TSM Client -Partition mobility
Additional CLI between two -Solid state drive
ITUAM agent
monitoring commands
TADDM HMCs support
Fix Pack Fix Pack 9.1 Fix Pack 10.1 Fix Pack 11.1 Migration DVD Fixpak 21
5.3.0 TL06 or 5.3.0 TL08 or
AIX BASE 5.3.0 TL07 or later AIX 6.1 AIX 6.1 TL2 SP3
later later
All VIO code versions and updates are not shown.

Figure 9-16. VIO Server code updates AN313.1
Notes:
Existing VIOS installations can refresh to the latest VIOS level by applying the Fix Pack. If
your VIOS is installed with the previous VIOS install media, or running with a Fix Pack prior
to Fix Pack 21 you should update it by applying Fix Pack 21
If you are updating from VIOS level 1.5, you must perform an upgrade to version 2.1 This
upgrade preserves the virtual devices configuration. Then you can update to the 2.1.1.0
level by installing the Fix Pack 21.
VIOS Version 2.1

The Virtual I/O Server Version 2.1 provides many enhancements such as N_Port ID
virtualization, virtual tapes support, virtual SCSI devices backed by solid state drives and
some management enhancements. This version also supports the PowerVM Active
Memory Sharing feature. Refer to this link for more informations:
http://www14.software.ibm.com/webapp/set2/sas/f/vios/documentation/installreadme.html
Instructor Guide
VIOS Version 1.5

In VIOS Version 1.5, there were several new features in system management and
availability such as:
Support for Live Partition Mobility on PowerVM Enterprise Edition
Support for file-backed virtual SCSI devices
Virtual I/O Server Expansion Pack
Support for BladeCenter JS22
Monitoring solutions: Additional monitoring solutions are available to help manage and
monitor the VIOS and shared resources. New commands and views provide additional
metrics for memory, paging, processes, Fibre Channel HBA statistics, and
virtualization.
IBM Tivoli Application Dependency Discovery Manager (TADDM) enables System p
users to:
- Discover their System p server environment along with the rest of their IT
infrastructure.
- Learn how their System p servers are configured (and changing over time).
- Determine whether their System p servers are compliant with their configuration
policies. TADDM provides deep-dive discovery for the System p servers (POWER5
or later), including its dependencies on the network and applications, along with its
configuration data, subsystems, and virtualized LPARs. The discovery includes the
HMC, VIOS, IVM, System p servers, and each of the LPARs.
Virtual I/O Server Version 1.5.2
This VIOS Version 1.5.2 provided several key enhancements in the area of POWER
Virtualization.
VIOS network bandwidth apportioning. The bandwidth apportioning feature for the
shared Ethernet adapter, allows the VIOS to give a higher priority to some types of
packets. In accordance with the IEEE 802.1q specification, VIOS administrators can
instruct the SEA to inspect bridged VLAN-tagged traffic for the VLAN priority field in the
VLAN header. The 3-bit VLAN priority field allows each individual packet to be
prioritized with a value from 0 to 7 to distinguish more important traffic from less
important traffic. More important traffic is sent faster and uses more VIOS bandwidth
than less important traffic.
Virtual I/O Server CLI was enhanced to support Image Management commands. The
CLI is in a unique position to have access to virtual disks and their contents. Users will
be able to make a copy of virtual disks and install virtual disks using the Image
Management command, cpvdi. This command will allow other programs to create and
copy virtual disk images.
The Virtual I/O Server runs on Internet Protocol version 6 (IPv6) networks; therefore
users can configure IPv6 type IP addresses.
The updates to the Systems Planning and Deployment tool include updates to ensure
none of the existing VIOS mappings are changed during the deployment step.
If you want to know more information about the Virtual I/O Server, go to the following
website: http://www14.software.ibm.com/webapp/set2/sas/f/vios/documentation/home.html

V5.4.0.3
Instructor Guide

Purpose Describe the purpose of VIO Server updates.
Details Mention the details in the student notes.
Instructor Guide
VIO Server service strategy

VIOS service package definitions
Fix pack
Updates your VIOS release to the latest level
Can contain product enhancements, new function, and fixes
Service pack
Applies to only one (the latest) VIOS level
Critical fixes for issues found between fix pack releases
Can only be applied to the fix pack release for which it is specified
Interim fix
Applies to only one (the latest) VIOS level
Provides a fix for a specific issue
Figure 9-17. VIO Server service strategy AN313.1
Notes:
The service strategy for VIOS has changed. In addition to Fix Packs, Service Packs for
VIOS will be released, depending on the number of needed changes. VIOS Service Packs
consist of critical changes found between Fix Pack releases.
VIOS server problem/resolution

To ensure the reliability, availability, and serviceability of a computing environment using
the Virtual I/O Server, IBM recommends that you update the Virtual I/O Server to the latest
generally available release, Fix Pack or Service Pack. The latest level contains all
cumulative fixes for the VIOS. Interim fixes (iFixes) are only provided at the latest shipped
VIOS level. You need to upgrade to the latest service level or Fix Pack to install an interim
fix. The official fixes might be shipped in the next Fix Pack or Service Pack of VIOS.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Lets see how to migrate to Virtual I/O Server 2.1.
Instructor Guide
Upgrade to VIOS 2.1

If your VIOS level is at 1.3, 1.4, or 1.5:
You must migrate to version 2.1 by using the migration media.
You can then update your VIOS 2.1 to the latest 2.1 level by applying
the latest version 2.1 fix pack.
The Virtual I/O Server configuration is preserved.
Migrating the VIOS from the HMC:

Run a backupios before migration.
Run installios.
Run ioslevel (should be 2.1.0.0).
Migrating the VIOS from a migration DVD

Boot in SMS.
Check the title displayed is VIOS Migration Installation and Settings.
Perform the installation.
Figure 9-18. Upgrade to VIOS 2.1 AN313.1
Notes:
Before you start, ensure that the following statements are true:
An HMC is attached to the managed system.
A DVD optical device is assigned to the Virtual I/O Server logical partition.
The Virtual I/O Server migration installation media is required.
After the migration is complete, the Virtual I/O Server logical partition is restarted to its
preserved configuration prior to the migration installation. It is recommended that you verify
that migration was successful by checking results of the installp command and running the
ioslevel command. It should indicate that the ioslevel is now $ ioslevel 2.1.0.0. You can
restart previously running daemons and agents such as FTP and Telnet, and the previously
running agents, such as ITUAM.

V5.4.0.3
Instructor Guide

Purpose
Details Highlight that going from previous VIO Server versions to VIO Server 2.1 is a
migration process and not an update. The upgrade preserves all the virtual devices and
configuration of the Virtual I/O Server.
Instructor Guide
VIO Server software updates: Single VIO Server

Planned or unplanned virtual I/O client downtime
Steps required:
1. Shut down the associated virtual I/O clients.

2. Apply the update.
$ updateios
3. Reboot the Virtual I/O Server.
$ shutdown -restart
4. Log in as padmin and check VIOS level.
$ ioslevel
Figure 9-19. VIO Server software updates: Single VIO Server AN313.1
Notes:
When performing updates in a single Virtual I/O Server environment, you must plan for
virtual I/O client downtime. This is because you must bring down all of the associated
virtual I/O clients before you can start the update of the Virtual I/O Server.
To avoid complications during and after an update, you should verify the system is fully
operational as well as trouble free, and check the environment's configuration before
updating the Virtual I/O Server software. The following list is an example of useful
commands that can be used to document the configuration of the virtual I/O client and
Virtual I/O Server:
lsvg rootvg: On the Virtual I/O Server and virtual I/O client, check for stale PP's and
stale PV.
lsvg -pv rootvg: On the Virtual I/O Server, check for missing disks.
netstat -cdlistats: On the Virtual I/O Server, check that the Link status is Up on
all used interfaces.

V5.4.0.3
Instructor Guide
Uempty errpt: On the virtual I/O client, check for CPU, memory, disk, or Ethernet errors, and
resolve them before you continue.
lsvg -p rootvg: On the virtual I/O client, check for missing disks.
netstat -v: On the virtual I/O client, check that the Link status is Up on all used
interfaces.
If a current backup is not available, perform a backup of the Virtual I/O Server and the
virtual I/O client. Backup of the Virtual I/O Server is done by backupios (discussed later in
this unit), and the backup of the virtual I/O client can done by using mksysb, savevg, Tivoli
Storage Manager, or similar backup products.
To update or upgrade the Virtual I/O Server, use the following steps (in this case, from a
locally attached DVD/CD drive):
1. Bring down the virtual I/O clients connected to the Virtual I/O Server.
2. Apply the update with the updateios command (detailed later in this unit).
3. Reboot Virtual I/O Server.
4. Check the new level with ioslevel.
Instructor Guide
Instructor notes:
Purpose Introduce the steps required to update the VIO Server.
Details In a single VIO Server configuration, you must plan for the impact the update
process will have on the clients. Depending on the devices served, the clients might
experience a great amount or very little disruption. For example, if the client is being served
its boot disk from this VIO Server, the client must be shut down prior to the update
procedures.
Transition statement Lets look at how things differ in a dual VIO Server configuration.

V5.4.0.3
Instructor Guide
Uempty
VIO Server software updates:
Dual VIO Servers (1 of 4)
Dual VIO Server configuration is recommended.
Minimizes client disruptions
Check and document the virtual Ethernet and virtual SCSI disk
configurations.
In SAN environment, disable path to the VIO Server that

receives the update.
# chpath l hdiskX p vscsiX s disable
This assumes the best case scenario, where one VIOS is used for
vSCSI's primary path and the other VIOS is used for the primary path of
SEA failover.
Figure 9-20. VIO Server software updates: Dual VIO Servers (1 of 4) AN313.1
Notes:
When applying an update to the Virtual I/O Server in a dual Virtual I/O Server environment,
you can do so without having planned or unplanned downtime. However, if the Virtual I/O
Server is updated from 1.3 to 1.4 or higher, and you want to migrate from network interface
backup to shared Ethernet adapter fail over on the clients, you have planned downtime at
the virtual I/O client when changing the virtual network setup. Migrating from Network
Interface Backup to shared Ethernet adapter failover is optional.
It is always good practice to check the virtual Ethernet and virtual SCSI disk device
configurations on the Virtual I/O Server and virtual I/O client before starting the update.
Also, consider checking the physical adapter connections and the virtual device mappings.
Instructor Guide
As seen in this example, all of the virtual adapters on the virtual I/O client are up and
running:
#netstat -v
.. (Lines omitted for clarity)
.Virtual I/O Ethernet Adapter (l-lan) Specific Statistics:
---------------------------------------------------------
RQ Length: 4481
No Copy Buffers: 0
Filter MCast Mode: False
Filters: 255
Enabled: 1 Queued: 0 Overflow: 0
LAN State: Operational
Hypervisor Send Failures: 0
Receiver Failures: 0
Send Errors: 0
Hypervisor Receive Failures: 0
ILLAN Attributes: 0000000000003002 [0000000000002000]
The following is run on the VIOS number 1:
$ netstat -cdlistats
---------------------------------------------------------
RQ Length: 4481
No Copy Buffers: 0
Trunk Adapter: True
Priority: 1 Active: True
Filters: 255
The following might be seen on VIOS number 2:
---------------------------------------------------------
RQ Length: 4481
No Copy Buffers: 0
Trunk Adapter: True
Priority: 2 Active: False
Filters: 255

V5.4.0.3
Instructor Guide
Uempty LAN State: Operational

In the following outputs, the link status' shows the physical adapters
are down. This could be because the adapters do not have cables
connected:
$
VIOS-1
.
2-Port 10/100/1000 Base-TX PCI-X Adapter (14108902) Specific Statistics:
------------------------------------------------------------------------
Link Status: Down
Media Speed Selected: Auto negotiation
.
. (Lines omitted for clarity)
.
VIOS-2
$ netstat -cdlistat
.10/100/1000 Base-TX PCI-X Adapter (14106902) Specific Statistics:
-----------------------------------------------------------------
Link Status: Down
Media Speed Selected: Auto negotiation
.$
Instructor Guide
Instructor notes:
Purpose Introduce the steps required to update the VIOS in a dual VIO Server
configuration.
Details Details are in the notes.
Transition statement Lets look at items needing attention if you are using an MPIO
configuration.

V5.4.0.3
Instructor Guide
Uempty
SAN switch SAN switch
VIOS 1 VIOS 2
FC FC
vSCSI vSCSI
vSCSI vSCSI
MPIO
Notes:
How to check the disk status depends on how the disks are shared from the Virtual I/O
Server. You want to verify that everything is alright before you start the update.
If you have an MPIO setup similar to the example shown, you should run the following
commands before and after updating the first Virtual I/O Server. This allows you to check
the disk path status.
lspath: On the virtual I/O client, check all the paths to the disks, they should all be in
the enabled state.
lsattr -El hdisk0: On the virtual I/O client, look at the MPIO heartbeat values for
hdisk0. Verify the hcheck_mode attribute is set to nonactive and hcheck_interval
attribute is set to 60.
If you are using an IBM storage solution, then verify the reserve_policy attribute is set to
no_reserve.
Other storage vendors might require other values for reserve_policy. You should check this
attribute value at the Virtual I/O Server.
Instructor Guide
Instructor notes:
Purpose Identify things that need to be checked and documented in an MPIO
configuration.
Details Before performing the update, you should check your MPIO configuration.
Check to see that the client is not using the path that includes the VIO Server being
updated.
Transition statement What should you check if you are using LVM mirroring?

V5.4.0.3
Instructor Guide
Uempty
VIOS 1 VIOS 2
vSCSI vSCSI
vSCSI vSCSI
Client partition
LVM
mirroring
Notes:
If your LVM disk environment is similar to the figure, you should check the LVM status of
the disk shared by the Virtual I/O Server. Verify that everything is okay before performing
the update.
lsvg rootvg: On the virtual I/O client, check for stale PPs and the quorum must be
off.
lsvg -p rootvg: On the virtual I/O client, check for missing hdisk.
Instructor Guide
Instructor notes:
Purpose Things to check in an LVM mirrored disk environment.
Details This is a simple but very popular configuration.
Transition statement Lets look at what commands should be used when updating in a
dual VIO Server configuration.

V5.4.0.3
Instructor Guide
Uempty
1. Run updateios.
2. Reboot the standby Virtual I/O Server when update is done.
$ shutdown -restart
4. Verify updated environment.
5. Start the update on the other VIOS
6. If using SEA, change the standby/primary status.
$ chdev -dev ent4 -attr ha_mode=standby
ent4 changed
updateios
7. Reboot the Virtual I/O Server (shutdownrestart).
9. Reset the Virtual I/O Server SEA role back to primary using chdev.
$ chdev -dev ent4 -attr ha_mode=auto
ent4 changed
Notes:
If using shared Ethernet adapter failover, use the netstat command to see that the
interface is not active on the VIO Server you are about to update. In the following command
output, we can see the adapter is the standby adapter (Priority=2) and it is not active
(Active=false):
Trunk Adapter: True
Filters: 255
$
Instructor Guide
Verify that the standby Virtual I/O Server and the virtual I/O client are connected to the
Virtual I/O Server environment. If you have an MPIO environment, run the lspath on the
virtual I/O client and verify that the all paths are enabled. If you have an LVM environment,
You will have to run varyonvg and the volume group should begin to sync. If not, run
syncvg -v on the volume groups that use virtual disk from the Virtual I/O Server
environment so that all the volume groups are in sync.
# lsvg -p rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 511 488 102..94..88..102..102
hdisk1 missing 511 488 102..94..88..102..102
# varyonvg rootvg
# lsvg -p rootvg
rootvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk0 active 511 488 102..94..88..102..102
hdisk1 active 511 488 102..94..88..102..102
# lsvg rootvg
VOLUME GROUP: rootvg VG IDENTIFIER: 00c478de00004c00000
00006b8b6c15e
VG STATE: active PP SIZE: 64 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1022 (65408 megabytes)
MAX LVs: 256 FREE PPs: 976 (62464 megabytes)
LVs: 9 USED PPs: 46 (2944 megabytes)
OPEN LVs: 8 QUORUM: 1
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 2 AUTO ON: yes
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
Verify that the Ethernet connects to the Virtual I/O Server being used for the shared
Ethernet adapter failover scenario using netstat -cdlistat on Virtual I/O Server or
netstat -v on the virtual I/O client for network interface backup and check the link
status for the network interface backup adapters.

V5.4.0.3
Instructor Guide
Uempty If the shared Ethernet adapter failover is used, shift the standby/primary Virtual I/O
Server, using chdev and check with netstat -cdlistats that the state has changed.
$ chdev -dev ent4 -attr ha_mode=standby
ent4 changed
Trunk Adapter: True
Apply the update to the Virtual I/O Server, which now is the standby Virtual I/O Server
using updateios.
Reboot the Virtual I/O Server: shutdown -restart.
Check the new level with ioslevel.
Verify the environment of the standby Virtual I/O Server and the associated virtual I/O
clients.
If you have an MPIO environment, run the lspath on the virtual I/O client and verify all
paths is enabled. If you have an LVM environment, You will have to run varyonvg and
the volume group should begin to sync. If not, run syncvg -v on the volume groups
that use virtual disk from the Virtual I/O Server environment so all the volume groups
are in sync.
Verify the Ethernet connect to the Virtual I/O Server, using netstat -cdlistats and
netstat -v check for link status.
Reset the Virtual I/O Server role back to primary using chdev.
$ chdev -dev ent4 -attr ha_mode=auto
ent4 changed
The update is now done.
Instructor Guide
Instructor notes:
Purpose Identify the commands used to update the VIOS.
Details

V5.4.0.3
Instructor Guide
Uempty
Acquiring VIO Server software updates

Retrieve the latest fix from the VIO Server website.
http://www14.software.ibm.com/webapp/set2/sas/f/vios/download/home.html
Anonymous FTP at ftp.software.ibm.com
Log in to VIO Server. To get Fix Pack 11.1, do the following:
cd <directory>
ftp ftp.software.ibm.com
Name: anonymous
Password: ftp
ftp> cd /software/server/vios/fixes/fixpack111
ftp> binary
ftp> prompt
ftp> mget *
ftp> quit
ISO image or order fix pack CD-ROM
Figure 9-24. Acquiring VIO Server software updates AN313.1
Notes:
The update package is available as a set of downloadable files. As new updates become
available, the package name will change. New packages are cumulative. To obtain all
available fixes, download the latest package.
The Web site also allows you to download the package by way of a Java applet. This
enables you to download the entire package in one session. This applet can download files
to your system only if you grant the access. You are prompted for this. If you deny access,
the applet does not download files.
There is a link to downloading the ISO image. Also, you can order the CD-ROM through the
Delivery Service Center. The order site requires you to sign on with an IBM ID. You receive
the CD-ROM in several days.
Instructor Guide
Instructor notes:
Purpose Identify how to acquire the updated code.
Details
Transition statement When you have acquired the update, you can use the updateios
command.

V5.4.0.3
Instructor Guide
Uempty
updateios command
updateios syntax:
$ updateios -dev Media [-f] [-install] [-accept]
$ updateios -commit I -reject [-f]
$ updateios cleanup
$ updateios -remove {-file RemoveListFile I

RemoveList}
Figure 9-25. updateios command AN313.1
Notes:
The updateios command is used to install fixes, or updates the Virtual I/O Server to the
latest maintenance level. Before installing a fix or maintenance level, the updateios
command first runs a preview installation and displays the results. Upon completion of the
preview, the user is then prompted to either continue or exit. If the preview fails for any
reason, then the updates should not be installed.
The -install flag is used to install new file sets onto the Virtual I/O Server. This flag should
not be used to install fixes or maintenance levels.
The -cleanup flag cleans up after an interrupted installation and attempts to remove all
incomplete pieces of the previous installation. Cleanup should be performed whenever any
software product or update is in a state of either applying or committing and can be run
manually as needed.
The -commit flag commit all uncommitted updates to the Virtual I/O Server.
The -reject flag rejects all uncommitted updates to the Virtual I/O Server.
Instructor Guide
If the -remove flag is specified, the listed file sets are removed from the system. The file
sets to be removed must be listed on the command line or in the RemoveListFile file.
The log file, install.log in the user's home directory, is overwritten with a list of all file
sets that were installed.
Flags
-accept agrees to required software license agreements for software to be installed.
-cleanup cleans up after an interrupted installation or update.
-commit commits all specified updates.
-dev Media Specifies the device or directory containing the images to install.
-f forces all uncommitted updates to be committed prior to applying the new updates.
When combined with the -dev flag, it commits all updates prior to applying any new
ones. When combined with the -reject flag, it rejects all uncommitted updates without
prompting for confirmation.
-file specifies the file containing a list of entries to uninstall.
-install installs new file sets onto the Virtual I/O Server.
-reject rejects all specified uncommitted updates.
-remove performs an uninstall of the specified software.
To update the Virtual I/O Server to the latest level, where the updates are located on the
mounted file system /home/padmin/update, type updateios -dev
/home/padmin/update.
To update the Virtual I/O Server to the latest level, when previous levels are not committed,
type updateios -f -dev /home/padmin/update.
To reject installed updates, type updateios -reject.
To cleanup partial installed updates, type updateios -cleanup.
To commit the installed updates, type updateios -commit.

V5.4.0.3
Instructor Guide

Purpose Introduce the updateios command.
Details The updateios command can be used to update the VIOS from the local hard
disk, remote file system, or from CD/DVD media.
Transition statement Lets take a look at using the command to update from the VIO
Servers local disk.
Instructor Guide
Applying updates from a local hard disk

1. Log in to the Virtual I/O Server as the user padmin.
2. Type $ mkdir directory_name.
3. Transfer update files using FTP to directory created.
4. Type $ updateios -dev directory_name -install

-accept.
5. Reboot the VI/O server.
6. Type $ ioslevel.
Figure 9-26. Applying updates from a local hard disk AN313.1
Notes:
Log in to the Virtual I/O Server as the user padmin.
Create directory on the Virtual I/O Server:
$ mkdir directory_name
Transfer update files using ftp to the directory created.
Apply the update by running the updateios command:
$ updateios -dev directory_name -install -accept
Accept to continue installation after preview update is run.
Verify that the update was successful by checking results of the updateios command and
running the ioslevel command. The result of ioslevel should be equal to the level of the
package downloaded:
$ ioslevel

V5.4.0.3
Instructor Guide

Purpose Discuss using the updateios command to update the VIOS from a local
update file.
Details
Transition statement Lets look at updating from a fix file located on a remote file
system.
Instructor Guide
Applying updates from remote file system

1. Rename the fix pack file tableofcontents.txt to .toc.
2. Log in to the Virtual I/O Server as user padmin.
3. Type $ mount remote_machine_name:directory

/mnt.
4. Type $ updateios -dev /mnt -install accept

(if prompted to remove the .toc?, type no).
5. Reboot the VIO Server.
6. Type $ ioslevel.
Figure 9-27. Applying updates from remote file system AN313.1
Notes:
If the remote file system is to be mounted read-only, you must first rename the fix pack file
tableofcontents.txt to .toc. Failure to do this will prevent you from being able to install this fix
pack.
Log in to the Virtual I/O Server as user padmin.
Mount the remote directory onto the Virtual I/O Server:
$ mount remote_machine_name:directory /mnt
Apply the update by running the updateios command:
$ updateios -dev /mnt -install -accept
Verify update was successful by checking results of updateios command and running
ioslevel command. The result of ioslevel should be equal to the ioslevel of the package
downloaded:
$ ioslevel

V5.4.0.3
Instructor Guide

Purpose Discuss updating from a remote file system.
Details
Transition statement If you downloaded and burned an ISO image, you can use the
updateios command as follows.
Instructor Guide
Applying updates from the CD/DVD drive

1. Log in to the Virtual I/O Server as user padmin.
2. Place the update CD into the drive.
3. Type $ updateios -dev /dev/cd0 -install -

accept.
4. Reboot the VIO Server.
5. Type $ ioslevel.
Figure 9-28. Applying updates from the CD/DVD drive AN313.1
Notes:
This fix pack can be burned onto a CD using the ISO image. After the CD has been
created, the following steps need to be performed to apply the update:
Log in to the Virtual I/O Server as user padmin.
Place the update CD into the drive.
Apply the update by running the updateios command: $ updateios -dev /dev/cdX
-install -accept (where X is device number between 0 and N).
Verify update was successful by checking results of updateios command and running the
ioslevel command. The result of ioslevel should be equal to the level of the package
downloaded.
Refer to the Virtual I/O Server online publications for additional information on the
updateios, ioslevel, and mount commands. Information on these commands can be
obtained from the IBM Information Center.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Virtual I/O Server backup strategy

External device configuration
SAN and storage subsystem configuration
Memory, CPU, virtual devices, and physical devices defined at

the HMC
The Virtual I/O Server operating system
Virtual device backing storage configuration

Disk structures such as:
Storage pools or volume groups
Logical volumes and files (for example, the virtual media repository)
Linking or mapping of the virtual device to the physical devices
Raw logical volumes used as virtual devices by the VIO Server

must have their own backup policy
Figure 9-29. Virtual I/O Server backup strategy AN313.1
Notes:
A complete disaster recovery (DR) strategy for the Virtual I/O Server should include
backing up the four areas listed so that we can recover the virtual devices and their
physical backing devices. The reprovision of these four areas, if necessary, is followed by
our server backup strategy, which rebuilds the AIX or Linux logical partitions. If we just
want to back up the Virtual I/O Server, then external device configuration is the one we are
most interested in.
External device configuration

If there were to be a natural or man-made disaster that were to take out a complete site,
then this information should be included into the end-to-end backup strategy. This is
probably more of your Disaster Recovery strategy, but it should be considered in the
complete backup picture. The backup strategy for this depends on the make, model, and
vendor of the storage, networking equipment, and SAN devices to name but a few.
Examples of the type of information you need to record might be the network Virtual Local
Area Network (VLAN) or Logical Unit Number (LUN) information from a storage subsystem.

V5.4.0.3
Instructor Guide
Uempty This information is beyond the scope of this document but, we mention it here to make the
reader aware that a complete DR solution for a physical or virtual server environment has a
dependency on this information. The method to collect and record the information depends
not only on the vendor and model of the infrastructure systems at the primary site, but also
what is present at the DR site.
Memory, CPU, virtual devices, and physical devices defined on the HMC
The definition of the Virtual I/O Server logical partition on the HMC includes such things as
how much CPU, memory, and which physical adapters are to be used. In addition to this,
the virtual device configuration (for example, virtual Ethernet adapters and which virtual
LAN ID they belong to) needs to be captured. The backup and restore of this data is
beyond the scope of this document, but more information can be found in the IBM
Information Center under the Backing up partition profile data topic.
The Virtual I/O Server operating system

The virtual I/O operating system consists of the base virtual I/O code, along with any
custom device drivers, to support disk subsystems and user defined customizations.
After an initial setup, these settings probably do not change, apart from the application of
fix packs. So a sensible backup strategy for the Virtual I/O Server might be a weekly or
manual schedule after fixpacks have been applied.
The virtual I/O command supplied to perform backups is the backupios command, and
using this command we can perform a backup of the Virtual I/O Server to either a tape
device, a CD/DVD device, or to a file system (local, or a remotely mounted NFS one).
With the redundancy of dual Virtual I/O Servers comes the added system maintenance
requirement to keep the redundancy working. Just as with HACMP, all changes made to
the dual VIOS that are related to providing redundancy to a client partition should be
tested. While the strictness of HACMP testing does not need to be followed, testing after
major changes or additions should be performed to verify that the redundancy required is
actually achieved.
Backing up the dual VIOS also requires some thought. When backing up to an optical
device, where VIOS maintains control of the DVD, requires some planning, as the client
partitions also use that VIOS for access to the optical device. Assigning one VIOS as a
client to the other for the DVD is an option, but only DVD-RAM is supported for this
configuration.
Raw logical volumes used as virtual SCSI devices

Raw logical volumes used as virtual devices by the Virtual I/O Server must have their own
backup policy, as the backupios command does not back up raw logical volumes. When a
raw logical volume is used as virtual SCSI device, you need to backup the content from the
client logical partition.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
VIO Server backup and restoration methods
Backup Media Restoration method

method
To tape Tape From tape
To DVD DVD-RAM From DVD
To remote file nim_resources.tar From an HMC using NIM on Linux facility

system image and the installios command
To remote file mksysb image From an AIX NIM server and a standard
system mksysb system installation
Tivoli Storage mksysb image Tivoli Storage Manager

Manager
Figure 9-30. VIO Server backup and restoration methods AN313.1
Notes:
You can back up the Virtual I/O Server and user-defined virtual devices using the
backupios command. You can also use IBM Tivoli Storage Manager to schedule backups
and store backups on another server. Different media can be used for performing a backup,
either DVD, tape, local or remote file systems. The restoration method depends on what
method was used for performing the backup.
Virtual I/O Server information you need to backup

The Virtual I/O Server contains the following types of information that you need to back up:
The Virtual I/O Server itself and user-defined virtual devices.
The Virtual I/O Server includes the base code, applied fix packs, custom device drivers
to support disk subsystems, and some user-defined metadata.
All of this information is backed up when you use the backupios command. In situations
where you plan to restore the Virtual I/O Server to the same system from which it was
backed up, then backing up only the Virtual I/O Server itself is usually sufficient.
Instructor Guide
User-defined virtual devices include metadata, such as virtual device mappings, that define
the relationship between the physical environment and the virtual environment. This data
can be saved to a location that is automatically backed up when you use the backupios
command.
Restoring on a new or different system

In situations where you plan to restore the Virtual I/O Server to a new or different system
(for example, in the event of a system failure or disaster), then you must back up both the
Virtual I/O Server and user-defined virtual devices. In these situations, you must also back
up the following components of your environment in order to fully recover your Virtual I/O
Server configuration:
External device configurations, such as Storage Area Network (SAN) devices.
Resources defined on the Hardware Management Console (HMC), such as processor
and memory allocations. This means backing up your HMC partition profile data for the
Virtual I/O Server and its client partitions.
The operating systems and applications running in the client logical partitions.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Backing up user-defined virtual devices

Back up user-defined virtual devices (such as virtual device
mappings) in preparation of a system failure or disaster.
1. List volume groups and storage pools with structures you want to
backup:
$ lsvg
2. Activate each volume group and storage pool that you want to back
up:
$ activatevg <volume_group>
3. Use the savevgstruct command to back up user-defined disk

structures.
In case the environment changed (new user-defined disk)
4. Save network settings, adapters, users and security settings to the

/home/padmin directory.
Figure 9-31. Backing up user-defined virtual devices AN313.1
Notes:
In situations where you plan to restore the Virtual I/O Server to a new or different system
(for example, in the event of a system failure or disaster), you need to back up both the
Virtual I/O Server and user-defined virtual devices.
The user-defined virtual devices include metadata, such as virtual device mappings that
define the relationship between the physical environment and the virtual environment, but
also any user-defined disk structure. This user-defined disk structures can change over
time if you add more clients or perform changes in the storage pools configuration.
In addition to backing up the Virtual I/O Server, you need to back up user-defined virtual
devices in preparation of a system failure or disaster. This can be accomplished using the
savevgstruct command.
Other informations to save

In addition to the lsmap command output, it is recommended that the following additional
information is gathered as this information would allow the Virtual I/O Server to be rebuilt
from the install media if necessary:

V5.4.0.3
Instructor Guide
Uempty Network settings

Commands: netstat -state, netstat -routinfo, netstat -routtable,
lsdev -dev, entX -attr, cfgnamesrv -ls, hostmap -ls, optimizenet
-list, and entstat -all entX
All physical and logical volume SCSI devices
Commands: lspv, lsvg, and lsvg -lv VolumeGroup
All physical and logical adapters
Commands: lsdev -type adapter
Code levels and users and security
Commands: ioslevel, viosecure -firewall -view, viosecure -view
-nonint, motd, loginmsg, and lsuser
lsmap command
The lsmap output does not gather information such as SEA adapter control channels (for
SEA failover), IP addresses to ping, and whether threading is enabled for the SEA devices.
These settings and any other changes that have been made (for example MTU settings)
must be documented separately. It is also vitally important to use the slot numbers as a
reference for the virtual SCSI and virtual Ethernet devices, not the vhost numbers or ent
numbers. The vhost and ent devices are assigned by the Virtual I/O Server as they are
found at boot time. If we add in more devices after subsequent boots, then these will be
sequentially numbered.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Backing up user-defined disk device structures

User-defined disk structures are backed up by default when invoking the
backupios command (except with the -nosvg flag).
savevgstruct
Structure information goes to /tmp/vgdata
Automatically called by backupios (output is automatically backed up)
$ savevgstruct <volume group or storage pool>
restorevgstruct
Restore the volume group or storage pools structure.
$ restorevgstruct -ls
Usage: restorevgstruct {-ls | -vg VolumeGroupLabel [DiskName ...]}
Restores the user volume group.
-ls Displays a list of saved volume groups.
-vg Specifies the name of the volume group.
DiskName Specifies the names of disk devices to be
used instead of the disk devices listed in
the saved volume group.
Figure 9-32. Backing up user-defined disk device structures AN313.1
Notes:
This is good for disaster recovery or in situations where you are not restoring to the same
system or the disks. The user-defined volume group disk structures can be backed up
using the savevgstruct command. This command writes a backup of the structure of a
named volume group (and therefore storage pool) to the /tmp/vgdata directory and also
in /home/ios/backupvgs directory. For example, if we wanted to back up the structure
in the volgrp01 volume group, we would run the command as follows:
$ savevgstruct volgrp01
Creating information file for volume group volgrp01.
You must run the savevgstruct command for each volume group and storage pool present
on the system, and these must be active. Use the lsvg command to list all of the volume
groups on the system and the activatevg command if necessary.
The savevgstruct command is automatically called before the backup commences for all
active non-rootvg volume groups/storage pools on a Virtual I/O Server when the
backupios command is run. The data (a backup and restore format file) from this is written
in the /home/ios/vgbackups directory on the VIO Server. This information can be used
Instructor Guide
after a Virtual I/O Server restoration for rebuilding the non rootvg structure using the
restorevgstruct command.
In the previous version of the Virtual I/O Server (for example, version 1.3), the data was not
written correctly in the /home/ios/vgbackups directory but is still written in /tmp.
However, the /tmp file system is not backed up automatically when performing a
backupios. The /tmp directory is excluded by default from an mksysb backup, so you had
to enter the oem_setup_env shell and make a copy of the .data files into the
/home/ios/vgbackups directory. The following command could be used after all
savevgstruct command had been completed:
find /tmp/vgdata -name "*.data" -exec cp {} /home/ios/vgbackups/ \;
This is no longer needed if you are using Virtual I/O Server Version 1.4, 1.5, or Version 2.1.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Backing up the Virtual I/O Server

Backupios command
Tape backup
$ backupios -tape /dev/rmt0
DVD backup
$ backupios -cd /dev/cd0 -udf
File system backup

$ mount server5:/export/ios_backup /mnt
$ backupios -file /mnt (creates nim_resources.tar)
mksysb backup
$ backupios -file /mnt/VIOS_BACKUP.mksysb -mksysb
DVD-RAM backup
$ backupios -cd /dev/cd0 -udf -accept
Figure 9-33. Backing up the Virtual I/O Server AN313.1
Notes:
backupios command
The backupios command backs up the Virtual I/O Server operating system, and the user
defined virtual devices by default. The user-defined virtual devices also include the different
user-defined volume groups and storage pools. By default the backupios command
invokes the savevgstruct command to backup in /home/ios/vgbackups directory the
structure of any online user-defined volume group and storage pools. If you have setup a
virtual Media Repository on your Virtual I/O Server, then the savevgstruct will also backup
the content of your Media repository (the different virtual media). The savevgstruct invokes
the AIX savevg command.
Backing up on tape
The result of running the backupios as command with the -tape flag is shown.
Tape backup
$ backupios -tape /dev/rmt0

V5.4.0.3
Instructor Guide
Uempty
Creating information file (/image.data) for rootvg ..
Creating tape boot image .....
Creating list of files to back up.
Backing up 23622 files .
23622 of 23622 files (100%)
0512-038 mksysb: Backup Completed Successfully.
bosboot: Boot image is 26916 512 byte blocks.
bosboot: Boot image is 26916 512 byte blocks.
The result of this command is a bootable tape that allows an easy restore of the Virtual I/O
Server.
Backing up on DVD
There are two types of DVD media that can be used for backing up: DVD-RAM and DVD-R.
DVD-RAM media can support both -cdformat and -udf format, while DVD-R media only
supports the -cdformat. The DVD device cannot be virtualized and assigned to a client
partition when performing backups. Remove the device from the client and the virtual SCSI
mapping from the server before proceeding with the backup.
$ backupios -cd cd0 -udf
Creating information file for volume group data pool
Creating list of files to back up. Backing up six files
6 of 6 files (100%)
0512-038 savevg: Backup Completed Successfully.
Backup in progress. This command can take a considerable amount of time
to complete, please be patient
Initializing mkcd log: /var/adm/ras/mkcd.log ... Verifying command
parameters
Creating image.data file ...
Creating temporary file system: /mkcd/mksysb_image Creating mksysb image
Creating list of files to back up.
Backing up 27129 files
27129 of 27129 files (100%)
0512-038 mksysb: Backup Completed Successfully. Populating the CD or DVD
file system
Copying backup to the CD or DVD file system
Building chrp boot image
Removing temporary file system: /mkcd/mksysb_image
Backing up on a file system

The result of the backupios command is a backup image in the tar format. This file is
stored in the directory specified by the -file flag. Backing up to file.
Instructor Guide
Once again you use the backupios command, but the big difference here is that all of the
previous commands resulted in some form of bootable media that can be used to directly
recover the Virtual I/O Server. This command results in either a TAR file, which contains all
of the information needed for a restore, or a mksysb image, but both methods depend on
an installation server for restoration. The restoration server can be an HMC using the
Network Installation Manager on Linux facility and the installios command. Alternatively,
we would use an AIX Network Installation Management (NIM) server and a standard
mksysb system install. Both of these methods are covered later in the restore section.
If you are using the NIM server for the install, it must be running a level of AIX that can
support the Virtual I/O Server install. For this reason the NIM server should be running the
very latest technology level and service packs at all times.
You can use the backupios command to write to a local file on the Virtual I/O Server but
the more common scenario would be to perform a backup to a remote NFS file system, the
ideal situation might be onto the NIM server as the restore server. In the following example,
the NIM server has a host name of SERVER5 and the Virtual I/O Server is LPAR01.
The first step is to set up the NFS file system export on the NIM server. Here, we are going
to export a file system called /export/ios_backup, and in this case, the
/etc/exports looks similar to the following:
#more /etc/exports
/export/ios_backup
-sec=sys:krb5p:krb5i:krb5:dh,rw=lpar01.ilsvpn.atlanta.ibm.com,root=lpar0
1.ilsvpn.atlanta.ibm.com
#
The NFS server must have the root access (NFS attribute) set on the file system exported
to the Virtual I/O Server logical partition for the backup to succeed.
Make sure the name resolution is functioning between the NIM server and the Virtual I/O
Server for both IP and host name. To edit the name resolution on the Virtual I/O Server, use
the hostmap command to manipulate the /etc/hosts file or the cfgnamesrv command to
change the DNS parameters.
Examples
hostmap -addr 192.100.201.7 -host alpha bravo charlie
The IP address 192.100.201.7 is specified as the address of the host that has a primary
host name of alpha with synonyms of bravo and charlie.
To add a domain entry with a domain name of abc.aus.century.com, type:
cfgnamesrv -add -dname abc.aus.century.com
To add a name server entry with IP address 192.9.201.1, type:
cfgnamesrv -add -ipaddr 192.9.201.1

V5.4.0.3
Instructor Guide
Uempty The backup of the Virtual I/O Server can be fairly large, so make sure that the system
limits allow the creation of large enough files. With the NFS export and name resolution
set up, the file system needs to be mounted on the Virtual I/O Server.
$ mount server5:/export/ios_backup /mnt
$ mount
node mounted over vfs date options
-------- --------------- --------------- ------ ------------
/dev/hd4 / jfs2 Jun 27 10:48 rw,log=/dev/hd8
/dev/hd2 /usr jfs2 Jun 27 10:48 rw,log=/dev/hd8
/dev/hd9var /var jfs2 Jun 27 10:48 rw,log=/dev/hd8
/dev/hd3 /tmp jfs2 Jun 27 10:48 rw,log=/dev/hd8
/dev/hd1 /home jfs2 Jun 27 10:48 rw,log=/dev/hd8
/proc /proc procfs Jun 27 10:48 rw
/dev/hd10opt /opt jfs2 Jun 27 10:48 rw,log=/dev/hd8
server5.itsc.austin.ibm.com /export/ios_backup /mnt nfs3 Jun 27 10:57
$ backupios -file /mnt
Creating information file for volume group storage01.
to complete, please be patient...
$
The command above creates a full backup tar file package, including all of the resources
that the installios command needs to install a Virtual I/O Server (mksysb, bosinst.data,
network bootimage, and SPOT) from an HMC using the installios command. We cover the
restoration methods later in this unit, but it is possible to just create the mksysb backup of
the Virtual I/O Server as follows. At the current time the NIM server only supports the
mksysb restoration method. The mksysb backup of the Virtual I/O Server can be extracted
from the TAR file created in a full backup, so either method is appropriate if the restoration
method is to use a NIM server.
$ backupios -file /mnt/VIOS_BACKUP_27Jun2008_1205.mksysb -mksysb
/mnt/VIOS_BACKUP_27Jun2008_1205.mksysb doesn't exist.
Creating /mnt/VIOS_BACKUP_27Jun2008_1205.mksysb
Creating information file for volume group storage01.
to complete, please be patient...
Creating information file (/image.data) for rootvg.
Creating list of files to back up...
Backing up 45016 files...........................
45016 of 45016 files (100%)
0512-038 savevg: Backup Completed Successfully.
Both of these methods create a backup of the virtual I/O operating system, which we can
use to recover the Virtual I/O Server using either an HMC or a NIM server.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Restoring the Virtual I/O Server (1 of 3)

Restoring HMC configuration
Restoring other IT infrastructure devices
Restoring the Virtual I/O Server operating system
Figure 9-34. Restoring the Virtual I/O Server (1 of 3) AN313.1
Notes:
Restoring HMC configuration

In the most extreme case of a natural or man-made disaster that has destroyed or
rendered unusable an entire data center, we might be forced to recover to a disaster
recovery site. In this case, we need another HMC and System p server located here to
recover our settings, too. If we are particularly organized, we also have a disaster recovery
server in place with our HMC profiles ready to start recovering our systems.
The detail of this is beyond the scope of this document but would, along with the next
section, be the first steps for a DR.
Restoring other IT infrastructure devices

All other IT infrastructure devices, such as network routers, switches, storage area
networks, and DNS servers, to name just a few, also need to be part of an overall IT
disaster recovery solution. You should be aware that restoring the Virtual I/O Server to full
Instructor Guide
functionality could include restoring configurations on external devices to which the VIO
Server communicates.
Restoring the Virtual I/O Server operating system

We should have a functioning IT infrastructure and also a System p server managed by an
HMC, which has a Virtual I/O Server defined with the same number of physical Ethernet
adapters and disks. At this point we need to restore the virtual I/O operating system. This is
the entry point for the restoration of the virtual I/O operating system, if it is required as part
of a standard backup and restore policy.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

Restoring the Virtual I/O Server from CD or DVD backup
Restoring the Virtual I/O Server from tape backup
Restoring the Virtual I/O Server from the HMC

installios
Allows install from CD or remote path
Restoring the Virtual I/O Server using a NIM server

# nim -o define -t mksysb a server=master \
-a location=/export/ios_backup/VIOS_BACKUP_8AUG2008_1511.mksysb VIOS_mksysb
# nim -o define -t spot -a server=master -a location=/export/ios_backup/SPOT a

source=VIOS_mksysb VIOS_SPOT
Enable bos_inst operation
Network boot the client LPAR

Notes:
Restoring the Virtual I/O Server from CD or DVD backup
The backup procedure creates bootable media, which we can use to restore as
stand-alone backups. Insert the first disk from the set of backups into the optical drive and
boot the machine into SMS mode. Select to install from the optical drive and work through
the usual installation procedure. If the CD or DVD backup spanned multiple disks, then
during the install you are prompted to insert the next disk in the set with a similar message
to the following:
Please remove volume 1, insert volume 2, and press the Enter key.
For more details on this please see details on system backup and restore
in the IBM Information Center.
Restoring the Virtual I/O Server from tape backup
The procedure for the tape is very similar to the CD or DVD, as this is bootable media: just
place the backup media into the tape drive and follow the boot into SMS mode. Select to
install from the tape drive and follow the same procedure as shown in the visual.

V5.4.0.3
Instructor Guide
Uempty Restoring the Virtual I/O Server from the HMC

If we took a full backup to file (not a mksysb backup but a full one that creates the
nim_resources.tar file), then we can use the HMC to install these using the installios
command (currently the NIM server does not support the installios restore from the TAR file
and the mksysb restore method shown as follows should be used). The TAR file has to be
available on either a DVD or an NFS share; here we use the NFS method. Assuming the
directory that holds the nim_resources.tar file has been exported from an NFS file server,
we now log onto the HMC and run the installios command.
The trailing slash in the NFS location below - server5:/export/ios_backup/ must
be included in the command as shown below.
The configure the client's network setting must be set to No as shown below.
This is because the physical adapter in the backup could already be used by an SEA, and
the IP configuration fails if this is the case. Log in and configure the IP if necessary after the
install through a console session. If the command seems to be taking a long time to
restore, this is most commonly caused by a speed or duplex mis-configuration.
hscroot@server1:~> installios
The following objects of type managed system were found. Please select
one:
PowerVM520-01
Enter a number: 1
The following objects of type virtual I/O server partition were found.
Please select one:
VIOS_DR_Server
VIO_Server1
Enter a number (1-2): 1
The following objects of type profile were found. Please select one:
normal
Enter a number: 1
Enter the source of the installation images [/dev/cdrom]:
server5:/export/ios_backup/
Enter the client's intended IP address: 10.31.181.134
Enter the client's intended subnet mask: 255.255.252.0
Enter the client's gateway: 10.31.180.1
Enter the client's speed [100]:auto
Enter the client's duplex [full]:auto
Would you like to configure the client's network after the installation
[yes]/no? no
Retrieving information for available network adapters
This will take several minutes...
The following objects of type "ethernet adapters" were found. Please select
one:
Instructor Guide
1. ent U8204.E8A.652B042-V3-C31-T1 7e5717d18e1f /vdevice/l-lan@3000001f

virtual
2. ent U7311.D20.6516E1C-P1-C06-T1 001a6491857c
/pci@800000020000120/pci@2/ethernet@1 physical
3. ent U7311.D20.6516E1C-P1-C06-T2 001a6491857d
/pci@800000020000120/pci@2/ethernet@1,1 physical
Enter a number (1-3): 2
Here are the values you entered:
managed system = PowerVM520-01
virtual I/O server partition = VIOS_DR_Server
profile = normal
source = server5:/export/ios_backup/
IP address = 10.31.181.134
subnet mask = 255.255.255.0
gateway = 10.31.181.1
speed = 100
duplex = full
configure network = no
ethernet adapters = 00:09:6b:6e:84:58
Press enter to proceed or type Ctrl-C to cancel...
nimol_config MESSAGE: No NIMOL server hostname specified, using
server1.itsc.austin.ibm.com as the default.
Starting RPC portmap daemon
done
Starting kernel based NFS server
done
nimol_config MESSAGE: Added "REMOTE_ACCESS_METHOD /usr/bin/rsh" to the file
"/etc/nimol.conf"
nimol_config MESSAGE: Removed "disable = yes" from the file
"/etc/xinetd.d/tftp"
nimol_config MESSAGE: Added "disable = no" to the file "/etc/xinetd.d/tftp"
Shutting down xinetd:
done
Starting INET services. (xinetd)
done
nimol_config MESSAGE: Removed "SYSLOGD_PARAMS=" from the file
"/etc/sysconfig/syslog"
nimol_config MESSAGE: Added "SYSLOGD_PARAMS=-r " to the file
"/etc/sysconfig/syslog"
nimol_config MESSAGE: Removed "local2,local3.* -/var/log/localmessages" from
the file "/etc/syslog.conf"
nimol_config MESSAGE: Added "local3.* -/var/log/localmessages" to the file
"/etc/syslog.conf"
nimol_config MESSAGE: Added "local2.* /var/log/nimol.log" to the file

V5.4.0.3
Instructor Guide
Uempty "/etc/syslog.conf"
Shutting down syslog services
done
Starting syslog services
done
nimol_config MESSAGE: Executed /usr/sbin/nimol_bootreplyd -l -d -f
/etc/nimoltab -s server1.itsc.austin.ibm.com.
nimol_config MESSAGE: Successfully configured NIMOL.
nimol_config MESSAGE: target directory: /info/default5
nimol_config MESSAGE: Executed /usr/sbin/iptables -I INPUT 1 -s server5 -j
ACCEPT.
nimol_config MESSAGE: source directory: /mnt/nimol
nimol_config MESSAGE: Checking /mnt/nimol/nim_resources.tar for existing
resources.
nimol_config MESSAGE: Executed /usr/sbin/iptables -D INPUT -s server5 -j
ACCEPT.
nimol_config MESSAGE: Added "/info/default5 *(rw,insecure,no_root_squash)"
to the file "/etc/exports"
nimol_config MESSAGE: Successfully created "default5".
nimol_install MESSAGE: The hostname "lpar11.ilsvpn.atlanta.ibm.com" will be
used.
"/etc/nimol.conf"
nimol_install MESSAGE: Added
menu interface.
if_en: ns_alloc(en0) failed with errno = 19
Method error (/usr/lib/methods/chgif):
0514-068 Cause not known.
0821-510 ifconfig: error calling entry point for /usr/lib/drivers/if_en: The
specified device
does not exist.
0821-103 : The command /usr/sbin/ifconfig en0 inet 10.31.182.163 arp netmask
255.255.255.0 mtu 1500 up failed.
0821-007 cfgif: ifconfig command failed.
The status of "en0" Interface in the current running system is uncertain.
0821-103 : The command /usr/lib/methods/cfgif -len0 failed.
0821-510 ifconfig: error calling entry point for /usr/lib/drivers/if_en: The
specified device does not exist.
0821-103 : The command /usr/sbin/ifconfig en0 inet 10.31.182.163 arp netmask
255.255.255.0 mtu 1500 up failed.
0821-229 chgif: ifconfig command failed.
mktcpip: Problem with command: chdev , return code = 1
Instructor Guide

Restoring the Virtual I/O Server using a NIM server
The installios command is also available on the NIM server but at present, it only supports
installs from the base media of the Virtual I/O Server. The method we use from the NIM
server is to install the mksysb image, which can either be the mksysb image generated with
the -mksysb flag in the backupios command above, or we can extract the mksysb from the
nim_resources.tar file.
When we have the mksysb on the NIM server, we must create a NIM object that identifies it
as a NIM resource:
# nim -o define -t mksysb -a server=master
-a location=/export/ios_backup/VIOS_BACKUP_27Jun2008_1205.mksysb
VIOS_mksysb
# lsnim VIOS_mksysb
VIOS_mksysb resources mksysb
To avoid restore problems, you should create a spot from the VIOS mksysb.
When we have the mksysb identified in the NIM database as a resource, we can generate
a SPOT from the mksysb using the following:
# nim -o define -t spot -a server=master -a location=/export/ios_backup/SPOT
a source=VIOS_mksysb VIOS_SPOT
Creating SPOT in "/export/ios_backup/SPOT" on machine "master" from

"VIOS_mksysb"
...
Restoring files from BOS image. This may take several minutes...
# lsnim VIOS_SPOT
VIOS_SPOT resources spot

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

Install the Base Operating System on Standalone Clients
Type or select values in entry fields.

Press Enter AFTER making all desired changes.
[TOP] [Entry Fields]

* Installation Target lpar1
* Installation TYPE mksysb
* SPOT VIOS_SPOT
LPP_SOURCE [] +
MKSYSB VIOS_mksysb
BOSINST_DATA to use during installation [] +

IMAGE_DATA to use during installation [] +
RESOLV_CONF to use for network configuration [] +
Customization SCRIPT to run after installation [] +
Customization FB Script to run at first reboot [] +
ACCEPT new license agreements? [no] +
Remain NIM client after install? [no]
PRESERVE NIM definitions for resources on [yes] +
this target?
FORCE PUSH the installation? [no] +
Initiate reboot and installation now? [yes] +

-OR-
Set bootlist for installation at the [no] +
next reboot?
Additional BUNDLES to install [] +

-OR-
Additional FILESETS to install [] +
(bundles will be ignored)
installp Flags
COMMIT software updates? [yes] +
SAVE replaced files? [no] +
AUTOMATICALLY install requisite software? [yes] +
EXTEND filesystems if space needed? [yes] +
OVERWRITE same or newer versions? [no] +
VERIFY install and check file sizes? [no] +
ACCEPT new license agreements? [no] +
(AIX V5 and higher machines and resources)
Preview new LICENSE agreements? [no] +
Group controls (only valid for group targets):

Number of concurrent operations [] #
Time limit (hours) [] #
Schedule a Job [no] +

YEAR [] #
MONTH [] +#
DAY (1-31) [] +#
HOUR (0-23) [] +#
MINUTES (0-59) [] +#
Notes:
With the SPOT and the mksysb image defined to NIM, we can now install the Virtual I/O
Server from the backup. If the machine or LPAR we are to install is not defined in NIM,
create a NIM machine object to identify it. Then use smitty nim_bosinst fastpath to enable
the base operating system installation process.
Note that the Remain NIM client after install field here is set to no. If this is not set to no,
then the last step for the NIM install is to configure an IP address onto the physical adapter
used to install the Virtual I/O Server from the NIM server. If this is the adapter used by the
shared Ethernet adapter, it will cause some error messages similar to those shown below.

V5.4.0.3
Instructor Guide
Uempty If this is the case, reboot the Virtual I/O Server, log on to the Virtual I/O Server through the
terminal and remove the IP address information and SEA adapter and recreate them:
inet0 changed
Method error (/usr/lib/methods/chgif):
0514-068 Cause not known.
0821-510 ifconfig: error calling entry point for /usr/lib/drivers/if_en:
The
specified device does not exist.
0821-103: The command /usr/sbin/ifconfig en0 inet 10.31.182.163 arp
netmask
255.255.255.0 mtu 1500 up failed.
0821-007 cfgif: ifconfig command failed.
0821-103 : The command /usr/lib/methods/cfgif -len0 failed.
0821-510 ifconfig: error calling entry point for /usr/lib/drivers/if_en:
The specified device does not exist.
0821-103 : The command /usr/sbin/ifconfig en0 inet 10.31.182.163 arp
netmask
255.255.255.0 mtu 1500 up failed.
0821-229 chgif: ifconfig command failed.
mktcpip: Problem with command: chdev
, return code = 1
Now that we have set up the NIM server to push out the backup image, the Virtual I/O
Server LPAR needs to have the remote IPL setup completed; the procedure for this can be
found in the AIX Installation in a Partitioned Environment guide found in the Infocenter at:
http://publib16.boulder.ibm.com/pseries/index.htm
The install of the Virtual I/O Server should complete, but in this case there is a big
difference between restoring to the existing server or restoring to a new disaster recovery
server. One of the NIM install options is to recover devices. With this option, any virtual
devices that were created on a server will be recreated exactly as they were, providing the
restoration occurs to the same server. This means that virtual target SCSI devices and
shared Ethernet adapters should all be recovered without any need to recreate them.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Using cron
Available in Virtual I/O Server 1.3 (and above) to schedule
tasks
crontab e
/usr/ios/cli/ioscli
Example:
30 21 * * * /usr/ios/cli/ioscli mount
eservernim:/export/nim/mksysb /mnt
31 21 * * * /usr/ios/cli/ioscli backupios file
/mnt/vios_backup -mksysb
Figure 9-37. Using cron AN313.1
Notes:
The cron function was introduced to the padmin shell in Virtual I/O Server Version 1.3.
However, many commands fail when executed from within a cron job. Failing commands
include mount and backupios, even though they work fine when executed from the CLI.
/home/padmin/.profile is an important part of the CLI, because it changes the path
and aliases many commands. The commands fail because cron does not read the users
.profile.
You can successfully execute the commands from within a cron job by using the
/usr/ios/ioscli command.
With Virtual I/O Server Version 1.3 and later, the crontab command is available to allow
you to submit, edit, list, or remove cron jobs. A cron job is a command run by the cron
daemon at regularly scheduled intervals, such as system tasks, nightly security checks,
analysis reports, and backups.
Instructor Guide
With the Virtual I/O Server, a cron job can be submitted by specifying the crontab
command with the -e flag. The crontab command invokes an editing session that allows
you to modify the padmin users crontab file and create entries for each cron job in this file.
When you finish creating entries and exit the file, the crontab command copies it into the
/var/spool/cron/crontabs directory and places it in the padmin file.
When scheduling jobs, use the padmin users crontab file. The creation or editing of other
users crontab files is not supported.
The following syntax is available to be used by the crontab command:
crontab [ -e padmin | -l padmin | -r padmin | -v padmin ]
-e padmin: Edits a copy of the padmins crontab file. When editing is complete, the file
is copied into the crontab directory as the padmin's crontab file.
-l padmin: Lists padmin's crontab file.
-r padmin: Removes the padmins crontab file from the crontab directory.
-v padmin: Lists the status of the padmin's cron jobs.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Virtual optical devices can be used to backup client logical
partitions mksysb.
Instructor Guide
Backing up the client LPAR to virtual DVD

Can use virtual DVD to back up the client LPAR operating
system
Virtual DVD backed by a physical DVD drive
DVD RAM supported
Virtual DVD backed by a file
Much faster than backing up on physical drives
Need a virtual media repository on the Virtual I/O Server
Create a blank DVD in r/w mode using the mkvopt command
$ mkvopt -name <dvd name> -size <size>
Create a virtual optical device for your partition using the mkvdev command
$ mkvdev -fbo -vadapter vhostx
Load the blank DVD into the virtual optical device using the loadopt
command
$ loadopt -vtd vtoptx (vtoptx is the virtual target device - file backed
optical)
Perform the backup using mkcd command (or smitty mkdvd)
$ /usr/sbin/mkcd -d /dev/cd0 -v rootvg -A -U
Figure 9-38. Backing up the client LPAR to virtual DVD AN313.1
Notes:
Since the Virtual I/O Server Version 1.5, it is possible to create a single container to store
and manage file-backed virtual media files. This container is named the virtual media
repository. You can have only one virtual media repository per Virtual I/O Server.
You can create virtual media files and store them in this repository. These media stored can
be loaded into file-backed virtual optical devices for exporting to client logical partitions.
The slide describes the way to create a blank virtual optical media that can be used to
backup an client lpars mksysb image. From the client logical partition, you can use the
mkdvd or mkcd command to create a system backup image (mksysb) to DVD-RAM
backed by a file.
Restoring your bootable image from a file-backed virtual optical device
First you need to load the virtual DVD containing the bootable image into the virtual optical
device (this is the same action you would have done with physical DVD reader) using the
loadopt command.
Boot your logical partition in SMS mode and start restoring the image.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Managing virtual SCSI device capacity

For rootvg volume group
Increasing a root volume group by extending its associated Virtual I/O Server
logical volume is not supported.
Need to add a virtual SCSI disk then extend rootvg from the client partition
For non-rootvg (VIOS 1.3 or later)
Logical volumes based on volume groups:
$ extendlv
Logical volumes based on storage pools:
$ chbdsp -sp <storage pool> -bd <logical volume> -size <new size>
Files (VIOS 1.5 or later)
$ chbdsp -sp <storage pool> -bd <file> -size <new size>
On the client logical partition perform a chvg
# chvg g <volume group>
For non-root volume groups (pre VIOS 1.3)
Increasing a non-root volume group by extending its associated Virtual I/O Server
logical volume is a disruptive method.
Reconfigure the virtual device or restart the Virtual I/O Server.
Figure 9-39. Managing virtual SCSI device capacity AN313.1
Notes:
Increase the size of virtual SCSI disks

As storage demands increase for virtual client logical partitions, you can add physical
storage to increase the size of your virtual devices and allocate that storage to your virtual
environment. You can increase the capacity of your virtual SCSI devices by increasing the
size of physical or logical volumes. With Virtual I/O Server version 1.3 and later, you can do
this without disrupting client operations. To increase the size of files and logical volumes
based on storage pools, the Virtual I/O Server must be at version 1.5 or later.
Increasing the size of a volume group on a client logical partition can be done by adding an
extra virtual SCSI disk, then to extend the volume group to this new Virtual SCSI disk. This
method is not disruptive, as you dont need to varyoff the volume group on the client
partition, and it is the only supported method for root volume groups.

V5.4.0.3
Instructor Guide
Uempty Increasing the size of the backing device at the Virtual I/O Server (before
Version 1.3)
If you have a Virtual I/O Server Version 1.2 (pre VIOS 1.3) then you can increase the
associated logical volume of non-root volume group, but you must first varyoff the volume
group to unconfigure, and re-configure the associated virtual target device. When varying
on the non-root volume group on the client partition, you must execute a chvg command to
activate the new added space.
The chvg command can be used to set the characteristics of a volume group. The specific
-g parameter examines all of the disks in the volume group to see whether they have grown
in size. If any disks have grown try to add additional physical partitions to the physical
volume.
Increasing the size of the backing device at the Virtual I/O Server (VIOS
1.3 and later)
With Virtual I/O Server Version 1.3 (and later), changing the backing device logical
volumes size, or files size is non-disruptive. After you change its size, you must run chvg
-g <volume group name> at the client partition.
Instructor Guide
Instructor notes:
Purpose Show how to manage virtual SCSI disk.
Details Increasing non rootvg volume groups in the client partition is dynamic. A chvg
command has to be performed on the client lpar to recognize the new virtual SCSI disk
size. VIOS 1.5 is required for exporting files as virtual SCSI disk devices.

V5.4.0.3
Instructor Guide
Uempty
Processor recovery: Partition availability priority

In case of processor failure and no unassigned processors are
available:
The partition can acquire a replacement processor from logical partition with a
lower availability priority.
Allow the logical partition with the higher partition-availability priority to continue
running after a processor failure.
Server
configuration
Availability
priority update
Figure 9-40. Processor recovery: Partition availability priority AN313.1
Notes:
Partition availability introduction

When the server firmware detects that a processor is about to fail, it can deconfigure the
failing processor automatically. When the server firmware deconfigures a failing processor,
and there are no unassigned processors available on the managed system, the processor
deconfiguration can cause the logical partition to which the processor is assigned to shut
down.
To avoid shutting down mission-critical workloads when your server firmware deconfigures
a failing processor, you can use the Hardware Management Console (HMC) to set
partition-availability priorities for the logical partitions on your managed system. A logical
partition with a failing processor can acquire a replacement processor from logical
partitions with a lower partition-availability priority. The acquisition of a replacement
processor allows the logical partition with the higher partition-availability priority to continue
running after a processor failure.
Instructor Guide
Acquisition of a replacement processor mechanism

When a processor fails on a high-priority logical partition, the managed system follows
these steps to acquire a replacement processor for the high-priority logical partition.
If there are unassigned processors on the managed system, the managed system
replaces the failed processor with an unassigned processor.
If there are no unassigned processors on the managed system, the managed system
checks the logical partitions with lower partition-availability priorities, starting with the
lowest partition-availability priority.
If a lower-priority logical partition uses dedicated processors, the managed system
shuts down the logical partition and replaces the failed processor with one of the
processors from the dedicated-processor partition.
If a lower-priority logical partition uses shared processors, and removing a whole
processor from the logical partition would not cause the logical partition to go below its
minimum value, the managed system removes a whole processor from the
shared-processor partition using dynamic logical partitioning and replaces the failed
processor with the processor that the managed system removed from the
shared-processor partition.
If a lower-priority logical partition uses shared processors, but removing a whole
processor from the logical partition would cause the logical partition to go below its
minimum value, the managed system skips that logical partition and continues to the
logical partition with the next higher partition availability.
If the managed system still cannot find a replacement processor, the managed system
shuts down as many of the shared-processor partitions as it needs to acquire the
replacement processor. The managed system shuts down the shared-processor
partitions in partition-availability priority order, starting with the lowest
partition-availability priority.
A logical partition can take processors only from logical partitions with lower
partition-availability priorities. If all of the logical partitions on your managed system
have the same partition-availability priority, then a logical partition can replace a failed
processor only if the managed system has unassigned processors.

V5.4.0.3
Instructor Guide

Purpose To discuss Virtual I/O Server partition availability priority.
Details By default, the partition availability priority of Virtual I/O Server logical partitions
is set to 191. The partition-availability priority of all other logical partitions is set to 127 by
default. Do not set the priority of Virtual I/O Server logical partitions to be lower than the
priority of the logical partitions that use the resources on the Virtual I/O Server logical
partition.
Transition statement The Power Saver mode is new on POWER6 processor-based
systems.
Instructor Guide
Power saver mode

Power saving
Power saver mode drops voltage and frequency by a fixed percentage.
Under current implementation, this is 14%.
One possible use for power saver would be to enable it when workloads are
minimal.
It can also be used to reduce the peak energy consumption.
Firmware and hardware power saver mode capability
Power saver mode is only supported with 4.0 GHz processors and faster.
User can enable or disable power saver mode through HMC or ASMI
Alternatively through the chpwrmgmt command
To enable or disable power saver mode:
chpwrmgmt -m <managed system name> -r sys -o enabled | disabled
To list power saver mode configuration:
lspwrmgmt -m <managed system name> -r sys
Figure 9-41. Power saver mode AN313.1
Notes:
Power saver mode

Power saver mode provides a way to save power by dropping the voltage and frequency a
fixed percentage. This percentage is pre-determined to be within a safe operating limit and
is not user configurable. Under current implementation, this is a 14% frequency drop.
Active Energy Manager is the recommended user interface to enable/disable Power Saver
mode. One possible use for Power Saver would be to enable it when workloads are
minimal, such as at night, and then disable it in the morning.
Power Saver can be used to reduce the peak energy consumption, which can lower the
cost of all power used. At low CPU utilization, the use of Power Saver increases processor
utilization such that the workload notices no performance impact. Depending on workload,
this can reduce the processor power usage by 20-30%.
When the power saver mode is activated, the lparstat command output reports the current
average processor speed as a percentage of nominal speed.

V5.4.0.3
Instructor Guide
Uempty Here is an example:

{lpar1:root}/ # lparstat 2
System configuration: type=Shared mode=Uncapped smt=On lcpu=2 mem=1536MB
psize=4 ent=0.10
%user %sys %wait %idle physc %entc lbusy app vcsw phint %nsp
--------- --------- --------- -------- -------- --------- --------
------- ------- ------- --------
99.8 0.2 0.0 0.0 0.86 860.5 100.0 2.98 204 2 86
99.9 0.1 0.0 0.0 0.86 859.2 100.0 2.98 205 0 86
99.9 0.1 0.0 0.0 0.86 856.4 100.0 2.91 204 1 86
99.8 0.2 0.0 0.0 0.86 859.1 100.0 2.97 205 0 86
IBM EnergyScale technology

The Power Saver mode is available on POWER6 processor-based systems and is one of
the features available in the EnergyScale technology. EnergyScale provides functions that
help the user to understand and control IBM server power and cooling usage. This enables
better facility planning, provides energy and cost savings, enables peak energy usage
control, and increases system availability. The following EnergyScale features are
available:
Power saver mode: Power saver mode provides a way to save power by dropping the
voltage and frequency a fixed percentage.
Power Trending: provides continuous power usage data collection. The power usage
data can be displayed from the IBM Systems Director Active Energy Manager.
Power Capping: Power Capping enforces a user specified limit on power usage. The
user must set and enable a power cap from the Active Energy Manager user interface.
Processor Code Nap: The IBM POWER6 processor uses a low-power mode called Nap
that stops processor execution when there is no work to do on that processor core.
EnergyScale for I/O: IBM POWER6 processor-based System i and System p models
automatically power off pluggable, PCI adapter slots that are not being used to save
approximately 14 watts per slot.
Power saver mode capability

HMC offers a display of power saver mode capabilities, enable or disable of power saver
mode, and scheduling of power saver mode. Active Energy Manager and IBM Director can
also configure or query the EnergyScale related information through HMC. The HMC
provides two capabilities for Power Saver mode: firmware and hardware power saver mode
capabilities. Power saver mode is the only EnergyScale feature supported on ASMI and
HMC, as Active Energy Manager is the preferred user interface.
Instructor Guide
Firmware power saver mode capability: This capability indicates the firmware loaded on
the server is capable of performing Power Saver mode functions, but does not
necessarily imply the same for the underlying hardware.
Hardware power saver mode capability: This capability indicates whether the server
hardware supports the power saver mode function. For example, POWER6
processor-based systems with a nominal operating frequency of < 4.0 GHz do not
support power saver functions at the hardware level, even if the installed firmware does.
Enable and disable power saver mode

User can enable or disable power saver mode through the HMC enable/disable power
saver mode task. However, the new power saver mode might not take effect immediately.
Normally, if the operation is performed before the system is powered on, the desired mode
won't change to the current until the system is up and running. If the power saver mode is
in transition, any changes will be blocked. With the HMC GUI, users can reach this task by
selecting the managed system > Operations > Power Management.

V5.4.0.3
Instructor Guide

Purpose
Details EnergyScale enables better facility planning, provides energy and cost savings,
enables peak energy usage control, and increases system availability. The client can
leverage EnergyScale capability to customize the power consumption of their POWER6
processor-based system and tailor it to their particular datacenter needs.
To optimize savings by consolidating applications on fewer physical servers, energy
management features have been implemented to provide the most efficient computing
environment.
With these technologies, systems use less power, generate less heat, and use less energy
to cool the system.
For more information on power saver mode and EnergySacle technology, refer to:
http://www-03.ibm.com/systems/power/hardware/whitepapers/energyscale.html
For more informations on Active Energy Manager, refer to:
http://www.ibm.com/systems/management/director/extensions/actengmrg.html
Instructor Guide
Hot pluggable devices (1 of 2)

diagmenu command
FUNCTION SELECTION 801002
Move cursor to selection, then press Enter.
Diagnostic Routines
other advanced functions will not be used.
Advanced Diagnostics Routines
other advanced functions will be used.
TASKS SELECTION LIST 801004
Task Selection (Diagnostics, Advanced Diagnostics, Service Aids, etc.)
This selection will list the tasks supported by these procedures.
Once a task is selected, a resource menu may be presented showing
all resources supported by the task. From the list below, select a task by moving the cursor to
Resource Selection the task and pressing 'Enter'.
This selection will list the resources in the system that are supported
To list the resources for the task highlighted, press 'List'.
by these procedures. Once a resource is selected, a task menu will
be presented showing all tasks that can be run on the resource(s).
[MORE...20]
Display Resource Attributes
Display Service Hints
Display Software Product Data
F1=Help F10=Exit F3=Previous Menu
Display or Change Bootlist
Format Media
Gather System Information
Hot Plug Task
Identify and Attention Indicators
Local Area Network Analyzer
Log Repair Action
Microcode Tasks
RAID Array Manager
[MORE...4]
F1=Help F4=List F10=Exit Enter

F3=Previous Menu
Figure 9-42. Hot pluggable devices (1 of 2) AN313.1
Notes:
Hot pluggable devices

Similar to AIX, the VIOS includes a feature that accepts hot plugging devices, such as
disks and PCI adapters, into the server, and activating them for the partition without a
reboot. Prior to starting, an empty system slot must be assigned to the VIOS partition on
the HMC. This task can be done through dynamic LPAR operations, but the VIOS partition
profile must also be updated so that the new adapter is configured to the VIOS after a
reactivation.
To begin, use the diagmenu command to get into the VIOS diagnostic menu. This menu is
very similar to the AIX V5.3 diagnostic menu and gives you the same four options at the
beginning screen:
Diagnostic Routines
Advanced Diagnostic Routines
Task Selection
Resource Selection

V5.4.0.3
Instructor Guide
Uempty The Hot Plug Tasks selection is under the Task Selection option of the menu. Under this
menu selection, the choice of PCI hot plug tasks, RAID hot plug devices, and the SCSI and
SCSI RAID hot plug manager are presented.
The PCI Hot Plug Manager menu is used for adding, identifying, or replacing PCI adapters
in the system that are currently assigned to the VIOS. The RAID hot plug devices option is
used for adding RAID enclosures that are connected to a SCSI RAID adapter. The SCSI
and SCSI RAID manager menu is used for disk drive addition or replacement and SCSI
RAID configuration.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Hot pluggable devices (2 of 2)
Hot Plug Task 801004
Move cursor to desired item and press Enter.
PCI Hot Plug Manager

RAID Hot Plug Devices
SCSI and SCSI RAID Hot Plug Manager
F1=Help F2=Refresh F3=Cancel F8=Image

F9=Shell F10=Exit Enter=Do
PCI Hot Plug Manager
Move cursor to desired item and press Enter.
List PCI Hot Plug Slots

Add a PCI Hot Plug Adapter
Replace/Remove a PCI Hot Plug Adapter
Identify a PCI Hot Plug Slot
Unconfigure a Device
Configure a Defined Device
Install/Configure Devices Added After IPL
F1=Help F2=Refresh F3=Cancel

F8=Image
F9=Shell F10=Exit Enter=Do
Figure 9-43. Hot pluggable devices (2 of 2) AN313.1
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Lets review the key concepts of this unit.

V5.4.0.3
Instructor Guide
Uempty
Checkpoint
1. Which command can be used to display the code version of
the HMC?
2. Your currently installed firmware level is EL320_59_031, and

the new service pack is EL320_061_031. Is this disruptive?
3. In the service partition, what command is used to apply the

updated system firmware?
4. List the commands that can be used to back up and restore

the volume group data structures.
5. If you resize a backing device at the VIO Server version 1.3

or later, what commands must be executed at the client
partition to use the additional space?
Notes:
Instructor Guide
Instructor notes:
Purpose
Details
1. Which command can be used to display the code version of the HMC?
The answer is lshmc -V.
2. Your currently installed firmware level is EL320_59_031, and the new service pack
is EL320_061_031. Is this disruptive?
The answer is no.
3. In the service partition, what command is used to apply the updated system
firmware?
The answer is update_flash.
4. List the commands that can be used to back up and restore the volume group data
structures.
The answers are savevgstruct and restorevgstruct.
5. If you resize a backing device at the VIO Server version 1.3 or later, what
commands must be executed at the client partition to use the additional space?
The answer is chvg -g.

V5.4.0.3
Instructor Guide
Uempty
Exercise
Unit
exerc
ise
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Unit summary
Single and dual VIO configurations
updateios command
backupios command (file, tape, CD, and DVD)
Backup tape, DVD, and tar file
Back up client partitions operating system to virtual DVD
diagmenu command: VIOS diagnostic menu
Notes:
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty Unit 10. Virtualization management tools
Estimated time
01:00

This unit describes different virtualization management tools that can
be used to manage and monitor a POWER virtualized environment.
The purpose of this unit is not to describe in detail all of the available
tools, but rather to give the students a list of tools.

Describe standard AIX/virtualization monitoring tools
Identify freeware monitoring tools
Describe virtualization management and monitoring tools, such as:
- IBM Systems Director
- IBM Tivoli Monitoring

References
SG24-7590-01 IBM PowerVM Virtualization Managing and Monitoring
redbook
Copyright IBM Corp. 2010, 2011 Unit 10. Virtualization management tools 10-1
Instructor Guide
Unit objectives
Describe virtualization management and monitoring tools, such
as
IBM Systems Director
IBM Tivoli Monitoring
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
Utilization data management (1 of 3)

Displays events related to resource utilization or state changes
Managed system must be configured for data to be sampled
Change the setting for utilization data

collection
Use chlparutil to change the
sampling rate for the managed system
at the command line
lslparutil can be used at the command line to view/gather data.

Figure 10-2. Utilization data management (1 of 3) AN313.1
Notes:
Utilization data management application
HMCs can collect utilization data for any managed system. To collect this information, you
need to select the managed system you want to monitor, then change the utilization
frequency for retrieving data. This can be every 30 seconds, 60 seconds, every 5 minutes,
every 30 minutes, or every hour. By default, the utilization retrieval rate is set to 0 and the
recording is set to disable.
Utilization data is collected into records called utilization events and include information
about the states of the HMC, partitions, and managed systems and utilization of processor
and memory resources. Events can be viewed at periodic intervals (hourly, daily, monthly,
and snapshot) by selecting Operations > Utilization Data > View.
Periodic intervals and maximum number of events
Hourly: Hourly periodic utilization events, system-level state and configuration change
utilization events, partition-level state and configuration change utilization events, and
HMC start, shutdown, and time-change utilization events

V5.4.0.3
Instructor Guide
Uempty Daily: Daily periodic utilization events

Monthly: Monthly periodic utilization events
Snapshot: Snapshot periodic utilization events, system-level state and configuration
change utilization events, partition-level state and configuration change utilization
events, and Hardware Management Console (HMC) start, shutdown, and time-change
utilization events
You can also indicate the maximum number of utilization events that you want to display
between the times indicated in a Beginning time/date and an Ending time/date fields.
Events view
Utilization events can contain HMC, managed system, or LPARs configuration changes but
also the utilization status of the following:
System: Displays information about the processor and memory utilization on the
managed system as a whole.
LPAR: Displays information about the processor and memory utilization on each logical
partition in the managed system.
Physical processor pool: Displays information about the total processor utilization
within all shared processor pools on the managed system.
Shared processor pool: Displays information about the processor utilization within
each configured shared processor pool on the managed system.
Shared memory pool: Displays information about the memory utilization within the
shared memory pool on the managed system.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty

Sampling event example (logical partition utilization)
LPARs utilization
percentage
Sampling event example (physical processor pool utilization)

Physical pool
utilization
percentage
Notes:
Logical partition utilization table
This table displays the utilization data captured for each logical partition on the managed
system at the indicated date and time. Each line contains the information for a single logical
partition on the managed system.
The information contained in each column is as follows:
Partition (ID): Displays the name of each logical partition and the ID number.
Processor mode: Displays the processor mode of each logical partition. Valid values
are Dedicated or Shared).
Processing units: Displays the number of processing units committed to each logical
partition.
Current processors: Displays the number of dedicated processors committed to each
logical partition.
Instructor Guide
Utilized processing units: Displays the number of processing units that were utilized
by the logical partition since the previous sample. The utilization expressed as a
percentage of the number of processing units assigned to the logical partition is also
displayed. The utilization can be greater than the number of processing units assigned
to the logical partition (or 100%) if the sharing mode of the logical partition is uncapped.
This information is not available for logical partitions that use dedicated processors.
Physical processor poll utilization
Processing units: Displays the number of processing units in all shared processor
pools that were configurable for partition usage. This number includes processing units
that were assigned to all partitions in shared processor pools.
Processor utilization: Displays the number of processing units that were utilized by all
partitions in shared processor pools since the previous sample. The utilization
expressed as a percentage of the number of configurable processing units in all shared
processor pools is also displayed. The utilization can be greater than the number of
configurable processing units in all shared processor pools (or 100%) if shared
processor partitions are using processing cycles that belong to dedicated processor
partitions.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

SharedPool01
utilization
percentage
Shared memory
pool utilization
Shared memory pool
Notes:
Shared processor pool utilization
This table displays the utilization data captured for each configured shared processor pool
on the managed system at the indicated date and time. Each row in the table contains the
information for a single configured shared processor pool on the managed system. The
information contained in each column is as follows:
Shared processor pool (ID): Displays the name and ID of the shared processor pool.
Processing units: Displays the number of processing units assigned to each
configured shared processor pool on the managed system.
Processor utilization: Displays the percent of entitled processing time that the logical
partitions using each configured shared processor pool have used since the last time
that the managed system was powered on or restarted. This is the utilized pool cycles
time divided by the total pool cycles time, and this is expressed as a percent value. This
figure can be greater than 100% if the sharing mode of the logical partition is uncapped

V5.4.0.3
Instructor Guide
Uempty and the logical partition is using processing time that belongs to dedicated logical
partitions or logical partitions that use other shared processor pools.
Shared memory pool utilization table
The information displayed here allows you to see the utilization of the shared memory pool
at the indicated date and time.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Normal monitoring tools (1 of 5)

iostat, vmstat, sar, topas, and so on
Use Performance Utilization Resource Register (PURR)
Use the Scaled PURR (SPURR) if the hardware supports it
Support POWER6 features
topas cross partition monitor output
topas -C (topas -cecdisp on vio server)
topas D
topas -@
lparstat
Operational modes: Monitoring, Hypervisor summary, Hypervisor calls, operating
system configuration
Monitor multiple shared processor pools, dedicated partition in donating mode.
mpstat
Logical CPU statistics
Voluntary and involuntary logical context switching (ilcs and vlcs)
Multiple shared processor pools, dedicated partition in donating mode
Figure 10-5. Normal monitoring tools (1 of 5) AN313.1
Notes:
POWER6 provides a Scaled Processor Utilization Resource Register (SPURR).
Measurement of processor time is dynamically scaled, based on the current degree of
throttling or frequency skewing. AIX 5.3 TL7 and AIX 6.1 will support accurate process
accounting based on the SPURR in the face of processor throttling or TPMD induced
processor frequency slewing.
The POWER5/POWER6 family of processors implement a performance-specific register
called the Process Utilization Resource Register (PURR). The PURR tracks the real
processor resource usage on a per thread or per partition level. The AIX 5L performance
tools have been updated in AIX 5L V5.3 to reflect the new statistics.
The PURR is simply a 64-bit counter with the same units for the timebase and decrementer
registers that provide per-thread processor utilization statistics. Each POWER5/POWER6
processor (core) has two hardware threads associated. With SMT enabled, each hardware
thread is seen as a logical processor.
The timebase register is simply a hardware register that is incremented at each tic. The
decrementer register provides periodic interrupts. A simple way to look at it would be to
Instructor Guide
consider that at each processor clock cycle, one of the PURRs is incremented. It will be for
the thread dispatching instructions or the thread that last dispatched an instruction.
The sum of the two PURRs equals the value in the timebase register. This approach is an
approximation, as SMT allows both threads to run in parallel. It simply provides a
reasonable indication of which thread is making use of the POWER5 resources.
The AIX tools that provide system wide information, such as the iostat, vmstat, sar, and
time commands, use the PURR-based statistics whenever SMT is enabled for the %user,
%system, %iowait, and %idle figures.
When executing on a shared-processor partition, these commands add two extra columns
of information with:
Physical processor consumed by the partition, shown as pc or %physc.
Percentage of entitled capacity consumed by the partition, shown as ec or %entc.
# iostat -t 2 4
0.0 19.3 8.4 77.6 14.0 0.1 0.5 99.9
0.0 83.2 9.9 75.814.2 0.1 0.5 99.5
0.0 41.1 9.5 76.413.9 0.1 0.5 99.6
0.0 41.0 9.4 76.4 14.1 0.0 0.5 99.7
# sar -P ALL 2 2
AIX vio_client2 3 5 00CC489E4C00 08/17/05
20:13:50 0 19 71 0 9 0.31 61.1
1 2 75 0 23 0.19 38.7
- 13 73 0 15 0.50 99.8
20:13:52 0 21 69 0 9 0.31 61.1
1 2 75 0 23 0.20 39.0
- 14 71 0 15 0.50 100.2
Average 0 20 70 0 9 0.31 61.1
1 2 75 0 23 0.19 38.9
- 13 72 0 15 0.50 100.0
A cross-partition view of system resources is available with the topas C command. At this
time, this command will only see partitions running AIX 5L V5.3 TL3 or later; Virtual I/O
Server with version 1.3 or later is supported also.
The topas command also has a new -D switch or D command to show disk statistics that
take virtual SCSI disks into account.
The mpstat command collects and displays performance statistics for all logical CPUs in a
partition. When the mpstat command is invoked, it displays two sections of statistics. The
first section displays the system configuration, which is displayed when the command
starts and whenever there is a change in the system configuration. The second section
displays the utilization statistics, which will be displayed in intervals and at any time the
values of these metrics are deltas from a pervious interval.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

topas recording and reporting:
topasrec can be used to record local system data and cross LPAR data in binary
format.
Start recording:
# topasrec L o <output file> (local data)
# topasrec -C o <output file> (cross partitions data)
Use smitty Stop_Recording to stop topas recording.
Listing running recording:
# topasrec l
Pid User Fmt Start_time Path Status
245850 root bin 12:58:23,Oct06,2009 /usr/lpp/perfagent/
Running
Summary report:
# topasout -R summary <output file>
Detailed report:
# topasout -R detailed /etc/perf/topas_cec.080816
Notes:
topasrec
topasrec is a new tool available since October 2008 that generates binary recording of
local system metrics and the CEC metrics. When you run the topasrec command for a
CEC recording, the topasrec command collects a set of metrics from the AIX partitions
running on the same CEC. The topasrec command collects dedicated and shared partition
data, and a set of aggregated values to provide an overview of the partition set on the
same CEC.
Start recording
To start a continuous local binary recording using the default output file:
# topasrec -L
The performance data is logged to:
/usr/lpp/perfagent/<hostname>_<date>_<time>.topas
To start a CEC binary recording using the default output file:

V5.4.0.3
Instructor Guide
Uempty # topasrec -C
The performance data is logged to:
/usr/lpp/perfagent/<hostname>_<date>_<time>.topas
Stop recording
It is recommended to stop topas recording using the smitty Stop_Recording fast path. The
command listtrec is run under the cover when stopping topas recording.
Summary report of a CEC recording in binary format
# topasout -R summary /usr/lpp/perfagent/sys044_vios1_cec_091006_1324.topas
#Report: CEC Summary --- hostname: sys044_vios1 version:1.2
Start:10/06/09 13:24:12 Stop:10/06/09 13:38:12 Int: 5 Min Range: 14 Min
Partition Mon: 6 UnM: 0 Shr: 6 Ded: 0 Cap: 6 UnC: 0
--CEC-------------- -Processors----------------- --Memory
(GB)------------
Time ShrB DedB Don Stl Mon UnM Shr Ded PSz APP Mon UnM Avl UnA InU
13:29 0.0 0.0 - - 1.3 0.0 1.3 0 8.0 8.0 6.6 0.0 0.0 0.0 0.0
13:34 0.0 0.0 - - 1.5 0.0 1.5 0 8.0 8.0 7.0 0.0 0.0 0.0 0.0
13:38 0.0 0.0 - - 1.5 0.0 1.5 0 8.0 8.0 7.0 0.0 0.0 0.0 0.0
Detailed report
# topasout -R detailed /usr/lpp/perfagent/sys044_vios1_cec_091006_1324.topas
#Report: CEC Detailed --- hostname: sys044_vios1 version:1.2
Start:10/06/09 13:24:12 Stop:10/06/09 13:28:12 Int: 5 Min Range: 4 Min
Time: 13:28:12
-----------------------------------------------------------------
Partition Info Memory (GB) Processors Avail Pool : 8.0
Monitored : 6 Monitored : 6.6 Monitored : 1.3 Shr Physcl Busy: 0.03
UnMonitored: 0 UnMonitored: 0.0 UnMonitored: 0.0 Ded Physcl Busy: 0.00
Shared : 6 Available : 0.0 Available : 0.0 Donated Phys. CPUs: 0.00
UnCapped : 0 UnAllocated: 0.0 Unallocated: 0.0 Stolen Phys. CPUs :
0.00
Capped : 6 Consumed : 0.0 Shared : 1.3 Hypervisor
Dedicated : 0 Dedicated : 0.0 Virt Cntxt Swtch:3205358
Donating : 0 Donated : 0 Phantom Intrpt : 4573
Pool Size : 8.0
Host OS M Mem InU Lp Us Sy Wa Id PhysB Vcsw Ent %EntC PhI
-------------------------------------shared--------------------------------
sys044_lpar1 A61 C 1.0 0.7 2 0 0 0 99 0.00 194 0.3 0.87 0
sys044_lpar2 A61 C 1.0 0.7 2 0 0 0 99 0.01 387899 0.3 1.69 549
sys044_vios1 A61 C 1.0 0.6 2 0 4 0 94 0.01 2816378 0.1 7.98 4023
sys044_vios4 A61 C 1.0 0.6 2 0 1 0 79 0.00 155 0.1 2.15 0
Instructor Guide
sys044_lpar3 A61 C 1.0 0.7 2 0 0 0 79 0.00 322 0.3 0.84 0

sys044_vios3 A61 C 1.0 0.6 2 0 1 0 98 0.00 280 0.1 3.66 0
sys044_vios2 A61 C 1.0 0.6 2 1 2 0 96 0.00 232 0.1 4.96 0
Topas cross-partition view and recording (-C | -R)
This panel displays metrics similar to the lparstat command for all the AIX partitions it can
identify as belonging to the same hardware platform. Dedicated and shared partitions are
displayed in separate sections with appropriate metrics. The top section represents
aggregated data from the partition set to show overall partition, memory, and processor
activity.
Remote enablement for this panel to collect from other partitions requires that latest
updates to the perfagent.tools and bos.perf.tools used to support this function. For earlier
versions of AIX, the topas command also collects remote data from partitions that have the
Performance Aide product (perfagent.server) installed. topas -C might not be able to
locate partitions residing on other subnets. To circumvent this, create a
$HOME/Rsi.hosts file containing the fully qualified host names for each partition
(including domains), one host per line.
Besides the display option, the topas -R option is now deprecated, and you must use the
topasrec command or use SMIT panels to start the recording. The topas -R command
starts the topasrec command that records the cross-partition data. Recordings are made
to the /etc/perf/ directory, and are of the form topas_cec.YYMMDD. The topasout
command or SMIT panels (smit topas) can be used to convert these recordings into
various text-based reports.
The following metrics display in the initial Cross-Partition panel. Additional metrics with full
descriptive labels can be displayed by using the key toggles identified in the Additional
Cross-Partition Panel Subcommands section in the AIX commands manual pages:
Partition totals
- Shr Number of shared partitions
- Ded Number of dedicated partitions
Memory (in GB)
- Mon Monitored partitions total memory
- Avl Memory available to partition set
- InUse Memory in use on monitored partitions
Processor
- Shr Number of shared processors
- Ded Number of dedicated processors
- PSz Active physical CPUs in the shared processor pool being used by this LPAR
- APP Available physical processors in the shared pool
- Shr_PhysB Shared Physical Busy
- Ded_PhysB Dedicated Physical Busy

V5.4.0.3
Instructor Guide
Uempty Individual partition data

- Host Hostname
- OS Operating system level
- M Mode
For shared partitions
- C - SMT enabled and capped
- c - SMT disabled and capped
- U - SMT enabled and uncapped
- u - SMT disabled and uncapped
For dedicated partitions
- S - SMT enabled
- \q \q (blank) - SMT disabled
- Mem Total memory in GB
- InU Memory in use in GB
- Lp Number of logical processors
- Us Percentage of CPU used by programs executing in user mode.
- Sy Percentage of CPU used by programs executing in kernel mode.
- Wa Percentage of time spent waiting for IO
- Id Percentage of time the CPU(s) is idle
- PhysB Physical Processors Busy
- Ent Entitlement granted (shared-only)
- %Entc Percent Entitlement consumed (shared-only)
- Vcsw Virtual context switches average per second (shared-only)
- PhI Phantom interrupts average per second (shared-only)
Note: At least one partition to be monitored must have Pool Utilization Authority (PUA)
configured for pool information metrics to be collected.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty

SMIT panels introduced to operate on topas recording
The following options are provided:
To start/stop persistent recording (24x7)
To start/stop WLE data collection
To choose the type of recording
Binary/nmon style local recording
CEC recording (cross partitions data)
List available and completed recordings
Generate reports on the completed recording
Notes:
New smit panels have been introduced in AIX to operate on topas, to start and stop
recordings.
Persistent recording
Persistent recordings are those recordings that are started from SMIT with the option to
specify the cut and retention. You can specify the number of days of recording to be stored
per recording file (cut) and the number of days of recording to be retained (retention) before
it can be deleted. Not more than one instance of persistent recording of the same type
(CEC or local recording) can be run in a system. When a persistent recording is started, the
recording command will be invoked with user-specified options. The same set of command
line options used by this persistent recording will be added to inittab entries. This will
ensure that the recording is started automatically on reboot or restart of the system.
By default, a local persistent recording is already running on each AIX operating system.
The default persistent recording is based on a daily recording in /etc/perf/daily
directory. You can start a persistent local recording either in binary or nmon type.
Instructor Guide
WLE data collection

When starting a local binary recording, then you have an option to enable the IBM
Workload Estimator (WLE) report generation in the SMIT screen. The WLE report is
generated only on Sundays at 00:45 a.m. and requires local binary recording to always be
enabled for consistent data in the report. The data in the weekly report is correct only if the
local recordings are always enabled.
Type of recording
You can specify the type of recording such as binary or nmon style. CEC recording can
only be done in binary format. NMON comes with recording filters that help you customize
the nmon recording.
Generate a report
Using the smitty topas menus, you can generate reports from recorded files. Depending on
the type of record you started, different reporting formats can be selected. From a CEC
recording, you can generate report based on the following reporting format:
Comma_separated
Spreadsheet
Detailed
Summary
Pool_detail
Mempool
VIO_Server/Client
VIO_Client/Client_disk
If you started a local recording, the possible reporting format will be the following:
Comma_separated
Spreadsheet
Detailed
Summary
Disk_summary
Network_summary
nmon
Adapter
Virtual_Adapter

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Here is a topas report example.
Instructor Guide

topas CEC reports
Summary
Detailed
Notes:
Here is a topasout report (summary and detailed) of a topas CEC (cross partitions)
recording in binary format.
The first output report is an extract of a summary report from a CEC recording file, and the
the secondary output report is an extract of a detailed report from a CEC recording file.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement What about nmon?
Instructor Guide

nmon
Installed by default in AIX
Can be started by running command nmon or topas_nmon
Can be started by pressing ~ from topas screen
topas_nmon can also record local data in nmon format
Notes:
topas_nmon
The classical nmon freeware tool has been assimilated inside topas and is now fully
supported by IBM. Like topas, nmon is also a cursor-based tool for system performance
monitoring and also has recording capabilities. nmon is a new, by default, part of AIX and
can be started using the standard nmon command or topas_nmon.
Unlike topas, to start recording local data in nmon format, use the smitty
Start_Recording_Topas fast path menu. The output report file generated can be used with
nmon analyzer to create graphic views of recorded data.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement nmon analyzer and nmon consolidator can be used to generate
graphs from nmon or topas output files.
Instructor Guide
nmon analyzer and nmon consolidator

nmon analyzer
Post processing tool to analyze performance.
Analyze files produced by topasout in nmon format
Cross partitions statistics cannot be analyzed
http://www.ibm.com/developerworks/wikis/display/Wikiptype/nmonanalyser
nmon consolidator
Need to use nmon consolidator if you want to get a report for the entire machine.
Figure 10-10. nmon analyzer and nmon consolidator AN313.1
Notes:
nmon analyzer is an Excel spreadsheet that takes an output file from nmon or
topas_nmon and produces some nice graphs to aid in analysis and report writing. It also
performs some additional analyses for ESS, EMC, and FAStT subsystems. It requires
Excel 2002 or later.
Using NMON_analyser
FTP the input file to your PC, ideally using the ASCII or TEXT options to make sure that
lines are terminated with the CRLF characters required by Windows applications.
Open the NMON_analyser spreadsheet and specify the options you want on the Analyser
and Settings sheets (see below). Save the spreadsheet if you want to make these options
your personal defaults.
Click Analyse nmon data and find and select the .nmon files to be processed. You can
select several files in the same directory. If you wish to process several files in different
directories you might wish to consider using the FILELIST option described below.

V5.4.0.3
Instructor Guide
Uempty You might see the message SORT command failed for filename if the file has >65K lines
and the filename (or directory name) contains blanks or special characters. Either rename
the file or directory or just pre-sort the file before using the Analyzer.
Using NMON_consolidator
NMON_consolidator reads in up to 255 nmon or topasout files to produce a consolidated
set of data in the form of an Excel spreadsheet (requires Excel 2002 or later).
A separate sheet is generated for each major performance statistic; graphs are
automatically generated showing summary data for each server and for the installation as a
whole. The tool allows nodes to be grouped together and will automatically calculate group
totals and group averages for each statistic. Because the graphs are pre-defined, the user
is free to edit the titles, colors, and fonts to suit their own requirements and can simply
delete unwanted charts or entire sheets to reduce the amount of output.
Administrators who tend partitioned servers will find this tool provides the ability to get an
overview of an entire machine at-a-glance and provides the opportunity for modelling
different partitioning scenarios (for example, moving dedicated partitions into the shared
pool).
nmon_consolidator can be obtained from
http://www.ibm.com/developerworks/wikis/display/WikiPtype/nmonconsolidator
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
VIO Server 2.1 monitoring commands

Virtual I/O Server version 2.1 monitoring commands
$ topas *
$ topas -cecdisp
$ fcstat *
$ svmon *
$ wklmgr
$ wklagent
$ wkldout
$ seastat
$ vmstat *
$ viostat
$ sysstat
* Denotes standard AIX commands
Figure 10-11. VIO Server 2.1 monitoring commands AN313.1
Notes:
Virtual I/O Server 2.1 monitoring commands
vmstat, fcstat, and svmon commands are standard AIX commands reporting
performance statistics. The fcstat command displays the statistics gathered by the
specified Fibre Channel device driver. The svmon command captures and analyzes a
snapshot of virtual memory.
wkldout: The wkldout command processes the data produced by running the
Workload Manger Agent. The Workload Manager Agent writes data files to the
/home/ios/perf/wlm directory.
topas: The topas command reports selected statistics about the activity on the local
system. To get a cross partition view, the cecdisp option can be used.
viostat: The viostat command reports CPU statistics, asynchronous input/output
(AIO), and input/output statistics for the entire system, adapters, tty devices, disks, and
CD-ROMs.
Instructor Guide
seastat: The seastat command generates a report to view, per client, shared Ethernet
adapter statistics. To gather network statistics at a per-client level, advanced
accounting can be enabled on the shared Ethernet to provide more information about
its network traffic. To enable per-client statistics, the VIOS administrator can set the
shared Ethernet adapter accounting attribute to enabled. The default value is disabled.
When advanced accounting is enabled, the shared Ethernet adapter keeps track of the
hardware (MAC) addresses of all of the packets it receives from the LPAR clients, and
increments packet and byte counts for each client independently. After advanced
accounting is enabled on the shared Ethernet adapter, the VIOS administrator can
generate a report to view per-client statistics by running the seastat command.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Here is a topas example running on a Virtual I/O Server.
Instructor Guide
VIO monitoring using topas (1 of 4)

From the VIO, run topas cecdisp and press v to show the
VIOs monitoring panel.
Figure 10-12. VIO monitoring using topas (1 of 4) AN313.1
Notes:
This menu can also be retrieved using the topas -C command (instead of the topas
cecdisp command) from any logical partition in the managed system.
In this example, we have one Virtual I/O Server and two virtual I/O client partitions.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

From topas VIOS panel, move the cursor to a specific VIOS
and press d to get the detailed monitoring.
Notes:
From the previous topas menu, you can zoom on a specific Virtual I/O Server and clients
configuration and throughput. This can be accomplished by selecting a Virtual I/O Server
using the arrows keys and then pressing d to get the detailed monitoring.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

From topas panel in the Virtual I/O Server, press E to display
the shared Ethernet adapter configuration and statistics.
Topas monitor for host: sys14_vios3 Interval: 2 Wed Sep 23 17:37:07 2009
===================================================================================
Network KBPS I-Pack O-Pack KB-In KB-Out
ent3 (SEA PRIM) 0.2 0.9 0.9 0.1 0.2
|\--ent0 (PHYS) 0.2 0.9 0.5 0.1 0.1
|\--ent4 (VETH CTRL) 0.1 0.0 2.8 0.0 0.1
\--ent2 (VETH) 0.1 0.0 0.5 0.0 0.0
lo0 0.0 0.0 0.0 0.0 0.0
The number of kilobytes

received per second over
the monitoring interval
The number of data packets

Device tree reflected in sent per second over the
the ASCII-art tree display monitoring interval
The number of data packets

received per second over the
monitoring interval
Total throughput=KB-In+KB-Out
Notes:
Network statistics can be seen in topas by pressing E. If you are running topas on a Virtual
I/O Server, the shared Ethernet adapter configuration and statistics will be shown.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

From topas panel in the Virtual I/O Server, press D to display
the Disks panel; then press d to display the adapters statistics.
Topas Adapter View : sys044_vios1Interval: 2 Fri Oct 16 11:23:05 2009

===============================================================================
Adapter KBPS TPS KB-R KB-W
fcs0 204.3K 2.4K 204.3K 0.0
vhost1 0.0 0.0 0.0 0.0 Adapters throughput
vhost0 204.3K 4.8K 2.4K 2.4K
===============================================================================
Vtargets/Disks Busy% KBPS TPS KB-R ART MRT KB-W AWT MWT AQW AQD
hdisk2 95.7 204.3K 2.4K 204.3K 0.4 1.8 0.0 0.0 0.0 0.6 0.9
hdisk1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
hdisk0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
Physical hdisk Busy% and statistics
Notes:
The figure shows a topas view of the adapters on the Virtual I/O Server. You can see
activity on the physical Fibre Channel adapter (fcs0) and on the virtual SCSI adapter
(vhost0). At the bottom of the output, hdisk2 is the physical disk device with I/O activity.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
AIX MPIO paths monitoring using topas

From topas Disk panel in an AIX client partition, press d to toggle
on/off Adapter panel.
Topas Adapter View : sys044_lpar1Interval: 2 Fri Oct 16 12:43:52 2009
===============================================================================
Adapter KBPS TPS KB-R KB-W
vscsi0 152.0K 2.4K 2.4K 0.0
vscsi1 0.0 0.0 0.0 0.0
===============================================================================
Vtargets/Disks Busy% KBPS TPS KB-R ART MRT KB-W AWT MWT AQW AQD
hdisk0 100.0 150.0K 2.3K 150.0K 1.2 72.2 0.0 0.0 113.80.0 0.0
Press m to toggle on/off Path panel.

Topas MPIO Monitor for host:sys044_lpar1 Interval 2 Fri Oct 16 11:10:51 2009
===============================================================================
Disk Busy% KBPS TPS KB-R ART MRT KB-W AWT MWT AQW AQD
hdisk0 100.0 214.0K 2.5K 214.0K 1.1 3.6 0.0 0.0 0.0 0.0 0.0
===============================================================================
Path Busy% KBPS TPS KB-R KB-W
Path1 0.0 0.0 0.0 0.0 0.0 Paths throughput
Path0 100.0 214.0K 2.5K 214.0K 0.0
Figure 10-16. AIX MPIO paths monitoring using topas AN313.1
Notes:
In this figure, the upper topas panel shows two virtual SCSI client adapters and a virtual
SCSI disk (hdisk0). We can see vscsi0 has activity and hdisk0 is 100% busy. Looking at the
bottom of the output, we see the two paths providing access to hdisk0 (virtual SCSI disk).
Only path0 has activity.
To collect input/output statistics for the disk, enable the collection by running the following
command:
chdev -l sys0 a iostat=true

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
VIO monitoring using Workload Manager

Maintenance commands available since Virtual I/O Server 1.3 (for long term
monitoring)
wkldmgr (workload manager)
wkldagent (provide recording capabilities)
wkldout (process workload manager-related recordings)
Tool used for monitoring is a subset of AIX WLM (workload manager)
The WLM is started in passive mode
Does not regulate the CPU, memory, or disk utilization but monitors them
wkldout [-report reportType] [-interval MM] [-beg HHMM] [-end
HHMM][-fmt[-mode modeType]] [-graph] filename
<xmwlm_recording_file> nmon
analyzer
$wkldout report summary filename /home/ios/perf/wlm/xmwlm output
Report: System Summary --- hostname: sys23_vio
Start:12/06/07 11:36:18 Stop:12/06/07 12:16:18 Int: 5 Min
Mem: 2.0GB Shared SMT: ON Logical CPUs: 4 Psize: 3
time InU Us Sy Wa Id PhysB Ent %Entc RunQ WtQ
--------------------------------------------------
11:41:17 2.0 0 1 0 99 0.21 0.25 1.4 1.0 0.0
11:46:17 2.0 0 1 0 99 0.30 0.25 1.8 1.1 0.0
11:51:17 2.0 0 1 0 99 0.27 0.25 1.7 0.8 0.0
11:56:17 2.0 0 1 0 40 56.2 88.2 28.2 1.0 0.0
wkldout
summary
report
Figure 10-17. VIO monitoring using Workload Manager AN313.1
Notes:
The Workload Manager Agent provides recording capability for a limited set of local system
performance metrics. These include common CPU, memory, network, disk, and partition
metrics typically displayed by the topas command.
The Workload Manager must be started using the wkldmgr command before the
wkldagent command is run. Daily recordings are stored in the /home/ios/perf/wlm
directory with filenames xmwlm.YYMMDD, where YY is the year, MM is the month, and DD
is the day.
The wkldout command can be used to process Workload Manager-related recordings. All
recordings cover 24-hour periods and are retained for only two days.
wkldout [-report reportType] [-interval MM] [-beg HHMM] [-end
HHMM][-fmt [-mode modeType]] [-graph] filename
<xmwlm_recording_file>
report: Detailed | summary | disk | lan

V5.4.0.3
Instructor Guide
Uempty interval MM: Split the recording reports into equal size time periods. Allowed values
(in min.) are 5, 10, 15, 30, and 60.
beg HHMM: Begin time in hours (HH) and minutes (MM). Range is between 0000 and
2400.
end HHMM: End time in hours (HH) and minutes (MM). Range is between 0000 and
2400 and is greater than the begin time.
fmt: Spreadsheet import format.
mode: min | max | mean |stdev | set
graph: Generate the .csv file under /home/ios/perf/wlm in the format
xmwlm.YYMMDD.csv, which can be input to nmon Analyzer to produce some nice
graphs to aid in analysis and report writing. nmon Analyzer requires Excel 2002 or later.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
Freeware monitoring tools (1 of 3)

LPARMon
Simple graphical tool that shows what is going on in the managed
system
Figure 10-18. Freeware monitoring tools (1 of 3) AN313.1
Notes:
Graphical LPAR monitor for System p5 servers (LPARMon) is a graphical logical partition
(LPAR) monitoring tool that can be used to monitor the state of one or more LPARs on a
System p5 server. LPAR state information that can be monitored with this tool includes
LPAR entitlement/tracking, simultaneous multithreading (SMT) state, and processor and
memory use. In addition to monitoring the state of individual LPARS, the tool also includes
gauges that display overall shared processor pool use and total memory use. The LPARs
to be monitored can be running any mixture of AIX 5.3, AIX 5.2, or the Linux operating
systems. There is also a history feature that can be used to display an LPARs processor
use over seconds, minutes, hours, or days. This feature is helpful in determining an LPARs
processor resource requirements.
How does it work?
The LPARMon tool consists of two components. First, there are small agents that run in
AIX or Linux LPARs. These agents gather various LPAR information through several
operating system commands and API calls. The agents then pass this information using a
connected socket to the second component, which is the monitors graphical user
Instructor Guide
interface. This graphical user interface is a Java application and it is used as a collection
point for the server and LPAR status information, which it then displays in a graphical
format for the user.
Once LPARMon is installed and configured, starting up LPARMon is as simple as executing
the lparmon shell script or lparmon.bat file. The operation of LPARMon is very straight
forward and the various dialogs were covered in the Overview section. There are just a few
things you need to keep in mind when using LPARMon.
1. Only one LPARMon can be connected to a partition at any given time. If a partition is
specified in the LPARMon config file and an instance of LPARMon is running which
pointed to that config file, other instances of LPARMon who might also have specified
that same partition in their config file will hang on startup waiting for the partition to be
released by the active LPARMon instance.
2. If a partition is specified in the config file and LPARMon cannot connect to that partition,
the partition will be ignored and not be available when the LPARMon dialog comes up.
The most likely causes for this problem are:
a. The LPARMon agent is not running on the specified partition.
b. The machine where LPARMon is running cannot connect to the agent using the
specified or defaulted port. Make sure you can ping the machine or try another port.
c. Make sure you have used the correct version of the agent that corresponds with the
operating system that is running on the partition.
d. If the LPARMon agent is running and you still cannot connect to it, try killing all
instances of the LPARMon agent on the partition and restart the agent.
Once the communication problems with the partition have been resolved, it is necessary to
restart LPARMon in order to see the partition in the monitor.
3. The history information for processor usage is only held while LPARMon is running.
Restarting LPARMon will reset the history data.
Ganglia
Many large AIX high performance computing (HPC) clusters use this excellent tool to
monitor performance across large clusters of machines.
The data is displayed graphically on a Web site, and includes configuration and
performance statistics. This is also increasingly being used in commercial data centers to
monitor large groups of machines.
Ganglia can also be used to monitor a group of logical partitions (LPARs) on a single
machine - these just look like a cluster to Ganglia.
Ganglia is not limited to just the AIX, which makes it even more useful in heterogeneous
computer rooms.
For more information go to the Ganglia home Web site at http://ganglia.sourceforge.net/
For the Ganglia for AIX and Linux on POWER binaries go to http://www.perzl.org/ganglia/

V5.4.0.3
Instructor Guide
Uempty Briefly, a daemon runs on each node, machine, or LPAR and the data is collected by a
further daemon and placed in an rrdtool database. Ganglia then uses PHP scripts on a
Web server to generate the graphs as directed by the user. There is also an on-going
project to add POWER5 micro-partitions statistics.
Ganglia tool can also be found here:
http://www.ibm.com/developerworks/wikis/display/WikiPtype/ganglia
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty

Ganglia is an Open Source cluster performance monitoring tool.
Extended to include POWER5/6 features like shared processor LPARs,
entitlement, physical CPU usage and so on
Monitors LPARs (including VIO Server) on a single or multiple managed systems
Notes:
Ganglia
Many large AIX high performance computing (HPC) clusters use this excellent tool to
monitor performance across large clusters of machines.
The data is displayed graphically on a Web site, and includes configuration and
performance statistics. This is also increasingly being used in commercial data centers to
monitor large groups of machines.
Ganglia can also be used to monitor a group of logical partitions (LPARs) on a single
machine - these just look like a cluster to Ganglia. Ganglia is not limited to just the AIX,
which makes it even more useful in heterogeneous computer rooms.
For more information, go to the Ganglia home Web site at http://ganglia.sourceforge.net/.
For the Ganglia for AIX and Linux on POWER binaries go to http://www.perzl.org/ganglia/.
A wiki is also available here:
Instructor Guide
Ganglia components
A daemon runs on each node, machine, or LPAR and the data collected and placed in an
rrdtool database. Ganglia then uses PHP scripts on a Web server to generate the graphs
as directed by the user.
The components of Ganglia are as follows:
This data collector (G)
- The daemon is a single file and called gmond (Ganglia MONitor Daemon). It is a
monitoring daemon that collects the different metrics.
- Its configuration file is /etc/gmond.conf
- This goes on each node.
The data consolidator (G)
- This is a single file and called gmetad (Ganglia METAdata Daemon). It polls all the
gmond clients and stores the collected metrics in a round-robin databases (RRDs)
- Its configuration file is /etc/gmetad.conf
- You need one of these for each cluster. On massive clusters you can have more
than one and a hierarchy.
- This daemon collects the gmond data set using the network and saves it in a rrdtool
database.
The database
- Ganglia uses the well known and respected open source tool called rrdtool.
The Web GUI tools (G)
- These are a collection of PHP scripts started by the Web server to extract the
Ganglia data and generate the graphs for the Web site.
The Web server with PHP
- This could be any Web server that supports PHP, SSL, and XML.
- Everyone uses Apache2; you are on your own if you use anything else.
Addition advanced tools (G)
- gmetric is used to add extra stats. In fact, anything you like such as numbers or
strings, with units, and so on.
- gstat is used to get at the Ganglia data to do anything else you like.
Notice: The parts that are labeled with a (G) are part of Ganglia.The other parts you have
to get and install as prerequisites, namely Apache2, PHP, and rrdtool. These might also
have prerequisites.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

lpar2rrd
LPAR CPU statistics and documentation tool for IBM POWER servers
Produces historical CPU utilization graphs of LPARs and shared CPU usage.
Collects complete physical (HW) and logical configuration of all managed systems
and their LPARs including changes in their state and configuration
http://www.ibm.com/developerworks/wikis/display/virtualization/lpar2rrd+tool
Notes:
lpar2rrd is a tool capable of producing historical CPU utilization graphs of logical partitions
and shared CPU usage.
It also collects complete physical (hardware) and logical configuration of all managed
systems and logical partitions. This includes all changes in their state and configuration.
This tool is not intended to be real-time monitoring tool.
This tool is intended only for HMC-based micro-partitioned systems with a shared CPU
pool and creates charts based on utilization data collected on HMC (lslparutil hmc
command). It is agent-less; no agent needs to be installed on any logical partition. It uses
ssh keys-based access to the HMC to get all of the data, so it does not cause any load on
monitored logical partitions.
It supports all types of logical partitions and operating systems: AIX, VIOS, Linux on Power,
i5/OS on IBM Power systems. It creates automatically a menu for viewing charts,
configuration, and logs. It creates a physical and logical configuration inventory of all
managed systems and their logical partitions (once a day).

V5.4.0.3
Instructor Guide
Uempty The lpar2rrd tool shows 100 of the last changes in the configuration and 100 of the last
changes in the state of all managed systems and their logical partitions. It shows the total
memory usage for each managed systems
This tool is simple to install, configure, and use (initial install and configuration together with
supporting tools like Apache/Perl/SSH should not take more than half an hour). Default
graphs can provide up to a year of historical data if available at the HMC.
More can be found here:
http://www.ibm.com/developerworks/wikis/display/virtualization/lpar2rrd+tool.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement Next is the Performance toolbox.

V5.4.0.3
Instructor Guide
Uempty
Performance toolbox
POWER systems and virtualized environments can be
monitored by Performance toolbox (PTX).
Figure 10-21. Performance toolbox AN313.1
Notes:
This X Windows performance monitoring tools support POWER systems and virtualization
statistics. From VIOS version 1.3, the daemon that Performance toolbox (PTX)
communicates with to extract data from remote machine is available.
You run a daemon on each AIX LPAR.
The PTX graphical user interface runs on a machine running X Windows and typically an
AIX workstation (although VNC works and you could use an other workstation running X
Windows remotely).
With PTX, you build up a monitor of what you want to capture dynamically on the screen
(CPU, disk, network, and so on, out of hundreds of statistics). You can also do the
following:
Automate the capture and saving of data to files
Replay the "monitor" - much like watching a video an zoom forward and back in time
Filter and modify the captured data to support other tools or performance databases
Instructor Guide
These two graphs show Entitlement (ent), Physical CPU use (physc), Shared, SMT, Cap
Status, and the Global CPU utilization in 2D and 3D modes. 3D allows multiple
machines/LPARs to be monitored at the same time.
The Performance toolbox for AIX consists of two components called the Manager and the
Agent. The Agent is also referred to as the Performance Aide and represents the
component that is installed on every network node in order to enable monitoring by the
manager.
The Agent component is available separately from the Performance toolbox for AIX
product.
The Local Performance Analysis and Control Commands fileset (perfagent.tools) is now a
prerequisite of the Performance Aide for AIX fileset (perfagent.server). The Local
Performance Analysis and Control Commands ship with the base operating system and
must be installed before proceeding with the Performance Aide for AIX installation.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Next is IBM System Director.
Instructor Guide
Management tools: IBM Systems Director

Integrated suite of systems management tools, providing:
Monitoring and management of systems across heterogeneous IT
environments
Simplify management of physical and virtual platform resources
Monitors and reports system health
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR
LPAR VIO VIO
Discovers physical
and shared I/O
Figure 10-22. Management tools: IBM Systems Director AN313.1
Notes:
IBM Systems Director is an integrated suite of systems management tools, designed for
monitoring and managing heterogeneous IT environments. The design concept is to
provide one mechanism by which your entire IT architecture is managed. From one access
point, administrators can monitor system environmental, resources, inventory, events, task
management, core corrective actions, distributed commands, and hardware control for a
wide range of servers and storage. IBM Systems Director has an extendable and modular
foundation that enables it to be enhanced with additional plug-ins such as VMControl.
IBM Director is a management tool for virtual and physical environments in the data center.
IT administrators can view and track the hardware configuration of remote systems in
detail, and monitor the usage and performance of critical components such as processors,
disks, and memory.
Systems Director consists of three main components

1. The Systems Director server provides the management application logic and database
required to perform all key functions. It can run on AIX, Windows, and Linux systems.

V5.4.0.3
Instructor Guide
Uempty 2. The management console provides a Web-based user interface for all management
functions.
3. The information is passed to and from supported systems using the Systems Director
Agent or without an agent. This allows clients to trade off management functions for a
smaller footprint.
System Director agents

Common Agent: Single, shared incoming port (firewall friendly) for management.
Common authentication and credential management using a single agent manager.
The goal is to provide a single agent runtime shared by IBM systems and Tivoli
products like Tivoli Provisioning Manager, reduces agent footprint to support shared
credentials and drive discovery, inventory, and other common services. Common agent
must be installed to get the full IBM Director capabilities.
Platform Agent: Provides a subset of common agent functions used to communicate
with and administer the managed system, including hardware alerts and status
information, firmware and driver updates, and remote deployment. The Platform Agent
provides basic hardware monitoring, health monitoring, and status and event
notification, but more sophisticated tasks cannot be undertaken by IBM Systems
Director.
Agent-less management: Agent-less managed systems are best for environments that
require very small footprints and are used for specific tasks, such as one-time inventory
collection, firmware and driver updates, and remote deployment. Agent-less managed
systems have limited capabilities such as system discovery, and collection of limited
operating system inventory data.
You will find the link to the download page and more information at:
http://www-03.ibm.com/systems/management/director/
Instructor Guide
Instructor notes:
Purpose
Details
Additional information IBM Director is provided at no additional charge for use on IBM
systems. You can purchase additional IBM Director server licenses for installation on
non-IBM servers. In order to extend IBM Director capabilities, several extensions can be
optionally purchased and integrated.

V5.4.0.3
Instructor Guide
Uempty
IBM Systems Director: Welcome
Figure 10-23. IBM Systems Director: Welcome AN313.1
Notes:
User interface
Here is a preview of Systems Directors console.
IBM Systems Director utilizes an extensible Web-based user interface for the console.
From the console, the administrator can perform a number of tasks utilizing the various
foundation components provided. For certain legacy operations, the user interface provides
a launch in context capability.
It is designed to give IT administrators the flexibility to manage in a way that is intuitive to
them. If they only manage Power servers, they can go right to a Power Systems
management home page by clicking a button. If they want to get a quick status or update
servers, they can go right to that function. It is designed to minimize the number of clicks
required to perform a function and show status of each function along the way.
It includes context-sensitive help, tutorials, and even a link to guide users to understand
how to navigate the console.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
IBM Systems Director:
Power Systems Management
Figure 10-24. IBM Systems Director: Power Systems Management AN313.1
Notes:
The IBM Systems Director Power Systems Management option, provides specific tasks
that can help you manage power systems and platform managers such as the Hardware
Management Console (HMC) and the Integrated Virtualization Manager (IVM).
IBM Power systems can all be completely managed by IBM Systems Director with
capabilities such as discovery, inventory, status, monitoring, power management, and so
on.
IBM Systems Director can manage the following Power systems environments that might
include POWER5 and POWER6 processor-based servers running AIX, IBM i, or Linux:
Power systems managed by the Hardware Management Console
Power systems managed by the Integrated Virtualization Manager
Power systems server with a single image (a nonpartitioned configuration)
A Power Architecture BladeCenter server under the control of a BladeCenter
management module
Instructor Guide
IBM Systems Director gives you an overall understanding of the Hardware Management
Consoles and Integrated Virtualization Managers in your environment, as well as the hosts
they manage and their associated virtual servers (logical partitions). You can access and
manage the logical partitions as you would any other managed system. In addition, IBM
Systems Director provides a launch-in-context feature to access additional tasks that are
available from the Hardware Management Console and the Integrated Virtualization
Manager. From IBM Systems Director, you can also access i5/OS management tasks, in
addition to the AIX management tasks.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IBM Systems Director VMControl features

VMControl Express Edition
VMControl VMControl VMControl
Lifecycle management and Express Standard Enterprise
Edition Edition Edition
relocation of virtual machines
of all IBM servers platform Create and
manage virtual 9 9 9
Manage Power Systems machines
virtual server resources Virtual machine 9 9 9
relocation
Create, modify, and delete
Capture, import,
virtual machines create virtual 9 9
Relocate virtual machine images
Deploy virtual 9 9
VMControl Standard Edition images
Lifecycle management and Maintain virtual
images in a 9 9
deployment of virtual images centralized library
on Power Systems (AIX) and
Resource pool
System z (Linux) management 9
Optional upgrade from features (future)
Express to Standard edition

through purchased license key
activation
Figure 10-25. IBM Systems Director VMControl features AN313.1
Notes:
VMControl
IBM Systems Director VMControl enables clients to improve service delivery through faster
deployment of Power servers. Systems Director VMControl is a plug-in of IBM System
Director and is available in two editions: the Express Edition and the Standard Edition.
The Express Edition
The Express Edition provides lifecycle management of virtual machines, that is, the ability
create, modify, and delete them as well as move them to other locations. This function is a
no-charge download that supports a broad set of operating environments and Hypervisors
across IBM hardware platforms.
The Standard Edition
The Standard Edition includes the Express Edition functions and adds virtual-to-virtual
image management. This includes the ability to create, capture, import, and deploy virtual
images. This virtual image management solution configures new Power server AIX
systems or System z Linux systems, clone existing systems, and facilitates planning and

V5.4.0.3
Instructor Guide
Uempty deploying virtual images. These virtual server images can be maintained in a library and
encapsulate the operating system, middleware, and applications for deployment on
another server.
Industry standard virtual images can also be imported because the Systems Director
VMControl design is based on the Open Virtualization Format (OVF) standard. For Power
servers, it leverages the Network Installation Manager (NIM) function of AIX, so that clients
do not have to migrate these to use the Systems Director interface for maintaining AIX
images. The automation capabilities can reduce the time to deploy new services, especially
as compared to installing each operating environment, middleware, and applications
individually.
VMControl Standard Edition also helps reduce image sprawl. In other words, the
proliferation of multiple similar images maintained by many administrators. Instead,
administrators can choose from existing images that are consistently used throughout the
IT operation.
There is a 60-day free trial for your clients to download and try the image management
capabilities of Systems Director VMControl Standard Edition. At the end of the 60 days,
they can purchase a license key to continue managing virtual images or continue to use the
Express Edition features at no charge.
Instructor Guide
Instructor notes:
Purpose
Details

V5.4.0.3
Instructor Guide
Uempty
IBM Systems Director VMControl
Figure 10-26. IBM Systems Director VMControl AN313.1
Notes:
From this IBM Systems Director menu, you can manage the virtual appliances in your data
center.
Virtual appliances
Generally speaking, a virtual appliance is a virtual machine software image designed to run
on a virtualization platform. From this menu, you can rapidly deploy virtual appliances to
create virtual servers that are instantly configured with the operating system and software
applications that you desire. You can deploy virtual appliances to the following platforms:
IBM Power systems servers (POWER5 and POWER6) that are managed by Hardware
Management Console or Integrated Virtualization Manager
Linux on System z systems running on the z/VM Hypervisor
IBM Systems Director VMControl allows you to complete the following tasks:
Discover existing image repositories in your environment and import external,
standards-based images into your repositories as virtual appliances.
Instructor Guide
Capture a running virtual server that is configured just the way you want, complete with
guest operating system, running applications, and virtual server definition. When you
capture the virtual server, a virtual appliance is created in one of your image
repositories with the same definitions and can be deployed multiple times in your
environment.
Import virtual appliance packages that exist in the Open Virtualization Format (OVF)
from the Internet or other external sources. After the virtual appliance packages are
imported, you can deploy them within your data center.
Deploy virtual appliances quickly to create new virtual servers that meet the demands of
your ever-changing business needs.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

Consolidated monitoring of physical and virtual resources
Improves mean-time-to-recovery by
visualizing the virtual world to
solve virtual performance problems
Real-time and historical data assists
in separating intermittent problems
from reoccurring problems within
peak workloads
Out-of-the-box reporting allows
clients to quickly provide executive
level reports and identify resource
bottlenecks
Figure 10-27. IBM Tivoli Monitoring AN313.1
Notes:
IBM Tivoli Monitoring manages and monitors system and network applications on a variety
of operating systems, tracks the availability and performance of your enterprise system,
and provides reports to track trends and troubleshoot problems.
IBM Tivoli Monitoring (ITM) Version 6.2 gives the Power systems administrator the ability to
be alerted or notified when something goes wrong. ITM uses agent technology, and has the
capability to determine the health and availability of the entire Power system; right down to
the network interface card.
ITM provides the administrator the ability to monitor both the physical and logical resources
of the Power system, including the disk and network that sits behind the Virtual I/O Server
(VIOS). It can do this because with every VIOS server shipped from IBM, there is an ITM
agent embedded in the VIOS to enable ITM to monitor the disks and network.
Included with ITM is a data warehouse tool that allows the customer to store as much
historical data as they desire. This DB information allows the customer to go back and
compare current system performance and utilization to past performance and utilization.

V5.4.0.3
Instructor Guide
Uempty In addition, ITM includes a vast number of reporting templates that can be easily
customized for individual customer requirements.
For additional technical resources, see the following Web sites:
IBM Tivoli Monitoring information center:
http://publib.boulder.ibm.com/infocenter/tivihelp/v15r1/index.jsp?topic=/com.ibm.itm.do
c/welcome.htm
IBM Tivoli Monitoring Web site:
http://www-306.ibm.com/software/sysmgmt/products/support/IBMTivoliMonitoring.html
IBM Tivoli Monitoring can be ordered as a separate product or can be part of the AIX
management edition and AIX Enterprise editions offering. AIX Enterprise edition is only
available with AIX 6. Consider AIX Management edition if running AIX 5.3.
AIX Enterprise Edition and ITM

AIX Enterprise Edition is designed to simplify the process of managing virtualized AIX
environments by including AIX 6 and enterprise management functions into a single, easy
to order offering. AIX Enterprise Edition consists of:
AIX 6
PowerVM Workload Partitions Manager for AIX (WPAR Manager)
IBM Tivoli Application Dependency Discovery Manager (TADDM)
IBM Tivoli Monitoring with real-time and advanced predictive monitoring
IBM Usage and Accounting Manager Virtualization Edition for Power systems
AIX Management Edition and ITM

Management Edition for AIX (ME for AIX) is an integrated systems management offering
created specifically for the System p platform that provides as primary functions, the
monitoring of the health and availability of the System p platform, the discovery of
configurations and relationships between System p service and application components
and the usage and accounting of System p IT resources.
Management edition for AIX is a bundled system management offering comprised of three
seamlessly integrated products from IBM Tivoli:
IBM Tivoli Application Dependency Discovery Manager
IBM Tivoli Usage and Accounting Manager, Virtualization edition for System p
The product integration provides System p clients with a platform management solution
that is easier to install and easier to implement.
Instructor Guide
Instructor notes:
Purpose
Details
Transition statement The next series of slides show different views in Tivoli to monitor
the virtualized environment.

V5.4.0.3
Instructor Guide
Uempty
IBM Tivoli Monitoring: AIX/System p architecture
TEP client
Console server ITM server

Console
TEPS database
Management server
TEMS Warehouse
Topology
availability
performance
AIX
VIOS availability
HMC OS availability health
health performance
performance
HMC/IVM
AIX premium AIX premium AIX premium

HMC agent CEC agent VIOS premium agent agent agent
agent
System P
Server VIOs AIX AIX AIX
AIX AIX
Figure 10-28. IBM Tivoli Monitoring: AIX/System p architecture AN313.1
Notes:
The basic installation of IBM Tivoli Monitoring requires the following components:
One or more Tivoli enterprise monitoring servers (TEMS), which act as a collection and
control point for alerts received from the agents, and collect their performance and
availability data. The monitoring server also manages the connection status of the
agents.
A Tivoli enterprise portal server (TEPS), which provides the core presentation layer for
retrieval, manipulation, analysis, and pre-formatting of data. The portal server retrieves
data from the hub monitoring server in response to user actions at the portal client, and
sends the data back to the portal client for presentation. The portal server also provides
presentation information to the portal client so that it can render the user interface views
suitably.
One or more Tivoli enterprise portal clients (TEP client) with a Java-based user
interface for viewing and monitoring your enterprise. Tivoli enterprise portal offers two
modes of operation: desktop and browser.
Instructor Guide
Tivoli enterprise monitoring agents, installed on the systems or subsystems you want to
monitor. These agents collect data from monitored or managed systems and distributes
it to a monitoring server. Four different System p agents are available for monitoring and
gathering information.
- Virtual I/O Server Premium Agent: The VIOS premium agent monitors the health of
the VIOS, provides mapping of storage and network resources to the client LPAR,
and provides storage and network utilization statistics. The VIOS premium agent is
preinstalled on a VIOS system. No further installation is required, but the agent must
be configured and bound to a TEMS and TEPS in order to be viewed from a TEP.
- CEC Agent: The CEC agent provides overall CPU and memory utilization of the
frame for monitored partitions and provides CPU and memory utilization by LPAR.
The CEC agent must be installed on an AIX partition that resides on the CEC to be
monitored.
- AIX Premium Agent: The AIX premium agent provides statistics for each LPAR
(entitled CPU, physical and logical CPUs), memory utilization, disk and network
utilization, and process data. This agent also provides usage statistics for WPARs
as well. An AIX agent can be installed on any AIX partition. All AIX agents on a CEC
are typically bound to the same TEMS and TEPS so they can be viewed from the
same TEP.
- HMC Agent: The HMC agent provides health and availability of the HMC. A HMC
agent can be installed on any AIX partition, but requires a ssh connection to the
HMC to monitor it, so it is generally convenient to install it on the same partition as
the CEC agent that already requires a ssh connection. Multiple instances of the
HMC agent can be invoked on the same partition to monitor multiple HMCs.

V5.4.0.3
Instructor Guide

Purpose Quickly describe ITM architecture and required components to manage and
monitor a System p environment.
Details
Additional information The VIOS premium agent requires an established ssh
connection to the HMC that controls the VIOS partition. The Virtual I/O Server is a closed
system and does not support the running of non-certified software. Currently the only ITM
agents certified to run on the VIOS are the ITM VIOS, ITM CEC (only in Integrated
Virtualization Manager (IVM) configurations), and ITM log alert agent.
The CEC agent requires the following to properly monitor AIX LPARs and frame utilization:
Network connectivity to other AIX LPARs on the same CEC using UPD port 2279. Each
AIX LPAR must have the xmservd/xmtopas agent running in order to be monitored
properly. A ssh connection between the CEC agents LPAR and the CECs managing HMC.
Instructor Guide
IBM Tivoli Monitoring: Hypervisor view

Global
CPU and
memory
allocation
Total CPU
and
memory
allocated
to LPARs
Figure 10-29. IBM Tivoli Monitoring: Hypervisor view AN313.1
Notes:
This figure shows a view of the overall CPU and memory resources allocated to the
different logical partitions. These information are provided by the CEC agent.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IBM Tivoli Monitoring: AIX LPAR view
CPU,
memory, disk,
network info
per LPAR
Figure 10-30. IBM Tivoli Monitoring: AIX LPAR view AN313.1
Notes:
Monitoring can be done on individual or groups of logical partitions. The figure shows an
example of CPU, memory, disk, and network resources for an individual logical partition.
This information is provided by the AIX premium agent running on the logical partition.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IBM Tivoli Monitoring: Network mapping

VIOS view of network information
Shows how
network
interfaces are
mapped to
LPARS
Figure 10-31. IBM Tivoli Monitoring: Network mapping AN313.1
Notes:
Tivoli Monitoring is virtualization aware it can show the relationships between virtual
resources and physical resources. The figure shows an example of virtual network
adapters and interfaces mapped to the different logical partitions. This virtual resource
information is provided by the VIOS premium agent installed on the virtual I/O server.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IBM Tivoli Monitoring: Virtual storage mapping

VIOS view of disk mapping and utilization information
Shows virtual
storage
mapping for
VIO Server
Shows virtual
storage
mapping detail
for VIO Server
Figure 10-32. IBM Tivoli Monitoring: Virtual storage mapping AN313.1
Notes:
The figure shows an example of virtual disk mapping and utilization information of the
virtualized environment. This is also provided by the VIOS premium agent running on the
virtual I/O server.

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide
IBM Tivoli Monitoring: Device status
Figure 10-33. IBM Tivoli Monitoring: Device status AN313.1
Notes:
The VIOS premium agent also provides status information of the devices (virtual and
physical) on a Virtual I/O Server.

V5.4.0.3
Instructor Guide

Purpose
Details
Transition statement Next slide is a checkpoint.
Instructor Guide
Checkpoint
1. Which command can be used to check and monitor a shared
Ethernet adapter failover on a Virtual I/O Server?
2. Which command can be used to start recording with topas?
3. What IBM tool is an integrated suite of systems management

tools, designed for monitoring and managing heterogeneous
IT environments?
4. True or False: IBM Tivoli Monitoring monitors and manages

system and network applications on a variety of operating
systems, tracks the availability and performance of your
enterprise system, and provides reports to track trends and
troubleshoot problems.
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details Checkpoint solutions
1. Which command can be used to check and monitor a shared Ethernet
adapter failover on a Virtual I/O Server?
The answer is run topas and then press E.

The answer is topasrec.
3. What IBM tool is an integrated suite of systems management tools,

designed for monitoring and managing heterogeneous IT
environments?
The answer is IBM Systems Director.
4. True or False? IBM Tivoli Monitoring monitors and manages system
and network applications on a variety of operating systems, tracks the
availability and performance of your enterprise system, and provides
reports to track trends and troubleshoot problems.
The answer is true.
Instructor Guide
Unit summary
Describe virtualization management and monitoring tools, such
as
IBM Systems Director
Notes:

V5.4.0.3
Instructor Guide

Purpose
Details
Instructor Guide

V5.4.0.3
Instructor Guide
AP Appendix A. Checkpoint solutions

Unit 1
Checkpoint solution
1. The PowerVM Enterprise Edition is required for which of the following?
The answers are partition mobility and Active Memory Sharing.
2. How many shared processor pools are supported on a POWER6

processor-based system?
The answer is 64 (the default one plus 63 additional).

The answer is Active Memory Sharing.
4. Which PowerVM feature allows LPAR migration from one physical

machine to another?
The answer is partition mobility.
Copyright IBM Corp. 2010, 2011 Appendix A. Checkpoint solutions A-1

Instructor Guide
Unit 2
a. Dedicated These processors cannot be used in micro-partitions.
b. Uncapped Partitions marked as this might use excess processing
c. Logical There are two or four of these for each virtual processor if
d. Dedicated This type of processor must be configured in whole
processor units.
e. Shared These processors are configured in processing units as
f. Capped Partitions marked as this might use up to their entitled
The answers in the correct order are dedicated, uncapped, logical,
dedicated, shared, and capped.

The answer is no, a partition must use only one type of processor,
dedicated or shared, but not both.
A-2 PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

V5.4.0.3
Instructor Guide
AP
The answer is true.
a. One
b. Three
c. No minimum
The answer is three.
a. 25 (Maximum can be no more than 10 times processing units.)
b. 30
d. No maximum
The answer is 25 (maximum can be no more than 10 times processing
units).

Instructor Guide
6. What is the maximum amount of processing units that can be allocated
to a partition?
The answer is all available processing units.
7. If an uncapped partition has an entitled capacity of 0.5 and two virtual
processors, what is the maximum amount of processing units it can
use?
The answer is 2.0 processing units because it is uncapped and has
two virtual processors (maximum of 1.0 units per virtual processor).
8. If there are multiple uncapped partitions running, how are excess
shared processor pool resources divided between the partitions?
The answer is the uncapped weight configuration value is used to
allocate excess resources.
9. True or False: If a capped partition is allocated 2.5 processing units,

this means that it can use up to 25 ms of processing time for every 10
ms of clock time.
The answer is true.

V5.4.0.3
Instructor Guide
AP
The answer is up to ten times the amount of processing
units, with a maximum value of 64.

The answer is true.

Instructor Guide
Unit 3
Topic 1
1. True or False: Dedicated processors can be shared only if they are
idle.
The answer is true.
2. True or False: Only uncapped partitions can use idle cycles donated
by the dedicated processors.
The answer is true.
3. Which of the following sharing_mode values allow excess processor

cycles to be donated to the shared processor pool from an active
partition only?
a. keep_idle_procs
b. share_idle_procs
The answer is share_idle_procs_active.

V5.4.0.3
Instructor Guide
AP Topic 2
The answer is true.
have a number.
The answer is false (default shared pool ID = 0).
The answer is the default value is 0.

The answer is topas C (using p to enable display of pool
section).

Instructor Guide
Unit 4
1. True or False: PowerVM Active Memory Sharing feature allows shared memory
partitions to share memory from a single pool of shared physical memory.
The answer is true.

c. Allows logical partitions to increase their memory footprint during peak memory demand periods
d. Allows over commitment of physical memory resources.
The answer is reduces global memory requirements of logical partitions, manages
operational costs by consolidating resources, allows logical partitions to increase
their memory footprint during peak memory demand periods, and allows over
commitment of physical memory resources.
3. True or False: The total logical memory of all shared memory LPARs is allowed to
exceed the real physical memory allocated to a shared memory pool in the
system.
The answer is true.

The answer is logical memory.

V5.4.0.3
Instructor Guide
AP
5. What requirements must be met by the LPAR in order to be defined
as shared memory LPAR?
The answer is the LPAR must use shared processors and use only
virtual I/Os.
6. What is the purpose of the LPAR paging space device?

The answer is it provides an amount of paging space that the partition
requires to back up its logical memory pages.
7. True or False: Each shared memory partition requires a dedicated

paging device in order to be started.
The answer is true.
8. The dedicated paging device must be what size to successfully

activate a shared memory LPAR?
The answer is in order to successfully activate, the minimum paging
space device size must be more than or equal to the maximum
memory setting for the shared memory LPAR.

Instructor Guide
9. True or False: The Collaborative Memory Manager is an operating
system feature that gives hints on memory page usage to the
hypervisor.
The answer is true.
statistics?
The answer is vmstat, lparstat, topas, and svmon.
11. True or False: When AIX starts to loan logical memory pages, by
default it first selects pages used to cache file data.
The answer is true.
12. How can you tune the Collaborative Memory Managers loan policy?
The answer is the policy is tunable through the AIX VMM vmo
command. The parameter ams_loan_policy has a default value of 1.
This enables the loaning of the file cache. When set to 2, loaning of
any type of data is enabled.

V5.4.0.3
Instructor Guide
AP Unit 5
Topic 1
1. True or False: Every POWER7 system comes with Active Memory
Expansion as standard.
effectively use more memory than the logical memory amount
allocated by the hypervisor.
The answer is true.
3. True or False: The size of the compressed memory pool is static.

4. True or False: The compression ratio achieved on a system depends

on the compressibility of data.
The answer is true.
5. True or False: The AME feature costs the same on every POWER7
system.

Instructor Guide
Topic 2
1. True or False: Any user can use the amepat command to generate
AME modeling information.
2. True or False: The amepat command can be used to generate a
report using recorded data.
The answer is true.
3. True or False: The amepat command should be run when the target
workload is idle.
4. True or False: The amepat command can run on any system running
AIX 6.1 TL4 SP2 or above.
The answer is true.
5. True or False: The amepat command should only be run when AME
is disabled.

V5.4.0.3
Instructor Guide
AP Topic 3
1. True or False: A partition can use AME on any POWER7 system.
The answer is false. (AME can only be configured on a system with
the correct activation code.)
2. True or False: Multiple AME 60 day trial activations can be used on a

single system.
The answer is false. (Only a single 60 day trial can be used on each
system.)
3. True or False: A managed system must be rebooted to enable an

AME activation code.
The answer is false. (The AME capability is enabled dynamically when
the activation code is entered.)

The answer is true.

Instructor Guide
Topic 4
The answer is true.

V5.4.0.3
Instructor Guide
AP Unit 6
The answer is with N_PIV, the VIOS provides a pass-thru
service.

The answer is yes.

The answer is the pair of numbers are associated with the
same LUN; One is use by the source system and the other
by the target system after Live Partition Mobility.

Instructor Guide
Unit 7
Topic 1
The answers are check process activity to determine errant
processes, add CPU resources, change configuration (for
example, capped to uncapped and dedicated to donate
cycles, and move workload.

The answer is true.

V5.4.0.3
Instructor Guide
AP Topic 2
The answer is true.

The answer is false. Clients use the same amount of
processing resources when using virtual or physical devices.

Instructor Guide
3. Which one of the following recommendations about sizing the Virtual
I/O Server for virtual SCSI is false:
c. When using shared processors, set the priority (weight value) of the Virtual I/O
Server partition equal to its client partitions.
The answer is when using shared processors, set the priority (weight
value) of the Virtual I/O Server partition equal to its client partitions.
4. If a physical volume has a queue_depth of four and it has two logical

volumes, each used as backing storage and each has a large amount
of I/Os, what should the client set the queue_depth to on the two
VSCSI disks?
The answer is two.
5. True or False: The overall system processing resource needs to

support virtual disk I/O is nearly double that of using native disks.
The answer is true.

V5.4.0.3
Instructor Guide
AP Topic 3
1. True or False: Virtual Ethernet adapters are created and the PVID
assignments are performed using the Hardware Management Console
(HMC).
The answer is true.
2. True or False: Virtual Ethernet performance can be improved by

adding more CPU entitlement to a partition if it is constrained.
The answer is true.
3. True or False (Physical Ethernet does not support an MTU of 65394.):

An MTU size of 65390 for both virtual and physical Ethernet will result
in the best performance.
The answer is false. Physical Ethernet does not support an MTU of
65394.
4. True or False: For average to high virtual Ethernet workloads,

simultaneous multithreading will improve performance.
The answer is true.

Instructor Guide
Topic 4
1. True or False: When using shared Ethernet adapters, set the MTU size to 65390
on the physical adapter for the best performance.
2. True or False: Processor utilization for large packet workloads on jumbo frames is
approximately half that required for MTU 1500.
The answer is true.
3. If you see many collisions or dropped packets for the SEA device, what are the
first two things to investigate?
The answers are VIOS CPU utilization and physical adapter saturation.
4. True or False: You can configure a maximum amount of network bandwidth for
individual clients of a shared Ethernet adapter.
The answer is false. You can only set priorities.
5. True or False: For mixed shared Ethernet adapter and VSCSI services, leave
threading enabled on the shared Ethernet adapter device.
The answer is true.

V5.4.0.3
Instructor Guide
AP Topic 5
1. True or False: The IVE allows partitions to connect to an external network without the need for a
Virtual I/O Server partition.
The answer is true.
2. True or False: Partitions using IVE logical ports must be connected to an external switch to
communicate with each other.
The answer is false. Partitions configured with logical ports on the same physical port do not need to
connect through an external switch to communication with each other.
3. True or False: The standard IVE adapter card on most POWER6 systems will connect 16 LPARs,
but you can optionally order an IVE adapter card which connects up to 32 LPARs.
The answer is true.
4. True or False: An IVE logical port can be used as the physical adapter in an SEA configuration.
The answer is true.
5. You can see the number of QPs by looking at output from what command?
a. lsattr (AIX)
b. entstat (AIX)
c. ifconfig (AIX)
d. lshwres (HMC)
The answer is entstat (AIX).

Instructor Guide
6. True or False: It is best to have the number of QPs equivalent to the number of virtual, dedicated,
or logical processors in a partition (whichever is the highest number).
The answer is true.
7. True or False: The best performance will be between logical ports which share the same internal
switch.
The answer is true.
8. True or False: The MCS value sets the maximum number of available logical ports per physical
port.
The answer is false. MCS value sets the maximum number per port group.
9. True or False: The MCS value sets the number of queue pairs (QPs) in each partition which is
configured for that port group.
The answer is true.

The answer is false. You must completely power off the managed system and power it back on to
change the MCS value.
11. What is the effect of disabling the multicore attribute for an LHEA Ethernet device in an AIX LPAR?
The answer is when you disable the multicore attribute, the device has just one QP.

V5.4.0.3
Instructor Guide
AP Unit 8
1. True or False: The VASI interface controls every phase of the partition
mobility process.
2. True or False: With partition mobility, you can move a partition

between any two IBM Power Systems.
3. What is included in the state that is transferred during the migration?

The answer is the partitions memory, hardware page table (HPT),
processor state, virtual adapter state, non-volatile RAM (NVRAM),
time of day (ToD), partition configuration, and state of each resource.
4. What log usually provides details not found in the HMC migration error
message?
The answer is the config log; alog t cfg.

Instructor Guide
Unit 9
1. Which command can be used to display the code version of the HMC?
The answer is lshmc -V.
2. Your currently installed firmware level is EL320_59_031, and the new service pack
is EL320_061_031. Is this disruptive?
The answer is no.
3. In the service partition, what command is used to apply the updated system
firmware?
The answer is update_flash.
4. List the commands that can be used to back up and restore the volume group data
structures.
The answers are savevgstruct and restorevgstruct.
5. If you resize a backing device at the VIO Server version 1.3 or later, what
commands must be executed at the client partition to use the additional space?
The answer is chvg -g.

V5.4.0.3
Instructor Guide
AP Unit 10
1. Which command can be used to check and monitor a shared Ethernet
adapter failover on a Virtual I/O Server?
The answer is run topas and then press E.

The answer is topasrec.
3. What IBM tool is an integrated suite of systems management tools,

designed for monitoring and managing heterogeneous IT
environments?
The answer is IBM Systems Director.
4. True or False? IBM Tivoli Monitoring monitors and manages system
and network applications on a variety of operating systems, tracks the
availability and performance of your enterprise system, and provides
reports to track trends and troubleshoot problems.
The answer is true.

Instructor Guide

V5.4.0.3
backpg
Back page

An 313 Inst

Enviado por

Dados do documento

Título original

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

An 313 Inst

Enviado por

Direitos autorais:

Formatos disponíveis

V5.4.0.

Power Systems for AIX -

(Course code AN31)

May 2011 edition

Copyright International Business Machines Corporation 2010, 2011.

Instructor course overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

Course description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix

Unit 1. PowerVM features review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1

Unit 2. Processor virtualization tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1

Copyright IBM Corp. 2010, 2011 Contents iii

POWER7 intelligent threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-11

iv PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

TOC Micro-Partitioning and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-135

Copyright IBM Corp. 2010, 2011 Contents v

Topic 2: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-83

Unit 4. Active Memory Sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-1

vi PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

TOC HMC utilization data: Memory pool utilization . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-100

Unit 5. Active Memory Expansion: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1

Copyright IBM Corp. 2010, 2011 Contents vii

Requesting a trial activation (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-80

Unit 6. N_Port ID Virtualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-1

viii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

TOC Shared NPIV adapter for efficient path redundancy . . . . . . . . . . . . . . . . . . . . . . . 6-36

Unit 7. I/O device virtualization performance and tuning . . . . . . . . . . . . . . . . . . . . . 7-1

Copyright IBM Corp. 2010, 2011 Contents ix

Performance factors: SMT impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-80

x PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

TOC Topic 5: Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-176

Unit 8. Partition mobility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1

Copyright IBM Corp. 2010, 2011 Contents xi

Unit 9. PowerVM advanced systems maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . .9-1

xii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

TOC Backing up the client LPAR to virtual DVD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-104

Unit 10. Virtualization management tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1

Appendix A. Checkpoint solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

Copyright IBM Corp. 2010, 2011 Contents xiii

xiv PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

Copyright IBM Corp. 2010, 2011 Trademarks xv

xvi PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

pref Instructor course overview

Copyright IBM Corp. 2010, 2011 Instructor course overview xvii

xviii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

pref Course description

Duration: 4.5 days

Copyright IBM Corp. 2010, 2011 Course description xix

xx PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

Copyright IBM Corp. 2010, 2011 Agenda xxi

xxii PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

Uempty Unit 1. PowerVM features review

What this unit is about

What you should be able to do

How you will check your progress

List the key technologies associated with the IBM POWER

Describe the IBM PowerVM features

Discuss virtualization performance management

Copyright IBM Corporation 2011

Figure 1-1. Unit objectives AN313.1

1-2 PowerVM Virtualization II Copyright IBM Corp. 2010, 2011

Uempty Instructor notes:

Virtualization technologies on POWER systems

Figure 1-2. Virtualization technologies on POWER systems AN313.1

The IBM System and IBM Power System technologies