Você está na página 1de 310

Solaris 10 Administration Topics Workshop

2 - Virtualization
By Peter Baer Galvin

For Usenix
Last Revision Apr 2009

Copyright 2009 Peter Baer Galvin - All Rights Reserved

Saturday, May 2, 2009


About the Speaker
Peter Baer Galvin - 781 273 4100
pbg@cptech.com
www.cptech.com
peter@galvin.info
My Blog: www.galvin.info
Bio
Peter Baer Galvin is the Chief Technologist for Corporate Technologies, Inc., a leading
systems integrator and VAR, and was the Systems Manager for Brown University's
Computer Science Department. He has written articles for Byte and other magazines. He
was contributing editor of the Solaris Corner for SysAdmin Magazine , wrote Pete's
Wicked World, the security column for SunWorld magazine, and Pete’s Super Systems, the
systems administration column there. He is now Sun columnist for the Usenix ;login:
magazine. Peter is co-author of the Operating Systems Concepts and Applied Operating
Systems Concepts texbooks. As a consultant and trainer, Mr. Galvin has taught tutorials
in security and system administration and given talks at many conferences and
institutions.

Copyright 2008 Peter Baer Galvin - All Rights Reserved 2

Saturday, May 2, 2009


Objectives
Cover a wide variety of topics in Solaris 10

Useful for experienced system administrators

Save time

Avoid (my) mistakes

Learn about new stuff


Answer your questions about old stuff

Won't read the man pages to you

Workshop for hands-on experience and to reinforce concepts

Note – Security covered in separate tutorial

Copyright 2009 Peter Baer Galvin - All Rights Reserved 3

Saturday, May 2, 2009


More Objectives
What makes novice vs. advanced administrator?
Bytes as well as bits, tactics and strategy
Knows how to avoid trouble
How to get out of it once in it
How to not make it worse
Has reasoned philosophy
Has methodology

Copyright 2009 Peter Baer Galvin - All Rights Reserved 4

Saturday, May 2, 2009


Prerequisites

Recommend at least a couple of years of


Solaris experience
Or at least a few years of other Unix
experience
Best is a few years of admin experience,
mostly on Solaris

Copyright 2009 Peter Baer Galvin - All Rights Reserved 5

Saturday, May 2, 2009


About the Tutorial

Every SysAdmin has a different knowledge set


A lot to cover, but notes should make good
reference
So some covered quickly, some in detail
Setting base of knowledge

Please ask questions


But let’s take off-topic off-line
Solaris BOF
Copyright 2009 Peter Baer Galvin - All Rights Reserved 6

Saturday, May 2, 2009


Fair Warning
Sites vary
Circumstances vary
Admin knowledge varies
My goals
Provide information useful for each of you at
your sites
Provide opportunity for you to learn from
each other

Copyright 2009 Peter Baer Galvin - All Rights Reserved 7

Saturday, May 2, 2009


Why Listen to Me
20 Years of Sun experience
Seen much as a consultant
Hopefully, you've used:
My Usenix ;login: column
The Solaris Corner @ www.samag.com
The Solaris Security FAQ
SunWorld “Pete's Wicked World”
SunWorld “Pete's Super Systems”
Unix Secure Programming FAQ (out of date)
Operating System Concepts (The Dino Book), now 8th ed
Applied Operating System Concepts

Copyright 2009 Peter Baer Galvin - All Rights Reserved 8

Saturday, May 2, 2009


Slide Ownership

As indicated per slide, some slides


copyright Sun Microsystems
Thanks to Jeff Victor for input
Feel free to share all the slides - as long as
you don’t charge for them or teach from
them for fee

Copyright 2009 Peter Baer Galvin - All Rights Reserved 9

Saturday, May 2, 2009


Overview
Lay of the Land

Copyright 2009 Peter Baer Galvin - All Rights Reserved

Saturday, May 2, 2009


Schedule
Times and Breaks

Copyright 2009 Peter Baer Galvin - All Rights Reserved 11

Saturday, May 2, 2009


Coverage

Solaris 10+, with some Solaris 9 where


needed
Selected topics that are new, different,
confusing, underused, overused, etc

Copyright 2009 Peter Baer Galvin - All Rights Reserved 12

Saturday, May 2, 2009


Outline

Overview
Objectives
Virtualization choices in Solaris
Zones / Containers
LDOMS and Domains
Virtualbox
Xvm (aka Xen)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 13

Saturday, May 2, 2009


Polling Time
Solaris releases in use?
Plans to upgrade?
Other OSes in use?
Use of Solaris rising or falling?
SPARC and x86
OpenSolaris?

Copyright 2009 Peter Baer Galvin - All Rights Reserved 14

Saturday, May 2, 2009


Your Objectives?

Copyright 2009 Peter Baer Galvin - All Rights Reserved 15

Saturday, May 2, 2009


Your Lab Environment

Apple Macbook Pro


3GB memory
Mac OS X 10.4.10
VMware Fusion 1.0
Solaris Nevada
50 Containers

Copyright 2009 Peter Baer Galvin - All Rights Reserved 16

Saturday, May 2, 2009


Lab Preparation
Have device capable of telnet on the
USENIX network
Or have a buddy
Learn your “magic number”
Telnet to 131.106.62.100+”magic number”
User “root, password “lisa”
It’s all very secure

Copyright 2009 Peter Baer Galvin - All Rights Reserved 17

Saturday, May 2, 2009


Lab Preparation

Or...
Use virtualbox
Use your own system
Use a remote machine you have legit
access to

Copyright 2009 Peter Baer Galvin - All Rights Reserved 18

Saturday, May 2, 2009


Lab Preparation

Or...
Use virtualbox
Use your own system
Use a remote machine you have legit
access to

Copyright 2009 Peter Baer Galvin - All Rights Reserved 19

Saturday, May 2, 2009


Choosing Virtualization Technologies

(See separate “virtualization comparison”


document)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 20

Saturday, May 2, 2009


!"#$%&'()*"+,(-+*(.#&!/01*)"2
/012(301$%$%4-, 5%1$"0#(!067%-',)*(5%1$"0#%80$%4-
9',4"16'(!0-0.':'-$

!"#$%&#'()*+,(
*%-.#'()*

O1'-2($4(B#'D%P%#%$< O1'-2($4(%,4#0$%4-

C4.%60#(;4:0%-, *4#01%,(=4-$0%-'1, *4#01%,(9',4"16'


;<-0:%6(*<,$': !0-0.'1(>*9!A
;4:0%-, *"-(D5! >?4-',(@(*9!A
L'- =4-$0%-'1,(B41(C%-"D G(H-(*4#01%,(IJK
5!M01' *4#01%,(E(=4-$0%-'1,
/<&'1N5 *4#01%,(F(=4-$0%-'1,

!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778
Copyright 2009 Peter Baer Galvin - All Rights Reserved 21

Saturday, May 2, 2009


!"#$%&'&()*+,""-*+.&-/
! !"#$%&'()"*+$&*,%'-
" 9-:"'-*$;-(#-<$&#*,1#'-*=$.-.)(+$>)),0(&#,=$
?)(;<)1:@:(&A-#$3/B$",&<&C1,&)#=$D!$.1#14-.-#,$
')*,*=$>&#-@4(1&#-:$*-'"(&,+
" !&#4<-@;-(#-<=$5-,-()4-#-)"*$100<&'1,&)#$
-#A&()#.-#,*
! ./*$0&1(!/'+,0'(."0$&*'-
" %1E&.&C-*$51(:?1(-$&*)<1,&)#
! 2"3&1$#(."0$&*'4(5&%+6$#(7$18&*,'-
" %"<,&0<-$;-(#-<*=$>"<<$D!$-#A&()#.-#,*=$
5-,-()4-#-)"*
! F-'5#)<)4&-*$1(-$').0<-.-#,1(+
!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778
Copyright 2009 Peter Baer Galvin - All Rights Reserved 22

Saturday, May 2, 2009


!"#$%&#'()*+(),()*-.)/"#$.0#/.12
!"#$%&'()"*+$&*,%'($*-(.&%+/$#(0$12&*,'

812/#.2()*: 812/#.2()*7 812/#.2()*;

812/#.2()*< 812/#.2()*=

!13#.2*4*&!13*4*5"(6/ !137
8139"/()

!678)()09 345
!678)
:;"<' !/*(3.0
;=*$<&1(;"<$&*'
!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778
Copyright 2009 Peter Baer Galvin - All Rights Reserved 23

Saturday, May 2, 2009


Zones, Containers, and
LDOMS

Copyright 2009 Peter Baer Galvin - All Rights Reserved 24

Saturday, May 2, 2009


Overview

Cover details and use of Zones/Containers


and LDOMS
Note that Xen (x64 only) and Virtualbox
(open source x64 only) are coming
No slides yet

Copyright 2009 Peter Baer Galvin - All Rights Reserved 25

Saturday, May 2, 2009


Zones Overview
Think of them of chroot on steroids
Virtualized operating system services
Isolated and “secure” environment for running apps
Apps and users (and superusers) in zone cannot see /
effect other zones
Delegated admin control

Virtualized device paths, network interfaces, network


ports, process space, resource use (via resource manager)
Application fault isolation
Detach and attach containers between systems
Cloning of a zone to create identical new zone
Copyright 2009 Peter Baer Galvin - All Rights Reserved 26

Saturday, May 2, 2009


Zones Overview - 2
Low physical resource use
Up to 8192 zones per system!

Differentiated file system


Multiple versions of an app installed and running on a given system

Inter-zone communication is only via network (but short-pathed


through the kernel

No application changes needed – no API or ABI


Can restrict disk use of a zone via the loopback file driver (lofi) using
a file as a file system

Can dedicate an Ethernet port to a zone

Allowing snooping, firewalling, managing that port by the zone

Copyright 2009 Peter Baer Galvin - All Rights Reserved 27

Saturday, May 2, 2009


Other Virtualization Options
Many virtualization options to consider

Containers is just one of them

Xen (xVM) - being integrated into Solaris Nevada

Run other OSes (linux, win) with S10+ has the host

Industry semi-standard

Para-virtualization, x86 only

LDOMs - hard partitions, shipped in May 2007

Run multiple copies of Solaris on the same coolthreads chip


(Niagara, Rock in the future)

Some resource management - move CPUs and mem

VMWare - solaris as a guest, not a host so far, x86 only

Traditional Sun Domains - SPARC only, Enterprise servers only

Copyright 2009 Peter Baer Galvin - All Rights Reserved 28

Saturday, May 2, 2009


!"#$%&'()"*+'
!"#$%"(8%!"(-*9:;0<&%%/=<&3,'9:<:>(9:?@AB@C@:C1
!"#$%"(8%!"(&%%#D(E( &'$(8%!" %++,*'-.'-(8%!" (%)%$%*'(8%!"
8%!"(&%%#D(E8%!"E/G2V6
8%!"(&%%#D(E8%!"E$"F 8%!"(&%%#D(E8%!"E355

2G2#"/(2"&*+,"2 $"F(2"&*+,"(5&%H",# H"2(5&%H",# /G2V6(5&%H",#

R!*+&%!/"!#
T556+,3#+%!
-53#&%61 -T53,."(9@=@::1 -H:2"1 -/G2V6)1

3N)+#(2"&*+,"2 ,&G5#%(5&%H",# 355(N2"&2(5&%H )F3(N2"&2(5&%H


-3N)+#)1 -2261 -2.M(F32.M(5&2#3#1 -2.M(F32.M(5&2#3#1

2",N&+#G(2"&*+,"2 5&%7G(5&%H",# 2G2#"/(5&%H",# 2G2#"/(5&%H",#


-6%4+!M(QIK1 -5&%7G1 -+!"#)M(22.)1 -+!"#)M(22.)1
,%!2%6"

./"0D:

./"0D=
./"0D9

L63#S%&/
8,%!2

8,%!2

8,%!2
./"0

,"0D:

,"0D=
,"0D9

U+&#N36
EN2&
EN2&

EN2&

EN2&
,"9
,"0

8%!"3)/) 8%!"3)/) 8%!"3)/)

8%!"(/3!34"/"!# ,%&"(2"&*+,"2 &"/%#"(3)/+!E/%!+#%&+!4 563#S%&/(3)/+!+2#&3#+%!


-8%!",S4M(8%!"3)/M(86%4+!1 -+!"#)M(&5,F+!)M(22.)M(@@@1 -IJKLM(IN!KOM(PQRK1 -2G2"*"!#)M()"*S23)/M(+S,%!S+4M@@@1

2#%&34"(,%/56"7
!"#$%&'()"*+," !"#$%&'()"*+," !"#$%&'()"*+,"
-./"01 -,"01 -,"91

!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778

Copyright 2009 Peter Baer Galvin - All Rights Reserved 29

Saturday, May 2, 2009


(From the Solaris 10 Sun Net Talk about Solaris 10 Security)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 30

Saturday, May 2, 2009


Zone Limits
Only one OS installed on a system

One set of OS patches

Only one /etc/system


Although Sun working to move as many settings as possible out of /etc/
system

System crash / OS crash -> all zones crash

Each (sparse) zone uses


~ 100MB of disk

some VM and physical memory (for processes and daemons running in the zone)
- ~40MB of physical memory

Copyright 2009 Peter Baer Galvin - All Rights Reserved 31

Saturday, May 2, 2009


Sparse vs. Whole Root Zone
Sparse Whole-Root

Loop-back mount of system directories Full install of all system files


(/usr, etc)

Lots of disk space


Little disk space use

Each binary independent -> memory use


Each zone shares global-zone system-
binaries -> shared memory
Apps may not be supported (but more
likely)
Apps may not be supported

Cannot change system files


Can change system files

Inter-zone communication only via Inter-zone communication only via


network network

Saturday, May 2, 2009


!"#$%&'($%)*+,$-+

!"#$%"&'##(&)

111&&&&1111&&&& )*#+,- ).-' )/,0&&&111&&&1111&&1111


1111

3#+,&'##(4&)*#+,-)*#+,7 . / 0 !"#$%"&02,5
3#+,&'##(4&) 3#+,&02,5

)$2+ ).-' )/,0 ,(6111


9)#-:

!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778

Copyright 2009 Peter Baer Galvin - All Rights Reserved 33

Saturday, May 2, 2009


!"#$%&'($%)*+,$-+.%)/01+$23"",

!"#$%"&'##(&)

444&&&&4444&&&& )8#-/+ )*+' )./0&&&444&&&4444&&4444


4444

1#-/&'##(7&)8#-/+)8#-/9 4 5 6 !"#$%"&0,/2
1#-/&'##(7&) 1#-/&0,/2

56
)$,- )*+' )./0 /(3444
9)#-$:

!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778
Copyright 2009 Peter Baer Galvin - All Rights Reserved 34

Saturday, May 2, 2009


Global Zone
Aka the usual system
Global Is assigned ID 0 by the system
Provides the single instance of the Solaris kernel
that is bootable and running on the system
Contains a complete installation of the Solaris
system software packages
Can contain additional software packages or
additional software, directories, files, and other
data not installed through packages

Copyright 2009 Peter Baer Galvin - All Rights Reserved 35

Saturday, May 2, 2009


Global Zone - 2
Provides a complete and consistent product
database that contains information about all
software components installed in the global
zone
Holds configuration information specific to the
global zone only, such as the global zone host
name and file system table
Is the only zone that is aware of all devices and
all file systems

Copyright 2009 Peter Baer Galvin - All Rights Reserved 36

Saturday, May 2, 2009


Global Zone - 3
Is the only zone with knowledge of non-global
zone existence and configuration
Is the only zone from which a non-global zone
can be configured, installed, managed, or
uninstalled
Can see the file systems of the non-global
zones (i.e. can copy files into the non-global
zone roots for the non-global zones to see

Copyright 2009 Peter Baer Galvin - All Rights Reserved 37

Saturday, May 2, 2009


Non-global Zones
Non-Global Is assigned a zone ID by the system when the
zone is booted
Shares operation under the Solaris kernel booted from the
global zone
Contains an installed subset of the complete Solaris
Operating System software packages
Contains Solaris software packages shared from the global
zone (“sparse zone”)
Can contain additional installed software packages not
shared from the global zone

Copyright 2009 Peter Baer Galvin - All Rights Reserved 38

Saturday, May 2, 2009


Non-global Zones -2
Can contain additional software, directories, files, and other data
created on the non-global zone that are not installed through
packages or shared from the global zone
Has a complete and consistent product database that contains
information about all software components installed on the zone,
whether present on the non-global zone or shared read-only
from the global zone Is not aware of the existence of any other
zones
Cannot install, manage, or uninstall other zones, including itself
Has configuration information specific to that non-global zone
only, such as the non-global zone host name and file system table

Copyright 2009 Peter Baer Galvin - All Rights Reserved 39

Saturday, May 2, 2009


“Sparse” and “Whole Root” Zones
By default /lib, /platform, /sbin, /usr are LOFS read-only mounted
from global zone into child zone
Ergo those can’t be modified by child zone
Packages installed in child zone only install non (/lib, /platform, /sbin, /usr)
components into the child zone’s file systems
Saves disk space
Saves memory

Whole root zone removes those mounts


Packages install entirely
Ergo child zone can modify its /lib, /platform, /sbin, /usr

Some apps not supported in zones, some only in whole root, some in
sparse root
Per app check with app vendor!
Note that ZFS clone use for zone builds may mean that sparse root is no
longer useful!

Copyright 2009 Peter Baer Galvin - All Rights Reserved 40

Saturday, May 2, 2009


Non-global Zone States
Configured - The zone’s configuration is complete and committed to
stable storage, not initially booted
Incomplete - During an install or uninstall operation
Installed - The zone’s configuration is instantiated on the system but
no virtual platform. Files copied into zoneroot.
Ready - The virtual platform for the zone is established. The kernel
creates the zsched process, network interfaces are plumbed, file
systems are mounted, and devices are configured. A unique zone ID
is assigned by the system, no processes associated with the zone
have been started.
Running - User processes associated with the zone application
environment are running.
Shutting down and Down - These states are transitional states that
are visible while the zone is being halted. However, a zone that is
unable to shut down for any reason will stop in one of these states.
Copyright 2009 Peter Baer Galvin - All Rights Reserved 41

Saturday, May 2, 2009


(From System Administration Guide: N1Grid Containers, Resource Management, and Solaris Zones)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 42

Saturday, May 2, 2009


Zone boot

Note that zoneadm allows “boot” “reboot”


“halt” and “shutdown”. Only “shutdown”
and “boot” execute the smf commands
Also note that there are many options to
these commands (such as zoneadm boot
-- - m verbose)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 43

Saturday, May 2, 2009


Zone Configuration
Data from the following are not referenced or copied when a zone is
installed:
Non-installed packages
Patches
Data on CDs and DVDs
Network installation images
Any prototype or other instance of a zone
In addition, the following types of information, if present in the global zone,
are not copied into a zone that is being installed:
New or changed users in the /etc/passwd file
New or changed groups in the /etc/group file
Configurations for networking services such as DHCP address assignment,
UUCP, or sendmail
Configurations for network services such as naming services
New or changed crontab, printer, and mail files
System log, message, and accounting files
Copyright 2009 Peter Baer Galvin - All Rights Reserved 44

Saturday, May 2, 2009


Zone Configuration
zlogin –C logs in to a just-boot virgin zone
Only root can zlogin – normal zone access is via network

The usual sysidconfig questions are asked


(hostname, name service, timezone, kerberos)
The zone root directory must exist prior to zone
installation
Zone reboots to put configuration changes into effect (a
few seconds)
Messages look like a system reboot (within your window)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 45

Saturday, May 2, 2009


sysidcfg
Create to shorten first boot questions
File gets copied into <zonehome>/root/etc
Sample contents:
name_service=DNS
{domain_name=petergalvin.info
name_server=63.240.76.19
search=arp.com}
network_interface=PRIMARY
{hostname=zone00.petergalvin.info}
timezone=US/Eastern
terminal=vt100
system_locale=C
timeserver=localhost
root_password=aMG0YPkgZQPqo <obviously change this>
security_policy=NONE
nfsv4_domain=dynamic
Copyright 2009 Peter Baer Galvin - All Rights Reserved 46

Saturday, May 2, 2009


Zone Configuration - 2
# zonecfg -z app1
app1: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:app1> create
zonecfg:app1> set zonepath=/opt/zone/app1
zonecfg:app1> set autoboot=false
zonecfg:app1> add net
zonecfg:app1:net> set physical=pnc0
zonecfg:app1:net> set address=192.168.118.140
zonecfg:app1:net> end
zonecfg:app1> add fs
zonecfg:app1:fs> set dir=/export/home
zonecfg:app1:fs> set special=/export/home
zonecfg:app1:fs> set type=lofs
zonecfg:app1:fs> end
zonecfg:app1> add inherit-pkg-dir
zonecfg:app1:inherit-pkg-dir> set dir=/opt/sfw
zonecfg:app1:inherit-pkg-dir> end
zonecfg:app1> verify
zonecfg:app1> commit
zonecfg:app1> exit

Copyright 2009 Peter Baer Galvin - All Rights Reserved 47

Saturday, May 2, 2009


Zone Configuration - 3
# df -k
Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c0d0s0 5678823 2689099 2932936 48% /
/devices 0 0 0 0% /devices
/dev/dsk/c0d0p0:boot 10296 1401 8895 14% /boot
proc 0 0 0 0% /proc
mnttab 0 0 0 0% /etc/mnttab
fd 0 0 0 0% /dev/fd
swap 600780 28 600752 1% /var/run
swap 600776 24 600752 1% /tmp
/dev/dsk/c0d0s7 4030684 32853 3957525 1% /export/home
# zoneadm -z app1 verify
WARNING: /opt/zone/app1 does not exist, so it cannot be verified.
When 'zoneadm install' is run, 'install' will try to create
/opt/zone/app1, and 'verify' will be tried again,
but the 'verify' may fail if:
the parent directory of /opt/zone/app1 is group- or other-writable
or
/opt/zone/app1 overlaps with any other installed zones.
could not verify net address=192.168.118.140 physical=pnc0: No such device or address
zoneadm: zone app1 failed to verify

Copyright 2009 Peter Baer Galvin - All Rights Reserved 48

Saturday, May 2, 2009


Zone Configuration - 4
# ls -l /opt/zone
total 2
drwx------ 4 root other 512 Aug 21 12:44 test
# mkdir /opt/zone/app1
# chmod 700 /opt/zone/app1
# ls -l /opt/zone
total 4
drwx------ 2 root other 512 Sep 16 15:14 app1
drwx------ 4 root other 512 Aug 21 12:44 test
# zonadm -z app1 verify
could not verify net address=192.168.118.140
physical=pnc0: No such device or address
zoneadm: zone app1 failed to verify
# zonecfg -z app1
zonecfg:app1> info
zonepath: /opt/zone/app1
autoboot: false
Copyright 2009 Peter Baer Galvin - All Rights Reserved 49

Saturday, May 2, 2009


Zone Configuration - 5
net:
address: 192.168.118.140
physical: pnc0
zonecfg:app1> remove physical=pnc0
zonecfg:app1> add net
zonecfg:app1:net> set physical=pcn0
zonecfg:app1:net> set address=192.168.118.140
zonecfg:app1:net> end
zonecfg:app1> exit
# zoneadm -z app1 verify
# zoneadm -z app1 install
Preparing to install zone <app1>.
Creating list of files to copy from the global zone.
Copying <2199> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <779> packages on the zone.
Initializing package <0> of <779>: percent complete: 0%
. . .
Copyright 2009 Peter Baer Galvin - All Rights Reserved 50

Saturday, May 2, 2009


Zone Configuration -6
Zone <app1> is initialized.
The file </opt/zone/app1/root/var/sadm/system/logs/install_log> contains a
log of the zone installation.

# zoneadm list -v
ID NAME STATUS PATH
0 global running /
1 test running /opt/zone/test

# df -k
Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c0d0s0 5678823 2766177 2855858 50% /
/devices 0 0 0 0% /devices
/dev/dsk/c0d0p0:boot 10296 1401 8895 14% /boot
proc 0 0 0 0% /proc
mnttab 0 0 0 0% /etc/mnttab
fd 0 0 0 0% /dev/fd
swap 594332 32 594300 1% /var/run
swap 594500 200 594300 1% /tmp
/dev/dsk/c0d0s7 4030684 32853 3957525 1% /export/home
Copyright 2009 Peter Baer Galvin - All Rights Reserved 51

Saturday, May 2, 2009


Zone Configuration -7
# zoneadm -z app1 boot
zoneadm: zone 'app1': WARNING: pcn0:2: no matching subnet found in netmasks(4) for 192.168.118.131; using default of
192.168.118.131.
# zoneadm list -v
ID NAME STATUS PATH
0 global running /
1 test running /opt/zone/test
2 app1 running /opt/zone/app1
# telnet 192.168.118.140
Trying 192.168.118.140...
telnet: Unable to connect to remote host: Connection refused

# zlogin -C app1
[Connected to zone 'app1' console]

Select a Locale

0. English (C - 7-bit ASCII)


1. U.S.A. (UTF-8)
2. Go Back to Previous Screen

Please make a choice (0 - 2), or press h or ? for help: 0

. . .

Copyright 2009 Peter Baer Galvin - All Rights Reserved 52

Saturday, May 2, 2009


Zone Configuration -8
rebooting system due to change(s) in /etc/default/init

[NOTICE: Zone rebooting]

SunOS Release 5.10 Version s10_63 32-bit


Copyright 1983-2004 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hostname: zone-app1
The system is coming up. Please wait.
starting rpc services: rpcbind done.
syslog service starting.
Sep 16 15:48:24 zone-app1 sendmail[7567]: My unqualified host
name (zone-app1) unknown; sleeping for retry
Sep 16 15:49:24 zone-app1 sendmail[7567]: unable to qualify my
own domain name (zone-app1) -- using short name
WARNING: local host name (zone-app1) is not qualified; see cf/
README: WHO AM I?
/etc/mail/aliases: 12 aliases, longest 10 bytes, 138 bytes total
Copyright 2009 Peter Baer Galvin - All Rights Reserved 53

Saturday, May 2, 2009


Zone Configuration -9
Creating new rsa public/private host key pair
Creating new dsa public/private host key pair
The system is ready.
zone-app1 console login: root
Password:
Sep 16 15:51:08 zone-app1 login: ROOT LOGIN /dev/console
Sun Microsystems Inc. SunOS 5.10 s10_63 May 2004
# cat /etc/passwd
root:x:0:1:Super-User:/:/sbin/sh
daemon:x:1:1::/:
bin:x:2:2::/usr/bin:
. . .
noaccess:x:60002:60002:No Access User:/:
nobody4:x:65534:65534:SunOS 4.x NFS Anonymous Access
User:/:

Copyright 2009 Peter Baer Galvin - All Rights Reserved 54

Saturday, May 2, 2009


Zone Configuration -10
# useradd -u 101 -g 14 -d /export/home/pbg -s /bin/bash
pbg
# passwd pbg
New Password:
Re-enter new Password:
passwd: password successfully changed for pbg
# zoneadm list -v
ID NAME STATUS PATH
3 app1 running /
# exit
zone-app1 console login: ~.
[Connection to zone 'app1' console closed]

Copyright 2009 Peter Baer Galvin - All Rights Reserved 55

Saturday, May 2, 2009


Zone Configuration - 11
# zoneadm list -v
ID NAME STATUS PATH
0 global running /
1 test running /opt/zone/test
3 app1 running /opt/zone/app1
# uptime
3:53pm up 5:14, 1 user, load average: 0.23, 0.34, 0.43
# telnet 192.168.118.140
Trying 192.168.118.140…
Connected to 192.168.118.140.
Escape character is ‘^]’.
Login: pbg
Password:

Copyright 2009 Peter Baer Galvin - All Rights Reserved 56

Saturday, May 2, 2009


Zones and ZFS
Installing a zone with its root on ZFS is not supported as
the system then lacks the ability to be upgraded.
Note that “add fs” can be used to add access to a ZFS file
system to a zone
Beyond that, “add dataset” delegates a ZFS file system to
a zone, removes it from the global zone
The zone can manage the file system, except where management
would effect other file systems / parent file system
Filesystem contents can still be seen from global zone via zonepath
+mountpoint (i.e. /zones/zone00/zfs/zonefs/zone00)
# zfs create zfs/zonefs/zone00
# zonecfg -z zone00
zonecfg:zone00> add dataset
zonecfg:zone00:dataset> set name=zfs/zonefs/zone00
zonecfg:zone00:dataset> end
Copyright 2009 Peter Baer Galvin - All Rights Reserved 57

Saturday, May 2, 2009


Zone Script
create -b
set zonepath=/opt/zones/zone0
set autoboot=false
add inherit-pkg-dir
set dir=/lib
end
add inherit-pkg-dir
set dir=/platform
end
add inherit-pkg-dir
set dir=/sbin
end

Copyright 2009 Peter Baer Galvin - All Rights Reserved 58

Saturday, May 2, 2009


Zone Script
add inherit-pkg-dir
set dir=/usr
end
add inherit-pkg-dir
set dir=/opt/sfw
end
add net
set address=192.168.128.200
set physical=pcn0
end
add rctl
set name=zone.cpu-shares
add value (priv=privileged,limit=1,action=none)
end

Copyright 2009 Peter Baer Galvin - All Rights Reserved 59

Saturday, May 2, 2009


Life in a Zone
# ifconfig -a
lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
lo0:1: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
zone test
inet 127.0.0.1 netmask ff000000
lo0:2: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
zone app1
inet 127.0.0.1 netmask ff000000
pcn0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 2
inet 192.168.80.128 netmask ffffff00 broadcast 192.168.80.255
ether 0:c:29:44:a9:df
pcn0:1: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
zone test
inet 192.168.80.139 netmask ffffff00 broadcast 192.168.80.255
pcn0:2: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
zone app1
inet 192.168.80.140 netmask ffffff00 broadcast 192.168.80.255

Copyright 2009 Peter Baer Galvin - All Rights Reserved 60

Saturday, May 2, 2009


Life in a Zone - 2
$ telnet 192.168.80.140
. . .
$ df -k
Filesystem kbytes used avail capacity Mounted on
/ 9515147 1894908 7525088 21% /
/dev 9515147 1894908 7525088 21% /dev
/export/home 10076926 10369 9965788 1% /export/home
/lib 9515147 1894908 7525088 21% /lib
/platform 9515147 1894908 7525088 21% /platform
/sbin 9515147 1894908 7525088 21% /sbin
/usr 9515147 1894908 7525088 21% /usr
proc 0 0 0 0% /proc
mnttab 0 0 0 0% /etc/mnttab
fd 0 0 0 0% /dev/fd
swap 1043072 16 1043056 1% /var/run
swap 1043056 0 1043056 0% /tmp
$ touch /usr/foo
touch: /usr/foo cannot create

Note that virtual memory (and therefore swap) are global


resources
Copyright 2009 Peter Baer Galvin - All Rights Reserved 61

Saturday, May 2, 2009


Life in a Zone - 3
$ ps -ef
UID PID PPID C STIME TTY TIME CMD
root 11120 11120 0 11:00:35 ? 0:00 zsched
pbg 11377 11347 0 11:01:28 pts/8 0:00 ps -ef
root 11229 11120 0 11:00:40 ? 0:00 /usr/sbin/cron
root 11341 11120 0 11:00:46 ? 0:00 /usr/sfw/sbin/snmpd
root 11266 11120 0 11:00:41 ? 0:00 /usr/lib/im/htt -port 9010 -s
yslog -message_locale C
root 11339 11336 0 11:00:46 ? 0:00 /usr/lib/saf/ttymon
root 11250 11120 0 11:00:41 ? 0:00 /usr/lib/utmpd
root 11264 11261 0 11:00:41 ? 0:00 /usr/sadm/lib/smc/bin/smcboot
root 11261 11120 0 11:00:41 ? 0:00 /usr/sadm/lib/smc/bin/smcboot
root 11227 11120 0 11:00:40 ? 0:00 /usr/sbin/nscd
root 11218 11120 0 11:00:40 ? 0:00 /usr/lib/autofs/automountd
root 11325 11120 0 11:00:45 ? 0:00 /usr/lib/dmi/snmpXdmid -s zon
e-app1
root 11239 11120 0 11:00:40 ? 0:00 /usr/lib/sendmail -bd -q15m
root 11265 11261 0 11:00:41 ? 0:00 /usr/sadm/lib/smc/bin/smcboot
root 11230 11120 0 11:00:40 ? 0:00 /usr/sbin/inetd -s
root 11273 11266 0 11:00:42 ? 0:00 htt_server -port 9010 -syslog
-message_locale C
root 11129 11120 0 11:00:36 ? 0:00 init

Copyright 2009 Peter Baer Galvin - All Rights Reserved 62

Saturday, May 2, 2009


Life in a Zone - 4
# mount -p
/ - / ufs - no rw,intr,largefiles,logging,xattr,onerror=panic
/dev - /dev lofs - no zonedevfs
/export/home - /export/home lofs - no
/lib - /lib lofs - no ro,nodevices,nosub
/platform - /platform lofs - no ro,nodevices,nosub
/sbin - /sbin lofs - no ro,nodevices,nosub
/usr - /usr lofs - no ro,nodevices,nosub
proc - /proc proc - no nodevices,zone=app1
mnttab - /etc/mnttab mntfs - no nodevices,zone=app1
fd - /dev/fd fd - no rw,nodevices,zone=app1
swap - /var/run tmpfs - no nodevices,xattr,zone=app1
swap - /tmp tmpfs - no nodevices,xattr,zone=app1
# hostname
zone-app1
# zonename
app1

Copyright 2009 Peter Baer Galvin - All Rights Reserved 63

Saturday, May 2, 2009


Zone Clone
As of S10 8/07, zones are “cloneable”
Much faster than installing a zone

As of 10/08 zones on ZFS -> ZFS clone - instantaneous

Usable only if the zones of similar configs

Configure a zone i.e. zone00

Install the zone

Configure a new zone i.e. zone01

Then rather than zoneadm install, with zone00 halted, do


# zoneadm –z zone01 clone –m copy zone00

Copyright 2009 Peter Baer Galvin - All Rights Reserved 64

Saturday, May 2, 2009


Zone Clone (cont)
A cloned zone is unconfigured and must be
configured
When ZFS used as clone file system
# zoneadm -z <newzone> clone <oldzone>
Can clone a zone’s previously-taken
snapshot via
# zoneadm -z <newzone> clone -s \
<snapshot name> <oldzone>

Copyright 2009 Peter Baer Galvin - All Rights Reserved 65

Saturday, May 2, 2009


Zone Clone (cont)
So to clone zone1 to make zone2
# zonecfg -z zone1 export -f configfile
Edit configfile to change zonepath and address (at
least)
Create zone2 via zonecfg -z zone2 -f
configfile
Halt zone1 via zoneadm -z zone1 halt
Clone zone1 via zoneadm -z zone2 clone zone1
Use “-m copy” if zone1 on UFS
Boot up both zones
Check status via zoneadm list -iv
Copyright 2009 Peter Baer Galvin - All Rights Reserved 66

Saturday, May 2, 2009


Zone Migration
Zones can be moved between like systems
Available S10 8/07
Separate the zone from its current system
# zoneadm –z <zone> detach
Note zone must be halted first
Attach a detached zone to a different system (assuming its
file system is now visible there, send a tarball, etc)
# zoneadm –z <zone> attach [-F]
Note zone must be configured before this can work
Note new system is validated to assure the zone can function there
To create a config for a zone that is detached rather than
having to zonecfg it from scratch
# zonecfg –z <zone> create -a zonepath
Copyright 2009 Peter Baer Galvin - All Rights Reserved 67

Saturday, May 2, 2009


Zone Migration (cont)
Can dry-run an attach / detach via the “-n” option to
see if the attach will work
Can upgrade the attaching zone on the attaching
system via “-u” but only if all packages on the
attaching system are as new or newer than the
detaching system
Can force an attach if a detach could not be done
(dead system for example)
Best to save your zone cfg files for use on the
attach system (or you have to recreate them)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 68

Saturday, May 2, 2009


Other Cool Zone Stuff

ps –Z shows zone in which each process is running


Can use resource manager with zones
Zones can use global naming services
Use features to enable or disable accounts per zone
Interzone networking executed via loopback for
performance

Copyright 2009 Peter Baer Galvin - All Rights Reserved 69

Saturday, May 2, 2009


Labs
Create a “simple” zone
Install it
Boot it
Configure it
Look around in it - file systems, processes,
resource use, users, etc
Halt it

Copyright 2009 Peter Baer Galvin - All Rights Reserved 70

Saturday, May 2, 2009


Zones and DTrace

Zones can get some DTrace privileges (starting 11/06)


# zonecfg -z my-zone
zonecfg:my-zone> set
limitpriv="default,dtrace_proc,dtrace_user"
zonecfg:my-zone> exit

DTrace can use zonenames are predicates to filter


results
# dtrace -n 'syscall:::/zonename==”zone1”/
{@[probefunc]=count()}'

Copyright 2009 Peter Baer Galvin - All Rights Reserved 71

Saturday, May 2, 2009


Fair-share Scheduling
Solaris has many scheduler classes available

A thread has priority 0-169, user threads are 0-59

The higher the priority, the sooner scheduled on CPU

Scheduler class decides how the priority is modified over time

Default user-land is Time-sharing

Time-sharing dynamically changes the priority of each thread


based on its activity

If a thread used it time quantum, its priority decreases

(The quantum is the scheduling interval)

Kernel uses “sys” class

Have a look via ps -elfc

Copyright 2009 Peter Baer Galvin - All Rights Reserved 72

Saturday, May 2, 2009


!"#$%&'"$(%&)'(*+,($
Fair-share Scheduling
!"#$%&'"$(%&)'(*+,($
!"#$%&'"$(%&)'(*+,($
2
22 1 Backup
AppServer
3 1 Backup
Database
3 1 AppServer
Database
Backup
Web
AppServer
3 Web Database
Web

Database gets
4 / 4+3+2+1=
! 40%
!!of all5CPU
4 ! $to container
$!%
4 $!% time available
!""#"!"$# "% $
!""#"!"$#
! ! "% 5
4
!"#$%&'())*+#,%-',*'.*/,#0/%$&
$ $!%
!""#"!"$# "%
!"#$%&'())*+#,%-',*'.*/,#0/%$&
!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778

!"#$%&'())*+#,%-',*'.*/,#0/%$&
!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778

Copyright 2009 Peter Baer Galvin - All Rights Reserved 73

!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778

Saturday, May 2, 2009


Zones and Fair Share Scheduling
FSS allows all CPU to be used if needed, but overuse to
be limited based on “shares” given to CPU users
Shares give to projects et al, and/or to containers
Load the fair share schedule as the default schedule
class
dispadmin –d FSS
Move all processes into the FSS class
priocntl -s -c FSS -i class TS
Give the global zone some (2) shares
Note this is not persistent across reboots!
prctl -n zone.cpu-shares -v 2 -r -i zone
global

Copyright 2009 Peter Baer Galvin - All Rights Reserved 74

Saturday, May 2, 2009


Zones and Fair-share scheduling (2)

Check the shares of the global zone


prctl -n zone.cpu-shares -i zone global
Add a zone-wide resource control (1 share) to a zone
(within zonecfg) (before S10U5)
zonecfg:my-zone> add rctl
zonecfg:my-zone:rctl> set name=zone.cpu-
shares
zonecfg:my-zone:rctl> add value \
(priv=privileged,limit=1,action=none)
zonecfg:my-zone:rctl> end
How many total shares are given out on a given
machine?

Copyright 2009 Peter Baer Galvin - All Rights Reserved 75

Saturday, May 2, 2009


FX Scheduler
Time-share is heavy weight scheduler
Has to calculate for every thread that ran
in the last quantum, every quantum
Plus decreases priority on CPU hogs
Instead consider “FX” - fixed scheduler class
All priorities stay the same
Light weight schedule can gain back a few
percent of CPU
Copyright 2009 Peter Baer Galvin - All Rights Reserved 76

Saturday, May 2, 2009


!"#$%&'()*+,-.'*(/,,0+
! 9-*&4#-:$,)$4()"0$'5)*-#$(-*)"('-*$*"'5$1*$3/;*<$
.-.)(+<$=>?$')##-',&)#*
! @$0))A$'1#$B-$1**)'&1,-:$C&,5$3/;*$1#:$1$*'5-:"A-(
! 3/;*$'1#$B-$1**&4#-:D
" :+#1.&'1AA+<$B+$')#E&4"(&#4$1$.&#&.".$1#:$.1F&.".$
#".B-($)E$3/;*$,51,$1$G)#-$)($0))A$*5)"A:$"*-
" B+$!)A1(&*$C5-#$&,$:-'&:-*$,)$,(1#*E-($3/;*$1.)#4$
-F&*,&#4$0))A*$C&,5$H,5(-*5)A:H$1#:$H&.0)(,1#'-H$
01(1.-,-(*
" *,1,&'1AA+<$B+$H0&##&#4H$1$3/;$,)$1$0))A$2$"*-E"A$,)$
-#*"(-$,51,$1$0()'-**$*,1+*$)#$1$3/;$1#:$:)-*#H,$
*51(-$,5-$3/;H*$'1'5-
" @$3/;$&*$.)I-:$B-,C--#$0))A*$C5-#$1#$H&.0)(,1#,H$
C)(JA)1:$*"(01**-*$&,*$",&A&G1,&)#$,5(-*5)A:$E)($1$
*"EE&'&-#,$0-(&):$)E$,&.-
!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778
Copyright 2009 Peter Baer Galvin - All Rights Reserved 77

Saturday, May 2, 2009


!"#$%&'()*+,-.'*(/,,0+
! 95-(-$&*$)#-$0)):$')#;&4"(1,&)#$0-($!):1(&*$&#*,1#'-
! <+$=-;1":,>$)#-$0)):$-?&*,*>$@0)):A=-;1":,B
! 95-*-$'1#$C-$C)"#=$,)$1$0)):D
" /()'-**>$,1*E>$0()F-',>$3)#,1&#-(
! G$3)#,1&#-($'1#$C-$*,1,&'1::+$1**&4#-=$,)$1#$
-?&*,&#4$H*51(-=I$0)):$J5-#$,5-$3)#,1&#-($C)),*
" %":,&0:-$3)#,1&#-(*$'1#$*51(-$,51,$0)):
" !"'5$1$3)#,1&#-($)#:+$"*-*$(-*)"('-*$J5-#$&,$&*$
("##&#4
! G$3)#,1&#-($'1#$C-$1**&4#-=$,)$1$,-.0)(1(+$0)):
" /)):$)#:+$-?&*,*$J5&:-$3)#,1&#-($("#*
" 951,$0)):$'1##),$C-$*51(-=$J&,5$),5-($3)#,1&#-(*

!"#$%&'()*+*,-.*$/()0(&-,1(+$2$3)0+(&45,$6778
Copyright 2009 Peter Baer Galvin - All Rights Reserved 78

Saturday, May 2, 2009


DRPs
You can make “DRP”s non-dynamic by not including
a variation in the range (i.e. 2 to 2 rather than 1 to 2)
Probably preferred rather than real dynamic
With pools, interrupts and I/O only occur in the
default pool
This can help pin a process to a set of CPUS
Cache stays hot, less context switching
So consider a DRP config with the kernel in the
default pool and all apps in another pool

Copyright 2009 Peter Baer Galvin - All Rights Reserved 79

Saturday, May 2, 2009


Zones and Dynamic Resource Pools
Assign zones to dedicated CPU resources
Used to assign zone to processor set

Can be dynamically created, deleted, modified

Can be used with FSS


Can be used to reduce Oracle (and other?) costs!
Consider two DRPs, one with an email container
and one with 2 X web server containers (and
global) (from http://www.sun.com/software/solaris/
howtoguides/containersLowRes.jsp):

Copyright 2009 Peter Baer Galvin - All Rights Reserved 80

Saturday, May 2, 2009


Zones and DRPs (cont)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 81

Saturday, May 2, 2009


Zones and DRPs (cont)
Create a pool (from global zone) via
# # enable DRPs
# pooladm –e
# # save current config
# pooladm –s
# # show current state, at start only pool_default exists
global# pooladm
system my_system
string system.comment
int system.version 1
boolean system.bind-default true
int system.poold.pid 638
pool pool_default
int pool.sys_id 0
boolean pool.active true
boolean pool.default true
int pool.importance 1
string pool.comment
pset pset_default

Copyright 2009 Peter Baer Galvin - All Rights Reserved 82

Saturday, May 2, 2009


Zones and DRPs (cont)
pset pset_default
int pset.sys_id -1
boolean pset.default true
uint pset.min 1
uint pset.max 65536
string pset.units population
uint pset.load 7
uint pset.size 8
string pset.comment
cpu
int cpu.sys_id 1
string cpu.comment
string cpu.status on-line
cpu
int cpu.sys_id 0
string cpu.comment
string cpu.status on-line
cpu
int cpu.sys_id 3
string cpu.comment
string cpu.status on-line
cpu
int cpu.sys_id 2
string cpu.comment
string cpu.status on-line

Copyright 2009 Peter Baer Galvin - All Rights Reserved 83

Saturday, May 2, 2009


Zones and DRPs (cont)
Create a new one-CPU processor set called email-pset
# poolcfg -c 'create pset email-pset (uint
pset.min=1; uint pset.max=1)'

Create a resource pool for the processor set


# poolcfg -c 'create pool email-pool'

Link the pool to the processor set


# poolcfg -c 'associate pool email-pool (pset
email-pset)'

Set an objective (if including a range of processors (i.e. min <> max)
# poolcfg -c 'modify pset email-pool (string
pset.poold.objectives="wt-load")'
Activate the configuration
# pooladm -c

Copyright 2009 Peter Baer Galvin - All Rights Reserved 84

Saturday, May 2, 2009


Zones and DRPs (cont)
Check the config
# pooladm
system my_system
string system.comment
int system.version 1
boolean system.bind-default true
int system.poold.pid 638
pool email-pool
int pool.sys_id 1
boolean pool.active true
boolean pool.default false
int pool.importance 1
string pool.comment
pset email
pool pool_default
int pool.sys_id 0
boolean pool.active true
boolean pool.default true
int pool.importance 1
string pool.comment
pset pset_default
pset email-pset
int pset.sys_id 1
boolean pset.default false
uint pset.min 1
uint pset.max 1
string pset.units population
uint pset.load 0
uint pset.size 1
string pset.comment
cpu
int cpu.sys_id 0
string cpu.comment
string cpu.status on-line

Copyright 2009 Peter Baer Galvin - All Rights Reserved 85

Saturday, May 2, 2009


Zones and DRPs (cont)
Check the config
pset pset_default
int pset.sys_id -1
boolean pset.default true
uint pset.min 1
uint pset.max 65536
string pset.units population
uint pset.load 7
uint pset.size 7
string pset.comment
cpu
int cpu.sys_id 1
string cpu.comment
string cpu.status on-line
cpu
int cpu.sys_id 3
string cpu.comment
string cpu.status on-line
cpu
int cpu.sys_id 2
string cpu.comment
string cpu.status on-line
Copyright 2009 Peter Baer Galvin - All Rights Reserved 86

Saturday, May 2, 2009


DRPs
Note that you can give ranges of CPUs to
be used in DRPs
If you do be sure to set an “objective” else
nothing will be dynamic
Note that some software licenses allow
licensing of the app for only those CPUs in
the DRP that the zone is attached to (i.e.
only pay for your DRP CPUs, not all
CPUs)(!)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 87

Saturday, May 2, 2009


Zones and DRPs (cont)
Now enable FSS, make it default for pool_default

# poolcfg -c 'modify pool pool_default (string pool.scheduler="FSS")'

Create an instance of the configuration

# pooladm -c

Move all the processes in the default pool and its associated zones under the FSS.

# priocntl -s -c FSS -i class TS

# priocntl -s -c FSS -i pid 1

Now have the zones use the DRPs


# zonecfg –z email-zone
zonecfg:email-zone> set pool=email-pool
# zonecfg –z Web1-zone
zonecfg: Web1-zone> set pool=pool_default
zonecfg:Web1-zone> add rctl
zonecfg:Web1-zone:rctl> set name=zone.cpu-shares
zonecfg:Web1-zone:rctl> add value (priv=privileged,limit=3,action=none)
zonecfg:Web1-zone:rctl> end
# zonecfg -z Web2-zone
zonecfg:Web2-zone> set pool=pool_default
zonecfg:Web2-zone> add rctl
zonecfg:Web2-zone:rctl> set name=zone.cpu-shares
zonecfg:Web2-zone:rctl> add value (priv=privileged,limit=2,action=none)
zonecfg:Web2-zone:rtcl> end

Copyright 2009 Peter Baer Galvin - All Rights Reserved 88

Saturday, May 2, 2009


Zones, Resources, and S10 8/07
Much simpler now if you just want a zone to have dedicated
CPUs, memory limits

(From http://blogs.sun.com/jerrysblog/feed/entries/atom?cat=%2FSolaris)
zonecfg:my-zone> set scheduling-class=FSS
zonecfg:my-zone> add dedicated-cpu
zonecfg:my-zone:dedicated-cpu> set ncpus=1-4
zonecfg:my-zone:dedicated-cpu> set importance=10
zonecfg:my-zone:dedicated-cpu> end

zonecfg:my-zone> add capped-memory


zonecfg:my-zone:capped-memory> set physical=50m
zonecfg:my-zone:capped-memory> set swap=128m
zonecfg:my-zone:capped-memory> set locked=10m
zonecfg:my-zone:capped-memory> end

You have to enable poold via svcadm if “importance”used

Still use dispadmin to set system-wide scheduling


Copyright 2009 Peter Baer Galvin - All Rights Reserved 89

Saturday, May 2, 2009


Zones, Resources, and S10 8/07 (cont)

Can use zonecfg for the global zone to persistently


set resource management settings in global
Now can set other zone-wide resource limits easily
zone.cpu-shares
zone.max-locked-memory (locked property of the capped-memory
resource is preferred)
zone.max-lwps
zone.max-msg-ids
zone.max-sem-ids
zone.max-shm-ids
zone.max-shm-memory
zone.max-swap (The swap property of the capped-memory resource
is the preferred way to set this control)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 90

Saturday, May 2, 2009


Zones and Networking S10 8/07
Can now create exclusive-IP zones (i.e. dedicate an HBA port to a zone) known as
“IP Instances”

Need this if you want advanced networking features in a zone (firewalls, snooping,
DHCP client, traffic shaping)

Each zone get its own IP stack (and soon xVM will too)
zonecfg:my-zone>set ip-type=exclusive
zonecfg:my-zone> add net
zonecfg:my-zone:net> set physical=e1000g1
zonecfg:my-zone:net> end
Now the zone can set its own IP address et al, can do IPMP within a zone

“zonecfg set physical=” to one of the interfaces in an IPMP group

Project Crossbow will allow virtual NICs to be IP instance entity (no longer tying up
Ethernet port)
Limited to Ethernet devices that use GLDv3 drivers (dladm show-link not reporting
“legacy”)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 91

Saturday, May 2, 2009


Zones, Resources and 5/08
CPU Caps Can limit the aggregated amount of CPU that a container’s CPUs can
accumulate
Although it is possible to use prctl(1M) command to manage CPU caps, the capctl
Perl script that simplifies it
# capctl <-P project> <-p pid> <-Z zone> <-n name> <-v value>
* -P proj: Specify project id
* -p pid: Specify pid
* -Z zone: Specify zone name
* -n name: Specify resource name
* -v value: Specify resource value
For example, to set a cap for project foo to 50% you can say:
# capctl -P foo -v 50
To change the cap to 80%:
# capctl -P foo -v 80
To see the cap value:
# capctl -P foo
To remove the cap:
# capctl -P foo -v 0

Copyright 2009 Peter Baer Galvin - All Rights Reserved 92

Saturday, May 2, 2009


prctl vs zonecfg

prctl can read resource settings in the


global or child zones
Not persistent for setting variables
Can’t set variables in the child zone
zonecfg is persistent, but only runs in
global zone

Copyright 2009 Peter Baer Galvin - All Rights Reserved 93

Saturday, May 2, 2009


Zone Issues
Zone cannot reside on NFS
But zone can be NFS client
Each zone normally has a “sparse” installation of a
package, if package is from “inherit-package-dir” directory
tree
By default, a package installed in global zone is installed in
all existing non-global zones
Unless the pkgadd –G or –Z options are used
See also SUNW_PKG_ALLZONES and SUNW_PKG_HOLLOW
package parameters
Patches installed in global zone is installed in all non-global
zones
If any zone does not match patch dependencies, patch not
installed
Copyright 2009 Peter Baer Galvin - All Rights Reserved 94

Saturday, May 2, 2009


Zone issues - cont
Upgrading the global zone to a new Solaris release
upgrades the non-global zones but depends on which
upgrade method is used (hint - use live upgrade)
Best practice is to keep packages and patches synced
between global and all non-global zones
Watch out for giving users root in a zone – could
violate policy or regulations
Flash Archive (flar) can be used to capture system
containing zones and clone it, but only if zones are
halted.
Details at http://www.opensolaris.org/os/community/zones/
faq/flar_zones
Copyright 2009 Peter Baer Galvin - All Rights Reserved 95

Saturday, May 2, 2009


Zones and Packages
# pkgadd -d screen*

The following packages are available:


1 SMCscreen screen
(intel) 4.0.2

Select package(s) you wish to process (or 'all' to process


all packages). (default: all) [?,??,q]:
## Not processing zone <zone10>: the zone is not running and cannot be booted
## Booting non-running zone <zone0> into administrative state
## waiting for zone <zone0> to enter single user mode...
## Verifying package <SMCscreen> dependencies in zone <zone0>
## Restoring state of global zone <zone0>
## Booting non-running zone <zone1> into administrative state
## waiting for zone <zone1> to enter single user mode...
. . .
## Booting non-running zone <zone0> into administrative state
## waiting for zone <zone0> to enter single user mode...
## waiting for zone <zone0> to enter single user mode...
## Installing package <SMCscreen> in zone <zone0>

Copyright 2009 Peter Baer Galvin - All Rights Reserved 96

Saturday, May 2, 2009


Sparse Zones vs. Whole Root Zones
When should you use “sparse”, when should you use
“whole root”
Check per-application support and/or requirements
sparse zones don’t allow writes into /, /usr, etc by default, some apps
don’t like that
Can intermix sparse and whole-root on the same system

Make a sparse root into a whole root


# zonecfg create -b
In the future, likely that the world will use whole root
zones and ZFS cloning
But zone roots on ZFS not supported until U6
because not upgradeable
Copyright 2009 Peter Baer Galvin - All Rights Reserved 97

Saturday, May 2, 2009


Upgrading a System Containing Containers

Supported methods vary, depending on


OS release being upgraded from
Generally liveupgrade is best, but many
details to consider
Well documented at http://docs.sun.com/app/docs/
doc/820-4041/gdzlc?a=view

Copyright 2009 Peter Baer Galvin - All Rights Reserved 98

Saturday, May 2, 2009


Zone Best Practices
Note that global zone root can copy files directly into zones via their
zonepath directory

Consider building at least one container per system


Put all users and apps in there

Fast to copy for testing

Fast reboot

Put it on shared storage for future attach / detach


But watch out for limits

dtrace

app support in a zone


Surprisingly, a global-zone mount within the zone file system is
immediately seen in the zone
Copyright 2009 Peter Baer Galvin - All Rights Reserved 99

Saturday, May 2, 2009


Zone Best Practices (2)
Use zonecfg export to save each zone’s
config settings - store on a different system
For every zone created, in its “virgin state”,
create a clone of it and store it on a
different system
Put zones on ZFS for best feature set
Consider configuring child zones to send
syslog output to central syslog server
Copyright 2009 Peter Baer Galvin - All Rights Reserved 100

Saturday, May 2, 2009


Zones and /etc/system
For variables no longer in /etc/system they can be set via the rctladm command,
but only per project. This example is from the Sun installation guide for Weblogic
on Solaris 10…
Modify /etc/project in each zone the app will run in to contain the following
additions to the resource controls for user.root (assuming the application will run
as root):
bash-3.00# cat /etc/project
system:0::::
user.root:1::::
process.max-file-descriptor=(privileged,1024,deny);
process.max-sem-ops=(privileged,512,deny);
process.max-sem-nsems=(privileged,512,deny);
project.max-sem-ids=(privileged,1024,deny);
project.max-shm-ids=(privileged,1024,deny);
project.max-shm-memory=(privileged,4294967296,deny)
noproject:2::::
default:3::::
group.staff:10::::
Copyright 2009 Peter Baer Galvin - All Rights Reserved 101

Saturday, May 2, 2009


Zones and /etc/system (cont)

Note that /etc/project is read at login


Also to enable warnings via syslog if the resource limits
are approached execute the following commands once
in each zone the app will run in (they update the /etc/
rctladm.conf file)
Do this in the global zone, not persistent so script it:
#rctladm -e syslog process.max-file-descriptor
#rctladm -e syslog process.max-sem-ops
#rctladm -e syslog process.max-sem-nsems
#rctladm -e syslog process.max-sem-ids
#rctladm -e syslog process.max-shm-ids
#rctladm -e syslog process.max-shm-memory
Copyright 2009 Peter Baer Galvin - All Rights Reserved 102

Saturday, May 2, 2009


Branded Zones
Shipped in S10 8/07
Allows native binary execution of bins from other
operating systems
Centos first
Install a brandz zone, install the “guest” OS, then install
binaries (RPMs et al) and run them
Currently limited to centos and other 2.4-based distros
Result - can use DTrace to analyze Linux perf problems
See man pages for brands(5), lx(5)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 103

Saturday, May 2, 2009


brandz
Example install given at http://milek.blogspot.com/2006/10/brandz-
integrated-into-snv49.html
# zonecfg -z linux
linux: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:linux> create -t SUNWlx
zonecfg:linux> set zonepath=/home/zones/linux
zonecfg:linux> add net
zonecfg:linux:net> set address=192.168.1.10/24
zonecfg:linux:net> set physical=bge0
zonecfg:linux:net> end
zonecfg:linux> add attr
zonecfg:linux:attr> set name="audio"
zonecfg:linux:attr> set type=boolean
zonecfg:linux:attr> set value=true
zonecfg:linux:attr> end
zonecfg:linux> exit

Copyright 2009 Peter Baer Galvin - All Rights Reserved 104

Saturday, May 2, 2009


brandz (cont)
# zoneadm -z linux install -d /mnt/iso/
centos_fs_image.tar.bz2
A ZFS file system has been created for this zone.
Installing zone 'linux' at root directory '/home/zones/
linux'
from archive '/mnt/iso/centos_fs_image.tar.bz2'

This process may take several minutes.

Setting up the initial lx brand environment.


System configuration modifications complete!
Setting up the initial lx brand environment.
System configuration modifications complete!
Installation of zone 'linux' completed successfully.
Details saved to log file:
"/home/zones/linux/root/var/log/linux.install.10064.log"

Copyright 2009 Peter Baer Galvin - All Rights Reserved 105

Saturday, May 2, 2009


Solaris 8 and 9 Containers
Now available as a commercial product ($) from Sun
Uses brandz
Capture a Solaris 8 or Solaris 9 system via Archiver (aka
P2V)
Updater Tool, processes Solaris 8 image and prepares it
for new, virtualized environment
Create it as a container under S10
Apps think they are on S8 or S9
Sun “guarantees” compatibility
SPARC only
Copyright 2009 Peter Baer Galvin - All Rights Reserved 106

Saturday, May 2, 2009


Solaris 8 and 9 Containers - cont
http://www.sun.com/software/solaris/pdf/
solaris8and9containers_datasheet.pdf
# zonecfg -z zone8
zonecfg:zone8> create -t SUNWsolaris8
zonecfg:zone8> set zonepath = /export/home/zones/zone8
zonecfg:zone8> add net
zonecfg:zone8:net> set address = <IP Address>
zonecfg:zone8:net> set physical = e1000g1
zonecfg:zone8:net> end
zonecfg:zone8> verify
zonecfg:zone8> commit
zonecfg:zone8> exit
# zoneadm -z zone8 install -a <FLAR_image_location> {-u|-p}

Try for 90 days via http://www.sun.com/software/solaris/


containers/getit.jsp

Copyright 2009 Peter Baer Galvin - All Rights Reserved 107

Saturday, May 2, 2009


zonestat
Tool to monitor entire system performance, including per-zone

More information that prstat -Z

Download from http://opensolaris.org/os/project/zonestat/


# ./zonestat
|--Pool--|Pset|-------Memory-----|
Zonename| IT|Size|Used| RAM| Shm| Lkd| VM|
------------------------------------------
global 0D 2 0.1 556M 0.0 0.0 331M
zone1 0D 2 0.0 26M 0.0 0.0 24M
==TOTAL= === 2 0.1 608M 0.0 0.0 355M
# ./zonestat -l
|----Pool-----|------CPU-------|----------------Memory----------------|
|---|--Size---|-----Pset-------|---RAM---|---Shm---|---Lkd---|---VM---|
Zonename| IT| Max| Cur| Cap|Used|Shr|S%| Cap|Used| Cap|Used| Cap|Used| Cap|Used
-------------------------------------------------------------------------------
global 0D 2 0.0 0.1 5 83 556M 18E 0.0 18E 0.0 18E 331M
zone1 0D 2 0.0 0.0 1 16 26M 18E 0.0 18E 0.0 18E 24M
==TOTAL= --- ---- 2 ---- 0.1 --- -- 3.1G 608M 3.1G 0.0 3.0G 0.0 4.0G 355M

Copyright 2009 Peter Baer Galvin - All Rights Reserved 108

Saturday, May 2, 2009


Zone Futures

Live migration
Improved networking via project crossbow
Not just ip-exclusive. Virtual network stack for
each container
S10 containers?

Copyright 2009 Peter Baer Galvin - All Rights Reserved 109

Saturday, May 2, 2009


Labs
Create a container with resource management (your choice)

What is your view of file systems?


What file systems are yours, what are shared?

What do the file systems look like from the global zone?

Test the resource management if possible

What does your networking look like?

What is your life like in a zone?

How are zones different from domains? From vmware?

What scheduler is in use in your zone?


If fair share, how many shares does your zone have?

Copyright 2009 Peter Baer Galvin - All Rights Reserved 110

Saturday, May 2, 2009


Labs (cont)

If you are not fair share scheduled, turn it


on and enable shares for your container
Clone the zone
Detach and attach the zone (to the same
system if necessary)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 111

Saturday, May 2, 2009


LDOMS

Copyright 2009 Peter Baer Galvin - All Rights Reserved 112

Saturday, May 2, 2009


LDOMs
Logical domains
Released April ’07
Only on Niagara and future CMT chips (Niagara
II, Rock)
Like enterprise-system domains but within one
chip
Slice the chip into multiple LDOMs, each with its
own OS root, boot independently, et
Now can run multiple OSes on 1 SPARC chip
Copyright 2009 Peter Baer Galvin - All Rights Reserved 113

Saturday, May 2, 2009


Copyright 2009 Peter Baer Galvin - All Rights Reserved 114

Saturday, May 2, 2009


LDOMs - Details
Can create up to 1 LDOM per thread(!)
Best practice seems to be max one LDOM per
core
i.e. 8 LDOMs on Niagara I and II
Nice intro blog
http://blogs.sun.com/ash/entry/ultrasparc_t2_launched_today

And nice flash demo


http://www.sun.com/servers/coolthreads/ldoms/
Community cookbooks
http://wikis.sun.com/display/SolarisLogicalDomains/
LDoms+Community+Cookbook
Copyright 2009 Peter Baer Galvin - All Rights Reserved 115

Saturday, May 2, 2009


LDOMS Introduction
and
Hands-On-Training

Peter Baer Galvin With Thanks to: Tom Gendron


Chief Technologist SPARC Systems Technical Specialist
Corporate Technologies Sun Microsystems
116 1

Saturday, May 2, 2009


Agenda
• Virtualization Comparisons
• Concepts of LDOMs
• Requirements of LDOMs
• Examples
• Best Practices

117

Saturday, May 2, 2009


Single application
per server
The Data Center Today

Server sprawl is hard


to manage Client

App App Mail


Service
Average server
Developer

Server Server Server Database Database


Application

Management
NETWORK

utilization between

Data Center
5 to 15 % OS
Server

Energy costs
continue to rise Storage

118

Saturday, May 2, 2009


A widely understood problem

119

Saturday, May 2, 2009


Virtualization: Who and Why

InformationWeek: Feb 12, 2007 http://www.informationweek.com/news/showArticle.jhtml?articleID=197004875


120

Saturday, May 2, 2009


Server Virtualization
Hard Partitions Virtual Machines OS Virtualization Resource Mgmt.
App Identity File Web Mail Calendar Web SunRay App
Server Database Server Server Server Server Server Database Server Server Database Server App

OS

Server

Multiple OSs Single OS


> Very High RAS > Live OS migration > Very scalable and low > Very scalable and low
capability overhead overhead
> Very Scalable
> Improved Utilization > Single OS to manage > Single OS to manage
> Mature Technology
> Ability to run different OS > Cleanly divides system and > Fine grained resource
> Ability to run different OS versions and types application administration management
versions
> De-couples OS and HW > Fine grained resource
> Complete Isolation versions management

121

Saturday, May 2, 2009


Para vs. Full Virtualization
Para-Virtualization • Para-virtualization:
File
Server
Web
Server
Mail
Server App > OS ported to special architecture
> Uses generic “virtual” device drivers
OS > More efficient since it is “hypervisor”
aware
Server > “almost” native performance
Full Virtualization • Full virtualization:
File
Server
Web
Server
Mail
Server App > OS has no idea it is running virtualized
> Must emulate real i/o devices
OS > Can be slow/need help from hardware
Control Domain > May use traps, emulation or rewriting
Server
122

Saturday, May 2, 2009


What is an LDOM?
• It is a virtual server
• Has its own console and OBP instance
• A configurable allocation of CPU, FPU, Disk, Memory and I/O components
• Runs a unique OS/patch image and configuration
• Has the capability to stop, start and reboot independently
• Utilizes a Hypervisor to facilitate LDOMs

123

Saturday, May 2, 2009


Requirements for LDOMs
• Sun T-Series server
> T1/2000 T5x20 rack servers
> T6100, T6120 blade
> Any future CMT based server
• Up to date Firmware on service processor
http://sunsolve.sun.com/handbook_pub/validateUser.do?target=index
• minimum Solaris 10 11/06 on T1/2000, T6100
• minimum Solaris 10 08/07 T5x20, T6120
• Ldom Manager Software 1.0.1 + patches

124

Saturday, May 2, 2009


Hypervisor
• A thin interface between the Hardware and Solaris
• The interface is called sun4v
• Solaris calls the sunv4 interface to use hardware
specific functions
• It is very simple and is implemented in firmware
• It allows for the creation of ldoms
• It creates communication channels between ldoms

125

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 08/07 Solaris 10 11/06 Solaris 10 08/07
+app+patches +app+patches

• The Control Domain


ldmd
drd
vntsd CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain


CPU
Cpu
CPU
/dev/lofi/1 Cpu Mem
Mem vdisk0 vdisk1
Crypto
CPU
Cpu
CPU
Cpu Mem
vol1 Mem
vnet0 Guestvnet0

• Multiple Guest
Control & Service Guest
primary CPU
Cpu
CPU Crypto ldom1 ldom2
Cpu Mem
Mem Crypto
vsw0

Domains
CPU
Cpu
CPU
Cpu
vnet0 vnet1
Mem Crypto
Mem Crypto

primary-vds0

• Virtualised devices
Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

126

Saturday, May 2, 2009


LDOMs types
• Different Ldom Types
- Control Domain - Hosts the Logical Domain Manager (LDM)

- Service Domains - Provides virtual services to other domains

- I/0 Domains - Has direct access to physical devices

- Guest Domains - Used to run user environments


• Control, Service and I/O domains can be combined or separate
> One of the I/O domains must be the control domain

127

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 08/07 Solaris 10 11/06 Solaris 10 08/07
+app+patches +app+patches

• The Control Domain


ldmd
drd
vntsd CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain


CPU
Cpu
CPU
ZFS FS Cpu Mem
Mem vdisk0 vdisk1
Crypto
CPU
Cpu
CPU
Cpu Mem
vol1 Mem
vnet0 Guestvnet0

• Multiple Guest
Control & Service Guest
primary CPU
Cpu
CPU Crypto ldom1 ldom2
Cpu Mem
Mem Crypto
vsw0

Domains
CPU
Cpu
CPU
Cpu
vnet0 vnet1
Mem Crypto
Mem Crypto

primary-vds0

• Virtualised devices
Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

128

Saturday, May 2, 2009


'Control' Domain
• Creates and manages other LDOMs
• Runs the LDOM Manager software
• Allows monitoring and reconfiguration of domains
• Recommendation:
> Make this Domain as secure as possible

129

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 08/07 Solaris 10 11/06 Solaris 10 08/07
+app+patches +app+patches

• The Control Domain


ldmd
drd
vntsd CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain


CPU
Cpu
CPU
ZFS FS Cpu Mem
Mem vdisk0 vdisk1
Crypto
CPU
Cpu
CPU
Cpu Mem
vol1 Mem
vnet0 Guestvnet0

• Multiple Guest
Control & Service Guest
primary CPU
Cpu
CPU Crypto ldom1 ldom2
Cpu Mem
Mem Crypto
vsw0

Domains
CPU
Cpu
CPU
Cpu
vnet0 vnet1
Mem Crypto
Mem Crypto

primary-vds0

• Virtualised devices
Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

130

Saturday, May 2, 2009


'Service' Domain
• Provides services to other domains
– virtual network switch
– virtual disk service
– virtual console service
• Multiple Service domains can exist with shared or
sole access to system facilities
• Allows for IO load separation and redundancy
within domains deployed on a platform
• Often Control and Service Domains are one and
the same

131

Saturday, May 2, 2009


IO Domain
• IO Domain has direct access to physical input and
output devices.
• The number of IO domains is hardware dependent
> currently limited to 2
> limited by PCI-E switch configuration
• One IO domain must also be the control domain

132

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 08/07 Solaris 10 11/06 Solaris 10 08/07
+app+patches +app+patches

• The Control Domain


ldmd
drd
vntsd CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain


CPU
Cpu
CPU
ZFS FS Cpu Mem
Mem vdisk0 vdisk1
Crypto
CPU
Cpu
CPU
Cpu Mem
vol1 Mem
vnet0
Control & Service Guest Guestvnet0

• Multiple Guest primary


vsw0
CPU
Cpu
CPU
Cpu Mem
Crypto
Mem Crypto
ldom1 ldom2

Domains
CPU
Cpu
CPU
Cpu
vnet0 vnet1
Mem Crypto
Mem Crypto

primary-vds0

• Virtualised devices
Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

133

Saturday, May 2, 2009


'Guest' Domains
• Contain the targeted applications the LDOMs were
created to service.
• Multiple Guest domains can exist
> Constrained only by hardware limitations
• May use one or more Service domains to obtain IO
> Various redundancy mechanisms can be used
• Can be independently 'powered' and rebooted and
without affecting other domains

134

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 08/07 Solaris 10 11/06 Solaris 10 08/07
+app+patches +app+patches

• The Control Domain


ldmd
drd
vntsd CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain


CPU
Cpu
CPU
ZFS FS Cpu Mem
Mem vdisk0 vdisk1
Crypto
CPU
Cpu
CPU
Cpu Mem
vol1 Mem
vnet0
Control & Service Guest Guestvnet0

• Multiple Guest primary


vsw0
CPU
Cpu
CPU
Cpu Mem
Crypto
Mem Crypto
ldom1 ldom2

Domains
CPU
Cpu
CPU
Cpu
vnet0 vnet1
Mem Crypto
Mem Crypto

primary-vds0

• Virtualised devices
Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

135

Saturday, May 2, 2009


Virtual devices
• Virtual devices are hardware resources abstracted by the hypervisor
and made available for use by the other domains
• Virtual devices are :
> CPU's - VCPU
> Memory -
> Crypto cores - MAU
> Network switches - VSW
> NICs - VNET
> Disk servers - VDSDEV
> Disks - VDISK
> Consoles - VCONS

136

Saturday, May 2, 2009


Example 1
Install Ldom Manager &
Setting up the Control Domain

137

Saturday, May 2, 2009


Example 1 steps
• Update firmware to latest release
• Install Supported version of Solaris
• Install Logical Domain Manager (LDM) software
• Configure the control domain
• Save initial domain config
• Reboot Solaris

138

Saturday, May 2, 2009


A note on system interfaces
• Provide out-of-band management
• Two types (iLOM and ALOM)‫‏‬
• T1/2000 uses ALOM interface
• T5x20 uses iLOM
• iLOM “CLI” has a ALOM compatibility shell
> ALOM shell used in the examples
• A web based interface available
• (SC = system controller, SP = system processor)‫‏‬
> essentially the same thing.

139

Saturday, May 2, 2009


Web based iLOM interface

140

Saturday, May 2, 2009


ALOM compatibility shell
• login to SP as root/changeme
• -> create /SP/users/admin
• -> set /SP/users/admin role=Administrator
• -> set /SP/users/admin cli_mode=alom
– Creating user ...
– Enter new password: ********
– Enter new password again: ********
– Created /SP/users/admin
• exit
• login as admin
141

Saturday, May 2, 2009


Step 1
Firmware verification and update

142

Saturday, May 2, 2009


System Identification and Update
• Check the Service Processor of your system for firmware levels
• using alom mode (showhost not available in bui)‫‏‬
sc> showhost
Sun System Firmware 7.0.1 2007/09/14 16:31

Host flash versions:


Hypervisor 1.5.1 2007/09/14 16:11
OBP 4.27.1 2007/09/14 15:17 Check SC Firmware
POST 4.27.1 2007/09/14 15:43 version 7.0.1

• Upgrade your system firmware if needed...


> flashupdate command
> sysfwdownload (via Solaris on platform)‫‏‬
> BUI

143

Saturday, May 2, 2009


Firmware update example
sc> showkeyswitch
Keyswitch is in the NORMAL position.
sc> flashupdate -s 10.8.66.15 -f /incoming//Sun_System_Firmware-6_4_6-Sun_Fire_T2000.bin
Username: tgendron
Password: ********

SC Alert: System poweron is disabled.


Update complete. Reset device to use new software.
sc>
sc> resetsc

telnet and login back in once up.

sc> showhost
Sun-Fire-T2000 System Firmware 6.5.5 2007/10/28 23:09

144

Saturday, May 2, 2009


Firmware update example 2
Step 1:
From Solaris running on T5120 with the SP to update
Download the patch from Sun Solve 127580-05.zip

Step 2:
unzip and cd into 127580-05
Step 3:

run sysfwdownload [image].pkg


Step 4:
reboot solaris
sc> resetsc

145

Saturday, May 2, 2009


Installing LDOM manager software
• T5x20 requires Solaris 10 8/07 or greater
• T1/2000 requires Solaris 10 11/06 or greater +
* 124921-02 at a minimum
* 125043-01 at a minimum
* 118833-36 at a minimum
• 11/06 is minimum for guests
• ldm 1.0.2 is current
> includes Solaris Security Toolkit (optional)‫‏‬

146

Saturday, May 2, 2009


Install the LDM Software
• Unzip and install w/installation script
• Security of Control Domain is important
> Recommend selecting the JASS secure configuration
• Once complete entire system is one LDOM
• LDOM software installed in /opt/SUNWldm
# [cmt1/root] ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 64 8064M 0.0% 3h 19m
[cmt1/root]

All the system resource are in domain “primary”


* Follow the Administration Guide to install required OS and patches
147

Saturday, May 2, 2009


Flag Definitions
# ldm list

NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME


primary active -n-cv SP 32 32640M 0.1% 6d 20h 24m
#

- placeholder
c control domain
d delayed reconfiguration
n normal
s starting or stopping
t transition
v virtual I/O domain
148

Saturday, May 2, 2009


Example 1
Part 2
Setting up the Control Domain

149

Saturday, May 2, 2009


On naming things...
• Choose LDOM component names carefully
> Names are used to manage the devices
> Bad choices can be very confusing later on...
> Keep names short and specific...

• You need names for ...


> Disk Servers, and disk device instances
> Network Virtual Switches, and network device instances
> Domains

• Service and device names are only known to the


Control and Service domains
– Guest domains just see virtual devices.

150

Saturday, May 2, 2009


Control/Service Domain
• On our 'Primary' Domain do the following ...
• In this example Control and Service are combined Primary

> Control domain runs the LDM CPU


Cpu
Unallocated
Resources
> Service domain has these services set up: Solaris 10 08/07
CPU
Cpu

ldmd CPU
Cpu
CPU
Cpu Mem
Mem
drd
• Set up the basic services needed. vntsd CPU
Cpu
CPU
Cpu Mem
Mem Mem
Mem

> vds - virtual disk service CPU


Cpu
CPU
Cpu Mem
Mem
CPU Crypto
Crypto
Cpu
CPU
> vcc - virtual console concentrator vcc0 vds0
Control & Service
Cpu Mem
Mem
primary CPU
Cpu
CPU
Cpu Mem Crypto
Mem Crypto
> vsw - virtual network switch vsw0
CPU
Cpu
CPU
Cpu Mem
Mem Crypto
Crypto
Crypto
Crypto

• The service names in this example are below: primary-vds


Hypervisor
primary-vsw
> primary-vds0 CPU
Cpu CPU
Cpu
Hardware
> primary-vcc0 Mem Shared CPU,
Mem Crypto
> primary-vsw0 Memory & IO
IO Devices
• Allocate resources
PCI-E
> CPU, Memory, Crypto, IO devices
Network
72GB 72GB

151

Saturday, May 2, 2009


Control/Service Domain set-up (1)‫‏‬
# Add services to the control domain
# The mac address taken from a physical interface, e.g., e1000g0.
ldm add-vds primary-vds0 primary
ldm add-vcc port-range=5000-5100 primary-vcc0 primary
ldm add-vsw mac-addr=0:14:4f:6a:9e:dc net-dev=e1000g0 primary-vsw0 primary
# Activate the virtual network terminal server
svcadm enable vntsd
# Allocate resources to the control domain and save
ldm set-mau 1 primary
ldm set-vcpu 8 primary
ldm set-memory 2G primary
ldm add-spconfig my-initial
# Reboot required to have the configuraiton take effect.
init 6

152

Saturday, May 2, 2009


Crypto Note
Note–If you have any cryptographic devices in the control domain, you
cannot dynamically reconfigure CPUs. So if you are not using
cryptographic devices, set-mau to 0.

153

Saturday, May 2, 2009


Control/Service Domain set-up (2)‫‏‬
# Verify the primary domain configuration
ldm list-domain
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 2G 6.3% 6m
# Enable Networking
ifconfig vsw0 plumb
ifconfig e1000g0 down unplumb
ifconfig vsw0 10.8.66.208 netmask 255.255.255.0 broadcast + up
ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
vsw0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 10.8.66.208 netmask ffffff00 broadcast 10.8.66.255
ether 0:14:4f:6a:9e:dc

154

Saturday, May 2, 2009


Ldom Service details

155

Saturday, May 2, 2009


Reconfiguration
• Dynamic reconfiguration
> Resource changes that take effect w/out reboot of domain
• Delayed reconfiguration
> Resource changes that take effect after a reboot
• Resource examples:
> VCPU, Memory, IO devices
• Currently only VCPUs are dynamic

156

Saturday, May 2, 2009


Virtual Disk Server device (vds)‫‏‬ Delayed
Reconfiguration

• VDS runs in a service domain


• Performs disk I/O on corresponding raw devices
• Device types can be
> A entire physical disk or LUN (can be san based)‫‏‬
> Single slice of disk or LUN
> Disk image in a filesystem (e.g. ufs, zfs)‫‏‬
> Disk volumes (zfs, svm, VxVM)‫‏‬
> lofi devices NOT supported
• Virtual Disk Client (vdc drivers)‫‏‬
> Requests standard block IO via the VDS
> Classic client/server architecture

157

Saturday, May 2, 2009


Virtual Disk devices
• Physical LUNS perform best
• Disk image files efficient use of space
• ZFS snapshots and clones give rapid provisioning
• Network install not supported with
> zfs volumes
> single slice
• Network install requries
> entire disk
> disk image file

158

Saturday, May 2, 2009


Virtual Network Switch services Delayed

(vswitch)‫‏‬
Reconfiguration

• Implements a layer-2 network switch


• Connects virtual network devices to
> To the physical network
> or to each other (internal private network)‫‏‬
• vswitch not automatically used by service domain
> must be plumbed

159

Saturday, May 2, 2009


Virtual Console Concentrator Delayed

(vcc)‫‏‬
Reconfiguration

• Provides console access to LDoms


• Service domain VCC driver communicates with all guest console
drivers over the Hypervisor
> No changes required in guest console drivers (qcn)‫‏‬
• Makes each console available as a tty device on the Control/
Service domain
• usage: telnet local host <port>

160

Saturday, May 2, 2009


Virtual Network Terminal Server Delayed

daemon (vntsd)
Reconfiguration

• VCC implemented by vntsd


• Runs in the Control/Service domain
• Aggregates the VCC tty devices and makes them available over
network sockets
> Accessible once a domain is configured and bound
> Attach prior to domain start to watch domain OBP boot sequence
• Only one user at a time can view a serial console
• Flexible support of port groups, IP's, port numbers etc
> Not visible outside the Control/Service domain by default

161

Saturday, May 2, 2009


Example 2
Setting up the Guest Domain

162

Saturday, May 2, 2009


Primary ldm1

Guest Domain Unallocated


Resources
Solaris 10 11/06
Solaris 10 08/07
ldmd CPU
Cpu
CPU
Cpu Mem
Mem +app+patches
drd
In the control domain: vntsd CPU
Cpu
CPU
Cpu Mem
Mem /dev/dsk/c0d0s0
CPU
Cpu
CPU
/dev/c0t1d0s0 Cpu Mem
Mem ldm1-vdisk1

T2000 ldm1-
Control &vol1
Service
CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guest
ldm add-domain ldm1 /dev/e1000g0 primary CPU
Cpu
CPU
Cpu Mem Crypto
Mem Crypto ldom1
vsw0 CPU
Cpu
CPU vnet0
ldm add-mau 1 ldm1
Cpu Mem Crypto
Mem Crypto

ldm add-vcpu 4 ldm1 primary-vds0


Hypervisor
primary-vsw0

ldm add-memory 4G ldm1 CPU


Cpu CPU
Cpu
Hardware
CPU
Cpu CPU
Cpu

Shared CPU,
ldm add-vnet vnet0 primary-vsw0 ldm1 Mem
Mem Crypto
Memory & IO
Mem
Mem Crypto

IO Devices

ldm add-vdsdev /dev/dsk/c0t1d0s2 ldm1-vol1@primary-vds0 PCI-E

ldm add-vdisk ldm1-vdisk1 ldm1-vol1@primary-vds0 ldm1 72GB 72GB Network

ldm set-var auto-boot\?=false ldm1


ldm set-var boot-device=vdisk ldm1
ldm set-var nvramrc-devalias vnet0 /virtual-devices@100/channel-
devices@200/network@0 ldm1
ldm bind-domain ldm1 • Watch the console of ldom1 using ...
> telnet localhost 5000
ldm start-domain ldm1
163

Saturday, May 2, 2009


Disk Service Setup Primary ldm1

• Establish a Virtual Disk Service


– 'primary-vds' Solaris 10 11/06 Solaris 10 11/06
+app+patches
ldmd /dev/dsk/c0d0s0
• Associate it with some form of vntsd
drd
media. ldm1-vdisk1
– A real device or slice /dev/dsk/ /dev/c0t1d0s2
c0t1d0s0 or
ldm1-vol1
– or a disk image e.g. '/ Control & Service
primary
Guest
ldom1
ldmzpool/ldg1'
• Create disk server device
instance to be exported to guest primary-vds
Hypervisor
domains
– 'ldm1-vol1@primary-vds' CPU
Cpu CPU
Cpu

Hardware
CPU
Cpu CPU
Cpu

Shared CPU,
 ldm add-vdsdev /dev/dsk/c0t1d0s2 Mem
Mem Crypto
Memory & IO
Mem
Mem Crypto

ldm1-vol1@primary-vds0
IO Devices
 ldm add-vdisk ldm1-vdisk1 ldm1-
vol1@primary-vds0 ldm1
72GB 72GB PCI-E
 (The disk device name can vary - find it
via “ok show-devs”) Network
164

Saturday, May 2, 2009


Virtual Disk Client (vdc) Delayed
Reconfiguration

• vdc's are the objects passed to OBP and the Operating System in
guest systems
• Guest domain OBP and Solaris sees normal SCSI devices
• Domain administrators may setup devaliases or use raw vdisk
devices
• vdc’s provide Guest domains with virtual disk devices (vdisks) via
device instances from Virtual Disk Servers running in the Service
Domains(s)
• A future release will provide virtualised access to DVD/CD-ROM in
service domains

165

Saturday, May 2, 2009


Network Setup Primary ldom1

• Establish a Virtual Network Switch Solaris 10 08/07 Solaris 10 11/06


Services ldmd
+app+patches

– 'primary-vsw0' vntsd
drd

> Automatically associated with a


vsw device instance
– 'vsw0' Control & Service Guestvnet0
primary
primary- ldom1
• May or may not choose to vsw0
vnet0@ldm1
associate it with media.
– 'e1000g0' a real NIC
primary-vsw
Hypervisor
– or no NIC . in memory
CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
• Create a network device instance Hardware
to provide to guest domains Mem
Mem Crypto
Shared CPU,
Memory & IO
Mem
Mem Crypto

– 'vnet0@ldm1'
IO Devices

e1000g0
72GB PCI-E

Network
166

Saturday, May 2, 2009


Virtual Network Device (vnet) Delayed
Reconfiguration

• Implements an ethernet device in a domain


> Communicates with other vnets or the outside world over vswitch
devices
• If the vSwitch is suitably configured, packets can be routed out of
the server.
• vnet exports a GLDv3 interface
> A simple virtual Ethernet NIC
> Enumerates as a 'vnetx' device
> For domain-domain transfers, vnets connect 'directly'.

167

Saturday, May 2, 2009


Memory Delayed
Reconfiguration

• Memory is configured through the Control Domain


• Minimum allocatable chunk is 8kB
> Minimum size is 12MB (for OBP)‫‏‬
> Though most OS deployments will need > 512M
• If memory is added over time to a domain
> Memory device bindings within a domain may appear to show that
memory fragmentation is occuring
> Not a problem, all handled in HW by the MMU
> No performance penalty

168

Saturday, May 2, 2009


vCPU's Immediate
Reconfiguration

• Each UltraSPARC T1 has up to 8 physical cores with 4 threads each


> Each thread is considered a vCPU, so up to 32 vCPUs or Domains
• Each UltraSPARC T2 has up to 8 physical cores with 8 threads each
> Each thread is considered a vCPU, so up to 64 vCPUs or Domains
• Maximum Granularity is 1 vCPU per domain
• vCPU's can only be allocated to one Domain at a time.
• Can be dynamically allocated with the Domain running,
> Take care if removing a vcpu from a running domain, will there be
enough compute power left in the domain ?

169

Saturday, May 2, 2009


Example 3
Guest Domains and ZFS

170

Saturday, May 2, 2009


Using ZFS (1) – setup zfs
1. Remove the disk from the service domain
ldm stop-domain ldm1
LDom ldm1 stopped
ldm unbind-domain ldm1
ldm remove-vdsdev ldm1-vol1@primary-vds0

2. Create a zpool
root@cmt1 > zfs create mypool/ldoms
root@cmt1 > zfs create mypool/ldoms/ldm1
root@cmt1 > cd /export/ldoms/ldm1
root@cmt1 > ls
root@cmt1 > mkfile 12G `pwd`/rootdisk

171

Saturday, May 2, 2009


Using ZFS (2) – setup guest domain
3. Configure the guest domain
root@cmt1 > ldm add-domain ldm1
root@cmt1 > ldm add-vcpu 8 ldm1
root@cmt1 > ldm add-memory 1G ldm1
root@cmt1 > ldm add-vnet vnet0 primary-vsw0 ldm1
root@cmt1 > ldm add-vdsdev /export/ldoms/ldm1/rootdisk ldm1-vol1@primary-vds0
root@cmt1 > ldm add-vdisk ldm1-vdisk1 ldm1-vol1@primary-vds0 ldm1

root@cmt1 > ldm set-var auto-boot\?=false ldm1


root@cmt1 > ldm set-var boot-device=ldm1-vdisk1 ldm1
root@cmt1 > ldm set-var nvramrc-devalias vnet0 /virtual-devices@100/channel-
devices@200/network@0 ldm1

172

Saturday, May 2, 2009


Using ZFS (3) – setup guest domain
4. Start the guest domain
root@cmt1 > ldm bind-domain ldm1
root@cmt1 > ldm start-domain ldm1
LDom ldm1 started

5. Inspect the domain


root@cmt1 > ldm list-domain
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 2G 0.7% 17h 12m
ldm1 active -t--- 5000 8 1G 13% 7s

telnet localhost 5000


{ok} boot vnet0 - install
installation goes forward

173

Saturday, May 2, 2009


Provision the guest
6. Set up for jumpstart
Determine the mac address
root@cmt1 > ldm list-bindings ldm1
[snip]
NETWORK
NAME SERVICE DEVICE MAC
vnet0 primary-vsw0@primary network@0 00:14:4f:f8:2a:c4
PEER MAC
primary-vsw0@primary 00:14:4f:46:41:b4

telnet localhost 5000


{0} ok banner
SPARC Enterprise T5120, No Keyboard
[snip]
Ethernet address 0:14:4f:fb:7:42, Host ID: 83fb0742.

174

Saturday, May 2, 2009


Provision the guest (2)‫‏‬
{0} ok boot vnet0 - install
Boot device: /virtual-devices@100/channel-devices@200/network@0ile and args: - install
Requesting Internet Address for 0:14:4f:f8:2a:c4
SunOS Release 5.10 Version Generic_120011-14 64-bit
...
How to break
telnet> send brk
Debugging requested; hardware watchdog suspended.

c)ontinue, s)ync, r)eboot, h)alt? r


Resetting...

{0} ok

175

Saturday, May 2, 2009


Guest Domain (zfs) login
{0} ok boot
Boot device: ldm1-vdisk1 File and args:
SunOS Release 5.10 Version Generic_120011-14 64-bit
Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hostname: ldm1

ldm1 console login:

176

Saturday, May 2, 2009


Using ZFS (2) – cloning domains
Snapshot and Clone the installed boot disk
tgendron@cmt1 > zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 12.0G 54.9G 27.5K /export
mypool/ldoms 12.0G 54.9G 25.5K /export/ldoms
mypool/ldoms/ldm1 12.0G 54.9G 12.0G /export/ldoms/ldm1

root@cmt1 > zfs snapshot mypool/ldoms/ldm1@initial

Create the clones


root@cmt1 > zfs snapshot mypool/ldoms/ldm1@initial
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm2
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm3
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm4
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm5
177

Saturday, May 2, 2009


Using ZFS (2) – Leverage the clones
4. Create the new guest domains (should be easily to script this)‫‏‬
ldm add-domain ldm2
ldm add-vcpu 8 ldm2
ldm add-memory 1G ldm2
ldm add-vnet vnet0 primary-vsw0 ldm2

ldm add-vdsdev /export/ldoms/ldm2/rootdisk ldm2-vol1@primary-vds0


ldm add-vdisk ldm2-vdisk1 ldm2-vol1@primary-vds0 ldm2

ldm set-var auto-boot\?=false ldm2


ldm set-var boot-device=vdisk ldm2
ldm set-var nvramrc-devalias vnet0 /virtual-devices@100/channel-devices@200/network@0 ldm2
ldm bind-domain ldm2
ldm start-domain ldm2

178

Saturday, May 2, 2009


Boot the cloned ldom
{0} ok boot
Boot device: vdisk File and args:
SunOS Release 5.10 Version Generic_120011-14 64-bit
Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
WARNING: vnet0 has duplicate address 010.030.019.178 (in use by 00:14:4f:f8:2a:c4); disabled
Feb 13 19:55:29 svc.startd[7]: svc:/network/physical:default:
Method "/lib/svc/method/net-physical" failed with exit status 96.
Feb 13 19:55:29 svc.startd[7]: network/physical:default misconfigured:
transitioned to maintenance (see 'svcs -xv' for details)‫‏‬
Hostname: ldm1...

179

Saturday, May 2, 2009


Example 4
Split Service Domains

180

Saturday, May 2, 2009


Sun Fire T2000 Block Diagram

181

Saturday, May 2, 2009


Split IO Example
• Setting up a second Service domain with split PCI busses...
-bash-3.00# ldm list-bindings primary
Name: primary
...
IO: pci@780 (bus_a)‫‏‬
pci@7c0 (bus_b)‫‏‬
...
-bash-3.00# df /
/ (/dev/dsk/c1t0d0s0 ):28233648 blocks 3450076 files
-bash-3.00# ls -l /dev/dsk/c1t0d0s0
lrwxrwxrwx 1 root root 65 Apr 11 13:25 /dev/dsk/c1t0d0s0 -> ../../devices/
pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0:a
-bash-3.00# grep e1000g /etc/path_to_inst
"/pci@780/pci@0/pci@1/network@0" 0 "e1000g"
"/pci@780/pci@0/pci@1/network@0,1" 1 "e1000g"
"/pci@7c0/pci@0/pci@2/network@0" 2 "e1000g"
"/pci@7c0/pci@0/pci@2/network@0,1" 3 "e1000g"
-bash-3.00# ldm remove-io pci@780 primary
..
-bash-3.00# shutdown -i6 -y -g0
..
-bash-3.00# ldm add-io pci@780 second-svrc-dom
-bash-3.00# ldm start second-srvc-dom
-bash-3.00# ldm list-bindings
..
-bash-3.00#

Check which PCI bus ports we own and are currently using and be sure to only give away
unused ones... i.e need to retain the Control Domain boot disk controller and network device...
Providing a PCI bus to a Guest makes the selected Domain a Service domain, by definition –
access to physical IO = Service Domain.
182

Saturday, May 2, 2009


Sun Fire T5x20 Block Diagram
16 x FB-DIMMs

Disk Chassis
1RU 2RU/8

SSI
x8 FPGA
10GbE
10GbE
x4 PCI-E
LSI x4 Switch x8 MPC885
SAS links x4 1068E ILOM
PLX 8533 Service
DVD x4 Processor
PCI-E x4
x8 Switch
USB 2.0
PCI-E
to IDE PCI-E x1 Switch PLX 8533
to x8
10GbE
2.0 USB PLX 8517 x4 x4 x4 SerDes
BCM8704
USB 2.0 x4 x4
Hub 10GbE
2RU Only
USB Intel Intel Cu PHY
2.0 BCMxxxx
Dual Dual
GbE GbE XFP

Front Panel
10GbE
Fibre
0 1 2 3 Plugin

USB Quad GbE PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E Serial NetworkPOSIX
Rear
183Panel Connectors x16 x8 x8 x8 x8 x8 Mgt MgtSerial DB-9

Saturday, May 2, 2009


MPxIO considerations
• MPxIO can be used in the Service/Control domain
• Very straightforward to configure with defaults...
> Ensure you have two FC-AL HBA's in a single service domain attached to the
the same SAN array
> Check that you have two paths to the same SAN devices ('ls /dev/dsk/')‫‏‬
> Enable MPxIO by running the command 'stmsboot -e' and rebooting the control/
service domain
> Check that you now have only a single path to the SAN devices...

184

Saturday, May 2, 2009


IPMP considerations
• IPMP has several options for configurations
> Refer to the Admininstration Guide for worked examples...
> Options are Multipathing in the Service Domain or Multipathing in the Guest
Domain

185

Saturday, May 2, 2009


Ldom 1.0.1
Best Practice Guidence

186

Saturday, May 2, 2009


Ldom Best Practice (1)‫‏‬
• Control Domain
> Runs LDM daemon processes
> Must have adequate CPU and memory
> Start w/ 1 core (4 or 8 threads) 1GB Memory
> Make this domain as secure as possible

187

Saturday, May 2, 2009


Ldom Best Practice (2)‫‏‬
• I/O and service domains
> Runs IO for other domains
> Resources will be sized based on IO load
> Start w/ 1 core and 1GB memory
> 4GB of memory if zfs used for virtual disks images
> Add complete cores as heavier I/O loads

188

Saturday, May 2, 2009


Ldom Best Practice (3)‫‏‬
• Core/Thread Affinity
> Core resources are shared by threads
> E.g. L1 cache and MAU, FPU
• Best to avoid allocating the threads of a core to
separate domains
• Create larger Ldoms first using complete cores
• Smaller domains last

189

Saturday, May 2, 2009


Ldom Best Practice (4)‫‏‬ Delayed
Reconfiguration

• Cypto Units
• Each T1/T2 physical CPU Core has a Crypto Unit
> 8 in total on a 8 core system
> referred to as (MAU)‫‏‬
• Crypto cores can only be allocated to domains that have at least
one vcpu(thread) on the same physical Core as the crypto unit
• Crypto cores cannot be shared, they are owned by exactly one (or
no) Domain
• Probably best to allocate all four/eight threads on a Core to a
domain that wants to use the Crypto core

190

Saturday, May 2, 2009


More on Crypto Units
• For example we define three domains in order of LDOM1 LDOM2 LDOM3
LDOM1 then LDOM2 then LDOM 3...
• LDOM1 has 3 threads (vCPUs) on Core 0
> Only has access to MAU0 since it only has threads
on Core 0

T1 Core 0

T1 Core 1

T1 Core 2
• LDOM2 has 6 vCPUs spread across Cores 0, 1 & 2
> Potentially has access to MAUs 0,1 & 2
> BUT.. LDOM1 already binds MAU0
> So only can take MAU1 and MAU2
MAU MAU MAU
• LDOM3 has 3 vCPUs on Core 2 0 1 2
> But can't access any MAU's since LDOM2 has already taken MAU2
• Adding and removing vCPUs can cause access to previously accessible MAU's to be
lost, currently you can't elect specific vCPU's, framework does that itself
• When MAU's are allocated to Domains, vCPU's become delayed reconfiguration
properties in those domains
191

Saturday, May 2, 2009


Ldom Best Practice (5)‫‏‬
• Plan your LDOM configuration carefully, reconfiguration may become awkward
• Use easy to understand names
> Try not to overload vds, vsw, ldom, vdisk,vnic etc...
• Use MPxIO or VxVM, VxFS, Sun Cluster on service domains (only VxFS in
Guests) for resilient storage devices
• Use IPMP on Guest or Service Domains for resilient network connections

192

Saturday, May 2, 2009


Ldom Best Practice (6)‫‏‬
• For hi-speed inter-domain comms use device-less/in-memory VSW configs
• For high disk performance, allocate a whole real device via a dedicated, properly
sized Virtual Disk Server and Service domain
• Look at the server architecture when configuring devices to ensure you get the
bandwidth you expect
• For critical applications consider hot/warm standby domains across multiple
physical servers, never rely on multiple instances within a single server.

193

Saturday, May 2, 2009


LDOM's v1.0.1 Notes
• All domains can be Stopped and Started independently
> Beware, Guest domains attempting to perform IO using a rebooting Service
domain will stall until the Service domain returns.
• LDOM SNMP MIB available now with traps and requests to the LDOM framework
• MAC address on banner different from what is raprd for jumpstart
• Only vcpu's can be dynamically reallocated
> BUT... if the domain has crypto cores this becomes a delayed reconfiguration
> You cannot choose which vCPU's are allocated to a domain
• By default the Control/Service domain cannot network with Guest domains
> Plumb the vSwitch vsw device to enable communications
> Give the vsw device the e1000g devices MAC address
• Check you have the latest versions of the documents, Software & Firmware

194

Saturday, May 2, 2009


SVM, VxVM, ZFS Volume managers
• SVM, VxVM and ZFS volumes can be exported from a Service Domain to Guest
domains and appear as virtual disks to the Guest Domains
> Always appear as a disk with only one s0 slice
> Can't be used as Solaris Install targets...yet, just use for data storage
• Can export a disk image file placed in one of these volumes as a full disk image to
Guest domains
> Allows use of the disk as Solaris Install Target
> Doing this with ZFS allows very efficient re-use of images using ZFS
Snapshotting and Cloning and Compression
> Invisibly bestows the benefits of the underlying Volume manager on the disks
available to the Guest domains
> Using SVM allows either Guest or Service domain to access the disk image,
allowing for off-line
maintenan
c
e of the guest domain filesystems (only one at a time can mount the filesystem)‫‏‬
• VxVM can only be used in the Service domain, not Guest domains
195

Saturday, May 2, 2009


Solaris Cluster 3.2 Support

• Sun Cluster 3.2 is now supported in IO Domains


> i.e domains with real physical devices, PCI busses or NIU devices
• Please check the web site here for more infom on deployment scenarios
> http://blogs.sun.com/SC/entry/announcing_solaris_cluster_support_in

196

Saturday, May 2, 2009


Logical Domains (LDoms) Roadmap

• LDoms 1.0 > LDoms 1.0.1 - CURRENT


> Niagara support – Niagara2 support
> Up to 32 LDOMs per system, guest domain – I/O domain reboot support
may be rebooted independently – Control domain minimization
> Virtualized console, ethernet, disk & – SNMP MIB
cryptographic acceleration
– Web management tool
> Live re-configuration of virtual CPUs (freeware/unsupported)‫‏‬
> FMA diagnosis for each domain
> Control domain hardening

* Requiring new Solaris 10 update


197

Saturday, May 2, 2009


References for further information
• http://www.sun.com/ldoms
• Sun Blueprints relating to LDOMs
– http://www.sun.com/blueprints/0207/820-0832.html
– http://www.sun.com/blueprints/0807/820-3023.html
• SDLC Release of LDOMs
– http://www.sun.com/download/products.xml?id=46e5ba66
• Official Documentation for the SDLC release
– http://www.sun.com/servers/coolthreads/ldoms/get.jsp

• LDOMs Blogs
– http://blogs.sun.com/hlsu/entry/logincal_domains_1_0_1

• OpenSolaris LDOMs community


– http://www.opensolaris.org/os/community/ldoms/
198

Saturday, May 2, 2009


LDOMS Introduction
and
Hands-On-Training

Peter Baer Galvin With Thanks to: Tom Gendron


Chief Technologist SPARC Systems Technical Specialist
Corporate Technologies Sun Microsystems
199 1

Saturday, May 2, 2009


Agenda
• Virtualization Comparisons
• Concepts of LDOMs
• Requirements of LDOMs
• Examples
• Best Practices

200

Saturday, May 2, 2009


Single application
per server
The Data Center Today

Server sprawl is hard


to manage Client

App App Mail


Service
Average server Server Server Server Database Database
Developer

Application
utilization between

Management
NETWORK

Data Center
5 to 15 % OS

Server

Energy costs continue


to rise Storage

201

Saturday, May 2, 2009


A widely understood problem

202

Saturday, May 2, 2009


Virtualization: Who and Why

InformationWeek: Feb 12, 2007 http://www.informationweek.com/news/showArticle.jhtml?articleID=197004875


203

Saturday, May 2, 2009


Server Virtualization
Hard Partitions Virtual Machines OS Virtualization Resource Mgmt.
App Identity File Web Mail Calendar Web SunRay App
Server Database Server Server Server Server Server Database Server Server Database Server App

OS

Server

Multiple OSs Single OS


> Very High RAS > Live OS migration capability > Very scalable and low overhead > Very scalable and low overhead
> Very Scalable > Improved Utilization > Single OS to manage > Single OS to manage
> Mature Technology > Ability to run different OS > Cleanly divides system and > Fine grained resource
versions and types application administration management
> Ability to run different OS
versions > De-couples OS and HW > Fine grained resource
versions management
> Complete Isolation

204

Saturday, May 2, 2009


Para vs. Full Virtualization
Para-Virtualization
• Para-virtualization:
App > OS ported to special architecture
File Web Mail
Server Server Server

> Uses generic “virtual” device drivers


OS > More efficient since it is “hypervisor” aware
> “almost” native performance
Server
• Full virtualization:
Full Virtualization > OS has no idea it is running virtualized
File
Server
Web
Server
Mail
Server App > Must emulate real i/o devices
> Can be slow/need help from hardware
OS > May use traps, emulation or rewriting
Control Domain

Server

205

Saturday, May 2, 2009


What is an LDOM?
• It is a virtual server
• Has its own console and OBP instance
• A configurable allocation of CPU, FPU, Disk, Memory and I/O components
• Runs a unique OS/patch image and configuration
• Has the capability to stop, start and reboot independently
• Utilizes a Hypervisor to facilitate LDOMs

206

Saturday, May 2, 2009


Requirements for LDOMs
• Sun T-Series server
> T1/2000 T5x20 rack servers
> T6100, T6120 blade
> Any future CMT based server
• Up to date Firmware on service processor
http://sunsolve.sun.com/handbook_pub/validateUser.do?target=index
• minimum Solaris 10 11/06 on T1/2000, T6100
• minimum Solaris 10 08/07 T5x20, T6120
• Ldom Manager Software 1.0.1 + patches

207

Saturday, May 2, 2009


Hypervisor
• A thin interface between the Hardware and Solaris
• The interface is called sun4v
• Solaris calls the sunv4 interface to use hardware
specific functions
• It is very simple and is implemented in firmware
• It allows for the creation of ldoms
• It creates communication channels between ldoms

208

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 11/06 Solaris 10 08/07
Solaris 10 08/07
+app+patches +app+patches

• The Control Domain ldmd

vntsd
drd
CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain /dev/lofi/1 CPU


Cpu
CPU

CPU
Cpu

Cpu
CPU
Mem
Mem
Crypto
vdisk0 vdisk1

Cpu Mem
vol1

Multiple Guest
Mem

• Control & Service


primary CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guestvnet0
ldom1
Guestvnet0
ldom2

Domains
vsw0
CPU
Cpu
CPU vnet0 vnet1
Cpu Mem Crypto
Mem Crypto

• Virtualised devices primary-vds0


Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

209

Saturday, May 2, 2009


LDOMs types
• Different Ldom Types
- Control Domain - Hosts the Logical Domain Manager (LDM)

- Service Domains - Provides virtual services to other domains

- I/0 Domains - Has direct access to physical devices

- Guest Domains - Used to run user environments


• Control, Service and I/O domains can be combined or separate
> One of the I/O domains must be the control domain

210

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 11/06 Solaris 10 08/07
Solaris 10 08/07
+app+patches +app+patches

• The Control Domain ldmd

vntsd
drd
CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain ZFS FS CPU


Cpu
CPU

CPU
Cpu

Cpu
CPU
Mem
Mem
Crypto
vdisk0 vdisk1

Cpu Mem
vol1

Multiple Guest
Mem

• Control & Service


primary CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guestvnet0
ldom1
Guestvnet0
ldom2

Domains
vsw0
CPU
Cpu
CPU vnet0 vnet1
Cpu Mem Crypto
Mem Crypto

• Virtualised devices primary-vds0


Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

211

Saturday, May 2, 2009


'Control' Domain
• Creates and manages other LDOMs
• Runs the LDOM Manager software
• Allows monitoring and reconfiguration of domains
• Recommendation:
> Make this Domain as secure as possible

212

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 11/06 Solaris 10 08/07
Solaris 10 08/07
+app+patches +app+patches

• The Control Domain ldmd

vntsd
drd
CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain ZFS FS CPU


Cpu
CPU

CPU
Cpu

Cpu
CPU
Mem
Mem
Crypto
vdisk0 vdisk1

Cpu Mem
vol1

Multiple Guest
Mem

• Control & Service


primary CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guestvnet0
ldom1
Guestvnet0
ldom2

Domains
vsw0
CPU
Cpu
CPU vnet0 vnet1
Cpu Mem Crypto
Mem Crypto

• Virtualised devices primary-vds0


Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

213

Saturday, May 2, 2009


'Service' Domain
• Provides services to other domains
– virtual network switch
– virtual disk service
– virtual console service
• Multiple Service domains can exist with shared or sole
access to system facilities
• Allows for IO load separation and redundancy within
domains deployed on a platform
• Often Control and Service Domains are one and the
same

214

Saturday, May 2, 2009


IO Domain
• IO Domain has direct access to physical input and
output devices.
• The number of IO domains is hardware dependent
> currently limited to 2
> limited by PCI-E switch configuration
• One IO domain must also be the control domain

215

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 11/06 Solaris 10 08/07
Solaris 10 08/07
+app+patches +app+patches

• The Control Domain ldmd

vntsd
drd
CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain ZFS FS CPU


Cpu
CPU

CPU
Cpu

Cpu
CPU
Mem
Mem
Crypto
vdisk0 vdisk1

Cpu Mem
vol1

Multiple Guest
Mem

• Control & Service


primary
vsw0
CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guestvnet0
ldom1
Guestvnet0
ldom2

Domains CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
vnet0 vnet1

• Virtualised devices primary-vds0


Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

216

Saturday, May 2, 2009


'Guest' Domains
• Contain the targeted applications the LDOMs were
created to service.
• Multiple Guest domains can exist
> Constrained only by hardware limitations
• May use one or more Service domains to obtain IO
> Various redundancy mechanisms can be used
• Can be independently 'powered' and rebooted and
without affecting other domains

217

Saturday, May 2, 2009


Key LDOMs components Primary/Control ldom1 ldom2

• The Hypervisor
Unallocated
Resources
Solaris 10 11/06 Solaris 10 08/07
Solaris 10 08/07
+app+patches +app+patches

• The Control Domain ldmd

vntsd
drd
CPU
Cpu /dev/dsk/c0d0s0 /dev/dsk/c0d0s0

• The Service Domain ZFS FS CPU


Cpu
CPU

CPU
Cpu

Cpu
CPU
Mem
Mem
Crypto
vdisk0 vdisk1

Cpu Mem
vol1

Multiple Guest
Mem

• Control & Service


primary
vsw0
CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guestvnet0
ldom1
Guestvnet0
ldom2

Domains CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
vnet0 vnet1

• Virtualised devices primary-vds0


Hypervisor
primary-vsw0

CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu
CPU CPU
Cpu
Cpu
Hardware
Mem Shared CPU,Mem Mem
Mem Crypto Mem Crypto Mem Crypto
Memory & IO Mem
Mem Crypto

IO Devices

72GB PCI-E

Network

218

Saturday, May 2, 2009


Virtual devices
• Virtual devices are hardware resources abstracted by the hypervisor and
made available for use by the other domains
• Virtual devices are :
> CPU's - VCPU
> Memory -
> Crypto cores - MAU
> Network switches - VSW
> NICs - VNET
> Disk servers - VDSDEV
> Disks - VDISK
> Consoles - VCONS

219

Saturday, May 2, 2009


Example 1
Install Ldom Manager &
Setting up the Control Domain

220

Saturday, May 2, 2009


Example 1 steps
• Update firmware to latest release
• Install Supported version of Solaris
• Install Logical Domain Manager (LDM) software
• Configure the control domain
• Save initial domain config
• Reboot Solaris

221

Saturday, May 2, 2009


A note on system interfaces
• Provide out-of-band management
• Two types (iLOM and ALOM)‫‏‬
• T1/2000 uses ALOM interface
• T5x20 uses iLOM
• iLOM “CLI” has a ALOM compatibility shell
> ALOM shell used in the examples
• A web based interface available
• (SC = system controller, SP = system processor)‫‏‬
> essentially the same thing.

222

Saturday, May 2, 2009


Web based iLOM interface

223

Saturday, May 2, 2009


ALOM compatibility shell
• login to SP as root/changeme
• -> create /SP/users/admin
• -> set /SP/users/admin role=Administrator
• -> set /SP/users/admin cli_mode=alom
– Creating user ...
– Enter new password: ********
– Enter new password again: ********
– Created /SP/users/admin
• exit
• login as admin

224

Saturday, May 2, 2009


Step 1
Firmware verification and update

225

Saturday, May 2, 2009


System Identification and Update
• Check the Service Processor of your system for firmware levels
• using alom mode (showhost not available in bui)‫‏‬

sc> showhost
Sun System Firmware 7.0.1 2007/09/14 16:31

Host flash versions:


Hypervisor 1.5.1 2007/09/14 16:11
OBP 4.27.1 2007/09/14 15:17
Check SC Firmware
POST 4.27.1 2007/09/14 15:43 version 7.0.1

• Upgrade your system firmware if needed...


> flashupdate command
> sysfwdownload (via Solaris on platform)‫‏‬
> BUI

226

Saturday, May 2, 2009


Firmware update example
sc> showkeyswitch
Keyswitch is in the NORMAL position.
sc> flashupdate -s 10.8.66.15 -f /incoming//Sun_System_Firmware-6_4_6-Sun_Fire_T2000.bin
Username: tgendron
Password: ********

SC Alert: System poweron is disabled.


Update complete. Reset device to use new software.
sc>
sc> resetsc

telnet and login back in once up.

sc> showhost
Sun-Fire-T2000 System Firmware 6.5.5 2007/10/28 23:09

227

Saturday, May 2, 2009


Firmware update example 2
Step 1:
From Solaris running on T5120 with the SP to update
Download the patch from Sun Solve 127580-05.zip

Step 2:
unzip and cd into 127580-05
Step 3:

run sysfwdownload [image].pkg


Step 4:
reboot solaris
sc> resetsc

228

Saturday, May 2, 2009


Installing LDOM manager software
• T5x20 requires Solaris 10 8/07 or greater
• T1/2000 requires Solaris 10 11/06 or greater +
* 124921-02 at a minimum
* 125043-01 at a minimum
* 118833-36 at a minimum
• 11/06 is minimum for guests
• ldm 1.0.2 is current
> includes Solaris Security Toolkit (optional)‫‏‬

229

Saturday, May 2, 2009


Install the LDM Software
• Unzip and install w/installation script
• Security of Control Domain is important
> Recommend selecting the JASS secure configuration
• Once complete entire system is one LDOM
• LDOM software installed in /opt/SUNWldm
# [cmt1/root] ldm list
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 64 8064M 0.0% 3h 19m
[cmt1/root]

All the system resource are in domain “primary”


* Follow the Administration Guide to install required OS and patches
230

Saturday, May 2, 2009


Flag Definitions
# ldm list

NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME


primary active -n-cv SP 32 32640M 0.1% 6d 20h 24m
#

- placeholder
c control domain
d delayed reconfiguration
n normal
s starting or stopping
t transition
v virtual I/O domain

231

Saturday, May 2, 2009


Example 1
Part 2
Setting up the Control Domain

232

Saturday, May 2, 2009


On naming things...
• Choose LDOM component names carefully
> Names are used to manage the devices
> Bad choices can be very confusing later on...
> Keep names short and specific...

• You need names for ...


> Disk Servers, and disk device instances
> Network Virtual Switches, and network device instances
> Domains

• Service and device names are only known to the Control


and Service domains
– Guest domains just see virtual devices.

233

Saturday, May 2, 2009


Control/Service Domain
• On our 'Primary' Domain do the following ...
• In this example Control and Service are combined Primary

> Control domain runs the LDM


CPU
Cpu
> Service domain has these services set up: CPU
Unallocated
Resources
Solaris 10 08/07 Cpu

• Set up the basic services needed. ldmd


drd
CPU
Cpu
CPU
Cpu Mem
Mem
vntsd CPU
Cpu
CPU Mem
Mem Mem
> vds - virtual disk service Cpu Mem
CPU
Cpu
CPU
Cpu Mem
Mem
> vcc - virtual console concentrator CPU Crypto
Crypto
Cpu
CPU
Cpu Mem
vcc0 vds0 Mem
> vsw - virtual network switch Control & Service
primary CPU
Cpu
CPU
Cpu Mem Crypto
Crypto
Mem
vsw0
CPU
• The service names in this example are below: Cpu
CPU
Cpu Mem Crypto
Mem Crypto
Crypto
Crypto

> primary-vds0 primary-vds


Hypervisor
primary-vsw
> primary-vcc0 CPU
Cpu CPU
Cpu
Hardware
> primary-vsw0 Mem Shared CPU,
Mem Crypto
Memory & IO
• Allocate resources
IO Devices
> CPU, Memory, Crypto, IO devices
PCI-E

72GB 72GB Network

234

Saturday, May 2, 2009


Control/Service Domain set-up (1)‫‏‬
# Add services to the control domain
# The mac address taken from a physical interface, e.g., e1000g0.
ldm add-vds primary-vds0 primary
ldm add-vcc port-range=5000-5100 primary-vcc0 primary
ldm add-vsw mac-addr=0:14:4f:6a:9e:dc net-dev=e1000g0 primary-vsw0 primary
# Activate the virtual network terminal server
svcadm enable vntsd
# Allocate resources to the control domain and save
ldm set-mau 1 primary
ldm set-vcpu 8 primary
ldm set-memory 2G primary
ldm add-spconfig my-initial
# Reboot required to have the configuraiton take effect.
init 6

235

Saturday, May 2, 2009


Crypto Note
Note–If you have any cryptographic devices in the control domain, you cannot
dynamically reconfigure CPUs. So if you are not using cryptographic devices,
set-mau to 0.

236

Saturday, May 2, 2009


Control/Service Domain set-up (2)‫‏‬
# Verify the primary domain configuration
ldm list-domain
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 2G 6.3% 6m
# Enable Networking
ifconfig vsw0 plumb
ifconfig e1000g0 down unplumb
ifconfig vsw0 10.8.66.208 netmask 255.255.255.0 broadcast + up
ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
vsw0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 3
inet 10.8.66.208 netmask ffffff00 broadcast 10.8.66.255
ether 0:14:4f:6a:9e:dc

237

Saturday, May 2, 2009


Ldom Service details

238

Saturday, May 2, 2009


Reconfiguration
• Dynamic reconfiguration
> Resource changes that take effect w/out reboot of domain
• Delayed reconfiguration
> Resource changes that take effect after a reboot
• Resource examples:
> VCPU, Memory, IO devices
• Currently only VCPUs are dynamic

239

Saturday, May 2, 2009


Virtual Disk Server device (vds)‫‏‬ Delayed
Reconfiguration

• VDS runs in a service domain


• Performs disk I/O on corresponding raw devices
• Device types can be
> A entire physical disk or LUN (can be san based)‫‏‬
> Single slice of disk or LUN
> Disk image in a filesystem (e.g. ufs, zfs)‫‏‬
> Disk volumes (zfs, svm, VxVM)‫‏‬
> lofi devices NOT supported
• Virtual Disk Client (vdc drivers)‫‏‬
> Requests standard block IO via the VDS
> Classic client/server architecture

240

Saturday, May 2, 2009


Virtual Disk devices
• Physical LUNS perform best
• Disk image files efficient use of space
• ZFS snapshots and clones give rapid provisioning
• Network install not supported with
> zfs volumes
> single slice
• Network install requries
> entire disk
> disk image file

241

Saturday, May 2, 2009


Virtual Network Switch services
(vswitch)‫‏‬
Delayed
Reconfiguration

• Implements a layer-2 network switch


• Connects virtual network devices to
> To the physical network
> or to each other (internal private network)‫‏‬
• vswitch not automatically used by service domain
> must be plumbed

242

Saturday, May 2, 2009


Virtual Console Concentrator
(vcc)‫‏‬
Delayed
Reconfiguration

• Provides console access to LDoms


• Service domain VCC driver communicates with all guest console
drivers over the Hypervisor
> No changes required in guest console drivers (qcn)‫‏‬
• Makes each console available as a tty device on the Control/Service
domain
• usage: telnet local host <port>

243

Saturday, May 2, 2009


Virtual Network Terminal Server
daemon (vntsd)
Delayed
Reconfiguration

• VCC implemented by vntsd


• Runs in the Control/Service domain
• Aggregates the VCC tty devices and makes them available over
network sockets
> Accessible once a domain is configured and bound
> Attach prior to domain start to watch domain OBP boot sequence
• Only one user at a time can view a serial console
• Flexible support of port groups, IP's, port numbers etc
> Not visible outside the Control/Service domain by default

244

Saturday, May 2, 2009


Example 2
Setting up the Guest Domain

245

Saturday, May 2, 2009


Primary ldm1

Guest Domain Unallocated


Resources

Solaris 10 08/07 Solaris 10 11/06

ldmd CPU
Cpu
CPU Mem
Mem +app+patches

In the control domain:


Cpu
drd
vntsd CPU
Cpu
CPU Mem /dev/dsk/c0d0s0
Cpu Mem
CPU
Cpu
CPU
/dev/c0t1d0s0 Cpu Mem
Mem ldm1-vdisk1

T2000 CPU Crypto


Crypto
Cpu
CPU
Cpu Mem
ldm1-vol1 Mem
ldm add-domain ldm1 Control & Service
/dev/e1000g0 primary CPU
Cpu
CPU
Cpu Mem
Mem
Crypto
Crypto
Guest
ldom1

ldm add-mau 1 ldm1 vsw0 CPU


Cpu
CPU
Cpu Mem
Mem Crypto
Crypto
vnet0

ldm add-vcpu 4 ldm1 primary-vds0


Hypervisor
primary-vsw0

ldm add-memory 4G ldm1 CPU


Cpu CPU
Cpu CPU
Cpu CPU
Cpu
Hardware
ldm add-vnet vnet0 primary-vsw0 ldm1 Mem
Mem Crypto Shared CPU, Mem
Mem Crypto
Memory & IO

IO Devices

ldm add-vdsdev /dev/dsk/c0t1d0s2 ldm1-vol1@primary-vds0


PCI-E
ldm add-vdisk ldm1-vdisk1 ldm1-vol1@primary-vds0 ldm1 72GB 72GB Network

ldm set-var auto-boot\?=false ldm1


ldm set-var boot-device=vdisk ldm1
ldm set-var nvramrc-devalias vnet0 /virtual-devices@100/channel-
devices@200/network@0 ldm1
ldm bind-domain ldm1
ldm start-domain ldm1 • Watch the console of ldom1 using ...
> telnet localhost 5000

246

Saturday, May 2, 2009


Disk Service Setup Primary ldm1

• Establish a Virtual Disk Service


– 'primary-vds' Solaris 10 11/06 Solaris 10 11/06
+app+patches

• Associate it with some form of media. ldmd


drd
/dev/dsk/c0d0s0

– A real device or slice /dev/dsk/ vntsd


ldm1-vdisk1
c0t1d0s0 or /dev/c0t1d0s2

– or a disk image e.g. '/ldmzpool/


ldg1' ldm1-vol1
Control & Service Guest
primary ldom1

• Create disk server device instance to


be exported to guest domains
– 'ldm1-vol1@primary-vds' primary-vds
Hypervisor
 ldm add-vdsdev /dev/dsk/c0t1d0s2 ldm1-
vol1@primary-vds0 CPU
Cpu CPU
Cpu CPU
Cpu CPU
Cpu

Hardware
 ldm add-vdisk ldm1-vdisk1 ldm1- Mem
Mem Crypto
Shared CPU, Mem
Mem Crypto
vol1@primary-vds0 ldm1 Memory & IO

 (The disk device name can vary - find it via IO Devices


“ok show-devs”)
72GB 72GB PCI-E

Network

247

Saturday, May 2, 2009


Virtual Disk Client (vdc) Delayed
Reconfiguration

• vdc's are the objects passed to OBP and the Operating System in guest
systems
• Guest domain OBP and Solaris sees normal SCSI devices
• Domain administrators may setup devaliases or use raw vdisk devices
• vdc’s provide Guest domains with virtual disk devices (vdisks) via device
instances from Virtual Disk Servers running in the Service Domains(s)
• A future release will provide virtualised access to DVD/CD-ROM in
service domains

248

Saturday, May 2, 2009


Network Setup Primary ldom1

• Establish a Virtual Network Switch Solaris 10 08/07 Solaris 10 11/06


Services +app+patches
ldmd
– 'primary-vsw0' drd
vntsd
> Automatically associated with a
vsw device instance
– 'vsw0'
Control & Service Guest vnet0
primary ldom1
• May or may not choose to associate it primary-vsw0
with media. vnet0@ldm1

– 'e1000g0' a real NIC


primary-vsw
– or no NIC . in memory Hypervisor

CPU CPU CPU CPU


• Create a network device instance to Cpu Cpu

Hardware
Cpu Cpu

provide to guest domains Mem


Mem Crypto
Shared CPU, Mem
Mem Crypto
Memory & IO
– 'vnet0@ldm1'
IO Devices

72GB PCI-E
e1000g0

Network

249

Saturday, May 2, 2009


Virtual Network Device (vnet) Delayed
Reconfiguration

• Implements an ethernet device in a domain


> Communicates with other vnets or the outside world over vswitch devices
• If the vSwitch is suitably configured, packets can be routed out of the
server.
• vnet exports a GLDv3 interface
> A simple virtual Ethernet NIC
> Enumerates as a 'vnetx' device
> For domain-domain transfers, vnets connect 'directly'.

250

Saturday, May 2, 2009


Memory Delayed
Reconfiguration

• Memory is configured through the Control Domain


• Minimum allocatable chunk is 8kB
> Minimum size is 12MB (for OBP)‫‏‬
> Though most OS deployments will need > 512M
• If memory is added over time to a domain
> Memory device bindings within a domain may appear to show that
memory fragmentation is occuring
> Not a problem, all handled in HW by the MMU
> No performance penalty

251

Saturday, May 2, 2009


vCPU's Immediate
Reconfiguration

• Each UltraSPARC T1 has up to 8 physical cores with 4 threads each


> Each thread is considered a vCPU, so up to 32 vCPUs or Domains
• Each UltraSPARC T2 has up to 8 physical cores with 8 threads each
> Each thread is considered a vCPU, so up to 64 vCPUs or Domains
• Maximum Granularity is 1 vCPU per domain
• vCPU's can only be allocated to one Domain at a time.
• Can be dynamically allocated with the Domain running,
> Take care if removing a vcpu from a running domain, will there be
enough compute power left in the domain ?

252

Saturday, May 2, 2009


Example 3
Guest Domains and ZFS

253

Saturday, May 2, 2009


Using ZFS (1) – setup zfs
1. Remove the disk from the service domain
ldm stop-domain ldm1
LDom ldm1 stopped
ldm unbind-domain ldm1
ldm remove-vdsdev ldm1-vol1@primary-vds0

2. Create a zpool
root@cmt1 > zfs create mypool/ldoms
root@cmt1 > zfs create mypool/ldoms/ldm1
root@cmt1 > cd /export/ldoms/ldm1
root@cmt1 > ls
root@cmt1 > mkfile 12G `pwd`/rootdisk

254

Saturday, May 2, 2009


Using ZFS (2) – setup guest domain
3. Configure the guest domain
root@cmt1 > ldm add-domain ldm1
root@cmt1 > ldm add-vcpu 8 ldm1
root@cmt1 > ldm add-memory 1G ldm1
root@cmt1 > ldm add-vnet vnet0 primary-vsw0 ldm1
root@cmt1 > ldm add-vdsdev /export/ldoms/ldm1/rootdisk ldm1-vol1@primary-vds0
root@cmt1 > ldm add-vdisk ldm1-vdisk1 ldm1-vol1@primary-vds0 ldm1

root@cmt1 > ldm set-var auto-boot\?=false ldm1


root@cmt1 > ldm set-var boot-device=ldm1-vdisk1 ldm1
root@cmt1 > ldm set-var nvramrc-devalias vnet0 /virtual-devices@100/channel-
devices@200/network@0 ldm1

255

Saturday, May 2, 2009


Using ZFS (3) – setup guest domain
4. Start the guest domain
root@cmt1 > ldm bind-domain ldm1
root@cmt1 > ldm start-domain ldm1
LDom ldm1 started

5. Inspect the domain


root@cmt1 > ldm list-domain
NAME STATE FLAGS CONS VCPU MEMORY UTIL UPTIME
primary active -n-cv SP 8 2G 0.7% 17h 12m
ldm1 active -t--- 5000 8 1G 13% 7s

telnet localhost 5000


{ok} boot vnet0 - install
installation goes forward

256

Saturday, May 2, 2009


Provision the guest
6. Set up for jumpstart
Determine the mac address
root@cmt1 > ldm list-bindings ldm1
[snip]
NETWORK
NAME SERVICE DEVICE MAC
vnet0 primary-vsw0@primary network@0 00:14:4f:f8:2a:c4
PEER MAC
primary-vsw0@primary 00:14:4f:46:41:b4

telnet localhost 5000


{0} ok banner
SPARC Enterprise T5120, No Keyboard
[snip]
Ethernet address 0:14:4f:fb:7:42, Host ID: 83fb0742.

257

Saturday, May 2, 2009


Provision the guest (2)‫‏‬
{0} ok boot vnet0 - install
Boot device: /virtual-devices@100/channel-devices@200/network@0ile and args: - install
Requesting Internet Address for 0:14:4f:f8:2a:c4
SunOS Release 5.10 Version Generic_120011-14 64-bit
...

How to break
telnet> send brk
Debugging requested; hardware watchdog suspended.

c)ontinue, s)ync, r)eboot, h)alt? r


Resetting...

{0} ok

258

Saturday, May 2, 2009


Guest Domain (zfs) login
{0} ok boot
Boot device: ldm1-vdisk1 File and args:
SunOS Release 5.10 Version Generic_120011-14 64-bit
Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
Hostname: ldm1

ldm1 console login:

259

Saturday, May 2, 2009


Using ZFS (2) – cloning domains
Snapshot and Clone the installed boot disk
tgendron@cmt1 > zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 12.0G 54.9G 27.5K /export
mypool/ldoms 12.0G 54.9G 25.5K /export/ldoms
mypool/ldoms/ldm1 12.0G 54.9G 12.0G /export/ldoms/ldm1

root@cmt1 > zfs snapshot mypool/ldoms/ldm1@initial

Create the clones


root@cmt1 > zfs snapshot mypool/ldoms/ldm1@initial
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm2
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm3
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm4
root@cmt1 > zfs clone mypool/ldoms/ldm1@initial mypool/ldoms/ldm5

260

Saturday, May 2, 2009


Using ZFS (2) – Leverage the clones
4. Create the new guest domains (should be easily to script this)‫‏‬
ldm add-domain ldm2
ldm add-vcpu 8 ldm2
ldm add-memory 1G ldm2
ldm add-vnet vnet0 primary-vsw0 ldm2

ldm add-vdsdev /export/ldoms/ldm2/rootdisk ldm2-vol1@primary-vds0


ldm add-vdisk ldm2-vdisk1 ldm2-vol1@primary-vds0 ldm2

ldm set-var auto-boot\?=false ldm2


ldm set-var boot-device=vdisk ldm2
ldm set-var nvramrc-devalias vnet0 /virtual-devices@100/channel-devices@200/network@0 ldm2
ldm bind-domain ldm2
ldm start-domain ldm2

261

Saturday, May 2, 2009


Boot the cloned ldom
{0} ok boot
Boot device: vdisk File and args:
SunOS Release 5.10 Version Generic_120011-14 64-bit
Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved.
Use is subject to license terms.
WARNING: vnet0 has duplicate address 010.030.019.178 (in use by 00:14:4f:f8:2a:c4); disabled
Feb 13 19:55:29 svc.startd[7]: svc:/network/physical:default:
Method "/lib/svc/method/net-physical" failed with exit status 96.
Feb 13 19:55:29 svc.startd[7]: network/physical:default misconfigured:
transitioned to maintenance (see 'svcs -xv' for details)‫‏‬
Hostname: ldm1...

262

Saturday, May 2, 2009


Example 4
Split Service Domains

263

Saturday, May 2, 2009


Sun Fire T2000 Block Diagram

264

Saturday, May 2, 2009


Split IO Example
• Setting up a second Service domain with split PCI busses...
-bash-3.00# ldm list-bindings primary
Name: primary
...
IO: pci@780 (bus_a)‫‏‬
pci@7c0 (bus_b)‫‏‬
...
-bash-3.00# df /
/ (/dev/dsk/c1t0d0s0 ):28233648 blocks 3450076 files
-bash-3.00# ls -l /dev/dsk/c1t0d0s0
lrwxrwxrwx 1 root root 65 Apr 11 13:25 /dev/dsk/c1t0d0s0 -> ../../devices/
pci@7c0/pci@0/pci@1/pci@0,2/LSILogic,sas@2/sd@0,0:a
-bash-3.00# grep e1000g /etc/path_to_inst
"/pci@780/pci@0/pci@1/network@0" 0 "e1000g"
"/pci@780/pci@0/pci@1/network@0,1" 1 "e1000g"
"/pci@7c0/pci@0/pci@2/network@0" 2 "e1000g"
"/pci@7c0/pci@0/pci@2/network@0,1" 3 "e1000g"
-bash-3.00# ldm remove-io pci@780 primary
..
-bash-3.00# shutdown -i6 -y -g0
..
-bash-3.00# ldm add-io pci@780 second-svrc-dom
-bash-3.00# ldm start second-srvc-dom
-bash-3.00# ldm list-bindings
..
-bash-3.00#

Check which PCI bus ports we own and are currently using and be sure to only give away unused
ones... i.e need to retain the Control Domain boot disk controller and network device...
Providing a PCI bus to a Guest makes the selected Domain a Service domain, by definition – access
to physical IO = Service Domain.
265

Saturday, May 2, 2009


Sun Fire T5x20 Block Diagram
16 x FB-DIMMs

Disk Chassis
1RU 2RU/8

SSI
x8 FPGA
10GbE
10GbE
x4 PCI-E
LSI x4 x8 MPC885
SAS links x4
Switch
1068E ILOM
PLX 8533 Service
DVD x4
Processor
x4
PCI-E
x8 Switch
PCI-E
USB 2.0
to IDE PCI-E Switch PLX 8533
x1
to x8
PLX 8517 10GbE
2.0 USB x4 x4 x4 SerDes
BCM8704
USB 2.0 x4 x4
Hub
2RU Only 10GbE
USB Intel Intel Cu PHY
2.0 BCMxxxx
Dual Dual
GbE GbE
XFP

Front Panel
10GbE
Fibre
Plugin
0 1 2 3

USB Quad GbE PCI-E PCI-E PCI-E PCI-E PCI-E PCI-E Serial Network POSIX
Rear266Panel Connectors x16 x8 x8 x8 x8 x8 Mgt Mgt Serial DB-9

Saturday, May 2, 2009


MPxIO considerations
• MPxIO can be used in the Service/Control domain
• Very straightforward to configure with defaults...
> Ensure you have two FC-AL HBA's in a single service domain attached to the the same
SAN array
> Check that you have two paths to the same SAN devices ('ls /dev/dsk/')‫‏‬
> Enable MPxIO by running the command 'stmsboot -e' and rebooting the control/service
domain
> Check that you now have only a single path to the SAN devices...

267

Saturday, May 2, 2009


IPMP considerations
• IPMP has several options for configurations
> Refer to the Admininstration Guide for worked examples...
> Options are Multipathing in the Service Domain or Multipathing in the Guest Domain

268

Saturday, May 2, 2009


Ldom 1.0.1
Best Practice Guidence

269

Saturday, May 2, 2009


Ldom Best Practice (1)‫‏‬
• Control Domain
> Runs LDM daemon processes
> Must have adequate CPU and memory
> Start w/ 1 core (4 or 8 threads) 1GB Memory
> Make this domain as secure as possible

270

Saturday, May 2, 2009


Ldom Best Practice (2)‫‏‬
• I/O and service domains
> Runs IO for other domains
> Resources will be sized based on IO load
> Start w/ 1 core and 1GB memory
> 4GB of memory if zfs used for virtual disks images
> Add complete cores as heavier I/O loads

271

Saturday, May 2, 2009


Ldom Best Practice (3)‫‏‬
• Core/Thread Affinity
> Core resources are shared by threads
> E.g. L1 cache and MAU, FPU
• Best to avoid allocating the threads of a core to
separate domains
• Create larger Ldoms first using complete cores
• Smaller domains last

272

Saturday, May 2, 2009


Ldom Best Practice (4)‫‏‬ Delayed
Reconfiguration

• Cypto Units
• Each T1/T2 physical CPU Core has a Crypto Unit
> 8 in total on a 8 core system
> referred to as (MAU)‫‏‬
• Crypto cores can only be allocated to domains that have at least one
vcpu(thread) on the same physical Core as the crypto unit
• Crypto cores cannot be shared, they are owned by exactly one (or no)
Domain
• Probably best to allocate all four/eight threads on a Core to a domain
that wants to use the Crypto core

273

Saturday, May 2, 2009


More on Crypto Units
• For example we define three domains in order of
LDOM1 then LDOM2 then LDOM 3... LDOM1 LDOM2 LDOM3
• LDOM1 has 3 threads (vCPUs) on Core 0
> Only has access to MAU0 since it only has threads
on Core 0
• LDOM2 has 6 vCPUs spread across Cores 0, 1 & 2

T1 Core 0

T1 Core 1

T1 Core 2
> Potentially has access to MAUs 0,1 & 2
> BUT.. LDOM1 already binds MAU0
> So only can take MAU1 and MAU2
• LDOM3 has 3 vCPUs on Core 2
MAU0 MAU1 MAU2
> But can't access any MAU's since LDOM2 has already taken MAU2
• Adding and removing vCPUs can cause access to previously accessible MAU's to be lost,
currently you can't elect specific vCPU's, framework does that itself
• When MAU's are allocated to Domains, vCPU's become delayed reconfiguration properties
in those domains

274

Saturday, May 2, 2009


Ldom Best Practice (5)‫‏‬
• Plan your LDOM configuration carefully, reconfiguration may become awkward
• Use easy to understand names
> Try not to overload vds, vsw, ldom, vdisk,vnic etc...

• Use MPxIO or VxVM, VxFS, Sun Cluster on service domains (only VxFS in Guests) for
resilient storage devices
• Use IPMP on Guest or Service Domains for resilient network connections

275

Saturday, May 2, 2009


Ldom Best Practice (6)‫‏‬
• For hi-speed inter-domain comms use device-less/in-memory VSW configs
• For high disk performance, allocate a whole real device via a dedicated, properly sized
Virtual Disk Server and Service domain
• Look at the server architecture when configuring devices to ensure you get the
bandwidth you expect
• For critical applications consider hot/warm standby domains across multiple physical
servers, never rely on multiple instances within a single server.

276

Saturday, May 2, 2009


LDOM's v1.0.1 Notes
• All domains can be Stopped and Started independently
> Beware, Guest domains attempting to perform IO using a rebooting Service domain
will stall until the Service domain returns.
• LDOM SNMP MIB available now with traps and requests to the LDOM framework
• MAC address on banner different from what is raprd for jumpstart
• Only vcpu's can be dynamically reallocated
> BUT... if the domain has crypto cores this becomes a delayed reconfiguration
> You cannot choose which vCPU's are allocated to a domain
• By default the Control/Service domain cannot network with Guest domains
> Plumb the vSwitch vsw device to enable communications
> Give the vsw device the e1000g devices MAC address
• Check you have the latest versions of the documents, Software & Firmware

277

Saturday, May 2, 2009


SVM, VxVM, ZFS Volume managers
• SVM, VxVM and ZFS volumes can be exported from a Service Domain to Guest domains
and appear as virtual disks to the Guest Domains
> Always appear as a disk with only one s0 slice
> Can't be used as Solaris Install targets...yet, just use for data storage
• Can export a disk image file placed in one of these volumes as a full disk image to Guest
domains
> Allows use of the disk as Solaris Install Target
> Doing this with ZFS allows very efficient re-use of images using ZFS Snapshotting and
Cloning and Compression
> Invisibly bestows the benefits of the underlying Volume manager on the disks available
to the Guest domains
> Using SVM allows either Guest or Service domain to access the disk image, allowing for
off-line
m
a
intenance of the guest domain filesystems (only one at a time can mount the filesystem)‫‏‬
• VxVM can only be used in the Service domain, not Guest domains

278

Saturday, May 2, 2009


Solaris Cluster 3.2 Support

• Sun Cluster 3.2 is now supported in IO Domains


> i.e domains with real physical devices, PCI busses or NIU devices
• Please check the web site here for more infom on deployment scenarios
> http://blogs.sun.com/SC/entry/announcing_solaris_cluster_support_in

279

Saturday, May 2, 2009


Logical Domains (LDoms) Roadmap

• LDoms 1.0 > LDoms 1.0.1 - CURRENT


> Niagara support – Niagara2 support
> Up to 32 LDOMs per system, guest domain may – I/O domain reboot support
be rebooted independently – Control domain minimization
> Virtualized console, ethernet, disk & – SNMP MIB
cryptographic acceleration
– Web management tool
> Live re-configuration of virtual CPUs (freeware/unsupported)‫‏‬
> FMA diagnosis for each domain
> Control domain hardening

* Requiring new Solaris 10 update


280

Saturday, May 2, 2009


References for further information

• http://www.sun.com/ldoms
• Sun Blueprints relating to LDOMs
– http://www.sun.com/blueprints/0207/820-0832.html
– http://www.sun.com/blueprints/0807/820-3023.html
• SDLC Release of LDOMs
– http://www.sun.com/download/products.xml?id=46e5ba66
• Official Documentation for the SDLC release
– http://www.sun.com/servers/coolthreads/ldoms/get.jsp

• LDOMs– Blogs
http://blogs.sun.com/hlsu/entry/logincal_domains_1_0_1

• OpenSolaris LDOMs community


– http://www.opensolaris.org/os/community/ldoms/

281

Saturday, May 2, 2009


Domains

Copyright 2009 Peter Baer Galvin - All Rights Reserved 282

Saturday, May 2, 2009


Overview
Long-standing Sun server feature
E10Ks and all servers since then
Hard partition of system resources (bus, CPU,
memory, I/O)
Options vary depending on hardware (how many
domains, CPUs per domain)
Sometimes used in conjunction with Dynamic
Reconfiguration (DR)
Controlled via firmware commands (XSCF on M-
servers)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 283

Saturday, May 2, 2009


Prep Work
Do this before installing Solaris / moving to production

Determine number of domains, resources per domain (CPU,


memory, I/O)

Make sure I/O is redundant between allocation units (so for


example a system board can be taken out of service without
disabling I/O to a device)

PCI cards must support DR (per device)

Leave “kernel cage memory” enabled to minimize number of


system boards kernel memory allocated to

Enabled by default in S10 (but costs a little


performance)

Disable via set kernel_cage_enable=0 in /etc/


system
Copyright 2009 Peter Baer Galvin - All Rights Reserved 284

Saturday, May 2, 2009


Prep Work (cont)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 285

Saturday, May 2, 2009


M-Servers

Copyright 2009 Peter Baer Galvin - All Rights Reserved 286

Saturday, May 2, 2009


Server model Max system boards Max domains

M9000+EU 16 24

M9000 8 24

M8000 4 16

M5000 2 4

M4000 1 2

Copyright 2009 Peter Baer Galvin - All Rights Reserved 287

Saturday, May 2, 2009


Implementation

For M-servers, see http://docs.sun.com/


source/819-3601-13
setupfru, showfru, setdcl,
addboard, showdcl, showboards
commands configure resources into domains
in XSCF

Copyright 2009 Peter Baer Galvin - All Rights Reserved 288

Saturday, May 2, 2009


DR
Can install, remove, add , delete, move, register,
configure, unconfigure, etc system boards
A system board is in one domain at a time
Move resources as needed between domains
Movement can be automated or manual
And I/O devices
While Solaris remains running
Good details in http://docs.sun.com/
source/819-5992-12

Copyright 2009 Peter Baer Galvin - All Rights Reserved 289

Saturday, May 2, 2009


Implementation
XSCF used to configure DR
Shell and Web interfaces
Add to Domain command set showdcl,
setdcl, addboard, deleteboard,
moveboard, showdomainstatus
cfgadm and cfgadm_pci configures DR on I/O
devices
Be sure to configure and implement all of this before
going production - don’t plan on adding a domain to
a production system without practice and experience
Copyright 2009 Peter Baer Galvin - All Rights Reserved 290

Saturday, May 2, 2009


xVM Virtualbox

Copyright 2009 Peter Baer Galvin - All Rights Reserved 291

Saturday, May 2, 2009


Overview
Sun has a suite of xVM products
xVM ops center - patching x86 and
SPARC (Linux too) plus provisioning
xVM virtualbox - desktop virtualization
x86
xVM server (aka Xen) - hypervisor-like
virtualization for x86

Copyright 2009 Peter Baer Galvin - All Rights Reserved 292

Saturday, May 2, 2009


Virtualbox
Open source (GPL) virtualization environment
for x86 (and closed source commercial version)
(Sun bought the independent developer)
Completes Sun’s virtualization picture by
adding desktop / workstation
virtualization tool
Competes with VMWare workstation,
Parallels, Fusion

Copyright 2009 Peter Baer Galvin - All Rights Reserved 293

Saturday, May 2, 2009


Platform Support
Runs on Windows, Linux, MacOS X, and
OpenSolaris
Guest support is extensive, including
Windows (NT 4.0, 98, 2000, XP, Server
2003, Vista), DOS/Windows 3.x, Linux (2.4
and 2.6), OpenBSD, Solaris, OpenSolaris
Full list at http://www.virtualbox.org/
wiki/Guest_OSes

Copyright 2009 Peter Baer Galvin - All Rights Reserved 294

Saturday, May 2, 2009


Features
Modular design
Active community
VM descriptions in XML
Guest tools to add functionality to some guests
Shared folders
Multiple snapshots of VM states
Supports VT-x and AMD-V (enable per-VM)
Seamless windows on Windows guests, Linux, Solaris
Import of guest VMs in VMDK format
Copyright 2009 Peter Baer Galvin - All Rights Reserved 295

Saturday, May 2, 2009


Closed-source Features
Virtual USB controller
Remote Desktop Protocol (RDP) server support
Can connect to Virtualbox client from other
systems, thin clients
USB over RDP works - guest can access local
resources while displaying remotely
iSCSI initiator (can use iSCSI targets as virtual disks)
SATA controller (faster and less overhead than IDE)

Copyright 2009 Peter Baer Galvin - All Rights Reserved


296

Saturday, May 2, 2009


XVM Server

Copyright 2009 Peter Baer Galvin - All Rights Reserved 297

Saturday, May 2, 2009


xVM Server
Solaris-based bare-metal hypervisor based on
Xen
Complete vm management
Goal is to be similar to VMWare ESX
Brand-new
Server itself is open source, is free to try
xVM Infrastructure Enterprise - multinode
management of VMs
xVM Infrastructure Datacenter - multinode
management of physical servers and physical and
virtual nodes
Copyright 2009 Peter Baer Galvin - All Rights Reserved 298

Saturday, May 2, 2009


Features
MS 2003, 2008, RedHat 4.6 / 5.2, Solaris and OpenSolaris guests

Live migration

Guest cloning / templating

xVM Ops Center integration

Java-based KVM access to guest OS consoles

Management is browser-based

VMDK-formatted guest OSes supported

Paravirtualized device drivers

NAS / CIFS storage support

Least privilege security model of services, management

DTrace integration (just how much?)

ZFS supported (guest OS file systems

Copyright 2009 Peter Baer Galvin - All Rights Reserved 299

Saturday, May 2, 2009


Implementation

TBD

Copyright 2009 Peter Baer Galvin - All Rights Reserved 300

Saturday, May 2, 2009


References
You Are Now Free to Move About
Solaris

Copyright 2009 Peter Baer Galvin - All Rights Reserved 301

Saturday, May 2, 2009


References
 [Kozierok] TCP/IP Guide, No Starch Press, 2005
 [Nemeth] Nemeth et al, Unix System Administration

Handbook, 3rd edition, Prentice Hall, 2001


 [SunFlash] The SunFlash announcement mailing list

run by John J. Mclaughlin. News and a whole lot more.


Mail sunflash-info@sun.com
 Sun online documents at docs.sun.com


[Kasper] Kasper and McClellan, Automating Solaris
Installations, SunSoft Press, 1995

Copyright 2009 Peter Baer Galvin - All Rights Reserved 302

Saturday, May 2, 2009


References (continued)

 [O’Reilly] Networking CD Bookshelf, Version 2.0,


O’Reilly 2002
 [McDougall] Richard McDougall et al, Resource

Management, Prentice Hall, 1999 (and other


"Blueprint" books)

[Stern] Stern, Eisler, Labiaga, Managing NFS
and NIS, 2nd Edition, O’Reilly and Associates,
2001

Copyright 2009 Peter Baer Galvin - All Rights Reserved 303

Saturday, May 2, 2009


References (continued)
 [Garfinkel and Spafford] Simson Garfinkel and
Gene Spafford, Practical Unix & Internet
Security, 3rd Ed, O’Reilly & Associates, Inc,
2003 (Best overall Unix security book)
 [McDougall, Mauro, Gregg] McDougall, Mauro,

and Gregg, Solaris Internals and Solaris


Performance and Tools, 2007 (great Solaris
internals, DTrace, mdb books)

Copyright 2009 Peter Baer Galvin - All Rights Reserved 304

Saturday, May 2, 2009


References (continued)
 Subscribe to the Firewalls mailing list by sending
"subscribe firewalls <mailing-address>" to
Majordomo@GreatCircle.COM
 USENIX membership and conferences. Contact

USENIX office at (714)588-8649 or office@usenix.org


 Sun Support: Sun’s technical bulletins, plus access to

bug database: sunsolve.sun.com



Solaris 2 FAQ by Casper Dik:
ftp://rtfm.mit.edu/pub/usenet-by-group/comp.answers/Solaris2/FAQ

Copyright 2009 Peter Baer Galvin - All Rights Reserved 305

Saturday, May 2, 2009


References (continued)
 Sun Managers Mailing List FAQ by John
DiMarco:
ftp://ra.mcs.anl.gov/sun-managers/faq
Sun's unsupported tool site (IPV6,
printing)
http://playground.sun.com/
Sunsolve STBs and Infodocs
http://www.sunsolve.com

Copyright 2009 Peter Baer Galvin - All Rights Reserved 306

Saturday, May 2, 2009


References (continued)
 comp.sys.sun.* FAQ by Rob Montjoy: ftp://
rtfm.mit.edu/pub/usenet-by-group/comp.answers/comp-sys-sun-faq

“Cache File System” White Paper from Sun:


http://www.sun.com/sunsoft/Products/Solaris-whitepapers/Solaris-
whitepapers.html
 “File System Organization, The Art of
Automounting” by Sun:
ftp://sunsite.unc.edu/pub/sun-info/white-papers/TheArtofAutomounting-1.4.ps

Solaris 2 Security FAQ by Peter Baer Galvin


http://www.sunworld.com/common/security-faq.html

Secure Unix Programming FAQ by Peter Baer


Galvin
http://www.sunworld.com/swol-08-1998/swol-08-security.html

Copyright 2009 Peter Baer Galvin - All Rights Reserved 307

Saturday, May 2, 2009


References (continued)
 Firewalls mailing list FAQ:
ftp://rtfm.mit.edu/pub/usenet-by-group/
Comp.answers/firewalls-faq
 There are a few Solaris-helping files available via anon
ftp at
ftp://ftp.cs.toronto.edu/pub/darwin/
solaris2
Peter’s Solaris Corner at SysAdmin Magazine
http://www.samag.com/solaris
 Marcus and Stern, Blueprints for High Availability, Wiley,
2000
 Privilege Bracketing in Solaris 10
http://www.sun.com/blueprints/0406/819-6320.pdf

Copyright 2009 Peter Baer Galvin - All Rights Reserved 308

Saturday, May 2, 2009


References (continued)

Peter Baer Galvin's Sysadmin Column (and old Pete's


Wicked World security columns, etc)
http://www.galvin.info
My blog at http://pbgalvin.wordpress.com
Operating Environments: Solaris 8 Operating
Environment Installation and Boot Disk Layout by
Richard Elling
http://www.sun.com/blueprints (March 2000)
Sun’s BigAdmin web site, including Solaris and Solaris
X86 tools and information’
http://www.sun.com/bigadmin

Copyright 2009 Peter Baer Galvin - All Rights Reserved 309

Saturday, May 2, 2009


References (continued)

DTrace
http://users.tpg.com.au/adsln4yb/
dtrace.html
http://www.solarisinternals.com/si/dtrace/
index.php
http://www.sun.com/bigadmin/content/dtrace/

Copyright 2009 Peter Baer Galvin - All Rights Reserved 310

Saturday, May 2, 2009

Você também pode gostar