Você está na página 1de 107

Virtualization

SystemVirtualization
and
OSVirtualMachines

Ivan Boule

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

Virtualization

HistoryofVirtualMachines
VMintroducedinthesixtiesonIBM/370series
CoDesignedVM:IBMAS/400
HighlevelISAincludingI/Os
ProprietaryCISCPowerPC

ApplicationVMs
SunJava,MicrosoftCommonLanguageInfrastructure

OSVMs
VMware(Windows/LinuxonIntel)
Connectix(Windows/PCemulationonMacOS)
Ivan Boule

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

Virtualization

GoalsofSystemVirtualization
ReductionofTotalCostofOwnership(TCO)
Increaseutilisationofserverresources

ReductionofTotalCostofFunctioning
Energyconsumption
Cooling
OccupiedSpace

HardwareConsolidation
ReductionofBuildOfMaterial(BOM)forhighvolume
lowendproducts

Ivan Boule

Virtualizationinhighthroughputnetwork
equipments

VirtualizationinMultimediadevices
ReductionofBuildOfMaterial(BOM)forhighvolumelowend
products
NoneedforaGeneralPurposeProcessor
~20to25%BOMreduction
RunLinuxtogetherwithOSsupporting
CodecsonasingleTIDSP
LeverageLinuxenvironment
ReuseexistingDSPsoftware

Virtualization

UsagesofVirtualMachines
Servervirtualization
Websiteshosting
OSfaultrecovery

OSkerneldevelopment
Testmachine=developmenthost

OS/kerneleducation&training
Keepbackwardcompatibilityoflegacysoftware
Hardwarenomoreavailable

RunapplicationsnotsupportedbyhostOS
Ivan Boule

Virtualization

RecoveryServers

Ivan Boule

Virtualization

MultiCoreCPUIssues(1)
CPUpowergain
NomoreachievedthroughFrequency/Speedincrease
Butobtainedwithhigherdensity&multicorechips

ManyRTOSdesignedwithmonoprocessorassumption
Addingmultiprocessorsupportiscomplex&costly
Scalingrequirestime,atbest...

LegacyRTapplicationsalsodesignedformono
processor
AdaptationtomultiproevenmoredifficultthanRTOS

Ivan Boule

10

Virtualization

MultiCoreCPUIssues(2)
OSvirtualizationallowstorunsimultaneouslyon
amulticoresCPUmultiple[instancesof]mono
processorOS's
EachOSinstanceisrunina[monoprocessor]
VirtualMachineassignedtoasingleCPUcore
Noneedtochangelegacysoftware
Scalabilitymanagedatvirtualizationlevel

Ivan Boule

11

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

12

Virtualization

SystemVirtualizationPrinciples
RunmultipleOS'sonthesamemachine
Bydesign,anOSassumestohavefullcontrol
overallphysicalresourcesofthemachine
Managesharing/partitioningofmachine
resourcesbetweenGuestOS's
CPU
Physicalmemory&MMU
I/Odevices

Ivan Boule

13

MachineInterfaces
App

App

System Calls

ISA

App
User ISA

App

System Calls

OS

OS

System ISA

System ISA

Hardware

App

App

User ISA

ABI

Hardware

ISA=InstructionSetArchitecture
Systemlevelinterface
AllCPUinstructions,memoryarchitecture,I/O

ABI=ApplicationBinaryInterface
Processlevelinterface
14
UserlevelnonprivilegedISAinstructions+OSsystemscalls

Virtualization

VirtualizationTaxonomy
Processlevelvirtualization
EmulationofOperatingSystemABI
VirtualServers

Systemlevelvirtualization
Standalone/HostedVirtualization
MachineEmulation/MachineVirtualization

Ivan Boule

15

HostedversusStandalone
Virtualization

Virtualization

HostedVirtualization
HostedVMMonitor(VMM)runsontopofnativeOS
VMwareWKS,MicrosoftVirtualPC,QEMU,UML

StandaloneVirtualization
VMMdirectlyrunsonbarehardware
VMwareESX,IBM/VM,Xen,VLX,KVM

OSruninaVMisnamedaGuestOS
Ivan Boule

16

Virtualization

Hosted Virtualization
Applications

Applications

Applications

GuestOS

GuestOS

GuestOS

VMM

VMM

VMM

NativeOS
Hardware

Ivan Boule

17

Virtualization

Example:VMwareWorkstation
HostedVM
UnmodifiedOSes
Specificdevice
drivers
X86only
GuestOSexecuted
inusermode
Ivan Boule

18

Virtualization

Standalone Virtualization
Applications

Applications

Applications

GuestOS

GuestOS

GuestOS

VMM
Hardware

Ivan Boule

19

VMwareESX

Virtualization

StandaloneVM
SupportsunmodifiedOSbinaries
Configurationwithappropriatedevicedrivers

X86only
OShosted
inusermode

Ivan Boule

20

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

21

Virtualization

ProcesslevelABIEmulation
Goal:executebinaryapplicationsofagiven
systemXontheABIofanothersystemY
EmulatesystemXABIontopofsystemYABI
Emulationdonebyapplicationlevelcode

SystemYmustprovideservicesequivalentto
thoseofsystemX(filesystem,sockets,etc.)

Ivan Boule

22

Virtualization

ProcessLevel(ABI)Emulators
Wine - Windows Emulator on Unix/Linux
Windows API in userland
Adobe Photoshop, Google Picasa, ...

Cygwin
Unix emulation on Windows
POSIX library
Bash shell + many Unix commands
GNU development tool chain (gcc, gdb)
XWindow,GNOME,Apache,sshd,...
Ivan Boule

23

Virtualization

Virtual Servers
SingleOSkernel/Multipleresourceinstances
Isolatedkernelexecutionenvironments
Rootfilesystem
IPtables
Processforsignals

Solaris10Containers
LinuxVServer
FreeBSDJail
Ivan Boule

24

Virtualization

Virtual Servers
P1
P2

P3
P8

P5
P9

P7
P6

P1 P2

P3 P8 P9

P5 P6

74.125.0.0/16

10.16.0.0/16

10.17.0.0/16

10.18.0.0/16

/roots/vm1

/roots/vm2

/roots/vm3

P7

KernelCode

Ivan Boule

25

Virtualization

VirtualServers
Pro's
CPUindependent
Lightweight
Lowmemoryfootprint
LowCPUoverhead
Scalable

Con's
No OS heterogeneity (no GPOS/RTOS combination)
Single OS binary instance (common point of failure)
Intrusive: must modify OS & follow OS updates

Ivan Boule

26

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

27

Virtualization

TransparentHardwareEmulation
Run unmodified OS binaries

Includesemulationofphysicaldevices
CrossISAEmulation
QEMU

SameISAEmulation
VirtualBox(Intelx86)

Ivan Boule

28

Virtualization

TransparentHardwareEmulation
EmulatemachineXontopofmachineY
Interpretation
1instructionofXexecutedbyNinstructionsofY
Hugeslowdownmethod(1/1000ifXY)

DynamicBinaryTranslation
ConvertblocsofXinstructionsinYinstructions

ApplicationlevelemulatorrunsonanativeOS
OneVMrunningasingleGuestOS
Ivan Boule

29

Virtualization

QEMU Architecture
Solaris
Process

Linux

RealTime
Applications

Applications

Windows
Applications

RTOS
3

Linux

Windows

QEMU
PowerPC

PowerPCISA

x86ISA

QEMU
PCx86

x86ISA

QEMU
PCx86

SolarisABI
Solaris(NativeOS)
SunSparc
Ivan Boule

30

Virtualization

QEMU:HostedHardwareEmulator
CrossISAEmulation
EmulatemachineXontopofmachineY

Interpretation+translationwhenXY
Intelx86,PowerPC,ARM,Sparcarchitectures
EmulationofSMParchitectures
EmulatesphysicalI/Odevices
HardDiskdrives,CDROM,networkcontrollers,
USBcontrollers,
SynchronousemulationofdeviceI/Ooperations
Ivan Boule

31

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

32

Virtualization

TransparentHardwareVirtualization
SharemachineresourcesamongmultipleVMs
Executenative/unmodifiedOSbinaryimages
ProvideineachVMacompletesimulationof
hardware
FullCPUinstructionset
Interrupts,exceptions
MemoryaccessandMMU
I/Odevices

Ivan Boule

33

Virtualization

FullCPUVirtualization
PresentsamefunctionalCPUtoallGuestOSes
VMMmanagesaCPUcontextforeachVM
savedcopyofCPUregisters
representationofsoftwareemulatedCPUcontext

VMMsharesphysicalCPUsamongallVMs
VMMincludesaVMscheduler
roundrobin
prioritybased

Ivan Boule

34

Virtualization

FullCPUVirtualization
RelationshipsbetweenaVMMandVMssimilarto
relationshipsbetweennativeOSandapplications
GuaranteemutualisolationbetweenallVMs
ProtectVMMfromallVMs

DirectlyexecutenativebinaryimagesofGuest
OS'sinnonprivilegedmode
VMMemulatesaccesstoprotectedresources
performedbyGuestOSs
Ivan Boule

35

Virtualization

CPUVirtualization
RuneachGuestOSinnonprivilegedmode
Ex:ringcompressiononIntelx86
Ring3

Applications

Applications

Applications

Ring1

GuestOS

GuestOS

GuestOS

VM0

VM1

VM2

Ring0

VirtualMachineMonitor(VMM)

Ivan Boule

36

Virtualization

HardwareSensitiveInstructions
Interactwithprotectedhardwareresources
PrivilegedInstructions
CriticalInstructions

CannotbedirectlyexecutedbyGuestOS's
MustbedetectedandfakedbyVMM
DynamicBinaryTranslationofkernelcode
Doneonce,savedinTranslationCache
Example:Vmware
Ivan Boule

37

Virtualization

PrivilegedInstructionsVirtualization
Onlyallowedinsupervisormode
Ex:cli/stitomask/unmaskinterruptsonIntelx86

Whenexecutedinnonprivilegedmode
CPUautomaticallydetectsaprivilegeviolation
Triggersaprivilegeviolationexception

CaughtbyVMMwhichfakestheexpectedeffect
oftheprivilegedinstruction
Ex:cli/sti
VMMdoesnotmask/unmaskCPUinterrupts
recordsinterruptmaskstatusincontextofVM
Ivan Boule

38

Virtualization

CriticalInstructionsVirtualization(1)
Hardwaresensitiveinstructions
Ex:IntelIA32pushf/popf
pushf/*saveEFLAGreg.tostack*/
cli/*maskinterrupts=>clearEFLAG.IF*/

popf/*restoreEFLAGreg.=>unmaskinterrupts*/

Whenexecutedinnonprivilegedmode
ThecliinstructiontriggersanexceptioncaughtbyVMM
=>VMMrecordinterruptsmaskedforcurrentVM
Butnoexceptionforpopf=>VMMnotawareofGuest
OSaction(unmaskinterrupts)
Ivan Boule

39

Virtualization

CriticalInstructionsVirtualization(2)
MustbedetectedandemulatedbyVMM
VMMdynamicallyanalysesGuestOSbinarycode
tofindcriticalinstructions
VMMreplacescriticalinstructionsbyatrap
instructiontoentertheVMM
VMMemulatesexpectedeffectofcritical
instruction,ifany.
Ex:pushf/popfcombinedwithcli/stiinstructions
Ivan Boule

40

Virtualization

FullMemoryVirtualization
CPUincludeaMemoryManagementUnit(MMU)
Isolatedmemoryaddressingspaces
Independantofunderlyingphysicalmemorylayout
Runmutuallyprotectedapplicationsinparallel

VirtualMemorymanagedbyOSkernel
Providesavirtualaddressspacetoeachprocess
4GBonmost32bitarchitectures(Intelx86,PowerPC)

Managesvirtualpagephysicalcasemappings
Managesswapspacetoextendphysicalmemory
Ivan Boule

41

Virtualization

MMU & Virtual Address Space


Virtual Address Spaces

page
MMU
Translation
Lookaside
Buffer

Physical
Memory
case0

pte

4GB

caseN
Ivan Boule

42

Virtualization

Intel x86 MMU


Virtual
Address

31

22 21

12 11

10bits

10bits

Directory
Index

12bits

PageTable
Index

Page
Offset

4KBpage

0
cr/st

1023

DirectoryPage
CR3

DirectoryAddress

cr/st

32bitword

1023
cr/st

PageTableEntry(PTE)

cr/st=control&status

TranslationLookasideBuffer(TLB)=cacheforPTEs
Ivan Boule

PhysicalMemory
43

Virtualization

MemoryVirtualization
MachinePhysicalMemory
Physicalmemoryavailableonthemachine

GuestOSPhysicalMemory
PartofmachinememoryassignedtoaVMbyVMM
GuestPhysicalMemorycanbe>MachineMemory
VMMusesswapspace

GuestOSVirtualMemory
GuestOSmanagesvirtualaddressspacesofits
processes

Ivan Boule

44

Virtualization

MemoryVirtualization
GuestOSmanagesGuestPhysicalPages
ManagesMMUwithitsownpageentries
TranslatesVirtualAddressesintoGuestPhysical
Addresses(GPA)

VMMtransparentlymanagesMachinePhysical
Pages
GuestPhysicalAddressMachinePhysicalAddress
VMMdynamicallytranslatesGuestPhysicalPages
intoMachinePhysicalPages
Ivan Boule

45

Virtualization

MemoryVirtualization
1000
2000

4000

6000

1000

1000

1000
3000

5000

7000
8000

VM1
P1.1

P1.2

VM2

P2.1

3000

mappedvirtualpage

7000

unmappedvirtualpage

Process
virtual
space

mappedGuestpage
unmappedGuestpage
Machinephysicalpage

Guest
Physical
Memory

machine
memory
Ivan Boule

46

Virtualization

MemoryVirtualization
VMMmaintainsShadowPageTables
CopiesofGuestOStranslationtables

VMMcatchesupdatesoperationsoftranslation
tablesperformedbyaGuestOS
WriteprotectallguestOSpagetables
Emulatesoperationinshadowpagetable
UpdateseffectiveMMUpagetableentry,ifneeded

Ivan Boule

47

Virtualization

MemoryVirtualization
PTEentriescanbetaggedwithacontextID
AvoidstoflushTLBwhenswitchingcurrentaddress
spaceuponschedulingofanewprocess
usuallyPTEtag=OSprocessidentifier

ProcessesofdifferentGuestOSescanbe
assignedthesameProcessID
=>VMMmustflushTLBwhenswitchingVMs

Ivan Boule

48

Virtualization

MemoryVirtualization
VMMmustrespectGuestOSvirtualpagefaults
NotmapvirtualpagesunmappedbyGuestOS
WhenGuestOSunmapsavirtualpage:
VMMmustdeletetheassociatedrealpage/physicalpage
mapping,ifany.

Conversely,VMMcantransparently:
Introduce&resolverealpagefaultsforGuestOSes
SharephysicalpagesbetweenGuestOS's
Pageswithsamecontent's(e.g.zeroedpages)

Ivan Boule

49

Virtualization

MemoryVirtualization
VMMcanswaprealpagesofaVM
onswapspacemanagedbyVMM

VMMcandynamicallydistributephysicalmemory
amongVM's
NeedsaspecificsupportinGuestOS(Linuxmodule)
VMMasksGuestOStoreleasememory
GuestOSselfallocates[real]pages
=>nomoreavailablefornormal[kernel]allocationservice
VMMassignssameamountofphysicalpagestootherVM's

Ivan Boule

50

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

51

Virtualization

Paravirtualization
OSadaptationtoavoidbinarytranslation
overhead
RequiresaccesstoOSsourcecode
Includedriversofvirtualdevices
Examples
Xen
UserModeLinux(UML)

Ivan Boule

52

Virtualization

ParavirtualizationPrinciples
StillruneachGuestOSinnonprivilegedmode
Butwithminimalvirtualizationoverhead
=>ModifiedGuestOSkernel
RemoveHardwareSensitiveInstructions
UsefastVMMsystemcallsinstead,ifneeded

MinimiseusageofPrivilegedInstructions

OnlyaffectMachine/CPUdependantpartofOS
OSportageonnewarchitecturewithsameCPU
WithoutsystemISA
Ivan Boule

53

Virtualization

Paravirtualization(2)
GuestOSonlyuseVirtualI/ODevices
FrontenddriverinGuestOS
BackenddriverinVMM

DatatransferthroughasynchronousI/Orings
AvoidextraI/OdatacopiesofFullVirtualization
VMMmultiplexVMVirtualDevicesonphysical
devices
VirtualEthernet
VirtualDisks
Ivan Boule

54

Virtualization

Virtual I/O Devices


Applications

Applications

Applications

GuestOS

GuestOS

GuestOS

Vdisk Veth
(fe) (fe)

Vdisk Veth
(fe) (fe)

Vdisk Veth
(fe) (fe)

Vdisk Veth
(be) (be)

Vdisk Veth
(be) (be)

Vdisk Veth
(be) (be)

NIC
driver

NetBridging

Disk
driver

VMM

EthernetNIC
Ivan Boule

55

Virtualization

ParavirtualizationExample:Xen
Objectives
Supportmorethan100VM
ShareresourcesofServermachines

IntelIA32,x8664andARMarchitectures
SpecialfirstGuestOScalledDomain0
Runinprivilegedmode
Haveaccess(andmanages)allphysicaldevices
ModifiedversionofLinux,FreeBSD

Ivan Boule

56

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

57

Virtualization

HardwareAssistedVirtualization
Support of Virtualization in Hardware
RununmodifiedOSbinaries
With minimal virtualization overhead
Simplify VMM development
Examples
KVM (Intel-VT, AMD-V)
VMware (Intel-VT)

Ivan Boule

58

Virtualization

HardwareAssistedVirtualization
CPUvirtualization
AMDV
IntelVTx(x86),IntelVTi(Itanium)architectures
ARMCortexA15

MMUvirtualization
IntelExtendedPageTables(EPT)
AMD Nested Page Tables (NPT)

Ivan Boule

59

Virtualization

HardwareAssistedVirtualization
DMAvirtualization
IOMMU

I/ODevicevirtualization
SelfVirtualizingdevices
SingleRootI/OVirtualizationandSharing
Specification(SRIOV)
ExtensionstoPCIe(PCIExpress)Bus
standard

Ivan Boule

60

Virtualization

Intel VT-x Architecture


Support unmodified Guest OS with no need for
paravirtualization and/or binary code translation
Simplify VMM tasks & improve VMM performances
Minimize VMM memory footprint
Suppress shadowing of Guest OS page tables
Enable Guest OS to directly manage I/O devices
Without performance lost
While enforcing VM isolation and mutual protection

Ivan Boule

61

Virtualization

Intel VT-x Architecture Overview


ring3
ring0
VM1

Applications
GuestOS
kernel

VMXnonrootmode

ring3

Applications

ring0

GuestOS
kernel

VM2

VMExit

ring3
ring0
VM3

Applications

GuestOS
kernel

VMEnter

VMXrootmode

rings03

VMM
IntelVTHardware

Ivan Boule

62

Virtualization

IntelVTxCPUVirtualization
Virtual Machine eXtension (VMX)
Two new meta-modes of CPU operation
VMX root mode
Behaviour similar to IA-32 without VT
Intended for VMM execution

VMX non-root mode


Alternative IA-32 execution environment
Controlled by a VMM
Designed to run unchanged Guest OS in a VM
Both modes support rings 0-3 privilege levels
Allow VMM to use several privilege levels
Ivan Boule

63

Virtualization

IntelVTxCPUVirtualization
Two additional CPU mode transitions
From VMX root-mode to VMX non-root mode
Named VM Enter

From VMX non-root mode to VMX root mode


Named VM Exit

VM entries & VM exits use a new data structure


Virtual Machine Control Structure (VMCS) per VM
Referenced with a memory physical address
Format and layout hidden
New VT-x instructions to access a VMCS
Ivan Boule

64

Virtualization

IntelVTxCPUVirtualization
Guest State Area
Saved value of registers loaded by VM Exits
(e.g., Segment Registers, CR3, IDTR)
Hidden CPU state (e.g., CPU Interruptibility State)

Host State Area


VM Control Fields
Interrupt Virtualization
Exceptions bitmaps
I/O bitmaps
Model Specific Register R/W bitmaps
Execution rights for CPU Privileged Instructions
Ivan Boule

65

Virtualization

IntelVTxInterruptVirtualization
VMCS External Interrupt Exiting
All external interrupts cause VM Exits
Guest OS cannot mask external interrupts
when executing Interrupt Masking instructions

VMCS Interrupt Window Exiting


VM Exit occurs whenever Guest OS ready to
serve external interrupts

Used by VMM to control VM interrupts

Ivan Boule

66

Virtualization

IntelVTxMMUVirtualization
Extended Page Tables (EPT)
Second level of Page Tables in MMU
Translate Guest OS Physical Address into
Machine Physical Address
Controlled by VMM

Virtual Processor IDentifier (VPID)


Used to tag TLB entries
Avoid to flush TLB upon VM switch

Ivan Boule

67

Virtualization

Virtual Memory Virtualization


VM1
Process1

GuestOS
VirtualMemory

VM2

Process2

Process1

GuestOS
PhysicalMemory
Ivan Boule

Process2

Machine
PhysicalMemory
68

Virtualization

IntelVTxExtendedPageTables
VMM controls Extended Page Tables
EPT used in VMX non-root operation
Activated on VM Enter
Desactivated on VM exit

EPTP register points to Extended Page Tables


Instanciated by VMM
Saved in VMCS
Loaded from VMCS on VM entry

Ivan Boule

69

Virtualization

IntelVTxExtendedPageTables
GuestCR3

GuestVA

EPTRBasePointer

Guest
PageTable

GuestPA

EPT
PageTable

MachinePA

TranslationLookasideBuffer(TLB)
GuestPTEs(GuestVAGuestPA)
ExtendedPTEs(GuestPAHostPA)

Ivan Boule

70

Virtualization

TLB Flush Issue


Virtual Address Spaces

MMU
Translation
Lookaside
Buffer
pte

Physical
Memory
case0

4GB

caseN
Ivan Boule

71

Virtualization

IntelVTxVirtualProcessorIdentifier
16-bit VPID used to tag TLB entries
Enabled by VMM in VMCS
Unique VPID is assigned by VMM to each VM
VPID 0 reserved for VMM

Current VPID is 0x0000 when


Outside VMX operation
In VMX root mode operation
In VMX non-root mode if VPID disabled in VMCS
VPID loaded from VMCS on VM Enter

Ivan Boule

72

Virtualization

DMA Virtualization
Enable Guest OS to manage I/O devices
I/O devices assigned by VMM to Guest OSes

Transparent mode
Use native device driver of Guest OS
Unaware of physical memory Virtualization

Enforce isolation between Guest Oses


Guest OS only view hardware ressources
assigned by VMM (memory, devices)

Ivan Boule

73

Virtualization

DMA Principles

CPU

System
Bus

DMA
Request
Device1

Bus
Controller

Memory

I/OBus(PCI)
Device2

Ivan Boule

Device3

74

Virtualization

DMA Virtualization
VM1

VM2

VM3

Applications

Applications

Applications

GuestOS

GuestOS

GuestOS

Isolation
Domain

VMM
Device1

Device2

Ivan Boule

MachinePhysical
Memory
75

Virtualization

DMA Virtualization Issue


Guest OS driver setup I/O registers of
device with Guest Physical Address of
I/O buffers
Guest Physical Address must be
translated into its corresponding
Machine Physical Address when used for
DMA operations by device
GPA Translation cannot be done by VMM
VMM cannot catch device-specific driver
operations to setup I/O buffers addresses
Ivan Boule

76

Virtualization

Intel VT-d Protection Domains


Intel VT-d provides DMA Protection Domains
Extension of IOMMU translation mechanism
Isolated context of a subset of the Machine
Physical Memory (MPA)
Correspond to the portion of Machine Physical
Memory allocated to a VM

I/O devices assigned by VMM to a DMA


Protection Domain
Achieves DMA isolation by restricting memory
view of I/O devices through DMA address
translation
Ivan Boule

77

Virtualization

Intel VT-d DMA Translation


VT-d hardware treats address specified in
DMA request as DMA Virtual Address (DVA)
DVA = GPA of the VM to which the I/O
device is assigned
VT-d translates the DVA into its
corresponding Machine Physical Address
Support of multiple Protection Domains
DVA to MPA translation table per Protection
Domain
Must identify the device issuing a DMA request
Ivan Boule

78

Virtualization

VT-d PCI Express North Bridge

CPU

System
Bus

NorthBridge
VTd

Memory

PCIerootports

PCIExpressBus
Device1

Device2

Ivan Boule

Device3

79

Virtualization

PCI DMA Requester Identification


Mapping between PCI Device and Protection
Domains
16-bit PCI DMA Requester Identifier
15

PCIBus#

3
Device#

Function#

Assigned by PCI configuration software


Bus # indexes Bus Context Table in Root
Context Table
(Device #, Function #) indexes Device
Protection Domain in Bus Context Table
Ivan Boule

80

Virtualization

Device / Protection Domain Mapping


(Dev0,Func0)
(Dev0,Func1)

VDAMPA
Translation
Tables
Bus0
BusN

(Dev31,Func7)

ContextTableofBus0
(Dev0,Func0)
(Dev0,Func1)

VDAMPA
Translation
Tables

Bus255
RootContextTable

ProtectionDomain0

(Dev31,Func7)

ContextTableofBusN
Ivan Boule

ProtectionDomain1
81

Virtualization

Virtual DMA Address Translation


VDA MPA VT-d Page Tables similar to
IA-32 processor Page Tables
4KB or larger page size granularity
Read/Write permissions
Protection Domains managed by VMM
Initialized at VM creation time
With same translations of the VM Extended
Page Table

Ivan Boule

82

Virtualization

Device Virtualization
Share I/O device among multiples VMs
With no performance lost
While enforcing VM isolation and protection

Move device virtualization from the VMM


to the device itself
Requires support from the device
Example of Ethernet controllers

Ivan Boule

83

Virtualization

Ethernet Device Virtualization


VM1

VM2
GuestOS2

GuestOS1

vNICDriver

VMM

vNICDriver
pNICDriver

Virtual
Function

Virtual
Function

LAN
Ivan Boule

84

Virtualization

Intel Single Root I/O Virtualization


SR-IOV capable PCI Device can be
partitionned into multiple Virtual Functions
SR-IOV Device appears in PCI configuration
space as multiple PCI Virtual Functions
Each Device Virtual Function includes
PCI configuration registers
DMA streams
Interrupts

Requires VT-d for DMA virtualization


Ivan Boule

85

Virtualization

Intel SR-IOV
VMM manages physical PCI device
Create a PCI Virtual Function for each VM
Include it into VM PCI configuration space to
be probed by VM GuestOS kernel
Map it to Protection Domain of VM

Programs the sharing of physical devices


ressources between VFs
PCI Device Virtual Functions directly
managed by specific VF-Aware GuestOS
drivers (kind of Para-Virtualization)
Ivan Boule

86

Virtualization

Intel SR-IOV
VM1

VM2

VMn

OSvNIC
Driver

OSvNIC
Driver

OSvNIC
Driver

VMM

Packet
Queue1

VF1

Packet
Queue2

Packet
Queuen

VF2

VFn

Layer2Filtering(MacAddress)
MAC/PHY

IntelNiantic
10GbEPort

LAN
Ivan Boule

87

Virtualization

Intel SR-IOV - Ethernet example


Intel Kawela (1GB) / Niantic (10GB) Ethernet NICs
Multiple RX/TX packet queues per port
Virtual Device Machine Queues
1 RX paquet queue per VF
Filters multiple unicast Ethernet Addresses
Layer-2 paquet filtering based on Ethernet
Destination Address
Duplicate Broadcast / Multicast packets for all VFs
Load balancing between TX paquets sent by VFs
Ivan Boule

88

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

89

Virtualization

OldEmbeddedSystems
Relativelysimplearchitecture
Singlepurposedevices
Dominatedbyhardwareconstraints
Memory,batterycharge

Dedicatedfunctionalities,withmoderated
softwaresizeandcomplexity
Realtimeconstraints

Ivan Boule

90

Virtualization

OldEmbeddedSystems(2)
Closedenvironment(blackboxes)
Fixedhardwareconfiguration
Fullsoftwareprovidedbydevicevendor
Nodynamicloadingofapplications
Softwareupdatesrareful

Ivan Boule

91

Virtualization

EmbeddedSystemsNow(1)
TakeonfeaturesofgeneralpurposeOS's
Growingfunctionalities
=>growingcomplexityandsize
RunapplicationsoriginallydevelopedforPC's
SophisticatedHumanMachineInterfaces(HMI)
SafariWebbrowseroniPhones

Dynamicloadingofapplications
Iphone
GoogleAndroid
Ivan Boule

92

Virtualization

EmbeddedSystemsNow(2)
Dynamicallyloaddevice'sownerspecificapplications
Games

Applicationsdeveloppedbyengineerswithnoexpertise
inembeddedsystems
Javaapplications

Needforexchangeswithexternalworld
USB,Bluetooth,WiFi
TCP/IP

NeedforopenAPI's,andopennessingeneral
Needforhighlevelsystems(Linux,Windows)
Ivan Boule

93

Virtualization

EmbeddedSystemsChallenges
StillRealTimesystems(partofit)
Basebandstackofmobilephones

Stillhardwareconstraints
Battery
Memory(tominimizedevice'scost)

Alsousedinmission/lifecriticalsituations
Weapons
Cars

Highrequirementsonreliabilityandsecurity
Ivan Boule

94

MobileHandsets
Modem
Telephony
Services

Wireless Stacks:
GSM/GPRS
Edge/UMTS

RTOS

Applications

HMI
PIM
Internet
Games
Multimedia

Android/Linux

OS Virtualization Layer

Phone HW (Single CPU)

RunAndroid/Linux
applicationsonbaseband
processor
Reuseexistinglegacy
modemsoftwarestackwith
itsRTOS(nochanges)
SupportofLinuxata
minimaldevelopmentcost
OperatingSystem
independenceforfuture
evolutions
Security&Protection
throughOSisolation
HMI:HumanMachineInterface
PIM:PersonalInformationMngt.
95

Virtualization

VirtualizationinEmbeddedSystems
SupportforheterogeneousOS'senvironments
RealtimeOS
Legacysoftware
Dedicatedapplicationswhoserealtimeconstraints
cannotbeachievedbyGeneralPurposesystems
Licenceissues(GPLcontamination)

GeneralPurposeOS
Openness
HMI
Ivan Boule

96

Virtualization

VirtualizationinEmbeddedSystems
ConcurrentexecutionofRTOSandGPOSon
thesameCPU
Reducescost(BillOfMaterial)
RequirestheunderlyingVMMtoprovide
MemoryisolationbetweenOS's
CPUschedulingamongOS's,withhigherpriorityto
theRTOS
Devicepartitionning
CommunicationmechanismbetweenOS's
Ivan Boule

97

Virtualization

VirtualizationinEmbeddedSystems
Leveragemulticoressupportwithvirtual
machineabstraction
1coreperOS=>noneedforCPUscheduling
2lowperformancecoresconsumelesspower
thanasinglehighperformanceCPU
=>simplifypowermanagement
Newmodelofsoftwaredistribution,shipping
applicationwithitsownOS
NoOSconfiguration/versionincoherency
Ivan Boule

98

Virtualization

SecurityThroughVirtualization
NotionofTrustedComputingBase(TCB)
Partofthesystemthatprovidessecurityfoundations
ShouldonlyincludehardwareandVMM
MayalsoincludeRTOS,forperformance/legacy
reasons

RunGPOSinanisolatedVirtualMachine
AvoiddamagedGPOStocompromisethesecure
parts(data,services)ofthesystem

Ivan Boule

99

Virtualization

Embedded+Virtualization
Challenges(1)
Full isolation of VM's does not fit
cooperation requirements between OS's
Efficient communication mechanisms
between VM's
Global scheduling, with interleaved
priorities
Global Energy Management
Ivan Boule

100

Embedded+Virtualization
Challenges(2)

Virtualization

Efficient communication mechanisms between


VM's
VirtualEthernetdevicenotadapted
NeedVMMcontrolledsharedmemorytransfers

Example:VideostreamingonaSmartphone
VideodatareceivedviathebasebandmanagedbyRTOS
VideodatadisplayedbyaMediaPlayerrunningonGPOS
Avoidcopyofvideodatatransferedbetweenthe2OS's!

Ivan Boule

101

Virtualization

TaskSchedulingIssues
StandardserverorientedVirtualizationmodel
TheVMMschedulesVM'sontheCPU
TheOSoneachVMrunsitsownscheduler

InterleavedprioritiesinEmbeddedSystems
BasebandtaskofRTOSwithahighpriority
ButGPOSMediaPlayermusthaveahigher
prioritythansomelowprioritytasksofRTOS
EnableaVMtoyieldtheCPU
UseaRTtaskasaproxyofGPOSapplication,and
makeityieldtheCPU
Ivan Boule

102

Virtualization

MultiUsersDevices
Mobilephonehas3typesofusers,eachwith
specificprivatedatatoprotectfromtheothers
Thepersonowningthedevice,withaddressbook,
emails,documents,etc.
Differentwirelessproviders,forexampleprivate
andprofessionnal:networkaccessproperly
authenticated,ensurecorrectbilling!
Thirdpartyserviceproviders,forinstance
multimediaproviders.

Ownerandthirdpartiesmustbegranted
securefinancialtransactions
Ivan Boule

103

Virtualization

VirtualizationinHardware
OnlywaytobuildarealTCB
Withoutpenalizingperformances

Shouldincludesupportfor
MemoryPartitionning
PhysicalMemory/MachineMemorymapping
Coupledwithmulticores
DevicePartitioning
Interruptrouting
I/ODMAcoupledwithmemorypartitioning&
PhysicalMemory/MachineMemorymapping
Ivan Boule

104

Plan

Virtualization

History
VirtualizationUsages
VirtualizationTaxonomy
ProcessLevelVirtualization
TransparentHardwareEmulation
TransparentHardwareVirtualization
Paravirtualization
HardwareAssistedVirtualization
VirtualizationandEmbeddedSystems
EvolutionofVirtualization
Ivan Boule

105

Virtualization

EvolutionsofVirtualization
VMMshippedwithhardware
ByCPU/Motherboardconstructor
WithextensionsofMachineconstructor
PlaysBIOSrole

VMMstillabootoption?
VMMavailableasfreesoftware?
VMMavailableasOpenSource?

Ivan Boule

106

Virtualization

EvolutionsofVirtualization
Public&OpenVMMAPI
EaseportofOSontopofVMM

DeviceVirtualization
IncludedinVMMAPI
AllowgenericdevicedriversinOS's
IncreaseOSportability

Ivan Boule

107

Você também pode gostar