Você está na página 1de 75

INF-BCO2382

VMware vSphere HA Recommendations to Maximize Virtual Machine Uptime

Josh Gray, VMware, Inc. Jeff Hunter, VMware, Inc.

#vmworldinf

Disclaimer

This session may contain product features that are


currently under development.

This session/overview of the new technology represents


no commitment from VMware to deliver these features in any generally available product.

Features are subject to change, and must not be included in


contracts, purchase orders, or sales agreements of any kind.

Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features
discussed or presented have not been determined.

High Availability is Part of IT Business Continuity

Just a Few Clicks to Higher Availability

Turn ON vSphere HA
OK

Global Support Services (GSS)

Burlington, Canada Palo Alto, CA

Cork, Ireland Tokyo, Japan

Broomfield, CO

Bangalore, India

Support offices Local language support Spanish, Portuguese, French, German, Japanese, Chinese

Global Coverage 24x7, 365 days/year 6 Support Centers 1000+ Support Engineers

Follow-the-sun Support for Severity 1 Issues

Support Relationships with 100% of the Fortune 100; 99% of Fortune 500

Recent Enhancements

vSphere 5.0 Major Redesign


Fault Domain Manager (FDM)

vSphere 5.1 Minor Updates

Recommendations: Networking

Redundant Management Network Fewest hops possible Route based on originating port ID Failback policy = No Enable PortFast, Edge, etc. MTU size the same Keep things simple

Recommendations: Networking

Consistent portgroup names, network labels Host Monitoring during network maintenance Use Maintenance Mode Separate subnet for vSphere HA Specify additional network isolation address Each host can communicate with all other hosts Keep things simple

10

Recommendations: Networking

11

Recommendations: Networking

Advanced Configuration Options


das.allowNetwork[0-9]= das.isolationAddress[0-9]= das.useDefaultIsolationAddress= (true/false) das.failuredetectiontime
Not supported in vCenter 5.x

12

Recommendations: Storage

Implement multiple paths


HBAs, storage processors (SPs), NICs, switches Appropriate multipathing policy

13

Recommendations: Storage

Storage Heartbeats
HA selects two datastores by default

14

Recommendations: Storage

Storage Heartbeats
Override auto-selected datastores if necessary

15

HA Events
(How to Avoid Problems)

16

Possible HA Events:
Host Failure Network partition Host isolation

17

HA Events:
Host Failures

18

HA Events:
Network Partition

19

Recommendations: Network Partition

Symptoms: Network Partition

20

Recommendations: Network Partition

Symptoms: Network Partition

Master

21

Recommendations: Network Partition

Symptoms: Network Partition

22

Recommendations: Network Partition

Symptoms: Network Partition

New Master

23

Recommendations: Network Partition

Symptoms: Network Partition

New Master

New Master

24

HA Events:
Host Isolation

25

Host Isolation Policies:


Leave Powered On Power Off Shutdown

26

Which Policy?
(How to Avoid Problems)

27

Depends.
(on HOW You Want to Avoid Problems)

28

Likelihood.

29

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

30

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

31

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

32

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

33

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

34

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

35

Recommendations: Isolation Response


Host will retain access to datastores? Likely
VMs will retain access to VM network? Recommended Isolation Policy

Rationale

Likely

Leave Powered On

VM is running fine, why power it off Allow HA to restart on hosts that are not isolated, likely to have access to storage Avoid having two instances of the same VM on the network

Likely

Unlikely

Leave Powered On or Shutdown

Unlikely

Likely

Power off

36

Admission Control
(How to Avoid Problems)

37

Admission Control Policies:


Static number of hosts Percentage of cluster resources Dedicated failover hosts

38

Static Number of Hosts


Admission Control Policy

39

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VMware vSphere

40

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Each Host: 4 CPU x 2.40 GHz CPU 16 GB memory Cluster: 38 GHz 64 GB memory

41

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Reservation: 2 GHz 1024 MB Each Host: 4 CPU x 2.40 GHz CPU 16 GB memory Cluster: 38 GHz 64 GB memory

Reservation: 1 GHz 2048 MB

42

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Reservation: 2 GHz 1024 MB

Reservation: 1 GHz 2048 MB

43

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Reservation: 2 GHz 1024 MB

Reservation: 1 GHz 2048 MB

44

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Reservation: 2 GHz 1024 MB

Reservation: 1 GHz 2048 MB

45

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

46

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

47

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

48

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

49

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

50

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Windows Client

vSphere Web Client

51

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Windows Client

vSphere Web Client

52

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Override default behavior


vSphere Windows Client
Sets a cap on the slot size

53

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

Override default behavior

vSphere Web Client


Sets the exact size. Important difference.

54

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

55

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

56

Recommendations: Admission Control

Number of Hosts (Host Failures Cluster Tolerates)

VM

VM

57

Recap: Static Number of Hosts


Admission Control Policy

58

% of Cluster Resources
Admission Control Policy

59

Recommendations: Admission Control

Percentage of cluster resources

60

Recommendations: Admission Control

Percentage of cluster resources

61

Recommendations: Admission Control

Percentage of cluster resources

62

Recommendations: Admission Control

Percentage of cluster resources

63

Dedicated Failover Hosts


Admission Control Policy

64

Recommendations: Admission Control

65

Which Do I Use?!?!

66

Recommendations: Admission Control

Basic design principle: Do the math, and take customer requirements into account. If you need flexibility a Percentage is the way to go. Frank Denneman & Duncan Epping
VMware vSphere 5 Clustering Technical Deepdive

67

vSphere HA VM Monitoring

VM Monitoring restarts VM if
VMware Tools Heartbeat not received No network or disk activity within I/O stats interval
Default 120 seconds customize in vSphere Web Client

68

vSphere HA Application Monitoring

3rd-Party Solutions
Symantec ApplicationHA Neverfail vAppHA

Application Awareness API open with vSphere 5.0


Download VMware GuestAppMonitor SDK with 5.0 Download VMware Guest SDK for vSphere 5.1

69

vSphere HA Futures

VMware vSphere HA Today


Storage interconnect most commonly queried KB issue Assumes storage connected on other hosts Improvements with vSphere 5.0 U1 and 5.1

Virtual Machine Component Protection (VMCP)


Fine-grained controls for VM restart policy Queries destination host(s) for storage health Demo in VMware booth on show floor

70

vSphere HA Futures

VMware vSphere Fault Tolerance (FT) Today


Protects only VMs with 1 vCPU Many mission-critical apps require multiple vCPUs

SMP Fault Tolerance (FT)


Protect VMs that have more than one vCPU

71

Customer Support Day Events


Coming to a location near you: sharing of VMware best practices!

Support Days are a collaboration between VMware Support, Sales


and customers you learn directly from the experts

Topics are driven by


customer input, and typically include: Best practices Tips/tricks Top issues Product roadmaps/demos Certification offerings

http://www.vmware.com/go/supportdays
72

VMware GSS: Important Links


Support and Downloads: vmware.com/support Get Support via My VMware: my.vmware.com/group/vmware/get-help Knowledge Base: kb.vmware.com Renewals: vmware.com/go/renew Product Support Centers: vmware.com/support/product-support Blogs Support Insider: blogs.vmware.com/kb KBTV: blogs.vmware.com/kbtv KB Digest: blogs.vmware.com/kbdigest Twitter @vmwarecares: twitter.com/vmwarecares @vmwarekb: twitter.com/vmwarekb Facebook https://www.facebook.com/vmwkb Technical Support Welcome Guide: vmware.com/go/supportguide Licensing Help Center: vmware.com/support/licensing Customer Support Days: vmware.com/go/supportdays Customer Advocacy: customerfeedback@vmware.com

Communities communities.vmware.com

YouTube KBTV: youtube.com/user/vmwarekb

73

FILL OUT A SURVEY


EVERY COMPLETE SURVEY IS ENTERED INTO DRAWING FOR A $25 VMWARE COMPANY STORE GIFT CERTIFICATE

INF-BCO2382

VMware vSphere HA Recommendations to Maximize Virtual Machine Uptime

Josh Gray, VMware, Inc. Jeff Hunter, VMware, Inc.

#vmworldinf