Você está na página 1de 130

Implementation and Planning

Best Practices for EMC VPLEX


VS2 Hardware
with GeoSynchrony v5.x
Technical Note
Nov 2013

These technical notes describe various EMC VPLEX configurations and the
recommended best practices for each configuration.

EMC VPLEX Overview ............................................................................. 4


VPLEX Components ................................................................................ 11
Requirements vs. Recommendations .................................................... 15
Back-end/Storage Array Connectivity ................................................. 17
Host Connectivity .................................................................................... 30
Host Cluster cross-connect...................................................................... 39
Path loss handling semantics (PDL and APD) ..................................... 43
VBLOCK and VPLEX Front End Connectivity Rules ......................... 46
Storage View Considerations ................................................................. 48
Fan In / Fan Out Ratios........................................................................... 49
Network Best Practices ............................................................................ 53
VPLEX Witness......................................................................................... 75
Consistency Groups ................................................................................. 76
Rule Sets .................................................................................................... 77
System Volumes ....................................................................................... 78
Migration of Host/Storage to a VPLEX Environment ............................. 82
Storage Volume Considerations............................................................. 87
Export Considerations ............................................................................. 93
LUN Expansion ........................................................................................ 94
Data Migration.......................................................................................... 96
Scale Up Scale Out and Hardware Upgrades ...................................... 99
VPLEX and RecoverPoint Integration ................................................. 101
Monitoring for Performance Best Practices .......................................... 103
Storage Resource Management Suite VPLEX Solution Pack ........ 103
VPLEX Administration Recommendations ............................................ 108
1

EMC VPLEX Overview

Summary..................................................................................................116
Appendix A: VS1 ....................................................................................117
Glossary....................................................................................................121

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview


Audience
These technical notes are for EMC field personnel and partners and customers who
will be configuring, installing, and supporting VPLEX. An understanding of these
technical notes requires an understanding of the following:

SAN technology and network design

Fibre Channel block storage concepts

VPLEX concepts and components

The next section presents a brief review of VPLEX.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview

EMC VPLEX Overview


EMC VPLEX represents the next-generation architecture for data mobility and
continuous availability. This architecture is based on EMCs 20+years of expertise in
designing; implementing and perfecting enterprise class intelligent cache and
distributed data protection solutions.
VPLEX addresses two distinct use cases:

Data Mobility: The ability to move applications and data across different storage
installationswithin the same data center, across a campus, or within a
geographical region

Continuous Availability: The ability to create a continuously available storage


infrastructure across these same varied geographies with unmatched resiliency

Figure 1 Access anywhere, protect everywhere

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview

VPLEX Platform Availability and Scaling Summary


VPLEX addresses continuous availability and data mobility requirements and scales
to the I/O throughput required for the front-end applications and back-end storage.
Continuous availability and data mobility features are characteristics of VPLEX Local,
VPLEX Metro, and VPLEX Geo.
A VPLEX cluster consists of one, two, or four engines (each containing two directors),
and a management server. A dual-engine or quad-engine cluster also contains a pair
of Fibre Channel switches for communication between directors within the cluster.
Each engine is protected by a standby power supply (SPS), and each Fibre Channel
switch gets its power through an uninterruptible power supply (UPS). (In a dualengine or quad-engine cluster, the management server also gets power from a UPS.)
The management server has a public Ethernet port, which provides cluster
management services when connected to the customer network. This Ethernet port
also provides the point of access for communications with the VPLEX Witness.
VPLEX scales both up and out. Upgrades from a single engine to a dual engine cluster
as well as from a dual engine to a quad engine are fully supported and are
accomplished non-disruptively. This is referred to as scale up. Scale out upgrades
from a VPLEX Local to a VPLEX Metro or VPLEX Geo are also supported nondisruptively.
Note: Scaling out also supports any size cluster to any size cluster meaning
that it is not required that both clusters contain the same number of engines.

For access to all VPLEX related collateral, interaction and information from EMC
please visit the customer accessible:

EMC Community Network - ECN: Space: VPLEX

Former Powerlink.emc.com now support.emc.com

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview

Data Mobility
EMC VPLEX enables the connectivity to heterogeneous storage arrays providing
seamless data mobility and the ability to manage storage provisioned from multiple
heterogeneous arrays from a single interface within a data center. Data Mobility and
Mirroring are supported across different array types and vendors.
VPLEX Metro or Geo configurations enable migrations between locations over
synchronous/asynchronous distances. In combination with, for example, VMware
and Distance vMotion or Microsoft Hyper-V, it allows you to transparently relocate
Virtual Machines and their corresponding applications and data over synchronous
distance. This provides you with the ability to relocate, share and balance
infrastructure resources between data centers. Geo is currently supported with
Microsoft Hyper-V only.
The EMC Simple Support Matrix (ESSM) Simplified version of the ELab Navigator

Note: Please refer to the ESSM for additional support updates.

Figure 2 Application and data mobility


All Directors in a VPLEX cluster have access to all Storage Volumes making this
solution what is referred to as an N -1 architecture. This type of architecture allows
for multiple director failures without loss of access to data down to a single director.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview


During a VPLEX Mobility operation any jobs in progress can be paused or stopped
without affecting data integrity. Data Mobility creates a mirror of the source and
target devices allowing the user to commit or cancel the job without affecting the
actual data. A record of all mobility jobs are maintained until the user purges the list
for organizational purposes.

Figure 3 Migration process comparison


One of the first and most common use cases for storage virtualization in general is that
it provides a simple transparent approach for array replacement. Standard migrations
off an array are time consuming due to the requirement of coordinating planned
outages with all necessary applications that dont inherently have the ability to have
new devices provisioned and copied to without taking an outage. Additional host
remediation may be required for support of the new array which may also require an
outage.
VPLEX eliminates all these problems and makes the array replacement completely
seamless and transparent to the servers. The applications continue to operate
uninterrupted during the entire process. Host remediation is not necessary as the host
continues to operate off the Virtual Volumes provisioned from VPLEX and is not
aware of the change in the backend array. All host level support requirements apply
only to VPLEX and there are no necessary considerations for the backend arrays as
that is handled through VPLEX.
If the solution incorporates RecoverPoint and the RecoverPoint Repository, Journal
and Replica volumes reside on VPLEX virtual volumes then array replacement is also
completely transparent to even to RecoverPoint. This solution results in no
interruption in the replication so there is no requirement to reconfigure or
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview


resynchronize the replication volumes.

Continuous Availability
Virtualization Architecture
Built on a foundation of scalable and continuously available multi processor engines,
EMC VPLEX is designed to seamlessly scale from small to large configurations.
VPLEX resides between the servers and heterogeneous storage assets, and uses a
unique clustering architecture that allows servers at multiple data centers to have
read/write access to shared block storage devices.
Unique characteristics of this new architecture include:

Scale-out clustering hardware lets you start small and grow big with
predictable service levels

Advanced data caching utilizes large-scale SDRAM cache to improve


performance and reduce I/O latency and array contention

Distributed cache coherence for automatic sharing, balancing, and failover of


I/O across the cluster

Consistent view of one or more LUNs across VPLEX clusters (within a data
center or across synchronous distances) enabling new models of continuous
availability and workload relocation

With a unique scale-up and scale-out architecture, VPLEX advanced data caching and
distributed cache coherency provide workload resiliency, automatic sharing,
balancing, and failover of storage domains, and enables both local and remote data
access with predictable service levels.
EMC VPLEX has been architected for multi-site virtualization enabling federation
across VPLEX Clusters. VPLEX Metro supports max 5ms RTT, FC or 10 GigE
connectivity. VPLEX Geo supports max 50ms RTT, and 10GigE. The nature of the
architecture will enable more than two sites to be connected in the future.
EMC VPLEX uses a VMware Virtual machine located within a separate failure domain
to provide a VPLEX Witness between VPLEX Clusters that are part of a

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview


distributed/federated solution. This third site needs only IP connectivity to the
VPLEX sites and a 3 way VPN will be established between the VPLEX management
servers and the VPLEX Witness.
Many solutions require a third site, with a FC LUN acting as the quorum disk. This
must be accessible from the solutions node in each site resulting in additional storage
and link costs.
Please refer to the section in this document on VPLEX Witness for additional details.

Storage/Service Availability
Each VPLEX site has a local VPLEX Cluster and physical storage and hosts are
connected to that VPLEX Cluster. The VPLEX Clusters themselves are interconnected
across the sites to enable federation. A device is taken from each of the VPLEX
Clusters to create a distributed RAID 1 virtual volume. Hosts connected in Site A
actively use the storage I/O capability of the storage in Site A, Hosts in Site B actively
use the storage I/O capability of the storage in Site B.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

EMC VPLEX Overview

Figure 4 Continuous availability architecture

VPLEX distributed volumes are available from either VPLEX cluster and have the
same LUN and storage identifiers when exposed from each cluster, enabling true
concurrent read/write access across sites.
When using a distributed virtual volume across two VPLEX Clusters, if the storage in
one of the sites is lost, all hosts continue to have access to the distributed virtual
volume, with no disruption. VPLEX services all read/write traffic through the remote
mirror leg at the other site.

10

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

VPLEX Components

VPLEX Components
VPLEX Engine VS2
VS2 refers to the second version of hardware for the VPLEX cluster. The VS2
hardware is on 2U engines and will be detailed below. For information on VS1
hardware please see the appendix.

The following figures show the front and rear views of the VPLEX VS2 Engine.

Figure 5 Front and rear view of the VPLEX VS2 Engine

Connectivity and I/O paths


This section covers the hardware connectivity best practices for connecting to the
SAN. The practices mentioned below are based on Dual Fabric SAN, which is
Industry best practice. Well discuss Host and Array connectivity. The VPLEX
hardware is designed with a standard preconfigured port arrangement that is not
reconfigurable. The VS2 hardware must be ordered as a Local, Metro or Geo. VS2
hardware is pre-configured with FC or 10 Gigabit Ethernet WAN connectivity from
the factory and does not offer both solutions in the same configuration.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

11

VPLEX Components

Figure 6 VPLEX preconfigured port arrangement VS2 hardware


Director A and Director B each have four I/O modules. I/O modules A0 and B0 are
configured for host connectivity and are identified as frontend while the A1 and B1
are configured for array connectivity identified as backend. The frontend ports will
log in to the fabrics and present themselves as targets for zoning to the host initiators
and the backend ports will log in to the fabrics as initiators to be used for zoning to the
array targets. Each director will connect to both SAN fabrics with both frontend and
backend ports. Array direct connect is also supported however limiting. Special
consideration must be used if this option is required.
The I/O modules in A2 and B2 are for WAN connectivity. This slot may be populated
by a four port FC module or a two port 10 GigE for VPLEX Metro or VPLEX Geo
configurations. VPLEX Local configurations will ship with filler blanks in slots A2
and B2 and may be added in the field for connecting to another net new cluster for
Metro or Geo upgrades. The I/O modules in slots A3 and B3 are populated with FC
modules for Local Com and will only use the bottom two ports.
The FC WAN Com ports will be connected to dual separate backbone fabrics or
networks that span the two sites. This allows for data flow between the two VPLEX
clusters in a Metro configuration without requiring a merged fabric between the two
sites. Dual fabrics that currently span the two sites are also supported but not
required. The 10 GigE I/O modules will be connected to dual networks consisting of
the same QoS. All A2-FC00 and B2-FC00 ports or A2-XG00 and B2-XG00 ports from
both clusters will connect to one fabric or network and all A2-FC01 and B2-FC01 ports
for or A2-XG01 and B2-XG01 ports will connect to the other fabric or network. This
provides a redundant network capability where each director on one cluster
communicates with all the directors on the other site even in the event of a fabric or
network failure. For both VPLEX Metro and VPLEX Geo, each directors WAN COM
12

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

VPLEX Components
ports on one cluster must see all of the directors WAN COM ports within the same
port group on the other cluster across two different pipes. This applies in both
directions.
When configuring the VPLEX Cluster cabling and zoning, the general rule is to use a
configuration that provides the best combination of simplicity and redundancy. In
many instances connectivity can be configured to varying degrees of redundancy.
However, there are some minimal requirements that must be adhered to for support
of features like NDU. Various requirements and recommendations are outlined below
for connectivity with a VPLEX Cluster.
Frontend (FE) ports provide connectivity to the host adapters also known as host
initiator ports. Backend (BE) ports provide connectivity to storage arrays ports known
as target ports or FAs.
Do not confuse the usage of ports and initiator ports within documentation. Any
general reference to a port should be a port on a VPLEX director. All references to
HBA ports on a host should use the term initiator port. VPLEX Metro and VPLEX
Geo sections have a more specific discussion of cluster-to-cluster connectivity.
General information (applies to both FE and BE)
Official documents as may be found in the VPLEX Procedure Generator refer to a
minimal config and describe how to connect it to bring up the least possible host
connectivity. While this is adequate to demonstrate the features within VPLEX for the
purpose of a Proof of Concept or use within a Test/Dev environment it should not be
implemented in a full production environment. As clearly stated within the
documentation for minimal config this is not a Continuously Available solution.
Solutions should not be introduced into production environments that are not HA.
Also, this minimal config documentation is specific to host connectivity. Please do
not try to apply this concept to backend array connectivity. The requirements for
backend must allow for connectivity to both fabrics for dual path connectivity to all
backend storage volumes from each director.
The following are recommended:

Dual fabric designs for fabric redundancy and HA should be implemented to


avoid a single point of failure. This provides data access even in the event of a
full fabric outage.

Each VPLEX director will physically connect to both fabrics for both host
(front-end) and storage (back-end) connectivity. Hosts will connect to both an
A director and B director from both fabrics and across engines for the
supported HA level of connectivity as required with the NDU pre-checks.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

13

VPLEX Components

Figure 7 Continuous availability front-end configurations (dual-engine)

14

Back-end connectivity checks verify that there are two paths to each LUN
from each director. This assures that the number of combined active and
passive paths (reported by the ndu pre-check command) for the LUN is
greater than or equal to two. This check assures that there are at least two
unique initiators and two unique targets in the set of paths to a LUN from
each director. These backend paths must be configured across both fabrics as
well. No volume is to be presented over a single fabric to any director as this
is a single point of failure.

Fabric zoning should consist of a set of zones, each with a single initiator and
up to 16 targets.

Avoid incorrect FC port speed between the fabric and VPLEX. Use highest
possible bandwidth to match the VPLEX maximum port speed and use
dedicated port speeds i.e. do not use oversubscribed ports on SAN switches.

Each VPLEX director has the capability of connecting both FE and BE I/O
modules to both fabrics with multiple ports. The ports connected to on the
SAN should be on different blades or switches so a single blade or switch
failure wont cause loss of access on that fabric overall. A good design will
group VPLEX BE ports with Array ports that will be provisioning groups of
devices to those VPLEX BE ports in such a way as to minimize traffic across
blades.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Requirements vs. Recommendations

Requirements vs. Recommendations


Production Environment

Test or Proof of Concept


Environment

Dual Fabric

Requirement for High


Availability

Requirement if the tests


involve High Availability

Dual HBA

Required

Required

Initiator connected to
both an A and B
Director

Required

Recommended

Four "active" backend


paths per Director per
Storage Volume

Recommended but also Its a


requirement to not have more
than 4 active paths per director
per storage volume

Recommended

Two "active" backend


paths per Director per
Storage Volume

Required

Required

Host Direct Connected


to VPLEX Directors

Not Supported

Not Supported

Arrays direct
connected to VPLEX
Directors

Not Recommended but


Supported

Supported

WAN COM single port


connectivity

Not Supported

Not Supported

Notes

Dual Fabrics are a general best


practice.
Single initiator hosts are not
supported and a dual port HBA is a
single point of failure also.
For a Production Environment, it is
also required that the connectivity
for each initiator span engines in a
dual or quad engine VPLEX Cluster.
This is the maximum number of
"active" paths. An active/passive
or ALUA array will have a maximum
of four active and four passive or
non-preferred paths making eight
in all.
This is a minimum requirement in
NDU which dictates that to two
VPLEX Director backend ports will
be connected to two array ports
per Storage Volume. Depending on
workload, size of environment and
array type, four "active" path
configurations have proven to
alleviate performance issues and
therefore are recommended over
the minimum of two active paths
per director per storage volume.
Try to avoid only two path
connectivity in production
environments.
Host direct connect to a VPLEX
Director is never supported.
Array direct connect is supported
but extremely limited in scale
which is why it is not
recommended for a production
environment.
Two ports on the WAN COM for
each Director must be configured
each it their separate port groups.
Fibre Channel WAN COM (Metro
Only) is also supported with all four
ports each in their own port group.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

15

Metadata and Logging


Volume

It is required that metadata


and metadata backups are
configured across arrays at the
local site if more than one
array is present. If the site
were built originally with a
single array and another array
were to be added at a later
time then it is required to
move one leg of the metadata
volume and one backup to the
new array.

It is required that
metadata and metadata
backups are configured
across arrays at the local
site if more than one
array is present. A
standard test during a
POC is to perform an
NDU. It would be
undesirable to have to use
the --skip option if not
needed.

Host Cross-Cluster
Connect

Supported with VPLEX Witness


required

Supported with VPLEX


Witness required

Array Cross-Cluster
Connect

Only supported if both sites are


within 1ms latency from each
other and strictly for the
purpose of adding protection
to metadata and logging
volumes. Storage Volumes are
not supported connected to
both clusters

Only supported if both


sites are within 1ms
latency from each other
and strictly for the
purpose of adding
protection to metadata
and logging volumes.
Storage Volumes are not
supported connected to
both clusters

VPLEX Witness

Requirement for High


Availability.

Optional but should


mirror what will be in
production

16

While it is a requirement for


metadata and metadata backups to
be configured across arrays, it is
highly recommended to mirror
logging volumes across arrays as
well. Loss of the array that
contains the logging volumes
would result in additional overhead
of full rebuilds after the array came
back up.
VPLEX Witness is a hard
requirement for Host Cross-Cluster
Connect regardless of the type of
environment. The auto resume
attribute on all consistency groups
must be set to true as an additional
requirement.
Connecting an array to both sides
of a VPLEX Metro or Geo is not
supported if the sites exceed 1ms
latency from each other. If done
then extreme caution must be
taken not to share the same
devices to both clusters. Also, be
cautious if evaluating performance
or fault injection tests with such a
configuration.
VPLEX Witness is designed to work
with a VPLEX Metro or Geo. It is
not implemented with a VPLEX
Local. VPLEX Witness has proven
to be such a valuable enhancement
that it should be considered a
requirement. VPLEX Witness must
never be co-located with either of
the VPLEX Clusters that it is
monitoring.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity

Back-end/Storage Array Connectivity


The best practice for array connectivity is to use A/B fabrics for redundancy however
VPLEX is also capable of backend direct connect. This practice is immediately
recognizable as being extremely limited. The following best practices for fabric
connect should be followed with regards to the direct connect where applicable.
Direct connect is intended for Proof of Concept, test/dev and specific sites that have
only 1 array. This allows for backend connectivity while reducing the overall cost of
switch ports. Sites with multiple arrays or large implementations should utilize SAN
connectivity as that provides the optimal solution overall.

Note: Direct connect applies only to backend connectivity. Frontend direct connect is
not supported.

Active/Active Arrays
Each director in a VPLEX cluster must have a minimum of two I/O paths to every
local back-end storage array and to every storage volume presented to that cluster
(required). This is referred to as an ITL or Initiator/Target/LUN.
Each director will have redundant physical connections to the back-end storage across
dual fabrics (required) Each director is required to have redundant paths to every
back-end storage array across both fabrics.
Each storage array should have redundant controllers connected to dual fabrics, with
each VPLEX Director having a minimum of two ports connected to the back-end
storage arrays through the dual fabrics (required).
VPLEX recommends a maximum of 4 active paths per director to a given LUN
(Optimal). This is considered optimal because each director will load balance across
the four active paths to the storage volume.
High quantities of storage volumes or entire arrays provisioned to VPLEX should be
divided up into appropriately sized groups (i.e. masking views or storage groups) and
presented from the array to VPLEX via groups of four array ports per VPLEX director
so as not to exceed the four active paths per VPLEX director limitation. As an example,
following the rule of four active paths per storage volume per director (also referred to
as ITLs), a four engine VPLEX cluster could have each director connected to four array
ports dedicated to that director. In other words, a quad engine VPLEX cluster would
have the ability to connect to 32 ports on a single array for access to a single device
presented through all 32 ports and still meet the connectivity rules of 4 ITLs per
director. This can be accomplished using only two ports per backend I/O module
leaving the other two ports for access to another set of volumes over the same or
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

17

Back-end/Storage Array Connectivity


different array ports.
Appropriateness would be judged based on things like the planned total IO workload
for the group of LUNs and limitations of the physical storage array. For example,
storage arrays often have limits around the number of LUNs per storage port, storage
group, or masking view they can have.
Maximum performance, environment wide, is achieved by load balancing across
maximum number of ports on an array while staying within the IT limits.
Performance is not based on a single host but the overall impact of all resources being
utilized. Proper balancing of all available resources provides the best overall
performance.
Load balancing via Host Multipath between VPLEX directors and then from the four
paths on each director balances the load equally between the array ports

18

1.

Zone VPLEX director A ports to one group of four array ports.

2.

Zone VPLEX director B ports to a different group of four array ports.

3.

Repeat for additional VPLEX engines.

4.

Create a separate port group within the array for each of these logical path
groups.

5.

Spread each group of four ports across array engines for redundancy.

6.

Mask devices to allow access to the appropriate VPLEX initiators for both port
groups.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity

Figure 8 Active/Active Array Connectivity


This illustration shows the physical connectivity to a VMAX array. Similar
considerations should apply to other active/active arrays. Follow the array best
practices for all arrays including third party arrays.
The devices should be provisioned in such a way as to create digestible chunks and
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

19

Back-end/Storage Array Connectivity


provisioned for access through specific FA ports and VPLEX ports. The devices
within this device grouping should restrict access to four specific FA ports for each
VPLEX Director ITL group.
The VPLEX initiators (backend ports) on a single director should spread across
engines to increase HA and redundancy. The array should be configured into
initiator groups such that each VPLEX director acts as a single host per four paths.
This could mean four physical paths or four logical paths per VPLEX director
depending on port availability and whether or not VPLEX is attached to dual fabrics
or multiple fabrics in excess of two.
For the example above following basic limits on the VMAX:
Initiator Groups (HBAs); max of 32 WWN's per IG; max of 8192 IG's on a VMax; set port flags
on the IG; an individual WWN can only belong to 1 IG. Cascaded Initiator
Groups have other IG's (rather than WWN's) as members.
Port Groups (FA ports): max of 512 PG's; ACLX flag must be enabled on the port; ports may
belong to more than 1 PG
Storage Groups (LUNs/SymDevs); max of 4096 SymDevs per SG; a SymDev may belong to
more than 1 SG; max of 8192 SG's on a VMax
Masking View = Initiator Group + Port Group + Storage Group
We have divided the backend ports of the VPLEX into two groups allowing us to
create four masking views on the VMAX. Ports FC00 and FC01 for both directors are
zoned to two FAs each on the array. The WWNs of these ports are the members of
the first Initiator Group and will be part of Masking View 1. The Initiator Group
created with this group of WWNs will become the member of a second Initiator
Group which will in turn become a member of a second Masking View. This is called
Cascading Initiator Groups. This was repeated for ports FC02 and FC03 placing them
in Masking Views 3 and 4. This is only one example of attaching to the VMAX and
other possibilities are allowed as long as the rules are followed.
VPLEX virtual volumes should be added to Storage Views containing initiators from a
director A and initiators from a director B. This translates to a single host with two
initiators connected to dual fabrics and having four paths into two VPLEX directors.
VPLEX would access the backend arrays storage volumes via eight FAs on the array
through two VPLEX directors (an A director and a B director). The VPLEX A director
and B director each see four different FAs across at least two VMAX engines if
available.
This is an optimal configuration that spreads a single hosts I/O over the maximum
number of array ports. Additional hosts will attach to different pairs of VPLEX
directors in a dual-engine or quad-engine VPLEX cluster. This will help spread the
overall environment I/O workload over more switches, VPLEX and array resources.
20

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity


This would allow for the greatest possible balancing of all resources resulting in the
best possible environment performance.

Figure 9 Show ITLs per Storage Volume


This illustration shows the ITLs per Storage Volume. In this example the VPLEX
Cluster is a single engine and is connected to an active/active array with four paths
per Storage Volume per Director giving us a total of eight logical paths. The Show
ITLs panel displays the ports on the VPLEX Director from which the paths originate
and which FA they are connected to.
The proper output in the Show ITLs panel for an active/passive or ALUA supported
array would have double the count as it would also contain the logical paths for the
passive or non-preferred SP on the array.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

21

Back-end/Storage Array Connectivity

Active/Passive and ALUA Arrays


Some arrays have architecture and implementation requirements that necessitate
special consideration. When using an active-passive or ALUA supported array, each
director needs to have logical (zoning and masking) and physical connectivity to both
the active and passive or non-preferred controllers. That way you will not lose access
to storage volumes if an active controller should fail. Additionally, arrays like the
CLARiiON have limitations on the size of initiator or storage groups. It may be
necessary to have multiple groups to accommodate provisioning storage to the
VPLEX. Adhere to logical and physical connectivity guidelines discussed earlier.

Figure 10 VS2 connectivity to Active/Passive and ALUA Arrays


Points to note would be that for each CLARiiON, each SP has connection to each fabric
through which each SP has connections to all VPLEX directors. The above examples
shows Fabric-A with SPa0 & SPb0 (even ports) and Fabric-B with SPa3 & SPb3(odd
ports) for dual fabric redundancy.
ALUA support allows for connectivity similar to Active/Passive arrays. VPLEX will
recognize the non-preferred path and refrain from using it under normal conditions.
A director with proper maximum path connectivity will show eight ITLs per director
but will only report four active paths.

22

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity


When provisioning storage to VPLEX, ensure that mode 4 (ALUA) or mode 1 set
during VPLEX initiator registration prior to device presentation. Dont try to change it
after devices are already presented.
Proper connectivity for active/passive and ALUA arrays can be handled in a couple of
ways. You have the option of configuring to a minimum configuration or maximum
configuration which amount to two or four active paths per Director per LUN as well
as two or four passive or non-preferred paths per Director per LUN. This is known as
an ITL or Initiator/Target/LUN Nexus. A minimum configuration is 4 (two active
and two passive/non-preferred) paths (Logical or Physical) per Director for any given
LUN and a maximum supported configuration is 8 (eight) paths per Director per LUN
for Active/Passive and ALUA.

Note: VNX2 will support Active/Active connectivity on classic LUNs, i.e.


those not bound on a pool. Pool LUNs are not supported in Symmetrical
Active/Active.
VPLEX does not support mixed mode configurations.
The next set of diagrams depicts both a four active path per Director per LUN
Logical configuration and a four active path per Director per LUN Physical
configuration. Both are supported configurations as well as two active paths per
Director per LUN configurations.
The commands used in the VPlexcli to determine the port WWNs and the ITLs used in
the following diagram are:

VPlexcli:/ll **/hardware/ports

VPlexcli:/clusters/cluster-<cluster number>/storage-elements/storagevolumes/<storage volume name>>ll --full

As an example:

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

23

Back-end/Storage Array Connectivity

Figure 11 Backend port WWN identification


Running the long listing on the hardware/ports allows you to determine which WWN
is associated with which VPLEX backend port.

24

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity

array port
WWN

VPLEX
Backend port
WWN

Figure 12 ITL association

From the storage-volumes context you can select a sample volume and cd to that
context. Running the ll --full command will show the ITLs.
In this example we have sixteen entries for this volume. This is a single engine VPLEX
cluster connected to a VNX array. Even though this gives us eight paths per director
for this volume only four paths go to the array SP that owns the volume. In either
mode 1 or mode 4 (ALUA), the paths going to the other SP will not be used for I/O.
Only in the case of a trespass will they become active.

Note: All paths, whether active or not, will perform device discovery during
an array rediscover. Over allocating the number of paths beyond the
supported limits will have detrimental effects on performance and/or
backend LUN provisioning.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

25

Back-end/Storage Array Connectivity

Figure 13 1:1 Physical path configuration

26

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity


This drawing was developed from the output from the two commands shown above.

Figure 14 Logical path configuration

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

27

Back-end/Storage Array Connectivity


A slight modification from the previous drawing helps illustrate the same concept but
using only two VPLEX backend ports per director. This gives us the exact same
number of ITLs and meets the maximum supported limit as spreading across all four
ports.

Significant Bit
The above two illustrations show the significant bit but there are other bit
considerations for identifying all possible ports. The following will help explain the
bit positions as they apply to the various modules on a CLARiiON / VNX.
The CLARiiON CX4 Series supports many more SP ports. As such the original method
of specifying the Ports would cause an overlap between SP A high end ports and SP B
low end ports.
That is, SPA9 would have the significant byte pair as 69, which is SPB1.
The new method is as follows :
SPA0-7 and SPB0-7 are the same as the old method.
Port

28

SP A

SP B

00

50:06:01:60:BB:20:02:07

50:06:01:68:BB:20:02:07

01

50:06:01:61:BB:20:02:07

50:06:01:69:BB:20:02:07

02

50:06:01:62:BB:20:02:07

50:06:01:6A:BB:20:02:07

03

50:06:01:63:BB:20:02:07

50:06:01:6B:BB:20:02:07

04

50:06:01:64:BB:20:02:07

50:06:01:6C:BB:20:02:07

05

50:06:01:65:BB:20:02:07

50:06:01:6D:BB:20:02:07

06

50:06:01:66:BB:20:02:07

50:06:01:6E:BB:20:02:07

07

50:06:01:67:BB:20:02:07

50:06:01:6F:BB:20:02:07

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Back-end/Storage Array Connectivity


For the higher port numbers byte 12 is changed to represent the higher ports thus:
0 0-7
4 8-15
8 16-23
C 24-31
And the 8th byte cycles back to 0-7 for SP A and 8-F for SP B. So for ports 8-11 on SP A
and SP B we have:
Port

SP A

SP B

08

50:06:01:60:BB:24:02:07

50:06:01:68:BB:24:02:07

09

50:06:01:61:BB:24:02:07

50:06:01:69:BB:24:02:07

10

50:06:01:62:BB:24:02:07

50:06:01:6A:BB:24:02:07

11

50:06:01:63:BB:24:02:07

50:06:01:6B:BB:24:02:07

Additional Array Considerations


Arrays, such as the Symmetrix, that do in-band management may require a direct
path from some hosts to the array. Such a direct path should be solely for the
purposes of in-band management. Storage volumes provisioned to the VPLEX should
never simultaneously be masked directly from the array to the host; otherwise there is
a high probability of data corruption. It may be best to dedicate hosts for in-band
management and keep them outside of the VPLEX environment.
Storage volumes provided by arrays must have a capacity that is a multiple of 4 KB.
Any volumes which are not a multiple of 4KB will not show up in the list of available
volumes to be claimed. For the use case of presenting storage volumes to VPLEX that
contain data and are not a multiple of 4K then those devices will have to be migrated
to a volume that is a multiple of 4K first then that device presented to VPLEX. The
alternative would be to use a host based copy utility to move the data to a new and
unused VPLEX device.
Remember to reference the EMC Simple Support Matrix, Release Notes, and online
documentation for specific array configuration requirements. Remember to follow
array best practices for configuring devices to VPLEX.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

29

Host Connectivity

Host Connectivity
Front-end/host initiator port connectivity
Dual fabric designs are considered a best practice
The front-end I/O modules on each director should have a minimum of two physical
connections one to each fabric (required).
Each host should have at least one path to an A director and one path to a B director on each
fabric for a total of four logical paths (required for NDU).
Multipathing or path failover software is required at the host for access across the dual fabrics
Each host should have fabric zoning that provides redundant access to each LUN from a
minimum of an A and B director from each fabric.
Four paths are required for NDU
Observe Director CPU utilization and schedule NDU for times when average directory CPU
utilization is below 50%
GUI Performance Dashboard in GeoSynchrony 5.1 or newer
Skipping the NDU pre-checks would be required for host connectivity with less than four paths
and is not considered a best practice
NOTE: An RPQ will be required for single attached hosts and/or
environments that do not have redundant dual fabric configurations.
More information is available in the

30

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Host Connectivity
Export Considerations section.

Note: Each Initiator / Target connection is called an IT Nexus. Each VPLEX frontend port supports up to 400 IT nexuses and, on VS2, each engine has a total of 8
front-end target ports. Dual and quad-engine clusters provide additional
redundancy but do not increase the total number of initiator ports supported on a
per-cluster basis. For that matter, all listed limits in the Release Notes apply to all
VPLEX Cluster configurations equally regardless whether it is a single, dual or
quad engine configuration.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

31

Host Connectivity

Additional recommendations
If more than one engine is available, spread I/O paths across engines as well as
directors.

Note: For cluster upgrades when going from a single engine to a dual
engine cluster or from a dual to a quad engine cluster you must rebalance
the host connectivity across the newly added engines. Adding additional
engines and then not connecting host paths to them is of no benefit.
Additionally until further notice, the NDU pre-check will now flag any
host connectivity as a configuration issue if they have not been
rebalanced. Dual to Quad upgrade will not flag an issue provided there
were no issues prior to the upgrade. You may choose to rebalance the
workload across the new engines or add additional hosts to the pair of
new engines.
Complete physical connections to the VPLEX before commissioning/setup.
Use the same FE/BE ports on each director to avoid confusion, that is, B0-FC00 and
A0-FC00. Please refer to hardware diagrams for port layout.

32

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Host Connectivity

Figure 15 Host connectivity for Single Engine Cluster meeting NDU pre-check requirements

This illustration shows dual HBAs connected to two Fabrics with each connecting to
two VPLEX directors on the same engine in the single engine cluster. This is the
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

33

Host Connectivity
minimum configuration that would meet NDU requirements.
This configuration increases the chance of a read cache hit increasing performance.
Please refer to the Release Notes for the total FE port IT Nexus limit.

34

Pros:Meets the NDU requirement for Single Engine configurations only

Cons:Single engine failure could cause a DU event

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Host Connectivity

Figure 16 Host connectivity for HA requirements for NDU pre-checks dual or quad engine

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

35

Host Connectivity

The previous illustration shows host connectivity with dual HBAs connected to four
VPLEX directors. This configuration offers increased levels of HA as required by the
NDU pre-checks. This configuration could be expanded to additional ports on the
VPLEX directors or additional directors for the quad-engine configuration. This
configuration still only counts as four IT Nexus against the limit as identified in the
Release Notes for that version of GeoSynchrony.
Pros:

Offers HA for hosts running load balancing software

Good design for hosts using load balancing multipath instead of path failover
software

Director failure only reduces availability by 25%

Cons:

36

Reduces the probability of a read cache hit potentially impacting performance


initially until cache chunk is duplicated in all servicing directors (applies to hosts
using load balancing software)

Duplicating cache chunks into too many different directors consumes


proportional amounts of overall system cache reducing total cache capacity
(applies to hosts using multipath load balancing software)

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Host Connectivity

Figure 17 Host connectivity for HA quad engine

The previous illustration shows host connectivity with dual HBAs connected to four
VPLEX engines (eight directors). This configuration counts as eight IT Nexuses
against the total limit as defined in the Release Notes for that version of
GeoSynchrony. Hosts using active/passive path failover software should connect a
path to all available directors and manual load balance by selecting a different director
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

37

Host Connectivity
for the active path with different hosts.
Most host connectivity for hosts running load balancing software should follow the
recommendations for a dual engine cluster. The hosts should be configured across
two engines and the hosts should alternate between pairs of engines effectively
load balancing the I/O across all engines.
Pros:

Offers highest level of Continuous Availability for hosts running load balancing
software

Best Practice design for hosts using path failover software

Director failure only reduces availability by 12.5%

Allows for N-1 director failures while always providing access to data as long as 1
director stays online

Cons:

Reduces the probability of a read cache hit potentially impacting performance


initially until cache chunk is duplicated in all servicing directors (applies to hosts
using load balancing software)

Duplicating cache chunks into too many different directors consumes


proportional amounts of overall system cache reducing total cache capacity
(applies to hosts using multipath load balancing software)

Consumes double the IT Nexus count against the system limit identified in the
Release Notes as compared to the dual engine configuration
o

38

Most hosts should be attached to four directors at most unless absolutely


necessary

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Host Cluster cross-connect

Host Cluster cross-connect

Figure 18 Host Cluster connected across sites to both VPLEX Clusters

Cluster cross-connect applies to specific host OS and multipathing configurations


as listed in the VPLEX ESSM only.

Host initiators are zoned to both VPLEX clusters in a Metro.

Host multipathing software can be configured for active path/passive path with
active path going to the local VPLEX cluster. When feasible, configure the
multipathing driver to prefer all local cluster paths over remote cluster paths.

Separate HBA ports should be use for the remote cluster connection to avoid
merging of the local and remote fabrics

Connectivity at both sites follow same rules as single host connectivity

Supported stretch clusters can be Cluster cross-connected (Please refer to VPLEX


ESSM)

Cluster cross-connect is limited to a VPLEX cluster separation of no more than


1ms latency

Cluster cross-connect requires the use of VPLEX Witness

VPLEX Witness works with Consistency Groups only

Cluster cross-connect must be configured when using VPLEX Distributed Devices


only

Cluster cross-connect is supported in a VPLEX Metro synchronous environment


only

At least one backend storage array is required at each site with redundant
connection to the VPLEX cluster at that site. Arrays are not cross connected to
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

39

Host Cluster cross-connect


each VPLEX cluster

All Consistency Groups used in a Cluster cross-connect are required to have the
auto-resume attribute set to true

The unique solution provided by Cluster cross-connect requires hosts have access to
both datacenters. The latency requirements for Cluster cross-connect can be achieved
using an extended fabric or fabrics that span both datacenters. The use of backbone
fabrics and LSAN zones may introduce additional latency preventing a viable use of
Cluster cross-connect. The rtt for Cluster cross-connect must be within 1ms.
PowerPath supports auto-standby on PPVE 5.8 its documented in the release notes
and CLI guide.

Other PP OSs that support the auto-standy feature are:

Windows

Linux (SUSE and RHEL)

Solaris

HP-UX
The only thing that the customer has to do is enable the autostandby feature:
#powermt set autostandby=on trigger=prox host=xxx
PowerPath will take care of setting to autostandby those paths associated with the
remote/non-preferred VPLEX cluster.
PP groups the paths by VPLEX cluster and the one with the lowest minimum path
latency is designated as the local/preferred cluster.

HP-UX MPIO
HP-UX MPIO policy least-command-load is better performing than simple roundrobin.

40

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Host Cluster cross-connect


Cluster cross-connect for VMWare ESXi
Note: We encourage any customer moving to a VPLEX-Metro to move
to ESX 5.0 Update 1 to benefit from all the HA enhancements in ESX 5.0
as well as the APD/PDL handling enhancements provided in update 1

Applies to vSphere 4.1 and newer and VPLEX Metro Spanned SAN configuration

HA/DRS cluster is stretched across the sites. This is a single HA/DRS cluster with
ESXi hosts at each site

A single standalone vCenter will manage the HA/DRS cluster

The vCenter host will be located at the primary datacenter

The HA/VM /Service console/vMotion networks should use multiple NIC cards
on each ESX for redundancy

The latency limitation of 1ms is applicable to both Ethernet Networks as well as


the VPLEX FC WAN networks

The ESXi servers should use internal disks or local SAN disks for booting. The
Distributed Device should not be used as a boot disk

All ESXi hosts initiators must be registered as default type in VPLEX

VPLEX Witness must be installed at a third location isolating it from failures that
could affect VPLEX clusters at either site

It is recommended to place the VM in the preferred site of the VPLEX distributed


volume (that contains the datastore)

In case of a Storage Volume failure or a BE array failure at one site, the VPLEX
will continue to operated with the site that is healthy. Furthermore if a full VPLEX
failure or WAN COM failure occurs and the cluster cross-connect is operational
then these failures will be transparent to the host

Create a common storage view for ESX nodes on site 1 on VPLEX cluster-1

Create a common storage view for ESX nodes on site 2 on VPLEX cluster-2

All Distributed Devices common to the same set of VMs should be in one
consistency group

All VMs associated with one consistency group should be collocated at the same
site with the bias set on the consistency group to that site

If using ESX Native Multi-Pathing (NMP) make sure to use the fixed policy and
make sure the path(s) to the local VPLEX is the primary path(s) and the path(s) to
the remote VPLEX is only stand-by

vMSC is support for both non-uniform and uniform (cross-connect)

For additional information please refer to White Paper: Using VMware vSphere with
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

41

Host Cluster cross-connect


EMC VPLEX Best Practices Planning found on PowerLink.

42

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Path loss handling semantics (PDL and APD)

Path loss handling semantics (PDL and APD)


vSphere can recognize two different types of total path failures to an ESXi 5.0 u1
server. These are known as "All Paths Down" (APD) and "Persistent Device Loss"
(PDL). Either of these conditions can be declared by the ESXi server depending on the
failure condition.

Persistent device loss (PDL)


This is a state that is declared by an ESXi server when a SCSI sense code (2/4/3+5) is
sent from the underlying storage array (in this case a VPLEX) to the ESXi host
informing the ESXi server that the paths can no longer be used. This condition can be
caused if the VPLEX suffers a WAN partition causing the storage volumes at the nonpreferred location to suspend. If this does happen then the VPLEX will send the PDL
SCSI sense code (2/4/3+5) to the ESXi server from the site that is suspending (i.e. the
non-preferred site).

Figure 19 Persistent device loss process flow

All paths down (APD)

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

43

Path loss handling semantics (PDL and APD)


This is a state where all the paths to a given volume have gone away (for whatever
reason) but no SCSI sense code can be sent by the array (e.g. VPLEX) or alternatively
nothing is received by the ESXi server. An example of this would be a dual fabric
failure at a given location causing all of the paths to be down. In this case no SCSI
sense code will be generated or sent by the underlying storage array, and even if it
was, the signal would not be received by the host since there is no connectivity.
Another example of an APD condition is if a full VPLEX cluster fails (unlikely as there
are no SPOFs). In this case a SCSI sense code cannot be generated since the storage
hardware is unavailable, and thus the ESXi server will detect the problem on its own
resulting in an APD condition. ESXi versions prior to vSphere 5.0 Update 1 could not
distinguish between an APD or PDL condition, causing VM's to become nonresponsive rather than to automatically invoke a HA failover (i.e. if the VPLEX
suffered a WAN partition and the VMs were running on the non-preferred site).
Clearly this behavior is not desirable when using vSphere HA with VPLEX in a
stretched cluster configuration. This behavior changed in vSphere 5.0 Update 1 since
the ESXi server is now able to receive and act on a 2/4/3+5 sense code if it is received
and declare PDL, however additional settings are required to ensure the ESXi host acts
on this condition.
The settings that need to be applied to vSphere 5.0 update 1 deployments (and
beyond, including vSphere 5.1) are: 1. Use vSphere Client and select the cluster, rightclick and select Edit Settings. From the pop-up menu, click to select vSphere HA, then
click Advanced Options. Define and save the following option:
das.maskCleanShutdownEnabled=true 2. On every ESXi server, create and edit (with
vi) the /etc/vmware/settings with the content below, then reboot the ESXi server. The
following output shows the correct setting applied in the file: ~ # cat
/etc/vmware/settings disk.terminateVMOnPDLDefault=TRUE
Refer to the ESXi documentation for further details and the whitepaper found here:

http://www.vmware.com/files/pdf/techpaper/vSPHR-CS-MTROSTOR-CLSTR-USLET-102-HI-RES.pdf

Note: vSphere and ESXi 5.1 introduces a new feature called APD timeout.
This feature is automatically enabled in ESXi 5.1 deployments and while not
to be confused with PDL states does carry an advantage whereby if both
fabrics to the ESXi host or an entire VPLEX cluster fails, the host (which
would normally hang (also known as a VM zombie state)) would now be able
to respond to non-storage requests since "hostd" will effectively disconnect
the unreachable storage, however this feature does not cause the affected VM
to die. Please see this article for further details:
http://www.vmware.com/files/pdf/techpaper/Whats-New44

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Path loss handling semantics (PDL and APD)


VMware-vSphere-51-Storage-Technical-Whitepaper.pdf
It is expected that since VPLEX uses a non-uniform architecture that this
situation should never be encountered on a VPLEX METRO cluster.
As discussed above, vSphere HA does not automatically recognize that a SCSI PDL
(Persistent device loss) state is a state that should cause a VM to invoke a HA failover.
Clearly, this may not be desirable when using vSphere HA with VPLEX in a stretched
cluster configuration. Therefore, it is important to configure vSphere so that if the
VPLEX WAN is partitioned and a VM happens to be running at the non-preferred site
(i.e., the storage device is put into a PDL state), the VM recognizes this condition and
invokes the steps required to perform a HA failover.
ESX and vSphere versions prior to version 5.0 update 1 have no ability to act on a SCSI
PDL status and will therefore typically hang (i.e., continue to be alive but in an
unresponsive state). However, vSphere 5.0 update 1 and later do have the ability to
act on the SCSI PDL state by powering off the VM, which in turn will invoke a HA
failover. To ensure that the VM behaves in this way, additional settings within the
vSphere cluster are required.
At the time of this writing the settings are:
1. Use vSphere Client and select the cluster, right-click and select Edit Settings. From
the pop-up menu click to select the vSphere
HA, then click Advanced Options. Define and save the option:
das.maskCleanShutdownEnabled=true

2. On every ESXi server, vi /etc/vmware/settings with the content below, then reboot
the ESXi server.
The following output shows the correct setting applied in the file:
~ # cat /etc/vmware/settings
disk.terminateVMOnPDLDefault=TRUE
Refer to the ESX documentation for further details.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

45

VBLOCK and VPLEX Front End Connectivity Rules

VBLOCK and VPLEX Front End Connectivity Rules


Note: All rules in BOLD cannot be broken, however Rules in Italics can be adjusted
depending on customer requirement, but if these are general requirements simply use the
suggested rule.
1. Physical FE connectivity
a. Each VPLEX Director has 4 front end ports. 0, 1, 2 and 3. In all cases even ports
connect to fabric A and odd ports to fabric B.
i. For single VBLOCKS connecting to single VPLEX
Only ports 0 and 1 will be used on each director. 2 and 3 are
reserved.
Connect even VPLEX front end ports to fabric A and odd to fabric B.
ii. For two VBLOCKS connecting to a single VPLEX
Ports 0 and 1 will be used for VBLOCK A
Ports 2 and 3 used for VBLOCK B
Connect even VPLEX front end ports to fabric A and odd to fabric B.
2. ESX Cluster Balancing across VPLEX Frontend
All ESX clusters are evenly distributed across the VPLEX front end in the following patterns:
Single Engine
Engine #
Director
Cluster #

Engine 1
A
B
1,2,3,4,5,6,7,8 1,2,3,4,5,6,7,8

Dual Engine
Engine #
Director
Cluster #

A
1,3,5,7

Quad Engine
Engine #
Director
Cluster#

A
1,5

Engine 1

Engine 2
B
2,4,6,8

A
2,4,6,8

B
2,6

A
3,7

Engine 1

B
1,3,5,7

Engine 2

Engine 3
B
4,8

A
4,8

Engine 4
B
3,7

3. Host / ESX Cluster rules


a. Each ESX cluster must connect to a VPLEX A and a B director.
46

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

A
2,6

B
1,3

VBLOCK and VPLEX Front End Connectivity Rules


b. For dual and quad configs, A and B directors must be picked from different
engines (see table above for recommendations)
c. Minimum directors that an ESX cluster connects to is 2 VPLEX directors.
d. Maximum directors that an ESX cluster connects to is 2 VPLEX directors.
e. Any given ESX cluster connecting to a given VPLEX cluster must use the same
VPLEX frontend ports for all UCS blades regardless of host / UCS blade count.
f. Each ESX host should see four paths to the same datastore
i. 2 across fabric A
A VPLEX A Director port 0 (or 2 if second VBLOCK)
A VPLEX B Director port 0 (or 2 if second VBLOCK)
ii. 2 across fabric B
The same VPLEX A Director port 1 (or 3 if second VBLOCK)
The same VPLEX B Director port 1 (or 3 if second VBLOCK)

4. Pathing policy
a. Non cross connected configurations recommend to use adaptive pathing policy in all
cases. Round robin should be avoided especially for dual and quad systems.
b. For cross connected configurations, fixed pathing should be used and preferred
paths set per Datastore to the local VPLEX path only taking care to alternate and
balance over the whole VPLEX front end (i.e. so that all datastores are not all
sending IO to a single VPLEX director).

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

47

Storage View Considerations

Storage View Considerations


Storage Views provide the framework for masking Virtual Volumes to hosts. They
contain three sets of objects:
1.

Virtual Volumes

2.

VPLEX Frontend Ports

3.

Host Initiators

The combination of each of the VPLEX Frontend Ports and the Host Initiators are
called IT Nexus (Initiator/Target Nexus). An IT Nexus can only access a single
Storage View. If a Host requires access to another Storage View with the same set of
Initiators then the Initiators must be zoned to other VPLEX frontend ports and those
IT Nexus combinations would be used for access to the new Storage View. The NDU
requirements must be met for each Storage View independently even if the Host
Initiator and frontend connectivity meet the requirements for a different Storage View.
Virtual Volumes can be placed in multiple Storage Views which offer additional
options to architecting different solutions based on unique customer needs.

Best practices Single Host

Create a separate storage view for each host and then add the volumes for that host to
only that view

For redundancy, at least two initiators from a host must be added to the storage view

Clustered Hosts

Create a separate storage view for each node for private volumes such as boot volume

Create new pairs of IT Nexus for node HBA Initiators on different VPLEX Frontend
Ports

Create a Storage View for the Cluster that contains the shared Virtual Volumes and
add IT Nexus pairs that are different than those for the private volume Storage Views
As Mentioned above, there are additional options for Clustered Hosts. This specific
option may consume more IT Nexuses against the system limits but it allows for a
single Storage View for shared volumes. This minimizes the possibility of user error
when adding or removing shared volumes.

48

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Fan In / Fan Out Ratios

Fan In / Fan Out Ratios


The published system limits are based on what has been tested and qualified to work
for that specific code level and GeoSynchrony feature Local, Metro or Geo. Always
refer to the Release Notes for the code level that the solution is being architected to.
All Cluster limits identified in the Release Notes apply equally to the single, dual or
quad engine cluster. Upgrading Clusters to increase engine count apply to
performance considerations only and do not increase defined limits.

Note: ESX Initiators apply to the physical server, not the individual
VMs.

Fan out (Backend connectivity):


Rule number one, all directors must see all storage equally. This means that all devices
provisioned from all arrays must be presented to all directors on the VPLEX cluster
equally. Provisioning some storage devices to some directors and other storage
devices to other directors on VPLEX is not supported. All VPLEX directors must have
the exact same view of all the storage.
Rule number two, all directors must have access to the same storage devices over both
fabrics. The purpose of presenting to both fabrics provides the redundancy necessary
to survive a fabric event that could take an entire fabric down without causing the
host to lose access to data. Additionally, the NDU process tests for redundancy across
two backend ports for each director per Storage Volume.
Rule number three, a device must not be accessed by more than four active paths on a
given director. The limit is based on logical count of 4 paths per Storage Volume per
Director not physical port count. An ALUA supported array would have eight paths
per Storage Volume per Director as only four of those paths would be active at any
given time. Four paths are optimal as VPLEX will Round Robin across those four
paths on a per director basis.

Note: It is very important not to exceed the 4 paths per Storage Volume per Director as this
would cause a DU event under certain Fabric failures as VPLEX tries to timeout the paths.
Host platforms would weather the storm if connectivity was within supported limits but
could cause extreme timeouts on the host if an excessive number of paths per Storage
Volume per Director were configured causing the host to fail the device resulting in a DU
event.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

49

Fan In / Fan Out Ratios


Theoretically, based on rule number three of 4 paths per Storage Volume per
Director and the Storage Volume limit of 8,000, each port would have to support 8,000
devices per port on the VS2 hardware but only of the I/O going to that device will
be on any given port for that director. This is only an example to illustrate the
concept.
The access of these devices for the I/O are controlled by the host attach rules which
means that even though you could have up to 8,000 devices per port on the VS2
hardware, as per this example, you will only be accessing a fraction of those devices at
any given time. That ratio of total device attach vs. access is directly proportional to
the number of engines in the cluster and which directors the host are actually attached
to.
Whether you have a single, dual or quad engine configuration, the rules mentioned
previously still apply. The different number of engines provides increasing points of
access for hosts thereby spreading the workload over more directors, and possibly,
more engines.
The VPLEX limits for IT nexus (Initiator - Target nexus) per port are as follows. You
cannot exceed 256 IT nexus per backend port or 400 per frontend port. (Please refer to
the Release Notes for the specific limits based on GeoSynchrony level). This means a
total of 256 array ports can be zoned to the initiator of the backend port on VPLEX and
a total of 400 host initiators can be zoned to any individual frontend port on VPLEX
(VPLEX Local or Metro only as an example as limits for Geo are much less).

Note: The IT nexus limit was increased to 400 with GeoSynchrony 5.1 patch 2
for frontend port connectivity in conjunction with increasing the IT nexus
limit to 3,200 for the VPLEX Local and VPLEX Metro solutions. Please refer to
Release Notes for specific support limits.

Fan in (Host connectivity)


Rule number one - Hosts HBAs should be connected to fabric A and fabric B for HA.
The dual fabric connectivity rule for storage applies here for the exact same reason.
Rule number two - Hosts HBAs should be connected to an A director and a B director
on both fabrics as required by NDU pre-checks. This totals a minimum of four paths
all together. For dual or quad engine configurations, the initiators must span engines
for Director A and Director B connectivity.
VPLEX caching algorithms allow the transfer of "chunks" of cache to transfer between
directors via the internal Local Com. In addition to cache transfers within the cluster,
cache chunks are also transferred between clusters in a VPLEX Metro or VPLEX Geo
configuration over the WAN COM. With NDU requirements in mind, you can
50

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Fan In / Fan Out Ratios


potentially optimize performance by selecting directors on the same engine for the
director A and B connections or optimize HA by selecting an A director and a B
director on separate engines, if available, so that you will survive a complete engine
failure.
When considering attaching hosts running load balancing software to more than two
engines in a large VPLEX configuration, performance and scalability of the VPLEX
complex should be considered. This caution is provided for the following reasons:

Having a host utilize more than the required number of directors can increases cache
update traffic among the directors

Decreases probability of read cache hits

Considerations for multipath software limits must be observed

Impact to availability is proportional to path provisioning excess


o

More paths increase device discovery and hand shaking

High fabric latencies increase the chance that the host will fail

Based on the reliably and availability characteristics of a VPLEX hardware, attaching a


host to two engines provides a continuously available configuration without
unnecessarily impacting performance and scalability of the solution.
Hosts running multipath failover software such as ESX with NMP should connect to
every Director in the VPLEX cluster and select a different Director for each node in the
ESX cluster for the active path. The exception might be for Host Cross-Connect in a
Metro utilizing Quad engine clusters at both sites. This would result in a total of 16
paths from the server to the LUN. ESX allows for 8 paths connectivity to a maximum
of 256 LUNs. If 16 paths were configured you would reduce the total number of
supported LUNs to 128.

DMP Tuning Parameters


dmp_lun_retry_timeout
Specifies a retry period for handling transient errors. When all paths to a disk fail,
there may be certain paths that have a temporary failure and are likely to be restored
soon. If I/Os are not retried for a period of time, the I/Os may be failed to the
application layer *** EVEN THOUGH SOME PATHS ARE EXPERIENCING A
TRANSIENT FAILURE. The DMP tunable dmp_lun_retry_timeout can be used for
more robust handling of such transient errors. If the tunable is set to a non-zero value,
I/Os to a disk with all failed paths will be retried until the specified
dmp_lun_retry_timeout interval or until the I/O succeeds on one of the paths,
whichever happens first.
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

51

Fan In / Fan Out Ratios


The default value of the tunable is 0, which means that the paths are probed only once.
You would also want to try this if they are not the defaults on your hosts:
Set the vxdmp setting:
vxdmpadm setattr enclosure emc-vplex0 recoveryoption=throttle iotimeout=30
vxdmpadm setattr enclosure emc-vplex0 dmp_lun_retry_timeout=60
Port utilization must be the deciding factor in determining which port to attach to.
VPLEX provides performance monitoring capabilities which will provide information
for this decision process. Please refer to the GeoSynchrony Release Notes for the
specific limits for that level of code.
GeoSynchrony 5.0 and later also supports cross-connect in a Metro environment. This
is a unique HA solution utilizing distributed devices in consistency groups controlled
by a VPLEX Witness. Please refer to the Cluster Cross-connect section for
additional details.
Host environments that predominantly use path failover software instead of load
balancing software should observe system loads based on the active path and balance
the active paths across directors so the overall VPLEX director CPU utilization is
balanced.
In conclusion, it is more important to observe port utilization and defined limits than
to simply say X number of connections per port.

52

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices

Network Best Practices


This section provides guidance around the network best practices for all components
within the VPLEX family of products. For the purpose of clarification, there are
several parts that will be covered.

Management network

Management server to management server Virtual Private Network (VPN)

VPLEX Witness to management servers VPN

Director to Director communications

Local cluster communications between directors

Remote cluster communications between directors


Note: The network best practices section does not cover host cluster network
considerations as VPLEX does not provide solutions directly for host stretch cluster
networks. Host stretch clusters may require layer 2 network connectivity so please
refer to host requirements and network vendor products that provide the ability to
separate cluster nodes across datacenters. Even though this document does not
provide for host cluster network requirements they are still an integral part of the
overall solution.
For the purpose of defining the terminology used, Director to Director communication
refers to the intra-cluster connectivity (within the cluster) as well as the inter-cluster
connectivity (between the clusters). For the purpose of clarification, VPLEX-Local
only has intra-cluster communications to be considered. VPLEX-Metro and VPLEXGeo have inter-cluster communications; in addition, that uses different carrier media.
VPLEX-Metro uses Fibre Channel connectivity and supports switched fabric, DWDM
and FCIP protocols. The inter-cluster communications for VPLEX-Metro will be
referred to as FC WAN COM or FC WAN or WAN COM. The product CLI display
both as WAN COM but identify the ports with FC or XG for Fibre Channel or 10 Gig
Ethernet.

Note: VPLEX Metro supports 10 GigE WAN COM connectivity but will
continue to follow all current limits of latency and redundancy for synchronous
write through mode.

VPLEX-Geo utilizes Ethernet communications for inter-cluster communications


Implementation and Planning Best Practices for EMC VPLEX Technical Notes

53

Network Best Practices


referred to as WAN COM. VS2 hardware has to be ordered with the appropriate
hardware for VPLEX-Metro utilizing the FC WAN COM and four port Fibre Channel
I/O modules or configured for VPLEX-Geo utilizing the dual port 10 GigE modules
for WAN COM.
Note: Reconfiguration of the hardware I/O module of the WAN COM after installation
is currently not supported. Please submit an RPQ if necessary.

VPLEX Local
The only network requirements for a VPLEX Local are for management access. Access
requires configuring the IP Address for eth3 on the management server and
connecting the management server to the customer network.
The eth3 port on the management server is configured for Auto Negotiate and is 1
GigE capable. Even though the primary use is for management, file transfer is very
important and proper speed and duplex connectivity is imperative for transferring
collect-diagnostics or performing NDU package transfers. If file transfer appear
extremely slow then it would be prudent to check the network switch to make sure
that the port connected to isnt set to 100/full duplex as no auto negotiations would
happen and the network connectivity would default to 100/half duplex. This
situation would cause file transfers to take possibly hours to transfer. Please dont
assume that you wont use this port for file transfer as it will be used in a Metro or
Geo NDU as well as the possibility of performing the NDU via ESRS remotely.
For some very specific installations such as government dark sites, it may be required
that the VPLEX Cluster not be connected to the network for security reasons. In this
situation, it is still required to configure an IP Address for eth3 but a non-routable
address such as 192.168.x.x can be used in order to proceed with the installation. The
IP Address can be changed at a later date if necessary but a specific procedure is
required to reconfigure ipsec when attempting to change the IP Address.
Management of the cluster can be performed via the service port. This only applies to
the VPLEX Local solution as both VPLEX Metro and VPLEX Geo require VPN
connectivity between clusters over the management servers eth3 port.

Note: VPLEX only supports IPv4 for GeoSynchrony releases 5.2 and older. IPv6 is
available with GeoSynchrony 5.3 and newer.

54

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices

Intra-Cluster Local Communications (Local COM)


Intra-Cluster Local Communications apply to all variations of VPLEX. This is the
director to director communications within the VPLEX cluster. Intra-Cluster Local
Communications are completely private to the Cluster and will never be connected to
the customer SAN. The I/O module ports are referred to as Local COM. The Local
COM communications are based on dual redundant Fibre Channel networks that
include two SAN switches internal to the cluster for dual-engine and quad-engine
configurations. A single-engine cluster simply connects corresponding Local COM
ports together with Fibre Channel cables direct connected.

Virtual Private Network Connectivity (VPN)


VPN connectivity applies to VPLEX Metro and VPLEX Geo. The VPN is configured
between management servers as well as the optional VPLEX Witness server. There
are two distinct VPN networks to consider management-server to managementserver VPN and VPLEX Witness to management-server VPN. These VPNs serve
different purposes and therefore have different requirements.
The VPLEX Witness VPN has a sole purpose of establishing and maintaining
communications with both management servers. This is for the purpose of proper
failure identification. For additional details on VPLEX Witness please see the VPLEX
Witness section.
Management server to management server VPN is used to establish a secure routable
connection between management servers so both clusters can be managed from either
management servers as a single entity.
VPN connectivity for management communications
Requires a routable/pingable connection between the management servers for each
cluster.

The best practice for configuring the VPN is to follow the installation guide and run
the automated VPN configuration script.

Network QoS requires that the link latency does not exceed 1 second (not millisecond)
for management server to VPLEX Witness server

Network QoS must be able to handle file transfers during the NDU procedure
Static IP addresses must be assigned to the public ports on each Management Server
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

55

Network Best Practices


(eth3) and the public port in the Cluster Witness Server. If these IP addresses are in
different subnets, the IP Management network must be able to route packets between
all such subnets.

NOTE: The IP Management network must not be able to route to the following
reserved VPLEX subnets: 128.221.252.0/24, 128.221.253.0/24, and 128.221.254.0/24.

The following protocols need to be allowed on the firewall (both in the outbound and
inbound filters):

Internet Key Exchange (IKE): UDP port 500

NAT Traversal in the IKE (IPsec NAT-T): UDP port 4500


The following ports as well:

56

Encapsulating Security Payload (ESP): IP protocol number 50

Authentication Header (AH): IP protocol number 51

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices


Port

Function

Public port TCP/22


Service port TCP/22

Public port TCP/50

Service

Log in to management
server OS, copy files to
and from the management
server using the SCP subservice and establish SSH
tunnels

SSH

IPsec VPN

ESP

Public port UDP/500

ISAKMP

Public port UDP/4500

IPSEC NAT traversal

Public port UDP/123

Time synchronization
service

NTP

Public port TCP/161

Get performance statistics

SNMP

Web access to the VPLEX


& RP Management
Consoles graphical user
interface

Https

Localhost TCP/5901

Access to the management


servers desktop. Not
available on the public
network. Must be
accessed through the SSH
tunnel.

VNC

Localhost TCP/49500

VPlexcli. Not available on


the public network. Must
be accessed through SSH.

Telnet

Public port 389

Lightweight Directory
Access Protocol

LDAP

Public port 636

LDAP using TLS/SSL


(was sldap)

ldaps

Public port 7225

Protocol for
communicating with the
RecoverPoint functional
API

RecoverPoint

Public Port 80

web server for


management (TCP)
RecoverPoint

Http

Public port UDP/161


Public port TCP/443
Service port TCP/443

Table 1 VPLEX and RecoverPoint port usage

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

57

Network Best Practices

IP Management network must be capable of transferring SSH traffic between


Management Servers and Cluster Witness Server.
The following port must be open on the firewall:

58

Secure Shell (SSH) and Secure Copy (SCP): TCP port 22

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices

VPLEX Metro
Cluster connectivity
VPLEX Metro connectivity is defined as the communication between clusters in a
VPLEX Metro. The two key components of VPLEX Metro communication are FC
(FCIP, DWDM) or 10 GigE and VPN between management servers. Please refer to
the VPN section for details. FC WAN is the Fibre Channel connectivity and 10 GigE is
the Ethernet connectivity options between directors of each cluster. Choose one or the
other but not both.

FC WAN connectivity for inter-cluster director communication

Latency must be less than or equal to 5ms rtt

Each directors FC WAN ports must be able to see at least one FC WAN port on every
other remote director (required).

The directors local com port is used for communications between directors within the
cluster.

Independent FC WAN links are strongly recommended for redundancy

Each director has two FC WAN ports that should be configured on separate fabrics (or
FC/IP external hardware) to maximize redundancy and fault tolerance so that s single
external VPLEX failure does not cause a full WAN communications failure to a single
director

Configure FC WAN links between clusters like ISLs between FC switches. This does
not require a merged fabric between locations.

Logically isolate VPLEX Metro traffic from other traffic using zoning, VSANs or LSAN
zones. Additionally, physical isolation from the SAN can be achieved by connecting
to separate switches used to connect the data centers without requiring connection to
the local SAN.

Metro over IP (10Gb/E)

Latency must be less than or equal to 5ms rtt

Must follow all the network requirements outlined in the Geo section

Configured and monitored using the same tools as the Geo solution

Cache must be configured for synchronous write through mode only


FC WAN connectivity utilizes Fibre Channel with standard synchronous distance
limitations in the initial release. Considerations for Fibre Channel include
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

59

Network Best Practices


latency/round trip conditions and buffer-to-buffer credits as well as the BB_credits
applied to distance. An excellent source for additional information is the EMC
Symmetrix Remote Data Facility (SRDF) Connectivity Guide or EMC Networked Storage
Topology Guide, available through E-Lab Interoperability Navigator at:
http://elabnavigator.EMC.com.
Latency/roundtrip conditions
Latency is generally referred to in milliseconds (ms) as the combined roundtrip time
(RTT) between local and remote clusters. A FC-Frame by itself takes approximately 1
ms to traverse a one-way distance of 100 km from primary-transmitter to secondaryreceiver.
For example, if two locations are 100 km apart, since standard Fibre Channel protocol
requires two roundtrips for a write I/O, then 4 ms of latency (2 x RTT) will be added
to the write operation. As more network components are attached to the
configuration for pure Fibre Channel environments, latency will naturally increase.
This latency can be caused by network components such as host initiators, switches,
fiber optics, and distance extension devices, as well as factors such as cable purity.
The VPLEX application layer will contribute additional delays on top of the network.
The supported network round trip latency is <=5 ms for VPLEX Metro.
Using performance monitoring tools, the roundtrip time can be determined for
troubleshooting any WAN latency issues within the network.
Buffer-to-buffer credits
Fibre Channel uses the BB_Credits (buffer-to-buffer credits) mechanism for hardwarebased flow control. This means that a port has the ability to pace the frame flow into
its processing buffers. This mechanism eliminates the need of switching hardware to
discard frames due to high congestion. On VPLEX, during fabric login, the 8 GB FC
ports advertise a BB credit value of up to 41. The switch will respond to the login with
its maximum support BB credit value. The BB credit value returned by the switch is
used by the FC port, up to a maximum of 41. EMC testing has shown this mechanism
to be extremely effective in its speed and robustness. The BB credits between the two
clusters must be configured for long distance.
Refer to the EMC Networked Storage Topology Guide for proper calculations and settings
for your WAN connectivity.
DWDM/SONET configuration
When using DWDM or SONET connectivity between sites, be sure to determine if the
two rings have diverse pathing. If it is found that the two rings have diverse pathing
measure the latency over both paths. VPLEX directors will load balance using Round
Robin over both links so any large discrepancy in latency between the two paths will
60

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices


cause VPLEX to operate at speeds based on the slower path and if a big enough
difference is measured then VPLEX will issue call homes but will take no action to
take a path offline.
Dual fabrics for inter-cluster communication
A VPLEX Metro should be set up with redundant and completely independent Fibre
Channel connectivity between clusters. This provides maximum performance, fault
isolation, fault tolerance, and availability.
Redundant fabrics are of critical importance due to the fact that when the directors in
one cluster have inconsistent connectivity with the directors in the remote cluster, the
two clusters will be logically split until the connectivity issues are resolved. This is by
design. The firmware requires full connectivity among all directors for protocols such
as cache coherence and inter-cluster communication. Without full connectivity, the
director will continue to run but will bring the inter-cluster link down. The net result
is that all volumes at the non preferred (or losing) site will become unavailable as per
the pre-defined per volume cluster detach rules. Recovery is simple, but manual. It
requires that connectivity be restored between all directors prior to the resumption of
I/O operations.
The following is an example fabric configuration:
Site-1 has switches, switch-1A and switch-1B.
Site-2 has switches, switch-2A and switch-2B.
At Site-1:

a) Connect "A4-FC02" and "B4-FC02" to switch-1A.

b) Connect "A4-FC03" and "B4-FC03" to switch-1B.

At Site-2:

a) Connect "A4-FC02" and "B4-FC02" to switch-2A.

b) Connect "A4-FC03" and "B4-FC03" to switch-2B.

Place ISLs between switch-1A and switch-2A and between switch-1B and switch-2B.
The best practice is to have your ISL traffic travel through independent links (and/or
carrier) between your sites.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

61

Network Best Practices


Zoning inter-cluster connectivity
The Fibre Channel WAN COM ports are configured into two different port groups.
Ports on the WAN COM module labeled FC00 will be in port group 0 and the ports
labeled FC01 will be in port group 1. The inter-cluster connectivity will be spread
across two separate fabrics for HA and resiliency. Port group 0 will be connected to
one fabric and port group 1 will be connected to the other fabric.
A Fibre Channel WAN COM port in one cluster will be zoned to all of the FC WAN
ports on the same port group at the remote site. This is roughly equivalent to one
initiator zoned to multiple targets. This is repeated for each and every Fibre Channel
WAN COM ports at both clusters.
This zoning provides additional fault tolerance and error isolation in the event of
configuration error or a rogue fabric device (when compared to a single large zone).

Figure 20 Fibre Channel WAN COM connections on VS2 VPLEX hardware

Figure 21 IP WAN COM connections on VS2 VPLEX hardware

62

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices

Though this requires more setup than a single zone, it is worth the effort and should
not be considered out of the norm for a SAN administrator.

Assuming two fabrics and dual-engine systems for Cluster A and Cluster B, each
fabric would be zoned as follows:
Key:
Director-1-1-A = cluster 1 engine director A1 : D11A
Director-2-1-B = cluster 2 engine 1 director B: D21B
WAN COM ports are FC00 or FC01
All combined (example): D11A:FC00

Zoning using director FC WAN COM modules A2 and B2

Zone 1: D11A:FC00-> D21A:FC00, D21B:FC00, D22A:FC00, D22B:FC00


Zone 2: D11A:FC01-> D21A:FC01, D21B:FC01, D22A:FC01, D22B:FC01
Zone 3: D12B:FC00 ->D21A:FC00, D21B:FC00, D22A:FC00, D22B:FC00
Zone 4: D12B:FC01 ->D21A:FC01, D21B:FC01, D22A:FC01, D22B:FC01
Zone 5: D21A:FC00, D11A:FC00, D11B:FC00, D12A:FC00, D12B:FC00
Zone 6: D21A:FC01, D11A:FC01, D11B:FC01, D12A:FC01, D12B:FC01
Zone 7: D21B:FC00, D11A:FC00, D11B:FC00, D12A:FC00, D12B:FC00
Zone 8: D21B:FC01, D11A:FC01, D11B:FC01, D12A:FC01, D12B:FC01
There would be 16 zones for the quad engine configuration. Please extrapolate from
this example.

NOTE: If VPLEX is deployed with IP inter-cluster network (fcip), the inter-cluster


network must not be able to route to the following reserved VPLEX subnets:
128.221.252.0/24, 128.221.253.0/24, and 128.221.254.0/24.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

63

Network Best Practices

64

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices

Checking cluster connectivity


To check for FC WAN connectivity, log in to the VPLEX CLI and run the following
command:
connectivity director <director_name>

An example is as follows :
VPlexcli:/> connectivity director --director director-1-1-A/
Device VPD83T3:50060160bce00a99 is a default LUN_0.
StorageVolumes discovered - sorted by: name
StorageVolume Name
WWN
---------------------------------------- -----------------VPD83T3:60000970000192601707533031353237 0x50000972081aadd4
0x50000972081aadd5
VPD83T3:60000970000192601707533031353238 0x50000972081aadd4
0x50000972081aadd5

Initiators discovered
Node WWN
Port WWN
------------------ -----------------0x20000000c98ce6cd 0x10000000c98ce6cd
0x20000000c98ce6cc 0x10000000c98ce6cc

LUN
-----------------0x0000000000000000
0x0000000000000000
0x0001000000000000
0x0001000000000000

Ports
------A2-FC00
A2-FC00
A2-FC00
A2-FC00

Ports
------A0-FC00
A0-FC00

Directors discovered by director-1-1-A, UUID 0x000000003b201e0b:


Directors discovered by east_35, UUID 0x000000003b201e0b:
Director UUID
Protocol Address
Ports
------------------ -------- ------------------ ------0x000000003b301fac COMSCSI 0x50001442901fac43 A4-FC03
COMSCSI 0x50001442901fac42 A4-FC02
0x000000003b201fac COMSCSI 0x50001442801fac43 A4-FC03
COMSCSI 0x50001442801fac42 A4-FC02
0x000000003b301f80 COMSCSI 0x50001442901f8041 A4-FC01
COMSCSI 0x50001442901f8040 A4-FC00
0x000000003b201f80 COMSCSI 0x50001442801f8040 A4-FC00
COMSCSI 0x50001442801f8041 A4-FC01
0x000000003b301e07 COMSCSI 0x50001442901e0743 A4-FC03
COMSCSI 0x50001442901e0742 A4-FC02
0x000000003b201e07 COMSCSI 0x50001442801e0743 A4-FC03
COMSCSI 0x50001442801e0742 A4-FC02
0x000000003b301e0b COMSCSI 0x50001442901e0b41 A4-FC01
COMSCSI 0x50001442901e0b40 A4-FC00

In the Directors discovered by section, check to make sure that the director has
connectivity to the remote directors using the ports A4-FC02 and A4-FC03.
Repeat this process for all the remaining directors in your system and check to make
sure that they can reach the remote directors using both the FC WAN ports.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

65

Network Best Practices


VPLEX Geo
Cluster connectivity
VPLEX Geo connectivity is defined as the communication between clusters in a
VPLEX Geo. The two key components of VPLEX Geo communication are Ethernet
(IP) WAN and VPN. (Please see the VPN section for details.) Ethernet WAN is the
network connectivity between directors of each cluster and the VPN is connectivity
between management servers for management purposes.

IP WAN connectivity

Each directors IP WAN ports must be able to see at least one WAN port on every
other remote director (required).

The directors local com port is used for communications between directors within the
cluster.

Independent WAN links are strongly recommended for redundancy

Each director has two WAN ports that should be configured on separate hardware to
maximize redundancy and fault tolerance.

Configure WAN links between clusters on network components that offer the same
Quality of Service (QoS).

VPLEX uses round-robin load balancing and is at the mercy of the slowest pipe

Logically isolate VPLEX Geo traffic from other WAN traffic

Multicast must be enabled on the network switches

The supported network round trip latency is <50 ms.

Dual networks for inter-cluster communication


A VPLEX Geo should be set up with redundant and completely independent
networks between clusters located over geographically different paths. This provides
maximum performance, fault isolation, fault tolerance, and availability.
Redundant networks are of critical importance due to the fact that when the directors
in one cluster have inconsistent connectivity with the directors in the remote cluster,
the two clusters will be logically split until the connectivity issues are resolved. This is
by design. The firmware requires full connectivity among all directors for protocols
such as cache coherence and inter-cluster communication. Without full connectivity,
the director will continue to run but will bring the inter-cluster link down. The net

66

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices


result is that all volumes at the losing site will become unavailable as per the predefined per volume cluster detach rules or through the control based on VPLEX
Witness.
Recovery requires that connectivity be restored between all directors prior to the
resumption of I/O operations. In the case of active/active stretch clusters, application
roll-back may be required as uncommitted delta sets will be discarded. Installations
configured as active/passive host clusters will avoid this condition as VPLEX
recognizes single site access through the awareness of empty delta sets and therefore
will not discard the uncommitted deltas. This is of course assuming all active noted
are active at the same cluster in the same consistency groups. If one of these LUNS in
the same CG is active at the remote cluster too then its game over as you are now
active/active. Also this is only true when the rule is set to active writer wins
Maximum Transmission Unit (MTU) size is also a very important attribute. The
default MTU size for network switches is 1500 for IPv4. Jumbo Packet support is
provided by an ever increasing assortment of network switches however all switches
from end to end must support Jumbo Packets. By increasing the size of the MTU,
there will be an increase in performance over the WAN. If there is a switch anywhere
inline that does not support Jumbo Packets or is not configured to utilize Jumbo
Packets then it will break the jumbo packet down into multiple smaller packets with
the MTU of 1500 which will have an overall negative impact on performance.
VPLEX supports an MTU size up to a maximum of 9000 and it is a recommended best
practice to use the highest supported MTU size supported on that network. It is very
important to do an assessment of the network first to determine MTU size to configure
for VPLEX.

NOTE: If VPLEX is deployed with IP inter-cluster network, the inter-cluster network


must not be able to route to the following reserved VPLEX subnets: 128.221.252.0/24,
128.221.253.0/24, and 128.221.254.0/24.

Performance Tuning VPLEX Geo

How-To Set the maximum-queue-depth:


A VPLEX tunable parameter you should be familiar with is max queue depth on a
consistency group (found in the advanced context in each consistency group
context) By default it set to 6. It can be increased up to maximum of 64. The queue
depth controls how many delta sets are allowed to queue before write pacing
(artificial host IO throttling) is invoked. The net impact of increasing the queue depth
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

67

Network Best Practices


allows virtual volumes in asynchronous consistency groups to withstand longer
duration spikes in write IO before write pacing is invoked. This has a smoothing effect
in short write burst environments. The downside is increased RPO as VPLEX Geo is
asynchronous at the array layer (but not the host cache layer) as you add deltas you
increase RPO. Its important to note that Delta sets only buffer temporary spikes in
write IO and cannot help in situations where the sustained write IO over time exceeds
the link bandwidth.

Verify current max-queue-depth setting:

VPlexcli:/> ll /clusters/<cluster_name>/consistency-groups/<cg_name>/advanced
/clusters/<cluster_name>/consistency-groups/<cg_name>/advanced:
Name
Value
-------------------------- -------auto-resume-at-loser
false
current-queue-depth
1
current-rollback-data
0B
default-closeout-time
0.5min
delta-size
16M
local-read-override
true
max-possible-rollback-data 992M
maximum-queue-depth
6
potential-winner
write-pacing
inactive

The default value for queue depth is 6 which may be adjusted as high as 64.
Decreasing the maximum-queue-depth will decrease the maximum RPO in the event
of link and/or cluster failures. Increasing the maximum-queue-depth will enhance
the ability of the vplex system to handle bursts of traffic.
NOTE: Increasing the maximum-queue-depth will increase the amount data
that must roll-back in the case of a cluster failure or a link outage.

68

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices

Setting the max-transfer-size for rebuilds:


The max-transfer-size parameter is used to determine the size of the region on a
virtual volume that is locked while data is read for a distributed (or local) raid-1
rebuild as well as Data Mobility jobs when migrating devices and/or extents. This is
set on a per virtual volume basis. VPLEX uses the rebuild function when you first
create a distributed raid-1 or local raid-1 device to synchronize data on both mirror
legs. During normal production operations a rebuild would be fairly *uncommon*.
This setting can dynamically be adjusted at any time even during a migration or
rebuild. The 128K size is very conservative in terms of FE host impact and the
appropriate size is a function of when you start to see impact from moving it higher
and your overall time objectives during a rebuild or migration. In effect, max transfer
is a tunable throttle that lets the user control the speed *and* host impact of
distributed/local raid-1 rebuilds and migrations.
Under normal host read/write activities the max transfer size setting will not affect
your replication speed across the wire. VPLEX will take all the bandwidth that is
allocated to it provided it has sufficient write traffic. A suggested next step would be
to start some host IO against DR1 and test various levels of throughput per your
normal iops testing procedures.

VPlexcli:/> rebuild show-transfer-size --devices *


device name
transfer-size
----------------------------------------- ------------device_6006016018641e000e933e974fbae111_1 128K
device_6006016018641e000f933e974fbae111_1 128K
device_6006016018641e0010933e974fbae111_1 128K
device_6006016018641e0011933e974fbae111_1 128K
device_6006016018641e0012933e974fbae111_1 128K
VPlexcli:/> rebuild set-transfer-size --devices * --limit 1M
VPlexcli:/> rebuild show-transfer-size --devices *
device name
transfer-size
----------------------------------------- ------------device_6006016018641e000e933e974fbae111_1 1M
device_6006016018641e000f933e974fbae111_1 1M
device_6006016018641e0010933e974fbae111_1 1M
device_6006016018641e0011933e974fbae111_1 1M
device_6006016018641e0012933e974fbae111_1 1M
VPlexcli:/> rebuild status
device
----------------------------------------device_6006016018641e000e933e974fbae111_1
device_6006016018641e000f933e974fbae111_1
device_6006016018641e0010933e974fbae111_1
device_6006016018641e0011933e974fbae111_1
device_6006016018641e0012933e974fbae111_1

rebuild
------full
full
full
full
full

rebuilder
--------s2_0908_spa
s2_0908_spa
s1_0a69_spb
s2_0908_spb
s1_0a69_spa

rebuilt/total
------------5.59G/10.1G
5.59G/10.1G
3.87G/10.1G
620M/10.1G
7.99G/10.1G

% finished
---------55.47%
55.47%
38.40%
6.01%
79.32%

throughput
---------31.8M/s
31.8M/s
42.4M/s
16.8M/s
34.5M/s

ETA
--------2.4min
2.4min
2.48min
9.58min
61sec

Set the MTU Size for Jumbo Packets:


Implementation and Planning Best Practices for EMC VPLEX Technical Notes

69

Network Best Practices

In order to use jumbo frames, you need to set each and every hop along the path to
jumbo frames enabled.
The VPlexCLI command director tracepath will help here. When you set the end
points to 9000 mtu, run the tracepath command, it will tell you what it thinks each
point on the path can support. You have to set the end points to something large in
order to find the biggest mtu size, unfortunately you cant just find out with the end
points set to 1500. From the CLI Guide, it will look something like this:

Also, in the firmware log (in /var/log/VPlex/cli/ on the management station), you
can look for specific messages regarding what a director thinks the MTU size on the
path:
ipc/36 example path MTU from B2-XG00 (slic2eth0 mtu 1500) to 153.7.45.24 is
likely to be 128 or slightly higher.
Be sure to check the firmware log for ipc and comUdt related messages when you
are trying jumbo frames this will give a clue as to why the connection isnt working.
You may see things like accepted or losing connection. Tail the firmware log when
you make the MTU changes.
After determining the maximum MTU prior to setting that value on the VPLEX port
groups, you must determine if there will be any WAN optimization/acceleration gear
that will also be added inline. Optimization typically adds a slight overhead to the
packet during transmission. An example of this would be the RapidPath nodes. If
you set them to use UDP optimization and Multihoming then the overhead added has
a value of 124. This value must be subtracted from the maximum MTU value i.e. 1500
-124=1376. The MTU value that would be set on VPLEX in this case would be 1376.
Likewise, if you determined that the MTU size should be 9000 then you would apply a
value of 8876 on VPLEX. 8876 + 124 = 9000.
Note: Only modify one port group at a time, as to not lose connectivity
between clusters.

70

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Network Best Practices


Verify connectivity:

VPlexcli:/>connectivity validate-wan-com

Go to the port-group-0 context:

VPlexcli:/>cd /clusters/cluster-1/cluster-connectivity/port-groups/port-group-0
VPlexcli:/>cd /clusters/cluster-2/cluster-connectivity/port-groups/port-group-0

Disable port-group-0 on Cluster-1 & 2:

VPlexcli:/clusters/Cluster-1/cluster-connectivity/port-groups/port-group-0> set enabled all-disabled


VPlexcli:/clusters/Cluster-2/cluster-connectivity/port-groups/port-group-0> set enabled all-disabled

Modify the MTU size in the subnet:

VPlexcli:/> cd /clusters/cluster-1/cluster-connectivity/subnets/cluster-1-SN00
VPlexcli:/clusters/Cluster-1/cluster-connectivity/subnets/cluster-1-SN00> set mtu 9000
VPlexcli:/> cd /clusters/cluster-2/cluster-connectivity/subnets/cluster-2-SN00
VPlexcli:/clusters/cluster-2/cluster-connectivity/subnets/cluster-1-SN00> set mtu 9000
Note:

mtu

Takes an integer between 1024 and 9000.

Enable the port group:

VPlexcli:/> cd /clusters/cluster-1/cluster-connectivity/port-groups/port-group-0
VPlexcli:/clusters/cluster-1/cluster-connectivity/port-groups/port-group-0/
set enabled all-enabled
VPlexcli:/> cd /clusters/cluster-2/cluster-connectivity/port-groups/port-group-0
VPlexcli:/clusters/Cluster-2/cluster-connectivity/port-groups/port-group-0/
set enabled all-enabled

Validate connectivity:

VPlexcli:/> connectivity validate-wan-com

Go to the port-group-1 context:

VPlexcli:/>cd /clusters/cluster-1/cluster-connectivity/port-groups/port-group-1
VPlexcli:/>cd /clusters/cluster-2/cluster-connectivity/port-groups/port-group-1

Disable port-group-1 on Cluster-1 & 2:

VPlexcli: /clusters/cluster-1/cluster-connectivity/port-groups/port-group-1> set enabled all-disabled


VPlexcli: /clusters/cluster-2/cluster-connectivity/port-groups/port-group-1> set enabled all-disabled

Modify the MTU size in the subnet:

VPlexcli:/> cd /clusters/cluster-1/cluster-connectivity/subnets/cluster-1-SN01
VPlexcli:/clusters/Cluster-1/cluster-connectivity/subnets/Cluster-1-SN01> set mtu 9000
VPlexcli:/> cd /clusters/cluster-2/cluster-connectivity/subnets/cluster-2-SN01
VPlexcli:/clusters/Cluster-2/cluster-connectivity/subnets/Cluster-1-SN01> set mtu 9000

Enable the port group:

VPlexcli:/> cd /clusters/cluster-1/cluster-connectivity/port-groups/port-group-1
VPlexcli:/clusters/Cluster-1/cluster-connectivity/port-groups/port-group-1/
set enabled all-enabled
VPlexcli:/> cd /clusters/cluster-2/cluster-connectivity/port-groups/port-group-1
VPlexcli:/clusters/Cluster-2/cluster-connectivity/port-groups/port-group-1/
set enabled all-enabled

Validate connectivity:

VPlexcli:/> connectivity validate-wan-com

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

71

Network Best Practices

Set the Socket-Buf-Size:


The formula for setting the socket-buf-size is:
BW*RTT Latency/2 for each of two connections
Example:
100MB * .1sec/2=5MB socket buffer setting each (default is 1MB)
Syntax:

VPlexcli: /clusters/cluster-1/cluster-connectivity/option-sets/optionset-com-0> set socket-buf-size 5M


VPlexcli: /clusters/cluster-1/cluster-connectivity/option-sets/optionset-com-1> set socket-buf-size 5M
VPlexcli: /clusters/Cluster-2/cluster-connectivity/option-sets/optionset-com-0> set socket-buf-size 5M
VPlexcli: /clusters/Cluster-2/cluster-connectivity/option-sets/optionset-com-1> set socket-buf-size 5M

Make sure that you set the Maximum Transmission Unit (MTU) and the Socket
Buffers according to your anticipated workloads. Figures 19 and 20 offer some
suggested values to start your base lining process:

Note: The VPLEX 5.2 default socket-buffer-size is 1MB.


Our VPLEX 5.2 MetroIP Performance Overview whitepaper advises:
- 1MB optimal for MTU 1500 with an RTT of 1ms
- 5MB optimal for MTU 1500 with an RTT of 10ms
- 5MB optimal for MTU 9000 with RTTs 1ms and 10ms
Our VPLEX 5.2 Geo Performance Overhead whitepaper will advise:
- 15MB optimal for MTU 1500
- 20MB optimal for MTU 9000

72

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Figure 22 OC24 - 1Gb

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

73

Figure 23 Oc192 - 10Gb

74

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

VPLEX Witness

VPLEX Witness

Figure 24 VPLEX Witness


VPLEX Witness will be provided as a VM and will be supported on ESX 4.0, ESX 4.1,
ESXi 5.0 / 5.1, and VMware FT on customer supplied ESX hardware that resides in a
third failure domain. Please refer to the ESSM for future ESX support levels. VPLEX
Witness communicates with both management servers over a secure three way VPN.
VPLEX Witness is intended to provide a third point of view of various failure
scenarios in order to ensure proper device access attributes in the event of a failure. It
is imperative that VPLEX Witness is installed in a third failure domain such that no
single failure will affect multiple components involving VPLEX Witness.
VPLEX Witness is critical to continuous availability of the VPLEX Metro.

Note: If VPLEX Witness fails and will be down for an extended period of time
then it is very important that it is disabled within vplexcli so that VPLEX will
respond to additional failures based on the detach rules. Failure to do so will
cause a dual failure resulting in cluster isolation and suspension of all devices
on both clusters.
Please reference the TechBook: EMC VPLEX Metro Witness Technology and High
Availability for additional details.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

75

Consistency Groups

Consistency Groups
There are three types of consistency groups: asynchronous, synchronous and
RecoverPoint. Asynchronous consistency groups are used for distributed volumes in
a VPLEX Geo to ensure that I/O to all volumes in the group is coordinated across both
clusters, and all directors in each cluster. All volumes in an asynchronous group share
the same detach rule and cache mode, and behave the same way in the event of an
inter-cluster link failure. Only distributed volumes can be included in an
asynchronous consistency group.
Synchronous consistency groups, on the other hand, simply provide a convenient way
to apply rule sets and other properties to a group of volumes at a time, simplifying
system configuration and administration on large systems. Volumes in a synchronous
group behave as volumes in a VPLEX Local or Metro, and can have global or local
visibility. Synchronous groups can contain local, global, or distributed volumes.
RecoverPoint consistency Groups contain the Journal and Replica volumes for the
corresponding data volumes found in a RecoverPoint enabled consistency group.
VPLEX Witness controls rule sets for consistency groups only. Distributed volumes
not placed within consistency groups will simply apply the detach rules without
guidance from VPLEX Witness.

Note: VPLEX Witness will not provide guidance for Consistency Groups that do not
have the detach rule set to make one of the clusters the winner.

For additional information, please refer to VPLEX with GeoSynchrony CLI Guide
found on support.emc.com or the online help found within the GUI.

76

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Rule Sets

Rule Sets
As a minimum, set the detach timer to 5 seconds. Setting the detach delay lower than 5
seconds can result in unnecessary or numerous storage detaches during periods of
network instability. Multiple detaches in a short period of time can also result in many
unnecessary data rebuilds and subsequently in reduced performance.
Configure detach rules based on the cluster/site that you expect to continue I/O
during any network outage.
Avoid conflicting detach situations. Each distributed device (or group of devices)
must have a rule set assigned to it. When a clusters distributed device detaches
during a link outage or other communications issue with the other members of a
distributed device, the detached device can resume I/O. Therefore, it is important to
understand the nature of the outage and which cluster is set to automatically detach.
It is a recommendation that the rule set configuration for each distributed device or
group of devices be documented as well as plans for how to handle various outage
types.
It is important to note that rule sets are applied on a distributed device basis or to a
number of devices within a consistency group. It is within normal parameters for
different distributed devices to resume I/O on different clusters during an outage.
However, if a host application uses more than one distributed device, most likely all of
the distributed devices for that application should have the same rule set to resume
I/O on the same cluster.

Note: It is recommended that Consistency Groups be used when setting


detach rules.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

77

System Volumes

System Volumes
There are two types of system volumes. Each cluster must have a metadata volume.
Each VPLEX Metro or Geo cluster must also have sufficient logging volumes to
support its distributed devices.

Metadata volume
Metadata volumes consistency should be periodically checked using
the meta-volume verify-on-disk-consistency style short -c
<cluster>

A metadata volume contains information specific to a VPLEX Cluster such as virtualto-physical mapping information, data about the devices and virtual volumes, system
configuration settings, and other system information. Metadata volumes are created
during system installation. However, you may need to create a metadata volume if,
for example, you are migrating to a new array.
Note the following:

78

A metadata volume is the only object you create on a storage volume without claiming
it first

You create metadata volumes directly on storage volumes, not extents or devices

Metadata volumes should be on storage volumes that have underlying redundant


properties such as RAID 1 or RAID 5

Metadata volumes must be mirrored between different storage arrays whenever more
than one array is configured to a VPLEX Cluster NDU requirement

Two Metadata volume backups must be configured for every VPLEX Cluster and
must be placed on different arrays whenever more than one array is configured to a
VPLEX Cluster NDU requirement

Installations that were initially configured with a single array per VPLEX cluster must
move a Metadata mirror leg and a backup copy to the second array once provisioned

Metadata volumes are written to at the time of a configuration change and read from
only during the boot of each director

Metadata volumes are supported on Thin/Virtually Provisioned devices. It is


required for those devices to be fully allocated

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

System Volumes

Meta volume names can be a max of 39 characters


Metadata volumes provisioned on Thin devices have the additional requirement to
fully allocate the device so that over allocation of the thin pool wont have adverse
effects to the metadata volume should additional updates be required.

Note: Renaming Metadata Backup volumes is not supported.

Note: Metadata volumes are required to be zeroed out. Follow the procedure
outlined in VPLEX Procedure Generator to zero the disks to be used for the
Metadata Volumes and Metadata volume backups. Additionally, writing
zeros to thin devices will force them to be fully allocated which is necessary
for VPLEX supportability with both Metadata volumes and logging volumes.

Metadata backup policies and planning


You need to have a metadata backup you can recover from. Plan on the following:

Spare volumes for each cluster to hold backups You need to rotate a minimum of
two backups per VPLEX Cluster

A system-wide scheduled backup done at regular times A single cluster backup for
a VPLEX Metro or VPLEX Geo is not useful

On-demand backups before/after major reconfigurations and/or migrations


For additional information, please refer to VPLEX with GeoSynchrony CLI Guide
found on support.emc.com.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

79

System Volumes

Logging volumes
Note: Single-cluster systems (VPLEX Local) and systems that do not have
distributed devices do not require logging volumes.

A prerequisite to creating a distributed device, or a remote device, is that you must


have a logging volume at each cluster. Logging volumes keep track of any blocks
written during an inter-cluster link failure. After a link is restored, the system uses the
information in logging volumes to synchronize the distributed devices by sending
only changed block regions across the link. Each DR1 requires 2 bitmap logs at each
site. One for writes initiated remotely that fail locally (gives array death protection)
and one for writes succeed locally but fail remotely. Upon restore (of array or link)
one log is used to push data and one is used to pull data.
1.

The first bitmap log is used if the clusters partition, for writes on the winning
site. You will use this log to send the blocks for the log rebuild

2.

The 2nd bitmap log is used if your local array fails. Since writes to the local
leg will fail, you mark the 2nd bitmap log. You then use this in a log rebuild
to pull data from the remote cluster.

If a logging volume is not created, every inter-cluster link-failure could cause a full
resynchronization of every distributed device in the system. The logging volume
must be large enough to contain one bit for every page of distributed storage space.
The calculation for determining logging volume size is: 10 GB protects 320TB/N [n is
the number of clusters] Consequently, you need approximately 10 GB of logging
volume space for every 160 TB of distributed devices in both a VPLEX Metro and
VPLEX Geo. Logging volumes are not used for VPLEX Local configurations and are
not used for local mirrors.
The logging volume receives a large amount of I/O during and after link outages.
Consequently, it must be able to handle I/O quickly and efficiently. EMC
recommends that you stripe it across several disks to accommodate the I/O volume,
and that you also mirror it, since this is important data. EMC also recommends
placing the logging volume on separate physical spindles than the storage volumes
that it is logging against.
Because logging volumes are critical to the functioning of the system, the best practice
is to mirror them across two or more back-end arrays to eliminate the possibility of
data loss on these volumes. In addition, they can be striped internally on the back-end
arrays. The RAID level within the array should be considered for high performance.
When a failure happens that invoke the use of the logging volume then they will
experience very high read/write I/O activity during the outage. During normal
healthy system activity, logging volumes are not used.

80

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

System Volumes
If one array's data may, in the future, be migrated to another array, then the arrays
used to mirror the logging volumes should be chosen such that they will not be
required to migrate at the same time.
You can have more than one logging volume, and can select which logging volume is
used for which distributed device.
Logging volumes are supported on thin devices however it is a requirement that the
thin device be fully allocated. For similar reasons as the metadata volume
requirement, logging volumes are active during critical times and must have the
highest level of availability during that time.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

81

System Volumes

Migration of Host/Storage to a VPLEX Environment


Always remember that a host should never be able to do I/O to a storage volume
directly while also able to do I/O to the virtualized representation of that storage
volume from a VPLEX Cluster. It must be one or the other but never both.
The process of migrating hosts with existing storage over to the VPLEX virtualized
environment includes the following methodology suggestions:

Host grouping (initiator group basis) (recommended)

Migration by application and/or volume. This way any necessary driver updates can
happen on a host-by-host basis

Virtualize an entire cluster of hosts (requirement)

Select back-end ports for specific arrays/initiator groups (on those arrays)
Note: Please refer to the VPLEX Procedure Generator for various step by step
host and clustered host encapsulation procedures. Please note that work is
being done to get away from the use of the --appc flag. Some of the newer
procedures have already removed this step in favor of performing the
encapsulation via the GUI. The 1:1 Virtual Volume creation within the GUI
Wizard avoids the problems that were being protected against which required
the --appc flag.

Storage volume encapsulation


It is recommended when claiming and using storage volumes with existing data that
special attention is paid to the processes of constructing a virtual volume so that the
integrity of the existing data is maintained and available through the virtual volume.

You must create a single extent across the entire capacity of each storage volume

You must protect the data when creating devices

Special considerations with encapsulation


Before attempting to encapsulate existing storage devices, please refer to the ESSM to
verify host support and please refer to the VPLEX Procedure Generator for proper
procedures.
Unlike normal migrations, encapsulations include operations that share the existing
device to VPLEX. Host clusters complicate this process by applying reservations to
the devices therefore special procedures must be followed to remove the reservation
prior to exposing to VPLEX.

82

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

System Volumes
VPLEX Non-Disruptive Insertion (NDI) Options

Overview
Because VPLEX changes the addressing that hosts use to access storage, there is a
common misconception that inserting a VPLEX into a running environment must be a
disruptive process. In fact, the vast majority of IT environments already have the tool
sets for enabling VPLEX to be inserted without ANY downtime for the end user
community. This document outlines the wide variety of processes that users can use
to insert a VPLEX non-disruptively. For additional details, or if you have questions,
please contact the VPLEX Deal Support Desk.

Background
In the process of virtualizing back end storage LUNs, VPLEX creates new WWNs /
Volume IDs to present to the hosts. These WWNs / Volume IDs are usually mapped
one-to-one with the LUN(s) of the storage behind the VPLEX. Because the WWN /
Volume ID is different than what the host originally was using, the traditional
assumption has been that the host must be re-booted to recognize the new WWNs.
This re-boot downtime is often a minor objection to adopting VPLEX technology.
Avoiding these disruptions is possible using a number of common alternatives. Tools
already existing in typical IT environments which can be leveraged to nondisruptively migrate any number of user data sets whether the whole environment,
or a subset of applications that make up the most mission-critical suite. The option of
choosing which servers to move non-disruptively is totally up to the IT administrative
team.
VPLEX Non-Disruptive Insertion (NDI) supported environments include:

Virtualized servers (VMware, HyperV, etc.)

Nearly all physical server Operating Systems

Nearly all physical server clusters

Any host running PP ME

Any host that has an upcoming scheduled outage


Implementation and Planning Best Practices for EMC VPLEX Technical Notes

83

System Volumes
Prerequisites
The following sections outline the process of NDI for each of these mainstream use
cases. In each example, it is assumed that the user has:

Installed the VPLEX

Chosen hosts targeted for non-disruptive insertion

Verified support of each of the targeted hosts by VPLEX and remediated any that are
not currently on the VPLEX ESM

With VPLEX, created a one-to-one mapping of the existing LUNs to new VPLEX
LUNs for each of the targeted hosts LUNs

Associated the targeted hosts with their respective new VPLEX LUNs

NDI in virtualized environments


Virtualized environments, particularly VMware, are ideal for NDI. Both VMware and
MS Hyper V enable migrations without server re-booting. During the data transfer
phase, Hyper V does suspend the VM.

VMware:
1) Perform a Storage vMotion from the original LUN to the new VPLEX LUN
2) Perform a vMotion to the VPLEX LUN
3) Repeat as necessary; multiple Storage vMotions and vMotions can be executed
simultaneously (just not the SAME LUN)
4) OPERATION COMPLETE
MS Hyper V
1) Perform a Quick Storage Migration from the original LUN to the new VPLEX LUN (Hyper V will
suspend the VM during this time)
2) Perform a Live Migration to the VPLEX LUN
3) Repeat as necessary; multiple Quick Storage Migrations and Live Migrations can be executed
simultaneously (just not the SAME LUN)
4) OPERATION COMPLETE
84

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

System Volumes

NDI with Physical Servers


Physical servers with host-based mirroring are ideal candidates for NDI. The
following combinations are potential solutions:
IBM AIX AIX Logical Volume Manager
HP/UX HP Storage Mirroring Software
Solaris Solaris Volume Manager
Etc. TBD
1)
2)
3)
4)

Create mirrored relationship between the existing host-based LUNs and the new VPLEX LUNs
Initiate mirroring
Transition primary access to the VPLEX LUNs and dis-associate the two sets of LUNs
OPERATION COMPLETE

NDI with Physical Server Clusters


All mainstream physical server clusters some form of data replication which can be
used for NDI. The following combinations are potential solutions:
IBM AIX PowerHA / HACMP Logical Volume Manager
HP/UX MC/Service Guard HP Storage Mirroring Software
Solaris SunCluster Solaris Volume Manager
VCS Veritas Volume Manager
Etc. TBD
1)
2)
3)
4)

Create mirrored relationship between the existing host-based LUNs and the new VPLEX LUNs
Initiate mirroring
Transition primary access to the VPLEX LUNs and dis-associate the two sets of LUNs
OPERATION COMPLETE

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

85

System Volumes
NDI with PowerPath Migration Enabler
Any host running PPME can perform a VPLEX NDI. Process is equivalent to hostbased mirroring processes
1)
2)
3)
4)

Create mirrored relationship between the existing host-based LUNs and the new VPLEX LUNs
Initiate mirroring
Transition primary access to the VPLEX LUNs and dis-associate the two sets of LUNs
OPERATION COMPLETE

Insertion of VPLEX during scheduled downtime


Unlike the other processes outlined above, the insertion of VPLEX during already
scheduled downtime is NOT a non-disruptive process. BUT, since the outage was
already scheduled for other reasons (server maintenance, operating system patches,
power/cooling updates, etc.) there is no INCREMENTAL disruption to insert the
VPLEX.

FAQ:
Q: To encapsulate one LUN on an ESX cluster do we need to stop only VMs on that
LUN or do we need to shutdown the servers? The procedure Encapsulate arrays on
ESX without boot from SAN generates by procedure generator says that we must
stop the ESX server for an encapsulation (its not a cluster).
A: You would need to stop access by all VMs to the Datastore (Lun), remove access
by the ESX cluster to the lun (masking/zoning on native array), rescan/refresh the
ESX storage, present encapsulated lun via VPLEX, rescan/refresh the ESX storage,
persistently mount the Datastore (choose keep signature while mounting as new
signature will for virtual volume will be different), restart your VMs.

86

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Volume Considerations

Storage Volume Considerations


Storage volume claiming
Claiming storage volumes with pre-existing data on them have previous
recommendations to be done using the application-consistent flag to protect against
building VPLEX devices upon them in such a way as to make the data unavailable or
corrupting it. Instead of using the --appc flag it is now our recommendation to
perform this operation in the GUI. This is part of the storage volume encapsulation
process. The section

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

87

Storage Volume Considerations


Migration of Host/Storage to a VPLEX Environment has more information.

Note: VPLEX only supports block-based storage devices which uses 512-byte sector
for allocation or addressing, hence ensure the storage array connecting to VPLEX
supports or emulates the same. The storage devices which doesn't use 512-byte sector
may be discovered by VPLEX but cannot be claimed for use within VPLEX and cannot
be used to create meta-volume. When a user tries to utilize the discovered storage
volumes having unsupported block size within VPLEX either by claiming them or for
creating meta-volume using appropriate VPLEX CLI commands, the command fails
with this error - "the disks has an unsupported disk block size and thus can't be
moved to a non-default spare pool". Also the storage device having capacity not
divisible by 4KB (4096B) may be discovered by VPLEX but cannot be claimed for use
within VPLEX. When a user tries to utilize a discovered storage volume having
unsupported aligned capacity within VPLEX by claiming them using appropriate
VPLEX CLI commands, the command fails with this error "The storage-volume
<StorageVolumeName> does not have a capacity divisible by the system block size
(4K and cannot be claimed.)"

Although it is possible to claim individual storage volumes, it is preferable to use the


claiming wizard to claim dozens or hundreds of storage volumes in one operation.
The claiming wizard will assign meaningful names to claimed storage volumes,
including an array identifier and a device number or name. The array identifier
included in the meaningful storage volume names will let you quickly identify storage
volumes from a given array (useful, for example, when you want to migrate virtual
volumes off a storage array you want to decommission). The device number or name
in the meaningful storage volume name lets you correlate that VPLEX storage volume
with a given LUN visible in the array's management interface. This will come in
handy when you want to troubleshoot performance issues starting at the array.
The claiming wizard also provides a mechanism to include a storage tier identifier in
the storage volume names, which can be used in capacity reports, as well as form the
basis of a tiered block storage federation solution.
Some storage arrays such as EMC Symmetrix and HDS USP report the array serial
number and device name in their responses to standard SCSI inquiries; the claiming
wizard can claim their storage volumes without requiring any additional files. For
other storage arrays, the storage administrator must use the array's command-line
tools to generate a hints file that declares the device names and their World Wide
Names. This file is then input to the claiming wizard. In addition, you can also run
the claiming wizard using the --dry-run option and use the output as a source to
create a custom hints file. Also, note that the hints file can be used to selectively add
88

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Volume Considerations


more control over the claiming wizard behavior for arrays like the EMC Symmetrix
and HDS USP.

Extent sizing
Extents should be sized to match the desired virtual volume's capacity. If the storage
volume you want to use for an extent is larger than the desired virtual volume, create
an extent the size of the desired virtual volume. Do not create smaller extents and then
use devices to concatenate or stripe the extents.
Creating smaller extents on the same storage volume and using devices to concatenate
or stripe these extents may create spindle contention on the underlying storage
volume and not provide any protection from storage volume failure. Creating smaller
extents on different storage volumes and using devices to concatenate or stripe these
extents will distribute the virtual volume's I/O over multiple storage volumes, which
may be beneficial for throughput and responsiveness in some cases, but it also creates
additional management complexity. You should only do this when you know the I/O
pattern will benefit from this.
When disk capacities are smaller than desired volume capacities, best practice is to
create a single slice per disk, and use RAID structures to concatenate or stripe these
slices into a larger RAID.

Volume Expansion
Volume Expansion refers to the expansion operation performed from within VPLEX.
Volume Expansion is now supported with GeoSynchrony 5.2 and later nondisruptively when performed from the back-end array.
This discussion has no bearing on Thin Provisioned device expansion as just like a
host, VPLEX is already projecting the full capacity regardless of the actual used
capacity. Further expansion beyond the original full capacity does apply though.
Volume Expansion is supported in GeoSynchrony 4.2 on up.
You can non-disruptively expand the capacity of a virtual volume by entering the
capacity needed, and then selecting from a list of storage volumes with available
capacity. The virtual volume you are expanding can be created on a RAID-0, RAID-C
or RAID-1 device. When you expand a volume, VPLEX automatically creates the
extent and concatenates the additional capacity to the virtual volume. These
components are visible when you view the component map.
You can expand a virtual volume as many times as necessary. The Expandable column
on the Virtual Volumes screen indicates whether a volume can be expanded.

Note: We support virtual volume expansion of distributed devices from 5.2. We


Implementation and Planning Best Practices for EMC VPLEX Technical Notes

89

Storage Volume Considerations


dont support expansion of RecoverPoint enabled volumes.

Selecting storage volumes


When selecting storage volumes to expand a virtual volume, note the following:

For volumes created on RAID-0 and RAID-C devices:

You can select only one storage volume. This storage volume can be the same storage
volume from which the virtual volume is created, or it can be a different storage
volume

For volumes created on RAID-1 devices:

You must select two or more storage volumes. We recommend that you select a
number of storage volumes greater than or equal to the number of components (legs)
in the device. At a minimum, you should have the same redundancy as the original
RAID-1 device on which the virtual volume is created

Although you can select the same storage volumes on which the virtual volume was
created, we recommend that you select storage volumes from different arrays to avoid
any single point of failure

Considerations for database storage provisioning


Historically, disk alignment has always been a concern and this refers to the start of
the data region on a disk based on header and master boot record location and size.
The issue comes in when the start of the data region does not align with a 4K block
boundary. VPLEX writes in 4K blocks and if the random writes from a database are
not aligned properly then VPLEX may have to write two 4K blocks in order to write a
single 4K block of data.
In order to preserve DB block atomicity, VPLEX back-end writes are split along DB
block boundaries. DB blocks of interest range in size from 4 KB to 64 KB, and are
powers-of-two in size.
For optimal performance and availability in an application or database environment,
its important to ensure alignment of your hosts operating system partitions to a 64
KB block boundary and using VPLEX RAID 1 or encapsulated volume configurations.
On many storage platforms (like CLARiiON) you can create a LUN with an alignment
offset which will mean zero host-based alignment activities. This approach is a
situation where the back-end I/O can be aligned, but the VPLEX I/O misaligned. We
would avoid this approach, and suggest as a VPLEX best practice that all alignment
corrections happen via the host.
90

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Volume Considerations

The recommendations are as follows:

Databases should be aligned to the beginning of VPLEX's virtual volumes (or some
integral number of database blocks from LBA 0).

Database alignment is important for performance.

If RAID 0 or RAID-C is used, two precautions are necessary:


o

The device boundaries must be at a multiple of 64 KB. For RAID 0 this means
a stripe depth that is a multiple of 64 KB. For RAID-C this means
concatenating devices whose total size is a multiple of 64 KB

The database must be aligned to a 64 KB offset in the virtual volume

The challenge comes in when performing a Data-in-Place Encapsulation of existing


devices. Correcting the offset on a disk containing data is nearly impossible without
migrating the data to a properly aligned disk.
For any site running the GeoSynchrony 4.0 code that believe they are experiencing
degraded performance based on misaligned writes and are operating in a VPLEX
Local or Metro should understand that GeoSynchrony 4.0 has reached EOSL (End of
Service Life) and must upgrade to latest GeoSynchrony version.

Device creation and storage provisioning


The following are some tips for database creation and storage provisioning:

Use smaller extents to virtualize a large storage volume into a smaller virtual volume

Use device geometries consisting of either RAID 0 or RAID-C types to virtualize


smaller extents into a larger virtual volume

Remember that you can create one to 128 extents on a single storage volume. The
default is one extent comprised of the whole storage volume

Avoid creating RAID 0 structures on storage volumes that are constructed from
stripes in the storage array. This could create hot spots in the array

Stacking of RAID 0s
When creating a device the underlying storage should be taken into account. If the
underlying storage volume has RAID 0 or stripe properties from the storage array, the
VPLEX administrator should use that storage in a device with RAID 1 or RAID-C
properties (not RAID 0 properties). This is to avoid reverse mapping of RAID 0 or
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

91

Storage Volume Considerations


stripe, which could create hot spots on the spindles underneath it all.

Proper construction of a mirror/RAID 1 device


This section applies to both local and distributed device creation. When creating
either type of mirror device it is important to understand how to work with existing
data. You should not try to mirror two devices or extents with existing data. In
general there are two ways to create local and distributed mirror devices:

92

Create the RAID 1/mirror device with two extents or devices

Create a device and attach a mirror extent or device

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Export Considerations

Export Considerations
Host multipathing drivers, OS, application considerations

Multipathing should be set up for adaptive and not round-robin (recommended)

Avoid multipathing software that does excessive round-robin and/or splits I/O

Avoid subpage writes (excessive) (not on 4 KB boundaries) Please see section on


database storage provisioning for more information

Make sure host I/O paths include redundancy across the first and second upgraders
(director A and B) as required by NDU pre-checks

Avoid connecting to multiple A directors and multiple B directors with a single host
or host cluster. This increases the chance of a read cache hit improving performance

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

93

LUN Expansion

LUN Expansion
New to GeoSynchrony 5.2:
Note: In Release 5.2, VPLEX supports the OS2007 bit on Symmetrix arrays. This setting is
vital to detect LUN swap situations and storage volume expansion automatically on a
Symmetrix array. The Symmetrix section of the Configuring Arrays procedure in the
generator contains instructions on setting the OS2007 bit.

Storage array based volume expansion enables storage administrators to expand the
size of any virtual volume by expanding the underlying storage-volume.
The supported device geometries include virtual volumes mapped 1:1 to storage
volumes, virtual volumes on multi-legged RAID-1, and distributed RAID-1, RAID-0,
and RAID-C devices under certain conditions. The expansion operation is supported
through expanding the corresponding Logical Unit Numbers (LUNs) on the back-end
(BE) array.
Storage array based volume expansion might require that you increase the capacity of
the LUN on the back-end array.
Procedures for doing this on supported third-party luns are availble with the storage
array based volume expansion procedures in the generator.

Note: Virtual volume expansion is not supported on RecoverPoint enabled volumes.


A new attribute called expansion-method has been added to the virtual volumes
context. There are two types of expansion available -- storage-volume and
concatenation.
Storage-volume expansion will be the expansion method for most distributed devices.
See the VPLEX 5.2 Admin guide starting on page 99 for more details -https://support.emc.com/docu47973_VPLEX-Administration-Guide.pdf

94

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

LUN Expansion
Pre GeoSynchrony 5.2
VPLEX Distributed Volume Expansion
This document will walk the user through the steps of expanding a VPLEX distributed
volume. This is online process that can be done either primarily through the GUI or
the VPLEXcli. Both GUI and CLI procedures are shown. Utilizing the PowerPath
"standby" option for stopping I/O is discussed in the Appendix.

Note: This procedure is no longer necessary with GeoSynchrony 5.2 or newer.

http://one.emc.com/clearspace/docs/DOC-57694

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

95

Data Migration

Data Migration
Extent migration
Extent migration is the process of non-disruptively moving an extent from one storage
volume to another. An extent migration should be used when:

Relocating an extent from a hot storage volume shared by other busy extents

Defragmenting a storage volume to create contiguous free space

Source and target arrays have a similar configuration, that is, the same number of
storage volumes, identical capacities, and so on

Device migration
Device migration is the process of non-disruptively moving a device from one set of
extents to another. A device migration should be used when:

Migrating data between dissimilar arrays. For example, a storage administrator might
need to slice or combine extents on a target arrays storage volumes to create devices
that match the capacities of existing devices on the source array

Relocating a hot device from one type of storage to another

Relocating a device from an array behind one VPLEX in a VPLEX Metro cluster to an
array behind a different VPLEX Cluster (a VPLEX exclusive)

Batch migration
A batch migration is a group of extent or device migrations that are executed as a
single migration job. A batch migration should be used for:

Non-disruptive technology refreshes and lease rollovers

Non-disruptive cross VPLEX Metro device migration, that is, moving data to an array
at a different site (a VPLEX exclusive)

Migration jobs
The best practice is to monitor the migration's effect on the host application and to
adjust down the transfer size if it is too high.
Consider pausing migrations during the day and resuming them at night or during
off-peak hours to reduce the potential performance impact.
Consider committing migration jobs shortly after they complete to avoid double
writes to both the source and target RAIDs, which could potentially affect
performance.

96

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Data Migration
Performance notes
Migration and front-end performance will primarily depend on:

Back-end storage layout (physical disks, RAID type, number of connections)

Migration transfer-size setting

Rebuild-type setting

Bandwidth available on the WAN COM link

Up to 25 local and global migrations can be in progress at any given time


o

25 is a shared limit for local and global combined

A local migration occurs within a cluster

A global migration occurs between clusters (Metro or Geo only)

Other migrations will be queued and started once a rebuild slot opens up

Application write block size should match or be a factor of 4K

Host HBA limits

Amount of competing host traffic

Backend array configuration (disks and RAID striping)

Array's write limit to the durable media/disk (not the deferred write speed!)

PCI port bandwidth configured for backend and frontend I/O on VPLEX
Best case (no FE or BE bottlenecks) a VS2 Engine can do upwards of 3 GB/s seq reads
and 4 GB/s seq writes. Say at 3 GB/s, that would be over 5 TB/hour to do a
producer/consumer read/write stream. Of course, that assumes all things optimal...
What is transfer size? (as referenced in the vplexcli)
Transfer size is the region of a source element that is temporarily locked, read, and
written on the target. The default value is 128 KB. It can be as small as 4 KB (the block
size of devices) and as large as 32 MB although not recommend to exceed 2 MB. The
size can be changed during a migration, will take effect immediately, and is persistent
for future migrations.
How does transfer size affect performance?
A larger transfer size:

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

97

Data Migration

Results in a higher-performance migration but the tradeoff is that there will be more
performance impact on FE I/O, especially for VPLEX Metro migrations

Is set for devices where the priorities are data protection or migration performance
A smaller transfer size:

Results in the migration taking longer but will have lower impact on FE I/O in terms
of the response time to the host

Is set for devices where the priority is FE storage response time

Job transfer speed (as referenced in the VPLEX GUI)


The transfer speed determines the maximum number of bytes of data transferred at a
time from the source to the target. When creating a mobility job, you can control this
transfer speed. The higher the speed, the greater the impact on host I/O. A slower
transfer speed results in the mobility job taking longer to complete, but has a lower
impact on host I/O. Monitor the mobility job's progress and its effect on host I/O. If
the job is progressing too slowly, or I/O is greatly impacted, adjust the transfer speed
accordingly. You can change the transfer speed of a job while the job is in the queue
or in progress. The change takes effect immediately.
By default, the GUI transfer speed is set to lowest, which translates to 128 KB. This
transfer speed provides good throughput while maintaining the best front-end
performance in most environments. The following table shows the mapping between
the GUI transfer speed and the transfer-size attribute used in the CLI.

Transfer Speed (GUI)


Lowest (default)
Low
Medium
High
Highest

transfer-size attribute (CLI)


128 KB
2 MB
8 MB
16 MB
32 MB

It is advisable to avoid the use of higher transfer speeds during times of heavy
application activity.

98

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Scale Up Scale Out and Hardware Upgrades

Scale Up Scale Out and Hardware Upgrades


Scale up refers to adding engines to an existing cluster and scale out refers to attaching
another cluster to a local cluster in either a Metro or Geo configuration. Hardware
Upgrades are hardware versions such as VS1 to VS2 upgrades.
Note: Prior to any upgrades, please ensure to consult with services.

Scale Up Recommendations
Note: For Dual or Quad engine clusters the NDU requirements require spreading the
host initiators across engines. This may pose a problem when upgrading from a single
engine to a dual as the customer would be required to reconfigure host connectivity.
The NDU pre-check requirements for host connectivity can be mitigated under these
circumstances by using the --skip-group-fe-checks. This will Skip all NDU pre-checks
related to front-end validation. This includes the unhealthy storage views and storage
view configuration pre-checks. If choosing this option, the recommended best
practice would be to run the normal pre-check first which would flag all the unhealthy
Storage Views allowing the customer to verify host connectivity for those Storage
Views. The recommendation would be to correct connectivity issues for all hosts
running critical apps. Customers may also wish to perform an NDU after upgrading
but havent had the opportunity to reconfigure all the host connectivity yet.

Scale Out Recommendations


Scale out allows you to upgrade an existing VPLEX Local cluster to a VPLEX Metro or
Geo. Switching from a Metro to a Geo or vice versa is currently not supported.
Changes of this type would be very disruptive and require engineering assistance.
Combining two active VPLEX Local cluster to form a Metro or Geo is also not
supported.
VPLEX Local clusters with the VS2 hardware are shipped without the WAN COM
I/O modules. In order to upgrade you would need to order the appropriate module,
upgrade to the appropriate supported code level and follow the procedure in VPLEX
Procedure Generator. The new cluster should be ordered with the correct I/O module
pre-installed from factory and once received, verify that GeoSynchrony revisions
match between the two clusters.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

99

Scale Up Scale Out and Hardware Upgrades


Attaching another cluster for a Metro or Geo configuration allows for the addition of a
cluster with a different engine count than the first cluster as long as all requirements
are met. All directors on one cluster must communicate with all directors on the other
cluster over the two different port groups. Configuring a single port group on one
fabric or network for the WAN COM is not supported.

Hardware Upgrades
The hardware upgrade procedure will be found in the VPLEX Procedure Generator.
This is a complicated procedure and requires a significant amount of preparation. The
best practice recommendation is to review the online training just prior to going to a
customer site to perform the procedure. Opening a proactive SR is also required prior
to performing the procedure.

100

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

VPLEX and RecoverPoint Integration

VPLEX and RecoverPoint Integration


The addition of the RecoverPoint splitter to VPLEX with the 5.1 GeoSynchrony code
opens the door for a long list of new opportunities. The integration work that went
into the product provides for internal code checks that prevent improper
configurations but are limited in some areas. The limitations within these areas are
due to dependencies and user choices outside the control of VPLEX which could
inadvertently cause a DU/DL situation.
One possibility is based on user error such as when adding volumes at a later date to
existing Consistency Groups (CG). VPLEX and RecoverPoint integration checks for
conditions when a volume is added to the data volumes CG to flag the user to add a
volume to the corresponding RP CG. The code checks cannot account for user error if
the user places the data volume in the wrong CG and corresponding RP CG. The
problem comes in when a restore operation from RP restores all the volumes in the
expected CG but fails to restore the volume accidently placed in another CG. Proper
naming conventions should be used to help identify the CGs with which application
they go with.
Please refer to the white paper or technote:

EMC RECOVERPOINT: ADDING APPLICATION RECOVERY TO VPLEX LOCAL


AND METRO

EMC RecoverPoint Deploying with VPLEX

CGs for Production and Replica volumes


We recommend that a pair of consistency groups be setup for each set of volume to be
protected, one for the production volumes and one for the replicas. This allows
production and replica volumes to have different topologies (i.e. distributed vs. local
virtual volumes). The validation and alignment checks in the product assume this
type of configuration.

Local Volumes for Replica/Journal/Repository


Local virtual volumes should be used for replica, journal, and repository volumes. In
the case of distributed replica volumes, access to the replica at the VPLEX cluster that
does not have RecoverPoint deployed is blocked in all cases, even when Image Access
is enabled. The only time distributed replica volumes are accessible at the nonRecoverPoint VPLEX cluster is if a Production Failover operation is done, in which the
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

101

VPLEX and RecoverPoint Integration


replica volumes become production volumes.
Journal and Repository volumes should be local to the VPLEX cluster with
RecoverPoint, as access at the non-RecoverPoint cluster provides no value.
On the journals using faster storage, there should be a description of that in the
RecoverPoint Best Practice documentation, as it would apply to all splitter types.

102

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack

Monitoring for Performance Best Practices

Storage Resource Management Suite VPLEX Solution Pack


Per-object monitoring within VPLEX is now possible, and with Solutions Pack it
makes life much easier. Please note that the Storage Resource Management Suite is
not part of the VPLEX package and must be purchased separately.
During a customer POC, critical success criteria may be to show response times on
VPLEX storage volumes. This information is not included in the standard VPLEX SP.
It is possible to show this information in Storage Resource Management Suite. Please
find below the steps required to provide this information.
There is one thing that should be mentioned that will affect this procedure when
Storage Resource Management Suite officially supports VPLEX GeoSynchrony 5.2.
Several of the latency related monitor stats names have changed. This will be added
to the 5.2 release notes:
In 5.2 release, the following monitor statistics were renamed:
Old name -> New name
-------------------cg.closure -> cg.closure-avg
cg.drain-lat -> cg.drain-avg-lat
cg.exch-lat -> cg.exch-avg-lat
cg.inter-closure -> cg.inter-closure-avg
cq.write-lat -> cg.write-avg-lat
fe-director.caw-lat -> fe-director.caw-avg-lat
fe-director.read-lat -> fe-director.read-avg-lat
fe-director.write-lat -> fe-director.write-avg-lat
fe-lu.read-lat -> fe-lu.read-avg-lat

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

103

Storage Resource Management Suite VPLEX Solution Pack


fe-lu.write-lat -> fe-lu.write-avg-lat
fe-prt.caw-lat -> fe-prt.caw-avg-lat
fe-prt.read-lat -> fe-prt.read-avg-lat
fe-prt.write-lat -> fe-prt.write-avg-lat
rp-spl-node.write-lat -> rp-spl-node.write-avg-lat
rp-spl-vol.write-lat -> rp-spl-vol.write-avg-lat
storage-volume.per-storage-volume-read-latency -> storage-volume.per-storagevolume-read-avg-lat
storage-volume.per-storage-volume-write-latency -> storage-volume.per-storagevolume-write-avg-lat
storage-volume.read-latency -> storage-volume.read-avg-lat
storage-volume.write-latency -> storage-volume.write-avg-lat
Notes:
- Despite the monitor statistic name change, the CSV column output in the monitor
sink file is the same as before. This was done purposefully to ensure backwards
compatibility.
In the CSV sink file, the histogram and average columns output as part of the bucket
type now print zeros for all histogram entries and the average column. The only
column that should be used is the "recent-average" column. The histogram and
average column outputs will be removed in the next major GeoSynchrony release.

How to add response time metrics


1- Replace these two files with the ones attached textoutputcollector-vplex.xml &
vplex-create-w4n-collection.py located under the collector host
/opt/APG/Collecting/Text-Collector/emc-vplex-rdc/conf
FYI Changes in the files are
104

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack


a.

textoutputcollector-vplex.xml
Parse the collected virtual volume .csv files, adding additional metrics
calling them FeReadLat and FeWriteLat :
<value-list name="FeReadLat" leaf="fe-lu.read-lat .*recent.*"
unit="us"/>
<value-list name="FeWriteLat" leaf="fe-lu.write-lat .*recent.*"
unit="us"/>

b. vplex-create-w4n-collection.py
Add following text to add latency metrics in the monitor creation
script:
cli.make_monitor(director, "W4N_VIRTUAL", "felu.read-lat,fe-lu.write-lat,fe-lu.ops,fe-lu.read,fe-lu.write,virtualvolume.dirty,virtual-volume.ops,virtual-volume.read,virtualvolume.write", "/clusters/" + cluster + "/virtual-volumes/*", logdir +
"/virt-volumes/" + cluster + "_" + director + ".volumes.csv")
2- Restart collector /opt/APG/bin/manage-modules service
<collector_instance_name> restart
3- Run Import DB Properties Task under Centralized Management to add new
properties in the DB
4- Log out and log back in to have an up to date reports display
5- Do a search to make sure you have new properties in the DB filter on parttype is VDisk
Expand On device part na

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

105

Storage Resource Management Suite VPLEX Solution Pack


6- Click on a director e.g. director-1-2-A

7- Add these new parameters in the VPLEX SP


a.

Go to All>>SolutionPack for EMC


VPLEX>>Clusters>>serialnb,cluster>>Virtual
Volumes>>part>>Throughput

b. Copy and Paste Throughput node change the filter to name is


FeReadLat
c.

Copy and Paste Throughput node change the filter to name is


FeWriteLat

8- You will see a report structure like this

9- Do not be surprised when you see multiple lines in the graph. Ultimately the latency
that the host would see depends upon which director its accessing, but generally the
latency is the average across all directors. The latency is affected by the I/O load on the
106

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack


director, queue depths, port usage, and a variety of things. In your particular graphs,
the variability (less than 1 msec or so) should not be of concern. If you start seeing major
differences (like 10-20 msec or more) between directors, then thats probably something
worth investigating.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

107

Storage Resource Management Suite VPLEX Solution Pack

VPLEX Administration Recommendations


Use Access Control Lists to restrict administrator actions on LUNs under VPLEX
management
A serious DU or DL potentially exists if the administrator of the back-end array
accidentally or purposely exposes a LUN that has been claimed by VPLEX directly to
a non-VPLEX initiator (or to a different VPLEX system, for that manner). Under no
circumstances should a volume that is virtualized by VPLEX be presented directly to
another initiator. In all circumstances, this is a configuration error.
To prevent the above scenario, it is a best practice to put in place barriers that would
prevent or make difficult the above scenarios. One such barrier that can be used on a
Symmetrix is to configure Access Control Lists (ACLs) that prevent the administrator
from changing the LUN masking for any volume that is masked to VPLEX. Also,
Symmetrix ACLs are only available on recent versions of Symmetrix firmware.
The strongest recommendation for the use of access control lists are for protection of
the meta-data volumes.

Naming conventions
For device/volume naming, users should decide early on whether they will name
volumes after underlying storage or prefer to have volumes named after the data they
contain. This is because it will become important on their first migration when they
have to decide whether to rename the volume after the target device or keep the
current name.
During the course of managing your virtualized environment you will create various
virtual storage objects (extents, devices, virtual volumes, storage views, and so on).
Each of these objects has a name. Some commands create default names. The
following name rules are enforced for all names:
Names can consist of:

Upper and lowercase letters

Digits

Underscores

Dashes

Spaces are not permitted


Names must start with either a letter or underscore.

108

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack


The maximum name length is 63 characters. Some automated processes like
migrations rename devices by appending date stamp information to an object name.
If the original object name is close to the 63-character limit, this process will fail
because it wont be able to set the new name. It is best to keep names closer to a
maximum of 40 characters.
If you use the CLI more often and take advantage of tab completion, you may want to
keep the unique part of a name closer to the beginning to cut down on typing.
More importantly are the naming conventions used for the storage objects. The
following are some naming convention suggestions. Remember that the naming
convention should be decided based on the needs of the environment.

Storage volumes - Indicate the storage array and other identifying info

Extents - Keep consistent with storage volumes (default)

Devices - May reference information from the storage volume, but it is more important
to make some reference to the host/application/purpose

Virtual volumes The default is named after the top-level device, appending _vol
Additionally, try not to load names with too much information.

Log or capture CLI sessions


It is recommended that VPLEX administrators use the capture command to log
activities. This has various advantages that become more valuable if there are
multiple administrators. Captured sessions help with:

Accountability/Auditing

Ease of repeating tasks

Note taking

Support calls
Capture sessions can also be used to document best practices and procedures that you
develop specifically to your environment.
It is highly recommended that you start a capture session before any important admin
tasks, especially before NDU.
Captured sessions may also be the proof needed to show you didnt cause a problem.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

109

Storage Resource Management Suite VPLEX Solution Pack


Changing the GUI Timeout Value
Note: timeout value must be a positive number. For security reasons, it
isn't recommended to have the number higher than 30.

Note: Later versions of the GUI allow for changing the GUI Timeout
Value via the gear shaped icon in the upper right corner of the interface.

1.) Log into the management server shell via a ssh


2.) Change to the smsflex directory via the following command:
'cd /opt/emc/VPlex/apache-tomcat-6.0.x/webapps/smsflex'
3.) The GUI idle timeout value is stored in a property in theVPlexConsole.ini file
(if the file does not exist you will need to create it with the
followingcommand -> 'touch VPlexConsole.ini')
a. Open the VPlexConsole.ini file withan editor (ie. emacs/vim) and add the
'AUTO_LOGOUT_MINUTES=x'
property to the configuration file (or simply change the value of 'x' if itexists).
To set the GUI logout to 20 minutes, replace the valueof 'x' with 20.
b. Save the file and quit.
4.) Restart the management console with the following command:
'sudo /etc/init.d/VPlexManagementConsole restart'
5.) Log out of the management server
6.) You will need to establish a new GUI session for the new setting to be effective

110

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack


Changing the GUI Timeout Value from the GUI

Figure 25 GUI Timeout option button

Figure 26 Configuring Idle Timeout

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

111

Storage Resource Management Suite VPLEX Solution Pack

Monitoring VPLEX
The EMC VPLEX CLI Guide has more information monitoring VPLEX.
Make regular use of summary and reporting commands to monitor the VPLEX. These
commands can provide a quick overview of how the system is configured and its
general health. It is recommended that you become familiar with and develop your
own set of routine commands to do a quick check of the system.
The following commands are used to monitor clusters and system health:
health-check:
Displays a report indicating overall hardware/software health.
If no arguments are given, it performs a high level check for major subcomponents
with error conditions. Warnings are ignored. It is used for instantaneous, high level
view of the health of the VPLEX.
Please refer to the CLI Guide for additional information about specific arguments and
their behaviors.

NDU pre-check
The ndu pre-check command should be run before you run a non-disruptive upgrade
on a system to upgrade GeoSynchrony. This command runs through a number of
checks to see if the non-disruptive upgrade would run into any errors in upgrading
GeoSynchrony. The checks performed by ndu pre-check are listed in the Upgrade
procedure for each software release. This procedure can be found in the generator.
Validate commands:
Connectivity validate-be
This provides a summary analysis of the back-end connectivity information displayed
by connectivity director if connectivity director was executed for every director in the
system. It checks the following:

112

All directors see the same set of storage volumes.

All directors have at least two paths to each storage-volume.

The number of active paths from each director to a storage volume does not
exceed 4.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack


connectivity validate-wan-com
This command assembles a list of expected WAN COM connectivity, compares it to
the actual WAN COM connectivity and reports any discrepancies (i.e. missing or extra
connections). This command verifies IP or FC based WAN COM connectivity. If no
option is specified, displays a list of ports that are in error: either missing expected
connectivity or have additional unexpected connectivity to other ports. The expected
connectivity is determined by collecting all ports with role wan-com and requiring
that each port in a group at a cluster have connectivity to every other port in the same
group at all other clusters. When both FC and IP ports with role wan-com are present,
the smaller subset is discarded and the protocol of the remaining ports is assumed as
the correct protocol.
rp validate-configuration
This command checks the system configuration with respect to RecoverPoint and
displays errors or warnings if errors are detected. For VPLEX Metro configurations,
run this command on both management servers.
Please refer to the VPLEX CLI Guide for information about the checks that are
performed.
validate-system-configuration
This command performs the following checks:

Validates cache mirroring

Validates the logging volume

Validates the meta-volume

Validates back-end connectivity

In GeoSynchrony 5.0 and later, the perpetual monitors are kicked off and controlled
by the CLI. These are rolled to ensure they dont get too big. They contain a standard
set of categories that engineering is interested in. Do not delete these files! Users can
still create their own monitors if they like
Once the monitors start and are periodically updating, the process is to go to the
VPLEX management station and grab the CSV files, which are one per director, and
are typically stored in /var/log/VPlex/cli/. Import the CSV data into Excel, but this
is very manual. This data is usually collected in response to a perceived performance
problem and would normally be requested by engineering. Engineering has
developed a series of tools that help in the analysis of the performance data.
For the command report create-monitors, this grabs only port, virtual-volume, and
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

113

Storage Resource Management Suite VPLEX Solution Pack


storage-volume information. The best practice would be to kick these off with a
polling frequency.

114

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Storage Resource Management Suite VPLEX Solution Pack


Management server date/time across VPLEX Metro and Geo
Keep clocks on VPLEX Metro and Geo management servers in sync for log event
correlation. This will make troubleshooting or auditing the system easier.
Email Home Recommendation
Adding and removing email recipients may be required after the initial install. The
recommendation is to configure the recipient to an email distribution list so that
adding or removing recipients are managed through the email application.
You can also use the configuration event-notices-reports config and configuration
event-notices-reports reset commands to import/apply/remove customized callhome event files.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

115

Summary

Summary
VPLEX provides an abstraction layer between the host and the storage array. VPLEX
inherently adds HA attributes, but it also requires careful planning to take full
advantage. The VPLEX best practices weve reviewed along with traditional SAN
architecture and planning best practices must be observed to fully leverage the
availability, mobility, and collaboration capabilities of the EMC VPLEX platform.

116

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Appendix A: VS1

Appendix A: VS1

Table 2 VS1 and VS2 Component Comparison

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

117

Appendix A: VS1

VPLEX Engine VS1


The following figures show the front and rear views of the VPLEX VS1 Engine

Figure 27 Front and rear view of the VPLEX VS1 Engine

118

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Appendix A: VS1

Figure 28 VPLEX preconfigured port arrangement VS1 hardware

Director A and Director B each have four I/O modules for SAN connectivity. The
IOM carriers are for inter-director communications over Local com or WAN com.
Two of the four I/O modules on each director are configured for host connectivity and
are identified as front end while the other two modules are configured for array
connectivity identified as back end. The frontend ports will log in to the fabrics and
present themselves as targets for zoning to the host initiators and the backend port
will log in to the fabrics as initiators to be used for zoning to the array targets.
Each director will connect to both fabrics with both frontend and backend ports.
Those connections should span both I/O modules for both the front end and back end
so that failure of a single I/O module wont create unnecessary Data Unavailability
events.
VPLEX offers separate port connectivity for WAN COM (Director to Director
communications between clusters). FC WAN COM ports will be connected to
separate backbone fabrics that span the two sites. This allows for data flow between
the two VPLEX clusters without requiring a merged fabric between the two sites.
Dual fabrics that currently span the two sites are also supported but not required. All
A4-FC02 and B4-FC02 ports from both clusters will connect to one fabric and all A4FC03 and B4-FC03 ports will connect to the other fabric. This provides a redundant
network capability where all directors communicate with all the directors on the other
site even in the event of a fabric failure.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

119

Appendix A: VS1
For GigE, all A5-GE00 and B5-GE00 ports will be connected through one Ethernet
network and all A5-GE01 and B5-GE01 ports will be connected through a second
Ethernet network over a different geographical path to prevent a single environmental
problem from taking out both networks.

Figure 29 VPLEX fabric connectivity

120

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Glossary

Glossary
This glossary contains terms related to VPLEX federated storage systems. Many of
these terms are used in this manual.

A
AccessAnywhere The breakthrough technology that enables VPLEX clusters to
provide access to information between clusters that are separated by distance.
active/active A cluster with no primary or standby servers, because all servers can
run applications and interchangeably act as backup for one another.
active/passive A powered component that is ready to operate upon the failure of a
primary component.
array A collection of disk drives where user data and parity data may be stored.
Devices can consist of some or all of the drives within an array.
asynchronous Describes objects or events that are not coordinated in time. A process
operates independently of other processes, being initiated and left for another task
before being acknowledged. For example, a host writes data to the blades and then
begins other work while the data is transferred to a local disk and across the WAN
asynchronously. See also synchronous.

B
bandwidth The range of transmission frequencies a network can accommodate,
expressed as the difference between the highest and lowest frequencies of a
transmission cycle. High bandwidth allows fast or high-volume transmissions.
bias When a cluster has the for a given DR1 it will remain online if connectivity is lost
to the remote cluster (in some cases this may get over ruled by VPLEX Cluster
Witness). This is now known as preference.
bit A unit of information that has a binary digit value of either 0 or 1.
block The smallest amount of data that can be transferred following SCSI standards,
which is traditionally 512 bytes. Virtual volumes are presented to users as a
contiguous lists of blocks.
block size The actual size of a block on a device.
byte Memory space used to store eight bits of data.

C
cache Temporary storage for recent writes and recently accessed data. Disk data is

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

121

Glossary
read through the cache so that subsequent read references are found in the cache.
cache coherency Managing the cache so data is not lost, corrupted, or overwritten.
With multiple processors, data blocks may have several copies, one in the main
memory and one in each of the cache memories. Cache coherency propagates the
blocks of multiple users throughout the system in a timely fashion, ensuring the data
blocks do not have inconsistent versions in the different processors caches.
cluster Two or more VPLEX directors forming a single fault-tolerant cluster,
deployed as one to four engines.
cluster ID The identifier for each cluster in a multi-cluster deployment. The ID is
assigned during installation.
cluster deployment ID A numerical cluster identifier, unique within a VPLEX cluster.
By default, VPLEX clusters have a cluster deployment ID of 1. For multi-cluster
deployments, all but one cluster must be reconfigured to have different cluster
deployment IDs.
clustering Using two or more computers to function together as a single entity.
Benefits include fault tolerance and load balancing, which increases reliability and up
time.
COM The intra-cluster communication (Fibre Channel). The communication used for
cache coherency and replication traffic.
command line interface (CLI) A way to interact with a computer operating system or
software by typing commands to perform specific tasks.
continuity of operations (COOP) The goal of establishing policies and procedures to
be used during an emergency, including the ability to process, store, and transmit data
before and after.
Controller A device that controls the transfer of data to and from a computer and a
peripheral device.

D
data sharing The ability to share access to the same data with multiple servers
regardless of time and location.
detach rule A rule set applied to a DR1 to declare a winning and a losing cluster in
the event of a failure.
device A combination of one or more extents to which you add specific RAID
properties. Devices use storage from one cluster only; distributed devices use storage
from both clusters in a multi-cluster plex. See also distributed device.

122

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Glossary
director A CPU module that runs GeoSynchrony, the core VPLEX software. There
are two directors in each engine, and each has dedicated resources and is capable of
functioning independently.
dirty data The write-specific data stored in the cache memory that has yet to be
written to disk.
disaster recovery (DR) The ability to restart system operations after an error,
preventing data loss.
disk cache A section of RAM that provides cache between the disk and the CPU.
RAMs access time is significantly faster than disk access time; therefore, a diskcaching program enables the computer to operate faster by placing recently accessed
data in the disk cache.
distributed device A RAID 1 device whose mirrors are in Geographically separate
locations.
distributed file system (DFS) Supports the sharing of files and resources in the form
of persistent storage over a network.
Distributed RAID1 device (DR1) A cache coherent VPLEX Metro or Geo volume that
is distributed between two VPLEX Clusters

E
engine Enclosure that contains two directors, management modules, and redundant
power.
Ethernet A Local Area Network (LAN) protocol. Ethernet uses a bus topology,
meaning all devices are connected to a central cable, and supports data transfer rates
of between 10 megabits per second and 10 gigabits per second. For example, 100 BaseT supports data transfer rates of 100 Mb/s.
event A log message that results from a significant action initiated by a user or the
system.
extent A slice (range of blocks) of a storage volume.

F
failover Automatically switching to a redundant or standby device, system, or data
path upon the failure or abnormal termination of the currently active device, system,
or data path.
fault domain A concept where each component of a HA solution is separated by a
logical or physical boundary so if a fault happens in one domain it will not transfer to
the other. The boundary can represent any item which could fail (i.e., a separate
power domain would mean that is power would remain in the second domain if it
failed in the first domain).
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

123

Glossary
fault tolerance Ability of a system to keep working in the event of hardware or
software failure, usually achieved by duplicating key system components.
Fibre Channel (FC) A protocol for transmitting data between computer devices
Longer distance requires the use of optical fiber; however, FC also works using coaxial
cable and ordinary telephone twisted pair media. Fibre channel offers point-to-point,
switched, and loop interfaces. Used within a SAN to carry SCSI traffic.
field replaceable unit (FRU) A unit or component of a system that can be replaced on
site as opposed to returning the system to the manufacturer for repair.
firmware Software that is loaded on and runs from the flash ROM on the VPLEX
directors.

G
Geographically distributed system A system physically distributed across two or
more Geographically separated sites. The degree of distribution can vary widely,
from different locations on a campus or in a city to different continents.
Geoplex A DR1 device configured for VPLEX Geo
gigabit (Gb or Gbit) 1,073,741,824 (2^30) bits. Often rounded to 10^9.
gigabit Ethernet The version of Ethernet that supports data transfer rates of 1 Gigabit
per second.
gigabyte (GB) 1,073,741,824 (2^30) bytes. Often rounded to 10^9.
global file system (GFS) A shared-storage cluster or distributed file system.

H
host bus adapter (HBA) An I/O adapter that manages the transfer of information
between the host computers bus and memory system. The adapter performs many
low-level interface functions automatically or with minimal processor involvement to
minimize the impact on the host processors performance.

I
input/output (I/O) Any operation, program, or device that transfers data to or from a
computer.
internet Fibre Channel protocol (iFCP) Connects Fibre Channel storage devices to
SANs or the Internet in Geographically distributed systems using TCP.
intranet A network operating like the World Wide Web but with access restricted to a
limited group of authorized users.

124

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Glossary
internet small computer system interface (iSCSI) A protocol that allows commands
to travel through IP networks, which carries data from storage units to servers
anywhere in a computer network.
I/O (input/output) The transfer of data to or from a computer.

K
kilobit (Kb) 1,024 (2^10) bits. Often rounded to 10^3.
kilobyte (K or KB) 1,024 (2^10) bytes. Often rounded to 10^3.

L
latency Amount of time it requires to fulfill an I/O request.
load balancing Distributing the processing and communications activity evenly
across a system or network so no single device is overwhelmed. Load balancing is
especially important when the number of I/O requests issued is unpredictable.
local area network (LAN) A group of computers and associated devices that share a
common communications line and typically share the resources of a single processor
or server within a small Geographic area.
logical unit number (LUN) Used to identify SCSI devices, such as external hard
drives, connected to a computer. Each device is assigned a LUN number which serves
as the device's unique address.

M
megabit (Mb) 1,048,576 (2^20) bits. Often rounded to 10^6.
megabyte (MB) 1,048,576 (2^20) bytes. Often rounded to 10^6.
metadata Data about data, such as data quality, content, and condition.
metavolume A storage volume used by the system that contains the metadata for all
the virtual volumes managed by the system. There is one metadata storage volume
per cluster.
Metro-Plex Two VPLEX Metro clusters connected within metro (synchronous)
distances, approximately 60 miles or 100 kilometers.
metroplex A DR1 device configured for VPLEX Metro
mirroring The writing of data to two or more disks simultaneously. If one of the disk
drives fails, the system can instantly switch to one of the other disks without losing
data or service. RAID 1 provides mirroring.
miss An operation where the cache is searched but does not contain the data, so the
data instead must be accessed from disk.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

125

Glossary

N
namespace A set of names recognized by a file system in which all names are unique.
network System of computers, terminals, and databases connected by communication
lines.
network architecture Design of a network, including hardware, software, method of
connection, and the protocol used.
network-attached storage (NAS) Storage elements connected directly to a network.
network partition When one site loses contact or communication with another site.

P
parity The even or odd number of 0s and 1s in binary code.
parity checking Checking for errors in binary data. Depending on whether the byte
has an even or odd number of bits, an extra 0 or 1 bit, called a parity bit, is added to
each byte in a transmission. The sender and receiver agree on odd parity, even parity,
or no parity. If they agree on even parity, a parity bit is added that makes each byte
even. If they agree on odd parity, a parity bit is added that makes each byte odd. If
the data is transmitted incorrectly, the change in parity will reveal the error.
partition A subdivision of a physical or virtual disk, which is a logical entity only
visible to the end user, not any of the devices.
plex A VPLEX single cluster.
preference When a cluster has bias the for a given DR1 it will remain online if
connectivity is lost to the remote cluster (in some cases this may get over ruled by
VPLEX Cluster Witness).

R
RAID (Redundant Array of Independent Disks) The use of two or more storage
volumes to provide better performance, error recovery, and fault tolerance.
RAID 0 A performance-orientated striped or dispersed data mapping technique.
Uniformly sized blocks of storage are assigned in regular sequence to all of the arrays
disks. Provides high I/O performance at low inherent cost. No additional disks are
required. The advantages of RAID 0 are a very simple design and an ease of
implementation.
RAID 1 Also called mirroring; this has been used longer than any other form of
RAID. It remains popular because of simplicity and a high level of data availability.
A mirrored array consists of two or more disks. Each disk in a mirrored array holds
an identical image of the user data. RAID 1 has no striping. Read performance is
improved since either disk can be read at the same time. Write performance is lower
126

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Glossary
than single disk storage. Writes must be performed on all disks, or mirrors, in the
RAID 1. RAID 1 provides very good data reliability for read-intensive applications.
RAID leg A copy of data, called a mirror, that is located at a user's current location.
rebuild The process of reconstructing data onto a spare or replacement drive after a
drive failure. Data is reconstructed from the data on the surviving disks, assuming
mirroring has been employed.
redundancy The duplication of hardware and software components. In a redundant
system, if a component fails then a redundant component takes over, allowing
operations to continue without interruption.
reliability The ability of a system to recover lost data.
remote direct memory access (RDMA) Allows computers within a network to
exchange data using their main memories and without using the processor, cache, or
operating system of either computer.
Recovery Point Objective (RPO) The amount of data that can be lost before a given
failure event.
Recovery Time Objective (RTO) The amount of time the service takes to fully
recover after a failure event.

S
scalability Ability to easily change a system in size or configuration to suit changing
conditions, to grow with your needs.
simple network management protocol (SNMP) Monitors systems and devices in a
network.
site ID The identifier for each cluster in a multi-cluster plex. In a Geographically
distributed system, one clusters ID is 1, the next is 2, and so on, each number
identifying a physically separate cluster. These identifiers are assigned during
installation.
small computer system interface (SCSI) A set of evolving ANSI standard electronic
interfaces that allow personal computers to communicate faster and more flexibly than
previous interfaces with peripheral hardware such as disk drives, tape drives, CDROM drives, printers, and scanners.
split brain Condition when a partitioned DR1 accepts writes from both clusters. This
is also known as a conflicting detach.
storage RTO The amount of time taken for the storage to be available after a failure
event (In all cases this will be a smaller time interval than the RTO since the storage is
a pre-requisite).
stripe depth The number of blocks of data stored contiguously on each storage
Implementation and Planning Best Practices for EMC VPLEX Technical Notes

127

Glossary
volume in a RAID 0 device.
striping A technique for spreading data over multiple disk drives. Disk striping can
speed up operations that retrieve data from disk storage. Data is divided into units
and distributed across the available disks. RAID 0 provides disk striping.
storage area network (SAN) A high-speed special purpose network or subnetwork
that interconnects different kinds of data storage devices with associated data servers
on behalf of a larger network of users.
storage view A combination of registered initiators (hosts), front-end ports, and
virtual volumes, used to control a hosts access to storage.
storage volume A LUN exported from an array.
synchronous Describes objects or events that are coordinated in time. A process is
initiated and must be completed before another task is allowed to begin. For example,
in banking two withdrawals from a checking account that are started at the same time
must not overlap; therefore, they are processed synchronously. See also
asynchronous.

T
throughput:
1.

The number of bits, characters, or blocks passing through a data


communication system or portion of that system.

2.

The maximum capacity of a communications channel or system.

3.

A measure of the amount of work performed by a system over a period of


time. For example, the number of I/Os per day.

tool command language (TCL) A scripting language often used for rapid prototypes
and scripted applications.
transmission control protocol/Internet protocol (TCP/IP) The basic communication
language or protocol used for traffic on a private network and the Internet.

U
uninterruptible power supply (UPS) A power supply that includes a battery to
maintain power in the event of a power failure.
universal unique identifier (UUID)
A 64-bit number used to uniquely identify each VPLEX director. This number is
based on the hardware serial number assigned to each director.

128

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Glossary
virtualization A layer of abstraction implemented in software that servers use to
divide available physical storage into storage volumes or virtual volumes.
virtual volume A virtual volume looks like a contiguous volume, but can be
distributed over two or more storage volumes. Virtual volumes are presented to
hosts.
VPLEX Cluster Witness A feature in VPLEX V5.x that can augment and improve
upon the failure handling semantics of Static Bias Rules.

W
wide area network (WAN) A Geographically dispersed telecommunications network.
This term distinguishes a broader telecommunication structure from a local area
network (LAN).
world wide name (WWN) A specific Fibre Channel Name Identifier that is unique
worldwide and represented by a 64-bit unsigned binary value.
write-through mode A caching technique in which the completion of a write request
is communicated only after data is written to disk. This is almost equivalent to noncached systems, but with data protection.

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

129

Glossary

Copyright 2013 EMC Corporation. All Rights Reserved.


EMC believes the information in this publication is accurate as of its publication date. The
information is subject to change without notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION
MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE
INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an
applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on
EMC.com.
All other trademarks used herein are the property of their respective owners.

130

Implementation and Planning Best Practices for EMC VPLEX Technical Notes

Você também pode gostar