Escolar Documentos
Profissional Documentos
Cultura Documentos
to VMware Disaster
Recovery and Data
Protection
Mike Preston
IT Professional and Blogger
Contents
Chapter 1 Modern Data Centers Call for Modern Data Protection . . . . . . . . . . . . . . . . . . . . . . . . 5
Data growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Always-On availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
What is the modern data center? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
The modern data center will be highly virtualized . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
The modern data center will leverage the cloud, be it private, public or hybrid. . . . . . . . . . . . . . 8
The modern data center will have modern storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Tying it all together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Data recoverability in the modern data center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Transforming from backup to data recoverability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Chapter 2 Achieving Availability Through Backup and Replication . . . . . . . . . . . . . . . . . . . . . 13
Traditional backup methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Modern data protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Consistent backups and quiescence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Modern snapshot methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Backup from VMware snapshot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Backup from Storage Snapshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
The importance of cleaning up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Integration with vStorage APIs for Data Protection (VADP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Change Block Tracking (CBT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Backup transport modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Backup targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Backup media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Backup retention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Full backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Incremental backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Chapter 1
Modern Data Centers Call
for Modern Data Protection
Businesses, both large and small, are all-in or approaching a transformation phase right now. The
24/7 Always-On demands from consumers and employees have placed us all into a fast-paced,
competitive landscape. A landscape that requires enterprises and small to mid-sized businesses (SMBs)
all over the world to shift focus and change the way they deliver services, both operationally and
technically. In fact, if we compare our data centers of today to what they were like a mere ten years ago,
we can see drastic changes. The way we deploy and configure hardware has changed. Virtualization
has replaced racks upon racks of single-application servers with small compute clusters and highconsolidation ratios. Our storage area networks (SAN) and storage arrays are no longer serving up
logical unit numbers (LUN) on a one to one basis, but providing storage in a shared fashion to our
hypervisors. In fact, a small technology platform noted hyper-convergence is on the rise and is placing
its footprint in our data centers by eliminating the need for these centralized storage arrays. Its doing
this by presenting local storage in a distributed, scalable, building block node architecture, all while
keeping it next to the compute and providing great performance improvements.
All of these changes over the years are due to a fundamental shift in the way customers and employees
perceive their access to information. This causes organizations to modernize their data centers and
make drastic changes in the way they deliver services and applications. Before we get too far into the
components of a modern data center, lets first discuss a couple of challenges that the modern data
center helps organizations solve; data growth and availability.
Data growth
Theres no doubt that over the last few years businesses, both small and large have shifted data to their
number one protected asset. Data is now seen as a strategic corporate initiative, giving companies
competitive advantages and essentially becoming the lifeblood of the organization. For these reasons,
organizations have begun to create, analyze and store more and more data. The amount of data
growth over the last 10 years has been astronomical with the worlds digital footprint doubling almost
every two years. According to an IDC study, back in 2011, we had created 1.8 zettabytes (1.8 trillion
gigabytes) of data. In 2012 that number doubled to 2.8 zettabytes. The next year, 2013 we sat at 4.4
zettabytes. The prediction for 2020 a whopping 44 zettabytes of information.
A large contributing factor to the exponential data growth is a direct result from a concept dubbed the
Internet of Things (IoT). To understand the IoT we have to look no further than our own house. Think
about the number of devices you have transmitting data inside your home. Sure, the usual suspects
will always be present; laptops, desktops, phones, tablets. But, we are seeing an increase of other
smart devices making their way inside our four walls. Items such as smart televisions, thermostats,
home automation, security systems, automobiles, etc. They are quickly taking a strong hold on the
way we live and the amount of bandwidth we require. Although these devices all perform vastly
different functions, they all have one thing in common they all generate data. Data that is collected,
processed, analyzed and treated as a Tier 1 asset by an organization somewhere in the world. In fact,
IDC reports that in the year 2020 data from embedded systems like those included in the IoT, will
account for 10% of the data we store, making it roughly the same amount of data available on the
planet today. This data, which provides companies with real-time information about the way we live
and operate, is extremely important to business because it opens doors in terms of customer response
times and meeting the personalized service levels that employees and customers have come to expect.
2015 Veeam Software
Data growth is perhaps the biggest challenge presented to inside data centers today. We need to
find ways to store it, analyze it, process it, and use it in a way that can further our company objectives.
With data taking the forefront we also need to able to protect it, essentially duplicating it and with
the amount of generated data doubling almost every two years this can pose quite a challenge to
organizations moving forward.
Always-On availability
Another challenge we are seeing within current data centers comes in terms of availability. Gone are
the days where we can shut down our key systems and lock our doors every night at 5pm. The Internet
has truly changed the way consumers perceive technology and service. Could you imagine browsing
to an online store and not being able to purchase an item because it was 5:30 p.m.? Companies today
must operate 24 hours a day, 7 days a week in order to meet customer demands.
An organizations availability of services around the clock not only applies to external customers, but also its
internal employees. We have seen an increase of workers departing from their normal nine to five work day.
Employees working from home or remotely are on the rise. The cloud and internet have made it possible
for companies to expand their global reach, spanning multiple time zones and generating more revenue.
Devices transmitting data in an IoT fashion do so in real-time, 24/7 they dont wait for services to be
available. All of these factors are creating new Tier 1 workloads workloads which are not just contributing to
our data growth, but also requiring that the services are available and Always-On.
Data growth and data availability are truly a couple of the most challenging issues we face today in our
data centers. To put the 2020 estimated growth into perspective we can simply look to the sky. By 2020
we will have almost as many digital bits of data stored as we do stars in the universe. And, businesses
and corporations will have a large responsibility to make that data available and put the proper
protection mechanisms in place around it. But, are the technology advancements weve made in the
last 10 years enough to overcome the challenges of todays world? Will they be enough to protect us
from tomorrows challenges? The answer to these challenges is being built into the architecture of
whats called the modern data center.
1.33
$82.864
$110,209
4.81
$70,913
$341,091
$451,300
10
and/or ignored. In order to get the level of availability and avoidance they need, organizations need to
focus on a data recoverability strategy, rather than just implementing a simple backup solution.
Data recoverability strategies differ from backup in many ways, but for the most part they contain five
key characteristics that must be met.
1. H
igh Speed Recovery This is the key feature in any data recoverability strategy. With the cost of
downtime being so high, recovering services and applications faster means less of an impact in lost
revenues, loss of productivity and business impact. On average the recovery time objectives (RTO) of
a mission-critical application is 2.86 hours the modern datacenter aims at moving this RTO down
to 15 minutes or less. Data recoverability solutions need to leverage the storage and automation
features within the modern data center to provide a number of ways to recover from failures and
they need to leverage both backup and replication, along with virtualization features to restore
services in an almost-instant manner.
2. Data Loss Avoidance In order to provide low RTO, data recoverability solutions need to make
frequent backups of our production data. As we mentioned before, the average RPO of a mission
critical application is 4.81 hours. Modern data recoverability solutions strive to bring this number
down to less than 15 minutes. To do this, they must leverage many virtualization features such as
Change Block Tracking and wide area network (WAN) Acceleration in order to physically move the
data at a speed that can support the desired RPO and leverage deduplication and compression on
the backend to ensure that capacity is available to house the many restore points being created.
Modern Data Protection solutions need to be smart about how they backup and how they store our
data in order to lower RPO, thus minimizing the amount of data loss during an outage.
3. Verified Recoverability Lets face it, our backups are no good to us if they fail or are corrupt when we go
to restore them. Unfortunately, only 5.26% of backups are tested each and every quarter, with an average
of 17% of backups which fail to recover (Source: Veeam Availability Report). The modern data protection
strategy must provide a means in which we can test, validate and ensure that our backups are indeed
restorable and it must do this in a way that is both efficient and automated. With a highly virtualized
environment a data recoverability solution can leverage the power of virtualization, essentially powering
these backups on in an isolated environment, just as they would be if they were restored and ensure that
they are 100% restorable in the event they need be (make sense?)
4. Leveraging Data For the most part, our data recoverability solutions are an insurance policy,
11
meaning we have terabytes and terabytes of production data that has been backed up sitting
stagnant on disk and tapes within our organizations that is very seldom used. A solid modern data
recoverability solution will take this once stagnant data and leverage it in ways that are beneficial
to disaster avoidance. Using the same isolated, Virtual Lab technology that is implemented in
testing our backups, data recoverability solutions can provide a virtual sandbox environment to
administrators and applications owners in which they can test different software updates and
patches. This allows us to perform upgrades or patch applications and operating systems on a near
exact copy of our virtual machine, ensuring that there are no surprises when the same updates
are applied within the production environment. With 87% of organizations reporting that they
experience more than expected downtime when applying patches and upgrades (Source: VEEAM)
this is a huge value add within a data recoverability solution.
5. C
omplete Visibility Data recoverability solutions need to do more than just report on whether
jobs have failed or succeeded. We need to have complete visibility into our backup environments
in order to properly plan for the future and manage the now. Knowing where bottlenecks may be
within your environment can help you to remediate issues which possible could increase your RPO
and RTO. Knowing the size of incremental backups and deduplication ratios being applied can
help you properly perform capacity planning as it pertains to your backup storage. Having a fully
customizable reporting feature in which you can perform compliance auditing on what is being
backed up, or more importantly what isnt being protected is key to ensuring there are no surprises
in the future that may have a significant impact to the business. Modern data recoverability
solutions must provide complete visibility into your backup environment taking into account
resource usage, capacity planning and performance monitoring all while eliminating the guess
work that sometimes comes with managing a backup environment.
With these five key components of a data recoverability strategy in place, organizations can put
themselves in a great position to close in on the availability gap that they face and eventually fully
modernize their data center. In the next few chapters we will dive deeper into how traditional backup
solutions are lacking in terms of new technologies. And, how modern data recoverability solutions
containing the above five key components, coupled with VMware virtualization technology, integrate
with each other in order to provide us with Always-On Availability for the Modern Data Center.
12
Chapter 2 Achieving
Availability Through Backup
and Replication
Just as we have seen the modernization of the data center over the years, great strides have been made
in the backup technologies and methodologies that surround it. Although we still see some older target
media within the data center mainly tape the methods we use to obtain and transfer the data to
those media are much different. The introduction of virtualization into our data centers has opened up
many doors in the backup world, allowing new and innovative vendors to provide solutions that are more
efficient, agentless and have less overhead than traditional backup solutions could provide.
13
14
Another major difference between traditional and modern solutions is the method they use to capture
and transport the data from source to target. Traditional methods essentially have one transport mode,
the network. Data is processed by an agent within the OS and sent across the production network to
its final destinations. Because modern solutions are highly virtualized, backup vendors have the option
to take advantage of APIs and functionality provided by VMware to direct attach or hot-add the source
VMs disk to a proxy. The proxy can then handle the processing of the data, as opposed to utilizing an
agent within the production workload. These modern methods not only offload the backup processing
from your production VMs, but can also offload traffic from your production networks when coupled
with a dedicated backup network.
Most modern data protection solutions follow a similar process, depending on the type of backup
being performed. First the VM is quiesced, invoking Microsofts Volume Shadow Copy Service (VSS),
or VMware tools if need be, in order to ensure an application-consistent backup. Then, a snapshot is
performed on the VM, which allows live data to be redirected to a delta disk while the original is copied.
From there, a transport mode is determined for the backup, the source and target are mapped, and
the backup type whether its full or incremental is performed. Incremental backups can take
advantage of VMwares Change Block Tracking (CBT) to determine changed blocks, or they can also
be calculated by the backup software itself. Backups are then stored compressed and deduplicated
whether they are inline or post process and retention policies are applied.
A lot happens during this process, so lets take a look at each section in more detail.
15
16
17
As you can imagine, the longer a VM snapshot exists, the higher the number of changed blocks, which
produce larger redo logs. In fact, the redo logs have the possibility of growing to be the same size
as the original VM's disk if each and every block were to incur a change. When multiple snapshots
exist concurrently, capacity can quickly become a problem. Perhaps the biggest issue with running
a VM containing snapshots is performance penalties. Due to the nature of reading and updating the
metadata or redo log with each read and write, a VM running on a snapshot will drive more IOPs than
one without, resulting in more stress on the underlying storage. With both performance penalties and
potential capacity issues, its best not to keep snapshots around for long periods of time. Due to these
concerns, backup solutions have explored other methods to free up the VMs source data, including
Backup from Storage Snapshots, which is performed on the array itself.
Backup from Storage Snapshots
Backup vendors like Veeam have recognized the problems around native VMware snapshots and have
provided solutions to help minimize the amount of time that a snapshot exists on a VM. Theyve done
so by building support into their application that allows a backup to be processed from a VMware
snapshot that resides on a storage snapshot (sometimes called a SAN or LUN snapshot) while the
production VMware snapshot has already been committed back to the original VM. Aside from the
benefits of being able to quickly commit the VMware snapshot, storage snapshots offer the ability to
speed up your backup time in general: We dont have to commit long-running production snapshots
at the end of the job, which eliminates the need for the backups to cross the hypervisor stack by
implementing direct communication between the storage and backup software.
The process for using Backup from Storage Snapshots is as follows:
A
VM-level snapshot is requested, redirecting all writes to the newly created redo logs and placing
original disks in a read-only state. Although the backup is being performed on a storage snapshot
level, the VM-level snapshot is still required in order to ensure application consistency within the OS.
A
storage-level snapshot is triggered on the LUN/storage volume that holds both the read-only
original virtual disks and the newly created VMware snapshot.
T he VMware snapshot on the original LUN/storage volume is immediately deleted (committed) back to
the VM. At this point, the VM would now be running in its normal production capacity and performance.
T he backup server can now process the cloned VM on the storage snapshot that was created and
complete the backup.
Finally, after the backup has been completed, the storage snapshot created in the second step is discarded.
18
As we can conclude, the VMware snapshots will exist for much less time when using the Backup from
Storage Snapshot method because they only exist for the time it takes to create storage snapshots,
which is normally relatively fast. The snapshots can then be discarded and do not need to be present
for the duration of the backup job.
The importance of cleaning up
By coupling the performance penalties and capacity issues surrounding snapshots with the ability to
create up to 32 snapshots per VM, its important for backup solutions to ensure that they have cleaned
up after themselves by verifying that any snapshots they created are removed or deleted. Its not
uncommon for the vCenter API to report success on a snapshot deletion when it has not completed.
Whether this is a fault within the API or a disruption in communication, the end result is similar: We
are sometimes left with rogue or orphaned snapshots that arent reported through vCenter. For these
reasons, its important to monitor your environment for orphaned snapshots that may have been left
behind. To help mitigate these scenarios, companies like Veeam have implemented processes to poll
your environment to ensure that any failed consolidation or snapshot deletions have successfully
completed. In Veeams case, a process called Snapshot Hunter is spawned every time a snapshot
commit or deletion is invoked within its backup operations. If these orphaned or phantom snapshots
are found, Snapshot Hunter will make a first pass at removing the snapshot once the process of
backing up the VM in question has completed. If the first pass at removing the snapshot fails, Snapshot
Hunter will then automate the best practices set forth by VMware to safely remove the snapshots. If the
snapshots still exist after attempting a soft consolidation and a hard consolidation with and without
quiesce, Snapshot Hunter will notify the user that action needs to be taken.
19
20
21
Network
Network mode in most cases should be a last resort transfer mode because it usually results in slower
data transfer speeds and increased usage on your production network. Network mode basically sends
requests between the backup server or proxy and the ESXi host housing the source VM by leveraging
the TCP/IP network as the means of transport for all communication and data transfer. Network mode is
really only an optimal solution when environments are running a 10GB isolated backup network, when
your backup server or proxy is a physical machine and you have no shared storage in your virtualized
environment, if youre backing up VMs with a very low change rate, or as a last resort to a failing Direct
SAN Access or Virtual Appliance hot add transport.
The data transport mode you select greatly depends on the setup and configuration of your
environment. For example, Direct SAN Access mode will not work if environments utilize local storage
or non-block storage. Instead, you will need to use network mode or install a backup proxy locally on
each host for Virtual Appliance mode. Thankfully, backup solutions like Veeam Backup & Replication
have the intelligence built in to properly automatically select the best and most efficient mode for each
and every backup, as well as failover over through the list if needed. In addition to the backup modes
supported by VADP, Veeam Backup & Replication also provides a means to perform direct mounts on
NFS storage as well.
22
Backup targets
So far, we have discussed how we can utilize snapshots to exclusively connect to our backup source,
the VM. Weve talked about how we can achieve an application-aware consistency using quiescence.
Weve also discussed the different transport modes available to move our data. What we havent
discussed is where exactly we plan on moving our data to and what medium to best store it on.
Backup media
In the past, organizations that had to deal with a large scale, high-capacity data protection strategy
really only had one option when looking for a final target for their backups: Tape. While tape has many
advantages mostly price it can be slower than other options at times, which is why we are seeing
a surge of organizations leverage spinning disks to host backups lately. With that said, there are benefits
and drawbacks to each and every piece of media that organizations chose to house their backups. Lets
have a look at a few of the most popular backup media utilized today: Tape, disk and cloud.
Tape
As mentioned earlier, tape is probably one of the most effective solutions if you look at everything in
terms of cost. However, tape does have some downfalls. Due to the shear cost of disk, tape used to be
the only option organizations had to store their backups. They provided a high-capacity, cost-effective
target for businesses to store their mission critical data. With that said, tape lacks in the areas of speed
and efficiency. While tape can be just as fast as disk when it comes to sequential operations, it suffers in
terms of random I/O. Because most restore options require a great deal of random I/O, tape becomes
a less viable option because it greatly increases an organizations RTO. As we mentioned in Chapter 1,
increasing both RPO and RTO can result in a pretty hefty cost during an unplanned outage, which is
certainly not a desirable outcome for any organization.
Disk
During the last 10 years, we have seen a huge increase in the use of backup to disk solutions. This is mainly
because the price of disk space has gone down and the capacity that the disk provides continues to go
up. Using disk solves the speed issues that we see with tape. Companies like Veeam are able to leverage
the speed and efficiency of disk in order to create very fast recovery feature sets, such as Instant VM
Recovery and Virtual Labs, and power up VMs directly from their backup files. Although disk does give
us the speed we need to maintain low RPO/RTO policies, it does present challenges when it comes to
failures. Excluding solid state, disks contain many moving parts that can be prone to failure, and with a
single disk holding more and more capacity, recovery of these disk and RAID rebuild process can take a
long time all the while affecting performance and speed in your backup environment.
23
Cloud
Utilizing cloud services as backup storage can be an effective onboarding into introducing cloud into the
modern data center. Over the last five or so years, we have seen backup providers like Veeam offer hooks
or connections into popular cloud-based storage services, allowing customers to back up or copy backups
directly into a cloud infrastructure. Utilizing cloud comes with many benefits. First and foremost, you only
pay for what you actually use. Organizations dont have to foot the bill for large gobs of storage that is
normally over provisioned on site. Cloud also provides an easy onboarding solution to help fulfill the 3-2-1
Rule by providing off-site storage for an organizations backup files. With that said, the cloud does provide its
own challenges. Ensuring you have enough bandwidth to get your data to and from your cloud service can
be costly, and once it is there, concerns around encryption and security can become cloudy.
The 3-2-1 Rule
Out of the three most popular media available to organizations today, how do we determine which
one is a right fit? For the most part, we see businesses adopt all of them, mainly to support what is
called the 3-2-1 Rule: 3 copies of your data, 2 on different media, with at least 1 copy located in a
remote location away from your production data center. The 3-2-1 Rule provides us with a variety of
options when we need to restore data from backup, placing the organization in a scenario where they
are very likely to be able to recover. Some backup providers like Veeam have built solutions around the
3-2-1 Rule, allowing organizations to backup to on-site storage once and then run copy jobs to transfer
those backups to either tape, disk or the cloud. This means we only need to touch our production
workloads once for the initial backup, and the backup copy jobs ensure we have moved those backups
to other media. Media like disk and tape give us the reliability and speed that we need within our
production data centers, and cloud helping us to save money instead of duplicating infrastructure and
maintenance costs at a secondary site.
WAN acceleration
As mentioned earlier, moving data off site or to the cloud does present challenges pertaining to bandwidth.
Most organizations utilize the same pipe for shipping backups off site as they do for sending their
production traffic. This means we need to ensure that we have enough bandwidth available to support
both traffic types. Since additional network bandwidth and links can be very costly to organizations, backup
providers like Veeam have provided solutions such as built-in WAN acceleration to help mitigate the
amount of data that needs to traverse the WAN, thus lowering an organizations bandwidth requirements.
By placing a software-based, purpose-built WAN accelerator on both the source and target locations,
organizations can leverage caching and data compression techniques to help get their VM backups off site
faster while minimizing the amount of bandwidth used at the same time.
24
Caching works by storing metadata about a particular block in whats called a global cache. When a file is
first sent across the WAN, the complete file will be sent. If that file is requested to be sent again, whether it is
in the same VM backup or a different one, the data can be simply read locally from the global cache, rather
than having to traverse the WAN a second time. This effectively trades expensive network bandwidth for
cheaper disk I/O operations. When an organization deploys many VMs from the same individual template,
this can equate to quite a lot of saved bandwidth that doesnt need to traverse your WAN.
Data protection strategies within the modern data center need to leverage many pieces in order to
be effective in todays environments. At times, we will see them leverage tape, disk and the cloud for
backup storage in order to effectively implement the 3-2-1 Rule. Components like WAN acceleration are
key features that provide the level of protection and Availability that modern data centers need.
25
Backup retention
As our data footprint inside the modern data center constantly grows, so does the capacity required
to back it all up. Providing the level of data protection that organizations expect can sometimes be
challenging to overcome. Backup vendors use a number of different models when laying down
backups to disk to help mitigate this issue of data growth and save capacity. Below are a few of the
most common backup models used today for storing backup data.
Full backup
A full backup is a complete, image-level backup that contains the VMs disks and respective content
bundled into one complete backup. As you can see in the illustration below, this takes up a lot of
capacity because it needs to gather all of the VMs data and configuration in each and every backup
cycle. With a retention policy set to keep three restore points, we can see that we would have three full
copies of the backup for each and every backup event. Each backup would be completely independent
and would not require the presence of any other backup file in order to perform a restore. However, what
we gain with independence, we pay for in capacity because each backup contains all the data from the
VM, whether it has changed or not. Full backups are not normally the only backup mode used inside a
data protection strategy, but they are usually coupled with other modes like incremental backups.
Incremental backup
Incremental backups solve the issue of capacity that full backups create. By leveraging incremental
backups, organizations are able to simply copy the blocks that have changed since the previous
backup operation was performed. As you can see below, incremental backups require much less space
on the backup target and are much faster to perform compared to running a full backup.
26
While incremental backups help decrease the amount of capacity an organization needs to store backups,
they always need to have a complete full backup available in order to perform a complete VM restore.
Incremental with active/synthetic full
While having one full backup with many respective incrementals attached is a viable backup strategy,
some organizations prefer to take other periodic full backups in the event that an original full backup
becomes corrupt. Backup solutions like Veeam Backup & Replication have implemented options in order
to help with this strategy by introducing yet another mode that combines periodic full backups with
incremental backups. This gives organizations the peace of mind that multiple full backups have been
processed. As shown below, we can see that a full backup has been taken on both Sunday and Thursday,
leaving the Tuesday and Wednesday backups relying on the Monday full, and the Friday and Saturday
backups relying on the Thursday full. This is what we call an incremental backup with an active full.
27
Although this mode does give organizations the peace of mind of having more than one full backup
on disk, it does require the backup software to perform multiple full backups on a VM within the
production environment. In order to address the problems of processing your production environment
multiple times, a similar type backup mode incremental with synthetic full was introduced.
Incremental with synthetic full is similar to an active full in that it creates another full backup, but
instead of grabbing this from our production VM, backup solutions can construct the full backup by
merging a current incremental with previous incrementals already stored on disk. As shown below,
we can see that by simply using our Monday full, coupled with our Tuesday, Wednesday and Thursday
incremental backups, we are able to generate or synthetically create a full backup for Thursday without
having to process a full backup, leaving the backup solution to do most of the I/O and heavy lifting.
28
Reverse incremental backups are almost a polar opposite to incremental backups when we look at
how they are laid out on disk. In terms of restore points, incremental backups start with a full and add
incremental changes, while reverse incremental backups end with a full and begin with incrementals.
Reverse incremental backup is perhaps the best solution if you are looking to save on disk capacity
because there is only ever one full backup present on disk. However, this can stress the backup storage
more than the other backup modes because it requires more IOPS when generating the reversed
incremental blocks.
Incremental with active/synthetic full and transform
This backup mode is somewhat of a combination of incremental backup, synthetic full and reverse
incremental, and it utilizes a transformation phase that will convert any incremental and full backups
that occurred before the latest full backup into reversed incrementals. As shown below, we are left with
only one full backup taken on a Thursday, with incremental backups following for Friday/Saturday and
reversed incremental backups behind it on Sunday through Wednesday.
29
30
Grandfather-father-son
Using the backup modes explained above are great for day-to-day or recent restores, but they dont
support any strategies around long-term storage or archiving of backups. Many organizations are
required or want to store weekly, monthly, quarterly or yearly backups of their data in order to preserve
certain points in time due to security or compliance reasons. In order to achieve this, organizations
often deploy their primary storage on fast and small targets, while utilizing a secondary location to host
those long-term, older restore points. The process of storing these long-term, older restore points is
commonly referred to as grandfather-father-son (GFS).
GFS is a tiered retention cycle that allows an organization to ensure that x number of weekly backups,
x number of monthly backups, and x number of yearly backups are also maintained and archived in
addition to their short-term retention policies with weekly being the son, monthly being the father,
and yearly being the grandfather.
Backup providers like Veeam have introduced support for the GFS retention policy into their
solutions utilizing their backup copy technology. A backup copy job gives organizations the ability to
synthetically create backup files based on the data from the primary job but at a secondary location.
Coupling backup copy with GFS allows organizations to ensure that they are able to meet their desired
retention policies to keep certain numbers of weekly, monthly and yearly backups separate from the
storage hosting their short-term restore points, if desired.
The method you use greatly depends on the requirements of your organization. Certainly reversed
incremental will allow you save much more disk capacity than most the others, but it is not very
compatible with deduplication because the blocks in the full backup are always changing, resulting in
issues when trying to move the data off site or to another target media. Incremental mode might be the
best option if you are looking offload your data to tape or another site, because only small incremental
changes are backed up each and every backup run. Depending on the organizations requirements, a
mixture of all the backup modes, coupled with the GFS retention policies, are often used.
Deduplication and compression
Multiple retention policies arent the only way backup solutions are helping organizations with capacity
issues within their modern data centers: Organizations are also leveraging technologies that physically
decrease the amount of data that they need to store on disk mainly deduplication and compression.
Deduplication
Deduplication is a technology that eliminates the need of storing duplicate or identical blocks of data
and is used in many backup solutions today. There are many types of deduplication deployed in data
centers, but source-side deduplication is the type Veeam has chosen to implement. With source-side
deduplication, just one block of data is written to disk, and when that block is discovered in the source
again, a simple piece of metadata is written, pointing the backup to the first block instead of laying
down the exact same data. The illustration below is a simplified version of deduplication. We can see
that our job source contains 18 separate blocks of data, but once its deduplicated, only 5 blocks need
to be actually written to disk.
31
With the modern data center being highly virtualized, it only makes sense that organizations take
advantage of something called templates: The ability to create and manage one master, gold image
VM from which all of the other VMs are deployed. The nature of templates results in many identical
blocks getting created on your production storage because each VM has essentially spawned from the
exact same storage to begin with. This makes deduplication a must in any data protection solution that
organizations implement in their modern data centers.
Compression
Compression is another technique that backup solutions have implemented in order to conserve
expensive backup storage. Compression is a technology that has been around for ages and basically
squeezes and decreases the size of files, including backup files. Certainly some file types are
compressed more than others, and the level of compression seen can vary depending on content
within those files. With that said, having compression within your modern data protection solution is a
must in todays world. Moreover, some backup solution providers like Veeam offer a user-configurable
compression level per job, allowing organizations the choice to either use the highest level of
compression possible or tune their compression levels lower in order to maximize deduplication ratios
and provide better support for WAN acceleration.
With the exponential data growth occurring inside organizations showing no signs of slowing down,
its important a wide variety of backup modes and data deduplication and compression features are
available when choosing a modern data protection strategy. This is a must in order to ensure that the
level of protection can meet the defined RPO/SLA requirements within the modern data center.
32
Replication
So far, weve only talked about backup within the modern data center, but its important that data
protection solutions offer some sort of replication within their portfolio as well. Replication is a process
that performs similar to that of backup in that it utilizes snapshots and proxies to move the data.
However, the end result is very different. Instead of storing the data within a compressed, deduplicated
file, the data is stored in the form of a vmdk and a snapshot, and attached to a complete, duplicated
copy of the source VM. Because of this, replication has a different use case than backup and is often
used to support functions like disaster recovery (DR). Before we delve into exactly how it works, its best
to understand the differences between backup and replication.
The differences between backup and replication
The processes of backing up a VM and replicating a VM are very similar in nature: Both leverage
consistency quiescing, utilize VMwares snapshot technology and move the production data from a
source to a target. The main differences between the two are how the data is stored within the target
and the use-cases around how we recover using this data. Our backup targets mainly disk, tape
and the cloud store our data in a compressed, deduplicated format that requires us to extract and
restore our VMs. Replication targets while they are still disk are disks that are utilized for an ESXi
datastore and mounted to an ESXi host. When we replicate a VM, we essentially create a copy of that
VM in its native VMware format on another host or cluster, with separate storage located either in the
same site or a different site than the source VM.
In the event of a disaster or an outage, replication would simply require that VMs are failed over or
turned on at their target location. This process can be done utilizing the data protection solutions
tool sets. However, its important to ensure that your data protection solution allows for this to be
done outside of itself and natively within VMware, because our protection solutions go down with
the ship all too often. If we were to apply a backup methodology to this same situation, without the
use of instant recovery, we would be looking at a complete restore of those VMs and files back into
production, resulting in a much lower RTO.
While replication provides organizations with a smaller RTO, it is lacking a little in terms of the amount
of restore points you can maintain. Because most backup solutions store replicas as VMs and restore
points as snapshots attached to that VM, they are limited to the maximum number of snapshots
supported by VMware, which is 32 at this time. Backups do not have this limitation and can essentially
provide much more flexibility in terms of implanting GFS and increasing the number of restore points.
Replication is much different from backup and serves a fundamentally different purpose. Before we
explore where and when to use replication over backup, lets take a look at the replication process in
most modern data protection solutions like Veeam Backup & Replication.
33
34
Replica seeding
As previously discussed, all of the VMs data needs to be transferred across to the replication target
during the first run of a replication job, which can be quite a bit of data, depending on the size of the
VM. Replica seeding allows you to seed, or create your initial replica, by pulling the data directly out
of any backup that may reside on the same site as your replication target, skipping the initial, time
consuming task of replicating the entire VM. Subsequent jobs on that VM do not access the backup
files, and changed data is obtained and stored as it normally would be during a replication process.
Replica seeding gives us the advantages of saving bandwidth and time on the first replication run, plus
the lower RPO that replication can provide.
Resume on disconnect
Whether theres a rogue admin, a power blip or a spanning tree event, the network can sometimes
experience brief outages. The problem is that replication relies heavily on the network being up,
available and stable. When it seems to disappear, sometimes jobs will fail. This might not be a big
deal for smaller jobs, but imagine a larger job failing at 75% of the way through a 2TB VM and being
forced to start all over again due to a brief network outage. Its reasons like this this that modern data
protection solutions like Veeam have implemented measures that can resume operations from the
point of failure upon the restoration of the network, rather than starting the job all over again.
WAN acceleration
We spoke about the importance and benefits of utilizing WAN acceleration within our backup
scenarios, and they remain the same when applied to replication. In fact, WAN acceleration is even
more important when looking at it in terms of replication because we tend to replicate across WANs
more often and more likely during production work hours in order to provide a lower RPO. Data
protection providers like Veeam provide WAN acceleration technology that is effective for both backup
and replication solutions, allowing organizations to save bandwidth and increase WAN efficiency.
So backup or replication?
The question is often, Do we use backup or do we use replication within our data protection strategy?
The answer is almost always to use both. Backup and replication serve fundamentally different purposes,
and in order to get the level of protection and Availability that a modern data center needs, organizations
often need to leverage both. Backup takes care of the day-to-day file recovery and provides a means for
long-term, archival storage because its compressed and deduplicated on disk. Replication takes on more
of an Availability role, giving organizations the ability to quickly recover from a partial or full outage and
skipping the need and time it takes to restore data because the replica is already mounted and registered
within their data center. Replication, especially when implemented on a per-VM basis, gives organizations
a cost effective way to provide a DR solution for their tier-1 workloads.
35
Organizations frequently use a strategy that utilizes backup as they always have, with nightly or twicea-day jobs running, of which backups are created on site and then shipped off site. Backup jobs tend
to encompass the complete environment, meaning that every VM is normally backed up in some
fashion. Replication, however, occurs at various intervals during business hours, depending on the
organizations defined RPO (usually once an hour, once every 30 minutes, etc.) Because replication
requires a bit of an investment on the target end and needs more ESXi hosts, storage, etc., we usually
see organizations apply it to only their most critical, tier-1 workloads: The applications for which losing
24 hours of data would not be acceptable.
Organizations that leverage a modern data protection solution and implement both backup and
replication can fully ensure they are providing a complete level of protection, delivering Availability and
access to applications in their Always-On modern data center.
So far, we have discussed the importance, the benefits and features of both backup and replication
in the modern data center, but these features are not much use to organizations without a simple
and efficient recovery process. In the next chapter, we will explore just how modern data protection
solutions leverage virtualization and other current technology to provide timely recovery from backups
and complete failover for replicas.
36
37
File-level restore
A file-level restore is perhaps one of the oldest restore methods available. Its still highly utilized today,
and it allows organizations to simply restore individual files from their backup servers back to their
production servers. Traditionally, these restores are performed with an agent on the source, installed on
top of the operating system. With that said, most modern solutions are agentless and perform imagelevel backups at the hypervisor layer, so the way they perform file-level restores slightly differs from
the traditional method. Instead of directly accessing a file from the backup storage, modern solutions
must first mount the virtual disk or .vmdk from the backed up VM. When this mount is performed, the
virtual disk that contains the target files is mounted as read-only so that it doesnt affect the integrity
of the backed up VM for subsequent backup jobs. From there, the backup server is able to access the
desired file, perform a copy operation and transfer the file either back to its original location or to a local
disk attached to the backup server. When the operation has completed, the virtual disk can simply be
unmounted and re-attached to the backup copy of the VM in its compressed and deduplicated format.
VM file/disk-level restores
Along with files located within the guest operating systems, Availability solutions should also be able to
perform restores on the files that comprise a VM. Within VMware, a VM is encapsulated into a group of
files: We have a .vmx file, which stores the configuration and specifications about the VM itself; one or
more .vmdk files, which hold the actual data represented in the VMs disks; and various other files, which
store information around the VMs memory, BIOS, etc. Backup solutions like Veeam Backup & Replication
process these files during the backup process and provide a means to restore them as well. This is
important for a couple of reasons. Imagine a scenario where your .vmx file becomes corrupt rather
than attempting to recreate a new .vmx file, we could simply restore one from our backups, saving us
the risk of recreating one manually and the time of having to restore an entire VM. Perhaps the biggest
use case for a VM file-level restore is the ability to restore individual .vmdk files. Essentially, this allows
organizations to restore entire VM disks on a disk-by-disk basis. This may be useful in a scenario in which
a data disk gets corrupted and the OS disk is still functional. Organizations would be able to simply
restore the one data disk, rather than the alternatives of restoring an entire VM or processing multiple
file-level restores.
VM-/image-level restores
There comes a time usually during a disaster-type scenario when a file-level restore is just not
enough to recover a VM and we need to perform an image- or VM-level restore. A VM-level restore is
where most modern solutions shine. By leveraging virtualization, backup providers are able to take
a point-in-time backup of an entire VM and restore it to either the same location or a different one
than the original. This gives organizations the flexibility to restore to a different storage array, restore
to a different host, or even leverage a different hypervisor product altogether, such as a VMware
workstation. Because virtualization encapsulates all of the disks, files, BIOS, etc. of a server into a
grouping of files, data protection solutions are able to simply process these files and restore them as
direct copy of the server, all the while abstracting away the underlying hardware.
38
Application-level restores
If youve ever gone through the process of restoring individual messages or mailboxes from Microsoft
Exchange, you know that it can be complex and extremely time consuming. Mounting mailbox
databases and exporting items is not an ideal way to restore these types of items. Providers like Veeam
have taken notice of this and developed technologies wrapped around virtualization that allow us
to power on our backup files in an isolated, virtual lab. From there, organizations can utilize unique
explorer technology to simply discover individual mailboxes, mailbox items, etc. and chose to restore
them back to production or export them out to a different location. Other popular technologies
that benefit from application-level recovery include Microsoft Active Directory (users, groups, and
computers) and Microsoft SQL Server (databases, tables and records). However, as long as the VM
can be powered on in a virtual lab, almost any application can take advantage of this technology. By
utilizing an application-level restore, organizations save time and avoid the hassle of recovering entire
VMs just to recover one piece of an application.
Quick rollback
Often, when we restore VMs and/or their disk back to their original location, we are overwriting the
current VM we are restoring. In many situations but certainly not all of them its only a portion
of the data or a single disk that needs to be restored. In these scenarios, performing a full restore on
the entire VM is not necessarily warranted. To help combat these issues, modern solutions can utilize
the vSphere APIs for Data Protection (VADP) that VMware provides more specifically Change Block
Tracking (CBT). Just as a modern solution needs to leverage CBT when doing backups in order to
reduce backup time, it must also do the same in terms of restore. By doing so, the solution can query
data from the current VMware CBT file and compare that to the CBT file that has been backed up,
making a note of which blocks must be transported back to the original VM in order to bring it to the
proper point in time specified. From there, data protection solutions only need to migrate or restore
those blocks that are different between the source and target, which drastically reduces the time spent
recovering the VM and has less of an impact on the production environments storage performance.
Traditionally, we would require multiple sets of tools in order to utilize all of the restore methods
explained above which increased complexity as well as interoperability. A modern solution like Veeam
Backup & Replication, coupled with a highly virtualized data center, gives organizations one huge
benefit: All of the restore methods above can be implemented through one tool that processes the
same backup file. This means that organizations can back up a VM once and perform image-level, filelevel, application-level and VM file-level restores on the same backup. When paired with technologies
such as Quick Rollback, this decreases the amount of backup storage we need and removes the
complexity of managing multiple tools, all while decreasing the time to recover and increasing the
amount of Availability IT departments can provide to their Always-On, 24/7 business.
39
To preserve the integrity of the actual backup files, a snapshot is taken at the desired point in time
within the newly presented VM. This means that organizations can present the VM to their users in a
40
production-type environment, and when the VM is finally migrated or restored back to production, the
integrity of the backup files containing that VM are preserved. This allows organizations to only commit
the changes during Instant VM Recovery to production, leaving the backup files stored as they were
before the instant recovery had taken place.
Because most backup storage tends to be tier-2 type storage, which runs on slower spinning disk than
production, you can imagine that performance may suffer a little during this process. Its certainly not
meant to be a permanent solution, but what Instant VM Recovery does is get the VM back up and
running and allow organizations to access the data immediately in a limited fashion until the VM can
be migrated back to your production storage array. To help mitigate the performance issues, changes
and writes to the VM can be redirected to a different datastore. Those redo logs hosting the VMware
snapshot changes, along with the metadata files pointing to them, can be placed on a different
datastore that the vPower NFS datastore, specifically one with faster underlying storage.
When its time to finalize the Instant VM Recovery, committing changes and moving the VM back into
production, organizations have a few options available in various levels of Availability: Replication, quick
migration and Storage vMotion.
T he replication functionality within Veeam or the cloning functionality within VMware can be used
to essentially failover an Instant VM Recovery into production storage. However, both of these
options require a scheduled maintenance window and certainly some downtime when its
time to actually perform the failover from the recovered VM to the replicated VM. At that time, the
instantly recovered VM is turned off and final changes are replicated or cloned to the new VM, which
is then powered on.
Q
uick Migration is another technology that has been introduced in Veeam in order to failover
instantly recovered VMs to production. This is essentially the same as replication and cloning, but
it is performed in such a manner that it minimizes the amount of disruption to your production
environment during the switchover. Quick Migration simply restores your instantly recovered VM on
the production storage and then utilizes fast background processes to copy the changes from the
target to the source VM. When both VMs are in sync, the original backup VM is simply suspended
and then resumed on the target host.
A
nother option that is more preferable and requires no downtime is VMwares Storage vMotion.
Because the VMs are presented to ESXi in their native fashions, those organizations that are licensed
for it can simply leverage Storage vMotion to quickly migrate the restored VM to their production
storage, without incurring any downtime. This process basically pulls the VM from the NFS store
presented by Veeam and places it back onto your production storage, with absolutely no disruption
to the services provided from the VM.
Instant VM Recovery is really a game changer when it comes to providing 24/7 Availability inside of
the modern data center. With the ability to get services up and running within a limited fashion almost
immediately, organizations can save both time and money when it comes to loss productivity and revenue.
41
Replication Failovers
For the most, part weve only discussed recovering from backups. However, as I mentioned earlier,
organizations that are looking to provide a modern strategy into their data centers need to leverage
both backup and replication to ensure Availability. Although solutions like Veeam provide the means to
perform application- and file-level restores directly from the replicas they create, that type of recovery
is commonly left to backups. Failing over replicas is a function that is normally used during some sort of
outage or disaster. The failover process within Veeam Backup & Replication consists of the following:
1. The replica VM is rolled back to the desired restore point or snapshot.
2. A
nother snapshot of the replica VM is then taken in order to redirect any changes made to the VM
while it is running in a failed over state. This snapshot is used in step five of the failback scenario in
order to move our VM back to our production site. At this point, the VM is also excluded from all
replication jobs that may still be happening to ensure integrity of our failed over VM.
3. I f you chose to do so, you can also perform a permanent failover. Permanent failovers first delete or
discard all restore points on the failed over VM, depending on where you are in the snapshot tree.
After this, the delta snapshot created in step two is committed to the VM, which is now assuming
the role of your production VM.
The failover process from the outside appears simple in nature: The production VM is powered off, with
the replica VM being powered on in the latest state or a specified known good restore point. With that
said, there are a lot of items, both operationally and technically, that have to be taken into account
when initiating a failover. Items such as run books, processes, Re-IPing, network re-mapping, failover
testing, and failback are all important pieces of a solid failover plan. A good modern solution should, at
the very least, solve the technical aspects of it for you.
Failover network options
Many organizations choose to implement replication both on and off site to ensure they can provide
a very high level of protection. While performing a failover on-site may not require and further
configuration, failing over off-site presents challenges in terms of the IP addresses of the VMs. Most
often, the secondary or off-site locations that organizations replicate to are in an different IP space or
subnet and contain different VM networks, as defined in vSphere, than there production environments.
Simply powering on replicas inside the site will result in the VMs booting without network connectivity.
Modern solutions like Veeam give organizations the ability to both map a target network to the VM and
Re-IP the VM at the target off-site location, ensuring we always boot to a state where the VM will be
able to communicate.
Network mapping
Network mapping essentially is a mapping table that is created between the source and target sites and
holds all the information in regards to how the networks match up on either end. When a replication job
occurs, the networks are compared, and if a match is found within the mapping table, the network settings
on the target side are then reconfigured to the proper VM network. This ensures it is always connected to the
correct network and, at the same time, allows organizations to have different network names at each site.
2015 Veeam Software
42
Re-IP Rules
In terms of IP space and subnets, its also important to ensure that your data protection solution
provides some sort of automation for changing addresses within your replicated VMs. Solutions like
Veeam provide the ability for end users to create Re-IP rules that are attached to their replication
jobs. When VMs are failed over, they are first mounted and the IP address is changed before they are
powered on. This ensures that communication will occur correctly once its fully initialized.
Failback
A properly planned and executed failover should go off without a hitch, and data protection solutions make
this very easy for organizations today. Usually, it only takes a few clicks of the mouse to initiate. However,
when the production environment is back up, there tends to be many issues around how to failback or
move the newly failed over VMs back to a production site. A modern solution like Veeam can utilize both the
benefits of encapsulation and virtualization to help alleviate some of the problems that are associated with
failback. The process that Veeam Backup & Replication uses for a failback attempt is as follow:
1. I f the original VM is available, whether its the exact original or a backed up copy of the original, then
only those differences between the VMs will need to be restored to production. If the original VM is
not available or we are failing back to a completely new data center, then obviously we will need to
transfer the entire VM.
2. A
snapshot is taken on the replica VM in order preserve a pre-failback state, in case we wish to return
to it at a later time.
3. D
epending on the scenarios outlined in step one, we can either utilize delta copies or transfer the
entire VM back to the production site. The VM is brought to a point in time just before the failover
snapshot (performed in step two of the failover process) was taken.
4. O
nce the VM has been brought to the pre-failover state, the replica is then shut down and another
snapshot is taken in case the failback be cancelled or fails.
43
5. The changes that have occurred after the failover, which are stored in the failover snapshot, are then
transferred and committed to the failed back VM. At this point, the VM is processed through network
mapping, Re-IP processes and powered on.
6. Upon a commit of the process, the snapshots taken on the replica are removed and replication can
continue as normal once again.
As you can see, failover and failback a certainly more than just powering VMs off and on at different locations.
In order for organizations to meet the Availability needs of today, they must ensure they implement a highly
automated modern solution like Veeam Backup & Replication, which provides orchestration for failover and
failback and ensures that data is secure and recoverable during all phases of the restoration.
44
Application Groups
In some cases testing a backup/replica all on its own may be enough, however in the complex modern
data center there are a lot of moving parts, which results in a lot of VMs having dependencies on one
another. Think of things like a simple three tier web application; we have our front end web server, a
middle tier application server, all backed by a database server of some sorts. Without all three of these
functions present we cannot truly test the restoration of the application. By bundling these three VMs
into an Application group, and running that through a SureBackup/SureReplica job organizations can
ensure and verify the restorability of the entire application, instead of just singular VMs. We can think of
an Application group as a way to create the surroundings for the VM we would like to test for example,
do other services need to be present for this VM to work in a production capacity. For example, Active
Directory, DCHP, DNS.
With the application group setup as a whole we can then drill down into a more VM centric setup
allowing us to define which VMs within the group are powered on first, what if any boot delays between
the power on operations, and what tests/scripts are ran against which VMs within the application group.
Virtual labs
Once we have our application groups defined we need to create an area where we can safely power
on these VMs without affecting our production environment this is where virtual labs come in.
A virtual lab is essentially an isolated, fenced off area from our production network, however still
mirroring our production network setup.
45
As we can see above we have applications assigned to two different logical networks attached
to a virtual switch within our production environment. Within the virtual lab we have those same
applications, organized into two separate applications groups, all while maintaining the same IP
and network information as is within production.The proxy appliance, which assumes the IP of our
production default gateways on the isolated NICs and creates a fenced off, isolated network in which
the VMs are able to run and communicate with each other, all the while restricting access from the
virtual lab to the production network. The proxy appliance also has the ability to masquerades IPs of
the VMs within the virtual lab, allowing network access from certain IPs on our production network,
back through to our fenced-off network.
With all the pieces in places organization can now run the jobs to ensure that their backups and
replicas are indeed restorable. Within Veeam Backup & Replication these jobs are called SureBackup or
SureReplica jobs respectively. The process of running one of these jobs is as follows
1. T he backup solution utilizes the vPower technology to publish an application group into an isolated
virtual lab. Depending on the type of the source VM replica or backup it is either powered on
normally or powered on directly from the backup files.
2. B
efore each VM is processed, a snapshot is taken in order to divert all changes during the tests away
from the backup or replica disks. This ensures the validity of our backups and gives Veeam a way to
discard any changes to the disks during testing.
3. T he VMs are powered on individually in accordance with their power-on order, and a series of
recovery verifications tests are performed on the VM, including:
a. H
eartbeat test This test simply waits for a heartbeat signal from the VMware Tools software
installed within the OS of the VM. If the signal comes at its normal specified intervals, the test is
considered a success.
b. P
ing test The Veeam Backup & Replication server sends a set of pings to the VM through its
masqueraded IP address. If the VM responds to the pings, the test is a success.
c. A
pplication test Depending on the type of VMs being tested, Veeam can run a set of
automated tests that are application-specific. For example, if a VM is an Active Directory
controller, Veeam will probe for port 389. If there is a response, the test is considered a success.
d. C
ustom scripts The recovery verification also allows for a series of custom scripts to be run
against the target VMs. For example, you may want to run a Select TOP against a SQL server, a
REST Operation against a web server, etc.
4. If we are running a SureBackup job, we have additional options for running backup file validation, which
essentially compares checksums to the data that is on disk to ensure the integrity of the backup files.
5. Once the verification process has completed, the VMs are powered down and unpublished and the
snapshot files taken in step two are discarded. A report outlining the success of the job is then sent off.
46
Having strict, regimented backup and replication testing is a must within the modern data center.
Using automated solutions like Veeam Backup & Replication can give organizations the peace of mind
they require because they know that when it comes time to restore from backup or replica, the data
they need is there, intact and 100% restorable.
To meet the demands of the Always-On, 24/7 modern data center, organizations need to ensure that
their data protection solution has many levels of recovery available. Solutions like Veeam Backup &
Replication focus heavily on recovery and restore, which is probably the most important piece inside
of any data protection product in the end. Its important that data protection solutions leverage the
power of virtualization when performing restores and utilize VMware APIs in order to provide a faster,
more efficient recovery experience. Doing so minimizes downtime and helps organization achieve the
level of Availability that their customers and users have come to expect.
47
48
49
50
Scalability
When choosing an Availability solution, organizations must consider how scalable that solution is.
Although your infrastructure might be stagnant or still, it is almost impossible to predict what it may
look like in the future. Two or three years from now, an organizations infrastructure can double or triple,
so it is important that the solution you select today can scale to meet the needs of tomorrow. Without
support for scalability, organizations would have to perform a rip-and-replace on their data protection
solutions every time they grew, which is not an ideal situation to be in, especially when dealing with
large amounts of data that they need to move around.
Veeam Backup & Replication provides scalability in terms of backup proxies. When the demands exceed
the current configuration, Veeam gives customers the ability to simply deploy more backup proxies,
which essentially takes the load off the existing backup infrastructure. By deploying or removing
proxies, which are workers or data movers, organizations can scale both up and down to meet the
demands of their ever changing, dynamic modern data center and guarantee that required RPO and
RTO parameters will always be met, regardless of the size of the environment.
Aside from proxies, Veeam customers are also able to install multiple backup consoles as well. The
backup console essentially is the keeper of the jobs and the point of reference for all management.
There are situations like multiple data centers and remote/branch (ROBO) offices in which
organizations may wish to have multiple backup consoles. Veeam is licensed per CPU socket on the
ESXi hosts running the protected VMs, leaving the number of consoles and proxies deployed up to
the customer, with no additional licensing fees. Although having multiple backup consoles solves a
number of issues around multiple data centers and ROBO deployments, there are some complexities
around management. To help alleviate the issues of having to log into multiple consoles to manage
and monitor backup jobs, Veeam has introduced another component called Veeam Enterprise
Manager. Veeam Enterprise Manager is a solution intended for managing distributed enterprise
environments, like those described above. It offers a consolidated view of all the backup consoles
deployed within an organization through a web interface, allowing organizations to centrally monitor
and manage their backup and replication jobs and search for and restore VM guest OS files stored
within the subsequent backup files.
2015 Veeam Software
51
However, there are still times when an organization wishes to only have one backup console but still
minimize the amount of traffic that surpasses their WAN during a backup process. Even if the source
and target are self-contained, or both located within the same remote site, a centralized console will
still need to communicate with the guest that is getting backed up in order to process certain guest
interactions, such as application quiescence. Although this may not be an issue on a small scale, it can
quickly become one as the number of concurrent processes and VMs increases. To alleviate this, Veeam
presents the option for organizations to place a guest interaction proxy at each remote site. The guest
interaction proxy then handles all the interactions with the guest OS being backed up, eliminating the
need to traverse the WAN and only requiring control commands from the centralized backup console.
Self-service restore
Another must have feature of any modern Availability solution is self-service restore. Self-service restore
is essentially a delegation, or granting the task of restoring files within the guest OS to users that have
administrative rights. In many organizations, there are users who are responsible for the applications
running within the VMs. These application owners are the primary administrators for the files and
folders within the VM so why would they have to go to a backup administrator to simply restore
a file? Self-service restore allows these applications owners to log into a portal, search through the
content of the backups for their VMs only and perform any necessary restores of those files back to
their original or new locations. This process is all completed without any intervention from the backup
administrator, freeing up the admin to work on more important projects and guaranteeing application
users a faster restore by skipping any restore workflows they may have to run through.
With IT departments today facing challenges juggling declining budgets and increased service
expectations, modernizing a data center becomes an uphill battle. Protecting that modern data center
is a key component within that challenge. Companies need to take a close look at every purchase,
including that of their data protection solutions. A modern Availability solution should not only meet
the demands and Availability that end users expect, but it should also provide organizations with valueadd. Using virtualization and technologies, such as Veeams vPower, can help to bring increased benefits
to organizations. This not only increases Availability while decreasing data loss and providing a way to
failover during disaster, but it also delivers a powerful platform that puts those once stagnant backup files
and resources to work. Utilizing your backups to test application updates, perform patch management
and further secure your production environment are just a few of the added benefits that a modern
data protection solution like Veeam Backup & Replication can bring to an organizations IT department.
However, there is much more: If you need to test anything on a production VM, Veeam, vPower and virtual
labs can provide you with the foundation, creating a duplicate copy of your production workloads.
The Always-On, 24/7 modern data center is achievable, but in order to get there, organizations need to be
sure they are protecting their workloads the best they can. With virtualization becoming the prominent
infrastructure delivery method inside our data centers, organizations need to ensure that they are
utilizing a solution that has been purpose-built for virtualization. Modern Availability solutions like Veeam
Backup & Replication leverage virtualization to its fullest, providing innovative and unique feature sets to
organizations in order to prevent data loss, adhere to SLAs and protect organizational assets. A modern
solution is the first step in meeting the Availability needs of the Always-On, 24/7 modern data center.
52
53
COMING SOON
NEW Veeam
Availability
Suite v9
54