Você está na página 1de 32

Innovators are directing their focus on turning the vast amounts

of data available into actionable intelligence.


Big Data
In Government
3 Big Data, Big Issues, Big Efforts
4 Set Your Data Free!
6 Geospatial Is Big Data In Action
8 The Cyber Big Data Equation
10 8 Steps To Move Up The Curve
12 Privacy In The New Big Data Era
14 The Proven Approach
16 Executive Interview With NGAs
David L. Bottom
18 The Geospatial Platform At a
Glance
20 Lets Talk Tech: Listen to the
Experts
30 My Big Data Top Ten
SME One-On-Ones
22 Ciena: Renee Reinke
23 Software AG Governmet
Solutions: Michael Ho
24 NetApp: Dale Wickizer
25 The GovConnection Team
26 HP Enterprise Services:
Diana Zavala
27 Cloudera Government Solutions:
Mercedes Westcott
28 EMC Isilon: Audie Hittle
29 General Dynamics Advanced
Information Systems: Jay Mork
Volume 6 Number 2 March 2014
Inside Big Data
Published by
Download Your Digital Edition at www.OnTheFrontLines.net.
Learn more at www.cloudera.com
By Jeff Erlichman, Editor, On The FrontLines
It Occurs To Me...
Big Data, Big Issues, Big Efforts
B
ig Data success stories and practical implementation strat-
egies are two staples of the 2014 federal conference circuit.
At these events, agency implementers talk about get-
ting your organization culturally prepared; eliminating legacy
systems without compromising the mission; and what the best
acquisition strategies are.
They speak about the criticality of
data governance and recount past and
ongoing success stories, for example
the success of the Project Open Data
and Data.gov.
There are big privacy issues for
Big Data as well. Currently, the White
House has an ongoing 90 day study
underway. Also in collaboration with
the National Science Foundation, the
Council for Big Data, Ethics, and Soci-
ety is now researching critical social
and cultural perspectives on Big Data
initiatives.
Researchers from diverse disci-
plines from anthropology and phi-
losophy to economics and law will
address issues such as security, priva-
cy, equality and access in order to help
guard against the repetition of known
mistakes and inadequate preparation,
said the NSF press release.
Meanwhile the technology com-
munity is addressing the problems of
handling scalability for large amounts of data and the issues of
scale using cloud apps; along with the specic need to provide
user-friendly applications for driving complex information to
analysts.
They are tackling issues such as the use of Novel Unstruc-
tured Data (NLP) analysis for concept extraction and relationship
derivation; and the use of use of Hadoop and open source tools in
a NoSQL environment.
Many of these issues were addressed at NISTs recent 2014
Data Science workshop. The NIST Big Data Working Group is con-
tinuing its work on how to interpret and convert raw data into a
visual, actionable product.
They are looking at the latest advancements and research
in visualization, learning how to describe new data and create
user-friendly narratives and developing techniques for reshaping
predisposed, non-visual thoughts about data.
Of course, none of the leaders are pre-
dicting any new money for the big trans-
formation to Big Data.
That Someday Is Today
With all thats going on, theres a lot for
an agency to absorb. While big data is not
new, the idea, the concept of Big Data and
its value is just taking hold.
Its value is found in the questions we
can ask and the answers we can derive,
Michael Ho from Software AG told me in a
recent interview.
What are the things you have thought
about and said you know what well get
to that someday? Well that someday is
today! Think about those questions now
and youll lay the proper foundation.
Now is the time to ask the harder
questions and embrace solutions that can
push the envelope, Mr. Ho explained.
Now is the time with the evolution
of technologies to step back in and say
today is the day we want to get the most
results out of our data; here is what we need here is how to do
it, Mr. Ho said.
You dont have to buy whole thing all at once, he counseled.
There is no failure; there are lots of ways to approach the
next generation of technologies without having to really just grind
your enterprise for a halt. There is a way to do things in real time.
The message for government Big Data implementers is: Its
not about having the 100% solution now; but putting in place the
means to get to that 100% solution as quickly as possible while
still meeting the demands of your enterprise. n
Copyright 2014 Trezza Media Group, Public Sector Communications, LLC
Big Data 3
Scan to
view video
Scan to
view video
Joe Klimavicz
CIO
National Oceanic and Atmospheric Administration (NOAA)
Dan Doney
Chief Innovation Ofcer
Defense Intelligence Agency (DIA)
Impact of Big Data
Set Your Data Free!
Data.gov is the home of the U.S. Governments open data. Here you will nd data, tools, and
resources to conduct research, develop web and mobile applications, design data visualizations,
and more.
F
or those looking to turn Big Data into action, theres plenty
of raw materials to get started at Data.gov.
Researchers and developers can choose from 127,091
*

open datasets, along with models and tools. Or search more than
20 topics, such as Agriculture, Health, Manufacturing, Weather
and Oceans. Every topic is supported by a com-
munity of interest; people who are working to nd
solutions and collaborating through social media,
events and platforms. Concurrently, developers
are using the tools and open data sets to design
a variety of apps.
Open data is an ecosystem. And what we are
really trying to accomplish is a sea-change in government and the
expectations people have of their government, Jeanne Holm,
Evangelist, Data.gov, GSA explained.
The change, albeit very small, is to make sure the data we
gather on behalf of citizens in order to get our work done is acces-
sible and understandable to citizens and our government agen-
cies, she told OTFL in a recent interview.
Ms. Holm said normally people capture information and
data in order to create reports for making policy and or doing
analysis.
The change is to really open it up so citizens can look at
information transparently; so citizens can comment and offer
constructive criticism of government based on data, not opinion;
and be able to look at that data in new ways and use it so they can
create innovation in their businesses, or deliver better services.
At the highest level we are trying to do that, but there are
lots of steps in between, she noted. We try to make all of that
information integrated and accessible through Data.gov.
4 Big Data
Jeanne Holm
Data.gov
Wicked Problems
Ms. Holm said most people get involved with Big Data for prac-
tical reasons.
They have a wicked problem they are trying to solve; like the
Navy trying to gure out where they should invest next years
R&D money; like NIH deciding what types of clinical trials they
should be running; or NASA looking for the best landing spot for
next rover on Mars.
That causes them to seek out platforms to run their apps and
analytical tools so analysts can perform their missions.
And for the actual answers to problems and out-of-the-box
innovative solutions, Ms. Holm said some are turning to Gamifa-
cation activities or exploring the use of challenge competitions
found at Challenge.gov.
Many in government dont yet understand the power of
these analytic tools to give them interesting information, she
noted. That is changing.
We are at a place where machine learning is getting sophisti-
cated; we are getting to a point we can start to automate some of
those more simple decisions that come out of predictive analytics.
This is not being adopted quite as proactively everywhere in gov-
ernment that it could be, but it is gaining some traction.
Popular Speaker
Ms. Holm is also a much sought after speaker at Big Data
events.
She has spoken on Operational Challenges and Consider-
ations in Large-scale data Analytics where she discussed the
best tools and strategies for large scale data management; data
retention policies beyond keeping everything forever; and solv-
ing the data quality issue.
At another event Ms. Holm spoke on Federating Big Data for
Big Innovation, where she spoke about how Data.gov federates
vast data resources providing analytic tools that allow game-
changing innovation.
The Open Data movement is changing how government and
business can drive innovation and how we can predict what do in
the future based on data driven-decisions, she said.
We are looking at open source in data not just for software,
but open source models for the models behind the decisions. Dif-
ferent models can make a difference in outcomes.
As the Evangelist for Data.gov, Ms. Holm has a keen interest
in growing the number of both Data.gov participants and com-
munities. I would say we have had sparky growth in number and
participation.
Go to www.data.gov, create your own spark and set your data
free. n
* as of March 10, 2014
Project Open Data
The White House developed Project Open Data a col-
lection of code, tools, and case studies to help agencies
adopt the Open Data Policy and unlock the potential of
government data.
According to the website, Project Open Data will evolve
over time as a community resource to facilitate broader
adoption of open data practices in government. Anyone
government employees, contractors, developers, the
general public can view and contribute.
Learn more about Project Open Data Governance and
dive right in and help to build a better world through the
power of open data. Visit http://project-open-data.github.io/.
Below is a list of ready-to-use solutions or tools that will
help agencies jump-start their open efforts. These are real,
implementable, coded solutions that were developed to sig-
nicantly reduce the barrier to implementing open data at
your agency. Many of these tools are hosted at Labs.Data.gov.
1. Database to API - Dynamically generate RESTful APIs
from the contents of a database table. Provides JSON, XML,
and HTML. Supports most popular databases. - Hosted
2. CSV to API Dynamically generate RESTful APIs from
static CSVs. Provides JSON, XML, and HTML. - Hosted
3. Spatial Search A RESTful API that allows the user to
query geographic entities by latitude and longitude, and
extract data.
4. Kickstart A WordPress plugin to help agencies kick-
start their open data efforts by allowing citizens to browse
existing datasets and vote for suggested priorities.
5. PDF Filler PDF Filler is a RESTful service (API) to aid in
the completion of existing PDF-based forms and empower
web developers to use browser-based forms and modern
web standards to facilitate the collection of information. -
Hosted
6. Catalog Generator Multi-format tool to generate and
maintain agency.gov/data catalog les. - Hosted
7. A data.json validator can help you check compliance
with the POD schema. There is one hosted at Project Open
Data; another written by Dave Caraway; and another one
written by HHS.
8. Project Open Data Dashboard A dashboard to check
the status of /data and /data.json at each agency. This also
includes a validator.
9. Data.json File Merger Allows the easy combination
of multiple data.json les from component agencies or bu-
reaus into one combined le.
10. API Sandbox Interactive API documentation sys-
tems.
11. CFPB Project Qu The CFPBs in-progress data pub-
lishing platform, created to serve public data sets.
Big Data 5
6 Big Data
Geospatial Is Big Data In Action
NSFs Dr. Suzi Iacono and NOAAs Zach Goldstein talk about the role of Big Data.
G
eospatial data is being collected, produced and dis-
seminated at faster and faster speeds. Thats nothing
new. The sheer amount of location-based data is grow-
ing exponentially. For example, NOAA archives 30 petabytes
per year, every year. Theres nothing new about that either.
What is new however is the complexity of the data is increas-
ing and how you go about linking all this location-based, often
unstructured data together to create actionable intelligence in
real time.
Experts know to be successful you must harness the power of
the cloud and Big Data analytics.
Scaling Up
This is one of the areas Dr. Suzi Iacono, Senior Science Ad-
visor for the Directorate for Computer and Information Science
and Engineering (CISE) at the National
Science Foundation (NSF) is focused on.
We really have to gure out how to
scale from the kind of small data sets
we have today to very large data sets
and very heterogeneous data sets, she
explained. Heterogeneity is one of the
biggest challenges. There are data ar-
chives in universities across the US that
cannot share information.
Dr. Iacono described one example
of bringing communities together; a
project called EarthCube, which allows
NSFs three geoscience three divisions
Ocean, Earth and Atmospheric to
share data.
If you are a geoscientist who only
worked with Oceans, you are using
only the Ocean community database.
But what if you could get information
about the Earth and atmosphere?,
Dr. Iacono asked. Then you could really
pose some much more interesting research questions.
Saving Lives
Dr. Iacono said another example is using real time data to
evacuate people in a big storm; being able to take the data that
FEMA, NOAA and USGS have and be able to actually integrate it in
real time; and get real actionable knowledge of what to do.
This will change the course of how we handle public safety
in this country, because being able to get people going down the
right road rather than the wrong one will save lives.
Zach Goldstein, Deputy Chief Information Ofcer (CIO) at
NOAA said he sees more opportunities and examples of informa-
tion sharing among agencies. And that NOAA cant do its job or
save lives without Big Data.
For example, during Superstorm Sandy NOAA and the Census
Bureau worked together to gure out where to place evacuation
shelters before the storm hit. This process will become routine in
the future, said Mr. Goldstein.
NOAAs mission spans from the surface of the sun to the
bottom of the oceans. Everything from monitoring marine mam-
mals to forecasting the direction of hurricanes; what ties things
together is Big Data.
Mr. Goldstein said they have hundreds of thousands of sen-
sors gathering that Big Data; and to handle the massive amounts
of trafc they built a 10 gigabyte network.
We are trying to deliver as much
technology as securely as possible and
one of the things we are really focusing
on is to take advantage of cloud solu-
tions and cloud services, he said.
It is much more economical to take
advantage of these public clouds. And
so we have been working across the
geospatial community to build a cloud
solution so that everybody can put their
data out there.
Platform Please
NOAA has also been a key player in
the development of the Geospatial Plat-
form.
Using the build it once, use it many
times premise, the Platform leverages
current interagency coordination ef-
forts. It also uses best practices, new
technologies and open standards to pro-
vide more accessible data and services
while realizing efciencies through shared infrastructure and
economies of scale.
Its also something the public and private sector can take ad-
vantage of to attack critical problems and come up with innova-
tive solutions. The federal data has been quality controlled and
checked. Users can then take advantage of not only that informa-
tion, but information that other users have posted there as well.
Users can mash it up and create additional map-based prod-
ucts, and use them for enhanced decision making within their
operations or post them out again for sharing. n
OTFL@The Federal Executive Forum
Scan to
view video
Scan to
view video
Zach Goldstein
Deputy Chief Information Ofcer (CIO)
National Oceanic and Atmospheric Administration (NOAA)
Dr. Suzi Iacono
Senior Science Advisor for the Directorate for Computer
and Information Science and Engineering (CISE)
National Science Foundation (NSF)
Because agencies have unique mission needs, there is no one-size-ts-all Big
Data solution. General Dynamics Advanced Information Systems open architecture
approach allows agencies to quickly plug-and-play technologies to extract
meaning and value, accelerating operational capability and time-to-mission. When
you combine this exible, forward-leaning approach with our in-depth mission
understanding, nding the needle in your Big Data haystack is now possible.
www.gd-ais.com
CAN YOU FIND THE
NEEDLE IN YOUR BIG
DATA HAYSTACK?
You can now.
GDAIS Big Data Ad8.5x11-2014.indd 1 2/27/14 8:20 AM
Because agencies have unique mission needs, there is no one-size-ts-all Big
Data solution. General Dynamics Advanced Information Systems open architecture
approach allows agencies to quickly plug-and-play technologies to extract
meaning and value, accelerating operational capability and time-to-mission. When
you combine this exible, forward-leaning approach with our in-depth mission
understanding, nding the needle in your Big Data haystack is now possible.
www.gd-ais.com
CAN YOU FIND THE
NEEDLE IN YOUR BIG
DATA HAYSTACK?
You can now.
GDAIS Big Data Ad8.5x11-2014.indd 1 2/27/14 8:20 AM
8 Big Data
I
n June 2013, 70% of Feds surveyed believed that in ve years
successfully leveraging Big Data will be critical to fullling
Federal mission objectives. Thats according to MeriTalks
Smarter Uncle Sam: The Big Data Forecast research.
Five years is not long in government time.
In that time, Big Data technologies are going to mature.
Agencies will have honed their Big Data strategies. However all
the maturing and honing will not make a real difference until
risk-based and risk-driven security policies and technologies are
embraced and in place.
So, as cyber goes, Big Data goes. To examine
this symbiotic relationship between Big Data and
cyber security and where its going, MeriTalk re-
leased new research on Balancing the Cyber Big
Data Equation.
In the research, 18 federal Big Data and cyber
leaders (including Jeanne Holm) were inter-
viewed. They spoke about the emerging interplay
between the two disciplines. They explored steps agencies should
and are taking to balance access and risk.
Two-Way Street
According to MeriTalk, Big Data presents new opportunities
to enhance cyber missions, but that more security controls are
needed to ensure agencies can protect data, especially as data
sets grow.
The report examines this two-way street and explores what
agencies are doing today, how they are balancing access and risk,
and whats next as technologies and policies mature.
Experts note that agencies lack both infrastructure and policy
to enable Big Data correlation, dissemination, and protection.
They point out that while existing information security principles
still apply, the product of data analytics is the element that be-
comes more sensitive.
As data sets grow and come together, so does the data sensi-
tivity, and the risks of unintended consequences. Most execs note
there is much work to do including initial steps such as ltering
and characterizing data. Agencies are also facing severe budget
cuts, limiting their ability to evolve and take full advantage of the
Big Data opportunity while also managing the associated threats.
Looking ahead, the report found that as Big Data and security
infrastructures and policies evolve, agencies need dashboards
that can aggregate input from different analytical tools to deliver
business insight and value. Some Feds believe that intelligent
analytics will one day reduce the need for highly trained data
scientists.
To achieve success, agencies should consider information
sharing from an organization and policy perspective and at
the same time, focus on the technical issues. Government-wide,
agencies need a cohesive enterprise security infrastructure,
coupled with a sound risk management strategy.
The report recommends executives on the front lines:
Consider the Full Equation: Develop a comprehensive
enterprise information architecture strategy that encom-
passes both Big Data and cyber
Re-think Risk: Ensure agencies adequately classify the risk
level of their data analytics capability and take appropriate
steps to mitigate risks, including considering threats within
and beyond perimeter borders
Put Data To Work: Invest in the tools to apply analytics
to continuously monitor data, enabling the predictive and
automated capabilities that will ultimately save time and
money
Show Me The Money: Pilot dashboards that provide mis-
sion owners with new insights and help track the ROI for Big
Data/cyber efforts.
Update Big Data Denition
Peter Mell, Computer Scientist for the National Institute of
Standards and Technology (NIST), denes Big Data as: Where
the data volume, acquisition velocity, or data representation
limits the ability to perform effective analysis using traditional
relational approaches or requires the use of signicant horizontal
scaling for efcient processing.
The report notes that while technology will rightfully domi-
nate the discussion of Big Data, in truth the denition needs to
extend beyond just those limits to embrace both the challenge
and the ultimate value.
Based upon the panel insights, we would amend the NIST
denition to: Big Data is the set of technical capabilities and
management processes for converting vast, fast, or varied data
into useful knowledge.
To download the report visit: http://www.meritalk.com/
Balancing-Cyber-BigData n
BALANCING THE
CYBER BIG DATA
EQUATION
The Whole is Greater Than the Sum of Its Parts
Aristotle
The Cyber Big Data
Equation
18 Federal IT Big Data and cyber security experts talk
about the emerging interplay between the two disciplines
in this MeriTalk research report.
Jeanne Holm
Data.gov
BIG DATA
ITS ON ISILON
Store & AnAlyze your DAtA
Protect your DAtA
Secure your network
www.emc.com/federal
GAme-chAnGinG
technoloGieS
for iSr & fmV
BIG DATA
ITS ON ISILON
Store & AnAlyze your DAtA
Protect your DAtA
Secure your network
www.emc.com/federal
GAme-chAnGinG
technoloGieS
for iSr & fmV
8 Steps To Move
Up The Curve
At HUD, turning Big Data into actionable
information requires a lot of moving parts
to work in harmony.
T
here are those in government actively pursuing the Big
Data quest. They actually enjoy doing the hard stuff.
Some of these government implementers spoke of their
experiences at the recent Big Data for Government and Defense
Conference in Washington, DC. The conference was presented
by the Institute for Defense & Government Advancement (IDGA).
HUD Vision
Tarrazzia Martin is the Director of Future IT
Environment at the Housing and Urban Develop-
ment (HUD). She told conference attendees that
HUD is keenly aware of todays IT trends that
focus on transforming legacy infrastructures
into agile environments using cloud and shared
services.
The one constant: the trend that binds all IT transformation
efforts is the data, Ms. Martin said.
Ms. Martin noted that HUD will consider the impact of new
trends; studying how these trends will affect the portfolio; and
what the budgetary implications will be.
Each trend has been analyzed using a low, medium, and
high rating on four key areas, Ms. Martin explained.
1. Strategic impact to HUD: how this trend will bring new
services and capabilities to customers, employees, and
stakeholders
2. Benet rating
3. Years to mainstream adoption
4. Business values.
She described the current HUD environment as one where it is
sometimes difcult to deliver information timely and seamlessly
between staff. To remedy that means moving HUD up the Big
Data implementation curve.
The lack of consistent, reliable, authoritative common data
is a challenge to success, said Ms. Martin. This is not a should
we or shouldnt we proposition, its not a question of timing, its
happening now.
You need to understand where your agency is on the imple-
mentation curve and develop a plan for moving up the curve as
quickly as possible, Ms. Martin urged. She provided 8 steps you
can take to move up the curve. n
10 Big Data
8 Steps Up The Curve
1. Data Creation: Smartphones, social media apps, online
shopping, video streaming, and web surng all create Big
Data. These digital trails can be measured in terabytes,
and when you multiply that by a thousand consumers or
knowledge workers, you reach a petabyte. Voilayouve
got Big Data.
2. Tiering: Is all about data accumulation. IT departments
invest in tape storage, disk storage, ash storage, desktop
storage, and cloud storage as a way of taking it all in. Best
practices involve tiered storage, where data is moved to
the most cost-effective medium. As terabytes become
petabytes, you must have a well-conceived storage strat-
egy to survive.
3. Optimization: The inux of petabytes (1000 terabytes)
and an exabytes (1000 petabytes.) has upped the game
for data managers. Exponential growth requires IT teams
to rethink whats worked in the past and bring in new tools
to optimize database workloads that are orders of magni-
tude greater than before.
4. Analytics: Once your Big Data house is in order, the next
step is to run algorithms against the data in search of in-
sights. Keep in mind that big data isnt just more data, it is
new types of data, coming in faster and from new sources.
5. Share The Wealth: Make this an enterprise-wide initiative,
not the exclusive realm of a few highly specialized data
analysts. Put big data tools and access into the hands of
as many employees as possible, and mobilize those capa-
bilities the same way that you have other enterprise apps.
Also you will want to share big data even more broadly by
using APIs to make data sets available outside your orga-
nization, as in the case of Data.gov.
6. Run The Business: Now that your terabytes of raw data
have been transformed into information, customer, and
operational data can be used to better run all aspects of
your business, making it a case study in Big Data imple-
mentation.
7. Change The Business: This is the step that gives your
agency a sharper competitive edge. Big Data lets agen-
cies get lightning-fast answers to new questions that only
they think to ask, accelerate decision making and actions,
predict outcomes, and create new data-driven products
and services.
8. Disrupt your Industry. This is the height of big data execu-
tion, and where you want to be.
Big Data is an instrument of change, and innovation is
coming, Ms. Martin said. The take-a-way: Federal agencies
must create a cohesive integrated strategy to manage our
#1 asset which is data.
Source: IDGA
Tarrazzia Martin
HUD
WE PUT THE IT IN ABILITY
Keep All the Data You Need
Massive amounts of information are being collected
at the federal level to help solve some of the
nations most pressing challenges. The need to
store, access, and turn that data into information
that guides decisions is a top priority.
GovConnections expertise helps you:

Protect structured and unstructured data

Accommodate user demands

Meet security regulations

Determine disaster recovery goals



We can optimize your storage so you
have the input necessary for every
big data initiative.
2014 GovConnection, Inc. All rights reserved. GovConnection is a registered trademark of
PC Connection, Inc. or its subsidiaries. All copyrights and trademarks remain the property
of their respective owners. #2312 FMC0314
1.800.800.0019
www.govconnection.com/federal
Embrace the data revolution
with integrated storage solutions
we solve IT.
How We Do IT
DATA MANANGEMENT SOLUTIONS
GovConnection has the tools and resources to
discern, design, and deliver the right solution for
your environment.

Storage Assessment

Storage Healthcheck

Backup & Recovery Services
www.govconnection.com/datamanagement
12 Big Data
White House Wants Public Comments
by March 31
On March 5, EPIC the Electronic Privacy Information Center
announced the White House is requesting public comments
on the Big Data and the Future of Privacy review.
EPIC, joined by 24 consumer privacy, public interest, sci-
entic, and educational organizations petitioned the Ofce
of Science and Technology Policy last month to accept public
comments.
The petition stated, The public should be given the op-
portunity to contribute to the OSTPs review of Big Data and
the Future of Privacy since it is their information that is being
collected and their privacy and their future that is at stake.
The letter sets out several important questions, including
whether current laws are adequate and whether it is possible
to maximize the benets of Big Data while minimizing the risks
to privacy.
Comments are due by March 31, 2014. For more informa-
tion, see EPIC: Big Data and the Future of Privacy.
Pr
T
O
n January 23, 2014, at the direct request of the Presi-
dent, John Podesta began a comprehensive review
how the intersection of Big Data and privacy is affect-
ing the way we live and work.
While we dont expect to answer
all these questions, or produce a comprehensive
new policy in 90 days, we expect this work to
serve as the foundation for a robust and forward-
looking plan of action, wrote the Counselor to
the President. A preliminary report is due at the
end of April.
The goal is to study the Big Data relationship between
government and citizens. The immense volume, diversity and po-
tential value of data will have profound implications for privacy,
the economy, and public policy.
The working group will consider all those issues, and spe-
cically how the present and future state of these technologies
might motivate changes in our policies across a range of sectors,
he said.
Mr. Podesta is joined in this effort by Secretary of Commerce
Penny Pritzker, Secretary of Energy Ernie Moniz, the Presidents
Science Advisor John Holdren and the Presidents Economic Ad-
visor Gene Sperling among others.
Mr. Podesta expects to deliver a report that anticipates fu-
ture technological trends and frames the key questions that the
collection, availability, and use of Big Data raise both for our
government, and the nation as a whole.
Mr. Podesta said this is going to be a collaborative effort.
The Presidents Council of Advisors on Science and Technology
(PCAST) will conduct a study to explore in-depth the technologi-
cal dimensions of the intersection of Big Data and privacy, which
will feed into this broader effort.
The working group will consult with industry, civil liberties
groups, technologists, privacy experts, international partners,
and other national and local government ofcials. It will also
work with think tanks, academic institutions, which are convening
stakeholders to discuss these very issues and questions.
Quick Follow-Up Action
One of those institutions is MIT, which hosted the rst Big
Data Privacy: Advancing the State of the Art in Technology and
Practice workshop on March 3, 2014. OSTP is co-hosting a series
of public events to hear from technologists, business leaders, civil
society and the academic community.
Organized by the MIT Big Data Initiative at CSAIL and the MIT
Information Policy Project, the workshop explored core techni-
cal challenges associated with Big Data apps and the theoretical
grounding for privacy considerations in large-scale information
systems. State-of-the-art in privacy-protecting technologies and
how they can be applied to a diversity of Big Data applications
were also discussed.
Among the speakers at the event were:
MIT President Rafael Reif
White House Counselor John Podesta (Keynote Speaker)
Secretary of Commerce Penny Pritzker (Keynote Speaker)
Cynthia Dwork, Microsoft Research
At the event, Mr. Podesta talked about taking a holistic view of
Big Data and how the Internet of Things is going to require us
to look hard at our current policies to make sure they are in sync
with the real world.
In addition to the MIT event, the OSTP will be co-hosting at
least two additional events one with the Data & Society Re-
search Institute and New York University, and one with the School
of Information and the Berkeley Center for Law and Technology
at the University of California, Berkeley.
Visit the White House at http://www.whitehouse.gov/blog/
issues/Technology for further details. n
Privacy In
The New Big Data Era
You can help the White House study the in-depth the technological dimensions
of the intersection of Big Data and privacy. Heres how.
John Podesra
White House
Helping Agencies Prove IT First and Prove IT Fast
Software AG Government Solutions is a leading software solutions company, delivering
massive-scale, complex and real-time solutions for:
Business Process Management
Integration
Analytics & Visualization
Application Optimization
Put our team to the test.
Learn how our "special forces" approach can get fast results for your most complex integration and
process challenges. Visit: www.SoftwareAGgov.com
GET THERE FASTER
Take a Special Forces Approach to Enterprise IT
14 Big Data
T
he path is a circle, not a straight line.
You know the data you steward could do more to im-
prove the lives of citizens, or improve social and welfare
services or save the government money. You know you are a
prime candidate for Big Data. You also know that unless you are
getting part of the $200 million in announced grants, new money
is not an option.
But that doesnt mean you cant get started.
NISTs Ashit Talukder told OTFL rst you must start with
fundamental questions including: What is Big Data? Having a
consensus denition allows people to speak in a common lan-
guage, use a shared set of reference architectures and is the key
for more uniform way of approaching Big Data.
At the moment, there is no consensus denition and NIST and
others are working hard to formulate one. But in the meantime,
Big Data projects need go forward. So, how do you proceed?
TechAmericas Big Data Commission in its Demystifying Big
Data report presents a proven ve step cyclical approach to
take advantage of the Big Data opportunity.
Review each step continually along the way. To think Big Data
is to think in cyclical not, serial terms.
These steps are iterative versus serial in nature, with a con-
stant closed feedback loop that informs ongoing efforts, the Big
Data Commission writes.
Start from a simple denition of the business and operational
imperatives you want to address and a set of specic business
requirements and use cases that each phase of deployment with
support.
At each phase, review progress, the value of the investment,
the key lessons learned, and the potential impacts on gover-
nance, privacy, and security policies. In this way, the organization
can move tactically to address near term business challenges, but
operate within the strategic context of building a robust Big Data
capability.
How To Succeed
Specically, the Big Data Commission says successful Big
Data implementations:
1. Dene business requirements: Start with a set of specic
and well dened mission requirements, versus a plan to deploy a
universal and unproven technical platform to support perceived
future requirements. The approach is not build it and they will
come, but t for purpose.
2. Plan to augment and iterate: Augment current IT invest-
ments rather than building entirely new enterprise scale sys-
tems. The new integrated capabilities should be focused on
initial requirements but be part of a larger architectural vision
that can include far wider use cases in subsequent phases of
deployment.
3. Big Data entry point: Successful deployments are character-
ized by three patterns of deployment underpinned by the selec-
tion of one Big Data entry point that corresponds to one of the
key characteristics of Big Data.
Velocity: Use cases requiring both a high degree of velocity
in data processing and real time decision making, tend to require
Streams as an entry point.
Volume: Those struggling with the sheer volume in the data
they seek to manage, often select a database or warehouse archi-
tecture that can scale out without pre-dened bounds.
Variety: Those use cases requiring an ability to explore, un-
derstand and analyze a variety of data sources, across a mixture
of structured, semi-structured and unstructured formats, hori-
zontally scaled for high performance while maintaining low cost,
imply Hadoop or Hadoop-like technologies as the entry point.
4. Identify gaps: Once an initial set of business requirements
have been identied and dened, government IT leaders assess
their technical requirements and ensure consistency with their
long term architecture. Leaders should identify gaps and then
plan the investments to close the gaps.
5. Iterate: From Phase I de-
ployments you can then ex-
pand to adjacent use cases,
building out a more robust
and unied Big Data plat-
form. This platform begins to
provide capabilities that cut
across the expanding list of
use cases, and provide a set
of common services to sup-
port an ongoing initiative. n
The Proven Approach
To think Big Data is to think in cyclical not serial terms.
1. Define. 2. Assess. 3. Plan. 4. Execute.
5. Review.
Chart courtesy of TechAmerica Foundation.
www.dlt.com/STaaS . .. . edm-solutions@dlt.com
Storage-as-a-Service.
Visit us at www.DLT.com/NetApp-Dummies-eBook for more information
On-premise. On-demand. Secure.
Elastic pay-as-you-use model.
No capital investment.
OTFL: NGA is two years into embracing the vision Mr. Grifth
described. Can you provide an update of your efforts?
David Bottom, NGA: We have made signicant progress. To fol-
low up on Jim Grifths points I would emphasize that the NGA
Strategy establishes the strategic goals and objectives that
will guide our efforts to fulll NGAs mission and vision and, in
so doing, ensure that NGA continues to lead the community in
providing timely, relevant, and accurate geospatial intelligence in
support of national security (our mission).
One major effort we have undertaken to improve the integra-
tion of data and intelligence information is to comprehensively
transform how we manage and use content something we
call Map of the World (MoW) which will serve as NGAs major
service for visualizing geospatially-enabled content through a
spatially accurate four-dimensional virtual model of the world.
It is a centralized, seamless visualization of all geospatial-
related content and it is online, on-demand. MoW will provide
users with a single visualization environment allowing them to
intuitively identify, retrieve, display, and manipulate the content
they choose for any area of the Earth at any time as well as allow
users to integrate their own content and otherwise tailor displays
in a variety of ways to provide the specic content and visualiza-
tion solutions they need for their mission.
By providing a central focal point for visualizing content, MoW
will enable the integrated display of foundation GEOINT and intel-
ligence content across IC disciplines and, as an added benet, will
provide access to the specic analytical tools they need as well.
OTFL: What are your top priorities?
David Bottom, NGA: My top priorities are:
To provide a world class GEOINT IT platform that integrates
intelligence for our customers the war ghter, analysts,
policy makers, and rst responders who are responsible
for executing military and intelligence operations, ensuring
homeland defense, developing and enforcing our nations
policies on issues such as counterterrorism and counter pro-
liferation, humanitarian missions, and disaster relief.
To reduce associated costs. We face an austere scal envi-
ronment that demands better governance and deliberate
planning to maintain critical community capabilities while
preserving resources for future mission development. We
must leverage our resources in the most efcient manner
possible to meet mission demands by diligent risk assess-
ment and careful balancing of our investments.
OTFL: Is there one NGA Big Data product/service success
story that stands out?
David Bottom, NGA: Activity-Based Intelligence or ABI.
ABI is a set of methods or workows that enable better intel-
ligence through knowledge discovery and capture. It uses high-
performance technologies to quickly ingest, preprocess, and
analyze Big Data.
ABI focuses on analyzing events and transactions in a spatial
environment to address intelligence problems by discovering
or resolving unknown entities and their associated networks.
David L. Bottom
Director
IT Services
National Geospatial Intelligence Agency (NGA)
OTFL ExpertViews
In 2012, OTFL interviewed Jim Grifth, NGA Deputy Director of
the Vision Integration Team, where he spoke about The NGA Vi-
sion to fundamentally change the user experience by providing
online, on-demand access to NGA GEOINT knowledge and data;
and create new value by broadening and deepening the analytic
expertise.
At the time, OTFL asked Mr. Grifth why NGA was embarking
on this mission.
Mr. Grifth replied:
I think the technology is now to a place where we can ad-
vance it. (Senior leaders) talk about how we need to integrate
and collaborate; and thats been a process thats been at work
over the last several years. We are now reaching a critical mass
within the community to do this; and we are now at a point
where we can really take advantage of the new technology to
enable our analysts to do what they were doing already, but do
it in a more intuitive and collaborative manner.
The national security community is looking for integrated
and collaborative analysis and assessments. NGAs part of that
is providing that foundational layer of GEOINT that helps folks
to visualize what they are thinking and talking about in time and
in space. And thats a very powerful tool. We are developing this
trade craft to make better use of our people and their time. Now
weve reached the stage where technology and the amount of
content allow us to spread it out to a wider audience.
Now in the eld, our analysts arent just imagery analysts,
they are also geospatial analysts and they are doing high level
GEOINT. And they are sitting right next to someone from the
Army or the Navy or the Air Force, NSA or DIA; they are all sit-
ting there concentrating on the same issue and they are devel-
oping new ways of doing GEOINT to support everything thats
going on around them.
Now in 2014 NGA is steadily progressing towards the goals
Mr. Grifth outlined. To follow up on NGAs progress and to learn
more about what NGA is doing with Big Data, OTFL interviewed
NGA Director of IT, David L. Bottom. What follows are his Big
Data thoughts and observations.
16 Big Data
It focuses on capturing activities as they occur and based on
understanding of patterns of life analyzing those activities to
determine normal from abnormal, to determine relationships,
and to discover networks.
Our endeavors with ABI reect some key tenets of manag-
ing and utilizing Big Data. The increasing volume, velocity, and
variety of data drive fundamentally different requirements for
the storage, processing, analysis, dissemination, and security of
information systems and associated content.
They require: scalable storage and distributed processing;
agile and exible methods for handling multiple-source formats;
and, streaming data processing (process before storage). To op-
timize performance, processing should be as close as possible to
the data and we must distribute the analytics and not the data.
It is now easier and faster to move software than the more
massive data sets. We also emphasize driving cost efciency
through use of Open Source software, enabling reuse and ex-
ibility (plug and play) of components with non-proprietary
frameworks and Service Oriented Architectures, and we promote
interoperability through implementation of Open Standards.
OTFL: What advice would you offer government managers
about how to organize their own infrastructures to take
advantage of Big Data?
David Bottom, NGA: The overarching question that needs to
be answered is, What mission outcome am I trying to effect?
Taking an outcome-based approach drives capability denition
and development of the solution that is optimal for your given
environment.
I would encourage an in-depth study of best practices that
are being codied as more and more Big Data solutions are being
implemented. For example, utilization of open standards is key to
an open IT environment and promoting interoperability. I would
emphasize the need for having a good data governance strategy
and developing a rm understanding of what technology is avail-
able to make informed choices become familiar with the tools
and techniques that are emerging as part of the big data trend.
And, it is critical to ensure you have the required skill sets in your
workforce to operate and maintain the solution you implement.
Opportunities often bring challenges. Can you describe some
of the Big Data challenges you still face?
We still nd ourselves dealing with stovepipes a large
amount of data, not viewed as important during initial product
production, that is not released to the community. We are mak-
ing strides in this area and our Director of National Intelligence
(DNI) sponsored Unied Intelligence Strategies (UIS), emphasizes
achieving intelligence integration, a key DNI objective for the
Intelligence Community.
We must develop a common strategy to separate data from
applications and optimize for the cloud environment.
We must push forward on selection of a common service plat-
form for Big Data analytics that allows easy implementation of
new analytic applications and data access.
OTFL: Some say the term Big Data will be gone by the end
of the decade. The term may go away, but Big Data is not.
Looking down the road, what opportunities do you see for
growth of Big Data?
David Bottom, NGA: Big Data carries the promise of cost sav-
ings, exibility in our IT architectures, improved interoperability
and increased performance. In the area of increased efciencies
we can look for simplied infrastructure management, faster
development of tools, use of a common query language, and the
decline of duplicative data stores. We can look forward to smarter
products with access to community data and processing services;
spending less time discovering data and more time building com-
plex products; and, with more automated portions of the work-
ow we should increase the speed and efciency of our efforts.
OTFL: if you had the opportunity to write this piece, what
would be your story lines?
David Bottom, NGA: 1. NGA is working had to deliver a robust,
safe, secure, and agile data framework and interfaces that foster
community sharing of data and application development.
NGA is committed to the development and promulgation of ap-
plications within the GEOINT community. NGA will build an open,
agile, and resilient data framework to invite community participa-
tion in developing, deploying, and sharing their own applications
for use by the entire GEOINT community. It is imperative that the
GEOINT enterprise architecture operates in the most efcient
and effective manner possible to enable secure and responsive
exploitation, analysis, and conception of solutions.
2. Emerging science and technology brings both challenges
and opportunities.
NGAs success will depend on how we embrace change, es-
pecially that which is enabled by advances in technology. As our
adversaries adopt new denial and deception techniques, NGA
must use innovative sources, tools, techniques, and processes to
maintain our strategic advantage. NGA will be guided by our vi-
sion, as expressed in our strategic goals and objectives, in taking
GEOINT to the next level.
3. NGA will create and make easily accessible and usable
GEOINT content that addresses key intelligence questions and
anticipates the entire range of our consumers needs.
Recognizing that GEOINT data, products, services, and knowl-
edge are most relevant when the information is easily accessible,
NGA is committed to making its content discoverable, accessible,
and usable in multiple security domains.
We will develop and implement standards for GEOINT content
creation, sharing, and storage. NGA will work with the community
to develop and evolve common standards to permit the sharing
of GEOINT content, both to enrich the entirety of our collective
GEOINT holdings and to reduce duplication and cost.
Through community engagement, NGA will make every ef-
fort to ensure that current and future systems that use, pro-
duce, or enable GEOINT are interoperable and adhere to appli-
cable standards. n
Big Data 17
Create A Map Now!
Click on the link to visit the Geospatial Platform where its easy
to make your own map. Just follow these steps.
1. Choose an area.
From the overall U.S. map, pan and zoom the map to an area
or search by its name or address.
2. Decide what to show.
Choose a Basemap. Examples are street, topographic, satel-
lite, or terrain view. Then add layers on top of it.
3. Add more to your map.
Create an editable layer either from your own les or the
web to draw features on the map.
Display descriptive text, images, and charts for map fea-
tures in a pop-up.
4. Save and share your map.
Give your map a name and description then share it with
other people.
The Geospatial Platform At a Glance
The Geospatial Platform initiative is a critical component of the
NSDI Strategic Plan.
The Platform is a web-based service environment that pro-
vides access to a suite of well-managed, highly available, and
trusted geospatial data, services, applications and tools. In addi-
tion, the FGDC and its partners will utilize common cloud com-
puting and enterprise acquisition approaches as mechanisms to
leverage technology, close productivity gaps, and combine their
buying power for similar needs.
The Geospatial Platform is:
A one-stop shop to deliver trusted, nationally consistent
data and services.
A portal for discovery of geospatial data, services, and ap-
plications (www.geoplatform.gov).
A publishing framework for geospatial assets.
A place where partners can host data and analytical services.
A forum for communities to form, collaborate and share
common geospatial assets.
Built within the federal cloud computing infrastructure.
The Geospatial Platform Supports:
Open and interoperable standards to facilitate sharing and
use of geospatial assets.
Organization and maintenance of a federated network of
organizations that contribute content.
OMB Circular A-16 Portfolio Management to ensure trusted
content and funding for priority data, services, and applica-
tions.
Brokering of enterprise geospatial acquisitions and service
level agreements for the Federal Government.
Identifying requirements for and encouraging the develop-
ment of shared geospatial data, services, and applications
for the Federal Government.
Acquisition of standards-based cloud services with Federal
Information Security Management Act (FISMA) security ac-
creditation.
The future tracking of Federal geospatial investments.
Source: FGDC
Join MeriTalks Big Data Exchange community to engage
in discussions on the opportunities and challenges presented
by big data, risk management, data sharing, and more.
Stay tuned and get involved at www.meritalk.com/bdx
BIG DATA
is more than a buzzword.
18 Big Data
On The FrontLines Magazine (OTFL) is dedicated to proling
the people, programs and policies advancing innovation and best
practices in government IT.
Each OTFL is laser focused on one content area with 2014
issues dedicated to:
Cybersecurity
Cloud Computing
Big Data
Mobility
Data Center Consolidation
To further serve government IT buyers, OTFL also
publishes annual contract guides on these GWACs:
NITAAC (NIH)
SEWP (NASA)

Three Ways To Read
Each digital edition can be read online using ip page
technology, on your device of choice as an interactive PDF or
in print.
1. The digital edition is published using
ip page technology, which allows readers to
enjoy the full print magazine experience.
They turn paper-like pages, link directly
to websites, white papers, and other
information hosted on the Internet,
while watching videos inside the
actual articles and feature stories.
To experience it yourself, click here.
2. After reading the issue using
ip page, readers then click on
the direct link to download the
interactive PDF from
www.OnTheFrontLines.net.
The interactive PDF contains all
the same direct live links and videos,
so they can read the digital edition at
their convenience on their computer,
mobile device or share easily with
colleagues.
3. The print edition is a great
portable resource you can distribute
at conferences and tradeshows, mail
to clients or use as collateral during
sales calls.
Innovative minds are creating brand new ways of analyzing data and extracting knowledge from it.
Big Data
In Government
3 Big Data Is A Big Deal
4 Big Ideas For Big Data
6 A Few Minutes With NSFs Dr. Suzi Iacono
8 Big Data Demystied
10 Making Your Big Data Move
12 Executive Interview With NISTs Ashit Talukder
14 Lets Talk Tech
SME One-On-Ones
15 Brocade: Scott Pearson & Casey Miles
16 CommVault: Shawn Smucker 17 EMC Isilon: Audie Hittle
18 HP: Diane Zavala
19 Informatica: Todd Goldman 20 Bearing Big Data Fruit
22 Calling All GWACs
Volume 5 Number 2 March 2013
Inside Big Data
Published by
Download Your Digital Edition at www.OnTheFrontLines.net.
Courtesy National Science Foundation: Fuqing Zhang and Yonghui Weng,
Pennsylvania State University; Frank Marks, NOAA; Gregory P. Johnson, Romy
Schneider, JohnCazes, Karl Schulz, Bill Barth, The University of Texas at Austin
Dedicated to
Advancing Innovation
& Best Practices
In Government
Reserve your sponsorship today! Save with multiple sponsorships!
Contact Tom Trezza at 201-670-8153 or email ttrezza@trezzamediagroup.com.
Enabling anywhere, anytime, from any device access to serve the needs of citizens and government.
Using Mobile
Technologies To Power Todays Digital Government
3 Screen Size Is Not The Issue 4 Roll Up Your Sleeves, Be Part Of The Process
5 New Mobile Security Baseline Published
6 Digital Strategy Dissected: Mobile Work Exchange Research 8 Mobile Movers: Anil Karmel, NNSA and Vaughn Noga, EPA
10 GSA is Calling All Government Entrepreneurs!
12 Executive Interview: Rick Walsh, Army
14 Lets Talk Tech... 22 A Few Minutes With...Tom Suder, Mobilgov
23 Resources
OTFL SME One-On-Ones 16 Raytheon Trusted Computer Solutions
17 HP Enterprise Services 18 Cisco
19 CommVault
20 BMC Software
Volume 5 Number 4 May/June 2013
Inside
Published by
Download Your Digital Edition at digital.onthefrontlines.net.
Published by
Download Your Digital Edition at digital.onthefrontlines.net.
Volume 5 Number 6 July/August 2013
Success is now measured by TCO incorporating metrics for energy,
facility, labor, and storage, among other things, not the number of closures.
G
o
ve
rn
m
e
n
t D
a
ta
C
e
n
te
r
C
o
n
s
o
lid
a
tio
n
Inside
3 FDCCI 2.0 4 Getting To The Core
5 OTFL@The Federal Executive
Forum
RADM Robert Day, Coast Guard,
Brig. Gen. Joseph Brendler, Army
6 Three Energetic Efforts
8 General Dale Meyerrose
on Focusing On The End State 10 Executive Interview
Kevin Donovan, Data Center Director, National
Renewable Energy Lab
12 Lets Talk Tech...Getting To The
Core...And Beyond 16 Numbers, Never, Lie
18 Resources OTFL SME One-On-Ones
13 Jim Morin Ciena 14 Ben Crocker Software AG
15 Paul Christman Dell
Volume 5 Number 7 September/October 2013
Read online using
ip page technology.
Print edition is
great collateral.
Download the
fully linked PDF.
Every OTFL is available
at http://digital.onthefrontlines.net!
Serving the government IT community since 2009.
OTFL 2014 Schedule
January Cybersecurity Solutions in Government
February Big Data in Government
March Mobility in Government
March Special Issue! Proles in Excellence
April Cloud Computing in Government
April Records Management
May NIH NITAAC Contract Guide
June Data Center Consolidation
July Cybersecurity in Government
August NASA SEWP Contract Guide
September Geospatial Trends in Government
November Cloud Computing in Government
December Health IT in Government
Subject to change. View every OTFL at http://digital.onthefrontlines.net
20 Big Data
By Jeff Erlichman, Editor, On The FrontLines
Lets Talk Tech
Listen To The Experts
8 industry SMEs provide insights into implementing your Big Data efforts.
P
rocessing the insights of 8 Big Data SMEs was a Big Data
process in itself.
So what did I learn? I chronicled the details in the OTFL
SME One-On-Ones on pages 18-25. But rst, heres a cut and
paste of what I found.
First, the obvious: Data is becoming more valuable. It is being
used to reveal insights that can boost competitiveness. The goal
is to turn data into knowledge or actionable information.
Speed counts. Data analytics is moving from batch to real
time. Processing real time or near-real time information is mov-
ing past hype to early stages of maturity. Get closer to your data.
Real time predictive analytics. Predictive analytics enables
organizations to move to a future-oriented view of whats ahead.
Real time data provides the prospect for fast, accurate, and ex-
ible predictive analytics.
The scope of Big Data analytics continues to expand. Interest
in applying Big Data analytics to data from sensors and intelligent
systems continues to increase to gain faster, richer insight more
cost-effectively.
We are moving from an information approach to information
strategy; one that entails decoupling data from applications and
hardware.
Drive Better Outcomes
Diana Zavala from HP Enterprise Services crystalized the
SMEs view of Big Datas future when she told OTFL, its all about
analytics and creating information and knowledge that leads to
better decisionmaking and desirable outcomes.
To get there Ms. Zavala urges IT to look at how can I build on
top of what I have? It is not a rip and replace. We believe these
things are evolutionary in nature and you dont have to start from
scratch.
You have to have the infrastructure to capture, store and
scale the data. You have to be able to secure the information and
provide tools to derive the value the data, Ms. Zavala explained.
Its all about turning the data into actionable intelligence.
Build on what you have today.
Michael Ho from Software AG Government Solutions told
OTFL that Big Data implementers need to keep an open mind.
Dont get boxed-in to a particular concept of what Big Data
is, he noted.
Big Data to me is all about getting the most out of your data
no matter what size, what speed or what type it is in the moment
it is the most relevant to you. Every enterprise will have different
size, speed, types of data components and how they see it will be
different.
The questions Mr. Ho says to ask are:
Am I getting the maximum value my enterprise infrastruc-
ture and data today? If not, what are my needs? Is it the need to
support more volume? Is it the need to support more speed?
Whatever the issues, he advises you to break things into
chunks. That will make it easier to modernizing your infrastruc-
ture in logical steps and lay the foundation for Big Data.
Finally, the challenge is to gure out the best way for each
customer to achieve what they want to achieve, because there is
no one size ts all in Big Data. n
SMEs Speak
22 Renee Reinke, Ciena
23 Michael Ho, Software AG Government Solutions
24 Dale Wickizer, NetApp
25 The GovConnection Team
26 Diana Zavala, HP Enterprise Services
27 Mercedes Westcott, Cloudera Government Solutions
28 Audie Hittle, EMC Isilon
29 Jay Mork, General Dynamics Advanced Information
Systems
OP
n
NETWORK ARCHITECTURE
Whitepaper W
What is OPn?
OPn (pronounced Open) is Cienas new network architecture.
It derives from Cienas vision of how networks will evolve, and
guides both the companys product strategy and its advocacy
to customers and the market.
The aim of OPn is to bend the cost curve of networking
down in the face of rapidly increasing bandwidth and evolving
service demands, and to enable network operators to monetize
new applications and superior user experiences. To do this,
OPn offers three key elements:
> Intelligent optical and lean packet capabilities to scale at
lower cost than alternative architectures
> Software automation and orchestration to speed and reduce
the cost of operations
> Software linkages to allow the network to better integrate
with the data center, transforming connect, compute, and
storage functions into a programmable platform for a wide
variety of new services
The name OPn reects these attributes:
> Exponential scale at lower cost, made possible by
Optical Packet networks
> Software-enabled orchestration and automation, made
possible by open interfaces both within the network, as well
as between the network and the applications that use it
Why is OPn important?
Cienas OPn network architecture is vital for networks to be
able to scale economically to great capacity over wide areas. It
is especially important in light of the changes wrought by cloud
(particularly the evolution of the data center), mobile broadband,
and the rise of the application model for new services. Because
of the general growth of network trafc, network operators
have been cost-challenged since the mainstreaming of the
Internet. The cost-at-scale problem is exacerbated by the
growth of content- and compute-centric services, which create
greater variability and unpredictability of trafc, requiring
greater overcapacity to compensate for trafc uncertainties.
OPn reduces cost at scale by:
> Using the lowest-cost, easiest-to-administer function possible
for a given need. OPn optimizes network cost curves by
minimizing the number of high-touch, high-cost functions such
as IP/MPLS connection engineering and ne-grained trafc
shaping into a smaller number of locations, while optimizing a
larger number of locations for low-touch, streamlined forwarding
and transport. In particular, OPn introduces hardware systems
optimized for low-touch packet forwarding, leaving high-
touch functions to more centralized, higher-cost routers. In
this sense, the OPn networking architecture is congruent with
the trend in both content and compute for consolidation into
a relatively small number of data centers.
> Enhancing scale economics for optical packet transport by
innovating coherent optics for the highest possible capacity
across a given optical link, delivering the richest and most
reliable optical switching, and taking advantage of high-volume
data center economics for packet-switching hardware
> Using open software techniques to decouple software scaling
from hardware scaling and enable the network infrastructure
to become a programmable platform that supports rapid
and exible service and application development. This will
also allow the network to understand and react to shifting
capacity needs, providing greater efciency through higher
resource utilization.
In addition to lowering capital and operational costs for
large-scale networks, OPn is also important because it improves
the competitiveness of network owners. Communications
providers and IT providers are attacking one anothers markets
with the introduction of cloud services. Ciena believes the
orchestration and software-denable network capability inherent
to the OPn architecture allow network owners to prototype and
release services and applications faster, differentiate their
offerings, and/or deliver better customer experiences than
would be possible with a dumb network. In particular, the
2013 GovConnection, Inc. All rights reserved. GovConnection is a registered trademark of PC Connection, Inc. or its subsidiaries. 24944 | 0513
1.800.800.0019
www.govconnection.com/storageassessment
Call an Account Manager to schedule a Storage Assessment today.
Is Big Data Becoming
a Big Challenge?
We Have the Tools You Need to Maintain Control
Its a tough time for data. The amount of information youre generating is exploding while youre
trying to do more with less. Plus, with tighter regulations, records retention, and the need for better
management, you are tasked with guring out how to capture, store, transfer, and analyze your data
|so it doesnt become an overwhelming burden. What can you do to protect your organization and
ensure you have the tools to meet evolving requirements and constant growth?
GovConnection is ready to help. Our Storage Assessment provides easy-to-understand information
about your data growth and usage. We can help you make changes and improvements that reduce
costs, shorten backup and recovery times, and accelerate performance and operational efciency.
100 Blue Ravine Road
Folsom, CA 95630
916-932-1300
Page #
__________Designer __________CreativeDir.
__________Editorial __________Prepress
__________Other ____________OK to go
5 25 50 75 95 100 5 25 50 75 95 100 5 25 50 75 95 100 5 25 50 75 95 100
BLACK
YELLOW
MAGENTA
CYAN
Special Report
SOLUTION BRIEF
Cloudera and Gazzang Partner to Deliver
Enterprise Data Security to Hadoop Users
> Does your Big Data platform contain sensitive or regulated information about your
customers, employees, or business?
> Are you contractually obligated to keep customer information secure?
> Do you use hosted services, where data is stored beyond your control?
A positive answer to any of these questions requires a discussion of data security. Cloudera
and Gazzang are partnering to ensure our customers sensitive data is protected from un-
authorized access or attack and meets strict compliance requirements for HIPAA, PCI-DSS,
SOX, FERPA, and European Data Privacy regulations.
Security that scales with your big data environment
Cloudera delivers unparalleled performance, scalability and flexibility for Hadoop. Gazzang
adds enterprise-strength data security that scales along with your Big Data implementation
and runs at peak performance (low-single digits) whether the data resides in the cloud,
on-premises or in a hybrid environment.
Gazzang zNcrypt
Gazzang zNcrypt helps ensure the security and confidentiality of your Cloudera Enterprise
data, whether in the cloud or on-premise. The high-performance solution transparently
encrypts and secures data at rest without any changes to your applications, storage or data.
In fact, the applications utilizing the encrypted file system will be completely unaware of
zNcrypts presence. Process-based access controls provide an added layer of protection to
prevent unauthorized processes and systems from accessing the encrypted data.
Gazzang zTrustee
Gazzang zTrustee is a software-based
secure vault for Gazzang zNcrypt keys and
any other digital artifact (SSL certificates,
SSH keys, configurations, etc.) that must be
secure and policy controlled. The solution
functions like a virtual safe-deposit box,
supporting a variety of robust, configurable,
and easy-to-implement policies governing
access to these artifacts.
Why Encrypt?
Mounting U.S. industry regulations like HIPAA and PCI-DSS and strict data privacy regula-
tions in the European Union require organizations to secure data wherever it is written to
disk and vigorously protect the encryption keys. Organizations that encrypt are often able to
claim safe harbor, meaning they are exempt from reporting a data breach should one occur.
GAZZANG
INDUSTRY
Data Security
WEBSITE
www.gazzang.com

COMPANY OVERVIEW
Gazzang provides transparent data
encryption and advanced key
management to help enterprises
secure sensitive information in Big
Data environments.
PRODUCT OVERVIEW
Meet compliance initiatives and
protect sensitive data at rest with
Gazzang zNcrypt and zTrustee. Gazzang
zNcrypt is a Cloudera-certified trans-
parent encryption solution for the Linux
file system, leveraging the industry-
standard AES-256 algorithm. Gazzang
zTrustee manages the zNcrypt keys and
enforces access policies.
SOLUTION HIGHLIGHTS
> Rapid, automated deployment at each
node, with no changes to the Cloudera
application
> High-performance data protection
that meets standards for compliance
and safeguarding PII and scales with
your Big Data environment
> Process-based access controls
prevent unauthorized parties from ac-
cessing sensitive data
> Software-based key manager adds
multiple layers of policy and protection
for encryption keys and other digital
artifacts
Datasheet
Clustered Data ONTAP
Operating System
Revolutionize your storage software, remove IT
constraints, and speed response to business changes
KEY BENEFITS
Nondisruptive Operations
Perform storage maintenance,
hardware lifecycle operations,
and software upgrades without
interrupting your business
Eliminate planned and
unplanned downtime for
continuous business availability
Proven Efciency
Drive storage cost reductions
with comprehensive storage
efciency
Consolidate and share the
same infrastructure for
workloads or tenants with
different performance,
capacity, and security
requirements
Grow efciency as scale
increases
Seamless Scalability
Scale capacity, performance,
and operations without
compromise
Scale SAN and NAS from
terabytes to tens of petabytes
without reconguring running
applications
Combine different generations
of storage hardware for
seamless expansion
The Challenge
Businesses today struggle with the
increasing amount of data that they need
to store, manage, and back up. Growing
competitive pressure and 24-hour busi-
ness cycles require that your mission-
critical business processes and data
remain accessible around the clock.
With your business environment in a
constant state of evolution, you need
a more agile approach to storage that
can eliminate downtime, improve the
efciency of your infrastructure and
your IT staff, scale nondisruptively as
your business grows, and quickly adapt
to changing business requirements.
The Solution
Clustered Data ONTAP addresses the
challenges facing your growing and
dynamic business by extending the
innovation of NetApp Data ONTAP, the
worlds number one branded storage
operating system1. Our unied cluster
architecture scales and adapts to your
changing needs, reducing risk and cost.
Clustered Data ONTAP is designed to
eliminate downtime, allowing you to ser-
vice your infrastructure without disrupting
access to user data and applications
even during regular business hours.
Proven operational efciency helps
you simplify your overall storage
environment and manage storage
infrastructure at scale by automating
important processes and increasing
productivity. You can add capacity as
you grow across both SAN and NAS
environmentswithout reconguring
running applications. We let you start
small and grow big without the disruptive
hardware upgrades required by other
storage vendors.
Clustered Data ONTAP provides up to
24 storage controllersor nodes
managed as a single logical pool so
your operations scale more easily.
NetApp supports the broadest set of
storage protocols and is the only
provider to deliver both SAN and NAS
data access from a single, unied
scale-out platform.
Prevent Business Disruptions
With IT now integral to your business
operations, the impact of downtime
goes beyond dollars or productivity
lost. Your companys reputation might
be at stake. Clustered Data ONTAP
eliminates sources of downtime and
protects your critical data against
disaster.
Nondisruptive operations
Our nondisruptive operations capabili-
ties allow you to perform critical tasks
without interrupting your business. The
ability to dynamically assign, promote,
and retire storage resources lets you
1. IDC Worldwide Quarterly Disk Storage Systems Tracker Q4 2012, March 2013 (Open Networked Disk Storage Systems revenue).
Terracotta, Inc.
575 Florida St. Suite 100
San Francisco, CA 94110
Contact us to learn more about Terracotta products, services and support. Please visit
www.terracottatech.com, e-mail us at info@terracottatech.com or contact one of our
sales professionals at +1 (415) 738-4000. Terracotta is a Software AG company.
BUILT-IN COMMUNICATIONS LAYER
TERABYTE SCALE CACHE SERVER ARRAY
TERRACOTTA DRIVER
DURABILITY MIRRORING STRIPING DEVELOPER CONSOLE PLUG-IN MONITORING OPERATIONS CENTER
APPLICATION APPLICATION APPLICATION APPLICATION
HIBERNATE EHCACHE API
LOW LATENCY MEMORY CACHE
Enterprise Ehcache is an easy-to-deploy caching solution for hard-to-solve scale and
throughput problems. Based on the de facto caching standard for Java, Enterprise
Ehcache snaps into enterprise applications for an instant, 10x speed increase and
on-demand, unlimited scale out.
Now with Search for Greater Value and Versatility
Enterprise Ehcache now offers a powerful native Search capability. Through a simple
API, Search lets you search and analyze in-memory data, providing a fast, cost-
effective alternative to querying overloaded databases. By leveraging a searchable
cache, you can solve a wider range of data challenges, at terabyte magnitudes, with
a single, highly scalable platform.
NEW! Powerful Search Capability
Search TBs of data at in-memory speeds
Avoid slow, expensive database queries
Simple, but exible API
Snap-In Performance and Scale
Store more data in memory
Speed application response times
Gain unlimited linear scalability
Ofoad database or mainframe
Deploy with just two lines of
conguration
Plug in BigMemory for TB-scale caches
High Availability
Fully redundant architecture
No single point of failure
Highly available, disk-backed cache
Full Platform Support
Tomcat
WebLogic
WebSphere
JBoss
ColdFusion
Jetty
Enterprise Ehcache
Glasssh
Resin
Hotspot
JRockit
IBM JDK
Enterprise Management and Control
Management console
Third-party monitoring tool integration
Asynchronous write-behind
Bulk loader APIs
Flexible TTI and TTL eviction
JTA support
Online backups
Faster Performance, Sustained Gains
Enterprise Ehcache provides immediate relief from performance bottlenecks to speed
application response times. It maintains performance at any scale, enabling dynamic,
no-compromise growth:
10,000+ transactions per second
30%-90% database load reduction
90% application latency improvement
Linear scale
In-memory performance
Rapid, Easy Deployment
Enterprise Ehcache solves hard scalability problems without forcing new concepts or
development paradigms:
Scales out with just two lines of congurationnot a full application rewrite
Reduces development time, makes scalability available to mainstream developers
PERFORMANCE AT ANY SCALE
Business white paper
Data-driven
government
HP HAVEn turns Big Data into eficient and efective government
Get started
Join MeriTalks Big Data Exchange community to engage
in discussions on the opportunities and challenges presented
by big data, risk management, data sharing, and more.
Stay tuned and get involved at www.meritalk.com/bdx
BIG DATA
is more than a buzzword.
OTFL SME One-On-One
Renee Reinke
Senior Advisor
Global Industry Marketing
Ciena
The Crucial Cornerstone
On www.data.gov, everyone can access
data, tools, and resources to conduct
research, develop web and mobile appli-
cations and design data visualizations.
In fact there are 127,000 data sets;
everything from Agriculture to Con-
sumer to Energy to Public Safety to Sci-
ence & Research, and more. No single
data center can store all these data
sets; eventually the data has to traverse a network.
Government needs agile, on-demand, high performance
networks to realize Big Datas big promise, Renee Reinke told
OTFL.
Data center compute and storage resources are undeniably
critical, but so too is the network that transports those large
data sets in a safe, error-free and timely manner, she noted.
The network is a crucial cornerstone to leveraging Big Data for
improved decision-making.
Ciena provides the ideal high-performance optical and
packet network infrastructure to support on-demand Big Data
le transfers.
Ciena understands that the network must transform to
become just as exible and on-demand as many compute and
storage resources are today, Ms. Reinke added.
Organize Infrastructure for Big Data
Government executives need to understand current net-
work infrastructures were designed to connect users to data
through various locally based applications. Those networks
werent designed to scale to 10G/40G/100G+ the bandwidth
levels required for Big Data analytics.
To better leverage Big Data, IT planners must address two
key challenges, Ms. Reinke counseled.
The rst involves designing inter-data center networks
that scale on-demand to enable compute and store resources
to be shared among data centers for backup and recovery; and
to virtualize the management of storage and the processing of
larger data les.
Second, those planners must also address how to scale the
network between data centers and users to speed the transfer
of large data sets to researchers, application developers and
other analysts.
In both challenges, the network infrastructure must cost ef-
fectively support large volumes of information, while minimiz-
ing latency and ensuring secure data transport.
OPn Architecture
Cienas OPn architecture delivers three key benets: scal-
ability, programmability and application awareness. OPn stands
for optical packet networks at exponential scale, and delivers an
open framework for converged optical transport, switching and
service delivery.
Agencies are using Ciena packet optical networks on a na-
tional and global span to scale capacity from the Mbps range, to
10 Gbps, and to 100 Gbps and beyond.
For example, agencies with a 10 Gbps inter-data center con-
nection for everyday trafc could use Cienas OPn architecture
to burst throughput much higher (even to 100 Gbps) for periodic
or unexpected workload spikes, and return the capacity back to
the pool when nished, she noted.
This dynamic model driven by users or applications dramati-
cally increases performance while maintaining efciencies and
control.
Reach Coveted Goals
As inter-data center networking moves from agency-owned
premises to cloud-based facilities, the network becomes even
more strategic.
Government agencies are already working toward leverag-
ing Big Data analytics to accomplish their most coveted goals,
Ms. Reinke said.
Goals include everything from predicting the impact of
global climate change, to improving situational awareness for
warghters, to nding a cure for cancer.
Of course, collecting, storing, accessing and analyzing vast
quantities of data are no easy challenge. And agencies see non-
traditional data sources as one of the biggest drivers of data
growth in the future.
Industry surveys indicate the use of video and audio feeds,
for example, are most likely to grow in the next two years. High-
denition imagery would play a role in defense applications for
drones, satellites, and battleeld sensors, and in border surveil-
lance and facial recognition programs, on the civilian side.
Ultimately, government executives realize theres much to
be gained from the Big Data owing through their organizations
today. Accurately measuring the effectiveness of citizen inter-
actions, for example, brings greater legitimacy and accountabil-
ity to key services, Ms. Reinke said.
Leveraging data-driven analytics, government executives
expect to better inform future decisions, improve constituent
interactions and streamline critical processes. n
OP
n
NETWORK ARCHITECTURE
Whitepaper W
What is OPn?
OPn (pronounced Open) is Cienas new network architecture.
It derives from Cienas vision of how networks will evolve, and
guides both the companys product strategy and its advocacy
to customers and the market.
The aim of OPn is to bend the cost curve of networking
down in the face of rapidly increasing bandwidth and evolving
service demands, and to enable network operators to monetize
new applications and superior user experiences. To do this,
OPn offers three key elements:
> Intelligent optical and lean packet capabilities to scale at
lower cost than alternative architectures
> Software automation and orchestration to speed and reduce
the cost of operations
> Software linkages to allow the network to better integrate
with the data center, transforming connect, compute, and
storage functions into a programmable platform for a wide
variety of new services
The name OPn reects these attributes:
> Exponential scale at lower cost, made possible by
Optical Packet networks
> Software-enabled orchestration and automation, made
possible by open interfaces both within the network, as well
as between the network and the applications that use it
Why is OPn important?
Cienas OPn network architecture is vital for networks to be
able to scale economically to great capacity over wide areas. It
is especially important in light of the changes wrought by cloud
(particularly the evolution of the data center), mobile broadband,
and the rise of the application model for new services. Because
of the general growth of network trafc, network operators
have been cost-challenged since the mainstreaming of the
Internet. The cost-at-scale problem is exacerbated by the
growth of content- and compute-centric services, which create
greater variability and unpredictability of trafc, requiring
greater overcapacity to compensate for trafc uncertainties.
OPn reduces cost at scale by:
> Using the lowest-cost, easiest-to-administer function possible
for a given need. OPn optimizes network cost curves by
minimizing the number of high-touch, high-cost functions such
as IP/MPLS connection engineering and ne-grained trafc
shaping into a smaller number of locations, while optimizing a
larger number of locations for low-touch, streamlined forwarding
and transport. In particular, OPn introduces hardware systems
optimized for low-touch packet forwarding, leaving high-
touch functions to more centralized, higher-cost routers. In
this sense, the OPn networking architecture is congruent with
the trend in both content and compute for consolidation into
a relatively small number of data centers.
> Enhancing scale economics for optical packet transport by
innovating coherent optics for the highest possible capacity
across a given optical link, delivering the richest and most
reliable optical switching, and taking advantage of high-volume
data center economics for packet-switching hardware
> Using open software techniques to decouple software scaling
from hardware scaling and enable the network infrastructure
to become a programmable platform that supports rapid
and exible service and application development. This will
also allow the network to understand and react to shifting
capacity needs, providing greater efciency through higher
resource utilization.
In addition to lowering capital and operational costs for
large-scale networks, OPn is also important because it improves
the competitiveness of network owners. Communications
providers and IT providers are attacking one anothers markets
with the introduction of cloud services. Ciena believes the
orchestration and software-denable network capability inherent
to the OPn architecture allow network owners to prototype and
release services and applications faster, differentiate their
offerings, and/or deliver better customer experiences than
would be possible with a dumb network. In particular, the
22 Big Data
OTFL SME One-On-One
Michael Ho
Vice President
Software AG Government Solutions
When it comes to the four Vs of Big
Data volume, velocity, variety and
veracity, most attention has gone to
volume, the sheer size of Big Data.
In the past few years Big Data
has been equated to the growth of
data. Now when people speak of Tera-
bytes and Petabytes, it is common,
Michael Ho told OTFL recently.
Traditionally when people hear Big Data, they think I have
a giant data warehouse full of information; and I am generating
tons of tons of information like nancial sector data.
But massive amounts of information are just one aspect of
Big Data; there are others such as a massive demand for the
same data.
For example, at Healthcare.gov, there are thousands of
people trying to access that information at one time; its not
massive amounts of data, but massive demand for the same
data over and over again, he said. People dont want to wait
for information.
Two things make people wait: the time for processing and
the time to access and retrieve information used in processing.
Closer To The Customer
At Software AG, all solutions are built with a common de-
nominator and differentiator speed.
And when your users need it, they need it fast. With the
Terracotta In-Memory Data Management Platform, featuring
BigMemory, the data your users need isnt buried deep in a da-
tabase. Its stored in-memory, where it is quickly retrievable by
multiple users with multiple apps.
In-Memory is not just about the size of data, but the speed
to access and analyze that data. What In-Memory does is allow
you to move your data closer to your decisionmakers, he ex-
plained.
Mr. Ho compared the evolution of In-Memory to the evolu-
tion in PC processors going from single thread to multicore and
storage going from disk to solid state.
How did we make PCs faster for users? By letting them pro-
cess more things. Take solid state vs. disk, the disk read/write
time is longer. After you get past the processing problem, next
is the access and writing of that data, he said.
In-Memory is the next evolution to the Big Data suite, with
multiprocessing capabilities that give you faster access to data
and put information closer to the customer.
In-Memory computing is about speed and has a real time
focus on data.
Focus On Relevance
Mr. Ho noted the last 10 years were spent saying wow, we
have all this data, now what do we do with it? The next 10 is how
do we maximize our use of it. Its not just about size, but about
the relevance of the data.
The question is: Can you get all the information you want out
of your data set in the period of time in which it is relevant to
you, your users and the people you serve?
Every piece of data has a window of relevance in which it
has value. It can be days, weeks, and months, maybe it never
expires, Mr. Ho explained.
But sometimes the window of relevance is seconds or
sub-seconds because information is moving by you so quickly
and the things you are watching are changing so quickly that
information generated a minute ago may be no longer relevant
to decision making.
Need For Speed
What happens is because that data is moving so quickly and
collecting in real time, your Big Data needs change, Mr. Ho noted.
Its not that you have a massive amount of data; rather it
is you have enough data that traditional methods of analytics,
storage and processing are insufcient to meet your need for
speed.
That is where In-Memory comes in. We can write and read
data much faster than before because in-memory computing
keeps data closer to the user and can allow us to do analytics
in real-time as it is coming by us or as we are clicking the In-
Memory store.
We can achieve levels of analytics and general interaction
with data with speeds that we wouldnt have conceived of 5
years ago. We are talking about 10x to 100x times gains in ef-
ciency by adding In-Memory computing to your architecture.
Mr. Ho stressed that Software AG solutions are meant to be
complementary to a customers existing architecture.
A lot of customers hear modernization and all they see are
dollar signs. The solutions are not meant to rip and replace,
he said. The idea is to speed up your ability to get to your end
goal. Everything we do is bolted on to your architecture and
helps evolve your enterprise. n
Maximize Your Windows Of Relevance
Terracotta, Inc.
575 Florida St. Suite 100
San Francisco, CA 94110
Contact us to learn more about Terracotta products, services and support. Please visit
www.terracottatech.com, e-mail us at info@terracottatech.com or contact one of our
sales professionals at +1 (415) 738-4000. Terracotta is a Software AG company.
BUILT-IN COMMUNICATIONS LAYER
TERABYTE SCALE CACHE SERVER ARRAY
TERRACOTTA DRIVER
DURABILITY MIRRORING STRIPING DEVELOPER CONSOLE PLUG-IN MONITORING OPERATIONS CENTER
APPLICATION APPLICATION APPLICATION APPLICATION
HIBERNATE EHCACHE API
LOW LATENCY MEMORY CACHE
Enterprise Ehcache is an easy-to-deploy caching solution for hard-to-solve scale and
throughput problems. Based on the de facto caching standard for Java, Enterprise
Ehcache snaps into enterprise applications for an instant, 10x speed increase and
on-demand, unlimited scale out.
Now with Search for Greater Value and Versatility
Enterprise Ehcache now offers a powerful native Search capability. Through a simple
API, Search lets you search and analyze in-memory data, providing a fast, cost-
effective alternative to querying overloaded databases. By leveraging a searchable
cache, you can solve a wider range of data challenges, at terabyte magnitudes, with
a single, highly scalable platform.
NEW! Powerful Search Capability
Search TBs of data at in-memory speeds
Avoid slow, expensive database queries
Simple, but exible API
Snap-In Performance and Scale
Store more data in memory
Speed application response times
Gain unlimited linear scalability
Ofoad database or mainframe
Deploy with just two lines of
conguration
Plug in BigMemory for TB-scale caches
High Availability
Fully redundant architecture
No single point of failure
Highly available, disk-backed cache
Full Platform Support
Tomcat
WebLogic
WebSphere
JBoss
ColdFusion
Jetty
Enterprise Ehcache
Glasssh
Resin
Hotspot
JRockit
IBM JDK
Enterprise Management and Control
Management console
Third-party monitoring tool integration
Asynchronous write-behind
Bulk loader APIs
Flexible TTI and TTL eviction
JTA support
Online backups
Faster Performance, Sustained Gains
Enterprise Ehcache provides immediate relief from performance bottlenecks to speed
application response times. It maintains performance at any scale, enabling dynamic,
no-compromise growth:
10,000+ transactions per second
30%-90% database load reduction
90% application latency improvement
Linear scale
In-memory performance
Rapid, Easy Deployment
Enterprise Ehcache solves hard scalability problems without forcing new concepts or
development paradigms:
Scales out with just two lines of congurationnot a full application rewrite
Reduces development time, makes scalability available to mainstream developers
PERFORMANCE AT ANY SCALE
Big Data 23
OTFL SME One-On-One
Dale Wickizer
Chief Technology Ofcer, US Public Sector
NetApp
Hedge Your Bets
The term Big Data will probably go
away, since it is not a place one goes to,
like the Cloud. But although the term
might go away, the underlying problem
of how to get the most value from the
data does not go away.
New analytic solutions may allow
you to solve problems your exist-
ing traditional tools cannot, Dale
Wickizer told OTFL recently.
But since Big Data applications, particularly in the
analytics realm, are an emerging market, the advice I have
given to my colleagues in government agencies is pretty
simple: Hedge your bets.
Mr. Wickizer advocated agencies use pilot programs to test
new technologies; and this will determine if there is a compel-
ling ROI to justify retooling your organization to hire the right
skillsets to expand the role of these technologies within your
organization.
FlexPod Select
As a data storage provider, NetApp enables numerous big
solutions in high performance computing (HPC), intelligence,
surveillance and reconnaissance (ISR), big analytics (e.g., Ha-
doop and NoSQL/NewSQL solutions), video surveillance, cyber
security and very large archives.
If I had to pick one solution I am personally excited about it
would be our new FlexPod Select solution running OpenStack,
Mr. Wickizer exclaimed. It combines two great product lines
Clustered Data ONTAP running our Fabric Attached Storage
and our E-Series in the same solution.
Clustered Data ONTAP is best-of-breed in supporting hyper-
visors, being able to quickly provision out server instances us-
ing space efcient FlexClones, while being hypervisor agnostic.
E-Series is best-of-breed for high bandwidth, sequential data
I/O. OpenStack is an exciting technology that is gaining tremen-
dous momentum, Mr. Wickizer noted.
When you combine that new technology with those in the
FlexPod Select, you get a solution which can be used to provi-
sion and run many of the new Big Data applications in a multi-
tenant way.
In other words, you can run those applications on an enter-
prise class, shared infrastructure solution using familiar opera-
tions architecture and processes (similar to those you use to
run other enterprise applications).
You can use software to completely carve up that hard-
ware, any way you need to, in a software-dened manner. For
example, you can create workows between the new analytics
applications and the traditional ones, said Mr. Wickizer. You
also get an infrastructure that can natively run applications
developed on Amazon frameworks. That is very exciting.
Internet of Things
Mr. Wickizer also said he sees issues with viewing Big Data
as another set of ETL technologies. The database brains out
there look at Hadoop as well as the new streams processing
applications as just another set of extraction, translation and
load (ETL) technologies.
You can see this in the way the more traditional database
companies have reacted to it, he noted.
What they fail to realize is that in the face of the Internet of
Things, most data being stored and processed is non-relational
in nature (most people mistakenly call it unstructured): it does
not readily lend itself to being stuffed in a relational database.
Furthermore, the expectation on getting actionable intel-
ligence out of that data is near real-time. In other words, we no
longer have the luxury of sitting back and developing a static
schema for a database to answer a few set of business ques-
tions, Mr. Wickizer added.
Now, questions are coming so fast; analysts have to pour
through terabytes and petabytes of data very rapidly, develop
schemas on the y, get the answers we need, then reset for the
next question.
Much of that will need to be automated, as humans in the
loop slow things down. And did I mention that most of that data
is not sitting in some database somewhere? he said.
Lastly, this data is being kept around forever, which means
organizations have a huge archival problem on their hands.
The sheer numbers of les being stored is putting pressure
on le systems, in that many cannot scale to accommodate the
number of required in-node pointers to organize it all. This is
where I believe object based storage can really help, Mr. Wick-
izer explained.
Many les, along with their metadata can be packed into
a single, self-describing object. I am encouraged by what I see
in this area, both with our own StorageGrid solution, but also in
the Open Source arena with OpenStack Swift, CEPH and other
technologies. n
Datasheet
Clustered Data ONTAP
Operating System
Revolutionize your storage software, remove IT
constraints, and speed response to business changes
KEY BENEFITS
Nondisruptive Operations
Perform storage maintenance,
hardware lifecycle operations,
and software upgrades without
interrupting your business
Eliminate planned and
unplanned downtime for
continuous business availability
Proven Efciency
Drive storage cost reductions
with comprehensive storage
efciency
Consolidate and share the
same infrastructure for
workloads or tenants with
different performance,
capacity, and security
requirements
Grow efciency as scale
increases
Seamless Scalability
Scale capacity, performance,
and operations without
compromise
Scale SAN and NAS from
terabytes to tens of petabytes
without reconguring running
applications
Combine different generations
of storage hardware for
seamless expansion
The Challenge
Businesses today struggle with the
increasing amount of data that they need
to store, manage, and back up. Growing
competitive pressure and 24-hour busi-
ness cycles require that your mission-
critical business processes and data
remain accessible around the clock.
With your business environment in a
constant state of evolution, you need
a more agile approach to storage that
can eliminate downtime, improve the
efciency of your infrastructure and
your IT staff, scale nondisruptively as
your business grows, and quickly adapt
to changing business requirements.
The Solution
Clustered Data ONTAP addresses the
challenges facing your growing and
dynamic business by extending the
innovation of NetApp Data ONTAP, the
worlds number one branded storage
operating system1. Our unied cluster
architecture scales and adapts to your
changing needs, reducing risk and cost.
Clustered Data ONTAP is designed to
eliminate downtime, allowing you to ser-
vice your infrastructure without disrupting
access to user data and applications
even during regular business hours.
Proven operational efciency helps
you simplify your overall storage
environment and manage storage
infrastructure at scale by automating
important processes and increasing
productivity. You can add capacity as
you grow across both SAN and NAS
environmentswithout reconguring
running applications. We let you start
small and grow big without the disruptive
hardware upgrades required by other
storage vendors.
Clustered Data ONTAP provides up to
24 storage controllersor nodes
managed as a single logical pool so
your operations scale more easily.
NetApp supports the broadest set of
storage protocols and is the only
provider to deliver both SAN and NAS
data access from a single, unied
scale-out platform.
Prevent Business Disruptions
With IT now integral to your business
operations, the impact of downtime
goes beyond dollars or productivity
lost. Your companys reputation might
be at stake. Clustered Data ONTAP
eliminates sources of downtime and
protects your critical data against
disaster.
Nondisruptive operations
Our nondisruptive operations capabili-
ties allow you to perform critical tasks
without interrupting your business. The
ability to dynamically assign, promote,
and retire storage resources lets you
1. IDC Worldwide Quarterly Disk Storage Systems Tracker Q4 2012, March 2013 (Open Networked Disk Storage Systems revenue).
24 Big Data
OTFL SME One-On-One
The GovConnection Team
Double The DataAre You Prepared?
Is Big Data becoming a big challenge?
At many agencies, the amount of
data collected, stored and dissemi-
nated will double in size this year. Are
you prepared? Do you have the neces-
sary storage and network capabilities
to keep you from drowning in data?
Now is a tough time to be manag-
ing data. The amount of information
youre generating is exploding, while youre trying to do more
with less, the GovConnection team told OTFL.
Agencies need to gure out how to capture, store, transfer
and analyze data, so that Big Data is viewed as an agency as-
set, rather than an overwhelming burden.
So, what can you do to protect your organization; and en-
sure you have the tools to meet evolving requirements and the
constant growth of data?
Start with a Storage Assessment urged the GovConnection
team.
Our storage assessment provides easy-to-understand in-
formation about your data growth and usage. We can help you
make changes and improvements that reduce costs, shorten
backup and recovery times, and accelerate performance and
operational efciency.
How a Storage Assessment Works
GovConnection uses an agent-less data collection tool that
gathers information about your Microsoft Windows environ-
ment. The tool collects information on the rate of growth of
your storage, le aging, le duplication, and le types. It also
compiles information about your available storage space, wast-
ed space, and storage inventory.
This agent-less data collection method is both affordable
and timely. The analyzed data offers valuable insights into the
current state of your storage and gives our engineers the infor-
mation they need to design a more efcient storage solution
that best ts your environment and budget, the GovConnec-
tion team explained.
The nalized Storage Assessment Report includes:
Inventory of storage servers, arrays, and infrastructure
Capacity analysis
Storage usage and alerts
File categorization data (by type, by most frequently used,
by user, by age, etc.).
The team said that organzations can use this information to:
Measure Total Cost of Ownership (TCO) and Return On
Investment (ROI) expectations
Make decisions regarding regulatory compliance and data
retention
Make immediate changes that reduce costs, shorten
backup/recovery times, and improve overall operational
efciencies
Strategically schedule follow-on projects like storage
consolidation, storage virtualization, and tiered storage
design and implementation that yield additional benets.
The Storage Assessment process is very easy. After a
discovery call led by a GovConnection services engineer, an
agent-less data collection tool is remotely installed and runs for
30 days. The tool generates detailed reports, from which the
engineers provide recommendations and present a nal report.
Our engineers are certied and skilled at developing stor-
age solutions that build efciencies in your unique IT environ-
ment, the GovConnection team said. They utilize a discern,
design, and deliver methodology for each and every customer
environment.
Network Foundation
To get big value from Big Data applications requires a net-
work that can support real-time video, dynamic content, server,
storage and client virtualization, and other bandwidth-intensive
applications.
Scalability, security, and management are three key areas
to focus on when upgrading your network. Investments here will
add value and open doors to new technologies without consum-
ing your entire budget, the GovConnection team said.
Implementing virtualization allows you to throttle supply to
meet demand. This technology requires a network capable of
delivering high availability, and low-latency performance.
Optimizing your network also allows you to extend the life
of your existing IT equipment while improving service levels
delivering signicant cost savings.
Converting your existing network into the infrastructure
of tomorrow may seem daunting, but its a lot easier when you
know where to start, the GovConnection team said.
The GovConnection team can guide you with expert advice
and offers a wide selection of products and services designed
to reduce your operating costs and increase your networks
performance.
The result is transforming your infrastructure into a secure,
scalable, and efcient foundation for Big Data applications. n
2013 GovConnection, Inc. All rights reserved. GovConnection is a registered trademark of PC Connection, Inc. or its subsidiaries. 24944 | 0513
1.800.800.0019
www.govconnection.com/storageassessment
Call an Account Manager to schedule a Storage Assessment today.
Is Big Data Becoming
a Big Challenge?
We Have the Tools You Need to Maintain Control
Its a tough time for data. The amount of information youre generating is exploding while youre
trying to do more with less. Plus, with tighter regulations, records retention, and the need for better
management, you are tasked with guring out how to capture, store, transfer, and analyze your data
|so it doesnt become an overwhelming burden. What can you do to protect your organization and
ensure you have the tools to meet evolving requirements and constant growth?
GovConnection is ready to help. Our Storage Assessment provides easy-to-understand information
about your data growth and usage. We can help you make changes and improvements that reduce
costs, shorten backup and recovery times, and accelerate performance and operational efciency.
100 Blue Ravine Road
Folsom, CA 95630
916-932-1300
Page #
__________Designer __________CreativeDir.
__________Editorial __________Prepress
__________Other ____________OK to go
5 25 50 75 95 100 5 25 50 75 95 100 5 25 50 75 95 100 5 25 50 75 95 100
BLACK
YELLOW
MAGENTA
CYAN
Special Report
Big Data 25
OTFL SME One-On-One
Diana Zavala
Director
Analytics & Data Management
HP Enterprise Services, U.S. Public Sector
Experience Big Data Discovery
As agencies deal with Big Data vol-
ume, variety and velocity issues,
they also face new business and mis-
sion expectations to actually derive
the most value from their structured
and unstructured data.
So how can an agency actually test their Big Data hypoth-
eses? How can they nd out which use cases are really of use?
HPs innovative Big Data Discovery Experience can provide the
answers.
HP has really invested in the area of Big Data. We believe in-
formation is the lifeblood of the organization, Diana Zavala told
OTFL recently. In doing everything from performing the mission
to managing a data center, data is the common denominator.
We have a range of infrastructure hardware and software
capabilities and tools to securely manage and scale data. We
also have the data management, business intelligence and the
analytical services on the other end to realize the value of the
information and support the mission.
Prove Your Use Case
Ms. Zavala said the Big Data Discovery Experience allows
HP to bring together the right hardware, software, people and
resources to help you quickly prove your analytic business use
cases and then accelerate them into production.
The Experience really helps you to explore your data be-
yond the limits that you might have with the technology and
practices you currently have in place.
Our services provide a scalable discovery environment in
which you can test, explore, and evaluate new insights, con-
cepts, patterns, models, and hypotheses gleaned from your own
data, added Ms. Zavala.
You can drill down into specic uses cases that we develop in
conjunction with the customer to be able to look at the data and
actually test the hypothesizes we come up with.
Are they true? Do you gain the insights you are looking for?
Are there low hanging fruits you hadnt expected? Or did you
not gain the insight?
Being able to do that without investing in a huge infrastruc-
ture or a long term project is really key in avoiding unnecessary
expenses, Ms. Zavala said.
It helps reduce or eliminate any of the barriers that might
be stopping you from starting on that Big Data journey. We
bring data scientists and information architects together with
the customer to develop use cases and work through them.
The Big Data Discovery Experience service makes your jour-
ney more manageable and successful, enabling you to move
quickly from exploring high-level concepts to driving business
innovation.
Offered as a service, the HP Big Data Discovery Experience
suite of services leverages HPs scalable, open and secure HAVEn
platform.
HAVEn: A Heaven for Big Data
HAVEn is a Big Data analytics platform, which leverages HPs
analytics software, hardware and services to create the next gen-
eration of Big Data-ready analytics applications and solutions.
HAVEn stands for Hadoop, Autonomy, Vertica, Enterprise
Security, and n solutions, the number of apps that you can
create on top of that Big Data platform, Ms. Zavala explained.
HAVEn is one of the industrys rst comprehensive, scalable
and open platforms for Big Data that is secure, she said.
Its unique because it handles 100% of your information,
structured information like rows and columns, as well as un-
structured information and human information, such as a video
or email or social media.
Ms. Zavala described how a platform like HAVEn can help
you address huge volume with Hadoop, real time with Vertica
and the human side or unstructured data with Autonomy.
The insights you will gain can prepare you to take action
to improve operational efciencies, improve risk management,
lower operational costs, and expand revenue streams.
New Style of IT
The expectation of instant connection to information any-
time and from anywhere is driving a shift to what HP calls a
New Style of IT. This new style is business-led, with IT support-
ing organizational goals.
Ms. Zavala said the New Style of IT is not only about provid-
ing customers with the right enabling technologies, but truly
supporting these new technologies and ways to consume and
deliver services in the public sector.
Its really critical to understand that objectives will change,
she counseled, so organizations need to start thinking less
about machines and more about how technology is delivered
and consumed. Its more about how users engage with technol-
ogy than the technology itself.
Ms. Zavala added, These evolving objectives help drive the
right tools needed in your roadmap of enabling technologies. n
Business white paper
Data-driven
government
HP HAVEn turns Big Data into eficient and efective government
Get started
26 Big Data
OTFL SME One-On-One
Mercedes Westcott
Vice President, Public Sector
Cloudera Government Solutions
From efciency to impact, Big Data and
Hadoop are transforming the way we
view data management.
Until Hadoop came along, organi-
zations, including government, simply
had no way to store and process the
volume and variety of data needed
in the era of Big Data, Clouderas
Mercedes Westcott told OTFL recently.
A lot has been said of Hadoop and the role it plays in Big
Data strategies these days. Because of all its benets, Hadoop
has to some degree become synonymous with Big Data. Yet
Hadoop is just part of the answer.
Once organizations start storing all this information they
quickly realize that theres a wealth of insight within it that they
can tap into for things like improved decision making or effec-
tive risk management both within Hadoop and in concert with
Hadoop.
Those organizations that have gured this out are pushing
the boundaries and framing a new data management paradigm
based on an enterprise data hub, or EDH, Ms. Westcott explained.
An EDH, powered by Hadoop, is a single platform to ef-
ciently and securely store, process, analyze, and serve all data
and complement and extend existing systems. It allows agen-
cies to capture and preserve terabytes of data for several years
according to each agencys mandate.
Unique Characteristics
Ms. Westcott pointed out that an EDH has several key char-
acteristics that make it unique.
First, an EDH provides a landing zone for all new enter-
prise data structured, un-structured, retained cost-effective-
ly and in its full delity, for any period of time an automatic
archive that is both deep and wide, yet immediately accessible
to all forms of computing.
Second, an EDH also offers one place to transform data from
its raw format into structured formats for consumption by exist-
ing systems, while retaining the original data for future use and
reuse so nothing is lost.
Third, it provides business users direct, agile access to
specic data sets to explore and analyze in-place no data
duplication to other systems so these users can get at the data
reducing the BI backlog of business user requests and freeing
capacity on existing systems.
Lastly, it allows organizations to bring new workloads direct-
ly to the EDH, not the other way around, thus reducing the need
to invest in and move large volumes of data to external systems
just to ask new questions.
At the same time, an EDH has the same capabilities expected
in any data management system security, governance, reli-
ability and openness so it can be used for all an organizations
enterprise data with condence.
With an EDH at the center of an organizations Big Data
strategy, programs and agencies can realize several key ben-
ets; increased business visibility, better risk and compliance
management, and reduced operating costs, Ms. Westcott sum-
marized.
Future View
Its denitely safe to say that the term Big Data will soon be
behind us said Ms. Westcott.
At Cloudera, were already going down this path and are
talking about all data rather than categorizing it differently.
That said certain dynamics will persist she said.
We will still struggle with unforeseen volume, variety and
velocity of data, which, while initially intimidating, will have real
promise to do some amazing things, Ms. Westcott noted.
There are already organizations doing great things with Big
Data. At Cloudera, weve recently worked with a company called
Patterns and Predictions thats doing just that.
She described how by leveraging a wide range of data, they
have created an analytics network that assesses mental health
risk, in real time, to help with veterans suicide prevention.
Examples like this are only the beginning. I think well
quickly see a wealth of examples of putting data to good use,
Ms. Westcott added.
But she stressed to keep in mind that Big Data is still just data.
Align traditional data management strategies to all your
data and dont approach Big Data any differently.
Make sure youre not heading down a path selecting a so-
lution thats not going to meet your entire data management
needs and will require an entirely new approach for key pro-
cesses, Ms. Westcott urged, especially when dealing with Big
Data and all the sources, uses, and insight that can be shared
among all the actors. n
The EDH Landing Zone
Cloudera Enterprise is a single data management solution thats specifically designed
for Big Data workloads. By natively combining storage, processing and exploration with
both batch and real-time query, it provides an economical and powerful purpose-built
foundation to gain insight fromall your data. Cloudera Enterprise, The Platformfor Big
Data, is for organizations who want to transformtheir business through data improving
operational efficiency by solving end-to-end problems and creating competitive advantage
by asking bigger questions.
Solve End-to-end Big Data Problems
With Cloudera Enterprise, you can perform end-to-end workflows on a single platform
eliminating the need to copy data between multiple specialized systems. This, in turn,
simplifies and accelerates processes, improves the quality of output and reduces
hardware and software expenditures.
As a key component of Cloudera Enterprise, CDH delivers the core elements of Hadoop
scalable storage and distributed computing as well as necessary enterprise capabilities
such as security, high availability, and integration with a broad range of hardware and
software solutions. Ideal for enterprises seeking a stable, tested, open source Hadoop
solution without proprietary vendor lock-in, CDH is the bridge allowing organizations to
use Hadoop in production while leveraging the continuous innovations from the open
source community. CDH is the most widely deployed distribution of Hadoop and is currently
run at scale in production environments across a broad range of industries and use cases.
Cloudera Enterprise
The Platform for Big Data SOLVE END-TO-END BIG
DATA PROBLEMS WITH
A CENTRAL SOLUTION
POWERFUL
Ask Bigger Questions to drive
competitive advantage

ECONOMICAL
Simplified infrastructure improves
operational efficiency
OPEN
100% open source platform distribution
based on Apache Hadoop
SIMPLE
Wizard-based deployment; centralized
management & administration
STABLE
Tested, enterprise-ready
subscription-based offering
COMPATIBLE
Seamless integration with all leading
RDBMS, EDW, data integration &
BI solutions

DATA SHEET
Cloudera Enterprise gives us an easy
way to manage multiple clusters.

PAUL PERRY, DIRECTOR OF SOFTWARE,
EXPERIAN MARKETING SERVICES

Big Data 27
OTFL SME One-On-One
Audie Hittle
CTO of the Federal Market Segment
EMC Isilon
Storage: Software Dened
Every agency may be unique, but they
all want to transform their data stor-
age and information management
operations to achieve levels previously
not considered possible.
However today these levels are
possible, EMC Isilons Audie Hittle told
OTFL in a recent interview, through
intelligent data storage technologies
recently made possible through the use of sophisticated soft-
ware dened storage capabilities.
Mr. Hittle invests a signicant percentage of his time and
energy helping agencies understand how they can transform
their operations to achieve the desired operational efciencies.
He noted that by some estimates, up to 80% of all new data
created is unstructured such as imagery, video, massive
home directories, or network log les.
Weve found that scale-out network attached storage
(NAS) solutions are extremely well suited to effectively and ef-
ciently address these needs.
Validated Solution
EMC provides the Big Data technology of OneFS and the
Isilon product portfolio of Scale-Out Network Attached Stor-
age (NAS) to immediately help government cut costs, increase
efciencies, and take advantage of their Big Data potential,
explained Mr. Hittle.
I have personally validated a Big Data example, with a senior
government executive who was briefed by his own operations
team. They reduced their IT data storage stafng requirements
by 90%, enabling reallocation of resources to other higher pri-
ority missions. This thanks to the intelligence and automation of
the EMC Isilon operating system, OneFS.
Isilon does not use the industry standard Random Array of
Independent Disks (RAID) for data storage and protection, but
rather uses advanced software algorithms to distribute and
protect the data across an entire cluster of disks.
By doing this, it can offer sophisticated capabilities such as:
Role-Based Access Control (RBAC) to help ensure separa-
tion of data access from administrative access, and to
differentiate appropriate levels of administrative access
Auto-balancing, SyncIQ for remote synchronization;
SmartPools for next-generation tiering
SmartLock for data protection and retention, fundamen-
tally changing the way data storage is managed.
The example cited above is just one of many and reinforc-
ing data from industry analysts like IDC and Gartner Group
substantiating dramatic capital expense and operating expense
efciencies on the order of 40% to 50% overall, Mr. Hittle said.
Three Questions To Ask
Certainly, no one will deny operational efciency is a
prominent part of the discussion with IT buyers today. But all
too often, the discussion is focused on minimizing the up front
or initial cost of the procurement of the product or solution, Mr.
Hittle explained.
However one of the questions that could or should be
asked, and frequently is not, deals with Total Cost of Ownership
or TCO. The real value is in understanding what the TCO is going
to be over the life of the product. This includes everything from
initial procurement costs to energy to full life-cycle operations
and management costs, he said.
Data storage efciency improvements have enabled
achievement of 40% operating expense (OPEX) savings and
the ability to automate so many of the traditionally human-
resource/staff intensive functions, which go a long way in re-
ducing a clients TCO.
A second question might deal with the timelines and costs of
migrating data, he continued.
Traditionally, data migration is one of the most staff-
intensive processes, and yet, advanced concepts such as never
migrate again enable organizations to completely avoid costs
associated with any future data storage migrations.
Mr. Hittle also urged buyers to ask prospective data storage
vendors: How do you plan to deal with data migrations, both
from the existing storage to the new, and for any future expan-
sions or enhancements?
On how to organize infrastructure to take advantage of Big
Data, Mr. Hittle follows the advice of that famous philosopher
and Hall of Famer, Yogi Berra, who is recognized for saying,
Predictions are always toughespecially when they are about
the future.
When it comes to organizing your infrastructure, therefore,
it seems it would be best to avoid making any big predictions
or bets, where possible. Buy what you need now, and scale-out
or pay as you grow into the future.
The goal is to keep it simple, both in terms of architecture
and operational interfaces. In other words, look for things that
offer ease of architectural planning and integration as well as
signicant automation to minimize or eliminate unnecessary
operator intervention, Mr. Hittle said. n
28 Big Data
OTFL SME One-On-One
Jay Mork
Strategic Technology and Chief Technology Ofcer
General Dynamics Advanced Information Systems
Big Data can be a big problem in todays
data-driven world.
For some agencies Big Data, and
how to effectively manage it, can be a big
problem, Jay Mork told OTFL recently.
At General Dynamics Advanced
Information Systems, we understand
that there is no one-size-ts-all solution
when it comes to Big Data, he explained.
Instead, each Big Data solution is the sum of its parts; how
we innovatively combine relevant tools and technologies, tailor-
ing and ne tuning them for each customer to help advance
their mission, quickly and effectively.
General Dynamics has a long history and rich heritage in the
management of critical mission systems for customers in the
Intelligence Community (IC). We have an in-depth understand-
ing of what it takes to help our customers derive intelligence
from the prolic amounts of data they are inundated with each
and every day.
No Silver Bullet
When we think about Big Data, there is not one silver bullet
that comes to mind, Mr. Mork noted. Instead, we think about
the right combination of relevant solutions and approaches.
That is where true innovation happens,
A key component of Big Data is how to effectively break
down the silos of data; and reorganizing it to be more accessible,
anytime, anywhere.
We have found that by creating a unied environment
where the raw data resides, we are then able to leverage our
proven open architecture (OA) approach, said Mr. Mork.
By placing the OA on top of the raw data, we are able to
harness the power of our Big Data analytics to help our custom-
ers extract strategic meaning from these large volumes of raw
data. Our OA also allows our customers to rapidly integrate and
plug-in commercial-off-the-shelf (COTS) products, saving valu-
able time and money.
Leverage Existing Infrastructures
An essential element of successfully putting Big Data to
work is the architecture, noting that some customers existing
infrastructure is not built upon a foundation of OA.
When this is the case, we help our customers make a smart
migration to OA, actively leveraging existing infrastructures,
he said.
At General Dynamics, its all about the scally-smart migra-
tion to an OA design that will provide increased efciencies,
reduce dependency on hardware and enhance capabilities for
more users; not simply about ipping a switch.
Mr. Mork added the beauty of our OA solutions is that
they are exible and enable us to design, develop and deploy
custom Big Data solutions based on our customers mission-
specic needs.
He also counseled government managers to keep in mind
that a successful Big Data solution should take a full lifecycle
approach.
We build the architecture that allows customers to buy,
design and implement what they need based on their dynamic
mission. Our OA approach allows customers to access their big
data, delving into the unknown and extracting value, manipu-
lating it with technologies and tools. This exible and scalable
approach means customers can do whatever they need to do
with their Big Data.
Activity Based Intelligence (ABI)
As the volume of data continues to increase, activity based
intelligence (ABI) will remain a top priority for many IC decision
makers, Mr. Mork said.
For instance, in the IC, our customers need to sift through
mounds of geospatial, signal, communications and human intel-
ligence data to derive actionable intelligence in support of a
variety of missions.
With intelligence pulled from a variety of sources across
these equally varied domains, ABI goes beyond simply collect-
ing data and storing it, he explained.
ABI enhances the dimensions and context of mission-critical
information for analysts, while highlighting areas where more
information is required. ABI also allows analysts to connect the
dots faster and more efciently and, most importantly, provide
decision makers with real-time intelligence. And with our ex-
ible OA serving as the foundation to allow for rapid insertion of
needed technology, customers can truly harness the power of
Big Data.
Additionally, Mr. Mork said General Dynamics will continue
to invest in internal research and development efforts including
GDNexus.
From innovative concept to operations, GDNexus helps our
customers solve their Big Data challenges and reduce acquisi-
tion risk, increase operational capability and leverage proven
technology to take their Big Data solutions to the next level. n
The Sum Of Its Parts
Big Data 29
30 Big Data
My Big Data Top Ten
After collecting volumes of data from a variety of formats and
sources and then analyzing them using my human processor, Ive
come up with my Big Data Top Ten.
1
Big Data needs a consensus denition.
Go to any tech event and Big Data talk is on the lips of at-
tendees. While most say they know Big Data when they
see it, when asked to give a specic denition, answers
vary widely.
But we are getting closer.
NIST denes Big Data as: Where the data volume, acquisition
velocity, or data representation limits the ability to perform effec-
tive analysis using traditional relational approaches or requires the
use of signicant horizontal scaling for efcient processing.
The MeriTalk research would amend the NIST denition to: Big
Data is the set of technical capabilities and management processes
for converting vast, fast, or varied data into useful knowledge.
2
Big Data is well, big!
In 1978, I sold Big Data storage to the Naval Research
Lab. My format was an 8 oppy disk storing a robust
8 kilobytes. Hard multiple-platter Disk Packs stored
maybe 1 megabyte. Today hard disk capacities of 1.5 terabyte are
commonplace.
According to the TechAmerica report, in 2009, the govern-
ment produced 848 petabytes of data and healthcare data alone
reached 150 exabytes. Five exabytes (10
18
gigabytes) would con-
tain all words ever spoken by human beings on earth.
3
Big Data is new Not true
While the term Big Data with initial caps is new, big
data itself is not new.
For example, NOAA/National Weather Service has
been processing it since the 1950s. Today NOAA manages over 30
petabytes of new data per year. (How many 8K oppys is that?)
4
Big Data characteristics: The 4Vs
Dealing with Big Data means dealing with:
(a) Volume The sheer amount of data generated or
data intensity that must be ingested, analyzed, and
managed.
(b) Velocity The speed data is being produced and changed;
and the speed with which data must be received, understood, and
processed.
(c) Variety Structured and unstructured data in a variety
of formats creates integration, management, governance, and
architectural pressures on IT.
(d) Veracity The quality and provenance of received data
must be veried.
5
Why Big Data now!
Technology nally has the power to handle the 4Vs.
We now have the tools to really ask what if and to
explore data sets that were available before or didnt
exist before. Now it is possible to really think about the art of the
possible. We are witnessing a true democratization of data.
6
Big Data is now affordable.
You dont need to start from scratch with new IT
investments. You can use your existing capabilities,
technologies and infrastructure to begin pilots. There
is no single technology is required; there are no must haves.
7
The Big Data path is a circle not
a straight line.
Dene. Assess. Plan. Execute. Review. TechAmerica
recommends a ve step cyclical approach that is itera-
tive versus serial in nature, with a constant closed feedback loop
that informs ongoing efforts.
8
The Help Wanted sign is up!
We have to grow the next generation of data scien-
tists. The McKinsey Global Institute Analysis report
predicts a shortfall of 200,000 data scientists over
the next ve or so years and shortfall of 1 million managers in
organizations where data will be critical for them to be successful
in their organization e.g. government.
9
Government is funding foundational
Big Data R&D.
Projects are moving forward via the $200 million in-
vestment announced by the administration in March
2012. In October 2012, NSF/NIH announced 8 awards covering
three areas: data management, data analytics and collaboration/
data sharing. New solicitations, competitions and prizes are in the
ofng with opportunities for anyone who has a good idea.
10
The Big Data Market is growing bigger.
Deltek forecasts demand for big data solutions
by the U.S. government will increase from $4.9
billion in FY 2012 to $7.2 billion in FY 2017 at a
compound annual growth rate (CAGR) of 8.2%.
Big Data means big opportunities. Thats truly a big deal. n
By Jeff Erlichman, Editor, On The FrontLines
Assured IT. Delivered by Cloud.

> Lower your data center resource costs by up to 35%
> Reduce inter-data center transaction bandwidth requirements by 50%
> Reduce cloud access and backbone network costs by 40%
Learn how Cienas trusted, reliable, and secure cloud networking solutions
enable the highest performance, most efcient and cost-effective cloud service
architectures for Government.
http://www.ciena.com/solutions/cloud
CREATE A DATA CENTER WITHOUT WALLS
Welcome To
The Cloud

Você também pode gostar