Você está na página 1de 13

A 10gen White Paper

Big Data:
Examples and Guidelines for the
Enterprise Decision Maker

May 2013
Contents

E xecu ti v e Su mm a ry 1

Introduc tion 2
Big Data Landscape 2
Defining Big Data 2

W h at Busin esses A re Doing with Big Data 2


1. Build New Appplications That Were Not Possible Before 2
2. Adapt and Develop Competitive Advantages 4
3. Make Customers Happy 4
4. Reduce Costs 5

Conside r ations for Decision M a ke r s 5


1. Online Vs. Offline Big Data 6
2. Software License Model 6
3. Community 7
4. Developer Appeal 7
5. Agility 8
6. General Purpose Vs. Niche Solutions 8

G e t ting S ta rted with Big Data a n d MongoDB 8

Abou t 10ge n a n d MongoDB 9

RE SOURCE S 9
big data \big 'dt-\ noun
referring to technologies and initiatives that
involve data that is too diverse, fast-changing or
massive for conventional technologies, skills and
infrastructure to address efficiently.
Executive Summary
Despite the hype, Big Data is more than just a buzzword. 3. Make customers happy, like a top 5 insurance
Big Data is enabling organizations to create new company that improves customer service by
products, to outpace their competitors and to save generating a 360-degree view of over 100
tens of millions of dollars. In this paper, we begin with million customers using MongoDB.
a description of Big Data and the data management
4. Reduce costs, like a Tier 1 bank that saves tens
landscape. Next, we describe examples of customers
of millions of dollars and meets new compliance
innovating with Big Data using MongoDB, the leading
standards by replacing legacy data infrastructure
NoSQL database, which has been a catalyst of the Big
with MongoDB.
Data movement. With Big Data, these organizations:
Finally, given the nascent state of the market, we
1. Build new applications that were not possible
provide guidance to organizations selecting technol-
before, like a major US city that uses MongoDB
ogies for their Big Data projects. Decision makers
to cut crime with real-time data aggregation
should consider the following dimensions:
and analysis.
1. Online vs. Offline Big Data. Do you need a product
2. Adapt and develop competitive advantages, like a
for an online, operational application or an offline,
global telco that runs an online streaming video
batch analytics application?
service on MongoDB to compete with over-the-top
(OTT) competitors like Netflix and Amazon. 2. Software Licensing Models. How do you pay for
the product?

3. Community. Who uses and supports the product?


Big Data is enabling 4. Developer Appeal. Do your developers want to
organizations to create new use the product?

products, to outpace their 5. Agility. Is it easy to get started quickly, to adapt


and to grow with the product?
competitors and to save 6. General Purpose vs. Niche Solutions. Does
tens of millions of dollars. the product solve one niche problem or many
problems for your organization?

1
Introduction
But today, new technologies make it possible to
realize value from Big Data. For example, retailers
Big Data Landscape can track user web clicks to identify behavioral
trends that improve campaigns, pricing and stockage.
Over the past decade, major web companies like
Utilities can capture household energy usage levels
Google, Amazon and Facebook pioneered businesses
to predict outages and to incent more efficient energy
built on monetizing massive data volumes. In the
consumption. Governments and even Google can
process, they invented new paradigms not only for
detect and track the emergence of disease outbreaks
extracting value from data, but also for managing data
via social media signals. Oil and gas companies can
and compute resourcesfrom data center design, to
take the output of sensors in their drilling equipment
hardware, to software, to application provisioning. In
to make more efficient and safer drilling decisions.
the same way that the mission to the moon spawned
a wave of innovation across multiple industries, Big
Data has pushed information technology a quantum
leap forward.
What Businesses Are
For organizations of all sizes, data management has
shifted from an important competency to a critical Doing with Big Data
differentiator that can determine market winners and
has-beens. Fortune 1000 companies and government
It can be hard to talk about Big Data in concrete terms.
bodies are starting to benefit from the innovations of
In this section, we briefly tell the stories of over a
the web pioneers. These organizations are defining
dozen organizations and four ways in which Big Data
new initiatives and reevaluating existing strategies
is changing their businesses. With Big Data, these
to examine how they can transform their businesses
organizations:
using Big Data. In the process, they are learning that
Big Data is not a single technology, technique or 1. Build new applications that were not
initiative. Rather, it is a trend across many areas of possible before
business and technology.
2. Adapt and develop competitive advantages

3. Make customers happy


Defining Big Data 4. Reduce costs
Big Data refers to technologies and initiatives that
involve data that is too diverse, fast-changing or
massive for conventional technologies, skills and infra- 1. Build New Applications that Were
structure to address efficiently. Said differently, the Not Possible Before
volume, velocity or variety of data is too great. By enabling new types of applications and features,
Big Data helps organizations generate new revenue
streams and achieve strategic goals.
By enabling new types of A global telco built a next-generation machine-to-
applications and features, machine (M2M) platform using MongoDB, generating
new revenue streams and taking a leading position
Big Data helps organizations in a nascent market. As consumer wireless and fixed
generate new revenue streams growth slows in most mature markets, telcos are
looking to alternative sources of growth. One highly
and achieve strategic goals. promising area is M2M communication, a type of
service in which devices (e.g., sensors, meters) capture
events (e.g., temperature, inventory level), which are
relayed over a network (wireless, wired or hybrid) to

2
METRICS
911 CALLS

VIEW

ANALY ZING GEOSPATIAL DATA // A major US city is using MongoDB to cut crime and improve municipal services by
collecting and analyzing geospatial data in real-time from over 30 different departments.

applications or other devices, which then automati- investment hinges on knowing which cell sites require
cally translate the captured events into meaningful more capacity, or where more cell sites are needed.
information (e.g., items need to be restocked). But This operator uses MongoDB to power a new service
telcos and software vendors have failed to capitalize through which a business can push location-specific
on this opportunity given the complexity of ingesting, offers to the telcos subscribers in real-time when they
managing and analyzing data from large numbers are in the vicinity of that business. This provides a
of sensors in real-time. With MongoDB, this telco is new revenue stream for the telco and brings its legacy
offering an industry-leading service that can take in systems to life.
up to 10 billion sensor readings for a single customer
In a matter of years, a social networking company
and can scale reliably as the business grows.
grew to serve tens of millions of users and over a
A major US city is using MongoDB to cut crime and million businesses. Using MongoDB, it started small
improve municipal services by collecting and analyzing and was able to scale the infrastructure to support the
geospatial data in real-time from over 30 different growth in user base and user activity. Doing so with
departments. For instance, in a given area, it might legacy technologies would have been operationally
evaluate the number of 911 calls and complaints, challenging (even infeasible) and financially taxing.
broken lights, stolen garbage cans, liquor permits and For this company, the Big Data problem was core
abandoned buildings, determining that an uptick in to the companys business, and being able to scale
crime is more likely than usual. It needs to marry struc- reliably and predictably meant the difference
tured and unstructured data, to do so at scale and to between success and failure.
conduct in-place, online analysis. With legacy technol-
A top industrial equipment manufacturer is using
ogies, this would be challenging at best, infeasible at
MongoDB to power a cloud-based analytics platform
worst. MongoDB makes it possible.
that ingests, stores and analyzes readings (e.g.,
A European mobile operator is using MongoDB to temperature, location) from its customers equipment.
monetize underused legacy data from wireless towers. It then presents the readings back to customers via
Like many telcos, this operator collects mounds of data a web interfaceincluding visualization, key metrics
on the locations of its customers. Rational network and time series analysisto help them make better

3
decisions about their businesses, such as where to to compete with Netflix. The project entails storing,
provision equipment and how to increase facility managing and serving thousands of titles and even
efficiency. This company is the first to offer a service more associated metadata in various formats. The
of this kind. It stands out in an industry that has seen company is using MongoDB because of its ability
little innovation in the last half century and can drive to handle the volume and variety of data involved;
new revenue streams from its new application. because it can perform in-place analysis to offer
real-time recommendations; and because it substan-
tially accelerates time-to-market, a crucial factor in the
2. Adapt and Develop Competitive race to compete for new subscribers.
Advantages A leading consumer electronics (CE) vendor observed
Governments pass new regulations. Disruptive the rise of cloud-based syncing technologies like
technologies challenge business models. Customers Dropbox, and set out to build an integrated consumer
and employees make new demands. Big Data is cloud service on MongoDB. As the race for CE market
helping organizations stay nimble so that they can share rages on and the industry becomes increas-
adapt to these changes and develop competitive ingly commoditized, vendors are trying to develop
advantages. value-added services that help them stand out from
the pack and increase the stickiness of their products.
This vendor adapted to major market developments
namely, increasing competition and consumer desire
Vendors are trying to develop for cloud-based syncing serviceshelping it remain
value-added services that competitive in the tight race for consumer spend.

help them stand out from the


pack and increase the stick- 3. Make Customers Happy

iness of their products. Big data can help increase customer satisfaction
which can reduce churn and increase revenueby
opening up access to information and by empowering
customers.
Todays consumer prefers to use mobile for a growing
A top 5 global insurance provider has over 100
number of activities. One of the largest Human Capital
million customers and over 100 different products.
Management (HCM) solution providers saw this, and
Its back-office systems comprise a patchwork of siloed
responded by building iPhone and Android mobile
technologies that make it challenging for customers
apps on MongoDB. MongoDB enabled the provider
and representatives to access the right information.
to pull a variety of data from myriad sources to create
Using MongoDB, this company built an application that
a single view of servicessuch as payroll, benefits
provides a 360-degree view of the customer, aggre-
and company policiesand to bring a mobile app to
gating customer and product information from over 70
market within 3 months. This HCM provider adapted
existing systems and making it available to customers
to its users preferences and offers a service that
and representatives. And it built the application in just
differentiates it from a host of competitors still
three months. As a result, the provider decreases the
shipping legacy, on-premise solutions.
time to resolve customer issues, making its customers
Telcos and cable providers have long been under happy; and it provides representatives opportunities to
attack from over-the-top (OTT) app and content cross-sell and upsell using real-time analytics.
providers like Netflix, Amazon and Apple. While the
Telcos also manage a variety of OSS/BSS systems
infrastructure providers expected to monetize their
and offer hundreds of products across wireless and
networks by offering their own content and value-
wireline, prepaid and postpaid, consumer and enter-
added services, these OTT entrants have challenged
prise businesses. A major wireless operator built a
that business model and have garnered significant
unified Subscriber Identity Management system on
traction. While many struggle to adapt, one telco is
MongoDB to increase customer satisfaction and to
responding by creating its own OTT video service
reduce churn, a key metric for telcos. This system

4
improves call center efficiency by reducing the substantial cost savings. Additionally, by migrating
amount of time customer service representatives away from a license-heavy software model, the bank
need to pull data on customers. MongoDBs support was able to shift a major portion of expenses from
for real-time analytics enables a live dashboard that CapEx to OpEx.
shows trending customer service issues, which can
One of the largest photo-sharing websites globally
also help customer service representatives determine
reduced costs by 80% using MongoDB. The website
whether a customer complaint is an isolated issue or
safeguards more than six billion images for millions
part of a larger pattern.
of customers. As the only photo-sharing site that does
A leading newspaper increases user engagement and not down-sample, compress or force delete photos,
ad revenue by serving custom content and interactive it faced massive data growth that pushed the perfor-
features to over 20 million readers. As the publication mance and budgetary limits of its existing relational
has evolved from a traditional publishing model, it has database. Switching to MongoDB not only saved the
determined that user engagement is directly corre- company 80% in costs (CapEx and OpEx), but as a
lated with revenue: the more content users interact result of its migration the company also realized a
with, the higher the revenue. Legacy technologies 900% performance improvement.
were incapable of supporting the number of users,
A global company in the travel industry is reducing
variety of metadata and real-time requirements for
infrastructure costs by orders of magnitude using
what it wanted to build. Using MongoDB, it is now
MongoDB. It captures hundreds of millions of data
able to deliver relevant content and interactive
points from web, mobile and social platforms, then
features to users more quickly, driving more revenue.
analyzes them to enhance customer experience and
to drive additional revenue. Using a legacy database
required the company to buy expensive hardware
4. Reduce Costs and storage solutions, but with MongoDB, it can use
In many cases, Big Data projects emerge from existing commodity, scale-out hardware that yields massive
applications that have not been able to cope with cost savings. Additionally, MongoDBs built-in high
the growing volume, variety and velocity of data. availability and data replication allows the company
Additionally, legacy technologies are burdened not to reduce backup and disaster recovery costs.
only by limited capabilities and poor scalability,
they also bring with them higher costs and legacy
business models. Many organizations are pursuing
Big Data projects that solve old problems with new Considerations for
solutions and help them reduce costs. Cost reductions
are typically driven by decreased development effort Decision Makers
using developer-friendly technologies, cost-efficient
software and commodity hardware. While many Big Data technologies are mature enough
A Tier 1 bank is saving $40 million over 5 years by to be used for mission-critical, production use cases,
migrating its reference data management appli- it is still nascent in some regards. Accordingly, the
cation to MongoDB. The application previously ran way forward is not always clear. As organizations
on a proprietary relational database. Not only did develop Big Data strategies, there are a number of
this database carry high license costs and expensive dimensions to consider when selecting technology
hardware requirements, but it also could not handle partners, including:
the availability and replication requirements that the 1. Online vs. Offline Big Data
business demanded. It would take 24 to 36 hours for
the data to replicate across 12 global data centers, 2. Software Licensing Models
which meant international locations had out-of-date 3. Community
information. This cost this bank a number of fines
for failing to meet regulatory requirements. With 4. Developer Appeal
MongoDB, that data is now replicated globally in 5. Agility
minutes. Through the project, this bank decreased
6. General Purpose vs. Niche Solutions
development time, increased availability and realized

5
ONLINE VS. OFFLINE
online vs. offline big data // Online Big Data is created, ingested, transformed, managed and/or analyzed in real-time
to support operational applications and their users. Offline Big Data encompasses applications that ingest, transform, manage
and/or analyze data in a batch context; they typically do not create new data.

1. Online vs. Offline Big Data Organizations pursuing both use cases can do so in
tandem, and they will sometimes find integrations
Big Data can take both online and offline forms. Online
between online and offline Big Data technologies. For
Big Data refers to data that is created, ingested, trans-
instance, MongoDB provides integration with Hadoop.
formed, managed and/or analyzed in real-time to
support operational applications and their users. Big A major web company, for instance, uses MongoDB
Data is born online. Latency for these applications and Hadoop together for user data management
must be very low and availability must be high in and analysis. MongoDB is the operational data store,
order to meet SLAs and user expectations for modern storing and tracking rich user data (i.e., not just login
application performance. This includes a vast array information, but also online behavior). Additionally,
of applications, from social networking news feeds, it performs real-time analytics to dictate automated
to analytics to real-time ad servers to complex CRM machine behavior. By contrast, the company uses
applications. Examples of online Big Data applications Hadoop to perform more complex analysis offline. It
include MongoDB and other NoSQL databases. pipes the data from MongoDB into Hadoop, where it
groups user sessions, segments users by behavior and
Offline Big Data encompasses applications that ingest,
performs regression analyses to determine correlation
transform, manage and/or analyze data in a batch
and improve predictive models. Finally, it pipes the
context. They typically do not create new data. For
enriched data back into MongoDB, which informs the
these applications, response time can be slow (up to
abovementioned real-time analytics.
hours or days), which is often acceptable for this type
of use case. Since they usually produce a static (vs.
operational) output, such as a report or dashboard, they
2. Software License Model
can even go offline temporarily without impacting the
overall goal or end product. Examples of offline Big There are three general types of licenses for Big Data
Data applications include Hadoop-based workloads; software technologies:
modern data warehouses; extract, transform, load
(ETL) applications; and business intelligence tools. Proprietary. The software product is owned and
controlled by a software company. The source
Organizations evaluating which Big Data technol- code is not available to licensees. Customers
ogies to adopt should consider how they intend to use typically license the product through a perpetual
their data. For those looking to build applications that license that entitles them to indefinite use,
support real-time, operational use cases, they will need with annual maintenance fees for support and
an operational data store like MongoDB. For those software upgrades. Examples of this model
that need a place to conduct long-running analysis include databases from Oracle, IBM
offline, perhaps to inform decision-making processes, and Terradata.
offline solutions like Hadoop can be an effective tool.

6
Open-Source. The software product and source The global MongoDB community is large and growing,
code are freely available to use. Companies with over 4 million downloads, 50,000 MongoDB
monetize the software product by selling online education registrants, 15,000 MongoDB User
subscriptions and adjacent products with value- Group (MUG) members and 10,000 annual MongoDB
added components, such as management tools Days attendees.
and support services. Examples of this model
include MongoDB (by 10gen) and Hadoop (by
Cloudera and others). 4. Developer Appeal
Cloud Service. The service is hosted in a cloud- The market for Big Data talent is tight. The nations top
based environment outside of customers data engineers and data scientists often flock to companies
centers and delivered over the public Internet. like Google and Facebook, which are known havens
The predominant business model is metered (i.e., for the brightest minds and places where one will be
pay-per-use) or subscription-based. Examples exposed to leading edge technology. If enterprises
of this model include Google App Engine and want to compete for this talent, they have to offer
Amazon Elastic MapReduce. more than money.

For many Fortune 1000 companies, regulations and


internal policies around data privacy limit their ability
to leverage cloud-based solutions. As a result, most The market for Big Data
Big Data initiatives are driven with technologies
deployed on-premise. Most of the Big Data pioneers
talent is tight. If enterprises
are web companies that developed powerful software want to compete for this
and hardware, which they open-sourced to the larger
community. Accordingly, most of the software used
talent, they have to offer
for Big Data projects is open-source. There are many more than money.
advantages to open-source software, like MongoDB,
including lower total cost of ownership, transparency
of the product, ease of adoption and the ability to
engage in a community of organizations collaborating By offering developers the opportunity to work on
to improve the project. tough problems, and by using a technology that has
strong developer interest, a vibrant community, and an
auspicious long-term future, organizations can attract
3. Community the brightest minds. They can also increase the pool
of candidates by choosing technologies that are easy
In these early days of Big Data, there is an opportunity to learn and usewhich are often the ones that appeal
to learn from others. Organizations should consider most to developers. Furthermore, technologies that
how many other initiatives are being pursued using have strong developer appeal tend to make for more
the same technologies and with similar objectives. To productive teams who feel they are empowered by
understand a given technologys adoption, organiza- their tools rather than encumbered by poorly-designed,
tions should consider the following: legacy technology. Productive developer teams reduce
time to market for new initiatives and reduce devel-
The number of users
opment costs, as well.
The prevalence of local, community-organized
One of the reasons that MongoDB is the leading NoSQL
events
database is its appeal to developers, who find it easy
The health and activity of online forums such as and natural to use.
Google Groups and StackOverflow

The availability of conferences, how frequently


they occur and whether they are well-attended

7
5. Agility 6. General Purpose vs. Niche Solutions
Organizations should use Big Data products that Organizations are constantly trying to standardize on
enable them to be agile. They will benefit from fewer technologies to reduce complexity, to improve
technologies that get out of the way and allow teams their competency in the selected tools and to make
to focus on what they can do with their data, rather their vendor relationships more productive. Organi-
than how to deploy new applications and infra- zations should consider whether adopting a Big Data
structure. This will make it easy to explore a variety technology helps them address a single initiative or
of paths and hypotheses for extracting value from the many initiatives. If the technology is general purpose,
data and to iterate quickly in response to changing the expertise, infrastructure, skills, integrations
business needs. and other investments of the initial project can be
amortized across many projects. Organizations may
In this context, agility comprises three primary
find that a niche technology may be a better fit for a
components:
single project, but that a more general purpose tool
Ease of Use. A technology that is easy for is the better option for the organization as a whole.
developers to learn and understand -- either For this reason, Fortune 500 companies are standard-
because of the way its architected, the izing on MongoDB, a general purpose database, and
availability of tools and information, or both deploying it on dozens of projects throughout their
-- will enable teams to get Big Data projects organizations.
started and to realize value quickly. Technologies
with steep learning curves and fewer resources
to support education will make for a longer road
to project execution.
Getting Started with Big
Technological Flexibility. The product should
make it relatively easy to change requirements Data and MongoDB
on the flysuch as how data is modeled, which
data is used, where data is pulled from and how 10gen has worked closely with the MongoDB
it gets processedas teams develop new findings community to create some of the largest, most
and adapt to internal and external needs. innovative and successful MongoDB systems. For
Dynamic data models (also known as schemas) organizations beginning to develop big data strategies,
and scalability are capabilities to seek out. 10gen offers a unique workshop to help business
Licensing Freedom. Open-source products owners and technology management explore how
are typically easier to adopt, as teams can get to get started. These two-day events are delivered
started quickly with free community versions onsite and are facilitated by MongoDB experts. 10gen
of the software. They are also usually easier to staff will work with your teams to explore initiatives
scale from a licensing standpoint, as teams can that will impact the business, what types of resources
buy more licenses as requirements increase. By should be involved, example project plans, education
contrast, in many cases proprietary software strategies, infrastructure design and other best
vendors require large, upfront license purchases, practices for Big Data and MongoDB.
which make it harder for teams to get moving
quickly and to scale in the future.

MongoDBs ease of use, dynamic data model and open-


source licensing model make it the most agile Big Data
solution available.

8
About 10gen and Resources
MongoDB For more information on 10gen and MongoDB,
please visit 10gen.com or mongodb.org, or contact us
10gen is the company behind MongoDB, the at sales@10gen.com.
leading NoSQL database. MongoDB (named from
huMONGOus, meaning extremely large) is
reinventing data management and powering big data. Resource Website URL
Designed for how we build and run applications today,
MongoDB empowers organizations to be more agile MongoDB
10gen.com/download
Enterprise Download
and scalable. It enables new types of applications,
better customer experience, faster time to market Free Online Training education.10gen.com
and lower costs. MongoDB has a thriving global
community with 4 million downloads, 50,000 Online Webinars and Events 10gen.com/events

Education registrations, 15,000 MongoDB User Group White Papers 10gen.com/white-papers


(MUG) members and 10,000 annual MongoDB Days
attendees. The company has more than 600 customers, Case Studies 10gen.com/customers
including many of the worlds largest organizations.
Presentations 10gen.com/presentations

Documentation docs.mongodb.org

9
New York Palo Alto Washington, D.C. London Dublin Barcelona Sydney

US (866) 237-8815 INTL +1 (650) 440-4474 info@10gen.com

Copyright 2013 10gen, Inc. All Rights Reserved.

Você também pode gostar