Você está na página 1de 33

Career in Database

Field?

Speakers: Anne Regina Nancy Toar


Head of FemaleGeek

Female Development Program PHPIndonesia

2013-2014 Former Web Designer


2014-2015 Backend Developer
2015-now Database Consultant
Oracle Database SQL Certified Expert
Oracle Database 11g Administration Certified Associate
Oracle Database 11g Administration Certified Professional

www.linkedin.com/in/annetoar
2

History of databases
IDMS

Magnetic tape
flat (sequential) files

Precomputer
technologi
es:
Printing
press
Dewey
decimal
system
1940-50
Punched
cards

ADABAS

Magnetic Disk

HBase

Access

System R

Postgres

Oracle V2

MySQL

Dynamo
MongoDB
Redis
VoltDB
Neo4J

1950-60

1960-70

1970-80

19902000

1980-90

20002010

SQL Server
Relational
Model
defined

Sybase
Informix

IMS
Network Model
Hierarchical
model
IndexedSequential Access
Mechanism
(ISAM)

Ingres

DB2
dBase

Aerospike
Hana
Riak
Cassandra
Vertica
Hadoop

Big Data and


Hadoop

2005

2013

Pioneers of big data

Google Software Architecture (2005)


Google Applications

Map Reduce

Google File System

Big Table

Hadoop: 1.0: Open Source Map-Reduce


Stack

Hadoop at Yahoo

2010 (biggest cluster)


4000 nodes
16PB disk
64 TB of RAM
32,000 Cores
2014:
16 Clusters
32,500 nodes

Big Data Analytics, aka Data Science


Machine
learning
Programs that evolve
with experience

Collective
intelligence
Programs that use
inputs from crowds
to simulate
intelligence

Predictive
analytics
Programs that
extrapolate from
past to future

NoSQL

Web
servers

Memcached
sServers

Database
Servers

Read Only Slaves

Shard (A-F)

Shard (G-O)

Shard (P-Z)

Major influences on non-relational


Amazon Dynamo
Eventually consistent transaction model
Consistent hashing
Google BigTable
Column Family model for sparse distributed columnar
data
OODBMS and XML DBs
Paved the way for the JSON document database
Graph
Databases that can navigate social and other networks

No means yes!

New SQL

Row orientation vs column orientation


Row oriented database
ID

Name DOB

Salary

Sales

Dick

21/12/60 67,000

78980

Expens
es
3244

100
1
100
2
100
3
100
4
100
5

Jane

12/12/55 55,000

67840

2333

Robert 17/02/80 22,000

67890

6436

Dan

15/03/75 65,200

98770

2345

Steven 11/11/81 76,000

43240

3214

Block ID

Name DOB

1001

Dick

21/12/60 67,000

Salary

1002

Jane

12/12/55 55,000

1003

Robert 17/02/80 22,000

1004

Dan

1005

Steven 11/11/81 76,000

15/03/75 65,200

Sales Expense
s
7898 3244
0
6784 2333
0
6789 6436
0
9877 2345
0
4324 3214
0

Block

1
2
3
4
5

Dick
Jane
21/12/60 12/12/5
5
67,000
55,000
78980
67840
3244
2333

Robert
17/02/8
0
22,000
67890
6436

Dan
15/03/75

Steven
11/11/81

65,200
98770
2345

76,000
43240
3214

Column oriented database

Analytical queries
Row oriented database

SELECT
SUM(salary) FROM
saleperson

Block ID

Name DOB

1001

Dick

21/12/60 67,000

Salary

1002

Jane

12/12/55 55,000

1003

Robert 17/02/80 22,000

1004

Dan

1005

Steven 11/11/81 76,000

15/03/75 65,200

Sales Expense
s
7898 3244
0
6784 2333
0
6789 6436
0
9877 2345
0
4324 3214
0

Block

1
2
3
4
5

Dick
Jane
21/12/60 12/12/5
5
67,000
55,000
78980
67840
3244
2333

Robert
17/02/8
0
22,000
67890
6436

Dan
15/03/75

Steven
11/11/81

65,200
98770
2345

76,000
43240
3214

Column oriented database

Compression
Row oriented database

Poor compression ratio (low


repetition)

Good compression ratio


(high repetition)

Block ID

Name DOB

1001

Dick

21/12/60 67,000

Salary

1002

Jane

12/12/55 55,000

1003

Robert 17/02/80 22,000

1004

Dan

1005

Steven 11/11/81 76,000

15/03/75 65,200

Sales Expense
s
7898 3244
0
6784 2333
0
6789 6436
0
9877 2345
0
4324 3214
0

Block

1
2
3
4
5

Dick
Jane
21/12/60 12/12/5
5
67,000
55,000
78980
67840
3244
2333

Robert
17/02/8
0
22,000
67890
6436

Dan
15/03/75

Steven
11/11/81

65,200
98770
2345

76,000
43240
3214

Column oriented database

Inserts
Row oriented database

INSERT INTO
salesperson

Block ID

Name DOB

1001

Dick

21/12/60 67,000

Salary

1002

Jane

12/12/55 55,000

1003

Robert 17/02/80 22,000

1004

Dan

1005

Steven 11/11/81 76,000

15/03/75 65,200

Sales Expense
s
7898 3244
0
6784 2333
0
6789 6436
0
9877 2345
0
4324 3214
0

Block

1
2
3
4
5

Dick
Jane
21/12/60 12/12/5
5
67,000
55,000
78980
67840
3244
2333

Robert
17/02/8
0
22,000
67890
6436

Dan
15/03/75

Steven
11/11/81

65,200
98770
2345

76,000
43240
3214

Column oriented database

SSD and
in-memory databases

5MB HDD 1956

The more that things change....

In-memory databases
Cost of RAM
falling 50% each
18 months.
Some databases
can fit entirely
within the RAM
of a single
server or cluster
of servers

Examples of NewSQL and in-memory systems


AeroSpike
Non-relational DB designed to take advantage of SSD
technologies
HANA
In-memory database which exploits columnar technology
for DW workloads
VoltDB
Completely in-memory clustered database for high OLTP
throughput
Redis
In memory Key-value store
Spark
In-memory MapReduce to supplement Hadoop

What will you choose?


DBA
SQL Developer
Database Consultant
Data Scientist

Data Scientist

Thank you

Você também pode gostar