Escolar Documentos
Profissional Documentos
Cultura Documentos
QCon 2012
Progressive Architectures at the Royal Bank of Scotland
Mark Atwell
Farzad Fuzz Pezeshkpour
Ben Stopford
rbs.com/gbm
Overview
History Front-Office-Centric
The City has been the source of industry innovation
FPML
AMQP
OpenAdaptor
LMAX Disruptor
RBS00000
New Drivers
Change: Even more than before!:
Regulation
Competition The Race to Zero
Utilisation data-centres, Just Enough hardware
All in the face of even tighter budgets
Agenda
Ben:
Fuzz:
Mark:
Data Virtualisation
Collaboration
Questions
RBS00000
Development Areas
UX
Data & Resource Consolidation
GPGPUs, FPGA
Big Data (for us)
Data Visualisation
Data Virtualisation
Were driving the shape and features of products particularly in the
last two
RBS00000
The Problem:
615p
595p
Apple announces
Financial Results
above expectations
RBS
Full GC
Buy @ 595p
Short @ 615p
RBS00000
RBS00000
Stop-the-World happens more than you may think, even with CMS
Eden collections
Mark/Remark in CMS
Full GC (really hurts)
CMS falls back to Full GC on Concurrent Mode Failure
Eden
Old Generation
RBS00000
RBS00000
10
Old Generation
Old Generation
RBS00000
11
We use Model 2
Old Generation
RBS00000
12
Manhattan Processor:
Concurrent processing with a
very low garbage footprint.
RBS00000
13
Manhattan Processor
In-house non-allocating queue specialised for
Event Processor
Pre-Processor
prePost(
Event)
Event^
Event^
Filtering
PreProcessor
prePost(
Event)
prePost(
Event)
Conflating
PreProcessor
Event^
post(
Event^)
Conflating
PreProcessor
processors
Queue
Thread
#2
dispatch(
Event^)
Enriching
Target
Queue.poll
dispatch(
Event^)
dispatch(
Event^)
Target
C
Minimal Garbage
Target
D
RBS00000
14
Result
processing
Predictable pauses <1ms
No full collection pauses in 15hr trading
day
RBS00000
15
scenarios
Measuring Risk is not simple
- Not as difficult as predicting the weather, but not far off
RBS00000
17
Disclaimer:
The numbers quoted in this presentation are indicative of magnitude only.
RBS00000
18
Linear approximation
Single-step Monte Carlo
Multi-step Monte Carlo
19
Monte-Carlo: Example
(xi , yi )
Average
RBS00000
20
RBS00000
21
RBS00000
22
1000 paths
10000 paths
per path
~50 observables
1000s observables
Updated quarterly
Updated monthly
23
24
App Servers
Directors / Brokers
Compute Grid
25
~ 10 TB
~ 2 MB
26
Storage ....
Database?
Storage Appliance?
App Servers
Distributed Filesystem?
Directors
/
Brokers
Compute Grid
27
Requirements
Speed
Scalability
Robustness
Interoperability
Support
Infrastructure
RBS00000
28
RBS00000
29
30
RBS00000
31
Option 2
Offline simulation
each day generates
the underlying factor
data
Market observables
Simulated factors
Option 3a
Offline simulation
generates sets of
correlated random
numbers to be used
in the factor
simulation
Option 3b
Evolution of factors
and generation of
market observables
Correlated random
numbers
Distributed calculation nodes access the
correlated random numbers on demand. The
factors are then evolved and the market
observables generated. The storage and
remote access technology is not defined, but
the nature of the correlated random numbers
may lend itself to a technology that supports
some type of sub-element access.
Generation of
market observables
Offline simulation
generates sets of
uncorrelated random
numbers to be used
in the factor
simulation
Option 4
Correlation of
random numbers,
evolution of factors
and generation of
market observables
Generation of
correlated random
numbers, evolution of
factors and
generation of market
observables
Uncorrelated random
numbers
Distributed calculation nodes access the
correlated random numbers on demand. The
factors are then evolved and the market
observables generated. The storage and
remote access technology is not defined, but
the nature of the correlated random numbers
may lend itself to a technology that supports
some type of sub-element access.
RBS00000
32
3 of 14 = ~ 50MB/s
14 of 14 = ~ 100MB/s
App
Servers
Compute
and
Data
Grid
Directors
/
Brokers
Name
Nodes
Co-located
Grid Engines and HDFS Nodes
10
GE
Switched
LAN
RBS00000
33
App
Servers
Compute
and
Data
Grid
Directors
/
Brokers
Name
Nodes
Co-located
Grid Engines and HDFS Nodes
10
GE
Switched
LAN
RBS00000
Where Next?
Software Optimisations
Data Structures
Distribution Algorithms
Uncorrelated and Correlated Random number generation
Data Compression
Novel Hardware Architectures
/ Compute
35
Summary
Thank you
RBS00000
36
ODC
One Version of the Truth For Everyone
RBS00000
RBS00000
RBS00000
RBS00000
RBS00000
RBS00000
43
Latency
Throughput
RBS00000
44
Throughput
RBS00000
45
Latency
RBS00000
46
ODC is a balance
InMemory
Plentiful
hardware
Latency
Throughput
RBS00000
47
RBS00000
RBS00000
RBS00000
Why Joins?
Objects need to be versioned as
example bi-temporally)
Fat objects (Distributed Cache) /
RBS00000
Select Entity
Select Party
Select Cash
Recursively
hits the
network
RBS00000
RBS00000
RBS00000
54
RBS00000
55
RBS00000
56
RBS00000
All Data:
Party Alias
Parties
Ledger Book
Source Book
Cost Centre
Product
Business Unit
HCS Entity
Set of Books
0
1,250,000
2,500,000
3,750,000
5,000,000
1,250,015
2,500,010
3,750,005
5,000,000
Used Data:
Party Alias
Parties
Ledger Book
Source Book
Cost Centre
Product
Business Unit
HCS Entity
Set of Books
20
RBS00000
58
RBS00000
RBS00000
RBS00000
RBS00000
RBS00000
So ODC provides
Scalability
Real-time interface
Low latency access
High throughput (throttled)
RBS00000
64
RBS00000
Data Virtualisation
Why do Data Virtualisation?
Resisting the YAS Yet Another System (and database & feeds &
code &&&) Syndrome.
Aspirations:
RBS00000
66
Combine
&
Transform
Q2
Q3
Q1
RBS00000
67
Q2
Q1
Q3
Q1
RBS00000
68
Data Virtualisation
and if synergies are identified, transitions can be straightforward:
S
Virtual
Data Virtualisation
Transform
Looks like
a single
unified
database
(subset)
Q2
Q3
RBS00000
69
Q1
RBS00000
70
Collaboration is Key
What weve learnt:
RBS00000
71
Questions?
RBS00000
72