Você está na página 1de 42

Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Anurag Singla
Sr. Manager, R & D

Statistical anomaly detection
with ESM Correlation
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 2
Agenda
Anomaly detection primer
Use case 1: data monitors
Review of recent ESM correlation features
Use case 2: financial fraud detection
Use case 3: time-sequence anomalies
Caveats and conclusions
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 3
Anomaly detection primer
Baselining
Record statistics of a certain behavior/event flow
Typically involves trending over long time periods
Detect deviations from this baseline in order to discover anomalies

Threat (anomaly) score
Amount of statistical deviation from baseline for a new event / pattern
User logged in 5x longer than usual
Unusually high volume data transfers from application host
Large bank account transfers compared to monthly average

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 4
Anomaly detection primer
Some types of anomalies are well understood and can be captured using static rules
Login from multiple geo regions in short time period
Large # sources connecting to same target
Known virus/worm definitions

Others are more dynamic and have too much variance across individual actors.
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 5
Use case 1: data monitors
Data Monitors
Track real-time event streams.
Perform automatic short term baselining.
Audit events if some statistic (eg. Moving Average) deviates from recent history by a
certain threshold. Rule can fire and take action when threshold exceeded.

Not suited to analysis over long periods.
Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Use case 2:
financial fraud detection
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 7
Behaviour-based fraud detection
Methodology: compare an account holders recent transactions with the historical behavior
patterns of the account holder, looking for anomalies.

Fire an alert if the total amount of money a customer has transferred out of his account today is
greater than the accounts historical cumulative monthly average.
If the customer transfers out on average $1200 per month, then as soon as a transaction is received that
puts his daily transfer total (since 12am this morning) above $1200, we should fire an alert and add the
account to a watch list for investigation.
Conditions for detecting fraud can be fine tuned to avoid false positives.

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 8
ESM enhancements for anomaly detection
Baselining:
Interval queries on active lists
Lightweight rules
Timestamp granularity variables

Real-time detection:
Cumulative active list fields
Lightweight rules
Time-partitioned active lists

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 9
Cumulative active list fields
Problem: atomic column increment
Solution: cumulative numeric columns
For numeric column types (Integer, Long, Double),
subtypes SUM, MIN, MAX.
The value from AddToAL action is combined
atomically with existing value.
To implement counter, use Integer(SUM) field and
add value 1 each time
Can obtain mean value using counter in
conjunction with value sum field
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 10
Cumulative active list fields
Num
transactions
Total
amount
Max Amount Min Amount
1 200.00 200.00 200.00
Num
transactions
Total
amount
Max
Amount
Min
Amount
1 200.00 200.00 200.00
Values inserted by AddToList action
(CustomerId and TimeKey same for all)
Resulting values in AL entry
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 11
Cumulative active list fields
Num
transactions
Total
amount
Max Amount Min Amount
1 50.00 50.00 50.00
Num
transactions
Total
amount
Max
Amount
Min
Amount
1 200.00 200.00 200.00
2 250.00 200.00 50.00
Values inserted by AddToList action
(CustomerId and TimeKey same for all)
Resulting values in AL entry
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 12
Cumulative active list fields
Num
transactions
Total
amount
Max Amount Min Amount
1 875.00 875.00 875.00
Num
transactions
Total
amount
Max
Amount
Min
Amount
1 200.00 200.00 200.00
2 250.00 200.00 50.00
3 1125.00 875.00 50.00
Values inserted by AddToList action
(CustomerId and TimeKey same for all)
Resulting values in AL entry
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 13
Lightweight rules

Designed for data list maintenance
No correlation or audit event when rule fires
No aggregation (stateless)
Can match large # events and not get disabled

Allow separation of data maintenance and risk
analysis logic.
Processed earlier than regular rules.
Lightweight rule actions executed before regular
rule conditions are evaluated.

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 14
Timestamp granularity variables
Convert timestamp to beginning of time period
(hour/day/month, etc.)
Use result in AL key field
Input arguments are timestamp field (eg. EndTime,
DeviceCustomDate) plus granularity selection.
Output is timestamp value shifted to desired
boundary.
Example: transaction time = 2012 -09-11-14:32
Start of hour: 2012-09-11-14:00
Start of day: 2012-09-11-00:00
Start of month: 2012-09-01-00:00

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 15
Time-partitioned partially cached active lists

Keep most recent entries (latest timestamp key
value) in memory for fast correlation
Evict entries with older timestamp value
Partition the cache into buckets based on time
values.
PLEASE do not insert random time values (like
EndTime) into time field!
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 16
Interval query on active list
Problem: query the entries of an
active list based on a time interval.
Query on active list was snapshot based
all the entries in the AL were
considered.

Solution: interval queries on AL
Make the query type interval.
Enter the start time and end time of the
query.
Select the field on which the time interval
will be evaluated.
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 17
Workflow: baselining and detection
Lightweight
rules
Daliy
active list
Monthly
trend
Monthly
active list
Stats
trend
Historical
stats
active list
Rules
Transaction
events
Anomaly?
Update daily
transaction
stats
Runs every 30
days
Runs every 30
days over 180
days
Read daily
values
Read historical
values
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 18
Daily transactions active list
Cumulative Fields
TimePartitioned
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 19
Historical stats active list
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 20
Active list roll up using trends
CustId TimeKey Total Max
46201 Aug 5
th
$250 $150
46201 Aug 14
th
$700 $700
50532 Aug 7
th
$100 $100
50532 Aug 28
th
$3290 $3000
Cust TimeKey Total Max
daily
46201 Aug $950 $700
50532 Aug $3390 $3290
Daily transactions Monthly transactions
Trend Query
Interval: 1 mo
Freq: 1 mo
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 21
Active list roll up using trends
CustId TimeKey Total Max
daily
46201 May $2500 $1800
46201 Jun $800 $300
46201 Aug $950 $700
50532 Apr $1600 $1000
50532 Jul $810 $600
50532 Aug $3390 $3290
Cust Max
daily
Max
monthly
Mean
monthly
46201 $1800 $2500 $1417
50532 $3290 $3390 $1933
Monthly transactions Historical stats
Trend Query
Interval: 6 mos
Frequency: 1 mo
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 22
Data maintenance rule
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 23
Fraud detection rule
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 24
Fraud detection rule in action

Snapshot in time for CustomerId 46201:
Historical Stats AL: mean monthly total = $1200
Daily Transactions: cumulative total = $1150

Transaction event received: CustomerId = 46201, transaction value = $100
1. Data Maintenance rule (lightweight): updates daily total to $1150 + $100 = $1250
2. Fraud Detection rule (standard): looks up Daily AL, finds updated cumulative value ($1250)
3. Condition is matched, rule fires and adds customer account to Suspicious Accounts AL
Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Use case 3:
sequence anomaly detection
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 26
Use case 3: sequence anomaly detection
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 27
What do users normally do?
Baselining process
Vertica /R
Historical
User Data
Profile 1 Profile 2
Profile 3
Transition Probability Matrix
Hadoop /
Mahout
User-Profile Mapping
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 28
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
1.0
Help
0.4
0.4
0.2
0.12
0.72
0.06
0.08
1.0
0.02
0.80
0.05
0.15
0.80
0.05
1.0
0.15
0.55
0.18
0.27
0.52
0.46
0.02
0.1
0.7
0.2
0.36
0.64
State transition probabilities for specific user profile (101)
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 29
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.4
0.4
0.2
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 30
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.12
0.72
0.06
0.08
0.02
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 31
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.80
0.05
0.15
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 32
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.80
0.05
0.15
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 33
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.52
0.46
0.02
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 34
Transition diagram online banking
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.55
0.18
0.27
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 35
Transition probability (baseline) active list
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 36
Real-time anomaly detection process

Online session in progress
Receive incoming event containing: {UserId, SessionId, OldState, NewState}
Lightweight (maintenance) rule does:
1. Lookup Profile for User
2. Look up transition probability between old and new state
3. Replace missing transition with low prob value (e.g. 0.000001)
4. Compute negative log (JME variable), add to session anomaly score (SUM field)

Standard rule checks anomaly score against defined threshold
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 37
Normal user session
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.4 (0.40)
1.0 (0)
0.72 (0.14)
0.80 (0.097)

0.80 (0.097)
0.52 (0.28)
Anomaly Score (Sum of reds) = 1.02
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 38
Fraudulent user session, part 1
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.4 (0.40)
1.0 (0)
0.80 (0.097)

Anomaly Score (part 1) = 5.20
1
.
0

(
0
)

Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 39
Fraudulent user session, part 2
Login
Main
Page
Add
Account
Check
Balance
Transfer
Page
Transfer
Init
Transfer
Commit
Transfer
Failed
Transfer
Success
Remove
Account
Help
0.80 (0.097)

Anomaly Score (part 2) = 3.27, total = 5.20 + 3.27 = 8.47 (vs 1.02)
0.18 (0.74)
1.0
0.80 (0.097)
Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. 40
Anomaly detection - caveats
Statistical baseline issues (hat tip: John Petropoulos)
Prone to being skewed.
During an attack, short-term stats are prone to being skewed quickly.
How do we deal with this? Remove the offending entries?

False positives can be reduced, never eliminated
Users sometimes behave strangely
Seasoned fraudsters can appear eerily natural
Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Thank you
Copyright 2013 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.
Security for the new reality

Você também pode gostar