Escolar Documentos
Profissional Documentos
Cultura Documentos
I.
INTRODUCTION
118
II.
RELATED WORK
III.
BACKGROUND
+ ~ ~
(1)
: , : ,
~ : , ~ = 1
When () is very low (of the order of 2 times 10 -5), even for
unrealistic detection rate, (|), (as high as 100%), the false
positive rate, (|~), has to be extremely low ( of the order
of 10-5) for the Bayesian detection rate, to be 66%.
119
Local Alerts: The IDS alerts generated are fed back to the
reputation system. This information includes the source
and destination IP addresses, and a suspicion score (that
corresponds to the signature priority). The reputation of
the concerned IP addresses increases with each generated
alert.
VI.
120
121
~ =
+
+
The Bayesian detection rate was calculated based on the false
positive rate, the true positive rate and the base rate of intrusion
using (1).
=
Evaluation datasets
DARPA Dataset: The first set, DARPA 1999 IDS evaluation
dataset [14], is an artificially created dataset. Although the
dataset is not current, we used it mainly because it is labeled
(i.e. ground truth is known), publicly available, and widely
used as a standard for IDS evaluation purposes [9]. The dataset
is divided into five weeks, two of which are training sets
(attack free) and the rest three weeks are test datasets that are
labeled (containing attacks). For the evaluations, we used week
1 dataset which is attack free, and also week 4 and 5 datasets
that has labeled attacks. Together week 4 and 5 contains 201
instances of attacks. Of these, 8 of them had console as the
attacker instead of the IP address. These were not considered
for the evaluation since correlating based on IP address with
the generated alerts was not possible. This left a total of 193
attacks as shown in Table II. The total number of connections
(including TCP, UDP and ICMP) were identified by counting
the different 5 tuples (protocol, source IP address/port,
destination IP address/port) and the base rate of intrusion was
calculated by dividing the total number of attacks by the total
number of connections.
CLASSIFICATION
Classification
Description
Priority
Successful-admin
Attempted-admin
Successful-user
Attempted-user
Dataset
Conns
Attacks
Base rate
Successful-dos
Denial of service
Darpa Week4-5
530571
193
0.00036375
Attempted-dos
Private set
243570
1167
0.00479
Bad-unknown
TABLE II.
Evaluation approach
The evaluation conducted is a trace driven approach. We
selected datasets for which we know the ground truth, i.e.
which data (packets) are benign and which data are malicious.
Based on this information, we calculated the base rate of each
dataset as shown in Table II. After the IDS under evaluation
have processed the dataset, the generated alerts are compared
against the ground truth in order to identify true positives (TP),
false positives (FP), true negatives (TN) and false negatives
(FN). The false positive rate ~ and the true positive rate
were calculated based on the following equations.
Results Discussions:
DARPA Dataset: The DARPA week 1 dataset is attack free.
We evaluated open source Snort (default configuration) using
this dataset. Since the dataset was completely attack free, all
122
TABLE III.
the generated alerts were false positives and the number of true
positives is zero, as shown in Table III. The top 5 high false
positive alerts were identified (Table IV). The corresponding
signatures and modules were allotted to an appropriate
signature level based on their relative false positive rate as part
of the RAPID configuration. This may be considered as a
tuning step for the RAPID system. Using this configuration we
evaluated RAPID with week 1 dataset. The results are shown in
Table III; as expected the results showed a decrease in false
positives.
IDS
Total
Alerts
FP
TP
125619
125619
RAPID
281
281
281
281
TABLE IV.
Count
Sig level
62251
62251
495
209
301
TABLE V.
As the next step the systems were evaluated using the week 4-5
datasets that contains labeled attacks. Snort was evaluated
using both the default and disabled configurations. For RAPID
the previous configuration was used for initial evaluation.
Following the initial evaluation, the top 5 high false positive
alerts were identified and the corresponding signatures were
allotted to an appropriate signature level. After this 2nd tuning
step, the RAPID system was re-evaluated (referred as RAPID
tuned in the results table). The results are shown in Table V
and VI. The results showed that using RAPID the false
positives dropped by 72% for the initial run, and by 88% after
re-tuning. Completely disabling the high false positive
signatures (disabled configuration) produced the least false
positives.
IDS
Total
Alerts
False
Alerts
True
Alerts
Attacks
Detected
Attacks
Missed
29784
21066
8718
80
113
RAPID
12482
5900
6582
69
124
3965
2680
1285
57
136
574
417
157
22
171
RAPID - tuned
Snort-disabled config
TABLE VI.
The false negatives, i.e. missed attacks, is maximum for Snortdisabled-config, where the high false positive signatures and/or
modules were completely disabled. In case of RAPID, the high
false positive signatures were not completely disabled but
placed in an appropriate signature level so that only
connections with a certain suspicion level was analyzed by
those signatures. After the 2nd tuning step (RAPID-tuned), yet
another set of top 5 high false positives signatures were placed
in higher signature levels. In these cases, corresponding to the
reduction in false positives, we see an increase in false
negatives as well. But the false negatives are not as high as in
the case of Snort-disabled-config.
IDS
P(A|I)
P(A|~I)
P(I|A)
0.414
0.02173
0.0068
RAPID
0.357
0.00608
0.0209
RAPID tuned
0.295
0.00276
0.0374
0.113
0.00043
0.0872
123
TABLE VII.
IDS
TP
FP
P(A|I)
P(A|~I)
P(I|A)
Snort
1229
2358
0.00968
0.3320
RAPID
1229
0.0000328
0.9932
REFERENCES
[1]
[2]
In this case the traffic from all machines behind the NAT
device, say a firewall, will have the same source IP address. In
this case, the attack traffic from one of the machines behind the
NAT device can affect the reputation of all the devices behind
the NAT device. Therefore, the benign and suspicious traffic
from those machines will end up being inspected by the high
false positives signatures as well.
[3]
[4]
[5]
DHCP:
[6]
[7]
[8]
[9]
[10]
[12]
[13]
[14]
[15]
124