Você está na página 1de 5

2014 IEEE 28th International Conference on Advanced Information Networking and Applications

IP Spoofing Detection Using Modified Hop Count


Ayman Mukaddam

Imad Elhajj

Ayman Kayssi

Ali Chehab

Electrical and Computer Engineering Department


American University of Beirut
Beirut 1107 2020, Lebanon
{agm10, ie05, ayman, chehab}@aub.edu.lb
Blind Spoofing, Man in The Middle, Denial of Service, and
Decoy Scan.

Abstract With the global widespread usage of the Internet,


more and more cyber-attacks are being performed. Many of
these attacks utilize IP address spoofing. This paper describes IP
spoofing attacks and the proposed methods currently available to
detect or prevent them. In addition, it presents a statistical
analysis of the Hop Count parameter used in our proposed IP
spoofing detection algorithm. We propose an algorithm, inspired
by the Hop Count Filtering (HCF) technique, that changes the
learning phase of HCF to include all the possible available Hop
Count values. Compared to the original HCF method and its
variants, our proposed method increases the true positive rate by
at least 9% and consequently increases the overall accuracy of an
intrusion detection system by at least 9%. Our proposed method
performs in general better than HCF method and its variants.

There are two well-known methods to prevent IP spoofing:


Address filtering and IPsec. Other methods address specific
cases like the Generalized TTL Security Mechanism [4].
This work was inspired by the Hop-Count Filtering
(HCF) technique proposed by Wang et al. [5] [6] to detect IP
spoofing. Their algorithm is based on the idea that although an
attacker can spoof the source IP address, the attacker cannot
spoof the number of hops a packet traverses to reach the
destination. Therefore, the algorithm first learns the IP to Hop
Count (HC) mapping and stores the mapping in an IP2HC
table. Once a packet arrives, it is compared to the HC stored for
this IP. If the HC values match, then the packet is legitimate.
Otherwise, the packet is discarded.

KeywordsIP spoofing, hop count, hop count filtering,


statistical analysis.

I.

The main strength of the HCF technique lies in its


simplicity. This paper aims at proposing a variation of the HCF
technique in order to enhance the accuracy of the HCF by
including in the IP2HC table all valid HCs seen in the learning
phase. This modification enhances the overall accuracy
compared to the original HCF and its variations [6].

INTRODUCTION

Internet access, in today's world, can no longer be


considered a commodity but rather a human right [1]. Many
critical services like banking, online shopping, e-commerce,
distance learning, remote surgery, searching, and social media
are based on the Internet service. According to [2], there are
more than 2.4 billion Internet users as of June 30, 2012.
Therefore, any disruption to this service is considered
problematic and can result in drastic financial losses to several
businesses. Unfortunately, the Internet was not designed with
security as a primary concern but rather it was designed based
on scalability. This allowed several attackers or hackers to
exploit several of the design weaknesses that are inherent to the
protocols used in today's Internet.

The remainder of this paper is organized as follows: section


2 discusses the previous work related to HCF technique and its
variations. Section 3 presents statistical analysis of HC and
RTT. In Section 4 we describe our proposed algorithm. Section
5 presents the results of the proposed algorithm. Finally, we
conclude the paper in section 6.
II.

This section provides a literature review of several methods


that detect spoofed IP packets like Hop Count Filtering
technique and Reverse Path Forwarding,

A particularly interesting weakness in the protocols used in


today's Internet lies in the IP Protocol. This weakness allowed
attackers to "spoof" (masquerade) the source IP address and
thus be able to perform several attacks such as hijacking
sessions, packet spoofing, denial of service, advanced scanning
techniques, and distributed attacks.

Hop Count (HC) is defined as the number of hops a packet


traverses as it moves from the sender to the receiver. HC is not
sent in the IP packet but is rather inferred from the IP Time-toLive Field (TTL). The receiver can estimate the HC by
subtracting the received TTL value from the closest initial TTL
value bigger than the received packets TTL. Usually, these
initial TTL values are operating system dependent and are
limited to few possibilities, which include 30, 32, 60, 64, 128,
and 255 [1]. Therefore, guessing the initial TTL set by the OS
is possible without explicitly knowing what the OS is,
especially that the number of hops between two hosts is

By design, the IP protocol does not offer any form of


authentication of the source IP address. Therefore, an attacker
can send an IP packet with a "spoofed" source IP address. An
attacker can thus benefit from this ability to remain
anonymous, to launch targeted attacks, and to circumvent some
security restrictions that are based solely on verifying the
sources of IP addresses [3]. There are many variations of
attacks that utilize IP Spoofing such as Non-Blind Spoofing,
1550-445X/14 $31.00 2014 IEEE
DOI 10.1109/AINA.2014.62

LITERATURE REVIEW

512

relatively limited, as the statistics presented in this paper will


show.

analysis is based on probability of packet arrival p, number of


malicious packets n, and number of legitimate packets m.

A. Hop Count Filtering Technique & Its Variants


Wang et al. proposed in [6] the HCF algorithm to detect IP
spoofing. The algorithm is based on the idea that although an
attacker can spoof the source IP address, the attacker cannot
spoof the number of hops a packet traverses to reach the
destination. Therefore, the algorithm first learns the IP to HC
mapping and stores the mapping in an IP2HC table. Once a
packet arrives, it is compared to the HC stored for this IP. If the
HC values match, then the packet is legitimate. Otherwise, the
packet is discarded. Wang et al. presented an analysis of the
HC distribution and its stability and how one or multiple
attackers can change the initial Time to Live (TTL) values to
bypass their detection method. In general, they analyzed that
their method captures roughly 90% of spoofed packets. In
addition, they explained how clustering can be used to create a
mapping between a subnet and HC. Moreover, they proposed 3
methods of filtering namely strict filtering, +1 filtering and +2
filtering. Strict filtering drops a packet when the HC differs
from the HC profile. The +1 filtering drops a packet when the
HC differs by more than one from the HC profile. Finally, +2
filtering drops a packet when the HC differs by more than two
from the HC profile.

Mopari et al. [11] implemented HCF inside the Linux


kernel and demonstrated that it is indeed a simple and effective
solution against DoS attacks. They used a hash table to
construct the IP2HC table so that the machine does not need to
know the IP to HC mapping of every single IP address.
B. Reverse Path Forwarding (RPF)
Reverse Path Forwarding is a way to implement Ingress
Filtering and it has many variations, which include 1) Strict
Reverse Path Forwarding, 2) Feasible Path Reverse Path
Forwarding, 3) Loose Reverse Path Forwarding, and 4) Loose
Reverse Path Forwarding Ignoring Default Routes.
In strict reverse path forwarding (RPF), the source address
is looked up in the forwarding information base (FIB) table of
the router. If the packet is received on the interface that would
be used to forward traffic to the source address, the packet
passes the check. Otherwise the packet is discarded [12].
However, this method only applies to the cases where the
routing is symmetrical. Otherwise, RPF checks can fail. In
Feasible Reverse Path Forwarding, all the feasible alternative
paths, if any, are considered legitimate. If the packet is received
on any of the interfaces that would be used to forward traffic to
the source address, the packet passes the RPF check.
Otherwise, the packet is discarded [12]. Removing the rigid
restriction allows Feasible RPF to work better in cases were
routes are asymmetric. Loose RPF checks if a route to the
source actually exists, even if it is a default route. If a route
exists RPF check passes, otherwise, RPF check fails. [12].
However, since this method checks the default routes, it can be
considered very loose since RPF check will always pass in case
a default route exists. In Loose Reverse Path Forwarding
Ignoring Default Routes, the default route is not considered
while performing the RPF check [12]. This method will only
allow packets that the routers have a specific route to. The
SAVE Protocol [13] operates on each router by mapping each
interface with a set of valid incoming network addresses. If a
packet arrives on the correct interface, then it is considered
legitimate. Otherwise, the packet is considered spoofed and
thus should be dropped.

However, the authors did not study the case where there are
multiple routes between the sender and destination, which
results in multiple allowed HC values. In that case, the 90%
probability will be lower since the attacker will have more HC
values that are considered legitimate. Moreover, even though
the authors stated that their method works solely on layer 3,
this statement is not accurate since they restricted the HC
learning phase to established TCP connections.
Xia Wang et al. [7] proposed a variation of the HC Filtering
technique. Instead of applying the HC filtering at the end hosts,
they suggested to apply the filtering at intermediate routers. In
this manner, they are not protecting the end systems only but
the whole network is protected from traffic congestion. Their
simulation results showed that the proposed algorithm
outperforms HCF.
Krishna Kumar et al. [8] proposed to detect IP spoofing by
checking both the HC and the path identification (PID) at every
router. The PID is inserted in each IP packet in the
identification field. If both the HC and PID match, then the
packet is considered legitimate. Otherwise, the routers start an
attack-detection process. The algorithm requires a shared key
between every pair of adjacent routers.

III.

STATISTICAL ANALYSIS OF HC

In order to study the HC statistic we used one-month skitter


data collected from the Cooperative Association for Internet
Data Analysis (CAIDA) in January 2008 [14]. Six IP sources
were selected that are geographically spread. For each IP
source, a developed program calculates the HC distribution as
seen by each IP source and the distribution of the number of
different HCs obtained for each IP destination. The code finally
computes the HC overlap (defined below) between IPs in the
same Autonomous System (AS), IPs in the same country and
IPs in different AS and different countries.

Wu and Chen [9] presented a three-layer defense against


DDoS on web servers. At the network layer, the standard HCF
is used while at the transport layer, a proxy firewall is used to
handle the issue of TCP SYN Flooding attacks. At the
application layer, traffic limiting is used to distribute the
available bandwidth among all the clients.

Then, the HC distribution from the 6 sources to all the


destinations is calculated. Although the HC values vary
between 6 and 30, more than 83% of the HC values are located
between 11 and 21 hops only. In accordance with [6], the mode
has less than 10% probability of occurrence. The best-fit

Swain and Sahoo [10] presented a probabilistic theoretical


model based on HCF. Unlike the HCF technique that checks
every packet for its legitimacy, the technique works by
checking the packets until n malicious packets have been
received. Then, m packets are allowed to go unchecked. Their

513

distribution that models this HC distribution is a Poisson


distribution with a mean of 15.47 [15].

AS have a higher chance of appearing alike in terms of HC.


There is around 61% chance that the number of hops between
two IPs in the same AS and a random destination IP to be the
same. This might be used against HC filtering techniques in
general. The attackers might have a higher chance of being
able to spoof a victim if they have an IP in the same AS as the
victim since most likely their HC values as seen by the
destination will be similar.
Surprisingly, IPs in the same country are different in terms
of HC. More than 47% of the IPs in the same country have
less than 10% overlap. This means the HC distribution of the
country as a whole is spread across its range. Attackers might
not benefit much if they get an IP in the same country as the
victim since the destination will most likely notice a difference
in their HC values.
IPs in different countries and AS seem very different in
terms of HC. There is on average less than 19% chance that
two IPs would have the same HC values as seen by a random
destination.
In order to test the validity of the results which were based
on the skitter data collected in 2008, we conducted traceroute
from two different IP sources, one located in Lebanon and one
in United Arab Emirates (UAE) to around 200 IPs spread over
40 ASs. The traceroute to each IP was repeated every hour for
10 consecutive days.
The results (Figures 1 and 2) showed that IPs in the same
AS have a high probability of HC overlaps. More than 66% of
IPs in the same AS have higher than 0.9 HC overlaps. Also,
less than 15% of IPs that are in different AS and different
countries have a probability of HC overlap greater than 30%.
The results confirmed the findings that were based on skitter
data.
Based on these results, a random destination IP has a high
probability of viewing two IPs in the same AS as similar in
terms of HC. This HC overlap or similarity decreases as the
IPs are chosen in the same country or chosen at random.

Next, the number of different HC values obtained between


each IP source and every destination is calculated. Although
this number varies between 1 and 9, more than 98% of the
values are less than 6. The mode is 2 and it has a probability of
around 0.61 [16].
The HC overlap is defined as follows: for each destination
IP, the HC probability density function (PDF) is calculated.
Then, each IP's PDF is compared with every other IP's PDF.
Assume that [A, B, , Z] represent the probabilities for the
HC distribution that IP1 obtains when communicating with
some destinations, and similarly we have [A', B', , Z'] for
IP2 when communicating with the same destinations. To find
the HC overlap between IP1 and IP2, we do a conditional sum
of [A', B', .., Z']. This means that we add A' if A0, B' if B0,
, Z' if Z0. Note that this probability of overlap is different
from the probability of HC overlap between IP2 and IP1
because in the latter case we will do a conditional sum of [A,
B, .., Z]. This means we add A if A'0, B if B'0, , Z if
Z'0.
After calculating the HC overlap between each IP and every
other IP, we calculate the average of the HC overlap based on
IPs in the same AS and same country, IPs in the same country
and different AS, and IPs in different countries and different
AS. Note that we didn't have enough data to study the case of
different countries but same AS.
The results showed that 71.82% of the IPs in the same AS
had HC overlap greater than 0.5. More than 23% of the IPs in
the same AS had HC overlap between 0.9 and 1. The average
HC overlap in the AS was 0.618.
In the same country with different AS, 78.85 % of the IPs
had HC overlap smaller than 0.5. More than 47.35% of the IPs
had HC overlap between 0 and 0.1. The average HC overlap in
the same country was 0.217.
In different countries, 82.32 % of the IPs had HC overlap
smaller than 0.5. More than 53.89% of the IPs had HC overlap
between 0 and 0.1. The average HC overlap in different
countries was 0.182.
Although the HC distribution has a wide range, the middle
values are more likely than the values at the upper and lower
extremities of the range. The probabilities of HC values
between 11 and 19 are less than 0.02 apart. On average, the
number of HC between each sender and destination is
approximately 15 hops.
It was expected to have different results for the distribution
of the number of different HC values obtained between each
IP source and every destination. The mode was expected to be
one; however, the mode was actually two, which meant that
two different routes existed between most source/destination
pairs. This was also confirmed by the traceroute experiment
we describe later.
An interesting finding is the amount of HC overlap between
IPs in the same AS. The results showed that IPs in the same

Different Country & AS

Same Country

Decreasing Cumulative
Probability

Same AS

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1


Figure 1: HC overlap in Lebanon

514

Decreasing Cumulative Probability

Different AS & Country

filtering from upstream Internet Service Providers. However, if


all ISPs implement a similar solution, this limitation can be
lessened.

Same Country

Same AS

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

V.

TESTING AND RESULTS

The CAIDA data is used to perform the testing and


analysis. From each collection source, and after performing the
necessary data preparation and normalizations, we did the
following:

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1) For each IP destination, we divide the data into two groups


and we perform 3-fold cross validation. The data is divided
into 3 sets. The first two sets are used to learn while the
third set is used to test the performance thus we calculate
the true negative (TN) and the false positive (FP) rates. The
choice of which set is chosen for testing is done
sequentially using all sets. In total, 3 models are built and
evaluated for the three possible cases.

HC Overlap
Figure 2: HC overlap UAE

In this section we first describe the attacker models we


adopted and then we describe our proposed algorithm.

2) We randomly select 3 IPs in the same AS, 3 IPs in the same


country and 3 IPs in different AS and countries. For each of
these sets, we pass them to the algorithm and we record the
TP and FN cases.

We believe that IP Spoofing attacks can be used in two


main attacker models. The first model is where the attacker is
spoofing the IP address and he/she is in the traffic path and
thus is able to receive replies to spoofed packets. The attacker
is thus able to send spoofed IP packets and receive their
corresponding replies. This is the case of man-in-the-middle or
impersonation attacks. The second model is where the attacker
is spoofing the IP address but he/she is not interested in the
reply back. This is the case of most DoS attacks.

In order to assess our modified HC (M-HCF) algorithm's


performance, we calculate the percentage of HC in the testing
data that is present in the training data. In this manner, we can
calculate the TN, FP, TP, and FN rates for the modified HCF
technique. We similarly calculate the percentage of HCs in the
testing data that have strict HC matching or +1 HC matching or
+2 HC matching relative to the training data. In this manner,
we can calculate the TN, FP, TP, and FN rates for the original
HCF technique and its variants.

Like HCF, our proposed algorithm has two phases, a


learning phase and a filtering phase. In the learning phase, each
destination creates a profile for each IP it communicates with.
Unlike HCF, the profile is not only made of <IP, HC> but is
made of <IP, HC-List>. In this manner, the destination can
learn all the valid HCs seen from the source in case multiple
routes with varying hop counts exist. This will enhance the
performance of HCF by correctly classifying the legitimate
traffic and thus lowering the false alarms rate. However, this
leniency will come at the cost of lower attack detection rate
since an attacker has more valid HC options to bypass this
detection method.

Table 1 shows the results. The second column shows the


results based on our proposed modified HCF. Columns three to
five show the three variations proposed in [6].

IV.

PROPOSED ALGORITHM: MODIFIED HCF

Table 1 shows that if the traditional HCF is based on exact


HC match, then around 40% of false positives will exist. To
lessen the amount of false positives, the HC match criteria can
be relaxed and the false positive rates can be diminished to
around 6.45%. However, this gain comes at the cost of
decreasing the true positive from around 89% to 58%. On the
contrary, the M-HCF is able to outperform the traditional HCF
in most of the cases. For example, a simple comparison
between columns two and five, we see that our M-HCF detects
attacks better by at least 24% and generates less false alarms by
around 5%.

In the filtering phase, the packet is considered legitimate if


its HC belongs to the HC-list created in the learning phase.
Otherwise, it is considered as an attack.

TABLE I.

The proposed model could be deployed in two forms. It can be


deployed at end hosts to protect them from undetected IP
spoofing attempts. The second form is the deployment near the
periphery of an organization. All the traffic into and out of the
organization usually passes by periphery routers followed by
firewall for inspection. This detection algorithm can be placed
in-line between the router and the firewall. It will be able to
perform all the calculations since all the traffic will pass
through it. In case of an attack, it can filter the traffic and thus
protect end servers. However, the device will not be able to
protect against DoS attacks that target the bandwidth of the
Internet links used by the organization since that requires

Testing Scenario

PERFORMANCE RESULTS
M-HCF

HCF

HCF+1

True Negative

98.5

59.84

89.39

HCF+2
93.55

Same AS (TP)

42.35

62.86

29.47

16.09

Same Country (TP)

81.81

89.80

72.21

56.76

 AS & Country (TP)

84.18

89.39

72.66

58.70

VI.

CONCLUSION AND FUTURE WORK

In this paper, we presented an IP Spoofing detection


algorithm based on HCF technique, M-HCF, which includes in
the learning phase all training hop count values. This will help

515

[7]

improve the overall performance. Results show that our MHCF outperforms the strict HCF, +1 HCF filtering and +2 HCF
filtering proposed in [6].

[8]

Further testing with larger and more diverse data sets is


needed to strengthen our findings. The modified HCF method's
strength is that it is able to generate higher true negatives and
thus lower false positives when the traffic is legitimate.

[9]

In addition, it is possible to train and conduct detection


based on subnets instead of individual IP similar to the
clustering method suggested in [6]. This can help in reducing
the storage requirements since few IPs can be used to learn the
profile of the subnet.

[10]

[11]

ACKNOWLEDGMENT
[12]

This research was supported by Telus Corporation, Canada.


REFERENCES
[1]

[2]
[3]
[4]

[5]

[6]

[13]

Kravets D. (2011,March) U.N. Report Declares Internet Access a


Human
Right,
(accessed
April
28,
2012)
Available:
http://www.wired.com/threatlevel/2011/06/internet-a-human-right
Internet World Stats, (accessed May 14, 2012) Available:
http://www.internetworldstats.com/stats.htm
Spoofer Project (accessed September 15, 2012) Available:
http://spoofer.cmand.org/index.php.
Gill V., Hesley J., Meyer D. (2004, February) The Generalized TTL
Security Mechanism (GTSM), accessed October 14, 2012, Available:
http://www.ietf.org/rfc/rfc3682.txt
Jin G., Wang H., Shin K. G., Hop-count filtering: an effective defense
against spoofed DDoS traffic 10th ACM conference on Computer and
communication security, New York, USA, 2003.
Wang H., Jin C. and Shang K., Defense Against Spoofed IP Traffic
Using Hop-Count Filtering, IEEE/ACM Trans. Networking, vol. 15,
no. 1, pp. 40-53, 2007.

[14]
[15]

[16]

516

Wang, Xia, Li, Ming, Li, Muhai, "A scheme of distributed hop-count
filtering of traffic," IET International Communication Conference on
Wireless Mobile and Computing (CCWMC), pp.516-521, 7-9 , 2009.
KrishnaKumar, B., Kumar, P.K., Sukanesh, R. , Hop Count Based
Packet Processing Approach to Counter DDoS Attacks, International
Conference on Recent Trends in Information, Telecommunication and
Computing (ITC), pp.271-273, 2010.
Zhijun W., Zhifeng C., "A Three-Layer Defense Mechanism Based on
WEB Servers Against Distributed Denial of Service Attacks", First
International Conference on Communications and Networking in China,
pp.1-5, 2006.
Swain, B.R.; Sahoo, B., Mitigating DDoS attack and Saving
Computational Time using a Probabilistic approach and HCF method in
Advance Computing Conference, 2009. IACC 2009. IEEE International
, pp.1170-1172, 2009
Mopari, I.B., Pukale, S.G., Dhore, M.L., Detection and defense against
DDoS attack with IP spoofing, International Conference on Computing,
Communication and Networking, 2008.
Baker F., Savola P. (2004, March) Ingress Filtering for Multihomed
Networks [Online] Available: http://www.ietf.org/rfc/rfc3704.txt
(accessed November 9, 2012)
Li J., Mirkovic J., Wang M., Reiher P., and Zhang L., SAVE: source
address validity enforcement protocol, IEEE INFOCOM, pp. 15571566, 2002.
The Cooperative Association for Internet Data Analysis Available:
http://www.caida.org/home/ (accessed October 7, 2012)
Mukaddam A., Elhajj I., Hop Count Variability, International
Conference for Internet Technology and Secured Transactions (ICITST),
Abu Dhabi, UAE, 2011.
Mukaddam A., Elhajj I., Round Trip Time to Improve Hop Count
Filtering, The Third Symposium on Broadband Networks and Fast
Internet, Baabda Lebanon, May 28 - 29, 2012.

Você também pode gostar