Você está na página 1de 64

ABSTRACT

This project is intended to develop a tool called Packet Sniffer. The Packet
Sniffer allows the computer to examine and analyze all the traffic passing by its network
connection. It decodes the network traffic and makes sense of it.

When it is set up on a computer, the network interface of the computer is set to


promiscuous mode, listening to all the traffic on the network rather than just those
packets destined for it. Packet Sniffer is a tool that sniffs without modifying the
network’s packet in anyway. It merely makes a copy of each packet flowing through
the network interface and finds the source and destination Ethernet addresses of the
packets. It decodes the protocols in the packets given below:

IP (Internet Protocol), TCP (Transmission Control Protocol), UDP (User


Datagram Protocol).

The output is appended into normal text file, so that the network administrator
can understand the network traffic and later analyze it.
Introduction

A Packet Sniffer is a program that can see all of the information passing over
the network it is connected to. A Packet Sniffer is a Wire-tapping device that plugs
into computer Networks and eavesdrop on the network traffic.

A packet sniffer (also known as a network analyzer or protocol analyzer or, for
particular types of networks, an Ethernet sniffer or wireless sniffer) is computer
software that can intercept and log traffic passing over a digital network or part of a
network. As data streams flow across the network, the sniffer captures each packet
and eventually decodes and analyzes its content.

Most Ethernet networks use to be of a common bus topology, using either coax
cable or twisted pair wire and a hub. All of the nodes (computers and other devices) on
the network could communicate over the same wires and take turns sending data using
a scheme known as carrier sense multiple access with collision detection (CSMA/CD).
Think of CSMA/CD as being like a conversation at a loud party, you may have to wait
for quite a spell for your chance to get your words in during a lull in everybody else’s
conversation. All of the nodes on the network have their own unique MAC (media
access control) address that they use to send packets of information to each other.
Normally a node would only look at the packets that are destined for its MAC address.
However, if the network card is put into what is known as “promiscuous mode” it will
look at all of the packets on the wires it is hooked to.

TCP/IP Protocols

Background:

The Internet protocols are the world's most popular open-system


(nonproprietary) protocol suite because they can be used to communicate across any
set of interconnected networks and are equally well suited for LAN and WAN
communications. The Internet protocols consist of a suite of communication protocols,
of which the two best known are the Transmission Control Protocol (TCP) and the
Internet Protocol (IP). The Internet protocol suite not only includes lower-layer
protocols (such as TCP and IP), but it also specifies common applications such as
electronic mail, terminal emulation, and file transfer. This document provides a broad
introduction to specifications that comprise the Internet protocols.

Internet protocols were first developed in the mid-1970s, when the Defense
Advanced Research Projects Agency (DARPA) became interested in establishing a
packet-switched network that would facilitate communication between dissimilar
computer systems at research institutions. With the goal of heterogeneous connectivity
in mind, DARPA funded research by Stanford University and Bolt, Beranek, and
Newman (BBN). The result of this development effort was the Internet protocol suite,
completed in the late 1970s.

Documentation of the Internet protocols (including new or revised protocols)


and policies are specified in technical reports called Request for Comments (RFCs),
which are published and then reviewed and analyzed by the Internet community.
Protocol refinements are published in the new RFCs. To illustrate the scope of the
Internet protocols, Figure 1. Maps many of the protocols of the Internet protocol suite
and their corresponding OSI layers.
Internet protocol suite
OSI model

Application NFS
FTP,
Presentation RPC
Telnet,

Session SMTP, XDR

SNMP
Transport TCP, UDP

Network IP ICMP

ARP RARP
Data Link

Physical Not Specified

Fig 1: Internet protocols span the complete range of OSI model layers.

The Internet Protocol (IP) is a network-layer (Layer 3) protocol that contains


addressing information and some control information that enables packets to be routed.
IP is the primary network-layer protocol in the Internet protocol suite. Along with the
Transmission Control Protocol (TCP), IP represents the heart of the Internet protocols.
IP has two primary responsibilities: providing connectionless, best-effort delivery of
data grams through an inter-network; and providing fragmentation and reassembly of
data grams to support data links with different maximum-transmission unit (MTU)
sizes.

IP Packet Format:

An IP packet contains several types of information, as illustrated in

Version IHL Type of Service Total length

Time to live Protocol Header checksum

Identification Flags Fragment Offset


Fig 2: Fourteen fields comprise an IP packet.

The following discussion describes the IP packet fields illustrated in Figure.2

 Version— Indicates the version of IP currently used.

 IP Header Length (IHL) — Indicates the datagram header length in 32-bit


words.

 Type-of-Service— Specifies how an upper-layer protocol would like a current


datagram to be handled, and assigns data grams various levels of importance.

 Total Length — Specifies the length, in bytes, of the entire IP packet, including
the data and header.

 Identification — Contains an integer that identifies the current datagram. This


field is used to help piece together datagram fragments.

 Flags — Consists of a 3-bit field of which the two low-order (least-significant)


bits control fragmentation. The low-order bit specifies whether the packet can
be fragmented. The middle bit specifies whether the packet is the last fragment
in a series of fragmented packets. The third or high-order bit is not used.
 Fragment Offset — Indicates the position of the fragment's data relative to the
beginning of the data in the original datagram, which allows the destination IP
process to properly reconstruct the original datagram.

 Time-to-Live — Maintains a counter that gradually decrements down to zero,


at which point the datagram is discarded. This keeps packets from looping
endlessly.

 Protocol — Indicates which upper-layer protocol receives incoming packets


after IP processing is complete.

 Header Checksum— Helps ensure IP header integrity.

 Source Address — Specifies the sending node.

 Destination Address — Specifies the receiving node.

 Options — Allows IP to support various options, such as security.

 Data — Contains upper-layer information.

IP Addressing:

As with any other network-layer protocol, the IP addressing scheme is integral


to the process of routing IP datagram through an inter-network. Each IP address has
specific components and follows a basic format. These IP addresses can be subdivided
and used to create addresses for sub-networks.

Each host on a TCP/IP network is assigned a unique 32-bit logical address that
is divided into two main parts: the network number and the host number. The network
number identifies a network and must be assigned by the Internet Network Information
Center (InterNIC) if the network is to be part of the Internet. An Internet Service
Provider (ISP) can obtain blocks of network addresses from the InterNIC and can itself
assign address space as necessary. The host number identifies a host on a network and
is assigned by the local network administrator.
IP Address Format:

The 32-bit IP address is grouped eight bits at a time, separated by dots, and
represented in decimal format (known as dotted decimal notation). Each bit in the octet
has a binary weight (128, 64, 32, 16, 8, 4, 2, 1). The minimum value for an octet is 0,
and the maximum value for an octet is 255. Figure.3 illustrates the basic format of an
IP address.

Fig 3: An IP address consists of 32 bits, grouped into four octets.


32Bits

Network Host

8bits 8bits 8bits 8bits

IP Address Classes:

IP addressing supports five different address classes: A, B, C, D, and E. only


classes A, B, and C are available for commercial use. The left-most (high-order) bits
indicate the network class.

Fig 4: IP address formats A, B, and C are available for commercial use.

No. of bits 7 24
0 Networks Host Host Host
Class A
Class B 1 0 Network Network Host Host

14 16
Class C 1 1 0 Network Network Network Host

21 8
The class of address can be determined easily by examining the first octet of the
address and mapping that value to a class range in the following table. In an IP address
of 172.31.1.2, for example, the first octet is 172. Because 172 falls between 128 and
191, 172.31.1.2 is a Class B address. Figure 5 summarizes the range of possible values
for the first octet of each address class.

Fig 5: A range of possible values exists for the first octet of each address class.

Address Class First Octet in High-order Bits


Decimal

Class A 1 D 126 0

Class B 128 D 191 10

Class C 192 D 223 110

Class D 224 D 239 1110

Class E 240 D 254 1111


IP Subnet Addressing:

IP networks can be divided into smaller networks called sub-networks (or


subnets). Sub-netting provides the network administrator with several benefits,
including extra flexibility, more efficient use of network addresses, and the capability
to contain broadcast traffic (a broadcast will not cross a router).

Subnets are under local administration. As such, the outside world sees an
organization as a single network and has no detailed knowledge of the organization's
internal structure.

A given network address can be broken up into many sub networks. For
example, 172.16.1.0, 172.16.2.0, 172.16.3.0, and 172.16.4.0 are all subnets within
network 171.16.0.0. (All 0s in the host portion of an address specifies the entire
network.)

Address Resolution Protocol (ARP) Overview:

For two machines on a given network to communicate, they must know the
other machine's physical (or MAC) addresses. By broadcasting Address Resolution
Protocols (ARPs), a host can dynamically discover the MAC-layer address
corresponding to a particular IP network-layer address.

After receiving a MAC-layer address, IP devices create an ARP cache to store


the recently acquired IP-to-MAC address mapping, thus avoiding having to broadcast
ARPS when they want to re-contact a device. If the device does not respond within a
specified time frame, the cache entry is flushed.

In addition to the Reverse Address Resolution Protocol (RARP) is used to map


MAC-layer addresses to IP addresses. RARP, which is the logical inverse of ARP,
might be used by diskless workstations that do not know their IP addresses when they
boot. RARP relies on the presence of a RARP server with table entries of MAC-layer-
to-IP address mappings.
Internet Routing:

Internet routing devices traditionally have been called gateways. In today's


terminology, however, the term gateway refers specifically to a device that performs
application-layer protocol translation between devices. Interior gateways refer to
devices that perform these protocol functions between machines or networks under the
same administrative control or authority, such as a corporation's internal network.
These are known as autonomous systems. Exterior gateways perform protocol
functions between independent networks.

Routers within the Internet are organized hierarchically. Routers used for
information exchange within autonomous systems are called interior routers, which use
a variety of Interior Gateway Protocols (IGPs) to accomplish this purpose. The Routing
Information Protocol (RIP) is an example of an IGP.

Routers that move information between autonomous systems are called exterior
routers. These routers use an exterior gateway protocol to exchange information
between autonomous systems. The Border Gateway Protocol (BGP) is an example of
an exterior gateway protocol.

IP Routing:

IP routing protocols are dynamic. Dynamic routing calls for routes to be


calculated automatically at regular intervals by software in routing devices. This
contrasts with static routing, where routers are established by the network administrator
and do not change until the network administrator changes them.

An IP routing table, which consists of destination address/next hop pairs, is used


to enable dynamic routing. An entry in this table, for example, would be interpreted as
follows: to get to network 172.31.0.0, send the packet out Ethernet interface 0 (E0).

IP routing specifies that IP datagrams travel through internetworks one hop at a


time. The entire route is not known at the onset of the journey, however. Instead, at
each stop, the next destination is calculated by matching the destination address within
the datagram with an entry in the current node's routing table.

Each node's involvement in the routing process is limited to forwarding packets


based on internal information. The nodes do not monitor whether the packets get to
their final destination, nor does IP provide for error reporting back to the source when
routing anomalies occur. This task is left to another Internet protocol, the Internet
Control-Message Protocol (ICMP), which is discussed in the following section.

Internet Control Message Protocol (ICMP):

The Internet Control Message Protocol (ICMP) is a network-layer Internet


protocol that provides message packets to report errors and other information regarding
IP packet processing back to the source. ICMP is documented in RFC 792.

ICMP Messages:

ICMPs generate several kinds of useful messages, including Destination


Unreachable, Echo Request and Reply, Redirect, Time Exceeded, and Router
Advertisement and Router Solicitation. If an ICMP message cannot be delivered, no
second one is generated. This is to avoid an endless flood of ICMP messages.

When an ICMP destination-unreachable message is sent by a router, it means


that the router is unable to send the package to its final destination. The router then
discards the original packet. Two reasons exist for why a destination might be
unreachable. Most commonly, the source host has specified a nonexistent address. Less
frequently, the router does not have a route to the destination.

Destination-unreachable messages include four basic types: network


unreachable, host unreachable, protocol unreachable and port unreachable. Network-
unreachable messages usually mean that a failure has occurred in the routing or
addressing of a packet. Host-unreachable messages usually indicate delivery failure,
such as a wrong subnet mask. Protocol-unreachable messages generally mean that the
destination does not support the upper-layer protocol specified in the packet. Port-
unreachable messages imply that the TCP socket or port is not available.
An ICMP echo-request message, which is generated by the ping command, is
sent by any host to test node reachability across an inter network. The ICMP echo-reply
message indicates that the node can be successfully reached.

An ICMP Redirect message is sent by the router to the source host to stimulate
more efficient routing. The router still forwards the original packet to the destination.
ICMP redirects allow host routing tables to remain small because it is necessary to
know the address of only one router, even if that router does not provide the best path.
Even after receiving an ICMP Redirect message, some devices might continue using
the less-efficient route.

The router sends an ICMP Time-exceeded message if an IP packet's Time-to-


Live field (expressed in hops or seconds) reaches zero. The Time-to-Live field prevents
packets from continuously circulating the internetwork if the internetwork contains a
routing loop. The router then discards the original packet.

Transmission Control Protocol (TCP):

The TCP provides reliable transmission of data in an IP environment. TCP


corresponds to the transport layer (Layer 4) of the OSI reference model. Among the
services TCP provides are stream data transfer, reliability, efficient flow control, full-
duplex operation, and multiplexing.

With stream data transfer, TCP delivers an unstructured stream of bytes


identified by sequence numbers. This service benefits applications because they do not
have to chop data into blocks before handing it off to TCP. Instead, TCP groups bytes
into segments and passes them to IP for delivery.

TCP offers reliability by providing connection-oriented, end-to-end reliable


packet delivery through an internetwork. It does this by sequencing bytes with a
forwarding acknowledgment number that indicates to the destination the next byte the
source expects to receive. Bytes not acknowledged within a specified time period are
retransmitted. The reliability mechanism of TCP allows devices to deal with lost,
delayed, duplicate, or misread packets. A time-out mechanism allows devices to detect
lost packets and request retransmission.
TCP offers efficient flow control, which means that, when sending
acknowledgments back to the source, the receiving TCP process indicates the highest
sequence number it can receive without overflowing its internal buffers.

Full-duplex operation means that TCP processes can both send and receive at the same
time.

Finally, TCP's multiplexing means that numerous simultaneous upper-layer


conversations can be multiplexed over a single connection.

TCP Connection Establishment:

To use reliable transport services, TCP hosts must establish a connection-


oriented session with one another. Connection establishment is performed by using a
"three-way handshake" mechanism.

A three-way handshake synchronizes both ends of a connection by allowing


both sides to agree upon initial sequence numbers. This mechanism also guarantees that
both sides are ready to transmit data and know that the other side is ready to transmit
as well. This is necessary so that packets are not transmitted or retransmitted during
session establishment or after session termination.

Each host randomly chooses a sequence number used to track bytes within the
stream it is sending and receiving. Then, the three-way handshake proceeds in the
following manner:

The first host (Host A) initiates a connection by sending a packet with the initial
sequence number (X) and SYN bit set to indicate a connection request. The second host
(Host B) receives the SYN, records the sequence number X, and replies by
acknowledging the SYN (with an ACK = X + 1). Host B includes its own initial
sequence number (SEQ = Y). An ACK = 20 means the host has received bytes 0
through 19 and expects byte 20 next. This technique is called forward
acknowledgment. Host A then acknowledges all bytes Host B sent with a forward
acknowledgment indicating the next byte Host A expects to receive (ACK = Y + 1).
Data transfer then can begin.
Positive Acknowledgment and Retransmission (PAR):

A simple transport protocol might implement a reliability-and-flow-control


technique where the source sends one packet, starts a timer, and waits for an
acknowledgment before sending a new packet. If the acknowledgment is not received
before the timer expires, the source retransmits the packet. Such a technique is called
positive acknowledgment and retransmission (PAR).

By assigning each packet a sequence number, PAR enables hosts to track lost
or duplicate packets caused by network delays that result in premature retransmission.
The sequence numbers are sent back in the acknowledgments so that the
acknowledgments can be tracked.

PAR is an inefficient use of bandwidth, however, because a host must wait for
an acknowledgment before sending a new packet, and only one packet can be sent at a
time.

TCP Sliding Window:

A TCP sliding window provides more efficient use of network bandwidth than
PAR because it enables hosts to send multiple bytes or packets before waiting for an
acknowledgment.

In TCP, the receiver specifies the current window size in every packet. Because
TCP provides a byte-stream connection, window sizes are expressed in bytes. This
means that a window is the number of data bytes that the sender is allowed to send
before waiting for an acknowledgment. Initial window sizes are indicated at connection
setup, but might vary throughout the data transfer to provide flow control. A window
size of zero, for instance, means "Send no data."

In a TCP sliding-window operation, for example, the sender might have a


sequence of bytes to send (numbered 1 to 10) to a receiver who has a window size of
five. The sender then would place a window around the first five bytes and transmit
them together. It would then wait for an acknowledgment.
The receiver would respond with an ACK = 6, indicating that it has received
bytes 1 to 5 and is expecting byte 6 next. In the same packet, the receiver would indicate
that its window size is 5. The sender then would move the sliding window five bytes to
the right and transmit bytes 6 to 10. The receiver would respond with an ACK = 11,
indicating that it is expecting sequenced byte 11 next. In this packet, the receiver might
indicate that its window size is 0 (because, for example, its internal buffers are full). At
this point, the sender cannot send any more bytes until the receiver sends another packet
with a window size greater than 0.

TCP Packet Format:

Figure 6 illustrate the fields and overall format of a TCP packet.

Twelve fields comprise a TCP packet.

Source port Destination port

Sequence number

Acknowledge number

Data offset Reserved Flags Window

Checksum Urgent pointer

Options (+padding)

Data (variable)

TCP Packet Field Descriptions:The following descriptions summarize the TCP


packet fields illustrated in Figure. 6

Source Port and Destination Port — Identifies points at which upper-layer source and
destination processes receive TCP services.
Sequence Number — Usually specifies the number assigned to the first byte of data in
the current message. In the connection-establishment phase, this field also can be used
to identify an initial sequence number to be used in an upcoming transmission.

Acknowledgment Number — Contains the sequence number of the next byte of data
the sender of the packet expects to receive.

Data Offset — Indicates the number of 32-bit words in the TCP header.

Reserved—Remains reserved for future use.

Flags — Carries a variety of control information, including the SYN and ACK bits
used for connection establishment, and the FIN bit used for connection termination.

Window — Specifies the size of the sender's receive window (that is, the buffer space
available for incoming data).

Checksum — Indicates whether the header was damaged in transit.

Urgent Pointer — Points to the first urgent data byte in the packet.

Options — Specifies various TCP options.

Data — Contains upper-layer information.

User Datagram Protocol (UDP):

The User Datagram Protocol (UDP) is a connectionless transport-layer protocol


(Layer 4) that belongs to the Internet protocol family. UDP is basically an interface
between IP and upper-layer processes. UDP protocol ports distinguish multiple
applications running on a single device from one another.

Unlike the TCP, UDP adds no reliability, flow-control, or error-recovery


functions to IP. Because of UDP's simplicity, UDP headers contain fewer bytes and
consume less network overhead than TCP.
UDP is useful in situations where the reliability mechanisms of TCP are not
necessary, such as in cases where a higher-layer protocol might provide error and flow
control.

UDP is the transport protocol for several well-known application-layer


protocols, including Network File System (NFS), Simple Network Management
Protocol (SNMP), Domain Name System (DNS), and Trivial File Transfer Protocol
(TFTP).

The UDP packet format contains four fields, as shown in Figure 7. These
include source and destination ports, length, and checksum fields.

A UDP packet consists of four fields.


32 bits

Source Port Destination Port

Length checksum

Source and destination ports contain the 16-bit UDP protocol port numbers used
to de-multiplex datagrams for receiving application-layer processes. A length field
specifies the length of the UDP header and data. Checksum provides an (optional)
integrity check on the UDP header and data.

Internet Protocols: Application Layer Protocols

The Internet protocol suite includes many application-layer protocols that


represent a wide variety of applications, including the following:

File Transfer Protocol (FTP)—Moves files between devices

Simple Network-Management Protocol (SNMP)—Primarily reports anomalous


network conditions and sets network threshold values
Telnet—Serves as a terminal emulation protocol

X Windows—Serves as a distributed windowing and graphics system used for


communication between X terminals and UNIX workstations

Network File System (NFS), External Data Representation (XDR), and Remote
Procedure Call (RPC)—Work together to enable transparent access to remote network
resources

Simple Mail Transfer Protocol (SMTP)—Provides electronic mail services

Domain Name System (DNS)—Translates the names of network nodes into network
addresses.

The list of the higher-layer protocols and the applications that they support is as follows:

Application Protocols

File transfer FTP

Terminal emulation Telnet

Electronic mail SMTP

Network management SNMP

Distributed file services NFS, XDR, RPC, X Windows

Software Requirements Specification:

This is the requirements document for the project. The system to be developed
is for capturing the packets flowing in the network and analyzes them. The information
in the various headers of the packets is to be extracted and saved into the output file.

Introduction:

Purpose:
To develop a tool that easily analyzes the network traffic flow on that particular
system and to show the information for the administrator in human readable format.

Scope:

This can be used by network administrators, organizations and by common man


who want to know the network flow, in and out, of the system and also save the file for
later analysis such as load on the system, network intrusion detection etc.

Developer’s Responsibilities Overview:

The developer is responsible for:

 Developing the system.

 Installing the software on the client’s hardware.

General Description:

Product Functions Overview:

In a computer network every system can see all the packets flowing in the
network, but can capture the packets that are addressed to that particular system only.
But the product must be able to make a copy of all the packets flowing in the network,
which are address to it and also not address to it. The packet copied must be stored in a
buffer. Each packet has headers in which information about the packet will be stored in
a specified format. This information must be extracted and if necessary covert into
human readable form and store it in the output files.

User Characteristics:

The user of the system will be the systems administrator who controls and
configures the network traffic through the server.

General Constraints:

The system should have winpcap & packetx installed.

General assumptions and dependencies:


The assumption is that the packets moving in the networking are coming from
only Ethernet and not from any other like FDDI, etc.

Specific Requirements:

Inputs and Outputs:

Inputs: Raw packets flowing in the network of the system on which the Packet
Sniffer is installed.

Outputs: The output is stored in a file.

Functional Requirements:

Capture the packets in the network at the data link layer before they are passed
to the protocols implemented in the kernel.

Strip off the various headers in each packet and analyze the information in it.

Append the information in the headers of the packet into output file in a specified
format.

Performance Constraints:

The maximum size of the buffer to hold the packet is 2000 bytes. The speed of
the networks should not exceed 100Mbps if it exceeds this speed all the packets may
not be analyzed.

Requirements Specification:

Software Environment:

The system will run under .net Framework that is to be installed on the system.

Operating Platform : WINDOWS XP

Front End : Visual Studio .NET

Back End : SQL Server 2000


Language : C# .NET

Hardware Environment:

Processor : Pentium IV

RAM : 512MB RAM

HDD : 5 GB

LAN : Enabled

Acceptance Criteria:

Before accepting the system, the developer will have to demonstrate how the system
works on the given data. The developer will have to show by suitable test cases that all
conditions are satisfied.

Architecture Diagrams

Data Flow Diagrams:

Data flow diagrams are made up of a number of symbols, which represent


system components. Most data flow modeling methods use four kinds of symbols.
These symbols are used to represent four kinds of system components: processes, data
stores, data flows and external entities.

The data flow diagrams for the current project are show in the following figure.
It is the data flow diagram for the entire process. It specifies the major transform centers
in the approach to be followed for producing the software. This is the first step in the
structured design method. In the project, the inputs are the packets that are flowing in
the network interface that is set to promiscuous mode. The output is the information
contained in the packets in human readable form, which is stored in the output file.

The context diagram and data flow diagram of the proposed system are given
as follows:

Network Protocol
Packets Reports Administrator
interface
Analyzer
card

Promiscuous mode

Fig: Context Diagram

Ip header

packets Separate dtl header ip header tl header


ip &dtl Headers output
headers analysis analysis analysis
Info
dtl dtl header info
header

Transport layer header

Fig: DFD for the Protocol Analyzer process

The data flow diagrams for the current project are show in the following figure.
It is the data flow diagram for the entire process. It specifies the major transform centers
in the approach to be followed for producing the software. This is the first step in the
structured design method. In the project, the inputs are the packets that are flowing in
the network interface that is set to promiscuous mode. The output is the information
contained in the packets in human readable form, which is stored in the output file.

Get Separate Analyze Update


Packets Buffer with Info in
packets headers hdr’s output file
headers
packets headers

Fig: Data flow diagram for Packet Sniffer

Explanation:

In the diagram the input is obtained as packets from the network interface by
the ‘Get packets’ process. For that this process defines a packet socket and obtains the
raw packets from the network interface and stores them into a buffer. The buffer
containing the packets is passed to the ‘separate header’ process, which strips off
various headers of the packet and passes them to ‘analyze headers’ process where they
will be analyzed and the information is passed on to the ‘update output file’ process.
Here the output file will be updated with the latest information obtained from the later
processes. The most abstract inputs are the stripped off headers and the most abstract
output is the information in the headers in human readable form.

Structure Charts:

For a function-oriented design, structure charts can represent the design


graphically. The structure of a program is made up of modules of the program together
with the interconnections between modules. The structure chart of a program is a
graphic representation of its structure. In a structure chart, a box represents a module
with the module name written in the box. The parameters returned as output by a
module.
MAIN

hdr’s
info

info

hdr’s

Get Input Protocol Analysis Output


In the structure chart, there are three modules: - one for input, one for output and another
is called the central transform module which performs the basic transformation for the
system, taking the most abstract input and transforming it into the most abstract output.
The main module’s job is to invoke the subordinates.

Here, there is one input module, which returns the headers in the packet to the
main module. The main module passes these headers to the protocol analysis module,
which transforms them into human readable information. This information is passed to
the main module. The main module passes this information to the output module, which
updates the output files.

hdr’s

Get Input

hdr’s
packetfd buf
buf
packetfd

Adapter object receive packets Strip off headers

Fig. Input Module


In the input module, the network interface is turned into promiscuous mode so
that all the packets can be captured even though they are not intended to it. This is done
by defining an adapter object and reading all the packets into a buffer. Then each packet
is taken and the various headers are stripped off and sent to the main module.

hdr’s info

Protocol Analyzer

ip hdr
arp or rarp hdr

ip ARP &RARP

info info info


info

ICMP IGMP TCP UDP

Fig: Protocol Analysis Module

The central transform module is shown in figure. In the central transform


module i.e. protocol analysis system, the process is split into three modules viz. ip, arp
and rarp. Here the major decision is taken about which module to be invoked by the
central module basing on the type of header sent by the main module. The ip module
takes a further decision as which module to be invoked based on the type of header
passed to it. IP module is further divided into icmp, igmp, tcp and udp modules. The
modules are named after the type of headers they handle. Each module knows the
specified format in which the information in that particular header is stored, so they
convert it into required format by which we can easily understand and know about the
packets in detail. This information is passed to the main module.

info

Output

ofstream ofstream

Append into file Print reports

Fig: Output Module

This module gets the information stored in the headers of the packets as input
from the main module. The output module is split into two sub-modules. The first
module updates the output files with the input obtained by the main module and passes
back the file pointers to the ’output’ module. These file streams are passed to the ‘print
reports’ module where the reports are printed.

UML Diagrams:

Unified Modeling Language


This mapping permits forward engineering: The generation of code from a
UML model into a programming language. The reverse is also possible: You can
reconstruct a model from an implementation back into the UML. Reverse engineering
is not magic. Unless you encode that information in the implementation, information is
lost when moving forward from models to code. Reverse engineering thus requires tool
support with human intervention. Combining these two paths of forward code
generation and reverse engineering yields round-trip engineering, meaning the ability
to work in either a graphical or a textual view, while tools keep the two views
consistent.

In addition to this direct mapping, the UML is sufficiently expressive and


unambiguous to permit the direct execution of models, the simulation of systems, and
the instrumentation of running systems.

The UML is a Language for Documenting:

A healthy software organization produces all sorts of artifacts in addition to raw


executable code. These artifacts include (but are not limited to)

 Requirements
 Architecture
 Design
 Source code
 Project plans
 Tests
 Prototypes
 Releases

Depending on the development culture, some of these artifacts are treated more
or less formally than others. Such artifacts are not only the deliverables of a project,
they are also critical in controlling, measuring, and communicating about a system
during its development and after its deployment.

The UML addresses the documentation of a system's architecture and all of its
details. The UML also provides a language for expressing requirements and for tests.
Finally, the UML provides a language for modeling the activities of project planning
and release management Applications

The UML is intended primarily for software-intensive systems. It has been used
effectively for such domains as

 Enterprise information systems


 Banking and financial services
 Telecommunications
 Transportation
 Defense/aerospace
 Retail
 Medical electronics
 Scientific
 Distributed Web-based services

The UML is not limited to modeling software. In fact, it is expressive enough


to model non-software systems, such as workflow in the legal system, the structure and
behavior of a patient healthcare system, and the design of hardware.

A Conceptual Model of the UML:

To understand the UML, you need to form a conceptual model of the language,
and this requires learning three major elements: the UML's basic building blocks, the
rules that dictate how those building blocks may be put together, and some common
mechanisms that apply throughout the UML. Once you have grasped these ideas, you
will be able to read UML models and create some basic ones. As you gain more
experience in applying the UML, you can build on this conceptual model, using more
advanced features of the language.

Building Blocks of the UML:

The vocabulary of the UML encompasses three kinds of building blocks:

 Things

 Relationships
 Diagrams

Things are the abstractions that are first-class citizens in a model; relationships
tie these things together; diagrams group interesting collections of things.

Things in the UML:

There are four kinds of things in the UML:

 Structural things

 Behavioral things

 Grouping things

 Annotational things

These things are the basic object-oriented building blocks of the UML. You use
them to write well-formed models.

Structural Things:

Structural things are the nouns of UML models. These are the mostly static
parts of a model, representing elements that are either conceptual or physical. In all,
there are seven kinds of structural things.

First, a class is a description of a set of objects that share the same attributes,
operations, relationships, and semantics. A class implements one or more interfaces.
Graphically, a class is rendered as a rectangle, usually including its name, attributes,
and operations.

CLASS:
Second, an interface is a collection of operations that specify a service of a class
or component. An interface therefore describes the externally visible behavior of that
element. An interface might represent the complete behavior of a class or component
or only a part of that behavior. An interface defines a set of operation specifications
(that is, their signatures) but never a set of operation implementations. Graphically, an
interface is rendered as a circle together with its name. An interface rarely stands alone.
Rather, it is typically attached to the class or component that realizes the interface.

INTERFACE:

Third, a collaboration defines an interaction and is a society of roles and other


elements that work together to provide some cooperative behavior that's bigger than
the sum of all the elements. Therefore, collaborations have structural, as well as
behavioral, dimensions. A given class might participate in several collaborations.
These collaborations therefore represent the implementation of patterns that make up a
system. Graphically, collaboration is rendered as an ellipse with dashed lines, usually
including only its name.

Fourth, a use case is a description of set of sequence of actions that a system


performs that yields an observable result of value to a particular actor. A use case is
used to structure the behavioral things in a model. A use case is realized by
collaboration. Graphically, a use case is rendered as an ellipse with solid lines, usually
including only its name.

The remaining three things—active classes, components, and nodes—are all


class-like, meaning they also describe a set of objects that share the same attributes,
operations, relationships, and semantics. However, these three are different enough and
are necessary for modeling certain aspects of an object-oriented system, and so they
warrant special treatment.

Fifth, an active class is a class whose objects own one or more processes or
threads and therefore can initiate control activity. An active class is just like a class
except that its objects represent elements whose behavior is concurrent with other
elements. Graphically, an active class is rendered just like a class, but with heavy lines,
usually including its name, attributes, and operations.

The remaining two elements—component, and nodes—are also different. They


represent physical things, whereas the previous five things represent conceptual or
logical things.

Sixth, a component is a physical and replaceable part of a system that conforms


to and provides the realization of a set of interfaces. In a system, you'll encounter
different kinds of deployment components, such as COM+ components or Java Beans,
as well as components that are artifacts of the development process, such as source code
files. A component typically represents the physical packaging of otherwise logical
elements, such as classes, interfaces, and collaborations. Graphically, a component is
rendered as a rectangle with tabs, usually including only its name.

COMPONENT:
Seventh, a node is a physical element that exists at run time and represents a
computational resource, generally having at least some memory and, often, processing
capability. A set of components may reside on a node and may also migrate from node
to node. Graphically, a node is rendered as a cube, usually including only its name.

Behavioral Things:

Behavioral things are the dynamic parts of UML models. These are the verbs of
a model, representing behavior over time and space. In all, there are two primary kinds
of behavioral things.

First, an interaction is a behavior that comprises a set of messages exchanged


among a set of objects within a particular context to accomplish a specific purpose. The
behavior of a society of objects or of an individual operation may be specified with an
interaction. An interaction involves a number of other elements, including messages,
action sequences (the behavior invoked by a message), and links (the connection
between objects). Graphically, a message is rendered as a directed line, almost always
including the name of its operation.

Second, a state machine is a behavior that specifies the sequences of states an


object or an interaction goes through during its lifetime in response to events, together
with its responses to those events. The behavior of an individual class or a collaboration
of classes may be specified with a state machine.

A state machine involves a number of other elements, including states,


transitions (the flow from state to state), events (things that trigger a transition), and
activities (the response to a transition). Graphically, a state is rendered as a rounded
rectangle, usually including its name and its sub states, if any.

STATE:
These two elements—interactions and state machines—are the basic behavioral
things that you may include in a UML model. Semantically, these elements are usually
connected to various structural elements, primarily classes, collaborations, and objects.

Grouping Things:

Grouping things are the organizational parts of UML models. These are the
boxes into which a model can be decomposed. In all, there is one primary kind of
grouping thing, namely, packages.

A package is a general-purpose mechanism for organizing elements into


groups. Structural things, behavioral things, and even other grouping things may be
placed in a package. Unlike components (which exist at run time), a package is purely
conceptual (meaning that it exists only at development time). Graphically, a package
is rendered as a tabbed folder, usually including only its name and, sometimes, its
contents.

PACKAGE:

Packages are the basic grouping things with which you may organize a UML
model. There are also variations, such as frameworks, models, and subsystems (kinds
of packages).

Annotational Things:

Annotational things are the explanatory parts of UML models. These are the
comments you may apply to describe, illuminate, and remark about any element in a
model. There is one primary kind of annotational thing, called a note. A note is simply
a symbol for rendering constraints and comments attached to an element or a collection
of elements. Graphically, a note is rendered as a rectangle with a dog-eared corner,
together with a textual graphical comment.

NOTES:

This element is the one basic annotational thing you may include in a UML
model. You'll typically use notes to adorn your diagrams with constraints or comments
that are best expressed in informal or formal text. There are also variations on this
element, such as requirements (which specify some desired behavior from the
perspective of outside the model).

Relationships in the UML :

There are four kinds of relationships in the UML:

1. Dependency

2. Association

3. Generalization

4. Realization

These relationships are the basic relational building blocks of the UML. You
use them to write well-formed models.

First, a dependency is a semantic relationship between two things in which a


change to one thing (the independent thing) may affect the semantics of the other thing
(the dependent thing). Graphically, a dependency is rendered as a dashed line, possibly
directed, and occasionally including a label.

DEPENDANCY:
ASSOCIATION:

Employer employee

Second, an association is a structural relationship that describes a set of links, a


link being a connection among objects. Aggregation is a special kind of association,
representing a structural relationship between a whole and its parts. Graphically, an
association is rendered as a solid line, possibly directed, occasionally including a label,
and often containing other adornments, such as multiplicity and role names.

Third, a generalization is a specialization/generalization relationship in which


objects of the specialized element (the child) are substitutable for objects of the
generalized element (the parent). In this way, the child shares the structure and the
behavior of the parent. Graphically, a generalization relationship is rendered as a solid
line with a hollow arrowhead pointing to the parent.

GENERALIZATION:

. Fourth, a realization is a semantic relationship between classifiers, wherein


one classifier specifies a contract that another classifier guarantees to carry out. You'll
encounter realization relationships in two places: between interfaces and the classes or
components that realize them, and between use cases and the collaborations that realize
them. Graphically, a realization relationship is rendered as a cross between a
generalization and a dependency relationship.

REALIZATION:
These four elements are the basic relational things you may include in a UML
model. There are also variations on these four, such as refinement, trace, include, and
extend (for dependencies).

The five views of architecture are discussed in the following section.

Rules of the UML:

The UML's building blocks can't simply be thrown together in a random


fashion. Like any language, the UML has a number of rules that specify what a well-
formed model should look like. A well-formed model is one that is semantically self-
consistent and in harmony with all its related models.

The UML has semantic rules for

Table 2.1 Semantic rules

• Names What you can call things, relationships, and diagrams

• Scope The context that gives specific meaning to a name

• Visibility How those names can be seen and used by others

• Integrity How things properly and consistently relate to one another

• Execution What it means to run or simulate a dynamic model

Models built during the development of a software-intensive system tend to


evolve and may be viewed by many stakeholders in different ways and at different
times. For this reason, it is common for the development team to not only build models
that are well-formed, but also to build models that are
Table 2.2 Models

• Elided Certain elements are hidden to simplify the view

• Incomplete Certain elements may be missing

• Inconsistent The integrity of the model is not guaranteed

These less-than-well-formed models are unavoidable as the details of a system


unfold and churn during the software development life cycle. The rules of the UML
encourage you—but do not force you—to address the most important analysis, design,
and implementation questions that push such models to become well-formed over time.

Diagrams in the UML:

A diagram is the graphical presentation of a set of elements, most often rendered


as a connected graph of vertices (things) and arcs (relationships). You draw diagrams
to visualize a system from different perspectives, so a diagram is a projection into a
system. For all but the most trivial systems, a diagram represents an elided view of the
elements that make up a system. The same element may appear in all diagrams, only a
few diagrams (the most common case), or in no diagrams at all (a very rare case). In
theory, a diagram may contain any combination of things and relationships. In practice,
however, a small number of common combinations arise, which are consistent with the
five most useful views that comprise the architecture of a software-intensive system.
For this reason, the UML includes nine such diagrams:

 Use case diagram


 Class diagram
 Sequence diagram
 Collaboration diagram
 State chart diagram
 Activity diagram
 Component diagram
 Deployment diagram
 Object diagram

A class diagram shows a set of classes, interfaces, and collaborations and their
relationships. These diagrams are the most common diagram found in modeling object-
oriented systems. Class diagrams address the static design view of a system. Class
diagrams that include active classes address the static process view of a system.

A use case diagram shows a set of use cases and actors (a special kind of class)
and their relationships. Use case diagrams address the static use case view of a system.
These diagrams are especially important in organizing and modeling the behaviors of a
system. Both sequence diagrams and collaboration diagrams are kinds of interaction
diagrams. Shows an interaction, consisting of a set of objects and their relationships,
including the messages that may be dispatched among them. Interaction diagrams
address the dynamic view of a system.

A sequence diagram is an interaction diagram that emphasizes the time-


ordering of messages; a collaboration diagram is an interaction diagram that
emphasizes the structural organization of the objects that send and receive messages.
Sequence diagrams and collaboration diagrams are isomorphic, meaning that you can
take one and transform it into the other.

A state chart diagram shows a state machine, consisting of states, transitions,


events, and activities. State chart diagrams address the dynamic view of a system. They
are especially important in modeling the behavior of an interface, class, or collaboration
and emphasize the event-ordered behavior of an object, which is especially useful in
modeling reactive systems.

An activity diagram is a special kind of a state chart diagram that shows the
flow from activity to activity within a system. Activity diagrams address the dynamic
view of a system. They are especially important in modeling the function of a system
and emphasize the flow of control among objects.
A component diagram shows the organizations and dependencies among a set
of components. Component diagrams address the static implementation view of a
system. They are related to class diagrams in that a component typically maps to one
or more classes, interfaces, or collaborations.

A deployment diagram shows the configuration of run-time processing nodes


and the components that live on them. Deployment diagrams address the static
deployment view of architecture. They are related to component diagrams in that a node
typically encloses one or more components.

An object diagram shows a set of objects and their relationships. Object


diagrams represent static snapshots of instances of the things found in class diagrams.
These diagrams address the static design view or static process view of a system as do
class diagrams, but from the perspective of real or prototypical cases.

This is not a closed list of diagrams. Tools may use the UML to provide other
kinds of diagrams, although these nine are by far the most common you will encounter
in practice.

USE CASE DIAGRAMS:

A use case describes a sequence of actions that provide something measurable


value to an actor and is drawn as a horizontal ellipse an actor is a person, organization
or external system that plays a role in one or more interactions with your system.
Add/removeIP’s: Administrator can add or remove ip addresses from the
database to block or allow receiving packets from that
particular system on the domain.

View report: Administrator can view the information of the decoded packets
in the user interface of the tool.

Save/print reports: Administrator can save the information on to file system or can
print the information through the printer if connected.

Use case scenario:

Use case name: Admin functions

Participant actor: Administrator

Flow of events: 1. System Starts Monitoring.

2. User presses the capture button.


Class diagram:

A class diagram describes the static structure of the symbols in your new
system. It is a graphical presentation of the static view that shows a collection of
declarative (static) model elements, such as classes, types, and their contents and
relationships.
Sequence diagram:

UML sequence diagrams model the flow of logic within your system in a
visual manner, enabling you both to document and validate your logic, and are
commonly used for both analysis and design purposes. Sequence diagrams are the most
popular UML artifacts for dynamic modeling, which focuses on identifying the
behavior within your system.

Sequence diagram for Administrator:


Collaboration diagram for Administrator:

A collaboration diagram is an interaction diagram that emphasizes the structural


organization of the objects that send and receive messages. Collaboration diagrams are
isomorphic, meaning that you can take one and transform it into the other.
State chart Diagram:
Project Methodology

Getting Wireshark

Download Wireshark from the given website

https://www.wireshark.org/download.html
Starting Wireshark
When you run the Wireshark program, the Wireshark graphic user interface will be
shown as Figure 5. Currently, the program is not capturing the packets.

Then, you need to choose an interface. If you are running the Wireshark on your
laptop, you need to select WiFi interface. If you are at a desktop, you need to select
the Ethernet interface being used. Note that there could be multiple interfaces. In
general, you can select any interface but that does not mean that traffic will flow
through that interface. The network interfaces (i.e., the physical connections) that
your computer has to the network are shown. The attached Figure 6 was taken from
my computer. After you select the interface, you can click start to capture the packets
as shown in Figure 7.
Figure 6: Capture Interfaces in Wireshark

Figure 7: Capturing Packets in Wireshark


Figure 8: Wireshark Graphical User Interface on Microsoft Windows

The packet-header details window provides details about the packet selected
(highlighted) in the packet-listing window. (To select a packet in the packet-listing
window, place the cursor over the packet’s one-line summary in the packet-listing
window and click with the left mouse button.). These details include information
about the Ethernet frame and IP datagram that contains this packet. The amount of
Ethernet and IP-layer detail displayed can be expanded or minimized by clicking on
the right- pointing or down-pointing arrowhead to the left of the Ethernet frame or IP
datagram line in the packet details window. If the packet has been carried over TCP
or UDP, TCP or UDP details will also be displayed, which can similarly be expanded
or minimized. Finally, details about the highest-level protocol that sent or received
this packet are also provided.

The packet-contents window displays the entire contents of the captured frame, in
both ASCII and hexadecimal format.
Towards the top of the Wireshark graphical user interface, is the packet display filter
field, into which a protocol name or other information can be entered in order to filter
the information displayed in the packet-listing window (and hence the packet-header
and packet-contents windows). In the example below, we’ll use the packet-display
filter field to have Wireshark hide (not display) packets except those that correspond
to HTTP messages.

Capturing Packets

After downloading and installing Wireshark, you can launch it and click the name of
an interface under Interface List to start capturing packets on that interface. For
example, if you want to capture traffic on the wireless network, click your wireless
interface.

Test Run
Do the following steps:

1. Start up the Wireshark program (select an interface and press start to capture
packets).

2. Startupyourfavoritebrowser(ceweaselinKaliLinux).

3. Inyourbrowser,gotoWayneStatehomepagebytypingwww.wayne.edu.

4. After your browser has displayed the http://www.wayne.edu page, stop


Wireshark packet capture by selecting stop in the Wireshark capture window.
This will cause the Wireshark capture window to disappear and the main
Wireshark window to display all packets captured since you began packet
capture see image below:

5. Color Coding: You’ll probably see packets highlighted in green, blue, and
black. Wireshark uses colors to help you identify the types of traffic at a
glance. By default, green is TCP traffic, dark blue is DNS traffic, light blue is
UDP traffic, and black identifies TCP packets with problems — for example,
they could have been delivered out-of-order.

6. You now have live packet data that contains all protocol messages exchanged
between your computer and other network entities! However, as you will
notice the HTTP messages are not clearly shown because there are many other
packets included in the packet capture. Even though the only action you took
was to open your browser, there are many other programs in your computer
that communicate via the network in the background. To filter the connections
to the ones we want to focus on, we have to use the filtering functionality of
Wireshark by typing “http” in the filtering field as shown below:
Notice that we now view only the packets that are of protocol HTTP. However, we
also still do not have the exact communication we want to focus on because using
HTTP as a filter is not descriptive enough to allow us to find our connection to
http://www.wayne.edu. We need to be more precise if we want to capture the correct
set of packets.

7. To further filter packets in Wireshark, we need to use a more precise filter. By


setting the http.host==www.wayne.edu, we are restricting the view to packets
that have as an http host the www.wayne.edu website. Notice that we need
two equal signs to perform the match “==” not just one. See the screenshot
below:
8. Now, we can try another protocol. Let’s use Domain Name System (DNS) protocol
as an example here.

9. Let’s try now to find out what are those packets contain by following one of the
conversations (also called network flows), select one of the packets and press the right
mouse button (if you are on a Mac use the command button and click), you should see
something similar to the screen below:
Click on Follow UDP Stream, and then you will see following screen.

10. If we close this window and change the filter back to


“http.host==www.wayne.edu” and then follow a packet from the list of packets that
match that filter, we should get the something similar to the following screens. Note
that we click on Follow TCP Stream this time.
Screenshots
Conclusion

In practice, there is not a typical network problem that can’t be discovered and solved
using packet sniffer technology. Sniffers can be used as the first method of attack on a
number of issues that vary from overloaded networks to unresponsive switches to lost
packets. As a number of networks and nodes continue to grow and as network speeds
accelerate, it becomes more and more difficult to monitor a LAN by using traditional
tools, such as RMON (Remote Monitoring) probes. Packet sniffers, by contrast,
monitor traffic on network right down to the Header information on each series of data.
This means that u can actually track data from starting point to its end point. Packet
sniffers can also be used to identify the types of packets on a network and discover
whether or not the specific packet has any errors.

Data is sent across the internet in the form of packets. Packet sniffing can be used for
the benefit of a network or for malicious purposes. It can monitor and analyze traffic
and help with network research. It can also be used by adversaries in order to steal
plaintext data or watch a user’s actions. Software exists to help detect sniffers on a
network. Business systems often set these in place in order to keep data safe. Without
using modern defenses and best practices, data sent across the network can be easily
seen by attackers. It’s important to verify that sites you access are utilizing the safe
guards available, namely encryption, and avoid the sites that are not.

The above experiment asserts the need of IDS/IPS devices in any typical network. We
have also highlighted the capabilities of Wireshark in packet data interpretation and
data handling too. Wireshark, in this experiment has been used primarily in ACL
(Access Control List) filtering. Many other variations of filtering are available in the
Wireshark utility such as filtering based on packet size, filtering based on protocols
used, filtering of substrings etc. Thus, with proper use of filtering commands and
complementing utilities, Wireshark can be developed into comprehensive intrusion
detection software.

Future Scope

Wireshark as a Network Protocol Analyzer has already proven its mettle in all
necessary realms. However it still has scope of improvement in it as far as alert
generation and heuristic development is concerned. We are working to introduce
certain utilities in the source code of Wireshark to overcome the above shortcomings
by making Wireshark capable of alert generations.

As part of future work for this project, testing the ability of detecting attacks can be
performed on many other Snort rules. With good test criteria, with proper network
logs all the snort rules can be examined and tested in order to determine the
performance of the system in detecting threats. Therefore, this project throws beacon
on the scope of security policy design and network
References

wireshark.org

Youtube

Geeksforgeek

Slideshare.

Você também pode gostar