Você está na página 1de 82

THREE WAY HANDSHAKE

 Three-Way Handshake
 The following scenario occurs when a TCP connection is established:

 The server must be prepared to accept an incoming connection. This is normally


done by calling socket, bind, and listen and is called a passive open.

 The client issues an active open by calling connect. This causes the client TCP to
send a "synchronize" (SYN) segment, which tells the server the client's initial
sequence number for the data that the client will send on the connection.
Normally, there is no data sent with the SYN; it just contains an IP header, a
TCP header, and possible TCP options (which we will talk about shortly).

 The server must acknowledge (ACK) the client's SYN and the server must also
send its own SYN containing the initial sequence number for the data that the
server will send on the connection. The server sends its SYN and the ACK of
the client's SYN in a single segment.

 The client must acknowledge the server's SYN. The minimum number of
packets required for this exchange is three; hence, this is called TCP'sthree-way
handshake.
THREE WAY HANDSHAKE

Figure:
PACKET EXCHANGE FOR TCP CONNECTION
THE INTERNET (IPV4) SOCKET ADDRESS STRUCTURE
 struct in_addr {
in_addr_t s_addr; /* 32-bit IPv4 address */
/* network byte ordered */
};

struct sockaddr_in {
uint8_t sin_len; /* length of structure (16) */
sa_family_t sin_family; /* AF_INET */
in_port_t sin_port; /* 16-bit TCP or UDP port number */
/* network byte ordered */
struct in_addr sin_addr; /* 32-bit IPv4 address */
/* network byte ordered */
char sin_zero[8]; /* unused */
};
THE INTERNET (IPV4) SOCKET ADDRESS STRUCTURE
THE GENERIC SOCKET ADDRESS STRUCTURE

struct sockaddr {
uint8_t sa_len;
sa_family_t sa_family; /* address family: AF_xxx value */
char sa_data[14]; /* protocol-specific address */
};
BYTE ORDERING
BYTE ORDERING FUNCTIONS
INET_ATON, INET_ADDR, AND INET_NTOA FUNCTIONS
 We will describe two groups of address conversion
functions in this section and the next. They convert
Internet addresses between ASCII strings (what humans
prefer to use) and network byte ordered binary values
(values that are stored in socket address structures).

 inet_aton, inet_ntoa, and inet_addr convert an IPv4


address from a dotted-decimal string (e.g.,
"206.168.112.96") to its 32-bit network byte ordered
binary value.

 The newer functions, inet_pton and inet_ntop, handle


both IPv4 and IPv6 addresses.

 We describe these two functions in the next section and


use them throughout the text.
INET_ATON, INET_ADDR, AND INET_NTOA FUNCTIONS
INET_PTON AND INET_NTOP FUNCTIONS
 These two functions are new with IPv6 and work with
both IPv4 and IPv6 addresses.
 We use these two functions throughout the text. The
letters "p" and "n" stand for presentation and numeric.
The presentation format for an address is often an ASCII
string and the numeric format is the binary value that
goes into a socket address structure.
INET_PTON AND INET_NTOP FUNCTIONS
READN, WRITEN, AND READLINE FUNCTIONS
 Stream sockets (e.g., TCP sockets) exhibit a behavior with the
read and write functions that differs from normal file I/O. A
read or write on a stream socket might input or output fewer
bytes than requested, but this is not an error condition. The
reason is that buffer limits might be reached for the socket in
the kernel.
 All that is required to input or output the remaining bytes is
for the caller to invoke the read or write function again. Some
versions of Unix also exhibit this behavior when writing more
than 4,096 bytes to a pipe.
 This scenario is always a possibility on a stream socket with
read, but is normally seen with write only if the socket is
nonblocking.
 Nevertheless, we always call our writen function instead of
write, in case the implementation returns a short count.
 We provide the following three functions that we use whenever
we read from or write to a stream socket:
READN, WRITEN, AND READLINE FUNCTIONS
ELEMENTARY SOCKET FUNCTION
SOCKET FUNCTION
 To perform network I/O, the first thing a process must
do is call the socket function, specifying the type of
communication protocol desired (TCP using IPv4, UDP
using IPv6, Unix domain stream protocol, etc.).
 #include <sys/socket.h>
 int socket (int family, int type, int protocol);
 Returns: non-negative descriptor if OK, -1 on error

 family specifies the protocol family and is one of the constants.


 This argument is often referred to as domain instead of family.
 The protocol argument to the socket function should be set to
the specific protocol or 0 to select the system's default for the
given combination of family and type.
SOCKET FUNCTION
 #include <sys/socket.h>
 int socket (int family, int type, int protocol);
 Returns: non-negative descriptor if OK, -1 on error
Family Description
AF_INET IPv4 Protocol Family
AF_INET6 IPv6 Protocol Family

Type Description
SOCK_STREAM STREAM SOCKET
SOCK_DGRAM Datagram Socket
SOCK_RAW Raw Socket

Protocol Description
IPPROTO_TCP TCP Transport Protocol
IPPROTO_UDP UDP Transport Protocol
CONNECT FUNCTION
 The connect function is used by a TCP client to establish a
connection with a TCP server.
 #include <sys/socket.h>
 int connect(int sockfd, const struct sockaddr *servaddr, socklen_t addrlen);
 Returns: 0 if OK, -1 on error
 sockfd is a socket descriptor returned by the socket function. The
second and third arguments are a pointer to a socket address
structure and its size.
 The socket address structure must contain the IP address and port
number of the server.
 The client does not have to call bind (which we will describe in the
next section) before calling connect: the kernel will choose both an
ephemeral port and the source IP address if necessary.
CONNECT FUNCTION
 In the case of a TCP socket, the connect function initiates TCP's three-way
handshake (Section 2.6). The function returns only when the connection is
established or an error occurs. There are several different error returns
possible.
 If the client TCP receives no response to its SYN segment, ETIMEDOUT
is returned.
 for example, sends one SYN when connect is called, another 6 seconds
later and another 24 seconds later (p. 828 of TCPv2). If no response is
received after a total of 75 seconds, the error is returned.
 If the server's response to the client's SYN is a reset (RST), this indicates
that no process is waiting for connections on the server host at the port
specified (i.e., the server process is probably not running). This is a hard
error and the error ECONNREFUSED is returned to the client as soon as
the RST is received.
 An RST is a type of TCP segment that is sent by TCP when something is
wrong. Three conditions that generate an RST are:
 when a SYN arrives for a port that has no listening server (what we just
described), when TCP wants to abort an existing connection, and when
TCP receives a segment for a connection that does not exist.
CONNECT FUNCTION CONTINUE
 If the client's SYN elicits an ICMP "destination unreachable" from some
intermediate router, this is considered a soft error. The client kernel saves
the message but keeps sending SYNs with the same time between each
SYN as in the first scenario.
 If no response is received after some fixed amount of time (75 seconds for
4.4BSD), the saved ICMP error is returned to the process as either
EHOSTUNREACH or ENETUNREACH.
 It is also possible that the remote system is not reachable by any route in
the local system‘s forwarding table, or that the connect call returns
without waiting at all.
BIND FUNCTION
 The bind function assigns a local protocol address to a socket. With the
Internet protocols, the protocol address is the combination of either a 32-
bit IPv4 address or a 128-bit IPv6 address, along with a 16-bit TCP or
UDP port number.
 #include <sys/socket.h>
 int bind (int sockfd, const struct sockaddr *myaddr, socklen_t addrlen);
 Returns: 0 if OK,-1 on error
LISTEN FUNCTION
 The listen function is called only by a TCP server and it performs two
actions:
 When a socket is created by the socket function, it is assumed to be an
active socket, that is, a client socket that will issue a connect. The listen
function converts an unconnected socket into a passive socket, indicating
that the kernel should accept incoming connection requests directed to this
socket.
 In terms of the TCP state transition diagram (Figure 2.4), the call to listen
moves the socket from the CLOSED state to the LISTEN state.
 1. The second argument to this function specifies the maximum number of
connections the kernel should queue for this socket.
 #include <sys/socket.h>
 int listen (int sockfd, int backlog);
 Returns: 0 if OK, -1 on error
LISTEN FUNCTION
#include <sys/socket.h>
int listen (int sockfd, int backlog);
Returns: 0 if OK, -1 on error
 This function is normally called after both the socket and bind functions
and must be called before calling the accept function.
 To understand the backlog argument, we must realize that for a given
listening socket, the kernel maintains two queues:
 An incomplete connection queue, which contains an entry for each SYN
that has arrived from a client for which the server is awaiting completion
of the TCP three-way handshake.
 These sockets are in the SYN_RCVD state.
 1. A completed connection queue, which contains an entry for each client
with whom the TCP three-way handshake has completed. These sockets
are in the ESTABLISHED state.
 2. depicts these two queues for a given listening socket
THE TWO QUEUES MAINTAINED BY TCP FOR A LISTENING SOCKET.
THE TWO QUEUES MAINTAINED BY TCP FOR A LISTENING SOCKET.
ACCEPT FUNCTION
 accept is called by a TCP server to return the next completed
connection from the front of the completed connection queue. If the
completed connection queue is empty, the process is put to sleep
(assuming the default of a blocking socket).
#include <sys/socket.h>
int accept (int sockfd, struct sockaddr *cliaddr, socklen_t *addrlen);
Returns: non-negative descriptor if OK, -1 on error
 If accept is successful, its return value is a brand-new descriptor
automatically created by the kernel. This new descriptor refers to
the TCP connection with the client.
 When discussing accept, we call the first argument to accept the
listening socket (the descriptor created by socket and then used as
the first argument to both bind and listen), and we call the return
value from accept the connected socket. It is important to
differentiate between these two sockets.
 A given server normally creates only one listening socket, which
then exists for the lifetime of the server. The kernel creates one
connected socket for each client connection that is accepted (i.e., for
which the TCP three-way handshake completes).
 When the server is finished serving a given client, the connected
socket is closed.
FORK AND EXEC FUNCTIONS
#include <unistd.h>
pid_t fork(void);
Returns: 0 in child, process ID of child in parent, -1 on error
 If y ou have never seen this function before, the hard part in understanding
fork is that it is called once but it returns twice. It returns once in the calling
process (called the parent) with a return value that is the process ID of the
newly created process (the child).
 It also returns once in the child, with a return value of 0. Hence, the return
value tells the process whether it is the parent or the child.

 The reason fork returns 0 in the child, instead of the parent's process ID, is
because a child has only one parent and it can always obtain the parent's
process ID by calling getppid.
 A parent, on the other hand, can have any number of children, and there is
no way to obtain the process IDs of its children. If a parent wants to keep
track of the process IDs of all its children, it must record the return values
from fork.
FORK AND EXEC FUNCTIONS
 There are two typical uses of fork:
 A process makes a copy of itself so that one copy can handle one operation
while the other copy does another task. This is typical for network servers.
We will see many examples of this later in the text.
 A process wants to execute another program. Since the only way to create a
new process is by calling fork, the process first calls fork to make a copy of
itself, and then one of the copies (typically the child process) calls exec to
replace itself with the new program. This is typical for programs such as
shells.
CONCURRENT SERVERS
 pid_t pid;
 int listenfd, connfd;
 listenfd = Socket( ... ); /* fill in sockaddr_in{} with server's well-known port
*/
 Bind(listenfd, ... );
 Listen(listenfd, LISTENQ);
 for ( ; ; ) {
 connfd = Accept (listenfd, ... ); /* probably blocks */
 if( (pid = Fork()) == 0) {
 Close(listenfd); /* child closes listening socket */
 doit(connfd); /* process the request */
 Close(connfd); /* done with this client */
 exit(0); /* child terminates */
 }
 Close(connfd); /* parent closes connected socket */
 }
CONCURRENT SERVERS
 Status of client/server before call to accept returns

 Status of client/server after Accept and fork returns


CONCURRENT SERVERS
 Status of client/server after child close listening socket.
CLOSE FUNCTION
#include <unistd.h>
int close (int sockfd);
Returns: 0 if OK, -1 on error
 The default action of close with a TCP socket is to mark the socket as closed
and return to the process immediately. The socket descriptor is no longer
usable by the process: It cannot be used as an argument to read or write.
But, TCP will try to send any data that is already queued to be sent to the
other end, and after this occurs, the normal TCP connection termination
sequence takes place
TCP ECHO SERVER
int main(int argc, char **argv)
{
int listenfd, connfd;
pid_t childpid;
socklen_t clilen;
struct sockaddr_in cliaddr, servaddr;
listenfd = Socket (AF_INET, SOCK_STREAM, 0);
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl (INADDR_ANY);
servaddr.sin_port = htons (5555);
bind(listenfd, (sockaddr *) &servaddr, sizeof(servaddr));
listen(listenfd, 5);
for ( ; ; ) {
clilen = sizeof(cliaddr);
connfd = Accept(listenfd, (sockaddr *) &cliaddr, &clilen);
if ( (childpid = Fork()) == 0) { /* child process */
Close(listenfd); /* close listening socket */
str_echo(connfd); /* process the request */
exit (0);
}
Close(connfd); /* parent closes connected socket */
}
}
TCP ECHO SERVER
void str_echo(int sockfd)
{
ssize_t n;
char buf[MAXLINE];
while ( (n = read(sockfd, buf, MAXLINE)) > 0)
write(sockfd, buf, n);
}
TCP ECHO CLIENT
int main(int argc, char **argv)
{
int sockfd;
struct sockaddr_in servaddr;
if (argc != 2)
print error;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(5555);
inet_pton(AF_INET, argv[1], &servaddr.sin_addr);
connect(sockfd, (struct sockaddr *) &servaddr, sizeof(servaddr));
str_cli(stdin, sockfd); /* do it all */
exit(0);
}
TCP ECHO CLIENT
void str_cli(FILE *fp, int sockfd)
{
char sendline[100], recvline[100];
while (fgets(sendline, 100, fp) != NULL) {
writen(sockfd, sendline, strlen (sendline));
if (readline(sockfd, recvline,100) == 0)
PRINT ERROR;
fputs(recvline, stdout);
}
}
SOCKET FUNCTIONS FOR UDP CLIENT/SERVER.
SOCKET FUNCTIONS FOR UDP CLIENT/SERVER.
 recvfrom and sendto Functions
#include <sys/socket.h>
ssize_t recvfrom(int sockfd, void *buff, size_t nbytes, int
flags, struct sockaddr*from, socklen_t *addrlen);

ssize_t sendto(int sockfd, const void *buff, size_t nbytes, int


flags, const struct sockaddr *to, socklen_t addrlen);

 Both return: number of bytes read or written if OK, –1


on error
SOCKET FUNCTIONS FOR UDP CLIENT/SERVER.
UDP ECHO SERVER
int main(int argc, char **argv)
{
int sockfd;
struct sockaddr_in servaddr, cliaddr;
sockfd = socket(AF_INET, SOCK_DGRAM, 0);
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_addr.s_addr = htonl(INADDR_ANY);
servaddr.sin_port = htons(SERV_PORT);
bind(sockfd, (sockaddr *) &servaddr, sizeof(servaddr));
dg_echo(sockfd, (sockaddr *) &cliaddr, sizeof(cliaddr));
}
UDP ECHO SERVER
void dg_echo(int sockfd, sockaddr *pcliaddr, socklen_t clilen)
{
int n;
socklen_t len;
char mesg[MAXLINE];
for ( ; ; ) {
len = clilen;
n = recvfrom(sockfd, mesg, MAXLINE, 0, pcliaddr, &len);
sendto(sockfd, mesg, n, 0, pcliaddr, len);
}
}
UDP ECHO CLIENT
int main(int argc, char **argv)
{
int sockfd;
struct sockaddr_in servaddr;
if(argc != 2)
ERROR
bzero(&servaddr, sizeof(servaddr));
servaddr.sin_family = AF_INET;
servaddr.sin_port = htons(SERV_PORT);
inet_pton(AF_INET, argv[1], &servaddr.sin_addr);
sockfd = Socket(AF_INET, SOCK_DGRAM, 0);
dg_cli(stdin, sockfd, (sockaddr *) &servaddr, sizeof(servaddr));
exit(0);
}
UDP ECHO CLIENT
void dg_cli (FILE *fp, int sockfd, const sockaddr *pservaddr, socklen_t
servlen)
{
int n;
char sendline[MAXLINE], recvline[MAXLINE + 1];
while (fgets(sendline, MAXLINE, fp) != NULL) {
sendto(sockfd, sendline, strlen(sendline), 0, pservaddr, servlen);
n = recvfrom(sockfd, recvline, MAXLINE, 0, NULL, NULL);
recvline[n] = 0; /* null terminate */
fputs(recvline, stdout);
}
}
READ DATAGRAM, ECHO BACK TO SENDER
 This function is a simple loop that reads the next datagram arriving at the
server's port using recvfrom and sends it back using sendto.
 Despite the simplicity of this function, there are numerous details to consider.
First, this function never terminates. Since UDP is a onnectionless protocol,
there is nothing like an EOF as we have with TCP.
 Next, this function provides an iterative server, not a concurrent server as we
had with TCP.
 There is no call to fork, so a single server process handles any and all clients.
In general, most TCP servers are concurrent and most UDP servers are
iterative.
 There is implied queuing taking place in the UDP layer for this socket. Indeed,
each UDP socket has a receive buffer and each datagram that arrives for this
socket is placed in that socket receive buffer. When the process calls recvfrom,
the next datagram from the buffer is returned to the process in a first-in, first-
out (FIFO) order. This way, if multiple datagrams arrive for the socket before
the process can read what's already queued for the socket, the arriving
datagrams are just added to the socket receive buffer. But, this buffer has a
limited size. We discussed this size and how to increase it with the
SO_RCVBUF socket option in Section
SUMMARY
SOCKET OPTION
SOCKET OPTION
 getsockopt and setsockopt function
#include <sys/socket.h>
int getsockopt(int sockfd, , int level, int optname, void
*optval, socklent_t *optlen);

int setsockopt(int sockfd, int level , int optname, const void


*optval, socklent_t optlen);
• sockfd: open socket descriptor
• level: code in the system to interpret the option(generic,
IPv4,IPv6, TCP)
 – level includes general socket option (SOL_SOCKET)
and protocolspecific option (IP, TCP, etc).
SOCKET OPTION
getsockopt and setsockopt function
• optval: pointer to a variable from which the new value of
option is fetched by setsockopt, or into which the current
value of the option is stored by getsockopt.
– Option value can be of different types : int, in_addr,
timeval, … – that is the reason we use the void pointer.
• optlen: the size of the option variable.
SOCKET OPTION
Two types of options
Binary option :-
Used to enable or disable certain feature (flag option).
optval is integer type.
In getsockopt, The returned optval 0 means option is
disabled and 1 means it is enabled.
In setsockopt, optval 0 is used to disable the option and
nonzero is used to enable the option.
Value option :-
• Uses specific values.
• Used to pass/fetch values, structures, etc.
SOCKET OPTION
General socket options (level = SOL_SOCKET)
• Optnames:
– SO_BROADCAST(int): permit sending broadcast
datagram
– SO_ERROR: can only be got (not set), reset the error
– SO_KEEPALIVE: for TCP only, automatic send
keepalive message when inactive for 2 hours (can be
modified).
– SO_LINGER: for TCP only, determines the behavior
when a close is called.
– SO_RCVBUF, SO_SNDBUF: send and receive buffer
size.
SOCKET OPTION
 IP options:
– Allows packets sent through a socket to have certain
behavior,
– Level = IPPROTO_IP
– E.g., manipulating IP header fields
 TCP options:

• Level = IPPROTO_TCP
• Optnames:
– TCP_KEEPALIVE: set the time
– TCP_NODELAY: wait to ack or not (enable)
SOCKET OPTION
The following socket options are inherited by a connected
TCP socket from the listening socket.
– SO_KEEPALIVE
– SO_LINGER
– SO_RCVBUF
– SO_SNDBUF

• When to set these options?


– The connected socket is returned to a server by accept
until the three-way handshake is completed by the TCP
layer.
– To set these option(s), set them for the listening socket.
SOCKET OPTION
Generic Socket Options
• These options are protocol independent.
However, certain options apply to only certain types of
sockets (E.g., SO_BROADCAST).
• They are handled by a protocol independent code within
the kernel. (Not by one particular protocol module)
SOCKET OPTION
Generic Socket Options (SO_BROADCAST)
• It enables or disables the ability of the process to send
broadcast message on broadcast links.
– Can work only with datagram socket.
– Moreover only on network that supports the concept of
broadcast.• E.g Ethernet, token ring..)
• Example: int broadcastFlag=1;

setsockopt(sd,SOL_SOCKET,SO_BROADCAST,&broadc
astFlag,size of(broadcastFlag));
• If the destination address is a broadcast address, and
this socket option is not set, EACCES is returned.
SOCKET OPTION
 Generic Socket Options (SO_ERROR)
• when error occurs on a socket, the protocol module in a
Berkeley-derived kernel sets one of standard UNIX Exxx
values to a variable named so_error for that socket.
• It is called pending error for the socket.
• The process can be immediately notified of the error as,
• The process can then obtain the value of so_error by
fetching the SO_ERROR socket option.
SOCKET OPTION
 Generic Socket Options (SO_ERROR)
• If so_error is nonzero when the process calls read
There is no data to return, read returns -1 with errno set
to the value of so_error. The value of so_error is then
reset to 0.
– If there is data queued for the socket, that data is
returned by read instead of error condition.
• If so_error is nonzero when the process calls write, -1 is
returned with errno set to the value of so_error. The
so_error is reset to 0.
SOCKET OPTION
 Generic Socket Options (SO_KEEPALIVE)
• It is used to detect peer host crash.
• When the keepalive option is set for a TCP socket.
– And there is no data exchange for 2hours,
– then TCP automatically sends a keep-alive probe to the
peer.
• Possible peer responses
– ACK(everything OK)
– RST(peer crashed and rebooted):ECONNRESET
– no response to keep alive probe.
SOCKET OPTION
 Generic Socket Options (SO_KEEPALIVE)
• Peer response-1: ACK
– The peer TCP responds with the expected ACK.
– Everything OK, and application is not notified.
– TCP will send another probe following another 2 hours of
inactivity.
• Peer response-2: RST
– The peer TCP responds with RST.
– It indicates that peer host has crashed and rebooted.
– Socket’s pending error is set to ECONNRESET.
– And the socket is closed.
SOCKET OPTION
 Generic Socket Options (SO_KEEPALIVE)
• Peer response-3: No response to keep alive probe.
– Berkely derived TCPs send eight additional probes, 75
seconds apart, trying to elicit response.
– TCP will give up if no response within 11 minutes and 15
seconds after sending the first probe.
– If there is no response at all to TCP’s keepalive probes,
• Socket’s pending error is set to ETIMEDOUT and the
socket is closed.
Or
• If the socket receives an ICMP error in response to one of
keepalive probes, the corresponding error is returned
instead. (EHOSTUNREACH).
SOCKET OPTION
 Generic Socket Options (SO_KEEPALIVE)
Can we change inactivity duration (i.e., rather than 2
hours, can we specify some other period)?
– Most kernels maintain these parameters on a per-kernel
basis.
– So, changing the inactivity period from 2 hours to 15
minutes will affect all sockets on the host.
SOCKET OPTION
 Generic Socket Options (SO_KEEPALIVE)
• Use of SO_KEEPALIVE to detect peer host crash.
• If the peer process crashes?
– Its TCP will send a FIN across the connection, which we
can easily detect with select (through I/O multiplexing).
• If there is no response to any keep-alive probes?
– We are not guaranteed that peer host has crashed.
– It could be possible that some intermediate router failed
for 15 minutes (for example,), and our probe sending
period gets completely overlap by this 15 minutes period.
SOCKET OPTION
 Generic Socket Options (SO_KEEPALIVE)
• Practical usage
• This option is normally used by servers.
• Servers use the option as they spend most of their time
for waiting for input across the TCP connection.
• If the client host crashes, the server process will never
know about it.
• And the server will continually wait for input that can
never arrive.
• This is called half-open connection. The keep-alive
option will detect these half-option connections and
terminate them.
SOCKET OPTION
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
• Linger means gradually dying.
• Using this option, we can specify how the close function
operates for a connection oriented protocol.
• By default, close returns immediately, but if there is any
data still remaining in the socket send buffer, the system
will try to deliver the data to the peer.
• We can change this default behavior by passing properly
initialized struct linger structure to setsockopt function.
SOCKET OPTION
 Generic Socket Options (SO_LINGER)

struct linger{
int l_onoff; /* 0 = off, nonzero = on */
int l_linger; /*linger time : second*/
};
• l_onoff = 0 : turn off the option , l_linger is ignored.
– We get default behavior.
• l_onoff = nonzero and l_linger is 0:
– TCP aborts the connection when the close is called.
– TCP discard any remaining data in the socket send buffer and
sends
RST to the peer, not the normal four packet connection termination
sequence.
– Moreover, TCP’s TIME_WAIT state is avoided. There is possibility of
another incarnation of this connection gets created within 2MSL
seconds (old duplicates from earlier connection arrive at new
connection).
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
struct linger{
int l_onoff; /* 0 = off, nonzero = on */
int l_linger; /*linger time : second*/
};
• l_onoff = nonzero and l_linger is nonzero :
– Kernel will linger until socket is closed.
– If there is any data still remaining in socket send buffer,
the process is put to sleep until
• Either all the data is sent and acknowledge by peer TCP
or until the linger time expires.
– If socket has been set nonblocking, it will not wait for the
close to complete, even if linger time is nonzero.
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
• The application should check the return value from close
while using linger socket option.
• If the linger time expires before the remaining data is
sent and acknowledged, close returns EWOULDBLOCK
and any remaining data in the send buffer is discarded.
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
• In default behavior, there exists two problems.
– It is possible that the client’s close can return before
the server reads the remaining data in its socket
receive buffer.
– It is possible that the server host crashes before server
application reads this remaining data, and the client
application will never know this.
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
• Successful return from close, with SO_LINGER option set, only tells
us that the data we sent and our FIN have been acknowledged by
the peer TCP.
• This does not tell us whether the peer application has read the data.
• How to make sure that the peer application has read our data?
– Use of shutdown
– Use of application level acknowledgement of data.
• shutdown
#include <sys/socket.h>
int shutdown(int s, int how);
(On success, zero is returned. On error, -1 is returned, and errno is set
appropriately. )
– If how = SHUT_RD, further receiving will not be allowed.
– If how = SHUT_WR, further sending will not be allowed.
– If how = SHUT_RDWR, further receiving and sending will not be
allowed.
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
• Use of application level acknowledgement of data to
know the peer application has read our data.
• Use ack of 1 byte.
• Client code
char ack;
write(sockfd, data, nbytes); /* data from client to server */
N=read(sockfd, &ack, 1); /* wait for application-level ack */
• Server
nbytes=read(sockfd, buff, sizeof(buff)); /*data from client */
/* server verifies it received the correct amount of data
from the client */
write(sockfd, “ ”, 1); /* server’s ACK back to client */
SOCKET OPTION
 Generic Socket Options (SO_LINGER)
SOCKET OPTION
 Socket Options
TCP AND UDP OUTPUT
 Given all these terms and definitions, Figure shows
that happens when an application writes data to a TCP
socket.
TCP AND UDP OUTPUT
 Every TCP socket has a send buffer and we can change the
size of this buffer with the SO_SNDBUF socket option. When
an application calls write, the kernel copies all the data from
the application buffer into the socket send buffer.
 If there is insufficient room in the socket buffer for all the
application's data (either the application buffer is larger than
the socket send buffer, or there is already data in the socket
send buffer), the process is put to sleep.
 This assumes the normal default of a blocking socket. The
kernel will not return from the write until the final byte in the
application buffer has been copied into the socket send buffer.
 Therefore, the successful return from a write to a TCP socket
only tells us that we can reuse our application buffer. It does
not tell us that either the peer TCP has received the data or
that the peer application has received the data.
TCP AND UDP OUTPUT
 TCP takes the data in the socket send buffer and sends it
to the peer TCP based on all the rules of TCP data
transmission. The peer TCP must acknowledge the data,
and as the ACKs arrive from the peer, only then can our
TCP discard the acknowledged data from the socket
send buffer. TCP must keep a copy of our data until it is
acknowledged by the peer.
TCP AND UDP OUTPUT
 UDP Output :- shows what happens when an
application writes data to a UDP socket.
TCP AND UDP OUTPUT
 This time, we show the socket send buffer as a dashed
box because it doesn't really exist. A UDP socket has a
send buffer size ,but this is simply an upper limit on the
maximum-sized UDP datagram that can be written to
the socket.
 If an application writes a datagram larger than the
socket send buffer size, EMSGSIZE is returned.
 Since UDP is unreliable, it does not need to keep a copy
of the application's data and does not need an actual
send buffer. (The application data is normally copied into
a kernel buffer of some form as it passes down the
protocol stack, but this copy is discarded by the datalink
layer after the data is transmitted.)
TCP AND UDP OUTPUT
 UDP simply prepends its 8-byte header and passes the
datagram to IP. IPv4 or IPv6 prepends its header, determines
the outgoing interface by performing the routing function, and
then either adds the datagram to the datalink output queue
(if it fits within the MTU) or fragments the datagram and
adds each fragment to the datalink output queue.
 If a UDP application sends large datagrams (say 2,000-byte
datagrams), there is a much higher probability of
fragmentation than with TCP, because TCP breaks the
application data into MSS-sized chunks, something that has
no counterpart in UDP.
 The successful return from a write to a UDP socket tells us
that either the datagram or all fragments of the datagram
have been added to the datalink output queue.
 If there is no room on the queue for the datagram or one of its
fragments, ENOBUFS is often returned to the application.

Você também pode gostar