The TCP Connection (Revisited)

In the last three lectures, discussion of how an application protocol operates invariably starts with the phrase:
...the client process opens a TCP connection to a server, at port number xx...
In this lecture, we examine what this really means from the perspective of a programmer writing a TCP client or (more complex) a server.

Warning: In this lecture, some things will be assumed:


The Socket Abstraction

The socket was introduced in BSD Unix as a way of extending the Unix file I/O model to handle network communications. Sockets are implemented in all current versions of Unix as system calls; that is, they are part of the operating system kernel. In other systems, they are often implemented as libraries (eg, winsock for PCs, GUSI for the Macintosh).

Socket operations resemble file operations in many respects:


Socket Creation

The system call to create a socket looks like:
result = socket(pf, type, protocol);
The pf argument specifies the protocol family of the socket. For TCP (Internet) sockets this is PF_INET.

The type argument specifies the type of communication desired. For a TCP reliable stream connection, it has value SOCK_STREAM.

The third (protocol) argument can, on some systems, further define the type of socket. For TCP, this is always (?) zero.

Once a socket has been created, the programmer can optionally call the bind system call to define a local address for it. This is normally only used in servers, not clients. - see later today.

bind(socket, localaddr, addrlen);
The socket argument is the small integer returned by the socket system call.

The localaddr argument is a pointer to a sockaddr structure containing the desired port number, and the IP address (see later lecture) of the local host. Addrlen is the length of the structure in bytes.


Socket Addresses and Connections

The sockaddr structure contains the following fields for a TCP connection:

The same structure is used by a client to establish a connection to a server, thus:

connect(socket, destaddr, addrlen);
In this case, the fields have the following meaning:

Socket Communications

Once a connection is established, a process uses standard file I/O system calls to perform communication, thus:
read(socket, buffer, length);
and
write(socket, buffer, length);
The first argument is the integer socket identifier. The second is a pointer to an area of memory containing the data to be read or written, to or from the connected socket. The length argument specifies, for a write, the number of bytes to be written to the socket. For a read, it specifies the maximum number of bytes which may be returned.

Note that there are many other operations (outside the scope of this subject) which can be performed on a TCP connected socket. Some of these include:


Server Socket Operations

Until now, the socket operations we have seen relate to client processes (except bind). Several further operations are used by server processes to manage the fact that a server waits for a connection to be established.
listen(socket, qlength);
This call defines the number of incoming connections which the operating system will queue until the server process is able to receive the connection.
newsock = accept(sock, sockaddr, addrlen);
This is the call which defines "waiting for connection". It does not return until a connection is established, at which time it "fills in" the sockaddr structure with the details of the client from whom the connection was accepted. It also creates a new socket which the server can use to perform communications with the client, either consecutively or concurrently.

The new socket is bound to the same port as the original one used in the argument list to accept, allowing the server to continue to accept new connections at the same port number.

How this works is a very subtle point...


This lecture is also available in PostScript format. The tutorial for this lecture is Tutorial #06.
Phil Scott