System Management Guide: Communications and Networks

TCP/IP Protocols

The topics discussed in this section are:

Protocols are sets of rules for message formats and procedures that allow machines and application programs to exchange information. These rules must be followed by each machine involved in the communication in order for the receiving host to be able to understand the message.

The TCP/IP suite of protocols can be understood in terms of layers (or levels).

Figure 3. TCP/IP Suite of Protocols. This illustration depicts the layers of the TCP/IP protocol. From the top they are, Application Layer, Transport Layer, Network Layer, Network Interface Layer, and Hardware.

TCP/IP carefully defines how information moves from sender to receiver. First, application programs send messages or streams of data to one of the Internet Transport Layer Protocols, either the User Datagram Protocol (UDP) or the Transmission Control Protocol (TCP). These protocols receive the data from the application, divide it into smaller pieces called packets, add a destination address, and then pass the packets along to the next protocol layer, the Internet Network layer.

The Internet Network layer encloses the packet in an Internet Protocol (IP) datagram, puts in the datagram header and trailer, decides where to send the datagram (either directly to a destination or else to a gateway), and passes the datagram on to the Network Interface layer.

The Network Interface layer accepts IP datagrams and transmits them as frames over a specific network hardware, such as Ethernet or Token-Ring networks.

Figure 4. Movement of Information from Sender Application to Receiver Host. This illustration shows the flow of information down the TCP/IP protocol layers from the Sender to the Host.

Frames received by a host go through the protocol layers in reverse. Each layer strips off the corresponding header information, until the data is back at the application layer.

Figure 5. Movement of Information from Host to Application. This illustration shows the flow of information up the TCP/IP protocol layers from the Host to the Sender.

Frames are received by the Network Interface layer (in this case, an Ethernet adapter). The Network Interface layer strips off the Ethernet header, and sends the datagram up to the Network layer. In the Network layer, the Internet Protocol strips off the IP header and sends the packet up to the Transport layer. In the Transport layer, the TCP (in this case) strips off the TCP header and sends the data up to the Application layer.

Hosts on a network send and receive information simultaneously. The "Host Data Transmission and Reception" figure below more accurately represents a host as it communicates.

Figure 6. Host Data Transmissions and Receptions. This illustration shows data flowing both ways through the TCP/IP layers.

Internet Protocol (IP) Version 6 Overview

Internet Protocol (IP) Version 6 (IPv6 or IPng) is the next generation of IP and has been designed to be an evolutionary step from IP Version 4 (IPv4). While IPv4 has allowed the development of a global Internet, it is not capable of carrying much farther into the future because of two fundamental factors: limited address space and routing complexity. The IPv4 32-bit addresses do not provide enough flexibility for global Internet routing. The deployment of Classless InterDomain Routing (CIDR) has extended the lifetime of IPv4 routing by a number of years, but the effort to better manage the routing will continue. Even if IPv4 routing could be scaled up, the Internet will eventually run out of network numbers.

The Internet Engineering Task Force (IETF) recognized that IPv4 would not be able to support the phenomenal growth of the Internet, so the IETF IPng working group was formed. Of the proposals that were made, Simple Internet Protocol Plus (SIPP) was chosen as an evolutionary step in the development of IP. This was renamed to IPng, and RFC1883 was finalized in December of 1995.

IPv6 extends the maximum number of Internet addresses to handle the ever increasing Internet user population. As an evolutionary change from IPv4, IPv6 has the advantage of allowing the new and the old to coexist on the same network. This coexistence enables an orderly migration from IPv4 (32 bit addressing) to IPv6 (128 bit addressing) on an operational network.

This overview is intended to give the reader a general understanding of the IPng protocol. For detailed information, please see RFCs 2460, 2373, 2465, 1886, 2461, 2462, and 2553.

Expanded Routing and Addressing

IPv6 increases the IP address size from 32 bits to 128 bits, thereby supporting more levels of addressing hierarchy, a much greater number of addressable nodes, and simpler autoconfiguration of addresses.

IPv6 has three types of addresses:

unicast	A packet sent to a unicast address is delivered to the interface identified by that address. A unicast address has a particular scope: link-local, site-local, global. There are also two special unicast addresses: ::/128 (unspecified address) ::1/128 (loopback address)
multicast	A packet sent to a multicast address is delivered to all interfaces identified by that address. A multicast address is identified by the prefix ff::/8. As with unicast addresses, multicast addresses have a similar scope: node-local, link-local, site-local, and organization-local.
anycast	An anycast address is an address that has a single sender, multiple listeners, and only one responder (normally the "nearest" one, according to the routing protocols' measure of distance). For example, several web servers listening on an anycast address. When a request is sent to the anycast address, only one responds. An anycast address is indistinguishable from a unicast address. A unicast address becomes an anycast address when more than one interface is configured with that address.

Note: There are no broadcast addresses in IPv6. Their function has been superseded by the multicast address.

Autoconfiguration

The primary mechanisms available that enable a node to start up and communicate with other nodes over an IPv4 network are hard-coding, BOOTP, and DHCP.

IPv6 introduces the concept of scope to IP addresses, one of which is link-local. This allows a host to construct a valid address from the predefined link-local prefix and its local identifier. This local identifyer is typically derived from the medium access control (MAC) address of the interface to be configured. Using this address, the node can communicate with other hosts on the same subnet and, for a fully-isolated subnet, might not need any other address configuration.

Meaningful Addresses

With IPv4, the only generally recognizable meaning in addresses are broadcast (typically all 1s or all 0s), and classes (for example, a class D is multicast). With IPv6, the prefix can be quickly examined to determine scope (for example, link-local), multicast versus unicast, and a mechanism of assignment (provider-based or geography-based).

Routing information might be explicitly loaded into the upper bits of addresses as well, but this has not yet been finalized by the IETF (for provider-based addresses, routing information is implicitly present in the address).

Duplicate Address Detection

When an interface is initialized or reinitialized, it uses autoconfiguration to tentatively associate a link-local address with that interface (the address is not yet assigned to that interface in the traditional sense). At this point, the interface joins the all-nodes and solicited-nodes multicast groups, and sends a neighbor discovery message to these groups. By using the multicast address, the node can determine whether that particular link-local address has been previously assigned, and choose an alternate address. This eliminates accidentally assigning the same address to two different interfaces on the same link. (It is still possible to create duplicate global-scope addresses for nodes that are not on the same link.)

Neighbor Discovery/Stateless Address Autoconfiguration

Neighbor Discovery Protocol (NDP) for IPv6 is used by nodes (hosts and routers) to determine the link-layer addresses for neighbors known to reside on attached links, and maintain per-destination routing tables for active connections. Hosts also use NDP to find neighboring routers that are willing to forward packets on their behalf and detect changed link-layer addresses. NDP uses the Internet Control Message Protocol (ICMP) Version 6 with its own unique message types. In general terms, the IPv6 Neighbor Discovery protocol corresponds to a combination of the IPv4 Address Resolution Protocol (ARP), ICMP Router Discovery (RDISC), and ICMP Redirect (ICMPv4), but with many improvements over these IPv4 protocols.

IPv6 defines both a stateful and a stateless address autoconfiguration mechanism. Stateless autoconfiguration requires no manual configuration of hosts; minimal, if any, configuration of routers; and no additional servers. The stateless mechanism allows a host to generate its own addresses using a combination of locally available information and information advertised by routers. Routers advertise prefixes that identify the subnets associated with a link, while hosts generate an interface token that uniquely identifies an interface on a subnet. An address is formed by combining the two. In the absence of routers, a host can only generate link-local addresses. However, link-local addresses are sufficient for allowing communication among nodes attached to the same link.

Routing Simplification

To simplify routing issues, IPv6 addresses are considered in two parts: a prefix and an ID. This might seem the same as the IPv4 net-host address breakdown, but it has two advantages:

no class	No fixed number of bits for prefix or ID, which allows for a reduction in loss due to over-allocation
nesting	An arbitrary number of divisions can be employed by considering different numbers of bits as the prefix.

Case 1

128 bits
node address

Case 2

`n bits`	`128-n bits`
`Subnet prefix`	`Interface ID`

Case 3:

`n bits`	`80-n bits`	`48 bits`
`Subscriber prefix`	`Subnet ID`	`Interface ID`

Case 4:

`s bits`	`n bits`	`m bits`	`128-s-n-m bits`
`Subscribe prefix`	`Area ID`	`Subnet ID`	`Interface ID`

Generally, IPv4 cannot go beyond Case 3, even with Variable Length Subnet Mask (VLSM is a means of allocating IP addressing resources to subnets according to their individual need rather than some general network-wide rule). This is as much an artifact of the shorter address length as the definition of variable length prefixes, but is worth noting nonetheless.

Header Format Simplification

IPv6 simplifies the IP header, by removing entirely or by moving to an extension header, some of the fields found in the IPv4 header. It defines a more flexible format for optional information (the extension headers). Specifically, note the absence of:

header length (length is constant)
identification
flags
fragment offset (moved into fragmentation extension headers)
header checksum (upper-layer protocol or security extension header handles data integrity).

Table 1. IPv4 header
`Version`	`IHL`	`Type of Service`	`Total Length`
`Identification`			`Flags`	`Fragment Offset`
`Time to Live`		`Protocol`	`Header Checksum`
`Source Address`
`Destination Address`
`Options`				`Padding`

Table 2. IPv6 header
`Version`	`Prio`	`Flow Label`
`Payload Length`			`Next Header`	`Hop Limit`
`Source Address`
`Destination Address`

IPng includes an improved options mechanism over IPv4. IPv6 options are placed in separate extension headers that are located between the IPv6 header and the transport-layer header in a packet. Most extension headers are not examined or processed by any router along a packet delivery path until it arrives at its final destination. This mechanism facilitates a major improvement in router performance for packets containing options. In IPv4 the presence of any options requires the router to examine all options.

Another improvement is that, unlike IPv4 options, IPv6 extension headers can be of arbitrary length and the total amount of options carried in a packet is not limited to 40 bytes. This feature, plus the manner in which it is processed, permits IPv6 options to be used for functions that were not practical in IPv4, such as the IPv6 Authentication and Security Encapsulation options.

To improve the performance when handling subsequent option headers and the transport protocol that follows, IPv6 options are always an integer multiple of eight octets long to retain this alignment for subsequent headers.

By using extension headers instead of a protocol specifier and options fields, newly defined extensions can be integrated more easily.

Current specifications define extension headers in the following ways:

Hop-by-hop options that apply to each hop (router) along the path
Routing header for loose/strict source routing (used infrequently)
A fragment defines the packet as a fragment and contains information about the fragment (IPv6 routers do not fragment)
Authentication (See IP Security AIX 5L Version 5.2 Security Guide
Encryption (See IP Security AIX 5L Version 5.2 Security Guide
Destination options for the destination node (ignored by routers).

Improved Quality-of-Service/Traffic Control

While quality of service can be controlled by use of a control protocol such as RSVP, IPv6 provides for explicit priority definition for packets by using the priority field in the IP header. A node can set this value to indicate the relative priority of a particular packet or set of packets, which can then be used by the node, one or more routers, or the destination to make choices concerning the packet (that is, dropping it or not).

IPv6 specifies two types of priorities, those for congestion-controlled traffic, and those for non-congestion-controlled traffic. No relative ordering is implied between the two types.

Congestion-controlled traffic is defined as traffic that responds to congestion through some sort of "back-off" or other limiting algorithm. Priorities for congestion-controlled traffic are:

0	uncharacterized traffic
1	"filler" traffic (for example, netnews)
2	unattended data transfer (for example, mail)
3	(reserved)
4	attended bulk transfer (for example, FTP)
5	(reserved)
6	interactive traffic (for example, Telnet)
7	control traffic (for example, routing protocols)

Non-congestion-controlled traffic is defined as traffic that responds to congestion by dropping (or simply not resending) packets, such as video, audio, or other real-time traffic. Explicit levels are not defined with examples, but the ordering is similar to that for congestion-controlled traffic:

The lowest value that the source is most willing to have discarded should be used for traffic.
The highest value that the source is least willing to have discarded should be used for traffic.

This priority control is only applicable to traffic from a particular source address. Control traffic from one address is not an explicitly higher priority than attended bulk transfer from another address.

Flow Labeling

Outside of basic prioritization of traffic, IPv6 defines a mechanism for specifying a particular flow of packets. In IPv6 terms, a flow is defined as a sequence of packets sent from a particular source to a particular (unicast or multicast) destination for which the source desires special handling by the intervening routers.

This flow identification can be used for priority control, but might also be used for any number of other controls.

The flow label is chosen randomly, and does not identify any characteristic of the traffic other than the flow to which it belongs. This means that a router cannot determine that a packet is a particular type by examining the flow label. It can, however, determine that it is part of the same sequence of packets as the last packet containing that label.

Note: Until IPv6 is in general use, the flow label is mostly experimental. Uses and controls involving flow labels have not yet been defined nor standardized.

Tunneling

The key to a successful IPv6 transition is compatibility with the existing installed base of IPv4 hosts and routers. Maintaining compatibility with IPv4 while deploying IPv6 streamlines the task of transitioning the Internet to IPv6.

While the IPv6 infrastructure is being deployed, the existing IPv4 routing infrastructure can remain functional, and can be used to carry IPv6 traffic. Tunneling provides a way to use an existing IPv4 routing infrastructure to carry IPv6 traffic.

IPv6 or IPv4 hosts and routers can tunnel IPv6 datagrams over regions of IPv4 routing topology by encapsulating them within IPv4 packets. Tunneling can be used in a variety of ways:

Router-to-Router	IPv6 or IPv4 routers interconnected by an IPv4 infrastructure can tunnel IPv6 packets between themselves. In this case, the tunnel spans one segment of the end-to-end path that the IPv6 packet takes.
Host-to-Router	IPv6 or IPv4 hosts can tunnel IPv6 packets to an intermediary IPv6 or IPv4 router that is reachable through an IPv4 infrastructure. This type of tunnel spans the first segment of the packet's end-to-end path.
Host-to-Host	IPv6 or IPv4 hosts that are interconnected by an IPv4 infrastructure can tunnel IPv6 packets between themselves. In this case, the tunnel spans the entire end-to-end path that the packet takes.
Router-to-Host	IPv6/IPv4 routers can tunnel IPv6 packets to their final destination IPv6 or IPv4 host. This tunnel spans only the last segment of the end-to-end path.

Tunneling techniques are usually classified according to the mechanism by which the encapsulating node determines the address of the node at the end of the tunnel. In router-to-router or host-to-router methods, the IPv6 packet is being tunneled to a router. In host-to-host or router-to-host methods, the IPv6 packet is tunneled all the way to its final destination.

The entry node of the tunnel (the encapsulating node) creates an encapsulating IPv4 header and transmits the encapsulated packet. The exit node of the tunnel (the decapsulating node) receives the encapsulated packet, removes the IPv4 header, updates the IPv6 header, and processes the received IPv6 packet. However, the encapsulating node needs to maintain soft state information for each tunnel, such as the maximum transmission unit (MTU) of the tunnel, to process IPv6 packets forwarded into the tunnel.

IPv6 Security

For details about IP Security, versions 4 and 6, see IP Security in AIX 5L Version 5.2 Security Guide.

IPv6 Multihomed Link-Local and Site-Local Support

A host can have more than one interface defined. A host with two or more active interfaces is called multihomed. Each interface has a link-local address associated with it. Link-local addresses are sufficient for allowing communication among nodes attached to the same link.

A multihomed host has two or more associated link-local addresses. The AIX IPv6 implementation has 4 options to handle how link-layer address resolution is resolved on multihomed hosts. Option 1 is the default.

Option 0	No multihomed actions are taken. Transmissions will go out on the first link-local interface. When the Neighbor Discovery Protocol (NDP) must perform address resolution, it multicasts a Neighbor Solicitation message out on each interface with a link local address defined. NDP queues the data packet until the first Neighbor Advertisement message is received. The data packet is then sent out on this link.
Option 1	When the NDP must perform address resolution, that is, when sending a data packet to a destination and the the link-layer information for the next hop is not in the Neighbor Cache, it multicasts a Neighbor Solicitation message out on each interface with a link-local address defined. NDP then queues the data packet until it gets the link-layer information. NDP then waits until a response is received for each interface. This guarantees that the data packets are sent on the appropriate outgoing interfaces. If NDP did not wait, but responded to the first Neighbor Advertisement received, it would be possible for a data packet to be sent out on a link not associated with the packet source address. Because NDP must wait, a delay in the first packet being sent occurs. However, the delay occurrs anyway in waiting for the first response.
Option 2	Multihomed operation is allowed, but dispatching of a data packet is limited to the interface specified by main_if6. When the NDP must perform address resolution, it multicasts a Neighbor Solicitation message out on each interface with a link-local address defined. It then waits for a Neighbor Advertisement message from the interface specified by main_if6 (see the no command). Upon receiving a response from this interface, the data packet is sent out on this link.
Option 3	Multihomed operation is allowed, but dispatching of a data packet is limited to the interface specified by main_if6 and site-local addresses are only routed for the interface specified by main_site6 (see the no command). The NDP operates just as it does for Option 2. For applications that route data packets using site-local addresses on a multihomed host, only the site-local address specified by main_site6 are used.

Packet Tracing

Packet tracing is the process by which you can verify the path of a packet through the layers to its destination. The iptrace command performs network interface level packet tracing. The ipreport command issues output on the packet trace in both hexadecimal and ASCII format. The trpt command performs transport protocol level packet tracking for the TCP. The trpt command output is more detailed, including information on time, TCP state, and packet sequencing.

Network Interface Packet Headers

At the Network Interface layer, packet headers are attached to outgoing data.

Figure 7. Packet Flow through Network Interface Structure. This illustration shows bi-directional data flow through the layers of the Network Interface Structure. From the top (software) they are the Network Layer, Network Interface Layer, Device Driver, and the (hardware) Network Adapter Card or Connection.

Packets are then sent through the network adapter to the appropriate network. Packets can pass through many gateways before reaching their destinations. At the destination network, the headers are stripped from the packets and the data is sent to the appropriate host.

The following section contains packet header information for several of the more common network interfaces.

Ethernet Adapter Frame Headers

An Internet Protocol (IP) or Address Resolution Protocol (ARP) frame header for the Ethernet adapter is composed of three fields as shown in the following table..

Ethernet adapter frame header
Field	Length	Definition
DA	6 bytes	Destination address.
SA	6 bytes	Source address. If bit 0 of this field is set to 1, it indicates that routing information (RI) is present.
Type	2 bytes	Specifies whether the packet is IP or ARP. The type number values are listed below.

Type field numbers:

IP	0800
ARP	0806

Token-Ring Frame Headers

The medium access control (MAC) header for the token-ring adapter is composed of five fields, as shown in the following table.

Token-ring MAC header
Field	Length	Definition
AC	1 byte	Access control. The value in this field x`00' gives the header priority 0.
FC	1 byte	Field control. The value in this field x`40' specifies the Logical Link Control frame.
DA	6 bytes	Destination address.
SA	6 bytes	Source address. If bit 0 of this field is set to 1, it indicates that routing information (RI) is present.
RI	18 bytes	Routing information. The valid fields are discussed below.

The MAC header consists of two routing information fields of two bytes each: routing control (RC) and segment numbers. A maximum of eight segment numbers can be used to specify recipients of a limited broadcast. RC information is contained in bytes 0 and 1 of the RI field. The settings of the first two bits of the RC field have the following meanings:

bit (0) = 0	Use the nonbroadcast route specified in the RI field.
bit (0) = 1	Create the RI field and broadcast to all rings.
bit (1) = 0	Broadcast through all bridges.
bit (1) = 1	Broadcast through limited bridges.

The logical link control (LLC) header is composed of five fields, as shown in the following LLC header table.

802.3 LLC header
Field	Length	Definition
DSAP	1 byte	Destination service access point. The value in this field is x`aa'.
SSAP	1 byte	Source service access point. The value in this field is x`aa'.
CONTROL	1 byte	Determines the LLC commands and responses. The three possible values for this field are discussed below.
PROT_ID	3 bytes	Protocol ID. This field is reserved. It has a value of x`0'.
TYPE	2 bytes	Specifies whether the packet is IP or ARP.

Control Field Values

x`03'	Unnumbered Information (UI) frame. This is the normal, or unsequenced, way in which token-ring adapter data is transmitted through the network. TCP/IP sequences the data.
x`AF'	Exchange identification (XID) frame. This frame conveys the characteristics of the sending host.
x`E3'	Test frame. This frame supports testing of the transmission path, echoing back the data that is received.

802.3 Frame Headers

The MAC header for the 802.3 adapter is composed of two fields, as shown in the following MAC header table.

802.3 MAC header
Field	Length	Definition
DA	6 bytes	Destination address.
SA	6 bytes	Source address. If bit 0 of this field is set to 1, it indicates that routing information (RI) is present.

The LLC header for 802.3 is the same as for Token-Ring MAC header.

Internet Network-Level Protocols

The Internet network-level protocols handle machine-to-machine communication. In other words, this layer implements TCP/IP routing. These protocols accept requests to send packets (along with the network address of the destination machine) from the Transport layer, convert the packets to datagram format, and send them down to the Network Interface layer for further processing.

Figure 8. Network Layer of the TCP/IP Suite of Protocols. This illustration shows the various layers of the TCP/IP Suite of Protocols. From the top, the application layer consists of the application. The transport layer contains UDP and TCP. The network layer contains the network (hardware) interface. And finally, the hardware layer contains the physical network.

TCP/IP provides the protocols that are required to comply with RFC 1100, Official Internet Protocols, as well as other protocols commonly used by hosts in the Internet community.

Note: The use of Internet network, version, socket, service, and protocol numbers in TCP/IP also complies with RFC 1010, Assigned Numbers.

Address Resolution Protocol

The first network-level protocol is the Address Resolution Protocol (ARP). ARP dynamically translates Internet addresses into the unique hardware addresses on local area networks.

To illustrate how ARP works, consider two nodes, X and Y. If node X wishes to communicate with Y, and X and Y are on different local area networks (LANs), X and Y communicate through bridges, routers, or gateways, using IP addresses. Within a LAN, nodes communicate using low-level hardware addresses.

Nodes on the same segment of the same LAN use ARP to determine the hardware address of other nodes. First, node X broadcasts an ARP request for node Y's hardware address. The ARP request contains X's IP and hardware addresses, and Y's IP address. When Y receives the ARP request, it places an entry for X in its ARP cache (which is used to map quickly from IP address to hardware address), then responds directly to X with an ARP response containing Y's IP and hardware addresses. When node X receives Y's ARP response, it places an entry for Y in its ARP cache.

Once an ARP cache entry exists at X for Y, node X is able to send packets directly to Y without resorting again to ARP (unless the ARP cache entry for Y is deleted, in which case ARP is reused to contact Y).

Unlike most protocols, ARP packets do not have fixed-format headers. Instead, the message is designed to be useful with a variety of network technologies, such as:

Ethernet LAN adapter (supports both Ethernet and 802.3 protocols)

Token-ring network adapter

Fiber Distributed Data Interface (FDDI) network adapter

However, ARP does not translate addresses for Serial Line Interface Protocol (SLIP) or Serial Optical Channel Converter (SOC), since these are point-to-point connections.

The kernel maintains the translation tables, and the ARP is not directly available to users or applications. When an application sends an Internet packet to one of the interface drivers, the driver requests the appropriate address mapping. If the mapping is not in the table, an ARP broadcast packet is sent through the requesting interface driver to the hosts on the local area network.

Entries in the ARP mapping table are deleted after 20 minutes; incomplete entries are deleted after 3 minutes. To make a permanent entry in the ARP mapping tables, use the arp command with the pub parameter:

arp -s 802.3 host2 0:dd:0:a:8s:0 pub

When any host that supports ARP receives an ARP request packet, the host notes the IP and hardware addresses of the requesting system and updates its mapping table, if necessary. If the receiving host IP address does not match the requested address, the host discards the request packet. If the IP address does match, the receiving host sends a response packet to the requesting system. The requesting system stores the new mapping and uses it to transmit any similar pending Internet packets.

Internet Control Message Protocol

The second network-level protocol is the Internet Control Message Protocol (ICMP). ICMP is a required part of every IP implementation. ICMP handles error and control messages for IP. This protocol allows gateways and hosts to send problem reports to the machine sending a packet. ICMP does the following:

Tests whether a destination is alive and reachable

Reports parameter problems with a datagram header

Performs clock synchronization and transit time estimations

Obtains Internet addresses and subnet masks

Note: ICMP uses the basic support of IP as if it were a higher-level protocol. However, ICMP is actually an integral part of IP and must be implemented by every IP module.

ICMP provides feedback about problems in the communications environment, but does not make IP reliable. That is, ICMP does not guarantee that an IP packet is delivered reliably or that an ICMP message is returned to the source host when an IP packet is not delivered or is incorrectly delivered.

ICMP messages might be sent in any of the following situations:

When a packet cannot reach its destination

When a gateway host does not have the buffering capacity to forward a packet

When a gateway can direct a host to send traffic on a shorter route

TCP/IP sends and receives several ICMP message types. ICMP is embedded in the kernel, and no application programming interface (API) is provided to this protocol.

Internet Control Message Protocol Message Types

ICMP sends and receives the following message types:

echo request	Sent by hosts and gateways to test whether a destination is alive and reachable.
information request	Sent by hosts and gateways to obtain an Internet address for a network to which they are attached. This message type is sent with the network portion of IP destination address set to a value of 0.
timestamp request	Sent to request that the destination machine return its current value for time of day.
address mask request	Sent by host to learn its subnet mask. The host can either send to a gateway, if it knows the gateway address, or send a broadcast message.
destination unreachable	Sent when a gateway cannot deliver an IP datagram.
source quench	Sent by discarding machine when datagrams arrive too quickly for a gateway or host to process, in order to request that the original source slow down its rate of sending datagrams.
redirect message	Sent when a gateway detects that some host is using a nonoptimum route.
echo reply	Sent by any machine that receives an echo request in reply to the machine which sent the request.
information reply	Sent by gateways in response to requests for network addresses, with both the source and destination fields of the IP datagram specified.
timestamp reply	Sent with current value of time of day.
address mask reply	Sent to machines requesting subnet masks.
parameter problem	Sent when a host or gateway finds a problem with a datagram header.
time exceeded	Sent when the following are true: Each IP datagram contains a time-to-live counter (hop count), which is decremented by each gateway. A gateway discards a datagram because its hop count has reached a value of 0.
Internet Timestamp	Used to record the time stamps through the route.

Internet Protocol

The third network-level protocol is the Internet Protocol (IP), which provides unreliable, connectionless packet delivery for the Internet. IP is connectionless because it treats each packet of information independently. It is unreliable because it does not guarantee delivery, meaning, it does not require acknowledgments from the sending host, the receiving host, or intermediate hosts.

IP provides the interface to the network interface level protocols. The physical connections of a network transfer information in a frame with a header and data. The header contains the source address and the destination address. IP uses an Internet datagram that contains information similar to the physical frame. The datagram also has a header containing Internet addresses of both source and destination of the data.

IP defines the format of all the data sent over the Internet.

Figure 9. Internet Protocol Packet Header. This illustration shows the first 32 bits of a typical IP packet header. Table below lists the various entities.

IP Header Field Definitions

Version	Specifies the version of the IP used. The current version of the IP protocol is 4.
Length	Specifies the datagram header length, measured in 32-bit words.
Type of Service	Contains five subfields that specify the type of precedence, delay, throughput, and reliability desired for that packet. (The Internet does not guarantee this request.) The default settings for these five subfields are routine precedence, normal delay, normal throughput, and normal reliability. This field is not generally used by the Internet at this time. This implementation of IP complies with the requirements of the IP specification, RFC 791, Internet Protocol.
Total Length	Specifies the length of the datagram including both the header and the data measured in octets. Packet fragmentation at gateways, with reassembly at destinations, is provided. The total length of the IP packet can be configured on an interface-by-interface basis with the Web-based System Manager, wsm, the ifconfig command, or the System Management Interface Tool (SMIT) fast path, smit chinet. Use Web-based System Manager or SMIT to set the values permanently in the configuration database; use the ifconfig command to set or change the values in the running system.
Identification	Contains a unique integer that identifies the datagram.
Flags	Controls datagram fragmentation, along with the Identification field. The Fragment Flags specify whether the datagram can be fragmented and whether the current fragment is the last one.
Fragment Offset	Specifies the offset of this fragment in the original datagram measured in units of 8 octets.
Time to Live	Specifies how long the datagram can remain on the Internet. This keeps misrouted datagrams from remaining on the Internet indefinitely. The default time to live is 255 seconds.
Protocol	Specifies the high-level protocol type.
Header Checksum	Indicates a number computed to ensure the integrity of header values.
Source Address	Specifies the Internet address of the sending host.
Destination Address	Specifies the Internet address of the receiving host.
Options	Provides network testing and debugging. This field is not required for every datagram. End of Option List Indicates the end of the option list. It is used at the end of the final option, not at the end of each option individually. This option should be used only if the end of the options would not otherwise coincide with the end of the IP header. End of Option List is used if options exceed the length of the datagram. No Operation Provides alignment between other options; for example, to align the beginning of a subsequent option on a 32-bit boundary. Loose Source and Record Route Provides a means for the source of an Internet datagram to supply routing information used by the gateways in forwarding the datagram to a destination and in recording the route information. This is a loose source route: the gateway or host IP is allowed to use any route of any number of other intermediate gateways in order to reach the next address in the route. Strict Source and Record Route Provides a means for the source of an Internet datagram to supply routing information used by the gateways in forwarding the datagram to a destination and in recording the route information. This is a strict source route: In order to reach the next gateway or host specified in the route, the gateway or host IP must send the datagram directly to the next address in the source route and only to the directly connected network that is indicated in the next address. Record Route Provides a means to record the route of an Internet datagram. Stream Identifier Provides a way for a stream identifier to be carried through networks that do not support the stream concept. Internet Timestamp Provides a record of the time stamps through the route.

Outgoing packets automatically have an IP header prefixed to them. Incoming packets have their IP header removed before being sent to the higher-level protocols. The IP protocol provides for the universal addressing of hosts in the Internet network.

Internet Transport-Level Protocols

The TCP/IP transport-level protocols allow application programs to communicate with other application programs.

Figure 10. Transport Layer of the TCP/IP Suite of Protocols. This illustration shows the various layers of the TCP/IP Suite of Protocols. From the top, the application layer consists of the application. The transport layer contains UDP and TCP. The network layer contains the network (hardware) interface. And finally, the hardware layer contains the physical network.

User Datagram Protocol (UDP) and the TCP are the basic transport-level protocols for making connections between Internet hosts. Both TCP and UDP allow programs to send messages to and receive messages from applications on other hosts. When an application sends a request to the Transport layer to send a message, UDP and TCP break the information into packets, add a packet header including the destination address, and send the information to the Network layer for further processing. Both TCP and UDP use protocol ports on the host to identify the specific destination of the message.

Higher-level protocols and applications use UDP to make datagram connections and TCP to make stream connections. The operating system sockets interface implements these protocols.

User Datagram Protocol

Sometimes an application on a network needs to send messages to a specific application or process on another network. The UDP provides a datagram means of communication between applications on Internet hosts. Because senders do not know which processes are active at any given moment, UDP uses destination protocol ports (or abstract destination points within a machine), identified by positive integers, to send messages to one of multiple destinations on a host. The protocol ports receive and hold messages in queues until applications on the receiving network can retrieve them.

Since UDP relies on the underlying IP to send its datagrams, UDP provides the same connectionless message delivery as IP. It offers no assurance of datagram delivery or duplication protection. However, UDP does allow the sender to specify source and destination port numbers for the message and calculates a checksum of both the data and header. These two features allow the sending and receiving applications to ensure the correct delivery of a message.

Figure 11. User Datagram Protocol (UDP) Packet Header. This illustration shows the first 32 bits of the UDP packet header. The first 16 bits contain the source port number and the length. The second 16 bits contain the destination port number and the checksum.

Applications that require reliable delivery of datagrams must implement their own reliability checks when using UDP. Applications that require reliable delivery of streams of data should use TCP.

UDP Header Field Definitions

Source Port Number	Address of the protocol port sending the information.
Destination Port Number	Address of the protocol port receiving the information.
Length	Length in octets of the UDP datagram.
Checksum	Provides a check on the UDP datagram using the same algorithm as the IP.

The applications programming interface (API) to UDP is a set of library subroutines provided by the sockets interface.

Transmission Control Protocol

TCP provides reliable stream delivery of data between Internet hosts. Like UPD, TCP uses Internet Protocol, the underlying protocol, to transport datagrams, and supports the block transmission of a continuous stream of datagrams between process ports. Unlike UDP, TCP provides reliable message delivery. TCP ensures that data is not damaged, lost, duplicated, or delivered out of order to a receiving process. This assurance of transport reliability keeps applications programmers from having to build communications safeguards into their software.

The following are operational characteristics of TCP:

Basic Data Transfer	TCP can transfer a continuous stream of 8-bit octets in each direction between its users by packaging some number of bytes into segments for transmission through the Internet system. TCP implementation allows a segment size of at least 1024 bytes. In general, TCP decides when to block and forward packets at its own convenience.
Reliability	TCP must recover data that is damaged, lost, duplicated, or delivered out of order by the Internet. TCP achieves this reliability by assigning a sequence number to each octet it transmits and requiring a positive acknowledgment (ACK) from the receiving TCP. If the ACK is not received within the time-out interval, the data is retransmitted. The TCP retransmission time-out value is dynamically determined for each connection, based on round-trip time. At the receiver, the sequence numbers are used to correctly order segments that may be received out of order and to eliminate duplicates. Damage is handled by adding a checksum to each segment transmitted, checking it at the receiver, and discarding damaged segments.
Flow Control	TCP governs the amount of data sent by returning a window with every ACK to indicate a range of acceptable sequence numbers beyond the last segment successfully received. The window indicates an allowed number of octets that the sender may transmit before receiving further permission.
Multiplexing	TCP allows many processes within a single host to use TCP communications facilities simultaneously. TCP receives a set of addresses of ports within each host. TCP combines the port number with the network address and the host address to uniquely identify each socket. A pair of sockets uniquely identifies each connection.
Connections	TCP must initialize and maintain certain status information for each data stream. The combination of this information, including sockets, sequence numbers, and window sizes, is called a connection. Each connection is uniquely specified by a pair of sockets identifying its two sides.
Precedence and Security	Users of TCP may indicate the security and precedence of their communications. Default values are used when these features are not needed.

The TCP Packet Header figure illustrates these characteristics.

Figure 12. Transmission Control Protocol (TCP) Packet Header. This illustration shows what is contained in the TCP packet header. The individual entities are listed in the text below.

TCP Header Field Definitions

Source Port	Identifies the port number of a source application program.
Destination Port	Identifies the port number of a destination application program.
Sequence Number	Specifies the sequence number of the first byte of data in this segment.
Acknowledgment Number	Identifies the position of the highest byte received.
Data Offset	Specifies the offset of data portion of the segment.
Reserved	Reserved for future use.
Code	Control bits to identify the purpose of the segment: URG Urgent pointer field is valid. ACK Acknowledgement field is valid. PSH Segment requests a PUSH. RTS Resets the connection. SYN Synchronizes the sequence numbers. FIN Sender has reached the end of its byte stream.
Window	Specifies the amount of data the destination is willing to accept.
Checksum	Verifies the integrity of the segment header and data.
Urgent Pointer	Indicates data that is to be delivered as quickly as possible. This pointer specifies the position where urgent data ends.
Options	End of Option List Indicates the end of the option list. It is used at the final option, not at the end of each option individually. This option needs to be used only if the end of the options would not otherwise coincide with the end of the TCP header. No Operation Indicates boundaries between options. Can be used between other options; for example, to align the beginning of a subsequent option on a word boundary. There is no guarantee that senders will use this option, so receivers must be prepared to process options even if they do not begin on a word boundary. Maximum Segment Size Indicates the maximum segment size TCP can receive. This is only sent in the initial connection request.

The applications programming interface to TCP consists of a set of library subroutines provided by the sockets interface.

Internet Application-Level Protocols

TCP/IP implements higher-level Internet protocols at the application program level.

Figure 13. Applicaton Layer of the TCP/IP Suite of Protocols. This illustration shows the various layers of the TCP/IP Suite of Protocols. From the top, the application layer consists of the application. The transport layer contains UDP and TCP. The network layer contains the network (hardware) interface. And finally, the hardware layer contains the physical network.

When an application needs to send data to another application on another host, the applications send the information down to the transport level protocols to prepare the information for transmission.

The official Internet application-level protocols include:

Domain Name Protocol (DOMAIN)

Exterior Gateway Protocol (EGP)

File Transfer Protocol (FTP)

Name/Finger Protocol (FINGER)

Telnet Protocol (TELNET)

Trivial File Transfer Protocol (TFTP)

TCP/IP implements other higher-level protocols that are not official Internet protocols but are commonly used in the Internet community at the application program level. These protocols include:

Distributed Computer Network (DCN) Local-Network Protocol (HELLO)

Remote Command Execution Protocol (EXEC)

Remote Login Protocol (LOGIN)

Remote Shell Protocol (SHELL)

Routing Information Protocol (RIP)

Time Server Protocol (TIMED).

TCP/IP does not provide APIs to any of these application-level protocols.

Domain Name Protocol

The Domain Name Protocol (DOMAIN) allows a host in a domain to act as a name server for other hosts within the domain. DOMAIN uses UDP or TCP as its underlying protocol and allows a local network to assign host names within its domain independently from other domains. Normally, the DOMAIN protocol uses UDP. However, if the UDP response is truncated, TCP can be used. The DOMAIN protocol in TCP/IP supports both.

In the DOMAIN hierarchical naming system, local resolver routines can resolve Internet names and addresses using a local name resolution database maintained by the named daemon. If the name requested by the host is not in the local database, the resolver routine queries a remote DOMAIN name server. In either case, if the name resolution information is unavailable, the resolver routines attempt to use the /etc/hosts file for name resolution.

Note: TCP/IP configures local resolver routines for the DOMAIN protocol if the local file /etc/resolv.conf exists. If this file does not exist, the TCP/IP configures the local resolver routines to use the /etc/hosts database.

TCP/IP implements the DOMAIN protocol in the named daemon and in the resolver routines and does not provide an API to this protocol.

Exterior Gateway Protocol

Exterior Gateway Protocol (EGP) is the mechanism that allows the exterior gateway of an autonomous system to share routing information with exterior gateways on other autonomous systems.

Autonomous Systems

An autonomous system is a group of networks and gateways for which one administrative authority has responsibility. Gateways are interior neighbors if they reside on the same autonomous system and exterior neighbors if they reside on different autonomous systems. Gateways that exchange routing information using EGP are said to be EGP peers or neighbors. Autonomous system gateways use EGP to provide access information to their EGP neighbors.

EGP allows an exterior gateway to ask another exterior gateway to agree to exchange access information, continually checks to ensure that its EGP neighbors are responding, and helps EGP neighbors to exchange access information by passing routing update messages.

EGP restricts exterior gateways by allowing them to advertise only those destination networks reachable entirely within that gateway's autonomous system. Thus, an exterior gateway using EGP passes along information to its EGP neighbors but does not advertise access information about its EGP neighbors outside its autonomous system.

EGP does not interpret any of the distance metrics that appear in routing update messages from other protocols. EGP uses the distance field to specify whether a path exists (a value of 255 means that the network is unreachable). The value cannot be used to compute the shorter of two routes unless those routes are both contained within a single autonomous system. Therefore, EGP cannot be used as a routing algorithm. As a result, there will be only one path from the exterior gateway to any network.

In contrast to the Routing Information Protocol (RIP), which can be used within an autonomous system of Internet networks that dynamically reconfigure routes, EGP routes are predetermined in the /etc/gated.conf file. EGP assumes that IP is the underlying protocol.

EGP Message Types

Neighbor Acquisition Request	Used by exterior gateways to request to become neighbors of each other.
Neighbor Acquisition Reply	Used by exterior gateways to accept the request to become neighbors.
Neighbor Acquisition Refusal	Used by exterior gateways to deny the request to become neighbors. The refusal message includes reasons for refusal, such as `out of table space`.
Neighbor Cease	Used by exterior gateways to cease the neighbor relationship. The cease message includes reasons for ceasing, such as `going down`.
Neighbor Cease Acknowledgment	Used by exterior gateways to acknowledge the request to cease the neighbor relationship.
Neighbor Hello	Used by exterior gateways to determine connectivity. A gateway issues a `Hello` message and another gateway issues an `I Heard You` message.
I Heard You	Used by exterior gateways to reply to a `Hello` message. The `I Heard You` message includes the access of the answering gateway and, if the gateway is unreachable, a reason for lack of access, such as `You are unreachable because of problems with my network interface`.
NR Poll	Used by exterior gateways to query neighbor gateways about their ability to reach other gateways.
Network Reachability	Used by exterior gateways to answer the `NR Poll` message. For each gateway in the message, the `Network Reachability` message contains information on the addresses that gateway can reach through its neighbors.
EGP Error	Used by exterior gateways to respond to EGP messages that contain bad checksums or have fields containing incorrect values.

TCP/IP implements the EGP protocol in the gated server command and does not provide an API to this protocol.

File Transfer Protocol

File Transfer Protocol (FTP) allows hosts to transfer data among dissimilar hosts, as well as files between two foreign hosts indirectly. FTP provides for such tasks as listing remote directories, changing the current remote directory, creating and removing remote directories, and transferring multiple files in a single request. FTP keeps the transport secure by passing user and account passwords to the foreign host. Although FTP is designed primarily to be used by applications, it also allows interactive user-oriented sessions.

FTP uses reliable stream delivery (TCP/IP) to send the files and uses a Telnet connection to transfer commands and replies. FTP also understands several basic file formats including NETASCII, IMAGE, and Local 8.

TCP/IP implements FTP in the ftp user command and the ftpd server command and does not provide an applications programming interface (API) to this protocol.

When creating anonymous ftp users and directories please be sure that the home directory for users ftp and anonymous (for example, /u/ftp) is owned by root and does not allow write permissions (for example, dr-xr-xr-x). The script /usr/samples/tcpip/anon.ftp can be used to create these accounts, files and directories.

Telnet Protocol

The Telnet Protocol (TELNET) provides a standard method for terminal devices and terminal-oriented processes to interface. TELNET is commonly used by terminal emulation programs that allow you to log into a remote host. However, TELNET can also be used for terminal-to-terminal communication and interprocess communication. TELNET is also used by other protocols (for example, FTP) for establishing a protocol control channel.

TCP/IP implements TELNET in the tn, telnet, or tn3270 user commands. The telnetd daemon does not provide an API to TELNET.

TCP/IP supports the following TELNET options which are negotiated between the client and server:

BINARY TRANSMISSION (Used in tn3270 sessions)	Transmits characters as binary data.
SUPPRESS GO_AHEAD (The operating system suppresses GO-AHEAD options.)	Indicates that when in effect on a connection between a sender of data and the receiver of the data, the sender need not transmit a GO_AHEAD option. If the GO_AHEAD option is not desired, the parties in the connection will probably suppress it in both directions. This action must take place in both directions independently.
TIMING MARK (Recognized, but has a negative response)	Makes sure that previously transmitted data has been completely processed.
EXTENDED OPTIONS LIST	Extends the TELNET option list for another 256 options. Without this option, the TELNET option allows only 256 options.
ECHO (User-changeable command)	Transmits echo data characters already received back to the original sender.
TERM TYPE	Enables the server to determine the type of terminal connected to a user TELNET program.
SAK (Secure Attention Key)	Establishes the environment necessary for secure communication between you and the system.
NAWS (Negotiate About Window Size)	Enables client and server to negotiate dynamically for the window size. This is used by applications that support changing the window size.

Note: TELNET must allow transmission of eight bit characters when not in binary mode in order to implement ISO 8859 Latin code page. This is necessary for internationalization of the TCP/IP commands.

Trivial File Transfer Protocol

The Trivial File Transfer Protocol (TFTP) can read and write files to and from a foreign host. Because TFTP uses the unreliable User Datagram Protocol to transport files, it is generally quicker than FTP. Like FTP, TFTP can transfer files as either NETASCII characters or as 8-bit binary data. Unlike FTP, TFTP cannot be used to list or change directories at a foreign host and it has no provisions for security like password protection. Also, data can be written or retrieved only in public directories.

The TCP/IP implements TFTP in the tftp and utftp user commands and in the tftpd server command. The utftp command is a form of the tftp command for use in a pipe. TCP/IP does not provide an API to this protocol.

Name/Finger Protocol

The Name/Finger Protocol (FINGER) is an application-level Internet protocol that provides an interface between the finger command and the fingerd daemon. The fingerd daemon returns information about the users currently logged in to a specified remote host. If you execute the finger command specifying a user at a particular host, you will obtain specific information about that user. The FINGER Protocol must be present at the remote host and at the requesting host. FINGER uses Transmission Control Protocol (TCP) as its underlying protocol.

Note: TCP/IP does not provide an API to this protocol.

Distributed Computer Network Local-Network Protocol

Local-Network Protocol (HELLO) is an interior gateway protocol designed for use within autonomous systems. (For more information, see Autonomous Systems.) HELLO maintains connectivity, routing, and time-keeping information. It allows each machine in the network to determine the shortest path to a destination based on time delay and then dynamically updates the routing information to that destination.

The gated daemon provides the Distributed Computer Network (DCN) local network protocol.

Remote Command Execution Protocol

The rexec user command and the rexecd daemon provide the remote command execution protocol, allowing users to run commands on a compatible remote host.

Remote Login Protocol

The rlogin user command and the rlogind daemon provide the remote login protocol, allowing users to log in to a remote host and use their terminals as if they were directly connected to the remote host.

Remote Shell Protocol

The rsh user command and the rshd daemon provide the remote command shell protocol, allowing users to open a shell on a compatible foreign host for running commands.

Routing Information Protocol

Routing Information Protocol (RIP) and the routed and gated daemons that implement it keep track of routing information based on gateway hops and maintain kernel-routing table entries.

Time Server Protocol

The timed daemon is used to synchronize one host with the time of other hosts. It is based on the client/server concept.

Assigned Numbers

For compatibility with the general network environment, well-known numbers are assigned for the Internet versions, networks, ports, protocols, and protocol options. Additionally, well-known names are also assigned to machines, networks, operating systems, protocols, services, and terminals. TCP/IP complies with the assigned numbers and names defined in RFC 1010, Assigned Numbers.

The Internet Protocol (IP)defines a 4-bit field in the IP header that identifies the version of the general Internetwork protocol in use. For IP, this version number in decimal is 4. For details on the assigned numbers and names used by TCP/IP, see /etc/protocols and /etc/services files included with TCP/IP. For further details on the assigned numbers and names, refer to RFC 1010 and the /etc/services file.