How the Internet Works: the Internet Layer

How the Internet Works: the Internet Layer

IP Addresses

If MAC addresses were used to send data between networks, each network would have to keep track of every MAC address in the world, since MAC addresses are not structured hierarchically. This is where the internet protocol (IP) comes in; each node on the internet also has an IP address. IP addresses are structured hierarchically, so routers (a router is a device that sends traffic between networks) can use them to drill down from high-level matches to the specific address needed.

IP addresses come in two versions. The older, and universally adopted version, is IPv4, first deployed in 1983. This version has 32 bits, allowing for 4,294,967,295 possible IP addresses.

IPv4

An IPv4 address’s 32 bits are organized into four bytes. The notation of an IPv4 address is four numbers separated by periods, and each with a value of 0 to 255.

An IPv4 packet has this structure:

  1. Version. Four bits; identifies the version of the IP packet. For IPv4 packets, this is always 0100, or 4.
  2. IHL, or Internet Header Length. Since an IPv4 header may or many not include optional parameters (see 11 below), the header size can vary, so the IHL specifies the size.
  3. DSCP, or Differentiated Services Code Point. Some network hops allow for the use of different types of trasmissions, such as choosing between low-latency and low-loss transmissions. The DSCP field is a way of specifying these.
  4. ECN, or Explicit Congestion Notification. Allows end-to-end notification of network congestion to find ways of avoiding dropping packets.
  5. Total Length. Total size of the packet, including both headers and payload.
  6. Identification. If a Transport Layer PDU is larger than the maximum packet limit size, it gets broken up into a set of smaller fragments, each with the same ID number. It can then be reassembled at the other end.
  7. Flags. Three bits: 1) reserved, always 0. 2) DF (don’t fragment). If this flag is set and the packet is a fragment, the packet gets dropped. 3) MF (more fragments). If this flag is not set, then the fragment is the last one, otherwise there are more to come.
  8. Fragment offset. Essentially, this numbers fragments in the order they were sent. The offset is the number of bytes from the beginning of the first fragment that the current fragment starts at.
  9. TTL, or Time to Live. In theory, this field sets how much time should be allowed to elapse before the packet either reaches its destination or is dropped. In practice, the field contains the maximum number of hops a packet can take before it reaches its destination or is dropped. Each router decrements the TTL field by one, and the packet is dropped if the field reaches zero. (IPv6 replaces TTL with a Hop Limit field, reflecting this practice.)
  10. Protocol. Eight bits, specifying the protocol of the payload as assigned by the IETF (the Internet Engineering Task Force is a standards organization for the internet). For example, 6 is TCP, 17 is UDP, and 41 is ENCAP, which is used to encapsulate an IPv6 packet for transmission on an IPv4 hop.
  11. Checksum. Used for identifying corrupt packets.
  12. Source IP address.
  13. Destination IP address.
  14. Options. Allows the specification of a number of optional parameters. (Usually not used.)
  15. Payload (a Transport Layer PDU).

IPv6

With the rapid growth of the internet in the 1990s, it became clear by around 2000 that being limited to four billion IP addresses would cause us to run out of them (and that indeed happened in 2019). Since then, there has been an ongoing initiative to adopt a newer version of IP address, called IPv6. (For example, most mobile devices support IPv6 addressing.) IPv6 addresses increase the number of bits to 128, which allows for enough addresses for everyone in the world to have about 48 octillion of them. More than enough to cover the needs for the forseeable future, in other words.

An IPv6 packet has this structure:

  1. Version. Four bits: identifies the version of the IP packet. For IPv6 packets, this is always 0110, or 6.
  2. Traffic class. Analogous to the IPv4 DSCP and ECN fields combined.
  3. Flow label. Analogous to the IPv4 Identification field.
  4. Payload length. Size of the payload.
  5. Next header. Specifies the type of the next header, usually the protocol of the payload as with the IPv4 Protocol field. (“Usually” because the value could be the type of an *extension header*, which is the header of an extension that carries optional information. So, this field also replaces the IPv4 Options field, and, as with that field, is not typically used.)
  6. Hop limit. Replaces IPv4 TTL field. The maximum number of hops a packet can take before it reaches its destination or is dropped. Each router decrements the Hop limit field by one, and the packet is dropped if the field reaches zero.
  7. Source address. The 128-bit IPv6 address of the sending node.
  8. Destination address. The 128-bit IPv6 address of the receiving node.
  9. Payload (a Transport Layer PDU).

Notable in IPv6 is the absence of a Checksum field. At the Internet Layer, the checksum needs to be recalculated on every hop, since decrementing the hop count alters the packet metadata. The original idea was that a corrupt packet needs to be detected as soon as possible and dropped, to avoid the overhead of sending it all the way to the destination before it can be detected. However, experience has shown that the overhead of recalculating every good packet on every hop considerably outweighs the overhead of possibly adding several hops to corrupt packets, since only a small percentage of packets (roughly five percent on average) are corrupt to begin with. So IPv6 did away with the Checksum, instead expecting the Transport Layer protocol to check the integrity of its PDU upon arrival at the endpoint.

Routers

Routers are used to send traffic from one network to another. Every local network that is part of the internet has at least one router. There are also routers that are not part of the local network, typically those which handle longer internet trips. The router is responsible for opening (or “de-encapsulating”) a packet, looking at a packet’s IP address, determining where to send the packet based on its IP address, re-encapsulating the packet using the frame protocol of the destination node (usually some form of Ethernet, but not always), and forwarding the frame.

Each router has a routing table, which is a list of IP addresses that it can send to. When a router receives a packet, it searches its routing table to determine the closest match to the packet’s IP address, and forwards it there. To determine the closest match, the router goes from the specific to the general, using this (somewhat simplified) logic:

  1. If the routing prefix (see How Routers Evaluate IP Addresses below) on a table entry is the same as the IP address’s, then the destination node is local to that router. Send the packet there.
  2. Otherwise, send the packet to the closest match to the IP address in the routing table. The closest match uses a “longest match wins” principle: the more numbers matching at the beginning of the destination address and the table’s address, the closer the match.
  3. If there are no IP addresses in the routing table that even partially match, send the packet to the default gateway. This is the IP address of another router, which will repeat this process. Eventually, the packet will find its way to a router that is more directly connected to the node with the destination IP address.

This is somewhat simplified because (among other reasons) the routers that are responsible for long-distance traffic (called core routers) don’t have local nodes or default gateways. Their routing tables only contain the addresses of other routers, so they work exclusively with step 2.

How Routers Evaluate IP Addresses

An IP address has two logical parts: the routing prefix (the first group), which identifies a network, and the host identifier (the second group), which identifies a node on that network. The more bits that are used for the routing prefix, the fewer can be used for individual host identifiers. So the larger the routing prefix, the smaller the network.

For example, one of the Charter Communications networks has all the numbers from 24.158.0.0 to 24.158.255.255. So, its routing prefix is 24.158, and there are 65,534 possible host identifiers. (But, you may ask, two bytes have 65,536 possible values, so why are there two missing identifiers? Because the highest and lowest available addresses are reserved for the router’s IP address and the broadcast address, respectively. The broadcast address is used to send to every node on the network, typically for some form of resource discovery.)

A typical office network uses the first three bytes for the routing prefix, and therefore has 254 numbers that can be used for host identifiers.

To distinguish the routing prefix from the host identifier, the router uses a subnet mask, also knowns as a netmask. The netmask uses the same format as an ordinary IP address, with bits set to 1 for the routing prefix, and set to 0 for the host identifier. Therefore, in the network in the above example, the subnet mask is 255.255.255.0. It follows that the logical AND of the subnet mask and any IP address on the network will be the routing prefix. It further follows that the logical AND of the one’s complement (a one’s complement of a number is the number with all its bits reversed) of the subnet mask (0.0.0.255, in our case) will be the host identifier.

Let’s look at an example of how this works. Suppose one of the nodes on a 254-node network has the IP address 169.254.190.93. The routing prefix would be 169.254.190, and the host identifier would be 93. Now, suppose our router receives a packet with the destination address 169.254.190.93. The router will first apply the subnet mask:

Decimal 	169 	254 	190 	93 Binary 	10101001 	11111110 	10111110 	01011101 Netmask 	11111111 	11111111 	11111111 	00000000 Binary AND Netmask 	10101001 	11111110 	10111110 	00000000

The binary AND netmask row in this table is 169.254.190.0 in decimal, which is the router’s IP address. So, the router knows that the destination IP address is in its own network, and ANDs the one’s complement of the subnet mask with it:

Decimal 	169 	254 	190 	93 Binary 	10101001 	11111110 	10111110 	01011101 One’s c. of Netmask 	00000000 	00000000 	00000000 	11111111 Binary AND One’s c. 	00000000 	00000000 	00000000 	01011101

The result in decimal is 0.0.0.93, which is the destination’s host identifier. Once the router has calculated this host identifier, it finds the MAC address of the host (there are various ways to do this, depending on the actual network configuration) and sends the packet to it.

The Internet layer determines the destination for data and sends it there. However, it does not ensure that the data is sent intact. Also, a single node can have many different applications that use the internet to send and receive data (a common example is a browser and an email application), and the internet layer doesn’t do anything to distinguish between these. Data reliability and application-level communication are the responsibility of the Transport layer.

The next article will discuss the Transport layer, in How the Internet Works: the Transport Layer.

There’s a musical cadence that goes with this. 😎

Like
Reply

To view or add a comment, sign in

More articles by Robert Rodes

Others also viewed

Explore content categories