TCP/IP Byte ordering problem

TCP/IP Byte ordering problem

Host and Network byte order

Consider a 16-bit integer that is made up of 2 bytes. There are two ways to store the two bytes in memory: with the low-order byte at the starting address, known as little-endian byte order, or with the high-order byte at the starting address, known as big-endian byte order.

In this figure, we show increasing memory addresses going from right to left in the top, and from left to right in the bottom. We also show the most significant bit (MSB) as the leftmost bit of the 16-bit value and the least significant bit (LSB) as the rightmost bit.The terms "little-endian" and "big-endian" indicate which end of the multibyte value, the little end or the big end, is stored at the starting address of the value. Unfortunately, there is no standard between these two byte orderings and we encounter systems that use both formats. We refer to the byte ordering used by a given system as the host byte order.

There are currently a variety of systems that can change between little-endian and big-endian byte ordering, sometimes at system reset, sometimes at run-time. We must deal with these byte ordering differences as network programmers because networking protocols must specify a network byte order. For example, in a TCP segment, there is a 16-bit port number and a 32-bit IPv4 address. The sending protocol stack and the receiving protocol stack must agree on the order in which the bytes of these multibyte fields will be transmitted. The Internet protocols use big-endian byte ordering for these multibyte integers.

We have not yet defined the term "byte." We use the term to mean an 8-bit quantity since almost all current computer systems use 8-bit bytes. Most Internet standards use the term octet instead of byte to mean an 8-bit quantity. This started in the early days of TCP/IP because much of the early work was done on systems such as the DEC-10, which did not use 8-bit bytes.Another important convention in Internet standards is bit ordering. In many Internet standards, you will see "pictures" of packets that look similar to the following (this is the first 32 bits of the IPv4 header from RFC 791):

This represents four bytes in the order in which they appear on the wire; the leftmost bit is the most significant. However, the numbering starts with zero assigned to the most significant bit. This is a notation that you should become familiar with to make it easier to read protocol definitions in RFCs.

Byte ordering problem

Now let's write a simple example in C/C++ to demonstrate byte ordering problems. We will pass binary values across the socket, instead of text strings. We will see that this does not work when the client and server are run on hosts with different byte orders, or on hosts that do not agree on the size of a long integer.

The client reads 2 integers from the console, sends them to the server, the server calculates their sum and sends it back to the client. Let's define the structures with which we will work.

Let's look at the client's code, sscanf converts the two arguments from text strings to binary, and we call Writen to send the structure to the server.  We call Readn to read the reply, and print the result using printf.

Now let's look at the server's code. We read the arguments by calling Readn, calculate and store the sum, and call Writen to send back the result structure.

This code is full of pitfalls, so let's inspect them. If we run the client and server on two machines of the same architecture, say two SPARC machines, everything works fine. Here is the client interaction: 

But when the client and server are on two machines of different architectures (say the server is on the big-endian SPARC system freebsd and the client is on the little endian Intel system linux), it does not work. 

The problem is that the two binary integers are sent across the socket in little-endian format by the client, but interpreted as big-endian integers by the server. We see that it appears to work for positive integers, but fails for negative integers. Why?

 Our client was on a little-endian Intel system, where the 32-bit integer with a value of 1 was stored as shown below.

The 4 bytes are sent across the socket in the order A, A+1, A+2, and A+3 where they are stored in the big-endian format, as shown below.

This value of 0x01000000 is interpreted as 16,777,216. Similarly, the integer 2 sent by the client will be interpreted at the server as 0x02000000, or 33,554,432. The sum of these two integers is 50,331,648, or 0x03000000. When this big-endian value on the server is sent to the client, it is interpreted on the client as the integer value 3.

The 32-bit integer value of - 22 is represented on the little-endian system as shown below, assuming a two's-complement representation of negative numbers.

 This is interpreted on the big-endian server as 0xeaffffff, or -352,321,537. Similarly, the little-endian representation of - 77 is 0xffffffb3, but this is represented on the big-endian server as 0xb3ffffff, or - 1,275,068,417. The addition on the server yields a binary result of 0x9efffffe, or - 1,627,389,954. This big-endian value is sent across the socket to the client where it is interpreted as the little-endian value 0xfeffff9e, or - 16,777,314, which is the value printed in our example.

You may think that we can solve the byte ordering problem by having the client convert the two arguments into network byte order using htonl, having the server then call ntohl on each argument before doing the addition, and then doing a similar conversion on the result. But there's another pitfall. The technique is correct (converting the binary values to network byte order), but the two functions htonl and ntohl cannot be used. Even though the l in these functions once meant "long," these functions operate on 32-bit integers. On a 64-bit system, a long will probably occupy 64 bits and these two functions will not work correctly. One might define two new functions, hton64 and ntoh64, to solve this problem, but this will not work on systems that represent longs using 32 bits.

You may think that now we are done and everything will work fine. But you are wrong. There's another pitfall that requires a closer look. What happens if the client is on a SPARC that stores a long in 32 bits, but the server is on a Digital Alpha that stores a long in 64 bits? Does this change if the client and server are swapped between these two hosts? In the first scenario, the server blocks forever in the call to Readn, because the client sends two 32-bit values but the server is waiting for two 64-bit values. Swapping the client and server between the two hosts causes the client to send two 64-bit values, but the server reads only the first 64 bits, interpreting them as two 32-bit values. The second 64-bit value remains in the server's socket receive buffer. The server writes back one 32-bit value and the client will block forever in its call to Readn, waiting to read one 64-bit value.

Summary

There are really three potential problems with this example:

  1.  Different implementations store binary numbers in different formats. The most common formats are big-endian and little-endian, as we described earlier.
  2.  Different implementations can store the same C datatype differently. For example, most 32-bit Unix systems use 32 bits for a long but 64-bit systems typically use 64 bits for the same datatype. There is no guarantee that a short, int, or long is of any certain size.
  3. Different implementations pack structures differently, depending on the number of bits used for the various datatypes and the alignment restrictions of the machine. Therefore, it is never wise to send binary structures across a socket.

There are two common solutions to this data format problem:

  1. Pass all numeric data as text strings. This assumes that both hosts have the same character set.
  2. Explicitly define the binary formats of the supported datatypes (number of bits, little or big-endian) and pass all data between the client and server in this format. RPC packages normally use this technique. RFC 1832 describes the External Data Representation (XDR) standard that is used with the Sun RPC package.


Author: Rusu Marin

Email: dimarusu2000@gmail.com

Bibliography: UNIX Network Programming 3rd Edition by W. Richard Stevens

To view or add a comment, sign in

More articles by Rusu Marin

Explore content categories