Hands-On System Programming with C++
上QQ阅读APP看书,第一时间看更新

Structure packing

The standard integers provide a compiler-supported method for dictating the size of an integer type at compile time. Specifically, they map bit widths to default types so that the coder doesn't have to do this manually. The standard types, however, do not always guarantee the width of a type, and structures are a good example of this. To better understand this issue, let's look at a simple example of a structure with some data in it:

#include <iostream>

struct mystruct {
uint64_t data1;
uint64_t data2;
};

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 16

In the previous example, we created a structure with two 64 bit integers in it. We then, using the sizeof() function, output the size of the structure to stdout using std::cout. As expected, the total size, in bytes, of the structure is 16. It should be noted that, like the rest of this book, the examples in this section are all being executed on a 64 bit Intel CPU. 

Now, let's look at the same example, but with one of the data types being changed to a 16 bit integer instead of a 64 bit integer, as follows:

#include <iostream>

struct mystruct {
uint64_t data1;
uint16_t data2;
};

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 16

As shown in the preceding example, we have a structure that has two data types, but they do not match. We then output the size of the data structure to stdout using std::cout, and the reported size is 16 bytes. The problem is that we expect 10 bytes, as we defined the structure as being the combination of a 64-bit (8 bytes) and a 16-bit (2 bytes) integer. 

Under the hood, the compiler is replacing the 16 bit integer with a 64 bit integer. The reason for this is the base type for C and C++ is an int, and the compiler is allowed to change a type smaller than an int with an int, even though we explicitly declared the second integer as a 16 bit integer. To explain this in other words, the use of unit16_t does not demand the use of a 16 bit integer, but rather it is a typedef for short int on a 64 bit Intel-based CPU running Ubuntu, and based on the C and C++ specifications, the compiler is allowed to change a short int to an int at will. 

The order in which we specify our integers also does not matter:

#include <iostream>

struct mystruct {
uint16_t data1;
uint64_t data2;
};

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 16

As seen in the previous example, the compiler once again states that the total size of the structure is 16 bytes when, in fact, we expect 10. In this example, the compiler is even more likely to make this type of substitution because it is capable of identifying that there is an alignment issue. Specifically, the CPU this code was compiled on was a 64 bit CPU, which means that replacing the uint16_t with a unit64_t could possibly improve memory caching, and align data2 on a 64 bit boundary instead of a 16 bit boundary, which would span two 64 bit memory locations if the structure is properly aligned in memory. 

Structures are not the only way to reproduce this type of substitution. Let's examine the following example:

#include <iostream>

int main()
{
int16_t s = 42;
auto result = s + 42;
std::cout << "size: " << sizeof(result) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 4

In the previous example, we created a 16-bit integer and set it to 42. We then created another integer and set it to our 16-bit integer plus 42. The value 42 can be represented as an 8-bit integer, but it's not. Instead, the compiler represents 42 as an int, which in this case means that the system this code was compiled on is 4 bytes in size. 

The compiler represents 42 as an int, and int plus an int16_t, which results in an int, as that is the higher width type. In the previous example, we define our result variable using auto, which ensures that the resulting type reflects the type the compiler created as a consequence of this arithmetic. We could have defined result as another int16_t, which would have worked unless we turned on integer type conversion warnings. Doing so would have resulted in a conversion warning as the compiler constructs an int as a consequence of adding s plus 42, and then would have to automatically convert the resulting int back to an int16_t, which would be performing a narrowing conversion, which could result in an overflow (hence the warning). 

All of these issues are a consequence of the compiler's ability to perform type conversions from a smaller width type to a higher width type in order to optimize performance to reduce the possibility of overflows. In this case, a numeric value is always an int unless the value requires more storage (for example, replace 42 with 0xFFFFFFFF00000000). 

This type of conversion is not always guaranteed. Consider the following example:

#include <iostream>

struct mystruct {
uint16_t data1;
uint16_t data2;
};

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 4

In the previous example, we have a structure with two 16 bit integers. The total size of the structure is reported as 4 bytes, which is exactly what we would expect. In this case, the compiler doesn't see a benefit to changing the size of either of the integers and thus leaves them alone. 

Bit fields also do not change the compiler's ability to perform this type of conversion, as shown in the following example:

#include <iostream>

struct mystruct {
uint16_t data1 : 2, data2 : 14;
uint64_t data3;
};

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 16

In the previous example, we created a structure with two integers (a 16-bit integer and a 64-bit integer), but instead of just defining the 16-bit integer, we also defined bit fields, giving us direct access to specific bits within the integer (a practice that should be avoided when system programming for reasons that are about to be explained). Defining these bit fields does not prevent the compiler from changing the total size of the first integer from 16 bits to 64 bits. 

The problem with the previous example is that bit fields are often a pattern used by system programmers when interfacing directly with hardware. In the previous example, the second 64-bit integer is expected to be at 2 bytes from the top of the structure. In this case, however, the second 64-bit integer is actually 8 bytes from the top of the structure. If we used this structure to interface directly with hardware, a hard to find logic bug would be the result. 

The way to overcome this problem is to pack the structure. The following example demonstrates how to do this:

#include <iostream>

#pragma pack(push, 1)
struct mystruct {
uint64_t data1;
uint16_t data2;
};
#pragma pack(pop)

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 10

The previous example is similar to the first example in this section. A structure was created with a 64 bit integer and a 16 bit integer. In the previous example, the resulting size of the structure was 16 bytes, as the compiler replaced the 16 bit integer with a 64 bit integer instead. In the previous example, to fix this issue, we wrap the structure with the #pragma pack and #pragma pop macros. These macros tell the compiler (since we passed a 1 to the macro, which indicates a byte) to pack the structure using a byte granularity, telling the compiler it is not allowed to make a substitution optimization. 

Using this method, changing the order of the variables to the more likely scenario for which the compiler would attempt this type of optimization still results in a structure that is not converted, as shown in the following example:

#include <iostream>

#pragma pack(push, 1)
struct mystruct {
uint16_t data1;
uint64_t data2;
};
#pragma pack(pop)

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 10

As seen in the previous example, the size of the structure is still 10 bytes, regardless of the order of the integers. 

Combining structure packing with the standard integer types is sufficient (assuming endianness is not an issue)  to directly interface with the hardware, but this type of pattern is still discouraged in favor of building accessors and leveraging bit masks that provide the user with a means to ensure that direct access to hardware registers is occurring in a controlled manner without the compiler getting in the way, or optimizations producing undesired results. 

To explain why packed structures and bit fields should be avoided, let's look at an alignment issue with the following example:

#include <iostream>

#pragma pack(push, 1)
struct mystruct {
uint16_t data1;
uint64_t data2;
};
#pragma pack(pop)

int main()
{
mystruct s;
std::cout << "addr: " << &s << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// addr: 0x7fffd11069cf

In the previous example, we created a structure with a 16 bit integer and a 64-bit integer, and then packed the structure to ensure the total size of the structure is 10 bytes, and each data field is properly aligned. The total alignment of the structure is, however, not cache aligned, which is demonstrated in the previous example by creating an instance of the structure on the stack and then outputting the structure's address to stdout using std::cout. As shown, the address is byte aligned, not cache aligned. 

To cache align the structure, we will leverage the alignas() function, which will be explained in Chapter 7, A Comprehensive Look at Memory Management:

#include <iostream>

#pragma pack(push, 1)
struct alignas(16) mystruct {
uint16_t data1;
uint64_t data2;
};
#pragma pack(pop)

int main()
{
mystruct s;
std::cout << "addr: " << &s << '\n';
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// addr: 0x7fff44ee3f40
// size: 16

In the previous example, we added the alignas() function to the definition of the structure, which cache aligns the structure on the stack. We also output the total size of the structure as with previous examples, and as shown, the structure is no longer packed. In other words, the use of #pragma pack# does not guarantee the structure will, in fact, be packed. As in all cases, the compiler is free to make changes as needed, and even the #pragma pack macro is a hint, not a requirement. 

In the previous case, it should be noted that the compiler actually adds additional memory to the end of the structure, meaning that the data members in the structure are still in their correct locations, as follows:

#include <iostream>

#pragma pack(push, 1)
struct alignas(16) mystruct {
uint16_t data1;
uint64_t data2;
};
#pragma pack(pop)

int main()
{
mystruct s;
std::cout << "addr data1: " << &s.data1 << '\n';
std::cout << "addr data2: " << &s.data2 << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// addr data1: 0x7ffc45dd8c90
// addr data2: 0x7ffc45dd8c92

In the previous example, the address of each data member is output to stdout, and as expected, the first data member is aligned to 0, and the second data member is 2 bytes from the top of the structure, even though the total size of the structure is 16 bytes, meaning that the compiler is getting the extra 6 bytes by adding addition integers to the bottom of the structure. Although this might seem benign if an array of these structures were created, and it was assumed the structures were 10 bytes in size due to the use of #pragma pack, a hard to find logic bug would be introduced. 

To conclude this chapter, a note about pointers should be provided with respect to their size. Specifically, the size of a pointer depends entirely on the CPU architecture, operating system, and mode the application is running in. Let's examine the following example:

#include <iostream>

#pragma pack(push, 1)
struct mystruct {
uint16_t *data1;
uint64_t data2;
};
#pragma pack(pop)

int main()
{
std::cout << "size: " << sizeof(mystruct) << '\n';
}

// > g++ scratchpad.cpp; ./a.out
// size: 16

In the previous example, we stored a pointer and an integer and output the total size of the structure to stdout using std::cout. The resulting size of this structure is 16 bytes on a 64-bit Intel CPU running Ubuntu. The total size of this structure on a 32-bit Intel CPU running Ubuntu would be 12 bytes, as the pointer would only be 4 bytes in size. Worse, if the application were compiled as a 32-bit application, but executed on a 64-bit kernel, the application would see this structure as 12 bytes, and the kernel would see this structure as 16 bytes. Attempting to pass this structure to the kernel would result in a bug, as the application and kernel would see the structure differently.