The Mechanics of Compiling a C Program
The C Compiler is responsible for taking source code and converting it into a format that a computer can understand. When you write a source code file and compile it with a program such as GCC, it goes through four different stages to create machine-readable data. These four stages are preprocessing, compilation, assembly, and linking.
Preprocessing is the first step in compiling. Any code included in the application is processed, and conditional instructions — such as platform-specific behavior — are determined. The compiler also expands macros and removes all comments from the source code. This step creates a complete copy of all the code necessary to build the program. Once this step executes, the compiler has everything it needs to begin generating the assembly language build.
Compilation runs through the output of preprocessing to create assembly language source code. This set of instructions is much lower level and closer to machine language, referring to exact operations performed by the computer. Each line of assembly language code typically refers to one action performed by the computer.
Assembly translates output from compilation and turns it into object code. This is necessary because a computer cannot read assembly code, it only understands binary switches of 1’s and 0’s. These instructions are what is read by the processor. This object is inspectable with the hexdump command.
Linking is the final step in converting C source code to an executable. Although the instructions now exist in binary format, pieces can be out of order. During this phase the compiler places the pieces in correct sequence and fills in missing instructions from libraries on the system. Once this step executes, we have the final executable program.
We now have a working program!