[英]What does an object file contain?

During the various stages of compilation in C or C++, I know that an object file gets generated (i.e., any_name.o file). What does this .o file contain? I can't open it since it's a binary file.


Could anybody please help me? Are the contents of the object file mainly dependent on the compiler which we use on Unix?


8 个解决方案



Object files can contain a bunch of stuff: Basically it's some or all of the list below:


  • Symbol Names
  • 符号名称
  • Compiled code
  • 编译后的代码
  • Constant data, eg. strings
  • 常数数据,例如。字符串
  • Imports - which symbols the compiled code references (gets fixed up by linker)
  • 导入——将编译后的代码引用符号(由链接器修复)
  • Exports - which symbols the object file makes available to OTHER object files.
  • 导出—对象文件使其他对象文件可用的符号。

The linker turns a bunch of object files into an executable, by matching up all the imports and exports, and modifying the compiled code so the correct functions get called.




There is several standardized formats (COFF, ELF on Unix), basically they are variants of the same formats that those used for executables but missing some informations. These missing informations will be completed when linking.

有几种标准化的格式(COFF, Unix上的ELF),基本上它们是用于可执行文件的格式的变体,但是缺少一些信息。这些丢失的信息将在链接时完成。

Objects files formats basically contains the same informations:


  • binary code resulting of compilation (for a target processor)
  • 编译产生的二进制代码(用于目标处理器)
  • static data used by that part of the program (like constant strings, etc). You can make a finer distinction between BSS (exported data) and Text (data that won't be modified by the program). But that is mostly important for compiler and linker. Note that like binary code, data are also dependant on target (big-endian, little-endian, 32bits, 64bits).
  • 程序中该部分使用的静态数据(如常量字符串等)。您可以更好地区分BSS(导出数据)和Text(程序不会修改的数据)。但这对编译器和链接器来说是最重要的。注意,与二进制代码一样,数据也依赖于目标(big-endian, little-endian, 32bit, 64bit)。
  • tables of symbols exported by this part of the program (mostly functions entry points)
  • 程序这一部分导出的符号表(主要是函数入口点)
  • tables of external symbols used by this part of the program
  • 程序这一部分使用的外部符号的表

When objects will be linked together the parts of the code that refers to external symbols will be replaced by actual values (well, that is still oversimplified, there is a last part that will be done at loading time when running the program, but that's the idea).


The object file may also contain more symbols information that strictly necessary for resolving imports and export (useful for debug). That information can be removed using the strip command.




First read the wiki page. You can use objdump to examine such a file :)




The object file is the compiled source.


This means that it's machine code, which is dependent on the target platform (you can compile for Unix on Windows if you really want to) and the compiler used. Different compilers will produce different machine code from the same source file.




Use the file command for things like this. It's an ELF object file on a modern Linux system. E.g. if compiled for 32-bit x86.


ELF 32-bit LSB relocatable, Intel 80386, version 1 (SYSV), not stripped

In contrast, a dynamically linked executable might look like:


ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not stripped

To see headers, including section names, you can use:


objdump -x any_name.o

To disassemble:


objdump -d any_name.o



First, binary files can be opened! Don't be scared of it, you need just the right tools! Being binary data, a text editor is not the right tool of course; a right tool could be a hex editor, or an advanced editor like emacs, or a tool that instead of simply "outputting" bytes in their "hex" representation and letting you alone with your interpretation of the data, knows that particular format and "interprets" the data properly, at some level (e.g. GIMP interprets a PNG file as an image and shows it, a PNG analyser will "decompose" the data inside PNG sections showing telling you the flags in certain bytes, ...etc).


In your case, the general answer is that the object file contains your compiled code (and data), plus all extra informations needed by the linker, and eventually more.


How these informantions are "organized" and in some case in what the "eventually more" consists, it depends on the specific object format. Some wikipedia links listing some of the possibilities are this, this, this, this ...


Each of these may have its tools to analyse the content; e.g. readelf for ELF, objdump for several formats (try objdump -i) depending on how it was compiled.

每一个都可能有自己的工具来分析内容;例如:readelf, objdump用于多种格式(尝试dump -i),这取决于它是如何编译的。



The file contains binary data which must be run through a linker to generate an executable. It is essentially a bunch of machine code instructions with named sections (corresponding to your functions). From wikipedia's 'Object File' article:


In computer science, an object file is an organized collection of separate, named sequences of machine code[citation needed]. Each sequence, or object, typically contains instructions for the host machine to accomplish some task, possibly accompanied by related data and metadata (e.g. relocation information, stack unwinding information, comments, program symbols, debugging or profiling information). A linker is typically used to generate an executable or library by combining parts of object files.




In the GNU compilation environment you can look with objdump both in the executable and in the object file.


As you can see the object contains only the code of functions declared/referenced within the compiled file (the file contains only the main function with a scanf call and a printf call).


$ objdump -t scanf_sample.o

scanf_sample.o:     file format pe-i386

[  0](sec -2)(fl 0x00)(ty   0)(scl 103) (nx 1) 0x00000000 scanf_sample.c
[  2](sec  1)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x00000000 _main
[  3](sec  1)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .text
AUX scnlen 0x91 nreloc 9 nlnno 0
[  5](sec  2)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .data
AUX scnlen 0x0 nreloc 0 nlnno 0
[  7](sec  3)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .bss
AUX scnlen 0x0 nreloc 0 nlnno 0
[  9](sec  4)(fl 0x00)(ty   0)(scl   3) (nx 1) 0x00000000 .rdata
AUX scnlen 0x54 nreloc 0 nlnno 0
[ 11](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 1) 0x00000000 ___main
AUX tagndx 0 ttlsiz 0x0 lnnos 0 next 0
[ 13](sec  0)(fl 0x00)(ty   0)(scl   2) (nx 0) 0x00000000 __alloca
[ 14](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x00000000 _memset
[ 15](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x00000000 _scanf
[ 16](sec  0)(fl 0x00)(ty  20)(scl   2) (nx 0) 0x00000000 _printf

If you use objdump on an executable you can see a lot more functions (besides those found inside the object). This proves that the object file contains only the functions defined in the source file with references to other functions. Those references will be resolved at linking phase.


Read more about linking, compilation and objects.




粤ICP备14056181号  © 2014-2020 ITdaan.com