COFF link editor

Section definition directives

The purpose of the SECTIONS directive is to describe how input sections in COFF files are to be combined, to direct where to place output sections (both in relation to each other and to the entire virtual memory space), and to permit the renaming of output sections.

In the default case where no SECTIONS directives are given, all input sections of the same name appear in an output section of that name. If two object files are linked, one containing sections s1 and s2 and the other containing sections s2 and s3, the output object file contains three output sections s1, s2, and s3. The input sections s1 and s3 appear in output sections of the same name; the two input sections named s2 appear in an output section named s2. The order of these output sections depends on the order in which the link editor sees the input files.

The basic syntax of the SECTIONS directive is:

   SECTIONS
   {
              secname1 :
              {
                 file_specifications,
                 assignment_statements
              }
              secname2 :
              {
                      file_specifications,
                      assignment_statements
              }
   .
   .
   .
   }

File specifications

Within a section definition (a SECTIONS directive), the files and sections of files to be included in the output section are listed in the order in which they are to appear in the output section. Sections from an input file are specified in a statement of the form:

   filename ( secname )

or:

   filename ( secnam1 secnam2 . . . )

White space or commas are used to separate file specifications and to separate input section names within file specifications.

The following is an example of a SECTIONS directive:

   SECTIONS
   {
           outsec1:
           {
                   file1.o (sec1)
                   file2.o
                   file3.o (sec1, sec2)
           }
   }

According to this directive, the order in which the input sections appear in the output section outsec1 would be:

section sec1 from file file1.o
all sections from file2.o, in the order they appear in the input file
section sec1 from file file3.o, and then section sec2 from file file3.o

If there are any additional input files that contain input sections also named outsec1, these sections are linked following the last section named in the definition of outsec1. If there are any other input sections in file1.o or file3.o, they will be placed in output sections with the same names as the input sections unless they are included in other file specifications.

To refer to all the uninitialized, unallocated global symbols in a file, the following statement may be used in a file specification:

   filename [COMMON]

If a filename appears with no sections listed, then all sections from the file (but not the uninitialized, unallocated globals) are linked into the current output section.

The following code may be used in a file specification to refer to all previously unallocated input sections of the given name, regardless of what input file they are contained in:

   *(secname)

Loading a section at a specified address

To bind an output section to a specific virtual address, use a SECTIONS directive of the following form:

   SECTIONS
   {
           outsec addr:
           {
                   . . .
           }
   .
   .
   .
   }

The value addr is a C constant which specifies the binding address. If outsec does not fit at addr (perhaps because of holes in the memory configuration or because outsec is too large to fit without overlapping some other output section), ld issues an appropriate error message. addr may also be the word BIND, followed by a parenthesized expression. The expression may use the pseudo-functions SIZEOF, ADDR, or NEXT. NEXT accepts a constant and returns the first multiple of that value that falls into configured unallocated memory; SIZEOF and ADDR accept previously defined sections.

As long as output sections do not overlap and there is enough space, they can be bound anywhere in configured memory. The SECTIONS directives defining output sections need not be given to ld in any particular order, unless SIZEOF or ADDR is used.

ld does not ensure that the size of each section consists of an even number of bytes or that each section starts on an even byte boundary. The assembler ensures that the size (in bytes) of a section is evenly divisible by 4.

The ld directives can be used to force a section to start on an odd byte boundary, although this is not recommended. If a section starts on an odd byte boundary, the section's contents are either accessed incorrectly or are not executed properly. When a user specifies an odd byte boundary, ld issues a warning message.

Aligning an output section

An output section may be bound to a virtual address that falls on an n-byte boundary, where n is a power of 2. This may be done in order to take advantage of the underlying architecture. For example, it may be possible to reduce the number of instructions necessary to address a data object by aligning the data object on a word boundary. This is performed using an ALIGN in a SECTIONS directive. For example:

   SECTIONS
   {
           outsec  ALIGN(0x20000) :
           {
                   . . .
           }
   .
   .
   .
   }

The output section outsec is not bound to any specific address but is placed at some virtual address that is a multiple of 0x20000.

Grouping sections together

The default allocation algorithm for COFF ld does the following:

Links all input .init sections followed by .text sections into one output section. This output section is called .text and is bound to the address 0x0 plus the size of all headers in the output file.
Links all input .data sections together into one output section. This output section is called .data and, in paging systems, is bound to an address aligned to a machine-dependent constant plus a number dependent on the size of headers and text.
Links all input .bss sections together with all uninitialized, unallocated global symbols, into one output section. This output section is called .bss and is allocated so as to immediately follow the output section .data.

If any SECTIONS directives are specified, they replace the default allocation algorithm. Rather than relying on the ld default algorithm when manipulating COFF files, the one certain way to determine address and order information is to take it from the file and section headers. The default allocation of ld is equivalent to the following directive, where align_value and sizeof_headers are machine-dependent constants:

   SECTIONS
   {
       .text sizeof_headers : { *(.init) *(.text) *(.fini)}
       GROUP BIND( NEXT(align_value) +
       ((SIZEOF(.text) + ADDR(.text)) % 0x2000)) :
       {
               .data    : { }
               .bss     : { }
       }
   }

The GROUP command ensures that the two output sections .data and .bss are allocated together. Binding or alignment information is supplied only for the group and not for the output sections contained within the group. The sections making up the group are allocated in the order listed in the directive.

For compatibility with UNIX System V Release 2, the addresses of these sections cannot change. Unfortunately, .init sections in the algorithm above will interfere with the placement of the signal recovery routines. Hence the .text sections are linked into the a.out .text section first. The .init sections (for shared libraries) and the .fini sections follow all of the .text sections. Routines in crt1.o (a C runtime startup routine) branch to the .init sections before calling the main() function of the program.

The following SECTIONS directive may be used to place .text, .data, and .bss in the same segment of memory,

   SECTIONS
   {
           GROUP                  :
           {
                   .text        : { }
                   .data        : { }
                   .bss         : { }
           }
   }

Note that there are still three output sections (.text, .data, and .bss), but now they are allocated into consecutive virtual memory.

This entire group of output sections could be bound to a starting address or aligned simply by adding a field to the GROUP directive in the above example. To bind the group to 0xC0000, add the address after the GROUP keyword:

   GROUP 0xC0000 : {
   		   ...
   		}

This change causes the output section .text to be bound at 0xC0000, followed by the remaining members of the group in order of their appearance. To align the group to 0x10000, add an ALIGN after the GROUP keyword:

   GROUP ALIGN(0x10000) : {
   			  ....
   		       }

This change will causes the output section .text to be aligned to 0x10000, followed by the remaining members of the group.

When the GROUP directive is not used, each output section is treated as an independent entity:

   SECTIONS
   {
           .text   : { }
           .data ALIGN(0x400000)  : { }
           .bss    : { }
   }

In this example, the .text section starts at virtual address 0x0 (provided that address is in configured memory) and the .data section starts at a virtual address aligned to 0x400000. The .bss section immediately follows the .text section if there is enough space. If not, it follows the .data section. The order in which output sections are defined to ld cannot be used to force a certain allocation order in the output file.

Creating holes within output sections

The special symbol dot (``.''), representing ld's location counter, appears only within section definitions and assignment statements. When it appears on the left side of an assignment statement, ``.'' causes the location counter to be reset, leaving a hole in the output section. Holes built into output sections in this manner take up physical space in the output file and are initialized using a fill character. The default fill character is 0x00; alternately, the user may supply a fill character. See the discussion of filling holes in ``Initialized section holes or .bss sections''.

Consider the following section definition:

  1 outsec:
  2 {
  3 	. += 0x1000;
  4 	f1.o (.text)
  5 	. += 0x100;
  6 	f2.o (.text)
  7 	. = align (4);
  8 	f3.o (.text)
  9 }

The effect of this command is as follows:

line 3 increments the location counter by 0x1000, thereby leaving a 0x1000 byte hole, filled with the default fill character, at the beginning of the section
in line 4, the .text section of input file f1.o is linked after the hole that was just left
in line 5, the location counter is incremented by 0x100, leaving a 0x100 byte hole filled with the default fill character
in line 6, the .text section of input file f2.o is linked following the second hole; this section begins 0x100 bytes from the end of f1.o (.text).
line 7 causes the location counter to be aligned with the next 4-byte boundary (that is, the next double-word boundary)
in line 8, the .text section of f3.o is linked; the effect of lines 7 and 8 is to cause this section to start at the next full word boundary following the .text section of f2.o. The boundary is determined relative to the beginning of the output section outsec.

For the purposes of allocating and aligning addresses within an output section, ld treats the output section as if it began at address zero. As a result, in the above example, if outsec ultimately is linked to start at an odd address, then the part of outsec built from f3.o (.text) also starts at an odd address, even though f3.o (.text) is aligned to a full word boundary. This may be prevented by specifying an alignment for the entire output section, as follows:

   outsec ALIGN(4) : {
   			...
   		  }

Expressions that decrement ``.'' are illegal. Subtracting a value from the location counter is not allowed, since this can cause memory to be overwritten.

Creating and defining symbols at link-edit time

Assignment statements can be used to give symbols a value that is link-edit dependent. For example, we just saw that the ``.'' symbol can be used to adjust the location counter during allocation. It is possible to assign allocation-dependent values to other symbols. These can be symbols that were defined in an object file that is being linked, or they may be symbols that are used only in the ifile. This provides a way to assign to symbols addresses known only after allocation. For example:

   SECTIONS
   {
           outsc1: {...}
           outsc2:
           {
                   file1.o (s1)
                   s2_start = . ;
                   file2.o (s2)
                   s2_end = . - 1;
           }
   }

The symbol s2_start is defined to be the address of file2.o(s2), and s2_end is the address of the last byte of file2.o(s2). Consider the following example:

   SECTIONS
   {
           outsc1:
           {
                   file1.o (.data)
                   mark = .;
                   . += 4;
                   file2.o (.data)
           }
   }

In this example, the symbol mark is created and is equal to the address of the first byte beyond the end of file1.o's .data section. Four bytes are reserved for a future run-time initialization of the symbol mark. The type of the symbol is a long integer (32 bits).

Assignment instructions involving ``.'' must appear within SECTIONS definitions since they are evaluated during allocation. Assignment instructions that do not involve ``.'' can appear within SECTIONS definitions but typically do not. Such instructions are evaluated after allocation is complete. Reassignment of a defined symbol to a different address is dangerous. For example, if a symbol within .data is defined, initialized, and referenced within a set of object files being link-edited, the symbol table entry for that symbol is changed to reflect the new, reassigned physical address. However, the associated initialized data is not moved to the new address, and there may be references to the old address. ld issues warning messages for each defined symbol that is being redefined within an ifile. However, it is safe to assign of absolute values to new symbols because there are no references or initialized data associated with these symbols.

Allocating a section into named memory

It is possible to specify that a section be linked somewhere within a named memory range, previously defined on a MEMORY directive.

For example:

      MEMORY
      {
              mem1:          o=0x000000    l=0x10000
              mem2 (RW):     o=0x020000    l=0x40000
              mem3 (RW):     o=0x070000    l=0x40000
              mem1:          o=0x120000    l=0x04000
      }
   
      SECTIONS
      {
              outsec1: { f1.o(.data) } > mem1
              outsec2: { f2.o(.data) } > mem3
      }

The '>' operator (analogous to the UNIX system redirection operator) directs ld to place outsec1 anywhere within the memory area named mem1 (that is, somewhere within the address range 0x0-0xFFFF or 0x120000-0x123FFF). The output section outsec2 is to be placed somewhere in the area named mem3, that is, the address range 0x70000-0xAFFFF.

Initialized section holes or .bss sections

When holes are created within a section, ld normally fills them with bytes of zero (0x00). By default, .bss sections are not initialized at all; that is, no initialized data is generated for any .bss section by the assembler nor supplied by the link editor.

SECTIONS directives may be used to initialize such holes or .bss output sections to an arbitrary 2-byte pattern. Such initialization options apply only to .bss sections or holes. For example, an application might want an uninitialized data table to be initialized to a constant value without recompiling the .o file, or a hole in the text area to be filled with a transfer to an error routine.

An entire output section may be initialized, or specific areas within an output section. However, since no text is generated for an uninitialized .bss section, if part of such a section is initialized, then the entire section is initialized. In other words, if a .bss section is to be combined with a .text or .data section (both of which are initialized) or if part of an output .bss section is to be initialized, then one of the following will apply:

Explicit initialization options may be used to initialize all .bss sections in the output section.
ld will use the default fill value to initialize all .bss sections in the output section.

Holes are filled using a statement of the form:

   section_name:
   {
       ....
   } = long_int

or:

   file_specification = long_int

Consider the following ld file:

   SECTIONS
   {
           sec1:
           {
                   f1.o
                   . =+ 0x200;
                   f2.o (.text)
           } = 0xDFFF
           sec2:
           {
                   f1.o (.bss)
                   f2.o (.bss) = 0x1234
           }
           sec3:
           {
                   f3.o (.bss)
                   . . .
           } = 0xFFFF
           sec4: { f4.o (.bss) }
   }

In the example above, the 0x200 byte hole in section sec1 is filled with the value 0xDFFF. In section sec2, f1.o(.bss) is initialized to the default fill value of 0x00, and f2.o(.bss) is initialized to 0x1234. All .bss sections within sec3 as well as all holes are initialized to 0xFFFF. Section sec4 is not initialized; that is, no data is written to the object file for this section. When unconfigured areas exist in the virtual memory, each application must assume responsibility for forming output sections that will fit into memory. For example, assume that memory is configured as follows:

   MEMORY
   {
        mem1:        o = 0x00000        l = 0x02000
        mem2:        o = 0x20000        l = 0x10000
        mem3:        o = 0x40000        l = 0x05000
   }

Suppose that the files f1.o, f2.o, . . . fn.o each contain three sections .text, .data, and .bss, with the combined .text section length being 0x12000 bytes. There is no configured area of memory in which this section can be placed, because the longest memory range that has been defined is only 0x10000 bytes long. Appropriate directives must be supplied to break up the .text output section so ld may do the allocation. The following set of directives group the .text sections from the input files into a number of output sections txt1, txt2, and so on.

   SECTIONS
   {
           txt1:
           {
                   f1.o (.text)
                   f2.o (.text)
                   f3.o (.text)
           }
           txt2:
           {
                   f4.o (.text)
                   f5.o (.text)
                   f6.o (.text)
           }
   .
   .
   .
   }