Skip to content

Latest commit

 

History

History
688 lines (489 loc) · 34.9 KB

2013-11-08-mach-o-executables.md

File metadata and controls

688 lines (489 loc) · 34.9 KB
title category date tags author
Mach-O Executables
6
2013-11-08 09:00:00
article
name url
Daniel Eggert

When we build an application in Xcode, part of what happens is that the sources files (.m and .h) get turned into an executable. This executable contains the byte code than will run on the CPU, the ARM processor on the iOS device, or the Intel processor on your Mac.

We'll walk through some of what the compiler does and what's inside such an executable. There's more to it than first meets the eye.

Let's put Xcode aside for this and step into the land of command-line tools. When we build in Xcode, it simply calls a series of tools. Florian discusses how this works in more detail. We'll call these tools directly and take a look at what they do.

Hopefully this will give you a better understanding of how an executable on iOS or OS X -- a so-called Mach-O executable -- works and is put together.

xcrun

Some infrastructure first: There's a command-line tool called xcrun which we'll use a lot. It may seem odd, but it's pretty awesome. This little tool is used to run other tools. Instead of running:

% clang -v

On the Terminal, we'll use:

% xcrun clang -v

What xcrun does is to locate clang and run it with the arguments that follow clang.

Why would we do this? It may seem pointless. But xcrun allows us to (1) have multiple versions of Xcode and use the tools from a specific Xcode version, and (2) use the tools for a specific SDK (software development kit). If you happen to have both Xcode 4.5 and Xcode 5, with xcode-select and xcrun you can choose to use the tools (and header files, etc.) from the iOS SDK from Xcode 5, or the OS X tools from Xcode 4.5. On most other platforms, that'd be close to impossible. Check out the man pages for xcrun and xcode-select for more details. And you can use the developer tools from the command line without installing the Command Line Tools.

Hello World Without an IDE

Back in Terminal, let's create a folder with a C file in it:

% mkdir ~/Desktop/objcio-command-line
% cd !$
% touch helloworld.c

Now edit this file in your favorite text editor -- even TextEdit.app will do:

% open -e helloworld.c

Fill in this piece of code:

#include <stdio.h>
int main(int argc, char *argv[])
{
    printf("Hello World!\n");
    return 0;
}

Save and return to Terminal to run this:

% xcrun clang helloworld.c
% ./a.out

You should now see a lovely Hello World! message on your terminal. You compiled a C program and ran it. All without an IDE. Take a deep breath. Rejoice.

What did we just do here? We compiled helloworld.c into a Mach-O binary called a.out. That is the default name the compiler will use unless we specify something else.

How did this binary get generated? There are multiple pieces to look at and understand. We'll look at the compiler first.

Hello World and the Compiler

The compiler of choice nowadays is clang (pronounced /klæŋ/). Chris writes in more detail about the compiler.

Briefly put, the compiler will process the helloworld.c input file and produce the executable a.out. This processing consist of multiple steps/stages. What we just did is run all of them in succession:

Preprocessing

  • Tokenization
  • Macro expansion
  • #include expansion

Parsing and Semantic Analysis

  • Translates preprocessor tokens into a parse tree
  • Applies semantic analysis to the parse tree
  • Outputs an Abstract Syntax Tree (AST)

Code Generation and Optimization

  • Translates an AST into low-level intermediate code (LLVM IR)
  • Responsible for optimizing the generated code
  • target-specific code generation
  • Outputs assembly

Assembler

  • Translates assembly code into a target object file

Linker

  • Merges multiple object files into an executable (or a dynamic library)

Let's see how these steps look for our simple example.

Preprocessing

The first thing the compiler will do is preprocess the file. We can tell clang to show us what it looks like if we stop after that step:

% xcrun clang -E helloworld.c

Wow. That will output 413 lines. Let's open that in an editor to see what's going on:

% xcrun clang -E helloworld.c | open -f

At the very top you'll see lots and lots of lines starting with a # (pronounced 'hash'). These are so-called linemarker statements that tell us which file the following lines are from. We need this. If you look at the helloworld.c file again, you'll see that the first line is:

#include <stdio.h>

We have all used #include and #import before. What it does is to tell the preprocessor to insert the content of the file stdio.h where the #include statement was. This is a recursive process: The stdio.h header file in turn includes other files.

Since there's a lot of recursive insertion going on, we need to be able to keep track of where the lines in the resulting source originate from. To do this, the preprocessor inserts a linemarker beginning with a # whenever the origin changes. The number following the # is the line number followed by the name of the file. The numbers at the very end of the line are flags indicating the start of a new file (1), returning to a file (2), that the following is from a system header (3), or that the file is to be treated as wrapped in an extern"C" block.

If you scroll to the very end of the output, you'll find our helloworld.c code:

# 2 "helloworld.c" 2
int main(int argc, char *argv[])
{
 printf("Hello World!\n");
 return 0;
}

In Xcode, you can look at the preprocessor output of any file by selecting Product -> Perform Action -> Preprocess. Note that it takes a few seconds for the Editor to load the preprocessed file -- it'll most likely be close to 100,000 lines long.

Compilation

Next up: parsing and code generation. We can tell clang to output the resulting assembly code like so:

% xcrun clang -S -o - helloworld.c | open -f

Let's take a look at the output. First we'll notice how some lines start with a dot .. These are assembler directives. The other ones are actual x86_64 assembly. Finally there are labels, which are similar to labels in C.

Let's start with the first three lines:

    .section    __TEXT,__text,regular,pure_instructions
    .globl  _main
    .align  4, 0x90

These three lines are assembler directives, not assembly code. The .section directive specifies into which section the following will go. More about sections in a bit.

Next, the .globl directive specifies that _main is an external symbol. This is our main() function. It needs to be visible outside our binary because the system needs to call it to run the executable.

The .align directive specifies the alignment of what follows. In our case, the following code will be 16 (2^4) byte aligned and padded with 0x90 if needed.

Next up is the preamble for the main function:

_main:                                  ## @main
    .cfi_startproc
## BB#0:
    pushq   %rbp
Ltmp2:
    .cfi_def_cfa_offset 16
Ltmp3:
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
Ltmp4:
    .cfi_def_cfa_register %rbp
    subq    $32, %rsp

This part has a bunch of labels that work the same way as C labels do. They are symbolic references to certain parts of the assembly code. First is the actual start of our function _main. This is also the symbol that is exported. The binary will hence have a reference to this position.

The .cfi_startproc directive is used at the beginning of most functions. CFI is short for Call Frame Information. A frame corresponds loosely to a function. When you use the debugger and step in or step out, you're actually stepping in/out of call frames. In C code, functions have their own call frames, but other things can too. The .cfi_startproc directive gives the function an entry into .eh_frame, which contains unwind information -- this is how exception can unwind the call frame stack. The directive will also emit architecture-dependent instructions for CFI. It's matched by a corresponding .cfi_endproc further down in the output to mark the end of our main() function.

Next, there's another label ## BB#0: and then, finally, the first assembly code: pushq %rbp. This is where things get interesting. On OS X, we have x86_64 code, and for this architecture there's a so-called application binary interface (ABI) that specifies how function calls work at the assembly code level. Part of this ABI specifies that the rbp register (base pointer register) must be preserved across function calls. It's the main function's responsibility to make sure the rbp register has the same value once the function returns. pushq %rbp pushes its value onto the stack so that we can pop it later.

Next, two more CFI directives: .cfi_def_cfa_offset 16 and .cfi_offset %rbp, -16. Again, these will output information related to generating call frame unwinding information and debug information. We're changing the stack and the base pointer and these two basically tell the debugger where things are -- or rather, they'll cause information to be output which the debugger can later use to find its way.

Now, movq %rsp, %rbp will allow us to put local variables onto the stack. subq $32, %rsp moves the stack pointer by 32 bytes, which the function can then use. We're first storing the old stack pointer in rbp and using that as a base for our local variables, then updating the stack pointer to past the part that we'll use.

Next, we'll call printf():

leaq    L_.str(%rip), %rax
movl    $0, -4(%rbp)
movl    %edi, -8(%rbp)
movq    %rsi, -16(%rbp)
movq    %rax, %rdi
movb    $0, %al
callq   _printf

First, leaq loads the pointer to L_.str into the rax register. Note how the L_.str label is defined further down in the assembly code. This is our C string "Hello World!\n". The edi and rsi registers hold the first and second function arguments. Since we'll call another function, we first need to store their current values. That's what we'll use the 32 bytes based off rbp we just reserved for. First a 32-bit 0, then the 32-bit value of the edi register (which holds argc), then the 64-bit value of the rsi register (which holds argv). We're not using those values later, but since the compiler is running without optimizations, it'll store them anyway.

Now we'll put the first function argument for printf(), rax, into the first function argument register edi. The printf() function is a variadic function. The ABI-calling convention specifies that the number of vector registers used to hold arguments need to be stored in the al register. In our case it's 0. Finally, callq calls the printf() function:

    movl    $0, %ecx
    movl    %eax, -20(%rbp)         ## 4-byte Spill
    movl    %ecx, %eax

This sets the ecx register to 0, saves (spills) the eax register onto the stack, then copies the 0 values in ecx into eax. The ABI specifies that eax will hold the return value of a function, and our main() function returns 0:

    addq    $32, %rsp
    popq    %rbp
    ret
    .cfi_endproc

Since we're done, we'll restore the stack pointer by shifting the stack pointer rsp back 32 bytes to undo the effect of subq $32, %rsp from above. Finally, we'll pop the value of rbp we'd stored earlier and then return to the caller with ret, which will read the return address off the stack. The .cfi_endproc balances the .cfi_startproc directive.

Next up is the output for our string literal "Hello World!\n":

    .section    __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
    .asciz   "Hello World!\n"

Again, the .section directive specifies which section the following needs to go into. The L_.str label allows the actual code to get a pointer to the string literal. The .asciz directive tells the assembler to output a 0-terminated string literal.

This starts a new section __TEXT __cstring. This section contains C strings:

L_.str:                                 ## @.str
    .asciz     "Hello World!\n"

And these two lines create a null-terminated string. Note how L_.str is the name used further up to access the string.

The final .subsections_via_symbols directive is used by the static link editor.

More information about assembler directives can be found in Apple's OS X Assembler Reference. The AMD 64 website has documentation on the application binary interface for x86_64. It also has a Gentle Introduction to x86-64 Assembly.

Again, Xcode lets you review the assembly output of any file by selecting Product -> Perform Action -> Assemble.

Assembler

The assembler, simply put, converts the (human-readable) assembly code into machine code. It creates a target object file, often simply called object file. These files have a .o file ending. If you build your app with Xcode, you'll find these object files inside the Objects-normal folder inside the derived data directory of your project.

Linker

We'll talk a bit more about the linker later. But simply put, the linker will resolve symbols between object files and libraries. What does that mean? Recall the

callq   _printf

statement. printf() is a function in the libc library. Somehow, the final executable needs to be able to know where in memory the printf() is, i.e. what the address of the _printf symbol is. The linker takes all object files (in our case, only one) and the libraries (in our case, implicitly libc) and resolves any unknown symbols (in our case, the _printf). It then encodes into the final executable that this symbol can be found in libc, and the linker then outputs the final executable that can be run: a.out.

Sections

As we mentioned above, there's something called sections. An executable will have multiple sections, i.e. parts. Different parts of the executable will each go into their own section, and each section will in turn go inside a segment. This is true for our trivial app, but also for the binary of a full-blown app.

Let's take a look at the sections of our a.out binary. We can use the size tool to do that:

% xcrun size -x -l -m a.out 
Segment __PAGEZERO: 0x100000000 (vmaddr 0x0 fileoff 0)
Segment __TEXT: 0x1000 (vmaddr 0x100000000 fileoff 0)
    Section __text: 0x37 (addr 0x100000f30 offset 3888)
    Section __stubs: 0x6 (addr 0x100000f68 offset 3944)
    Section __stub_helper: 0x1a (addr 0x100000f70 offset 3952)
    Section __cstring: 0xe (addr 0x100000f8a offset 3978)
    Section __unwind_info: 0x48 (addr 0x100000f98 offset 3992)
    Section __eh_frame: 0x18 (addr 0x100000fe0 offset 4064)
    total 0xc5
Segment __DATA: 0x1000 (vmaddr 0x100001000 fileoff 4096)
    Section __nl_symbol_ptr: 0x10 (addr 0x100001000 offset 4096)
    Section __la_symbol_ptr: 0x8 (addr 0x100001010 offset 4112)
    total 0x18
Segment __LINKEDIT: 0x1000 (vmaddr 0x100002000 fileoff 8192)
total 0x100003000

Our a.out file has four segments. Some of these have sections.

When we run an executable, the VM (virtual memory) system maps the segments into the address space (i.e. into memory) of the process. Mapping is very different in nature, but if you're unfamiliar with the VM system, simply assume that the VM loads the entire executable into memory -- even though that's not what's really happening. The VM pulls some tricks to avoid having to do so.

When the VM system does this mapping, segments and sections are mapped with different properties, namely different permissions.

The __TEXT segment contains our code to be run. It's mapped as read-only and executable. The process is allowed to execute the code, but not to modify it. The code can not alter itself, and these mapped pages can therefore never become dirty.

The __DATA segment is mapped read/write but non-executable. It contains values that need to be updated.

The first segment is __PAGEZERO. It's 4GB large. Those 4GB are not actually in the file, but the file specifies that the first 4GB of the process' address space will be mapped as non-executable, non-writable, non-readable. This is why you'll get an EXC_BAD_ACCESS when reading from or writing to a NULL pointer, or some other value that's (relatively) small. It's the operating system trying to prevent you from causing havoc.

Within segments, there are sections. These contain distinct parts of the executable. In the __TEXT segment, the __text section contains the compiled machine code. __stubs and __stub_helper are used for the dynamic linker (dyld). This allows for lazily linking in dynamically linked code. __const (which we don't have in our example) are constants, and similarly __cstring contains the literal string constants of the executable (quoted strings in source code).

The __DATA segment contains read/write data. In our case we only have __nl_symbol_ptr and __la_symbol_ptr, which are non-lazy and lazy symbol pointers, respectively. The lazy symbol pointers are used for so-called undefined functions called by the executable, i.e. functions that are not within the executable itself. They're lazily resolved. The non-lazy symbol pointers are resolved when the executable is loaded.

Other common sections in the __DATA segment are __const, which will contain constant data which needs relocation. An example is char * const p = "foo"; -- the data pointed to by p is not constant. The __bss section contains uninitialized static variables such as static int a; -- the ANSI C standard specifies that static variables must be set to zero. But they can be changed at run time. The __common section contains uninitialized external globals, similar to static variables. An example would be int a; outside a function block. Finally, __dyld is a placeholder section, used by the dynamic linker.

Apple's OS X Assembler Reference has more information about some of the section types.

Section Content

We can inspect the content of a section with otool(1) like so:

% xcrun otool -s __TEXT __text a.out 
a.out:
(__TEXT,__text) section
0000000100000f30 55 48 89 e5 48 83 ec 20 48 8d 05 4b 00 00 00 c7 
0000000100000f40 45 fc 00 00 00 00 89 7d f8 48 89 75 f0 48 89 c7 
0000000100000f50 b0 00 e8 11 00 00 00 b9 00 00 00 00 89 45 ec 89 
0000000100000f60 c8 48 83 c4 20 5d c3 

This is the code of our app. Since -s __TEXT __text is very common, otool has a shortcut to it with the -t argument. We can even look at the disassembled code by adding -v:

% xcrun otool -v -t a.out
a.out:
(__TEXT,__text) section
_main:
0000000100000f30    pushq   %rbp
0000000100000f31    movq    %rsp, %rbp
0000000100000f34    subq    $0x20, %rsp
0000000100000f38    leaq    0x4b(%rip), %rax
0000000100000f3f    movl    $0x0, 0xfffffffffffffffc(%rbp)
0000000100000f46    movl    %edi, 0xfffffffffffffff8(%rbp)
0000000100000f49    movq    %rsi, 0xfffffffffffffff0(%rbp)
0000000100000f4d    movq    %rax, %rdi
0000000100000f50    movb    $0x0, %al
0000000100000f52    callq   0x100000f68
0000000100000f57    movl    $0x0, %ecx
0000000100000f5c    movl    %eax, 0xffffffffffffffec(%rbp)
0000000100000f5f    movl    %ecx, %eax
0000000100000f61    addq    $0x20, %rsp
0000000100000f65    popq    %rbp
0000000100000f66    ret

This is the same stuff, this time disassembled. It should look familiar -- it's what we looked at a bit further back when compiling the code. The only difference is that we don't have any of the assembler directives in the code anymore; this is the bare binary executable.

In a similar fashion, we can look at other sections:

% xcrun otool -v -s __TEXT __cstring a.out
a.out:
Contents of (__TEXT,__cstring) section
0x0000000100000f8a  Hello World!\n

Or:

% xcrun otool -v -s __TEXT __eh_frame a.out 
a.out:
Contents of (__TEXT,__eh_frame) section
0000000100000fe0    14 00 00 00 00 00 00 00 01 7a 52 00 01 78 10 01 
0000000100000ff0    10 0c 07 08 90 01 00 00 

Side Note on Performance

On a side note: The __DATA and __TEXT segments have performance implications. If you have a very large binary, you might want to check out Apple's documentation on Code Size Performance Guidelines. Moving data into the __TEXT segment is beneficial, because those pages are never dirty.

Arbitrary Sections

You can add arbitrary data as a section to your executable with the -sectcreate linker flag. This is how you'd add a Info.plist to a single file executable. The Info.plist data needs to go into a __info_plist section of the __TEXT segment. You'd pass -sectcreate segname sectname file to the linker by passing

-Wl,-sectcreate,__TEXT,__info_plist,path/to/Info.plist

to clang. Similarly, -sectalign specifies the alignment. If you're adding an entirely new segment, check out -segprot to specify the protection (read/write/executable) of the segment. These are all documented in the main page for the linker, i.e. ld(1).

You can get to sections using the functions defined in /usr/include/mach-o/getsect.h, namely getsectdata(), which will give you a pointer to the sections data and return its length by reference.

Mach-O

Executables on OS X and iOS are Mach-O executables:

% file a.out 
a.out: Mach-O 64-bit executable x86_64

This is true for GUI applications too:

% file /Applications/Preview.app/Contents/MacOS/Preview 
/Applications/Preview.app/Contents/MacOS/Preview: Mach-O 64-bit executable x86_64

Apple has detailed information about the Mach-O file format.

We can use otool(1) to peek into the executable's Mach header. It specifies what this file is and how it's to be loaded. We'll use the -h flag to print the header information:

% otool -v -h a.out           a.out:
Mach header
      magic cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
MH_MAGIC_64  X86_64        ALL LIB64     EXECUTE    16       1296   NOUNDEFS DYLDLINK TWOLEVEL PIE

The cputype and cpusubtype specify the target architecture this executable can run on. The ncmds and sizeofcmds are the load commands which we can look at with the -l argument:

% otool -v -l a.out | open -f
a.out:
Load command 0
      cmd LC_SEGMENT_64
  cmdsize 72
  segname __PAGEZERO
   vmaddr 0x0000000000000000
   vmsize 0x0000000100000000
...

The load commands specify the logical structure of the file and its layout in virtual memory. Most of the information otool prints out is derived from these load commands. Looking at the Load command 1 part, we find initprot r-x, which specifies the protection mentioned above: read-only (no-write) and executable.

For each segment and each section within a segment, the load command specifies where in memory it should end up, and with what protection, etc. Here's the output for the __TEXT __text section:

Section
  sectname __text
   segname __TEXT
      addr 0x0000000100000f30
      size 0x0000000000000037
    offset 3888
     align 2^4 (16)
    reloff 0
    nreloc 0
      type S_REGULAR
attributes PURE_INSTRUCTIONS SOME_INSTRUCTIONS
 reserved1 0
 reserved2 0

Our code will end up in memory at 0x100000f30. Its offset in the file is 3888. If you look at the disassembly output from before of xcrun otool -v -t a.out, you'll see that the code is, in fact, at 0x100000f30.

We can also take a look at which dynamic libraries the executable is using:

% otool -v -L a.out
a.out:
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 169.3.0)
    time stamp 2 Thu Jan  1 01:00:02 1970

This is where our executable will find the _printf symbol it's using.

A More Complex Sample

Let's look at a slightly more complex sample with three files:

Foo.h:

#import <Foundation/Foundation.h>

@interface Foo : NSObject

- (void)run;

@end

Foo.m:

#import "Foo.h"

@implementation Foo

- (void)run
{
    NSLog(@"%@", NSFullUserName());
}

@end

helloworld.m:

#import "Foo.h"

int main(int argc, char *argv[])
{
    @autoreleasepool {
        Foo *foo = [[Foo alloc] init];
        [foo run];
        return 0;
    }
}

Compiling Multiple Files

In this sample, we have more than one file. We therefore need to tell clang to first generate object files for each input file:

% xcrun clang -c Foo.m
% xcrun clang -c helloworld.m

We're never compiling the header file. Its purpose is simply to share code between the implementation files which do get compiled. Both Foo.m and helloworld.m pull in the content of the Foo.h through the #import statement.

We end up with two object files:

% file helloworld.o Foo.o
helloworld.o: Mach-O 64-bit object x86_64
Foo.o:        Mach-O 64-bit object x86_64

In order to generate an executable, we need to link these two object files and the Foundation framework with each other:

xcrun clang helloworld.o Foo.o -Wl,`xcrun --show-sdk-path`/System/Library/Frameworks/Foundation.framework/Foundation

We can now run our code:

% ./a.out 
2013-11-03 18:03:03.386 a.out[8302:303] Daniel Eggert

Symbols and Linking

Our small app was put together from two object files. The Foo.o object file contains the implementation of the Foo class, and the helloworld.o object file contains the main() function and calls/uses the Foo class.

Furthermore, both of these use the Foundation framework. The helloworld.o object file uses it for the autorelease pool, and it indirectly uses the Objective-C runtime in form of the libobjc.dylib. It needs the runtime functions to make message calls. This is similar to the Foo.o object file.

All of these are represented as so-called symbols. We can think of a symbol as something that'll be a pointer once the app is running, although its nature is slightly different.

Each function, global variable, class, etc. that is defined or used results in a symbol. When we link object files into an executable, the linker (ld(1)) resolves symbols as needed between object files and dynamic libraries.

Executables and object files have a symbol table that specify their symbols. If we take a look at the helloworld.o object file with the nm(1) tool, we get this:

% xcrun nm -nm helloworld.o
                 (undefined) external _OBJC_CLASS_$_Foo
0000000000000000 (__TEXT,__text) external _main
                 (undefined) external _objc_autoreleasePoolPop
                 (undefined) external _objc_autoreleasePoolPush
                 (undefined) external _objc_msgSend
                 (undefined) external _objc_msgSend_fixup
0000000000000088 (__TEXT,__objc_methname) non-external L_OBJC_METH_VAR_NAME_
000000000000008e (__TEXT,__objc_methname) non-external L_OBJC_METH_VAR_NAME_1
0000000000000093 (__TEXT,__objc_methname) non-external L_OBJC_METH_VAR_NAME_2
00000000000000a0 (__DATA,__objc_msgrefs) weak private external l_objc_msgSend_fixup_alloc
00000000000000e8 (__TEXT,__eh_frame) non-external EH_frame0
0000000000000100 (__TEXT,__eh_frame) external _main.eh

These are all symbols of that file. _OBJC_CLASS_$_Foo is the symbol as the Foo Objective-C class. It's an undefined, external symbol of the Foo class. External means it's not private to this object file, as opposed to non-external symbols which are private to the particular object file. Our helloworld.o object file references the class Foo, but it doesn't implement it. Hence, its symbol table ends up having an entry marked as undefined.

Next, the _main symbol for the main() function is also external because it needs to be visible in order to get called. It, however, is implemented in helloworld.o as well, and resides at address 0 and needs to go into the __TEXT,__text section. Then there are four Objective-C runtime functions. These are also undefined and need to be resolved by the linker.

If we turn toward the Foo.o object file, we get this output:

% xcrun nm -nm Foo.o
0000000000000000 (__TEXT,__text) non-external -[Foo run]
                 (undefined) external _NSFullUserName
                 (undefined) external _NSLog
                 (undefined) external _OBJC_CLASS_$_NSObject
                 (undefined) external _OBJC_METACLASS_$_NSObject
                 (undefined) external ___CFConstantStringClassReference
                 (undefined) external __objc_empty_cache
                 (undefined) external __objc_empty_vtable
000000000000002f (__TEXT,__cstring) non-external l_.str
0000000000000060 (__TEXT,__objc_classname) non-external L_OBJC_CLASS_NAME_
0000000000000068 (__DATA,__objc_const) non-external l_OBJC_METACLASS_RO_$_Foo
00000000000000b0 (__DATA,__objc_const) non-external l_OBJC_$_INSTANCE_METHODS_Foo
00000000000000d0 (__DATA,__objc_const) non-external l_OBJC_CLASS_RO_$_Foo
0000000000000118 (__DATA,__objc_data) external _OBJC_METACLASS_$_Foo
0000000000000140 (__DATA,__objc_data) external _OBJC_CLASS_$_Foo
0000000000000168 (__TEXT,__objc_methname) non-external L_OBJC_METH_VAR_NAME_
000000000000016c (__TEXT,__objc_methtype) non-external L_OBJC_METH_VAR_TYPE_
00000000000001a8 (__TEXT,__eh_frame) non-external EH_frame0
00000000000001c0 (__TEXT,__eh_frame) non-external -[Foo run].eh

The fifth-to-last line shows that _OBJC_CLASS_$_Foo is defined and external to Foo.o -- it has this class's implementation.

Foo.o also has undefined symbols. First and foremost are the symbols for NSFullUserName(), NSLog(), and NSObject that we're using.

When we link these two object files and the Foundation framework (which is a dynamic library), the linker tries to resolve all undefined symbols. It can resolve _OBJC_CLASS_$_Foo that way. For the others, it will need to use the Foundation framework.

When the linker resolves a symbol through a dynamic library (in our case, the Foundation framework), it will record inside the final linked image that the symbol will be resolved with that dynamic library. The linker records that the output file depends on that particular dynamic library, and what the path of it is. That's what happens with the _NSFullUserName, _NSLog, _OBJC_CLASS_$_NSObject, _objc_autoreleasePoolPop, etc. symbols in our case.

We can look at the symbol table of the final executable a.out and see how the linker resolved all the symbols:

% xcrun nm -nm a.out 
                 (undefined) external _NSFullUserName (from Foundation)
                 (undefined) external _NSLog (from Foundation)
                 (undefined) external _OBJC_CLASS_$_NSObject (from CoreFoundation)
                 (undefined) external _OBJC_METACLASS_$_NSObject (from CoreFoundation)
                 (undefined) external ___CFConstantStringClassReference (from CoreFoundation)
                 (undefined) external __objc_empty_cache (from libobjc)
                 (undefined) external __objc_empty_vtable (from libobjc)
                 (undefined) external _objc_autoreleasePoolPop (from libobjc)
                 (undefined) external _objc_autoreleasePoolPush (from libobjc)
                 (undefined) external _objc_msgSend (from libobjc)
                 (undefined) external _objc_msgSend_fixup (from libobjc)
                 (undefined) external dyld_stub_binder (from libSystem)
0000000100000000 (__TEXT,__text) [referenced dynamically] external __mh_execute_header
0000000100000e50 (__TEXT,__text) external _main
0000000100000ed0 (__TEXT,__text) non-external -[Foo run]
0000000100001128 (__DATA,__objc_data) external _OBJC_METACLASS_$_Foo
0000000100001150 (__DATA,__objc_data) external _OBJC_CLASS_$_Foo

We see that all the Foundation and Objective-C runtime symbols are still undefined, but the symbol table now has information about how to resolve them, i.e. in which dynamic library they're to be found.

The executable also knows where to find these libraries:

% xcrun otool -L a.out
a.out:
    /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation (compatibility version 300.0.0, current version 1056.0.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1197.1.1)
    /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 855.11.0)
    /usr/lib/libobjc.A.dylib (compatibility version 1.0.0, current version 228.0.0)

These undefined symbols are resolved by the dynamic linker dyld(1) at runtime. When we run the executable, dyld will make sure that _NSFullUserName, etc. point to their implementation inside Foundation, etc.

We can run nm(1) against Foundation and check that these symbols are, in fact, defined there:

% xcrun nm -nm `xcrun --show-sdk-path`/System/Library/Frameworks/Foundation.framework/Foundation | grep NSFullUserName
0000000000007f3e (__TEXT,__text) external _NSFullUserName 

The Dynamic Link Editor

There are a few environment variables that can be useful to see what dyld is up to. First and foremost, DYLD_PRINT_LIBRARIES. If set, dyld will print out what libraries are loaded:

% (export DYLD_PRINT_LIBRARIES=; ./a.out )
dyld: loaded: /Users/deggert/Desktop/command_line/./a.out
dyld: loaded: /System/Library/Frameworks/Foundation.framework/Versions/C/Foundation
dyld: loaded: /usr/lib/libSystem.B.dylib
dyld: loaded: /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
dyld: loaded: /usr/lib/libobjc.A.dylib
dyld: loaded: /usr/lib/libauto.dylib
[...]

This will show you all seventy dynamic libraries that get loaded as part of loading Foundation. That's because Foundation depends on other dynamic libraries, which, in turn, depend on others, and so forth. You can run

% xcrun otool -L `xcrun --show-sdk-path`/System/Library/Frameworks/Foundation.framework/Foundation

to see a list of the fifteen dynamic libraries that Foundation uses.

The dyld's Shared Cache

When you're building a real-world application, you'll be linking against various frameworks. And these in turn will use countless other frameworks and dynamic libraries. The list of all dynamic libraries that need to get loaded gets large quickly. And the list of interdependent symbols even more so. There will be thousands of symbols to resolve. This works takes a long time: several seconds.

In order to shortcut this process, the dynamic linker on OS X and iOS uses a shared cache that lives inside /var/db/dyld/. For each architecture, the OS has a single file that contains almost all dynamic libraries already linked together into a single file with their interdependent symbols resolved. When a Mach-O file (an executable or a library) is loaded, the dynamic linker will first check if it's inside this shared cache image, and if so, use it from the shared cache. Each process has this dyld shared cache mapped into its address space already. This method dramatically improves launch time on OS X and iOS.