Copyright 2004 by M. Uli Kusterer Tue, 30 Dec 1969 07:58:58 GMT Comments on article blog-intel-assembler-on-mac-os-x at Zathras.de http://www.zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm blog-intel-assembler-on-mac-os-x Comments witness_dot_of_dot_teachtext_at_gmx_dot_net (M. Uli Kusterer) witness_dot_of_dot_teachtext_at_gmx_dot_net (M. Uli Kusterer) en-us Comment 31 by André http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment31 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment31 Just wanted to update the URLs to the official Assembler and Calling Convention docs by Apple. Seems they both got a very recent update too if you look at the update dates:

PPC/IA32 Assembler Intro and Reference:
http://developer.apple.com/mac/library/documentation/DeveloperTools/Reference/Assembler

LowLevel ABI Function Call Guide:
http://developer.apple.com/mac/library/documentation/DeveloperTools/Conceptual/LowLevelABI

As usual, just readin the docs provided by Apple can get you already quite far :)

Cheers!

André
Comment 30 by leffe http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment30 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment30 I tried to asm a hello world example i found which was written for freeBSD on my intel mac with nasm knowing the similarities between osx an bsd syscalls.
It worked, but only if i produced a mach-o obj file (obviously!), or else i would get the:
"ld: could not find entry point "_start" (perhaps missing crt1.o) for inferred architecture i386" error.
when the correct obj file is produced: "ld -e _start -o foo foo.o" should work
Comment 29 by al http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment29 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment29 @hdx
if you're having compile problems, the syntax is
gcc youasmfile.as -o executablename
I made the assumption to use "as" and "ld" but those turned up with bus errors and linking problems and such.
Comment 28 by Mo http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment28 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment28 Mo writes:
(In relation to my previous comment) - Apple deliberately doesn't publicise the system call mechanism, save for the Darwin sources, and similarly doesn't support static linking. I'm not sure if they've said as much, but presumably this is so that they can change the mechanism in future OS release (or even a minor update) without having to recompile anything except libSystem.

@Ted Henry:

It's actually an ABI convention. Mac OS X uses leading underscores, as do (most) DOS compilers and Win32. No idea if Win64 does, but I'd guess so. Quite a few embedded systems do too. I don't know what the rationale was - I've a feeling that the underscore was (at least once upon a time) used to denote global symbols.
Comment 27 by Mo http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment27 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment27 Mo writes:
Linux will (usually) only use INT 0x80 as a fallback system-call mechanism. If I remember rightly, there's a page present in every process (I can't honestly recall whether it's managed by the libc or by the kernel-I think the kernel, though) which contains stubs for the system calls - on modern systems, this is SYSENTER instructions, whereas on old ones it'll be INT 0x80 (which is a lot slower than SYSENTER). This way, applications and libraries don't have to care what the system call calling convention (phew) is, which is important when it's changed a few times over the lifetime of the OS and backwards-compatibility needs to be preserved.

Somebody with more knowledge than I would have to chime in abut whether Mac OS X/Darwin exhibits any kind of comparable behaviour :)
Comment 26 by me http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment26 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment26 me writes:
@hdx
you have to link it first.
use:
as file.s -o file.o
ld file.o -o file
Comment 25 by hdx http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment25 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment25 Hi, I'm trying to run my first assembly program in my new iMac core 2 duo, but I'm getting the following error:

ld: could not find entry point "_start" (perhaps missing crt1.o) for inferred architecture i386

The code:

# My first Assembly program

.data

HelloWorldString:
.ascii "Hello World\n"

.text

.globl _start

_start:
# Load all the arguments for write ()

movl $4, %eax
movl $1, %ebx
movl $HelloWorldString, %ecx
movl $12, %edx
int $0x80

# Need to exit the program

movl $1, %eax
movl $0, %ebx
int $0x80



The command:
ld -e _start -o Hello Hello.o



Can anybody help?
Comment 24 by David http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment24 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment24 Hi,
I just started playing around with assembler under OS X too and wrote my first little proggy. It basicly does nothing but calls the system call 1 which ends the program. Had some difficulties to get it up and running until I realized I couldn't simply use the -arch x86_64 option for as and ld. So now I guess that I have a 32 bit program where I wanted a 64 bit program. Does anyone here know what is needed to create a 64 bit program?
Comment 23 by Cory B http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment23 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment23 Here's a nice Hello World program in OS X assembly language for the GNU assembler. Much of it is the same as FreeBSD assembly language. Notice I didn't calculate the absolute address of msg. The program works anyway.

Assemble and link like this:
$ as -o hello.o hello.S
$ ld -e _start -o hello hello.o
-

.data # data section, where we declare our constants
msg: .ascii "Hello, world!\n" # our message
len: .long . - msg # length of our message... the . means the
# current position pointer...

.text # text section, where we execute instructions
.globl _start # make _start visible to the linker

_syscall: # interrupt 80h, "calls" the kernel to perform a syscall
int $0x80
ret

_start: # our entry point
pushl len # push the length onto the stack
pushl $msg # push the (relative) address of msg onto the stack
pushl $1 # push 1 onto the stack (the file descriptor for stdout)
movl $4, %eax # load 4 into eax (the number for the write() syscall)
call _syscall # tell the kernel to write msg to stdout
addl $12, %esp # clean the excess off the register

pushl len # push the length onto the stack, to return it from exit()
mov $1, %eax # load 1 into eax (the number for the exit() syscall)
call _syscall # tell the kernel to exit our program

-- EOF --

Here's a version for nasm for comparison. It's a nearly one-to-one correspondence, so I didn't bother adding comments (assemble with `nasm -f macho hello.asm`, link the same as before):

section .data
msg db "Hello, world!",0xa
len equ $ - msg

section .text
global _start

_syscall:
int 0x80
ret

_start:
push dword len
push dword msg
push dword 1
mov eax, 4
call _syscall
add esp, 12

push dword len
mov eax, 1
call _syscall
Comment 22 by jan http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment22 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment22 super seite! essentielle resource!
Comment 21 by alex http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment21 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment21 So here's what I keep getting... does anyone have any ideas?
$ cat asmtest.s
.text
.globl _main
_main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl $0, %eax
leave
ret
$ as -o asmtest asmtest.s
$ chmod +x asmtest
$ ./asmtest
-bash: ./asmtest: Malformed Mach-o file
Comment 20 by Ted Henry http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment20 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment20 > the underscore in front of "main" is a convention in C, so just accept it

Should "C" be "assembly"?
Comment 19 by Tony http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment19 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment19 Hi,
I am using the gcc translated code as a guidance as well. But i am having trouble with the making calls to Pthread library functions. I kept getting 'illegal instruction' at run time. And the worst is that the debuggers (like gdb) can not reveal at which line actually the program crashed.

The file attached below is the program I experiment with the C library function calls. The program works if I remove the call to '_pthread_mutex_init'.
The function declaration looks like this:
_pthread_mutex_init(pointer to mutex object, NULL)

I am very very confused by Mac OSX.
Can you help me find the error here?
Thanks.
-----------------------------------------------------------------------
.cstring
.align 2
termination_msg:
.ascii "Program has terminated.\0"

.text

.globl _main
_main:

pushl %ebp
movl %esp, %ebp
#pushl %esi
pushl %ebx

call ___i686.get_pc_thunk.bx
"L12$pb":

# indirect addressing for Mac OSX
leal termination_msg-"L12$pb"(%ebx) , %eax
pushl %eax

call L_printf$stub
addl $4 , %esp

pushl $0 # NULL=$0
leal L_mtx$non_lazy_ptr-"L12$pb"(%ebx) , %eax
movl (%eax), %eax # addr(mtx) is in eax now
pushl %eax
call L_pthread_mutex_init$stub
addl $8 , %esp

#pushl $0
#call exit

popl %ebx
#popl %esi
movl %ebp, %esp
popl %ebp
ret

.comm _mtx , 44

.section __IMPORT,__pointers,non_lazy_symbol_pointers
L_mtx$non_lazy_ptr:
.indirect_symbol _mtx
.long 0

.section __IMPORT,__jump_table,symbol_stubs,self_modifying_code+pure_instructions,5
L_pthread_mutex_init$stub:
.indirect_symbol _pthread_mutex_init
hlt ; hlt ; hlt ; hlt ; hlt

L_printf$stub:
.indirect_symbol _printf
hlt ; hlt ; hlt ; hlt ; hlt

#--------------------------------------------------------------
.subsections_via_symbols
.section __TEXT,__textcoal_nt,coalesced,pure_instructions
.weak_definition ___i686.get_pc_thunk.bx
.private_extern ___i686.get_pc_thunk.bx
___i686.get_pc_thunk.bx:
movl (%esp), %ebx
ret
Comment 18 by André http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment18 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment18 Sorry, forgot to add there is also some info on Mac OS X IA-32 Assembly in the following Apple docs:

IA-32 Function Calling Conventions:

http://developer.apple.com/documentation/DeveloperTools/Conceptual/LowLevelABI/Articles/IA32.html
Comment 17 by André http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment17 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment17 I just wanted to point out that there is an Assembly reference included with the Xcode documentation set, namely a

Introduction to Mac OS X Assembler Guide:

http://developer.apple.com/DOCUMENTATION/DeveloperTools/Reference/Assembler/ASMIntroduction/chapter_1_section_1.html#//apple_ref/doc/uid/TP30000851-CH211-DontLinkElementID_14

Also the documentation for as, although sometimes cryptic:

http://sourceware.org/binutils/docs/as/index.html

And finally, there is the ".intel_syntax noprefix" directive which enables you to use Intel ASM syntax within the GNU Assembler.
noprefix means that you may leave out the %-sign in front of register names, to make it fully compatible with the Intel syntax (earlier versions of GAS also had Intel syntax support but still required the usage of % to indentify register names).

To switch back to AT&T syntax use ".att_syntax prefix"
Comment 16 by pete http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment16 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment16 Hey Uli
Thanks for the guide - nice and simple!
Tschuess!
Comment 15 by Uli Kusterer http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment15 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment15 Uli Kusterer writes:
@ema: Not at all. You only have to get the anchor's address once. Then you can calculate any subsequent addresses using the value in EBX.

Not to mention that a CALL instruction isn't the same as a function call in a higher-level language. It does not do all the stack set-up and tear-down that a real function call in a high-level language would do, nor does it pass any parameters. So, as long as you don't clobber EBX (or save it somewhere and restore it later), this is completely efficient.

Would absolute addressing be faster? Sure, but it's only a few instructions, and the benefit of being able to dynamically load code at runtime without having to worry about address space collisions far outweighs the downside of those three additional instructions.

Stop thinking like a high-level programmer, already, this is machine language. ;-) A typical function call takes dozens of instructions.
Comment 14 by ema http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment14 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment14 So, basically every access to a global variable involves a function call to find out our current address.
And every call to a shared library function results in another indirect call.

This is disturbingly inefficient.
Comment 13 by adil http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment13 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment13 Excellent work ! keep it going and you will be the first one to write a tutorial on the topic :)
Comment 12 by Amade http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment12 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment12 I modified simple helloworld program like it is told here and it is working pretty good even managed to add some other .cstring and print it.

But there is some part of code not mentioned here without which it doesn't compile:

.section __TEXT,__textcoal_nt,coalesced,pure_instructions
.weak_definition _nextInstructionAddress
.private_extern _nextInstructionAddress

could you tell something about it?
Comment 11 by Uli Kusterer http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment11 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment11 Uli Kusterer writes:
@David: Honestly, I don't know. I've only just started to learn this stuff. I've heard in various places that assembler isn't even portable between different assembers (i.e. between two manufacturers' "assembler compilers" for the same CPU), but there has to be a way. Whether there are problems with embedded or in-line assembly? No idea, I didn't even know the distinction existed.
Comment 10 by David Liontooth http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment10 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment10 A bunch of us are trying to compile transcode (http://www.transcoding.org) on OSX, both PPC and MacIntel. We're finding that inline assembly is not sufficiently supported for the code to compile, so we were hoping you could give us a pointer (this is being discussed on the transcode and macports mailing lists). Transcode 1.0.3 compiles on PPC but not MacIntel; transcode 1.1.0 causes problems with the asm code on both. A couple of guys working on Pre Make Kit (pmk, see http://sourceforge.net/tracker/index.php?func=detail&aid=1569713&group_id=94395&atid=607694) really push this and end up frustrated, concluding "Projects that mix assembly and C code must all fail on MacOS X Intel machines." They also say mplayer avoids the problem by using C embedded assembly. Could you comment on this whole situation? What's the most likely successful way out?

Cheers,
Dave
Comment 9 by Pete H http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment9 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment9 To answer Marc's question, on Mac OS you use the same 0x80 interrupt. For example, to write a 42 byte string labeled 'output' to stdout, the code would look something like:

pushl $42 #l ength
lea output, %eax
pushl %eax # string address
pushl $1 # file descriptor number
mov $4,%eax # system call number
push %eax
int $0x80 # make the system call

You can see mach/i386/syscall_sw.h for the different software interrupts available, and sys/syscall.h for the different system call numbers.
Comment 8 by Uli Kusterer http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment8 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment8 Uli Kusterer writes:
Haven't played with that yet. It may be the same as BSD on OS X, or it may be different (because BSD doesn't use Mach)... I'd be interested in hearing what you find out.
Comment 7 by Marc Haisenko http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment7 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment7 Marc Haisenko writes:
Good writeup ! Very nice ! This is exactly how it works on Linux as well. The "Calling a system function" is new to me and very interesting, have to try that on Linux as well.

There's only one thing left which I want to figure out: how to make a system call. On Linux, you put the arguments in %eax, %ebx, ... (%eax contains the number of the system call). Then you call interrupt 0x80. E.g. to print a string on stdout, you do:

movl $4, %eax # syscall: write
movl $1, %ebx # file descriptor 1 (stdout)
movl $string, %ecx # address of the string
movl $len, %edx # length of the string
int $0x80

On FreeBSD, it seems that instead of passing the arguments via registers they get passed via the user stack and call a different interrupt. I wonder how it's done in Mac OS X.
Comment 6 by Uli Kusterer http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment6 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment6 Uli Kusterer writes:
Blake, you're right, that should have been "base" pointer in the first case. The second case is meant to be more general, but I've clarified it a bit since, after all, the example code actually messes with the base pointer. Thanks for catching those!
Comment 5 by Blake C. http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment5 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment5 "We also save the *back* pointer (the point where our caller can find its parameters on the stack) to the stack, and set it to the current stack pointer..."

"Yes, since the stack starts at the end of memory and grows towards the beginning, and you subtract from the stack pointer to make it larger, you need to add to the *stack* pointer to find something in it."

Should *back* and *stack* above instead say 'base'?

Thanks for the writeup :)
Comment 4 by bxd http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment4 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment4 @Jan: Xcode->Preferences->Debugging->Disassembly Style from "AT&T" to "Intel".

If you're in bare gdb for some reason, it's "set disassembly-flavor intel" vs "att".
Comment 3 by Uli Kusterer http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment3 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment3 Uli Kusterer writes:
@Wolf: Thanks, Jon! :-) Coming from you, that means a lot to me.

@Jan: Jordan pointed me at http://www.niksula.hut.fi/~mtiihone/intel2gas/, which is a converter to convert Intel syntax (what a lot of tutorials on the web use) to AT&T syntax (what GCC uses and hence I use in this tutorial). Its features also mention it has preliminary support for the reverse, AT&T to Intel. Maybe that can help you? Let me know whether it works for you.
Comment 2 by Jan Patera http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment2 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment2 Hi,

Do you know if there is a way to make the XCode disassembly feature to use the MacroAssembler syntax, commonly used on Wintel?
E.g. "mov ebp,esp" instead of "movl %esp, %ebp"

-- Jan
Comment 1 by rentzsch http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment1 http://zathras.de/angelweb/blog-intel-assembler-on-mac-os-x.htm#comment1 rentzsch writes:
Excellent write-up: very approachable. Well done!