
15 Most Recent [RSS]
More...
|
Funny thing about C parameter evaluation order...
I just explained this to a friend today, and thought this might make an interesting blog posting:
#include <stdio.h>
int main( int argc, const char * argv[] )
{
char theText[2] = { 'A', 'B' };
char* myString = theText;
printf( "%c, %c\n", *(++myString), *myString );
return 0;
}
The above code is platform-dependent in C. Yes, you read correctly: platform dependent. And I'm not nitpicking that this may cause a problem if your compiler is old or that some compiler may not have printf() or the POSIX standard.
This code is platform-dependent, because the C standard says that there is no guarantee in which order the parameters of a function call get evaluated. So, if you run the above code, it could print B, B (which most of you probably expected because it corresponds to our left-to-right reading order) or it could print B, A.
If you want to test this and you own an Intel Mac, you can do the following thanks to Rosetta's PowerPC emulation: Create a new "Standard Tool" project in Xcode and paste the above code into the main.c file. Switch to "Release" and change "Architectures" in the build settings for the release build configuration to be "ppc". Build and Run. It'll print B, B. Now change the architecture to "i386" and build and run again. It'll print B, A.
So, why doesn't C define an order? Why did anyone think such odd behaviour was a good idea? Well, to explain that, we'll have to look at what your computer does under the hood to execute a function call. In general, there are two steps: First, the parameters are evaluated and stored in some standardized place where the called function can find them, and then the processor "jumps" to the first command in the new function and starts executing it.
Some CPUs have registers inside the CPU, which are little variables that can hold short values, and which can be accessed a lot quicker than actually going over to a RAM chip and fetching a value. There are different registers for different kinds of values. Many CPUs have separate registers for floating-point numbers and integers. And just like with RAM, it's sometimes faster to access these registers in a certain order.
So, it may be faster to first evaluate all integer-value parameters, and then those that contain floating-point values. Depending on what physical CPU your computer has (or in the case of Rosetta, what characteristics the emulated CPU your code is being run on has), these performance characteristics may be different. Some CPUs may have so few registers that the parameters will always have to be passed in RAM. Others may put larger parameters in RAM and smaller ones in registers, others again may put the first couple parameters in registers (maybe even distributing a longer parameter across several registers), and the rest that don't fit in RAM, etc.
So, to make sure C can be made to run that little bit faster on any of these CPUs, its designers decided not to enforce an order for execution of parameters. And that's one of the dangers of writing code in C++ or Objective C: It may look like a high-level language, but underneath it is still a portable assembler, with platform-dependencies like this.
Drew Thaler writes: I don't think it's platform-dependent so much as it is compiler-dependent. gcc-x86 does one thing, gcc-ppc does something else. CodeWarrior-ppc or xlc-ppc might have another behavior.
This is where warnings are useful. gcc's -Wmost will say:
warning: operation on 'myString' may be undefined
|
Uli Kusterer replies: ★ @Drew: Good point, it really is compiler-dependent. Even though platform details like the CPU and the operating system's ABI may set the general guidelines by which compilers get implemented, different levels of optimization may be implemented in different compilers, and different compilers may use slightly different algorithms. As long as everything ends up on the stack in the correct order, or in the correct registers, most compilers can do this any way they want.
|
ezj writes: Thanks, great info. I assumed left-to-right processing in a collection of printf arguments, and fortunately noticed some funky results. Turns out your post nailed the issue.
|
Dan writes: Aha! Thanks for this useful information. I was trying to work out what this compiler warning was trying to warn me about, and your explanation clarified it straightforwardly.
|
JJ writes: This is NOT platform-specific OR compiler-specific. This is completely undefined behavior because you are reading and modifying 'myString' with no sequence point in between. Do not write code like this (it is wrong) and do not teach others to write code like this either.
Please read up on sequence points and understand what "undefined behavior" is (which is what gcc -Wmost is telling you). Undefined means you should never do it because it is always wrong. If this was merely "implementation-defined behavior" then yes, you could claim it was platform or compiler dependent, depending on which implementation defined the semantics for you. |
|  |