Home > programming > Compiler optimization

Compiler optimization

I saw this a while back, but i still think that this is an impressive example of what modern compilers can do. Say, when writing C code you have the need to rotate bits. There is no rotate operator in C so the bit rotation has to be implemented like this:

$ cat rotate.c
unsigned long rotate_left(unsigned long a, unsigned int shift)
return a << shift | a >> (sizeof(a) * 8 - shift);
unsigned long rotate_right(unsigned long a, unsigned int shift)
return a >> shift | a << (sizeof(a) * 8 - shift);

Now compile it with gcc (version 4.4.1 here, but most other versions should do the same):

$ gcc -c -O rotate.c

The compiler knows that the C code only rotates the bits and that this can be done with the x86 rol and ror instructions:

$ objdump -d rotate.o

rotate.o: file format elf32-i386

Disassembly of section .text:

00000000 :
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 8b 4d 0c mov 0xc(%ebp),%ecx
6: 8b 45 08 mov 0x8(%ebp),%eax
9: d3 c0 rol %cl,%eax
b: 5d pop %ebp
c: c3 ret

0000000d :
d: 55 push %ebp
e: 89 e5 mov %esp,%ebp
10: 8b 4d 0c mov 0xc(%ebp),%ecx
13: 8b 45 08 mov 0x8(%ebp),%eax
16: d3 c8 ror %cl,%eax
18: 5d pop %ebp
19: c3 ret

This means that it is usually the best thing to write readable C code and let the compiler optimize things. Modern compilers can be pretty good at this.

Categories: programming Tags: ,
  1. August 19, 2010 at 7:59 pm

    I agree that compiler optimizations are awesome and what not. But, actually ROR/ROL is not the way to go. In this case GCC is wrong. If you read the intel optimization guide it will advise not to use these opcodes. These instructions are legacy instructions and they have ‘slow paths’ in the CPU (they are emulated).

    • August 20, 2010 at 6:36 am

      In http://www.intel.com/Assets/PDF/manual/248966.pdf i found this: “Avoid ROTATE by register or ROTATE by immediate instructions. If possible, replace with a ROTATE by 1 instruction.”

      Not being an expert for compilers, but just out of curiosity: What would be a better approach for compiling this to opcodes?

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s