~ wrote:
A typical example is linux/arch/i386/lib/strstr.c:
Typical example (even from older Linux kernel) with a bug. To demonstrate try this code:
Code:
#include <stdio.h>
char * strstr(const char * cs,const char * ct)
{
int d0, d1;
register char * __res;
__asm__ __volatile__(
"movl %6,%%edi\n\t"
"repne\n\t"
"scasb\n\t"
"notl %%ecx\n\t"
"decl %%ecx\n\t" /* NOTE! This also sets Z if searchstring='' */
"movl %%ecx,%%edx\n"
"1:\tmovl %6,%%edi\n\t"
"movl %%esi,%%eax\n\t"
"movl %%edx,%%ecx\n\t"
"repe\n\t"
"cmpsb\n\t"
"je 2f\n\t" /* also works for empty string, see above */
"xchgl %%eax,%%esi\n\t"
"incl %%esi\n\t"
"cmpb $0,-1(%%eax)\n\t"
"jne 1b\n\t"
"xorl %%eax,%%eax\n\t"
"2:"
:"=a" (__res), "=&c" (d0), "=&S" (d1)
:"0" (0), "1" (0xffffffff), "2" (cs), "g" (ct)
:"dx", "di");
return __res;
}
int main()
{
char string[]="Hello There";
char search[]="here";
return (strstr(string, search) ? 1 : 0);
}
This should return 1 since the string "here" is in the string to search. Compile and run without optimizations:
Code:
gcc test.c -O0 -m32 -Wall
./a.out; echo $?
1
Good to go! Not quite, build with optimizations on:
Code:
gcc test.c -O3 -m32 -Wall
./a.out; echo $?
0
So why did optimizations on cause this to fail? It is a subtle bug in the inline assembly. Passing pointers to memory through registers as is done with the constraints
"=&c" (d0), "=&S" (d1) doesn't actually tell the compiler that what those pointers point at is actually going to be read or written. In this case the code generator produced code that never put the strings
search and
string on the stack as the optimizer never realized that the data in the character arrays were being accessed. We only told the compiler we were using the pointers (not what they point at). To get around this you can add a memory clobber to the inline assembly to ensure all the data in the arrays are saved (and then restored if need be) before the inline assembly is executed. Adding a memory clobber can be done with this modification:
Code:
:"dx", "di", "memory");
. This is discussed in the
GCC inline assembly documentation along with an alternate solution (example shows a proper repne scasb) in the section
6.47.2.6 Clobbers and Scratch Registers