OSDev.org
https://forum.osdev.org/

Warning, gcc 10.2.0 incompatible changes!
https://forum.osdev.org/viewtopic.php?f=11&t=37252
Page 1 of 2

Author:  bzt [ Mon Sep 21, 2020 12:03 pm ]
Post subject:  Warning, gcc 10.2.0 incompatible changes!

Hi All,

You should know that gcc 10.2.0 is not compatible with previous gcc versions, and could easily miscompile previously working source.

1. an interesting thing, that memory addresses and builtins work differently. This will cause a segfault:
Code:
memcpy(&data + x, &data + y, ...);
However these compile correctly:
Code:
memcpy((void*)&data + x, (void*)&data + y, ...);   or
memcpy(&data[x], &data[y], ...);
This was reported to me here, you can try the code and all (single file example, mkboot.c).

2. strings are concatenated incorrectly, this is specially important for UEFI boot loader writers. gcc prior to 10.2.0 concatenated L"a" "b" "c" correctly into L"abc". But with 10.2.0 not any more!

I can't even describe this with C syntax, only in a dump: prior 10.2.0 that string constant was 'a' 0 'b' 0 'c' 0 0 0, but with 10.2.0: 'a' 0 'b' 'c' 0. This is particularly annoying if your code uses POSIX defines like PRIu64 and others, since they are defined as strings and not as widestrings. Watch out! For example L"%" PRIu64 " bytes" won't become L"%lu bytes" as expected (and how it was prior to 10.2.0), but you have to use L"%" L##PRIu64 L" bytes" or something! I run into this problem with my USBImager's Windows port here, which uses wchar_t and therefore needs L"" constants.

Cheers,
bzt

Author:  alexfru [ Mon Sep 21, 2020 12:20 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Regular compiler bugs in unstable versions?

Author:  nullplan [ Mon Sep 21, 2020 12:37 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Have you tried
Code:
memcpy(data + x, data + y, ...);

Even though I use C every day, and have done so for half my life, I don't know what &data + x will do in the original source. data is an array of 65536 unsigned chars, so &data is presumably a pointer to such an array. So what happens when you add to it? I honestly don't know.

The memcpy calls you changed are defined behavior, though. If they truly fail with GCC 10, then that is a bug, since source and destination don't overlap.

As for the other point you raised, I actually went to cppreference.com to see pretty much your example on display. Yes, that is a bug. The code L"a" "b" "c" is supposed to be equivalent to L"abc". If it is not, the compiler is broken.

Looks to me like the new compiler isn't quite ready for prime time yet.

Author:  bzt [ Mon Sep 21, 2020 1:38 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

alexfru wrote:
Regular compiler bugs in unstable versions?
Well, I've used the gcc of my distro and the latest MSYS2 mingw. Both installed 10.2.0.
nullplan wrote:
Looks to me like the new compiler isn't quite ready for prime time yet.
Yep, I got to the same conclusion. Unfortunately it is out as the latest stable. So OSDevers using gcc, watch out!

nullplan wrote:
Have you tried
Code:
memcpy(data + x, data + y, ...);
As you can see in the issue, I've added these to the code
Code:
printf("%lx %lx %lx %lx\n",(uint64_t)data, (uint64_t)(&data), (uint64_t)((void*)&data), (uint64_t)(&data[0]));
printf("%lx %lx %lx %lx\n",(uint64_t)(data+0x1EE), (uint64_t)(&data+0x1EE), (uint64_t)(&data[0]+0x1EE), (uint64_t)((void*)&data+0x1EE));
and it printed
Code:
7fff341045d0 7fff341045d0 7fff341045d0 7fff341045d0
7fff341047be 7fff35fe45d0 7fff341047be 7fff341047be
So I haven't tried "data + x" per se, but I guess it would work (being the 1st col in the 2nd row, printed the correct offset). It is just the 2nd col in the 2nd row that's incorrect (the (uint64_t)(&data+0x1EE) one).

Cheers,
bzt

Author:  Octocontrabass [ Mon Sep 21, 2020 4:19 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

bzt wrote:
1. an interesting thing, that memory addresses and builtins work differently. This will cause a segfault:
Code:
memcpy(&data + x, &data + y, ...);

Taking the address of an array results in a pointer to an object with the same type as the array, not one element of the array, so you're addressing well past the end of your array and the behavior is undefined.

bzt wrote:
2. strings are concatenated incorrectly, this is specially important for UEFI boot loader writers. gcc prior to 10.2.0 concatenated L"a" "b" "c" correctly into L"abc". But with 10.2.0 not any more!

Compiler Explorer says it works fine. (Yes, I tried with -fshort-wchar too.) Does running GCC with "-S" produce sensible output?

Author:  bzt [ Mon Sep 21, 2020 7:49 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Octocontrabass wrote:
Taking the address of an array results in a pointer to an object with the same type as the array, not one element of the array,
Exactly, that's why it should work with memcpy.
Octocontrabass wrote:
so you're addressing well past the end of your array
You make absolutely no sense. First, how do you know "x" is bigger than the array length? (In the actual code it isn't). Second, If that were true, then how do you explain that in the printf both "&data" and "&data + x" printed the same address?
Octocontrabass wrote:
Does running GCC with "-S" produce sensible output?
IT'S WIN32. There's nothing sensible about it! I did grep for "memcpy" in objdump though:
- With "&data + x", there's a libc call to memcpy@PLT (builtin not used).
- With "&data[x]" and "(void*)&data + x" there isn't any (builtin inlined).

Cheers,
bzt

Author:  Octocontrabass [ Mon Sep 21, 2020 8:34 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

bzt wrote:
You make absolutely no sense. First, how do you know "x" is bigger than the array length? (In the actual code it isn't).

In C, pointer arithmetic goes by the size of the object pointed to. If you add 1 to a pointer to an array of 65536 char, you're adding 65536 to the underlying pointer value.

bzt wrote:
Second, If that were true, then how do you explain that in the printf both "&data" and "&data + x" printed the same address?

Take a closer look at your printf output: 0x7fff35fe45d0 - 0x7fff341045d0 = 0x1ee0000

bzt wrote:
IT'S WIN32. There's nothing sensible about it! I did grep for "memcpy" in objdump though:

I'm not asking about memcpy, I'm asking about the wchar_t strings.

Author:  bzt [ Tue Sep 22, 2020 5:40 am ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Octocontrabass wrote:
Take a closer look at your printf output: 0x7fff35fe45d0 - 0x7fff341045d0 = 0x1ee0000
Ah, you're right! I missed that! However that's not correct either, because the type of data is unsigned char. "&data" should give the address of the array, and "&data + x" should be that address plus x. And it should not influence whether the memcpy gets inlined or not, but it does.

Octocontrabass wrote:
I'm not asking about memcpy, I'm asking about the wchar_t strings.
I was a bit tired, sorry. For the wchar_t thing, I've used a hexeditor to take a look at the string constants in the binary. Nullplan says this is incorrect behavior, tbh I haven't checked that, I just noticed that the handling of string constants changed and caused wsprintfW to crash.

Cheers,
bzt

Author:  nullplan [ Tue Sep 22, 2020 7:51 am ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

bzt wrote:
Ah, you're right! I missed that! However that's not correct either, because the type of data is unsigned char. "&data" should give the address of the array, and "&data + x" should be that address plus x. And it should not influence whether the memcpy gets inlined or not, but it does.
No, data is an array [65536] of unsigned char. So &data is a pointer to array [65536] of unsigned char. So &data + n is a pointer to the n'th array [65536] of unsigned char following data. If data were an int, then &data would be a pointer to int, and &data + n would be a pointer to the n'th int following data. There should be no difference if data happens to be an array. Also, this way you'd see that that syntax is obviously trouble.

Honestly, I never understood why people write pointers to arrays like that. Just use data + n if you mean data + n. Or maybe &data[n] (I have had reason to use that notation in the past, even to prefer it over data + n). Your "fix" uses a GCC extension which assigns a meaning to pointer arithmetic with void-pointers (namely that they behave like char-pointers).

Author:  Octocontrabass [ Tue Sep 22, 2020 4:58 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

bzt wrote:
And it should not influence whether the memcpy gets inlined or not, but it does.

Why shouldn't it? GCC can easily see that the behavior will be undefined, so it has no reason to emit sensible code.

bzt wrote:
For the wchar_t thing, I've used a hexeditor to take a look at the string constants in the binary.

That's perfectly reasonable, but it doesn't tell us if the bug is really in GCC. It could be a bug in binutils. Using "-S" allows us to examine the code generated by GCC before GAS does anything with it.

Author:  Solar [ Wed Sep 23, 2020 8:05 am ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Errr... no.

GCC produces assembly only if you pass -S. This is assembly generated from the in-memory representation of the source passed. Object code is generated by GCC from the in-memory representation, not by GAS from an intermediary ASM. GAS is never run by a GCC compile.

Author:  Solar [ Wed Sep 23, 2020 8:16 am ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

nullplan wrote:
As for the other point you raised, I actually went to cppreference.com to see pretty much your example on display. Yes, that is a bug. The code L"a" "b" "c" is supposed to be equivalent to L"abc". If it is not, the compiler is broken.


Hm...

Two things, here:

[list=]
[*] The compiler must be running in C99+ mode (previous to that, mixing prefixes resulted in UB). I guess we can pretty much assume that.
[*] What, exactly, is the evaluation order of string literal concatenation? If right-to-left, the "b" "c" is evaluated first (to a narrow string), and perhaps the compiler doesn't re-evaluate that when concatenating it with the L"a"...
[/list]

Why didn't @bzt prefix each literal, or write it in one to begin with? Much less ambiguous that way...

Author:  bzt [ Wed Sep 23, 2020 8:48 am ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Solar wrote:
Why didn't @bzt prefix each literal, or write it in one to begin with? Much less ambiguous that way...
For three reasons:
1. PRIu64 is defined by the POSIX headers, not by me (and without the prefix).
2. It worked with gcc 7, 8, 9, so I did not see any reason not to use PRIx until now.
3. using one literal is not a portable solution, it's just a workaround. PRIx defines are there to support multiple environments, just like the uint64_t define in stdint.h.

Again, I don't want to say which one is correct the pre-10.2.0 one or the 10.2.0 one, all I'm saying is, gcc behaviour has changed.

Cheers,
bzt

Author:  Octocontrabass [ Wed Sep 23, 2020 4:10 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

bzt wrote:
Again, I don't want to say which one is correct the pre-10.2.0 one or the 10.2.0 one, all I'm saying is, gcc behaviour has changed.

Behavior of my copy of GCC 10.2 has not changed relative to previous versions. How can I get a copy of GCC that exhibits the wchar_t string bug?

Author:  bzt [ Wed Sep 23, 2020 4:35 pm ]
Post subject:  Re: Warning, gcc 10.2.0 incompatible changes!

Octocontrabass wrote:
Behavior of my copy of GCC 10.2 has not changed relative to previous versions. How can I get a copy of GCC that exhibits the wchar_t string bug?
As I wrote, the problem appeared with wsprintfW here. Btw it was reported here, and happened after I've upgraded MSYS2 to the latest, which comes with gcc 10.2.0. I had no issues with older MSYS. To trigger the error, create a disk backup by pressing the "Read" button, the app will crash (but first compile the commit e216859db725, because I've already fixed the source by removing PRIu64, so it's working now).

Cheers,
bzt

Page 1 of 2 All times are UTC - 6 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group
http://www.phpbb.com/