OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 3:54 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 25 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 12:03 pm 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
Hi All,

You should know that gcc 10.2.0 is not compatible with previous gcc versions, and could easily miscompile previously working source.

1. an interesting thing, that memory addresses and builtins work differently. This will cause a segfault:
Code:
memcpy(&data + x, &data + y, ...);
However these compile correctly:
Code:
memcpy((void*)&data + x, (void*)&data + y, ...);   or
memcpy(&data[x], &data[y], ...);
This was reported to me here, you can try the code and all (single file example, mkboot.c).

2. strings are concatenated incorrectly, this is specially important for UEFI boot loader writers. gcc prior to 10.2.0 concatenated L"a" "b" "c" correctly into L"abc". But with 10.2.0 not any more!

I can't even describe this with C syntax, only in a dump: prior 10.2.0 that string constant was 'a' 0 'b' 0 'c' 0 0 0, but with 10.2.0: 'a' 0 'b' 'c' 0. This is particularly annoying if your code uses POSIX defines like PRIu64 and others, since they are defined as strings and not as widestrings. Watch out! For example L"%" PRIu64 " bytes" won't become L"%lu bytes" as expected (and how it was prior to 10.2.0), but you have to use L"%" L##PRIu64 L" bytes" or something! I run into this problem with my USBImager's Windows port here, which uses wchar_t and therefore needs L"" constants.

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 12:20 pm 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 1108
Regular compiler bugs in unstable versions?


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 12:37 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
Have you tried
Code:
memcpy(data + x, data + y, ...);

Even though I use C every day, and have done so for half my life, I don't know what &data + x will do in the original source. data is an array of 65536 unsigned chars, so &data is presumably a pointer to such an array. So what happens when you add to it? I honestly don't know.

The memcpy calls you changed are defined behavior, though. If they truly fail with GCC 10, then that is a bug, since source and destination don't overlap.

As for the other point you raised, I actually went to cppreference.com to see pretty much your example on display. Yes, that is a bug. The code L"a" "b" "c" is supposed to be equivalent to L"abc". If it is not, the compiler is broken.

Looks to me like the new compiler isn't quite ready for prime time yet.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 1:38 pm 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
alexfru wrote:
Regular compiler bugs in unstable versions?
Well, I've used the gcc of my distro and the latest MSYS2 mingw. Both installed 10.2.0.
nullplan wrote:
Looks to me like the new compiler isn't quite ready for prime time yet.
Yep, I got to the same conclusion. Unfortunately it is out as the latest stable. So OSDevers using gcc, watch out!

nullplan wrote:
Have you tried
Code:
memcpy(data + x, data + y, ...);
As you can see in the issue, I've added these to the code
Code:
printf("%lx %lx %lx %lx\n",(uint64_t)data, (uint64_t)(&data), (uint64_t)((void*)&data), (uint64_t)(&data[0]));
printf("%lx %lx %lx %lx\n",(uint64_t)(data+0x1EE), (uint64_t)(&data+0x1EE), (uint64_t)(&data[0]+0x1EE), (uint64_t)((void*)&data+0x1EE));
and it printed
Code:
7fff341045d0 7fff341045d0 7fff341045d0 7fff341045d0
7fff341047be 7fff35fe45d0 7fff341047be 7fff341047be
So I haven't tried "data + x" per se, but I guess it would work (being the 1st col in the 2nd row, printed the correct offset). It is just the 2nd col in the 2nd row that's incorrect (the (uint64_t)(&data+0x1EE) one).

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 4:19 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
bzt wrote:
1. an interesting thing, that memory addresses and builtins work differently. This will cause a segfault:
Code:
memcpy(&data + x, &data + y, ...);

Taking the address of an array results in a pointer to an object with the same type as the array, not one element of the array, so you're addressing well past the end of your array and the behavior is undefined.

bzt wrote:
2. strings are concatenated incorrectly, this is specially important for UEFI boot loader writers. gcc prior to 10.2.0 concatenated L"a" "b" "c" correctly into L"abc". But with 10.2.0 not any more!

Compiler Explorer says it works fine. (Yes, I tried with -fshort-wchar too.) Does running GCC with "-S" produce sensible output?


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 7:49 pm 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
Octocontrabass wrote:
Taking the address of an array results in a pointer to an object with the same type as the array, not one element of the array,
Exactly, that's why it should work with memcpy.
Octocontrabass wrote:
so you're addressing well past the end of your array
You make absolutely no sense. First, how do you know "x" is bigger than the array length? (In the actual code it isn't). Second, If that were true, then how do you explain that in the printf both "&data" and "&data + x" printed the same address?
Octocontrabass wrote:
Does running GCC with "-S" produce sensible output?
IT'S WIN32. There's nothing sensible about it! I did grep for "memcpy" in objdump though:
- With "&data + x", there's a libc call to memcpy@PLT (builtin not used).
- With "&data[x]" and "(void*)&data + x" there isn't any (builtin inlined).

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Mon Sep 21, 2020 8:34 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
bzt wrote:
You make absolutely no sense. First, how do you know "x" is bigger than the array length? (In the actual code it isn't).

In C, pointer arithmetic goes by the size of the object pointed to. If you add 1 to a pointer to an array of 65536 char, you're adding 65536 to the underlying pointer value.

bzt wrote:
Second, If that were true, then how do you explain that in the printf both "&data" and "&data + x" printed the same address?

Take a closer look at your printf output: 0x7fff35fe45d0 - 0x7fff341045d0 = 0x1ee0000

bzt wrote:
IT'S WIN32. There's nothing sensible about it! I did grep for "memcpy" in objdump though:

I'm not asking about memcpy, I'm asking about the wchar_t strings.


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Tue Sep 22, 2020 5:40 am 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
Octocontrabass wrote:
Take a closer look at your printf output: 0x7fff35fe45d0 - 0x7fff341045d0 = 0x1ee0000
Ah, you're right! I missed that! However that's not correct either, because the type of data is unsigned char. "&data" should give the address of the array, and "&data + x" should be that address plus x. And it should not influence whether the memcpy gets inlined or not, but it does.

Octocontrabass wrote:
I'm not asking about memcpy, I'm asking about the wchar_t strings.
I was a bit tired, sorry. For the wchar_t thing, I've used a hexeditor to take a look at the string constants in the binary. Nullplan says this is incorrect behavior, tbh I haven't checked that, I just noticed that the handling of string constants changed and caused wsprintfW to crash.

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Tue Sep 22, 2020 7:51 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1593
bzt wrote:
Ah, you're right! I missed that! However that's not correct either, because the type of data is unsigned char. "&data" should give the address of the array, and "&data + x" should be that address plus x. And it should not influence whether the memcpy gets inlined or not, but it does.
No, data is an array [65536] of unsigned char. So &data is a pointer to array [65536] of unsigned char. So &data + n is a pointer to the n'th array [65536] of unsigned char following data. If data were an int, then &data would be a pointer to int, and &data + n would be a pointer to the n'th int following data. There should be no difference if data happens to be an array. Also, this way you'd see that that syntax is obviously trouble.

Honestly, I never understood why people write pointers to arrays like that. Just use data + n if you mean data + n. Or maybe &data[n] (I have had reason to use that notation in the past, even to prefer it over data + n). Your "fix" uses a GCC extension which assigns a meaning to pointer arithmetic with void-pointers (namely that they behave like char-pointers).

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Tue Sep 22, 2020 4:58 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
bzt wrote:
And it should not influence whether the memcpy gets inlined or not, but it does.

Why shouldn't it? GCC can easily see that the behavior will be undefined, so it has no reason to emit sensible code.

bzt wrote:
For the wchar_t thing, I've used a hexeditor to take a look at the string constants in the binary.

That's perfectly reasonable, but it doesn't tell us if the bug is really in GCC. It could be a bug in binutils. Using "-S" allows us to examine the code generated by GCC before GAS does anything with it.


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Wed Sep 23, 2020 8:05 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7612
Location: Germany
Errr... no.

GCC produces assembly only if you pass -S. This is assembly generated from the in-memory representation of the source passed. Object code is generated by GCC from the in-memory representation, not by GAS from an intermediary ASM. GAS is never run by a GCC compile.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Wed Sep 23, 2020 8:16 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7612
Location: Germany
nullplan wrote:
As for the other point you raised, I actually went to cppreference.com to see pretty much your example on display. Yes, that is a bug. The code L"a" "b" "c" is supposed to be equivalent to L"abc". If it is not, the compiler is broken.


Hm...

Two things, here:

[list=]
[*] The compiler must be running in C99+ mode (previous to that, mixing prefixes resulted in UB). I guess we can pretty much assume that.
[*] What, exactly, is the evaluation order of string literal concatenation? If right-to-left, the "b" "c" is evaluated first (to a narrow string), and perhaps the compiler doesn't re-evaluate that when concatenating it with the L"a"...
[/list]

Why didn't @bzt prefix each literal, or write it in one to begin with? Much less ambiguous that way...

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Wed Sep 23, 2020 8:48 am 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
Solar wrote:
Why didn't @bzt prefix each literal, or write it in one to begin with? Much less ambiguous that way...
For three reasons:
1. PRIu64 is defined by the POSIX headers, not by me (and without the prefix).
2. It worked with gcc 7, 8, 9, so I did not see any reason not to use PRIx until now.
3. using one literal is not a portable solution, it's just a workaround. PRIx defines are there to support multiple environments, just like the uint64_t define in stdint.h.

Again, I don't want to say which one is correct the pre-10.2.0 one or the 10.2.0 one, all I'm saying is, gcc behaviour has changed.

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Wed Sep 23, 2020 4:10 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
bzt wrote:
Again, I don't want to say which one is correct the pre-10.2.0 one or the 10.2.0 one, all I'm saying is, gcc behaviour has changed.

Behavior of my copy of GCC 10.2 has not changed relative to previous versions. How can I get a copy of GCC that exhibits the wchar_t string bug?


Top
 Profile  
 
 Post subject: Re: Warning, gcc 10.2.0 incompatible changes!
PostPosted: Wed Sep 23, 2020 4:35 pm 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1584
Octocontrabass wrote:
Behavior of my copy of GCC 10.2 has not changed relative to previous versions. How can I get a copy of GCC that exhibits the wchar_t string bug?
As I wrote, the problem appeared with wsprintfW here. Btw it was reported here, and happened after I've upgraded MSYS2 to the latest, which comes with gcc 10.2.0. I had no issues with older MSYS. To trigger the error, create a disk backup by pressing the "Read" button, the app will crash (but first compile the commit e216859db725, because I've already fixed the source by removing PRIu64, so it's working now).

Cheers,
bzt


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 25 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 30 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group