OSDev.org https://forum.osdev.org/ |
|
Problems with memset() implementation on GCC 10.2.0 https://forum.osdev.org/viewtopic.php?f=13&t=38672 |
Page 1 of 1 |
Author: | kzinti [ Tue Dec 15, 2020 7:27 pm ] |
Post subject: | Problems with memset() implementation on GCC 10.2.0 |
I just upgraded my cross compiler to GCC 10.2.0 and my OS crashes early on memset(). I am sure I am doing something wrong and GCC 10.2.0 compiles it into something unexpected: Code: void* memset(void* ptr, int value, size_t num) { for (unsigned char* p = ptr; num; --num) { *p++ = (unsigned char)value; } return ptr; } Code: ffffffff80006360 <memset>: ffffffff80006360: 48 85 d2 test %rdx,%rdx ffffffff80006363: 74 13 je ffffffff80006378 <memset+0x18> ffffffff80006365: 55 push %rbp ffffffff80006366: 40 0f b6 f6 movzbl %sil,%esi ffffffff8000636a: 48 89 e5 mov %rsp,%rbp ffffffff8000636d: e8 ee ff ff ff callq ffffffff80006360 <memset> ffffffff80006372: 5d pop %rbp ffffffff80006373: c3 retq ffffffff80006374: 0f 1f 40 00 nopl 0x0(%rax) ffffffff80006378: 48 89 f8 mov %rdi,%rax ffffffff8000637b: c3 retq ffffffff8000637c: 0f 1f 40 00 nopl 0x0(%rax) What happens is I call memset with a non-zero length (in %rdx)... so the code above ends up calling memset() recursively at address ffffffff8000636d until I run out of stack space. Please help if you can. I refuse to believe the problem is with GCC, I must be missing something. |
Author: | nexos [ Tue Dec 15, 2020 7:30 pm ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
It might be better just to use __builtin_memset IMO. |
Author: | kzinti [ Tue Dec 15, 2020 7:33 pm ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
Agreed. I would still like to understand why it is broken though. |
Author: | kzinti [ Tue Dec 15, 2020 7:35 pm ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
Well what do you know, I am not the first to run into this: https://github.com/micropython/micropython/issues/6053 It looks like GCC detects that the loop is memset and optimizes the loop by calling... memset. Good times. |
Author: | kzinti [ Tue Dec 15, 2020 7:53 pm ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
Adding "-fno-builtin" when compiling the kernel fixes the issue, but clearly not what I want. |
Author: | Octocontrabass [ Tue Dec 15, 2020 8:16 pm ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
GCC assumes it can emit calls to memcpy(), memmove(), memset(), and memcmp() at any point - including inside your attempt at implementing one of those four functions. As the optimizer gets smarter, it will get better at creating endless recursion loops. Various GCC bug reports suggest the following function attribute: Code: __attribute__((optimize("no-tree-loop-distribute-patterns"))) You can also disable this optimization at a global level, although that seems like a poor choice. You can also implement those four functions in assembly, to be sure GCC can never create an endless recursion loop. You can also use Clang, which seems to automatically avoid infinite recursion and/or emitting C library calls in freestanding mode. nexos wrote: It might be better just to use __builtin_memset IMO. No, __builtin_memset() is only an optimization hint. The optimizer may still translate __builtin_memset() into a memset() call, and then you'll have a link error due to the undefined function. |
Author: | kzinti [ Wed Dec 16, 2020 12:56 am ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
Thanks, I went with the following at the top of my file: Code: #pragma GCC optimize "no-tree-loop-distribute-patterns"
|
Author: | moonchild [ Fri Dec 18, 2020 5:31 pm ] |
Post subject: | Re: Problems with memset() implementation on GCC 10.2.0 |
Can also implement strings functions in assembly; this also gives you a pretty easy perf boost, at least on x86. Here are a couple: Code: memcpy:
mov rcx, rdx mov rax, rdi rep movs byte ptr [rdi], byte ptr [rsi] ret memmove: cmp rdi, rsi ja memcpy mov rax, rdi mov rcx, rdx lea rdi, [rdi + rdx - 1] lea rsi, [rsi + rdx - 1] std rep movs byte ptr [rdi], byte ptr [rsi] cld ret memset: mov rcx, rdx mov rdx, rdi mov al, sil rep stos byte ptr [rdi] mov rax, rdx ret |
Page 1 of 1 | All times are UTC - 6 hours |
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group http://www.phpbb.com/ |