OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Mar 28, 2024 7:12 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: Problems with memset() implementation on GCC 10.2.0
PostPosted: Tue Dec 15, 2020 7:27 pm 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
I just upgraded my cross compiler to GCC 10.2.0 and my OS crashes early on memset().

I am sure I am doing something wrong and GCC 10.2.0 compiles it into something unexpected:

Code:
void* memset(void* ptr, int value, size_t num)
{
    for (unsigned char* p = ptr; num; --num)
    {
        *p++ = (unsigned char)value;
    }

    return ptr;
}

Code:
ffffffff80006360 <memset>:
ffffffff80006360:   48 85 d2                test   %rdx,%rdx
ffffffff80006363:   74 13                   je     ffffffff80006378 <memset+0x18>
ffffffff80006365:   55                      push   %rbp
ffffffff80006366:   40 0f b6 f6             movzbl %sil,%esi
ffffffff8000636a:   48 89 e5                mov    %rsp,%rbp
ffffffff8000636d:   e8 ee ff ff ff          callq  ffffffff80006360 <memset>
ffffffff80006372:   5d                      pop    %rbp
ffffffff80006373:   c3                      retq   
ffffffff80006374:   0f 1f 40 00             nopl   0x0(%rax)
ffffffff80006378:   48 89 f8                mov    %rdi,%rax
ffffffff8000637b:   c3                      retq   
ffffffff8000637c:   0f 1f 40 00             nopl   0x0(%rax)

What happens is I call memset with a non-zero length (in %rdx)... so the code above ends up calling memset() recursively at address ffffffff8000636d until I run out of stack space.

Please help if you can. I refuse to believe the problem is with GCC, I must be missing something.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Tue Dec 15, 2020 7:30 pm 
Offline
Member
Member

Joined: Tue Feb 18, 2020 3:29 pm
Posts: 1071
It might be better just to use __builtin_memset IMO.

_________________
"How did you do this?"
"It's very simple — you read the protocol and write the code." - Bill Joy
Projects: NexNix | libnex | nnpkg


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Tue Dec 15, 2020 7:33 pm 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Agreed. I would still like to understand why it is broken though.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Tue Dec 15, 2020 7:35 pm 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Well what do you know, I am not the first to run into this:

https://github.com/micropython/micropython/issues/6053

It looks like GCC detects that the loop is memset and optimizes the loop by calling... memset. Good times.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Tue Dec 15, 2020 7:53 pm 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Adding "-fno-builtin" when compiling the kernel fixes the issue, but clearly not what I want.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Tue Dec 15, 2020 8:16 pm 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5099
GCC assumes it can emit calls to memcpy(), memmove(), memset(), and memcmp() at any point - including inside your attempt at implementing one of those four functions. As the optimizer gets smarter, it will get better at creating endless recursion loops.

Various GCC bug reports suggest the following function attribute:
Code:
__attribute__((optimize("no-tree-loop-distribute-patterns")))


You can also disable this optimization at a global level, although that seems like a poor choice.

You can also implement those four functions in assembly, to be sure GCC can never create an endless recursion loop.

You can also use Clang, which seems to automatically avoid infinite recursion and/or emitting C library calls in freestanding mode.

nexos wrote:
It might be better just to use __builtin_memset IMO.

No, __builtin_memset() is only an optimization hint. The optimizer may still translate __builtin_memset() into a memset() call, and then you'll have a link error due to the undefined function.


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Wed Dec 16, 2020 12:56 am 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Thanks, I went with the following at the top of my file:

Code:
#pragma GCC optimize "no-tree-loop-distribute-patterns"

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: Problems with memset() implementation on GCC 10.2.0
PostPosted: Fri Dec 18, 2020 5:31 pm 
Offline
Member
Member

Joined: Wed Apr 01, 2020 4:59 pm
Posts: 73
Can also implement strings functions in assembly; this also gives you a pretty easy perf boost, at least on x86. Here are a couple:

Code:
memcpy:
mov rcx, rdx
mov rax, rdi
rep movs byte ptr [rdi], byte ptr [rsi]
ret

memmove:
cmp rdi, rsi
ja memcpy
mov rax, rdi
mov rcx, rdx
lea rdi, [rdi + rdx - 1]
lea rsi, [rsi + rdx - 1]
std
rep movs byte ptr [rdi], byte ptr [rsi]
cld
ret

memset:
mov rcx, rdx
mov rdx, rdi
mov al, sil
rep stos byte ptr [rdi]
mov rax, rdx
ret


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 38 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group