OSDev.org

The Place to Start for Operating System Developers
It is currently Tue Apr 23, 2024 5:56 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: [solved]lock cmpxchg didn't happen atomically?
PostPosted: Sun Jan 17, 2021 10:18 pm 
Offline
Member
Member

Joined: Mon Dec 07, 2020 8:09 am
Posts: 212
solved, bug is elsewhere (described in this thread, but probably not of much interest as it is a "logic error" in the upper level), lock cmpxchg is doing what it should do, original post below:

I'm pretty sure it's my bug or misunderstanding of something, yet it looks like "lock cmpxchg" didn't happen atomically from 2 CPUs.

Context:

It's a simple reader/writer lock (lock for my inode cache) implemented using 1 DWORD, different states represented by values:

// 0xFFFFFFFF: write locked
// 1 to 0xFFFFFFFE: read locked, the number is ref count
// 0: not locked

Atomic instructions are used to modify the values.

Yet I was able to get this sequence to happen on 2 CPUs :

Code:
CPU 1: before lock 2, val 0
CPU 1: after lock 2, val 1
CPU 0: before lock 2, val 0
CPU 0: after lock 2, val 1


As you can see both CPU 0 and CPU 1 think it has successfully incremented the value at index 2 from 0 to 1 :shock:

Code doing the increment (read lock, which is what both the CPU ran between the pair of prints) is below. I can't seem to see what's wrong with it but something has to be off.

It is also quite ugly (especially the part that it reads in the value from *ecx and then having to do both copy and increment) so would also like some suggestions on improving.

r is pointer to the lock.
_R_LOCK_MAX is 0xFFFFFFFE, but both CPUs should have blasted through the initial spin part as the value is so far from reaching it anyway.

Code:
    asm volatile ("_r_keep_spinning:                        \n\
                   movl (%%ecx), %%ebx                      \n\
                   movl %%ebx, %%eax                        \n\
                   addl $1, %%ebx                           \n\
                   cmp $" EXSTR(_R_LOCK_MAX) ", %%eax       \n\
                   jb _try_get_r_lock                       \n\
                   pause                                    \n\
                   jmp _r_keep_spinning                     \n\
                   _try_get_r_lock:                         \n\
                   lock cmpxchg %%ebx, (%%ecx)              \n\
                   jne _r_keep_spinning                      "
                   :
                   : "c"(r)
                   : "eax", "ebx", "memory", "cc");


Last edited by xeyes on Mon Jan 18, 2021 10:39 pm, edited 3 times in total.

Top
 Profile  
 
 Post subject: Re: lock cmpxchg didn't happen atomically?
PostPosted: Mon Jan 18, 2021 12:39 am 
Offline
Member
Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 5143
xeyes wrote:
As you can see both CPU 0 and CPU 1 think it has successfully incremented the value at index 2 from 0 to 1 :shock:

You're not doing anything silly like setting it to zero on both CPUs before running your test code, right? (Or using different physical addresses?)

xeyes wrote:
It is also quite ugly (especially the part that it reads in the value from *ecx and then having to do both copy and increment) so would also like some suggestions on improving.

I suggest writing it without using any inline assembly.

Something like this:
Code:
void acquire_read( unsigned int * lock )
{
    unsigned int temp = __atomic_load_n( lock, __ATOMIC_RELAXED );
    do
    {
        while( temp >= _R_LOCK_MAX - 1 )
        {
            __builtin_ia32_pause();
            temp = __atomic_load_n( lock, __ATOMIC_RELAXED );
        }
    }
    while( !__atomic_compare_exchange_n( lock, &temp, temp + 1, 0, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED ) );
}

(This code is untested and may be incorrect. Read the documentation carefully.)


Top
 Profile  
 
 Post subject: Re: lock cmpxchg didn't happen atomically?
PostPosted: Mon Jan 18, 2021 2:19 am 
Offline
Member
Member

Joined: Mon Feb 02, 2015 7:11 pm
Posts: 898
Yes please, do not write such code in assembly. Use compiler intrinsics as suggested by Octocontrabass.

_________________
https://github.com/kiznit/rainbow-os


Top
 Profile  
 
 Post subject: Re: lock cmpxchg didn't happen atomically?
PostPosted: Mon Jan 18, 2021 3:08 am 
Offline
Member
Member

Joined: Mon Dec 07, 2020 8:09 am
Posts: 212
Octocontrabass wrote:
xeyes wrote:
As you can see both CPU 0 and CPU 1 think it has successfully incremented the value at index 2 from 0 to 1 :shock:

You're not doing anything silly like setting it to zero on both CPUs before running your test code, right? (Or using different physical addresses?)


Thanks! You are right it was a silly issue and the 2 CPUs indeed aren't targeting the same address.

I had another instance of this same r/w lock on the whole cache, and obviously normal look up is read lock and write lock is used for 'cache maintenance' like read in new inodes (and potentially evict old ones).

Since I assumed that this kind of r/w lock's owner can't atomically change from r->w (on second thought, it might be possible, like spin for a 1 to 0xFFFFFFFF transition), I let the thread has to give up r and relock with w after a cache miss in order to read from disk and add to cache.

Thus there are situations where 2 threads (CPUs) both look up inode 2, both see that the cache missed, and both added it to the cache (in serial, but neither re-checked after relock with w thus they added 2 copies) They then proceeded to both "atomically change 0 to 1" on the inode's lock (targeting the 2 different copies, of course).

The code itself actually works fine, to isolate the issue I moved it to run in user space and with 8 threads (1 writer 7 readers) on 8 CPUs spinning lock/unlock things did seem to happen atomically, at least the check that found the case I'm asking about didn't trigger.

Also noticed an unexpected "unfairness" where the write thread wins vast majority of the time and itself does more iterations than 7 readers combined. Probably because the write lock path is much faster and it can go for cmpxchg right after reading *ecx and a single cmp, without wasting time doing the move and add like the readers.

Code:
   
asm volatile ("_w_init_spin:                            \n\
                   movl $" EXSTR(_RW_UNLOCKED) ", %%eax     \n\
                   _w_keep_spinning:                        \n\
                   cmp (%%ecx), %%eax                       \n\
                   je _w_try_get_lock                       \n\
                   pause                                    \n\
                   jmp _w_keep_spinning                     \n\
                   _w_try_get_lock:                         \n\
                   lock cmpxchg %%ebx, (%%ecx)              \n\
                   jne _w_init_spin                          "
                   :
                   : "c"(r), "b"(_W_LOCKED)
                   : "eax", "memory", "cc");


Quote:
I suggest writing it without using any inline assembly.


Will look into this, I always thought this kind of code is where inline is useful, but maybe I'm missing something important.


Last edited by xeyes on Mon Jan 18, 2021 3:21 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: lock cmpxchg didn't happen atomically?
PostPosted: Mon Jan 18, 2021 3:16 am 
Offline
Member
Member

Joined: Mon Dec 07, 2020 8:09 am
Posts: 212
kzinti wrote:
Yes please, do not write such code in assembly. Use compiler intrinsics as suggested by Octocontrabass.


Thanks, if multiple people are saying the same thing there must be some good reasons for it.


Top
 Profile  
 
 Post subject: Re: lock cmpxchg didn't happen atomically?
PostPosted: Mon Jan 18, 2021 1:08 pm 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 1604
xeyes wrote:
Thanks, if multiple people are saying the same thing there must be some good reasons for it.
There are good reasons for it. Inline assembly is notoriously difficult to get right. Did you use the right qualifiers, the right constraints, and the right clobbers? You might not immediately notice if you did not. Now, my solution to that is to write complete functions in assembler and call them with normal calls, but some people dislike this since it inhibits compiler optimization. Essentially, the compiler only sees a black box where the function would be, and cannot do anything with it. Cannot inline it, either. So compiler intrinsics help with that. Plus they make it easier to get the message across.

_________________
Carpe diem!


Top
 Profile  
 
 Post subject: Re: lock cmpxchg didn't happen atomically?
PostPosted: Mon Jan 18, 2021 9:36 pm 
Offline
Member
Member

Joined: Mon Dec 07, 2020 8:09 am
Posts: 212
nullplan wrote:
xeyes wrote:
Thanks, if multiple people are saying the same thing there must be some good reasons for it.
There are good reasons for it. Inline assembly is notoriously difficult to get right. Did you use the right qualifiers, the right constraints, and the right clobbers? You might not immediately notice if you did not. Now, my solution to that is to write complete functions in assembler and call them with normal calls, but some people dislike this since it inhibits compiler optimization. Essentially, the compiler only sees a black box where the function would be, and cannot do anything with it. Cannot inline it, either. So compiler intrinsics help with that. Plus they make it easier to get the message across.


Good points!

Yes GCC somehow has the feature but doesn't really seem to play too nice with inline asm, very easy to get inlined label conflict with another copy of itself and lots of confusion on who clobbered who :roll:

Maybe it takes a real expert to use it well, I've replaced my ugly mov and add with lea now but probably still slower than GCC's output.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot], Majestic-12 [Bot] and 127 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group