OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Apr 18, 2024 6:52 am

All times are UTC - 6 hours




Post new topic Reply to topic  [ 2 posts ] 
Author Message
 Post subject: How about bitscan instruction on x86 ?
PostPosted: Fri Oct 28, 2016 6:15 pm 
Offline
Member
Member

Joined: Wed Dec 18, 2013 9:10 am
Posts: 84
Hi.
Have you even used 'bsf/bsr' instruction in your project ?
How about its performance?
I saw in a post(http://www.asmcommunity.net/forums/topic/?id=3771) that bitscan instruction is quite slow on pentium.
And another evidence, the do_softirq() function in kernel/softirq.c use repeatedly bit shift operation to traverse all 32 bits.
He didn't use 'bsr/bsf'.
Here is the code:
Code:
asmlinkage void do_softirq()
{
   int cpu = smp_processor_id();
   __u32 active, mask;

   if (in_interrupt())
      return;

   local_bh_disable();

   local_irq_disable();
   mask = softirq_mask(cpu);
   active = softirq_active(cpu) & mask;

   if (active) {
      struct softirq_action *h;

restart:
      /* Reset active bitmask before enabling irqs */
      softirq_active(cpu) &= ~active;

      local_irq_enable();

      h = softirq_vec;
      mask &= ~active;

      do {
         if (active & 1)
            h->action(h);
         h++;
         active >>= 1;
      } while (active);

      local_irq_disable();

      active = softirq_active(cpu);
      if ((active &= mask) != 0)
         goto retry;
   }

   local_bh_enable();

   /* Leave with locally disabled hard irqs. It is critical to close
    * window for infinite recursion, while we help local bh count,
    * it protected us. Now we are defenceless.
    */
   return;

retry:
   goto restart;
}


And I pick out the relevant code:
do {
if (active & 1)
h->action(h);
h++;
active >>= 1;
} while (active);


Top
 Profile  
 
 Post subject: Re: How about bitscan instruction on x86 ?
PostPosted: Fri Oct 28, 2016 8:11 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8561
Location: At his keyboard!
Hi,

miaowei wrote:
Have you even used 'bsf/bsr' instruction in your project ?


Yes. For one example:

Code:
   mov eax,[tempVector3+2*8+4]         ;eax = highest 32-bits of C
   mov eax,edx
   xor eax,[tempVector3+1*8+4]         ;eax = highest 32-bits of C ^ B
   xor eax,[tempVector3+0*8+4]         ;eax = highest 32-bits of C ^ B ^ A
   cdq                        ;edx = sign of A ^ B ^ C
   xor eax,edx                  ;eax = highest 32-bits of A ^ B ^ C ^ (sign of A ^ B ^ C)
   mov ecx,31                  ;ebx = number of times to shift A, B and C left - 32 (for best case)
   test eax,eax                  ;Is there anything we need to keep?
   je .gotShift1                  ; no
   mov ecx,30                  ;ebx = number of times to shift A, B and C left - 32 (for best case)
   bsr eax,eax                  ;eax = bit number for highest set bit
   sub ecx,eax                  ;ecx = number of times to shift A, B and C left - 32 (if it's not <= 0)
   ja .gotShift1
   clr ecx                     ;ecx = number of times to shift A, B and C left - 32 (for worst case)

.gotShift1:
   mov edx,[tempVector3+0*8+4]
   mov eax,[tempVector3+0*8]
   shld edx,eax,cl
   mov [tempVector3+0*4],edx

   mov edx,[tempVector3+1*8+4]
   mov eax,[tempVector3+1*8]
   shld edx,eax,cl
   mov [tempVector3+1*4],edx

   mov edx,[tempVector3+2*8+4]
   mov eax,[tempVector3+2*8]
   shld edx,eax,cl
   mov [tempVector3+2*4],edx


Note: The idea here is to convert a vector with 64-bit components into a vector with 32-bit components, while shifting/scaling it to keep the most precision possible. It's used in code to find the A, B, C and D for the plane equation ("A*x + B*y + C*z +D = 0") from 3 points on the plane. Without this scaling I don't get enough precision and end up with ugly visible glitches; and this crude scaling is much much faster than doing full conversion to unit vector (which would be more precise, but would involve finding "1/sqrt(A*A + B*B + C*C)" and doing multiplications).

miaowei wrote:
How about its performance?


Typically; BSR/BSF is a little slow compared to fast single instructions (like TEST, ADD, etc); but are still faster than many instructions in a loop on every 80x86 CPU ever made.

miaowei wrote:
I saw in a post(http://www.asmcommunity.net/forums/topic/?id=3771) that bitscan instruction is quite slow on pentium.


It is; but even on a Pentium they're probably still almost as fast as Agner Fog's "clever" (Pentium only) hack; and probably faster on a Pentium than Agner Fog's hack in some situations (e.g. when all FPU registers are in use, or when FPU is configured for MMX and not floating point, or when there's a cache miss writing to the temporary location, or when there's a branch misprediction, or when the bottleneck is instruction fetch/decode).

miaowei wrote:
And another evidence, the do_softirq() function in kernel/softirq.c use repeatedly bit shift operation to traverse all 32 bits.


This is only evidence that C doesn't provide an easily usable method of doing it (other than resorting to non-portable inline assembly), and/or evidence that C programmers aren't assembly language programmers. Unless the compiler is smart enough to optimise that loop into a BSR/BSF (which is very unlikely) that code will be about 10 times slower than it could be on almost all 80x86 CPUs.

Note: This isn't quite true. On most compilers there are intrinsics (e.g. "__builtin_ffs", "__builtin_clz" and "__builtin_ctz" for GCC) that end up being a single BSR/BSF instruction; however these instrinsics aren't part of the C standard and aren't portable between compilers.


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 2 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 42 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group