OSDev.org

The Place to Start for Operating System Developers
 It is currently Fri Jan 19, 2018 7:48 am

 All times are UTC - 6 hours

 Page 1 of 2 [ 18 posts ] Go to page 1, 2  Next
 Print view Previous topic | Next topic
Author Message
 Post subject: And or double shift to select middle bits?Posted: Sat May 06, 2017 4:15 am
 Member

Joined: Sun Feb 12, 2017 1:50 am
Posts: 26
If I have 32 bits and want to grab some group from the middle is it faster to use an and followed by a shift, or two shifts?

Example, say I want bits 16-23.
Code:
uint32_t someBits;
uint8_t optionA = (someBits & 0xff0000) >> 16; //isolate the bits and shift
uint8_t optionB = (someBits << 8) >> 24; //double shift to clear unwanted bits

Which is faster?

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 4:32 am
 Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 1216
Have you checked the code your compiler generates for those? You might be surprised by what you see.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 5:55 am
 Member

Joined: Sun Feb 12, 2017 1:50 am
Posts: 26
With no optimizations g++ did an and followed by a shift for the first case and two shifts for the second case.
With any optimizations on the compiler replaced the values with pre-calculated constants.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 6:08 am
 Member

Joined: Mon Mar 25, 2013 7:01 pm
Posts: 1216
I'm guessing you actually want to see what happens when GCC can't optimize it to a constant, so try making your test cases functions:
Code:
uint8_t optionA( uint32_t someBits )
{
return (someBits & 0xff0000) >> 16;
}
uint8_t optionB( uint32_t someBits )
{
return (someBits << 8) >> 24;
}
uint8_t optionC( uint32_t someBits )
{
return someBits >> 16;
}

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 7:12 am
 Member

Joined: Thu Jul 12, 2012 7:29 am
Posts: 718
Location: Tallinn, Estonia
First shift right and then mask low bits could make it more efficient code (due to using smaller constants in an and).

_________________

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 9:51 am
 Member

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1057
Location: Athens, GA, USA
We really can't make a general reply to this, TBH, because it depends on too many factors to really say.
• Does the program or compiler know which bits you need to be selected at compile time?
• Is the final value one which can be pre-computed at compile time (as you saw, GCC will do this when optimizations are on - it is a pretty basic optimization strategy, but it isn't always going to be obvious to the client programmer when this is possible).
• Is the mask already loaded into a register (or can it be pre-loaded into one while the processor is executing some other operations - something which on an out-of-order superscalar like the current x86s can only be determined absolutely by the pipeline hardware, though the compiler can trying to re-order it to make it more likely).
• Are any memory operands going to be in cache when the load is made or does the CPU need to do a fetch from memory (which can take hundreds of cycles and probably cause a pipeline stall even with re-ordering)? On most modern architectures, this literally cannot be predicted until you have the instruction fetched and decoded (and it's dicey even then), because if a task switch occurs between the values being loaded and them being used, the cache manager can and will evict the memory cells to make room for memory used by the other process.
• Is the instruction preceded by a jump/branch of some kind (it almost certainly will be, especially since the only way the difference between the two approaches would matter is if the operation is in a tight loop)? If so, then a branch prediction miss will leave this question pointless, as the pipeline stalls will take several times longer than the difference between using one single cycle instruction or two could ever be.

This is just a few possible complications I came up with off the top of my head.

Unless this is some sort of bottleneck in the program - and you didn't mention it being one - this is the sort of pre-mature optimization that is literally only academic. There is just no way to guess which would work better on any x86 system more recent than the Pentium Pro (for reference, that's from 1995), and it is very likely that it will vary over different runs of the program because of cache and/or branch prediction misses.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 3:32 pm
 Member

Joined: Sun Jul 14, 2013 6:01 pm
Posts: 442
double shift will be faster IN MOST CASES IN THEORY becouse it results smaller code, and it needs fewer transistors so out of order superscalar almost always can kick in. not much difference, since the AND overhead can be hidden by pipelines - but with the power you asked it here as a question, you could benchmark it, maybe there will not be any difference. if yes, shift will be a bit faster (only a few percent, dont expect spectacular difference)

_________________
Operating system for SUBLEQ cpu architecture:
http://DawnOS.tk

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 3:47 pm
 Member

Joined: Sat Jan 15, 2005 12:00 am
Posts: 8200
Location: At his keyboard!
Hi,

MuchLearning wrote:
Example, say I want bits 16-23.
Code:
uint32_t someBits;
uint8_t optionA = (someBits & 0xff0000) >> 16; //isolate the bits and shift
uint8_t optionB = (someBits << 8) >> 24; //double shift to clear unwanted bits
Which is faster?

For this specific example; fastest would probably be:
uint8_t optionC = (someBits >> 16) & 0xff;[/code]

This is because most CPUs have a relatively fast way to convert 8-bit integers into larger integers. For example, on 80x86 I'd expect this code to be compiled to "SHR" then "MOVZX" without much optimisation at all.

With good optimisation, I'd expect that the shift would also disappear most of the time - e.g. the compiler deciding to do something like "movzx eax,byte [ebp-4+(16/8)]" instead of doing something like "mov eax,[ebp-4]; shr eax,16; movzx eax,al".

Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 4:42 pm
 Member

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1057
Location: Athens, GA, USA
Perhaps we need to start over from the top:

@MuchLearning, why do you want to know? I am not saying you shouldn't want to know, but the intent is relevant here, because (as I said earlier) this is the sort of micro--optimization that is entirely irrelevant in actual practice - any gain or loss from it is going be swamped by other factors unless it is a bottleneck in a very tight loop.

If you're just curious, fine; the answers already given are more than suitable, with Geri and Brendan's answers being pretty much equivalent in terms of cycles used (assuming that the compiler uses an 8-bit arguments for both the shift and mask, which I would surely hope it would but I wouldn't count on it).

If the intent is to break a bottleneck, then we'd need a lot more detail as to what the actual code is doing, rather than just a mocked-up test. The test serves to illustrate the question, but to fix the live code, we'd need to see it.

Beyond that, meh. It really isn't something that you get a real advantage on one way or the other. How you do it is up to you. That having been said, I would be curious as to what you are using it for, in case there is some aspect of it that could make it amenable to an entirely different solution (one which your question doesn't give traction for, making this a shoe or bottle problem).

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Sat May 06, 2017 10:16 pm
 Member

Joined: Sun Dec 25, 2016 1:54 am
Posts: 204
for reference:

https://graphics.stanford.edu/~seander/bithacks.html

_________________
Plagiarize. Plagiarize. Let not one line escape thine eyes...

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Fri May 12, 2017 2:14 am
 Member

Joined: Sun Feb 12, 2017 1:50 am
Posts: 26
Schol-R-LEA wrote:
Perhaps we need to start over from the top:

@MuchLearning, why do you want to know? I am not saying you shouldn't want to know, but the intent is relevant here, because (as I said earlier) this is the sort of micro--optimization that is entirely irrelevant in actual practice - any gain or loss from it is going be swamped by other factors unless it is a bottleneck in a very tight loop.

Pure and unbridled curiosity.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Fri May 12, 2017 2:29 am
 Member

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7184
Location: Germany
1) Pick an architecture, compiler, and optimization level.

2) Measure.

_________________
Every good solution is obvious once you've found it.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Mon Sep 04, 2017 10:56 am

Joined: Sun Sep 03, 2017 4:29 am
Posts: 2
dchapiesky wrote:

Thank you!

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Mon Sep 04, 2017 12:35 pm
 Member

Joined: Fri Aug 19, 2016 10:28 pm
Posts: 199
I know that the question refers to C for clarity, but it has to be said, that you can't optimize in such detail using C. The goal in such a language is to make the intent as easy to assimilate by the compiler as possible, which requires more knowledge about the compiler than the machine. That's why you mostly aim for expression that can be suitably implemented, but without delving into the specifics. Since the instruction output may vary from one version of the compiler to the next.

That being said, I intuitively prefer different operations when possible, because on some architecture they may be implemented in different execution units. This is not the case here, as a matter of fact, but it is a general tendency for me. You also want to eliminate data dependencies, which fails in both cases here. As Brendan pointed out, extracting the data from the least significant byte is good for x86, possibly for arm. The expression can be rewritten without the masking:
Code:
uint8_t optionC = (uint8_t)(someBits >> 16);
I hate the readability though. The cast may be omitted, but it still messes up the clarity, because you have to think about twos complement roll over/modulo assignment in both cases. For cores with modern instruction caches, since the microops themselves are cached, the size and encoding complexity of the instructions in the inner loop should not matter that much. But this again is the kind of reasoning that requires at least some inline comments in the code, if used in C at all.

Top

 Post subject: Re: And or double shift to select middle bits?Posted: Tue Sep 05, 2017 6:20 am
 Member

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7184
Location: Germany
Encapsule the actual operation in a mnemonically-named inline function (or macro, if you must).

That way, you achieve

• readability / clarity of intention no matter how obscure the operation, and
• you can use #ifdef to select the fastest operation depending on platform / option switch (if you must).

You could even (gasp!) quickly compare the two different implementations, and see for yourself if they actually make any measurable difference in the big scheme...

_________________
Every good solution is obvious once you've found it.

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 2 [ 18 posts ] Go to page 1, 2  Next

 All times are UTC - 6 hours

Who is online

Users browsing this forum: No registered users and 5 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for:
 Jump to:  Select a forum ------------------ Operating System Development    OS Development    OS Design & Theory    Announcements, Test Requests, & Job openings Everything Else    General Programming    General Ramblings    Auto-Delete Forum OSDev.org    OSDev Wiki    About this site