OSDev.org

The Place to Start for Operating System Developers
It is currently Thu Feb 25, 2021 8:04 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Why little endian!
PostPosted: Sat Dec 19, 2020 7:31 am 
Offline
Member
Member
User avatar

Joined: Tue Feb 18, 2020 3:29 pm
Posts: 639
Hello,
One thing that has always confused me in low level programming is endianess. So, today I decided to research it. So with little endian, from a resource from IBM I read, in little endian, everything is backwards! So, if I have a number 1234, little endian machines store it as 4321, correct? Big endian, however, stores it as 1234, also correct? Then I continued reading, and it states that data is always backwards at the byte level! Why is this?
Thanks,
nexos

_________________
Currently working on the Nexware project, an attempt to make a less bloated version of GNU. All repos for it can be found at https://github.com/Nexware-Project.


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Sat Dec 19, 2020 8:41 am 
Offline
Member
Member
User avatar

Joined: Tue Sep 15, 2020 8:07 am
Posts: 218
Location: London, UK
nexos wrote:
Hello,
One thing that has always confused me in low level programming is endianess. So, today I decided to research it. So with little endian, from a resource from IBM I read, in little endian, everything is backwards! So, if I have a number 1234, little endian machines store it as 4321, correct? Big endian, however, stores it as 1234, also correct? Then I continued reading, and it states that data is always backwards at the byte level! Why is this?
Thanks,
nexos


It just the bytes which are stored in reverse order of what you would expect when reading and writing binary numbers.

So uint32_t 0xDEADBEEF would be stored as 0xEFBEADDE in memory.

While I inherently prefer Big Endian, Little Endian is quite useful as you can access the same memory location with different type sizes, and still have a valid value.

_________________
Building a single address space Microkernel, as used in embedded applications, for the desktop... Download latest build bootable Disk Image: https://github.com/h5n1xp/CuriOS/blob/main/disk.img.zip


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Sat Dec 19, 2020 8:57 am 
Offline
Member
Member

Joined: Wed Aug 30, 2017 8:24 am
Posts: 845
nexos wrote:
Then I continued reading, and it states that data is always backwards at the byte level! Why is this?
Without context, this is going to be difficult to answer. IBM is a bit weird about bit counting as well: All PowerPC documentation has bit 0 as the most significant bit, so converting the manuals for PPC64 to PPC32 is not as simple as with the Intel manuals.

Originally, Little Endian and Big Endian were terms from Gulliver's travels. Gulliver happened upon an island on which a fierce war was about to start about whether to open your boiled egg from the little end or the big end. Not sure what Swift wanted to tell us with that, maybe that some of our wars actually were that petty? Also, I have it on authority of Baron Münchausen that Swift was a liar, and the only thing in the South Sea is tribe of people dancing Minuet in mid-air.

Anyway, yes, nowadays Little Endian and Big Endian refer to byte order. How do you save data in memory? Fun fact: If memory was only accessible in units of machine words, byte order would not matter. (Proof: Right now, memory is only accessible to the byte level, and you never hear arguments about bit ordering, do you?) But oh well, that is not the world we live in. So if you have a data item that spans multiple bytes, do you store the bytes in ascending or descending order of significance? Then the PDP faction will pipe up and tell you to take a third option. You see, the PDP-11 was a word addressed little-endian machine. So within the word, the significance of the bytes would decrease, but for a double word, the second word would have a higher significance. So the byte order would actually be 2143.

But anyway, apart from historical curiosities, all machines save stuff in little endian or big endian byte order. Both have counter-intuitive consequences. Yes, if you read a memory dump on a little endian machine, and you know a couple of bytes are actually forming a number, you have to invert their order to understand the number. That might seem daft, but it has the advantage that if you save a low value into that memory slot, you can read the value without error without knowing the exact type. This came in handy just a few minutes ago, when I was implementing a new multiboot loader. See, the multiboot 1 spec doesn't tell me very clearly how big the "type" in the memory map is supposed to be. That is actually a big problem with a lot of their listings. But the "type" is a value definitely below 256, so I can just declare it as a byte on x86 and get the correct value out. Won't help me on PowerPC, but then I will need an entirely different loader on PowerPC, so who cares?

Little endian also has the related advantage that if you increase the size of a field, if the new bytes added in are zero, then the value stays the same. That cannot be said for Big Endian systems. Big Endian has the advantage of making memory dumps more easily understandable, but that is not a very valuable trait of a computing system. It is also the network byte order, so badly written code will get a speedup from running on a Big Endian machine. Well-written code will perform the same on either.

_________________
Thou hast outraged, not insulted me, sir; but for that I ask thee not to beware of Starbuck; thou wouldst but laugh; but let Ahab beware of Ahab; beware of thyself, old man.


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Sat Dec 19, 2020 10:30 am 
Offline
Member
Member

Joined: Sun Aug 23, 2020 4:35 pm
Posts: 125
Basically, little-endian specializes in typecasting sizes.

Take 0xDEADBEEF for example.
Now say you only wanted the first two bytes, 0xBEEF.
With little-endian you wouldn't have to change the address you were looking at at all, you would just have to change how much you look at. (hex, where V is the pointer location):
Code:
V
EFBEADDE
V
EFBE (ignore anything past what we want)
----------------------------------------------------------------
Now consider big-endian. You have your stored value and you want to take the word value of it. You would have to increment the pointer by two bytes (or bit-shift/mask) to get to the value
Code:
V
DEADBEEF
    V
DEADBEEF

Now of course, big-endian has the typecasting ease in the opposite way.
If you want to read the last two bytes, 0xDEAD, you would just have to change how much you are reading wheras in little-endian you'd have to do bit-shifting and masking and stuff (or increment the pointer by two)
However, from what I can tell, it is more common to need the lower bytes than the upper bytes.
Also, there's probably something in the electronics that makes it easier to build for. Idk about faster, but ¯\_(ツ)_/¯

_________________
My OS: TritiumOS
https://github.com/foliagecanine/tritium-os
void warranty(laptop_t laptop) { if (laptop.broken) return laptop; }
I don't get it: Why's the warranty void?


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Sat Dec 19, 2020 2:02 pm 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 810
Exactly, foliagecanine is 100% correct. Think about it like this: little-endian is arguably the more natural representation for machines because digits with low significance are stored at low memory addresses.

_________________
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Sat Dec 19, 2020 5:09 pm 
Offline
Member
Member
User avatar

Joined: Fri Feb 17, 2017 4:01 pm
Posts: 547
Location: Ukraine, Bachmut
I like LE and also think, it's the natural way. we spell numbers in BE and it's awkward, europeans borrowed this system from arabs, but made it wrong - arabs write them in LE, - their writing system starts text from right to left, so if the number is say 123, first goes the least significant decimal digit - 3. europeans just copied this to the L-R spelling system, forgetting about reverting numbers, so, when we write 123, we have first the most significant digit. it's become habitual and because of this seems "normal" to us, but in fact, LE is the one, natural. imagine if we wrote words, starting not from the letter for the first sound, but for the last. BE does exactly the same with numbers.

_________________
future big goal: ANT - NT-like OS for mips, arm and x86.
current smaller goal: efify - UEFI for a couple of boards (mips and arm).


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Sat Dec 19, 2020 6:41 pm 
Offline
Member
Member

Joined: Wed Mar 30, 2011 12:31 am
Posts: 430
I think looking at big numbers makes the benefits of little-endian systems much less apparent. Let's look at small numbers instead, say... 1.

If we represent 1 as a 32-bit value in hexadecimal, we get 0x00000001.
As a 16-bit value, it's 0x0001.
And as an 8-bit value it's 0x01.

If we store these in big endian, they all look different:
0x00 0x00 0x00 0x01
0x00 0x01
0x01

And if we store them in little-endian, notice the pattern:
0x01 0x00 0x00 0x00
0x01 0x00
0x01

Consider what happens if we want to cast a 32-bit value to a 16-bit value - we can use the same memory address to get both of those in little-endian, but in big-endian we need to shift our pointer.

_________________
toaruos on github | toaruos.org | gitlab | twitter | bim - a text editor


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 4:05 am 
Offline
Member
Member

Joined: Wed Apr 01, 2020 4:59 pm
Posts: 62
bloodline wrote:
It just the bytes which are stored in reverse order of what you would expect when reading and writing binary numbers.


That's not entirely true.

The bits are probably also in little endian (though there is no guarantee of this).

But this is a CPU implementation detail. Since the byte is the smallest addressable unit, so there's no way for us programmers to tell the difference.


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 4:08 am 
Offline
Member
Member

Joined: Wed Apr 01, 2020 4:59 pm
Posts: 62
@klange I do agree the little endian representation is superior, but I don't agree with your argument. Patterns can be offset by arbitrary numbers of bits, and a repeating sequence can be recognized regardless of endianness.

Moreover, imagine we're comparing two same-sized integers; say, two pointers. It's probably interesting if those pointers have the same high bits, but it's completely irrelevant if just their low bits are the same. So in that case recognizing the high bits is more important.


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 4:28 am 
Offline
Member
Member
User avatar

Joined: Mon May 22, 2017 5:56 am
Posts: 546
I know a very smart and passionate person who really hates big-endian! I don't know why, but I imagine nullplan's arguments probably cover it.

There's another argument for little-endian, but it doesn't apply to many OSs: In arbitrary-precision calculations, adding or subtracting numbers needs to start with the low digits for the carry to propagate correctly. (Yes, even though you borrow in the other direction when doing subtraction longhand. 2s complement is awesome like that.) I don't know about multiplication and division.

In human language, big-endian makes rounding more natural. That's valuable, I think.

_________________
Kaph link pending. code pending. design pending. plans in a state of flux! everything pending! choice of language still up in the air! why is nothing coming together?!?


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 4:52 am 
Offline
Member
Member

Joined: Wed Mar 30, 2011 12:31 am
Posts: 430
moonchild wrote:
@klange I do agree the little endian representation is superior, but I don't agree with your argument. Patterns can be offset by arbitrary numbers of bits, and a repeating sequence can be recognized regardless of endianness.

Moreover, imagine we're comparing two same-sized integers; say, two pointers. It's probably interesting if those pointers have the same high bits, but it's completely irrelevant if just their low bits are the same. So in that case recognizing the high bits is more important.

It's not about recognition, it's about casting between sizes. With a little-endian storage format, a small value stored in a large integer "starts" at the same location for smaller types.

And individual bytes do not have any concept of endianness; there is no meaningful 'ordering' of bits. They all exist together, equally. How that exists in hardware not only can not be determined by software, it has no meaning and can even vary at different parts of the pipeline. In some forms of memory, individual bits of a single byte may even live on different chips.

_________________
toaruos on github | toaruos.org | gitlab | twitter | bim - a text editor


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 6:53 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7506
Location: Germany
eekee wrote:
In arbitrary-precision calculations [...] I don't know about multiplication and division.


As I happen to be engaged in just that @ PDCLib at the moment: Naive multiplication goes from least to most significant. A more advanced algorithm does a kind of divide-and-conquer with half the digits; if you do that recursively LSB / MSB stops to matter at all.

Division "guesses" a result going by the most significant digits, and then adjusts that first guess when calculating the remainder reveals the guess being off by one.

Whether you store the digits LSB or MSB first doesn't really matter performance-wise, but I have to (reluctantly, as I was raised on MSB) admit that having data[0] being the least significant digit (i.e., LSB) made more sense to me.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 7:04 am 
Offline
Member
Member
User avatar

Joined: Tue Feb 18, 2020 3:29 pm
Posts: 639
moonchild wrote:
bloodline wrote:
It just the bytes which are stored in reverse order of what you would expect when reading and writing binary numbers.


That's not entirely true.

The bits are probably also in little endian (though there is no guarantee of this).

But this is a CPU implementation detail. Since the byte is the smallest addressable unit, so there's no way for us programmers to tell the difference.

Actually, when doing bitwise operations, bits being ordered in little endian can be quite confusing. (i.e, when masking off the top four bits of a number, you must do "num & 0x0fff")

_________________
Currently working on the Nexware project, an attempt to make a less bloated version of GNU. All repos for it can be found at https://github.com/Nexware-Project.


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 7:13 am 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 810
For arithmetic operations on words, endianness does not make a difference: even on a big endian system, you'd do "num & 0xFFF" to mask out the top bits.

_________________
managarm: Microkernel-based OS capable of running a Wayland desktop (Discord: https://discord.gg/7WB6Ur3). My OS-dev projects: [mlibc: Portable C library for managarm, qword, Linux, Sigma, ...] [LAI: AML interpreter] [xbstrap: Build system for OS distributions].


Top
 Profile  
 
 Post subject: Re: Why little endian!
PostPosted: Mon Dec 21, 2020 3:23 pm 
Offline

Joined: Sun Nov 29, 2020 10:24 pm
Posts: 6
little endian has made writing my assembler's assembled immediate words/bytes into memory very easy. i am glad it is used.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ]  Go to page 1, 2  Next

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group