bzt wrote:
In other words, the compiler does not "require" nor "forbid" anything. Just as I said. You're free to use pointer casting.
And the compiler is free to silently generate code that does not work.
bzt wrote:
Here's another example. Zlib, which has been compiled on an armada of platforms for god knows how many OSes, also uses pointer casting to speed up CRC calculation. Is zlib a badly written code because of that? I don't think so.
Do you mean this code, which is documented to explicitly rely on correct pointer alignment and relaxed aliasing rules and is only conditionally enabled so it can be avoided when those conditions can't be met? No, because the author put in the work necessary to avoid relying on undefined behavior.
bzt wrote:
First, we can safely assume x86,
Why do you think this is a safe assumption on a forum where there are regular discussions about OS development on non-x86 platforms?
bzt wrote:
second this code was never designed to be portable
Why not? There's nothing about the PSF format that makes it architecture-specific, and there's nothing on the wiki page that says the code is architecture-specific.
bzt wrote:
third MOV is not a special instruction requiring alignment,
What about MOVDQA? MOVAPS?
bzt wrote:
fourth generated code accessing unaligned address is valid and does work.
"It works" does not mean "it will always work". "It works" does not mean "the behavior is defined".
bzt wrote:
By your argument you must never use paging nor variadic arguments in your C code because there are architectures that do not support those. Do not use open(), read(), write(), close() either, because under Windows you don't have those. And never use string literals longer than 509 bytes because there are C compilers that can't handle that. Don't declare a variable in a middle of a code block because there are C compilers that do not support those. Shall I continue...?
My argument is "you should not rely on undefined behavior." The only things you've listed that could be undefined behavior are string literals longer than 509 characters and mixed declarations and code, and only in the C89 standard. C99 has been around for 20 years, perhaps it's time for you to update.
bzt wrote:
You are reading in the middle of a bitchunk, not at the end. Are you suggesting copying memory over and over again?
No, of course not. You would add those three bytes to the end of the PSF file when including it into your kernel, not at runtime.
bzt wrote:
Homework for you:
here's my RLE16 decoder using pointer casting and no memory copy for 16 bit elements (about 20 SLoC), compare its complexity and performance with any of the other existing RLE solutions not using pointer casting (for example
this one, which uses memcpy to aligned temporary buffer to assure alignment). You'll be surprised to learn the difference. Seriously, I mean it, compile both and measure their performance!
Provide your own benchmarks first.
Here, I'll even give you a fair comparison by rewriting your implementation to remove the unaligned pointer casts. (Behavior may still be undefined if the input is malformed.)
bzt wrote:
Why not use a feature that all the potential target CPUs support and the compiler allows?
Where in the GCC documentation does it say that pointers do not have to be aligned?
bzt wrote:
That's your problem! You are not writing code for a beauty contest, this isn't
IOCCC, but in a kernel you're writing low level code, which means your priority is the generated code. If you think otherwise, then you're just an application programmer, not a kernel programmer (no offense intended).
My priority is the generated code. That's why avoiding undefined behavior is so important: the compiler makes no guarantees about the code it will generate for undefined behavior.
bzt wrote:
Never gonna happen. Forbidding a feature that the target supports and more than 50 years of code relying on would be a suicide on the compiler part. If a compiler would really disallow unaligned access, ALL the FAT implementations would broke for example (because BPB struct must be packed and its fields aren't properly aligned).
Packed structs are implementation-defined behavior. Implementation-defined behavior is not undefined behavior. We are talking about undefined behavior.