Tutorial: Printing FPU Floating Numbers in Pure Assembly

~ · **Joined:** Tue Mar 06, 2007 11:17 am **Posts:** 1225

Link (Public Domain License):
http://devel.archefire.org/forum/viewtopic.php?p=4143&hl=en#p4143

EXE:
http://devel.archefire.org/tmp/FPU_float_print_demo.zip
http://archive.org/download/x86_FPU_Programming_0002/FPU_float_print_demo.zip

Video description:
http://www.youtube.com/watch?v=Xrfri-hKYbg

This is a tutorial with a sample Win32 program and Assembly utility functions that convert a floating point number from the FPU to print it. The function uses the FPU itself to convert each digit into ASCII, first the integer part and then the floating point part.

It also has an ASCII string reversion utility function since the integer string representation is generated backwards. The function reads the first and last character at a time in AL and AH respectively, then uses XCHG and writes them back to revert the string as if we flipped a transparent sheet and could see the string values already reverted. Then it increases the start of string pointer and decreases the end of string pointer to continue the reversion. If the string has odd size or if it reached the last two bytes (size divided by 2 in the function), it will stop because it makes no sense to try to revert the center byte, for example, in a 3-byte string.

I have put it in my external website since I have full control there to edit the HTML, the links and maintain it properly. I put it here since I realize that it will be an useful snippet to actually learn how to finally use the FPU usefully by being able to calculate and then print the results right away to know whether we calculated correctly.

matt11235 · **Posted:** Sat Feb 11, 2017 4:11 am

~ wrote:

Video description:
http://www.youtube.com/watch?v=Xrfri-hKYbg

I don't think that you understand the purpose of a video.

gerryg400 · **Posted:** Sat Feb 11, 2017 4:31 am

Hi ~, you should read this.

~ · **Joined:** Tue Mar 06, 2007 11:17 am **Posts:** 1225

It's always good to read the available conversion methods.

Here is what I did:

I used the full 80-bit precision of the FPU.

I turned off rounding in the FPU (I used Truncate Mode -no rounding- in the Control Word).

I used FPREM to get the modulo division by 10 to get each digit of the integer part.

I used FISTTP or FISTP to extract the integer part without any rounding at all.

(FISTTP (supposedly from SSE3; also ignores the rounding mode form the Control Word although I set it to truncate anyway so it should work with FISTP too) or FISTP (supposedly more compatible and makes use of the rounding mode from the Control Word)).

I multiplied the fractional part by 10 to get each one of its digits.

To get the fractional part only, I used FCHS to change the sign of a copy of the integer part and substract by adding the full number with the integer part only with inverted sign.

Finally, to each digit I got, I added the value 48 with the FPU itself to convert to ASCII digit, saved the result to memory as an integer, and obtained the first byte with the CPU. With this I converted each digit to ASCII to then save those string digits to the final string.

Brendan · **Posted:** Sun Feb 12, 2017 3:55 am

Hi,

~ wrote:

Here is what I did:

Tell me what happens when you try to print the number 9223372036854775808.0.

Cheers,

Brendan

~ · **Joined:** Tue Mar 06, 2007 11:17 am **Posts:** 1225

Brendan wrote:

Hi,

~ wrote:

Here is what I did:

Tell me what happens when you try to print the number 9223372036854775808.0.

Cheers,

Brendan

Only garbage.

I can only seem to print numbers between 9223372036854775807.5 and -9223372036854775807.5.

JavaScript rounds it to 9223372036854776000 which seems a big error.

The program also had a bug where I forgot to pop the FPU stack once, so it could only print 6 integer digits.

I corrected it and reuploaded it.

Thanks for the exercise.

gerryg400 · **Posted:** Sun Feb 12, 2017 6:29 pm

~ wrote:

JavaScript rounds it to 9223372036854776000 which seems a big error.

Well not really. It's correct to about 16 decimal places while your function returns garbage. Did you actually read the paper I pointed to?

Brendan · **Posted:** Sun Feb 12, 2017 6:52 pm

Hi,

~ wrote:

Brendan wrote:

Tell me what happens when you try to print the number 9223372036854775808.0.

Only garbage.

I can only seem to print numbers between 9223372036854775807.5 and -9223372036854775807.5.

This is because the range of the 64-bit signed integer is -9223372036854775808 to +9223372036854775807; so the "FISTP" (to get the integer part) causes problems for anything outside that range.

Fortunately, for "double precision", the significand only has 52 bits (or 53 bits if you include the implied bit); which means that any number outside the range of a 64-bit signed integer must be an integer. More specifically; for "double precision", any number where the exponent is not less than 53 must be an integer. This means that you can check if the biased exponent is >= 53+1023 and use a faster/simpler "integers only" method.

Unfortunately, for "extended precision", the significand has 63 bits (or 64 bits if you include the implied bit). In this case any number where the exponent is not less than 64 must be an integer; and if the exponent is 63 then the number may not be an integer and will be outside the range of a 64-bit signed integer. However, if the exponent is 63 then the least significant bit of the significand will indicate if there is no fraction or if the fraction is 0.5. This means that:

If the biased exponent is >= 64+16383; then the number is an integer and has no fraction and you can use a faster/simpler "integers only" method
If the biased exponent == 63+16383; then you can mask off the least significant bit, use a faster/simpler "integers only" method, then append the characters ".5" on the end afterwards if the least significant bit was set
If the biased exponent <= 62+16383; then the integer part will fit in a 64-bit signed integer.

However...

Displaying numbers properly requires internationalisation. Programmers have been "perverted" by prolonged use of crappy programming languages and crappy standard libraries, which have traditionally failed to accept properly formatted numbers as input (in the source code itself, and in functions like "atoi()" used at run-time) and failed to convert numbers to strings (for display, etc) properly.

Large numbers are too error prone (the risk of miscounting the digits is quite high), so normal humans put "thousands separators" in their numbers. Unfortunately different people use a different "thousands separator character" and a different "decimal point character". For example, you shouldn't ever display "9223372036854775807.5" but should display either "9,223,372,036,854,775,807.5" or "9.223.372.036.854.775.807,5" instead.

It's relatively easy to split a number into groups using "number % 1000" and "number / 1000"; so that you have none or more integer groups (from 0 to 999) and none or one group that contains the fractional bits (from 0.000.. to 999.99999..). This avoids all of the problems for large numbers (no group is >= 1000); and makes it easier to improve performance with SIMD (because you can do multiple groups in parallel); and makes it easier to generate the digits in the correct order (e.g. "temp = group % 100; digit1 = group/100; digit2 = temp/10; digit3 = temp % 10; ") and avoid reversing the characters after.

Cheers,

Brendan

Brendan · **Posted:** Sun Feb 12, 2017 7:16 pm

Hi,

gerryg400 wrote:

Did you actually read the paper I pointed to?

I didn't - I glanced at it, and then realised how much I "dislike" mathematicians.

For fun...

Programming (as a field of expertise) has only really existed for about 60 years. It took less than 30 years for programmers to realise that (except for a few cases - e.g. the use of 'i' and 'j' for induction variables in simple loops, and 'x', 'y' and 'z' in coordinate systems) single letter variable names are unmaintainable/unreadable trash. In comparison, mathematics has been around for over 3000 years and in all that time they still haven't realised that single letter variable names are extremely bad. After so much time it's unreasonable to assume nobody has ever thought about making mathematics (the language) readable. You have assume that their prolonged negligence is a deliberate and malicious attempt to annoy everyone. :roll:

Cheers,

Brendan

Schol-R-LEA · **Posted:** Sun Feb 12, 2017 9:30 pm

Some points to be made:

The fact that programming overloads the term 'variable' in a way inconsistent with the mathematical meaning, yet all too often tries to read program variables as if they actually were mathematical variables, is our fault, not that of the mathematicians. We should have called them something else from the beginning, because they are really something entirely different from mathematical variables and the analogy had already broken down before Backus had finished the first FORTRAN compiler. Even when talking in terms of 'functional programming' - at least if we are trying to take it as a literal pseudo-mathematical notation rather than a useful if often over-applied design discipline - assignable symbols are radically different beasts from mathematical variables, which are, in many ways, purely relational elements (in the sense that variables only have meaning as mappings of one set of values to another) rather than algorithmic or process-oriented ones. Even things like the indices of a summation don't ever actually have values at all - while the algorithmic process of stepping through the summation to compute its value might use values corresponding to an index at a given state, the index itself is part of a statement of a stateless mathematical assertion, not an assignable value that can be incremented. We chose poorly in borrowing a word that didn't really fit.
Mathematical variables and constants are defined on the whims of the mathematician using them - in the past often in a startlingly informal manner - and the notations (plural) used in mathematics are notoriously fluid and non-standard. This was even more true before Hilbert and Co. tried to beat consistency into the other mathematicians, but really, a 'standard mathematical symbol' is a contradiction in terms.
Mathematical notation is not limited to either the Latin alphabet nor to anything even remotely linear, except when the mathematician writing it finds into convenient to do it that way. A single equation or function definition might fill a whole page of scrawled handwriting, and have no relationship at all to an algorithm that could compute said function.
On a related note regarding scope. at least in those branches of mathematics it applies to at all (e.g., lambda calculus), one of the defining characteristics of a 'bound variable' versus a 'free variable' is that you can rename a bound variable without changing its meaning so long as you do so everywhere it is used. This is a common approach to removing ambiguity when trying to figure out a bunch of curried functions. In other words, the variables themselves can vary by context.
Scoping rules tend to be fast and loose when the 'interpreter' is the human brain - there's a lot of 'you should be able to tell what I mean here' reliant on the fact that the reader can tell that the i used here isn't the same i as the one used two pages back.
While some mathematicians have used multi-glyph names for variables (and somewhat more frequently, functions, such as the trig relations), doing so runs smack into one of the few mostly consistent rules in mathematics - that placing two variables next to each other, regardless of whitespace, implies either multiplication or functional application, depending on the context. If ρπ can be either the name of a single variable, or two variables meant to be multiplied or, even ore likely, the functional application of a correlation coefficient (which can easily be a function - in the mathematical sense) to the constant π, then there's a lot of reason not to use multiple character variable names - all of those interpretations are valid, but only the last is likely to be clear.

Brendan · **Posted:** Sun Feb 12, 2017 11:04 pm

Hi,

Schol-R-LEA wrote:

The fact that programming overloads the term 'variable' in a way inconsistent with the mathematical meaning, yet all too often tries to read program variables as if they actually were mathematical variables, is our fault, not that of the mathematicians. We should have called them something else from the beginning, because they are really something entirely different from mathematical variables and the analogy had already broken down before Backus had finished the first FORTRAN compiler. Even when talking in terms of 'functional programming' - at least if we are trying to take it as a literal pseudo-mathematical notation rather than a useful if often over-applied design discipline - assignable symbols are radically different beasts from mathematical variables, which are, in many ways, purely relational elements (in the sense that variables only have meaning as mappings of one set of values to another) rather than algorithmic or process-oriented ones. Even things like the indices of a summation don't ever actually have values at all - while the algorithmic process of stepping through the summation to compute its value might use values corresponding to an index at a given state, the index itself is part of a statement of a stateless mathematical assertion, not an assignable value that can be incremented. We chose poorly in borrowing a word that didn't really fit.

Sure; in mathematics "variables" typically can't be varied (not in an "x = x + 1" way), and yet somehow it's programmers that got it wrong despite being able to distinguish between "constants" (that don't vary) and "variables" (that do vary).

Schol-R-LEA wrote:

Mathematical variables and constants are defined on the whims of the mathematician using them - in the past often in a startlingly informal manner - and the notations (plural) used in mathematics are notoriously fluid and non-standard. This was even more true before Hilbert and Co. tried to beat consistency into the other mathematicians, but really, a 'standard mathematical symbol' is a contradiction in terms.

I still consider mathematics to be a language, even though it might or might not be more technically correct to call it a group of languages (in the same way that "assembly language" applies to many different dialects of assembly). It's not a strictly defined language, but few languages are strictly defined (it's only really languages that need to be understood by machines, like programming languages, that are strictly defined; and even then they still evolve via. non-standard extensions and standardised versions).

Schol-R-LEA wrote:

While some mathematicians have used multi-glyph names for variables (and somewhat more frequently, functions, such as the trig relations), doing so runs smack into one of the few mostly consistent rules in mathematics - that placing two variables next to each other, regardless of whitespace, implies either multiplication or functional application, depending on the context. If ρπ can be either the name of a single variable, or two variables meant to be multiplied or, even ore likely, the functional application of a correlation coefficient (which can easily be a function - in the mathematical sense) to the constant π, then there's a lot of reason not to use multiple character variable names - all of those interpretations are valid, but only the last is likely to be clear.

There are existing symbols that could be used (and would be understood) for multiplication if mathematicians tried to correct the stupidity that's built into the design of their language/s. For example, "area" could unambiguously be a single (4-letter) name, while "a.r.e.a" and "a×r×e×a" (and whatever else these inconsistent fools may use for a multiplication sign) denotes multiplication of four single-letter variables.

Cheers,

Brendan

MichaelFarthing · **Posted:** Mon Feb 13, 2017 2:59 am

Brendan wrote:

For example, "area" could unambiguously be a single (4-letter) name, while "a.r.e.a" and "a×r×e×a" (and whatever else these inconsistent fools may use for a multiplication sign) denotes multiplication of four single-letter variables.

That's rich coming from a representative of a community that has added to that initial ambiguity by adopting yet another symbol for multiplication (*) :-)

Sik · **Joined:** Wed Aug 17, 2016 4:55 am **Posts:** 251

For the record, the whole issue of multiletter variables in maths could be worked around by using cursive (where letters are always joined together), then you'd know if two letters are separate variables or not by whether the letters themselves are separated. Heck, it already has multiletter stuff for some operators (e.g. sin).

That's the least of the problems with math though, especially since the meaning of each letter is usually explained in a paragraph next to the equation. The bigger problem is just how horribly complex they can get with all those symbols (while programming normally only uses the most basic operations, using actual words for more complex ones). That certainly can make any even remotely complex equation a pain to read unless you're a mathematician (and possibly even if you are).

Brendan wrote:

Displaying numbers properly requires internationalisation. Programmers have been "perverted" by prolonged use of crappy programming languages and crappy standard libraries, which have traditionally failed to accept properly formatted numbers as input (in the source code itself, and in functions like "atoi()" used at run-time) and failed to convert numbers to strings (for display, etc) properly.

As usual, this is a problem harder than it sounds, since numbers meant for computers (e.g. stored in a file) are different than numbers to be shown on screen. And when it comes to separators, different regions have different requirements (not just . vs , there's also other symbols, or how many digits to group, or how to show the sign, or even which characters are to be used for digits, etc.). And that's ignoring stuff like padding and such, or the fact your program may be using a different language than the system's (e.g. because it wasn't translated into it yet), so you may need to explicitly set the formatting locale.

All this really should just be part of a formatting string.

...incidentally, the standard C library already handles all this (look up the locale stuff, most of the lconv structure is dedicated to number formatting in fact). It's just that nobody bothers to use it.

Brendan wrote:

It's relatively easy to split a number into groups using "number % 1000" and "number / 1000";

Eh, depends on how long is the number, I suppose that for your average number is OK, but if the number is huge then it may be better to just count characters as you generate the string. Either one works, I guess (use whichever is easier for the case).

MichaelFarthing wrote:

That's rich coming from a representative of a community that has added to that initial ambiguity by adopting yet another symbol for multiplication (*) :-)

Blame ASCII for that. And many early programming language did use either × or · instead.

Kind of wishing some of the ASCII control codes were just more glyphs. Turning FS/GS/RS/US into ↑↓←→ would have been great (and we could be using ← instead of = for assignment, yes I know other languages use := but it looks awkward).

alexfru · **Joined:** Tue Mar 04, 2014 5:27 am **Posts:** 1108

Sik wrote:

For the record, the whole issue of multiletter variables in maths could be worked around by using cursive (where letters are always joined together), then you'd know if two letters are separate variables or not by whether the letters themselves are separated. Heck, it already has multiletter stuff for some operators (e.g. sin).

Cursive is cursed!

Haven't used it for 20+ years. Left it at school.

Brendan · **Posted:** Mon Feb 13, 2017 4:15 am

Hi,

Sik wrote:

Brendan wrote:

Displaying numbers properly requires internationalisation. Programmers have been "perverted" by prolonged use of crappy programming languages and crappy standard libraries, which have traditionally failed to accept properly formatted numbers as input (in the source code itself, and in functions like "atoi()" used at run-time) and failed to convert numbers to strings (for display, etc) properly.

As usual, this is a problem harder than it sounds, since numbers meant for computers (e.g. stored in a file) are different than numbers to be shown on screen.

If it's meant for computers it becomes easy - use binary instead of text. If you must use text, then go "concise and precise" with hexadecimal (e.g. maybe "0x1234567 >> 16").

Sik wrote:

And when it comes to separators, different regions have different requirements (not just . vs , there's also other symbols, or how many digits to group, or how to show the sign, or even which characters are to be used for digits, etc.). And that's ignoring stuff like padding and such, or the fact your program may be using a different language than the system's (e.g. because it wasn't translated into it yet), so you may need to explicitly set the formatting locale.

All this really should just be part of a formatting string.

Yes; multiple options (even using metric prefixes if you like); but the "split into groups" concept helps with most of them.

Sik wrote:

...incidentally, the standard C library already handles all this (look up the locale stuff, most of the lconv structure is dedicated to number formatting in fact). It's just that nobody bothers to use it.

I'm not sure that it's possible for someone to care enough about Internationalisation to be willing to use what C provides, while also caring so little about Internationalisation that they're willing to use what C provides.

Sik wrote:

MichaelFarthing wrote:

That's rich coming from a representative of a community that has added to that initial ambiguity by adopting yet another symbol for multiplication (*) :-)

Blame ASCII for that. And many early programming language did use either × or · instead.

Kind of wishing some of the ASCII control codes were just more glyphs. Turning FS/GS/RS/US into ↑↓←→ would have been great (and we could be using ← instead of = for assignment, yes I know other languages use := but it looks awkward).

I'd be tempted to blame ASCII for "lower case x" (e.g. "1920x1200") too.

Cheers,

Brendan

OSDev.org

Tutorial: Printing FPU Floating Numbers in Pure Assembly

Who is online