OSDev.org

The Place to Start for Operating System Developers
It is currently Fri Apr 16, 2021 12:42 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 64 posts ]  Go to page Previous  1, 2, 3, 4, 5
Author Message
 Post subject: Re: Designs of microkernels
PostPosted: Tue Jan 19, 2021 2:17 pm 
Offline

Joined: Sat Oct 10, 2020 4:05 pm
Posts: 12
Perhaps I didn't quite make my meaning clear. My fault entirely.

As I said, a stack-based VM will use a single bytecode instruction that implicitly pops the operands and then pushes the result.
The implementation for the instruction (jit or interpreter, it doesn't matter) will, when it encounters that instruction, pop operands, execute, and then push the result.
The article linked does not make it clear that it is talking about implementation and not bytecode when it says
Quote:
four lines of instructions is needed to carry out an addition operation

Comparing this to the stated instruction for the register-based VM, which states that a single line of instruction is required, does indicate to me that the 4 lines in the stack-based version are intended as a representation of the bytecode.
This might simply be me misinterpreting the article, which would again be my fault entirely.
I do agree that the images are correct, though.

I'll also note that implementation for a register-based VM would have to carry out essentially the same steps.
It has to use the CPU stack or registers to pass arguments/values the same way.
Code:
; Takes r1, r2, r3 as add operand register indexes
vm_add:
    mov r1, [reg_struct + 8 * r1]
    mov r2, [reg_struct + 8 * r2]
    add r1, r1, r2
    mov [reg_struct + 8 * r3], r1
    ret


And your vm_add_v2 needs to push eax again after the add instruction ;)


Top
 Profile  
 
 Post subject: Re: Designs of microkernels
PostPosted: Tue Jan 19, 2021 4:54 pm 
Offline
Member
Member
User avatar

Joined: Mon Jun 05, 2006 11:00 pm
Posts: 2165
Location: USA (and Australia)
A little off-topic, but I like stack-based bytecode because it's super easy to write a compiler for (you can print out the bytecode while walking an abstract syntax tree) without doing register allocation.

Although your VM could load the bytecode, perform SSA and turn it into a register-based representation (among doing other optimizations), even if you still interpret the end results.

_________________
My OS is Perception.


Top
 Profile  
 
 Post subject: Re: Designs of microkernels
PostPosted: Wed Jan 20, 2021 11:12 am 
Offline
Member
Member
User avatar

Joined: Thu Oct 13, 2016 4:55 pm
Posts: 1464
Doctor5555 wrote:
Perhaps I didn't quite make my meaning clear. My fault entirely.
Have no worries, that's why I asked, and you explained, now I understand your confusion and I can answer :-)
Doctor5555 wrote:
The article linked does not make it clear that it is talking about implementation and not bytecode when it says
Quote:
four lines of instructions is needed to carry out an addition operation
Agreed, he should have made it clear that he meant 4 CPU instructions and not 4 VM instructions. As I've said in my previous posts, on some CPU you could optimize for less CPU instructions, but that's still performing the same 4 theoretical steps.
Doctor5555 wrote:
And your vm_add_v2 needs to push eax again after the add instruction ;)
Nope, no push needed! It has added one operand to the other already in memory (on top of the stack), meaning the result of the add placed in memory, exactly where push would have stored it should both operands were popped. That's the point of the optimization ;-) (I might have accidentally switched the operands in vmm_add_v1 though, 2nd supposed to be the result)

AndrewAPrice wrote:
A little off-topic, but I like stack-based bytecode because it's super easy to write a compiler for
Btw, I've created a toy for you to play with :-D :-D :-D (Let me know if you find any bugs, I've quickly put it together in less than a day)
https://gitlab.com/bztsrc/brainfuck

It is a Brainfuck interpreter, compiler and bytecode optimizer, with configurable optimization levels.
Code:
./bf -O0 mandelbrot.bf
Runs the unoptimizied bf through an interpreter. Therefore it's going to be slow, but still much much faster than simpleinterp.
Code:
./bf -O3 mandelbrot.bf
Will use all optimizations, and it's a lot faster than optinterp3.
Code:
./bf -O4 mandelbrot.bf -o mandelbrot
./mandelbrot
Will compile the bytecode into native code and will save an x86_64 ELF executable. This will use all the optimizations plus will remove the run-time boundary checks for further speed up. Easily outperforms optasmjit.

You can do performance measurements with the -p flag (valid for both the interpreter and the compiler, for the latter profiling code will be generated into the ELF).

One of the reasons why my implementation is faster, because it's in C and only uses arrays with pointers (in contrast the article's author is using C++ with the worst std::vector class). Also my bf parses the source only once, while the article's author goes through the source as well as on the generated tokens in the std::vector multiple times, in several iterations. My implementation has one drawback though, it has a compile time configured limitation on the maximum number of nesting allowed in the BF scripts (set to 256 ATM).

Fun thing, you can also make this compiler to generate ANSI C source, and then optimize that with 'gcc -O3':
Code:
./bf mandelbrot.bf -c mandelbrot.c
gcc -O3 mandelbrot.c -o mandelbrot
It's going to be about 4 or 5 times slower, than if you let bf optimize and generate the Assembly itself :-)

Finally, AST is evil (specially when it's implemented with generated parser code), don't use it! If you want performance, use a hand-crafted parser with a state machine. More difficult to code, granted, but a state machine performs much-much better in a compiler, and allows some optimizations that are impossible with AST. IMHO state machine based parsers are much better, but there's a big pile of subjectivity in this sentence, not an objective fact.

Cheers,
bzt


Top
 Profile  
 
 Post subject: Re: Designs of microkernels
PostPosted: Sat Jan 23, 2021 12:38 pm 
Offline
Member
Member
User avatar

Joined: Mon May 22, 2017 5:56 am
Posts: 548
-late reply-

moonchild wrote:
eekee wrote:
@AndrewAPrice: Lisp machines will never die! :mrgreen: I'm saying that because it was Lisp and Lisp machines which developed and promoted garbage collection years ago. Nothing else had it in quite the same way, I don't think. But... with your calculator example, you're talking about something which can easily be handled with manual memory allocation. When a program exits, free all its memory.


The problem is, what is ‘its’ memory? Which memory ‘belongs’ to a process, when memory can be shared? Determining that isn't simple if you want to allow cheap IPC.

(The answer is, you do some sort of GC. And that's not (very) hard, but you still have to do it.)

I see, good point. A simple reference counting GC would work. This reminds me not to be put off by the mention of garbage collection, because if you can get away with a reference-counting GC, it's really simple.

_________________
Kaph link pending. code pending. design pending. plans in a state of flux! everything pending! choice of language still up in the air! why is nothing coming together?!?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 64 posts ]  Go to page Previous  1, 2, 3, 4, 5

All times are UTC - 6 hours


Who is online

Users browsing this forum: vvaltchev and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group