OSDev.org

The Place to Start for Operating System Developers
It is currently Sat Jul 20, 2019 2:19 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 14 posts ] 
Author Message
 Post subject: How to make a compiler
PostPosted: Thu Mar 28, 2019 5:25 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 635
It's literally just lots of graph problems; graph coloring, vertex reachability, etc. Prove me wrong.

_________________
Glidix: An x86_64 POSIX-compliant operating system, aiming to be as optimized as possible, especially in graphics.
https://glidix.madd-games.org/


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Thu Mar 28, 2019 5:31 pm 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1461
Location: Athens, GA, USA
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Thu Mar 28, 2019 6:50 pm 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 965
mariuszp wrote:
It's literally just lots of graph problems; graph coloring, vertex reachability, etc.

Only if your compiler is an optimizing one.


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Fri Mar 29, 2019 12:15 am 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 590
Schol-R-LEA wrote:
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.

That's only true if you either use an existing backend or don't care about optimization. Lexical analysis is solved and there are so many algorithms to do it easily: recursive descend, PEG, LR parsers generators, or even Earley parsers that can handle any CFG.

mariuszp wrote:
It's literally just lots of graph problems; graph coloring, vertex reachability, etc. Prove me wrong.

A LLVM-style Greedy RA is superior to graph coloring RAs in practice, prove me wrong. Graph coloring just cannot easily handle non-homogeneuos register files (and of course, it's also quite expensive). When I wrote an implementation (only a bit over 1k SLOC, can do live range splitting but not register spilling right now) inspired by the LLVM one, I was surprised (i) how natural it is, and (ii) how good the results are if you tune it a bit (for smallish functions with only a few basic blocks/variables, the result is often optimal). It's incredibly easy to tune -- when you see a missed optimization, it's usually straightforward to see how to extend the RA to handle it.

_________________
managarm: A microkernel-based OS that is capable of running a Wayland desktop


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Fri Mar 29, 2019 9:08 am 
Offline
Member
Member
User avatar

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 635
Schol-R-LEA wrote:
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.


All problems in NP have a polynomial-time reduction to graph coloring.

_________________
Glidix: An x86_64 POSIX-compliant operating system, aiming to be as optimized as possible, especially in graphics.
https://glidix.madd-games.org/


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Fri Mar 29, 2019 9:48 am 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1461
Location: Athens, GA, USA
mariuszp wrote:
Schol-R-LEA wrote:
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.


All problems in NP have a polynomial-time reduction to graph coloring.


facepalm I can't believe I fell for that. Yeah, that can describe more or less any practical algorithm, can't it (as P is a subset of NP).

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Fri Mar 29, 2019 12:04 pm 
Offline
Member
Member

Joined: Thu May 17, 2007 1:27 pm
Posts: 590
Doesn't hold when you want to parse C++. (And context-sensitivity gives you a PSPACE-complete problem.)

_________________
managarm: A microkernel-based OS that is capable of running a Wayland desktop


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Fri Mar 29, 2019 4:56 pm 
Offline
Member
Member
User avatar

Joined: Sat Oct 16, 2010 3:38 pm
Posts: 635
There is probably a pretty obvious reason why P is a strict subset of NP, which nobody thought of, because it's "too obvious".

_________________
Glidix: An x86_64 POSIX-compliant operating system, aiming to be as optimized as possible, especially in graphics.
https://glidix.madd-games.org/


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Sat Jun 01, 2019 8:16 am 
Offline

Joined: Tue Jan 02, 2018 12:53 am
Posts: 19
Location: Australia
Schol-R-LEA wrote:
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.

The lexer is actually the simplest and least time consuming part of writing a compiler. There's also pretty much only two ways to design a lexer; one which transforms the entire source file into a token array in a single step which is then handed off to the parser, or one which generates tokens for the parser on the fly. I personally prefer the former design for reasons of performance and code simplicity.


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Sat Jun 01, 2019 4:54 pm 
Offline
Member
Member
User avatar

Joined: Fri Oct 27, 2006 9:42 am
Posts: 1461
Location: Athens, GA, USA
Qbyte wrote:
Schol-R-LEA wrote:
Actually, in terms of the time spent, the dominant part of most compilers is the lexical analyzer - which in most cases, amounts to a Deterministic Finite State Automaton, whether formally constructed or as an ad-hoc equivalent. Either way, I would have a lot of trouble seeing that as a graph coloring problem.

The lexer is actually the simplest and least time consuming part of writing a compiler. There's also pretty much only two ways to design a lexer; one which transforms the entire source file into a token array in a single step which is then handed off to the parser, or one which generates tokens for the parser on the fly. I personally prefer the former design for reasons of performance and code simplicity.

Sorry if I wasn't clear; I wasn't referring to the implementation time, but the running time during a compile. Depending on the number of potential transitions, and the amount of optimization the compiler performs, lexical analysis can consume up to 80% of the running time (the textbook Modern Compiler Design by Grune, et. al., explains why in detail).

Writing a lexical analyzer is pretty easy, comparatively. Writing an efficient one is another story, though as you said, it still is usually simpler and faster to implement than the other components.

_________________
Rev. First Speaker Schol-R-LEA;2 LCF ELF JAM POEE KoR KCO PPWMTF
μή εἶναι βασιλικήν ἀτραπόν ἐπί γεωμετρίαν
Lisp programmers tend to seem very odd to outsiders, just like anyone else who has had a religious experience they can't quite explain to others.


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Sun Jun 02, 2019 11:21 pm 
Offline
Member
Member
User avatar

Joined: Thu Jun 04, 2009 11:12 pm
Posts: 253
Hi mariuszp,

With availability of lot of video courses online, you can do anything you set your mind to. You will need to invest some time and energy for it. With a combination of a book ( a good free one suggested already by others ) and online resources, you should be able to get it done. The classic text by Aho et el is another good reference. Hopefully you can find a low priced used copy online.

I think, these videos are very instructional.
https://courses.cs.washington.edu/cours ... index.html

--Thomas


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Sun Jun 02, 2019 11:31 pm 
Offline
Member
Member
User avatar

Joined: Thu Jun 04, 2009 11:12 pm
Posts: 253
I remember enjoying reading this too.

https://holub.com/compiler/. The author has released the book and sources in public domain. Read this if you want to learn how to write a lex and yacc clone. However i did not like the design of the final compiler he came up with. There are far more modern and more modular ways to do it. But a good read for sure.

--Thomas


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Mon Jun 03, 2019 1:44 am 
Offline
Member
Member
User avatar

Joined: Thu Nov 16, 2006 12:01 pm
Posts: 7382
Location: Germany
Boost.Spirit is (IMHO) a strong C++ alternative to the C (lex/flex, yacc/bison) way of going about things.

Upsides:

  • Integration. No additional syntaxes, no additional tools, no additional building steps.
  • Direct generation of C++ object hierarchy.
  • Modularity and re-usability of subparsers. (For example, I've created a Boost.Spirit parser for "quoted strings" that includes the usual C escapes, octal, hexadecimal, and Unicode character constants, and generates a UTF-8 std::string. And re-used it in multiple projects since.)
  • In combination with Boost iterators, can give very good error messages on failed parses with very little effort.

Downsides:

  • Compile times. (Can be alleviated significantly by modularization, see above.)
  • "Classic" C++ template error messages (when building the parser) can be daunting. (Might get better with Contracts.)

If you're willing to really tackle the learning curve, Boost.Spirit can be very rewarding, for both quick off-the-cuff oneliners or complex parse trees. But you have to dig deep to familiarize yourself with its intricacies. As with many other of my favourite tools (Vim, LaTeX, C++ in general), you won't see a payoff if you're only scratching the surface. The benefits are to be found at the deep end. ;-)

Additionally, let me mention Crafting Interpreters by Bob Nystrom as an introduction on how to make an interpreter (which isn't that different from making a compiler, really). I found that piece to be more enlightening than many more theoretical approaches, especially if you're a beginner in the art and can't wrap your head around all the related lingo yet.

_________________
Every good solution is obvious once you've found it.


Top
 Profile  
 
 Post subject: Re: How to make a compiler
PostPosted: Mon Jun 03, 2019 3:19 am 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 965
Schol-R-LEA wrote:
Depending on the number of potential transitions, and the amount of optimization the compiler performs, lexical analysis can consume up to 80% of the running time (the textbook Modern Compiler Design by Grune, et. al., explains why in detail).

I presume, that figure of 80% corresponds to non- or nearly non-optimizing compilers.
The languages that allow writing code with exponential compile times (where the compiler needs to try this or that in order to resolve ambiguity, potentially many times nested) are probably a separate category of their own.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group