Making a programming language Part 1 - how to start
Table of contents
- Part 1 - how to start (this article)
- Part 2 - something that kinda works
- Part 3 - adding features
- Part 4 - Hello World
- Part 5 - variables and decisions
- Part 6 - user defined functions
- Part 7a - constructors and objects
- Part 7b - using objects
- Part 8 - going faster
Source version for this post
Lately I gained some interest in programming
languages
and compilers. Those
seem like quite some daunting monsters - just consider the amount of
different features and more importantly, the vast infinity of possible
programs.So where to start? I have a course about compilers at my
college, but I have to wait another year to enroll into that. So welcome
Coursera. It’s currently
available only in self-study but that’s enough for me. I should mention
the Dragon
Book,
but I didn’t read that(yet) so I can’t comment. Compiler is basicaly
lexer -> parser -> optimiser -> code generator
.
The regex approach
I made it through introduction and lesson on lexical analysis and recognized finite automata as something from college(thank your professor!) and finnaly understood the connection from them to regex in java and the like(professor mentioned regular expressions but no-one figured out what was the connection).
Feeling empowered by the newly obtained knowledge I set out to make a lexer for my language. I had no clear specification in mind since this is supposed to be a fun and creative project…I intended to just invent as I go.
My lexer was something like that(won’t post the real code since I’m embarrassed)
- create a bunch of java regex objects that all match to the start of the string
- take the string and try to match it against all regexes
- return an identifier object corresponding to the match
- remove matched part of the string
- loop
Yeah, pretty horrible.I kinda abandoned this idea.
The recursive descent parser approach
By now I was into functional programming and scala language. I also watched lesson on recursive descent on coursera. The code was fugly, bunch of c++ with pointer arithmetic and side effects. I wan’t pretty functions :(
I considered doing a framework for this in scala or perhaps java but…
Enter scala’s parser combinators. I suggest reading this if you arent familiar. One of the reasons scala is awesome. You get parser “generation” baked into the standard library. Granted, it’s a bit slow, but who cares - this is a project for fun not for general usage. On the PLUS side you get productions in the laguage itself and parsing is almost like magic.
And in scaladoc you even get a nice code snippet. What more can a geek ask for.
Well….an AST would be nice. Fortunately scala also provides action combinators that allow to pattern match the parsers themself and map them into an AST. And it even isn’t that complicated.
I recreated the grammar from the scaladoc example and added the action combinators. The code block is kinda long, but I promise there is a short explanation at the bottom.
Notice the ^^
symbol. It sends the output from the parser combinator
into my mapping function(anonymous). And I just create the case
class.There is also an evaluator at the and…but that’s in part 2.
next -> Part 2: something that kinda works
Last modified on 2012-08-29
Previous About meNext Making a programming language Part 2 - something that kinda works