Simple Scheme Part 1 – IntroductionPosted: February 19, 2011
A very early exercise for those who wish to write an interpreter is to write a Scheme interpreter in Scheme. At first this sound counter-intuitive and slightly crazy. Why would we re-write a language in itself? If we wanted to write a Scheme interpreter, why not use a language that other Scheme interpreters are written in – like C or C++?
I thought many of the same things. The idea did seem crazy at first, but there are quite a few good reasons to write a Scheme interpreter in Scheme.
- You already have a perfectly good Scheme implementation at your fingertips. This means it’s less inventing and more replacing parts of the language.
- Just replacing pieces creates a quicker feedback loop. You don’t have to spend days writing code to see results.
- Scheme is a language written to process lists. Scheme is written in lists. That makes it a natural choice for this exercise.
- Writing a Scheme interpreter in a lower level or more verbose language, such as C or C++, is time consuming.
- It is more of a learning exercise. Using Scheme keeps it simple.
- If you want to take it to the next level, you can always translate your Scheme code into a statically typed language for awesomeness.
Another question is why Scheme and not another language? Scheme is an extremely basic and simple language. Since it’s simple there is less to implement and therefore our goal is easier to achieve. According to the Wikipedia Article on Scheme you don’t need to implement a lot of things underneath the hood to get a basic but complete Scheme-like language. Since it has a powerful macro system, you can write many of the Scheme forms in Scheme itself. There are a few other languages that writing an interpreter for Scheme can be done in, such as Common Lisp, Clojure, Ruby, Python, and so forth. The further you get from a Lisp language though the more code you are going to write.
Starting With a Repl
The first thing to hack together is a simple Read Evaluate Print Loop or Repl. A Repl gives instant gratification – you type something in and the interpreter evaluates it. Here is an extremely naïve and simple version of what we’ll need.
(define (repl) (read-input) (vm-eval) (print-obj) (repl))
This is pretty straight forward. It reads the input, evaluates it, and prints the result. To evaluate input and print a result the Repl needs to capture information. Here is version 1 of the Repl algorithm.
(define (repl) (display prompt) (print-obj (vm-eval (read-input))) (repl))
This captures input from the user, evaluates it and prints it out. Lather, rinse, repeat! It’s also a lot more functional than our previous example.
The Bits and Pieces
Now to move on to the actual read, print and evaluate pieces. Right now the read is extremely simple.
(define (read-input) (read))
This code is using the underlying Scheme system. Second function and I’m already cheating. (read) will not only read input from the user, but it also parses it and returns an appropriate Scheme value. If I enter a number then (read) returns a Scheme number; (number? (read)) evaluates to #t. This is fine for now since parsing is complex. I don’t want to get into having to parse input at this point.
Here is the evaluation function.
(define (vm-eval input) (cond [(string? input) input] [(number? input) input] [(null? input) input] [(char? input) input] [(boolean? input) input] [else "Error!"]))
This isn’t much right now. It just patriots the input depending on the input’s type. Given a number, string, character etcetera it will return that same object. Symbols will drop down to the else cause. That’s fine for now. We can work on evaluating symbols in the future. It’s called (vm-eval) because (eval) is already claimed by the underlying Scheme system.
After evaluating we need to prints results.
(define (print-obj obj) (cond [(string? obj) (display (string-append doublequote-char obj doublequote-char))] [(char? obj) (display (string-append "#\\" (string obj)))] [else (display obj)]) (newline))
This has some simple beautification functionality. It will surround strings with double-quotes before printing. It will also print characters as they are entered in Scheme, preceded by #\. Note that doublequote-char is just a constant for ” in string form.
Trying it out, it will print all the simple types back when reading them in. If it encounters a symbol, a single quote, or anything with parenthesis it chokes and prints “Error!”.
> (load "repl.scm") "/home/kris/projects/lisp/repl.scm" > (repl) -: 1234 1234 -: 3-5i 3-5i -: .1234123e12 1.234123e11 -: "abc 123" "abc 123" -: #\c #\c -: #t #t -: nil "Error!" -:
And that’s how you write an extremely simple Repl in Scheme. The code can be found at GitHub tagged as Part.1.