How to create a (slow) interpreter - Introduction

Published on  2019-07-09

How to create a (slow) interpreter - Introduction

thumbnail Vue.js minified code highlight on Webstorm

Hacker News Hits


Have you ever considered creating your own language ? With its own features and syntax ? If the answer is yes, maybe you should read another article where you can learn an efficient way of doing so. If you're here by curiosity, tighten your seatbelt and fetch a drink, we're going on a ride.

How I got there

Several months ago, I was on this project named Red Pineapple with some friends where we had to make a choice between these 3 options:

The third option was the one that we eventually chose as we had the motivation and the energy to complete it in 4 to 5 months. The goal was to generate math problems from templates written by teachers in a not so complicated language.

Back then, we didn't know there were tools to perform that task from your own design and choices, but we were young and ignorant (we still are by the way) and nothing is more formative than doing it from scratch (keep repeating that to reassure your manager, 100% success rate).

I took this task myself to let others handle front-end and back-end development of our service and because I was also eager to create something of my own. After some months, here is what this language was capable of:

You can see the working interpreter on GitHub: Red-Juice.

I discovered a lot of things during this project and I wanted to take this opportunity to discuss it with anyone interested.

We will be creating during this blog post a small but efficient interpreter for a language with a minimal syntax. It will be close to the one mentioned before, but with far less features for the post simplicity.

First step, syntax

Ok, first things first, we will be working on an interpreted language, not a compiled one. The difference is simple: we don't need to create a bytecode to be read by the computer. In fact, it will be as simple as reading it from a human perspective: a cursor will jump along the lines and decide where to go next while a memory will retain variables known during the execution. There will be 2 major steps:

By doing so, we don't need to deal with the ugly task of throwing and handling exceptions.

Looks like it's time to make some choices. For simplicity, we will impose the following:

To summarize (click)

The following code:

VAR A = 169
VAR B = 585
PRINT GCD({A}, {B})
VAR D = 0
WHILE A != B
    IF A > B
        VAR A = A - B
    ELSE
        VAR B = B - A
    ENDIF
    PRINT = GCD({A}, {B})
ENDWHILE
PRINT = {A}

Will output:

GCD(169, 585)
= GCD(169, 416)
= GCD(169, 247)
...
= GCD(13, 26)
= GCD(13, 13)
= 13

I think we're good to go.

See you next time for How to create a (slow) interpreter - Part 1: Tokens.

Comments

Klemek
Junior software engineer

Go to top - Back to home -  Tweet this