Building a Tiny Interpreter for a Custom Programming Language
source: chatgpt.com

Building a Tiny Interpreter for a Custom Programming Language

Have you ever wondered how programming languages actually 'run' your code? Behind the scenes, it all comes down to two approaches—compilers and interpreters. But wait a minute, what’s the difference between a compiler and an interpreter? In this article, we’ll focus on interpreters by walking through the design of a minimal custom programming language.

Compiler vs Interpreter

  • Compiler: A compiler takes the entire source code and translates it into machine code (an executable file) before execution. Once compiled, the program can be run as many times as needed without recompilation. This usually results in faster execution at runtime, as all the heavy translation work is done beforehand. Examples: C, C++, Rust, Go.
  • Interpreter: An interpreter, on the other hand, works in a more step-by-step fashion. It reads the program line by line (or statement by statement), translates it on the spot, and executes it immediately. No separate binary is produced; every time you run the program, the interpreter has to process the source code again. This makes it more flexible and easier to debug, but it is often slower than compiled code. Examples: Python, JavaScript, Ruby.

So, in short:

  • Compiler = translate first, run later.
  • Interpreter = translate and run immediately.

Scope of Our Custom Language

For simplicity, let’s design a minimal toy programming language with a very narrow scope. This helps us learn the principles of interpreters without getting lost in complexity. Here are the rules of our language:

  1. Every program must start with { and end with }.
  2. Variables are declared with &name;.
  3. Variables hold only integer values.
  4. To access a variable, we use @name.
  5. Supported operators: + - * / and = (assignment).
  6. Input is done using >>@var;.
  7. Output is done using <<@var;.
  8. Printing text is done using #. Everything after that # until the end of the line will be printed as it is.
  9. The language is case-sensitive and space-sensitive where necessary.

Sample Program

Here’s a small program written in our language that converts Fahrenheit to Celsius:

{
    &f;
    &c;
    #Enter temperature in Fahrenheit:
    >>@f;
    @c=@f-32;
    @c=@c*5;
    @c=@c/9;
    #Temperature in Celsius is:
    <<@c;
}        

Explanation:

  • We declare two variables, f and c.
  • Prompt the user for input.
  • Convert Fahrenheit to Celsius using arithmetic expressions.
  • Print the result.

How Our Interpreter Works

Our interpreter (written in C) is modularised into different parts:

Lexing & Parsing

  • The interpreter scans the source code character by character.
  • It recognises tokens like @var, &var;, >>, <<, #, numbers, and operators.

Execution

  • Variable declarations are stored in a simple global array.
  • Arithmetic expressions are parsed.
  • Input (>>) and output (<<) are handled with standard I/O (scanf and printf).
  • # prints plain text.

We organised the project into multiple files for clarity: interpreter.c (entry point), parser.c (program execution), expression.c (arithmetic evaluation), utils.c (variable management and file I/O), and interpreter.h (shared definitions). This modular structure keeps the code clean and easy to extend. Here is the pseudocode of how it works:

Interpreter

typedef struct {
    char name[NAME_LEN];
    long value;
} Var;

Var vars[MAX_VARS]; // Variable storage


function execute_code(program):

    check program starts with '{' and ends with '}'

    for each line inside the braces:
        skip whitespace

        if line starts with '#':
            print the text after '#'

        else if line starts with '&':
            declare a variable with value 0

        else if line starts with '>>':
            read input from user and store in given variable

        else if line starts with '<<':
            print value of given variable

        else if line starts with '@' and contains '=':
            evaluate the expression on the right (see pseudo code below)
            assign the result to the variable on the left

        else:
            show error "Invalid statement"
        

Expression Evaluation

function evaluate_expression(ptr):
    left = parse_operand(ptr)

    skip_spaces()

    if next character is + or - or * or /:
        op = read_operator()
        right = parse_operand(ptr)

        if op == '+': return left + right
        if op == '-': return left - right
        if op == '*': return left * right
        if op == '/':
            if right == 0: error("division by zero")
            return left / right

    else:
        // just a single value, no operator
        return left
        

Room to Grow

Currently, our toy language supports variables, arithmetic, input/output, and text printing. Future extensions could include:

  • Conditionals: if/else for branching logic.
  • Loops: while, for to repeat instructions.
  • Functions: encapsulate reusable code.

Our toy language may be simple—just integers, variables, arithmetic, input/output, and text printing—but it demonstrates the core building blocks of interpreters: lexing, parsing, and execution. Starting small gives you the foundation to explore more advanced features like conditionals, loops, and functions down the road. The program may have bugs in some features; if you find any, please raise an issue on my GitHub. Congrats, we’ve just built and understood the core of a custom interpreter. Check out the code here: https://github.com/Harsh-git98/Void-Pointer/tree/main/Interpreter

Reference: https://www.geeksforgeeks.org/compiler-design/difference-between-compiler-and-interpreter/ https://app.codecrafters.io/courses/interpreter/overview

To view or add a comment, sign in

More articles by Harsh Ranjan

Explore content categories