Skip to content

toy compiler for a custom stack oriented language similar to OCaml

Notifications You must be signed in to change notification settings

Fadeleke57/ocaml_compiler

Repository files navigation

Project Overview

The project aims to compile a high-level, OCaml-like syntax into a stack-based intermediate representation, which can then be interpreted. It includes a lexer, parser, desugarer, translator, and a stack-based interpreter.


Directory Structure

  • ocaml_compiler/: The root directory of the project.
    • README.md: A brief introduction to the compiler.
    • base_interp.ml: An OCaml script for interpreting stack programs from standard input.
    • compile.ml: An OCaml script for compiling high-level programs from standard input into stack programs.
    • example-XX-compiled.my, fib-compiled.my, is-perfect-compiled.my, taxi-cab-compiled.my: These are likely the compiled output of the .myml example files. The [Non-text file] indicates they are not meant to be directly readable as text.
    • interp_03.ml: This file contains the core logic for parsing high-level OCaml-like syntax, defining the intermediate stack-based language, and providing an interpreter for it. It also includes the desugaring and translation logic from the high-level to the stack-based language, as well as a serialization function for the stack programs.
    • examples/: This directory holds example programs written in the toy OCaml-like language (.myml files) and some pre-compiled stack programs (.my files).
      • example-00.myml to example-04.myml: Various small example programs demonstrating different language features like function definitions, arithmetic, and boolean operations.
      • fib.myml: An example implementing the Fibonacci sequence.
      • is-perfect.myml: An example to check for perfect numbers.
      • simple.my: A pre-compiled stack program demonstrating basic stack operations and function calls.
      • sqdist.my: A pre-compiled stack program to calculate squared distance.
      • taxi-cab.myml: An example related to the taxi-cab number problem.

Core Files and Their Functionality

  • README.md: A placeholder markdown file.

  • base_interp.ml: This script reads a stack-based program from standard input, parses it using interp_03.ml's interp function, and then prints the trace of the program's execution.

  • compile.ml: This script reads a high-level program from standard input, compiles it using interp_03.ml's compile function, and prints the resulting stack program.

  • interp_03.ml: This is the most substantial file, containing:

    • Utility functions: Generic OCaml utilities for list manipulation, string conversion, and character checking.
    • Parser combinators: A set of functions to build parsers for context-free grammars (e.g., satisfy, char, str, map, seq, many, alt, bind).
    • High-Level Syntax Parsing: Defines the abstract syntax tree (AST) for the toy OCaml-like language (uop, bop, expr, top_prog) and implements parsers for expressions, function definitions, let bindings, if-then-else statements, and tracing.
    • Stack-Based Language: Defines the intermediate representation as a list of commands (const, value, bindings, stack_prog, command) and includes a parser for this stack language.
    • Evaluation: Implements an interpreter (eval_step, eval_stack_prog) for the stack-based language, which simulates a stack machine.
    • Project 3 Specifics (Compiler Logic):
      • expr_to_lexpr: Converts a high-level expression into a simpler, "lowered" expression (lexpr) by handling multi-argument functions.
      • desugar_fun_defs: Desugars a list of top-level function definitions into a single lexpr.
      • desugar: The main desugaring function for the top-level program.
      • translate: This is the core compilation function that takes a lexpr (desugared high-level expression) and translates it into a stack_prog (a list of stack commands). This function handles various operators, control flow, function applications, and variable lookups/assignments.
      • assign_name: A helper function to transform OCaml identifiers into a format suitable for the stack-based language.
      • serialize: Converts a stack_prog into a human-readable string representation, including indentation.
      • compile: The top-level compilation function that orchestrates parsing, desugaring, translation, and serialization.

How it Works (High-Level)

  1. Parsing: The parse_top_prog function in interp_03.ml takes the high-level OCaml-like code and converts it into an abstract syntax tree (top_prog).
  2. Desugaring: The desugar function (which calls expr_to_lexpr and desugar_fun_defs) transforms the initial AST into a simpler lexpr representation. This step typically flattens constructs like multi-argument functions into nested single-argument functions.
  3. Translation: The translate function converts the lexpr into a sequence of stack machine commands. This is where the core compilation happens, mapping high-level operations to low-level stack instructions.
  4. Serialization (for output): The serialize function takes the generated stack_prog and pretty-prints it into a string format that can be saved or displayed.
  5. Interpretation: The eval_stack_prog function executes the generated stack commands, simulating a stack machine and producing a trace of print statements.

About

toy compiler for a custom stack oriented language similar to OCaml

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages