Skip to content
/ mangle Public

A virtual machine for writing programs to manipulate strings

License

Notifications You must be signed in to change notification settings

mmower/mangle

Repository files navigation

Mangle

By Matt Mower self@mattmower.com v1.0 - Released 21 Jan 2026

Mangle is a virtual maching for writing string transformation and string generation programs. Mangle is implemented in the Elixir programming language.

Concept

Mangle is a virtual CPU that makes it easy to work with strings and substrings and offers an assembly language with a simple instruction set for working with them but that includes operations for matching patterns and manipulating characters. The result of a Mangle program is an output word that may either be a variation of an input word or generated from scratch.

Example Program

Here is a Mangle program that converts a word into it's Pig Latin representation:

# Pig Latin: "string" -> "ingstray"
# If starts with vowel: "apple" -> "appleway"

start:
  load w, in
  load tv, "aeiou"
  load tc, "bcdfghjklmnpqrstvwxyz"

  # Check if starts with vowel
  load c, 0
  match w, c, tv
  jmp_ok vowel_start

consonant_start:
  load c, 0
  match w, c, tc
  jmp_fail move_done

  # Move first char to end
  move w, 0, w.li
  jmp consonant_start

move_done:
  # Append "ay"
  insert w, w.len, "ay"
  jmp done

vowel_start:
  # Just add "way"
  insert w, w.len, "way"

done:
  halt

Syntax

A # begins a comment that continues to the end of a line.

A label is an identifier followed by a :.

All programs must start with the label start:

By convention instructions are indented under their label.

An instruction is an opcode (load, match, jmp_ok, etc…) followed by zero or more comma-separated arguments on the same line.

An argument is a register name (w, in, c, etc…), a register property (w.len), a value (0, "ay", …), or an identifier (vowel_start, …).

VM State

The virtual machine maintains the following state:

Field Type Description
chars list(char) Lowercase characters of the word buffer
vm_flag boolean Result flag set by comparison/match operations
pc integer Program counter
c integer
cf boolean VM condition flag
input_word string Input word (via read-only in register)
index_registers integer i0-i9
string_registers string s0-s9
flag_registers boolean f0-f9
match_registers FSM m0-m9
char_tables CharTable ta-tz (table registers)
rand_state opaque Random generator state
comment string Accumulated comment messages for explain mode
error `string nil`
call_stack list Return addresses for subroutines
exit_code integer Exit code set by halt (0=success, >0=failure)
max_stack_depth integer Maximum call stack depth (default: 256)

Registers

Register Type Purpose
in string input word to the program (read-only)
w string current working 'word' output at the end
cf integer VM condition flag (used by conditional jmp/call)
c integer 'cursor' register auto-clamped to [0, w.len]
i0-i9 integer general purpose registers for indexes & calculations
s0-s9 string registers for storing & manipulating substrings
f0-f9 boolean general condition registers
m0-m9 FSM Finite-State Machine based string matchers
ta-tz table Character table for matching and random selection
r integer Returns a random value on read, sets the seed on write

Properties

Property Description
w.len Word buffer length
w.li Word buffer last index
s0.len - s.len String register lengths
s0.li - s9.li String register last indices

Instruction Reference

Control Flow

Opcode Operands Effect
halt [N] Stop program execution (optional exit code N)
jmp LABEL Unconditional jump to label
jmp_ok LABEL Jump if flag is :ok
jmp_fail LABEL Jump if flag is :fail
call LABEL Call subroutine
call_ok LABEL Call subroutine if flag is :ok
call_fail LABEL Call subroutine if flag is :fail
ret Return from subroutine

Documentation

Opcode Operands Effect
debug "message" Print message to stdout (for debugging)
comment "message" Append message to comment field (for explain mode)

Character Operations

All operations require explicit target and position.

Opcode Operands Effect
insert TARGET, INDEX, SOURCE Insert char/string at INDEX
delete TARGET, INDEX [, COUNT] Delete chars at INDEX
swap TARGET, POS1, POS2 Swap characters (case stays positional)
move TARGET, FROM, TO Move character (case stays positional)
copy SRC, START, LEN, DEST Copy range to destination
replace TARGET, INDEX, SOURCE Replace character at INDEX
dup TARGET, INDEX, COUNT Duplicate character COUNT times
reverse TARGET [, START, LEN] Reverse characters
shuffle TARGET [, START, LEN] Shuffle characters
rotl TARGET, COUNT Rotate left
rotr TARGET, COUNT Rotate right
load_char DEST, SRC, INDEX Load char into register
load_case DEST, w, INDEX Load case (0/1 or false/true)

Pattern Matching

Opcode Operands Effect
match TARGET, POS, PATTERN... Match pattern at POS. Sets flag and rl.

Pattern elements: literal strings, table registers, general registers, string registers, or ? wildcard.

Register Operations

Opcode Operands Effect
load DEST, SOURCE Load value into register
inc REG Increment by 1
dec REG Decrement by 1
add REG, VALUE Add to register
sub REG, VALUE Subtract from register
mul REG, VALUE Multiply register
div REG, VALUE Integer divide register
mod REG, VALUE Modulo register

Comparison (sets flag)

Opcode Operands Effect
eq REG, VALUE or sX, sY Equal
neq REG, VALUE Not equal
lt REG, VALUE Less than
lte REG, VALUE Less than or equal
gt REG, VALUE Greater than
gte REG, VALUE Greater than or equal

Boolean Flags

Opcode Operands Effect
and FLAG1, FLAG2 Set flag1 if both true
or FLAG1, FLAG2 Set flag1 if either true
xor FLAG1, FLAG2 Set flag1 if exactly one true
not FLAG Set flag1 if false, unset otherwise

Usage

Programs must explicitly load input into the word buffer:

start:
  load w, in          # Required: load input into word buffer
  swap w, 0, w.li     # Swap first and last characters
  halt

Command Line

# Build the escript
MIX_ENV=prod mix escript.build

# Run a program (given a specific word)
./mangle examples/vowel_rotate.mangle --word christmas

# Run a program on a file of words
./mangle examples/vowel_rotate.mangle --file word-list

# Run a program to generate new 10 words
./mangle examples/shivan.mangle --n 10

# Run a program and trace its output
./mangle examples/fremen.mangle --n 1 --trace

# Run a program in the interactive debugger
./mangle examples/bad_english.mangle --n 1 --debug

# Run a program setting the initial value of registers
./mangle example/bad_english.mangle -i1 2 --word fluffy

CLI Options

  • --trace - Print execution trace for each word

  • --debug - Start in the interactive debugger

  • --explain - Print results with comments from comment instructions

  • --compare - Show transformations as original -> result

  • --status - Show success/failure prefix (T/F) from output

  • --ignore-empty - Skip empty lines from input

  • --stack-depth N - Set maximum call stack depth

  • --n N - Run program N times with empty input (generative mode)

  • -i<0-9> num - set an i register to the given number

  • -s<0-9> "string" - set an s register to the given string

  • -f<0-9> true|false - set an f register to true|false

Examples

# Show transformations with comparison
./mangle examples/vowel_rotate.mangle --compare --word christmas
# Output: christmas -> chrastmis

# Trace execution
./mangle examples/vowel_rotate.mangle --trace --word vowels

Running from Elixir

# Parse a program
{:ok, %Mangle.Program{} = program} = Mangle.Parser.parse(source)

# Process words
[{:ok, "output"}, {:ok, "word"}] = Mangle.CLI.process_words(program, ["hello", "world"])

Program Syntax

# This is a comment
label_name:
  instruction arg1, arg2
  instruction arg1
  instruction

Argument Types

Type Syntax Examples
Integer digits 42, 0
Register identifier i0, c, w, s0
Property reg.prop w.len, s0.li
Label identifier start:, loop:
String "..." "hello"
Character 'c' 'a', 'Z'
Frequency map {c:n, ...} {a:8.2, e:12.7}
Wildcard ? Used in match

Development

mix deps.get
mix test

About

A virtual machine for writing programs to manipulate strings

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages