Bytecode Format: A Real Compile Artifact
Lesson, slides, and applied problem sets.
View SlidesLesson
Bytecode Format: A Real Compile Artifact
Why this module exists
A compiler feels "real" when it produces an artifact that can be saved, shared, and executed later. We will define a simple, readable bytecode format and implement encode/decode for it.
This is not about compression or speed. It is about clarity.
1) The BC1 text format
We use a line-based text format with a small header:
BC1
FUNC add a b
LOAD a
LOAD b
ADD
RETURN
END
MAIN
PUSH_NUM 1
PUSH_NUM 2
CALL add 2
END
Rules:
- The first non-empty line is
BC1 FUNC <name> <param...>starts a function sectionMAINstarts the main sectionENDends the current section- Empty lines and lines starting with
#are ignored
2) Instruction encoding
Instructions are one per line. Examples:
PUSH_NUM 42PUSH_STR "hello"LOAD xCALL add 2JUMP 12
String literals use double quotes with escapes: \", \\, \n, \t.
3) Why text?
Text is easy to debug and inspect. Once you understand the pipeline, you can switch to a binary format later.