Bytecode Format
Bytecode Format¶
CrossBasic compiles your AST into a simple, two-part CodeChunk:
struct CodeChunk {
std::vector<int> code; // sequence of opcodes and operands
std::vector<Value> constants; // constant pool of literal values, functions, classes, etc.
};
When you invoke your program, the VM runs runVM(vm, mainChunk), treating code as a flat array of int slots:
┌─────────────────────────────────────────────────────────────┐
│ CodeChunk │
│ ├ constants[0] = <Value nil> │
│ ├ constants[1] = <Value 123> │
│ ├ constants[2] = <Value "hello"> │
│ ├ constants[3] = <Value <ObjFunction "foo">> │
│ └ … │
│ │
│ code[] = [ │
│ OP_CONSTANT, 2, // push "hello" │
│ OP_PRINT, // print it │
│ OP_CONSTANT, 1, // push 123 │
│ OP_RETURN // return │
│ ] │
└─────────────────────────────────────────────────────────────┘
1. Constant Pool¶
- Indexed from
0up toconstants.size()-1. -
Stores any
Valuevariant: -
Scalars:
nil,int,double,bool,string,Color - Heap objects:
ObjFunction,ObjClass,ObjInstance,ObjArray,ObjModule,ObjEnum - Built-ins:
BuiltinFnlambdas, property maps, overload vectors, raw pointers (void*) - Accessed by
OP_CONSTANT <idx>.
2. Instruction Stream¶
Each entry in code[] is either:
- A single-word opcode (the integer value of an
OpCodeenum), possibly followed by - One or more operand words (each an
int).
Offsets (for jumps) and indices (into constants) are all absolute positions or constant-pool slots.
2.1 Common Opcodes¶
| Opcode | Operands | Semantics |
|---|---|---|
OP_CONSTANT | <constIndex> | Push constants[constIndex] |
OP_NIL | — | Push nil |
OP_POP | — | Pop & discard top of stack |
OP_DUP | — | Push a copy of the top of stack |
| Arithmetic & Logic | Pop 1–2 values, compute result, push it | |
OP_ADD | — | + (int/double/string) |
OP_SUB | — | - |
OP_MUL | — | * |
OP_DIV | — | / |
OP_NEGATE | — | unary - |
OP_POW | — | exponentiation ^ |
OP_MOD | — | modulus % |
OP_LT / OP_LE | — | <, <= |
OP_GT / OP_GE | — | >, >= |
OP_EQ / OP_NE | — | =, <> |
OP_AND / OP_OR | — | logical And / Or |
| Variables & Globals | ||
OP_DEFINE_GLOBAL | <nameConst> | Pop a value and define it globally under the string constants[nameConst] |
OP_GET_GLOBAL | <nameConst> | Push the current value of that global |
OP_SET_GLOBAL | <nameConst> | Pop and assign into an existing global |
| Control Flow | ||
OP_JUMP_IF_FALSE | <targetIp> | Pop condition; if false ⇒ ip = targetIp |
OP_JUMP | <targetIp> | ip = targetIp (unconditional) |
OP_RETURN | — | Pop and return a Value to the caller (runVM returns) |
| Functions & Calls | ||
OP_CALL | <argCount> | Pop argCount values + callee; invoke (built-in, scripted, overload, bound method, array) |
OP_OPTIONAL_CALL | <argCount> | Like OP_CALL, but a nil callee becomes a no-op ⇒ pushes nil |
| Classes & Objects | ||
OP_CLASS | <nameConst> | Push a fresh, empty ObjClass(name) |
OP_METHOD | <methodNameConst> | Pop a function and class; register class method |
OP_PROPERTIES | <propsConst> | Pop a PropertiesType; assign to a class on the stack |
OP_NEW | — | Pop a class, allocate ObjInstance; push it; handles plugin vs. built-in constructors |
OP_CONSTRUCTOR_END | — | Pop [instance, ctorResult]; if ctorResult≠nil push it, else re-push instance |
| Arrays & Props | ||
OP_ARRAY | <elementCount> | Pop that many values, build an ObjArray, push it |
OP_GET_PROPERTY | <propNameConst> | Pop an object (instance/module/enum/string/array), push field or bound method or event key |
OP_SET_PROPERTY | <propNameConst> | Pop [newValue, object]; set property (instance or plugin), then push object |
3. Jump Offsets¶
- Absolute:
OP_JUMPandOP_JUMP_IF_FALSEuse the index intocode[]where execution continues. - Compiler fix-ups: Labels record
code.size()when declared; gotos emit a placeholder and are patched after all statements compile.
4. On-Disk Embedding (Executable Bytecode)¶
While your VM runs the in-memory CodeChunk, CrossBasic also supports embedding compiled scripts inside the executable:
- Marker: 8-byte ASCII
"BYTECODE". - Length: 4-byte little-endian
uint32giving the size of the embedded data. - Payload: the raw script or serialized chunk of size
textLength.
At startup, retrieveData() scans backwards for "BYTECODE", reads the length, then pulls out the preceding textLength bytes for automatic use (e.g. reloading pre-compiled code or source).
With this reference you have a complete “map” of how the CrossBasic source ends up as a stream of small integer codes plus a shared constant pool—and exactly how the VM decodes and executes each instruction.