Adding Instructions¶
This guide shows how to add a new instruction to the IPU assembler and emulator.
Overview¶
The IPU uses a single source of truth for all instruction definitions: INSTRUCTION_SPEC in src/tools/ipu-common/src/ipu_common/instruction_spec.py.
Adding an instruction is a 3-step process:
- Add to instruction_spec — define the instruction with operands and documentation
- Implement execution handler — write the Python method in the emulator's
Ipuclass - Add tests — verify the instruction works correctly
No manual opcode management needed — opcodes are automatically derived from instruction position.
Step 1: Add Instruction to instruction_spec.py¶
Open src/tools/ipu-common/src/ipu_common/instruction_spec.py and find the INSTRUCTION_SPEC dictionary. Locate the slot type where your instruction belongs (e.g., "mult", "acc", "xmem", "lr", "cond", "break").
Add your instruction to the slot's dictionary:
INSTRUCTION_SPEC = {
"mult": {
# ... existing mult instructions ...
"my_new_instruction": {
"operands": [
{"name": "dest", "type": "MultStageReg"},
{"name": "src_a", "type": "MultStageReg", "read": "snapshot"},
{"name": "src_b", "type": "MultStageReg", "read": "snapshot"},
],
"doc": InstructionDoc(
title="My New Operation",
summary="Performs a custom operation on two source registers.",
syntax="my_new_instruction Rd Ra Rb",
operands=[
"Rd: Destination mult stage register (r0, r1, or mem_bypass)",
"Ra: First source mult stage register",
"Rb: Second source mult stage register",
],
operation=(
"for i in [0, R_REG_SIZE):\n"
" Rd[i] = custom_operation(Ra[i], Rb[i])"
),
example="my_new_instruction r0 r1 mem_bypass;;",
),
"execute_fn": "execute_my_new_instruction",
},
},
# ... other slot types ...
}
Key fields:
operands: List of operand definitions, each with:name: Meaningful name for the operand (used in handler signature)type: Operand type — one of:"MultStageReg"— r0, r1, or mem_bypass"LrIdx"— lr0-lr15"CrIdx"— cr0-cr15"LcrIdx"— lr0-lr15 or cr0-cr15"Immediate"— 32-bit signed integer"BreakImmediate"— 16-bit break condition"Label"— branch target label
-
read(optional):"snapshot"or"live"— marks source registers whose values are auto-resolved:"snapshot"— read from the VLIW snapshot (pre-write state, for read-before-write semantics)"live"— read from current register file (sees writes from earlier slots)- Omit for destination registers or indices passed as raw values
-
doc:InstructionDocwith: title— Short human-readable namesummary— What the instruction doessyntax— Assembly syntax exampleoperands— Description of each operand (type and meaning)operation— Pseudo-code showing the operation-
example— Code example -
execute_fn: Name of the Python method in theIpuclass that executes this instruction (e.g.,"execute_my_new_instruction")
Opcode assignment:
Opcodes are automatically derived from position within the slot's instruction list. The first instruction gets opcode 0, the second gets opcode 1, etc. If you want to preserve existing opcodes, append new instructions to the end. To change opcodes, reorder instructions (both assembler and emulator will update automatically).
Step 2: Implement Execution Handler¶
Open src/tools/ipu-emu-py/src/ipu_emu/ipu.py and find the section with handlers for your slot type (look for comments like # MULT Instruction Handlers).
Add your execution method to the Ipu class:
class Ipu:
# ... existing methods ...
# -----------------------------------------------------------------------
# MULT Instruction Handlers
# -----------------------------------------------------------------------
# ... existing mult handlers ...
def execute_my_new_instruction(self, *, dest: int, src_a: bytearray,
src_b: bytearray) -> None:
"""Execute my_new_instruction: performs custom operation.
Args:
dest: Destination register index (raw MultStageRegField value)
src_a: Source A register bytes (auto-resolved from snapshot)
src_b: Source B register bytes (auto-resolved from snapshot)
"""
result = bytearray(R_REG_SIZE)
dtype = DType(self.state.get_cr_dtype())
for i in range(R_REG_SIZE):
# Perform your custom operation
result[i] = (src_a[i] + src_b[i]) & 0xFF # Example: element-wise add
# Write result to destination register
reg_name, elem_idx = _MULT_STAGE_MAP[dest]
self.state.regfile.set_register_bytes(reg_name, elem_idx, result)
Key points:
- Method signature:
def execute_<instruction_name>(self, *, <operand_names>) -> None: - Use keyword-only arguments (note the
*before operand names) - Parameter names must match the
namefields ininstruction_spec -
Parameter order doesn't matter (they're passed as kwargs)
-
Auto-resolved operands (those with
"read"field): - Register operands with
"read": "snapshot"receivebytearrayvalues (register contents) - LR/CR operands with
"read"receiveintvalues (32-bit register value) -
The dispatcher automatically extracts these from the snapshot or live regfile
-
Raw operands (no
"read"field): - Destination registers receive the raw index as
int - Immediates receive the literal value as
int -
Labels receive the resolved address as
int -
Common patterns:
- Use
_MULT_STAGE_MAP[dest]to get(reg_name, elem_idx)from a MultStageReg index - Use
self.state.regfile.set_register_bytes()to write to a register - Use
DType(self.state.get_cr_dtype())to get the current data type - Use
ipu_mult(),ipu_add()fromipu_emu.ipu_mathfor typed arithmetic
Available helpers:
# Register access (via self.state.regfile)
self.state.regfile.get_lr(idx) -> int # Read LR register
self.state.regfile.set_lr(idx, value: int) # Write LR register
self.state.regfile.get_r_acc_bytes() -> bytearray # Read accumulator
self.state.regfile.set_r_acc_bytes(data: bytearray) # Write accumulator
# Register metadata
reg_name, elem_idx = _MULT_STAGE_MAP[idx] # Resolve MultStageReg index
R_REG_SIZE # Register size in bytes (128)
R_ACC_SIZE # Accumulator size (512)
# State access
self.state.xmem.read_address(addr, size) -> bytearray
self.state.xmem.write_address(addr, data: bytearray)
self.state.get_cr_dtype() -> int # Get dtype from CR register
self.snapshot # Pre-VLIW register snapshot
# Math operations
from ipu_emu.ipu_math import ipu_mult, ipu_add, DType
ipu_mult(a: int, b: int, dtype: DType) -> int
ipu_add(a: int, b: int, dtype: DType) -> int
Step 3: Add Tests¶
Create tests in src/tools/ipu-emu-py/test/test_<slot>_instructions.py (or create a new file if needed):
"""Tests for MULT instruction execution."""
import pytest
from ipu_emu.ipu_state import IpuState
from ipu_emu.ipu import Ipu
def test_my_new_instruction():
"""Test my_new_instruction performs custom operation correctly."""
state = IpuState()
ipu = Ipu(state)
# Set up initial register values
state.regfile.set_register_bytes("r", 0, bytearray(range(128)))
state.regfile.set_register_bytes("r", 1, bytearray([1] * 128))
# Create instruction word (see existing tests for encoding examples)
inst = {
"mult_inst_token_0_mult_inst_opcode": <opcode_value>,
"mult_inst_token_1_mult_stage_reg_field": 0, # dest = r0
"mult_inst_token_2_mult_stage_reg_field": 0, # src_a = r0
"mult_inst_token_3_mult_stage_reg_field": 1, # src_b = r1
# ... other fields set to NOP opcodes ...
}
# Execute instruction
state.inst_mem = [inst]
state.regfile.set_pc(0)
ipu.execute_vliw_cycle()
# Verify result
result = state.regfile.get_register_bytes("r", 0)
expected = bytearray((i + 1) & 0xFF for i in range(128))
assert result == expected
Run tests:
Step 4: Verify Integration¶
After adding your instruction, verify it works in both assembler and emulator:
# Verify assembler accepts the instruction
echo "my_new_instruction r0 r1 mem_bypass;;" | bazel run //src/tools/ipu-as-py:ipu-as -- assemble --format hex -
# Run all tests
bazel test //...
Summary¶
- Add to
instruction_spec.py— define operands, docs, andexecute_fn - Implement
execute_<name>inIpuclass — write the execution logic - Add tests — verify correctness with unit tests
- No opcode management — opcodes are derived from position automatically
The instruction will automatically work in: - Assembler (syntax highlighting, parsing, binary encoding) - Emulator (execution, operand resolution, register updates) - Documentation generation (help text, reference docs)
All from the single source of truth in instruction_spec.py.