1/1/2026
Inside Ferret: Compiler Architecture and Design
A deep dive into Ferret's multi-phase compiler architecture, from source code to native binary.
Ferret’s compiler is built from the ground up with a focus on correctness, performance, and maintainability. This post explores the architecture that powers Ferret’s compilation from .fer source files to native executables.
Design Principles
The Ferret compiler follows several key principles:
- Per-Module Phase Tracking: Each source file progresses through compilation phases independently
- Parallel Processing: Modules are parsed in parallel with unbounded goroutines
- Thread-Safe Diagnostics: Error collection is thread-safe with proper mutex synchronization
- Topological Ordering: Phases respect module dependencies for correct compilation
- Multi-IR Architecture: Three intermediate representations (AST → HIR → MIR) for progressive lowering
Architecture Overview
Here’s the complete architecture diagram showing all compilation phases and data flow:
Phase-by-Phase Breakdown
Phase 1: Lexing & Parsing (Parallel)
The compilation journey begins with parallel parsing:
- Regex-based Lexer: Tokenizes source with error continuation (doesn’t panic on bad input)
- Recursive Descent Parser: Builds AST with minimal error recovery
- Import Extraction: Discovers dependencies recursively
- Parallel Execution: Each module spawns its own goroutine
- Deduplication: Modules are parsed once per import path using
sync.Map
Key Innovation: Unbounded goroutines work well for typical project sizes (< 100s of modules) and avoid the complexity of worker pools.
Phase 2: Symbol Collection
Building the symbol table hierarchy:
- Lexical Scoping: Module → Function → Block scope chain
- Name Declaration: Records all symbols (variables, functions, types)
- Import Validation: Prevents redeclaration conflicts with import aliases
- Method Validation: Ensures methods are defined in the same module as their types
Phase 3: Resolution
Binding names to their declarations:
- Scope Walking: Parent chain traversal for symbol lookup
- Qualified Access: Resolves
module::symbolsyntax - Cross-Module: Links symbols across file boundaries
- Built-in Lookup: Resolves references to universe scope (built-in types)
Phase 4: Type Checking
The most complex phase with multiple subsystems:
Type Inference
- Context-based inference from expressions
- Array type adoption from surrounding context
- Untyped literal resolution
Constant Evaluation
- Arbitrary Precision: Uses
big.Intandbig.Float - Assignment Tracking: Follows constant values through local variables
- Conservative: Only tracks primitives, not struct fields or complex expressions
- Sound: Never gives wrong results, may miss some optimizations
Array Bounds Checking
- Compile-time validation for fixed-size arrays with constant indices
- Supports negative indexing (
-1to-N) - Dynamic arrays (
[]T) auto-grow without bounds checking
Dead Code Detection
- Detects unreachable code from constant conditions
- Warns about obvious infinite loops (no break/return)
- Works with constant expressions
Phase 5: HIR Generation
Lowering typed AST to High-Level IR:
- Source-Shaped: Preserves original structure for error reporting
- Type Annotations: All expressions have resolved types
- Expression Lowering: Simplifies complex expressions
- Function/Method Preservation: Maintains declaration structure
Phase 6: Control Flow Analysis
Building and analyzing control flow graphs:
- BasicBlock Construction: Breaks functions into basic blocks
- Successors/Predecessors: Tracks control flow edges
- Return Path Validation: Ensures all paths return a value
- Unreachable Detection: Finds dead code blocks
- Loop Analysis: Tracks break/continue for loop contexts
Phase 7: HIR Lowering
Canonical transformation to prepare for MIR:
- Optional/Result Explicit: Makes implicit operations explicit
- Temp Variables: Splits complex expressions
- Desugaring: Simplifies language constructs
- Ready for MIR: Canonical form suitable for SSA
Phase 8: MIR Generation
Lowering to Mid-Level IR (SSA-like):
- SSA Form: Single Static Assignment with value/block IDs
- Basic Blocks: Structured control flow
- VTables: Interface dispatch tables
- Type IDs: Runtime type information
- Closure Lowering: Environment capture with upvalues
Phase 9: QBE Code Generation
Native code generation via QBE:
- QBE IR: SSA form with type annotations
- Register Allocation: Handled by QBE
- Optimization: QBE performs backend optimizations
- Multi-Architecture: x86_64, aarch64, riscv64 support
- Assembly Output: Platform-specific
.sfiles
Assembly & Linking
Final binary generation:
- Platform Assembler: Uses system
asor$AS/$FERRET_AS - Object Files: ELF/Mach-O/COFF format
- Runtime Library:
libferret_runtime.awith memory management, I/O, panic handlers - CRT Objects: C runtime initialization
- Static Linking: Single native binary output
Concurrency Model
Thread-safe compilation with proper lock hierarchy:
- CompilerContext.mu (RWMutex): Protects module registry and dependency graph
- Module.Mu (Mutex): Protects individual module field updates
- DiagnosticBag.mu (Mutex): Thread-safe error collection
- SourceCache.mu (RWMutex): Cached source for diagnostics
Lock Ordering: Always acquire locks in this order to prevent deadlocks.
Error Diagnostics
Rust-style error output with:
- Primary/Secondary Labels: Multi-location error context
- Color-Coded Severity: Error, Warning, Info, Hint
- Source Context: Shows relevant source lines
- Error Codes: T-prefix for type errors, W-prefix for warnings
- Help Messages: Actionable suggestions for fixes
Example:
error[T0009]: array index out of bounds --> test.fer:5:9 | 5 | arr[15] | ~~ index 15 out of bounds for array of size 10 | = note: array was defined with size 10 = help: valid indices are 0 to 9, or -1 to -10 for reverse indexingDesign Decisions
Why Three IRs?
- AST: Preserves source structure for error reporting
- HIR: High-level IR maintains readability for optimization passes
- MIR: Low-level SSA suitable for code generation
Why QBE?
- Mature Backend: Production-ready register allocation and optimization
- Multi-Architecture: Single IR for x86_64, aarch64, riscv64
- Simple Integration: Clean C API and straightforward IR format
- Fast Compilation: No LLVM overhead
Why Go for the Compiler?
- Concurrency: Built-in goroutines and channels for parallel compilation
- Memory Safety: GC eliminates memory bugs in compiler code
- Fast Compilation: Go compiles quickly, keeping development iteration fast
- Standard Library: Excellent regex, parsing, and file I/O support
Why Per-Module Phases?
- Incremental Compilation: Only recompile changed modules (future)
- Parallel Processing: Modules can progress independently
- Correct Dependencies: Topological ordering ensures correctness
- Cache Friendly: Each module’s artifacts can be cached
Performance Characteristics
- Parallel Parsing: Scales with available CPU cores
- Lock-Free Reads: RWMutex allows concurrent phase queries
- Minimal Allocations: Reuses symbol tables and scopes where possible
- Fast Type Checking: Constant evaluation uses big integers but is still fast
- Zero-Copy: Import paths used as map keys without string copying
Future Enhancements
- Incremental Compilation: Cache module artifacts between builds
- Language Server: IDE support with incremental reparsing
- Better Optimizations: More sophisticated MIR transformations
- LLVM Backend: Alternative to QBE for maximum optimization
- Build System Integration: First-class build tool support
Conclusion
Ferret’s architecture balances simplicity with sophistication. The multi-phase pipeline with proper concurrency control enables fast, correct compilation while maintaining clear separation of concerns. The three-IR design allows progressive lowering from high-level semantics to low-level machine code while preserving enough information for high-quality error messages.
The result is a compiler that’s fast, correct, and maintainable - just like the language it compiles.
Want to contribute to the compiler? Check out the GitHub repository and read the CONTRIBUTING.md guide.