Understanding the Mica compiler pipeline
Mica is designed to be a learning-sized compiler that you can read and understand in a weekend. The entire pipelineβfrom lexing to native code generationβis intentionally compact and well-documented, making it perfect for studying compiler implementation techniques.
Every stage of the compiler is exposed through CLI flags, allowing you to inspect intermediate representations and understand how your code transforms at each step.
The Mica compiler processes code through the following stages:
Converts source text into a stream of tokens with precise span information. Handles keywords, operators, literals, and comments.
Output: Token stream with spans
Builds an Abstract Syntax Tree (AST) using recursive descent parsing with Pratt expression handling. Includes error recovery.
Output: AST with full source structure
Resolves names, builds symbol tables, and tracks capability usage. Handles imports and module graphs.
Output: Resolved symbols and capability metadata
Performs type inference, exhaustiveness checking, and effect verification. Ensures capability contracts are satisfied.
Output: Fully typed and checked program
Transforms AST to High-level Intermediate Representation. Desugars method calls, effects, and control flow.
Output: Simplified HIR
Converts HIR to typed SSA (Static Single Assignment) form with basic blocks and instructions.
Output: Typed SSA IR
Generates native code via LLVM or portable C backend. Links with runtime capability providers.
Output: Native binary
Executes the program with capability-aware runtime shims. Captures telemetry and enforces deterministic concurrency.
Output: Program results and telemetry
The repository is structured for easy navigation:
Lexer, parser, and pretty-printer for the front-end
src/syntax/Name resolution, type checking, and effect analysis
src/semantics/AST to HIR transformation and desugaring
src/lower/Typed SSA intermediate representation
src/ir/Code generation targets (LLVM, native C)
src/backend/Capability providers and task scheduler
src/runtime/Error reporting and structured warnings
src/diagnostics/Command-line interface and tooling
src/main.rsCode formatting and CST handling
src/pretty/Integration and unit test suites
src/tests/Runnable sample programs
examples/Guides, roadmap, and module docs
docs/Mica development follows a phased approach:
Lexer, parser, and basic tooling infrastructure
Name resolution, type checking, and effect system
HIR lowering, typed SSA IR, and purity analysis
Code generation, runtime shims, and deterministic scheduler
Formatter, LSP server, and developer experience tools
Standard library, package manager, and interop adapters
Community building, RFC process, and ecosystem expansion
The Mica compiler follows these core principles:
The entire codebase is designed to be readable in a weekend. Every module is intentionally compact and well-documented.
Every compiler stage is exposed through CLI flags, making it easy to understand transformations and debug issues.
Capabilities and effects are first-class, tracked through the entire pipeline from parsing to runtime.
Concurrency is structured and deterministic by default, with explicit capability requirements.
Comprehensive test suites cover every stage with golden snapshots and negative test cases.
Clean interfaces and modular design make it easy to add new features or experiment with alternatives.
For detailed information about each compiler module: