- Document PEG grammar as single source of truth for .improv format - Update file format section with v2025-04-09 syntax: version line, Initial View, pipe quoting, Views→Formulas→Categories→Data order - Add pipe quoting convention and grammar-driven testing principles - Update file inventory (persistence: 124+2291 lines, 83 tests) - Add pest/pest_meta to dependency table - Update persistence testing guidance for grammar-walking generator Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Executed-By: spot
13 KiB
Improvise Design Principles
1. Functional-First Architecture
Commands Are Pure, Effects Are Side-Effectful
Every user action flows through a two-phase pipeline:
-
Command (
Cmdtrait) — reads immutable context, returns a list of effects. TheCmdContextis a read-only snapshot: model, layout, mode, cursor position. Commands never touch&mut App. All decision logic is pure. -
Effect (
Effecttrait) — a small struct with anapply(&self, app: &mut App)method. Each effect is one discrete, debuggable state change. The app applies them in order.
This separation means:
- Commands are testable without a terminal or an
Appinstance. - Effects can be logged, replayed, or composed.
- The only place
Appis mutated is insideEffect::apply.
Prefer Transformations to Mutation
Where possible, build new values rather than mutating in place:
CellKey::with(cat, item)returns a new key with an added/replaced coordinate.CellKey::without(cat)returns a new key with a coordinate removed.- Viewport positioning is computed as a pure function (
viewport_effects) that returns aVec<Effect>, not a method that pokes at scroll offsets directly.
Compose Small Pieces
Commands compose via Binding::Sequence — a keymap entry can chain multiple
commands, each contributing effects independently. The o key (add row + begin
editing) is two commands composed at the binding level, not a monolithic handler.
2. Polymorphism Over Conditionals
Dispatch Through Traits and Registries, Not Match Blocks
-
Commands: 40+ types each implement
Cmd, organized by concern across submodules incommand/cmd/(navigation, cell, commit, grid, mode, panel, search, text_buffer, tile, effect_cmds). ACmdRegistrymaps names to constructor closures. Dispatching a key press looks up the binding, resolves the command name through the registry, and callsexecute. No centralmatch command_name { ... }block. -
Effects: 50+ types each implement
Effect. Collected into aVec<Box<dyn Effect>>and applied in order. Nomatch effect_kind { ... }. -
Keymaps: Each mode has its own
Keymap(aHashMap<KeyPattern, Binding>). Mode dispatch is one table lookup, not a nestedmatch (mode, key).
Use Enums to Make Invalid States Unrepresentable
-
BinOpis an enum (Add | Sub | Mul | ...), not a string. Invalid operators are caught at parse time, not silently ignored at eval time. -
AxisisRow | Column | Page | None. A category is on exactly one axis. Cycling is a four-state rotation — no boolean flags, no "row_or_column" ambiguity. -
BindingisCmd | Prefix | Sequence. The keymap lookup returns one of these three shapes; dispatch pattern-matches exhaustively. -
CategoryKindisRegular | VirtualIndex | VirtualDim | VirtualMeasure | Label. Business rules (e.g., the 12-category limit counts onlyRegular) are enforced by matching on the enum, not by checking name prefixes. Virtual categories (_Index,_Dim,_Measure) always exist:_Indexand_Dimsupport drill-down/records mode;_Measureholds numeric data fields and formula targets (added automatically byadd_formula). UseModel::regular_category_names()when selecting a default category for prompts or other user-visible choices.
When You Add a Variant, the Compiler Finds Every Call Site
Prefer exhaustive match over if let or _ => wildcards. When a new Axis
variant or AppMode is added, non-exhaustive matches produce compile errors
that guide you to every place that needs updating.
3. Correctness by Construction
Canonical Forms Prevent Equivalence Bugs
CellKey::new() sorts coordinates by category name. Two keys that name the same
intersection but in different order are identical after construction. Equality,
hashing, and storage all work correctly without callers needing to remember to
sort. Property tests verify this invariant.
Smart Constructors Enforce Invariants
CellKey::new()is the only way to build a key — it always sorts.Category::add_item()deduplicates by name and auto-assigns IDs via a private counter. External code cannot fabricate anItemId.Model::add_category()checks the 12-category limit before insertion.Formula::new()takes all required fields; there is no default/empty formula to accidentally leave half-initialized.
Type-Safe Identifiers
CategoryId and ItemId are typed aliases. While they are usize underneath,
using named types signals intent and prevents accidentally passing an item count
where an item ID is expected.
Symbol Interning for Data Integrity
DataStore interns category and item names into Symbol values (small copyable
handles). This means:
- String comparison is integer comparison — fast and allocation-free.
- A secondary index maps
(Symbol, Symbol)pairs to cell sets, enabling O(1) lookups for aggregation queries. - Symbols can only be created through the
SymbolTable, so misspelled names produce a distinct symbol rather than silently matching a wrong cell.
Parse-Time Validation
Formulas are parsed into a typed AST (Expr enum) at entry time. If the syntax
is invalid, the user gets an error immediately. The evaluator only sees
well-formed trees — it does not need to handle malformed input.
Grammar-Defined File Format
The .improv file format is defined by a PEG grammar (persistence/improv.pest)
and parsed by pest. The grammar is the single source of truth — the parser is a
tree-walker over the grammar's parse tree, not an ad-hoc line scanner. This means:
- Adding a new format feature means updating the grammar first, then the walker.
- The grammar can be read as a specification independent of the Rust code.
- A grammar-walking test generator reads the grammar AST at test time (via
pest_meta) and produces random valid files, ensuring the parser accepts everything the grammar describes.
CL-Style Pipe Quoting for Names
Names in the .improv format use CL-style |...| pipe quoting. A name is bare
if it matches [A-Za-z_][A-Za-z0-9_-]*; everything else must be pipe-quoted.
Escapes inside pipes: \| (literal pipe), \\ (backslash), \n (newline).
This convention is shared between the .improv persistence format and the
formula parser's identifier syntax.
Formula Tokenizer: Identifiers and Quoting
Bare identifiers support multi-word names (e.g., Total Revenue) by
allowing spaces when followed by non-operator, non-keyword characters. Keywords
(WHERE, SUM, AVG, MIN, MAX, COUNT, IF) act as token boundaries.
Pipe-quoted identifiers (|...|) allow any characters — including spaces,
keywords, and operators — inside the delimiters. Use pipes when a category or
item name collides with a keyword or contains special characters:
|WHERE| — category named "WHERE"
|Revenue (USD)| — name with parens
|Cost + Tax| — name with operator chars
SUM(|Net Revenue| WHERE |Region Name| = |East Coast|)
Pipes produce Token::Ident (same as bare identifiers), so they work
everywhere an identifier is expected: expressions, aggregate arguments, WHERE
clause category names and filter values. Double-quoted strings ("...")
remain Token::Str and are used only for WHERE filter values in the
split_where pre-parse step.
4. Separation of Concerns
Four Layers
| Layer | Directory | Responsibility |
|---|---|---|
| Model | src/model/ |
Categories, items, groups, cell data, formulas. Pure data, no rendering. |
| View | src/view/ |
Axis assignments, page selection, hidden items, layout computation. Derived from model. |
| Command / Effect | src/command/, src/ui/effect.rs |
Intent (commands) and state mutation (effects). Bridges user input to model changes. |
| Rendering | src/draw.rs, src/ui/ |
Terminal drawing. Reads model + view, writes pixels. No mutation. |
Formulas Are Data, Not Code
A formula is a serializable struct: raw text, target name, category, AST, optional filter. It is stored in the model alongside regular data. The evaluator walks the AST at read time. Formulas never become closures or runtime-generated code.
Formula Evaluation Is Fixed-Point
recompute_formulas(none_cats) iterates formula evaluation until values
stabilize. Each pass evaluates all formula cells using the current cache
(for formula-derived values) and raw data aggregation (for data values).
This avoids recursive evaluation through evaluate_aggregated and
naturally handles chained formulas (Margin = Profit / Revenue where
Profit = Revenue - Cost). Circular references converge to
CellValue::Error("circular") after MAX_EVAL_DEPTH iterations.
Display Rounding Is View-Only
Number formatting (format_f64) rounds for display. Formula evaluation always
operates on full f64 precision. The rounding function is only called in
rendering paths — never in eval_formula or aggregation.
Drill State Isolates Edits
When editing aggregated (drill-down) cells, a DrillState snapshot freezes the
current cell set. Pending edits accumulate in a staging map. On commit,
ApplyAndClearDrill writes them all atomically. On cancel, the snapshot is
discarded. No partial writes reach the model.
5. Guidelines for New Code
-
Add a command, not a special case. If you need new behavior on a keypress, implement
Cmd, register it, and bind it in the keymap. Do not add anif key == 'x'branch insidehandle_key. -
Return effects, do not mutate. Your command's
executereceives&CmdContext(immutable). Produce aVec<Box<dyn Effect>>. If you need a new kind of state change, create a newEffectstruct. -
Use the type system. If a value can only be one of N things, make it an enum. If an invariant must hold, enforce it in the constructor. If a field is optional, use
Option— do not use sentinel values. -
Test the logic, not the wiring. Commands are pure functions of context; test them by building a
CmdContextand asserting on the returned effects. You do not need a terminal. -
Keep
Option/Result/Boxat the boundaries. Core logic should work with concrete values. Wrap inOptionat the edges (parsing, lookup, I/O) and unwrap early. Do not threadOptionthrough deep call chains.
6. Testing
Coverage and ambition
Aim for ~80% line and branch coverage on logic code. This is a quality floor — go higher where the code is tricky or load-bearing, but don't pad coverage by testing trivial getters or chasing 100% on rendering widgets. The test suite should remain fast (under 2 seconds). Slow tests erode the habit of running them.
Demonstrate bugs before fixing them
Write a test that fails on the current code before writing the fix. Prefer a
small unit test targeting the broken function over an end-to-end test. After the
fix, the test stays as a regression guard. Document the bug in the test's
doc-comment (see model/types.rs → formula_tests for examples).
Use property tests judiciously
Property tests (proptest) are for invariants that must hold across all
inputs — not a replacement for example-based tests. Good candidates:
- Structural invariants: CellKey is always sorted, each category lives on exactly one axis, toggle-collapse is involutive, hide/show roundtrips.
- Serialization roundtrips: save/load identity.
- Determinism:
evaluatereturns the same result for the same inputs.
Keep case counts at the default (256). Don't crank them to thousands — if a property needs more cases to feel safe, constrain the input space with better strategies rather than brute-forcing. Property tests that take hundreds of milliseconds each are a sign something is wrong.
What to test
- Model, formula, view: the core logic. Unit tests for each operation and edge case. Property tests for invariants. These are the highest-value tests.
- Commands: build a
CmdContext, callexecute, assert on the returned effects. Pure functions — no terminal needed. Tests are colocated in each command submodule (command/cmd/<module>.rs→mod tests), with shared test helpers incommand/cmd/mod.rs::test_helpers. - Persistence: round-trip tests (
save → load → saveproduces identical output) plus grammar-driven property tests. The generator walks the pest grammar AST to produce random valid files; proptests verifyparse(generate())succeeds andparse(format(parse(generate())))is stable. Cover groups, formulas, views, hidden items, pipe quoting edges. - Format: boundary cases for comma placement, rounding, negative numbers.
- Import: field classification heuristics, CSV quoting, multi-file merge.
What not to test
- Ratatui
Widget::renderimplementations — pure drawing code that changes often. Test the data they consume (layout, cat_tree, format) instead. - Trivial data definitions (
ast.rs,axis.rs). - Module re-export files.
Test the property, not the implementation
A test like "calling set_axis(cat, Row) sets the internal map entry to Row"
is brittle — it mirrors the implementation and breaks if the storage changes.
Instead test the observable contract: "after set_axis(cat, Row),
axis_of(cat) returns Row and categories_on(Row) includes cat." This
style survives refactoring and catches real bugs.