Extract the formula AST and parser into a dedicated `improvise-formula` crate and convert the project into a Cargo workspace. The root crate now re-exports `improvise-formula` as `crate::formula` to maintain backward compatibility for internal callers. The repository map is updated to reflect the new crate structure. Co-Authored-By: fiddlerwoaroof/git-smart-commit (gemma-4-31B-it-UD-Q4_K_XL.gguf)
25 KiB
Repository Map (LLM Reference)
Terminal pivot-table modeling app. Rust, Ratatui TUI, command/effect architecture.
Cargo workspace, Apache-2.0, edition 2024. Root package improvise v0.1.0-rc2.
Library + binary crate: src/lib.rs exports public modules, src/main.rs is the CLI entry.
Sub-crates live under crates/:
crates/improvise-formula/— formula parser, AST (Expr,BinOp,AggFunc,Formula,Filter),parse_formula. Re-exported ascrate::formulafrom the main crate viapub use improvise_formula as formula;.
How to Find Things
| I need to... | Look in |
|---|---|
| Add a new keybinding | command/keymap.rs → default_keymaps() |
| Add a new user-facing command | command/cmd/ → implement Cmd in the relevant submodule, register in registry.rs |
| Add a new state mutation | ui/effect.rs → implement Effect |
| Change formula evaluation | model/types.rs → eval_formula(), eval_expr() |
| Change how cells are stored/queried | model/cell.rs → DataStore |
| Change category/item behavior | model/category.rs → Category |
| Change view axis logic | view/types.rs → View |
| Change grid layout computation | view/layout.rs → GridLayout |
| Change .improv file format | persistence/improv.pest (grammar), persistence/mod.rs → format_md(), parse_md() |
| Change number display formatting | format.rs → format_f64() |
| Change CLI arguments | main.rs → clap structs |
| Change import wizard logic | import/wizard.rs → ImportPipeline |
| Change grid rendering | ui/grid.rs → GridWidget |
| Change TUI frame layout | draw.rs → draw() |
| Change app state / mode transitions | ui/app.rs → App, AppMode |
| Write a test for model logic | model/types.rs → mod tests / mod formula_tests |
| Write a test for a command | command/cmd/<module>.rs → colocated mod tests |
Core Types and Traits
Command/Effect Pipeline (the central architecture pattern)
User keypress → Keymap lookup → Cmd::execute(&CmdContext) → Vec<Box<dyn Effect>> → Effect::apply(&mut App)
(immutable) (pure, read-only) (state mutations)
// src/command/cmd/core.rs
pub trait Cmd: Debug + Send + Sync {
fn name(&self) -> &'static str;
fn execute(&self, ctx: &CmdContext) -> Vec<Box<dyn Effect>>;
}
pub struct CmdContext<'a> {
pub model: &'a Model, // immutable
pub layout: &'a GridLayout, // immutable
pub registry: &'a CmdRegistry,
pub mode: &'a AppMode,
pub selected: (usize, usize), // (row, col) cursor
pub row_offset: usize,
pub col_offset: usize,
pub search_query: &'a str,
pub search_mode: bool,
pub yanked: &'a Option<CellValue>,
pub key_code: KeyCode, // the key that triggered this command
pub buffers: &'a HashMap<String, String>,
pub expanded_cats: &'a HashSet<String>,
// panel cursors, tile cursor, visible dimensions...
}
// src/ui/effect.rs
pub trait Effect: Debug {
fn apply(&self, app: &mut App);
fn changes_mode(&self) -> bool { false } // override if effect changes AppMode
}
To add a command: implement Cmd in the appropriate command/cmd/ submodule, then register in command/cmd/registry.rs. Use the effect_cmd! macro (in effect_cmds.rs) for simple effect-wrapping commands. Bind it in default_keymaps().
To add an effect: implement Effect in effect.rs, add a constructor function.
Data Model
// src/model/types.rs
pub struct Model {
pub name: String,
pub categories: IndexMap<String, Category>, // ordered
pub data: DataStore,
pub formulas: Vec<Formula>,
pub views: IndexMap<String, View>,
pub active_view: String,
pub measure_agg: HashMap<String, AggFunc>, // per-measure aggregation override
}
// Key methods:
// add_category(&mut self, name) -> Result<CategoryId> [max 12 regular]
// category(&self, name) -> Option<&Category>
// category_mut(&mut self, name) -> Option<&mut Category>
// set_cell(&mut self, key: CellKey, value: CellValue)
// evaluate(&self, key: &CellKey) -> Option<CellValue> [formulas + raw data]
// evaluate_aggregated(&self, key, none_cats) -> Option<CellValue> [sums over hidden dims]
// recompute_formulas(&mut self, none_cats) [fixed-point formula cache]
// add_formula(&mut self, formula: Formula) [replaces same target+category]
// remove_formula(&mut self, target, category)
// measure_item_names(&self) -> Vec<String> [_Measure items + formula targets]
// effective_item_names(&self, cat) -> Vec<String> [_Measure dynamic, others ordered_item_names]
// category_names(&self) -> Vec<&str> [includes virtual]
// regular_category_names(&self) -> Vec<&str> [excludes _Index, _Dim, _Measure]
const MAX_CATEGORIES: usize = 12; // virtual categories don't count
// src/model/cell.rs
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct CellKey(pub Vec<(String, String)>); // always sorted by category name
// CellKey::new(coords) — sorts on construction, enforcing canonical form
// CellKey::with(cat, item) -> Self — returns new key with coord added/replaced
// CellKey::without(cat) -> Self — returns new key with coord removed
// CellKey::get(cat) -> Option<&str>
#[derive(Clone, PartialEq)]
pub enum CellValue {
Number(f64),
Text(String),
Error(String), // formula evaluation error (circular ref, div/0, etc.)
}
// CellValue::as_f64() -> Option<f64>
// CellValue::is_error() -> bool
pub struct DataStore {
cells: HashMap<InternedKey, CellValue>,
pub symbols: SymbolTable,
index: HashMap<(Symbol, Symbol), HashSet<InternedKey>>, // secondary index
}
// DataStore::set(&mut self, key: &CellKey, value: CellValue)
// DataStore::get(&self, key: &CellKey) -> Option<&CellValue>
// DataStore::matching_values(&self, partial: &[(String,String)]) -> Vec<CellValue>
// DataStore::matching_cells(&self, partial) -> Vec<(CellKey, CellValue)>
// src/model/category.rs
pub struct Category {
pub id: CategoryId, // usize
pub name: String,
pub kind: CategoryKind,
pub items: IndexMap<String, Item>, // ordered
pub groups: IndexMap<String, Group>,
next_item_id: ItemId, // private, auto-increment
}
// Category::add_item(&mut self, name) -> ItemId [deduplicates by name]
// Category::ordered_item_names(&self) -> Vec<&str> [respects group order]
pub enum CategoryKind { Regular, VirtualIndex, VirtualDim, VirtualMeasure, Label }
Formula System
// crates/improvise-formula/src/ast.rs
pub enum Expr {
Number(f64),
Ref(String), // reference to an item name
BinOp(BinOp, Box<Expr>, Box<Expr>),
UnaryMinus(Box<Expr>),
Agg(AggFunc, Box<Expr>, Option<Filter>),
If(Box<Expr>, Box<Expr>, Box<Expr>),
}
pub enum BinOp { Add, Sub, Mul, Div, Pow, Eq, Ne, Lt, Gt, Le, Ge }
pub enum AggFunc { Sum, Avg, Min, Max, Count }
pub struct Formula {
pub raw: String, // "Profit = Revenue - Cost"
pub target: String, // "Profit"
pub target_category: String, // "Measure"
pub expr: Expr,
pub filter: Option<Filter>, // WHERE clause
}
// crates/improvise-formula/src/parser.rs
pub fn parse_formula(raw: &str, target_category: &str) -> Result<Formula>
Formula evaluation is in model/types.rs → eval_formula() / eval_expr(). Operates at full f64 precision. Display rounding in format.rs is view-only.
View and Layout
// src/view/axis.rs
pub enum Axis { Row, Column, Page, None }
// src/view/types.rs
pub struct View {
pub name: String,
pub category_axes: IndexMap<String, Axis>,
pub page_selections: HashMap<String, String>,
pub hidden_items: HashMap<String, HashSet<String>>,
pub collapsed_groups: HashMap<String, HashSet<String>>,
pub number_format: String, // e.g. ",.0" or ",.2f"
pub prune_empty: bool,
// scroll/selection state...
}
// View::set_axis(&mut self, cat, axis)
// View::axis_of(&self, cat) -> Axis
// View::cycle_axis(&mut self, cat) [Row→Column→Page→None→Row]
// View::transpose(&mut self) [swap Row↔Column]
// View::categories_on(&self, axis) -> Vec<&str>
// View::on_category_added(&mut self, cat) [auto-assigns axis]
// src/view/layout.rs
pub struct GridLayout { /* computed from Model + View */ }
// GridLayout::new(model, view) -> Self
// GridLayout::cell_key(row, col) -> Option<CellKey>
// GridLayout::cell_value(row, col) -> Option<CellValue>
// GridLayout::row_label(row) -> &str
// GridLayout::col_label(col) -> &str
// GridLayout::drill_records(row, col) -> Vec<(CellKey, CellValue)>
// Records mode: auto-detected when _Index on Row + _Dim on Column
App State
// src/ui/app.rs
pub enum AppMode {
Normal,
Editing { minibuf: MinibufferConfig },
FormulaEdit { minibuf: MinibufferConfig },
FormulaPanel,
CategoryPanel,
ViewPanel,
TileSelect,
CategoryAdd { minibuf: MinibufferConfig },
ItemAdd { minibuf: MinibufferConfig },
ExportPrompt { minibuf: MinibufferConfig },
CommandMode { minibuf: MinibufferConfig },
ImportWizard,
Help,
Quit,
}
// Note: SearchMode is Normal + search_mode:bool flag, not a separate variant.
pub struct App {
pub model: Model,
pub mode: AppMode,
pub file_path: Option<PathBuf>,
pub dirty: bool,
pub help_page: usize,
pub transient_keymap: Option<Arc<Keymap>>, // for prefix keys
// layout cache, drill_state, wizard, buffers, panel cursors, etc.
}
// App::handle_key(&mut self, KeyEvent) -> Result<()> [main input dispatch]
// App::rebuild_layout(&mut self)
// App::is_empty_model(&self) -> bool [true when only virtual categories exist]
Keymap System
// src/command/keymap.rs
pub enum KeyPattern { Key(KeyCode, KeyModifiers), AnyChar, Any }
pub enum Binding {
Cmd { name: &'static str, args: Vec<String> },
Prefix(Arc<Keymap>), // Emacs-style sub-keymap
Sequence(Vec<(&'static str, Vec<String>)>), // multi-command chain
}
pub enum ModeKey {
Normal, Help, FormulaPanel, CategoryPanel, ViewPanel, TileSelect,
Editing, FormulaEdit, CategoryAdd, ItemAdd, ExportPrompt, CommandMode,
SearchMode, ImportWizard,
}
// Keymap::with_parent(parent: Arc<Keymap>) -> Self [Emacs-style inheritance]
// Keymap::lookup(&self, key, mods) -> Option<&Binding>
// Fallback chain: exact(key,mods) → Char with NONE mods → AnyChar → Any → parent
// Minibuffer modes: Enter and Esc use Binding::Sequence to include clear-buffer
// KeymapSet::default_keymaps() -> Self [builds all 14 mode keymaps]
// KeymapSet::dispatch(&self, ctx, key, mods) -> Option<Vec<Box<dyn Effect>>>
File Format (.improv)
Plain-text markdown-like, defined by a PEG grammar (persistence/improv.pest).
Parsed by pest; the grammar is the single source of truth for both the parser
and the grammar-walking test generator.
Not JSON (JSON is legacy, auto-detected by { prefix).
v2025-04-09
# Model Name
Initial View: Default
## View: Default
Region: row
Measure: column
|Time Period|: page, Q1 ← pipe-quoted name, page with selection
hidden: Region/Internal
collapsed: |Time Period|/|2024|
format: ,.2f
## Formulas
- Profit = Revenue - Cost ← defaults to [_Measure]
- Tax = Revenue * 0.1 [CustomCat] ← explicit [TargetCategory] for non-_Measure
## Category: Region
- North, South, East, West ← bare items, comma-separated
- Coastal_East[Coastal] ← grouped item (one per line)
- Coastal_West[Coastal]
> Coastal ← group definition
## Category: Measure
- Revenue, Cost, Profit
## Data
Region=East, Measure=Revenue = 1200
Region=East, Measure=Cost = 800
Region=West, Measure=Revenue = |pending| ← pipe-quoted text value
Name quoting
Bare names match [A-Za-z_][A-Za-z0-9_-]*. Everything else uses CL-style
pipe quoting: |Income, Gross|, |2025|, |Name with spaces|.
Escapes inside pipes: \| (literal pipe), \\ (backslash), \n (newline).
Section order
format_md writes Views → Formulas → Categories → Data (smallest to largest).
The parser accepts sections in any order.
Key design choices
- Version line is exact match (
v2025-04-09) — grammar enforces valid versions only. Initial View:is a top-level header, not embedded in view sections.- Text cell values are always pipe-quoted to distinguish from numbers.
- Bare items are comma-separated on one line; grouped items get one line each.
Gzip variant: .improv.gz (same content, gzipped). Persistence code: persistence/mod.rs.
CLI
improvise [model.improv] # open TUI (default)
improvise import data.csv [--no-wizard] [-o out] # import CSV/JSON
improvise cmd 'add-cat Region' -f model.improv # headless command(s)
improvise script setup.txt -f model.improv # run script file
Import flags: --category, --measure, --time, --skip, --extract, --axis, --formula, --name.
Key Dependencies
| Crate | Purpose |
|---|---|
| ratatui 0.30 | TUI framework |
| crossterm 0.28 | Terminal backend |
| clap 4.6 (derive) | CLI parsing |
| serde + serde_json | Serialization |
| indexmap 2 | Ordered maps (categories, views) |
| anyhow | Error handling |
| chrono 0.4 | Date parsing in import |
| pest + pest_derive | PEG parser for .improv format |
| flate2 | Gzip for .improv.gz |
| csv | CSV parsing |
| enum_dispatch | CLI subcommand dispatch |
| dev: proptest, tempfile, pest_meta | Property testing, temp dirs, grammar AST for test generator |
File Inventory
Lines / tests / path — grouped by layer.
Model layer
1692 / 66t model/types.rs Model struct, formula eval, CRUD, MAX_CATEGORIES=12
621 / 28t model/cell.rs CellKey (canonical sort), CellValue, DataStore (interned)
216 / 6t model/category.rs Category, Item, Group, CategoryKind
79 / 3t model/symbol.rs Symbol interning (SymbolTable)
6 / 0t model/mod.rs
Formula layer (sub-crate improvise-formula under crates/)
776 / 35t crates/improvise-formula/src/parser.rs Recursive descent parser → Formula AST
77 / 0t crates/improvise-formula/src/ast.rs Expr, BinOp, AggFunc, Formula, Filter (data only)
5 / 0t crates/improvise-formula/src/lib.rs
View layer
1013 / 23t view/layout.rs GridLayout (pure fn of Model+View), records mode, drill
521 / 28t view/types.rs View config (axes, pages, hidden, collapsed, format)
21 / 0t view/axis.rs Axis enum {Row, Column, Page, None}
7 / 0t view/mod.rs
Command layer
command/cmd/ Cmd trait, CmdContext, CmdRegistry, 40+ commands
297 / 2t core.rs Cmd trait, CmdContext, CmdRegistry, parse helpers
586 / 0t registry.rs default_registry() — all command registrations
475 / 10t navigation.rs Move, EnterAdvance, PageNext/Prev
198 / 6t cell.rs ClearCell, YankCell, PasteCell, TransposeAxes, SaveCmd
330 / 7t commit.rs CommitFormula, CommitCategoryAdd/ItemAdd, CommitExport
437 / 5t effect_cmds.rs effect_cmd! macro, 25+ parseable effect-wrapper commands
409 / 7t grid.rs ToggleGroup, ViewNavigate, DrillIntoCell, TogglePruneEmpty
308 / 8t mode.rs EnterMode, Quit, EditOrDrill, EnterTileSelect, etc.
587 / 13t panel.rs Panel toggle/cycle/cursor, formula/category/view panel cmds
202 / 4t search.rs SearchNavigate, SearchOrCategoryAdd, ExitSearchMode
256 / 7t text_buffer.rs AppendChar, PopChar, CommandModeBackspace, ExecuteCommand
160 / 5t tile.rs MoveTileCursor, TileAxisOp
121 / 0t mod.rs Module declarations, re-exports, test helpers
1066 / 22t command/keymap.rs KeyPattern, Binding, Keymap, ModeKey, 14 mode keymaps
236 / 19t command/parse.rs Script/command-line parser (prefix syntax)
12 / 0t command/mod.rs
UI layer
942 / 41t ui/effect.rs Effect trait, 50+ effect types (all state mutations)
914 / 30t ui/app.rs App state, AppMode (15 variants), handle_key, autosave
1036 / 13t ui/grid.rs GridWidget (ratatui), col widths, rendering
617 / 0t ui/help.rs 5-page help overlay, HELP_PAGE_COUNT=5
347 / 0t ui/import_wizard_ui.rs Import wizard overlay rendering
165 / 6t ui/cat_tree.rs Category tree flattener for panel
113 / 0t ui/view_panel.rs View list panel
107 / 0t ui/category_panel.rs Category tree panel
95 / 0t ui/tile_bar.rs Tile bar (axis assignment tiles)
87 / 0t ui/panel.rs Generic panel frame widget
81 / 0t ui/formula_panel.rs Formula list panel
67 / 0t ui/which_key.rs Prefix-key hint popup
12 / 0t ui/mod.rs
Import layer
773 / 38t import/wizard.rs ImportPipeline + ImportWizard
292 / 9t import/analyzer.rs Field kind detection (Category/Measure/Time/Skip)
244 / 8t import/csv_parser.rs CSV parsing, multi-file merge
3 / 0t import/mod.rs
Top-level
400 / 0t draw.rs TUI event loop (run_tui), frame composition
391 / 0t main.rs CLI entry (clap): open, import, cmd, script
10 / 0t lib.rs Public module exports (enables examples)
228 / 29t format.rs Number display formatting (view-only rounding)
124 / 0t persistence/improv.pest PEG grammar — single source of truth for .improv format
2291 / 83t persistence/mod.rs .improv save/load (pest parser + format + gzip + legacy JSON)
Examples
examples/gen-grammar.rs Grammar-walking random file generator (pest_meta)
examples/pretty-print.rs Parse stdin, print formatted .improv to stdout
Context docs
context/design-principles.md Architectural principles
context/plan.md Show HN launch plan
context/repo-map.md This file
docs/design-notes.md Product vision & non-goals (salvaged from former SPEC.md)
Total: ~22,000 lines, 572 tests.
Testing Guidelines
Coverage target
Aim for ~80% line and branch coverage on logic code. This is a quality floor, not a
ceiling — go higher where the code warrants it, but don't chase 100% on rendering
widgets or write tests that just exercise trivial getters. Coverage should be run with
cargo llvm-cov (available via nix develop).
What to test and how
| Layer | Approach | Notes |
|---|---|---|
| Model (types, cell, category, symbol) | Unit tests + proptest | The data model is the foundation. Property tests catch invariant violations that hand-picked cases miss (see CellKey sort invariant, axis consistency). |
| Formula (parser, eval) | Unit tests per operator/construct | Cover each BinOp, AggFunc, IF, WHERE, unary minus, chained formulas, error cases (div-by-zero, missing ref). Ensure eval uses full f64 precision — never display-rounded values. |
| View (types, layout) | Unit tests + proptest | Property tests for axis assignment invariants (each category on exactly one axis, transpose is involutive, etc.). Unit tests for layout computation, records mode detection, drill. |
| Command (cmd, keymap, parse) | Unit tests | Test command execution by building a CmdContext and asserting on returned effects. Test keymap lookup fallback chain. Test script parser with edge cases (quoting, comments, dots). |
| Persistence | Round-trip + grammar-generated | save → load → save must be identical. Grammar-walking generator produces random valid files from the pest AST; proptests verify parse(generate()) and parse(format(parse(generate()))). Cover groups, formulas, views, hidden items, pipe quoting edge cases. |
| Format | Unit tests | Boundary cases: comma placement at 3/4/7 digits, negative numbers, rounding half-away-from-zero (not banker's), zero, small fractions. |
| Import (analyzer, csv, wizard) | Unit tests | Field classification heuristics, CSV quoting (RFC 4180), multi-file merge, date extraction. |
| UI rendering (grid, panels, draw, help) | Generally skip | Ratatui widgets are hard to unit-test and change frequently. Test the logic they consume (layout, cat_tree, format) rather than the rendering itself. |
| Effects | Test indirectly | Effects are thin apply methods. Test via integration: send a key through App::handle_key and assert on resulting app state. The complex ones (drill reconciliation, import) deserve targeted unit tests. |
Property tests (proptest)
Use property tests for invariants that must hold across all inputs, not as a substitute for example-based tests. Good candidates:
- Structural invariants: CellKey always sorted, each category on exactly one axis, toggle-collapse is involutive, hide/show roundtrips.
- Serialization roundtrips: save/load identity.
- Determinism:
evaluatereturns the same value for the same inputs.
Keep proptest case counts reasonable. The defaults (256 cases) are fine for most properties. Don't crank them up to thousands — the test suite should complete in under 2 seconds. If a property needs more cases to feel confident, that's a sign the input space should be constrained with better strategies, not brute-forced.
Bug-fix workflow
Per CLAUDE.md: write a test that demonstrates the bug before fixing it. Prefer
a small unit test targeting the specific function over an integration test. The test
should fail on the current code, then pass after the fix. Mark regression tests
with a doc-comment explaining the bug (see model/types.rs formula_tests for
examples).
What not to test
- Trivial struct constructors and enum definitions (
ast.rs,axis.rs). - Ratatui
Widget::renderimplementations — these are pure drawing code. - Module re-export files (
mod.rs). - One-line delegation methods.
Patterns to Know
- Commands never mutate. They receive
&CmdContext(read-only) and returnVec<Box<dyn Effect>>. - CellKey is always sorted. Use
CellKey::new()— never construct the inner Vec directly. category_mut()for adding items.Modelhas noadd_itemmethod; get the category first:m.category_mut("Region").unwrap().add_item("East").- Virtual categories
_Index,_Dim, and_Measurealways exist.is_empty_model()checks whether any non-virtual categories exist._Measureitems come from two sources: explicit data items (in category) + formula targets (dynamically viameasure_item_names()).add_formuladoes NOT add items to_Measure— useeffective_item_names("_Measure")to get the full list._Indexand_Dimare never persisted to.improvfiles;_Measureonly persists non-formula items. - Display rounding is view-only.
format_f64(half-away-from-zero) is only called in rendering. Formula eval uses full f64. 5b. Formula evaluation is fixed-point.recompute_formulas(none_cats)iterates formula evaluation until values stabilize, using a cache.evaluate_aggregatedchecks the cache for formula results. Circular refs produceCellValue::Error("circular"). - Keybindings are per-mode.
ModeKey::from_app_mode()resolves the current mode, then the correspondingKeymapis looked up. Normal +search_mode=truemaps toSearchMode. effect_cmd!macro generates a command struct that just produces effects. Use for simple commands without complex logic..improvformat is defined by a PEG grammar (persistence/improv.pest). Parsed by pest. Names use CL-style|...|pipe quoting when they aren't valid bare identifiers. JSON is legacy only.IndexMapis used for categories and views to preserve insertion order.MAX_CATEGORIES = 12applies only toCategoryKind::Regular. Virtual/Label categories are exempt.- Drill into formula cells strips the
_Measurecoordinate from the drill key when it names a formula target, somatching_cellsfinds the raw data records that feed the formula instead of returning empty. App::newcallsrecompute_formulasbefore building the initial layout, so formula values appear on the first rendered frame.- Minibuffer buffer clearing is handled by
Binding::Sequencein keymaps: Enter and Esc sequences includeclear-bufferto reset the text buffer. Theclear-buffercommand is registered in the registry.