From 09df7bf18158d41323e0f58cf6feba3a93534138 Mon Sep 17 00:00:00 2001 From: Edward Langley Date: Wed, 8 Apr 2026 23:44:25 -0700 Subject: [PATCH] docs: update design principles and repo map after test audit MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add virtual category boundary rule: use regular_category_names() for user-facing logic, never expose _Index/_Dim - Document formula tokenizer keyword-aware identifier breaking - Update repo-map test counts (356 → 510) and add regular_category_names Co-Authored-By: Claude Opus 4.6 (1M context) --- context/design-principles.md | 17 ++++++++++++++++- context/repo-map.md | 15 ++++++++------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/context/design-principles.md b/context/design-principles.md index 4593b37..2bfe0b5 100644 --- a/context/design-principles.md +++ b/context/design-principles.md @@ -62,7 +62,10 @@ editing) is two commands composed at the binding level, not a monolithic handler - `CategoryKind` is `Regular | VirtualIndex | VirtualDim | Label`. Business rules (e.g., the 12-category limit counts only `Regular`) are enforced by matching - on the enum, not by checking name prefixes. + on the enum, not by checking name prefixes. Virtual categories (`_Index`, + `_Dim`) exist solely for drill-down mechanics and must never leak into + user-facing logic — use `Model::regular_category_names()` when selecting a + default category for formulas, prompts, or other user-visible choices. ### When You Add a Variant, the Compiler Finds Every Call Site @@ -112,6 +115,18 @@ Formulas are parsed into a typed AST (`Expr` enum) at entry time. If the syntax is invalid, the user gets an error immediately. The evaluator only sees well-formed trees — it does not need to handle malformed input. +### Formula Tokenizer: Multi-Word Identifiers and Keywords + +The formula tokenizer supports multi-word identifiers (e.g., `Total Revenue`) +by allowing spaces within identifier tokens when followed by non-operator +characters. However, keywords (`WHERE`, `SUM`, `AVG`, `MIN`, `MAX`, `COUNT`, +`IF`) act as token boundaries — the tokenizer breaks an identifier when: +1. The identifier collected **so far** is a keyword (e.g., `WHERE ` stops at `WHERE`). +2. The **next word** after a space is a keyword (e.g., `Revenue WHERE` stops at `Revenue`). + +This ensures `SUM(Revenue WHERE Region = "East")` tokenizes correctly as +separate tokens while `Total Revenue` remains a single identifier. + --- ## 4. Separation of Concerns diff --git a/context/repo-map.md b/context/repo-map.md index e3e1075..0690065 100644 --- a/context/repo-map.md +++ b/context/repo-map.md @@ -95,7 +95,8 @@ pub struct Model { // evaluate_aggregated(&self, key, none_cats) -> Option [sums over hidden dims] // add_formula(&mut self, formula: Formula) [replaces same target+category] // remove_formula(&mut self, target, category) -// category_names(&self) -> impl Iterator +// category_names(&self) -> Vec<&str> [includes virtual] +// regular_category_names(&self) -> Vec<&str> [excludes _Index, _Dim] const MAX_CATEGORIES: usize = 12; // virtual categories don't count ``` @@ -353,7 +354,7 @@ Lines / tests / path — grouped by layer. ### Formula layer ``` - 461 / 8t formula/parser.rs Recursive descent parser → Formula AST + 461 / 29t formula/parser.rs Recursive descent parser → Formula AST 77 / 0t formula/ast.rs Expr, BinOp, AggFunc, Formula, Filter (data only) 5 / 0t formula/mod.rs ``` @@ -368,7 +369,7 @@ Lines / tests / path — grouped by layer. ### Command layer ``` -3373 / 21t command/cmd.rs Cmd trait, CmdContext, CmdRegistry, 40+ commands +3373 / 74t command/cmd.rs Cmd trait, CmdContext, CmdRegistry, 40+ commands 1068 / 22t command/keymap.rs KeyPattern, Binding, Keymap, ModeKey, 14 mode keymaps 236 / 19t command/parse.rs Script/command-line parser (prefix syntax) 12 / 0t command/mod.rs @@ -376,7 +377,7 @@ Lines / tests / path — grouped by layer. ### UI layer ``` - 942 / 0t ui/effect.rs Effect trait, 50+ effect types (all state mutations) + 942 / 41t ui/effect.rs Effect trait, 50+ effect types (all state mutations) 914 / 30t ui/app.rs App state, AppMode (15 variants), handle_key, autosave 1036 / 13t ui/grid.rs GridWidget (ratatui), col widths, rendering 617 / 0t ui/help.rs 5-page help overlay, HELP_PAGE_COUNT=5 @@ -393,7 +394,7 @@ Lines / tests / path — grouped by layer. ### Import layer ``` - 773 / 15t import/wizard.rs ImportPipeline + ImportWizard + 773 / 38t import/wizard.rs ImportPipeline + ImportWizard 292 / 9t import/analyzer.rs Field kind detection (Category/Measure/Time/Skip) 244 / 8t import/csv_parser.rs CSV parsing, multi-file merge 3 / 0t import/mod.rs @@ -404,7 +405,7 @@ Lines / tests / path — grouped by layer. 400 / 0t draw.rs TUI event loop (run_tui), frame composition 391 / 0t main.rs CLI entry (clap): open, import, cmd, script 228 / 29t format.rs Number display formatting (view-only rounding) - 806 / 22t persistence/mod.rs .improv save/load (markdown format + gzip + legacy JSON) + 806 / 38t persistence/mod.rs .improv save/load (markdown format + gzip + legacy JSON) ``` ### Context docs @@ -415,7 +416,7 @@ context/plan.md Show HN launch plan context/repo-map.md This file ``` -**Total: ~15,875 lines, 356 tests.** +**Total: ~16,500 lines, 510 tests.** ---