docs: update repo-map and design-principles for pest parser
- Document PEG grammar as single source of truth for .improv format - Update file format section with v2025-04-09 syntax: version line, Initial View, pipe quoting, Views→Formulas→Categories→Data order - Add pipe quoting convention and grammar-driven testing principles - Update file inventory (persistence: 124+2291 lines, 83 tests) - Add pest/pest_meta to dependency table - Update persistence testing guidance for grammar-walking generator Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Executed-By: spot
This commit is contained in:
@ -38,8 +38,10 @@ editing) is two commands composed at the binding level, not a monolithic handler
|
|||||||
|
|
||||||
### Dispatch Through Traits and Registries, Not Match Blocks
|
### Dispatch Through Traits and Registries, Not Match Blocks
|
||||||
|
|
||||||
- **Commands**: 40+ types each implement `Cmd`. A `CmdRegistry` maps names to
|
- **Commands**: 40+ types each implement `Cmd`, organized by concern across
|
||||||
constructor closures. Dispatching a key presses looks up the binding, resolves
|
submodules in `command/cmd/` (navigation, cell, commit, grid, mode, panel,
|
||||||
|
search, text_buffer, tile, effect_cmds). A `CmdRegistry` maps names to
|
||||||
|
constructor closures. Dispatching a key press looks up the binding, resolves
|
||||||
the command name through the registry, and calls `execute`. No central
|
the command name through the registry, and calls `execute`. No central
|
||||||
`match command_name { ... }` block.
|
`match command_name { ... }` block.
|
||||||
|
|
||||||
@ -117,6 +119,26 @@ Formulas are parsed into a typed AST (`Expr` enum) at entry time. If the syntax
|
|||||||
is invalid, the user gets an error immediately. The evaluator only sees
|
is invalid, the user gets an error immediately. The evaluator only sees
|
||||||
well-formed trees — it does not need to handle malformed input.
|
well-formed trees — it does not need to handle malformed input.
|
||||||
|
|
||||||
|
### Grammar-Defined File Format
|
||||||
|
|
||||||
|
The `.improv` file format is defined by a PEG grammar (`persistence/improv.pest`)
|
||||||
|
and parsed by pest. The grammar is the single source of truth — the parser is a
|
||||||
|
tree-walker over the grammar's parse tree, not an ad-hoc line scanner. This means:
|
||||||
|
|
||||||
|
- Adding a new format feature means updating the grammar first, then the walker.
|
||||||
|
- The grammar can be read as a specification independent of the Rust code.
|
||||||
|
- A grammar-walking test generator reads the grammar AST at test time (via
|
||||||
|
`pest_meta`) and produces random valid files, ensuring the parser accepts
|
||||||
|
everything the grammar describes.
|
||||||
|
|
||||||
|
### CL-Style Pipe Quoting for Names
|
||||||
|
|
||||||
|
Names in the `.improv` format use CL-style `|...|` pipe quoting. A name is bare
|
||||||
|
if it matches `[A-Za-z_][A-Za-z0-9_-]*`; everything else must be pipe-quoted.
|
||||||
|
Escapes inside pipes: `\|` (literal pipe), `\\` (backslash), `\n` (newline).
|
||||||
|
This convention is shared between the `.improv` persistence format and the
|
||||||
|
formula parser's identifier syntax.
|
||||||
|
|
||||||
### Formula Tokenizer: Identifiers and Quoting
|
### Formula Tokenizer: Identifiers and Quoting
|
||||||
|
|
||||||
**Bare identifiers** support multi-word names (e.g., `Total Revenue`) by
|
**Bare identifiers** support multi-word names (e.g., `Total Revenue`) by
|
||||||
@ -244,9 +266,14 @@ milliseconds each are a sign something is wrong.
|
|||||||
- **Model, formula, view**: the core logic. Unit tests for each operation and
|
- **Model, formula, view**: the core logic. Unit tests for each operation and
|
||||||
edge case. Property tests for invariants. These are the highest-value tests.
|
edge case. Property tests for invariants. These are the highest-value tests.
|
||||||
- **Commands**: build a `CmdContext`, call `execute`, assert on the returned
|
- **Commands**: build a `CmdContext`, call `execute`, assert on the returned
|
||||||
effects. Pure functions — no terminal needed.
|
effects. Pure functions — no terminal needed. Tests are colocated in each
|
||||||
|
command submodule (`command/cmd/<module>.rs` → `mod tests`), with shared
|
||||||
|
test helpers in `command/cmd/mod.rs::test_helpers`.
|
||||||
- **Persistence**: round-trip tests (`save → load → save` produces identical
|
- **Persistence**: round-trip tests (`save → load → save` produces identical
|
||||||
output). Cover groups, formulas, views, hidden items, legacy JSON.
|
output) plus grammar-driven property tests. The generator walks the pest
|
||||||
|
grammar AST to produce random valid files; proptests verify
|
||||||
|
`parse(generate())` succeeds and `parse(format(parse(generate())))` is
|
||||||
|
stable. Cover groups, formulas, views, hidden items, pipe quoting edges.
|
||||||
- **Format**: boundary cases for comma placement, rounding, negative numbers.
|
- **Format**: boundary cases for comma placement, rounding, negative numbers.
|
||||||
- **Import**: field classification heuristics, CSV quoting, multi-file merge.
|
- **Import**: field classification heuristics, CSV quoting, multi-file merge.
|
||||||
|
|
||||||
|
|||||||
@ -10,14 +10,14 @@ Crate `improvise` v0.1.0, Apache-2.0, edition 2021.
|
|||||||
| I need to... | Look in |
|
| I need to... | Look in |
|
||||||
|---------------------------------------|----------------------------------------------|
|
|---------------------------------------|----------------------------------------------|
|
||||||
| Add a new keybinding | `command/keymap.rs` → `default_keymaps()` |
|
| Add a new keybinding | `command/keymap.rs` → `default_keymaps()` |
|
||||||
| Add a new user-facing command | `command/cmd.rs` → implement `Cmd`, register in `default_registry()` |
|
| Add a new user-facing command | `command/cmd/` → implement `Cmd` in the relevant submodule, register in `registry.rs` |
|
||||||
| Add a new state mutation | `ui/effect.rs` → implement `Effect` |
|
| Add a new state mutation | `ui/effect.rs` → implement `Effect` |
|
||||||
| Change formula evaluation | `model/types.rs` → `eval_formula()`, `eval_expr()` |
|
| Change formula evaluation | `model/types.rs` → `eval_formula()`, `eval_expr()` |
|
||||||
| Change how cells are stored/queried | `model/cell.rs` → `DataStore` |
|
| Change how cells are stored/queried | `model/cell.rs` → `DataStore` |
|
||||||
| Change category/item behavior | `model/category.rs` → `Category` |
|
| Change category/item behavior | `model/category.rs` → `Category` |
|
||||||
| Change view axis logic | `view/types.rs` → `View` |
|
| Change view axis logic | `view/types.rs` → `View` |
|
||||||
| Change grid layout computation | `view/layout.rs` → `GridLayout` |
|
| Change grid layout computation | `view/layout.rs` → `GridLayout` |
|
||||||
| Change .improv file format | `persistence/mod.rs` → `format_md()`, `parse_md()` |
|
| Change .improv file format | `persistence/improv.pest` (grammar), `persistence/mod.rs` → `format_md()`, `parse_md()` |
|
||||||
| Change number display formatting | `format.rs` → `format_f64()` |
|
| Change number display formatting | `format.rs` → `format_f64()` |
|
||||||
| Change CLI arguments | `main.rs` → clap structs |
|
| Change CLI arguments | `main.rs` → clap structs |
|
||||||
| Change import wizard logic | `import/wizard.rs` → `ImportPipeline` |
|
| Change import wizard logic | `import/wizard.rs` → `ImportPipeline` |
|
||||||
@ -25,7 +25,7 @@ Crate `improvise` v0.1.0, Apache-2.0, edition 2021.
|
|||||||
| Change TUI frame layout | `draw.rs` → `draw()` |
|
| Change TUI frame layout | `draw.rs` → `draw()` |
|
||||||
| Change app state / mode transitions | `ui/app.rs` → `App`, `AppMode` |
|
| Change app state / mode transitions | `ui/app.rs` → `App`, `AppMode` |
|
||||||
| Write a test for model logic | `model/types.rs` → `mod tests` / `mod formula_tests` |
|
| Write a test for model logic | `model/types.rs` → `mod tests` / `mod formula_tests` |
|
||||||
| Write a test for a command | `command/cmd.rs` → `mod tests` |
|
| Write a test for a command | `command/cmd/<module>.rs` → colocated `mod tests` |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -39,7 +39,7 @@ User keypress → Keymap lookup → Cmd::execute(&CmdContext) → Vec<Box<dyn Ef
|
|||||||
```
|
```
|
||||||
|
|
||||||
```rust
|
```rust
|
||||||
// src/command/cmd.rs
|
// src/command/cmd/core.rs
|
||||||
pub trait Cmd: Debug + Send + Sync {
|
pub trait Cmd: Debug + Send + Sync {
|
||||||
fn name(&self) -> &'static str;
|
fn name(&self) -> &'static str;
|
||||||
fn execute(&self, ctx: &CmdContext) -> Vec<Box<dyn Effect>>;
|
fn execute(&self, ctx: &CmdContext) -> Vec<Box<dyn Effect>>;
|
||||||
@ -69,7 +69,7 @@ pub trait Effect: Debug {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
**To add a command**: implement `Cmd`, then in `default_registry()` call `r.register(...)` or use the `effect_cmd!` macro for simple cases. Bind it in `default_keymaps()`.
|
**To add a command**: implement `Cmd` in the appropriate `command/cmd/` submodule, then register in `command/cmd/registry.rs`. Use the `effect_cmd!` macro (in `effect_cmds.rs`) for simple effect-wrapping commands. Bind it in `default_keymaps()`.
|
||||||
|
|
||||||
**To add an effect**: implement `Effect` in `effect.rs`, add a constructor function.
|
**To add an effect**: implement `Effect` in `effect.rs`, add a constructor function.
|
||||||
|
|
||||||
@ -273,40 +273,61 @@ pub enum ModeKey {
|
|||||||
|
|
||||||
## File Format (.improv)
|
## File Format (.improv)
|
||||||
|
|
||||||
Plain-text markdown-like. **Not JSON** (JSON is legacy, auto-detected by `{` prefix).
|
Plain-text markdown-like, defined by a PEG grammar (`persistence/improv.pest`).
|
||||||
|
Parsed by pest; the grammar is the single source of truth for both the parser
|
||||||
|
and the grammar-walking test generator.
|
||||||
|
|
||||||
|
**Not JSON** (JSON is legacy, auto-detected by `{` prefix).
|
||||||
|
|
||||||
```
|
```
|
||||||
|
v2025-04-09
|
||||||
# Model Name
|
# Model Name
|
||||||
|
Initial View: Default
|
||||||
|
|
||||||
## Category: Region
|
## View: Default
|
||||||
- North
|
Region: row
|
||||||
- South
|
Measure: column
|
||||||
- East [Coastal] ← item in group "Coastal"
|
|Time Period|: page, Q1 ← pipe-quoted name, page with selection
|
||||||
- West [Coastal]
|
hidden: Region/Internal
|
||||||
> Coastal ← group definition
|
collapsed: |Time Period|/|2024|
|
||||||
|
format: ,.2f
|
||||||
## Category: Measure
|
|
||||||
- Revenue
|
|
||||||
- Cost
|
|
||||||
- Profit
|
|
||||||
|
|
||||||
## Formulas
|
## Formulas
|
||||||
- Profit = Revenue - Cost [Measure] ← [TargetCategory]
|
- Profit = Revenue - Cost [Measure] ← [TargetCategory]
|
||||||
|
|
||||||
|
## Category: Region
|
||||||
|
- North, South, East, West ← bare items, comma-separated
|
||||||
|
- Coastal_East[Coastal] ← grouped item (one per line)
|
||||||
|
- Coastal_West[Coastal]
|
||||||
|
> Coastal ← group definition
|
||||||
|
|
||||||
|
## Category: Measure
|
||||||
|
- Revenue, Cost, Profit
|
||||||
|
|
||||||
## Data
|
## Data
|
||||||
Region=East, Measure=Revenue = 1200
|
Region=East, Measure=Revenue = 1200
|
||||||
Region=East, Measure=Cost = 800
|
Region=East, Measure=Cost = 800
|
||||||
Region=West, Measure=Revenue = "pending" ← text value in quotes
|
Region=West, Measure=Revenue = |pending| ← pipe-quoted text value
|
||||||
|
|
||||||
## View: Default (active)
|
|
||||||
Region: row
|
|
||||||
Measure: column
|
|
||||||
Time: page, Q1 ← page axis with selected item
|
|
||||||
hidden: Region/Internal
|
|
||||||
collapsed: Time/2024
|
|
||||||
format: ,.2f
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Name quoting
|
||||||
|
|
||||||
|
Bare names match `[A-Za-z_][A-Za-z0-9_-]*`. Everything else uses CL-style
|
||||||
|
pipe quoting: `|Income, Gross|`, `|2025|`, `|Name with spaces|`.
|
||||||
|
Escapes inside pipes: `\|` (literal pipe), `\\` (backslash), `\n` (newline).
|
||||||
|
|
||||||
|
### Section order
|
||||||
|
|
||||||
|
`format_md` writes Views → Formulas → Categories → Data (smallest to largest).
|
||||||
|
The parser accepts sections in any order.
|
||||||
|
|
||||||
|
### Key design choices
|
||||||
|
|
||||||
|
- Version line (`v2025-04-09`) enables future format changes.
|
||||||
|
- `Initial View:` is a top-level header, not embedded in view sections.
|
||||||
|
- Text cell values are always pipe-quoted to distinguish from numbers.
|
||||||
|
- Bare items are comma-separated on one line; grouped items get one line each.
|
||||||
|
|
||||||
Gzip variant: `.improv.gz` (same content, gzipped). Persistence code: `persistence/mod.rs`.
|
Gzip variant: `.improv.gz` (same content, gzipped). Persistence code: `persistence/mod.rs`.
|
||||||
|
|
||||||
---
|
---
|
||||||
@ -335,10 +356,11 @@ Import flags: `--category`, `--measure`, `--time`, `--skip`, `--extract`, `--axi
|
|||||||
| indexmap 2 | Ordered maps (categories, views) |
|
| indexmap 2 | Ordered maps (categories, views) |
|
||||||
| anyhow | Error handling |
|
| anyhow | Error handling |
|
||||||
| chrono 0.4 | Date parsing in import |
|
| chrono 0.4 | Date parsing in import |
|
||||||
|
| pest + pest_derive | PEG parser for .improv format |
|
||||||
| flate2 | Gzip for .improv.gz |
|
| flate2 | Gzip for .improv.gz |
|
||||||
| csv | CSV parsing |
|
| csv | CSV parsing |
|
||||||
| enum_dispatch | CLI subcommand dispatch |
|
| enum_dispatch | CLI subcommand dispatch |
|
||||||
| **dev:** proptest, tempfile | Property testing, temp dirs |
|
| **dev:** proptest, tempfile, pest_meta | Property testing, temp dirs, grammar AST for test generator |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -372,8 +394,21 @@ Lines / tests / path — grouped by layer.
|
|||||||
|
|
||||||
### Command layer
|
### Command layer
|
||||||
```
|
```
|
||||||
3373 / 74t command/cmd.rs Cmd trait, CmdContext, CmdRegistry, 40+ commands
|
command/cmd/ Cmd trait, CmdContext, CmdRegistry, 40+ commands
|
||||||
1068 / 22t command/keymap.rs KeyPattern, Binding, Keymap, ModeKey, 14 mode keymaps
|
297 / 2t core.rs Cmd trait, CmdContext, CmdRegistry, parse helpers
|
||||||
|
586 / 0t registry.rs default_registry() — all command registrations
|
||||||
|
475 / 10t navigation.rs Move, EnterAdvance, PageNext/Prev
|
||||||
|
198 / 6t cell.rs ClearCell, YankCell, PasteCell, TransposeAxes, SaveCmd
|
||||||
|
330 / 7t commit.rs CommitFormula, CommitCategoryAdd/ItemAdd, CommitExport
|
||||||
|
437 / 5t effect_cmds.rs effect_cmd! macro, 25+ parseable effect-wrapper commands
|
||||||
|
409 / 7t grid.rs ToggleGroup, ViewNavigate, DrillIntoCell, TogglePruneEmpty
|
||||||
|
308 / 8t mode.rs EnterMode, Quit, EditOrDrill, EnterTileSelect, etc.
|
||||||
|
587 / 13t panel.rs Panel toggle/cycle/cursor, formula/category/view panel cmds
|
||||||
|
202 / 4t search.rs SearchNavigate, SearchOrCategoryAdd, ExitSearchMode
|
||||||
|
256 / 7t text_buffer.rs AppendChar, PopChar, CommandModeBackspace, ExecuteCommand
|
||||||
|
160 / 5t tile.rs MoveTileCursor, TileAxisOp
|
||||||
|
121 / 0t mod.rs Module declarations, re-exports, test helpers
|
||||||
|
1066 / 22t command/keymap.rs KeyPattern, Binding, Keymap, ModeKey, 14 mode keymaps
|
||||||
236 / 19t command/parse.rs Script/command-line parser (prefix syntax)
|
236 / 19t command/parse.rs Script/command-line parser (prefix syntax)
|
||||||
12 / 0t command/mod.rs
|
12 / 0t command/mod.rs
|
||||||
```
|
```
|
||||||
@ -408,7 +443,8 @@ Lines / tests / path — grouped by layer.
|
|||||||
400 / 0t draw.rs TUI event loop (run_tui), frame composition
|
400 / 0t draw.rs TUI event loop (run_tui), frame composition
|
||||||
391 / 0t main.rs CLI entry (clap): open, import, cmd, script
|
391 / 0t main.rs CLI entry (clap): open, import, cmd, script
|
||||||
228 / 29t format.rs Number display formatting (view-only rounding)
|
228 / 29t format.rs Number display formatting (view-only rounding)
|
||||||
806 / 38t persistence/mod.rs .improv save/load (markdown format + gzip + legacy JSON)
|
124 / 0t persistence/improv.pest PEG grammar — single source of truth for .improv format
|
||||||
|
2291 / 83t persistence/mod.rs .improv save/load (pest parser + format + gzip + legacy JSON)
|
||||||
```
|
```
|
||||||
|
|
||||||
### Context docs
|
### Context docs
|
||||||
@ -419,7 +455,7 @@ context/repo-map.md This file
|
|||||||
docs/design-notes.md Product vision & non-goals (salvaged from former SPEC.md)
|
docs/design-notes.md Product vision & non-goals (salvaged from former SPEC.md)
|
||||||
```
|
```
|
||||||
|
|
||||||
**Total: ~16,500 lines, 510 tests.**
|
**Total: ~21,400 lines, 568 tests.**
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@ -440,7 +476,7 @@ widgets or write tests that just exercise trivial getters. Coverage should be ru
|
|||||||
| **Formula** (parser, eval) | Unit tests per operator/construct | Cover each BinOp, AggFunc, IF, WHERE, unary minus, chained formulas, error cases (div-by-zero, missing ref). Ensure eval uses full f64 precision — never display-rounded values. |
|
| **Formula** (parser, eval) | Unit tests per operator/construct | Cover each BinOp, AggFunc, IF, WHERE, unary minus, chained formulas, error cases (div-by-zero, missing ref). Ensure eval uses full f64 precision — never display-rounded values. |
|
||||||
| **View** (types, layout) | Unit tests + **proptest** | Property tests for axis assignment invariants (each category on exactly one axis, transpose is involutive, etc.). Unit tests for layout computation, records mode detection, drill. |
|
| **View** (types, layout) | Unit tests + **proptest** | Property tests for axis assignment invariants (each category on exactly one axis, transpose is involutive, etc.). Unit tests for layout computation, records mode detection, drill. |
|
||||||
| **Command** (cmd, keymap, parse) | Unit tests | Test command execution by building a `CmdContext` and asserting on returned effects. Test keymap lookup fallback chain. Test script parser with edge cases (quoting, comments, dots). |
|
| **Command** (cmd, keymap, parse) | Unit tests | Test command execution by building a `CmdContext` and asserting on returned effects. Test keymap lookup fallback chain. Test script parser with edge cases (quoting, comments, dots). |
|
||||||
| **Persistence** | Round-trip tests | `save → load → save` must be identical. Cover groups, formulas, views, hidden items, legacy JSON detection. |
|
| **Persistence** | Round-trip + grammar-generated | `save → load → save` must be identical. Grammar-walking generator produces random valid files from the pest AST; proptests verify `parse(generate())` and `parse(format(parse(generate())))`. Cover groups, formulas, views, hidden items, pipe quoting edge cases. |
|
||||||
| **Format** | Unit tests | Boundary cases: comma placement at 3/4/7 digits, negative numbers, rounding half-away-from-zero (not banker's), zero, small fractions. |
|
| **Format** | Unit tests | Boundary cases: comma placement at 3/4/7 digits, negative numbers, rounding half-away-from-zero (not banker's), zero, small fractions. |
|
||||||
| **Import** (analyzer, csv, wizard) | Unit tests | Field classification heuristics, CSV quoting (RFC 4180), multi-file merge, date extraction. |
|
| **Import** (analyzer, csv, wizard) | Unit tests | Field classification heuristics, CSV quoting (RFC 4180), multi-file merge, date extraction. |
|
||||||
| **UI rendering** (grid, panels, draw, help) | Generally skip | Ratatui widgets are hard to unit-test and change frequently. Test the *logic* they consume (layout, cat_tree, format) rather than the rendering itself. |
|
| **UI rendering** (grid, panels, draw, help) | Generally skip | Ratatui widgets are hard to unit-test and change frequently. Test the *logic* they consume (layout, cat_tree, format) rather than the rendering itself. |
|
||||||
@ -488,6 +524,6 @@ examples).
|
|||||||
5b. **Formula evaluation is fixed-point.** `recompute_formulas(none_cats)` iterates formula evaluation until values stabilize, using a cache. `evaluate_aggregated` checks the cache for formula results. Circular refs produce `CellValue::Error("circular")`.
|
5b. **Formula evaluation is fixed-point.** `recompute_formulas(none_cats)` iterates formula evaluation until values stabilize, using a cache. `evaluate_aggregated` checks the cache for formula results. Circular refs produce `CellValue::Error("circular")`.
|
||||||
6. **Keybindings are per-mode.** `ModeKey::from_app_mode()` resolves the current mode, then the corresponding `Keymap` is looked up. Normal + `search_mode=true` maps to `SearchMode`.
|
6. **Keybindings are per-mode.** `ModeKey::from_app_mode()` resolves the current mode, then the corresponding `Keymap` is looked up. Normal + `search_mode=true` maps to `SearchMode`.
|
||||||
7. **`effect_cmd!` macro** generates a command struct that just produces effects. Use for simple commands without complex logic.
|
7. **`effect_cmd!` macro** generates a command struct that just produces effects. Use for simple commands without complex logic.
|
||||||
8. **`.improv` format is markdown-like**, not JSON. See `persistence/mod.rs`. JSON is legacy only.
|
8. **`.improv` format is defined by a PEG grammar** (`persistence/improv.pest`). Parsed by pest. Names use CL-style `|...|` pipe quoting when they aren't valid bare identifiers. JSON is legacy only.
|
||||||
9. **`IndexMap`** is used for categories and views to preserve insertion order.
|
9. **`IndexMap`** is used for categories and views to preserve insertion order.
|
||||||
10. **`MAX_CATEGORIES = 12`** applies only to `CategoryKind::Regular`. Virtual/Label categories are exempt.
|
10. **`MAX_CATEGORIES = 12`** applies only to `CategoryKind::Regular`. Virtual/Label categories are exempt.
|
||||||
|
|||||||
Reference in New Issue
Block a user