# Improvise — Multi-Dimensional Data Modeling Terminal Application ## Context Traditional spreadsheets conflate data, formulas, and presentation into a single flat grid addressed by opaque cell references (A1, B7). This makes models fragile, hard to audit, and impossible to rearrange without rewriting formulas. We are building a terminal application that treats data as a multi-dimensional, semantically labeled structure — separating data, computation, and views into independent layers. The result is a tool where formulas reference meaningful names, views can be rearranged instantly, and the same dataset can be explored from multiple perspectives simultaneously. The application compiles to a single static binary and provides a rich TUI experience. --- ## 1. Core Data Model ### 1.1 Categories and Items - Data is organized into **categories** (dimensions) and **items** (members of a dimension). - Example: Category "Region" contains items "North", "South", "East", "West". - Example: Category "Time" contains items "Q1", "Q2", "Q3", "Q4". - Items within a category can be organized into **groups** forming a hierarchy. - Example: Items "Jan", "Feb", "Mar" grouped under "Q1"; quarters grouped under "2025". - Groups are collapsible/expandable for drill-down. - A model supports up to **12 categories** (virtual categories prefixed with `_` do not count). ### 1.2 Data Cells - Each data cell is identified by the intersection of one item from each active category — not by grid coordinates. - Cells hold numeric values, text, or empty/null. - The underlying storage is a sparse multi-dimensional array (`HashMap`). ### 1.3 Models - A **model** is the top-level container: it holds all categories, items, groups, data cells, formulas, and views. - Models are saved to and loaded from a single `.improv` file (plain-text markdown-like format; see §7.1). --- ## 2. Formula System ### 2.1 Named Formulas - Formulas reference categories and items by name, not by cell address. - Example: `Profit = Revenue - Cost` - Example: `Tax = Revenue * 0.08` - Example: `Margin = Profit / Revenue` - A formula applies uniformly across all intersections of the referenced categories. No copying or dragging. ### 2.2 Formula Panel - Formulas are defined in a **dedicated formula panel** (`F` key), separate from the data grid. - All formulas are visible in one place for easy auditing. - Formulas cannot be accidentally overwritten by data entry. - Within the panel: `n`/`o` to create a new formula, `d` to delete. ### 2.3 Scoped Formulas (WHERE clause) - A formula can be scoped to a subset of items: - Example: `Discount = 0.10 * Price WHERE Region = "West"` ### 2.4 Aggregation - Built-in aggregation functions: `Sum`, `Avg`, `Min`, `Max`, `Count`. ### 2.5 Formula Language - Expression-based (not Turing-complete). - Operators: `+`, `-`, `*`, `/`, `^`, unary `-`. - Comparisons: `=`, `!=`, `<`, `>`, `<=`, `>=`. - Conditionals: `IF(condition, then, else)`. - `WHERE` clause for filtering: `Sum(Sales WHERE Region = "East")`. - Parentheses for grouping. - Literal numbers and quoted strings. --- ## 3. View System ### 3.1 Views as First-Class Objects - A **view** is a named configuration specifying: - Which categories are assigned to **rows**, **columns**, **pages** (filters/slicers), or **none** (hidden). - Which items/groups are visible vs. hidden per-view. - Collapsed group state per-view. - Number formatting (per-view format string, e.g. `,.2f`). - Multiple views can exist per model. Each is independent. - Editing data in any view updates the underlying model; all other views reflect the change. ### 3.2 Category Tiles - Each category is represented as a **tile** displayed in the tile bar. - Press `T` (or `Ctrl+Arrow`) to enter tile-select mode. - `h`/`l` or `←`/`→` to select a tile. - `Space`/`Enter` to cycle axis (Row → Column → Page). - `r`/`c`/`p` to set axis directly to Row / Column / Page. - Moving a tile triggers instant recalculation and re-render of the grid. ### 3.3 Page Axis (Slicing) - Categories assigned to the page axis act as filters. - The user selects a single item from a paged category using `[` and `]`. ### 3.4 Collapsing and Expanding - Groups can be collapsed/expanded per-view with the `z` key. ### 3.5 Drill-Down and View History - `>` drills into an aggregated cell, capturing a snapshot. - `<` navigates back through view history. ### 3.6 Records Mode - `R` toggles records mode: a long-format view activated when `_Index` is on Row and `_Dim` is on Column. - `P` toggles "prune empty" to hide rows/columns with no data. ### 3.7 Transpose - `t` swaps the Row and Column axes instantly. --- ## 4. Import Wizard ### 4.1 Purpose - Users can import CSV or JSON files to bootstrap a model. - Multiple CSV files can be merged with an automatic "File" category. ### 4.2 Wizard Flow (interactive TUI) **Step 1: Preview** — Structural summary of the data. **Step 2: Select Array Path** — (JSON only) If the JSON is not a flat array, the user selects which key path contains the primary record array. **Step 3: Review Proposals** — Fields are analyzed and proposed as: - Category (small number of distinct string values) - Measure (numeric) - Time Category (date-like strings, with optional date component extraction: Year, Month, Quarter) - Skip (exclude from import) **Step 4: Configure** — Set axis assignments, add formulas, name the model. ### 4.3 CLI Import ```bash # Interactive wizard improvise import data.json improvise import sales.csv expenses.csv # merge multiple CSVs # Headless (skip wizard) improvise import data.json --no-wizard -o model.improv # With field overrides improvise import data.csv \ --category Region \ --measure Revenue \ --time Date \ --extract Date:Month \ --axis Region:row \ --formula "Profit = Revenue - Cost" \ --name "Sales Model" \ -o output.improv ``` --- ## 5. Terminal UI ### 5.1 Layout ``` +---------------------------------------------------------------+ | improvise · Sales 2025 (model.improv) [+] ?:help :q quit| +---------------------------------------------------------------+ | [Page: Region = East] | | | Q1 | Q2 | Q3 | Q4 | | |--------------+---------+---------+---------+---------+--------| | Shirts | 1,200 | 1,450 | 1,100 | 1,800 | | | Pants | 800 | 920 | 750 | 1,200 | | | ... | |--------------+---------+---------+---------+---------+--------| | Total | 4,100 | 4,670 | 3,750 | 5,800 | | +---------------------------------------------------------------+ | Tiles: [Time ↔] [Product ↕] [Region ☰] T to select | +---------------------------------------------------------------+ | NORMAL | Default ?:help :q quit | +---------------------------------------------------------------+ ``` ### 5.2 Panels and Modes - **Grid** (main): Scrollable table of the current view. - **Tile bar**: Category tiles with axis symbols. `T` or `Ctrl+Arrow` enters tile-select mode. - **Formula panel**: `F` — list and edit formulas. `Ctrl+F` toggles visibility without focus. - **Category panel**: `C` — manage categories, items, and axis assignments. `Ctrl+C` toggles visibility. - **View panel**: `V` — switch, create, delete views. `Ctrl+V` toggles visibility. - **Status bar**: Mode indicator, active view name, keyboard hints. - **Help overlay**: `?` or `F1` — full key reference. **Modes**: Normal, Insert (editing), Formula Edit, Formula Panel, Category Panel, View Panel, Tile Select, Category Add, Item Add, Export Prompt, Command (`:` prefix), Search, Import Wizard, Help, Quit. ### 5.3 Navigation and Editing | Key | Action | |-----|--------| | ↑↓←→ / hjkl | Move cursor | | gg / G | Jump to first / last row | | 0 / $ | Jump to first / last column | | Ctrl+D / Ctrl+U | Scroll 5 rows down / up | | PageDown / PageUp | Page scroll | | [ / ] | Page axis prev / next | | i / a | Enter insert mode (or drill into aggregated cell) | | Enter | Advance to next cell while editing | | o | Add record row and begin editing | | Esc | Cancel edit / return to Normal | | x | Clear cell | | yy | Yank (copy) cell value | | p | Paste yanked value | | / | Search grid | | n | Next search match | | N | New category quick-add (or previous search match context) | | t | Transpose (swap rows ↔ columns) | | z | Toggle group collapse under cursor | | H | Hide current row item | | > | Drill into aggregated cell | | < | Navigate back (view history) | | R | Toggle records mode | | P | Toggle prune-empty rows/columns | | T / Ctrl+Arrow | Enter tile-select mode | | F | Toggle formula panel (focus) | | C | Toggle category panel (focus) | | V | Toggle view panel (focus) | | Ctrl+F / Ctrl+C / Ctrl+V | Toggle panel visibility (no focus change) | | Tab | Cycle focus to next open panel | | : | Enter command mode | | Ctrl+S | Save | | Ctrl+E | Export CSV prompt | | ZZ | Save and quit | | Ctrl+Q | Force quit | | ? / F1 | Help overlay | ### 5.4 Command Mode (`:`) | Command | Description | |---------|-------------| | `:q` | Quit (warns if unsaved) | | `:q!` | Force quit | | `:wq` / `ZZ` | Save and quit | | `:w [path]` | Save (path optional) | | `:export [path.csv]` | Export active view to CSV | | `:import ` | Open import wizard | | `:add-cat ` | Add a category | | `:add-item ` | Add one item to a category | | `:add-items a b c…` | Add multiple items at once | | `:formula ` | Add a formula | | `:add-view [name]` | Create a new view | | `:show-item ` | Restore a hidden item | --- ## 6. Command Layer (Headless Mode) All model mutations go through a typed command layer. This enables: - Scripting without the TUI - Replay / audit log - Testing without rendering ### 6.1 CLI Subcommands ```bash # Open TUI (default) improvise [model.improv] # Single headless command(s) improvise cmd 'set-cell Region/East Measure/Revenue 1200' -f model.improv # Script file (one command per line, # or // comments) improvise script setup.txt -f model.improv # Import improvise import data.json ``` ### 6.2 Script Syntax Commands use a prefix syntax (one per line). Multiple commands can be separated by `.` on a single line. Lines starting with `#` or `//` are comments. ### 6.3 Available Commands Commands are registered internally and accessible via `:` in the TUI or via the `cmd`/`script` CLI subcommands. Key operations include: - Cell manipulation: `set-cell`, `clear-cell`, `yank`, `paste` - Navigation: `move-selection`, `scroll-rows`, `page-scroll`, `jump-first-row`, `jump-last-row` - View: `transpose`, `toggle-records-mode`, `toggle-prune-empty`, `drill-into-cell`, `view-back`, `page-next`, `page-prev` - Categories: `add-cat`, `add-item`, `add-items`, `delete-category-at-cursor`, `cycle-axis-at-cursor`, `filter-to-item`, `hide-selected-row-item`, `show-item` - Formulas: `formula`, `enter-formula-edit`, `delete-formula-at-cursor`, `commit-formula` - Views: `switch-view-at-cursor`, `create-and-switch-view`, `delete-view-at-cursor`, `add-view` - Tiles: `enter-tile-select`, `move-tile-cursor`, `cycle-axis-for-tile`, `set-axis-for-tile` - File: `save`, `wq`, `export`, `import` - Modes: `enter-mode`, `search`, `enter-edit-mode`, `enter-export-prompt` --- ## 7. Persistence ### 7.1 File Format Native format: plain-text markdown-like `.improv` file. Structure: ``` # Model Name ## Category: Region - North - South - East [Coastal] - West [Coastal] > Coastal ## Formulas - Profit = Revenue - Cost [Measure] ## Data Region=East, Measure=Revenue = 1200 Region=East, Measure=Cost = 800 Region=West, Measure=Revenue = "pending" ## View: Default (active) Region: row Measure: column Time: page, Q1 hidden: Region/Internal collapsed: Time/2024 format: ,.2f ``` Compressed variant: `.improv.gz` (gzip, same payload). Legacy JSON format is auto-detected (by `{` prefix) for backward compatibility. ### 7.2 Export - `Ctrl+E` in TUI or `:export [path.csv]` command: exports active view to CSV. - Respects current view axes and page filters. ### 7.3 Autosave - Periodic autosave (every 30 seconds when dirty) to `.{filename}.autosave`. --- ## 8. Technology | Concern | Choice | |---------|--------| | Language | Rust (stable) | | TUI | [Ratatui](https://github.com/ratatui-org/ratatui) + Crossterm | | Serialization | `serde` + `serde_json` (for legacy compat); native format is plain text | | CLI | `clap` with subcommands (`open`, `import`, `cmd`, `script`) | | Build | Standard `cargo build --release` (LTO, stripped); optional musl target via Nix | | Dev environment | Nix flake with `rust-overlay` | | No runtime deps | Single binary, no database, no network | --- ## 9. Non-Goals (v1) - Scripting/macro language beyond the formula system. - Collaborative/multi-user editing. - Live external data sources (databases, APIs). - Charts or graphical visualization. - Multi-level undo history. --- ## 10. Verification ```bash # Build nix develop --command cargo build --release # Import test (interactive wizard) ./improvise import sample.json # Import test (headless) ./improvise import sample.json --no-wizard -o test.improv # Headless commands ./improvise cmd 'add-cat Region' 'add-item Region East' -f new.improv # Headless script ./improvise script tests/setup.txt -f new.improv # TUI ./improvise model.improv ```