# Improvise — Multi-Dimensional Data Modeling Terminal Application ## Context Traditional spreadsheets conflate data, formulas, and presentation into a single flat grid addressed by opaque cell references (A1, B7). This makes models fragile, hard to audit, and impossible to rearrange without rewriting formulas. We are building a terminal application that treats data as a multi-dimensional, semantically labeled structure — separating data, computation, and views into independent layers. The result is a tool where formulas reference meaningful names, views can be rearranged instantly, and the same dataset can be explored from multiple perspectives simultaneously. The application compiles to a single static binary (`x86_64-unknown-linux-musl`) and provides a rich TUI experience. --- ## 1. Core Data Model ### 1.1 Categories and Items - Data is organized into **categories** (dimensions) and **items** (members of a dimension). - Example: Category "Region" contains items "North", "South", "East", "West". - Example: Category "Time" contains items "Q1", "Q2", "Q3", "Q4". - Items within a category can be organized into **groups** forming a hierarchy. - Example: Items "Jan", "Feb", "Mar" grouped under "Q1"; quarters grouped under "2025". - Groups are collapsible/expandable for drill-down. - A model supports up to **12 categories**. ### 1.2 Data Cells - Each data cell is identified by the intersection of one item from each active category — not by grid coordinates. - Cells hold numeric values, text, or empty/null. - The underlying storage is a sparse multi-dimensional array (`HashMap`). ### 1.3 Models - A **model** is the top-level container: it holds all categories, items, groups, data cells, formulas, and views. - Models are saved to and loaded from a single `.improv` file (JSON format). --- ## 2. Formula System ### 2.1 Named Formulas - Formulas reference categories and items by name, not by cell address. - Example: `Profit = Revenue - Cost` - Example: `Tax = Revenue * 0.08` - Example: `Margin = Profit / Revenue` - A formula applies uniformly across all intersections of the referenced categories. No copying or dragging. ### 2.2 Formula Panel - Formulas are defined in a **dedicated formula panel**, separate from the data grid. - All formulas are visible in one place for easy auditing. - Formulas cannot be accidentally overwritten by data entry. ### 2.3 Scoped Formulas (WHERE clause) - A formula can be scoped to a subset of items: - Example: `Discount = 0.10 * Price WHERE Region = "West"` ### 2.4 Aggregation - Built-in aggregation functions: `SUM`, `AVG`, `MIN`, `MAX`, `COUNT`. ### 2.5 Formula Language - Expression-based (not Turing-complete). - Operators: `+`, `-`, `*`, `/`, `^`, unary `-`. - Comparisons: `=`, `!=`, `<`, `>`, `<=`, `>=`. - Conditionals: `IF(condition, then, else)`. - `WHERE` clause for filtering: `SUM(Sales WHERE Region = "East")`. - Parentheses for grouping. - Literal numbers and quoted strings. --- ## 3. View System ### 3.1 Views as First-Class Objects - A **view** is a named configuration specifying: - Which categories are assigned to **rows**, **columns**, and **pages** (filters/slicers). - Which items/groups are visible vs. hidden. - Sort order (future). - Number formatting. - Multiple views can exist per model. Each is independent. - Editing data in any view updates the underlying model; all other views reflect the change. ### 3.2 Category Tiles - Each category is represented as a **tile** displayed in the tile bar. - The user can move tiles between row, column, and page axes to instantly pivot/rearrange the view. - Moving a tile triggers an instant recalculation and re-render of the grid. ### 3.3 Page Axis (Slicing) - Categories assigned to the page axis act as filters. - The user selects a single item from a paged category using `[` and `]`. ### 3.4 Collapsing and Expanding - Groups can be collapsed/expanded per-view (future: keyboard shortcut in grid). --- ## 4. JSON Import Wizard ### 4.1 Purpose - Users can import arbitrary JSON files to bootstrap a model. ### 4.2 Wizard Flow (interactive TUI) **Step 1: Preview** — Structural summary of the JSON. **Step 2: Select Array Path** — If the JSON is not a flat array, the user selects which key path contains the primary record array. **Step 3: Review Proposals** — Fields are analyzed and proposed as: - Category (small number of distinct string values) - Measure (numeric) - Time Category (date-like strings) - Label/Identifier (skip) **Step 4: Name the Model** — User names the model and confirms. ### 4.3 Headless Import ``` improvise --cmd '{"op":"ImportJson","path":"data.json"}' ``` --- ## 5. Terminal UI ### 5.1 Layout ``` +---------------------------------------------------------------+ | Improvise | Model: Sales 2025 [*] [F1 Help] [Ctrl+Q] | +---------------------------------------------------------------+ | [Page: Region = East] | | | Q1 | Q2 | Q3 | Q4 | | |--------------+---------+---------+---------+---------+--------| | Shirts | 1,200 | 1,450 | 1,100 | 1,800 | | | Pants | 800 | 920 | 750 | 1,200 | | | ... | |--------------+---------+---------+---------+---------+--------| | Total | 4,100 | 4,670 | 3,750 | 5,800 | | +---------------------------------------------------------------+ | Tiles: [Time ↔] [Product ↕] [Region ☰] Ctrl+↑↓←→ tiles | +---------------------------------------------------------------+ | NORMAL | Default | Ctrl+F:formulas Ctrl+C:categories ... | +---------------------------------------------------------------+ ``` ### 5.2 Panels - **Grid panel** (main): Scrollable table of the current view. - **Tile bar**: Category tiles with axis symbols. `Ctrl+Arrow` enters tile-select mode. - **Formula panel**: `Ctrl+F` — list and edit formulas. - **Category panel**: `Ctrl+C` — manage categories and axis assignments. - **View panel**: `Ctrl+V` — switch, create, delete views. - **Status bar**: Mode, active view name, keyboard hints. ### 5.3 Navigation and Editing | Key | Action | |-----|--------| | ↑↓←→ / hjkl | Move cursor | | Enter | Edit cell | | Esc | Cancel edit | | Tab | Focus next open panel | | / | Search | | [ / ] | Page axis prev/next | | Ctrl+Arrow | Tile select mode | | Enter/Space (tile) | Cycle axis (Row→Col→Page) | | r / c / p (tile) | Set axis directly | | Ctrl+F | Toggle formula panel | | Ctrl+C | Toggle category panel | | Ctrl+V | Toggle view panel | | Ctrl+S | Save | | Ctrl+E | Export CSV | | F1 | Help | | Ctrl+Q | Quit | --- ## 6. Command Layer (Headless Mode) All model mutations go through a typed command layer. This enables: - Scripting without the TUI - Replay / audit log - Testing without rendering ### 6.1 Command Format JSON object with an `op` field: ```json {"op": "CommandName", ...args} ``` ### 6.2 Available Commands | op | Required fields | Description | |----|-----------------|-------------| | `AddCategory` | `name` | Add a category/dimension | | `AddItem` | `category`, `item` | Add an item to a category | | `AddItemInGroup` | `category`, `item`, `group` | Add an item in a named group | | `SetCell` | `coords: [[cat,item],...]`, `number` or `text` | Set a cell value | | `ClearCell` | `coords` | Clear a cell | | `AddFormula` | `raw`, `target_category` | Add/replace a formula | | `RemoveFormula` | `target` | Remove a formula by target name | | `CreateView` | `name` | Create a new view | | `DeleteView` | `name` | Delete a view | | `SwitchView` | `name` | Switch the active view | | `SetAxis` | `category`, `axis` (`"row"/"column"/"page"`) | Set category axis | | `SetPageSelection` | `category`, `item` | Set page-axis filter | | `ToggleGroup` | `category`, `group` | Toggle group collapse | | `Save` | `path` | Save model to file | | `Load` | `path` | Load model from file | | `ExportCsv` | `path` | Export active view to CSV | | `ImportJson` | `path`, `model_name?`, `array_path?` | Import JSON file | ### 6.3 Response Format ```json {"ok": true, "message": "optional message"} {"ok": false, "message": "error description"} ``` ### 6.4 Invocation ```bash # Single command improvise model.improv --cmd '{"op":"SetCell","coords":[["Region","East"],["Measure","Revenue"]],"number":1200}' # Script file (one JSON object per line, # comments allowed) improvise model.improv --script setup.jsonl ``` --- ## 7. Persistence ### 7.1 File Format Native format: JSON-based `.improv` file containing all categories, items, groups, data cells, formulas, and view definitions. Compressed variant: `.improv.gz` (gzip, same JSON payload). ### 7.2 Export - `Ctrl+E` in TUI or `ExportCsv` command: exports active view to CSV. ### 7.3 Autosave - Periodic autosave (every 30 seconds when dirty) to `.model.improv.autosave`. --- ## 8. Technology | Concern | Choice | |---------|--------| | Language | Rust (stable) | | TUI | [Ratatui](https://github.com/ratatui-org/ratatui) + Crossterm | | Serialization | `serde` + `serde_json` | | Static binary | `x86_64-unknown-linux-musl` via `musl-gcc` | | Dev environment | Nix flake with `rust-overlay` | | No runtime deps | Single binary, no database, no network | --- ## 9. Non-Goals (v1) - Scripting/macro language beyond the formula system. - Collaborative/multi-user editing. - Live external data sources (databases, APIs). - Charts or graphical visualization. - Multi-level undo history. --- ## 10. Verification ```bash # Build nix develop --command cargo build --release file target/x86_64-unknown-linux-musl/release/improvise # → statically linked # Import test ./improvise --cmd '{"op":"ImportJson","path":"sample.json"}' --cmd '{"op":"Save","path":"test.improv"}' # Formula test ./improvise test.improv \ --cmd '{"op":"AddFormula","raw":"Profit = Revenue - Cost","target_category":"Measure"}' # Headless script ./improvise new.improv --script tests/setup.jsonl # TUI ./improvise model.improv ```