Files
improvise/SPEC.md
Ed L eae00522e2 Initial implementation of Improvise TUI
Multi-dimensional data modeling terminal application with:
- Core data model: categories, items, groups, sparse cell store
- Formula system: recursive-descent parser, named formulas, WHERE clauses
- View system: Row/Column/Page axes, tile-based pivot, page slicing
- JSON import wizard (interactive TUI + headless auto-mode)
- Command layer: all mutations via typed Command enum for headless replay
- TUI: Ratatui grid, tile bar, formula/category/view panels, help overlay
- Persistence: .improv (JSON), .improv.gz (gzip), CSV export, autosave
- Static binary via x86_64-unknown-linux-musl + nix flake devShell
- Headless mode: --cmd '{"op":"..."}' and --script file.jsonl

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 21:11:55 -07:00

9.9 KiB

Improvise — Multi-Dimensional Data Modeling Terminal Application

Context

Traditional spreadsheets conflate data, formulas, and presentation into a single flat grid addressed by opaque cell references (A1, B7). This makes models fragile, hard to audit, and impossible to rearrange without rewriting formulas. We are building a terminal application that treats data as a multi-dimensional, semantically labeled structure — separating data, computation, and views into independent layers. The result is a tool where formulas reference meaningful names, views can be rearranged instantly, and the same dataset can be explored from multiple perspectives simultaneously.

The application compiles to a single static binary (x86_64-unknown-linux-musl) and provides a rich TUI experience.


1. Core Data Model

1.1 Categories and Items

  • Data is organized into categories (dimensions) and items (members of a dimension).
    • Example: Category "Region" contains items "North", "South", "East", "West".
    • Example: Category "Time" contains items "Q1", "Q2", "Q3", "Q4".
  • Items within a category can be organized into groups forming a hierarchy.
    • Example: Items "Jan", "Feb", "Mar" grouped under "Q1"; quarters grouped under "2025".
  • Groups are collapsible/expandable for drill-down.
  • A model supports up to 12 categories.

1.2 Data Cells

  • Each data cell is identified by the intersection of one item from each active category — not by grid coordinates.
  • Cells hold numeric values, text, or empty/null.
  • The underlying storage is a sparse multi-dimensional array (HashMap<CellKey, CellValue>).

1.3 Models

  • A model is the top-level container: it holds all categories, items, groups, data cells, formulas, and views.
  • Models are saved to and loaded from a single .improv file (JSON format).

2. Formula System

2.1 Named Formulas

  • Formulas reference categories and items by name, not by cell address.
    • Example: Profit = Revenue - Cost
    • Example: Tax = Revenue * 0.08
    • Example: Margin = Profit / Revenue
  • A formula applies uniformly across all intersections of the referenced categories. No copying or dragging.

2.2 Formula Panel

  • Formulas are defined in a dedicated formula panel, separate from the data grid.
  • All formulas are visible in one place for easy auditing.
  • Formulas cannot be accidentally overwritten by data entry.

2.3 Scoped Formulas (WHERE clause)

  • A formula can be scoped to a subset of items:
    • Example: Discount = 0.10 * Price WHERE Region = "West"

2.4 Aggregation

  • Built-in aggregation functions: SUM, AVG, MIN, MAX, COUNT.

2.5 Formula Language

  • Expression-based (not Turing-complete).
  • Operators: +, -, *, /, ^, unary -.
  • Comparisons: =, !=, <, >, <=, >=.
  • Conditionals: IF(condition, then, else).
  • WHERE clause for filtering: SUM(Sales WHERE Region = "East").
  • Parentheses for grouping.
  • Literal numbers and quoted strings.

3. View System

3.1 Views as First-Class Objects

  • A view is a named configuration specifying:
    • Which categories are assigned to rows, columns, and pages (filters/slicers).
    • Which items/groups are visible vs. hidden.
    • Sort order (future).
    • Number formatting.
  • Multiple views can exist per model. Each is independent.
  • Editing data in any view updates the underlying model; all other views reflect the change.

3.2 Category Tiles

  • Each category is represented as a tile displayed in the tile bar.
  • The user can move tiles between row, column, and page axes to instantly pivot/rearrange the view.
  • Moving a tile triggers an instant recalculation and re-render of the grid.

3.3 Page Axis (Slicing)

  • Categories assigned to the page axis act as filters.
  • The user selects a single item from a paged category using [ and ].

3.4 Collapsing and Expanding

  • Groups can be collapsed/expanded per-view (future: keyboard shortcut in grid).

4. JSON Import Wizard

4.1 Purpose

  • Users can import arbitrary JSON files to bootstrap a model.

4.2 Wizard Flow (interactive TUI)

Step 1: Preview — Structural summary of the JSON.

Step 2: Select Array Path — If the JSON is not a flat array, the user selects which key path contains the primary record array.

Step 3: Review Proposals — Fields are analyzed and proposed as:

  • Category (small number of distinct string values)
  • Measure (numeric)
  • Time Category (date-like strings)
  • Label/Identifier (skip)

Step 4: Name the Model — User names the model and confirms.

4.3 Headless Import

improvise --cmd '{"op":"ImportJson","path":"data.json"}'

5. Terminal UI

5.1 Layout

+---------------------------------------------------------------+
| Improvise | Model: Sales 2025 [*]          [F1 Help] [Ctrl+Q] |
+---------------------------------------------------------------+
|  [Page: Region = East]                                        |
|              | Q1      | Q2      | Q3      | Q4      |        |
|--------------+---------+---------+---------+---------+--------|
| Shirts       |   1,200 |   1,450 |   1,100 |   1,800 |        |
| Pants        |     800 |     920 |     750 |   1,200 |        |
| ...                                                           |
|--------------+---------+---------+---------+---------+--------|
| Total        |   4,100 |   4,670 |   3,750 |   5,800 |        |
+---------------------------------------------------------------+
| Tiles: [Time ↔] [Product ↕] [Region ☰]    Ctrl+↑↓←→ tiles   |
+---------------------------------------------------------------+
| NORMAL | Default | Ctrl+F:formulas  Ctrl+C:categories ...     |
+---------------------------------------------------------------+

5.2 Panels

  • Grid panel (main): Scrollable table of the current view.
  • Tile bar: Category tiles with axis symbols. Ctrl+Arrow enters tile-select mode.
  • Formula panel: Ctrl+F — list and edit formulas.
  • Category panel: Ctrl+C — manage categories and axis assignments.
  • View panel: Ctrl+V — switch, create, delete views.
  • Status bar: Mode, active view name, keyboard hints.

5.3 Navigation and Editing

Key Action
↑↓←→ / hjkl Move cursor
Enter Edit cell
Esc Cancel edit
Tab Focus next open panel
/ Search
[ / ] Page axis prev/next
Ctrl+Arrow Tile select mode
Enter/Space (tile) Cycle axis (Row→Col→Page)
r / c / p (tile) Set axis directly
Ctrl+F Toggle formula panel
Ctrl+C Toggle category panel
Ctrl+V Toggle view panel
Ctrl+S Save
Ctrl+E Export CSV
F1 Help
Ctrl+Q Quit

6. Command Layer (Headless Mode)

All model mutations go through a typed command layer. This enables:

  • Scripting without the TUI
  • Replay / audit log
  • Testing without rendering

6.1 Command Format

JSON object with an op field:

{"op": "CommandName", ...args}

6.2 Available Commands

op Required fields Description
AddCategory name Add a category/dimension
AddItem category, item Add an item to a category
AddItemInGroup category, item, group Add an item in a named group
SetCell coords: [[cat,item],...], number or text Set a cell value
ClearCell coords Clear a cell
AddFormula raw, target_category Add/replace a formula
RemoveFormula target Remove a formula by target name
CreateView name Create a new view
DeleteView name Delete a view
SwitchView name Switch the active view
SetAxis category, axis ("row"/"column"/"page") Set category axis
SetPageSelection category, item Set page-axis filter
ToggleGroup category, group Toggle group collapse
Save path Save model to file
Load path Load model from file
ExportCsv path Export active view to CSV
ImportJson path, model_name?, array_path? Import JSON file

6.3 Response Format

{"ok": true, "message": "optional message"}
{"ok": false, "message": "error description"}

6.4 Invocation

# Single command
improvise model.improv --cmd '{"op":"SetCell","coords":[["Region","East"],["Measure","Revenue"]],"number":1200}'

# Script file (one JSON object per line, # comments allowed)
improvise model.improv --script setup.jsonl

7. Persistence

7.1 File Format

Native format: JSON-based .improv file containing all categories, items, groups, data cells, formulas, and view definitions.

Compressed variant: .improv.gz (gzip, same JSON payload).

7.2 Export

  • Ctrl+E in TUI or ExportCsv command: exports active view to CSV.

7.3 Autosave

  • Periodic autosave (every 30 seconds when dirty) to .model.improv.autosave.

8. Technology

Concern Choice
Language Rust (stable)
TUI Ratatui + Crossterm
Serialization serde + serde_json
Static binary x86_64-unknown-linux-musl via musl-gcc
Dev environment Nix flake with rust-overlay
No runtime deps Single binary, no database, no network

9. Non-Goals (v1)

  • Scripting/macro language beyond the formula system.
  • Collaborative/multi-user editing.
  • Live external data sources (databases, APIs).
  • Charts or graphical visualization.
  • Multi-level undo history.

10. Verification

# Build
nix develop --command cargo build --release
file target/x86_64-unknown-linux-musl/release/improvise  # → statically linked

# Import test
./improvise --cmd '{"op":"ImportJson","path":"sample.json"}' --cmd '{"op":"Save","path":"test.improv"}'

# Formula test
./improvise test.improv \
  --cmd '{"op":"AddFormula","raw":"Profit = Revenue - Cost","target_category":"Measure"}'

# Headless script
./improvise new.improv --script tests/setup.jsonl

# TUI
./improvise model.improv