Files
improvise/context/SPEC.md
2026-04-11 00:06:49 -07:00

13 KiB

Improvise — Multi-Dimensional Data Modeling Terminal Application

Context

Traditional spreadsheets conflate data, formulas, and presentation into a single flat grid addressed by opaque cell references (A1, B7). This makes models fragile, hard to audit, and impossible to rearrange without rewriting formulas. We are building a terminal application that treats data as a multi-dimensional, semantically labeled structure — separating data, computation, and views into independent layers. The result is a tool where formulas reference meaningful names, views can be rearranged instantly, and the same dataset can be explored from multiple perspectives simultaneously.

The application compiles to a single static binary and provides a rich TUI experience.


1. Core Data Model

1.1 Categories and Items

  • Data is organized into categories (dimensions) and items (members of a dimension).
    • Example: Category "Region" contains items "North", "South", "East", "West".
    • Example: Category "Time" contains items "Q1", "Q2", "Q3", "Q4".
  • Items within a category can be organized into groups forming a hierarchy.
    • Example: Items "Jan", "Feb", "Mar" grouped under "Q1"; quarters grouped under "2025".
  • Groups are collapsible/expandable for drill-down.
  • A model supports up to 12 categories (virtual categories prefixed with _ do not count).

1.2 Data Cells

  • Each data cell is identified by the intersection of one item from each active category — not by grid coordinates.
  • Cells hold numeric values, text, or empty/null.
  • The underlying storage is a sparse multi-dimensional array (HashMap<CellKey, CellValue>).

1.3 Models

  • A model is the top-level container: it holds all categories, items, groups, data cells, formulas, and views.
  • Models are saved to and loaded from a single .improv file (plain-text markdown-like format; see §7.1).

2. Formula System

2.1 Named Formulas

  • Formulas reference categories and items by name, not by cell address.
    • Example: Profit = Revenue - Cost
    • Example: Tax = Revenue * 0.08
    • Example: Margin = Profit / Revenue
  • A formula applies uniformly across all intersections of the referenced categories. No copying or dragging.

2.2 Formula Panel

  • Formulas are defined in a dedicated formula panel (F key), separate from the data grid.
  • All formulas are visible in one place for easy auditing.
  • Formulas cannot be accidentally overwritten by data entry.
  • Within the panel: n/o to create a new formula, d to delete.

2.3 Scoped Formulas (WHERE clause)

  • A formula can be scoped to a subset of items:
    • Example: Discount = 0.10 * Price WHERE Region = "West"

2.4 Aggregation

  • Built-in aggregation functions: Sum, Avg, Min, Max, Count.

2.5 Formula Language

  • Expression-based (not Turing-complete).
  • Operators: +, -, *, /, ^, unary -.
  • Comparisons: =, !=, <, >, <=, >=.
  • Conditionals: IF(condition, then, else).
  • WHERE clause for filtering: Sum(Sales WHERE Region = "East").
  • Parentheses for grouping.
  • Literal numbers and quoted strings.

3. View System

3.1 Views as First-Class Objects

  • A view is a named configuration specifying:
    • Which categories are assigned to rows, columns, pages (filters/slicers), or none (hidden).
    • Which items/groups are visible vs. hidden per-view.
    • Collapsed group state per-view.
    • Number formatting (per-view format string, e.g. ,.2f).
  • Multiple views can exist per model. Each is independent.
  • Editing data in any view updates the underlying model; all other views reflect the change.

3.2 Category Tiles

  • Each category is represented as a tile displayed in the tile bar.
  • Press T (or Ctrl+Arrow) to enter tile-select mode.
    • h/l or / to select a tile.
    • Space/Enter to cycle axis (Row → Column → Page).
    • r/c/p to set axis directly to Row / Column / Page.
  • Moving a tile triggers instant recalculation and re-render of the grid.

3.3 Page Axis (Slicing)

  • Categories assigned to the page axis act as filters.
  • The user selects a single item from a paged category using [ and ].

3.4 Collapsing and Expanding

  • Groups can be collapsed/expanded per-view with the z key.

3.5 Drill-Down and View History

  • > drills into an aggregated cell, capturing a snapshot.
  • < navigates back through view history.

3.6 Records Mode

  • R toggles records mode: a long-format view activated when _Index is on Row and _Dim is on Column.
  • P toggles "prune empty" to hide rows/columns with no data.

3.7 Transpose

  • t swaps the Row and Column axes instantly.

4. Import Wizard

4.1 Purpose

  • Users can import CSV or JSON files to bootstrap a model.
  • Multiple CSV files can be merged with an automatic "File" category.

4.2 Wizard Flow (interactive TUI)

Step 1: Preview — Structural summary of the data.

Step 2: Select Array Path — (JSON only) If the JSON is not a flat array, the user selects which key path contains the primary record array.

Step 3: Review Proposals — Fields are analyzed and proposed as:

  • Category (small number of distinct string values)
  • Measure (numeric)
  • Time Category (date-like strings, with optional date component extraction: Year, Month, Quarter)
  • Skip (exclude from import)

Step 4: Configure — Set axis assignments, add formulas, name the model.

4.3 CLI Import

# Interactive wizard
improvise import data.json
improvise import sales.csv expenses.csv   # merge multiple CSVs

# Headless (skip wizard)
improvise import data.json --no-wizard -o model.improv

# With field overrides
improvise import data.csv \
  --category Region \
  --measure Revenue \
  --time Date \
  --extract Date:Month \
  --axis Region:row \
  --formula "Profit = Revenue - Cost" \
  --name "Sales Model" \
  -o output.improv

5. Terminal UI

5.1 Layout

+---------------------------------------------------------------+
| improvise  ·  Sales 2025 (model.improv) [+]     ?:help :q quit|
+---------------------------------------------------------------+
|  [Page: Region = East]                                        |
|              | Q1      | Q2      | Q3      | Q4      |        |
|--------------+---------+---------+---------+---------+--------|
| Shirts       |   1,200 |   1,450 |   1,100 |   1,800 |        |
| Pants        |     800 |     920 |     750 |   1,200 |        |
| ...                                                           |
|--------------+---------+---------+---------+---------+--------|
| Total        |   4,100 |   4,670 |   3,750 |   5,800 |        |
+---------------------------------------------------------------+
| Tiles: [Time ↔] [Product ↕] [Region ☰]       T to select     |
+---------------------------------------------------------------+
| NORMAL | Default                              ?:help  :q quit |
+---------------------------------------------------------------+

5.2 Panels and Modes

  • Grid (main): Scrollable table of the current view.
  • Tile bar: Category tiles with axis symbols. T or Ctrl+Arrow enters tile-select mode.
  • Formula panel: F — list and edit formulas. Ctrl+F toggles visibility without focus.
  • Category panel: C — manage categories, items, and axis assignments. Ctrl+C toggles visibility.
  • View panel: V — switch, create, delete views. Ctrl+V toggles visibility.
  • Status bar: Mode indicator, active view name, keyboard hints.
  • Help overlay: ? or F1 — full key reference.

Modes: Normal, Insert (editing), Formula Edit, Formula Panel, Category Panel, View Panel, Tile Select, Category Add, Item Add, Export Prompt, Command (: prefix), Search, Import Wizard, Help, Quit.

5.3 Navigation and Editing

Key Action
↑↓←→ / hjkl Move cursor
gg / G Jump to first / last row
0 / $ Jump to first / last column
Ctrl+D / Ctrl+U Scroll 5 rows down / up
PageDown / PageUp Page scroll
[ / ] Page axis prev / next
i / a Enter insert mode (or drill into aggregated cell)
Enter Advance to next cell while editing
o Add record row and begin editing
Esc Cancel edit / return to Normal
x Clear cell
yy Yank (copy) cell value
p Paste yanked value
/ Search grid
n Next search match
N New category quick-add (or previous search match context)
t Transpose (swap rows ↔ columns)
z Toggle group collapse under cursor
H Hide current row item
> Drill into aggregated cell
< Navigate back (view history)
R Toggle records mode
P Toggle prune-empty rows/columns
T / Ctrl+Arrow Enter tile-select mode
F Toggle formula panel (focus)
C Toggle category panel (focus)
V Toggle view panel (focus)
Ctrl+F / Ctrl+C / Ctrl+V Toggle panel visibility (no focus change)
Tab Cycle focus to next open panel
: Enter command mode
Ctrl+S Save
Ctrl+E Export CSV prompt
ZZ Save and quit
Ctrl+Q Force quit
? / F1 Help overlay

5.4 Command Mode (:)

Command Description
:q Quit (warns if unsaved)
:q! Force quit
:wq / ZZ Save and quit
:w [path] Save (path optional)
:export [path.csv] Export active view to CSV
:import <path> Open import wizard
:add-cat <name> Add a category
:add-item <cat> <item> Add one item to a category
:add-items <cat> a b c… Add multiple items at once
:formula <cat> <Name=expr> Add a formula
:add-view [name] Create a new view
:show-item <cat> <item> Restore a hidden item

6. Command Layer (Headless Mode)

All model mutations go through a typed command layer. This enables:

  • Scripting without the TUI
  • Replay / audit log
  • Testing without rendering

6.1 CLI Subcommands

# Open TUI (default)
improvise [model.improv]

# Single headless command(s)
improvise cmd 'set-cell Region/East Measure/Revenue 1200' -f model.improv

# Script file (one command per line, # or // comments)
improvise script setup.txt -f model.improv

# Import
improvise import data.json

6.2 Script Syntax

Commands use a prefix syntax (one per line). Multiple commands can be separated by . on a single line. Lines starting with # or // are comments.

6.3 Available Commands

Commands are registered internally and accessible via : in the TUI or via the cmd/script CLI subcommands. Key operations include:

  • Cell manipulation: set-cell, clear-cell, yank, paste
  • Navigation: move-selection, scroll-rows, page-scroll, jump-first-row, jump-last-row
  • View: transpose, toggle-records-mode, toggle-prune-empty, drill-into-cell, view-back, page-next, page-prev
  • Categories: add-cat, add-item, add-items, delete-category-at-cursor, cycle-axis-at-cursor, filter-to-item, hide-selected-row-item, show-item
  • Formulas: formula, enter-formula-edit, delete-formula-at-cursor, commit-formula
  • Views: switch-view-at-cursor, create-and-switch-view, delete-view-at-cursor, add-view
  • Tiles: enter-tile-select, move-tile-cursor, cycle-axis-for-tile, set-axis-for-tile
  • File: save, wq, export, import
  • Modes: enter-mode, search, enter-edit-mode, enter-export-prompt

7. Persistence

7.1 File Format

Native format: plain-text markdown-like .improv file. Structure:

# Model Name

## Category: Region
- North
- South
- East [Coastal]
- West [Coastal]
> Coastal

## Formulas
- Profit = Revenue - Cost [Measure]

## Data
Region=East, Measure=Revenue = 1200
Region=East, Measure=Cost = 800
Region=West, Measure=Revenue = "pending"

## View: Default (active)
Region: row
Measure: column
Time: page, Q1
hidden: Region/Internal
collapsed: Time/2024
format: ,.2f

Compressed variant: .improv.gz (gzip, same payload).

Legacy JSON format is auto-detected (by { prefix) for backward compatibility.

7.2 Export

  • Ctrl+E in TUI or :export [path.csv] command: exports active view to CSV.
  • Respects current view axes and page filters.

7.3 Autosave

  • Periodic autosave (every 30 seconds when dirty) to .{filename}.autosave.

8. Technology

Concern Choice
Language Rust (stable)
TUI Ratatui + Crossterm
Serialization serde + serde_json (for legacy compat); native format is plain text
CLI clap with subcommands (open, import, cmd, script)
Build Standard cargo build --release (LTO, stripped); optional musl target via Nix
Dev environment Nix flake with rust-overlay
No runtime deps Single binary, no database, no network

9. Non-Goals (v1)

  • Scripting/macro language beyond the formula system.
  • Collaborative/multi-user editing.
  • Live external data sources (databases, APIs).
  • Charts or graphical visualization.
  • Multi-level undo history.

10. Verification

# Build
nix develop --command cargo build --release

# Import test (interactive wizard)
./improvise import sample.json

# Import test (headless)
./improvise import sample.json --no-wizard -o test.improv

# Headless commands
./improvise cmd 'add-cat Region' 'add-item Region East' -f new.improv

# Headless script
./improvise script tests/setup.txt -f new.improv

# TUI
./improvise model.improv