feat: replace ad-hoc .improv parser with pest grammar

- Add improv.pest PEG grammar as the single source of truth for the
  .improv file format (v2025-04-09)
- Replace hand-written line scanner with pest-derived parser that walks
  the grammar's parse tree
- Add grammar-walking test generator that reads improv.pest at test time
  via pest_meta and produces random valid files from the AST
- Fix 6 parser bugs: newlines in text, commas in names, brackets in
  names, float precision, view name ambiguity, group brackets
- New format: version line, Initial View header, pipe quoting (|...|),
  Views→Formulas→Categories→Data section order, comma-separated items
- Bare names restricted to [A-Za-z_][A-Za-z0-9_-]*, everything else
  pipe-quoted with \| \\ \n escapes
- Remove all unwrap() calls from production code, propagate errors
  with Result throughout parse_md
- Extract shared escape_pipe/unescape_pipe/pipe_quote helpers, deduplicate
  hidden/collapsed formatting, add w!() macro for infallible writeln

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Executed-By: spot
This commit is contained in:
Edward Langley
2026-04-09 02:53:13 -07:00
parent 4d7d91257d
commit d34e8eb313
2 changed files with 714 additions and 474 deletions

124
src/persistence/improv.pest Normal file
View File

@ -0,0 +1,124 @@
// ── .improv file grammar (v2025-04-09) ───────────────────────────────────────
//
// Line-oriented, markdown-flavoured format for multi-dimensional models.
// Sections may appear in any order.
//
// Names: bare alphanumeric or pipe-quoted |like this|.
// Inside pipes, backslash escapes: \| for literal pipe, \\ for backslash,
// \n for newline.
// Values: pipe-quoted |text| or bare numbers.
file = {
SOI ~
blank_lines ~
version_line ~
model_name ~
initial_view? ~
section* ~
EOI
}
version_line = { "v" ~ rest_of_line ~ NEWLINE ~ blank_lines }
model_name = { "# " ~ rest_of_line ~ NEWLINE ~ blank_lines }
initial_view = { "Initial View: " ~ rest_of_line ~ NEWLINE ~ blank_lines }
section = _{
category_section
| formulas_section
| data_section
| view_section
}
// ── Category ─────────────────────────────────────────────────────────────────
category_section = {
"## Category: " ~ rest_of_line ~ NEWLINE ~ blank_lines ~
category_entry*
}
category_entry = _{ group_hierarchy | grouped_item | item_list }
// Comma-separated bare items (no group): `- Food, Gas, Total`
item_list = {
"- " ~ name ~ ("," ~ " "* ~ name)* ~ NEWLINE ~ blank_lines
}
// Single item with group bracket: `- Jan[Q1]`
grouped_item = {
"- " ~ name ~ "[" ~ name ~ "]" ~ NEWLINE ~ blank_lines
}
group_hierarchy = {
"> " ~ name ~ "[" ~ name ~ "]" ~ NEWLINE ~ blank_lines
}
// ── Formulas ─────────────────────────────────────────────────────────────────
formulas_section = {
"## Formulas" ~ NEWLINE ~ blank_lines ~
formula_line*
}
formula_line = {
"- " ~ rest_of_line ~ NEWLINE ~ blank_lines
}
// ── Data ─────────────────────────────────────────────────────────────────────
data_section = {
"## Data" ~ NEWLINE ~ blank_lines ~
data_line*
}
data_line = {
coord_list ~ " = " ~ cell_value ~ NEWLINE ~ blank_lines
}
coord_list = { coord ~ (", " ~ coord)* }
coord = { name ~ "=" ~ name }
cell_value = _{ number | pipe_quoted | bare_value }
number = @{
"-"? ~ ASCII_DIGIT+ ~ ("." ~ ASCII_DIGIT+)? ~ (("e" | "E") ~ ("+" | "-")? ~ ASCII_DIGIT+)?
}
bare_value = @{ (!NEWLINE ~ ANY)+ }
// ── View ─────────────────────────────────────────────────────────────────────
view_section = {
"## View: " ~ rest_of_line ~ NEWLINE ~ blank_lines ~
view_entry*
}
view_entry = _{ format_line | hidden_line | collapsed_line | axis_line }
axis_line = {
name ~ ": " ~ axis_kind ~ (", " ~ name)? ~ NEWLINE ~ blank_lines
}
axis_kind = @{ "row" | "column" | "page" | "none" }
format_line = { "format: " ~ rest_of_line ~ NEWLINE ~ blank_lines }
hidden_line = { "hidden: " ~ name ~ "/" ~ name ~ NEWLINE ~ blank_lines }
collapsed_line = { "collapsed: " ~ name ~ "/" ~ name ~ NEWLINE ~ blank_lines }
// ── Names ────────────────────────────────────────────────────────────────────
//
// A name is either pipe-quoted or a bare identifier.
// Pipe-quoted: |Income, Gross| — backslash escapes inside:
// \| = literal pipe, \\ = literal backslash, \n = newline
// Bare: no = , | [ ] / : # or newlines.
name = _{ pipe_quoted | bare_name }
pipe_quoted = { "|" ~ pipe_inner ~ "|" }
pipe_inner = @{ ("\\" ~ ANY | !"|" ~ ANY)* }
bare_name = @{ ('A'..'Z' | 'a'..'z' | "_") ~ ('A'..'Z' | 'a'..'z' | '0'..'9' | "_" | "-")* }
// ── Shared ───────────────────────────────────────────────────────────────────
rest_of_line = @{ (!NEWLINE ~ ANY)* }
blank_lines = _{ NEWLINE* }

File diff suppressed because it is too large Load Diff