feat(formula): support pipe-quoted identifiers |...|
Add CL/SQL-style symbol quoting using pipe delimiters for formula identifiers. This allows category and item names that collide with keywords (WHERE, SUM, IF, etc.) or contain special characters (parens, operators, spaces) to be used unambiguously in formulas: |WHERE| + |Revenue (USD)| SUM(|Net Revenue| WHERE |Region Name| = |East Coast|) Pipes produce Token::Ident (same as bare identifiers), so they work everywhere: expressions, aggregates, WHERE clauses. Double-quoted strings remain Token::Str for backward compatibility. Also updates split_where and parse_where to skip/strip pipe delimiters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -115,17 +115,28 @@ Formulas are parsed into a typed AST (`Expr` enum) at entry time. If the syntax
|
||||
is invalid, the user gets an error immediately. The evaluator only sees
|
||||
well-formed trees — it does not need to handle malformed input.
|
||||
|
||||
### Formula Tokenizer: Multi-Word Identifiers and Keywords
|
||||
### Formula Tokenizer: Identifiers and Quoting
|
||||
|
||||
The formula tokenizer supports multi-word identifiers (e.g., `Total Revenue`)
|
||||
by allowing spaces within identifier tokens when followed by non-operator
|
||||
characters. However, keywords (`WHERE`, `SUM`, `AVG`, `MIN`, `MAX`, `COUNT`,
|
||||
`IF`) act as token boundaries — the tokenizer breaks an identifier when:
|
||||
1. The identifier collected **so far** is a keyword (e.g., `WHERE ` stops at `WHERE`).
|
||||
2. The **next word** after a space is a keyword (e.g., `Revenue WHERE` stops at `Revenue`).
|
||||
**Bare identifiers** support multi-word names (e.g., `Total Revenue`) by
|
||||
allowing spaces when followed by non-operator, non-keyword characters. Keywords
|
||||
(`WHERE`, `SUM`, `AVG`, `MIN`, `MAX`, `COUNT`, `IF`) act as token boundaries.
|
||||
|
||||
This ensures `SUM(Revenue WHERE Region = "East")` tokenizes correctly as
|
||||
separate tokens while `Total Revenue` remains a single identifier.
|
||||
**Pipe-quoted identifiers** (`|...|`) allow any characters — including spaces,
|
||||
keywords, and operators — inside the delimiters. Use pipes when a category or
|
||||
item name collides with a keyword or contains special characters:
|
||||
|
||||
```
|
||||
|WHERE| — category named "WHERE"
|
||||
|Revenue (USD)| — name with parens
|
||||
|Cost + Tax| — name with operator chars
|
||||
SUM(|Net Revenue| WHERE |Region Name| = |East Coast|)
|
||||
```
|
||||
|
||||
Pipes produce `Token::Ident` (same as bare identifiers), so they work
|
||||
everywhere an identifier is expected: expressions, aggregate arguments, WHERE
|
||||
clause category names and filter values. Double-quoted strings (`"..."`)
|
||||
remain `Token::Str` and are used only for WHERE filter values in the
|
||||
`split_where` pre-parse step.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user