feat(formula): support pipe-quoted identifiers |...|
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@ -115,17 +115,28 @@ Formulas are parsed into a typed AST (`Expr` enum) at entry time. If the syntax
|
||||
is invalid, the user gets an error immediately. The evaluator only sees
|
||||
well-formed trees — it does not need to handle malformed input.
|
||||
|
||||
### Formula Tokenizer: Multi-Word Identifiers and Keywords
|
||||
### Formula Tokenizer: Identifiers and Quoting
|
||||
|
||||
The formula tokenizer supports multi-word identifiers (e.g., `Total Revenue`)
|
||||
by allowing spaces within identifier tokens when followed by non-operator
|
||||
characters. However, keywords (`WHERE`, `SUM`, `AVG`, `MIN`, `MAX`, `COUNT`,
|
||||
`IF`) act as token boundaries — the tokenizer breaks an identifier when:
|
||||
1. The identifier collected **so far** is a keyword (e.g., `WHERE ` stops at `WHERE`).
|
||||
2. The **next word** after a space is a keyword (e.g., `Revenue WHERE` stops at `Revenue`).
|
||||
**Bare identifiers** support multi-word names (e.g., `Total Revenue`) by
|
||||
allowing spaces when followed by non-operator, non-keyword characters. Keywords
|
||||
(`WHERE`, `SUM`, `AVG`, `MIN`, `MAX`, `COUNT`, `IF`) act as token boundaries.
|
||||
|
||||
This ensures `SUM(Revenue WHERE Region = "East")` tokenizes correctly as
|
||||
separate tokens while `Total Revenue` remains a single identifier.
|
||||
**Pipe-quoted identifiers** (`|...|`) allow any characters — including spaces,
|
||||
keywords, and operators — inside the delimiters. Use pipes when a category or
|
||||
item name collides with a keyword or contains special characters:
|
||||
|
||||
```
|
||||
|WHERE| — category named "WHERE"
|
||||
|Revenue (USD)| — name with parens
|
||||
|Cost + Tax| — name with operator chars
|
||||
SUM(|Net Revenue| WHERE |Region Name| = |East Coast|)
|
||||
```
|
||||
|
||||
Pipes produce `Token::Ident` (same as bare identifiers), so they work
|
||||
everywhere an identifier is expected: expressions, aggregate arguments, WHERE
|
||||
clause category names and filter values. Double-quoted strings (`"..."`)
|
||||
remain `Token::Str` and are used only for WHERE filter values in the
|
||||
`split_where` pre-parse step.
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user