YABNF
YABNF (Yet Another ABNF) is a superset of ABNF (RFC 5234 + RFC 7405) extended with @ annotations for representing tree-sitter grammar concepts.
Motivation
Standard ABNF can express concatenation, alternation, repetition, optional groups, and string literals — but tree-sitter grammars use additional concepts (precedence, fields, aliasing, tokens, patterns, external scanners) that have no ABNF equivalent. YABNF defines a minimal set of extensions to bridge this gap.
The tree-sitter2abnf tool converts tree-sitter grammar.json files to YABNF and back.
Extensions
YABNF adds two categories of extensions to standard ABNF:
Grammar-level Directives (7)
Emitted as structured ABNF comments:
| Directive | Purpose | Example |
|---|---|---|
@grammar |
Grammar name | ; @grammar "json" |
@word |
Keyword extraction rule | ; @word "identifier" |
@extras |
Auto-inserted tokens (whitespace, comments) | ; @extras (@pattern("\\s") / comment) |
@inline |
Rules inlined during parser generation | ; @inline (_semicolon) |
@conflicts |
GLR conflict declarations | ; @conflicts (call-expression member-expression) |
@externals |
External scanner tokens | ; @externals (_ternary-qmark / _template-chars) |
@supertypes |
Abstract supertype nodes | ; @supertypes (expression / statement) |
Rule-level Annotations (9)
Inline annotations within rule definitions:
| Annotation | Tree-sitter concept | Example |
|---|---|---|
@prec(N) |
Precedence (no associativity) | @prec(1) expr |
@prec-left(N) |
Left-associative precedence | @prec-left(2) a %s"+" b |
@prec-right(N) |
Right-associative precedence | @prec-right(1) a %s"=" b |
@prec-dynamic(N) |
Dynamic (runtime) precedence | @prec-dynamic(-1) expr |
@field(name) |
Named field on a node | @field(key) string |
@alias(name) |
Rename node (anonymous) | @alias(op) %s"+" |
@alias(~name) |
Rename node (named) | @alias(~statement-id) identifier |
@token(...) |
Token grouping | @token(%s"//" @pattern(".*")) |
@immediate-token(...) |
Token without preceding whitespace | @immediate-token(@prec(1) @pattern("[^\\\\\"]+"))) |
@pattern("re") |
Regex pattern | @pattern("[a-zA-Z]+") |
Precedence values can be integers (including negative) or named strings (e.g. @prec-left(logical_and)).
Additional ABNF Extensions
- Underscores in rule names: YABNF allows
_in rule names (RFC 5234 restricts to ALPHA/DIGIT/"-"). Leading_marks hidden rules (tree-sitter convention).
Audit Results
All 16 extensions were validated against real tree-sitter grammars (json, javascript):
- All 16 are load-bearing — each maps to a distinct tree-sitter grammar.json concept with no standard ABNF equivalent
- None are redundant — removing any would lose information needed for round-trip conversion
- Gap identified: tree-sitter's top-level
precedenceskey (ordered precedence tiers) has no@directive yet; to be addressed in the formal spec
Status
The formal YABNF grammar specification is tracked in issue #1.
Links
| Source (Codeberg) | https://codeberg.org/hum3/yabnf |
| tree-sitter2abnf | https://codeberg.org/hum3/tree-sitter2abnf |
References
- RFC 5234 — ABNF specification
- RFC 7405 — case-sensitive string literals in ABNF
- tree-sitter — parser generator
- tree-sitter grammar DSL — grammar.js authoring guide