Cleanup and documentation

This commit is contained in:
Jonatan Kłosko 2021-09-28 16:00:35 +02:00
parent dad92d2b87
commit d229dddf66
30 changed files with 330857 additions and 388993 deletions

2
.gitignore vendored
View File

@ -5,4 +5,6 @@
/tmp/ /tmp/
# Temporary files generated by Tree-sitter # Temporary files generated by Tree-sitter
/build/
log.html log.html
tree-sitter-elixir.wasm

9
README.md Normal file
View File

@ -0,0 +1,9 @@
# tree-sitter-elixir
[![Test](https://github.com/jonatanklosko/tree-sitter-elixir/actions/workflows/test.yml/badge.svg)](https://github.com/jonatanklosko/tree-sitter-elixir/actions/workflows/test.yml)
Elixir grammar for [tree-sitter](https://github.com/tree-sitter/tree-sitter).
## Development
See [the docs](./docs.md) for development notes.

271
docs.md Normal file
View File

@ -0,0 +1,271 @@
# Development notes
## Acknowledgements
While this parser is written from scratch, there were previous efforts that made
for a helpful reference:
* [tree-sitter-elixir](https://github.com/ananthakumaran/tree-sitter-elixir) developed
by [@ananthakumaran](https://github.com/ananthakumaran)
* [tree-sitter-elixir](https://github.com/wingyplus/tree-sitter-elixir) developed
by [@wingyplus](https://github.com/wingyplus) and [@Tuxified](https://github.com/Tuxified)
## The AST
When it comes to AST Elixir is a rather specific language due to its macro system.
From the perspective of our parser, the important implication is that a seemingly
invalid code can be a valid syntax when used in a macro (or just put in the `quote`
expression). For example:
```elixir
quote do
def Bar.foo(x), definitely_not_do: 1
%{a}
*/2
end
```
As opposed to other languages, core constructs like `def`, `if` and `for` are not
particularly special either, since they are itself regular functions (or macros rather).
Consequently they can be used "improperly" in a quoted expression, as shown above.
Consequently, to correctly parse all Elixir code, we need the AST to closely match
the Elixir AST. See [Elixir / Syntax reference](https://hexdocs.pm/elixir/syntax-reference.html)
for more details.
Whenever possible possible, we try using a more specific nodes (like binary/unary operator),
but only to the extent that doesn't lose on generality. To get a sense of what the AST looks
like, have a look at the tests in `test/corpus/`.
## Getting started with Tree-sitter
For official notes see the official guide on [Creating parsers](https://tree-sitter.github.io/tree-sitter/creating-parsers).
Essentially, we define relevant language rules in `grammar.js`, based on which
Tree-sitter generates parser code (under `src/`). In some cases, we want to write
custom C++ code for tokenizing specific character sequences (in `src/scanner.cc`).
The grammar rules may often conflict with each other, meaning that the given
sequence of tokens has multiple valid interpretations given one _token_ of lookahead.
In many conflicts we always want to pick one interpretation over the other and we can
do this by assigning different precedence and associativity to relevant rules, which
tells the parser which way to go.
For example given `expression1 * expression2 • *` the next token we _see_ ahead is `*`.
The parser needs to decide whether `expression1 * expression2` is a complete binary operator
node, or if it should await the next expression and interpret it as `expression1 * (expression2 * expression3)`.
Since the `*` operator is left-associative we can use `prec.left` on the corresponding
grammar rule, to inform the parser how to resolve this conflict.
However, in some cases looking at one token ahead isn't enough, in which case we can add
the conflicting rules to the `conflicts` list in the grammar. Whenever the parser stumbles
upon this conflict it uses its GLR algorithm, basically considering both interpretations
until one leads to parsing error. If both paths parse correctly (there's a genuine ambiguity)
we can use dynamic precedence (`prec.dynamic`) to decide on the preferred path.
## Using the CLI
### tree-sitter
```shell
# See CLI usage
npx tree-sitter -h
# Generate the the parser code based on grammar.js
npx tree-sitter generate
# Run tests
npx tree-sitter test
npx tree-sitter test --filter "access syntax"
# Parse a specific file
npx tree-sitter parse tmp/test.ex
npx tree-sitter parse -x tmp/test.ex
# Parse codebase to verify syntax coverage
npx tree-sitter parse --quiet --stat 'tmp/elixir/**/*.ex*'
```
Whenever you make a change to `grammar.js` remember to run `generate`,
before verifying the result. To test custom code, create an Elixir file
like `tmp/test.ex` and use `parse` on it. The `-x` flag prints out the
source grouped into AST nodes as XML.
### Additional scripts
```shell
# Format the grammar.js file
npm run format
# Run parser against the given repository
scripts/parse_repo.sh elixir-lang/elixir
```
## Implementation notes
This section covers some of the implementation decisions that have a more
elaborated rationale. The individual subsections are referenced in the code.
### Ref 1. External scanner for quoted content
We want to scan quoted content as a single token, but it requires lookahead.
Specifically the `#` character may no longer be quoted content if followed by `{`.
Also, inside heredoc string tokenizing `"` (or `'`) requires lookahead to know
if it's already part of the end delimiter or not.
Since we need to use external scanner, we need to know the delimiter type.
One way to achieve this is using external scanner to scan the start delimiter
and then storing its type on the parser stack. This approach requires the parser
to allocate enough memory upfront and implement serialization/deserialization,
which ideally would be avoided. To avoid this, we use a different approach!
Instead of having a single `quoted_content` token, we have specific tokens for
each quoted content type, such as `_quoted_content_i_single`, `_quoted_content_i_double`.
Once the start delimiter is tokenized, we know which quoted content should be
tokenized next, and from the token we can infer the end delimiter and whether
it supports interpolation. In other words, we extract the information from the
parsing state, rather than maintaining custom parser state.
### Ref 2. External scanner for newlines
Generally newlines may appear in the middle of expressions and we ignore them
as long as the expression is valid, that's why we list newline under extras.
When a newline follows a complete expression, most of the time it should be
treated as terminator. However, there are specific cases where the newline is
non-breaking and treated as if it was just a space. This cases are:
* call followed by newline and a `do end` block
* expression followed by newline and a binary operator
In both cases we want to tokenize the newline as non-breaking, so we use external
scanner for lookahead.
Note that the relevant rules already specify left/right associativity, so if we
simply added `optional("\n")` the conflicts would be resolved immediately rather
without using GLR.
Additionally, since comments may appear anywhere and don't change the context,
we also tokenize newlines before comments as non-breaking.
### Ref 3. External scanner for unary + and -
Plus and minus are either binary or unary operators, depending on the context.
Consider the following variants
```
a + b
a+b
a+ b
a +b
```
In the first three expressions `+` is a binary operator, while in the last one
`+` is an unary operator referring to local call argument.
To correctly tokenize all the cases, we have a special `_before_unary_operator` empty
token and use external scanner to tokenize
To correctly tokenize all cases we use external scanner to tokenize a special empty
token (`_before_unary_operator`) when the spacing matches `a +b`, which forces the
parser to pick the unary operator path.
### Ref 4. External scanner for `not in`
The `not in` operator may have an arbitrary inline whitespace between `not` and `in`.
We cannot use a regular expressoin like `/not[ \t]+in/`, because it would also match
in expressions like `a not inn` as the longest matching token.
A possible solution could be `seq("not", "in")` with dynamic conflict resolution, but
then we tokenize two separate tokens. Also to properly handle `a not inn`, we would need
keyword extraction, which causes problems in our case (https://github.com/tree-sitter/tree-sitter/issues/1404).
In the end it's easiest to use external scanner, so that we can skip inline whitespace
and ensure token ends after `in`.
### Ref 5. External scanner for quoted atom start
For parsing quoted atom `:` we could make the `"` (or `'`) token immediate, however this
would require adding immediate rules for single/double quoted content and listing them
in relevant places. We could definitely do that, but using external scanner is actually
simpler.
### Ref 6. Identifier pattern
See [Elixir / Unicode Syntax](https://hexdocs.pm/elixir/unicode-syntax.html) for official
notes.
Tree-sitter already supports unicode properties in regular expressions, however character
class subtraction is not supported.
For the base `<Start>` and `<Continue>` we can use `[\p{ID_Start}]` and `[\p{ID_Continue}]`
respectively, since both are supported and according to the
[Unicode Annex #31](https://unicode.org/reports/tr31/#Table_Lexical_Classes_for_Identifiers)
they match the ranges listed in the Elixir docs.
For atoms this translates to a clean regular expression.
For variables however, we want to exclude uppercase (`\p{Lu}`) and titlecase (`\p{Lt}`)
categories from `\p{ID_Start}`. As already mentioned, we cannot use group subtraction
in the regular expression, so instead we need to create a suitable group of characters
on our own.
After removing the uppercase/titlecase categories from `[\p{ID_Start}]`, we obtain the
following group:
`[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`
At the time of writing the subtracted groups actually only remove a single character:
```elixir
Mix.install([{:unicode_set, "~> 1.1"}])
Unicode.Set.to_utf8_char(
"[[[:Ll:][:Lm:][:Lo:][:Nl:][:Other_ID_Start:]] & [[:Pattern_Syntax:][:Pattern_White_Space:]]]"
)
#=> {:ok, [11823]}
```
Consequently, by removing the subtraction we allow just one additional (not common) character,
which is perfectly acceptable.
It's important to note that JavaScript regular expressions don't support the `\p{Other_ID_Start}`
unicode category. Fortunately this category is a small set of characters introduces for
[backward compatibility](https://unicode.org/reports/tr31/#Backward_Compatibility), so we can
enumerate it manually:
```elixir
Mix.install([{:unicode_set, "~> 1.1"}])
Unicode.Set.to_utf8_char("[[[:Other_ID_Start:]] - [[:Pattern_Syntax:][:Pattern_White_Space:]]]")
|> elem(1)
|> Enum.flat_map(fn
n when is_number(n) -> [n]
range -> range
end)
|> Enum.map(&Integer.to_string(&1, 16))
#=> ["1885", "1886", "2118", "212E", "309B", "309C"]
```
Finally, we obtain this regular expression group for variable `<Start>`:
`[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\u1885\u1886\u2118\u212E\u309B\u309C]`
### Ref 7. Keyword token
We tokenize the whole keyword sequence like `do: ` as a single token.
Ideally we wouldn't include the whitespace, but since we use `token`
it gets include. However, this is an intentionally accepted tradeoff,
because using `token` significantly simplifies the grammar and avoids
conflicts.
The alternative approach would be to define keyword as `seq(alias(choice(...), $._keyword_literal), $._keyword_end)`,
where we list all other tokens that make for for valid keyword literal
and use custom scanner for `_keyword_end` to look ahead without tokenizing
the whitespace. However, this approach generates a number of conflicts
because `:` is tokenized separately and phrases like `fun fun • do` or
`fun • {}` are ambiguous (interpretation depends on whether `:` comes next).
Resolving some of these conflicts (for instance special keywords like `{}` or `%{}`)
requires the use of external scanner. Given the complexities this approach
brings to the grammar, and consequently the parser, we stick to the simpler
approach.

View File

@ -1,7 +1,5 @@
// Operator precedence:
// * https://hexdocs.pm/elixir/master/operators.html
// * https://github.com/elixir-lang/elixir/blob/master/lib/elixir/src/elixir_parser.yrl
const PREC = { const PREC = {
// See https://github.com/elixir-lang/elixir/blob/master/lib/elixir/src/elixir_parser.yrl
IN_MATCH_OPS: 10, IN_MATCH_OPS: 10,
WHEN_OP: 20, WHEN_OP: 20,
TYPE_OP: 30, TYPE_OP: 30,
@ -64,6 +62,9 @@ const ATOM_OPERATOR_LITERALS = ALL_OPS.filter(
// so it should be kept in sync // so it should be kept in sync
const ATOM_SPECIAL_LITERALS = ["...", "%{}", "{}", "%", "<<>>", "..//"]; const ATOM_SPECIAL_LITERALS = ["...", "%{}", "{}", "%", "<<>>", "..//"];
// See Ref 6. in the docs
const ATOM_WORD_LITERAL = /[\p{ID_Start}_][\p{ID_Continue}@]*[?!]?/u;
// Word tokens used directly in the grammar // Word tokens used directly in the grammar
const RESERVED_WORD_TOKENS = [ const RESERVED_WORD_TOKENS = [
// Operators // Operators
@ -82,31 +83,28 @@ const SPECIAL_IDENTIFIERS = [
"__STACKTRACE__", "__STACKTRACE__",
]; ];
// Numbers
const DIGITS = /[0-9]+/; const DIGITS = /[0-9]+/;
const BIN_DIGITS = /[0-1]+/; const BIN_DIGITS = /[0-1]+/;
const OCT_DIGITS = /[0-7]+/; const OCT_DIGITS = /[0-7]+/;
const HEX_DIGITS = /[0-9a-fA-F]+/; const HEX_DIGITS = /[0-9a-fA-F]+/;
const numberDec = sep1(DIGITS, "_"); const NUMBER_DEC = sep1(DIGITS, "_");
const numberBin = seq("0b", sep1(BIN_DIGITS, "_")); const NUMBER_BIN = seq("0b", sep1(BIN_DIGITS, "_"));
const numberOct = seq("0o", sep1(OCT_DIGITS, "_")); const NUMBER_OCT = seq("0o", sep1(OCT_DIGITS, "_"));
const numberHex = seq("0x", sep1(HEX_DIGITS, "_")); const NUMBER_HEX = seq("0x", sep1(HEX_DIGITS, "_"));
const integer = choice(numberDec, numberBin, numberOct, numberHex); const INTEGER = choice(NUMBER_DEC, NUMBER_BIN, NUMBER_OCT, NUMBER_HEX);
const floatScientificPart = seq(/[eE]/, optional(choice("-", "+")), integer); const FLOAT_SCIENTIFIC_PART = seq(/[eE]/, optional(choice("-", "+")), INTEGER);
const float = seq(numberDec, ".", numberDec, optional(floatScientificPart)); const FLOAT = seq(NUMBER_DEC, ".", NUMBER_DEC, optional(FLOAT_SCIENTIFIC_PART));
const aliasPart = /[A-Z][_a-zA-Z0-9]*/; const NEWLINE = /\r?\n/;
module.exports = grammar({ module.exports = grammar({
name: "elixir", name: "elixir",
// TODO describe stuff (also in the separate notes doc add clarification
// how we use this verbose tokens to avoid needing scanner state)
externals: ($) => [ externals: ($) => [
// See Ref 1. in the docs
$._quoted_content_i_single, $._quoted_content_i_single,
$._quoted_content_i_double, $._quoted_content_i_double,
$._quoted_content_i_heredoc_single, $._quoted_content_i_heredoc_single,
@ -117,7 +115,6 @@ module.exports = grammar({
$._quoted_content_i_angle, $._quoted_content_i_angle,
$._quoted_content_i_bar, $._quoted_content_i_bar,
$._quoted_content_i_slash, $._quoted_content_i_slash,
$._quoted_content_single, $._quoted_content_single,
$._quoted_content_double, $._quoted_content_double,
$._quoted_content_heredoc_single, $._quoted_content_heredoc_single,
@ -129,77 +126,62 @@ module.exports = grammar({
$._quoted_content_bar, $._quoted_content_bar,
$._quoted_content_slash, $._quoted_content_slash,
$._keyword_special_literal, // See Ref 2. in the docs
$._atom_start,
$._keyword_end,
$._newline_before_do, $._newline_before_do,
$._newline_before_binary_op, $._newline_before_binary_operator,
// TODO explain this, basically we use newline ignored for newline before comment,
// as after the comment there is another newline that we then consider as usual (so
// that comments are skipped when considering newlines) <- this is chaotic need a better one
$._newline_before_comment, $._newline_before_comment,
// TODO explain this, basically we use this to force unary + and - // See Ref 3. in the docs
// if there is no spacing before the operand
$._before_unary_op, $._before_unary_op,
// See Ref 4. in the docs
$._not_in, $._not_in,
// See Ref 5. in the docs
$._quoted_atom_start,
], ],
// TODO include in notes about why using extra for newline before binary op is fine
// TODO figure out how "\n" helps with the behaviour in
// [
// :a,
// ]
// and how it generally works with extras
extras: ($) => [ extras: ($) => [
NEWLINE,
/[ \t]|\r?\n|\\\r?\n/,
$.comment, $.comment,
/\s|\\\n/,
$._newline_before_binary_op,
$._newline_before_comment, $._newline_before_comment,
"\n", // Placing this directly in the binary operator rule leads
// to conflicts, but we can place it here without any drawbacks.
// If we detect binary operator and the previous line is not a
// valid expression, it's a syntax error either way
$._newline_before_binary_operator,
], ],
// TODO check if the parser doesn't compile without each conflict rule,
// otherwise it means we don't really use it (I think)
conflicts: ($) => [ conflicts: ($) => [
// [$._newline_before_binary_op], // Given `left • *`, `left` identifier can be either:
[$.binary_operator], // * expression in `left * right`
[$.keywords], // * call identifier in `left * / 2`
// [$.identifier, $.atom_literal], [$._expression, $._local_call_without_parentheses],
[$._expression, $._local_call_with_arguments],
[
$._expression,
$._local_call_with_arguments,
$._local_call_without_arguments,
],
[$._remote_call, $._parenthesised_remote_call], // Given `left • when`, `left` expression can be either:
// * binary operator operand in `left when right`
// * stab arguments item in `left when right ->`
//
// Given `arg1, left • when`, `left` expression can be either:
// * binary operator operand in `arg1, left when right, arg3`
// * stab arguments item in `arg1, left when right ->`
[$.binary_operator, $._stab_clause_arguments_without_parentheses],
// stab clause `(x` may be either `(x;y) ->` or `(x, y) ->` // Given `(-> • /`, stab can be either:
// [$.block, $._stab_clause_arguments], // * stab clause operator in `(-> / / 2)`
[$.block, $._stab_clause_parentheses_arguments], // * operator identifier in `(-> / 2)`
[$.block, $._stab_clause_arguments],
[$.block, $._stab_clause_arguments_expression],
// when in stab clause
[$.binary_operator, $._stab_clause_arguments_expression],
[$.tuple, $.map],
[$.tuple, $.map_content],
[$.operator_identifier, $.stab_clause], [$.operator_identifier, $.stab_clause],
// Given `& /`, ampersand can be either:
// * capture operator in `& / / 2`
// * operator identifier in `& / 1`
[$.unary_operator, $.operator_identifier], [$.unary_operator, $.operator_identifier],
// [$.alias],
// Given `(arg -> expression • \n`, the newline could be either:
// * terminator separating expressions in `(arg -> expression \n expression)`
// * terminator separating clauses in `(arg -> expression \n arg -> expression)`
[$.body], [$.body],
// [$.block, $._stab_clause_arguments],
// [$.block, $._stab_clause_parentheses_arguments],
// [$.block, $._stab_clause_parentheses_arguments],
[$.after_block],
[$.rescue_block],
[$.catch_block],
[$.else_block],
], ],
rules: { rules: {
@ -212,7 +194,8 @@ module.exports = grammar({
), ),
_terminator: ($) => _terminator: ($) =>
prec.right(choice(seq(repeat("\n"), ";"), repeat1("\n"))), // Right precedence, because we want to consume `;` after newlines if present
prec.right(choice(seq(repeat(NEWLINE), ";"), repeat1(NEWLINE))),
_expression: ($) => _expression: ($) =>
choice( choice(
@ -221,7 +204,10 @@ module.exports = grammar({
$.alias, $.alias,
$.integer, $.integer,
$.float, $.float,
$.atom, $.char,
$.boolean,
$.nil,
$._atom,
$.string, $.string,
$.charlist, $.charlist,
$.sigil, $.sigil,
@ -229,9 +215,6 @@ module.exports = grammar({
$.tuple, $.tuple,
$.bitstring, $.bitstring,
$.map, $.map,
$.char,
$.boolean,
$.nil,
$.unary_operator, $.unary_operator,
$.binary_operator, $.binary_operator,
$.dot, $.dot,
@ -241,54 +224,27 @@ module.exports = grammar({
), ),
block: ($) => block: ($) =>
prec( seq(
PREC.WHEN_OP, "(",
seq( optional($._terminator),
"(", optional(
seq( choice(
optional($._terminator), sep1(choice($.stab_clause), $._terminator),
optional( seq(
seq( sep1(choice($._expression), $._terminator),
sep1(choice($._expression, $.stab_clause), $._terminator), optional($._terminator)
optional($._terminator)
)
) )
), )
")" ),
) ")"
), ),
_identifier: ($) => _identifier: ($) =>
choice($.identifier, $.unused_identifier, $.special_identifier), choice($.identifier, $.unused_identifier, $.special_identifier),
// Note: Elixir does not allow uppercase and titlecase letters
// as a variable starting character, but this regex would match
// those. This implies we would happily parse those cases, but
// since they are not valid Elixir it's unlikely to stumble upon
// them. TODO reword
// Ref: https://hexdocs.pm/elixir/master/unicode-syntax.html#variables
// TODO see if we need this in custom scanner in the end, if we do,
// then we may use the generation script from the original repo instead
// and make this an external (though I'd check if these custom unicode
// functions are efficient, does compiler optimise such checks?)
// identifier: ($) => choice(/[\p{ID_Start}][\p{ID_Continue}]*[?!]?/u, "..."),
// identifier: ($) => choice(/[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\p{Other_ID_Start}][\p{ID_Continue}]*[?!]?/u, "..."),
// identifier: ($) => choice(/[\p{Ll}\p{Lm}\p{Lo}\p{Nl}][\p{ID_Continue}]*[?!]?/u, "..."),
//
// TODO elaborate, but basically
//
// we remove uppercase/titlecase letters from ID_Start as elixir does
// we remove the subtractions (we cannot express group subtraction in regex),
// but it's fine becaues at the time of writing these groups only really subtract
// a single character
// Unicode.Set.to_utf8_char "[[[:L:][:Nl:][:Other_ID_Start:]] & [[:Pattern_Syntax:][:Pattern_White_Space:]]]"
// we use hardcoded codepoints for \p{Other_ID_Start} since treesitter/js regexp doesn't
// recognise this group
//
// Other_ID_Start \u1885\u1886\u2118\u212E\u309B\u309C
// (this the list at the time of writing, it's for backward compatibility, see https://unicode.org/reports/tr31/#Backward_Compatibility)
identifier: ($) => identifier: ($) =>
choice( choice(
// See Ref 6. in the docs
/[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\u1885\u1886\u2118\u212E\u309B\u309C][\p{ID_Continue}]*[?!]?/u, /[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\u1885\u1886\u2118\u212E\u309B\u309C][\p{ID_Continue}]*[?!]?/u,
"..." "..."
), ),
@ -297,36 +253,34 @@ module.exports = grammar({
special_identifier: ($) => choice(...SPECIAL_IDENTIFIERS), special_identifier: ($) => choice(...SPECIAL_IDENTIFIERS),
// We have a separate rule for single-part alias, so that we alias: ($) => token(sep1(/[A-Z][_a-zA-Z0-9]*/, /\s*\.\s*/)),
// can use it in the keywords rule
alias: ($) => choice($._alias_single, $._alias_multi),
_alias_single: ($) => aliasPart, integer: ($) => token(INTEGER),
_alias_multi: ($) => token(sep1(aliasPart, /\s*\.\s*/)), float: ($) => token(FLOAT),
integer: ($) => token(integer), char: ($) => /\?(.|\\.)/,
float: ($) => token(float), boolean: ($) => choice("true", "false"),
nil: ($) => "nil",
_atom: ($) => choice($.atom, $.quoted_atom),
atom: ($) => atom: ($) =>
seq( token(
$._atom_start, seq(
choice( ":",
alias($._atom_word_literal, $.atom_literal), choice(
alias($._atom_operator_literal, $.atom_literal), ATOM_WORD_LITERAL,
alias($._atom_special_literal, $.atom_literal), ...ATOM_OPERATOR_LITERALS,
$._quoted_i_double, ...ATOM_SPECIAL_LITERALS
$._quoted_i_single )
) )
), ),
// TODO comment on the unicode groups here quoted_atom: ($) =>
_atom_word_literal: ($) => /[\p{ID_Start}_][\p{ID_Continue}@]*[?!]?/u, seq($._quoted_atom_start, choice($._quoted_i_double, $._quoted_i_single)),
_atom_operator_literal: ($) => choice(...ATOM_OPERATOR_LITERALS),
_atom_special_literal: ($) => choice(...ATOM_SPECIAL_LITERALS),
// Defines $._quoted_content_i_{name} and $._quoted_content_{name} rules, // Defines $._quoted_content_i_{name} and $._quoted_content_{name} rules,
// content with and without interpolation respectively // content with and without interpolation respectively
@ -402,6 +356,82 @@ module.exports = grammar({
optional(alias(token.immediate(/[a-zA-Z]+/), $.sigil_modifiers)) optional(alias(token.immediate(/[a-zA-Z]+/), $.sigil_modifiers))
), ),
keywords: ($) =>
// Right precedence, because we want to consume next items as long
// as there is a comma ahead
prec.right(sep1($.pair, ",")),
_keywords_with_trailing_separator: ($) =>
seq(sep1($.pair, ","), optional(",")),
pair: ($) => seq($._keyword, $._expression),
_keyword: ($) => choice($.keyword, $.quoted_keyword),
keyword: ($) =>
// See Ref 7. in the docs
token(
seq(
choice(
ATOM_WORD_LITERAL,
...ATOM_OPERATOR_LITERALS.filter((op) => op !== "::"),
...ATOM_SPECIAL_LITERALS
),
/:\s/
)
),
quoted_keyword: ($) =>
seq(
choice($._quoted_i_double, $._quoted_i_single),
token.immediate(/:\s/)
),
list: ($) => seq("[", optional($._items_with_trailing_separator), "]"),
tuple: ($) => seq("{", optional($._items_with_trailing_separator), "}"),
bitstring: ($) =>
seq("<<", optional($._items_with_trailing_separator), ">>"),
map: ($) =>
// Precedence over tuple
prec(
1,
seq(
"%",
optional($.struct),
"{",
optional(alias($._items_with_trailing_separator, $.map_content)),
"}"
)
),
struct: ($) =>
// Left precedence, because if there is a conflict involving `{}`,
// we want to treat it as map continuation rather than tuple
prec.left(
choice(
$.alias,
$._atom,
$._identifier,
$.unary_operator,
$.dot,
alias($._call_with_parentheses, $.call)
)
),
_items_with_trailing_separator: ($) =>
seq(
choice(
seq(sep1($._expression, ","), optional(",")),
seq(
optional(seq(sep1($._expression, ","), ",")),
alias($._keywords_with_trailing_separator, $.keywords)
)
)
),
unary_operator: ($) => unary_operator: ($) =>
choice( choice(
unaryOp($, prec, PREC.CAPTURE_OP, "&", $._capture_expression), unaryOp($, prec, PREC.CAPTURE_OP, "&", $._capture_expression),
@ -413,9 +443,10 @@ module.exports = grammar({
_capture_expression: ($) => _capture_expression: ($) =>
choice( choice(
// TODO sholud parenthesised expression be generally used (?) // Note that block expression is not allowed as capture operand,
// Precedence over block expression // so we have an explicit sequence with the parentheses and higher
prec(PREC.WHEN_OP + 1, seq("(", $._expression, ")")), // precedence
prec(1, seq("(", $._expression, ")")),
$._expression $._expression
), ),
@ -466,13 +497,14 @@ module.exports = grammar({
operator_identifier: ($) => operator_identifier: ($) =>
// Operators with the following changes: // Operators with the following changes:
// * exclude "=>" since it's not a valid atom/operator identifier anyway (valid only in map) //
// * we exclude // since it's only valid after .. // * exclude "=>" since it's not a valid operator identifier
// * we remove "-" and "+" since they are both unary and binary // * exclude // since it's only valid after ..
// * exclude binary "-" and "+" as they are handled as unary below
// We use the same precedence as unary operators, so that a sequence //
// like `& /` is a conflict and is resolved via $.conflicts // For unary operator identifiers we use the same precedence as
// (could be be either `& / 2` or `& / / 2`) // operators, so that we get conflicts and resolve them dynamically
// (see grammar.conflicts for more details)
choice( choice(
// Unary operators // Unary operators
prec(PREC.CAPTURE_OP, "&"), prec(PREC.CAPTURE_OP, "&"),
@ -505,188 +537,63 @@ module.exports = grammar({
seq(choice($._expression), ".", choice($.alias, $.tuple)) seq(choice($._expression), ".", choice($.alias, $.tuple))
), ),
keywords: ($) => sep1($.pair, ","), call: ($) => choice($._call_without_parentheses, $._call_with_parentheses),
pair: ($) => seq($.keyword, $._expression), _call_without_parentheses: ($) =>
keyword: ($) =>
seq(
// Tree-sitter doesn't consider ambiguities within individual
// tokens (in this case regexps). So both in [a] and [a: 1] it
// would always parse "a" as the same node (based on whether
// $.identifier or $.atom_literal) is listed first in the rules.
// However, since identifiers and alias parts are valid atom
// literals, we can list them here, in which case the parser will
// consider all paths and pick the valid one.
// Also see https://github.com/tree-sitter/tree-sitter/issues/518
choice(
alias($._atom_word_literal, $.atom_literal),
alias($._atom_operator_literal, $.atom_literal),
alias($._keyword_special_literal, $.atom_literal),
alias($.identifier, $.atom_literal),
alias($.unused_identifier, $.atom_literal),
alias($.special_identifier, $.atom_literal),
alias($._alias_single, $.atom_literal),
alias(choice(...RESERVED_WORD_TOKENS), $.atom_literal),
$._quoted_i_double,
$._quoted_i_single
),
$._keyword_end
),
list: ($) => seq("[", optional($._items_with_trailing_separator), "]"),
tuple: ($) => seq("{", optional($._items_with_trailing_separator), "}"),
bitstring: ($) =>
seq("<<", optional($._items_with_trailing_separator), ">>"),
map: ($) => seq("%", optional($.struct), "{", optional($.map_content), "}"),
struct: ($) =>
prec.left(
choice(
$.alias,
$.atom,
$._identifier,
$.unary_operator,
$.dot,
alias($._parenthesised_call, $.call)
)
),
map_content: ($) => $._items_with_trailing_separator,
_items_with_trailing_separator: ($) =>
seq(
choice(
seq(sep1($._expression, ","), optional(seq(",", $.keywords))),
$.keywords
),
optional(",")
),
char: ($) => /\?(.|\\.)/,
boolean: ($) => choice("true", "false"),
nil: ($) => "nil",
call: ($) =>
choice( choice(
$._local_call_with_arguments, $._local_call_without_parentheses,
$._parenthesised_local_call_with_arguments, $._local_call_just_do_block,
$._local_call_without_arguments, $._remote_call_without_parentheses
$._remote_call,
$._parenthesised_remote_call,
$._anonymous_call,
$._call_on_call
), ),
_parenthesised_call: ($) => _call_with_parentheses: ($) =>
choice( choice(
$._parenthesised_local_call_with_arguments, $._local_call_with_parentheses,
$._parenthesised_remote_call, $._remote_call_with_parentheses,
$._anonymous_call, $._anonymous_call,
$._call_on_call $._double_call
), ),
_call_on_call: ($) => // Note, calls have left precedence, so that `do end` block sticks to
prec.left( // the outermost call
seq(
alias(
choice(
$._parenthesised_local_call_with_arguments,
$._parenthesised_remote_call,
$._anonymous_call
),
$.call
),
// arguments in parentheses
// alias($._local_or_remote_arguments, $.arguments),
// TODO just make nonimmediate/immediate in the name
alias($._anonymous_arguments, $.arguments),
optional(seq(optional($._newline_before_do), $.do_block))
)
),
_local_call_with_arguments: ($) => _local_call_without_parentheses: ($) =>
// Given `x + y` it can be interpreted either as a binary operator
// or a call with unary operator. This is an actual ambiguity, so
// we use dynamic precedence to penalize call
// prec.dynamic(
// TODO ideally we would penalize whitespace after unary op,
// so that x + y is binary op and x +y is unary op, to reflect
// Elixir ast
// -1,
prec.left( prec.left(
seq( seq(
$._identifier, $._identifier,
alias($._call_arguments, $.arguments), alias($._call_arguments_without_parentheses, $.arguments),
// TODO include this in notes:
// We use external scanner for _newline_before_do because
// this way we can lookahead through any whitespace
// (especially newlines). We cannot simply use repeat("\n")
// and conflict with expression end, because this function
// rule has left precedence (so that do-end sticks to the outermost
// call), and thus expression end would always be preferred
optional(seq(optional($._newline_before_do), $.do_block)) optional(seq(optional($._newline_before_do), $.do_block))
// optional($.do_block)
) )
// )
), ),
_parenthesised_local_call_with_arguments: ($) => _local_call_with_parentheses: ($) =>
// Given `x + y` it can be interpreted either as a binary operator
// or a call with unary operator. This is an actual ambiguity, so
// we use dynamic precedence to penalize call
// prec.dynamic(
// TODO ideally we would penalize whitespace after unary op,
// so that x + y is binary op and x +y is unary op, to reflect
// Elixir ast
// -1,
prec.left( prec.left(
seq( seq(
$._identifier, $._identifier,
alias($._parenthesised_call_arguments, $.arguments), alias($._call_arguments_with_parentheses_immediate, $.arguments),
// TODO include this in notes:
// We use external scanner for _newline_before_do because
// this way we can lookahead through any whitespace
// (especially newlines). We cannot simply use repeat("\n")
// and conflict with expression end, because this function
// rule has left precedence (so that do-end sticks to the outermost
// call), and thus expression end would always be preferred
optional(seq(optional($._newline_before_do), $.do_block)) optional(seq(optional($._newline_before_do), $.do_block))
// optional($.do_block)
) )
// )
), ),
_local_call_without_arguments: ($) => _local_call_just_do_block: ($) =>
// We use lower precedence, so given `fun arg do end` // Lower precedence than identifier, because `foo bar do` is `foo(bar) do end`
// we don't tokenize `arg` as a call prec(-1, seq($._identifier, $.do_block)),
// we actually need a conflict because of `foo bar do end` vs `foo bar do: 1` _remote_call_without_parentheses: ($) =>
// prec(-1,
prec.dynamic(-1, seq($._identifier, $.do_block)),
// )
_remote_call: ($) =>
prec.left( prec.left(
seq( seq(
alias($._remote_dot, $.dot), alias($._remote_dot, $.dot),
optional(alias($._call_arguments, $.arguments)), optional(alias($._call_arguments_without_parentheses, $.arguments)),
optional(seq(optional($._newline_before_do), $.do_block)) optional(seq(optional($._newline_before_do), $.do_block))
// optional($.do_block)
) )
), ),
_parenthesised_remote_call: ($) => _remote_call_with_parentheses: ($) =>
prec.left( prec.left(
seq( seq(
alias($._remote_dot, $.dot), alias($._remote_dot, $.dot),
alias($._parenthesised_call_arguments, $.arguments), alias($._call_arguments_with_parentheses_immediate, $.arguments),
optional(seq(optional($._newline_before_do), $.do_block)) optional(seq(optional($._newline_before_do), $.do_block))
// optional($.do_block)
) )
), ),
@ -696,9 +603,6 @@ module.exports = grammar({
seq( seq(
$._expression, $._expression,
".", ".",
// TODO can also be string, anything else?
// compare with the other parser
// TODO we don't want to support heredoc though
choice( choice(
$._identifier, $._identifier,
alias(choice(...RESERVED_WORD_TOKENS), $.identifier), alias(choice(...RESERVED_WORD_TOKENS), $.identifier),
@ -709,117 +613,154 @@ module.exports = grammar({
) )
), ),
_parenthesised_call_arguments: ($) =>
seq(token.immediate("("), optional($._call_arguments), ")"),
_anonymous_call: ($) => _anonymous_call: ($) =>
seq( seq(
alias($._anonymous_dot, $.dot), alias($._anonymous_dot, $.dot),
alias($._anonymous_arguments, $.arguments) alias($._call_arguments_with_parentheses, $.arguments)
), ),
_anonymous_dot: ($) => prec(PREC.DOT_OP, seq($._expression, ".")), _anonymous_dot: ($) => prec(PREC.DOT_OP, seq($._expression, ".")),
_anonymous_arguments: ($) => seq("(", optional($._call_arguments), ")"), _double_call: ($) =>
prec.left(
_call_arguments: ($) =>
// Right precedence ensures that `fun1 fun2 x, y` is treated
// as `fun1(fun2(x, y))` and not `fun1(fun2(x), y)
prec.right(
seq( seq(
choice( alias(
seq( choice(
sep1($._expression, ","), $._local_call_with_parentheses,
optional(seq(",", $.keywords, optional(","))) $._remote_call_with_parentheses,
$._anonymous_call
), ),
seq($.keywords, optional(",")) $.call
) ),
alias($._call_arguments_with_parentheses, $.arguments),
optional(seq(optional($._newline_before_do), $.do_block))
) )
), ),
_call_arguments_with_parentheses: ($) =>
seq("(", optional($._call_arguments_with_trailing_separator), ")"),
_call_arguments_with_parentheses_immediate: ($) =>
seq(
token.immediate("("),
optional($._call_arguments_with_trailing_separator),
")"
),
_call_arguments_with_trailing_separator: ($) =>
choice(
seq(
sep1($._expression, ","),
optional(
seq(",", alias($._keywords_with_trailing_separator, $.keywords))
)
),
alias($._keywords_with_trailing_separator, $.keywords)
),
_call_arguments_without_parentheses: ($) =>
// Right precedence, because `fun1 fun2 x, y` is `fun1(fun2(x, y))`
prec.right(
choice(
seq(sep1($._expression, ","), optional(seq(",", $.keywords))),
$.keywords
)
),
do_block: ($) =>
seq(
callKeywordBlock($, "do"),
repeat(
choice($.after_block, $.rescue_block, $.catch_block, $.else_block)
),
"end"
),
after_block: ($) => callKeywordBlock($, "after"),
rescue_block: ($) => callKeywordBlock($, "rescue"),
catch_block: ($) => callKeywordBlock($, "catch"),
else_block: ($) => callKeywordBlock($, "else"),
access_call: ($) => access_call: ($) =>
prec( prec(
PREC.ACCESS, PREC.ACCESS,
seq($._expression, token.immediate("["), $._expression, "]") seq($._expression, token.immediate("["), $._expression, "]")
), ),
do_block: ($) =>
seq(
sugarBlock($, "do"),
repeat(
choice($.after_block, $.rescue_block, $.catch_block, $.else_block)
),
"end"
),
after_block: ($) => sugarBlock($, "after"),
rescue_block: ($) => sugarBlock($, "rescue"),
catch_block: ($) => sugarBlock($, "catch"),
else_block: ($) => sugarBlock($, "else"),
// Specify right precedence, so that we consume as much as we can
stab_clause: ($) => stab_clause: ($) =>
// Right precedence, because we want to consume body if any
prec.right(seq(optional($._stab_clause_left), "->", optional($.body))), prec.right(seq(optional($._stab_clause_left), "->", optional($.body))),
_stab_clause_left: ($) => _stab_clause_left: ($) =>
choice( choice(
// Note the first option has higher precedence, TODO clarify alias($._stab_clause_arguments_with_parentheses, $.arguments),
alias($._stab_clause_parentheses_arguments, $.arguments),
// TODO naming/cleanup
alias( alias(
$._stab_clause_parentheses_arguments_with_guard, $._stab_clause_arguments_with_parentheses_with_guard,
$.binary_operator $.binary_operator
), ),
alias($._stab_clause_arguments, $.arguments), alias($._stab_clause_arguments_without_parentheses, $.arguments),
alias($._stab_clause_arguments_with_guard, $.binary_operator) alias(
$._stab_clause_arguments_without_parentheses_with_guard,
$.binary_operator
)
), ),
_stab_clause_parentheses_arguments: ($) => _stab_clause_arguments_with_parentheses: ($) =>
// `(1) ->` may be interpreted either as block argument // Precedence over block expression
// or argument in parentheses and we use dynamic precedence prec(
// to favour the latter 1,
seq(
"(",
optional(
choice(
seq(sep1($._expression, ","), optional(seq(",", $.keywords))),
$.keywords
)
),
")"
)
),
_stab_clause_arguments_without_parentheses: ($) =>
// We give the arguments and expression the same precedence as "when"
// binary operator, so that we get conflicts and resolve them dynamically
// (see the grammar.conflicts for more details)
prec( prec(
PREC.WHEN_OP, PREC.WHEN_OP,
prec.dynamic(1, seq("(", optional($._stab_clause_arguments), ")")) choice(
seq(
sep1(prec(PREC.WHEN_OP, $._expression), ","),
optional(seq(",", $.keywords))
),
$.keywords
)
), ),
_stab_clause_parentheses_arguments_with_guard: ($) =>
_stab_clause_arguments_with_parentheses_with_guard: ($) =>
seq( seq(
alias($._stab_clause_parentheses_arguments, $.arguments), alias($._stab_clause_arguments_with_parentheses, $.arguments),
"when", "when",
$._expression $._expression
), ),
_stab_clause_arguments_with_guard: ($) => _stab_clause_arguments_without_parentheses_with_guard: ($) =>
// `a when b ->` may be interpted either such that `a when b` is an argument // Given `a when b ->`, the left stab operand can be interpreted either
// or a guard binary operator with argument `a` and right operand `b`, // as a single argument item, or as binary operator with arguments on
// we use dynamic precedence to favour the latter // the left and guard expression on the right. Using dynamic precedence
// we favour the latter interpretation during dynamic conflict resolution
prec.dynamic( prec.dynamic(
1, 1,
seq(alias($._stab_clause_arguments, $.arguments), "when", $._expression)
),
_stab_clause_arguments: ($) =>
// TODO this is a variant of _items_with_trailing_separator, cleanup
choice(
seq( seq(
sep1($._stab_clause_arguments_expression, ","), alias($._stab_clause_arguments_without_parentheses, $.arguments),
optional(seq(",", $.keywords)) "when",
), $._expression
$.keywords )
), ),
_stab_clause_arguments_expression: ($) =>
// Note here we use the same precedence as when operator,
// so we get a conflict and resolve it dynamically
prec(PREC.WHEN_OP, $._expression),
body: ($) => body: ($) =>
seq( seq(
choice( optional($._terminator),
seq($._terminator, sep($._expression, $._terminator)), sep1($._expression, $._terminator),
sep1($._expression, $._terminator)
),
optional($._terminator) optional($._terminator)
), ),
@ -832,7 +773,7 @@ module.exports = grammar({
), ),
// A comment may be anywhere, we give it a lower precedence, // A comment may be anywhere, we give it a lower precedence,
// so it doesn't intercept sequences such as interpolation // so it doesn't intercept interpolation
comment: ($) => token(prec(-1, seq("#", /.*/))), comment: ($) => token(prec(-1, seq("#", /.*/))),
}, },
}); });
@ -846,15 +787,14 @@ function sep(rule, separator) {
} }
function unaryOp($, assoc, precedence, operator, right = null) { function unaryOp($, assoc, precedence, operator, right = null) {
return assoc( // Expression such as `x + y` falls under the "expression vs local call"
precedence, // conflict that we already have. By using dynamic precedence we penalize
// TODO clarify, we use lower precedence, so given `x + y`, // unary operator, so `x + y` is interpreted as binary operator (unless
// which can be interpreted as either `x + y` or `x(+y)` // _before_unary_op is tokenized and forces unary operator interpretation)
// we favour the former. The only exception is when return prec.dynamic(
// _before_unary_op matches which forces the latter interpretation -1,
// in case like `x +y` assoc(
prec.dynamic( precedence,
-1,
seq( seq(
optional($._before_unary_op), optional($._before_unary_op),
field("operator", operator), field("operator", operator),
@ -875,7 +815,7 @@ function binaryOp($, assoc, precedence, operator, left = null, right = null) {
); );
} }
function sugarBlock($, start) { function callKeywordBlock($, start) {
return seq( return seq(
start, start,
optional($._terminator), optional($._terminator),
@ -895,8 +835,7 @@ function defineQuoted(start, end, name) {
start, start,
repeat( repeat(
choice( choice(
// TODO rename the extenrals to _content alias($[`_quoted_content_i_${name}`], $.quoted_content),
alias($[`_quoted_content_i_${name}`], $.string_content),
$.interpolation, $.interpolation,
$.escape_sequence $.escape_sequence
) )
@ -909,9 +848,8 @@ function defineQuoted(start, end, name) {
start, start,
repeat( repeat(
choice( choice(
// TODO rename the extenrals to _content alias($[`_quoted_content_${name}`], $.quoted_content),
alias($[`_quoted_content_${name}`], $.string_content), // The end delimiter may always be escaped
// It's always possible to escape the end delimiter
$.escape_sequence $.escape_sequence
) )
), ),

4
package-lock.json generated
View File

@ -1,11 +1,11 @@
{ {
"name": "tree-sitter-elixir", "name": "tree-sitter-elixir",
"version": "1.0.0", "version": "0.19.0",
"lockfileVersion": 2, "lockfileVersion": 2,
"requires": true, "requires": true,
"packages": { "packages": {
"": { "": {
"version": "1.0.0", "version": "0.19.0",
"license": "ISC", "license": "ISC",
"dependencies": { "dependencies": {
"nan": "^2.15.0" "nan": "^2.15.0"

34
scripts/parse_repo.sh Executable file
View File

@ -0,0 +1,34 @@
#!/bin/bash
set -e
cd "$(dirname "$0")/.."
print_usage_and_exit() {
echo "Usage: $0 <github-repo>"
echo ""
echo "Clones the given repository and runs the parser against all Elixir files"
echo ""
echo "## Examples"
echo ""
echo " $0 elixir-lang/elixir"
echo ""
exit 1
}
if [ $# -ne 1 ]; then
print_usage_and_exit
fi
gh_repo="$1"
dir="tmp/gh/${gh_repo//[\/-]/_}"
if [[ ! -d "$dir" ]]; then
mkdir -p "$(dirname "$dir")"
git clone "https://github.com/$gh_repo.git" "$dir"
fi
echo "Running parser against $gh_repo"
npx tree-sitter parse --quiet --stat "$dir/**/*.ex*"

File diff suppressed because it is too large Load Diff

View File

@ -79,6 +79,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -186,6 +190,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -217,11 +225,6 @@
] ]
} }
}, },
{
"type": "alias",
"named": true,
"fields": {}
},
{ {
"type": "anonymous_function", "type": "anonymous_function",
"named": true, "named": true,
@ -321,6 +324,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -348,38 +355,6 @@
] ]
} }
}, },
{
"type": "atom",
"named": true,
"fields": {},
"children": {
"multiple": true,
"required": false,
"types": [
{
"type": "atom_literal",
"named": true
},
{
"type": "escape_sequence",
"named": true
},
{
"type": "interpolation",
"named": true
},
{
"type": "string_content",
"named": true
}
]
}
},
{
"type": "atom_literal",
"named": true,
"fields": {}
},
{ {
"type": "binary_operator", "type": "binary_operator",
"named": true, "named": true,
@ -464,6 +439,10 @@
"type": "operator_identifier", "type": "operator_identifier",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -756,6 +735,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -863,6 +846,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -974,6 +961,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1081,6 +1072,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1118,7 +1113,7 @@
"fields": {}, "fields": {},
"children": { "children": {
"multiple": true, "multiple": true,
"required": false, "required": true,
"types": [ "types": [
{ {
"type": "access_call", "type": "access_call",
@ -1192,6 +1187,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1230,7 +1229,7 @@
"fields": {}, "fields": {},
"children": { "children": {
"multiple": true, "multiple": true,
"required": true, "required": false,
"types": [ "types": [
{ {
"type": "arguments", "type": "arguments",
@ -1343,6 +1342,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1391,7 +1394,7 @@
"named": true "named": true
}, },
{ {
"type": "string_content", "type": "quoted_content",
"named": true "named": true
} }
] ]
@ -1489,6 +1492,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "rescue_block", "type": "rescue_block",
"named": true "named": true
@ -1608,6 +1615,10 @@
"type": "operator_identifier", "type": "operator_identifier",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1715,6 +1726,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1831,6 +1846,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -1858,33 +1877,6 @@
] ]
} }
}, },
{
"type": "keyword",
"named": true,
"fields": {},
"children": {
"multiple": true,
"required": false,
"types": [
{
"type": "atom_literal",
"named": true
},
{
"type": "escape_sequence",
"named": true
},
{
"type": "interpolation",
"named": true
},
{
"type": "string_content",
"named": true
}
]
}
},
{ {
"type": "keywords", "type": "keywords",
"named": true, "named": true,
@ -1984,6 +1976,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2114,6 +2110,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2235,6 +2235,14 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{
"type": "quoted_keyword",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2262,6 +2270,52 @@
] ]
} }
}, },
{
"type": "quoted_atom",
"named": true,
"fields": {},
"children": {
"multiple": true,
"required": false,
"types": [
{
"type": "escape_sequence",
"named": true
},
{
"type": "interpolation",
"named": true
},
{
"type": "quoted_content",
"named": true
}
]
}
},
{
"type": "quoted_keyword",
"named": true,
"fields": {},
"children": {
"multiple": true,
"required": false,
"types": [
{
"type": "escape_sequence",
"named": true
},
{
"type": "interpolation",
"named": true
},
{
"type": "quoted_content",
"named": true
}
]
}
},
{ {
"type": "rescue_block", "type": "rescue_block",
"named": true, "named": true,
@ -2342,6 +2396,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2389,6 +2447,10 @@
"type": "interpolation", "type": "interpolation",
"named": true "named": true
}, },
{
"type": "quoted_content",
"named": true
},
{ {
"type": "sigil_modifiers", "type": "sigil_modifiers",
"named": true "named": true
@ -2396,10 +2458,6 @@
{ {
"type": "sigil_name", "type": "sigil_name",
"named": true "named": true
},
{
"type": "string_content",
"named": true
} }
] ]
} }
@ -2484,6 +2542,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2556,7 +2618,7 @@
"named": true "named": true
}, },
{ {
"type": "string_content", "type": "quoted_content",
"named": true "named": true
} }
] ]
@ -2590,6 +2652,10 @@
"type": "identifier", "type": "identifier",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "special_identifier", "type": "special_identifier",
"named": true "named": true
@ -2689,6 +2755,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2835,6 +2905,10 @@
"type": "nil", "type": "nil",
"named": true "named": true
}, },
{
"type": "quoted_atom",
"named": true
},
{ {
"type": "sigil", "type": "sigil",
"named": true "named": true
@ -2862,10 +2936,6 @@
] ]
} }
}, },
{
"type": "\n",
"named": false
},
{ {
"type": "!", "type": "!",
"named": false "named": false
@ -2894,10 +2964,6 @@
"type": "%", "type": "%",
"named": false "named": false
}, },
{
"type": "%{}",
"named": false
},
{ {
"type": "&", "type": "&",
"named": false "named": false
@ -2978,10 +3044,6 @@
"type": "...", "type": "...",
"named": false "named": false
}, },
{
"type": "..//",
"named": false
},
{ {
"type": "/", "type": "/",
"named": false "named": false
@ -3014,10 +3076,6 @@
"type": "<<<", "type": "<<<",
"named": false "named": false
}, },
{
"type": "<<>>",
"named": false
},
{ {
"type": "<<~", "type": "<<~",
"named": false "named": false
@ -3130,10 +3188,18 @@
"type": "after", "type": "after",
"named": false "named": false
}, },
{
"type": "alias",
"named": true
},
{ {
"type": "and", "type": "and",
"named": false "named": false
}, },
{
"type": "atom",
"named": true
},
{ {
"type": "catch", "type": "catch",
"named": false "named": false
@ -3182,6 +3248,10 @@
"type": "integer", "type": "integer",
"named": true "named": true
}, },
{
"type": "keyword",
"named": true
},
{ {
"type": "nil", "type": "nil",
"named": false "named": false
@ -3194,6 +3264,10 @@
"type": "or", "type": "or",
"named": false "named": false
}, },
{
"type": "quoted_content",
"named": true
},
{ {
"type": "rescue", "type": "rescue",
"named": false "named": false
@ -3206,10 +3280,6 @@
"type": "sigil_name", "type": "sigil_name",
"named": true "named": true
}, },
{
"type": "string_content",
"named": true
},
{ {
"type": "true", "type": "true",
"named": false "named": false
@ -3226,10 +3296,6 @@
"type": "{", "type": "{",
"named": false "named": false
}, },
{
"type": "{}",
"named": false
},
{ {
"type": "|", "type": "|",
"named": false "named": false

713836
src/parser.c

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@ -91,7 +91,7 @@ does not match inside a string
(source (source
(string (string
(string_content)) (quoted_content))
(string (string
(string_content) (quoted_content)
(interpolation (identifier)))) (interpolation (identifier))))

View File

@ -223,8 +223,7 @@ end
(identifier) (identifier)
(integer)) (integer))
(body (body
(atom (atom))))))
(atom_literal)))))))
===================================== =====================================
stab clause / arguments in parentheses stab clause / arguments in parentheses
@ -245,8 +244,7 @@ end
(identifier) (identifier)
(identifier)) (identifier))
(body (body
(atom (atom))))))
(atom_literal)))))))
===================================== =====================================
stab clause / many clauses stab clause / many clauses
@ -268,20 +266,17 @@ end
(arguments (arguments
(integer)) (integer))
(body (body
(atom (atom)))
(atom_literal))))
(stab_clause (stab_clause
(arguments (arguments
(integer)) (integer))
(body (body
(atom (atom)))
(atom_literal))))
(stab_clause (stab_clause
(arguments (arguments
(identifier)) (identifier))
(body (body
(atom (atom))))))
(atom_literal)))))))
===================================== =====================================
stab clause / multiline expression stab clause / multiline expression
@ -327,8 +322,7 @@ end
(call (call
(identifier) (identifier)
(arguments)) (arguments))
(atom (atom)))
(atom_literal))))
(body (body
(boolean)))))) (boolean))))))
@ -773,8 +767,7 @@ end
(arguments (arguments
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))) (integer))))
(body (body
(identifier))))))) (identifier)))))))

View File

@ -114,6 +114,5 @@ def Mod.fun(x), do: 1
(identifier))) (identifier)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))

View File

@ -142,20 +142,17 @@ end
(arguments (arguments
(integer)) (integer))
(body (body
(atom (atom)))
(atom_literal))))
(stab_clause (stab_clause
(arguments (arguments
(integer)) (integer))
(body (body
(atom (atom)))
(atom_literal))))
(stab_clause (stab_clause
(arguments (arguments
(identifier)) (identifier))
(body (body
(atom (atom)))))
(atom_literal))))))
===================================== =====================================
with guard / no arguments with guard / no arguments
@ -176,8 +173,7 @@ end
(call (call
(identifier) (identifier)
(arguments)) (arguments))
(atom (atom)))
(atom_literal))))
(body (body
(boolean))))) (boolean)))))
@ -305,8 +301,7 @@ end
(map_content (map_content
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(identifier)))))) (identifier))))))
(binary_operator (binary_operator
(identifier) (identifier)

View File

@ -33,12 +33,10 @@ fun([1, 2], option: true, other: 5)
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -69,20 +67,17 @@ fun +: 1
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
(call (call
(identifier) (identifier)
(arguments (arguments
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -104,12 +99,10 @@ fun [1, 2],
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -155,8 +148,7 @@ outer_fun inner_fun do: 1
(arguments (arguments
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))))) (integer))))))))
===================================== =====================================
@ -270,12 +262,10 @@ Mod.fun([1, 2], option: true, other: 5)
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -304,12 +294,10 @@ Mod.fun [1, 2], option: true, other: 5
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -459,14 +447,14 @@ Mod.'fun'(a)
(dot (dot
(alias) (alias)
(string (string
(string_content))) (quoted_content)))
(arguments (arguments
(identifier))) (identifier)))
(call (call
(dot (dot
(alias) (alias)
(charlist (charlist
(string_content))) (quoted_content)))
(arguments (arguments
(identifier)))) (identifier))))
@ -482,15 +470,14 @@ remote call / atom literal module
(source (source
(call (call
(dot (dot
(atom (atom)
(atom_literal))
(identifier)) (identifier))
(arguments (arguments
(identifier))) (identifier)))
(call (call
(dot (dot
(atom (quoted_atom
(string_content)) (quoted_content))
(identifier)) (identifier))
(arguments (arguments
(identifier)))) (identifier))))
@ -533,12 +520,10 @@ fun.([1, 2], option: true, other: 5)
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -755,12 +740,10 @@ fun(option: true, other: 5,)
(arguments (arguments
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(boolean)) (boolean))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -812,8 +795,7 @@ map[:key]
(identifier)) (identifier))
(access_call (access_call
(identifier) (identifier)
(atom (atom)))
(atom_literal))))
===================================== =====================================
access syntax / does not allow whitespace access syntax / does not allow whitespace
@ -845,14 +827,12 @@ map[:mod].fun
(dot (dot
(identifier) (identifier)
(identifier))) (identifier)))
(atom (atom))
(atom_literal)))
(call (call
(dot (dot
(access_call (access_call
(identifier) (identifier)
(atom (atom))
(atom_literal)))
(identifier)))) (identifier))))
===================================== =====================================
@ -870,23 +850,19 @@ access syntax / precedence with operators
(unary_operator (unary_operator
(access_call (access_call
(identifier) (identifier)
(atom (atom)))
(atom_literal))))
(access_call (access_call
(unary_operator (unary_operator
(identifier)) (identifier))
(atom (atom))
(atom_literal)))
(unary_operator (unary_operator
(access_call (access_call
(identifier) (identifier)
(atom (atom)))
(atom_literal))))
(access_call (access_call
(unary_operator (unary_operator
(integer)) (integer))
(atom (atom)))
(atom_literal))))
===================================== =====================================
double parenthesised call double parenthesised call

View File

@ -589,15 +589,11 @@ not in[y]
(list (list
(identifier))) (identifier)))
(binary_operator (binary_operator
(atom (atom)
(atom_literal)) (atom))
(atom
(atom_literal)))
(binary_operator (binary_operator
(atom (atom)
(atom_literal)) (atom)))
(atom
(atom_literal))))
===================================== =====================================
multiline / unary over binary (precedence) multiline / unary over binary (precedence)

View File

@ -14,14 +14,14 @@ simple literal
--- ---
(source (source
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content)) (sigil (sigil_name) (quoted_content))
(sigil (sigil_name) (string_content))) (sigil (sigil_name) (quoted_content)))
===================================== =====================================
@ -36,7 +36,7 @@ line 2"
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content))) (quoted_content)))
===================================== =====================================
interpolation interpolation
@ -53,22 +53,22 @@ interpolation
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content))) (quoted_content)))
===================================== =====================================
nested interpolation nested interpolation
@ -81,14 +81,14 @@ nested interpolation
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(interpolation (interpolation
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(interpolation (interpolation
(integer)))) (integer))))
(string_content))) (quoted_content)))
===================================== =====================================
escape sequence escape sequence
@ -101,26 +101,26 @@ escape sequence
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
escaped interpolation escaped interpolation
@ -134,7 +134,7 @@ escaped interpolation
(sigil (sigil
(sigil_name) (sigil_name)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
upper sigil / no interpolation upper sigil / no interpolation
@ -147,7 +147,7 @@ upper sigil / no interpolation
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content))) (quoted_content)))
===================================== =====================================
upper sigil / no escape sequence upper sigil / no escape sequence
@ -160,7 +160,7 @@ upper sigil / no escape sequence
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content))) (quoted_content)))
===================================== =====================================
upper sigil / escape terminator upper sigil / escape terminator
@ -175,19 +175,19 @@ upper sigil / escape terminator
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content)) (quoted_content))
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content)) (quoted_content))
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
heredoc delimiter heredoc delimiter
@ -208,10 +208,10 @@ with 'quotes'
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content)) (quoted_content))
(sigil (sigil
(sigil_name) (sigil_name)
(string_content))) (quoted_content)))
===================================== =====================================
modifiers modifiers
@ -225,11 +225,11 @@ modifiers
(source (source
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(sigil_modifiers)) (sigil_modifiers))
(sigil (sigil
(sigil_name) (sigil_name)
(string_content) (quoted_content)
(sigil_modifiers))) (sigil_modifiers)))
===================================== =====================================
@ -244,4 +244,4 @@ modifiers
(sigil (sigil
(sigil_name) (sigil_name)
(ERROR) (ERROR)
(string_content))) (quoted_content)))

View File

@ -184,8 +184,7 @@ def fun(x), do: x
(arguments)) (arguments))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
(call (call
(identifier) (identifier)
@ -196,8 +195,7 @@ def fun(x), do: x
(identifier))) (identifier)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(identifier)))))) (identifier))))))
===================================== =====================================

View File

@ -17,8 +17,7 @@ for n <- [1, 2], do: n * 2
(integer))) (integer)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(binary_operator (binary_operator
(identifier) (identifier)
(integer))))))) (integer)))))))
@ -46,8 +45,7 @@ end
(arguments))) (arguments)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(call (call
(dot (dot
(alias) (alias)
@ -77,18 +75,16 @@ for <<c <- " hello world ">>, c != ?\s, into: "", do: <<c>>
(binary_operator (binary_operator
(identifier) (identifier)
(string (string
(string_content)))) (quoted_content))))
(binary_operator (binary_operator
(identifier) (identifier)
(char)) (char))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(string)) (string))
(pair (pair
(keyword (keyword)
(atom_literal))
(bitstring (bitstring
(identifier))))))) (identifier)))))))
@ -114,8 +110,7 @@ end
(integer))) (integer)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(map)))) (map))))
(do_block (do_block
(stab_clause (stab_clause

View File

@ -35,8 +35,7 @@ end
(call (call
(identifier) (identifier)
(arguments (arguments
(atom (atom))
(atom_literal)))
(do_block))) (do_block)))
===================================== =====================================
@ -76,7 +75,7 @@ end
(identifier) (identifier)
(arguments (arguments
(string (string
(string_content))))) (quoted_content)))))
(call (call
(identifier) (identifier)
(arguments (arguments
@ -91,7 +90,7 @@ end
(identifier) (identifier)
(arguments (arguments
(string (string
(string_content))))) (quoted_content)))))
(unary_operator (unary_operator
(call (call
(identifier) (identifier)
@ -133,8 +132,7 @@ end
(identifier))) (identifier)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(binary_operator (binary_operator
(identifier) (identifier)
(identifier))))))))) (identifier)))))))))

View File

@ -71,17 +71,14 @@ with literals
(map_content (map_content
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(identifier))))))) (identifier)))))))
(binary_operator (binary_operator
(tuple (tuple
(atom (atom)
(atom_literal))
(identifier)) (identifier))
(tuple (tuple
(atom (atom)
(atom_literal))
(identifier)))))))) (identifier))))))))
===================================== =====================================
@ -166,12 +163,10 @@ with type guard
(identifier))) (identifier)))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(identifier)) (identifier))
(pair (pair
(keyword (keyword)
(atom_literal))
(identifier)))))))) (identifier))))))))
===================================== =====================================
@ -245,7 +240,6 @@ nonempty list
(identifier)) (identifier))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(identifier))))))) (identifier)))))))
(ERROR)) (ERROR))

View File

@ -82,17 +82,3 @@ __MODULE__.Child
(dot (dot
(special_identifier) (special_identifier)
(alias))) (alias)))
=====================================
[error] does not support characters outside ASCII
=====================================
Ólá
Olá
---
(source
(ERROR
(atom_literal)
(atom_literal)))

View File

@ -11,16 +11,11 @@ simple literal
--- ---
(source (source
(atom (atom)
(atom_literal)) (atom)
(atom (atom)
(atom_literal)) (atom)
(atom (atom))
(atom_literal))
(atom
(atom_literal))
(atom
(atom_literal)))
===================================== =====================================
operators operators
@ -31,7 +26,7 @@ operators
--- ---
(source (source
(list (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)))) (list (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom)))
===================================== =====================================
special operator-like atoms special operator-like atoms
@ -43,18 +38,12 @@ special operator-like atoms
(source (source
(list (list
(atom (atom)
(atom_literal)) (atom)
(atom (atom)
(atom_literal)) (atom)
(atom (atom)
(atom_literal)) (atom)))
(atom
(atom_literal))
(atom
(atom_literal))
(atom
(atom_literal))))
===================================== =====================================
quoted atom quoted atom
@ -66,11 +55,11 @@ quoted atom
--- ---
(source (source
(atom (quoted_atom
(string_content) (quoted_content)
(escape_sequence)) (escape_sequence))
(atom (quoted_atom
(string_content) (quoted_content)
(escape_sequence))) (escape_sequence)))
===================================== =====================================
@ -83,13 +72,13 @@ interpolation
--- ---
(source (source
(atom (quoted_atom
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(atom (quoted_atom
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content))) (quoted_content)))

View File

@ -17,7 +17,7 @@ single item
(float)) (float))
(bitstring (bitstring
(string (string
(string_content)))) (quoted_content))))
===================================== =====================================
multiple items multiple items
@ -36,7 +36,7 @@ multiple items
(integer) (integer)
(float) (float)
(string (string
(string_content)))) (quoted_content))))
===================================== =====================================
size modifiers size modifiers
@ -77,21 +77,21 @@ multiple modifiers
(bitstring (bitstring
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(binary_operator (binary_operator
(identifier) (identifier)
(identifier)))) (identifier))))
(bitstring (bitstring
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(binary_operator (binary_operator
(identifier) (identifier)
(identifier)))) (identifier))))
(bitstring (bitstring
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(binary_operator (binary_operator
(identifier) (identifier)
(identifier)))) (identifier))))
@ -136,7 +136,7 @@ multiple components with modifiers
(integer) (integer)
(identifier))) (identifier)))
(string (string
(string_content)) (quoted_content))
(binary_operator (binary_operator
(float) (float)
(identifier)) (identifier))

View File

@ -8,7 +8,7 @@ single line
(source (source
(charlist (charlist
(string_content))) (quoted_content)))
===================================== =====================================
multiple lines multiple lines
@ -21,7 +21,7 @@ line 2'
(source (source
(charlist (charlist
(string_content))) (quoted_content)))
===================================== =====================================
interpolation interpolation
@ -37,20 +37,20 @@ interpolation
(source (source
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content))) (quoted_content)))
===================================== =====================================
nested interpolation nested interpolation
@ -62,13 +62,13 @@ nested interpolation
(source (source
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(integer)))) (integer))))
(string_content))) (quoted_content)))
===================================== =====================================
escape sequence escape sequence
@ -80,26 +80,26 @@ escape sequence
(source (source
(charlist (charlist
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
escaped interpolation escaped interpolation
@ -112,7 +112,7 @@ escaped interpolation
(source (source
(charlist (charlist
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / charlist heredoc / charlist
@ -127,7 +127,7 @@ with 'quotes'
(source (source
(charlist (charlist
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / interpolation heredoc / interpolation
@ -141,10 +141,10 @@ hey #{name}!
(source (source
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / nested interpolation heredoc / nested interpolation
@ -162,14 +162,14 @@ this is #{
(source (source
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(charlist (charlist
(string_content) (quoted_content)
(interpolation (interpolation
(integer)) (integer))
(string_content))) (quoted_content)))
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / escaped delimiter heredoc / escaped delimiter
@ -187,15 +187,15 @@ heredoc / escaped delimiter
(source (source
(charlist (charlist
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content)) (quoted_content))
(charlist (charlist
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / escaped interpolation heredoc / escaped interpolation
@ -209,6 +209,6 @@ heredoc / escaped interpolation
(source (source
(charlist (charlist
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))

View File

@ -10,24 +10,19 @@ simple literal
(list (list
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
===================================== =====================================
@ -42,8 +37,7 @@ trailing separator
(list (list
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
===================================== =====================================
@ -58,17 +52,14 @@ with leading items
(list (list
(integer) (integer)
(tuple (tuple
(atom (atom)
(atom_literal))
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
===================================== =====================================
@ -83,16 +74,13 @@ operator key
(list (list
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
===================================== =====================================
@ -107,28 +95,22 @@ special atom key
(list (list
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
===================================== =====================================
@ -144,22 +126,18 @@ reserved token key
(list (list
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))) (integer))))
(list (list
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer))))) (integer)))))
===================================== =====================================
@ -177,13 +155,13 @@ quoted key
(list (list
(keywords (keywords
(pair (pair
(keyword (quoted_keyword
(string_content) (quoted_content)
(escape_sequence)) (escape_sequence))
(integer)) (integer))
(pair (pair
(keyword (quoted_keyword
(string_content) (quoted_content)
(escape_sequence)) (escape_sequence))
(integer))))) (integer)))))
@ -202,18 +180,18 @@ key interpolation
(list (list
(keywords (keywords
(pair (pair
(keyword (quoted_keyword
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(integer)) (integer))
(pair (pair
(keyword (quoted_keyword
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(integer))))) (integer)))))
===================================== =====================================
@ -226,15 +204,14 @@ key interpolation
(source (source
(list (list
(keywords
(pair
(keyword
(atom_literal))
(integer))
(pair
(keyword
(atom_literal))
(integer)))
(ERROR (ERROR
(keywords
(pair
(keyword)
(integer))
(pair
(keyword)
(integer))))
(binary_operator
(integer) (integer)
(integer)))) (integer))))

View File

@ -22,12 +22,10 @@ from keywords
(map_content (map_content
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -42,12 +40,11 @@ from arrow entries
(map (map
(map_content (map_content
(binary_operator (binary_operator
(atom (atom)
(atom_literal))
(integer)) (integer))
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer)) (integer))
(binary_operator (binary_operator
(identifier) (identifier)
@ -66,16 +63,14 @@ from both arrow entries and keywords
(map_content (map_content
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -91,7 +86,7 @@ trailing separator
(map_content (map_content
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer))))) (integer)))))
===================================== =====================================
@ -110,24 +105,22 @@ update syntax
(identifier) (identifier)
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(string (string
(string_content))) (quoted_content)))
(pair (pair
(keyword (keyword)
(atom_literal))
(string (string
(string_content))))))) (quoted_content)))))))
(map (map
(map_content (map_content
(binary_operator (binary_operator
(identifier) (identifier)
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(string (string
(string_content))))))) (quoted_content)))))))
===================================== =====================================
[error] ordering [error] ordering
@ -139,19 +132,18 @@ update syntax
(source (source
(map (map
(map_content (ERROR
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))) (integer))))
(ERROR (map_content
(integer) (binary_operator
(integer)))) (integer)
(integer)))))
===================================== =====================================
[error] missing separator [error] missing separator
@ -166,9 +158,9 @@ update syntax
(map_content (map_content
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(ERROR (integer)) (ERROR (integer))
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer)))))) (integer))))))

View File

@ -8,7 +8,7 @@ single line
(source (source
(string (string
(string_content))) (quoted_content)))
===================================== =====================================
multiple lines multiple lines
@ -21,7 +21,7 @@ line 2"
(source (source
(string (string
(string_content))) (quoted_content)))
===================================== =====================================
interpolation interpolation
@ -37,20 +37,20 @@ interpolation
(source (source
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content)) (quoted_content))
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content))) (quoted_content)))
===================================== =====================================
nested interpolation nested interpolation
@ -62,13 +62,13 @@ nested interpolation
(source (source
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(integer)))) (integer))))
(string_content))) (quoted_content)))
===================================== =====================================
escape sequence escape sequence
@ -80,26 +80,26 @@ escape sequence
(source (source
(string (string
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
escaped interpolation escaped interpolation
@ -112,7 +112,7 @@ escaped interpolation
(source (source
(string (string
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / string heredoc / string
@ -127,7 +127,7 @@ with "quotes"
(source (source
(string (string
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / interpolation heredoc / interpolation
@ -141,10 +141,10 @@ hey #{name}!
(source (source
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(identifier)) (identifier))
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / nested interpolation heredoc / nested interpolation
@ -162,14 +162,14 @@ this is #{
(source (source
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(string (string
(string_content) (quoted_content)
(interpolation (interpolation
(integer)) (integer))
(string_content))) (quoted_content)))
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / escaped delimiter heredoc / escaped delimiter
@ -187,15 +187,15 @@ heredoc / escaped delimiter
(source (source
(string (string
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content)) (quoted_content))
(string (string
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
===================================== =====================================
heredoc / escaped interpolation heredoc / escaped interpolation
@ -209,18 +209,6 @@ heredoc / escaped interpolation
(source (source
(string (string
(string_content) (quoted_content)
(escape_sequence) (escape_sequence)
(string_content))) (quoted_content)))
=====================================
[error] heredoc / no whitespace
=====================================
"""s"""
---
(source
(ERROR
(identifier)))

View File

@ -26,12 +26,10 @@ from keywords
(map_content (map_content
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -48,12 +46,11 @@ from arrow entries
(alias)) (alias))
(map_content (map_content
(binary_operator (binary_operator
(atom (atom)
(atom_literal))
(integer)) (integer))
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer)) (integer))
(binary_operator (binary_operator
(identifier) (identifier)
@ -74,16 +71,14 @@ from both arrow entries and keywords
(map_content (map_content
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer)) (integer))
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)) (integer))
(pair (pair
(keyword (keyword)
(atom_literal))
(integer)))))) (integer))))))
===================================== =====================================
@ -101,7 +96,7 @@ trailing separator
(map_content (map_content
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(integer))))) (integer)))))
===================================== =====================================
@ -121,15 +116,13 @@ update syntax
(identifier) (identifier)
(keywords (keywords
(pair (pair
(keyword (keyword)
(atom_literal))
(string (string
(string_content))) (quoted_content)))
(pair (pair
(keyword (keyword)
(atom_literal))
(string (string
(string_content))))))) (quoted_content)))))))
(map (map
(struct (struct
(alias)) (alias))
@ -138,9 +131,9 @@ update syntax
(identifier) (identifier)
(binary_operator (binary_operator
(string (string
(string_content)) (quoted_content))
(string (string
(string_content))))))) (quoted_content)))))))
===================================== =====================================
unused struct identifier unused struct identifier
@ -212,8 +205,8 @@ with atom
(source (source
(map (map
(struct (struct
(atom (quoted_atom
(string_content))))) (quoted_content)))))
===================================== =====================================
with call with call

View File

@ -13,20 +13,15 @@ atom
--- ---
(source (source
(atom (atom)
(atom_literal)) (quoted_atom
(atom (quoted_content))
(string_content)) (quoted_atom
(atom (quoted_content))
(string_content)) (atom)
(atom (atom)
(atom_literal)) (atom)
(atom (atom))
(atom_literal))
(atom
(atom_literal))
(atom
(atom_literal)))
===================================== =====================================
string string
@ -43,17 +38,17 @@ string
(source (source
(string (string
(string_content)) (quoted_content))
(string (string
(string_content)) (quoted_content))
(string (string
(string_content)) (quoted_content))
(string (string
(string_content)) (quoted_content))
(string (string
(string_content)) (quoted_content))
(string (string
(string_content))) (quoted_content)))
===================================== =====================================
charlist charlist
@ -69,17 +64,17 @@ charlist
(source (source
(charlist (charlist
(string_content)) (quoted_content))
(charlist (charlist
(string_content)) (quoted_content))
(charlist (charlist
(string_content)) (quoted_content))
(charlist (charlist
(string_content)) (quoted_content))
(charlist (charlist
(string_content)) (quoted_content))
(charlist (charlist
(string_content))) (quoted_content)))
===================================== =====================================
char char