Cleanup and documentation

2021-09-28 16:00:35 +02:00 · 2021-09-28 16:00:35 +02:00 · d229dddf66
commit d229dddf66
parent dad92d2b87
30 changed files with 330857 additions and 388993 deletions
--- a/.gitignore
+++ b/.gitignore
@ -5,4 +5,6 @@
 /tmp/

 # Temporary files generated by Tree-sitter
+/build/
 log.html
+tree-sitter-elixir.wasm
--- a/README.md
+++ b/README.md
@ -0,0 +1,9 @@
+# tree-sitter-elixir
+
+[![Test](https://github.com/jonatanklosko/tree-sitter-elixir/actions/workflows/test.yml/badge.svg)](https://github.com/jonatanklosko/tree-sitter-elixir/actions/workflows/test.yml)
+
+Elixir grammar for [tree-sitter](https://github.com/tree-sitter/tree-sitter).
+
+## Development
+
+See [the docs](./docs.md) for development notes.
--- a/docs.md
+++ b/docs.md
@ -0,0 +1,271 @@
+# Development notes
+
+## Acknowledgements
+
+While this parser is written from scratch, there were previous efforts that made
+for a helpful reference:
+
+* [tree-sitter-elixir](https://github.com/ananthakumaran/tree-sitter-elixir) developed
+  by [@ananthakumaran](https://github.com/ananthakumaran)
+* [tree-sitter-elixir](https://github.com/wingyplus/tree-sitter-elixir) developed
+  by [@wingyplus](https://github.com/wingyplus) and [@Tuxified](https://github.com/Tuxified)
+
+## The AST
+
+When it comes to AST Elixir is a rather specific language due to its macro system.
+From the perspective of our parser, the important implication is that a seemingly
+invalid code can be a valid syntax when used in a macro (or just put in the `quote`
+expression). For example:
+
+```elixir
+quote do
+  def Bar.foo(x), definitely_not_do: 1
+  %{a}
+  */2
+end
+```
+
+As opposed to other languages, core constructs like `def`, `if` and `for` are not
+particularly special either, since they are itself regular functions (or macros rather).
+Consequently they can be used "improperly" in a quoted expression, as shown above.
+
+Consequently, to correctly parse all Elixir code, we need the AST to closely match
+the Elixir AST. See [Elixir / Syntax reference](https://hexdocs.pm/elixir/syntax-reference.html)
+for more details.
+
+Whenever possible possible, we try using a more specific nodes (like binary/unary operator),
+but only to the extent that doesn't lose on generality. To get a sense of what the AST looks
+like, have a look at the tests in `test/corpus/`.
+
+## Getting started with Tree-sitter
+
+For official notes see the official guide on [Creating parsers](https://tree-sitter.github.io/tree-sitter/creating-parsers).
+
+Essentially, we define relevant language rules in `grammar.js`, based on which
+Tree-sitter generates parser code (under `src/`). In some cases, we want to write
+custom C++ code for tokenizing specific character sequences (in `src/scanner.cc`).
+
+The grammar rules may often conflict with each other, meaning that the given
+sequence of tokens has multiple valid interpretations given one _token_ of lookahead.
+In many conflicts we always want to pick one interpretation over the other and we can
+do this by assigning different precedence and associativity to relevant rules, which
+tells the parser which way to go.
+
+For example given `expression1 * expression2 • *` the next token we _see_ ahead is `*`.
+The parser needs to decide whether `expression1 * expression2` is a complete binary operator
+node, or if it should await the next expression and interpret it as `expression1 * (expression2 * expression3)`.
+Since the `*` operator is left-associative we can use `prec.left` on the corresponding
+grammar rule, to inform the parser how to resolve this conflict.
+
+However, in some cases looking at one token ahead isn't enough, in which case we can add
+the conflicting rules to the `conflicts` list in the grammar. Whenever the parser stumbles
+upon this conflict it uses its GLR algorithm, basically considering both interpretations
+until one leads to parsing error. If both paths parse correctly (there's a genuine ambiguity)
+we can use dynamic precedence (`prec.dynamic`) to decide on the preferred path.
+
+## Using the CLI
+
+### tree-sitter
+
+```shell
+# See CLI usage
+npx tree-sitter -h
+
+# Generate the the parser code based on grammar.js
+npx tree-sitter generate
+
+# Run tests
+npx tree-sitter test
+npx tree-sitter test --filter "access syntax"
+
+# Parse a specific file
+npx tree-sitter parse tmp/test.ex
+npx tree-sitter parse -x tmp/test.ex
+
+# Parse codebase to verify syntax coverage
+npx tree-sitter parse --quiet --stat 'tmp/elixir/**/*.ex*'
+```
+
+Whenever you make a change to `grammar.js` remember to run `generate`,
+before verifying the result. To test custom code, create an Elixir file
+like `tmp/test.ex` and use `parse` on it. The `-x` flag prints out the
+source grouped into AST nodes as XML.
+
+### Additional scripts
+
+```shell
+# Format the grammar.js file
+npm run format
+
+# Run parser against the given repository
+scripts/parse_repo.sh elixir-lang/elixir
+```
+
+## Implementation notes
+
+This section covers some of the implementation decisions that have a more
+elaborated rationale. The individual subsections are referenced in the code.
+
+### Ref 1. External scanner for quoted content
+
+We want to scan quoted content as a single token, but it requires lookahead.
+Specifically the `#` character may no longer be quoted content if followed by `{`.
+Also, inside heredoc string tokenizing `"` (or `'`) requires lookahead to know
+if it's already part of the end delimiter or not.
+
+Since we need to use external scanner, we need to know the delimiter type.
+One way to achieve this is using external scanner to scan the start delimiter
+and then storing its type on the parser stack. This approach requires the parser
+to allocate enough memory upfront and implement serialization/deserialization,
+which ideally would be avoided. To avoid this, we use a different approach!
+Instead of having a single `quoted_content` token, we have specific tokens for
+each quoted content type, such as `_quoted_content_i_single`, `_quoted_content_i_double`.
+Once the start delimiter is tokenized, we know which quoted content should be
+tokenized next, and from the token we can infer the end delimiter and whether
+it supports interpolation. In other words, we extract the information from the
+parsing state, rather than maintaining custom parser state.
+
+### Ref 2. External scanner for newlines
+
+Generally newlines may appear in the middle of expressions and we ignore them
+as long as the expression is valid, that's why we list newline under extras.
+
+When a newline follows a complete expression, most of the time it should be
+treated as terminator. However, there are specific cases where the newline is
+non-breaking and treated as if it was just a space. This cases are:
+
+  * call followed by newline and a `do end` block
+  * expression followed by newline and a binary operator
+
+In both cases we want to tokenize the newline as non-breaking, so we use external
+scanner for lookahead.
+
+Note that the relevant rules already specify left/right associativity, so if we
+simply added `optional("\n")` the conflicts would be resolved immediately rather
+without using GLR.
+
+Additionally, since comments may appear anywhere and don't change the context,
+we also tokenize newlines before comments as non-breaking.
+
+### Ref 3. External scanner for unary + and -
+
+Plus and minus are either binary or unary operators, depending on the context.
+Consider the following variants
+
+```
+a + b
+a+b
+a+ b
+a +b
+```
+
+In the first three expressions `+` is a binary operator, while in the last one
+`+` is an unary operator referring to local call argument.
+
+To correctly tokenize all the cases, we have a special `_before_unary_operator` empty
+token and use external scanner to tokenize
+
+To correctly tokenize all cases we use external scanner to tokenize a special empty
+token (`_before_unary_operator`) when the spacing matches `a +b`, which forces the
+parser to pick the unary operator path.
+
+### Ref 4. External scanner for `not in`
+
+The `not in` operator may have an arbitrary inline whitespace between `not` and `in`.
+
+We cannot use a regular expressoin like `/not[ \t]+in/`, because it would also match
+in expressions like `a not inn` as the longest matching token.
+
+A possible solution could be `seq("not", "in")` with dynamic conflict resolution, but
+then we tokenize two separate tokens. Also to properly handle `a not inn`, we would need
+keyword extraction, which causes problems in our case (https://github.com/tree-sitter/tree-sitter/issues/1404).
+
+In the end it's easiest to use external scanner, so that we can skip inline whitespace
+and ensure token ends after `in`.
+
+### Ref 5. External scanner for quoted atom start
+
+For parsing quoted atom `:` we could make the `"` (or `'`) token immediate, however this
+would require adding immediate rules for single/double quoted content and listing them
+in relevant places. We could definitely do that, but using external scanner is actually
+simpler.
+
+### Ref 6. Identifier pattern
+
+See [Elixir / Unicode Syntax](https://hexdocs.pm/elixir/unicode-syntax.html) for official
+notes.
+
+Tree-sitter already supports unicode properties in regular expressions, however character
+class subtraction is not supported.
+
+For the base `<Start>` and `<Continue>` we can use `[\p{ID_Start}]` and `[\p{ID_Continue}]`
+respectively, since both are supported and according to the
+[Unicode Annex #31](https://unicode.org/reports/tr31/#Table_Lexical_Classes_for_Identifiers)
+they match the ranges listed in the Elixir docs.
+
+For atoms this translates to a clean regular expression.
+
+For variables however, we want to exclude uppercase (`\p{Lu}`) and titlecase (`\p{Lt}`)
+categories from `\p{ID_Start}`. As already mentioned, we cannot use group subtraction
+in the regular expression, so instead we need to create a suitable group of characters
+on our own.
+
+After removing the uppercase/titlecase categories from `[\p{ID_Start}]`, we obtain the
+following group:
+
+`[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\p{Other_ID_Start}-\p{Pattern_Syntax}-\p{Pattern_White_Space}]`
+
+At the time of writing the subtracted groups actually only remove a single character:
+
+```elixir
+Mix.install([{:unicode_set, "~> 1.1"}])
+
+Unicode.Set.to_utf8_char(
+  "[[[:Ll:][:Lm:][:Lo:][:Nl:][:Other_ID_Start:]] & [[:Pattern_Syntax:][:Pattern_White_Space:]]]"
+)
+#=> {:ok, [11823]}
+```
+
+Consequently, by removing the subtraction we allow just one additional (not common) character,
+which is perfectly acceptable.
+
+It's important to note that JavaScript regular expressions don't support the `\p{Other_ID_Start}`
+unicode category. Fortunately this category is a small set of characters introduces for
+[backward compatibility](https://unicode.org/reports/tr31/#Backward_Compatibility), so we can
+enumerate it manually:
+
+```elixir
+Mix.install([{:unicode_set, "~> 1.1"}])
+
+Unicode.Set.to_utf8_char("[[[:Other_ID_Start:]] - [[:Pattern_Syntax:][:Pattern_White_Space:]]]")
+|> elem(1)
+|> Enum.flat_map(fn
+  n when is_number(n) -> [n]
+  range -> range
+end)
+|> Enum.map(&Integer.to_string(&1, 16))
+#=> ["1885", "1886", "2118", "212E", "309B", "309C"]
+```
+
+Finally, we obtain this regular expression group for variable `<Start>`:
+
+`[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\u1885\u1886\u2118\u212E\u309B\u309C]`
+
+### Ref 7. Keyword token
+
+We tokenize the whole keyword sequence like `do: ` as a single token.
+Ideally we wouldn't include the whitespace, but since we use `token`
+it gets include. However, this is an intentionally accepted tradeoff,
+because using `token` significantly simplifies the grammar and avoids
+conflicts.
+
+The alternative approach would be to define keyword as `seq(alias(choice(...), $._keyword_literal), $._keyword_end)`,
+where we list all other tokens that make for for valid keyword literal
+and use custom scanner for `_keyword_end` to look ahead without tokenizing
+the whitespace. However, this approach generates a number of conflicts
+because `:` is tokenized separately and phrases like `fun fun • do` or
+`fun • {}` are ambiguous (interpretation depends on whether `:` comes next).
+Resolving some of these conflicts (for instance special keywords like `{}` or `%{}`)
+requires the use of external scanner. Given the complexities this approach
+brings to the grammar, and consequently the parser, we stick to the simpler
+approach.
--- a/grammar.js
+++ b/grammar.js
@ -1,7 +1,5 @@
-// Operator precedence:
-// * https://hexdocs.pm/elixir/master/operators.html
-// * https://github.com/elixir-lang/elixir/blob/master/lib/elixir/src/elixir_parser.yrl
 const PREC = {
+  // See https://github.com/elixir-lang/elixir/blob/master/lib/elixir/src/elixir_parser.yrl
  IN_MATCH_OPS: 10,
  WHEN_OP: 20,
  TYPE_OP: 30,
@ -64,6 +62,9 @@ const ATOM_OPERATOR_LITERALS = ALL_OPS.filter(
 // so it should be kept in sync
 const ATOM_SPECIAL_LITERALS = ["...", "%{}", "{}", "%", "<<>>", "..//"];

+// See Ref 6. in the docs
+const ATOM_WORD_LITERAL = /[\p{ID_Start}_][\p{ID_Continue}@]*[?!]?/u;
+
 // Word tokens used directly in the grammar
 const RESERVED_WORD_TOKENS = [
  // Operators
@ -82,31 +83,28 @@ const SPECIAL_IDENTIFIERS = [
  "__STACKTRACE__",
 ];

-// Numbers
-
 const DIGITS = /[0-9]+/;
 const BIN_DIGITS = /[0-1]+/;
 const OCT_DIGITS = /[0-7]+/;
 const HEX_DIGITS = /[0-9a-fA-F]+/;

-const numberDec = sep1(DIGITS, "_");
-const numberBin = seq("0b", sep1(BIN_DIGITS, "_"));
-const numberOct = seq("0o", sep1(OCT_DIGITS, "_"));
-const numberHex = seq("0x", sep1(HEX_DIGITS, "_"));
+const NUMBER_DEC = sep1(DIGITS, "_");
+const NUMBER_BIN = seq("0b", sep1(BIN_DIGITS, "_"));
+const NUMBER_OCT = seq("0o", sep1(OCT_DIGITS, "_"));
+const NUMBER_HEX = seq("0x", sep1(HEX_DIGITS, "_"));

-const integer = choice(numberDec, numberBin, numberOct, numberHex);
+const INTEGER = choice(NUMBER_DEC, NUMBER_BIN, NUMBER_OCT, NUMBER_HEX);

-const floatScientificPart = seq(/[eE]/, optional(choice("-", "+")), integer);
-const float = seq(numberDec, ".", numberDec, optional(floatScientificPart));
+const FLOAT_SCIENTIFIC_PART = seq(/[eE]/, optional(choice("-", "+")), INTEGER);
+const FLOAT = seq(NUMBER_DEC, ".", NUMBER_DEC, optional(FLOAT_SCIENTIFIC_PART));

-const aliasPart = /[A-Z][_a-zA-Z0-9]*/;
+const NEWLINE = /\r?\n/;

 module.exports = grammar({
  name: "elixir",

-  // TODO describe stuff (also in the separate notes doc add clarification
-  // how we use this verbose tokens to avoid needing scanner state)
  externals: ($) => [
+    // See Ref 1. in the docs
    $._quoted_content_i_single,
    $._quoted_content_i_double,
    $._quoted_content_i_heredoc_single,
@ -117,7 +115,6 @@ module.exports = grammar({
    $._quoted_content_i_angle,
    $._quoted_content_i_bar,
    $._quoted_content_i_slash,
-
    $._quoted_content_single,
    $._quoted_content_double,
    $._quoted_content_heredoc_single,
@ -129,77 +126,62 @@ module.exports = grammar({
    $._quoted_content_bar,
    $._quoted_content_slash,

-    $._keyword_special_literal,
-    $._atom_start,
-    $._keyword_end,
-
+    // See Ref 2. in the docs
    $._newline_before_do,
-    $._newline_before_binary_op,
-    // TODO explain this, basically we use newline ignored for newline before comment,
-    // as after the comment there is another newline that we then consider as usual (so
-    // that comments are skipped when considering newlines) <- this is chaotic need a better one
+    $._newline_before_binary_operator,
    $._newline_before_comment,

-    // TODO explain this, basically we use this to force unary + and -
-    // if there is no spacing before the operand
+    // See Ref 3. in the docs
    $._before_unary_op,

+    // See Ref 4. in the docs
    $._not_in,
+
+    // See Ref 5. in the docs
+    $._quoted_atom_start,
  ],

-  // TODO include in notes about why using extra for newline before binary op is fine
-  // TODO figure out how "\n" helps with the behaviour in
-  // [
-  //   :a,
-  // ]
-  // and how it generally works with extras
  extras: ($) => [
+    NEWLINE,
+    /[ \t]|\r?\n|\\\r?\n/,
    $.comment,
-    /\s|\\\n/,
-    $._newline_before_binary_op,
    $._newline_before_comment,
-    "\n",
+    // Placing this directly in the binary operator rule leads
+    // to conflicts, but we can place it here without any drawbacks.
+    // If we detect binary operator and the previous line is not a
+    // valid expression, it's a syntax error either way
+    $._newline_before_binary_operator,
  ],

-  // TODO check if the parser doesn't compile without each conflict rule,
-  // otherwise it means we don't really use it (I think)
  conflicts: ($) => [
-    // [$._newline_before_binary_op],
-    [$.binary_operator],
-    [$.keywords],
-    // [$.identifier, $.atom_literal],
-    [$._expression, $._local_call_with_arguments],
-    [
-      $._expression,
-      $._local_call_with_arguments,
-      $._local_call_without_arguments,
-    ],
+    // Given `left • *`, `left` identifier can be either:
+    //   * expression in `left * right`
+    //   * call identifier in `left * / 2`
+    [$._expression, $._local_call_without_parentheses],

-    [$._remote_call, $._parenthesised_remote_call],
+    // Given `left • when`, `left` expression can be either:
+    //   * binary operator operand in `left when right`
+    //   * stab arguments item in `left when right ->`
+    //
+    // Given `arg1, left • when`, `left` expression can be either:
+    //   * binary operator operand in `arg1, left when right, arg3`
+    //   * stab arguments item in `arg1, left when right ->`
+    [$.binary_operator, $._stab_clause_arguments_without_parentheses],

-    // stab clause `(x` may be either `(x;y) ->` or `(x, y) ->`
-    // [$.block, $._stab_clause_arguments],
-    [$.block, $._stab_clause_parentheses_arguments],
-    [$.block, $._stab_clause_arguments],
-
-    [$.block, $._stab_clause_arguments_expression],
-
-    // when in stab clause
-    [$.binary_operator, $._stab_clause_arguments_expression],
-
-    [$.tuple, $.map],
-    [$.tuple, $.map_content],
+    // Given `(-> • /`, stab can be either:
+    //   * stab clause operator in `(-> / / 2)`
+    //   * operator identifier in `(-> / 2)`
    [$.operator_identifier, $.stab_clause],
+
+    // Given `& /`, ampersand can be either:
+    //   * capture operator in `& / / 2`
+    //   * operator identifier in `& / 1`
    [$.unary_operator, $.operator_identifier],
-    // [$.alias],
+
+    // Given `(arg -> expression • \n`, the newline could be either:
+    //   * terminator separating expressions in `(arg -> expression \n expression)`
+    //   * terminator separating clauses in `(arg -> expression \n arg -> expression)`
    [$.body],
-    // [$.block, $._stab_clause_arguments],
-    // [$.block, $._stab_clause_parentheses_arguments],
-    // [$.block, $._stab_clause_parentheses_arguments],
-    [$.after_block],
-    [$.rescue_block],
-    [$.catch_block],
-    [$.else_block],
  ],

  rules: {
@ -212,7 +194,8 @@ module.exports = grammar({
      ),

    _terminator: ($) =>
-      prec.right(choice(seq(repeat("\n"), ";"), repeat1("\n"))),
+      // Right precedence, because we want to consume `;` after newlines if present
+      prec.right(choice(seq(repeat(NEWLINE), ";"), repeat1(NEWLINE))),

    _expression: ($) =>
      choice(
@ -221,7 +204,10 @@ module.exports = grammar({
        $.alias,
        $.integer,
        $.float,
-        $.atom,
+        $.char,
+        $.boolean,
+        $.nil,
+        $._atom,
        $.string,
        $.charlist,
        $.sigil,
@ -229,9 +215,6 @@ module.exports = grammar({
        $.tuple,
        $.bitstring,
        $.map,
-        $.char,
-        $.boolean,
-        $.nil,
        $.unary_operator,
        $.binary_operator,
        $.dot,
@ -241,54 +224,27 @@ module.exports = grammar({
      ),

    block: ($) =>
-      prec(
-        PREC.WHEN_OP,
-        seq(
-          "(",
-          seq(
-            optional($._terminator),
-            optional(
-              seq(
-                sep1(choice($._expression, $.stab_clause), $._terminator),
-                optional($._terminator)
-              )
+      seq(
+        "(",
+        optional($._terminator),
+        optional(
+          choice(
+            sep1(choice($.stab_clause), $._terminator),
+            seq(
+              sep1(choice($._expression), $._terminator),
+              optional($._terminator)
            )
-          ),
-          ")"
-        )
+          )
+        ),
+        ")"
      ),

    _identifier: ($) =>
      choice($.identifier, $.unused_identifier, $.special_identifier),

-    // Note: Elixir does not allow uppercase and titlecase letters
-    // as a variable starting character, but this regex would match
-    // those. This implies we would happily parse those cases, but
-    // since they are not valid Elixir it's unlikely to stumble upon
-    // them. TODO reword
-    // Ref: https://hexdocs.pm/elixir/master/unicode-syntax.html#variables
-    // TODO see if we need this in custom scanner in the end, if we do,
-    // then we may use the generation script from the original repo instead
-    // and make this an external (though I'd check if these custom unicode
-    // functions are efficient, does compiler optimise such checks?)
-    // identifier: ($) => choice(/[\p{ID_Start}][\p{ID_Continue}]*[?!]?/u, "..."),
-    // identifier: ($) => choice(/[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\p{Other_ID_Start}][\p{ID_Continue}]*[?!]?/u, "..."),
-    // identifier: ($) => choice(/[\p{Ll}\p{Lm}\p{Lo}\p{Nl}][\p{ID_Continue}]*[?!]?/u, "..."),
-    //
-    // TODO elaborate, but basically
-    //
-    // we remove uppercase/titlecase letters from ID_Start as elixir does
-    // we remove the subtractions (we cannot express group subtraction in regex),
-    //   but it's fine becaues at the time of writing these groups only really subtract
-    //   a single character
-    //   Unicode.Set.to_utf8_char "[[[:L:][:Nl:][:Other_ID_Start:]] & [[:Pattern_Syntax:][:Pattern_White_Space:]]]"
-    // we use hardcoded codepoints for \p{Other_ID_Start} since treesitter/js regexp doesn't
-    //   recognise this group
-    //
-    // Other_ID_Start \u1885\u1886\u2118\u212E\u309B\u309C
-    //   (this the list at the time of writing, it's for backward compatibility, see https://unicode.org/reports/tr31/#Backward_Compatibility)
    identifier: ($) =>
      choice(
+        // See Ref 6. in the docs
        /[\p{Ll}\p{Lm}\p{Lo}\p{Nl}\u1885\u1886\u2118\u212E\u309B\u309C][\p{ID_Continue}]*[?!]?/u,
        "..."
      ),
@ -297,36 +253,34 @@ module.exports = grammar({

    special_identifier: ($) => choice(...SPECIAL_IDENTIFIERS),

-    // We have a separate rule for single-part alias, so that we
-    // can use it in the keywords rule
-    alias: ($) => choice($._alias_single, $._alias_multi),
+    alias: ($) => token(sep1(/[A-Z][_a-zA-Z0-9]*/, /\s*\.\s*/)),

-    _alias_single: ($) => aliasPart,
+    integer: ($) => token(INTEGER),

-    _alias_multi: ($) => token(sep1(aliasPart, /\s*\.\s*/)),
+    float: ($) => token(FLOAT),

-    integer: ($) => token(integer),
+    char: ($) => /\?(.|\\.)/,

-    float: ($) => token(float),
+    boolean: ($) => choice("true", "false"),
+
+    nil: ($) => "nil",
+
+    _atom: ($) => choice($.atom, $.quoted_atom),

    atom: ($) =>
-      seq(
-        $._atom_start,
-        choice(
-          alias($._atom_word_literal, $.atom_literal),
-          alias($._atom_operator_literal, $.atom_literal),
-          alias($._atom_special_literal, $.atom_literal),
-          $._quoted_i_double,
-          $._quoted_i_single
+      token(
+        seq(
+          ":",
+          choice(
+            ATOM_WORD_LITERAL,
+            ...ATOM_OPERATOR_LITERALS,
+            ...ATOM_SPECIAL_LITERALS
+          )
        )
      ),

-    // TODO comment on the unicode groups here
-    _atom_word_literal: ($) => /[\p{ID_Start}_][\p{ID_Continue}@]*[?!]?/u,
-
-    _atom_operator_literal: ($) => choice(...ATOM_OPERATOR_LITERALS),
-
-    _atom_special_literal: ($) => choice(...ATOM_SPECIAL_LITERALS),
+    quoted_atom: ($) =>
+      seq($._quoted_atom_start, choice($._quoted_i_double, $._quoted_i_single)),

    // Defines $._quoted_content_i_{name} and $._quoted_content_{name} rules,
    // content with and without interpolation respectively
@ -402,6 +356,82 @@ module.exports = grammar({
        optional(alias(token.immediate(/[a-zA-Z]+/), $.sigil_modifiers))
      ),

+    keywords: ($) =>
+      // Right precedence, because we want to consume next items as long
+      // as there is a comma ahead
+      prec.right(sep1($.pair, ",")),
+
+    _keywords_with_trailing_separator: ($) =>
+      seq(sep1($.pair, ","), optional(",")),
+
+    pair: ($) => seq($._keyword, $._expression),
+
+    _keyword: ($) => choice($.keyword, $.quoted_keyword),
+
+    keyword: ($) =>
+      // See Ref 7. in the docs
+      token(
+        seq(
+          choice(
+            ATOM_WORD_LITERAL,
+            ...ATOM_OPERATOR_LITERALS.filter((op) => op !== "::"),
+            ...ATOM_SPECIAL_LITERALS
+          ),
+          /:\s/
+        )
+      ),
+
+    quoted_keyword: ($) =>
+      seq(
+        choice($._quoted_i_double, $._quoted_i_single),
+        token.immediate(/:\s/)
+      ),
+
+    list: ($) => seq("[", optional($._items_with_trailing_separator), "]"),
+
+    tuple: ($) => seq("{", optional($._items_with_trailing_separator), "}"),
+
+    bitstring: ($) =>
+      seq("<<", optional($._items_with_trailing_separator), ">>"),
+
+    map: ($) =>
+      // Precedence over tuple
+      prec(
+        1,
+        seq(
+          "%",
+          optional($.struct),
+          "{",
+          optional(alias($._items_with_trailing_separator, $.map_content)),
+          "}"
+        )
+      ),
+
+    struct: ($) =>
+      // Left precedence, because if there is a conflict involving `{}`,
+      // we want to treat it as map continuation rather than tuple
+      prec.left(
+        choice(
+          $.alias,
+          $._atom,
+          $._identifier,
+          $.unary_operator,
+          $.dot,
+          alias($._call_with_parentheses, $.call)
+        )
+      ),
+
+    _items_with_trailing_separator: ($) =>
+      seq(
+        choice(
+          seq(sep1($._expression, ","), optional(",")),
+          seq(
+            optional(seq(sep1($._expression, ","), ",")),
+            alias($._keywords_with_trailing_separator, $.keywords)
+          )
+        )
+      ),
+
    unary_operator: ($) =>
      choice(
        unaryOp($, prec, PREC.CAPTURE_OP, "&", $._capture_expression),
@ -413,9 +443,10 @@ module.exports = grammar({

    _capture_expression: ($) =>
      choice(
-        // TODO sholud parenthesised expression be generally used (?)
-        // Precedence over block expression
-        prec(PREC.WHEN_OP + 1, seq("(", $._expression, ")")),
+        // Note that block expression is not allowed as capture operand,
+        // so we have an explicit sequence with the parentheses and higher
+        // precedence
+        prec(1, seq("(", $._expression, ")")),
        $._expression
      ),

@ -466,13 +497,14 @@ module.exports = grammar({

    operator_identifier: ($) =>
      // Operators with the following changes:
-      //   * exclude "=>" since it's not a valid atom/operator identifier anyway (valid only in map)
-      // * we exclude // since it's only valid after ..
-      // * we remove "-" and "+" since they are both unary and binary
-
-      // We use the same precedence as unary operators, so that a sequence
-      // like `& /` is a conflict and is resolved via $.conflicts
-      // (could be be either `& / 2` or `& / / 2`)
+      //
+      //   * exclude "=>" since it's not a valid operator identifier
+      //   * exclude // since it's only valid after ..
+      //   * exclude binary "-" and "+" as they are handled as unary below
+      //
+      // For unary operator identifiers we use the same precedence as
+      // operators, so that we get conflicts and resolve them dynamically
+      // (see grammar.conflicts for more details)
      choice(
        // Unary operators
        prec(PREC.CAPTURE_OP, "&"),
@ -505,188 +537,63 @@ module.exports = grammar({
        seq(choice($._expression), ".", choice($.alias, $.tuple))
      ),

-    keywords: ($) => sep1($.pair, ","),
+    call: ($) => choice($._call_without_parentheses, $._call_with_parentheses),

-    pair: ($) => seq($.keyword, $._expression),
-
-    keyword: ($) =>
-      seq(
-        // Tree-sitter doesn't consider ambiguities within individual
-        // tokens (in this case regexps). So both in [a] and [a: 1] it
-        // would always parse "a" as the same node (based on whether
-        // $.identifier or $.atom_literal) is listed first in the rules.
-        // However, since identifiers and alias parts are valid atom
-        // literals, we can list them here, in which case the parser will
-        // consider all paths and pick the valid one.
-        // Also see https://github.com/tree-sitter/tree-sitter/issues/518
-        choice(
-          alias($._atom_word_literal, $.atom_literal),
-          alias($._atom_operator_literal, $.atom_literal),
-          alias($._keyword_special_literal, $.atom_literal),
-          alias($.identifier, $.atom_literal),
-          alias($.unused_identifier, $.atom_literal),
-          alias($.special_identifier, $.atom_literal),
-          alias($._alias_single, $.atom_literal),
-          alias(choice(...RESERVED_WORD_TOKENS), $.atom_literal),
-          $._quoted_i_double,
-          $._quoted_i_single
-        ),
-        $._keyword_end
-      ),
-
-    list: ($) => seq("[", optional($._items_with_trailing_separator), "]"),
-
-    tuple: ($) => seq("{", optional($._items_with_trailing_separator), "}"),
-
-    bitstring: ($) =>
-      seq("<<", optional($._items_with_trailing_separator), ">>"),
-
-    map: ($) => seq("%", optional($.struct), "{", optional($.map_content), "}"),
-
-    struct: ($) =>
-      prec.left(
-        choice(
-          $.alias,
-          $.atom,
-          $._identifier,
-          $.unary_operator,
-          $.dot,
-          alias($._parenthesised_call, $.call)
-        )
-      ),
-
-    map_content: ($) => $._items_with_trailing_separator,
-
-    _items_with_trailing_separator: ($) =>
-      seq(
-        choice(
-          seq(sep1($._expression, ","), optional(seq(",", $.keywords))),
-          $.keywords
-        ),
-        optional(",")
-      ),
-
-    char: ($) => /\?(.|\\.)/,
-
-    boolean: ($) => choice("true", "false"),
-
-    nil: ($) => "nil",
-
-    call: ($) =>
+    _call_without_parentheses: ($) =>
      choice(
-        $._local_call_with_arguments,
-        $._parenthesised_local_call_with_arguments,
-        $._local_call_without_arguments,
-        $._remote_call,
-        $._parenthesised_remote_call,
-        $._anonymous_call,
-        $._call_on_call
+        $._local_call_without_parentheses,
+        $._local_call_just_do_block,
+        $._remote_call_without_parentheses
      ),

-    _parenthesised_call: ($) =>
+    _call_with_parentheses: ($) =>
      choice(
-        $._parenthesised_local_call_with_arguments,
-        $._parenthesised_remote_call,
+        $._local_call_with_parentheses,
+        $._remote_call_with_parentheses,
        $._anonymous_call,
-        $._call_on_call
+        $._double_call
      ),

-    _call_on_call: ($) =>
-      prec.left(
-        seq(
-          alias(
-            choice(
-              $._parenthesised_local_call_with_arguments,
-              $._parenthesised_remote_call,
-              $._anonymous_call
-            ),
-            $.call
-          ),
-          // arguments in parentheses
-          // alias($._local_or_remote_arguments, $.arguments),
-          // TODO just make nonimmediate/immediate in the name
-          alias($._anonymous_arguments, $.arguments),
-          optional(seq(optional($._newline_before_do), $.do_block))
-        )
-      ),
+    // Note, calls have left precedence, so that `do end` block sticks to
+    // the outermost call

-    _local_call_with_arguments: ($) =>
-      // Given `x + y` it can be interpreted either as a binary operator
-      // or a call with unary operator. This is an actual ambiguity, so
-      // we use dynamic precedence to penalize call
-      // prec.dynamic(
-      // TODO ideally we would penalize whitespace after unary op,
-      // so that x + y is binary op and x +y is unary op, to reflect
-      // Elixir ast
-      // -1,
+    _local_call_without_parentheses: ($) =>
      prec.left(
        seq(
          $._identifier,
-          alias($._call_arguments, $.arguments),
-          // TODO include this in notes:
-          // We use external scanner for _newline_before_do because
-          // this way we can lookahead through any whitespace
-          // (especially newlines). We cannot simply use repeat("\n")
-          // and conflict with expression end, because this function
-          // rule has left precedence (so that do-end sticks to the outermost
-          // call), and thus expression end would always be preferred
+          alias($._call_arguments_without_parentheses, $.arguments),
          optional(seq(optional($._newline_before_do), $.do_block))
-          // optional($.do_block)
        )
-        // )
      ),

-    _parenthesised_local_call_with_arguments: ($) =>
-      // Given `x + y` it can be interpreted either as a binary operator
-      // or a call with unary operator. This is an actual ambiguity, so
-      // we use dynamic precedence to penalize call
-      // prec.dynamic(
-      // TODO ideally we would penalize whitespace after unary op,
-      // so that x + y is binary op and x +y is unary op, to reflect
-      // Elixir ast
-      // -1,
+    _local_call_with_parentheses: ($) =>
      prec.left(
        seq(
          $._identifier,
-          alias($._parenthesised_call_arguments, $.arguments),
-          // TODO include this in notes:
-          // We use external scanner for _newline_before_do because
-          // this way we can lookahead through any whitespace
-          // (especially newlines). We cannot simply use repeat("\n")
-          // and conflict with expression end, because this function
-          // rule has left precedence (so that do-end sticks to the outermost
-          // call), and thus expression end would always be preferred
+          alias($._call_arguments_with_parentheses_immediate, $.arguments),
          optional(seq(optional($._newline_before_do), $.do_block))
-          // optional($.do_block)
        )
-        // )
      ),

-    _local_call_without_arguments: ($) =>
-      // We use lower precedence, so given `fun arg do end`
-      // we don't tokenize `arg` as a call
+    _local_call_just_do_block: ($) =>
+      // Lower precedence than identifier, because `foo bar do` is `foo(bar) do end`
+      prec(-1, seq($._identifier, $.do_block)),

-      // we actually need a conflict because of `foo bar do end` vs `foo bar do: 1`
-      // prec(-1,
-      prec.dynamic(-1, seq($._identifier, $.do_block)),
-    // )
-    _remote_call: ($) =>
+    _remote_call_without_parentheses: ($) =>
      prec.left(
        seq(
          alias($._remote_dot, $.dot),
-          optional(alias($._call_arguments, $.arguments)),
+          optional(alias($._call_arguments_without_parentheses, $.arguments)),
          optional(seq(optional($._newline_before_do), $.do_block))
-          // optional($.do_block)
        )
      ),

-    _parenthesised_remote_call: ($) =>
+    _remote_call_with_parentheses: ($) =>
      prec.left(
        seq(
          alias($._remote_dot, $.dot),
-          alias($._parenthesised_call_arguments, $.arguments),
+          alias($._call_arguments_with_parentheses_immediate, $.arguments),
          optional(seq(optional($._newline_before_do), $.do_block))
-          // optional($.do_block)
        )
      ),

@ -696,9 +603,6 @@ module.exports = grammar({
        seq(
          $._expression,
          ".",
-          // TODO can also be string, anything else?
-          // compare with the other parser
-          // TODO we don't want to support heredoc though
          choice(
            $._identifier,
            alias(choice(...RESERVED_WORD_TOKENS), $.identifier),
@ -709,117 +613,154 @@ module.exports = grammar({
        )
      ),

-    _parenthesised_call_arguments: ($) =>
-      seq(token.immediate("("), optional($._call_arguments), ")"),
-
    _anonymous_call: ($) =>
      seq(
        alias($._anonymous_dot, $.dot),
-        alias($._anonymous_arguments, $.arguments)
+        alias($._call_arguments_with_parentheses, $.arguments)
      ),

    _anonymous_dot: ($) => prec(PREC.DOT_OP, seq($._expression, ".")),

-    _anonymous_arguments: ($) => seq("(", optional($._call_arguments), ")"),
-
-    _call_arguments: ($) =>
-      // Right precedence ensures that `fun1 fun2 x, y` is treated
-      // as `fun1(fun2(x, y))` and not `fun1(fun2(x), y)
-      prec.right(
+    _double_call: ($) =>
+      prec.left(
        seq(
-          choice(
-            seq(
-              sep1($._expression, ","),
-              optional(seq(",", $.keywords, optional(",")))
+          alias(
+            choice(
+              $._local_call_with_parentheses,
+              $._remote_call_with_parentheses,
+              $._anonymous_call
            ),
-            seq($.keywords, optional(","))
-          )
+            $.call
+          ),
+          alias($._call_arguments_with_parentheses, $.arguments),
+          optional(seq(optional($._newline_before_do), $.do_block))
        )
      ),

+    _call_arguments_with_parentheses: ($) =>
+      seq("(", optional($._call_arguments_with_trailing_separator), ")"),
+
+    _call_arguments_with_parentheses_immediate: ($) =>
+      seq(
+        token.immediate("("),
+        optional($._call_arguments_with_trailing_separator),
+        ")"
+      ),
+
+    _call_arguments_with_trailing_separator: ($) =>
+      choice(
+        seq(
+          sep1($._expression, ","),
+          optional(
+            seq(",", alias($._keywords_with_trailing_separator, $.keywords))
+          )
+        ),
+        alias($._keywords_with_trailing_separator, $.keywords)
+      ),
+
+    _call_arguments_without_parentheses: ($) =>
+      // Right precedence, because `fun1 fun2 x, y` is `fun1(fun2(x, y))`
+      prec.right(
+        choice(
+          seq(sep1($._expression, ","), optional(seq(",", $.keywords))),
+          $.keywords
+        )
+      ),
+
+    do_block: ($) =>
+      seq(
+        callKeywordBlock($, "do"),
+        repeat(
+          choice($.after_block, $.rescue_block, $.catch_block, $.else_block)
+        ),
+        "end"
+      ),
+
+    after_block: ($) => callKeywordBlock($, "after"),
+    rescue_block: ($) => callKeywordBlock($, "rescue"),
+    catch_block: ($) => callKeywordBlock($, "catch"),
+    else_block: ($) => callKeywordBlock($, "else"),
+
    access_call: ($) =>
      prec(
        PREC.ACCESS,
        seq($._expression, token.immediate("["), $._expression, "]")
      ),

-    do_block: ($) =>
-      seq(
-        sugarBlock($, "do"),
-        repeat(
-          choice($.after_block, $.rescue_block, $.catch_block, $.else_block)
-        ),
-        "end"
-      ),
-
-    after_block: ($) => sugarBlock($, "after"),
-
-    rescue_block: ($) => sugarBlock($, "rescue"),
-
-    catch_block: ($) => sugarBlock($, "catch"),
-
-    else_block: ($) => sugarBlock($, "else"),
-
-    // Specify right precedence, so that we consume as much as we can
    stab_clause: ($) =>
+      // Right precedence, because we want to consume body if any
      prec.right(seq(optional($._stab_clause_left), "->", optional($.body))),

    _stab_clause_left: ($) =>
      choice(
-        // Note the first option has higher precedence, TODO clarify
-        alias($._stab_clause_parentheses_arguments, $.arguments),
-        // TODO naming/cleanup
+        alias($._stab_clause_arguments_with_parentheses, $.arguments),
        alias(
-          $._stab_clause_parentheses_arguments_with_guard,
+          $._stab_clause_arguments_with_parentheses_with_guard,
          $.binary_operator
        ),
-        alias($._stab_clause_arguments, $.arguments),
-        alias($._stab_clause_arguments_with_guard, $.binary_operator)
+        alias($._stab_clause_arguments_without_parentheses, $.arguments),
+        alias(
+          $._stab_clause_arguments_without_parentheses_with_guard,
+          $.binary_operator
+        )
      ),

-    _stab_clause_parentheses_arguments: ($) =>
-      // `(1) ->` may be interpreted either as block argument
-      // or argument in parentheses and we use dynamic precedence
-      // to favour the latter
+    _stab_clause_arguments_with_parentheses: ($) =>
+      // Precedence over block expression
+      prec(
+        1,
+        seq(
+          "(",
+          optional(
+            choice(
+              seq(sep1($._expression, ","), optional(seq(",", $.keywords))),
+              $.keywords
+            )
+          ),
+          ")"
+        )
+      ),
+
+    _stab_clause_arguments_without_parentheses: ($) =>
+      // We give the arguments and expression the same precedence as "when"
+      // binary operator, so that we get conflicts and resolve them dynamically
+      // (see the grammar.conflicts for more details)
      prec(
        PREC.WHEN_OP,
-        prec.dynamic(1, seq("(", optional($._stab_clause_arguments), ")"))
+        choice(
+          seq(
+            sep1(prec(PREC.WHEN_OP, $._expression), ","),
+            optional(seq(",", $.keywords))
+          ),
+          $.keywords
+        )
      ),
-    _stab_clause_parentheses_arguments_with_guard: ($) =>
+
+    _stab_clause_arguments_with_parentheses_with_guard: ($) =>
      seq(
-        alias($._stab_clause_parentheses_arguments, $.arguments),
+        alias($._stab_clause_arguments_with_parentheses, $.arguments),
        "when",
        $._expression
      ),

-    _stab_clause_arguments_with_guard: ($) =>
-      // `a when b ->` may be interpted either such that `a when b` is an argument
-      // or a guard binary operator with argument `a` and right operand `b`,
-      // we use dynamic precedence to favour the latter
+    _stab_clause_arguments_without_parentheses_with_guard: ($) =>
+      // Given `a when b ->`, the left stab operand can be interpreted either
+      // as a single argument item, or as binary operator with arguments on
+      // the left and guard expression on the right. Using dynamic precedence
+      // we favour the latter interpretation during dynamic conflict resolution
      prec.dynamic(
        1,
-        seq(alias($._stab_clause_arguments, $.arguments), "when", $._expression)
-      ),
-    _stab_clause_arguments: ($) =>
-      // TODO this is a variant of _items_with_trailing_separator, cleanup
-      choice(
        seq(
-          sep1($._stab_clause_arguments_expression, ","),
-          optional(seq(",", $.keywords))
-        ),
-        $.keywords
+          alias($._stab_clause_arguments_without_parentheses, $.arguments),
+          "when",
+          $._expression
+        )
      ),

-    _stab_clause_arguments_expression: ($) =>
-      // Note here we use the same precedence as when operator,
-      // so we get a conflict and resolve it dynamically
-      prec(PREC.WHEN_OP, $._expression),
    body: ($) =>
      seq(
-        choice(
-          seq($._terminator, sep($._expression, $._terminator)),
-          sep1($._expression, $._terminator)
-        ),
+        optional($._terminator),
+        sep1($._expression, $._terminator),
        optional($._terminator)
      ),

@ -832,7 +773,7 @@ module.exports = grammar({
      ),

    // A comment may be anywhere, we give it a lower precedence,
-    // so it doesn't intercept sequences such as interpolation
+    // so it doesn't intercept interpolation
    comment: ($) => token(prec(-1, seq("#", /.*/))),
  },
 });
@ -846,15 +787,14 @@ function sep(rule, separator) {
 }

 function unaryOp($, assoc, precedence, operator, right = null) {
-  return assoc(
-    precedence,
-    // TODO clarify, we use lower precedence, so given `x + y`,
-    // which can be interpreted as either `x + y` or `x(+y)`
-    // we favour the former. The only exception is when
-    // _before_unary_op matches which forces the latter interpretation
-    // in case like `x +y`
-    prec.dynamic(
-      -1,
+  // Expression such as `x + y` falls under the "expression vs local call"
+  // conflict that we already have. By using dynamic precedence we penalize
+  // unary operator, so `x + y` is interpreted as binary operator (unless
+  // _before_unary_op is tokenized and forces unary operator interpretation)
+  return prec.dynamic(
+    -1,
+    assoc(
+      precedence,
      seq(
        optional($._before_unary_op),
        field("operator", operator),
@ -875,7 +815,7 @@ function binaryOp($, assoc, precedence, operator, left = null, right = null) {
  );
 }

-function sugarBlock($, start) {
+function callKeywordBlock($, start) {
  return seq(
    start,
    optional($._terminator),
@ -895,8 +835,7 @@ function defineQuoted(start, end, name) {
        start,
        repeat(
          choice(
-            // TODO rename the extenrals to _content
-            alias($[`_quoted_content_i_${name}`], $.string_content),
+            alias($[`_quoted_content_i_${name}`], $.quoted_content),
            $.interpolation,
            $.escape_sequence
          )
@ -909,9 +848,8 @@ function defineQuoted(start, end, name) {
        start,
        repeat(
          choice(
-            // TODO rename the extenrals to _content
-            alias($[`_quoted_content_${name}`], $.string_content),
-            // It's always possible to escape the end delimiter
+            alias($[`_quoted_content_${name}`], $.quoted_content),
+            // The end delimiter may always be escaped
            $.escape_sequence
          )
        ),
--- a/package-lock.json
+++ b/package-lock.json
@ -1,11 +1,11 @@
 {
  "name": "tree-sitter-elixir",
-  "version": "1.0.0",
+  "version": "0.19.0",
  "lockfileVersion": 2,
  "requires": true,
  "packages": {
    "": {
-      "version": "1.0.0",
+      "version": "0.19.0",
      "license": "ISC",
      "dependencies": {
        "nan": "^2.15.0"
--- a/scripts/parse_repo.sh
+++ b/scripts/parse_repo.sh
@ -0,0 +1,34 @@
+#!/bin/bash
+
+set -e
+
+cd "$(dirname "$0")/.."
+
+print_usage_and_exit() {
+  echo "Usage: $0 <github-repo>"
+  echo ""
+  echo "Clones the given repository and runs the parser against all Elixir files"
+  echo ""
+  echo "## Examples"
+  echo ""
+  echo "  $0 elixir-lang/elixir"
+  echo ""
+  exit 1
+}
+
+if [ $# -ne 1 ]; then
+  print_usage_and_exit
+fi
+
+gh_repo="$1"
+
+dir="tmp/gh/${gh_repo//[\/-]/_}"
+
+if [[ ! -d "$dir" ]]; then
+  mkdir -p "$(dirname "$dir")"
+  git clone "https://github.com/$gh_repo.git" "$dir"
+fi
+
+echo "Running parser against $gh_repo"
+
+npx tree-sitter parse --quiet --stat "$dir/**/*.ex*"
--- a/src/grammar.json
+++ b/src/grammar.json
--- a/src/node-types.json
+++ b/src/node-types.json
@ -79,6 +79,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -186,6 +190,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -217,11 +225,6 @@
      ]
    }
  },
-  {
-    "type": "alias",
-    "named": true,
-    "fields": {}
-  },
  {
    "type": "anonymous_function",
    "named": true,
@ -321,6 +324,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -348,38 +355,6 @@
      ]
    }
  },
-  {
-    "type": "atom",
-    "named": true,
-    "fields": {},
-    "children": {
-      "multiple": true,
-      "required": false,
-      "types": [
-        {
-          "type": "atom_literal",
-          "named": true
-        },
-        {
-          "type": "escape_sequence",
-          "named": true
-        },
-        {
-          "type": "interpolation",
-          "named": true
-        },
-        {
-          "type": "string_content",
-          "named": true
-        }
-      ]
-    }
-  },
-  {
-    "type": "atom_literal",
-    "named": true,
-    "fields": {}
-  },
  {
    "type": "binary_operator",
    "named": true,
@ -464,6 +439,10 @@
            "type": "operator_identifier",
            "named": true
          },
+          {
+            "type": "quoted_atom",
+            "named": true
+          },
          {
            "type": "sigil",
            "named": true
@ -756,6 +735,10 @@
            "type": "nil",
            "named": true
          },
+          {
+            "type": "quoted_atom",
+            "named": true
+          },
          {
            "type": "sigil",
            "named": true
@ -863,6 +846,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -974,6 +961,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1081,6 +1072,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1118,7 +1113,7 @@
    "fields": {},
    "children": {
      "multiple": true,
-      "required": false,
+      "required": true,
      "types": [
        {
          "type": "access_call",
@ -1192,6 +1187,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1230,7 +1229,7 @@
    "fields": {},
    "children": {
      "multiple": true,
-      "required": true,
+      "required": false,
      "types": [
        {
          "type": "arguments",
@ -1343,6 +1342,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1391,7 +1394,7 @@
          "named": true
        },
        {
-          "type": "string_content",
+          "type": "quoted_content",
          "named": true
        }
      ]
@ -1489,6 +1492,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "rescue_block",
          "named": true
@ -1608,6 +1615,10 @@
          "type": "operator_identifier",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1715,6 +1726,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1831,6 +1846,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -1858,33 +1877,6 @@
      ]
    }
  },
-  {
-    "type": "keyword",
-    "named": true,
-    "fields": {},
-    "children": {
-      "multiple": true,
-      "required": false,
-      "types": [
-        {
-          "type": "atom_literal",
-          "named": true
-        },
-        {
-          "type": "escape_sequence",
-          "named": true
-        },
-        {
-          "type": "interpolation",
-          "named": true
-        },
-        {
-          "type": "string_content",
-          "named": true
-        }
-      ]
-    }
-  },
  {
    "type": "keywords",
    "named": true,
@ -1984,6 +1976,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2114,6 +2110,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2235,6 +2235,14 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
+        {
+          "type": "quoted_keyword",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2262,6 +2270,52 @@
      ]
    }
  },
+  {
+    "type": "quoted_atom",
+    "named": true,
+    "fields": {},
+    "children": {
+      "multiple": true,
+      "required": false,
+      "types": [
+        {
+          "type": "escape_sequence",
+          "named": true
+        },
+        {
+          "type": "interpolation",
+          "named": true
+        },
+        {
+          "type": "quoted_content",
+          "named": true
+        }
+      ]
+    }
+  },
+  {
+    "type": "quoted_keyword",
+    "named": true,
+    "fields": {},
+    "children": {
+      "multiple": true,
+      "required": false,
+      "types": [
+        {
+          "type": "escape_sequence",
+          "named": true
+        },
+        {
+          "type": "interpolation",
+          "named": true
+        },
+        {
+          "type": "quoted_content",
+          "named": true
+        }
+      ]
+    }
+  },
  {
    "type": "rescue_block",
    "named": true,
@ -2342,6 +2396,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2389,6 +2447,10 @@
          "type": "interpolation",
          "named": true
        },
+        {
+          "type": "quoted_content",
+          "named": true
+        },
        {
          "type": "sigil_modifiers",
          "named": true
@ -2396,10 +2458,6 @@
        {
          "type": "sigil_name",
          "named": true
-        },
-        {
-          "type": "string_content",
-          "named": true
        }
      ]
    }
@ -2484,6 +2542,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2556,7 +2618,7 @@
          "named": true
        },
        {
-          "type": "string_content",
+          "type": "quoted_content",
          "named": true
        }
      ]
@ -2590,6 +2652,10 @@
          "type": "identifier",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "special_identifier",
          "named": true
@ -2689,6 +2755,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2835,6 +2905,10 @@
          "type": "nil",
          "named": true
        },
+        {
+          "type": "quoted_atom",
+          "named": true
+        },
        {
          "type": "sigil",
          "named": true
@ -2862,10 +2936,6 @@
      ]
    }
  },
-  {
-    "type": "\n",
-    "named": false
-  },
  {
    "type": "!",
    "named": false
@ -2894,10 +2964,6 @@
    "type": "%",
    "named": false
  },
-  {
-    "type": "%{}",
-    "named": false
-  },
  {
    "type": "&",
    "named": false
@ -2978,10 +3044,6 @@
    "type": "...",
    "named": false
  },
-  {
-    "type": "..//",
-    "named": false
-  },
  {
    "type": "/",
    "named": false
@ -3014,10 +3076,6 @@
    "type": "<<<",
    "named": false
  },
-  {
-    "type": "<<>>",
-    "named": false
-  },
  {
    "type": "<<~",
    "named": false
@ -3130,10 +3188,18 @@
    "type": "after",
    "named": false
  },
+  {
+    "type": "alias",
+    "named": true
+  },
  {
    "type": "and",
    "named": false
  },
+  {
+    "type": "atom",
+    "named": true
+  },
  {
    "type": "catch",
    "named": false
@ -3182,6 +3248,10 @@
    "type": "integer",
    "named": true
  },
+  {
+    "type": "keyword",
+    "named": true
+  },
  {
    "type": "nil",
    "named": false
@ -3194,6 +3264,10 @@
    "type": "or",
    "named": false
  },
+  {
+    "type": "quoted_content",
+    "named": true
+  },
  {
    "type": "rescue",
    "named": false
@ -3206,10 +3280,6 @@
    "type": "sigil_name",
    "named": true
  },
-  {
-    "type": "string_content",
-    "named": true
-  },
  {
    "type": "true",
    "named": false
@ -3226,10 +3296,6 @@
    "type": "{",
    "named": false
  },
-  {
-    "type": "{}",
-    "named": false
-  },
  {
    "type": "|",
    "named": false
--- a/src/parser.c
+++ b/src/parser.c
--- a/src/scanner.cc
+++ b/src/scanner.cc
--- a/test/corpus/comment.txt
+++ b/test/corpus/comment.txt
@ -91,7 +91,7 @@ does not match inside a string

 (source
  (string
-    (string_content))
+    (quoted_content))
  (string
-    (string_content)
+    (quoted_content)
    (interpolation (identifier))))
--- a/test/corpus/do_end.txt
+++ b/test/corpus/do_end.txt
@ -223,8 +223,7 @@ end
          (identifier)
          (integer))
        (body
-          (atom
-            (atom_literal)))))))
+          (atom))))))

 =====================================
 stab clause / arguments in parentheses
@ -245,8 +244,7 @@ end
          (identifier)
          (identifier))
        (body
-          (atom
-            (atom_literal)))))))
+          (atom))))))

 =====================================
 stab clause / many clauses
@ -268,20 +266,17 @@ end
        (arguments
          (integer))
        (body
-          (atom
-            (atom_literal))))
+          (atom)))
      (stab_clause
        (arguments
          (integer))
        (body
-          (atom
-            (atom_literal))))
+          (atom)))
      (stab_clause
        (arguments
          (identifier))
        (body
-          (atom
-            (atom_literal)))))))
+          (atom))))))

 =====================================
 stab clause / multiline expression
@ -327,8 +322,7 @@ end
            (call
              (identifier)
              (arguments))
-            (atom
-              (atom_literal))))
+            (atom)))
        (body
          (boolean))))))

@ -773,8 +767,7 @@ end
          (arguments
            (keywords
              (pair
-                (keyword
-                  (atom_literal))
+                (keyword)
                (integer))))
          (body
            (identifier)))))))
--- a/test/corpus/edge_syntax.txt
+++ b/test/corpus/edge_syntax.txt
@ -114,6 +114,5 @@ def Mod.fun(x), do: 1
          (identifier)))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))
--- a/test/corpus/expression/anonymous_function.txt
+++ b/test/corpus/expression/anonymous_function.txt
@ -142,20 +142,17 @@ end
      (arguments
        (integer))
      (body
-        (atom
-          (atom_literal))))
+        (atom)))
    (stab_clause
      (arguments
        (integer))
      (body
-        (atom
-          (atom_literal))))
+        (atom)))
    (stab_clause
      (arguments
        (identifier))
      (body
-        (atom
-          (atom_literal))))))
+        (atom)))))

 =====================================
 with guard / no arguments
@ -176,8 +173,7 @@ end
          (call
            (identifier)
            (arguments))
-          (atom
-            (atom_literal))))
+          (atom)))
      (body
        (boolean)))))

@ -305,8 +301,7 @@ end
            (map_content
              (keywords
                (pair
-                  (keyword
-                    (atom_literal))
+                  (keyword)
                  (identifier))))))
        (binary_operator
          (identifier)
--- a/test/corpus/expression/call.txt
+++ b/test/corpus/expression/call.txt
@ -33,12 +33,10 @@ fun([1, 2], option: true, other: 5)
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -69,20 +67,17 @@ fun +: 1
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer)))))
    (call
      (identifier)
      (arguments
        (keywords
          (pair
-            (keyword
-              (atom_literal))
+            (keyword)
            (integer))))))

 =====================================
@ -104,12 +99,10 @@ fun [1, 2],
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -155,8 +148,7 @@ outer_fun inner_fun do: 1
        (arguments
          (keywords
            (pair
-              (keyword
-                (atom_literal))
+              (keyword)
              (integer))))))))

 =====================================
@ -270,12 +262,10 @@ Mod.fun([1, 2], option: true, other: 5)
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -304,12 +294,10 @@ Mod.fun [1, 2], option: true, other: 5
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -459,14 +447,14 @@ Mod.'fun'(a)
    (dot
      (alias)
      (string
-        (string_content)))
+        (quoted_content)))
    (arguments
      (identifier)))
  (call
    (dot
      (alias)
      (charlist
-        (string_content)))
+        (quoted_content)))
    (arguments
      (identifier))))

@ -482,15 +470,14 @@ remote call / atom literal module
 (source
  (call
    (dot
-      (atom
-        (atom_literal))
+      (atom)
      (identifier))
    (arguments
      (identifier)))
  (call
    (dot
-      (atom
-        (string_content))
+      (quoted_atom
+        (quoted_content))
      (identifier))
    (arguments
      (identifier))))
@ -533,12 +520,10 @@ fun.([1, 2], option: true, other: 5)
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -755,12 +740,10 @@ fun(option: true, other: 5,)
    (arguments
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (boolean))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -812,8 +795,7 @@ map[:key]
    (identifier))
  (access_call
    (identifier)
-    (atom
-      (atom_literal))))
+    (atom)))

 =====================================
 access syntax / does not allow whitespace
@ -845,14 +827,12 @@ map[:mod].fun
      (dot
        (identifier)
        (identifier)))
-    (atom
-      (atom_literal)))
+    (atom))
  (call
    (dot
      (access_call
        (identifier)
-        (atom
-          (atom_literal)))
+        (atom))
      (identifier))))

 =====================================
@ -870,23 +850,19 @@ access syntax / precedence with operators
  (unary_operator
    (access_call
      (identifier)
-      (atom
-        (atom_literal))))
+      (atom)))
  (access_call
    (unary_operator
      (identifier))
-    (atom
-      (atom_literal)))
+    (atom))
  (unary_operator
    (access_call
      (identifier)
-      (atom
-        (atom_literal))))
+      (atom)))
  (access_call
    (unary_operator
      (integer))
-    (atom
-      (atom_literal))))
+    (atom)))

 =====================================
 double parenthesised call
--- a/test/corpus/expression/operator.txt
+++ b/test/corpus/expression/operator.txt
@ -589,15 +589,11 @@ not in[y]
    (list
      (identifier)))
  (binary_operator
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal)))
+    (atom)
+    (atom))
  (binary_operator
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal))))
+    (atom)
+    (atom)))

 =====================================
 multiline / unary over binary (precedence)
--- a/test/corpus/expression/sigil.txt
+++ b/test/corpus/expression/sigil.txt
@ -14,14 +14,14 @@ simple literal
 ---

 (source
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content))
-  (sigil (sigil_name) (string_content)))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content))
+  (sigil (sigil_name) (quoted_content)))


 =====================================
@ -36,7 +36,7 @@ line 2"
 (source
  (sigil
    (sigil_name)
-    (string_content)))
+    (quoted_content)))

 =====================================
 interpolation
@ -53,22 +53,22 @@ interpolation
 (source
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
+    (quoted_content))
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
+    (quoted_content))
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content)))
+    (quoted_content)))

 =====================================
 nested interpolation
@ -81,14 +81,14 @@ nested interpolation
 (source
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (interpolation
      (sigil
        (sigil_name)
-        (string_content)
+        (quoted_content)
        (interpolation
          (integer))))
-    (string_content)))
+    (quoted_content)))

 =====================================
 escape sequence
@ -101,26 +101,26 @@ escape sequence
 (source
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 escaped interpolation
@ -134,7 +134,7 @@ escaped interpolation
  (sigil
    (sigil_name)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 upper sigil / no interpolation
@ -147,7 +147,7 @@ upper sigil / no interpolation
 (source
  (sigil
    (sigil_name)
-    (string_content)))
+    (quoted_content)))

 =====================================
 upper sigil / no escape sequence
@ -160,7 +160,7 @@ upper sigil / no escape sequence
 (source
  (sigil
    (sigil_name)
-    (string_content)))
+    (quoted_content)))

 =====================================
 upper sigil / escape terminator
@ -175,19 +175,19 @@ upper sigil / escape terminator
 (source
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content))
+    (quoted_content))
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content))
+    (quoted_content))
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc delimiter
@ -208,10 +208,10 @@ with 'quotes'
 (source
  (sigil
    (sigil_name)
-    (string_content))
+    (quoted_content))
  (sigil
    (sigil_name)
-    (string_content)))
+    (quoted_content)))

 =====================================
 modifiers
@ -225,11 +225,11 @@ modifiers
 (source
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (sigil_modifiers))
  (sigil
    (sigil_name)
-    (string_content)
+    (quoted_content)
    (sigil_modifiers)))

 =====================================
@ -244,4 +244,4 @@ modifiers
  (sigil
    (sigil_name)
    (ERROR)
-    (string_content)))
+    (quoted_content)))
--- a/test/corpus/integration/function_definition.txt
+++ b/test/corpus/integration/function_definition.txt
@ -184,8 +184,7 @@ def fun(x), do: x
        (arguments))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer)))))
  (call
    (identifier)
@ -196,8 +195,7 @@ def fun(x), do: x
          (identifier)))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (identifier))))))

 =====================================
--- a/test/corpus/integration/kernel.txt
+++ b/test/corpus/integration/kernel.txt
@ -17,8 +17,7 @@ for n <- [1, 2], do: n * 2
          (integer)))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (binary_operator
            (identifier)
            (integer)))))))
@ -46,8 +45,7 @@ end
          (arguments)))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (call
            (dot
              (alias)
@ -77,18 +75,16 @@ for <<c <- " hello world ">>, c != ?\s, into: "", do: <<c>>
        (binary_operator
          (identifier)
          (string
-            (string_content))))
+            (quoted_content))))
      (binary_operator
        (identifier)
        (char))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (string))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (bitstring
            (identifier)))))))

@ -114,8 +110,7 @@ end
          (integer)))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (map))))
    (do_block
      (stab_clause
--- a/test/corpus/integration/module_definition.txt
+++ b/test/corpus/integration/module_definition.txt
@ -35,8 +35,7 @@ end
  (call
    (identifier)
    (arguments
-      (atom
-        (atom_literal)))
+      (atom))
    (do_block)))

 =====================================
@ -76,7 +75,7 @@ end
          (identifier)
          (arguments
            (string
-              (string_content)))))
+              (quoted_content)))))
      (call
        (identifier)
        (arguments
@ -91,7 +90,7 @@ end
          (identifier)
          (arguments
            (string
-              (string_content)))))
+              (quoted_content)))))
      (unary_operator
        (call
          (identifier)
@ -133,8 +132,7 @@ end
              (identifier)))
          (keywords
            (pair
-              (keyword
-                (atom_literal))
+              (keyword)
              (binary_operator
                (identifier)
                (identifier)))))))))
--- a/test/corpus/integration/spec.txt
+++ b/test/corpus/integration/spec.txt
@ -71,17 +71,14 @@ with literals
                (map_content
                  (keywords
                    (pair
-                      (keyword
-                        (atom_literal))
+                      (keyword)
                      (identifier)))))))
          (binary_operator
            (tuple
-              (atom
-                (atom_literal))
+              (atom)
              (identifier))
            (tuple
-              (atom
-                (atom_literal))
+              (atom)
              (identifier))))))))

 =====================================
@ -166,12 +163,10 @@ with type guard
              (identifier)))
          (keywords
            (pair
-              (keyword
-                (atom_literal))
+              (keyword)
              (identifier))
            (pair
-              (keyword
-                (atom_literal))
+              (keyword)
              (identifier))))))))

 =====================================
@ -245,7 +240,6 @@ nonempty list
            (identifier))
          (keywords
            (pair
-              (keyword
-                (atom_literal))
+              (keyword)
              (identifier)))))))
  (ERROR))
--- a/test/corpus/term/alias.txt
+++ b/test/corpus/term/alias.txt
@ -82,17 +82,3 @@ __MODULE__.Child
  (dot
    (special_identifier)
    (alias)))
-
-=====================================
-[error] does not support characters outside ASCII
-=====================================
-
-Ólá
-Olá
-
---
-
-(source
-  (ERROR
-    (atom_literal)
-    (atom_literal)))
--- a/test/corpus/term/atom.txt
+++ b/test/corpus/term/atom.txt
@ -11,16 +11,11 @@ simple literal
 ---

 (source
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal)))
+  (atom)
+  (atom)
+  (atom)
+  (atom)
+  (atom))

 =====================================
 operators
@ -31,7 +26,7 @@ operators
 ---

 (source
-  (list (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal)) (atom (atom_literal))))
+  (list (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom) (atom)))

 =====================================
 special operator-like atoms
@ -43,18 +38,12 @@ special operator-like atoms

 (source
  (list
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal))
-    (atom
-      (atom_literal))))
+    (atom)
+    (atom)
+    (atom)
+    (atom)
+    (atom)
+    (atom)))

 =====================================
 quoted atom
@ -66,11 +55,11 @@ quoted atom
 ---

 (source
-  (atom
-    (string_content)
+  (quoted_atom
+    (quoted_content)
    (escape_sequence))
-  (atom
-    (string_content)
+  (quoted_atom
+    (quoted_content)
    (escape_sequence)))

 =====================================
@ -83,13 +72,13 @@ interpolation
 ---

 (source
-  (atom
-    (string_content)
+  (quoted_atom
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
-  (atom
-    (string_content)
+    (quoted_content))
+  (quoted_atom
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content)))
+    (quoted_content)))
--- a/test/corpus/term/bitstring.txt
+++ b/test/corpus/term/bitstring.txt
@ -17,7 +17,7 @@ single item
    (float))
  (bitstring
    (string
-      (string_content))))
+      (quoted_content))))

 =====================================
 multiple items
@ -36,7 +36,7 @@ multiple items
    (integer)
    (float)
    (string
-      (string_content))))
+      (quoted_content))))

 =====================================
 size modifiers
@ -77,21 +77,21 @@ multiple modifiers
  (bitstring
    (binary_operator
      (string
-        (string_content))
+        (quoted_content))
      (binary_operator
        (identifier)
        (identifier))))
  (bitstring
    (binary_operator
      (string
-        (string_content))
+        (quoted_content))
      (binary_operator
        (identifier)
        (identifier))))
  (bitstring
    (binary_operator
      (string
-        (string_content))
+        (quoted_content))
      (binary_operator
        (identifier)
        (identifier))))
@ -136,7 +136,7 @@ multiple components with modifiers
        (integer)
        (identifier)))
    (string
-      (string_content))
+      (quoted_content))
    (binary_operator
      (float)
      (identifier))
--- a/test/corpus/term/charlist.txt
+++ b/test/corpus/term/charlist.txt
@ -8,7 +8,7 @@ single line

 (source
  (charlist
-    (string_content)))
+    (quoted_content)))

 =====================================
 multiple lines
@ -21,7 +21,7 @@ line 2'

 (source
  (charlist
-    (string_content)))
+    (quoted_content)))

 =====================================
 interpolation
@ -37,20 +37,20 @@ interpolation

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content)))
+    (quoted_content)))

 =====================================
 nested interpolation
@ -62,13 +62,13 @@ nested interpolation

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (interpolation
      (charlist
-        (string_content)
+        (quoted_content)
        (interpolation
          (integer))))
-    (string_content)))
+    (quoted_content)))

 =====================================
 escape sequence
@ -80,26 +80,26 @@ escape sequence

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 escaped interpolation
@ -112,7 +112,7 @@ escaped interpolation
 (source
  (charlist
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / charlist
@ -127,7 +127,7 @@ with 'quotes'

 (source
  (charlist
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / interpolation
@ -141,10 +141,10 @@ hey #{name}!

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / nested interpolation
@ -162,14 +162,14 @@ this is #{

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (interpolation
      (charlist
-        (string_content)
+        (quoted_content)
        (interpolation
          (integer))
-        (string_content)))
-    (string_content)))
+        (quoted_content)))
+    (quoted_content)))

 =====================================
 heredoc / escaped delimiter
@ -187,15 +187,15 @@ heredoc / escaped delimiter

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content)
+    (quoted_content)
    (escape_sequence)
    (escape_sequence)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / escaped interpolation
@ -209,6 +209,6 @@ heredoc / escaped interpolation

 (source
  (charlist
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))
--- a/test/corpus/term/keyword_list.txt
+++ b/test/corpus/term/keyword_list.txt
@ -10,24 +10,19 @@ simple literal
  (list
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer)))))

 =====================================
@ -42,8 +37,7 @@ trailing separator
  (list
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer)))))

 =====================================
@ -58,17 +52,14 @@ with leading items
  (list
    (integer)
    (tuple
-      (atom
-        (atom_literal))
+      (atom)
      (integer))
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer)))))

 =====================================
@ -83,16 +74,13 @@ operator key
  (list
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer)))))

 =====================================
@ -107,28 +95,22 @@ special atom key
  (list
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer)))))

 =====================================
@ -144,22 +126,18 @@ reserved token key
  (list
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))))
  (list
    (keywords
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer))
      (pair
-        (keyword
-          (atom_literal))
+        (keyword)
        (integer)))))

 =====================================
@ -177,13 +155,13 @@ quoted key
  (list
    (keywords
      (pair
-        (keyword
-          (string_content)
+        (quoted_keyword
+          (quoted_content)
          (escape_sequence))
        (integer))
      (pair
-        (keyword
-          (string_content)
+        (quoted_keyword
+          (quoted_content)
          (escape_sequence))
        (integer)))))

@ -202,18 +180,18 @@ key interpolation
  (list
    (keywords
      (pair
-        (keyword
-          (string_content)
+        (quoted_keyword
+          (quoted_content)
          (interpolation
            (identifier))
-          (string_content))
+          (quoted_content))
        (integer))
      (pair
-        (keyword
-          (string_content)
+        (quoted_keyword
+          (quoted_content)
          (interpolation
            (identifier))
-          (string_content))
+          (quoted_content))
        (integer)))))

 =====================================
@ -226,15 +204,14 @@ key interpolation

 (source
  (list
-    (keywords
-      (pair
-        (keyword
-          (atom_literal))
-        (integer))
-      (pair
-        (keyword
-          (atom_literal))
-        (integer)))
    (ERROR
+      (keywords
+        (pair
+          (keyword)
+          (integer))
+        (pair
+          (keyword)
+          (integer))))
+    (binary_operator
      (integer)
      (integer))))
--- a/test/corpus/term/map.txt
+++ b/test/corpus/term/map.txt
@ -22,12 +22,10 @@ from keywords
    (map_content
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -42,12 +40,11 @@ from arrow entries
  (map
    (map_content
      (binary_operator
-        (atom
-          (atom_literal))
+        (atom)
        (integer))
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (integer))
      (binary_operator
        (identifier)
@ -66,16 +63,14 @@ from both arrow entries and keywords
    (map_content
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -91,7 +86,7 @@ trailing separator
    (map_content
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (integer)))))

 =====================================
@ -110,24 +105,22 @@ update syntax
        (identifier)
        (keywords
          (pair
-            (keyword
-              (atom_literal))
+            (keyword)
            (string
-              (string_content)))
+              (quoted_content)))
          (pair
-            (keyword
-              (atom_literal))
+            (keyword)
            (string
-              (string_content)))))))
+              (quoted_content)))))))
  (map
    (map_content
      (binary_operator
        (identifier)
        (binary_operator
          (string
-            (string_content))
+            (quoted_content))
          (string
-            (string_content)))))))
+            (quoted_content)))))))

 =====================================
 [error] ordering
@ -139,19 +132,18 @@ update syntax

 (source
  (map
-    (map_content
+    (ERROR
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))
-    (ERROR
-      (integer)
-      (integer))))
+    (map_content
+      (binary_operator
+        (integer)
+        (integer)))))

 =====================================
 [error] missing separator
@ -166,9 +158,9 @@ update syntax
    (map_content
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (ERROR (integer))
        (binary_operator
          (string
-            (string_content))
+            (quoted_content))
          (integer))))))
--- a/test/corpus/term/string.txt
+++ b/test/corpus/term/string.txt
@ -8,7 +8,7 @@ single line

 (source
  (string
-    (string_content)))
+    (quoted_content)))

 =====================================
 multiple lines
@ -21,7 +21,7 @@ line 2"

 (source
  (string
-    (string_content)))
+    (quoted_content)))

 =====================================
 interpolation
@ -37,20 +37,20 @@ interpolation

 (source
  (string
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
+    (quoted_content))
  (string
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content))
+    (quoted_content))
  (string
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content)))
+    (quoted_content)))

 =====================================
 nested interpolation
@ -62,13 +62,13 @@ nested interpolation

 (source
  (string
-    (string_content)
+    (quoted_content)
    (interpolation
      (string
-        (string_content)
+        (quoted_content)
        (interpolation
          (integer))))
-    (string_content)))
+    (quoted_content)))

 =====================================
 escape sequence
@ -80,26 +80,26 @@ escape sequence

 (source
  (string
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)
+    (quoted_content)
    (escape_sequence)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 escaped interpolation
@ -112,7 +112,7 @@ escaped interpolation
 (source
  (string
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / string
@ -127,7 +127,7 @@ with "quotes"

 (source
  (string
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / interpolation
@ -141,10 +141,10 @@ hey #{name}!

 (source
  (string
-    (string_content)
+    (quoted_content)
    (interpolation
      (identifier))
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / nested interpolation
@ -162,14 +162,14 @@ this is #{

 (source
  (string
-    (string_content)
+    (quoted_content)
    (interpolation
      (string
-        (string_content)
+        (quoted_content)
        (interpolation
          (integer))
-        (string_content)))
-    (string_content)))
+        (quoted_content)))
+    (quoted_content)))

 =====================================
 heredoc / escaped delimiter
@ -187,15 +187,15 @@ heredoc / escaped delimiter

 (source
  (string
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content))
+    (quoted_content))
  (string
-    (string_content)
+    (quoted_content)
    (escape_sequence)
    (escape_sequence)
    (escape_sequence)
-    (string_content)))
+    (quoted_content)))

 =====================================
 heredoc / escaped interpolation
@ -209,18 +209,6 @@ heredoc / escaped interpolation

 (source
  (string
-    (string_content)
+    (quoted_content)
    (escape_sequence)
-    (string_content)))
-
-=====================================
-[error] heredoc / no whitespace
-=====================================
-
-"""s"""
-
---
-
-(source
-  (ERROR
-    (identifier)))
+    (quoted_content)))
--- a/test/corpus/term/struct.txt
+++ b/test/corpus/term/struct.txt
@ -26,12 +26,10 @@ from keywords
    (map_content
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -48,12 +46,11 @@ from arrow entries
      (alias))
    (map_content
      (binary_operator
-        (atom
-          (atom_literal))
+        (atom)
        (integer))
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (integer))
      (binary_operator
        (identifier)
@ -74,16 +71,14 @@ from both arrow entries and keywords
    (map_content
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (integer))
      (keywords
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))
        (pair
-          (keyword
-            (atom_literal))
+          (keyword)
          (integer))))))

 =====================================
@ -101,7 +96,7 @@ trailing separator
    (map_content
      (binary_operator
        (string
-          (string_content))
+          (quoted_content))
        (integer)))))

 =====================================
@ -121,15 +116,13 @@ update syntax
        (identifier)
        (keywords
          (pair
-            (keyword
-              (atom_literal))
+            (keyword)
            (string
-              (string_content)))
+              (quoted_content)))
          (pair
-            (keyword
-              (atom_literal))
+            (keyword)
            (string
-              (string_content)))))))
+              (quoted_content)))))))
  (map
    (struct
      (alias))
@ -138,9 +131,9 @@ update syntax
        (identifier)
        (binary_operator
          (string
-            (string_content))
+            (quoted_content))
          (string
-            (string_content)))))))
+            (quoted_content)))))))

 =====================================
 unused struct identifier
@ -212,8 +205,8 @@ with atom
 (source
  (map
    (struct
-      (atom
-        (string_content)))))
+      (quoted_atom
+        (quoted_content)))))

 =====================================
 with call
--- a/test/corpus/unicode.txt
+++ b/test/corpus/unicode.txt
@ -13,20 +13,15 @@ atom
 ---

 (source
-  (atom
-    (atom_literal))
-  (atom
-    (string_content))
-  (atom
-    (string_content))
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal))
-  (atom
-    (atom_literal)))
+  (atom)
+  (quoted_atom
+    (quoted_content))
+  (quoted_atom
+    (quoted_content))
+  (atom)
+  (atom)
+  (atom)
+  (atom))

 =====================================
 string
@ -43,17 +38,17 @@ string

 (source
  (string
-    (string_content))
+    (quoted_content))
  (string
-    (string_content))
+    (quoted_content))
  (string
-    (string_content))
+    (quoted_content))
  (string
-    (string_content))
+    (quoted_content))
  (string
-    (string_content))
+    (quoted_content))
  (string
-    (string_content)))
+    (quoted_content)))

 =====================================
 charlist
@ -69,17 +64,17 @@ charlist

 (source
  (charlist
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content))
+    (quoted_content))
  (charlist
-    (string_content)))
+    (quoted_content)))

 =====================================
 char