Contributing to nvim-treesitter
The main parts of nvim-treesitter are
* a curated list of parsers;
* a collection of queries.
Before describing these in detail, some general advice:
* Some basic knowledge of how tree-sitter works is assumed; we recommend reading
- the upstream documentation;
- Neovim's documentation.
* There are dedicated Matrix channels for questions and general help:
- #nvim-treesitter for questions specific to Neovim's implementation and the queries here;
- #tree-sitter for general questions regarding treesitter queries and the tree-sitter CLI.
Parsers
To add a new parser, edit the following files:
- In
lua/parsers.lua, add an entry to the returned table of the following form:
zimbu = {
install_info = {
url = 'https://github.com/zimbulang/tree-sitter-zimbu', -- git repo; use `path` for local path
revision = 'v2.1', -- tag or commit hash
-- optional entries:
branch = 'develop', -- only needed if different from default branch
location = 'parser', -- only needed if the parser is in subdirectory of a "monorepo"
generate = true, -- only needed if repo does not contain pre-generated src/parser.c
},
maintainers = { '@me' }, -- the _query_ maintainers
tier = 1, -- stable: track versioned releases instead of latest commit
-- optional entries:
requires = { 'vim' }, -- if the queries inherit from another language
readme_note = "an example language",
}
[!IMPORTANT] If the repo does not contain a pre-generated
src/parser.c, it must at least containsrc/grammar.jsonso that the parser can be generated without havingnodeinstalled.[!IMPORTANT] The "maintainers" here refers to the person maintaining the queries in
nvim-treesitter, not the parser maintainers (who likely don't use Neovim). The maintainers' duty is to review issues and PRs related to the query and to keep them updated with respect to parser changes.[!IMPORTANT] Due to reliability issues with smaller codeforges, only Github-hosted parsers are currently eligible for inclusion. (The development may happen elsewhere, but there must at least exist a Github mirror to pull the source from.) We are monitoring the situation and hope to support more codeforges again in the future.
[!NOTE] To qualify for Tier 1 ("stable"), a parser needs to * make releases following semver (patch for fixes not affecting queries; minor for changes introducing new nodes or patterns; major for changes removing nodes or previously valid patterns); * provide WASM release artifacts; * include and maintain reference queries.
- If the parser name is not the same as the Vim filetype, add an entry to the
filetypestable inplugin/filetypes.lua:
zimbu = { 'zu' },
[!IMPORTANT] Only external scanners written in C are supported for portability reasons.
-
Update the list of supported languages by running
make docs(or./scripts/update-readme.luaif on Windows). -
Test if both
:TSInstall zimbuand:TSInstallFromGrammar zimbuwork without errors (:checkhealth treesitteror./scripts/check-parsers.lua zimbu).
[!IMPORTANT] You also need to add queries in order for the parser to actually be useful!
When you're done, open a Pull Request using the provided template, e.g. using gh pr create -B main -T new_language.
Queries
To add (or edit existing) queries, create a corresponding runtime/queries/zimbu/*.scm file:
highlights.scmused for syntax highlighting,injections.scmused to specify nodes whose content should be parsed as a different language;folds.scm; used to define folds;locals.scm: used to extract keyword definitions, scopes, references, etc. (not used in this plugin).indents.scm; used to control indentation.
See tree-sitter queries for a basic description of the query language. The following tools can be helpful when writing or editing queries:
* ts_query_ls is a language server for treesitter queries, which can validate, autocomplete, and format. This tool can also be used as an offline linter and formatter (accessible through make lintquery, make checkquery, make formatquery targets).
* Neovim's :InspectTree will show the parsed tree for a buffer and highlight the text corresponding to any given node (and vice versa).
* :EditQuery opens a "playground" where you can write query patterns and see which parts of the buffer are captured by each capture.
[!IMPORTANT] The valid captures that can be used in queries is different for each editor, so you cannot just copy them, e.g., from Helix or the parser repositories. For Neovim, all valid captures are listed below. You can verify that your changes adhere to this by running
make lintquery.[!IMPORTANT] Since grammars can change constantly, it is important to make sure that the patterns in a query are actually valid for the parser specified in nvim-treesitter's manifest. This can be verified using
make checkquery(which requires the parser to be installed in the default directory(!) throughnvim-treesitter). Opening the query in Neovim with the parser installed will also show all invalid patterns, either via ts_query_ls or Neovim's builtin query-linter.[!TIP] Before opening a PR, run
make queryto format, lint, and check all queries.
Inheriting languages
If your language is an extension of a language (TypeScript is an extension of JavaScript for example), you can include the queries from your base language by adding the following as the first line of your file:
; inherits: lang1,(optionallang)
If you want to inherit a language, but don't want the languages inheriting from yours to inherit it, you can mark the language as optional (by putting it between parenthesis).
Formatting
All queries are expected to follow a standard format, with every node on a single line and indented by two spaces for each level of nesting. You can automatically format the bundled queries by running make formatquery.
Should you need to preserve a specific format for a node, you can exempt it (and all contained nodes) by placing before it
; format-ignore
Highlights
Syntax highlighting is specified in a highlights.scm query, which assigns treesitter nodes to captures that can be assigned a highlight group. This feature is implemented in Neovim and documented at :h treesitter-highlight.
Note that your color scheme needs to define (or link) these captures as highlight groups. You can use Neovim's built-in :Inspect function to see exactly which highlight groups are applied at a given position.
The valid captures are listed below.
Identifiers
@variable ; various variable names
@variable.builtin ; built-in variable names (e.g. `this`)
@variable.parameter ; parameters of a function
@variable.parameter.builtin ; special parameters (e.g. `_`, `it`)
@variable.member ; object and struct fields
@constant ; constant identifiers
@constant.builtin ; built-in constant values
@constant.macro ; constants defined by the preprocessor
@module ; modules or namespaces
@module.builtin ; built-in modules or namespaces
@label ; GOTO and other labels (e.g. `label:` in C), including heredoc labels
Literals
@string ; string literals
@string.documentation ; string documenting code (e.g. Python docstrings)
@string.regexp ; regular expressions
@string.escape ; escape sequences
@string.special ; other special strings (e.g. dates)
@string.special.symbol ; symbols or atoms
@string.special.url ; URIs (e.g. hyperlinks)
@string.special.path ; filenames
@character ; character literals
@character.special ; special characters (e.g. wildcards)
@boolean ; boolean literals
@number ; numeric literals
@number.float ; floating-point number literals
Types
@type ; type or class definitions and annotations
@type.builtin ; built-in types
@type.definition ; identifiers in type definitions (e.g. `typedef <type> <identifier>` in C)
@attribute ; attribute annotations (e.g. Python decorators, Rust lifetimes)
@attribute.builtin ; builtin annotations (e.g. `@property` in Python)
@property ; the key in key/value pairs
Functions
@function ; function definitions
@function.builtin ; built-in functions
@function.call ; function calls
@function.macro ; preprocessor macros
@function.method ; method definitions
@function.method.call ; method calls
@constructor ; constructor calls and definitions
@operator ; symbolic operators (e.g. `+` / `*`)
Keywords
@keyword ; keywords not fitting into specific categories
@keyword.coroutine ; keywords related to coroutines (e.g. `go` in Go, `async/await` in Python)
@keyword.function ; keywords that define a function (e.g. `func` in Go, `def` in Python)
@keyword.operator ; operators that are English words (e.g. `and` / `or`)
@keyword.import ; keywords for including or exporting modules (e.g. `import` / `from` in Python)
@keyword.type ; keywords describing namespaces and composite types (e.g. `struct`, `enum`)
@keyword.modifier ; keywords modifying other constructs (e.g. `const`, `static`, `public`)
@keyword.repeat ; keywords related to loops (e.g. `for` / `while`)
@keyword.return ; keywords like `return` and `yield`
@keyword.debug ; keywords related to debugging
@keyword.exception ; keywords related to exceptions (e.g. `throw` / `catch`)
@keyword.conditional ; keywords related to conditionals (e.g. `if` / `else`)
@keyword.conditional.ternary ; ternary operator (e.g. `?` / `:`)
@keyword.directive ; various preprocessor directives & shebangs
@keyword.directive.define ; preprocessor definition directives
Punctuation
@punctuation.delimiter ; delimiters (e.g. `;` / `.` / `,`)
@punctuation.bracket ; brackets (e.g. `()` / `{}` / `[]`)
@punctuation.special ; special symbols (e.g. `{}` in string interpolation)
Comments
@comment ; line and block comments
@comment.documentation ; comments documenting code
@comment.error ; error-type comments (e.g. `ERROR`, `FIXME`, `DEPRECATED`)
@comment.warning ; warning-type comments (e.g. `WARNING`, `FIX`, `HACK`)
@comment.todo ; todo-type comments (e.g. `TODO`, `WIP`)
@comment.note ; note-type comments (e.g. `NOTE`, `INFO`, `XXX`)
Markup
Mainly for markup languages.
@markup.strong ; bold text
@markup.italic ; italic text
@markup.strikethrough ; struck-through text
@markup.underline ; underlined text (only for literal underline markup!)
@markup.heading ; headings, titles (including markers)
@markup.heading.1 ; top-level heading
@markup.heading.2 ; section heading
@markup.heading.3 ; subsection heading
@markup.heading.4 ; and so on
@markup.heading.5 ; and so forth
@markup.heading.6 ; six levels ought to be enough for anybody
@markup.quote ; block quotes
@markup.math ; math environments (e.g. `$ ... $` in LaTeX)
@markup.link ; text references, footnotes, citations, etc.
@markup.link.label ; link, reference descriptions
@markup.link.url ; URL-style links
@markup.raw ; literal or verbatim text (e.g. inline code)
@markup.raw.block ; literal or verbatim text as a stand-alone block
; (use priority 90 for blocks with injections)
@markup.list ; list markers
@markup.list.checked ; checked todo-style list markers
@markup.list.unchecked ; unchecked todo-style list markers
@diff.plus ; added text (for diff files)
@diff.minus ; deleted text (for diff files)
@diff.delta ; changed text (for diff files)
@tag ; XML-style tag names (and similar)
@tag.builtin ; builtin tag names (e.g. HTML5 tags)
@tag.attribute ; XML-style tag attributes
@tag.delimiter ; XML-style tag delimiters
Non-highlighting captures
@conceal ; captures that are only meant to be concealed
[!TIP] * See
:h tree-sitter-highlight-conceal. * The capture should be meaningful to allow proper highlighting whenset conceallevel=0. * A conceal can be restricted to part of the capture via the#offset!directive.
@spell ; for defining regions to be spellchecked
@nospell ; for defining regions that should NOT be spellchecked
[!TIP] The main types of nodes that should be spell checked are - comments - strings; where it makes sense. Strings that have interpolation or are typically used for non text purposes are not spell checked (e.g. bash).
Predicates
Captures can be restricted according to node contents using predicates.
[!IMPORTANT] For performance reasons, prefer earlier predicates in this list:
#eq?(literal match)#any-of?(one of several literal matches)#lua-match?(match against a Lua pattern)#match?/#vim-match?(match against a Vim regular expression
Besides those provided by Neovim, nvim-treesitter also implements
#kind-eq? ; checks whether a capture corresponds to a given set of nodes
#any-kind-eq? ; checks whether any of a list of captures corresponds to a given set of nodes
Directives
Nodes contain metadata that can be modified via directives.
Priority
Captures can be assigned a priority to control precedence of highlights via the
#set! priority <number> directive (see :h treesitter-highlight-priority). This is useful for controlling conflicts with injected languages or when inheriting queries from other languages.
[!NOTE] The default priority for treesitter highlights is
100; queries should only set priorities between90and120, to avoid conflict with other sources of highlighting (such as diagnostics or LSP semantic tokens).[!TIP] Precedence is also influenced by pattern order in a query file. If possible, try to achieve the correct result by reordering patterns before resorting to explicit priorities.
Injections
Language injections are controlled by injections.scm queries, which specify nodes that should be parsed as a different language. This feature is implemented in Neovim and documented at
`:h treesitter-language-injections.
The valid captures are:
@injection.language ; dynamic detection of the injection language (i.e. the text of the captured node describes the language)
@injection.content ; region for the dynamically detected language
@injection.filename ; indicates that the captured node’s text may contain a filename; the corresponding filetype is then looked-up up via vim.filetype.match() and treated as the name of a language that should be used to re-parse the `@injection.content`
[!TIP] When writing injection queries, try to ensure that each captured node is only matched by a single pattern.
Folds
You can define folds for a given language by adding a folds.scm query. This is implemented in Neovim. The only valid capture is @fold:
(function_definition) @fold ; fold this node
Folds should be given to nodes with defined start and end delimiters/patterns, or to consecutive nodes which are part of the same conceptual "grouping", such as consecutive line comments or import statements. The following items are valid fold candidates:
- Function/method definitions
- Class/interface/trait definitions
- Switch/match statements, and individual match arms
- Execution blocks (such as those found in conditional statements or loops)
- Parameter/argument lists
- Array/object/string expressions
- Consecutive import statements, consecutive line comments
The following items would not be valid fold candidates:
- Multiline assignment statements
- Multiline property access expressions
As a rule of thumb, these highlight captures usually reside in or around objects which should be folded:
@function,@function.method@keyword.import,@keyword.conditional,@keyword.repeat@comment,@comment.documentation@string,@string.documentation@markup.heading.x,@markup.list
Indents
[!WARNING] Treesitter-based indentation is still experimental and likely to have breaking changes in the future.
Indentation for a language is controlled by indents.scm queries. The following captures can be used to set the indentation for nodes, either relative or absolute
@indent.beginspecifies that the next line should be indented. Multiple indents on the same line get collapsed, e.g.,
(
(if_statement)
(ERROR "else") @indent.begin
)
You can also #set! indent.immediate to permit the next line to indent even when the block intended to be indented has no content yet. (This can improve interactive typing.)
For example for Python,
((if_statement) @indent.begin
(#set! indent.immediate 1))
will allow
if True:<CR>
# Auto indent to here
-
@indent.endis used to specify that the indented region ends and any text subsequent to the capture should be dedented. -
@indent.branchis used to specify that a dedented region starts at the line including the captured nodes. -
@indent.dedentspecifies dedenting starting on the next line. -
@indent.autobehaves like Vim'sautoindentbuffer option (copy whatever the indentation of previous line is when opening a new line after it). -
@indent.ignorespecifies that no indent should be added to this node. -
@indent.zerosets the indentation of this node to 0 (i.e., removes all indentation). -
@indent.aligncan be used to specify blocks that should have the same indentation. This allows
foo(a,
b,
c)
as well as
foo(
a,
b,
c)
and
foo(
a,
b,
c
)
To specify the delimiters to align at, #set! indent.open_delimiter and
indent.close_delimiter, e.g.,
((argument_list) @indent.align
(#set! indent.open_delimiter "(")
(#set! indent.close_delimiter ")"))
For some languages, the last line of an indent.align block must not be
the same indent as the natural next line.
For example in Python,
if (a > b and
c < d):
pass
is not correct, whereas
if (a > b and
c < d):
pass
would be correctly indented. This behavior may be selected by setting
indent.avoid_last_matching_next. For example,
(if_statement
condition: (parenthesized_expression) @indent.align
(#set! indent.open_delimiter "(")
(#set! indent.close_delimiter ")")
(#set! indent.avoid_last_matching_next 1)
)
specifies that the last line of an @indent.align capture
should be additionally indented to avoid clashing with the indent of the first
line of the block inside an if.
Locals
Locals are used to keep track of definitions and references in local or global scopes, see upstream documentation. Note that nvim-treesitter uses more specific subcaptures for definitions and does not use locals (for highlighting or any other purpose). These queries are only provided for limited backwards compatibility.
@local.definition ; various definitions
@local.definition.constant ; constants
@local.definition.function ; functions
@local.definition.method ; methods
@local.definition.var ; variables
@local.definition.parameter ; parameters
@local.definition.macro ; preprocessor macros
@local.definition.type ; types or classes
@local.definition.field ; fields or properties
@local.definition.enum ; enumerations
@local.definition.namespace ; modules or namespaces
@local.definition.import ; imported names
@local.definition.associated ; the associated type of a variable
@local.scope ; scope block
@local.reference ; identifier reference
Definition scope
You can set the scope of a definition by setting the scope property on the definition.
For example, a JavaScript function declaration creates a scope. The function name is captured as the definition. This means that the function definition would only be available WITHIN the scope of the function, which is not the case. The definition can be used in the scope the function was defined in.
function doSomething() {}
doSomething(); // Should point to the declaration as the definition
(function_declaration
((identifier) @local.definition.var)
(#set! definition.var.scope "parent"))
Possible scope values are:
parent: The definition is valid in the containing scope and one more scope above that scopeglobal: The definition is valid in the root scopelocal: The definition is valid in the containing scope. This is the default behavior
