diff options
| author | Christian Clason <c.clason@uni-graz.at> | 2025-05-05 11:00:16 +0200 |
|---|---|---|
| committer | Christian Clason <c.clason@uni-graz.at> | 2025-05-12 18:43:41 +0200 |
| commit | e8bfe271b0da136048cdf20128b758ec87318479 (patch) | |
| tree | 7ff4f564acf75128647e4d9dfb6c3287623acdea /CONTRIBUTING.md | |
| parent | feat(parsers): update t32, query, markdown, markdown_inline, meson, hyprlang,... (diff) | |
| download | nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.tar nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.tar.gz nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.tar.bz2 nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.tar.lz nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.tar.xz nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.tar.zst nvim-treesitter-e8bfe271b0da136048cdf20128b758ec87318479.zip | |
docs: update to rewrite
This updates
* README
* CONTRIBUTING
* the `:h nvim-treesitter` documentation
to the current state of `main`. It also adds a pull request template for
adding a new language.
Diffstat (limited to 'CONTRIBUTING.md')
| -rw-r--r-- | CONTRIBUTING.md | 308 |
1 files changed, 205 insertions, 103 deletions
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 845891459..983499e81 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,13 +1,16 @@ # Contributing to `nvim-treesitter` -First of all, thank you very much for contributing to `nvim-treesitter`. +The main parts of `nvim-treesitter` are +* a curated list of [parsers](#Parsers); +* a collection of [queries](#Queries). -If you haven't already, you should really come and reach out to us on our -[Matrix channel], so we can help you with any question you might have! - -The main goal of `nvim-treesitter` is to provide a framework to easily install parsers and queries. - -Depending on which part of the plugin you want to contribute to, please read the appropriate section. +Before describing these in detail, some general advice: +* Some basic knowledge of how tree-sitter works is assumed; we recommend reading + - the [upstream documentation](https://tree-sitter.github.io/tree-sitter/); + - [Neovim's documentation](https://neovim.io/doc/user/treesitter.html#treesitter). +* There are dedicated Matrix channels for questions and general help: + - [#nvim-treesitter](https://matrix.to/#/#nvim-treesitter:matrix.org) for questions specific to Neovim's implementation and the queries here; + - [#tree-sitter](https://matrix.to/#/#tree-sitter-chat:matrix.org) for general questions regarding treesitter queries and the `tree-sitter` CLI. ## Parsers @@ -34,12 +37,14 @@ zimbu = { } ``` -**Note:** The "maintainers" here refers to the person maintaining the **queries** in `nvim-treesitter`, not the parser maintainers (who likely don't use Neovim). The maintainers' duty is to review issues and PRs related to the query and to keep them updated with respect to parser changes. +>[!IMPORTANT] +> The "maintainers" here refers to the person maintaining the **queries** in `nvim-treesitter`, not the parser maintainers (who likely don't use Neovim). The maintainers' duty is to review issues and PRs related to the query and to keep them updated with respect to parser changes. -**Note:** To qualify for Tier 1 ("stable"), a parser needs to - * make releases following semver (_patch_ for fixes not affecting queries; _minor_ for changes introducing new nodes or patterns; _major_ for changes removing nodes or previously valid patterns); - * provide WASM release artifacts; - * include and maintain reference queries. +>[!NOTE] +> To qualify for Tier 1 ("stable"), a parser needs to +> * make releases following semver (_patch_ for fixes not affecting queries; _minor_ for changes introducing new nodes or patterns; _major_ for changes removing nodes or previously valid patterns); +> * provide WASM release artifacts; +> * include and maintain reference queries. 2. If the parser name is not the same as the Vim filetype, add an entry to the `filetypes` table in `plugin/filetypes.lua`: @@ -47,30 +52,41 @@ zimbu = { zimbu = { 'zu' }, ``` -**Note: We only support external scanners written in C for portability reasons.** +>[!IMPORTANT] +> Only external scanners written in C are supported for portability reasons. -## Queries +3. Update the list of [supported languages] by running `make docs` (or `./scripts/update-readme.lua` if on Windows). + +4. Test if both `:TSInstall zimbu` and `:TSInstallFromGrammar zimbu` work without errors (`:checkhealth treesitter` or `./scripts/check-parsers.lua zimbu`). + +>[!IMPORTANT] +> You also need to add queries in order for the parser to actually be useful! + +When you're done, open a Pull Request using the [provided template](.github/PULL_REQUEST_TEMPLATE/new_language.md), e.g. using `gh pr create -B main -T new_language`. -Contributing to queries for an existing parser is basically modifying one of the `runtime/queries/*/*.scm`. -Each of these `scheme` files contains a _tree-sitter query_ for a given purpose. -Before going any further, we highly suggest that you [read more about tree-sitter queries](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries). +## Queries -Each query has an appropriate name, which is then used by modules to extract data from the syntax tree. -For now these are the types of queries provided by `nvim-treesitter`: +To add (or edit existing) queries, create a corresponding `runtime/queries/zimbu/*.scm` file: -- `highlights.scm`: used for syntax highlighting, using the `highlight` module. +- `highlights.scm` used for syntax highlighting, +- `injections.scm` used to specify nodes whose content should be parsed as a different language; +- `folds.scm`; used to define folds; - `locals.scm`: used to extract keyword definitions, scopes, references, etc. (not used in this plugin). -- `textobjects.scm`: used to define text objects. -- `folds.scm`: used to define folds. -- `injections.scm`: used to define injections. +- `indents.scm`; used to control indentation. + +See [tree-sitter queries] for a basic description of the query language. The following tools can be helpful when writing or editing queries: +* [ts_query_ls] is a language server for treesitter queries, which can validate, autocomplete, and format. This tool can also be used as an offline linter and formatter (accessible through `make lintquery`, `make checkquery`, `make formatquery` targets). +* Neovim's `:InspectTree` will show the parsed tree for a buffer and highlight the text corresponding to any given node (and vice versa). +* `:EditQuery` opens a "playground" where you can write query patterns and see which parts of the buffer are captured by each capture. + +>[!IMPORTANT] +> The valid captures that can be used in queries is different for each editor, so you cannot just copy them, e.g., from Helix or the parser repositories. For Neovim, all valid captures are listed below. You can verify that your changes adhere to this by running `make lintquery`. -For these types there is a _norm_ you will have to follow so that features work fine. -Here are some global advices: +>[!IMPORTANT] +> Since grammars can change constantly, it is important to make sure that the patterns in a query are actually valid for the parser specified in nvim-treesitter's manifest. This can be verified using `make checkquery` (which requires the parser to be installed in the default directory(!) through `nvim-treesitter`). Opening the query in Neovim with the parser installed will also show all invalid patterns, either via [ts_query_ls] or Neovim's builtin query-linter. -- Examples of queries can be found in [runtime/queries/](runtime/queries/) -- For highlights, all matching patterns are applied, with the last one determining the visible capture & highlight. -- For injections, try your best to ensure that each captured node is only matched by a single pattern. -- If the [parser is included in `nvim-treesitter`](https://github.com/nvim-treesitter/nvim-treesitter/SUPPORTED_LANGUAGES.md`) and installed with `:TSInstall`, you can use Neovim's developer tools (`:checkhealth`, `:InspectTree`, `:EditQuery`, `:Inspect`) to test your queries. +>[!TIP] +> Before opening a PR, run `make query` to format, lint, and check all queries. #### Inheriting languages @@ -96,7 +112,10 @@ Should you need to preserve a specific format for a node, you can exempt it (and ### Highlights -As languages differ quite a lot, here is a set of captures available to you when building a `highlights.scm` query. Note that your color scheme needs to define (or link) these captures as highlight groups. +Syntax highlighting is specified in a `highlights.scm` query, which assigns treesitter nodes to captures that can be assigned a highlight group. This feature is implemented in Neovim and documented at [`:h treesitter-highlight`](https://neovim.io/doc/user/treesitter.html#treesitter-highlight). +Note that your color scheme needs to define (or link) these captures as highlight groups. You can use Neovim's built-in `:Inspect` function to see exactly which highlight groups are applied at a given position. + +The valid captures are listed below. #### Identifiers @@ -255,45 +274,179 @@ Mainly for markup languages. #### Non-highlighting captures ```query -@none ; completely disable the highlight @conceal ; captures that are only meant to be concealed ``` +>[!TIP] +> * See [`:h tree-sitter-highlight-conceal`](https://neovim.io/doc/user/treesitter.html#treesitter-highlight-conceal). +> * The capture should be meaningful to allow proper highlighting when `set conceallevel=0`. +> * A conceal can be restricted to part of the capture via the [`#offset!` directive](https://neovim.io/doc/user/treesitter.html#treesitter-directive-offset%21). + ```query @spell ; for defining regions to be spellchecked @nospell ; for defining regions that should NOT be spellchecked ``` -The main types of nodes which are spell checked are: -- Comments -- Strings; where it makes sense. Strings that have interpolation or are typically used for non text purposes are not spell checked (e.g. bash). +>[!TIP] +> The main types of nodes that should be spell checked are +> - comments +> - strings; where it makes sense. Strings that have interpolation or are typically used for non text purposes are not spell checked (e.g. bash). #### Predicates -Captures can be restricted according to node contents using [predicates](https://neovim.io/doc/user/treesitter.html#treesitter-predicates). For performance reasons, prefer earlier predicates in this list: +Captures can be restricted according to node contents using [predicates](https://neovim.io/doc/user/treesitter.html#treesitter-predicates). -1. `#eq?` (literal match) -2. `#any-of?` (one of several literal matches) -3. `#lua-match?` (match against a [Lua pattern](https://neovim.io/doc/user/luaref.html#lua-pattern)) -4. `#match?`/`#vim-match?` (match against a [Vim regular expression](https://neovim.io/doc/user/pattern.html#regexp) +>[!IMPORTANT] +> For performance reasons, prefer earlier predicates in this list: +> +> 1. `#eq?` (literal match) +> 2. `#any-of?` (one of several literal matches) +> 3. `#lua-match?` (match against a [Lua pattern](https://neovim.io/doc/user/luaref.html#lua-pattern)) +> 4. `#match?`/`#vim-match?` (match against a [Vim regular expression](https://neovim.io/doc/user/pattern.html#regexp) -#### Conceal +Besides those provided by Neovim, nvim-treesitter also implements -Captures can be concealed by setting the [`conceal` metadata](https://neovim.io/doc/user/treesitter.html#treesitter-highlight-conceal), e.g.., ```query - (fenced_code_block_delimiter @markup.raw.block (#set! conceal "")) +#kind-eq? ; checks whether a capture corresponds to a given set of nodes +#any-kind-eq? ; checks whether any of a list of captures corresponds to a given set of nodes ``` -The capture should be meaningful to allow proper highlighting when `set conceallevel=0`. If the unconcealed capture should not be highlighted (e.g., because an earlier pattern handles this), you can use `@conceal`. -A conceal can be restricted to part of the capture via the [`#offset!` directive](https://neovim.io/doc/user/treesitter.html#treesitter-directive-offset%21). +#### Directives + +Nodes contain metadata that can be modified via [directives](https://neovim.io/doc/user/treesitter.html#treesitter-directives). #### Priority Captures can be assigned a priority to control precedence of highlights via the -`#set! priority <number>` directive (see `:h treesitter-highlight-priority`). -The default priority for treesitter highlights is `100`; queries should only -set priorities between `90` and `120`, to avoid conflict with other sources of -highlighting (such as diagnostics or LSP semantic tokens). +`#set! priority <number>` directive (see [`:h treesitter-highlight-priority`](https://neovim.io/doc/user/treesitter.html#treesitter-highlight-priority)). This is useful for controlling conflicts with injected languages or when inheriting queries from other languages. + +>[!NOTE] +> The default priority for treesitter highlights is `100`; queries should only +set priorities between `90` and `120`, to avoid conflict with other sources of highlighting (such as diagnostics or LSP semantic tokens). + +>[!TIP] +> Precedence is also influenced by pattern order in a query file. If possible, try to achieve the correct result by reordering patterns before resorting to explicit priorities. + +### Injections + +Language injections are controlled by `injections.scm` queries, which specify nodes that should be parsed as a different language. This feature is implemented in Neovim and documented at +[`:h treesitter-language-injections](https://neovim.io/doc/user/treesitter.html#treesitter-language-injections). + +The valid captures are: + +```query +@injection.language ; dynamic detection of the injection language (i.e. the text of the captured node describes the language) +@injection.content ; region for the dynamically detected language +@injection.filename ; indicates that the captured node’s text may contain a filename; the corresponding filetype is then looked-up up via vim.filetype.match() and treated as the name of a language that should be used to re-parse the `@injection.content` +``` + +>[!TIP] +> When writing injection queries, try to ensure that each captured node is only matched by a single pattern. + +### Folds + +You can define folds for a given language by adding a `folds.scm` query. This is implemented in Neovim. The only valid capture is + +```query +@fold ; fold this node +``` + +### Indents + +>[!WARNING] +> Treesitter-based indentation is still experimental and likely to have breaking changes in the future. + +Indentation for a language is controlled by `indents.scm` queries. The following captures can be used to set the indentation for nodes, either relative or absolute + +* `@indent.begin` specifies that the next line should be indented. Multiple +indents on the same line get collapsed, e.g., +```query + ( + (if_statement) + (ERROR "else") @indent.begin + ) +``` +You can also `#set! indent.immediate` to permit the next line to indent even when the block intended to be indented has no content yet. (This can improve interactive typing.) +For example for Python, +```query + ((if_statement) @indent.begin + (#set! indent.immediate 1)) +``` +will allow +```python + if True:<CR> + # Auto indent to here +``` + +* `@indent.end` is used to specify that the indented region ends and any text subsequent to the capture should be dedented. + +* `@indent.branch` is used to specify that a dedented region starts at the line _including_ the captured nodes. + +* `@indent.dedent` specifies dedenting starting on the _next_ line. + +* `@indent.auto` behaves like Vim's [`autoindent`](https://neovim.io/doc/user/options.html#'autoindent') buffer option (copy whatever the indentation of previous line is when opening a new line after it). + +* `@indent.ignore` specifies that no indent should be added to this node. + +* `@indent.zero` sets the indentation of this node to 0 (i.e., removes _all_ indentation). + +* `@indent.align` can be used to specify blocks that should have the same indentation. +This allows +``` + foo(a, + b, + c) +``` +as well as +``` + foo( + a, + b, + c) +``` +and +``` + foo( + a, + b, + c + ) +``` +To specify the delimiters to align at, `#set! indent.open_delimiter` and +`indent.close_delimiter`, e.g., +```query + ((argument_list) @indent.align + (#set! indent.open_delimiter "(") + (#set! indent.close_delimiter ")")) +``` +For some languages, the last line of an `indent.align` block must not be +the same indent as the natural next line. +For example in Python, + +```python + if (a > b and + c < d): + pass +``` +is not correct, whereas +```python + if (a > b and + c < d): + pass +``` +would be correctly indented. This behavior may be selected by setting +`indent.avoid_last_matching_next`. For example, +```query + (if_statement + condition: (parenthesized_expression) @indent.align + (#set! indent.open_delimiter "(") + (#set! indent.close_delimiter ")") + (#set! indent.avoid_last_matching_next 1) + ) +``` +specifies that the last line of an `@indent.align` capture +should be additionally indented to avoid clashing with the indent of the first +line of the block inside an `if`. ### Locals @@ -302,7 +455,7 @@ scopes, see [upstream documentation](https://tree-sitter.github.io/tree-sitter/syntax-highlighting#local-variables). Note that nvim-treesitter uses more specific subcaptures for definitions and **does not use locals** (for highlighting or any other purpose). These queries -are only provided for (limited) backwards compatibility. +are only provided for limited backwards compatibility. ```query @local.definition ; various definitions @@ -349,58 +502,7 @@ Possible scope values are: - `global`: The definition is valid in the root scope - `local`: The definition is valid in the containing scope. This is the default behavior -### Folds - -You can define folds for a given language by adding a `folds.scm` query : - -```query -@fold ; fold this node -``` - -If the `folds.scm` query is not present, this will fall back to the `@local.scope` captures in the `locals` -query. - -### Injections - -Some captures are related to language injection (like markdown code blocks). They are used in `injections.scm`. - -If you want to dynamically detect the language (e.g. for Markdown blocks) use the `@injection.language` to capture -the node describing the language and `@injection.content` to describe the injection region. - -```query -@injection.language ; dynamic detection of the injection language (i.e. the text of the captured node describes the language) -@injection.content ; region for the dynamically detected language -@injection.filename ; indicates that the captured node’s text may contain a filename; the corresponding filetype is then looked-up up via vim.filetype.match() and treated as the name of a language that should be used to re-parse the `@injection.content` -``` - -For example, to inject javascript into HTML's `<script>` tag - -```html -<script>someJsCode();</script> -``` - -```query -(script_element - (raw_text) @injection.content - (#set! injection.language "javascript")) ; set the parser language for @injection.content region to javascript -``` - -For regions that don't have a corresponding `@injection.language`, you need to manually set the language -through `(#set injection.language "lang_name")` - -To combine all matches of a pattern as one single block of content, add `(#set! injection.combined)` to such pattern - -### Indents - -```query -@indent.begin ; indent children when matching this node -@indent.end ; marks the end of indented block -@indent.align ; behaves like python aligned/hanging indent -@indent.dedent ; dedent children when matching this node -@indent.branch ; dedent itself when matching this node -@indent.ignore ; do not indent in this node -@indent.auto ; behaves like 'autoindent' buffer option -@indent.zero ; sets this node at position 0 (no indent) -``` -[Matrix channel]: https://matrix.to/#/#nvim-treesitter:matrix.org +[supported languages]: https://github.com/nvim-treesitter/nvim-treesitter/SUPPORTED_LANGUAGES.md +[tree-sitter queries]: https://tree-sitter.github.io/tree-sitter/using-parsers/queries/index.html +[ts_query_ls]: https://github.com/ribru17/ts_query_ls |
