Skip to content

GNAT Project major mode using tree-sitter for Emacs

License

Notifications You must be signed in to change notification settings

brownts/gpr-ts-mode

Repository files navigation

GNAT Project Major Mode using Tree-Sitter

MELPA CI

A major mode for GNAT Project (GPR) files, which utilizes the Emacs built-in support for tree-sitter, first available starting with Emacs 29. The tree-sitter functionality is used to build an in-memory concrete syntax tree of the parsed language, allowing operations such as syntax highlighting to be performed more accurately than historical methods (e.g., regular expressions).

This major mode provides support for syntax highlighting, indentation, navigation, Imenu, “which function” (i.e., displaying the current function name in the mode line), outlining (via outline-minor-mode), casing and completion.

Note: This major mode is based on the Emacs 29 (or newer) built-in tree-sitter support, not to be confused with the separate Emacs tree-sitter package. The two are not compatible with each other.

Note: Outlining is made available through Emacs 30 (or newer) built-in support for outlining within tree-sitter major modes, therefore it is required for outline-minor-mode to work with gpr-ts-mode.

Prerequisites

There are a couple of requirements which must be met in order to use tree-sitter powered major modes. The Emacs documentation should be consulted which will provide complete details. The following are the main points to consider:

  • Emacs must have been built with tree-sitter support. Versions of Emacs prior to Emacs 29 do not have built-in support. The built-in support is optionally enabled when Emacs is built. It is almost guaranteed that any pre-built Emacs 29 or newer executable supplied as part of an OS package manager (e.g., APT, Homebrew, MSYS2, etc.) will have been built with this capability enabled.
  • The tree-sitter shared library must be installed on your system. The specifics of how to do this will vary based on the Operating System. However, a pre-built Emacs 29 or newer executable that is installed by the OS package manager will likely have identified this library as a prerequisite and have installed it along with Emacs. Therefore, it is unlikely you will need to manually install this library if you installed Emacs with an OS package manager.

    The following command can be used to determine if tree-sitter support is enabled in Emacs and whether the tree-sitter library can be found on your system:

    • M-: (treesit-available-p) RET
    • Make sure it indicates t in the echo area instead of nil.

Installation

There are multiple ways in which a package can be installed in Emacs. The most convenient way is to use a package archive, however installation directly from the git repository is also possible. In addition, there are multiple third party package managers available, but installation instructions in this section will focus only on the built-in package manager (i.e., package.el). It is assumed that power-users will not need direction as to how to use other package managers.

In addition to package management, it is also common practice to perform package configuration. There are also multiple third party packages for managing your package configuration, however use-package is now built-in to Emacs. Refer to the example configuration section for ideas on how to utilize use-package to setup your own personal configuration.

From the MELPA Package Archive

This package can be installed from the MELPA package archive using the Emacs built-in package manager (i.e., package.el). MELPA is not configured in the package manager by default, but the following can be used to configure the use of the MELPA archive. Refer to Getting Started for additional details on configuring and using the MELPA package archive.

(add-to-list 'package-archives '("melpa" . "https://melpa.org/packages/") t)

Once configured as above, instruct the package manager to refresh the available packages and to perform the installation, as follows:

  • M-x package-refresh-contents RET
  • M-x package-install RET gpr-ts-mode RET

From the Git Repository

Installation directly from the source repository is possible using package-vc-install. The following command can be used to perform this installation:

M-x package-vc-install RET https://github.com/brownts/gpr-ts-mode RET

Grammar Installation

In order for gpr-ts-mode to be useful, it needs to have the specific tree-sitter GPR language grammar library installed. This library is different from the tree-sitter library mentioned in the prerequisites section (e.g., libtree-sitter.so vs libtree-sitter-gpr.so). The library is not bundled with gpr-ts-mode, but is maintained separately. With the default configuration, the first time gpr-ts-mode is loaded (in the absence of an existing installed library) it will prompt to download, build and install the grammar library. The following settings provide control over this activity.

User Option: gpr-ts-mode-grammar
Location of the tree-sitter GPR language grammar to be used by gpr-ts-mode.
User Option: gpr-ts-mode-grammar-install
Controls the level of automation in installing the grammar library (automatic, prompt first, etc).

In order to build the library, you will need to have a C compiler installed. Refer to the Emacs documentation surrounding treesit-install-language-grammar, as gpr-ts-mode uses the built-in Emacs functionality to perform the download, building and installation of the library.

It’s also possible to skip this step if you already have a pre-built library for the language. In which case, placing the pre-built library in the correct location will allow gpr-ts-mode to find and use the library. You can customize treesit-extra-load-path to add extra locations to search for libraries.

The README for the GPR language grammar, provides instructions on building the library from source if you’d rather perform that step manually or don’t have the correct toolchain installed in order for this to be performed automatically. You will only be prompted if the library can’t be found in one of the expected locations. The prompting can also be controlled by changing the gpr-ts-mode-grammar-install setting.

If manually installing, or troubleshooting the installation of the GPR language grammar, you can use the following to check whether Emacs can locate the library:

  • M-: (treesit-ready-p 'gpr t) RET
  • Make sure it indicates t in the echo area instead of nil.

Syntax Highlighting

There are 4 different levels of syntax highlighting available, providing an increasing amount of highlighting. By default in Emacs, level 3 (controlled by treesit-font-lock-level) is used to provide a compromise between providing too little and too much fontification. It should be noted that the levels are cumulative, meaning that each level also includes all of the fontification in the levels below it. The following provides the list of features and how they are mapped to the different font lock levels.

Level 1
comment, definition
Level 2
keyword, string, type
Level 3
attribute, function, number, operator, package, variable
Level 4
bracket, delimiter, error

The following user options can be customized to modify the syntax highlighting characteristics:

User Option: gpr-ts-mode-package-names
List of known package names to highlight within variable references. There are ambiguities in the syntax tree for variable references. For example, Foo'Bar could be a reference to the Bar attribute in the package Foo of the current project, or it could be a reference to the top-level Bar attribute in the project Foo. There are additional ambiguities when child packages are involved since references such as Foo.X'Bar exhibit the same problem. This ambiguity applies not just to variable attribute references but also to regular variable references since variables can exist at the project level as well as within a package. If package names in variable references are not properly highlighted, this causes an inconsistency between highlighting of package names in package declarations and not properly highlighting package names in variable references. Actively monitoring package names in the current project as well as all withed projects is considered too heavy handed and instead we settle for maintaining a list of package names. Since custom package names can be introduced (although uncommon), the list of names can be customized, but is initialized with the set of well-known names.

Indentation

Indentation follows the nesting structure of the language. Each nested level is indented a fixed amount. Thus the general indentation offset governs the amount of this indentation. Additional configurations address special cases, such as the indentation of a construct spanning multiple lines (i.e., broken indent). The following configurations can be used to customize these indentation levels to match your own style.

User Option: gpr-ts-mode-indent-offset
Indentation used for structural visualization
User Option: gpr-ts-mode-indent-when-offset
Indentation for case items and comments, relative to a case construction.
User Option: gpr-ts-mode-indent-broken-offset
Continuation indentation when item does not completely reside on a single line.
User Option: gpr-ts-mode-indent-exp-item-offset
Continuation indentation for partial expressions (i.e., terms, concatenation, etc).
User Option: gpr-ts-mode-indent-strategy
Indentation strategy to use, typically when RET or TAB are pressed during editing. Historically, this would drive a line-based indentation, however more complex indentation strategies can be used due to the availability of the syntax tree.

It should be noted that the offsets defined above are described in terms of each other, so that by customizing the standard indentation offset gpr-ts-mode-indent-offset, the other indentation offsets will be adjusted accordingly. In other words, those settings either use the same value or a value derived from it. Therefore, customizing the base value will have an affect on the remaining values as well. If this is not the desired outcome, the other offsets should be customized as well.

Stacking of a list of items on subsequent lines is supported and the indentation will align stacked items under each other. This applies to lists, function call parameters, import lists, discrete choice lists in case constructions, etc.

Indentation rules used in gpr-ts-mode, as in all tree-sitter based modes, are based on the syntax of the language. When what currently exists in the buffer is not syntactically correct, the in-memory syntax tree will contain errors, since the buffer doesn’t adhere to the grammar rules of the language (i.e., it contains syntax errors). To help combat this issue, specific indentation error recovery is used to maintain indentation even when portions of the syntax are missing, providing a best-effort approach to maintain accurate indentation. Furthermore, the indentation strategy can help recover from previously incorrect indentation that has occurred while the buffer was in a syntactically invalid state.

The default declaration setting for gpr-ts-mode-indent-strategy will re-indent at the declaration level once the declaration is in a syntactically valid state. When invalid syntax exists within a declaration, this strategy reverts back to “best effort” line-based indentation. Once the syntax becomes valid, indentation will be applied at the declaration level.

If none of the existing indentation strategies are sufficient, a custom strategy can be created and used. In order to create a strategy, a new strategy symbol should be specified in gpr-ts-mode-indent-strategy, and an implementation of gpr-ts-mode-indent should be created, specializing on the new strategy symbol name. Refer to existing instances of this function to understand how current strategy functions are implemented.

Navigation

The following specialized navigation functions exist and are applied to GPR projects. Since function declarations don’t exist for GPR project files, this is repurposed to navigate packages and projects instead. This re-purposing of function to project/package is also extended to which-function-mode support and will show the current project and package in the mode line, when enabled.

Key: C-M-a (treesit-beginning-of-defun)
Move backward to beginning of package or project
Key: C-M-e (treesit-end-of-defun)
Move forward to next end of package or project

Starting with Emacs 30, additional navigation functions are provided for S-expression and sentence traversal using the standard Emacs commands (i.e., forward-sexp, forward-sentence, etc).

Imenu

With the provided Imenu support, additional options are available for ease of navigation within a single GPR file. Imenu supports indexing of attributes, packages, projects, type declarations, variable declarations and with clauses. Custom categories can also be defined.

User Option: gpr-ts-mode-imenu-categories
The set of categories to be used for Imenu. Since there are a number of different categories supported, it may be a distraction to display categories that aren’t desired. Therefore, the set of categories can be customized to reduce clutter or to increase performance. The order in which the categories are listed will be respected when the Imenu indexing is performed. This is helpful if specific ordering of categories is desired.
User Option: gpr-ts-mode-imenu-category-name-alist
The mapping between categories and the displayed name for the category. This customization may be helpful if you are expecting a specific name for a category, use plural instead of singular nouns, or want to customize for internationalization.

The items in each Imenu category can be sorted for each nesting level. The specific ordering is controlled via imenu-sort-function, which can be customized to specify the desired sorting function. When imenu-sort-function is nil, items are listed in the order they appear in the buffer. It should be noted that while the mode applies the sorting as specified by imenu-sort-function as well as the category ordering specified in gpr-ts-mode-imenu-categories, downstream functionality (such as completion candidates in the minibuffer) may rearrange the order of items.

If none of the existing categories are sufficient, or an additional category is desired, a custom category can be created and used. In order to create a category, a new category symbol should be added to gpr-ts-mode-imenu-categories, a name mapping should be added to gpr-ts-mode-imenu-category-name-alist, and an implementation of gpr-ts-mode-imenu-index should be created, specializing on the new category symbol name. Refer to existing instances of this function to understand how current category index functions are implemented.

Casing

Functionality is provided to manage casing within a GNAT Project buffer. This includes a number of commands to apply casing rules to point, to a region or to the entire buffer. The commands can be used interactively, but may also be useful in other ways (such as calling the buffer casing command from the buffer-save-hook). An auto-casing minor mode is also provided which performs case formatting as you type.

User Option: gpr-ts-mode-case-formatting
A set of casing rules, used by the casing commands and modes, which apply category specific formatting and dictionaries.
Command: gpr-ts-auto-case-mode
A minor mode which applies casing corrections as you type at each word or subword boundary.
Command: gpr-ts-mode-case-format-region
Applies formatting to the specified region for all items in the region which match the case formatting categories specified in gpr-ts-mode-case-formatting. When the region doesn’t start or stop exactly on a category boundary, the range is expanded to the category boundary.
Command: gpr-ts-mode-case-format-at-point
Applies formatting to point, assuming the item at point matches a case formatting category specified in gpr-ts-mode-case-formatting. When no category is matched, no formatting is applied.
Command: gpr-ts-mode-case-format-buffer
Applies formatting to the entire buffer for all items in the buffer which match the case formatting categories specified in gpr-ts-mode-case-formatting.
Command: gpr-ts-mode-case-format-dwim
A convenience command which will select between “region” and “at point” formatting depending on whether a region is active in the buffer.

If none of the existing casing categories are sufficient, or an additional category is desired, a custom category can be created and used. In order to create a category, a new category symbol should be added to gpr-ts-mode-case-formatting along with the corresponding formatter and optional dictionary properties. Additionally, an implementation of gpr-ts-mode-case-category-p should be created, specializing on the new category symbol name. Refer to existing instances of this predicate function to understand how current category functions are implemented.

Completion

This major mode provides its own set of completions for GNAT Project buffers. The completions are organized by completion categories (i.e., package name completion, attribute name completion, etc). Individual categories can be enabled or disabled as desired through customization. Furthermore, the set of categories can be extended to provide categories not originally envisioned.

User Option: gpr-ts-mode-completion-categories
The set of categories to be used for completion. Categories can be removed from this set to tailor completion if desired. If no completions are desired, setting to nil will disable all gpr-ts-mode provided completions.
User Option: gpr-ts-mode-completion-definitions
Definitions used to populate package and attribute completions. Attributes are specified based on the package (or top-level project) in which they belong, providing package-specific attribute name completions.

If none of the existing categories are sufficient, or an additional category is desired, a custom category can be created and used. In order to create a category, a new category symbol should be added to gpr-ts-mode-completion-categories and an implementation of gpr-ts-mode-completion-at-point should be created, specializing on the new category symbol name. Refer to existing instances of this function to understand how current category functions are implemented.

Troubleshooting

Org Mode Source Code Blocks

When Org Mode doesn’t know the major mode for the language of a source block, it will guess by appending “-mode” to the end of the language name. If we use a language name of “gpr”, this means it will look for a major mode named “gpr-mode”. This default behavior doesn’t work if we want to use Tree-Sitter enabled modes. Maybe in the future it will be aware of these modes, but in the meantime, we can explicitly configure Org Mode to map to the Tree-Sitter major mode using the customization variable org-src-lang-modes.

The following can be added to your configuration to persist the setting:

(with-eval-after-load 'org-src
  (add-to-list 'org-src-lang-modes '("gpr" . gpr-ts)))

LSP Overriding Imenu

The mode’s Imenu support might be overridden if an LSP client is used with the major mode. The mode typically provides a more organized and configurable Imenu experience than that provided by the Language Server. In such cases, in order to use the mode’s built-in Imenu support rather than that provided via the Language Server, Imenu support must be disabled in the LSP client’s configuration. For Eglot, LSP-provided Imenu is disabled by adding the imenu symbol to the list in the eglot-stay-out-of variable.

(setq eglot-stay-out-of '(imenu))

For lsp-mode, LSP-provided Imenu is disabled by clearing the lsp-enable-imenu user option.

(setq lsp-enable-imenu nil)

LSP Overriding Completion

The mode’s completion support might be overridden if an LSP client is used with the major mode. This might be fine for cases where the completion functionality provided via LSP is equivalent to what is provided by the mode. However, if the mode provides additional completion capability not provided via LSP, this may be undesirable. When the LSP client completion support does not explicitly indicate that it’s completion support is “not exclusive”, Emacs may not query the mode’s completion support. To identify the completion provider, the first thing to do is examine the buffer-local completion-at-point-functions variable. It’s a list of completion functions. They will be queried in the order listed. If the gpr-ts-mode’s completion function is not listed first, it won’t be queried first (or maybe at all depending on the provider’s exclusivity setting). It should be obvious from the function name listed whether the provider is an LSP client.

For Eglot, LSP-provided completion can be disabled by adding :completionProvider to eglot-ignored-server-capabilities. For lsp-mode, completion can be configured through the lsp-completion-enable user option. Additionally, completion can be toggled manually with the lsp-completion-mode minor mode.

Example Configuration

The following is an example configuration using use-package to manage this configuration. It assumes that package.el is your package manager. This checks to make sure tree-sitter support is enabled in Emacs before attempting to install/configure the package, thus your configuration will remain compatible with versions of Emacs which don’t yet support tree-sitter, and will not install and configure this package in its absence.

(when (and (fboundp 'treesit-available-p)
           (treesit-available-p))
  (use-package gpr-ts-mode
    :ensure t
    :defer t ; autoload updates `auto-mode-alist'
    :init
    ;; Configure source blocks for Org Mode.
    (with-eval-after-load 'org-src
      (add-to-list 'org-src-lang-modes '("gpr" . gpr-ts)))
    ;; Enable auto-casing
    :hook (gpr-ts-mode . gpr-ts-auto-case-mode)))

;; Configure Imenu

(use-package imenu
  :ensure nil ; built-in
  :custom (imenu-auto-rescan t)
  :hook (gpr-ts-mode . imenu-add-menubar-index))

Resources

  • Recommended packages:
  • Extended example configuration:

Command Index

Keystroke Index

Variable Index

About

GNAT Project major mode using tree-sitter for Emacs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published