Skip to content

Latest commit

 

History

History
490 lines (406 loc) · 22.1 KB

README.org

File metadata and controls

490 lines (406 loc) · 22.1 KB

Ada Major Mode using Tree-Sitter

MELPA CI

A major mode for Ada files, which utilizes the Emacs built-in support for tree-sitter, first available starting with Emacs 29. The tree-sitter functionality is used to build an in-memory concrete syntax tree of the parsed language, allowing operations such as syntax highlighting to be performed more accurately than historical methods (e.g., regular expressions).

This major mode provides support for syntax highlighting, navigation, Imenu, “which function” (i.e., displaying the current function name in the mode line), indentation (via LSP) and outlining (via outline-minor-mode).

Note: This major mode is based on the Emacs 29 (or newer) built-in tree-sitter support, not to be confused with the separate Emacs tree-sitter package. The two are not compatible with each other.

Note: Outlining is made available through Emacs 30 (or newer) built-in support for outlining within tree-sitter major modes, therefore it is required for outline-minor-mode to work with ada-ts-mode.

Prerequisites

There are a couple of requirements which must be met in order to use tree-sitter powered major modes. The Emacs documentation should be consulted which will provide complete details. The following are the main points to consider:

  • Emacs must have been built with tree-sitter support. Versions of Emacs prior to Emacs 29 do not have built-in support. The built-in support is optionally enabled when Emacs is built. It is almost guaranteed that any pre-built Emacs 29 or newer executable supplied as part of an OS package manager (e.g., APT, Homebrew, MSYS2, etc.) will have been built with this capability enabled.
  • The tree-sitter shared library must be installed on your system. The specifics of how to do this will vary based on the Operating System. However, a pre-built Emacs 29 or newer executable that is installed by the OS package manager will likely have identified this library as a prerequisite and have installed it along with Emacs. Therefore, it is unlikely you will need to manually install this library if you installed Emacs with an OS package manager.

    The following command can be used to determine if tree-sitter support is enabled in Emacs and whether the tree-sitter library can be found on your system:

    • M-: (treesit-available-p) RET
    • Make sure it indicates t in the echo area instead of nil.

Installation

There are multiple ways in which a package can be installed in Emacs. The most convenient way is to use a package archive, however installation directly from the git repository is also possible. In addition, there are multiple third party package managers available, but installation instructions in this section will focus only on the built-in package manager (i.e., package.el). It is assumed that power-users will not need direction as to how to use other package managers.

In addition to package management, it is also common practice to perform package configuration. There are also multiple third party packages for managing your package configuration, however use-package is now built-in to Emacs. Refer to the example configuration section for ideas on how to utilize use-package to setup your own personal configuration.

From the MELPA Package Archive

This package can be installed from the MELPA package archive using the Emacs built-in package manager (i.e., package.el). MELPA is not configured in the package manager by default, but the following can be used to configure the use of the MELPA archive. Refer to Getting Started for additional details on configuring and using the MELPA package archive.

(add-to-list 'package-archives '("melpa" . "https://melpa.org/packages/") t)

Once configured as above, instruct the package manager to refresh the available packages and to perform the installation, as follows:

  • M-x package-refresh-contents RET
  • M-x package-install RET ada-ts-mode RET

From the Git Repository

Installation directly from the source repository is possible using package-vc-install. The following command can be used to perform this installation:

M-x package-vc-install RET https://github.com/brownts/ada-ts-mode RET

Grammar Installation

In order for ada-ts-mode to be useful, it needs to have the specific tree-sitter Ada language grammar library installed. This library is different from the tree-sitter library mentioned in the prerequisites section (e.g., libtree-sitter.so vs libtree-sitter-ada.so). The library is not bundled with ada-ts-mode, but is maintained separately. With the default configuration, the first time ada-ts-mode is loaded (in the absence of an existing installed library) it will prompt to download, build and install the grammar library. The following settings provide control over this activity.

User Option: ada-ts-mode-grammar
Location of the tree-sitter Ada language grammar to be used by ada-ts-mode.
User Option: ada-ts-mode-grammar-install
Controls the level of automation in installing the grammar library (automatic, prompt first, etc).

In order to build the library, you will need to have a C compiler installed. Refer to the Emacs documentation surrounding treesit-install-language-grammar, as ada-ts-mode uses the built-in Emacs functionality to perform the download, building and installation of the library.

It’s also possible to skip this step if you already have a pre-built library for the language. In which case, placing the pre-built library in the correct location will allow ada-ts-mode to find and use the library. You can customize treesit-extra-load-path to add extra locations to search for libraries.

You will only be prompted if the library can’t be found in one of the expected locations. The prompting can also be controlled by changing the ada-ts-mode-grammar-install setting.

If manually installing, or troubleshooting the installation of the Ada language grammar, you can use the following to check whether Emacs can locate the library:

  • M-: (treesit-ready-p 'ada t) RET
  • Make sure it indicates t in the echo area instead of nil.

Syntax Highlighting

There are 4 different levels of syntax highlighting available, providing an increasing amount of highlighting. By default in Emacs, level 3 (controlled by treesit-font-lock-level) is used to provide a compromise between providing too little and too much fontification. It should be noted that the levels are cumulative, meaning that each level also includes all of the fontification in the levels below it. The following provides the list of features and how they are mapped to the different font lock levels.

Level 1
comment, definition
Level 2
keyword, preprocessor, string, type
Level 3
attribute, assignment, constant, control, function, number, operator
Level 4
bracket, delimiter, error, label

Indentation

Built-in tree-sitter support for indentation is not currently available. However, if ada-ts-mode is used in conjunction with the Ada Language Server, indentation support can be provided by the language server itself.

The following user options can be customized to modify the indentation as needed.

User Option: ada-ts-mode-indent-backend
Selects the backend used to provide indentation support. The “Default” setting uses the default settings for indent-line-function and indent-region-function. See their documentation for further details, although this usually performs indentation relative to the previous line. The “LSP” setting will use the Language Server indentation support when a LSP client is active for the buffer. If no LSP client is active this falls back to using the “Default” indentation.
User Option: ada-ts-mode-indent-offset
Indentation used for structural visualization.

Ada Language Server Indentation

Since the Ada Language Server provides code formatting, not just an indentation engine, it may be necessary to configure the settings of the code formatter to meet behavioral desires. Under the hood, Ada LS uses the GNAT pretty printer engine to format the line or region of the buffer. For minimal impact to the buffer, it’s likely desirable to use the --source-line-breaks switch to prevent the pretty printer from reformatting the buffer beyond indentation. Refer to the GNAT User’s Guide for the full set of options available for the pretty printer.

The value of ada-ts-mode-indent-offset is provided to the Ada Language Server as the LSP “tab size” parameter. This corresponds to the main indentation amount (i.e., the --indentation switch for the pretty printer).

The switches for the pretty printer should be configured in the project’s GPR file. GPR files support a Pretty_Printer package which is where the switches should reside. The Ada LS will read the pretty printer switches from the project’s GPR file to control formatting.

Navigation

The major mode implements the normal source navigation commands which can be used to move around the buffer (i.e., C-M-a, C-M-e, etc). It should also be noted that which-function-mode is also supported and will show the current package and/or subprogram in the mode line, when enabled.

Command: ada-ts-mode-find-other-file
Locate the corresponding body or specification file related to the file in the current buffer. When an LSP client is active, the Ada Language Server will be commanded to direct navigation to the other file. This is the most accurate method as the Language Server will take into consideration non-standard naming conventions that are described in the GNAT project file. When no LSP client is active, ff-find-other-file will be used to find the corresponding file.
User Option: ada-ts-mode-other-file-alist
File extension mapping used to direct ff-find-other-file to find the corresponding file. This customization is used to define the buffer local ff-find-other-file-alist.
Command: ada-ts-mode-find-project-file
Locate the GNAT project file related to the source file of the current buffer. When an LSP client is active, the Ada Language Server will be queried for the project file, otherwise Alire will be queried if an alire.toml file exists in the directory hierarchy and the Alire executable is found in the path. If neither of these two conditions occur, a single project file will be searched for in the root directory of the project if one exists. Finally, if the project file still is not found, a search will look for a defaut.gpr file somewhere in the directory hierarchy from the current buffer file’s directory.
User Option: ada-ts-mode-alire-program
The name of the Alire executable program that will be searched for on the path, and if found, may be invoked for certain ada-ts-mode commands (such as to query for the name and location of the GNAT project file).

Imenu

With the provided Imenu support, additional options are available for ease of navigation within an Ada source file. Imenu supports indexing of declarations, bodies and stubs for packages, subprograms, task units and protected units as well as type declarations and with clauses.

User Option: ada-ts-mode-imenu-categories
The set of categories to be used for Imenu. Since there are a number of different categories supported, it may be a distraction to display categories that aren’t desired. Therefore, the set of categories can be customized to reduce clutter or to increase performance. The order in which the categories are listed will be respected when the Imenu indexing is performed. This is helpful if specific ordering of categories is desired.
User Option: ada-ts-mode-imenu-category-name-alist
The mapping between categories and the displayed name for the category. This customization may be helpful if you are expecting a specific name for a category, use plural instead of singular nouns, or want to customize for internationalization.
User Option: ada-ts-mode-imenu-nesting-strategy-function
Function to use for constructing nested items within the Imenu data structure. The specific nesting function the user should use will depend on which user interface is used to consume Imenu data, as different interfaces behave differently with respect to how they handle nested items. Some interfaces will display both an entry for the item as well as an entry for items nested within that item. In that case, using “Place Before Nested Entries” is a good choice. Other user interfaces remove duplicate entries, so using “Place Within Nested Entries” will create a placeholder entry in the list of sub-items. If none of these are satisfactory, a custom function can be used to implement a different strategy.
User Option: ada-ts-mode-imenu-nesting-strategy-placeholder
Placeholder to use for the “Place Within Nested Entries” strategy. This placeholder could also be used with a custom function if it supports a placeholder. If this option is customized, it should be configured such that it doesn’t interfere with other valid item names. Therefore, it is suggested to choose a placeholder which is not a valid item name. For example, surrounding a string with parenthesis or brackets.
User Option: ada-ts-mode-imenu-sort-function
The items in each Imenu category can be sorted for each nesting level. The current options are to use “In Buffer Order” which will list the items as they appear in the buffer, or to sort the items “Alphabetically”. Optionally, a custom sort function can be used if neither of these are suitable. A custom sort function should be aware of the possible existence of a placeholder and to order that before any “sorted” items in order to put the placeholder at the top of the list.

Miscellaneous Commands

This section identifies mode-specific commands which are provided to the user. The expectation is that if the user finds these commands useful, they will bind them into the local mode map. The commands are not bound by default as it would be presumptuous to assume where the user would want these bound, if at all. Binding examples are provided in the example configuration section.

Command: ada-ts-mode-defun-comment-box
Creates a comment box above the defun enclosing point. A comment box is typically used to visually identify the start of a subprogram. Not only can this command be used in subprograms, but it can be used within anything considered a defun, which includes declarations, bodies, body stubs, generic instantiations and renaming declarations for packages, task units, protected units, subprograms, etc. The comment box will be indented to the same level as the enclosing defun. The following demonstrates the look of a defun comment box for a subprogram body.
-----------------
-- Hello_World --
-----------------

procedure Hello_World is
begin
   Put_Line ("Hello, world!");
end Hello_World;
    

LSP Client Support

In order to integrate functionality provided by the Ada Language Server (such as indentation), ada-ts-mode must interact with an active LSP client. Since multiple LSP clients exist, this interaction must be configurable such that additional LSP clients can be added when needed. Additionally, ada-ts-mode must be able to determine which LSP client is active in the buffer. If there is no active LSP client in the buffer, ada-ts-mode will not be able to make use of LSP provided capabilities and thus falls back on providing this capability without it’s support, which likely will be less precise.

It’s up to the user to configure the LSP client to be active in an ada-ts-mode buffer, typically through the use of the ada-ts-mode-hook to enable the LSP client minor mode, although those clients which utilize directory local variables for configuration (e.g., GPR project filename), may need to hook into the hack-local-variables-hook in order to initialize after the directory local variables have been initialized. Refer to the documentation for the specific LSP client used for details.

A generic interface is used to support different LSP clients. This interface is described in ada-ts-mode-lspclient.el. There are separate files for each supported LSP client which implement those interfaces. These separate files are configured to automatically load when both ada-ts-mode and the corresponding LSP client package have been loaded.

To add an unsupported LSP client, add client-specific methods for all of the interfaces described in ada-ts-mode-lspclient.el. In addition, a parameterless function for that client should be added to the ada-ts-mode-lspclient-find-functions hook variable (via add-hook). That function, when invoked, should determine if the LSP client is active in the current buffer, and when active, return an LSP client instance (which can be used to dispatch on the client-specific methods), else return nil.

Troubleshooting

Org Mode Source Code Blocks

When Org Mode doesn’t know the major mode for the language of a source block, it will guess by appending “-mode” to the end of the language name. If we use a language name of “ada”, this means it will look for a major mode named “ada-mode”. This default behavior doesn’t work if we want to use Tree-Sitter enabled modes. Maybe in the future it will be aware of these modes, but in the meantime, we can explicitly configure Org Mode to map to the Tree-Sitter major mode using the customization variable org-src-lang-modes.

The following can be added to your configuration to persist the setting:

(with-eval-after-load 'org-src
  (add-to-list 'org-src-lang-modes '("ada" . ada-ts)))

Function Calls not Highlighted Correctly

If you observe places in the syntax highlighting where functions calls are not being properly highlighted, such as an array being highlighted as a function call, or a parameterless function call not being highlighted, this is due to ambiguities in the syntax. From a pure syntax perspective, array accesses look the same as function calls. Also parameterless function calls look the same as variable accesses. In places where it can be determined from the syntax (such as a generic package instantiation) care is taken to avoid highlighting these places as function calls. In other places, it cannot be known from the syntax tree alone and that is where the syntax highlighting will become inaccurate.

One way to address this slightly inaccurate syntax highlighting of function calls, is simply to disable it. An easy way to perform this is through the use of the treesit-font-lock-recompute-features function. Using this function when loading the major mode will allow you to customize which features are enabled/disabled from the default settings.

(defun ada-ts-mode-setup ()
  (treesit-font-lock-recompute-features nil '(function)))

(add-hook 'ada-ts-mode-hook #'ada-ts-mode-setup)

If disabling function call highlighting is not sufficiently satisfying, another approach is to augment the syntax highlighting of ada-ts-mode with that of the Ada language server. The language server is capable of providing semantic highlighting, which is what is needed in this situation. Refer to the Ada language server and a corresponding Emacs Language Server Protocol (LSP) client. Not all LSP clients for Emacs support semantic highlighting, so investigate first before selecting one. When semantic highlighting is used with ada-ts-mode, inaccurate function call highlighting will be corrected. This includes both places where function calls are being highlighted, which aren’t real function calls, as well as places which are function calls but are not being highlighted. In addition to function call highlighting, semantic highlighting provides highlighting of other semantic information, therefore it is highly recommended.

If you do observe places where function call syntax highlighting is inaccurate and it can be determined from the syntax tree, this is considered a bug and should be reported by filing an issue against the package.

Example Configuration

The following is an example configuration using use-package to manage this configuration. It assumes that package.el is your package manager. This checks to make sure tree-sitter support is enabled in Emacs before attempting to install/configure the package, thus your configuration will remain compatible with versions of Emacs which don’t yet support tree-sitter, and will not install and configure this package in its absence. This also demonstrates how to configure indentation support through the Ada Language Server (assuming the user has configured an LSP client to be active in the buffer).

(when (and (fboundp 'treesit-available-p)
           (treesit-available-p))
  (use-package ada-ts-mode
    :ensure t
    :defer t ; autoload updates `auto-mode-alist'
    :custom (ada-ts-mode-indent-backend 'lsp) ; Use Ada LS indentation
    :bind (:map ada-ts-mode-map
                (("C-c C-b" . ada-ts-mode-defun-comment-box)
                 ("C-c C-o" . ada-ts-mode-find-other-file)
                 ("C-c C-p" . ada-ts-mode-find-project-file)))
    :init
    ;; Configure source blocks for Org Mode.
    (with-eval-after-load 'org-src
      (add-to-list 'org-src-lang-modes '("ada" . ada-ts)))))

;; Configure Electric Pair

(use-package elec-pair
  :ensure nil ; built-in
  :hook (ada-ts-mode . electric-pair-local-mode))

Command & Function Index

Variable Index