Skip to content

Commit

Permalink
Update README.md to refer to microsoft/multilspy
Browse files Browse the repository at this point in the history
  • Loading branch information
LakshyAAAgrawal committed Aug 8, 2024
1 parent 49d0a9b commit 5003004
Showing 1 changed file with 11 additions and 7 deletions.
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,11 @@ This repository hosts the official code and data artifact for the paper ["Monito
1. [Datasets](#1-datasets): PragmaticCode and DotPrompts
2. [Evaluation scripts](#2-evaluation-scripts): Scripts to evaluate LMs by taking as input inferences (code generated by the model) for examples in DotPrompts and producing score@k scores for the metrics reported in the paper: Compilation Rate (CR), Next-Identifier Match (NIM), Identifier-Sequence Match (ISM) and Prefix Match (PM).
3. [Inference Results over DotPrompts](#3-inference-results-over-dotprompts): Generated code for examples in DotPrompts with various model configurations reported in the paper. The graphs and tables reported in the paper can be reproduced by running the evaluation scripts on the provided inference results.
4. [`multilspy`](#4-multilspy): A language server client, to easily obtain and use results of various static analyses provided by a large variety of language servers that communicate over the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/). `multilspy` is intended to be used as a library to easily query various language servers, without having to worry about setting up their configurations and implementing the client-side of language server protocol. `multilspy` currently supports running language servers for Java, Rust, C# and Python, and we aim to expand this list with the help of the community.
4. [`multilspy`](#4-multilspy): A cross-platform library designed to simplify the process of creating language server clients to query and obtain results of various static analyses from a wide variety of language servers that communicate over the [Language Server Protocol](https://microsoft.github.io/language-server-protocol/). `multilspy` is intended to be used as a library to easily query various language servers, without having to worry about setting up their configurations and implementing the client-side of language server protocol. `multilspy` currently supports running language servers for Java, Rust, C# and Python, and we aim to expand this list with the help of the community.
5. [Monitor-Guided Decoding](#5-monitor-guided-decoding): Implementation of various monitors monitoring for different properties reported in the paper (for example: monitoring for type-valid identifier dereferences, monitoring for correct number of arguments to method calls, monitoring for typestate validity of method call sequences, etc.), spanning 3 programming languages.

**The `multilspy` library has now been migrated to [microsoft/multilspy](https://github.com/microsoft/multilspy).**

## Monitor-Guided Decoding: Motivating Example
For example, consider the partial code to be completed in the figure below. To complete this code, an LM has to generate identifiers consistent with the type of the object returned by `ServerNode.Builder.newServerNode()`. The method `newServerNode` and its return type, class `ServerNode.Builder`, are defined in another file. If an LM does not have information about the `ServerNode.Builder type`, it ends up hallucinating, as can be seen in the example generations with the text-davinci-003 and SantaCoder models. The completion uses identifiers `host` and `port`, which do not exist in the type `ServerNode.Builder`. The generated code therefore results in “symbol not found” compilation errors.

Expand Down Expand Up @@ -105,6 +107,8 @@ python3 evaluation_scripts/eval_results.py inference_results/dotprompts_results.
The above command creates a directory [results](results/) (already included in the repository), containing all the figures and tables provided in the paper along with extra details. The command also generates a report in the output directory which relates the generated figures to sections in the paper. In case of above command, the report is generated at [results/Report.md](results/Report.md).

## 4. `multilspy`
**The `multilspy` library has now been migrated to [microsoft/multilspy](https://github.com/microsoft/multilspy).**

`multilspy` is a cross-platform library that we have built to set up and interact with various language servers in a unified and easy way. [Language servers]((https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/)) are tools that perform a variety of static analyses on source code and provide useful information such as type-directed code completion suggestions, symbol definition locations, symbol references, etc., over the [Language Server Protocol (LSP)](https://microsoft.github.io/language-server-protocol/overviews/lsp/overview/). `multilspy` intends to ease the process of using language servers, by abstracting the setting up of the language servers, performing language-specific configuration and handling communication with the server over the json-rpc based protocol, while exposing a simple interface to the user.

Since LSP is language-agnostic, `multilspy` can provide the results for static analyses of code in different languages over a common interface. `multilspy` is easily extensible to any language that has a Language Server and currently supports Java, Rust, C# and Python and we aim to support more language servers from the [list of language server implementations](https://microsoft.github.io/language-server-protocol/implementors/servers/).
Expand All @@ -120,15 +124,15 @@ Some of the analyses results that `multilspy` can provide are:
### Installation
To install `multilspy` using pip, execute the following command:
```
pip install https://github.com/microsoft/monitors4codegen/archive/main.zip
pip install https://github.com/microsoft/multilspy/archive/main.zip
```

### Usage
Example usage:
```python
from monitors4codegen.multilspy import SyncLanguageServer
from monitors4codegen.multilspy.multilspy_config import MultilspyConfig
from monitors4codegen.multilspy.multilspy_logger import MultilspyLogger
from multilspy import SyncLanguageServer
from multilspy.multilspy_config import MultilspyConfig
from multilspy.multilspy_logger import MultilspyLogger
...
config = MultilspyConfig.from_dict({"code_language": "java"}) # Also supports "python", "rust", "csharp"
logger = MultilspyLogger()
Expand Down Expand Up @@ -156,7 +160,7 @@ with lsp.start_server():

`multilspy` also provides an asyncio based API which can be used in async contexts. Example usage (asyncio):
```python
from monitors4codegen.multilspy import LanguageServer
from multilspy import LanguageServer
...
lsp = LanguageServer.create(...)
async with lsp.start_server():
Expand All @@ -166,7 +170,7 @@ async with lsp.start_server():
...
```

The file [src/monitors4codegen/multilspy/language_server.py](src/monitors4codegen/multilspy/language_server.py) provides the `multilspy` API. Several tests for `multilspy` present under [tests/multilspy/](tests/multilspy/) provide detailed usage examples for `multilspy`. The tests can be executed by running:
The file [microsoft/multilspy - src/multilspy/language_server.py](https://github.com/microsoft/multilspy/blob/main/src/multilspy/language_server.py) provides the `multilspy` API. Several tests for `multilspy` present under [microsoft/multilspy - tests/multilspy/](https://github.com/microsoft/multilspy/tree/main/tests/multilspy) provide detailed usage examples for `multilspy`. The tests can be executed by running:
```bash
pytest tests/multilspy
```
Expand Down

0 comments on commit 5003004

Please sign in to comment.