Skip to content

Commit

Permalink
carlrobertoh#178 - Add support for running local LLMs via LLaMA C/C++…
Browse files Browse the repository at this point in the history
… port (carlrobertoh#249)

* Initial implementation of integrating llama.cpp to run LLaMA models locally

* Move submodule

* Copy llama submodule to bundle

* Support for downloading models from IDE

* Code cleanup

* Store port field

* Replace service selection radio group with dropdown

* Add quantization support + other fixes

* Add option to override host

* Fix override host handler

* Disable port field when override host enabled

* Design updates

* Fix llama settings configuration, design changes, clean up code

* Improve You.com coupon design

* Add new Phind model and help tooltip

* Fetch you.com subscription

* Add CodeBooga model, fix downloadable model selection

* Chat history support

* Code refactoring, minor bug fixes

* UI updates, several bug fixes, removed code llama python model

* Code cleanup, enable llama port only on macOS

* Change downloaded gguf models path

* Move some of the labels to codegpt bundle

* Minor fixes

* Remove ToRA model, add help texts

* Fix test

* Modify description
  • Loading branch information
carlrobertoh authored Nov 3, 2023
1 parent ca2eb9b commit 45908e6
Show file tree
Hide file tree
Showing 71 changed files with 2,748 additions and 533 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "llama.cpp"]
path = src/main/cpp/llama.cpp
url = https://github.com/ggerganov/llama.cpp
66 changes: 54 additions & 12 deletions DESCRIPTION.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,66 @@
<!-- Plugin description -->

**ChatGPT as your copilot to level up your developer experience.**
## Introducing CodeGPT: Your Free, Open-Source AI Copilot for Coding

This is the perfect assistant for any programmer who wants to improve their coding skills
and make more efficient use of the time.
CodeGPT is your go-to AI assistant, designed to enhance your coding skills and optimize your programming time.
Access state-of-the-art LLMs like GPT-4, Code LLama and more, all for free.

## Getting Started
## Quick Start Guide

### Prerequisites
1. **Download the Plugin**: Get started by downloading the plugin from the [JetBrains Marketplace](https://plugins.jetbrains.com/plugin/21056-codegpt?preview=true).

In order to use the extension, you need to have the API key configured. You can find the API key in
your [User settings](https://platform.openai.com/account/api-keys).
2. **Choose Your Preferred Service**:

### API Key Configuration
a) **OpenAI** - Requires authentication via OpenAI API key.

After the plugin has been successfully installed, the API key needs to be configured.
b) **Azure** - Requires authentication via Active Directory or API key.

You can configure the key by going to the plugin's settings via the **File | Settings/Preferences | Tools | CodeGPT**. On the settings panel simply
click
on the API key field, paste the key obtained from the OpenAI website and click **Apply/OK**.
c) **You.com** - A free, web-connected service with an optional upgrade to You⚡Pro for enhanced features..

d) **LLaMA C/C++ Port** - Run Code Llama, WizardCoder, and other state-of-the-art models locally for free.

3. **Start Using the Features**: You're all set! Start exploring the features of our plugin.

### OpenAI

After successful installation, configure your API key. Navigate to the plugin's settings via **File | Settings/Preferences | Tools | CodeGPT**. Paste your OpenAI API key into the field and click `Apply/OK`.

### Azure

For Azure OpenAI services, you'll need to input three additional fields:
* `Resource name`: The name of your Azure OpenAI Cognitive Services.
* `Deployment ID`: The name of your Deployment.
* `API version`: The most recent non-preview version.

Also, input one of the two provided API keys.

### You.com (Free)

**You**.com is a search engine that summarizes the best parts of the internet for **you**, with private ads and with privacy options.

**You⚡Pro**

Use the **CodeGPT** coupon for a free month of unlimited GPT-4 usage.

Check out the full [feature list](https://about.you.com/hc/youpro/what-features-are-included-in-youpro/) for more details.

### LLaMA C/C++ Port (Free, Local)

> **Note**: This feature is currently supported only on Linux and MacOS.
The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quantization on a MacBook.

#### Getting Started

1. **Select the Model**: Depending on your hardware capabilities, choose the appropriate model from the provided list. Once selected, click on the `Download Model` link. A progress bar will appear, indicating the download process.

2. **Start the Server**: After successfully downloading the model, initiate the server by clicking on the `Start Server` button. A status message will be displayed, indicating that the server is starting up.

3. **Apply Settings**: With the server running, you can now apply the settings to start using the features. Click on the `Apply/OK` button to save your settings and start using the application.

<img alt="animated" style="max-width: 100%; width: 600px;" src="https://github.com/carlrobertoh/CodeGPT/raw/master/docs/assets/llama_settings.png" />

> **Note**: If you're already running a server and wish to configure the plugin against that, then simply select the port and click `Apply/OK`.
## Features

Expand Down
15 changes: 15 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,21 @@ Expected a different answer? Re-generate any response of your choosing.
- **Seamless conversations** - Chat with the AI regardless of the maximum token limitations
- **Predefined Actions** - Create your own editor actions or override the existing ones, saving time rewriting the same prompt repeatedly

### Running locally

**Linux or macOS**
```shell
git clone https://github.com/carlrobertoh/CodeGPT.git
cd CodeGPT
git submodule update
./gradlew runIde
```

**Windows ARM64**
```shell
./gradlew runIde -Penv=win-arm64
```

## Issues

See the [open issues][open-issues] for a full list of proposed features (and known issues).
Expand Down
40 changes: 38 additions & 2 deletions build.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,8 +1,24 @@
import org.gradle.api.tasks.testing.logging.TestExceptionFormat
import org.jetbrains.changelog.Changelog
import org.jetbrains.changelog.markdownToHTML
import java.io.FileInputStream
import java.util.*

val env = environment("env").getOrNull()

fun loadProperties(filename: String): Properties = Properties().apply {
load(FileInputStream(filename))
}

fun properties(key: String): Provider<String> {
if ("win-arm64" == env) {
val property = loadProperties("gradle-win-arm64.properties").getProperty(key)
?: return providers.gradleProperty(key)
return providers.provider { property }
}
return providers.gradleProperty(key)
}

fun properties(key: String) = providers.gradleProperty(key)
fun environment(key: String) = providers.environmentVariable(key)

plugins {
Expand Down Expand Up @@ -53,6 +69,17 @@ dependencies {
testRuntimeOnly("org.junit.vintage:junit-vintage-engine:5.10.0")
}

tasks.register<Exec>("updateSubmodules") {
workingDir(rootDir)
commandLine("git", "submodule", "update", "--init", "--recursive")
}

tasks.register<Copy>("copyLlamaSubmodule") {
dependsOn("updateSubmodules")
from(layout.projectDirectory.file("src/main/cpp/llama.cpp"))
into(layout.buildDirectory.dir("idea-sandbox/plugins/CodeGPT/llama.cpp"))
}

tasks {
wrapper {
gradleVersion = properties("gradleVersion").get()
Expand Down Expand Up @@ -98,13 +125,22 @@ tasks {
})
}

prepareSandbox {
enabled = true
dependsOn("copyLlamaSubmodule")
}

signPlugin {
enabled = true
certificateChain.set(System.getenv("CERTIFICATE_CHAIN"))
privateKey.set(System.getenv("PRIVATE_KEY"))
password.set(System.getenv("PRIVATE_KEY_PASSWORD"))
}

buildPlugin {
enabled = true
}

publishPlugin {
enabled = true
dependsOn("patchChangelog")
Expand All @@ -125,4 +161,4 @@ tasks {
showStandardStreams = true
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ intellij {
}

dependencies {
implementation("ee.carlrobert:llm-client:0.0.6")
implementation("ee.carlrobert:llm-client:0.0.7")
}

tasks {
Expand Down
Binary file added docs/assets/llama_settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions gradle-win-arm64.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
platformVersion = 2023.1
javaVersion = 17
2 changes: 2 additions & 0 deletions gradle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ org.gradle.configuration-cache = true
# Enable Gradle Build Cache -> https://docs.gradle.org/current/userguide/build_cache.html
org.gradle.caching = true

# org.gradle.logging.level=debug

# Enable Gradle Kotlin DSL Lazy Property Assignment -> https://docs.gradle.org/current/userguide/kotlin_dsl.html#kotdsl:assignment
systemProp.org.gradle.unsafe.kotlin.assignment = true

Expand Down
1 change: 1 addition & 0 deletions src/main/cpp/llama.cpp
Submodule llama.cpp added at b8fe4b
10 changes: 10 additions & 0 deletions src/main/java/ee/carlrobert/codegpt/CodeGPTPlugin.java
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,10 @@
import com.intellij.openapi.application.PathManager;
import com.intellij.openapi.extensions.PluginId;
import com.intellij.openapi.project.Project;
import ee.carlrobert.codegpt.telemetry.core.util.Directories;
import java.io.File;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.jetbrains.annotations.NotNull;

public final class CodeGPTPlugin {
Expand All @@ -33,6 +35,14 @@ private CodeGPTPlugin() {
return getPluginOptionsPath() + File.separator + "indexes";
}

public static @NotNull String getLlamaSourcePath() {
return getPluginBasePath() + File.separator + "llama.cpp";
}

public static @NotNull String getLlamaModelsPath() {
return Paths.get(System.getProperty("user.home"), ".codegpt/models/gguf").toString();
}

public static @NotNull String getProjectIndexStorePath(@NotNull Project project) {
return getIndexStorePath() + File.separator + project.getName();
}
Expand Down
1 change: 1 addition & 0 deletions src/main/java/ee/carlrobert/codegpt/Icons.java
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ public final class Icons {
public static final Icon OpenAIIcon = IconLoader.getIcon("/icons/openai.svg", Icons.class);
public static final Icon AzureIcon = IconLoader.getIcon("/icons/azure.svg", Icons.class);
public static final Icon YouIcon = IconLoader.getIcon("/icons/you.svg", Icons.class);
public static final Icon LlamaIcon = IconLoader.getIcon("/icons/llama.svg", Icons.class);
}
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ public static String[][] toArray(Map<String, String> actionsMap) {
}

public static void refreshActions() {
AnAction actionGroup = ActionManager.getInstance().getAction("action.editor.group.EditorActionGroup");
AnAction actionGroup = ActionManager.getInstance().getAction("project.label");
if (actionGroup instanceof DefaultActionGroup) {
DefaultActionGroup group = (DefaultActionGroup) actionGroup;
group.removeAll();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,14 @@
import ee.carlrobert.codegpt.credentials.OpenAICredentialsManager;
import ee.carlrobert.codegpt.settings.advanced.AdvancedSettingsState;
import ee.carlrobert.codegpt.settings.state.AzureSettingsState;
import ee.carlrobert.codegpt.settings.state.LlamaSettingsState;
import ee.carlrobert.codegpt.settings.state.OpenAISettingsState;
import ee.carlrobert.codegpt.settings.state.YouSettingsState;
import ee.carlrobert.llm.client.Client;
import ee.carlrobert.llm.client.ProxyAuthenticator;
import ee.carlrobert.llm.client.azure.AzureClient;
import ee.carlrobert.llm.client.azure.AzureCompletionRequestParams;
import ee.carlrobert.llm.client.llama.LlamaClient;
import ee.carlrobert.llm.client.openai.OpenAIClient;
import ee.carlrobert.llm.client.you.UTMParameters;
import ee.carlrobert.llm.client.you.YouClient;
Expand All @@ -33,8 +36,16 @@ public static YouClient getYouClient(String sessionId, String accessToken) {
utmParameters.setMedium("jetbrains");
utmParameters.setCampaign(CodeGPTPlugin.getVersion());
utmParameters.setContent("CodeGPT");
return new YouClient.Builder(sessionId, accessToken)
// FIXME
return (YouClient) new YouClient.Builder(sessionId, accessToken)
.setUTMParameters(utmParameters)
.setHost(YouSettingsState.getInstance().getBaseHost())
.build();
}

public static LlamaClient getLlamaClient() {
return new LlamaClient.Builder()
.setPort(LlamaSettingsState.getInstance().getServerPort())
.build();
}

Expand Down Expand Up @@ -65,10 +76,9 @@ private static Client.Builder addDefaultClientParams(Client.Builder builder) {
builder.setProxy(
new Proxy(advancedSettings.getProxyType(), new InetSocketAddress(proxyHost, proxyPort)));
if (advancedSettings.isProxyAuthSelected()) {
builder.setProxyAuthenticator(
new ProxyAuthenticator(
advancedSettings.getProxyUsername(),
advancedSettings.getProxyPassword()));
builder.setProxyAuthenticator(new ProxyAuthenticator(
advancedSettings.getProxyUsername(),
advancedSettings.getProxyPassword()));
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,11 @@ private EventSource startCall(
var requestProvider = new CompletionRequestProvider(conversation);

try {
if (settings.isUseLlamaService()) {
return CompletionClientProvider.getLlamaClient()
.getChatCompletion(requestProvider.buildLlamaCompletionRequest(message), eventListener);
}

if (settings.isUseYouService()) {
var sessionId = "";
var accessToken = "";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,23 @@

import static java.util.stream.Collectors.toList;

import com.intellij.openapi.application.ApplicationManager;
import com.intellij.openapi.diagnostic.Logger;
import ee.carlrobert.codegpt.CodeGPTPlugin;
import ee.carlrobert.codegpt.EncodingManager;
import ee.carlrobert.codegpt.completions.llama.LlamaModel;
import ee.carlrobert.codegpt.conversations.Conversation;
import ee.carlrobert.codegpt.conversations.ConversationsState;
import ee.carlrobert.codegpt.conversations.message.Message;
import ee.carlrobert.codegpt.settings.configuration.ConfigurationState;
import ee.carlrobert.codegpt.settings.state.LlamaSettingsState;
import ee.carlrobert.codegpt.settings.state.SettingsState;
import ee.carlrobert.codegpt.settings.state.YouSettingsState;
import ee.carlrobert.codegpt.telemetry.core.configuration.TelemetryConfiguration;
import ee.carlrobert.codegpt.telemetry.core.service.TelemetryService;
import ee.carlrobert.codegpt.telemetry.core.service.UserId;
import ee.carlrobert.codegpt.util.ApplicationUtils;
import ee.carlrobert.embedding.EmbeddingsService;
import ee.carlrobert.llm.client.llama.completion.LlamaCompletionRequest;
import ee.carlrobert.llm.client.openai.completion.chat.OpenAIChatCompletionModel;
import ee.carlrobert.llm.client.openai.completion.chat.request.OpenAIChatCompletionMessage;
import ee.carlrobert.llm.client.openai.completion.chat.request.OpenAIChatCompletionRequest;
Expand All @@ -36,17 +40,22 @@ public class CompletionRequestProvider {
"Follow the user's requirements carefully & to the letter.\n" +
"Your responses should be informative and logical.\n" +
"You should always adhere to technical information.\n" +
"If the user asks for code or technical questions, you must provide code suggestions and adhere to technical information.\n" +
"If the question is related to a developer, CodeGPT must respond with content related to a developer.\n" +
"First think step-by-step - describe your plan for what to build in pseudocode, written out in great detail.\n" +
"If the user asks for code or technical questions, you must provide code suggestions and " +
"adhere to technical information.\n" +
"If the question is related to a developer, CodeGPT must respond with " +
"content related to a developer.\n" +
"First think step-by-step - describe your plan for what to build in pseudocode, " +
"written out in great detail.\n" +
"Then output the code in a single code block.\n" +
"Minimize any other prose.\n" +
"Keep your answers short and impersonal.\n" +
"Use Markdown formatting in your answers.\n" +
"Make sure to include the programming language name at the start of the Markdown code blocks.\n" +
"Make sure to include the programming language name at the start of the " +
"Markdown code blocks.\n" +
"Avoid wrapping the whole response in triple backticks.\n" +
"The user works in an IDE built by JetBrains which has a concept for editors with open files, integrated unit test support, " +
"and output pane that shows the output of running the code as well as an integrated terminal.\n" +
"The user works in an IDE built by JetBrains which has a concept for editors " +
"with open files, integrated unit test support, and output pane that shows " +
"the output of running the code as well as an integrated terminal.\n" +
"You can only give one reply for each conversation turn.";

private final EncodingManager encodingManager = EncodingManager.getInstance();
Expand All @@ -60,6 +69,20 @@ public CompletionRequestProvider(Conversation conversation) {
this.conversation = conversation;
}

public LlamaCompletionRequest buildLlamaCompletionRequest(Message message) {
var settings = LlamaSettingsState.getInstance();
var promptTemplate = settings.isUseCustomModel() ?
settings.getPromptTemplate() :
LlamaModel.findByHuggingFaceModel(settings.getHuggingFaceModel()).getPromptTemplate();
var prompt = promptTemplate.buildPrompt(
COMPLETION_SYSTEM_PROMPT,
message.getPrompt(),
conversation.getMessages());
return new LlamaCompletionRequest.Builder(prompt)
.setN_predict(512)
.build();
}

public YouCompletionRequest buildYouCompletionRequest(Message message) {
var requestBuilder = new YouCompletionRequest.Builder(message.getPrompt())
.setUseGPT4Model(YouSettingsState.getInstance().isUseGPT4Model())
Expand All @@ -68,7 +91,8 @@ public YouCompletionRequest buildYouCompletionRequest(Message message) {
prevMessage.getPrompt(),
prevMessage.getResponse()))
.collect(toList()));
if (TelemetryConfiguration.getInstance().isEnabled()) {
if (TelemetryConfiguration.getInstance().isEnabled() &&
!ApplicationManager.getApplication().isUnitTestMode()) {
requestBuilder.setUserId(UUID.fromString(UserId.INSTANCE.get()));
}
return requestBuilder.build();
Expand Down
Loading

0 comments on commit 45908e6

Please sign in to comment.