carlrobertoh#178 - Add support for running local LLMs via LLaMA C/C++…

… port (carlrobertoh#249) * Initial implementation of integrating llama.cpp to run LLaMA models locally * Move submodule * Copy llama submodule to bundle * Support for downloading models from IDE * Code cleanup * Store port field * Replace service selection radio group with dropdown * Add quantization support + other fixes * Add option to override host * Fix override host handler * Disable port field when override host enabled * Design updates * Fix llama settings configuration, design changes, clean up code * Improve You.com coupon design * Add new Phind model and help tooltip * Fetch you.com subscription * Add CodeBooga model, fix downloadable model selection * Chat history support * Code refactoring, minor bug fixes * UI updates, several bug fixes, removed code llama python model * Code cleanup, enable llama port only on macOS * Change downloaded gguf models path * Move some of the labels to codegpt bundle * Minor fixes * Remove ToRA model, add help texts * Fix test * Modify description
reneleonhardt · Nov 3, 2023 · 45908e6 · 45908e6
1 parent ca2eb9b
commit 45908e6
Show file tree

Hide file tree

Showing 71 changed files with 2,748 additions and 533 deletions.
diff --git a/.gitmodules b/.gitmodules
@@ -0,0 +1,3 @@
+[submodule "llama.cpp"]
+	path = src/main/cpp/llama.cpp
+	url = https://github.com/ggerganov/llama.cpp
diff --git a/DESCRIPTION.md b/DESCRIPTION.md
@@ -1,24 +1,66 @@
 <!-- Plugin description -->
 
-**ChatGPT as your copilot to level up your developer experience.**
+## Introducing CodeGPT: Your Free, Open-Source AI Copilot for Coding
 
-This is the perfect assistant for any programmer who wants to improve their coding skills
-and make more efficient use of the time.
+CodeGPT is your go-to AI assistant, designed to enhance your coding skills and optimize your programming time.
+Access state-of-the-art LLMs like GPT-4, Code LLama and more, all for free.
 
-## Getting Started
+## Quick Start Guide
 
-### Prerequisites
+1. **Download the Plugin**: Get started by downloading the plugin from the [JetBrains Marketplace](https://plugins.jetbrains.com/plugin/21056-codegpt?preview=true).
 
-In order to use the extension, you need to have the API key configured. You can find the API key in
-your [User settings](https://platform.openai.com/account/api-keys).
+2. **Choose Your Preferred Service**:
 
-### API Key Configuration
+   a) **OpenAI** - Requires authentication via OpenAI API key.
 
-After the plugin has been successfully installed, the API key needs to be configured.
+   b) **Azure** - Requires authentication via Active Directory or API key.
 
-You can configure the key by going to the plugin's settings via the **File | Settings/Preferences | Tools | CodeGPT**. On the settings panel simply
-click
-on the API key field, paste the key obtained from the OpenAI website and click **Apply/OK**.
+   c) **You.com** - A free, web-connected service with an optional upgrade to You⚡Pro for enhanced features..
+
+   d) **LLaMA C/C++ Port** - Run Code Llama, WizardCoder, and other state-of-the-art models locally for free.
+
+3. **Start Using the Features**: You're all set! Start exploring the features of our plugin.
+
+### OpenAI
+
+After successful installation, configure your API key. Navigate to the plugin's settings via **File | Settings/Preferences | Tools | CodeGPT**. Paste your OpenAI API key into the field and click `Apply/OK`.
+
+### Azure
+
+For Azure OpenAI services, you'll need to input three additional fields:
+* `Resource name`: The name of your Azure OpenAI Cognitive Services.
+* `Deployment ID`: The name of your Deployment.
+* `API version`: The most recent non-preview version.
+
+Also, input one of the two provided API keys.
+
+### You.com (Free)
+
+**You**.com is a search engine that summarizes the best parts of the internet for **you**, with private ads and with privacy options.
+
+**You⚡Pro**
+
+Use the **CodeGPT** coupon for a free month of unlimited GPT-4 usage.
+
+Check out the full [feature list](https://about.you.com/hc/youpro/what-features-are-included-in-youpro/) for more details.
+
+### LLaMA C/C++ Port (Free, Local)
+
+> **Note**: This feature is currently supported only on Linux and MacOS.
+
+The main goal of `llama.cpp` is to run the LLaMA model using 4-bit integer quantization on a MacBook.
+
+#### Getting Started
+
+1. **Select the Model**: Depending on your hardware capabilities, choose the appropriate model from the provided list. Once selected, click on the `Download Model` link. A progress bar will appear, indicating the download process.
+
+2. **Start the Server**: After successfully downloading the model, initiate the server by clicking on the `Start Server` button. A status message will be displayed, indicating that the server is starting up.
+
+3. **Apply Settings**: With the server running, you can now apply the settings to start using the features. Click on the `Apply/OK` button to save your settings and start using the application.
+
+<img alt="animated" style="max-width: 100%; width: 600px;" src="https://github.com/carlrobertoh/CodeGPT/raw/master/docs/assets/llama_settings.png" />
+
+> **Note**: If you're already running a server and wish to configure the plugin against that, then simply select the port and click `Apply/OK`.
 
 ## Features
 

diff --git a/README.md b/README.md
@@ -116,6 +116,21 @@ Expected a different answer? Re-generate any response of your choosing.
 - **Seamless conversations** - Chat with the AI regardless of the maximum token limitations
 - **Predefined Actions** - Create your own editor actions or override the existing ones, saving time rewriting the same prompt repeatedly
 
+### Running locally
+
+**Linux or macOS**
+```shell
+git clone https://github.com/carlrobertoh/CodeGPT.git
+cd CodeGPT
+git submodule update
+./gradlew runIde
+```
+
+**Windows ARM64**
+```shell
+./gradlew runIde -Penv=win-arm64
+```
+
 ## Issues
 
 See the [open issues][open-issues] for a full list of proposed features (and known issues).

diff --git a/build.gradle.kts b/build.gradle.kts
@@ -1,8 +1,24 @@
 import org.gradle.api.tasks.testing.logging.TestExceptionFormat
 import org.jetbrains.changelog.Changelog
 import org.jetbrains.changelog.markdownToHTML
+import java.io.FileInputStream
+import java.util.*
+
+val env = environment("env").getOrNull()
+
+fun loadProperties(filename: String): Properties = Properties().apply {
+    load(FileInputStream(filename))
+}
+
+fun properties(key: String): Provider<String> {
+  if ("win-arm64" == env) {
+    val property = loadProperties("gradle-win-arm64.properties").getProperty(key)
+            ?: return providers.gradleProperty(key)
+    return providers.provider { property }
+  }
+  return providers.gradleProperty(key)
+}
 
-fun properties(key: String) = providers.gradleProperty(key)
 fun environment(key: String) = providers.environmentVariable(key)
 
 plugins {
@@ -53,6 +69,17 @@ dependencies {
   testRuntimeOnly("org.junit.vintage:junit-vintage-engine:5.10.0")
 }
 
+tasks.register<Exec>("updateSubmodules") {
+  workingDir(rootDir)
+  commandLine("git", "submodule", "update", "--init", "--recursive")
+}
+
+tasks.register<Copy>("copyLlamaSubmodule") {
+  dependsOn("updateSubmodules")
+  from(layout.projectDirectory.file("src/main/cpp/llama.cpp"))
+  into(layout.buildDirectory.dir("idea-sandbox/plugins/CodeGPT/llama.cpp"))
+}
+
 tasks {
   wrapper {
     gradleVersion = properties("gradleVersion").get()
@@ -98,13 +125,22 @@ tasks {
     })
   }
 
+  prepareSandbox {
+    enabled = true
+    dependsOn("copyLlamaSubmodule")
+  }
+
   signPlugin {
     enabled = true
     certificateChain.set(System.getenv("CERTIFICATE_CHAIN"))
     privateKey.set(System.getenv("PRIVATE_KEY"))
     password.set(System.getenv("PRIVATE_KEY_PASSWORD"))
   }
 
+  buildPlugin {
+    enabled = true
+  }
+
   publishPlugin {
     enabled = true
     dependsOn("patchChangelog")
@@ -125,4 +161,4 @@ tasks {
       showStandardStreams = true
     }
   }
-}
+}
diff --git a/buildSrc/src/main/kotlin/codegpt.java-conventions.gradle.kts b/buildSrc/src/main/kotlin/codegpt.java-conventions.gradle.kts
@@ -18,7 +18,7 @@ intellij {
 }
 
 dependencies {
-  implementation("ee.carlrobert:llm-client:0.0.6")
+  implementation("ee.carlrobert:llm-client:0.0.7")
 }
 
 tasks {

diff --git a/docs/assets/llama_settings.png b/docs/assets/llama_settings.png
diff --git a/gradle-win-arm64.properties b/gradle-win-arm64.properties
@@ -0,0 +1,2 @@
+platformVersion = 2023.1
+javaVersion = 17
diff --git a/gradle.properties b/gradle.properties
@@ -32,6 +32,8 @@ org.gradle.configuration-cache = true
 # Enable Gradle Build Cache -> https://docs.gradle.org/current/userguide/build_cache.html
 org.gradle.caching = true
 
+# org.gradle.logging.level=debug
+
 # Enable Gradle Kotlin DSL Lazy Property Assignment -> https://docs.gradle.org/current/userguide/kotlin_dsl.html#kotdsl:assignment
 systemProp.org.gradle.unsafe.kotlin.assignment = true
 

diff --git a/src/main/cpp/llama.cpp b/src/main/cpp/llama.cpp
diff --git a/src/main/java/ee/carlrobert/codegpt/CodeGPTPlugin.java b/src/main/java/ee/carlrobert/codegpt/CodeGPTPlugin.java
@@ -6,8 +6,10 @@
 import com.intellij.openapi.application.PathManager;
 import com.intellij.openapi.extensions.PluginId;
 import com.intellij.openapi.project.Project;
+import ee.carlrobert.codegpt.telemetry.core.util.Directories;
 import java.io.File;
 import java.nio.file.Path;
+import java.nio.file.Paths;
 import org.jetbrains.annotations.NotNull;
 
 public final class CodeGPTPlugin {
@@ -33,6 +35,14 @@ private CodeGPTPlugin() {
     return getPluginOptionsPath() + File.separator + "indexes";
   }
 
+  public static @NotNull String getLlamaSourcePath() {
+    return getPluginBasePath() + File.separator + "llama.cpp";
+  }
+
+  public static @NotNull String getLlamaModelsPath() {
+    return Paths.get(System.getProperty("user.home"), ".codegpt/models/gguf").toString();
+  }
+
   public static @NotNull String getProjectIndexStorePath(@NotNull Project project) {
     return getIndexStorePath() + File.separator + project.getName();
   }

diff --git a/src/main/java/ee/carlrobert/codegpt/Icons.java b/src/main/java/ee/carlrobert/codegpt/Icons.java
@@ -11,4 +11,5 @@ public final class Icons {
   public static final Icon OpenAIIcon = IconLoader.getIcon("/icons/openai.svg", Icons.class);
   public static final Icon AzureIcon = IconLoader.getIcon("/icons/azure.svg", Icons.class);
   public static final Icon YouIcon = IconLoader.getIcon("/icons/you.svg", Icons.class);
+  public static final Icon LlamaIcon = IconLoader.getIcon("/icons/llama.svg", Icons.class);
 }
diff --git a/src/main/java/ee/carlrobert/codegpt/actions/editor/EditorActionsUtil.java b/src/main/java/ee/carlrobert/codegpt/actions/editor/EditorActionsUtil.java
@@ -38,7 +38,7 @@ public static String[][] toArray(Map<String, String> actionsMap) {
   }
 
   public static void refreshActions() {
-    AnAction actionGroup = ActionManager.getInstance().getAction("action.editor.group.EditorActionGroup");
+    AnAction actionGroup = ActionManager.getInstance().getAction("project.label");
     if (actionGroup instanceof DefaultActionGroup) {
       DefaultActionGroup group = (DefaultActionGroup) actionGroup;
       group.removeAll();

diff --git a/src/main/java/ee/carlrobert/codegpt/completions/CompletionClientProvider.java b/src/main/java/ee/carlrobert/codegpt/completions/CompletionClientProvider.java
@@ -5,11 +5,14 @@
 import ee.carlrobert.codegpt.credentials.OpenAICredentialsManager;
 import ee.carlrobert.codegpt.settings.advanced.AdvancedSettingsState;
 import ee.carlrobert.codegpt.settings.state.AzureSettingsState;
+import ee.carlrobert.codegpt.settings.state.LlamaSettingsState;
 import ee.carlrobert.codegpt.settings.state.OpenAISettingsState;
+import ee.carlrobert.codegpt.settings.state.YouSettingsState;
 import ee.carlrobert.llm.client.Client;
 import ee.carlrobert.llm.client.ProxyAuthenticator;
 import ee.carlrobert.llm.client.azure.AzureClient;
 import ee.carlrobert.llm.client.azure.AzureCompletionRequestParams;
+import ee.carlrobert.llm.client.llama.LlamaClient;
 import ee.carlrobert.llm.client.openai.OpenAIClient;
 import ee.carlrobert.llm.client.you.UTMParameters;
 import ee.carlrobert.llm.client.you.YouClient;
@@ -33,8 +36,16 @@ public static YouClient getYouClient(String sessionId, String accessToken) {
     utmParameters.setMedium("jetbrains");
     utmParameters.setCampaign(CodeGPTPlugin.getVersion());
     utmParameters.setContent("CodeGPT");
-    return new YouClient.Builder(sessionId, accessToken)
+    // FIXME
+    return (YouClient) new YouClient.Builder(sessionId, accessToken)
         .setUTMParameters(utmParameters)
+        .setHost(YouSettingsState.getInstance().getBaseHost())
+        .build();
+  }
+
+  public static LlamaClient getLlamaClient() {
+    return new LlamaClient.Builder()
+        .setPort(LlamaSettingsState.getInstance().getServerPort())
         .build();
   }
 
@@ -65,10 +76,9 @@ private static Client.Builder addDefaultClientParams(Client.Builder builder) {
       builder.setProxy(
           new Proxy(advancedSettings.getProxyType(), new InetSocketAddress(proxyHost, proxyPort)));
       if (advancedSettings.isProxyAuthSelected()) {
-        builder.setProxyAuthenticator(
-            new ProxyAuthenticator(
-                advancedSettings.getProxyUsername(),
-                advancedSettings.getProxyPassword()));
+        builder.setProxyAuthenticator(new ProxyAuthenticator(
+            advancedSettings.getProxyUsername(),
+            advancedSettings.getProxyPassword()));
       }
     }
 

diff --git a/src/main/java/ee/carlrobert/codegpt/completions/CompletionRequestHandler.java b/src/main/java/ee/carlrobert/codegpt/completions/CompletionRequestHandler.java
@@ -80,6 +80,11 @@ private EventSource startCall(
     var requestProvider = new CompletionRequestProvider(conversation);
 
     try {
+      if (settings.isUseLlamaService()) {
+        return CompletionClientProvider.getLlamaClient()
+            .getChatCompletion(requestProvider.buildLlamaCompletionRequest(message), eventListener);
+      }
+
       if (settings.isUseYouService()) {
         var sessionId = "";
         var accessToken = "";

diff --git a/src/main/java/ee/carlrobert/codegpt/completions/CompletionRequestProvider.java b/src/main/java/ee/carlrobert/codegpt/completions/CompletionRequestProvider.java
@@ -2,19 +2,23 @@
 
 import static java.util.stream.Collectors.toList;
 
+import com.intellij.openapi.application.ApplicationManager;
 import com.intellij.openapi.diagnostic.Logger;
 import ee.carlrobert.codegpt.CodeGPTPlugin;
 import ee.carlrobert.codegpt.EncodingManager;
+import ee.carlrobert.codegpt.completions.llama.LlamaModel;
 import ee.carlrobert.codegpt.conversations.Conversation;
 import ee.carlrobert.codegpt.conversations.ConversationsState;
 import ee.carlrobert.codegpt.conversations.message.Message;
 import ee.carlrobert.codegpt.settings.configuration.ConfigurationState;
+import ee.carlrobert.codegpt.settings.state.LlamaSettingsState;
 import ee.carlrobert.codegpt.settings.state.SettingsState;
 import ee.carlrobert.codegpt.settings.state.YouSettingsState;
 import ee.carlrobert.codegpt.telemetry.core.configuration.TelemetryConfiguration;
-import ee.carlrobert.codegpt.telemetry.core.service.TelemetryService;
 import ee.carlrobert.codegpt.telemetry.core.service.UserId;
+import ee.carlrobert.codegpt.util.ApplicationUtils;
 import ee.carlrobert.embedding.EmbeddingsService;
+import ee.carlrobert.llm.client.llama.completion.LlamaCompletionRequest;
 import ee.carlrobert.llm.client.openai.completion.chat.OpenAIChatCompletionModel;
 import ee.carlrobert.llm.client.openai.completion.chat.request.OpenAIChatCompletionMessage;
 import ee.carlrobert.llm.client.openai.completion.chat.request.OpenAIChatCompletionRequest;
@@ -36,17 +40,22 @@ public class CompletionRequestProvider {
       "Follow the user's requirements carefully & to the letter.\n" +
       "Your responses should be informative and logical.\n" +
       "You should always adhere to technical information.\n" +
-      "If the user asks for code or technical questions, you must provide code suggestions and adhere to technical information.\n" +
-      "If the question is related to a developer, CodeGPT must respond with content related to a developer.\n" +
-      "First think step-by-step - describe your plan for what to build in pseudocode, written out in great detail.\n" +
+      "If the user asks for code or technical questions, you must provide code suggestions and " +
+      "adhere to technical information.\n" +
+      "If the question is related to a developer, CodeGPT must respond with " +
+      "content related to a developer.\n" +
+      "First think step-by-step - describe your plan for what to build in pseudocode, " +
+      "written out in great detail.\n" +
       "Then output the code in a single code block.\n" +
       "Minimize any other prose.\n" +
       "Keep your answers short and impersonal.\n" +
       "Use Markdown formatting in your answers.\n" +
-      "Make sure to include the programming language name at the start of the Markdown code blocks.\n" +
+      "Make sure to include the programming language name at the start of the " +
+      "Markdown code blocks.\n" +
       "Avoid wrapping the whole response in triple backticks.\n" +
-      "The user works in an IDE built by JetBrains which has a concept for editors with open files, integrated unit test support, " +
-      "and output pane that shows the output of running the code as well as an integrated terminal.\n" +
+      "The user works in an IDE built by JetBrains which has a concept for editors " +
+      "with open files, integrated unit test support, and output pane that shows " +
+      "the output of running the code as well as an integrated terminal.\n" +
       "You can only give one reply for each conversation turn.";
 
   private final EncodingManager encodingManager = EncodingManager.getInstance();
@@ -60,6 +69,20 @@ public CompletionRequestProvider(Conversation conversation) {
     this.conversation = conversation;
   }
 
+  public LlamaCompletionRequest buildLlamaCompletionRequest(Message message) {
+    var settings = LlamaSettingsState.getInstance();
+    var promptTemplate = settings.isUseCustomModel() ?
+        settings.getPromptTemplate() :
+        LlamaModel.findByHuggingFaceModel(settings.getHuggingFaceModel()).getPromptTemplate();
+    var prompt = promptTemplate.buildPrompt(
+        COMPLETION_SYSTEM_PROMPT,
+        message.getPrompt(),
+        conversation.getMessages());
+    return new LlamaCompletionRequest.Builder(prompt)
+        .setN_predict(512)
+        .build();
+  }
+
   public YouCompletionRequest buildYouCompletionRequest(Message message) {
     var requestBuilder = new YouCompletionRequest.Builder(message.getPrompt())
         .setUseGPT4Model(YouSettingsState.getInstance().isUseGPT4Model())
@@ -68,7 +91,8 @@ public YouCompletionRequest buildYouCompletionRequest(Message message) {
                 prevMessage.getPrompt(),
                 prevMessage.getResponse()))
             .collect(toList()));
-    if (TelemetryConfiguration.getInstance().isEnabled()) {
+    if (TelemetryConfiguration.getInstance().isEnabled() &&
+        !ApplicationManager.getApplication().isUnitTestMode()) {
       requestBuilder.setUserId(UUID.fromString(UserId.INSTANCE.get()));
     }
     return requestBuilder.build();
-Original file line number
+Diff line change
@@ Expand Up / @@ -18,7 +18,7 @@ intellij { @@
     }
     dependencies {
-      implementation("ee.carlrobert:llm-client:0.0.6")
+      implementation("ee.carlrobert:llm-client:0.0.7")
     }
     tasks {
@@ Expand Down @@