Skip to content

sola-st/RepairAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛠️ RepairAgent

RepairAgent is an autonomous LLM-based agent designed for automated program repair. For a comprehensive understanding of its workings and development, you can check out our research paper here.


📋 I. Requirements

Before you start using RepairAgent, ensure that your system meets the following requirements:

  • Docker: Version 20.04 or higher. For installation instructions, see the Docker documentation.
  • VS Code: Not a hard requirement but highly recommended. VS Code provides an easy way to interact with RepairAgent using DevContainers (see the instructions below).
  • OpenAI Token and Credits:
    • Create an account on the OpenAI website and purchase credits to use the API.
    • Generate an API token on the same website.
  • Disk Space:
    • At least 40GB of available disk space on your machine. The code itself does not take 40GB. However, the dependencies might take up to 8GB, and files generated from running on different instances may use more. 40GB is a safe estimate.
    • If you are using VS Code DevContainers, you can avoid pulling the heavy Docker image (~22GB).
  • Internet Access: Required while running RepairAgent to connect to OpenAI's API.

⚙️ II. How to Use RepairAgent?

You have two ways to use RepairAgent:

  1. Start a VS Code DevContainer: The easiest method, as it avoids pulling the large Docker image.
  2. Use the Docker Image: Suitable for users familiar with Docker.

🚀 Option 1: Using a VS Code DevContainer

STEP 1: Open RepairAgent in a DevContainer

  1. Ensure you have the Dev Containers extension installed in VS Code. You can install it from the Visual Studio Code Marketplace.

  2. Clone the RepairAgent repository:

    git clone https://github.com/your-organization/RepairAgent.git
    cd RepairAgent
    cd repair_agent
    rm -rf defects4j
    git clone https://github.com/rjust/defects4j.git
    cd ../..
  3. Open the repository folder in VS Code.

  4. When prompted by VS Code to "Reopen in Container," click it. If not prompted, open the Command Palette (Ctrl+Shift+P) and select "Dev Containers: Reopen in Container." VS Code will now build and start the DevContainer, setting up the environment for you.

  5. Within your VS Code terminal, move to the folder repair_agent

    cd repair_agent

STEP 2: Set the OpenAI API Key

Inside the DevContainer terminal, configure your OpenAI API key by running:

python3.10 set_api_key.py

The script will prompt you to paste your API token.

STEP 3: Start RepairAgent

By default, RepairAgent is configured to run on Defects4J bugs. To specify which bugs to run on:

  1. Create a text file named, for example, bugs_list. A sample file exists in the repository at experimental_setups/bugs_list.

  2. Run the following command:

    ./run_on_defects4j.sh experimental_setups/bugs_list hyperparams.json

You can open the hyperparams.json file to review or customize its parameters (explained further in the customization section).

If you went with this option, you can jump to section 4.1 What Happens When You Start RepairAgent? to see more details on the results of running RepairAgent.


🚀 Option 2: Using the Docker Image

STEP 1: Pull the Docker Image

Run the following commands in your terminal to retrieve and start our Docker image:

# Pull the image from DockerHub
docker pull islemdockerdev/repair-agent:v1

# Run the image inside a container
docker run -itd --name apr-agent islemdockerdev/repair-agent:v1

# Start the container
docker start -i apr-agent

STEP 2: Attach the Container to VS Code

  • After starting the container, open VS Code and navigate to the Containers icon on the left panel. Ensure you have the Remote Explorer extension installed.
  • Under the Dev Containers tab, find the name of the container you just started (e.g., apr-agent).
  • Attach the container to a new window by clicking the "+" sign to the right of the container name, then navigate to the workdir folder in the VS Code window (the workdir is /app/AutoGPT).
  • Tutorial Reference: For detailed steps on attaching a Docker container in VS Code, check out this video tutorial (1min 38 sec).

STEP 3: Set the OpenAI API Key

Inside the Docker container, configure your OpenAI API key by running:

python3.10 set_api_key.py

The script will prompt you to paste your API token.

STEP 4: Start RepairAgent

By default, RepairAgent is configured to run on Defects4J bugs. To specify which bugs to run on:

  1. Create a text file named, for example, bugs_list. A sample file exists in the repository and Docker image at experimental_setups/bugs_list.

  2. Run the following command:

    ./run_on_defects4j.sh experimental_setups/bugs_list hyperparams.json

You can open the hyperparams.json file to review or customize its parameters (explained further in the customization section).

4.1 What Happens When You Start RepairAgent?

  • RepairAgent checks out the project with the given bug ID.
  • It initiates the autonomous repair process.
  • Logs detailing each step performed will be displayed in your terminal.

4.2 Retrieve Repair Logs and History

RepairAgent saves the output in multiple files.

  • The primary logs are located in the folder experimental_setups/experiment_X, where experiment_X increments automatically with each run of the command run_on_defects_4j.

  • Within this folder, you may find several subfolders:

    • logs: Full chat history (prompts) and command outputs (one file per bug).
    • plausible_patches: Any plausible patches generated (one file per bug).
    • mutations_history: Suggested fixes derived by mutating prior suggestions (one file per bug).
    • responses: Responses from the agent (LLM) at each cycle (one file per bug).

4.3 Analyze Logs

Within the experimental_setups folder, several scripts are available to post-process the logs:

  • Collect Plausible Patches: Use the script collect_plausible_patches_files.py to gather the generated plausible patches across multiple experiments:

    python3.10 collect_plausible_patches.py 1 10

    A plausible patch is a patch that passes all test cases and is a candidate to be the correct patch

  • Get Fully Executed Runs: Use get_list_of_fully_executed.py to retrieve runs that reached at least 38 out of 40 cycles. This identifies executions that terminated unexpectedly or called the exit function prematurely.

  • Analyze experiments results: Produces a summary for all executed experiments so far. A text file is generated for each experiment where it shows all the suggested patches per bug and also a table with BugID, number of cycles, number of suggested patches and the number of plausible patches.

    python3.10 analyze_experiment_results.py

    An example of the output file would look like this:

    Experiment Results: experiment_60
    
    Number of Bugs: 2
    Correctly fixed bugs: 1
    Total Suggested Fixes: 4
    
    The list of suggested fixes:
    Cli_8
    
    ###Fix:
    Lines:['812', '813', '814', '815', '816', '817', '818', '819', '820'] from file /workspace/Auto-GPT/auto_gpt_workspace/cli_8_buggy/src/java/org/apache/commons/cli/HelpFormatter.java were replaced with the following:
    {'812': 'pos = findWrapPos(text, width, 0);', '813': 'if (pos == -1) { sb.append(rtrim(text)); return sb; }', '814': 'sb.append(rtrim(text.substring(0, pos))).append(defaultNewLine);', '815': 'final String padding = createPadding(nextLineTabStop);', '816': 'while (true) {', '817': 'text = padding + text.substring(pos).trim();', '818': 'pos = findWrapPos(text, width, nextLineTabStop);', '819': 'if (pos == -1) { sb.append(text); return sb; }', '820': 'sb.append(rtrim(text.substring(0, pos))).append(defaultNewLine);'}
    
    ###Fix:
    Lines:['812', '813', '814', '815', '816', '817', '818', '819', '820'] from file /workspace/Auto-GPT/auto_gpt_workspace/cli_8_buggy/src/java/org/apache/commons/cli/HelpFormatter.java were replaced with the following:
    {'812': 'pos = findWrapPos(text, width, nextLineTabStop);', '813': 'if (pos == -1) { sb.append(rtrim(text)); return sb; }', '814': 'sb.append(rtrim(text.substring(0, pos))).append(defaultNewLine);', '815': 'final String padding = createPadding(nextLineTabStop);', '816': 'while (true) {', '817': 'text = padding + text.substring(pos).trim();', '818': 'pos = findWrapPos(text, width, nextLineTabStop);', '819': 'if (pos == -1) { sb.append(text); return sb; }', '820': 'sb.append(rtrim(text.substring(0, pos))).append(defaultNewLine);'}
    
    ###Fix:
    Lines:['812', '813', '814', '815', '816', '817', '818', '819', '820'] from file /workspace/Auto-GPT/auto_gpt_workspace/cli_8_buggy/src/java/org/apache/commons/cli/HelpFormatter.java were replaced with the following:
    {'812': 'pos = findWrapPos(text, width, nextLineTabStop);', '813': 'if (pos == -1) { sb.append(rtrim(text)); return sb; }', '814': 'sb.append(rtrim(text.substring(0, pos))).append(defaultNewLine);', '815': 'final String padding = createPadding(nextLineTabStop);', '816': 'while (true) {', '817': 'text = padding + text.substring(pos).trim();', '818': 'pos = findWrapPos(text, width, nextLineTabStop);', '819': 'if (pos == -1) { sb.append(text); return sb; }', '820': 'sb.append(rtrim(text.substring(0, pos))).append(defaultNewLine);'}
    
    Chart_1
    
    ###Fix:
    Lines:['1797'] from file org/jfree/chart/renderer/category/AbstractCategoryItemRenderer.java were replaced with the following:
    {'1797': 'if (dataset == null) {'}
    
    +----------+-----------------+-----------------+-------------------+
    | Log File | Correctly Fixed | Suggested Fixes | Number of Queries |
    +----------+-----------------+-----------------+-------------------+
    | Cli_8    |        No       |        3        |         32        |
    | Chart_1  |       Yes       |        1        |         10        |
    +----------+-----------------+-----------------+-------------------+
    

✨ III. Customize RepairAgent

1. Modify hyperparams.json

  • Budget Control Strategy: Defines how the agent views the remaining cycles, suggested fixes, and minimum required fixes:

    • FULL-TRACK: Put the max, consumed and left budget in the prompt (default for our experiments).
    • NO-TRACK: Suppresses budget information.
    • FORCED: Experimental and buggy—avoid use (we did not use this option).

    Example Configuration:

    "budget_control": {
        "name": "FULL-TRACK",
        "params": {
            "#fixes": 4 //The agent should suggest at least 4 patches within the given budget, the number is updated based on agent progress (4 is default).
        }
    }
  • Repetition Handling: Default settings restrict repetitions.

    "repetition_handling": "RESTRICT",
  • Command Limit: Controls the maximum allowed cycles (budget).

    "commands_limit": 40 // default for our experiment
  • Request External Fixes: Experimental feature allowing the request of fixes from another LLM.

    "external_fix_strategy": 0, // deafult for our experiment

2. Switch Between GPT-3.5 and GPT-4

In the run_on_defects4j.sh file, locate the line:

./run.sh --ai-settings ai_settings.yaml --gpt3only -c -l 40 -m json_file --experiment-file "$2"
  • The --gpt3only flag enforces GPT-3.5 usage. Removing this flag switches RepairAgent to GPT-4.
  • Search the codebase for "gpt-3" and "gpt-4" to update version names accordingly.

📊 IV. Our Data

In our experiments, we utilized RepairAgent on the Defects4J dataset, successfully fixing 164 bugs. You can check our data under the folder data.

  • The list of fixed bugs here. The list allows to compare with prior and future work.

    • For example, we compare to ChatRepair, SelfAPR, and ITER. The venn diagram of Figure 6 is produced using the command:
      python3.10 draw_venn_chatrepair_clean.py
    • The file d4j12.csv contains the list of bugs fixed by previous work. The script draw_venn_chatrepair_clean.py contains the list of fixes that we compare to.
  • The implementation details of the patches in this file.

  • The folder data/root_patches contains patches produced by RepairAgent in the main phase

  • The folder data/derivated_pathces contains patches obtained by mutating root_patches

Note: RepairAgent encountered exceptions due to Middleware errors in 29 bugs, which were not re-run.


🧫 V. Replicate Experiments

This part is about running RepairAgent on full evaluation datasets to replicate our experiments. The process is the same as above; We just provide ready-to-use input files and instructs for replication.

Replicate Defects4J experiments

  1. Create the execution batches for Defects4J which will create lists of bugs to run on.

    python3.10 get_defects4j_list.py

    The result of this command can be found in experimental_setups/batches

  2. Run RepairAgent on each of the batches (either singularly or concurrently)

    ./run_on_defects4j.sh experimental_setups/batches/0 hyperparameters.json
    # replace 0 with the desired batch number
  3. Refer to sections 4.2 Retrieve Repair Logs and History and 4.3 Analyze Logs on how to analyze logs and summarize the results of the experiments.

  4. Furthermore, you can adapt the script experimental_setups/generate_main_table.py to generate the main comparative table (Table III in the paper)

    • 4.1. You can also use experimental_setups/draw_venn_chatrepair_clean.py to draw a venn diagram to compare different techniques (Figure 6 of the paper)
  5. You can use the script experimental_setups/calculate_tokens.py to calculate the costs of the agent (used to generate figure 9).

  6. You can use the script experimental_setups/collect_plausible_patches_files.py to get the list of plausible patches to inspect.

Replicate GitBugsJava Experiment

GitBugsJava is another dataset for program repair evaluation.

  1. First,prepare the GitBugsJava VM. Since this dataset requires a heavy VM (at least 140 GB of disk), we could not include it in this artifact. We added more detailed instruction on how to prepare such VM. Please check the step by step process here: https://github.com/gitbugactions/gitbug-java

  2. Copy the repository of RepairAgent inside the VM.

  3. Run RepairAgent on the list of bugs by specifying the file experimental_setups/gitbuglist as the target file.

  4. Use the same analysis scripts as part 1 (D4j replication) to analyse the results of the experiments.

💬 VI. Help Us Improve RepairAgent

If you use RepairAgent, we encourage you to report any issues, bugs, or documentation gaps. We are committed to addressing your concerns promptly.

You can raise an issue directly in this repository, or for any queries, feel free to email me.