Skip to content

Commit

Permalink
Fix Agent prompt and infra (#804)
Browse files Browse the repository at this point in the history
Fixing some issues revealed by full agent experiments earlier:
1. [x] LLM-generated build scripts do not save fuzz target binary into
the correct path.
2. [x] Use default build script in the code fixing prompt in this
scenario:
1. The default build scripts builds successfully but failed other checks
(i.e., reference), and
    2. The LLM-generated build script does not work.
3. [x] Selectively use the default built script and the LLM-generated
built script, depending which is better.
4. [x] Use different code-fixing prompts based on which built script and
which result it is:
    * default or LLM built script
    * No reference, no binary, or compilation failure
5. [x] Backup human-writtent `/src/build.sh` to `/src/build.bk.sh` in
agent's containers in case LLM wants to reuse it in the new build
script.
    * Create the same copy for fuzzing execution.
6. [x] Hide the compile command to prevent LLM from reusing it in the
inspection tool and be distracted by irrelevant errors. E.g.:
* The inspection container always runs compile before LLM analysis.
Rerunning it may fail in some projects due to an existing
/src/<project>/build directory.
7. [x] Prompt use example fuzz target in the language the same as the
generated fuzz target, (not the project).
* Also dynamically adjust instructions in priming. Do not leave LLM to
judge which language the fuzz target is.
8. [x] Remove the agent log when receiving fuzz targets.
9. [x] Do not restrict LLM to send one bash command per query.


Also need to:
1. [ ] Use SemanticAnalyzer in agent workflow, at least to ensure the
last Result is Analysis Result.
2. [ ] Add an Enhancer in agent workflow.
3. [ ] Use service account in GKE, hopefully this will solve the
[`Service Unavailable`
problem](google/oss-fuzz#13042).
  • Loading branch information
DonggeLiu authored Feb 23, 2025
1 parent 16bed89 commit 262dff0
Show file tree
Hide file tree
Showing 13 changed files with 652 additions and 135 deletions.
335 changes: 232 additions & 103 deletions agent/prototyper.py

Large diffs are not rendered by default.

20 changes: 9 additions & 11 deletions experiment/builder_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -922,17 +922,15 @@ def build_and_run_cloud(
f'--real_project={project_name}',
]

# Temporarily comment out due to error in cached images.
# TODO(dongge): Add this back when the cached image works again.
# if oss_fuzz_checkout.ENABLE_CACHING and (
# oss_fuzz_checkout.is_image_cached(project_name, 'address') and
# oss_fuzz_checkout.is_image_cached(project_name, 'coverage')):
# logger.info('Using cached image for %s', project_name)
# command.append('--use_cached_image')

# # Overwrite the Dockerfile to be caching friendly
# oss_fuzz_checkout.rewrite_project_to_cached_project_chronos(
# generated_project)
if oss_fuzz_checkout.ENABLE_CACHING and (
oss_fuzz_checkout.is_image_cached(project_name, 'address') and
oss_fuzz_checkout.is_image_cached(project_name, 'coverage')):
logger.info('Using cached image for %s', project_name)
command.append('--use_cached_image')

# Overwrite the Dockerfile to be caching friendly
oss_fuzz_checkout.rewrite_project_to_cached_project_chronos(
generated_project)

if cloud_build_tags:
command += ['--tags'] + cloud_build_tags
Expand Down
2 changes: 2 additions & 0 deletions experiment/evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,8 @@ def create_ossfuzz_project(self,
os.path.basename('agent-build.sh')))

# Add additional statement in dockerfile to overwrite with generated fuzzer
with open(os.path.join(generated_project_path, 'Dockerfile'), 'a') as f:
f.write('\nRUN cp /src/build.sh /src/build.bk.sh\n')
with open(os.path.join(generated_project_path, 'Dockerfile'), 'a') as f:
f.write('\nCOPY agent-build.sh /src/build.sh\n')

Expand Down
2 changes: 1 addition & 1 deletion experiment/oss_fuzz_checkout.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def _clone_oss_fuzz_repo():
"""Clones OSS-Fuzz to |OSS_FUZZ_DIR|."""
clone_command = [
'git', 'clone', 'https://github.com/google/oss-fuzz', '--depth', '1',
'--branch', 'target-exp-log-account', OSS_FUZZ_DIR
OSS_FUZZ_DIR
]
proc = sp.Popen(clone_command,
stdout=sp.PIPE,
Expand Down
72 changes: 65 additions & 7 deletions llm_toolkit/prompt_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
from experiment.benchmark import Benchmark, FileType
from experiment.fuzz_target_error import SemanticCheckResult
from llm_toolkit import models, prompts
from results import BuildResult

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -546,15 +547,22 @@ class PrototyperTemplateBuilder(DefaultTemplateBuilder):
def __init__(self,
model: models.LLM,
benchmark: Benchmark,
template_dir: str = DEFAULT_TEMPLATE_DIR):
super().__init__(model)
self._template_dir = template_dir
template_dir: str = DEFAULT_TEMPLATE_DIR,
initial: Any = None):
super().__init__(model, benchmark, template_dir, initial)
self.agent_templare_dir = AGENT_TEMPLATE_DIR
self.benchmark = benchmark

# Load templates.
self.priming_template_file = self._find_template(self.agent_templare_dir,
'prototyper-priming.txt')
if benchmark.is_c_target:
self.priming_template_file = self._find_template(
self.agent_templare_dir, 'prototyper-priming.c.txt')
elif benchmark.is_cpp_target:
self.priming_template_file = self._find_template(
self.agent_templare_dir, 'prototyper-priming.cpp.txt')
else:
self.problem_template_file = self._find_template(
self.agent_templare_dir, 'prototyper-priming.txt')

self.cpp_priming_filler_file = self._find_template(
template_dir, 'cpp-specific-priming-filler.txt')
self.problem_template_file = self._find_template(template_dir,
Expand All @@ -568,11 +576,13 @@ def build(self,
example_pair: list[list[str]],
project_example_content: Optional[list[list[str]]] = None,
project_context_content: Optional[dict] = None,
tool_guides: str = '') -> prompts.Prompt:
tool_guides: str = '',
project_dir: str = '') -> prompts.Prompt:
"""Constructs a prompt using the templates in |self| and saves it."""
if not self.benchmark:
return self._prompt
priming = self._format_priming(self.benchmark)
priming = priming.replace('{PROJECT_DIR}', project_dir)
final_problem = self.format_problem(self.benchmark.function_signature)
final_problem += (f'You MUST call <code>\n'
f'{self.benchmark.function_signature}\n'
Expand All @@ -585,6 +595,54 @@ def build(self,
return self._prompt


class PrototyperFixerTemplateBuilder(PrototyperTemplateBuilder):
"""Builder specifically targeted C (and excluding C++)."""

def __init__(self,
model: models.LLM,
benchmark: Benchmark,
build_result: BuildResult,
compile_log: str,
template_dir: str = DEFAULT_TEMPLATE_DIR,
initial: Any = None):
super().__init__(model, benchmark, template_dir, initial)
# Load templates.
self.priming_template_file = self._find_template(self.agent_templare_dir,
'prototyper-fixing.txt')
self.build_result = build_result
self.compile_log = compile_log

def build(self,
example_pair: list[list[str]],
project_example_content: Optional[list[list[str]]] = None,
project_context_content: Optional[dict] = None,
tool_guides: str = '',
project_dir: str = '') -> prompts.Prompt:
"""Constructs a prompt using the templates in |self| and saves it."""
del (example_pair, project_example_content, project_context_content,
tool_guides)
if not self.benchmark:
return self._prompt

if self.build_result.build_script_source:
build_text = (f'<build script>\n{self.build_result.build_script_source}\n'
'</build script>')
else:
build_text = 'Build script reuses `/src/build.bk.sh`.'

prompt = self._get_template(self.priming_template_file)
prompt = prompt.replace('{FUZZ_TARGET_SOURCE}',
self.build_result.fuzz_target_source)
prompt = prompt.replace('{BUILD_TEXT}', build_text)
prompt = prompt.replace('{COMPILE_LOG}', self.compile_log)
prompt = prompt.replace('{FUNCTION_SIGNATURE}',
self.benchmark.function_signature)
prompt = prompt.replace('{PROJECT_DIR}', project_dir)
self._prompt.append(prompt)

return self._prompt


class DefaultJvmTemplateBuilder(PromptBuilder):
"""Default builder for JVM projects."""

Expand Down
16 changes: 16 additions & 0 deletions prompts/agent/prototyper-fixing.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Failed to build fuzz target. Here is the fuzz target, build script, compilation command, and compilation output:
<fuzz target>\n{FUZZ_TARGET_SOURCE}\n</fuzz target>
{BUILD_TEXT}
<compilation log>\n{COMPILE_LOG}\n</compilation log>
YOU MUST first analyze the error messages with the fuzz target and the build script carefully to identify the root cause.
YOU MUST NOT make any assumptions of the source code or build environment. Always confirm assumptions with source code evidence, obtain them via Bash commands.
Once you are absolutely certain of the error root cause, output the FULL SOURCE CODE of the fuzz target (and FULL SOURCE CODE of build script, if /src/build.bk.sh is insufficient).
TIPS:
1. If necessary, #include necessary headers and #define required macros or constants in the fuzz target.
2. Adjust compiler flags to link required libraries in the build script.
3. After collecting information, analyzing and understanding the error root cause. YOU MUST take at least one step to validate your theory with source code evidence.
4. Always use the source code from project source code directory `{PROJECT_DIR}/` to understand errors and how to fix them. For example, search for the key words (e.g., function name, type name, constant name) in the source code to learn how they are used. Similarly, learn from the other fuzz targets and the build script to understand how to include the correct headers.
5. Once you have verified the error root cause, output the FULL SOURCE CODE of the fuzz target (and FULL SOURCE CODE of build script, if /src/build.bk.sh is insufficient).
6. Focus on writing a compilable fuzz target that calls the function-under-test {FUNCTION_SIGNATURE}, don't worry about coverage or finding bugs. We can improve that later, but first try to ensure it calls the function-under-test {FUNCTION_SIGNATURE} and can compile successfully.
7. If an error happens repeatedly and cannot be fixed, try to mitigate it. For example, replace or remove the line.

141 changes: 141 additions & 0 deletions prompts/agent/prototyper-priming.c.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
<system>
As a security testing engineer, you must write an `int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)` fuzz target in {LANGUAGE}.
Objective: Your goal is to modify an existing fuzz target `{FUZZ_TARGET_PATH}` to write a minimum fuzz target of a given function-under-test that can build successfully.
</system>

<steps>
Follow these steps to write a minimum fuzz target:

Step 1. Determine the information you need to write an effective fuzz target.
This includes:
* **Source code** of the function under test.
* **Custom Types and Dependencies** definitions and implementations.
* **Initialization and setup** requirements and steps.
* **Build details** and integration steps.
* Valid and edge-case input values.
* Environmental and runtime dependencies.

Step 2. Collect information using the Bash tool.
Use the bash tool (see <tool> section) and follow its rules to gather the necessary information. You can collect information from:
* The existing human written fuzz target at `{FUZZ_TARGET_PATH}`.
* The existing human written build script `/src/build.bk.sh`.
* The project source code directory `{PROJECT_DIR}/` cloned from the project repository.
* Documentation about the project, the function, and the variables/constants involved.
* Environment variables.
* Knowledge about OSS-Fuzz's build infrastructure: It will compile your fuzz target in the same way as the exiting human written fuzz target with the build script.

Step 3. Analyze the function and its parameters.
Understand the function under test by analyzing its source code and documentation:
* **Purpose and functionality** of the function.
* **Input processing** and internal logic.
* **Dependencies** on other functions or global variables.
* **Error handling** and edge cases.

Step 4. Understand initialization requirements.
Identify what is needed to properly initialize the function:
* **Header files** and their relative paths used by include statements in the fuzz target.
* **Complex input parameters or objects** initialization.
* **Constructor functions** or initialization routines.
* **Global state** or configuration needs to be set up.
* **Mocking** external dependencies if necessary.

Step 5. Understand Constraints and edge cases.
For each input parameter, understand:
* Valid ranges and data types.
* Invalid or edge-case values (e.g., zero, NULL, predefined constants, maximum values).
* Special values that trigger different code paths.

Step 6: Plan Fuzz Target Implementation.
Decide how to implement the fuzz target:
* **Extract parameters** from the `data` and `size` variable of `LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)`.
* Handle fixed-size versus variable-size data.
* **Initialize function's parameters** by appropriately mapping the raw input bytes.
* Ensure that the fuzz target remains deterministic and avoids side effects.
* Avoid `goto` statements.

Step 7: **Write** the fuzz target code.
Implement the `LLVMFuzzerTestOneInput` function:
* Header files:
* Investigate how existing fuzz targets include headers.
* Investigate where they are located in the project
* Collect all headers required by your fuzz target and their locations.
* Include their relative path in the same way as the existing fuzz targets.
* Macros or Constants:
* Include or define necessary macros or constants.
* Input Handling:
* Check that the input size is sufficient.
* Extract parameters from the input data.
* Handle any necessary conversions or validations.
* Function Invocation:
* Initialize required objects or state.
* Modify the existing fuzz target at `{FUZZ_TARGET_PATH}` to fuzz the function under test with the fuzzed parameters.
* Ensure proper error handling.
*
* Cleanup:
* Free any allocated resources.
* Reset any global state if necessary.

Step 8 (Optional): **Modify** the Build Script.
Write a new build script only if the existing one (`/src/build.bk.sh`) is insufficient:
* Decide if you need to modify the build script at `/src/build.bk.sh` to successfully build the new fuzz target.
* Include compilation steps for the project under test.
* Include compilation steps for the new fuzz target.
* Specify necessary compiler and linker flags.
* Ensure all dependencies are correctly linked.

Step 9: Providing Your Conclusion:
* Provide your conclusion on the FULL new fuzz target and build script **ONLY AFTER** you have gathered all necessary information.
* **DO NOT SEND** any other content (e.g., bash tool commands) in the conclusion message. ALWAYS send other commands individually and ONLY SEND conclusion after collecting all information.
* Conclusion Format:
* Overall Description:
* Summarize your findings and describe your fuzz target design.
* Wrap this summary within <conclusion> and </conclusion> tags.
* Modified Fuzz Target:
* Provide the full code of the modified fuzz target.
* Wrap the code within <fuzz target> and </fuzz target> tags.
* Modified Build Script (if applicable):
* If you need to modify the build script, provide the full code.
* Wrap it within <build script> and </build script> tags.
* Format Example:
<conclusion>
I determined that the fuzz target needs to include specific header files and adjust the `LLVMFuzzerTestOneInput` function to call the new function-under-test. Additionally, the build script requires modification to link against the necessary libraries.
</conclusion>
<fuzz target>
[Your FULL fuzz target code here.]
</fuzz target>
<build script>
[Your FULL build script code here, if applicable.]
</build script>

</steps>

{TYPE_SPECIFIC_PRIMING}

<instructions>
3. Methodical Approach:
* Be systematic to cover all necessary aspects, such as:
* Understanding the function's parameters and dependencies.
* Identifying required header files and libraries.
* Recognizing any special initialization or environmental requirements.
1. Utilizing Existing Examples:
* Use the existing fuzz target at `{FUZZ_TARGET_PATH}` and other fuzz targets with `LLVMFuzzerTestOneInput` in its parent directory as references.
* Pay special attention to:
* How header files are included.
* The structure and content of the `LLVMFuzzerTestOneInput` function.
* Typically, you only need to modify the content of `LLVMFuzzerTestOneInput`.
2. Investigating Header Inclusions:
* Use bash tool to find required headers and libraries.
* Examine library files built by `/src/build.bk.sh` to understand available functions and symbols.
3. Modifying the Build Script (if necessary):
* Modifying `/src/build.bk.sh` to build the necessary components or include required libraries if function-under-test is not included.
* The project's directory may contain a `README.md` with build instructions (e.g., at `/src/<project-name>/README.md`
4. Do Not Compile:
* **Do not compile** the fuzz target during your investigation.
* Provide your conclusions based on the information gathered after you have a solution.
5. Formatting Code Snippets:
* Do not wrap code snippets with triple backticks (```).
* Use the specified XML-style tags for wrapping code and other content.
6. DO NOT send the <conclusion> early: Provide conclusions **only after** gathering all necessary information.
7. Focus on Final Goals:
* Ensure that your fuzz target and build script aim to successfully build the fuzz target and fuzz the function-under-test.
</instructions>
Loading

0 comments on commit 262dff0

Please sign in to comment.