Skip to content

Commit

Permalink
Update GSoC 2025 ideas page (#80)
Browse files Browse the repository at this point in the history
  • Loading branch information
notxvilka authored Feb 10, 2025
1 parent 9655ea1 commit ac1fe1d
Showing 1 changed file with 87 additions and 21 deletions.
108 changes: 87 additions & 21 deletions content/gsoc/2025.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ ShowReadingTime: false
date: "2025-01-26"
---


**TL;DR** Jump to the [Ideas list](#project-ideas).

# Introduction
Expand Down Expand Up @@ -263,6 +264,55 @@ feature out of the box.

# Rizin

## Classes analysis for C++/ObjectiveC/Swift/Dlang/Java (350 hour project)

Analysis classes, accessible under the `ac` command, is a relatively new feature of rizin.
They provide a way to both manually and automatically manage and use information about classes in the binary. But their support is only bare bones, without supporting various analysis integration, as well as display in the disassembly output.

Consider the following call: `call dword [eax + 0x6c]`
Let's assume eax is the base pointer of a vtable we have saved in class analysis and we want to find out the actual address of the called method.

So there should be a command that takes the offset (in this case 0x6c) and looks up the actual destination.
It should be possible to call this command with a specific class, so it only looks into its vtable, or without a class, so it gives a list of possible destinations for all vtables that are not too small for the offset.

When that is implemented, one could also add a command that does the same thing, but automatically takes the offset from the opcode at the current seek.

### Task

- Connecting classes with their methods
- Class inheritance - nesting data structs
- Constructors and destructors autorecognition
- `try`/`catch`/`finally` recognition and marking
- arguments recognition
- ASCII/graphviz graph of class inheritance/structure inheritance
- Tests with sources for C++, FreePascal, D language, ObjC and Swift, for rizin-testbins
- Classes list via `Vb`. It already supports browsing bin classes. The same thing should be implemented for classes from analysis.

### Skills
* Good knowledge of the C language
* Good knowledge of the C++ language (other languages, like ObjC, Swift, D, etc are a plus)

### Difficulty
Hard

### Benefits for the participant
Participant will understand how OOP languages work under the hood, and will master technique of detecting various high level (classes, methods) abstractions in the binary code.

### Benefits for the project
It will greatly benefit the project to allow efficient reverse engineering of the programs written in C++ and other OOP languages.

### Assess requirements for midterm/final evaluation
- 1st term: implement classes integration into the analysis, type inference; detect constructors and destructors, `try`/`catch` blocks.
- Final term: Class inheritance recognition, virtual methods detection, building class inheritance graphs, visual mode to inspect classes and methods relationships.

### Mentors
- xvilka
- deroad

### Links and resources
- [Improve vtable detection for C++, ObjectiveC, Dlang and Swift binaries - issue #416](https://github.com/rizinorg/rizin/issues/416)
- [Devirtualize method calls using class vtables - issue #414](https://github.com/rizinorg/rizin/issues/414)

## Debugger improvements and portability (175 hour project)

Rizin debugger already supports most of the platforms, including native and remote debugging.
Expand Down Expand Up @@ -294,6 +344,9 @@ Hard
### Benefits for the participant
Participant will understand how debugging works on the low level, and will gain experience with variety of different platforms and operating systems.

### Benefits for the project
It will allow efficient low-level debugging on various supported platforms, not only the mainstream ones, greatly improving Rizin's usefulness in reversing some domain-specific software.

### Assess requirements for midterm/final evaluation
- 1st term: `SystemZ, MIPS, HPPA support in Linux native, remote GDB debuggers
- Final term: ARM and SPARC support in *BSD debuggers, VAX support in NetBSD
Expand Down Expand Up @@ -329,6 +382,9 @@ Hard
### Benefits for the participant
Participant will understand and learn how to use Frida toolkit, also the internals of the debugging and instrumentation processes.

### Benefits for the project
It will allow easy dynamic code instrumentation right from the Rizin or Cutter session, allowing tracing and code inspection.

### Assess requirements for midterm/final evaluation
- 1st term: Implement core of the FRIDA plugin, allowing local and remote debugging features
- Final term: Add support for extended features like calling functions or scripts within the context
Expand Down Expand Up @@ -390,7 +446,7 @@ This feature would greatly help during exploits development, and people would be
- [Slides](https://github.com/XVilka/hacklu) for the exploitation part of workshop at Hack.lu 2015
- [RzEgg related bugs](https://github.com/rizinorg/rizin/labels/RzEgg)

## Binary case reduce tool
## Binary case reduce tool (175 hours project)

Similar to [Csmith/Creduce](https://github.com/csmith-project/creduce) but operating on the binary files, to reduce the size of the test and to avoid sharing proprietary/classified files.

Expand All @@ -403,6 +459,32 @@ It can perform these operations:

Since it requires some knowledge of the file format, existing libraries like [LIEF](https://lief.re/) could be used.

### Task
1. Make a tool to reduce the size of ELF using specified operations
2. Extend it to other formats - PE, MachO
3. Create tests for this tool
4. Research the possibility of minimizing some testcases that are already in the Rizin repository.

### Skills
The participant should be comfortable with the C language, as well as the high-level language of a choice (Python or Rust).

### Difficulty
Advanced

### Benefits for the participant
The participant will improve their understanding in file formats and their mutation.

### Benefits for the project
This feature would greatly help in minimizing and anonimizing testcases for the Rizin and any other binary analysis tool.

### Assess requirements for evaluation
- 1st term: Create the simple tool to reduce the ELF size
- Final term: Extend it to other formats, improve the "compression" rate, cover with tests.

### Mentors
- xvilka
- deroad

### Links/Resources

- https://github.com/rizinorg/ideas/issues/52
Expand Down Expand Up @@ -448,30 +530,15 @@ The current code analysis has many caveats and issues which need addressing. Fix

Currently Rizin has support for heap exploration and analysis, but the feature is still basic and can be improved. Additionally, [other allocators can be added](https://github.com/rizinorg/rizin/issues/208) (MacOS, tmalloc, etc.), but this should be done after a proper refactoring, because heap analysis shouldn't depend on the debugger backend, and we may be able to use different heap tools.

## Class analysis for C++/ObjectiveC/Swift/Dlang/Java [#416](https://github.com/rizinorg/rizin/issues/416)

Analysis classes, accessible under the `ac` command, is a relatively new feature of rizin.
They provide a way to both manually and automatically manage and use information about classes in the binary.

### Devirtualize method calls using class vtables [#414](https://github.com/rizinorg/rizin/issues/414)

Consider the following call: `call dword [eax + 0x6c]`
Let's assume eax is the base pointer of a vtable we have saved in class analysis and we want to find out the actual address of the called method.

So there should be a command that takes the offset (in this case 0x6c) and looks up the actual destination.
It should be possible to call this command with a specific class, so it only looks into its vtable, or without a class, so it gives a list of possible destinations for all vtables that are not too small for the offset.

When that is implemented, one could also add a command that does the same thing, but automatically takes the offset from the opcode at the current seek.

### Add classes list to `Vb`

`Vb` already supports browsing bin classes. The same thing should be implemented for classes from
analysis.

## Rizin legacy code refactoring

## Miscellaneous

### `rz-ar` - code archives unpacking tool

https://github.com/rizinorg/rizin/issues/4866

### Improving regression suite and testing

It is required to solve [numerous issues](https://github.com/rizinorg/rizin/labels/rz-test), along with improving parallel execution and performance.
Expand All @@ -498,4 +565,3 @@ in the Rizin side.

Also note that most of these issues should be paired with the test to verify it will not break in
the future.

0 comments on commit ac1fe1d

Please sign in to comment.