From ac1fe1dc8a646960507e7a38babccf3858f19a6b Mon Sep 17 00:00:00 2001 From: notxvilka Date: Tue, 11 Feb 2025 00:16:48 +0800 Subject: [PATCH] Update GSoC 2025 ideas page (#80) --- content/gsoc/2025.md | 108 ++++++++++++++++++++++++++++++++++--------- 1 file changed, 87 insertions(+), 21 deletions(-) diff --git a/content/gsoc/2025.md b/content/gsoc/2025.md index 6a90e1a..51acad4 100644 --- a/content/gsoc/2025.md +++ b/content/gsoc/2025.md @@ -7,6 +7,7 @@ ShowReadingTime: false date: "2025-01-26" --- + **TL;DR** Jump to the [Ideas list](#project-ideas). # Introduction @@ -263,6 +264,55 @@ feature out of the box. # Rizin +## Classes analysis for C++/ObjectiveC/Swift/Dlang/Java (350 hour project) + +Analysis classes, accessible under the `ac` command, is a relatively new feature of rizin. +They provide a way to both manually and automatically manage and use information about classes in the binary. But their support is only bare bones, without supporting various analysis integration, as well as display in the disassembly output. + +Consider the following call: `call dword [eax + 0x6c]` +Let's assume eax is the base pointer of a vtable we have saved in class analysis and we want to find out the actual address of the called method. + +So there should be a command that takes the offset (in this case 0x6c) and looks up the actual destination. +It should be possible to call this command with a specific class, so it only looks into its vtable, or without a class, so it gives a list of possible destinations for all vtables that are not too small for the offset. + +When that is implemented, one could also add a command that does the same thing, but automatically takes the offset from the opcode at the current seek. + +### Task + +- Connecting classes with their methods +- Class inheritance - nesting data structs +- Constructors and destructors autorecognition +- `try`/`catch`/`finally` recognition and marking +- arguments recognition +- ASCII/graphviz graph of class inheritance/structure inheritance +- Tests with sources for C++, FreePascal, D language, ObjC and Swift, for rizin-testbins +- Classes list via `Vb`. It already supports browsing bin classes. The same thing should be implemented for classes from analysis. + +### Skills +* Good knowledge of the C language +* Good knowledge of the C++ language (other languages, like ObjC, Swift, D, etc are a plus) + +### Difficulty +Hard + +### Benefits for the participant +Participant will understand how OOP languages work under the hood, and will master technique of detecting various high level (classes, methods) abstractions in the binary code. + +### Benefits for the project +It will greatly benefit the project to allow efficient reverse engineering of the programs written in C++ and other OOP languages. + +### Assess requirements for midterm/final evaluation +- 1st term: implement classes integration into the analysis, type inference; detect constructors and destructors, `try`/`catch` blocks. +- Final term: Class inheritance recognition, virtual methods detection, building class inheritance graphs, visual mode to inspect classes and methods relationships. + +### Mentors +- xvilka +- deroad + +### Links and resources +- [Improve vtable detection for C++, ObjectiveC, Dlang and Swift binaries - issue #416](https://github.com/rizinorg/rizin/issues/416) +- [Devirtualize method calls using class vtables - issue #414](https://github.com/rizinorg/rizin/issues/414) + ## Debugger improvements and portability (175 hour project) Rizin debugger already supports most of the platforms, including native and remote debugging. @@ -294,6 +344,9 @@ Hard ### Benefits for the participant Participant will understand how debugging works on the low level, and will gain experience with variety of different platforms and operating systems. +### Benefits for the project +It will allow efficient low-level debugging on various supported platforms, not only the mainstream ones, greatly improving Rizin's usefulness in reversing some domain-specific software. + ### Assess requirements for midterm/final evaluation - 1st term: `SystemZ, MIPS, HPPA support in Linux native, remote GDB debuggers - Final term: ARM and SPARC support in *BSD debuggers, VAX support in NetBSD @@ -329,6 +382,9 @@ Hard ### Benefits for the participant Participant will understand and learn how to use Frida toolkit, also the internals of the debugging and instrumentation processes. +### Benefits for the project +It will allow easy dynamic code instrumentation right from the Rizin or Cutter session, allowing tracing and code inspection. + ### Assess requirements for midterm/final evaluation - 1st term: Implement core of the FRIDA plugin, allowing local and remote debugging features - Final term: Add support for extended features like calling functions or scripts within the context @@ -390,7 +446,7 @@ This feature would greatly help during exploits development, and people would be - [Slides](https://github.com/XVilka/hacklu) for the exploitation part of workshop at Hack.lu 2015 - [RzEgg related bugs](https://github.com/rizinorg/rizin/labels/RzEgg) -## Binary case reduce tool +## Binary case reduce tool (175 hours project) Similar to [Csmith/Creduce](https://github.com/csmith-project/creduce) but operating on the binary files, to reduce the size of the test and to avoid sharing proprietary/classified files. @@ -403,6 +459,32 @@ It can perform these operations: Since it requires some knowledge of the file format, existing libraries like [LIEF](https://lief.re/) could be used. +### Task +1. Make a tool to reduce the size of ELF using specified operations +2. Extend it to other formats - PE, MachO +3. Create tests for this tool +4. Research the possibility of minimizing some testcases that are already in the Rizin repository. + +### Skills +The participant should be comfortable with the C language, as well as the high-level language of a choice (Python or Rust). + +### Difficulty +Advanced + +### Benefits for the participant +The participant will improve their understanding in file formats and their mutation. + +### Benefits for the project +This feature would greatly help in minimizing and anonimizing testcases for the Rizin and any other binary analysis tool. + +### Assess requirements for evaluation +- 1st term: Create the simple tool to reduce the ELF size +- Final term: Extend it to other formats, improve the "compression" rate, cover with tests. + +### Mentors +- xvilka +- deroad + ### Links/Resources - https://github.com/rizinorg/ideas/issues/52 @@ -448,30 +530,15 @@ The current code analysis has many caveats and issues which need addressing. Fix Currently Rizin has support for heap exploration and analysis, but the feature is still basic and can be improved. Additionally, [other allocators can be added](https://github.com/rizinorg/rizin/issues/208) (MacOS, tmalloc, etc.), but this should be done after a proper refactoring, because heap analysis shouldn't depend on the debugger backend, and we may be able to use different heap tools. -## Class analysis for C++/ObjectiveC/Swift/Dlang/Java [#416](https://github.com/rizinorg/rizin/issues/416) - -Analysis classes, accessible under the `ac` command, is a relatively new feature of rizin. -They provide a way to both manually and automatically manage and use information about classes in the binary. - -### Devirtualize method calls using class vtables [#414](https://github.com/rizinorg/rizin/issues/414) - -Consider the following call: `call dword [eax + 0x6c]` -Let's assume eax is the base pointer of a vtable we have saved in class analysis and we want to find out the actual address of the called method. - -So there should be a command that takes the offset (in this case 0x6c) and looks up the actual destination. -It should be possible to call this command with a specific class, so it only looks into its vtable, or without a class, so it gives a list of possible destinations for all vtables that are not too small for the offset. - -When that is implemented, one could also add a command that does the same thing, but automatically takes the offset from the opcode at the current seek. - -### Add classes list to `Vb` - -`Vb` already supports browsing bin classes. The same thing should be implemented for classes from -analysis. ## Rizin legacy code refactoring ## Miscellaneous +### `rz-ar` - code archives unpacking tool + +https://github.com/rizinorg/rizin/issues/4866 + ### Improving regression suite and testing It is required to solve [numerous issues](https://github.com/rizinorg/rizin/labels/rz-test), along with improving parallel execution and performance. @@ -498,4 +565,3 @@ in the Rizin side. Also note that most of these issues should be paired with the test to verify it will not break in the future. -