Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate Automated Layouts for Pathways with Large-Language Model (e.g., ChatGPT) Planning #232

Open
cannin opened this issue Jan 12, 2024 · 6 comments

Comments

@cannin
Copy link

cannin commented Jan 12, 2024

Background

Pathway diagrams help researchers understand complex biological processes (i.e., pathways). The Systems Biology Graphical Notation (SBGN, https://sbgn.github.io/) is a formalism with a set of interconnected tools and file formats (SBGNML) for generating diagrams of these processes. A lot of pathway content exists in textual databases and automated layout of this pathway content can be challenging. Manually laid out pathways tend to convey a specific narrative that is lost when using automated layout algorithms that lack understanding of biology (https://academic.oup.com/bib/article/22/5/bbab103/6217719).

Large-Language Models (LLMs, e.g., ChatGPT, LLaMA) and Multimodal (GPT-4V, LLaVA) have been used for a variety of tasks: responding to questions, writing content, etc thanks to the huge abundance of text content on which it has been trained. Using text-based formats, LLMs can also generate diagrams (https://www.mermaidchart.com/blog/posts/mermaid-chart-chatgpt-plugin-combines-generative-ai-and-smart-diagramming). Separately, ChatGPT and related models have included in their training data SBGN content thanks to diagrams rendered in the SBGNML format.

Recent research has shown that LLMs can be leveraged to aid in diagram generation and layout (https://github.com/aszala/DiagrammerGPT) through a two-stage process (planning then generation).

Goal

The goal is to utilize LLMs (e.g., ChatGPT) to work on a pipeline to aid in the automatic layout of SBGN diagrams.

Difficulty

Easy-Medium; Easy to start, difficult to produce well

Size and Length of Project

medium: 175 hours
12 weeks preferred

Skills

Python

Public Repository

Potential Mentors

Augustin Luna
Adrien Rougny

@Raya679
Copy link

Raya679 commented Jan 13, 2024

Hello @cannin,
I am Raya Chakravarty, currently pursuing my BTech in Computer Science. I am particularly interested in this issue and would like to contribute to this project during the GSOC program.

I have prior experience with Large Language Models (LLMs) and have developed a Healthcare Chatbot by fine-tuning LLMs, specifically Llama.

I am going through the resources and links you have provided above. Currently, I am exploring the SBGN Documentation.
Are there any additional tasks you would like me to undertake apart from these?

@7070Shreyash
Copy link

Hey @cannin , My name is Shreyash, and I'm a B.Tech CSE student with proficiency in python and Machine Learning/ Deep Learning. I also have experience with Large Language Models (LLMs).

Having reviewed the project goal and provided resources, I'm keenly interested in contributing to this issue through the GSoC program. I'm currently immersed in the documentation and links, and I'm eager to put my skills to use.

Thanks

@khanspers
Copy link
Contributor

khanspers commented Feb 22, 2024

NRNB has been accepted as a mentoring organization for GSoC 2024. The contributor application period is March 18 – April 2. Here are some useful links:

GSoC contributor guide
NRNB project proposal template
Eligibility requirements
Full program timeline

@sumana-2705
Copy link

Hello @cannin @adrienrougny

My name is Sumana Sree, I am currently doning my Masters in Indian Institute of Technology (BHU) in the field of Machine Learning. I would love to contribute to this project and gain a deeper understanding of LLM's. Can you let me know whether this project is open for GSoC-2025?

@Anirudh2465
Copy link

I am doing my bachelors in Computer Science specializing in AI. I would love to contribute to the project if it has not been completed yet. I have done a lot of projects related to LLMs and DL and i feel that i could make great contributions.
please let me know if i can be of help @adrienrougny @cannin

@abanindra3
Copy link

Subject: Request to Work on SBGN Diagram Layout Automation Project

Hi @adrienrougny @cannin ,

I hope this message finds you well! I am Abanindra.M.Singh , a student/developer with experience in Python and a keen interest in leveraging Large Language Models (LLMs) for innovative applications in systems biology. I came across this project and am deeply fascinated by its potential to bridge automated diagram generation with biological pathway visualization.

Why I’m Interested
Understanding and representing complex biological pathways in a meaningful way is a critical challenge. The combination of SBGNML formats, LLM capabilities, and automated layout pipelines is an exciting frontier, and I’m eager to contribute to this field.

My Skills and Experience
Programming: Proficient in Python and have hands-on experience with libraries such as Matplotlib, NetworkX, and PyTorch.
LLM Familiarity: Worked with OpenAI’s API and open-source LLMs like LLaMA.
Diagrams: Have used tools like Mermaid and other visualization frameworks.
Biological Data: Familiar with working with biological databases and models.
What I Plan to Do
Implement a two-stage pipeline using LLMs for planning and generation of SBGN diagrams.
Explore integration with libraries like libsbgn-python for diagram parsing and layout refinement.
Test and refine outputs to ensure diagrams convey meaningful biological narratives.
Request
I would be honored if you could assign this project to me and guide me as I work through its development. I’m committed to contributing quality work and adhering to project timelines.

Looking forward to your response and any further instructions you might have. Thank you for this opportunity to contribute!

Best regards,
Abanindra.M.Singh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants