Skip to content

Commit

Permalink
update web
Browse files Browse the repository at this point in the history
  • Loading branch information
jingyaogong committed Feb 20, 2025
1 parent f88f23f commit 6348f0e
Show file tree
Hide file tree
Showing 13 changed files with 20 additions and 25 deletions.
Binary file removed images/1-pretrain-512.png
Binary file not shown.
Binary file removed images/1-pretrain-768.png
Binary file not shown.
Binary file removed images/2-sft-512.png
Binary file not shown.
Binary file removed images/2-sft-768.png
Binary file not shown.
Binary file removed images/3-eval_chat.png
Binary file not shown.
Binary file removed images/llava-structure.png
Binary file not shown.
Binary file removed images/minimind-v-input.png
Binary file not shown.
Binary file added images/minimind2-v.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed images/modelscope-demo.gif
Binary file not shown.
Binary file removed images/web_server.gif
Binary file not shown.
Binary file removed images/web_server1.png
Binary file not shown.
Binary file removed images/web_server2.png
Binary file not shown.
45 changes: 20 additions & 25 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,8 @@
<div align="center">
<img src="./images/logo.png" alt="Logo" class="logo mb-4"/>
<!--<br>-->
<img src="https://visitor-badge.laobi.icu/badge?page_id=jingyaogong/minimind-v" alt="Visitor Badge" class="me-2">
<img src="https://visitor-badge.laobi.icu/badge?page_id=jingyaogong/minimind-v" alt="Visitor Badge"
class="me-2">
<img src="https://img.shields.io/github/stars/jingyaogong/minimind-v?style=social" alt="GitHub Stars"
class="me-2">
<img src="https://img.shields.io/github/license/jingyaogong/minimind-v" alt="License" class="me-2">
Expand All @@ -85,46 +86,40 @@

<hr>
<ul>
<li>This open-source project aims to train a small-parameter, visually-capable language model
<strong>MiniMind-V</strong> from scratch, with the goal of achieving this in as little as 3 hours.
<li>This project aims to train a super-small multimodal vision-language model, <strong>MiniMind-V</strong>, with
just a cost of 3 RMB and 2 hours of work, starting from scratch!
</li>
<li><strong>MiniMind-V</strong> is also extremely lightweight, with the smallest version being approximately
1/7000 the size of GPT3, striving to enable quick inference and even training on personal GPUs.
<li>The smallest version of <strong>MiniMind-V</strong> is only about &frac14; the size of GPT-3, designed to
enable fast inference and even training on personal GPUs.
</li>
<li><strong>MiniMind-V</strong> provides the full-stage code for a simplified large model structure, dataset
cleaning and preprocessing, supervised pretraining, supervised instruction fine-tuning (SFT).
It also includes code for expanding to sparse models with mixed experts (MoE).
<li><strong>MiniMind-V</strong> is an extension of the visual capabilities of the <a
href="https://github.com/jingyaogong/minimind">MiniMind</a> pure language model.
</li>
<li>This is not just an implementation of an open-source model; it is also a tutorial for beginners to enter the
field of Vision-Language Models (VLM).
<li>The project includes full code for the minimalist structure of large VLM models, dataset cleaning,
pretraining, and supervised fine-tuning (SFT).
</li>
<li>We hope this project can serve as a starting point for researchers, providing an introductory example that
helps everyone quickly get started and inspires more exploration and innovation in the VLM domain.
<li>This is not only the smallest implementation of an open-source VLM model but also a concise tutorial for
beginners in vision-language models.
</li>
<li style="color: #aaa;">To prevent misinterpretation, "from scratch" specifically refers to building upon the
pure language model MiniMind (which is a GPT-like model trained entirely from scratch) to further expand its
capabilities from 0 to 1 in terms of visual abilities.
For detailed information on the latter, please refer to the twin project <a
href="https://github.com/jingyaogong/minimind">MiniMind</a>.

<li>The hope is that this project can provide a useful example to inspire others and share the joy of creation,
helping to drive progress in the wider AI community!
</li>
<li style="color: #aaa;">To avoid misinterpretation, "fastest 3 hours" means you need a machine with hardware
configuration superior
to mine. Detailed specifications will be provided below.
<li style="color: #aaa;">To avoid misunderstandings, the "2 hours" is based on testing (`1 epoch`) with an
NVIDIA 3090 hardware device (single GPU), and
the "3 RMB" refers to GPU server rental costs.
</li>
</ul>

<div align="center">

<div class="scroll-container">
<img src="./images/modelscope-demo.gif" alt="Streamlit Demo" class="img-fluid mb-4">
<img src="./images/minimind2-v.gif" alt="WebUI Demo" class="img-fluid mb-4">
<img src="./images/VLM-structure.png" alt="LLM Structure" class="img-fluid mb-4">
<img src="./images/VLM-structure-moe.png" alt="LLM Structure MOE" class="img-fluid mb-4">
</div>
<br/>
<a href="https://www.modelscope.cn/studios/gongjy/minimind-v">ModelScope Online</a> |
<a href="https://www.bilibili.com/video/BV1Sh1vYBEzY">Bilibili
Video Link</a>
<a href="https://www.modelscope.cn/studios/gongjy/minimind-v">🔗 Online Experience 🔗</a> |
<a href="https://www.bilibili.com/video/BV1Sh1vYBEzY">🔗 Video Introduction 🔗</a>
<br/>
<br/>
</div>
Expand Down

0 comments on commit 6348f0e

Please sign in to comment.