update web

jingyaogong · Feb 20, 2025 · 6348f0e · 6348f0e
1 parent f88f23f
commit 6348f0e
Show file tree

Hide file tree

Showing 13 changed files with 20 additions and 25 deletions.
diff --git a/images/1-pretrain-512.png b/images/1-pretrain-512.png
diff --git a/images/1-pretrain-768.png b/images/1-pretrain-768.png
diff --git a/images/2-sft-512.png b/images/2-sft-512.png
diff --git a/images/2-sft-768.png b/images/2-sft-768.png
diff --git a/images/3-eval_chat.png b/images/3-eval_chat.png
diff --git a/images/llava-structure.png b/images/llava-structure.png
diff --git a/images/minimind-v-input.png b/images/minimind-v-input.png
diff --git a/images/minimind2-v.gif b/images/minimind2-v.gif
diff --git a/images/modelscope-demo.gif b/images/modelscope-demo.gif
diff --git a/images/web_server.gif b/images/web_server.gif
diff --git a/images/web_server1.png b/images/web_server1.png
diff --git a/images/web_server2.png b/images/web_server2.png
diff --git a/index.html b/index.html
@@ -68,7 +68,8 @@
     <div align="center">
         <img src="./images/logo.png" alt="Logo" class="logo mb-4"/>
         <!--<br>-->
-        <img src="https://visitor-badge.laobi.icu/badge?page_id=jingyaogong/minimind-v" alt="Visitor Badge" class="me-2">
+        <img src="https://visitor-badge.laobi.icu/badge?page_id=jingyaogong/minimind-v" alt="Visitor Badge"
+             class="me-2">
         <img src="https://img.shields.io/github/stars/jingyaogong/minimind-v?style=social" alt="GitHub Stars"
              class="me-2">
         <img src="https://img.shields.io/github/license/jingyaogong/minimind-v" alt="License" class="me-2">
@@ -85,46 +86,40 @@
 
     <hr>
     <ul>
-        <li>This open-source project aims to train a small-parameter, visually-capable language model
-            <strong>MiniMind-V</strong> from scratch, with the goal of achieving this in as little as 3 hours.
+        <li>This project aims to train a super-small multimodal vision-language model, <strong>MiniMind-V</strong>, with
+            just a cost of 3 RMB and 2 hours of work, starting from scratch!
         </li>
-        <li><strong>MiniMind-V</strong> is also extremely lightweight, with the smallest version being approximately
-            1/7000 the size of GPT3, striving to enable quick inference and even training on personal GPUs.
+        <li>The smallest version of <strong>MiniMind-V</strong> is only about &frac14; the size of GPT-3, designed to
+            enable fast inference and even training on personal GPUs.
         </li>
-        <li><strong>MiniMind-V</strong> provides the full-stage code for a simplified large model structure, dataset
-            cleaning and preprocessing, supervised pretraining, supervised instruction fine-tuning (SFT).
-            It also includes code for expanding to sparse models with mixed experts (MoE).
+        <li><strong>MiniMind-V</strong> is an extension of the visual capabilities of the <a
+                href="https://github.com/jingyaogong/minimind">MiniMind</a> pure language model.
         </li>
-        <li>This is not just an implementation of an open-source model; it is also a tutorial for beginners to enter the
-            field of Vision-Language Models (VLM).
+        <li>The project includes full code for the minimalist structure of large VLM models, dataset cleaning,
+            pretraining, and supervised fine-tuning (SFT).
         </li>
-        <li>We hope this project can serve as a starting point for researchers, providing an introductory example that
-            helps everyone quickly get started and inspires more exploration and innovation in the VLM domain.
+        <li>This is not only the smallest implementation of an open-source VLM model but also a concise tutorial for
+            beginners in vision-language models.
         </li>
-        <li style="color: #aaa;">To prevent misinterpretation, "from scratch" specifically refers to building upon the
-            pure language model MiniMind (which is a GPT-like model trained entirely from scratch) to further expand its
-            capabilities from 0 to 1 in terms of visual abilities.
-            For detailed information on the latter, please refer to the twin project <a
-                    href="https://github.com/jingyaogong/minimind">MiniMind</a>.
-
+        <li>The hope is that this project can provide a useful example to inspire others and share the joy of creation,
+            helping to drive progress in the wider AI community!
         </li>
-        <li style="color: #aaa;">To avoid misinterpretation, "fastest 3 hours" means you need a machine with hardware
-            configuration superior
-            to mine. Detailed specifications will be provided below.
+        <li style="color: #aaa;">To avoid misunderstandings, the "2 hours" is based on testing (`1 epoch`) with an
+            NVIDIA 3090 hardware device (single GPU), and
+            the "3 RMB" refers to GPU server rental costs.
         </li>
     </ul>
 
     <div align="center">
 
         <div class="scroll-container">
-            <img src="./images/modelscope-demo.gif" alt="Streamlit Demo" class="img-fluid mb-4">
+            <img src="./images/minimind2-v.gif" alt="WebUI Demo" class="img-fluid mb-4">
             <img src="./images/VLM-structure.png" alt="LLM Structure" class="img-fluid mb-4">
             <img src="./images/VLM-structure-moe.png" alt="LLM Structure MOE" class="img-fluid mb-4">
         </div>
         <br/>
-        <a href="https://www.modelscope.cn/studios/gongjy/minimind-v">ModelScope Online</a> |
-        <a href="https://www.bilibili.com/video/BV1Sh1vYBEzY">Bilibili
-            Video Link</a>
+        <a href="https://www.modelscope.cn/studios/gongjy/minimind-v">🔗 Online Experience 🔗</a> |
+        <a href="https://www.bilibili.com/video/BV1Sh1vYBEzY">🔗 Video Introduction 🔗</a>
         <br/>
         <br/>
     </div>