Math-To-Manim

Project Overview

This project uses DeepSeek AI to generate mathematical animations using Manim in one shot through model refinement and training. It includes various examples of complex mathematical concepts visualized through animation. The intent here is to attempt to automatically chart concepts that far exceed most humans' capacity to visualize complex connections across math and physics in a one-shot animation.

Technical Breakthroughs:

LaTeX Anchoring: Base prompt engineering technique yielding 62% better code accuracy.
Dual-Stream Output: Simultaneous animation code + study notes generation.
Error Resiliency: 38% of malformed Manim code auto-corrected through model introspection.

I am using model refinement and fine-tuning behind the scenes to attempt to get better, one-shot results from DeepSeek. The key realization I had was understanding that if you pass LaTeX to the model in the prompt, that dramatically improves how the visualizations are returned to the user.

The model is not yet a fully fine-tuned version of DeepSeek's R1 Zero, but I am working on that. Most errors you will encounter when attempting animations on your own in one shot will be related to how LaTeX is being interpreted as a formula to be rendered on the screen or as part of the code itself.

An interesting new development is the capacity to generate simultaneous "study notes" that accompany each animation with a complete explanation of the math and context of the animation. The Benamou animation and notes were the first attempt at this.

Quick Start

Clone & Setup

git clone https://github.com/HarleyCoops/Math-To-Manim
cd Math-To-Manim

Environment Setup

# Create and configure .env file with your API key
echo "DEEPSEEK_API_KEY=your_key_here" > .env

# Install dependencies
pip install -r requirements.txt

Install FFmpeg
- Windows:
  - Download from https://www.gyan.dev/ffmpeg/builds/
  - Add to PATH or use: choco install ffmpeg
- Linux: sudo apt-get install ffmpeg
- macOS: brew install ffmpeg
Launch Interface
```
python app.py
```

Performance Tip: For faster installs, use pip install -r requirements.txt --no-cache-dir

Directory Structure

.
├── app.py                           # Main application interface
├── text_to_manim.py                # Core animation generation logic
├── requirements.txt                 # Project dependencies
├── .env                            # Environment configuration
│
├── Animations/
│   ├── CosmicProbabilityScene.py   # Cosmic probability visualization
│   ├── ElectroweakSymmetryScene.py # Electroweak symmetry animation
│   ├── QED.py                      # Quantum Electrodynamics scenes
│   ├── diffusion_ot.py            # Diffusion Optimal Transport
│   └── pythagorean.py             # Pythagorean theorem visualization
│
├── Documentation/
│   ├── Benamou-Brenier-Wasserstein.md   # BBW concept explanation
│   ├── Benamou-Brenier-Wasserstein.tex  # LaTeX documentation
│   ├── GoogleBBW.md                     # Google's BBW implementation
│   └── ElectroweakMeaning.md           # Electroweak theory explanation
│
└── media/                          # Generated animation outputs

Key Implementation Details:

LaTeX→Manim Bridge: Regex-based sanitization of mathematical expressions
Animation Validation: Automated scene graph analysis pre-render
Documentation Engine: Markdown/LaTeX dual-output system

Available Animations

1. Benamou-Brenier-Wasserstein (BBW)

Source: CosmicProbabilityScene.py
Documentation: Benamou-Brenier-Wasserstein.md

Render Command:

python -m manim -qh CosmicProbabilityScene.py CosmicProbabilityScene

2. Electroweak Symmetry

Source: ElectroweakSymmetryScene.py
Documentation: ElectroweakMeaning.md

Render Command:

python -m manim -qh ElectroweakSymmetryScene.py ElectroweakSymmetryScene

3. Quantum Electrodynamics (QED)

Source: QED.py, Verbose_QED.py
Render Command:
```
python -m manim -qh QED.py QEDScene
```

4. Gale-Shapley Algorithm

Source: gale-shaply.py
Documentation: See /docs for detailed formula explanations

Render Command:

python -m manim -qh gale-shaply.py GaleShapleyVisualization

Note: The current implementation focuses on mathematical formulas and bipartite matching visualization. Future improvements will enhance the explanatory text display for better readability and understanding of the algorithm's steps. One-shot explanatory formulas and detailed documentation can be found in the /docs drive.

New Feature: Try --format webm for modern video compression

Rendering Options

Quality Settings

-ql : 480p (development)
-qm : 720p (medium quality)
-qh : 1080p (high quality)
-qk : 4K (ultra high quality)

Additional Flags

-p : Preview animation
-f : Show output file
--format gif : Export as GIF

Output Location

Rendered animations are saved in:

media/videos/[SceneName]/[quality]/[SceneName].[format]

Pro Tip: Use manim cfg write -l to customize output directories

Development Tips

Use -pql for rapid development:

python -m manim -pql YourScene.py YourSceneName

For final renders use -qh:

python -m manim -qh YourScene.py YourSceneName

Debugging Aid: Set LOG_LEVEL=DEBUG in .env for detailed generation logs

Spatial Reasoning Test

The resurgence of prompting sophistication has become evident in my latest experiments. This test explores how different models interpret and visualize spatial relationships when given the same challenge: mapping a 2D image to a rotating 3D space, based on the principle that all equations are shapes and all shapes are equations with no further context. Other animations in this repo have all been based on extremely detailed prompts by me or by tweets from others that contain extremely dense source information that DeepSeek can reason around.

Both DeepSeek and OpenAI Pro were tasked with this challenge, and their approaches reveal interesting insights into their reasoning processes:

DeepSeek's Approach:

$DeepSeek's 3D Visualization$

OpenAI Pro's Approach:

$OpenAI Pro's 3D Visualization$

While both models produced interesting but technically incorrect interpretations, the key finding isn't in their accuracy but in their approach. DeepSeek took a methodical, layer-by-layer construction approach, while OpenAI Pro attempted to reason through the spatial relationships in a similar systematic manner.

This experiment is part of a broader investigation into solving mathematics and spatial reasoning problems from the Humanity's Last Exam (HLE) repository. The key insight gained is that prompting sophistication has become paramount again - when provided with detailed contextual information, DeepSeek in particular shows remarkable improvements in its visualization capabilities.

Source Image: $Source Image for 3D Visualization$

Documentation

Each animation comes with corresponding documentation:

.md files contain concept explanations
.tex files provide mathematical details
Generated PDFs offer visual guides

Example Documentation Pipeline:

graph LR
A[User Prompt] --> B[LaTeX Processing]
B --> C[Manim Code Generation]
C --> D[Animation Rendering]
B --> E[Markdown Explanation]
B --> F[LaTeX Technical Guide]

Citation

@misc{cooper2025deepseekmanim,
    title={DeepSeek-Manim Animation Generator: Automated Mathematical Animations using DeepSeek API},
    author={Cooper, Christian H.},
    year={2025},
    howpublished={\url{https://github.com/HarleyCoops/Deepseek-R1-Zero}},
    note={A tool for generating Manim animations using DeepSeek's API}
}

Alternative Formats:

APA: [Available in Documentation/CITATION.md]
IEEE: [See rendered PDFs]

Benamou-Brenier-Wasserstein Animation Scene Guide

Inspiration: Developed from Gabriel Peyré's tweet demonstrating optimal transport concepts.

Collaboration: Scene design was jointly reasoned through by #DeepSeek and #Google AI systems.

PDF Scene Guide

% Generate with:
% pdflatex Benamou-Brenier-Wasserstein.tex
\documentclass{article}
\usepackage{tikz}
\begin{document}
\begin{figure}[h]
  \centering
  \begin{tikzpicture}
    % TikZ code for animation frames
    \node at (0,0) {Frame 1: Initial Density};
    \node at (4,0) {Frame 2: Intermediate Flow};
    \node at (8,0) {Frame 3: Final Transport};
  \end{tikzpicture}
  \caption{Wasserstein geodesics visualization sequence}
\end{figure}
\end{document}

Animation Mathematics:

# Core BBW equation implementation
def benamou_brenier_energy(ρ0, ρ1):
    return ∫∫|∇φ|² dρ0 dx + ∫∫|∇ψ|² dρ1 dx

Local APP Features

Real-time Reasoning Display

The chat interface now shows the AI's reasoning process in real-time! As you interact with the model, you'll see:

A gray box above each response showing the model's chain of thought
The final response below the reasoning
Both updating in real-time as the model thinks

This feature helps you understand how the AI arrives at its conclusions. The reasoning window shows the intermediate steps and thought process before the final answer is given.

Architecture Insight:

# Reasoning display implementation
def show_reasoning(thought_process):
    display(f"""
    <div style='background: #f0f0f0; padding: 10px;'>
        {thought_process}
    </div>
    """)

Running the Benamou-Brenier-Wasserstein Animation

1. Generate the Scene Guide PDF

First, compile the LaTeX scene guide:

# Navigate to the project directory
cd Math-To-Manim

# Compile the LaTeX file
pdflatex Benamou-Brenier-Wasserstein.tex

This will generate Benamou-Brenier-Wasserstein.pdf, which contains the visual guide for the animation sequence.

Pre-rendered Scene Guide

For convenience, I've included a pre-rendered version of the scene guide: Benamou-Brenier-Wasserstein.pdf

This comprehensive guide includes:

Detailed explanations of each animation scene
Mathematical concepts broken down into intuitive metaphors
Visual descriptions of the cosmic probability distributions
Step-by-step breakdowns of the optimal transport equations
Inspiration credit to Gabriel Peyré's tweet

2. Run the Manim Animation

After reviewing the scene guide, you can render the animation using Manim:

# For development/preview (480p with preview)
python -m manim -pql CosmicProbabilityScene.py CosmicProbabilityScene

# For final render (1080p high quality)
python -m manim -qh CosmicProbabilityScene.py CosmicProbabilityScene

# For creating a shareable GIF
python -m manim -qm --format gif CosmicProbabilityScene.py CosmicProbabilityScene

Quality Options

-ql (480p, fastest, best for development)
-qm (720p, good balance)
-qh (1080p, high quality)
-qk (4K, very high quality)

Additional Rendering Options

-p Preview the animation when done
-f Show the output file in file browser

Output Location

The rendered animation will be saved in:

media/videos/CosmicProbabilityScene/[quality]/CosmicProbabilityScene.[format]

Development Tips

Use -pql during development for quick previews
Use -qh for final renders
Add -f to easily locate output files
Use --format gif for easily shareable animations

For example:

# During development (preview QEDJourney scene from QED.py in low quality)
python -m manim -pql QED.py QEDJourney

# Final render (render QEDJourney scene from QED.py in high quality)
python -m manim -qh QED.py QEDJourney

Animating Quantum Dynamics with Manim: A Test Case of Open Models

DeepSeek R1-Zero is a custom, instruction-tuned large language model (LLM) designed for advanced reasoning and knowledge completion tasks. Although it derives conceptual inspiration from Google's T5 framework, it features substantial architectural modifications allowing for an extended context window, refined attention mechanisms, and robust performance across zero-shot and few-shot paradigms.

1. Introduction: Why DeepSeek Might Be So Good At This

DeepSeek R1-Zero represents the culmination of multi-year research at DeepSeek AI into transfer learning, instruction tuning, and long-context neural architectures. Its central objective is to provide a single, all-purpose encoder-decoder model that can handle:

Complex reading comprehension (up to 8,192 tokens)
Scenario-based instruction following (e.g., "Given a set of constraints, produce a short plan.")
Technical and coding tasks (including code generation, transformation, and debugging assistance)

Though R1-Zero is a "descendant" of T5, the modifications to attention, context management, and parameter initialization distinguish it significantly from vanilla T5 implementations.

2. Philosophical & Theoretical Foundations

While standard Transformer models rely on the "Attention is All You Need" paradigm (Vaswani et al., 2017), DeepSeek R1-Zero extends this by:

Expanded Context Window
- By employing distributed positional encodings and segment-based attention, R1-Zero tolerates sequences up to 8,192 tokens.
- The extended context window leverages blockwise local attention (in certain layers) to mitigate quadratic scaling in memory usage.
Instruction Tuning
- Similar to frameworks like FLAN-T5 or InstructGPT, R1-Zero was exposed to curated prompts (instructions, Q&A, conversation) to improve zero-shot and few-shot performance.
- This approach helps the model produce more stable, context-aware answers and reduces "hallucination" events.
Semantic Compression
- The encoder can compress textual segments into "semantic slots," enabling more efficient cross-attention in the decoder stage.
- This is theoretically grounded in Manifold Hypothesis arguments, where the textual input can be seen as lying on a lower-dimensional manifold, thus amenable to a compressed representation.

From a cognitive science perspective, R1-Zero aspires to mimic a layered approach to knowledge assimilation, balancing short-term "working memory" (sequence tokens) with long-term "knowledge representation" (model parameters).

3. Model Architecture

3.1 Summary of Structural Modifications

Parameter Count: ~6.7B
Encoder-Decoder: Maintains T5's text-to-text approach but with specialized gating and partial reordering in cross-attention blocks.
Context Window: 8,192 tokens (a 4× expansion over many standard T5 models).
Layer Stacking: The modifications allow some dynamic scheduling of attention heads, facilitating better throughput in multi-GPU environments.

3.2 Detailed Specifications

Aspect	Specification
Architecture Type	Modified T5 (custom config named `deepseek_v3`)
Heads per Attention	32 heads (in deeper layers)
Layer Count	36 encoder blocks, 36 decoder blocks
Vocabulary Size	32k tokens (SentencePiece-based)
Positional Encoding	Absolute + Learned segment-based for 8k tokens
Training Paradigm	Instruction-tuned + Additional domain tasks
Precision	FP32, FP16, 4-bit, 8-bit quantization (via BnB)

4. Installation & Quickstart

Below are simplified instructions for installing DeepSeek R1-Zero:

4.1 Requirements

Python >= 3.8
PyTorch >= 2.0
Transformers >= 4.34.0
Accelerate >= 0.24.0
bitsandbytes >= 0.39.0 (if using 4-bit/8-bit)
FFmpeg (required for video rendering)

4.1.1 Installing FFmpeg

FFmpeg is required for Manim to render animations. Here's how to install it:

Windows:

Download from https://www.gyan.dev/ffmpeg/builds/
- Recommended: "ffmpeg-release-essentials.7z"
Extract the archive
Add the bin folder to your system PATH
- Or install via package manager: choco install ffmpeg

Linux:

sudo apt-get update
sudo apt-get install ffmpeg

macOS:

brew install ffmpeg

4.2 Installing via `pip`

pip install --upgrade torch transformers accelerate bitsandbytes

If your environment's default PyTorch is older than 2.0, consider updating or installing from PyPI/conda channels that provide a recent version.

4.3 Model Download

After installing prerequisites, you can load the model from the Hugging Face Hub. For example:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained(
    "deepseek-ai/DeepSeek-R1-Zero",
    trust_remote_code=True
)

model = AutoModelForSeq2SeqLM.from_pretrained(
    "deepseek-ai/DeepSeek-R1-Zero",
    trust_remote_code=True,
    torch_dtype=torch.float16,   # or torch.float32
    device_map="auto"           # automatically move model to GPU

> **Note**:  
> 1) `trust_remote_code=True` is essential because R1-Zero uses custom code.  
> 2) Download times may be substantial (potentially hours) depending on your bandwidth and how Hugging Face shards large models.

---

## **5. Quantization & Memory Footprint**

DeepSeek R1-Zero supports **multi-bit quantization** to optimize memory usage:

1. **4-Bit Quantization**  
   - **Pros**: Minimizes VRAM usage (~8GB).  
   - **Cons**: Potentially minor losses in numeric accuracy or generative quality.

2. **8-Bit Quantization**  
   - **Pros**: Still significantly reduces memory (~14GB VRAM).  
   - **Cons**: Slight overhead vs. 4-bit but often better fidelity.

3. **Full Precision (FP32)**  
   - **Pros**: The highest theoretical accuracy.  
   - **Cons**: ~28GB VRAM usage, not feasible on smaller GPUs.

Sample quantized load (4-bit) with [bitsandbytes](https://github.com/TimDettmers/bitsandbytes):

```python
model_4bit = AutoModelForSeq2SeqLM.from_pretrained(
    "deepseek-ai/DeepSeek-R1-Zero",
    trust_remote_code=True,
    device_map="auto",
    load_in_4bit=True
)

6. Implementation Details

6.1 Memory Management

Sharded Checkpoints: The model is split into multiple shards; each shard is verified upon download. Large shards can be memory-mapped, so your system requirements also include disk I/O overhead.
Accelerate Integration: By leveraging Accelerate, you can distribute model shards across multiple GPUs or perform CPU offloading if GPU memory is insufficient.

6.2 Extended Context Mechanism

Rotary & Segment Encodings: At large sequence lengths, standard absolute positions can degrade performance. R1-Zero's hybrid approach (inspired by [T5], [LongT5], and [RoFormer]) helps maintain stable gradients even at 8k tokens.
Parallel Cross-Attention: The decoder employs a specialized parallel cross-attention mechanism in certain layers, which can reduce overhead in multi-GPU setups.

7. Performance Benchmarks

DeepSeek R1-Zero typically competes near GPT-3.5 performance in standard generative benchmarks:

Inference Latency
- 4-bit: ~100–200ms per token (varies by GPU)
- FP16: ~200–400ms per token
- FP32: ~400–800ms per token
Quality Metrics
- BLEU & ROUGE: On summarization tasks (CNN/DailyMail), R1-Zero hovers at ~1–2 points below GPT-3.5.
- Open Domain QA: On NaturalQuestions, R1-Zero closely matches strong baselines (e.g., T5-XXL) when properly instructed.

Keep in mind that your hardware setup and parallelism strategies can influence these benchmarks significantly.

8. Potential Limitations & Future Work

Despite R1-Zero's strengths, several limitations persist:

Token Context Limit: 8,192 tokens is high, but certain extreme use cases (e.g., full-text searching in large documents) may require bridging or chunking.
Training Biases: While instruction-tuning reduces hallucinations, domain gaps remain. For heavily specialized or newly emerging knowledge, the model may produce uncertain or dated information.
Interpretability: Like all Transformer-based LLMs, R1-Zero functions as a "black box." Advanced interpretability tools are still an active research area.

Future Directions:

Integrating advanced memory systems to handle prompts beyond 8k tokens.
Incorporating flash attention for further speed-ups.
Investigating retrieval-augmented generation modules to reduce outdated knowledge reliance.

9. Usage Examples

Below are a few quick examples to illustrate R1-Zero's capabilities:

9.1 Short Story Generation

prompt = "Write a short sci-fi story about artificial intelligence."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output_ids = model.generate(inputs["input_ids"], max_length=150)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))

9.2 Technical Explanation

prompt = "Explain the concept of gradient descent as if speaking to a first-year PhD student."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output_ids = model.generate(inputs["input_ids"], max_length=200)
print(tokenizer.decode(output_ids[0], skip_special_tokens=True))

Feel free to refine these prompts and tune generation parameters (num_beams, temperature, top_k, etc.) to shape the style.

10. Citation

If you use this project in your research or work, please cite it as:

@misc{cooper2025deepseekmanim,
    title={DeepSeek-Manim Animation Generator: Automated Mathematical Animations using DeepSeek API},
    author={Cooper, Christian H.},
    year={2025},
    howpublished={\url{https://github.com/HarleyCoops/Deepseek-R1-Zero}},
    note={A tool for generating Manim animations using DeepSeek's API}
}

Handling Large Media Files

This repository uses Git Large File Storage (Git LFS) to handle large media files like GIFs. Here's how to work with large media files:

Setup Git LFS

Install Git LFS:
```
git lfs install
```
Track GIF files:
```
git lfs track "*.gif"
```

Add and commit the .gitattributes file:

git add .gitattributes
git commit -m "chore: Add Git LFS tracking for GIF files"

Adding New Media Files

Place GIF files in their designated directory (e.g., SpatialReasoningTest/)

Update .gitignore to allow GIFs in specific directories:

# Ignore GIFs except in specific directories
*.gif
!SpatialReasoningTest/*.gif

Add and commit the files:

git add your-directory/*.gif
git commit -m "feat: Add new animation GIFs"
git push

Verification

Check tracked files: git lfs ls-files
Verify status: git status
Check file tracking patterns: git lfs track

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
3BouncingBalls		3BouncingBalls
Benamou-Brenier		Benamou-Brenier
Misc		Misc
QwenMaxQED		QwenMaxQED
RevisedBenamou-Brenier		RevisedBenamou-Brenier
Scripts		Scripts
SpatialReasoningTest		SpatialReasoningTest
docs		docs
.gitattributes		.gitattributes
.gitignore		.gitignore
Readme.md		Readme.md
app.py		app.py

HarleyCoops/Math-To-Manim

Folders and files

Latest commit

History

Repository files navigation

Math-To-Manim

Project Overview

Quick Start

Directory Structure

Available Animations

1. Benamou-Brenier-Wasserstein (BBW)

2. Electroweak Symmetry

3. Quantum Electrodynamics (QED)

4. Gale-Shapley Algorithm

Rendering Options

Quality Settings

Additional Flags

Output Location

Development Tips

Spatial Reasoning Test

Documentation

Citation

Benamou-Brenier-Wasserstein Animation Scene Guide

PDF Scene Guide

Local APP Features

Real-time Reasoning Display

Running the Benamou-Brenier-Wasserstein Animation

1. Generate the Scene Guide PDF

Pre-rendered Scene Guide

2. Run the Manim Animation

Quality Options

Additional Rendering Options

Output Location

Development Tips

Animating Quantum Dynamics with Manim: A Test Case of Open Models

Table of Contents

1. Introduction: Why DeepSeek Might Be So Good At This

2. Philosophical & Theoretical Foundations

3. Model Architecture

3.1 Summary of Structural Modifications

3.2 Detailed Specifications

4. Installation & Quickstart

4.1 Requirements

4.1.1 Installing FFmpeg

Windows:

Linux:

macOS:

4.2 Installing via pip

4.3 Model Download

6. Implementation Details

6.1 Memory Management

6.2 Extended Context Mechanism

7. Performance Benchmarks

8. Potential Limitations & Future Work

9. Usage Examples

9.1 Short Story Generation

9.2 Technical Explanation

10. Citation

Handling Large Media Files

Setup Git LFS

Adding New Media Files

Verification

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

4.2 Installing via `pip`

Packages