zucchini-nlp

Follow

🦄

To code or not to code

Raushan Turganbay zucchini-nlp

🦄

To code or not to code

Follow

ML Engineer at 🤗 | Generation & Multimodality

100 followers · 9 following

Achievements

Achievements

Organizations

Lists (1)

Sort

🔮 Future ideas

Stars

Blaizzy / mlx-vlm

MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.

Python 935 84 Updated Mar 9, 2025

Deep-Agent / R1-V

Witness the aha moment of VLM with less than $3.

Python 3,140 245 Updated Mar 1, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,658 2,182 Updated Feb 1, 2025

haoliuhl / language-quantized-autoencoders

Language Quantized AutoEncoders

Python 101 5 Updated Feb 7, 2023

deepseek-ai / DeepSeek-R1

85,884 11,083 Updated Feb 24, 2025

NVIDIA / Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Jupyter Notebook 7,655 492 Updated Mar 7, 2025

thu-ml / SageAttention

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Cuda 1,103 66 Updated Feb 28, 2025

Aniezka / ai-texts-detection

Investigating the Detection of ChatGPT-Generated Texts across Radiology Reports

Jupyter Notebook 1 Updated Dec 25, 2024

microsoft / VidTok

a family of versatile and state-of-the-art video tokenizers.

Python 350 20 Updated Jan 15, 2025

facebookresearch / blt

Code for BLT research paper

Python 1,432 110 Updated Mar 5, 2025

NVIDIA / kvpress

LLM KV cache compression made easy

Python 428 29 Updated Mar 5, 2025

uploadcare / pillow-simd

Forked from python-pillow/Pillow

The friendly PIL fork

Python 2,214 90 Updated Oct 7, 2024

imageio / imageio

Python library for reading and writing image data

Python 1,563 305 Updated Feb 21, 2025

pytorch / torchcodec

PyTorch video decoding

Python 255 23 Updated Mar 10, 2025

xjdr-alt / entropix

Entropy Based Sampling and Parallel CoT Decoding

Python 3,341 319 Updated Nov 13, 2024

bytedance / Shot2Story

A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.

Python 122 7 Updated Jan 30, 2025

pytorch / torchtune

PyTorch native post-training library

Python 4,974 550 Updated Mar 10, 2025

sayakpaul / diffusers-torchao

End-to-end recipes for optimizing diffusion models with torchao and diffusers (inference and FP8 training).

Python 330 11 Updated Feb 19, 2025

leloykun / mmsg

Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.

Python 26 3 Updated Oct 18, 2024

facebookresearch / unibench

Python Library to evaluate VLM models' robustness across diverse benchmarks

Jupyter Notebook 194 14 Updated Feb 28, 2025

huggingface / transformers.js

State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!

JavaScript 13,170 872 Updated Mar 7, 2025

VITA-MLLM / VITA

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,147 164 Updated Feb 13, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,912 1,358 Updated Mar 3, 2025

bfshi / scaling_on_scales

When do we not need larger vision models?

Python 373 12 Updated Feb 8, 2025

mumtozee / MiptRL

MIPT RL Course HW Solutions Spring 2024

Jupyter Notebook 1 Updated Jul 2, 2024

vikhyat / moondream

tiny vision language model

Python 7,565 585 Updated Feb 25, 2025

CircleRadon / TokenPacker

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".

Python 237 8 Updated Dec 26, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 730 42 Updated Aug 5, 2024

microsoft / XPretrain

Multi-modality pre-training

Python 486 37 Updated May 8, 2024

microsoft / i-Code

Jupyter Notebook 1,691 162 Updated Sep 27, 2024