-
NEXON
- Gangneung, Korea
Highlights
Stars
Fully open reproduction of DeepSeek-R1
[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
A curated list of awesome open-source libraries for production LLM
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
MINT-1T: A one trillion token multimodal interleaved dataset.
[2024-ACL]: TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wildrounded Conversation
[CVPR 2024] Official implementation of the paper "Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement"
A Framework of Small-scale Large Multimodal Models
ModelScope-Agent: An agent framework connecting models in ModelScope with the world
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Tile primitives for speedy kernels
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
Start building LLM-empowered multi-agent applications in an easier way.
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Here we will keep track of the latest AI Game Development Tools, including LLM, Agent, Code, Writer, Image, Texture, Shader, 3D Model, Animation, Video, Audio, Music, Singing Voice and Analytics. 🔥
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Must-read Papers on Large Language Model (LLM) Planning.
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…
Multilingual Medicine: Model, Dataset, Benchmark, Code
The Cradle framework is a first attempt at General Computer Control (GCC). Cradle supports agents to ace any computer task by enabling strong reasoning abilities, self-improvment, and skill curatio…
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)