From e62030f43e253ee45203bbbfb9ae9923fe86ba66 Mon Sep 17 00:00:00 2001
From: GitHub Actions Bot <>
Date: Sat, 28 Sep 2024 00:50:52 +0000
Subject: [PATCH] Update collated assets CSV.

---
 resources/all_assets.csv | 25 -------------------------
 1 file changed, 25 deletions(-)

diff --git a/resources/all_assets.csv b/resources/all_assets.csv
index 696bdb59..e6522311 100644
--- a/resources/all_assets.csv
+++ b/resources/all_assets.csv
@@ -89,9 +89,6 @@ model,Aya 23,Cohere,Aya 23 is an open weights research release of an instruction
 model,Deepseek,Deepseek AI,Deepseek is a 67B parameter model with Grouped-Query Attention trained on 2 trillion tokens from scratch.,2023-11-28,https://github.com/deepseek-ai/DeepSeek-LLM,https://huggingface.co/deepseek-ai/deepseek-llm-67b-base,text; text,"Deepseek and baseline models (for comparison) evaluated on a series of representative benchmarks, both in English and Chinese.",67B parameters (dense),[],unknown,unknown,unknown,Training dataset comprised of diverse data composition and pruned and deduplicated.,open,custom,,,unknown,https://huggingface.co/deepseek-ai/deepseek-llm-67b-base/discussions,,,,,,,,,,
 model,Deepseek Chat,Deepseek AI,Deepseek Chat is a 67B parameter model initialized from Deepseek and fine-tuned on extra instruction data.,2023-11-29,https://github.com/deepseek-ai/DeepSeek-LLM,https://huggingface.co/deepseek-ai/deepseek-llm-67b-chat,text; text,"Deepseek and baseline models (for comparison) evaluated on a series of representative benchmarks, both in English and Chinese.",67B parameters (dense),['Deepseek'],unknown,unknown,unknown,Training dataset comprised of diverse data composition and pruned and deduplicated.,open,custom,,,unknown,https://huggingface.co/deepseek-ai/deepseek-llm-67b-chat/discussions,,,,,,,,,,
 model,Deepseek Coder,Deepseek AI,"Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese.",2023-11-03,https://github.com/deepseek-ai/DeepSeek-Coder,https://huggingface.co/deepseek-ai/deepseek-coder-33b-base,text; code,"Evaluated on code generation, code completion, cross-file code completion, and program-based math reasoning across standard benchmarks.",33B parameters (dense),[],unknown,unknown,8 NVIDIA A100 GPUs and 8 NVIDIA H800 GPUs,,open,custom,,,unknkown,https://huggingface.co/deepseek-ai/deepseek-coder-33b-base/discussions,,,,,,,,,,
-model,Stable Diffusion 3 Medium,Stability AI,"Stable Diffusion 3 Medium is Stability AI’s advanced text-to-image open model. It's suitable for running on consumer PCs and laptops as well as enterprise-tier GPUs. The model is known for its overall Quality and Photorealism, prompt understanding, typography, being resource-efficient, and being fine-tuned. The model in collaboration with NVIDIA and AMD has enhanced performance.",2024-06-12,https://stability.ai/news/stable-diffusion-3-medium,unknown,text; image,The model was tested extensively internally and externally. It has developed and implemented numerous safeguards to prevent harms. They have also received user feedback to make continuous improvements.,2B parameters,[],Unknown,Unknown,unknown,"They have conducted extensive internal and external testing of this model and have implemented numerous safeguards to prevent harms. Safety measures were implemented from the start of training the model and continued throughout testing, evaluation, and deployment.",open,Stability Community License,"The model can be used by professional artists, designers, developers, and AI enthusiasts for creating high-quality image outputs from text inputs.",Large-scale commercial use requires contacting the organization for licensing details. The model should not be used for any purpose that does not adhere to the usage guidelines.,"Continuous collaboration with researchers, experts, and the community to ensure that the model is being used appropriately.","Feedback can be given through Twitter, Instagram, LinkedIn, or Discord Community.",,,,,,,,,,
-model,Stable Video 4D,Stability AI,"Stable Video 4D is our latest AI model for dynamic multi-angle video generation. It allows users to upload a single video and receive novel-view videos of eight new angles/views. This advancement moves from image-based video generation to full 3D dynamic video synthesis. Users can specify camera angles, tailoring the output to meet specific creative needs. The model is currently available on Hugging Face and can generate 5-frame videos across the 8 views in about 40 seconds.",2024-07-24,https://stability.ai/news/stable-video-4d,unknown,video; video,"Consistency across the spatial and temporal axes greatly improves with this model. Stable Video 4D is able to generate novel view videos that are more detailed, faithful to the input video, and are consistent across frames and views compared to existing works.",Unknown,['Stable Video Diffusion Model'],Unknown,Unknown,Unknown,The Stability AI team is dedicated to continuous innovation and exploration of real-world use-cases for this model and others. They are actively working to refine and optimize the model beyond the current synthetic datasets it has been trained on.,open,Stability Community License,"This model can be used for creating dynamic multi-angle videos, with applications in game development, video editing, and virtual reality. It allows professionals in these fields to visualize objects from multiple angles, enhancing the realism and immersion of their products.",Unknown,Continuous monitoring by the Stability AI team for improvements and refinements.,"Feedback and reports about the progress should be shared via their social channels like Twitter, Instagram, LinkedIn or their Discord Community.",,,,,,,,,,
-model,Stable Fast 3D,Stability AI,"Stable Fast 3D is a ground-breaking model in 3D asset generation technology. It can transform a single input image into a highly detailed 3D asset in around half a second, setting new standards in terms of speed and quality in the realm of 3D reconstruction. Users start the process by uploading an image of an object. Stable Fast 3D then swiftly generates a complete 3D asset, which includes, UV unwrapped mesh, material parameters, albedo colors with reduced illumination bake-in, and optional quad or triangle remeshing. This model has various applications, notably for game and virtual reality developers, as well as professionals in retail, architecture, design, and other graphic-intensive professions.",2024-08-01,https://stability.ai/news/introducing-stable-fast-3d,https://huggingface.co/stabilityai/stable-fast-3d,image; 3D,"The model was evaluated on its ability to quickly and accurately transform a single image into a detailed 3D asset. This evaluation highlighted the model's unprecedented speed and quality, marking it as a valuable tool for rapid prototyping in 3D work. Compared to the previous SV3D model, Stable Fast 3D offers significantly reduced inference times--0.5 seconds versus 10 minutes--while maintaining high-quality output.",unknown,['TripoSR'],Unknown,Unknown,unknown,Unknown,open,Stability Community License,"The model is intended for use in game development, virtual reality, retail, architecture, design and other graphically intense professions. It allows for rapid prototyping in 3D work, assisting both enterprises and indie developers. It's also used in movie production for creating static assets for games and 3D models for e-commerce, as well as fast model creation for AR/VR.",Use by individuals or organizations with over $1M in annual revenue without obtaining an Enterprise License.,Unknown,Information on any downstream issues with the model can be reported to Stability AI through their support request system.,,,,,,,,,,
 model,VARCO-LLM,NCSOFT,VARCO-LLM is NCSOFT’s large language model and is trained on English and Korean.,2023-08-16,https://github.com/ncsoft/ncresearch,,text; text,"Boasts the highest performance among the Korean LLMs of similar sizes that have been released to date, according to internal evaluations.",13B parameters,[],unknown,unknown,unknown,,closed,custom,"Developing various NLP-based AI services such as Q&A, chatbot, summarization, information extraction",,,,,,,,,,,,,
 model,BioMedLM,Stanford,,2022-12-15,https://crfm.stanford.edu/2022/12/15/pubmedgpt.html,,text; text,,2.7B parameters (dense),['The Pile'],,,,,open,bigscience-bloom-rail-1.0,,,,,,,,,,,,,,
 model,RoentGen,Stanford,RoentGen is a generative medical imaging model that can create visually convincing X-ray images.,2022-11-23,https://arxiv.org/pdf/2211.12737.pdf,,text; image,Evaluated on own framework that tests domain-specific tasks in medical field.,330M parameters (dense),"['Stable Diffusion', 'RoentGen radiology dataset']",unknown,60k training steps per day,64 A100 GPUs,,open,,,,,,,,,,,,,,,
@@ -100,7 +97,6 @@ dataset,Alpaca dataset,Stanford,"Alpaca dataset consistes of 52,000 instruction-
 ",2023-03-13,https://crfm.stanford.edu/2023/03/13/alpaca.html,,text (English),,52K instruction-following demonstrations,['text-davinci-003'],,,,,open,CC BY-NC 4.0,Alpaca is intended and licensed for research use only.,,,Feedback can be provided on [[GitHub Issues]](https://github.com/tatsu-lab/stanford_alpaca/issues).,,,,,,,https://huggingface.co/datasets/tatsu-lab/alpaca,[],,
 model,Alpaca,Stanford,"Alpaca-7B is an instruction-following model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations.
 ",2023-03-13,https://crfm.stanford.edu/2023/03/13/alpaca.html,,text (English),,7B parameters (dense model),"['LLaMa', 'Alpaca dataset']",unknown,,,,open,CC BY NC 4.0 (model weights),Alpaca is intended and licensed for research use only.,,,Feedback can be provided on [[GitHub Issues]](https://github.com/tatsu-lab/stanford_alpaca/issues).,,,,,,,,,,
-model,Merlin,"Stanford Center for Artificial Intelligence in Medicine and Imaging, Stanford University","Merlin is a 3D Vision Language Model that's designed for interpretation of abdominal computed tomography (CT) scans. It uses both structured Electronic Health Record (EHR) and unstructured radiology reports for supervision without requiring additional manual annotations. The model was trained on a high-quality clinical dataset of paired CT scans, EHR diagnosis codes, and radiology reports and was evaluated on 6 task types and 752 individual tasks.",2024-09-08,https://arxiv.org/pdf/2406.06512,unknown,image; text,"Merlin has been comprehensively evaluated on 6 task types and 752 individual tasks. The non-adapted (off-the-shelf) tasks include zero-shot findings classification, phenotype classification, and zero-shot cross-modal retrieval, while model adapted tasks include 5-year chronic disease prediction, radiology report generation, and 3D semantic segmentation. It has undergone internal validation on a test set of 5,137 CTs, and external validation on 7,000 clinical CTs and on two public CT datasets (VerSe, TotalSegmentator).",Unknown,[],Unknown,Unknown,Single GPU.,The model has undergone extensive evaluations and also internal and external validation tests.,open,Unknown,"This model is intended for use in the interpretation of abdominal computed tomography (CT) scans, chronic disease prediction, radiology report generation, and 3D semantic segmentation.","The model should not be used outside of healthcare-related context, such as for personal or non-medical commercial purposes.",Unknown,"Feedback and reports for problems with the model should likely be routed to Stanford Center for Artificial Intelligence in Medicine and Imaging, or the corresponding author of the research (louis.blankemeier@stanford.edu).",,,,,,,,,,
 application,HyperWrite,OthersideAI,"HyperWrite is a writing assistant that generates text based on a user's request, as well as style and tone choices.
 ",,https://hyperwriteai.com/,,,,,['OpenAI API'],,,,unknown,limited,custom,"HyperWrite is intended to be used as a writing assistant.
 ",unknown,unknown,unknown,unknown,Generation,https://hyperwriteai.com/terms,unknown,unknown,unknown,,,,
@@ -213,8 +209,6 @@ model,Gemini,Google,"As of release, Gemini is Google's most capable and flexible
 model,TimesFM,Google,TimesFM is a single forecasting model pre-trained on a large time-series corpus of 100 billion real world time-points.,2024-02-02,https://blog.research.google/2024/02/a-decoder-only-foundation-model-for.html,,,Evaluated on popular time-series benchmarks.,200M parameters (dense),[],unknown,unknown,unknown,,closed,unknown,,,unknown,,,,,,,,,,,
 model,Gemma,Google,"Gemma is a family of lightweight, state-of-the-art open models from Google, based on the Gemini models. They are text-to-text, decoder-only large language models, available in English.",2024-02-21,https://blog.google/technology/developers/gemma-open-models/,https://huggingface.co/google/gemma-7b,text; text,Evaluation was conducted on standard LLM benchmarks and includes internal red-teaming testing of relevant content policies.,7B parameters (dense),[],unknown,unknown,TPUv5e,"Multiple evaluations and red-teaming conducted, with particular focus on ethics, bias, fair use cases, and safety.",open,custom,"Text generation tasks including question answering, summarization, and reasoning; content creation, communication, research, and education.",Prohibited uses are specified in the Gemma Prohibited Use Policy here https://ai.google.dev/gemma/prohibited_use_policy,,https://huggingface.co/google/gemma-7b/discussions,,,,,,,,,,
 model,Med-Gemini,Google,"Med-Gemini is a family of highly capable multimodal models that are specialized in medicine with the ability to seamlessly integrate the use of web search, and that can be efficiently tailored to novel modalities using custom encoders.",2024-04-29,https://arxiv.org/pdf/2404.18416,,"image, text; text","Evaluated Med-Gemini on 14 medical benchmarks spanning text, multimodal and long-context applications, establishing new state-of-the-art (SoTA) performance on 10 of them, and surpassing the GPT-4 model family on every benchmark where a direct comparison is viable.",unknown,"['Gemini', 'MultiMedBench']",unknown,unknown,unknown,,closed,unknown,"To be used in areas of medical research including medical summarization, referral letter generation, and medical simplification tasks.",Unfit for real-world deployment in the safety-critical medical domain.,,,,,,,,,,,,
-model,Imagen 3,Google DeepMind,"Imagen 3 is a high-quality text-to-image model capable of generating images with improved detail, richer lighting, and fewer distracting artifacts. It features improved prompt understanding and can be used to generate a wide array of visual styles from quick sketches to high-resolution images. The model is available in multiple versions, each optimized for particular types of tasks. Imagen 3 has been trained to capture nuances like specific camera angles or compositions in long, complex prompts, making it a versatile tool for image generation from textual inputs.",2024-09-05,https://deepmind.google/technologies/imagen-3/,unknown,text; image,Unknown,Unknown,[],Unknown,Unknown,Unknown,Unknown,open,Unknown,"Imagen 3 is intended to be used for generation of high-resolution images from textual prompts, from photorealistic landscapes to richly textured oil paintings or whimsical claymation scenes. It can also be used for stylized birthday cards, presentations, and more, due to its improved text rendering capabilities.",Unknown,Unknown,Unknown,,,,,,,,,,
-model,Gemma 2,Google DeepMind,"Gemma 2 is an open model that offers best-in-class performance and runs at incredible speed across different hardware. It easily integrates with other AI tools. This model is built on a redesigned architecture engineered for exceptional performance and inference efficiency. It is available in both 9 billion (9B) and 27 billion (27B) parameter sizes. Gemma 2 is optimized to run at incredible speed across a range of hardware, from powerful gaming laptops and high-end desktops, to cloud-based setups.",2024-06-27,https://blog.google/technology/developers/google-gemma-2/,unknown,text; text,The 27B Gemma 2 model outperforms other open models in its size category offering cutting-edge performance. Specific details can be found in the provided technical report.,27B parameters (dense),"['Gemma', 'CodeGemma', 'RecurrentGemma', 'PaliGemma']",Unknown,Unknown,"Google Cloud TPU host, NVIDIA A100 80GB Tensor Core GPU, NVIDIA H100 Tensor Core GPU","Google DeepMind implemented a refined architecture for Gemma 2. The model has improvements in safety and efficiency over the first generation. The deployment of Gemma 2 on Vertex AI, scheduled for the next month, will offer effortless management of the model.",open,Gemma (commercially-friendly license given by Google DeepMind),Gemma 2 is designed for developers and researchers for various AI tasks. It can be used via the integrations it offers with other AI tools/platforms and can additionally be deployed for more accessible and budget-friendly AI deployments.,Not specified,Unknown,Unknown,,,,,,,,,,
 model,Animagine XL 3.1,Cagliostro Research Lab,"An open-source, anime-themed text-to-image model enhanced to generate higher quality anime-style images with a broader range of characters from well-known anime series, an optimized dataset, and new aesthetic tags for better image creation.",2024-03-18,https://cagliostrolab.net/posts/animagine-xl-v31-release,https://huggingface.co/cagliostrolab/animagine-xl-3.1,text; image,unknown,unknown,['Animagine XL 3.0'],unknown,"Approximately 15 days, totaling over 350 GPU hours.",2x A100 80GB GPUs,"The model undergoes pretraining, first stage finetuning, and second stage finetuning for refining and improving aspects such as hand and anatomy rendering.",open,Fair AI Public License 1.0-SD,"Generating high-quality anime images from textual prompts. Useful for anime fans, artists, and content creators.",Not suitable for creating realistic photos or for users who expect high-quality results from short or simple prompts.,unknown,https://huggingface.co/cagliostrolab/animagine-xl-3.1/discussions,,,,,,,,,,
 model,GodziLLa 2,Maya Philippines,"GodziLLa 2 is an experimental combination of various proprietary LoRAs from Maya Philippines and Guanaco LLaMA 2 1K dataset, with LLaMA 2.",2023-08-11,https://huggingface.co/MayaPH/GodziLLa2-70B,https://huggingface.co/MayaPH/GodziLLa2-70B,text; text,"Evaluated on the OpenLLM leaderboard, releasing at rank number 4 on the leaderboard.",70B parameters (dense),"['LLaMA 2', 'Guanaco LLaMA dataset']",unknown,unknown,unknown,,open,LLaMA 2,,,unknown,,,,,,,,,,,
 dataset,EXMODD,Beijing Institute of Technology,EXMODD (Explanatory Multimodal Open-Domain Dialogue dataset) is a dataset built off the proposed MDCF (Multimodal Data Construction Framework).,2023-10-17,https://arxiv.org/pdf/2310.10967.pdf,,"image, text",Models fine-tuned on EXMODD and earlier dataset Image-Chat and then evaluated on Image-Chat validation set.,unknown,"['YFCC100M', 'Image-Chat']",,,,,open,MIT,,,,Feedback can be sent to authors via poplpr@bit.edu.cn,,,,,,,,[],,
@@ -297,7 +291,6 @@ application,Viable,Viable,"Viable analyzes qualitative consumer feedback and pro
 application,Reexpress One,Reexpress AI,"Reexpress One offers a means of document classification, semantic search, and uncertainty analysis on-device.",2023-03-21,https://re.express/index.html,,,,,[],,,,,limited,unknown,,,unknown,https://github.com/ReexpressAI/support,,data analyses,hhttps://re.express/tos.html,unknown,unknown,unknown,,,,
 model,360 Zhinao,360 Security,360 Zhinao is a multilingual LLM in Chinese and English with chat capabilities.,2024-05-23,https://arxiv.org/pdf/2405.13386,,text; text,"Achieved competitive performance on relevant benchmarks against other 7B models in Chinese, English, and coding tasks.",7B parameters,[],unknown,unknown,unknwon,,open,unknown,,,,,,,,,,,,,,
 dataset,YT-Temporal-1B,University of Washington,,2022-01-07,https://arxiv.org/abs/2201.02639,,video,,20M videos,['YouTube'],,,,,open,MIT,,,,,,,,,,,,[],,
-model,Gen-3 Alpha,"Runway AI, Inc.","Gen-3 Alpha is a foundation model trained for large-scale multimodal tasks. It is a major improvement in fidelity, consistency, and motion over the previous generation, Gen-2. Gen-3 Alpha can power various tools, such as Text to Video, Image to Video, and Text to Image. The model excels at generating expressive human characters with a wide range of actions, gestures, and emotions, and is capable of interpreting a wide range of styles and cinematic terminology. It is also a step towards building General World Models. It has been designed for use by research scientists, engineers, and artists, and can be fine-tuned for customization according to specific stylistic and narrative requirements.",2024-06-17,https://runwayml.com/research/introducing-gen-3-alpha?utm_source=xinquji,unknown,"text, image, video; video",Unknown,Unknown,[],Unknown,Unknown,Unknown,"It will be released with a set of new safeguards, including an improved in-house visual moderation system and C2PA provenance standards.",open,"Terms of Use listed on Runway AI, Inc.'s website, specific license unknown","Can be used to create expressive human characters, interpret a wide range of styles and cinematic terminology, and power tools for Text to Video, Image to Video, and Text to Image tasks.",Unknown,The model includes a new and improved in-house visual moderation system.,"Companies interested in fine-tuning and custom models can reach out to Runway AI, Inc. using a form on their website.",,,,,,,,,,
 model,Xwin-LM,Xwin,"Xwin-LM is a LLM, which on release, ranked top 1 on AlpacaEval, becoming the first to surpass GPT-4 on this benchmark.",2023-09-20,https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1,https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1,text; text,Evaluated on AlpacaEval benchmark against SOTA LLMs.,70B parameters (dense),[],unknown,unknown,unknown,,open,LLaMA2,,,,https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1/discussions,,,,,,,,,,
 application,Poe,Quora,"Poe lets people ask questions, get instant answers, and have back-and-forth conversations with several AI-powered bots. It is initially available on iOS, but we will be adding support for all major platforms in the next few months, along with more bots.",2023-02-03,https://quorablog.quora.com/Poe-1,,,,,"['ChatGPT API', 'GPT-4 API', 'Claude API', 'Dragonfly API', 'Sage API']",,,,,limited,,,,,,,,https://poe.com/tos,,,,,,,
 dataset,Jurassic-1 dataset,AI21 Labs,"The dataset used to train the Jurassic-1 models, based on publicly available data.",2021-08-11,https://uploads-ssl.webflow.com/60fd4503684b466578c0d307/61138924626a6981ee09caf6_jurassic_tech_paper.pdf,,text,,300B tokens,[],,,,,closed,unknown,unknown,,,,,,,,,,,[],unknown,unknown
@@ -311,7 +304,6 @@ application,AI21 Summarization API,AI21 Labs,AI21 Studio's Summarize API offers
 application,Wordtune,AI21 Labs,"Wordtune, the first AI-based writing companion that understands context and meaning.",2020-10-27,https://www.wordtune.com/,,,,,['AI21 Paraphrase API'],,,,unknown,limited,Wordtune License,The Wordtune assistant is a writing assistant,,unknown,,unknown,text,https://www.wordtune.com/terms-of-use,unknown,unknown,unknown,,,,
 application,Wordtune Read,AI21 Labs,"Wordtune Read is an AI reader that summarizes long documents so you can understand more, faster.",2021-11-16,https://www.wordtune.com/read,,,,,['AI21 Summarize API'],,,,unknown,limited,Wordtune License,,,unknown,,unknown,text,https://www.wordtune.com/terms-of-use,unknown,unknown,unknown,,,,
 model,Jamba,AI21 Labs,"Jamba is a state-of-the-art, hybrid SSM-Transformer LLM. Jamba is the world’s first production-grade Mamba based model.",2024-03-28,https://www.ai21.com/blog/announcing-jamba,https://huggingface.co/ai21labs/Jamba-v0.1,text; text,Jamba outperforms or matches other state-of-the-art models in its size class on a wide range of benchmarks.,52B parameters (sparse),[],unknown,unknown,unknown,,open,Apache 2.0,"intended for use as a foundation layer for fine tuning, training",,,https://huggingface.co/ai21labs/Jamba-v0.1/discussions,,,,,,,,,,
-model,Jamba 1.5,AI21,"A family of models that demonstrate superior long context handling, speed, and quality. Built on a novel SSM-Transformer architecture, they surpass other models in their size class. These models are useful for enterprise applications, such as lengthy document summarization and analysis. The Jamba 1.5 family also includes the longest context window, at 256K, among open models. They are fast, quality-focused, and handle long contexts efficiently.",2024-08-22,https://www.ai21.com/blog/announcing-jamba-model-family,unknown,text; text,"The models were evaluated based on their ability to handle long contexts, speed, and quality. They outperformed competitors in their size class, scoring high on the Arena Hard benchmark.",94B parameters,[],Unknown,Unknown,"For speed comparisons, Jamba 1.5 Mini used 2xA100 80GB GPUs, and Jamba 1.5 Large used 8xA100 80GB GPUs.","The models were evaluated on the Arena Hard benchmark. For maintaining long context performance, they were tested on the RULER benchmark.",open,Jamba Open Model License,"The models are built for enterprise scale AI applications. They are purpose-built for efficiency, speed, and ability to solve critical tasks that businesses care about, such as lengthy document summarization and analysis. They can also be used for RAG and agentic workflows.",Unknown,Unknown,Unknown,,,,,,,,,,
 model,Dolphin 2.2 Yi,Cognitive Computations,Dolphin 2.2 Yi is an LLM based off Yi.,2023-11-14,https://erichartford.com/dolphin,https://huggingface.co/cognitivecomputations/dolphin-2_2-yi-34b,text; text,,34B parameters (dense),"['Dolphin', 'Yi']",unknown,3 days,4 A100 GPUs,,open,custom,,,unknown,https://huggingface.co/cognitivecomputations/dolphin-2_2-yi-34b/discussions,,,,,,,,,,
 model,WizardLM Uncensored,Cognitive Computations,WizardLM Uncensored is WizardLM trained with a subset of the dataset - responses that contained alignment / moralizing were removed.,2023-06-01,https://huggingface.co/cognitivecomputations/WizardLM-30B-Uncensored,https://huggingface.co/cognitivecomputations/WizardLM-30B-Uncensored,text; text,Evaluated on OpenLLM leaderboard.,30B parameters (dense),['WizardLM'],unknown,unknown,unknown,,open,unknown,,,unknown,https://huggingface.co/cognitivecomputations/WizardLM-30B-Uncensored/discussions,,,,,,,,,,
 model,ChatGLM,ChatGLM,"ChatGLM is a Chinese-English language model with question and answer and dialogue functions, and is aimed at a Chinese audience.",2023-03-14,https://chatglm.cn/blog,,text; text,Performance evaluated on English and Chinese language benchmark tests.,6B parameters (dense),[],unknown,unknown,,,open,Apache 2.0,,,,,,,,,,,,,,
@@ -371,7 +363,6 @@ model,Emu Edit,Meta,Emu Edit is a multi-task image editing model which sets stat
 model,MetaCLIP,Meta,MetaCLIP is a more transparent rendition of CLIP that aims to reveal CLIP's training data curation methods.,2023-10-02,https://arxiv.org/pdf/2103.00020.pdf,https://huggingface.co/facebook/metaclip-b32-400m,text; text,Evaluated in comparison to CLIP.,unknown,['Common Crawl'],unknown,unknown,unknown,,open,CC-BY-NC-4.0,,,,,,,,,,,,,,
 model,Llama 3,Meta,Llama 3 is the third generation of Meta AI's open-source large language model. It comes with pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases.,2024-04-18,https://llama.meta.com/llama3/,https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md,text; text,"The models were evaluated based on their performance on standard benchmarks and real-world scenarios. These evaluations were performed using a high-quality human evaluation set containing 1,800 prompts covering multiple use cases. The models also went through red-teaming for safety, where human experts and automated methods were used to generate adversarial prompts to test for problematic responses.",70B parameters,[],unknown,unknown,2 custom-built Meta 24K GPU clusters,"Extensive internal and external testing for safety, and design of new trust and safety tools.",open,Llama 3,"Llama 3 is intended for a broad range of use cases, including AI assistance, content creation, learning, and analysis.",unknown,Extensive internal and external performance evaluation and red-teaming approach for safety testing.,"Feedback is encouraged from users to improve the model, but the feedback mechanism is not explicitly described.",,,,,,,,,,
 model,Chameleon,Meta FAIR,Chameleon is a family of early-fusion token-based mixed-modal models capable of understanding and generating images and text in any arbitrary sequence.,2024-05-17,https://arxiv.org/pdf/2405.09818,,"image, text; image, text","Evaluated on a comprehensive range of tasks, including visual question answering, image captioning, text generation, image generation, and long-form mixed modal generation. Chameleon demonstrates broad and general capabilities, including state-of-the-art performance in image captioning tasks, outperforms Llama-2 in text-only tasks while being competitive with models such as Mixtral 8x7B and Gemini-Pro.",34B parameters,[],unknown,unknown,Meta's Research Super Cluster (powered by NVIDIA A100 80GB GPUs),,open,unknown,,,,,,,,,,,,,,
-model,Llama 3.1 405B,Meta AI,"Llama 3.1 405B is the first openly available model that rivals the top AI models when it comes to state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. With the release of the 405B model, the Llama versions support advanced use cases, such as long-form text summarization, multilingual conversational agents, and coding assistants. It is the largest and most capable openly available foundation model.",2024-07-23,https://ai.meta.com/blog/meta-llama-3-1/,https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md,text; text,"The model was evaluated on over 150 benchmark datasets that span a wide range of languages. An experimental evaluation suggests that the model is competitive with leading foundation models across a range of tasks. Also, smaller models of Llama 3.1 405B are competitive with closed and open models that have a similar number of parameters.",405B parameters (dense),['Unknown'],Unknown,Unknown,Over 16 thousand H100 GPUs,"The development process was focused on keeping the model scalable and straightforward. It adopted an iterative post-training procedure, where each round uses supervised fine-tuning and direct preference optimization. The model also underwent quality assurance and filtering for pre-and post-training data.",open,Unknown,"For advanced use cases, such as long-form text summarization, multilingual conversational agents, and coding assistants. May also be useful in the development of custom offerings and systems by developers.",Unknown,Unknown,Unknown,,,,,,,,,,
 model,CausalLM,CausalLM,CausalLM is an LLM based on the model weights of Qwen and trained on a model architecture identical to LLaMA 2.,2023-10-21,https://huggingface.co/CausalLM/14B,https://huggingface.co/CausalLM/14B,text; text,Evaluated on standard benchmarks across a range of tasks.,14B parameters (dense),"['Qwen', 'OpenOrca', 'Open Platypus']",unknown,unknown,unknown,,open,WTFPL,,,unknown,,,,,,,,,,,
 model,Midm,KT Corporation,Midm is a pre-trained Korean-English language model developed by KT. It takes text as input and creates text. The model is based on Transformer architecture for an auto-regressive language model.,2023-10-31,https://huggingface.co/KT-AI/midm-bitext-S-7B-inst-v1,https://huggingface.co/KT-AI/midm-bitext-S-7B-inst-v1,text; text,unknown,7B parameters,"['AI-HUB dataset', 'National Institute of Korean Language dataset']",unknown,unknown,unknown,"KT tried to remove unethical expressions such as profanity, slang, prejudice, and discrimination from training data.",open,CC-BY-NC 4.0,It is expected to be used for various research purposes.,It cannot be used for commercial purposes.,unknown,https://huggingface.co/KT-AI/midm-bitext-S-7B-inst-v1/discussions,,,,,,,,,,
 dataset,10k_prompts_ranked,Data is Better Together,"10k_prompts_ranked is a dataset of prompts with quality rankings created by 314 members of the open-source ML community using Argilla, an open-source tool to label data.",2024-02-27,https://huggingface.co/blog/community-datasets,,text,,10k examples,[],,,,,open,unknown,Training and evaluating language models on prompt ranking tasks and as a dataset that can be filtered only to include high-quality prompts. These can serve as seed data for generating synthetic prompts and generations.,"This dataset only contains rankings for prompts, not prompt/response pairs so it is not suitable for direct use for supervised fine-tuning of language models.",,https://huggingface.co/datasets/DIBT/10k_prompts_ranked/discussions,,,,,,,https://huggingface.co/datasets/DIBT/10k_prompts_ranked,[],,
@@ -388,7 +379,6 @@ dataset,MineDojo,NVIDIA,,2022-06-17,https://arxiv.org/abs/2206.08853,,"text, vid
 dataset,VIMA dataset,"NVIDIA, Stanford",,2022-10-06,https://vimalabs.github.io/,,"image, text",,200M parameters (dense model),"['T5', 'Mask R-CNN', 'VIMA dataset']",,,,,open,MIT,,,,,,,,,,,,[],,
 model,VIMA,"NVIDIA, Stanford",,2022-10-06,https://vimalabs.github.io/,,"image, text; robotics trajectories",,200M parameters (dense),[],,,,,open,MIT,,,,,,,,,,,,,,
 model,Nemotron 4,Nvidia,Nemotron 4 is a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens.,2024-02-27,https://arxiv.org/pdf/2402.16819.pdf,,"text; code, text","Evaluated on standard LLM benchmarks across a range of fields like reasoning, code generation, and mathematical skills.",15B parameters (dense),[],unknown,13 days,3072 H100 80GB SXM5 GPUs across 384 DGX H100 nodes,Deduplication and quality filtering techniques are applied to the training dataset.,open,unknown,,,unknown,,,,,,,,,,,
-model,AstroPT,"Aspia Space, Instituto de Astrofísica de Canarias (IAC), UniverseTBD, Astrophysics Research Institute, Liverpool John Moores University, Departamento Astrofísica, Universidad de la Laguna, Observatoire de Paris, LERMA, PSL University, and Universit´e Paris-Cit´e.","AstroPT is an autoregressive pretrained transformer developed with astronomical use-cases in mind. The models have been pretrained on 8.6 million 512x512 pixel grz-band galaxy postage stamp observations from the DESI Legacy Survey DR8. They have created a range of models with varying complexity, ranging from 1 million to 2.1 billion parameters.",2024-09-08,https://arxiv.org/pdf/2405.14930v1,unknown,image; image,"The models’ performance on downstream tasks was evaluated by linear probing. The models follow a similar saturating log-log scaling law to textual models, their performance improves with the increase in model size up to the saturation point of parameters.",2.1B parameters,['DESI Legacy Survey DR8'],Unknown,Unknown,Unknown,The models’ performances were evaluated on downstream tasks as measured by linear probing.,open,MIT,"The models are intended for astronomical use-cases, particularly in handling and interpreting large observation data from astronomical sources.",Unknown,Unknown,Any problem with the model can be reported to Michael J. Smith at mike@mjjsmith.com.,,,,,,,,,,
 model,Nous Hermes 2,Nous Research,Nous Hermes 2 Mixtral 8x7B DPO is the new flagship Nous Research model trained over the Mixtral 8x7B MoE LLM.,2024-01-10,https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO,https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO,"text; code, text","Evaluated across standard benchmarks and generally performs better than Mixtral, which it was fine-tuned on.",7B parameters (dense),['Mixtral'],unknown,unknown,unknown,unknown,open,Apache 2.0,,,unknown,https://huggingface.co/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO/discussions,,,,,,,,,,
 model,YaRN LLaMA 2,"Nous Research, EleutherAI, University of Geneva",YaRN LLaMA 2 is an adapted version of LLaMA 2 using the YaRN extension method.,2023-11-01,https://arxiv.org/pdf/2309.00071.pdf,https://huggingface.co/NousResearch/Yarn-Llama-2-70b-32k,text; text,Evaluated across a variety of standard benchmarks in comparison to LLaMA 2.,70B parameters (dense),['LLaMA 2'],unknown,unknown,unknown,,open,LLaMA 2,,,unknown,https://huggingface.co/NousResearch/Yarn-Llama-2-70b-32k/discussions,,,,,,,,,,
 model,Nous Capybara,Nous Research,The Capybara series is a series of LLMs and the first Nous collection of models made by fine-tuning mostly on data created by Nous in-house.,2023-11-13,https://huggingface.co/NousResearch/Nous-Capybara-34B,https://huggingface.co/NousResearch/Nous-Capybara-34B,text; text,,34B parameters (dense),['Yi'],unknown,unknown,unknown,,open,MIT,,,unknown,https://huggingface.co/NousResearch/Nous-Capybara-34B/discussions,,,,,,,,,,
@@ -396,7 +386,6 @@ model,YaRN Mistral,"Nous Research, EleutherAI, University of Geneva",YaRN Mistra
 model,OpenHermes 2.5 Mistral,Nous Research,"OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, trained on additional code datasets.",2023-11-03,https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B,https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B,text; text,Evaluated on common LLM benchmarks in comparison to other Mistral derivatives.,7B parameters (dense),['Mistral'],unknown,unknown,unknown,,open,Apache 2.0,,,unknown,https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B/discussions,,,,,,,,,,
 model,Hermes 2 Pro-Mistral,Nous,"Hermes 2 Pro on Mistral 7B is an upgraded, retrained version of Nous Hermes 2. This improved version excels at function calling, JSON Structured Outputs, and several other areas, scoring positively on various benchmarks.",2024-03-10,https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B,https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B,text; text,"The model was examined across a range of benchmarks including GPT4All, AGIEval, BigBench, TruthfulQA and in-house evaluations of function calling and JSON mode.",7B parameters (dense),"['Mistral', 'OpenHermes 2.5 Dataset', 'Nous Hermes 2']",unknown,unknown,unknown,"The model was evaluated across multiple tasks, displaying notable scores in GPT4All, AGIEval, BigBench, and TruthfulQA. It also has a high score on function calling and JSON mode, indicating the robustness of its capabilities.",open,Apache 2.0,"The model is intended for general task and conversation capabilities, function calling, and JSON structured outputs.",unknown,unknown,https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B/discussions,,,,,,,,,,
 model,Genstruct,Nous,"Genstruct is an instruction-generation model, designed to create valid instructions given a raw text corpus. This enables the creation of new, partially synthetic instruction finetuning datasets from any raw-text corpus. This work was inspired by Ada-Instruct and the model is also trained to generate questions involving complex scenarios that require detailed reasoning.",2024-03-07,https://huggingface.co/NousResearch/Genstruct-7B,https://huggingface.co/NousResearch/Genstruct-7B,text; text,unknown,7B parameters (dense),[],unknown,unknown,unknown,unknown,open,Apache 2.0,"The model is intended for instruction-generation, creating questions involving complex scenarios and generating reasoning steps for those questions.",unknown,unknown,https://huggingface.co/NousResearch/Genstruct-7B/discussions,,,,,,,,,,
-model,ChatGLM,"Team GLM, Zhipu AI, Tsinghua University","ChatGLM is an evolving family of large language models that have been developed over time. The GLM-4 language series, includes GLM-4, GLM-4-Air, and GLM-4-9B. They are pre-trained on ten trillions of tokens mostly in Chinese and English and are aligned primarily for Chinese and English usage. The high-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback. GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tool(s) to use.",2023-07-02,https://arxiv.org/pdf/2406.12793,https://huggingface.co/THUDM/glm-4-9b,text; text,"Evaluations show that GLM-4, 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as measured by AlignBench.",9B parameters,[],Unknown,Unknown,Unknown,"High-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback.",Open,Apache 2.0,"General language modeling, complex tasks like accessing online information via web browsing and solving math problems using Python interpreter.",Unknown,Unknown,Unknown,,,,,,,,,,
 model,TigerBot,TigerResearch,TigerBot is an open source multilingual multitask LLM.,2023-10-19,https://arxiv.org/pdf/2312.08688.pdf,https://huggingface.co/TigerResearch/tigerbot-180b-base-v2,text; text,Evaluated across a range of domain tasks across standard benchmarks in comparison to predecessor Llama 2.,180B parameters (dense),"['Llama 2', 'BLOOM']",unknown,unknown,32 A100-40G GPUs,Safety filtering performed to mitigate risk and remove toxic content.,open,Apache 2.0,,,unknown,https://huggingface.co/TigerResearch/tigerbot-180b-base-v2/discussions,,,,,,,,,,
 dataset,MassiveText,Google Deepmind,"The MassiveText dataset was used to train the Gopher model.
 ",2021-12-08,https://arxiv.org/pdf/2112.11446.pdf,,"code, text","MassiveText data was analyzed for toxicity, language distribution, URL breakdown, and tokenizer compression rates on the subsets [[Section A.2]](https://arxiv.org/pdf/2112.11446.pdf#subsection.A.2).
@@ -478,13 +467,11 @@ model,Qwen 1.5 MoE,Qwen Team,"Qwen 1.5 is the next iteration in their Qwen serie
 model,SeaLLM v2.5,"DAMO Academy, Alibaba",SeaLLM v2.5 is a multilingual large language model for Southeast Asian (SEA) languages.,2024-04-12,https://github.com/DAMO-NLP-SG/SeaLLMs,https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5,text; text,"The model was evaluated on 3 benchmarks (MMLU for English, M3Exam (M3e) for English, Chinese, Vietnamese, Indonesian, and Thai, and VMLU for Vietnamese) and it outperformed GPT-3 and Vistral-7B-chat models across these benchmarks in the given languages.",7B parameters,['Gemma'],unknown,unknown,unknown,"Despite efforts in red teaming and safety fine-tuning and enforcement, the creators suggest, developers and stakeholders should perform their own red teaming and provide related security measures before deployment, and they must abide by and comply with local governance and regulations.",open,custom,"The model is intended for multilingual tasks such as knowledge retrieval, math reasoning, and instruction following. Also, it could be used to provide multilingual assistance.","The model should not be used in a way that could lead to inaccurate, misleading or potentially harmful generation. Users should comply with local laws and regulations when deploying the model.",unknown,https://huggingface.co/SeaLLMs/SeaLLM-7B-v2.5/discussions,,,,,,,,,,
 model,Grok-1,xAI,"Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy,",2023-11-04,https://grok.x.ai/,https://x.ai/model-card/,text; text,Grok-1 was evaluated on a range of reasoning benchmark tasks and on curated foreign mathematic examination questions.,314B parameters (dense),[],unknown,unknown,unknown,,open,Apache 2.0,"Grok-1 is intended to be used as the engine behind Grok for natural language processing tasks including question answering, information retrieval, creative writing and coding assistance.",,unknown,,,,,,,,,,,
 model,Grok-1.5V,xAI,"Grok-1.5V is a first-generation multimodal model which can process a wide variety of visual information, including documents, diagrams, charts, screenshots, and photographs.",2024-04-12,https://x.ai/blog/grok-1.5v,,"image, text; text","The model is evaluated in a zero-shot setting without chain-of-thought prompting. The evaluation domains include multi-disciplinary reasoning, understanding documents, science diagrams, charts, screenshots, photographs and real-world spatial understanding. The model shows competitive performance with existing frontier multimodal models.",unknown,[],unknown,unknown,unknown,,limited,unknown,"Grok-1.5V can be used for understanding documents, science diagrams, charts, screenshots, photographs. It can also translate diagrams into Python code.",unknown,unknown,,,,,,,,,,,
-model,Grok-2,xAI,"Grok-2 is a state-of-the-art language model with advanced capabilities in both text and vision understanding. It demonstrates significant improvements in reasoning with retrieved content and tool use capabilities over its previous Grok-1.5 model. It also excels in vision-based tasks and delivers high performance in document-based question answering and visual math reasoning (MathVista). Grok-2 mini, a smaller version of Grok-2, is also introduced, offering a balance between speed and answer quality.",2024-08-13,https://x.ai/blog/grok-2,unknown,"text; text, vision","The Grok-2 models were evaluated across a series of academic benchmarks that included reasoning, reading comprehension, math, science, and coding. They showed significant improvements over the earlier model Grok-1.5 and achieved performance levels competitive to other frontier models in areas such as graduate-level science knowledge (GPQA), general knowledge (MMLU, MMLU-Pro), and math competition problems (MATH).",unknown,[],Unknown,Unknown,Unknown,Grok-2 models were tested in real-world scenarios using AI tutors that engaged with the models across a variety of tasks and selected the superior response based on specific criteria outlined in the guidelines.,limited,Unknown,"The model is intended to be used for understanding text and vision, answering questions, collaborating on writing, solving coding tasks, and enhancing search capabilities.",Unknown,Unknown,Issues with the model should be reported to xAI.,,,,,,,,,,
 application,Nextdoor Assistant,Nextdoor,AI chatbot on Nextdoor that helps users write more clear and conscientious posts.,2023-05-02,https://help.nextdoor.com/s/article/Introducing-Assistant,,,,,['ChatGPT'],,,,,open,unknown,to be used to help make the Nextdoor experience more positive for users,,,,,natural language text guidance,,,,,,,,
 dataset,Neeva dataset,Neeva,,,https://neeva.com/index,,text,,unknown,[],,,,,closed,unknown,,,,,,,,,,,,[],,
 model,Neeva model,Neeva,,,https://neeva.com/index,,text; text,,unknown,['Neeva dataset'],,,,,closed,unknown,,,,,,,,,,,,,,
 application,NeevaAI,Neeva,NeevaAI is an AI-powered search tool that combines the capabilities of LLMs with Neeva's independent in-house search stack to create a unique and transformative search experience.,2023-01-06,https://neeva.com/blog/introducing-neevaai,,,,,['Neeva model'],,,,,open,Custom,,,,,,,https://neeva.com/terms,,,,,,,
 application,Transformify Automate,Transformify,Transformify Automate is a platform for automated task integration using natural language prompts.,2023-05-30,https://www.transformify.ai/automate,,,,,['GPT-4'],,,,,open,,,,,,,text and code,https://www.transformify.ai/legal-stuff,,,,,,,
-model,Re-LAION-5B,LAION e.V.,"Re-LAION-5B is an updated version of LAION-5B, the first web-scale, text-link to images pair dataset to be thoroughly cleaned of known links to suspected CSAM. It is an open dataset for fully reproducible research on language-vision learning. This model was developed in response to issues identified by the Stanford Internet Observatory in December 2023. The updates were made in collaboration with multiple organizations like the Internet Watch Foundation (IWF), the Canadian Center for Child Protection (C3P), and Stanford Internet Observatory.",2024-08-30,https://laion.ai/blog/relaion-5b/,unknown,text; image,"Re-LAION-5B aims to fix the issues as reported by Stanford Internet Observatory for the original LAION-5B. It is available for download in two versions, research and research-safe. In total, 2236 links that potentially led to inappropriate content were removed.","5.5B (text, image) pairs",['LAION-5B'],Unknown,Unknown,Unknown,The model utilized lists of link and image hashes provided by partner organizations. These were used to remove inappropriate links from the original LAION-5B dataset to create Re-LAION-5B.,open,Apache 2.0,Re-LAION-5B is designed for research on language-vision learning. It can also be used by third parties to clean existing derivatives of LAION-5B by generating diffs and removing all matched content from their versions.,"The dataset should not be utilized for purposes that breach legal parameters or ethical standards, such as dealing with illegal content.",unknown,Problems with the dataset should be reported to the LAION organization. They have open lines for communication with their partners and the broader research community.,,,,,,,,,,
 model,FuseChat,FuseAI,FuseChat is a powerful chat Language Learning Model (LLM) that integrates multiple structure and scale-varied chat LLMs using a fuse-then-merge strategy. The fusion is done using two stages,2024-02-26,https://arxiv.org/abs/2402.16107,https://huggingface.co/FuseAI/FuseChat-7B-VaRM,text; text,"The FuseChat model was evaluated on MT-Bench which comprises 80 multi-turn dialogues spanning writing, roleplay, reasoning, math, coding, stem, and humanities domains. It yields an average performance of 66.52 with specific scores for individual domains available in the leaderboard results.",7B parameters,"['Nous Hermes 2', 'OpenChat 3.5']",unknown,unknown,unknown,,open,Apache 2.0,"FuseChat is intended to be used as a powerful chat bot that takes in text inputs and provides text-based responses. It can be utilized in a variety of domains including writing, roleplay, reasoning, math, coding, stem, and humanities.",unknown,unknown,https://huggingface.co/FuseAI/FuseChat-7B-VaRM/discussions,,,,,,,,,,
 dataset,ToyMix,Mila-Quebec AI Institute,ToyMix is the smallest dataset of three extensive and meticulously curated multi-label datasets that cover nearly 100 million molecules and over 3000 sparsely defined tasks.,2023-10-09,https://arxiv.org/pdf/2310.04292.pdf,,"molecules, tasks",Models of size 150k parameters trained on ToyMix and compared to models trained on its dependencies across GNN baselines.,13B labels of quantum and biological nature.,"['QM9', 'TOX21', 'ZINC12K']",,,,,open,CC BY-NC-SA 4.0,"The datasets are intended to be used in an academic setting for training molecular GNNs with orders of magnitude more parameters than current large models. Further, the ToyMix dataset is intended to be used in a multi-task setting, meaning that a single model should be trained to predict them simultaneously.",,,,,,,,,,,[],,
 dataset,LargeMix,Mila-Quebec AI Institute,LargeMix is the middle-sized dataset of three extensive and meticulously curated multi-label datasets that cover nearly 100 million molecules and over 3000 sparsely defined tasks.,2023-10-09,https://arxiv.org/pdf/2310.04292.pdf,,"molecules, tasks",Models of size between 4M and 6M parameters trained for 200 epochs on LargeMix and compared to models trained on its dependencies across GNN baselines.,13B labels of quantum and biological nature.,"['L1000 VCAP', 'L1000 MCF7', 'PCBA1328', 'PCQM4M_G25_N4']",,,,,open,CC BY-NC-SA 4.0,"The datasets are intended to be used in an academic setting for training molecular GNNs with orders of magnitude more parameters than current large models. Further, the LargeMix dataset is intended to be used in a multi-task setting, meaning that a single model should be trained to predict them simultaneously.",,,,,,,,,,,[],,
@@ -566,7 +553,6 @@ model,Orca 2,Microsoft,Orca 2 is a finetuned version of LLAMA-2 for research pur
 model,Phi-3,Microsoft,"Phi-3 is a 14 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets.",2024-05-21,https://arxiv.org/abs/2404.14219,https://huggingface.co/microsoft/Phi-3-medium-128k-instruct,text; text,"The model has been evaluated against benchmarks that test common sense, language understanding, mathematics, coding, long-term context, and logical reasoning. The Phi-3 Medium-128K-Instruct demonstrated robust and state-of-the-art performance.",14B parameters,[],unknown,unknown,unknown,The model underwent post-training processes viz. supervised fine-tuning and direct preference optimization to increase its capability in following instructions and aligning to safety measures.,open,MIT,The model's primary use cases are for commercial and research purposes that require capable reasoning in memory or compute constrained environments and latency-bound scenarios. It can also serve as a building block for generative AI-powered features.,"The model should not be used for high-risk scenarios without adequate evaluation and mitigation techniques for accuracy, safety, and fairness.","Issues like allocation, high-risk scenarios, misinformation, generation of harmful content and misuse should be monitored and addressed.",https://huggingface.co/microsoft/Phi-3-medium-128k-instruct/discussions,,,,,,,,,,
 model,Aurora,Microsoft,Aurora is a large-scale foundation model of the atmosphere trained on over a million hours of diverse weather and climate data.,2024-05-28,https://arxiv.org/pdf/2405.13063,,text; climate forecasts,Evaluated by comparing climate predictions to actual happened events.,1.3B parameters,[],unknown,unknown,32 A100 GPUs,,closed,unknown,,,,,,,,,,,,,,
 model,Prov-GigaPath,Microsoft,Prov-GigaPath is a whole-slide pathology foundation model pretrained on 1.3 billion 256 × 256 pathology image tiles.,2024-05-22,https://www.nature.com/articles/s41586-024-07441-w,,image; embeddings,"Evaluated on a digital pathology benchmark comprising 9 cancer subtyping tasks and 17 pathomics tasks, with Prov-GigaPath demonstrating SoTA performance in 25 out of 26 tasks.",unknown,['GigaPath'],unknown,2 days,4 80GB A100 GPUs,,closed,unknown,,,,,,,,,,,,,,
-model,Phi-3.5-MoE,Microsoft,"Phi-3.5-MoE is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available documents, with a focus on very high-quality, reasoning dense data. It supports multilingual and has a 128K context length in tokens. The model underwent a rigorous enhancement process, incorporating supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure instruction adherence and robust safety measures.",2024-09-08,https://huggingface.co/microsoft/Phi-3.5-MoE-instruct,https://huggingface.co/microsoft/Phi-3.5-MoE-instruct,text; text,"The model was evaluated across a variety of public benchmarks, comparing with a set of models including Mistral-Nemo-12B-instruct-2407, Llama-3.1-8B-instruct, Gemma-2-9b-It, Gemini-1.5-Flash, and GPT-4o-mini-2024-07-18. It achieved a similar level of language understanding and math as much larger models. It also displayed superior performance in reasoning capability, even with only 6.6B active parameters. It was also evaluated for multilingual tasks.",61B parameters (sparse); 6.6B active parameters,['Phi-3 dataset'],Unknown,Unknown,Unknown,"The model was enhanced through supervised fine-tuning, proximal policy optimization, and direct preference optimization processes for safety measures.",open,MIT,"The model is intended for commercial and research use in multiple languages. It is designed to accelerate research on language and multimodal models, and for use as a building block for generative AI powered features. It is suitable for general purpose AI systems and applications which require memory/computed constrained environments, latency bound scenarios, and strong reasoning.","The model should not be used for downstream purposes it was not specifically designed or evaluated for. Developers should evaluate and mitigate for accuracy, safety, and fariness before using within a specific downstream use case, particularly for high risk scenarios.",Unknown,Unknown,,,,,,,,,,
 model,CodeGen,Salesforce,CodeGen is a language model for code,2022-03-25,https://arxiv.org/abs/2203.13474,,"code, text; code, text",,16B parameters (dense),[],,,Unspecified Salesforce Compute (TPU-V4s),,open,"none (model weights), BSD-3-Clause (code)",,,,,,,,,,,,,,
 model,BLIP,Salesforce,,2022-01-28,https://arxiv.org/abs/2201.12086,,text; image,,unknown,"['ViT-B', 'BERT', 'COCO', 'Visual Genome', 'Conceptual Captions', 'Conceptual 12M', 'SBU Captions', 'LAION-115M']",,,,,open,BSD-3-Clause,,,,,,,,,,,,,,
 dataset,LAION-115M,Salesforce,,2022-01-28,https://arxiv.org/abs/2201.12086,,"image, text",,115M image-text pairs,['LAION-400M'],,,,,open,BSD-3-Clause,,,,,,,,,,,,[],,
@@ -646,7 +632,6 @@ application,Bedrock,Amazon,"Bedrock is a new service that makes FMs from AI21 La
 model,FalconLite2,Amazon,"FalconLite2 is a fine-tuned and quantized Falcon language model, capable of processing long (up to 24K tokens) input sequences.",2023-08-08,https://huggingface.co/amazon/FalconLite2,https://huggingface.co/amazon/FalconLite2,text; text,Evaluated against benchmarks that are specifically designed to assess the capabilities of LLMs in handling longer contexts.,40B parameters (dense),['Falcon-40B'],unknown,unknown,unknown,,open,Apache 2.0,,,,https://huggingface.co/amazon/FalconLite2/discussions,,,,,,,,,,
 model,Chronos,Amazon,"Chronos is a family of pretrained time series forecasting models based on language model architectures. A time series is transformed into a sequence of tokens via scaling and quantization, and a language model is trained on these tokens using the cross-entropy loss. Once trained, probabilistic forecasts are obtained by sampling multiple future trajectories given the historical context.",2024-03-13,https://github.com/amazon-science/chronos-forecasting,https://huggingface.co/amazon/chronos-t5-large,time-series; time-series,Chronos has been evaluated comprehensively on 42 datasets both in the in-domain (15 datasets) and zero-shot settings (27 datasets). Chronos outperforms task specific baselines in the in-domain setting and is competitive or better than trained models in the zero-shot setting.,710M parameters (dense),['T5'],,63 hours on p4d.24xlarge EC2 instance,8 NVIDIA A100 40G GPUs,"Chronos was evaluated rigorously on 42 datasets, including 27 in the zero-shot setting against a variety of statistical and deep learning baselines.",open,Apache 2.0,"Chronos can be used for zero-shot time series forecasting on univariate time series from arbitrary domains and with arbitrary horizons. Chronos models can also be fine-tuned for improved performance of specific datasets. Embeddings from Chronos encoder may also be useful for other time series analysis tasks such as classification, clustering, and anomaly detection.",,,https://github.com/amazon-science/chronos-forecasting/discussions,,,,,,,,,,
 model,Orion,OrionStarAI,Orion series models are open-source multilingual large language models trained from scratch by OrionStarAI.,2024-01-20,https://github.com/OrionStarAI/Orion,https://huggingface.co/OrionStarAI/Orion-14B-Base,text; text,Evaluated on multilingual and NLP benchmarks in comparison with SoTA models of comparable size.,14B parameters (dense),[],unknown,unknown,unknown,unknown,open,custom,,,unknown,https://huggingface.co/OrionStarAI/Orion-14B-Base/discussions,,,,,,,,,,
-model,ESM3,EvolutionaryScale,"ESM3 is the first generative model for biology that simultaneously reasons over the sequence, structure, and function of proteins. It is trained across the natural diversity of Earth, reasoning over billions of proteins from diverse environments. It advances the ability to program and create with the code of life, simulating evolution, and making biology programmable. ESM3 is generative, and scientists can guide the model to create proteins for various applications.",2024-06-25,https://www.evolutionaryscale.ai/blog/esm3-release,unknown,"text; image, text","The model was tested in the generation of a new green fluorescent protein. Its effectiveness was compared to natural evolutionary processes, and it was deemed to simulate over 500 million years of evolution.",98B parameters (Dense),[],Unknown,Unknown,unknown,"The creators have put in place a responsible development framework to ensure transparency and accountability from the start. ESM3 was tested in the generation of a new protein, ensuring its quality and effectiveness.",open,Unknown,"To engineer biology from first principles. It functions as a tool for scientists to create proteins for various applications, including medicine, biology research, and clean energy.",Unknown,Unknown though specific measures are not specified.,Unknown,,,,,,,,,,
 application,Cformers,Nolano,Cformers is a set of transformers that act as an API for AI inference in code.,2023-03-19,https://www.nolano.org/services/Cformers/,,,,,[],,,,,limited,MIT,,,,,,,,,,,,,,
 dataset,Anthropic Helpfulness dataset,Anthropic,"One of the datasets used to train Anthropic RLHF models. The dataset was collected by asking crowdworkers to have open-ended conversations with Anthropic models, ""asking for help, advice, or for the model to accomplish a task"", then choose the model answer that was more helpful for their given task, via the Anthropic Human Feedback Interface [[Section 2.2]](https://arxiv.org/pdf/2204.05862.pdf#subsection.2.2).
 ",2022-04-12,https://arxiv.org/pdf/2204.05862.pdf,,text,"The authors found that the crowdworkers didn't exhaustively check for honesty in the model answers they preferred [[Section 2.1]](https://arxiv.org/pdf/2204.05862.pdf#subsection.2.1).
@@ -675,8 +660,6 @@ model,Claude 2,Anthropic,"Claude 2 is a more evolved and refined version of Clau
 model,Claude 2.1,Anthropic,"Claude 2.1 is an updated version of Claude 2, with an increased context window, less hallucination and tool use.",2023-11-21,https://www.anthropic.com/index/claude-2-1,,text; text,"Evaluated on open-ended conversation accuracy and long context question answering. In evaluations, Claude 2.1 demonstrated a 30% reduction in incorrect answers and a 3-4x lower rate of mistakenly concluding a document supports a particular claim.",unknown,[],unknown,unknown,unknown,,limited,unknown,,,,,,,,,,,,,,
 application,Claude for Sheets,Anthropic,Claude for Sheets is a Google Sheets add-on that allows the usage of Claude directly in Google Sheets.,2023-12-21,https://workspace.google.com/marketplace/app/claude_for_sheets/909417792257,,,,,['Anthropic API'],,,,,open,unknown,as an integrated AI assistant in Google Sheets,,unknown,Reviews on https://workspace.google.com/marketplace/app/claude_for_sheets/909417792257,,AI-generated text from prompt,https://claude.ai/legal,unknown,unknown,unknown,,,,
 model,Claude 3,Anthropic,The Claude 3 model family is a collection of models which sets new industry benchmarks across a wide range of cognitive tasks.,2024-03-04,https://www.anthropic.com/news/claude-3-family,https://www-cdn.anthropic.com/de8ba9b01c9ab7cbabf5c33b80b7bbc618857627/Model_Card_Claude_3.pdf,"image, text; text","Evaluated on reasoning, math, coding, reading comprehension, and question answering, outperforming GPT-4 on standard benchmarks.",unknown,[],unknown,unknown,unknown,Pre-trained on diverse dataset and aligned with Constitutional AI technique.,limited,unknown,"Claude models excel at open-ended conversation and collaboration on ideas, and also perform exceptionally well in coding tasks and when working with text - whether searching, writing, editing, outlining, or summarizing.","Prohibited uses include, but are not limited to, political campaigning or lobbying, surveillance, social scoring, criminal justice decisions, law enforcement, and decisions related to financing, employment, and housing.",,,,,,,,,,,,
-model,Claude 3.5 Sonnet,Anthropic,"Claude 3.5 Sonnet is an AI model with advanced understanding and generation abilities in text, vision, and code. It sets new industry benchmarks for graduate-level reasoning (GPQA), undergrad-level knowledge (MMLU), coding proficiency (HumanEval), and visual reasoning. The model operates at twice the speed of its predecessor, Claude 3 Opus, and is designed to tackle tasks like context-sensitive customer support, orchestrating multi-step workflows, interpreting charts and graphs, and transcribing text from images.",2024-06-21,https://www.anthropic.com/news/claude-3-5-sonnet,unknown,"text; image, text","The model has been evaluated on a range of tests including graduate-level reasoning (GPQA), undergraduate-level knowledge (MMLU), coding proficiency (HumanEval), and standard vision benchmarks. In an internal agentic coding evaluation, Claude 3.5 Sonnet solved 64% of problems, outperforming the previous version, Claude 3 Opus, which solved 38%.",Unknown,[],Unknown,Unknown,Unknown,"The model underwent a red-teaming assessment, and has been tested and refined by external experts. It was also provided to the UK's AI Safety Institute (UK AISI) for a pre-deployment safety evaluation.",open,unknown,"The model is intended for complex tasks such as context-sensitive customer support, orchestrating multi-step workflows, interpreting charts and graphs, transcribing text from images, as well as writing, editing, and executing code.",Misuse of the model is discouraged though specific use cases are not mentioned.,"Unknown of misuse, and policy feedback from external experts has been integrated to ensure robustness of evaluations.",Feedback on Claude 3.5 Sonnet can be submitted directly in-product to inform the development roadmap and improve user experience.,,,,,,,,,,
-model,Qwen2-Math,Qwen Team,"Qwen2-Math is a series of specialized math language models built upon the Qwen2 large language models, with a focus on enhancing the reasoning and mathematical capabilities. Their intended use is for solving complex mathematical problems. They significantly outperform both open-source and closed-source models in terms of mathematical capabilities.",2024-08-08,https://qwenlm.github.io/blog/qwen2-math/,https://huggingface.co/Qwen/Qwen2-Math-72B,text; text,"Models have been evaluated on a series of math benchmarks, demonstrating outperformance of the state-of-the-art models in both the English and Chinese language.",72B parameters,[],Unknown,Unknown,Unknown,The models were tested with few-shot chain-of-thought prompting and evaluated across mathematical benchmarks in both English and Chinese.,open,Tongyi Qianwen,These models are intended for solving complex mathematical problems.,Uses that go against the ethical usage policies of Qwen Team.,Unknown,Problems with the model should be reported to the Qwen Team via their official channels.,,,,,,,,,,
 model,Inflection-1,Inflection AI,Inflection AI's first version of its in-house LLM. via Inflection AI's conversational API.,2023-06-22,https://inflection.ai/inflection-1,,text; text,"Evaluated on wide range of language benchmarks like MMLU 5-shot, GSM-8K, and HellaSwag 10-shot among others.",unknown,[],,,unknown,,limited,unknown,,,,,,,,,,,,,,
 application,Pi,Inflection AI,Personal AI chatbot designed to be conversational and specialized in emotional intelligence.,2023-05-02,https://inflection.ai/press,,,,,['Inflection-2.5'],,,,,limited,unknown,to be used as a personal assistant chatbot for everyday activities,,,,,natural language text responses,,,,,,,,
 model,Inflection-2,Inflection AI,"Inflection-2 is the best model in the world for its compute class and the second most capable LLM in the world, according to benchmark evaluation, as of its release.",2023-11-22,https://inflection.ai/inflection-2,,text; text,"Evaluated against state of the art models on benchmarks, and found to be most performant model outside of GPT-4.",unknown,[],unknown,unknown,5000 NVIDIA H100 GPUs,,closed,unknown,,,,,,,,,,,,,,
@@ -688,8 +671,6 @@ model,Platypus,Boston University,Platypus is a family of fine-tuned and merged L
 model,UFOGen,Boston University,"UFOGen is a novel generative model designed for ultra-fast, one-step text-to-image synthesis.",2023-11-14,https://arxiv.org/pdf/2311.09257.pdf,,text; image,UFOGen is evaluated on standard image benchmarks against other models fine-tuned with Stable Diffusion.,900M parameters (dense),['Stable Diffusion'],unknown,unknown,unknown,,open,unknown,,,,,,,,,,,,,,
 model,Palmyra,Writer,Palmyra is a family of privacy-first LLMs for enterprises trained on business and marketing writing.,2023-01-01,https://gpt3demo.com/apps/palmyra,https://huggingface.co/Writer/palmyra-base,text; text,Evaluated on the SuperGLUE benchmark,20B parameters (dense),['Writer dataset'],unknown,unknown,,,open,Apache 2.0,generating text from a prompt,,,https://huggingface.co/Writer/palmyra-base/discussions,,,,,,,,,,
 model,Camel,Writer,Camel is an instruction-following large language model tailored for advanced NLP and comprehension capabilities.,2023-04-01,https://chatcamel.vercel.app/,https://huggingface.co/Writer/camel-5b-hf,text; text,,5B parameters (dense),"['Palmyra', 'Camel dataset']",unknown,unknown,,,open,Apache 2.0,,,,https://huggingface.co/Writer/camel-5b-hf/discussions,,,,,,,,,,
-model,Palmyra-Med-70b-32k,Writer,"Palmyra-Med-70b-32k is a Language Model designed specifically for healthcare and biomedical applications. It builds upon the foundation of Palmyra-Med-70b and offers an extended context length. This model integrates the DPO dataset, a custom medical instruction dataset, and has been fine-tuned to meet the unique requirements of the medical and life sciences sectors. It is ranked as the leading LLM on biomedical benchmarks with an average score of 85.87%.",2024-09-08,https://huggingface.co/Writer/Palmyra-Med-70B-32K,https://huggingface.co/Writer/Palmyra-Med-70B-32K,text; text,"The model was evaluated across 9 diverse biomedical datasets where it achieved state-of-the-art results with an average score of 85.9%. It also demonstrated robust capability in efficiently processing extensive medical documents, as showcased by its near-perfect score in the NIH evaluation.",70B parameters,['Palmyra-X-004'],Unknown,Unknown,Unknown,The model has been refined using Policy Optimization and a finely crafted fine-tuning dataset. It contains watermarks to detect and prevent misuse and illegal use.,open,Writer open model,"Palmyra-Med-70b-32k is intended for non-commercial and research use in English. Specifically, it can be used for tasks like clinical entity recognition and knowledge discovery from EHRs, research articles, and other biomedical sources. It excels in analyzing and summarizing complex clinical notes, EHR data, and discharge summaries.","The model should not be used in any manner that violates applicable laws or regulations. It is not to be used in direct patient care, clinical decision support, or professional medical purposes. The model should not replace professional medical judgment.",Measures in place to monitor misuse include the addition of watermarks in all models built by Writer.com to detect and prevent misuse and illegal use.,Downstream problems with this model should be reported via email to Hello@writer.com.,,,,,,,,,,
-model,Palmyra-Fin-70B-32K,Writer,"Palmyra-Fin-70B-32K is a leading LLM built specifically to meet the needs of the financial industry. It has been fine-tuned on an extensive collection of high-quality financial data and it is highly adept at handling the specific needs of the finance field. It outperforms other large language models in various financial tasks and evaluations, achieving state-of-the-art results across various financial datasets. Its strong performance in tasks like financial document analysis, market trend prediction, risk assessment underscores its effective grasp of financial knowledge.",2024-09-08,https://huggingface.co/Writer/Palmyra-Fin-70B-32K,https://huggingface.co/Writer/Palmyra-Fin-70B-32K,text; text,"The model has been evaluated internally, showing state-of-the-art results on various financial datasets. It has shown 100% accuracy in needle-in-haystack tasks and superior performance in comparison to other models in the organization's internal finance evaluations. It passed the CFA Level III test with a score of 73% and has shown superior performance compared to other models in the long-fin-eval, an internally created benchmark that simulates real-world financial scenarios.",70B parameters (dense),"['Palmyra-X-004', 'Writer in-house financial instruction dataset']",Unknown,Unknown,Unknown,"The model was trained with a proprietary internal database and a fine-tuning recipe to ensure a greater level of domain-specific accuracy and fluency. Still, the model may contain inaccuracies, biases, or misalignments and its usage for direct financial decision-making or professional financial advice without human oversight is not recommended. It has not been rigorously evaluated in real-world financial settings and it requires further testing, regulatory compliance, bias mitigation, and human oversight for more critical financial applications.",open,Writer open model license,"The model is intended for use in English for financial analysis, market trend prediction, risk assessment, financial report generation, automated financial advice, and answering questions from long financial documents. It can be used for entity recognition, identifying key financial concepts such as market trends, economic indicators, and financial instruments from unstructured text.","The model should not be used in manners that violate applicable laws or regulations, including trade compliance laws, use prohibited by Writer's acceptable use policy, the Writer open model license, and in languages other than English. It is advised not to use the model for direct financial decision-making or professional financial advice without human oversight. Always consult a qualified financial professional for personal financial needs.",Unknown,Downstream problems with this model should be reported to Hello@writer.com.,,,,,,,,,,
 dataset,Open X-Embodiment dataset,Open X-Embodiment,"The Open X-Embodiment dataset is a dataset of robot movements assembled from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks)",2023-10-03,https://robotics-transformer-x.github.io/,,robot trajectories,"Analyzed on breakdown of types of robot trajectory in dataset, and overall coverage.",160K tasks,[],,,,unknown,open,Apache 2.0,Further research on X-embodiment models.,,unknown,,,,,,,,All data can be found at https://robotics-transformer-x.github.io/.,[],N/A,N/A
 model,RT-1-X,"Open X-Embodiment, Google Deepmind","RT-1-X is a model trained on the Open X-Embodiment dataset that exhibits better generalization and new capabilities compared to its predecessor RT-1, an efficient Transformer-based architecture designed for robotic control.",2023-10-03,https://robotics-transformer-x.github.io/,,"images, text; robot trajectories","Evaluated on in-distribution robotics skills, and outperforms its predecessor RT-1 by 50% in emergent skill evaluations.",35M parameters (dense),"['Open X-Embodiment dataset', 'ImageNet EfficientNet', 'USE']",unknown,unknown,unknown,unknown,open,Apache 2.0,Further research on X-embodiment models.,,unknown,,,,,,,,,,,
 model,RT-2-X,"Open X-Embodiment, Google Deepmind","RT-2-X is a model trained on the Open X-Embodiment dataset that exhibits better generalization and new capabilities compared to its predecessor RT-2, a large vision-language model co-fine-tuned to output robot actions as natural language tokens.",2023-10-03,https://robotics-transformer-x.github.io/,,"images, text, robot trajectories; robot trajectories","Evaluated on in-distribution robotics skills, and outperforms its predecessor RT-2 by 3x in emergent skill evaluations.",55B parameters (dense),"['Open X-Embodiment dataset', 'ViT (unknown size)', 'UL2']",unknown,unknown,unknown,unknown,closed,unknown,Further research on X-embodiment models.,,unknown,,,,,,,,,,,
@@ -715,7 +696,6 @@ model,Llama-2-7B-32K-Instruct,Together,"Llama-2-7B-32K-Instruct is an open-sourc
 dataset,RedPajama-Data-v2,Together,"RedPajama-Data-v2 is a new version of the RedPajama dataset, with 30 trillion filtered and deduplicated tokens (100+ trillions raw) from 84 CommonCrawl dumps covering 5 languages, along with 40+ pre-computed data quality annotations that can be used for further filtering and weighting.",2023-10-30,https://together.ai/blog/redpajama-data-v2,,text,,30 trillion tokens,['Common Crawl'],,,,tokens filtered and deduplicated,open,Apache 2.0,"To be used as the start of a larger, community-driven development of large-scale datasets for LLMs.",,,Feedback can be sent to Together via https://www.together.ai/contact,,,,,,,,[],"documents in English, German, French, Spanish, and Italian.",
 model,StripedHyena,Together,"StripedHyena is an LLM and the first alternative model competitive with the best open-source Transformers in short and long-context evaluations, according to Together.",2023-12-08,https://www.together.ai/blog/stripedhyena-7b,https://huggingface.co/togethercomputer/StripedHyena-Hessian-7B,text; text,Model evaluated on a suite of short-context task benchmarks.,7B parameters (dense),"['Hyena', 'RedPajama-Data']",unknown,unknown,unknown,,open,Apache 2.0,,,,https://huggingface.co/togethercomputer/StripedHyena-Hessian-7B/discussions,,,,,,,,,,
 model,StripedHyena Nous,Together,"StripedHyena Nous is an LLM and chatbot, along with the first alternative model competitive with the best open-source Transformers in short and long-context evaluations, according to Together.",2023-12-08,https://www.together.ai/blog/stripedhyena-7b,https://huggingface.co/togethercomputer/StripedHyena-Nous-7B,text; text,Model evaluated on a suite of short-context task benchmarks.,7B parameters (dense),"['Hyena', 'RedPajama-Data']",unknown,unknown,unknown,,open,Apache 2.0,,,,https://huggingface.co/togethercomputer/StripedHyena-Nous-7B/discussions,,,,,,,,,,
-model,Dragonfly,Together,"A large vision-language model with multi-resolution zoom that enhances fine-grained visual understanding and reasoning about image regions. The Dragonfly model comes in two variants, the general-domain model (""Llama-3-8b-Dragonfly-v1"") trained on 5.5 million image-instruction pairs, and the biomedical variant (""Llama-3-8b-Dragonfly-Med-v1"") fine-tuned on an additional 1.4 million biomedical image-instruction pairs. Dragonfly demonstrates promising performance on vision-language benchmarks like commonsense visual QA and image captioning.",2024-06-06,https://www.together.ai/blog/dragonfly-v1,unknown,"image, text; text","The model was evaluated using five popular vision-language benchmarks that require strong commonsense reasoning and detailed image understanding, AI2D, ScienceQA, MMMU, MMVet, and POPE. It demonstrated competitive performance in these evaluations compared to other vision-language models.",8B parameters,['LLaMA'],unknown,unknown,unknown,The model employs two key strategies (multi-resolution visual encoding and zoom-in patch selection) that enable it to efficiently focus on fine-grained details in image regions and provide better commonsense reasoning. Its performance was evaluated on several benchmark tasks for quality assurance.,open,unknown,"Dragonfly is designed for image-text tasks, including commonsense visual question answering and image captioning. It is further focused on tasks that require fine-grained understanding of high-resolution image regions, such as in medical imaging.",Unknown,Unknown,Unknown,,,,,,,,,,
 model,DeepFloyd IF,Stability AI,A text-to-image cascaded pixel diffusion model released in conjunction with AI research lab DeepFloyd.,2023-04-28,https://stability.ai/blog/deepfloyd-if-text-to-image-model,https://huggingface.co/DeepFloyd/IF-I-XL-v1.0,text; image,Evaluated on the COCO dataset.,4.3B parameters (dense),['LAION-5B'],,,,,open,custom,,,,https://huggingface.co/DeepFloyd/IF-I-XL-v1.0/discussions,,,,,,,,,,
 model,StableLM,Stability AI,Large language models trained on up to 1.5 trillion tokens.,2023-04-20,https://github.com/Stability-AI/StableLM,,text; text,,7B parameters (dense),"['StableLM-Alpha dataset', 'Alpaca dataset', 'gpt4all dataset', 'ShareGPT52K dataset', 'Dolly dataset', 'HH dataset']",,,,,open,Apache 2.0,,,,,,,,,,,,,,
 application,Stable Diffusion,Stability AI,Stable Diffusion is a generative software that creates images from text prompts.,2022-08-22,https://stability.ai/blog/stable-diffusion-public-release,,,,,[],,,,,open,custom,,,,https://huggingface.co/CompVis/stable-diffusion/discussions,,image,,,,,,,,
@@ -735,7 +715,6 @@ dataset,Luminous dataset,Aleph Alpha,The dataset used to train the Luminous mode
 model,Luminous,Aleph Alpha,Luminous is a family of multilingual language models,2022-04-14,https://twitter.com/Aleph__Alpha/status/1514576711492542477,,text; text,,200B parameters (dense),['Luminous dataset'],unknown,unknown,unknown,,limited,,,,,,,,,,,,,,,
 application,Aleph Alpha API,Aleph Alpha,The Aleph Alpha API serves a family of text-only language models (Luminous) and multimodal text-and-image models (Magma).,2021-09-30,https://www.aleph-alpha.com/,,,,,['Luminous'],,,,,limited,,unknown,unknown,unknown,unknown,,The text models provide text outputs given text inputs. The multimodal models provide text completions given text and image inputs.,https://www.aleph-alpha.com/terms-conditions,unknown,unknown,unknown,,,,
 model,MAGMA,Aleph Alpha,An autoregressive VL model that is able to generate text from an arbitrary combination of visual and textual input,2022-10-24,https://arxiv.org/pdf/2112.05253.pdf,,"image, text; text",Evaluated on the OKVQA benchmark as a fully open-ended generative task.,6B parameters (dense),"['GPT-J', 'CLIP']",,,32 A100 GPUs,,open,MIT,,,,,,,,,,,,,,
-model,Pharia-1-LLM-7B,Aleph Alpha,"Pharia-1-LLM-7B is a model that falls within the Pharia-1-LLM model family. It is designed to deliver short, controlled responses that match the performance of leading open-source models around 7-8 billion parameters. The model is culturally and linguistically tuned for German, French, and Spanish languages. It is trained on carefully curated data in line with relevant EU and national regulations. The model shows improved token efficiency and is particularly effective in domain-specific applications, especially in the automotive and engineering industries. It can also be aligned to user preferences, making it appropriate for critical applications without the risk of shut-down behaviour.",2024-09-08,https://aleph-alpha.com/introducing-pharia-1-llm-transparent-and-compliant/#:~:text=Pharia%2D1%2DLLM%2D7B,unknown,text; text,"Extensive evaluations were done with ablation experiments performed on pre-training benchmarks such as lambada, triviaqa, hellaswag, winogrande, webqs, arc, and boolq. Direct comparisons were also performed with applications like GPT and Llama 2.",7B parameters,[],Unknown,Unknown,Unknown,The model comes with additional safety guardrails via alignment methods to ensure safe usage. Training data is carefully curated to ensure compliance with EU and national regulations.,open,Aleph Open,"The model is intended for use in domain-specific applications, particularly in the automotive and engineering industries. It can also be tailored to user preferences.",Unknown,Unknown,Feedback can be sent to support@aleph-alpha.com.,,,,,,,,,,
 model,PolyCoder,Carnegie Mellon University,"PolyCoder is a code model trained on 2.7B parameters based on the GPT-2 architecture, which was trained on 249GB of code across 12 programming languages on a single machine.",2022-02-26,https://arxiv.org/abs/2202.13169,https://huggingface.co/NinedayWang/PolyCoder-2.7B,code,Reports results on standard code benchmarks across a variety of programming languages.,2.7B parameters (dense),['Github'],unknown,6 weeks,8 NVIDIA RTX 8000,"No specific quality control is mentioned in model training, though details on data processing and how the tokenizer was trained are provided in the paper.",open,MIT,unknown,None,None,https://huggingface.co/NinedayWang/PolyCoder-2.7B/discussion,,,,,,,,,,
 model,Moment,"Carnegie Mellon University, University of Pennsylvania",Moment is a family of open-source foundation models for general-purpose time-series analysis.,2024-02-06,https://arxiv.org/pdf/2402.03885.pdf,,,Evaluated on nascent time-series datasets and benchmarks.,385M parameters (dense),[],unknown,unknown,Single A6000 GPU,,open,unknown,,,unknown,,,,,,,,,,,
 dataset,HowTo100M,"École Normale Supérieure, Inria","HowTo100M is a large-scale dataset of narrated videos with an emphasis on instructional videos where content creators teach complex tasks with an explicit intention of explaining the visual content on screen. HowTo100M features a total of 136M video clips with captions sourced from 1.2M Youtube videos (15 years of video) and 23k activities from domains such as cooking, hand crafting, personal care, gardening or fitness.",2019-06-07,https://arxiv.org/pdf/1906.03327.pdf,,"text, video","Authors use the dataset to learn a joint text-video embedding by leveraging more than 130M video clip-caption pairs. They then evaluate the learned embeddings on the tasks of localizing steps in instructional videos of CrossTask and textbased video retrieval on YouCook2, MSR-VTT and LSMDC datasets. They show that their learned embedding can perform better compared to models trained on existing carefully annotated but smaller video description datasets.",136M video clips,['YouTube'],,,,,open,Apache 2.0,,"No uses are explicitly prohibited by the authors. They note the following limitations of the dataset: ""We note that the distribution of identities and activities in the HowTo100M dataset may not be representative of the global human population and the diversity in society. Please be careful of unintended societal, gender, racial and other biases when training or deploying models trained on this data.""
@@ -744,9 +723,6 @@ model,Mistral,Mistral AI,Mistral is a compact language model.,2023-09-27,https:/
 model,Mistral Large,Mistral AI,Mistral Large is Mistral AI’s new cutting-edge text generation model.,2024-02-26,https://mistral.ai/news/mistral-large/,,text; text,Evaluated on commonly used benchmarks in comparison to the current LLM leaders.,unknown,[],unknown,unknown,unknown,,limited,unknown,,,,,,,,,,,,,,
 application,Le Chat,Mistral AI,Le Chat is a first demonstration of what can be built with Mistral models and what can deployed in the business environment.,2024-02-26,https://mistral.ai/news/le-chat-mistral/,,,,,"['Mistral', 'Mistral Large']",,,,,limited,unknown,,,,,,,https://mistral.ai/terms/#terms-of-use,unknown,unknown,unknown,,,,
 model,Codestral,Mistral AI,"Codestral is an open-weight generative AI model explicitly designed for code generation tasks. It helps developers write and interact with code through a shared instruction and completion API endpoint. Mastering code and English, it can be used to design advanced AI applications for software developers. It is fluent in 80+ programming languages.",2024-05-29,https://mistral.ai/news/codestral/,,text; code,"Performance of Codestral is evaluated in Python, SQL, and additional languages, C++, bash, Java, PHP, Typescript, and C#. Fill-in-the-middle performance is assessed using HumanEval pass@1 in Python, JavaScript, and Java.",22B parameters,[],unknown,unknown,unknown,,open,Mistral AI Non-Production License,"Helps developers write and interact with code, design advanced AI applications for software developers, integrated into LlamaIndex and LangChain for building applications, integrated in VSCode and JetBrains environments for code generation and interactive conversation.",unknown,unknown,,,,,,,,,,,
-model,Mistral NeMo,"Mistral AI, NVIDIA","The Mistral NeMo model is a state-of-the-art 12B model built in collaboration with NVIDIA, offering a large context window of up to 128k tokens. The model is suitable for multilingual applications and exhibits excellent reasoning, world knowledge, and coding accuracy. It's easy to use and a drop-in replacement in a system that uses Mistral 7B. The model uses a new tokenizer, Tekken, based on Tiktoken, which is trained on over 100 languages. It compresses natural language text and source code more efficiently than previously used tokenizers.",2024-07-18,https://mistral.ai/news/mistral-nemo/,unknown,text; text,"The model underwent an advanced fine-tuning and alignment phase. Its performance was evaluated using GPT4o as a judge on official references. It was compared to recent open-source pre-trained models Gemma 2 9B, Llama 3 8B regarding multilingual performance and coding accuracy. Tekken tokenizer's compression ability was compared with previous tokenizers like SentencePiece and the Llama 3 tokenizer.",12B parameters,[],Unknown,Unknown,"NVIDIA hardware, specifics unknown",The model underwent an advanced fine-tuning and alignment phase. Various measures such as accuracy comparisons with other models and instruction-tuning were implemented to ensure its quality.,open,Apache 2.0,"The model can be used for multilingual applications, understanding and generating natural language as well as source code, handling multi-turn conversations, and providing more precise instruction following.",Unknown,Unknown,"Problems should be reported to the Mistral AI team, though the specific method of reporting is unknown.",,,,,,,,,,
-model,Codestral Mamba,Mistral AI,"Codestral Mamba is a Mamba2 language model that is specialized in code generation. It has a theoretical ability to model sequences of infinite length and offers linear time inference. This makes it effective for extensive user engagement and is especially practical for code productivity use cases. Codestral Mamba can be deployed using the mistral-inference SDK or through TensorRT-LLM, and users can download the raw weights from HuggingFace.",2024-07-16,https://mistral.ai/news/codestral-mamba/,unknown,text; text,"The model has been tested for in-context retrieval capabilities up to 256k tokens. It has been created with advanced code and reasoning capabilities, which enables it to perform on par with SOTA transformer-based models.",7.3B parameters,[],Unknown,Unknown,Unknown,Unknown,open,Apache 2.0,The model is intended for code generation and can be utilized as a local code assistant.,Unknown,Unknown,Problems with the model can be reported through the organization's website.,,,,,,,,,,
-model,MathΣtral,Mistral AI,"MathΣtral is a 7B model designed for math reasoning and scientific discovery. It achieves state-of-the-art reasoning capacities in its size category across various industry-standard benchmarks. This model stands on the shoulders of Mistral 7B and specializes in STEM subjects. It is designed to assist efforts in advanced mathematical problems requiring complex, multi-step logical reasoning. It particularly achieves 56.6% on MATH and 63.47% on MMLU.",2024-07-16,https://mistral.ai/news/mathstral/,unknown,text; text,The model's performance has been evaluated on the MATH and MMLU industry-standard benchmarks. It scored notably higher on both these tests than the base model Mistral 7B.,7B parameters,['Mistral 7B'],Unknown,Unknown,Unknown,This model has been fine-tuned from a base model and its inference and performance have been tested on several industry benchmarks.,open,Apache 2.0,"The model is intended for use in solving advanced mathematical problems requiring complex, multi-step logical reasoning or any math-related STEM subjects challenges.",Unknown,Unknown,Feedback is likely expected to be given through the HuggingFace platform where the model's weights are hosted or directly to the Mistral AI team.,,,,,,,,,,
 application,Character,Character AI,Character allows users to converse with various chatbot personas.,2022-09-16,https://beta.character.ai/,,,,,[],,,,,limited,unknown,,,,,,AI-generated chat conversations,https://beta.character.ai/tos,unknown,unknown,unknown,,,,
 application,AI Dungeon,Latitude,"AI Dungeon is a single-player text adventure game that uses AI to generate content.
 ",2019-12-17,https://play.aidungeon.io,,,,,['OpenAI API'],,,,,limited,custom,,,,,,,https://play.aidungeon.io/main/termsOfService,,,,,,,
@@ -807,7 +783,6 @@ model,GLM-130B,Tsinghua University,GLM-130B is a bidirectional language model tr
 model,CogVLM,"Zhipu AI, Tsinghua University",CogVLM is a powerful open-source visual language foundation model,2023-11-06,https://arxiv.org/pdf/2311.03079.pdf,,"image, text; text",Evaluated on image captioning and visual question answering benchmarks.,17B parameters (dense),"['Vicuna', 'CLIP']",unknown,4096 A100 days,unknown,,open,custom,Future multimodal research,,,,,,,,,,,,,
 model,UltraLM,Tsinghua University,UltraLM is a series of chat language models trained on UltraChat.,2023-06-27,https://github.com/thunlp/UltraChat#UltraLM,https://huggingface.co/openbmb/UltraLM-13b,text; text,Evaluated on AlpacaEval Leaderboard benchmarks.,13B parameters (dense),['UltraChat'],unknown,unknown,unknown,,open,LLaMA 2,,,unknown,https://huggingface.co/openbmb/UltraLM-13b/discussions,,,,,,,,,,
 dataset,UltraChat,Tsinghua University,"UltraChat is an open-source, large-scale, and multi-round dialogue data powered by Turbo APIs.",2023-04-20,https://github.com/thunlp/UltraChat,,text,UltraLM evaluated off of UltraChat is evaluated on standard LLM benchmarks.,unknown,[],,,,,open,MIT,,,unknown,https://huggingface.co/datasets/stingning/ultrachat/discussions,,,,,,,https://huggingface.co/datasets/stingning/ultrachat,[],"Dialogue data of questions about the world, writing and creation tasks, and questions on existing materials.",
-model,EXAONE 3.0 Instruction Tuned Language Model,LG AI Research,EXAONE 3.0 is an instruction-tuned large language model developed by LG AI Research. It demonstrates notably robust performance across a range of tasks and benchmarks. It has been fine-tuned to be capable of complex reasoning and has a particular proficiency in Korean. The released 7.8B parameter model is designed to promote open research and innovation.,2024-09-08,https://arxiv.org/pdf/2408.03541,unknown,text; text,The model was evaluated extensively across a wide range of public and in-house benchmarks. The comparative analysis showed that the performance of EXAONE 3.0 was competitive in English and excellent in Korean compared to other large language models of a similar size.,7.8B parameters (dense),['MeCab'],Unknown,Unknown,Unknown,"Extensive pre-training on a diverse dataset, and advanced post-training techniques were employed to enhance instruction-following capabilities. The model was also trained to fully comply with data handling standards.",open,Unknown,"The model was intended for non-commercial and research purposes. The capabilities of the model allow for use cases that involve advanced AI and language processing tasks, particularly in fields requiring proficiency in English and Korean.",Commercial use is not intended for this model. Its intended use is for non-commercial research and innovation.,Unknown,Unknown,,,,,,,,,,
 application,DuckAssist,DuckDuckGo,The first Instant Answer in DuckDuckGo search results to use natural language technology to generate answers to search queries using Wikipedia and other related sources,2023-03-08,https://spreadprivacy.com/duckassist-launch/,,,,,['Anthropic API'],,,,,open,unknown,,,,,,,,,,,,,,
 application,My AI for Snapchat,Snap,"My AI offers Snapchatters a friendly, customizable chatbot at their fingertips that offers recommendations, and can even write a haiku for friends in seconds. Snapchat, where communication and messaging is a daily behavior, has 750 million monthly Snapchatters.",2023-03-01,https://openai.com/blog/introducing-chatgpt-and-whisper-apis,,,,,['ChatGPT API'],,,,,open,custom,,,,,,,https://snap.com/terms,,,,,,,
 model,InternVideo,Shanghai AI Laboratory,,2022-12-06,https://arxiv.org/pdf/2212.03191.pdf,,"text, video; video",,1.3B parameters (dense),"['Kinetics-400', 'WebVid-2M', 'WebVid-10M', 'HowTo100M', 'AVA', 'Something-Something-v2', 'Kinetics-710']",,,,,open,Apache 2.0,,,,,,,,,,,,,,