stanford-crfm · jxue16 · Jan 14, 2025 · Jan 15, 2025 · Jan 15, 2025
diff --git a/assets/amazon.yaml b/assets/amazon.yaml
@@ -86,3 +86,42 @@
   prohibited_uses: ''
   monitoring: ''
   feedback: https://github.com/amazon-science/chronos-forecasting/discussions
+- type: model
+  name: Amazon Nova
+  organization: Amazon Web Services (AWS)
+  description: A new generation of state-of-the-art foundation models (FMs) that
+    deliver frontier intelligence and industry leading price performance, available
+    exclusively in Amazon Bedrock. You can use Amazon Nova to lower costs and latency
+    for almost any generative AI task.
+  created_date: 2024-12-03
+  url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
+  model_card: unknown
+  modality:
+    explanation: Amazon Nova understanding models accept text, image, or video inputs
+      to generate text output. Amazon creative content generation models accept
+      text and image inputs to generate image or video output.
+    value: text, image, video; text, image, video
+  analysis: Amazon Nova Pro is capable of processing up to 300K input tokens and
+    sets new standards in multimodal intelligence and agentic workflows that require
+    calling APIs and tools to complete complex workflows. It achieves state-of-the-art
+    performance on key benchmarks including visual question answering ( TextVQA
+    ) and video understanding ( VATEX ).
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: All Amazon Nova models include built-in safety controls and creative
+    content generation models include watermarking capabilities to promote responsible
+    AI use.
+  access:
+    explanation: available exclusively in Amazon Bedrock
+    value: limited
+  license: unknown
+  intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
+    understand charts and diagrams, generate engaging video content, and build sophisticated
+    AI agents, from across a range of intelligence classes optimized for enterprise
+    workloads.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/anthropic.yaml b/assets/anthropic.yaml
@@ -637,3 +637,38 @@
     integrated to ensure robustness of evaluations.
   feedback: Feedback on Claude 3.5 Sonnet can be submitted directly in-product to
     inform the development roadmap and improve user experience.
+- type: model
+  name: Claude 3.5 Haiku
+  organization: Anthropic
+  description: Claude 3.5 Haiku is Anthropic's fastest model, delivering advanced
+    coding, tool use, and reasoning capability, surpassing the previous Claude 3
+    Opus in intelligence benchmarks. It is designed for critical use cases where
+    low latency is essential, such as user-facing chatbots and code completions.
+  created_date: 2024-10-22
+  url: https://www.anthropic.com/claude/haiku
+  model_card: unknown
+  modality:
+    explanation: Claude 3.5 Haiku is available...initially as a text-only model
+      and with image input to follow.
+    value: text; unknown
+  analysis: Claude 3.5 Haiku offers strong performance and speed across a variety
+    of coding, tool use, and reasoning tasks. Also, it has been tested in extensive
+    safety evaluations and exceeded expectations in reasoning and code generation
+    tasks.
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: During Claude 3.5 Haiku’s development, we conducted extensive
+    safety evaluations spanning multiple languages and policy domains.
+  access:
+    explanation: Claude 3.5 Haiku is available across Claude.ai, our first-party
+      API, Amazon Bedrock, and Google Cloud’s Vertex AI.
+    value: open
+  license: unknown
+  intended_uses: Critical use cases where low latency matters, like user-facing
+    chatbots and code completions.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/genmo.yaml b/assets/genmo.yaml
@@ -0,0 +1,43 @@
+---
+- type: model
+  name: Mochi 1
+  organization: Genmo
+  description: Mochi 1 is an open-source video generation model designed to produce
+    high-fidelity motion and strong prompt adherence in generated videos, setting
+    a new standard for open video generation systems.
+  created_date: 2025-01-14
+  url: https://www.genmo.ai/blog
+  model_card: unknown
+  modality:
+    explanation: Mochi 1 generates smooth videos... Measures how accurately generated
+      videos follow the provided textual instructions
+    value: text; video
+  analysis: Mochi 1 sets a new best-in-class standard for open-source video generation.
+    It also performs very competitively with the leading closed models... We benchmark
+    prompt adherence with an automated metric using a vision language model as a
+    judge following the protocol in OpenAI DALL-E 3. We evaluate generated videos
+    using Gemini-1.5-Pro-002.
+  size:
+    explanation: featuring a 10 billion parameter diffusion model
+    value: 10B parameters
+  dependencies: [DDPM, DreamFusion, Emu Video, T5-XXL]
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: robust safety moderation protocols in the playground to ensure
+    that all video generations remain safe and aligned with ethical guidelines.
+  access:
+    explanation: open state-of-the-art video generation model... The weights and
+      architecture for Mochi 1 are open
+    value: open
+  license:
+    explanation: We're releasing the model under a permissive Apache 2.0 license.
+    value: Apache 2.0
+  intended_uses: Advance the field of video generation and explore new methodologies.
+    Build innovative applications in entertainment, advertising, education, and
+    more. Empower artists and creators to bring their visions to life with AI-generated
+    videos. Generate synthetic data for training AI models in robotics, autonomous
+    vehicles and virtual environments.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/google.yaml b/assets/google.yaml
@@ -1908,3 +1908,69 @@
   monitoring: unknown
   feedback: Encourages developer feedback to inform model improvements and future
     updates.
+- type: model
+  name: Veo 2
+  organization: Google DeepMind
+  description: Veo 2 is a state-of-the-art video generation model that creates videos
+    with realistic motion and high-quality output, up to 4K, with extensive camera
+    controls. It simulates real-world physics and offers advanced motion capabilities
+    with enhanced realism and fidelity.
+  created_date: 2024-12-16
+  url: https://deepmind.google/technologies/veo/veo-2/
+  model_card: unknown
+  modality:
+    explanation: Our state-of-the-art video generation model ... text-to-image model
+      Veo 2
+    value: text; video
+  analysis: Veo 2 outperforms other leading video generation models, based on human
+    evaluations of its performance.
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: Veo 2 includes features that enhance realism, fidelity, detail,
+    and artifact reduction to ensure high-quality output.
+  access: ''
+  license: unknown
+  intended_uses: Creating high-quality videos with realistic motion, different styles,
+    camera controls, shot styles, angles, and movements.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
+
+- type: model
+  name: Gemini 2.0
+  organization: Google DeepMind
+  description: Google DeepMind introduces Gemini 2.0, a new AI model designed for
+    the 'agentic era.'
+  created_date: 2024-12-11
+  url: https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message
+  model_card: unknown
+  modality:
+    explanation: The first model built to be natively multimodal, Gemini 1.0 and
+      1.5 drove big advances with multimodality and long context to understand information
+      across text, video, images, audio and code...
+    value: text, video, images, audio, code; image, audio
+  analysis: unknown
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware:
+    explanation: It’s built on custom hardware like Trillium, our sixth-generation
+      TPUs.
+    value: custom hardware like Trillium, our sixth-generation TPUs
+  quality_control: Google is committed to building AI responsibly, with safety and
+    security as key priorities.
+  access:
+    explanation: Gemini 2.0 Flash is available to developers and trusted testers,
+      with wider availability planned for early next year.
+    value: limited
+  license: unknown
+  intended_uses: Develop more agentic models, meaning they can understand more about
+    the world around you, think multiple steps ahead, and take action on your behalf,
+    with your supervision.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/ibm.yaml b/assets/ibm.yaml
@@ -75,3 +75,47 @@
   prohibited_uses: ''
   monitoring: ''
   feedback: ''
+- type: model
+  name: IBM Granite 3.0
+  organization: IBM
+  description: IBM Granite 3.0 models deliver state-of-the-art performance relative
+    to model size while maximizing safety, speed and cost-efficiency for enterprise
+    use cases.
+  created_date: 2024-10-21
+  url: https://www.ibm.com/new/ibm-granite-3-0-open-state-of-the-art-enterprise-models
+  model_card: unknown
+  modality:
+    explanation: IBM Granite 3.0 8B Instruct model for classic natural language
+      use cases including text generation, classification, summarization, entity
+      extraction and customer service chatbots
+    value: text; text
+  analysis: Granite 3.0 8B Instruct matches leading similarly-sized open models
+    on academic benchmarks while outperforming those peers on benchmarks for enterprise
+    tasks and safety.
+  size:
+    explanation: 'Dense, general purpose LLMs: Granite-3.0-8B-Instruct'
+    value: 8B parameters
+  dependencies: [Hugging Face’s OpenLLM Leaderboard v2]
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: The entire Granite family of models are trained on carefully
+    curated enterprise datasets, filtered for objectionable content with critical
+    concerns like governance, risk, privacy and bias mitigation in mind
+  access:
+    explanation: In keeping with IBM’s strong historical commitment to open source
+      , all Granite models are released under the permissive Apache 2.0 license
+    value: open
+  license:
+    explanation: In keeping with IBM’s strong historical commitment to open source
+      , all Granite models are released under the permissive Apache 2.0 license
+    value: Apache 2.0
+  intended_uses: classic natural language use cases including text generation, classification,
+    summarization, entity extraction and customer service chatbots, programming
+    language use cases such as code generation, code explanation and code editing,
+    and for agentic use cases requiring tool calling
+  prohibited_uses: unknown
+  monitoring: IBM provides a detailed disclosure of training data sets and methodologies
+    in the Granite 3.0 technical paper , reaffirming IBM’s dedication to building
+    transparency, safety and trust in AI products.
+  feedback: unknown
diff --git a/assets/microsoft.yaml b/assets/microsoft.yaml
@@ -1032,3 +1032,38 @@
     use case, particularly for high risk scenarios.
   monitoring: Unknown
   feedback: Unknown
+- type: model
+  name: Phi-4
+  organization: Microsoft
+  description: the latest small language model in Phi family, that offers high quality
+    results at a small size (14B parameters).
+  created_date: 2024-12-13
+  url: https://techcommunity.microsoft.com/blog/aiplatformblog/introducing-phi-4-microsoft%E2%80%99s-newest-small-language-model-specializing-in-comple/4357090
+  model_card: unknown
+  modality:
+    explanation: Today we are introducing Phi-4 , our 14B parameter state-of-the-art
+      small language model (SLM) that excels at complex reasoning in areas such
+      as math, in addition to conventional language processing.
+    value: text; text
+  analysis: Phi-4 outperforms comparable and larger models on math related reasoning.
+  size:
+    explanation: a small size (14B parameters).
+    value: 14B parameters
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: Building AI solutions responsibly is at the core of AI development
+    at Microsoft. We have made our robust responsible AI capabilities available
+    to customers building with Phi models.
+  access:
+    explanation: Phi-4 is available on Azure AI Foundry and on Hugging Face.
+    value: open
+  license: unknown
+  intended_uses: Specialized in complex reasoning, particularly good at math problems
+    and high-quality language processing.
+  prohibited_uses: unknown
+  monitoring: Azure AI evaluations in AI Foundry enable developers to iteratively
+    assess the quality and safety of models and applications using built-in and
+    custom metrics to inform mitigations.
+  feedback: unknown
diff --git a/assets/mistral.yaml b/assets/mistral.yaml
@@ -192,3 +192,79 @@
   monitoring: Unknown
   feedback: Feedback is likely expected to be given through the HuggingFace platform
     where the model's weights are hosted or directly to the Mistral AI team.
+- type: model
+  name: Pixtral Large
+  organization: Mistral AI
+  description: Pixtral Large is the second model in our multimodal family and demonstrates
+    frontier-level image understanding. Particularly, the model is able to understand
+    documents, charts and natural images, while maintaining the leading text-only
+    understanding of Mistral Large 2.
+  created_date: 2024-11-18
+  url: https://mistral.ai/news/pixtral-large/
+  model_card: unknown
+  modality:
+    explanation: Pixtral Large is the second model in our multimodal family and
+      demonstrates frontier-level image understanding.
+    value: text, image; text
+  analysis: We evaluate Pixtral Large against frontier models on a set of standard
+    multimodal benchmarks, through a common testing harness.
+  size:
+    explanation: Today we announce Pixtral Large, a 124B open-weights multimodal
+      model.
+    value: 124B parameters
+  dependencies: [Mistral Large 2]
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: unknown
+  access:
+    explanation: The model is available under the Mistral Research License (MRL)
+      for research and educational use; and the Mistral Commercial License for experimentation,
+      testing, and production for commercial purposes.
+    value: open
+  license:
+    explanation: The model is available under the Mistral Research License (MRL)
+      for research and educational use; and the Mistral Commercial License for experimentation,
+      testing, and production for commercial purposes.
+    value: Mistral Research License (MRL), Mistral Commercial License
+  intended_uses: RAG and agentic workflows, making it a suitable choice for enterprise
+    use cases such as knowledge exploration and sharing, semantic understanding
+    of documents, task automation, and improved customer experiences.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
+
+- type: model
+  name: Codestral 25.01
+  organization: Mistral AI
+  description: Lightweight, fast, and proficient in over 80 programming languages,
+    Codestral is optimized for low-latency, high-frequency usecases and supports
+    tasks such as fill-in-the-middle (FIM), code correction and test generation.
+  created_date: 2025-01-13
+  url: https://mistral.ai/news/codestral-2501/
+  model_card: unknown
+  modality:
+    explanation: it for free in Continue for VS Code or JetBrains
+    value: text; text
+  analysis: Benchmarks We have benchmarked the new Codestral with the leading sub-100B
+    parameter coding models that are widely considered to be best-in-class for FIM
+    tasks.
+  size:
+    explanation: Codestral-2501 256k
+    value: 256k
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: unknown
+  access:
+    explanation: The API is also available on Google Cloud’s Vertex AI, in private
+      preview on Azure AI Foundry, and coming soon to Amazon Bedrock.
+    value: closed
+  license: unknown
+  intended_uses: Highly capable coding companion, regularly boosting productivity
+    several times over.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: We can’t wait to hear your experience! Try it now Try it on Continue.dev
+    with VsCode or JetBrains