stanford-crfm · jxue16 · Jan 14, 2025 · Jan 15, 2025 · Jan 15, 2025
diff --git a/assets/amazon.yaml b/assets/amazon.yaml
@@ -86,3 +86,76 @@
   prohibited_uses: ''
   monitoring: ''
   feedback: https://github.com/amazon-science/chronos-forecasting/discussions
+- type: model
+  name: Amazon Nova (Understanding)
+  organization: Amazon Web Services (AWS)
+  description: A new generation of state-of-the-art foundation models (FMs) that
+    deliver frontier intelligence and industry leading price performance, available
+    exclusively in Amazon Bedrock. Amazon Nova understanding models excel in Retrieval-Augmented
+    Generation (RAG), function calling, and agentic applications.
+  created_date: 2024-12-03
+  url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
+  model_card: unknown
+  modality:
+    explanation: Amazon Nova understanding models accept text, image, or video inputs
+      to generate text output.
+    value: text, image, video; text
+  analysis: Amazon Nova Pro is capable of processing up to 300K input tokens and
+    sets new standards in multimodal intelligence and agentic workflows that require
+    calling APIs and tools to complete complex workflows. It achieves state-of-the-art
+    performance on key benchmarks including visual question answering ( TextVQA
+    ) and video understanding ( VATEX ).
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: All Amazon Nova models include built-in safety controls and creative
+    content generation models include watermarking capabilities to promote responsible
+    AI use.
+  access:
+    explanation: available exclusively in Amazon Bedrock
+    value: limited
+  license: unknown
+  intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
+    understand charts and diagrams, generate engaging video content, and build sophisticated
+    AI agents, from across a range of intelligence classes optimized for enterprise
+    workloads.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
+- type: model
+  name: Amazon Nova (Creative Content Generation)
+  organization: Amazon Web Services (AWS)
+  description: A new generation of state-of-the-art foundation models (FMs) that
+    deliver frontier intelligence and industry leading price performance, available
+    exclusively in Amazon Bedrock.
+  created_date: 2024-12-03
+  url: https://aws.amazon.com/blogs/aws/introducing-amazon-nova-frontier-intelligence-and-industry-leading-price-performance/
+  model_card: unknown
+  modality:
+    explanation: Amazon creative content generation models accept text and image
+      inputs to generate image or video output.
+    value: text, image;image, video
+  analysis: Amazon Nova Canvas excels on human evaluations and key benchmarks such
+    as text-to-image faithfulness evaluation with question answering (TIFA) and
+    ImageReward.
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: All Amazon Nova models include built-in safety controls and creative
+    content generation models include watermarking capabilities to promote responsible
+    AI use.
+  access:
+    explanation: available exclusively in Amazon Bedrock
+    value: limited
+  license: unknown
+  intended_uses: You can build on Amazon Nova to analyze complex documents and videos,
+    understand charts and diagrams, generate engaging video content, and build sophisticated
+    AI agents, from across a range of intelligence classes optimized for enterprise
+    workloads.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/anthropic.yaml b/assets/anthropic.yaml
@@ -608,15 +608,17 @@
     speed of its predecessor, Claude 3 Opus, and is designed to tackle tasks like
     context-sensitive customer support, orchestrating multi-step workflows, interpreting
     charts and graphs, and transcribing text from images.
-  created_date: 2024-06-21
-  url: https://www.anthropic.com/news/claude-3-5-sonnet
+  created_date:
+    explanation: Claude 3.5 Sonnet updated on Oct. 22, initially released on June
+      20 of the same year.
+  value: 2024-10-22
+  url: https://www.anthropic.com/news/3-5-models-and-computer-use
   model_card: unknown
   modality: text; image, text
   analysis: The model has been evaluated on a range of tests including graduate-level
     reasoning (GPQA), undergraduate-level knowledge (MMLU), coding proficiency (HumanEval),
-    and standard vision benchmarks. In an internal agentic coding evaluation, Claude
-    3.5 Sonnet solved 64% of problems, outperforming the previous version, Claude
-    3 Opus, which solved 38%.
+    and standard vision benchmarks. Claude 3.5 Sonnet demonstrates state-of-the-art
+    performance on most benchmarks.
   size: Unknown
   dependencies: []
   training_emissions: Unknown
@@ -637,3 +639,38 @@
     integrated to ensure robustness of evaluations.
   feedback: Feedback on Claude 3.5 Sonnet can be submitted directly in-product to
     inform the development roadmap and improve user experience.
+- type: model
+  name: Claude 3.5 Haiku
+  organization: Anthropic
+  description: Claude 3.5 Haiku is Anthropic's fastest model, delivering advanced
+    coding, tool use, and reasoning capability, surpassing the previous Claude 3
+    Opus in intelligence benchmarks. It is designed for critical use cases where
+    low latency is essential, such as user-facing chatbots and code completions.
+  created_date: 2024-10-22
+  url: https://www.anthropic.com/claude/haiku
+  model_card: unknown
+  modality:
+    explanation: Claude 3.5 Haiku is available...initially as a text-only model
+      and with image input to follow.
+    value: text; unknown
+  analysis: Claude 3.5 Haiku offers strong performance and speed across a variety
+    of coding, tool use, and reasoning tasks. Also, it has been tested in extensive
+    safety evaluations and exceeded expectations in reasoning and code generation
+    tasks.
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: During Claude 3.5 Haiku’s development, we conducted extensive
+    safety evaluations spanning multiple languages and policy domains.
+  access:
+    explanation: Claude 3.5 Haiku is available across Claude.ai, our first-party
+      API, Amazon Bedrock, and Google Cloud’s Vertex AI.
+    value: open
+  license: unknown
+  intended_uses: Critical use cases where low latency matters, like user-facing
+    chatbots and code completions.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/cohere.yaml b/assets/cohere.yaml
@@ -592,3 +592,25 @@
   prohibited_uses: unknown
   monitoring: unknown
   feedback: https://huggingface.co/CohereForAI/aya-23-35B/discussions
+- type: model
+  name: Command R+
+  organization: Cohere
+  description: Command R+ is a state-of-the-art RAG-optimized model designed to
+    tackle enterprise-grade workloads, and is available first on Microsoft Azure.
+  created_date: 2024-04-04
+  url: https://cohere.com/blog/command-r-plus-microsoft-azure
+  model_card: unknown
+  modality: unknown
+  analysis: unknown
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: unknown
+  access: ''
+  license: unknown
+  intended_uses: unknown
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/genmo.yaml b/assets/genmo.yaml
@@ -0,0 +1,43 @@
+---
+- type: model
+  name: Mochi 1
+  organization: Genmo
+  description: Mochi 1 is an open-source video generation model designed to produce
+    high-fidelity motion and strong prompt adherence in generated videos, setting
+    a new standard for open video generation systems.
+  created_date: 2025-01-14
+  url: https://www.genmo.ai/blog
+  model_card: unknown
+  modality:
+    explanation: Mochi 1 generates smooth videos... Measures how accurately generated
+      videos follow the provided textual instructions
+    value: text; video
+  analysis: Mochi 1 sets a new best-in-class standard for open-source video generation.
+    It also performs very competitively with the leading closed models... We benchmark
+    prompt adherence with an automated metric using a vision language model as a
+    judge following the protocol in OpenAI DALL-E 3. We evaluate generated videos
+    using Gemini-1.5-Pro-002.
+  size:
+    explanation: featuring a 10 billion parameter diffusion model
+    value: 10B parameters
+  dependencies: [DDPM, DreamFusion, Emu Video, T5-XXL]
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: robust safety moderation protocols in the playground to ensure
+    that all video generations remain safe and aligned with ethical guidelines.
+  access:
+    explanation: open state-of-the-art video generation model... The weights and
+      architecture for Mochi 1 are open
+    value: open
+  license:
+    explanation: We're releasing the model under a permissive Apache 2.0 license.
+    value: Apache 2.0
+  intended_uses: Advance the field of video generation and explore new methodologies.
+    Build innovative applications in entertainment, advertising, education, and
+    more. Empower artists and creators to bring their visions to life with AI-generated
+    videos. Generate synthetic data for training AI models in robotics, autonomous
+    vehicles and virtual environments.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/google.yaml b/assets/google.yaml
@@ -1908,3 +1908,69 @@
   monitoring: unknown
   feedback: Encourages developer feedback to inform model improvements and future
     updates.
+- type: model
+  name: Veo 2
+  organization: Google DeepMind
+  description: Veo 2 is a state-of-the-art video generation model that creates videos
+    with realistic motion and high-quality output, up to 4K, with extensive camera
+    controls. It simulates real-world physics and offers advanced motion capabilities
+    with enhanced realism and fidelity.
+  created_date: 2024-12-16
+  url: https://deepmind.google/technologies/veo/veo-2/
+  model_card: unknown
+  modality:
+    explanation: Our state-of-the-art video generation model ... text-to-image model
+      Veo 2
+    value: text; video
+  analysis: Veo 2 outperforms other leading video generation models, based on human
+    evaluations of its performance.
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: Veo 2 includes features that enhance realism, fidelity, detail,
+    and artifact reduction to ensure high-quality output.
+  access: limited
+  license: unknown
+  intended_uses: Creating high-quality videos with realistic motion, different styles,
+    camera controls, shot styles, angles, and movements.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
+
+- type: model
+  name: Gemini 2.0
+  organization: Google DeepMind
+  description: Google DeepMind introduces Gemini 2.0, a new AI model designed for
+    the 'agentic era.'
+  created_date: 2024-12-11
+  url: https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message
+  model_card: unknown
+  modality:
+    explanation: The first model built to be natively multimodal, Gemini 1.0 and
+      1.5 drove big advances with multimodality and long context to understand information
+      across text, video, images, audio and code...
+    value: text, video, image, audio; image, text
+  analysis: unknown
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware:
+    explanation: It’s built on custom hardware like Trillium, our sixth-generation
+      TPUs.
+    value: custom hardware like Trillium, our sixth-generation TPUs
+  quality_control: Google is committed to building AI responsibly, with safety and
+    security as key priorities.
+  access:
+    explanation: Gemini 2.0 Flash is available to developers and trusted testers,
+      with wider availability planned for early next year.
+    value: limited
+  license: unknown
+  intended_uses: Develop more agentic models, meaning they can understand more about
+    the world around you, think multiple steps ahead, and take action on your behalf,
+    with your supervision.
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: unknown
diff --git a/assets/ibm.yaml b/assets/ibm.yaml
@@ -75,3 +75,45 @@
   prohibited_uses: ''
   monitoring: ''
   feedback: ''
+- type: model
+  name: IBM Granite 3.0
+  organization: IBM
+  description: IBM Granite 3.0 models deliver state-of-the-art performance relative
+    to model size while maximizing safety, speed and cost-efficiency for enterprise
+    use cases.
+  created_date: 2024-10-21
+  url: https://www.ibm.com/new/ibm-granite-3-0-open-state-of-the-art-enterprise-models
+  model_card: unknown
+  modality:
+    explanation: IBM Granite 3.0 8B Instruct model for classic natural language
+      use cases including text generation, classification, summarization, entity
+      extraction and customer service chatbots
+    value: text; text
+  analysis: Granite 3.0 8B Instruct matches leading similarly-sized open models
+    on academic benchmarks while outperforming those peers on benchmarks for enterprise
+    tasks and safety.
+  size:
+    explanation: 'Dense, general purpose LLMs: Granite-3.0-8B-Instruct'
+    value: 8B parameters
+  dependencies: [Hugging Face’s OpenLLM Leaderboard v2]
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: The entire Granite family of models are trained on carefully
+    curated enterprise datasets, filtered for objectionable content with critical
+    concerns like governance, risk, privacy and bias mitigation in mind
+  access:
+    explanation: In keeping with IBM’s strong historical commitment to open source
+      , all Granite models are released under the permissive Apache 2.0 license
+    value: open
+  license:
+    explanation: In keeping with IBM’s strong historical commitment to open source
+      , all Granite models are released under the permissive Apache 2.0 license
+    value: Apache 2.0
+  intended_uses: classic natural language use cases including text generation, classification,
+    summarization, entity extraction and customer service chatbots, programming
+    language use cases such as code generation, code explanation and code editing,
+    and for agentic use cases requiring tool calling
+  prohibited_uses: unknown
+  monitoring: ''
+  feedback: unknown
diff --git a/assets/inflection.yaml b/assets/inflection.yaml
@@ -93,3 +93,29 @@
   prohibited_uses: ''
   monitoring: ''
   feedback: none
+- type: model
+  name: Inflection 3.0
+  organization: Inflection AI
+  description: Inflection for Enterprise, powered by our industry-first, enterprise-grade
+    AI system, Inflection 3.0.
+  created_date: 2024-10-07
+  url: https://inflection.ai/blog/enterprise
+  model_card: unknown
+  modality: unknown
+  analysis: unknown
+  size: unknown
+  dependencies: []
+  training_emissions: unknown
+  training_time: unknown
+  training_hardware: unknown
+  quality_control: unknown
+  access:
+    explanation: Developers can now access Inflection AI’s Large Language Model
+      through its new commercial API.
+    value: open
+  license: unknown
+  intended_uses: unknown
+  prohibited_uses: unknown
+  monitoring: unknown
+  feedback: So please drop us a line. We want to keep hearing from enterprises about
+    how we can help solve their challenges and make AI a reality for their business.