LLM Vision | Getting Started
FeaturesExamplesGitHub
  • Introduction
  • Installation
  • Setup
    • Providers
      • Cloud Providers
      • Self-hosted Providers
      • Memory
      • Timeline
    • Timeline Card
    • Blueprint
    • Asking about Events
  • Usage
    • Image Analyzer
    • Video Analyzer
    • Stream Analyzer
    • Data Analyzer
    • Remember
  • Choosing the right model
Powered by GitBook
On this page
  • Cloud-Based Models
  • Self-hosted Models

Was this helpful?

Choosing the right model

PreviousRemember

Last updated 1 month ago

Was this helpful?

Cloud-Based Models

LLM Vision is compatible with multiple providers, each of which has different models available. Some providers run in the cloud, while others are self-hosted. To see which model is best for your use case, check the figure below. It visualizes the averaged scores of available cloud-based models. The higher the score, the more accurate the output.

Gemini 2.0 Flash is priced at just $0.175/1M input tokens, but its performance surpasses that of GPT-4o with an MMMU score of 72.7 (compared to 69.1).

Self-hosted Models

Gemma 3 with 12B parameters delivers performance comparable to GPT-4o Mini while remaining efficient enough to fit within 12GB of VRAM.

Data is based on the
Data is based on the
MMMU Leaderboard
MMMU Leaderboard