Choosing the right model

Cloud-Based Models

LLM Vision is compatible with multiple providers, each of which has different models available. Some providers run in the cloud, while others are self-hosted. To see which model is best for your use case, check the figure below. It visualizes the averaged scores of available cloud-based models. The higher the score, the more accurate the output.

gpt-5-mini is the recommended model due to its strong performance-to-price ratio.

Data is based on the MMMU Leaderboard

Self-hosted Models

gemma3:12b is the recommended model for self-hosting, offering performance comparable to gpt-4o-mini while fitting within 12GB of VRAM.

Data is based on the MMMU Leaderboard

Last updated