Llama
Open-weight or provider model strategyMeta's open model family and tooling foundation for a large share of the open-weight ecosystem.

Meta's open model family and tooling foundation for a large share of the open-weight ecosystem.
Mistral AI's open-weight and commercial model family used across local, hosted, and enterprise deployments.
Google's lightweight open model family for local, cloud, and deployable AI applications.
Mistral's assistant workspace for chat, search, document analysis, Canvas, code interpreter, and custom agents.
Mistral's developer console for API keys, playground testing, agents, fine-tuning, evaluation, and usage monitoring.
Mistral's terminal-native coding agent product in the open model coding stack.
Popular local and cloud model runner for pulling, running, serving, and integrating open models.
Desktop and developer runtime for chatting with, loading, serving, and connecting local models privately.
Core C/C++ runtime for quantized local inference, GGUF models, and OpenAI-compatible local serving.
Central hub for discovering, testing, hosting, and deploying open models and datasets.
Cloud platform for running, fine-tuning, and deploying open models via APIs.
High-throughput open-source serving engine for production open model inference.
Self-hosted AI interface for local and cloud models, commonly paired with Ollama and OpenAI-compatible APIs.
Open-source local-first assistant and model platform with desktop apps, MCP, Jan Hub, and a local API server.
MIT-licensed local AI stack and OpenAI-compatible API for self-hosted language, image, audio, and agent workloads.
OpenAI's open-weight reasoning models also belong in the wider self-hosted open model stack.
Low-latency hosted inference option frequently used for supported open models.