Models

Mistral Small 3

Mistral Small 3 is competitive with larger models such as Llama 3.3 70B or Qwen 32B, and is an excellent open replacement for opaque proprietary models like GPT4o-mini. Mistral Small 3 is on par with Llama 3.3 70B instruct, while being more than 3x faster on the same hardware. Mistral Small 3 is a pre-trained and instructed model catered to the ‘80%’ of generative AI tasks—those that require robust language and instruction following performance, with very low latency.

Open Source

Mistral: Mixtral 8x22B (base)

Mixtral 8x22B is a large-scale language model from Mistral AI. It consists of 8 experts, each 22 billion parameters, with each token using 2 experts at a time. It was released via [X](https://twitter.com/MistralAI/status/1777869263778291896). #moe

Open Source

Mistral: Mixtral 8x22B Instruct

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding, and reasoning - large context length (64k) - fluency in English, French, Italian, German, and Spanish See benchmarks on the launch announcement [here](https://mistral.ai/news/mixtral-8x22b/). #moe

Open Source

Mistral Large

This is Mistral AI's closed-source, flagship model. It's powered by a closed-source prototype and excels at reasoning, code, JSON, chat, and more. Read the launch announcement [here](https://mistral.ai/news/mistral-large/). It is fluent in English, French, Spanish, German, and Italian, with high grammatical accuracy, and its 32K tokens context window allows precise information recall from large documents.

Open Source

Mistral Medium

This is Mistral AI's closed-source, medium-sided model. It's powered by a closed-source prototype and excels at reasoning, code, JSON, chat, and more. In benchmarks, it compares with many of the flagship models of other companies.

Open Source

Mistral Tiny

This model is currently powered by Mistral-7B-v0.2, and incorporates a "better" fine-tuning than [Mistral 7B](/models/mistralai/mistral-7b-instruct), inspired by community work. It's best used for large batch processing tasks where cost is a significant factor but reasoning capabilities are not crucial.

Open Source

Mistral-7B-Instruct-v0.3

The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 Extended vocabulary to 32768 Supports v3 Tokenizer Supports function calling

Open Source

Mistral-7B-Instruct-v0.1

The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 Extended vocabulary to 32768 Supports v3 Tokenizer Supports function calling

Open Source

Mistral-7B-Instruct-v0.2

The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 Extended vocabulary to 32768 Supports v3 Tokenizer Supports function calling

Open Source

Mistral-7B-Instruct-v0.3

The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 Extended vocabulary to 32768 Supports v3 Tokenizer Supports function calling

Open Source

Mistral-7B-v0.1

The Mistral-7B-Instruct-v0.3 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.3. Mistral-7B-v0.3 has the following changes compared to Mistral-7B-v0.2 Extended vocabulary to 32768 Supports v3 Tokenizer Supports function calling

Open Source

Mistral: Mixtral 8x22B Instruct v0.1

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active parameters out of 141B, offering unparalleled cost efficiency for its size. Its strengths include: - strong math, coding, and reasoning - large context length (64k) - fluency in English, French, Italian, German, and Spanish See benchmarks on the launch announcement [here](https://mistral.ai/news/mixtral-8x22b/). #moe

Open Source

Mistral Nemo

Mistral AI and NVIDIA have collaborated to develop Mistral NeMo, a new 12B language model that represents a significant advancement in AI technology. This model boasts a large context window of up to 128k tokens and delivers state-of-the-art performance in reasoning, world knowledge, and coding accuracy for its size category. Mistral NeMo utilizes a standard architecture, making it easily adaptable and a straightforward replacement for systems currently using Mistral 7B. In a move to promote widespread adoption, both pre-trained base and instruction-tuned checkpoints have been released under the Apache 2.0 license.

Open Source

Mistral Large 2

Mistral AI's latest offering, Mistral Large 2, represents a significant advancement in language model technology. With 123 billion parameters and a 128k context window, it supports numerous languages and coding languages. The model sets a new benchmark in performance-to-cost ratio, achieving 84.0% accuracy on MMLU. It excels in code generation, reasoning, and multilingual tasks, competing with top-tier models like GPT-4 and Claude 3 Opus. Key improvements include enhanced instruction-following, reduced hallucination, and better handling of multi-turn conversations. The model's multilingual proficiency and advanced function calling capabilities make it particularly suitable for diverse business applications. Mistral Large 2 is designed for single-node inference and long-context applications, balancing performance with practical usability.

Open Source