Models
QwQ 32B
QwQ-32B is an research model developed by the Qwen Team, focused on advancing AI reasoning capabilities. As a preview release, it demonstrates promising analytical abilities while having several important limitations: 1. Language Mixing and Code-Switching: The model may mix languages or switch between them unexpectedly, affecting response clarity. 2. Recursive Reasoning Loops: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer. 3. Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it. 4. Performance and Benchmark Limitations: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.
Open Source
Qwen2.5-Max
Qwen2.5-Max is a large-scale Mixture-of-Experts (MoE) model developed with over 20 trillion tokens in pretraining and further refined through Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF). Recognizing the benefits of scaling both data and model size, researchers have faced challenges in effectively managing extremely large models. With the recent release of DeepSeek V3 providing insights into this process, Qwen2.5-Max emerges as a strong competitor in the field. The model has been rigorously evaluated against leading proprietary and open-weight models using benchmarks such as MMLU-Pro (college-level knowledge), LiveCodeBench (coding capabilities), LiveBench (general intelligence), and Arena-Hard (human preference approximation). Results indicate that Qwen2.5-Max outperforms DeepSeek V3 in key benchmarks, including Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond, while maintaining competitive performance in MMLU-Pro. For base model comparisons, proprietary models like GPT-4o and Claude-3.5-Sonnet were not accessible. However, Qwen2.5-Max was benchmarked against top open-weight models such as DeepSeek V3 (MoE), Llama-3.1-405B (dense), and Qwen2.5-72B (dense). The results demonstrate notable advantages across most benchmarks, reinforcing the model’s strengths. Qwen2.5-Max is now available via API on Alibaba Cloud, and users can explore its capabilities through Qwen Chat. With ongoing advancements in post-training techniques, future iterations of the model are expected to achieve even greater performance.
Open Source
Qwen2 VL 72B
### What's New in Qwen2-VL? #### [](https://huggingface.co/Qwen/Qwen2-VL-72B#key-enhancements)Key Enhancements: - SoTA understanding of images of various resolution & ratio: Qwen2-VL achieves state-of-the-art performance on visual understanding benchmarks, including MathVista, DocVQA, RealWorldQA, MTVQA, etc. - Understanding videos of 20min+: Qwen2-VL can understand videos over 20 minutes for high-quality video-based question answering, dialog, content creation, etc. - Agent that can operate your mobiles, robots, etc.: with the abilities of complex reasoning and decision making, Qwen2-VL can be integrated with devices like mobile phones, robots, etc., for automatic operation based on visual environment and text instructions. - Multilingual Support: to serve global users, besides English and Chinese, Qwen2-VL now supports the understanding of texts in different languages inside images, including most European languages, Japanese, Korean, Arabic, Vietnamese, etc.
Vision
Open Source
QwQ 32B Preview
QwQ-32B-Preview is an experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities. As a preview release, it demonstrates promising analytical abilities while having several important limitations: 1. Language Mixing and Code-Switching: The model may mix languages or switch between them unexpectedly, affecting response clarity. 2. Recursive Reasoning Loops: The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer. 3. Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it. 4. Performance and Benchmark Limitations: The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.
Open Source
Qwen2.5 Coder 32B Instruct
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5: * Significantly improvements in code generation, code reasoning and code fixing. Base on the strong Qwen2.5, we scale up the training tokens into 5.5 trillion including source code, text-code grounding, Synthetic data, etc. Qwen2.5-Coder-32B has become the current state-of-the-art open-source codeLLM, with its coding abilities matching those of GPT-4o. * A more comprehensive foundation for real-world applications such as Code Agents. Not only enhancing coding capabilities but also maintaining its strengths in mathematics and general competencies. Long-context Support up to 128K tokens.
Open Source
Qwen2.5 7B Instruct
Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: - Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. - Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. - Long-context Support up to 128K tokens and can generate up to 8K tokens. - Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more. This repo contains the instruction-tuned 7B Qwen2.5 model, which has the following features: - Type: Causal Language Models - Training Stage: Pretraining & Post-training - Architecture: transformers with RoPE, SwiGLU, RMSNorm, and Attention QKV bias - Number of Parameters: 7.61B - Number of Paramaters (Non-Embedding): 6.53B - Number of Layers: 28 - Number of Attention Heads (GQA): 28 for Q and 4 for KV - Context Length: Full 131,072 tokens and generation 8192 tokens - Please refer to [this section](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct#processing-long-texts) for detailed instructions on how to deploy Qwen2.5 for handling long texts. For more details, please refer to our [blog](https://qwenlm.github.io/blog/qwen2.5/), [GitHub](https://github.com/QwenLM/Qwen2.5), and [Documentation](https://qwen.readthedocs.io/en/latest/).
Open Source
Qwen2.5 Turbo (1M Context)
Following the release of Qwen2.5, the team responded to the community's demand for handling longer contexts. Over the past few months, significant optimizations have been made to enhance the model's capabilities and inference performance for extremely long contexts. Now, the team is proud to introduce the new **Qwen2.5-Turbo** model, featuring the following advancements: - **Extended Context Support**: The context length has been increased from 128k to 1M tokens, equivalent to approximately 1 million English words or 1.5 million Chinese characters. This capacity corresponds to 10 full-length novels, 150 hours of speech transcripts, or 30,000 lines of code. Qwen2.5-Turbo achieves 100% accuracy in the 1M-token Passkey Retrieval task and scores 93.1 on the RULER long-text evaluation benchmark, outperforming GPT-4 (91.6) and GLM4-9B-1M (89.9). Moreover, the model retains strong performance in short sequence tasks, comparable to GPT-4o-mini. - **Faster Inference Speed**: Leveraging sparse attention mechanisms, the time to generate the first token for a 1M-token context has been reduced from 4.9 minutes to just 68 seconds, representing a 4.3x speed improvement. - **Cost Efficiency**: The pricing remains unchanged at $0.05 per 1M tokens. At this rate, Qwen2.5-Turbo processes 3.6 times more tokens than GPT-4o-mini for the same cost.
Qwen2.5 72B Instruct
Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters. Qwen2.5 brings the following improvements upon Qwen2: * Significantly more knowledge and has greatly improved capabilities in coding and mathematics, thanks to our specialized expert models in these domains. * Significant improvements in instruction following, generating long texts (over 8K tokens), understanding structured data (e.g, tables), and generating structured outputs especially JSON. More resilient to the diversity of system prompts, enhancing role-play implementation and condition-setting for chatbots. * Long-context Support up to 128K tokens and can generate up to 8K tokens. * Multilingual support for over 29 languages, including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, Arabic, and more.
Qwen2 VL 72B Instruct
Qwen2-VL is the latest iteration of multimodal large language models developed by the Qwen team at Alibaba Cloud. This advanced AI system represents a significant leap forward in the field of vision-language models, building upon its predecessor, Qwen-VL. Qwen2-VL boasts state-of-the-art capabilities in understanding images of various resolutions and aspect ratios, as well as the ability to comprehend videos exceeding 20 minutes in length. One of the most notable features of Qwen2-VL is its versatility as an agent capable of operating mobile devices, robots, and other systems based on visual input and text instructions. This makes it a powerful tool for a wide range of applications, from personal assistance to industrial automation. The model also offers robust multilingual support, enabling it to understand and process text in various languages within images, catering to a global user base.
Vision
Qwen2 VL 7B Instruct
Qwen2-VL is the latest iteration of multimodal large language models developed by the Qwen team at Alibaba Cloud. This advanced AI system represents a significant leap forward in the field of vision-language models, building upon its predecessor, Qwen-VL. Qwen2-VL boasts state-of-the-art capabilities in understanding images of various resolutions and aspect ratios, as well as the ability to comprehend videos exceeding 20 minutes in length. One of the most notable features of Qwen2-VL is its versatility as an agent capable of operating mobile devices, robots, and other systems based on visual input and text instructions. This makes it a powerful tool for a wide range of applications, from personal assistance to industrial automation. The model also offers robust multilingual support, enabling it to understand and process text in various languages within images, catering to a global user base.
Vision
Qwen2 Math 7B Instruct
Qwen2-Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).
Open Source
Qwen2 Math 1.5B Instruct
Qwen2-Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).
Open Source
Qwen2 Math 72B Instruct
Qwen2-Math is a series of specialized math language models built upon the Qwen2 LLMs, which significantly outperforms the mathematical capabilities of open-source models and even closed-source models (e.g., GPT4o).
Open Source
Qwen2 Audio 7B Instruct
Qwen2-Audio is the new series of Qwen large audio-language models. Qwen2-Audio is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions. We introduce two distinct audio interaction modes: * voice chat: users can freely engage in voice interactions with Qwen2-Audio without text input; * audio analysis: users could provide audio and text instructions for analysis during the interaction;
Open Source
Qwen 2 72B Chat
Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 72B Qwen2 model. Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc. Qwen2-72B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs.
Open Source
Qwen 1.5 110B Chat
Qwen1.5 110B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 14B Chat
Qwen1.5 14B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 32B Chat
Qwen1.5 32B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 4B Chat
Qwen1.5 4B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 72B Chat
Qwen1.5 72B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 7B Chat
Qwen1.5 7B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 1.8B Chat
Qwen1.5 1.8B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Qwen 1.5 110B
Qwen1.5 110B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 14B
Qwen1.5 14B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 32B
Qwen1.5 32B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 4B
Qwen1.5 4B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 72B
Qwen1.5 72B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 7B
Qwen1.5 7B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Open Source
Qwen 1.5 1.8B
Qwen1.5 1.8B is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include: - Significant performance improvement in human preference for chat models - Multilingual support of both base and chat models - Stable support of 32K context length for models of all sizes For more details, see this [blog post](https://qwenlm.github.io/blog/qwen1.5/) and [GitHub repo](https://github.com/QwenLM/Qwen1.5). Usage of this model is subject to [Tongyi Qianwen LICENSE AGREEMENT](https://huggingface.co/Qwen/Qwen1.5-110B-Chat/blob/main/LICENSE).
Qwen 2 7B Chat
Qwen2 is the new series of Qwen large language models. For Qwen2, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters, including a Mixture-of-Experts model. This repo contains the instruction-tuned 72B Qwen2 model. Compared with the state-of-the-art opensource language models, including the previous released Qwen1.5, Qwen2 has generally surpassed most opensource models and demonstrated competitiveness against proprietary models across a series of benchmarks targeting for language understanding, language generation, multilingual capability, coding, mathematics, reasoning, etc. Qwen2-72B-Instruct supports a context length of up to 131,072 tokens, enabling the processing of extensive inputs.
Open Source