NVIDIA and Mistral AI Forge Partnership to Accelerate New Family of Open Models

NVIDIA and Mistral AI announced a partnership to accelerate a new family of open-source AI models, including the powerful Mistral Large 3.
The collaboration integrates NVIDIA's Blackwell architecture and MoE kernels, optimizing the models for efficient deployment from data centers to edge devices.
All models are released under the permissive Apache 2.0 license, a significant move to democratize access to frontier-scale AI capabilities.

In a move that strengthens the open-source AI ecosystem, NVIDIA and French AI startup Mistral AI have entered a strategic partnership. The collaboration, announced this week, is centered on accelerating Mistral's newly unveiled family of models, known as Mistral 3, which are being released under the open Apache 2.0 license.

The crown jewel of the new family is Mistral Large 3, described by the company as its most capable model to date. It is a sparse mixture-of-experts (MoE) model boasting 675 billion total parameters, with 41 billion active parameters. According to technical details released, the model was trained from scratch on a cluster of 3,000 of NVIDIA's H200 GPUs. Early benchmarks suggest it achieves performance parity with leading instruction-tuned open-weight models currently available.

Technical optimization is at the heart of the partnership. Mistral AI has integrated NVIDIA's Blackwell attention and specialized MoE kernels into its architecture. This deep co-design enables efficient inference for the entire Mistral 3 family using NVIDIA's TensorRT-LLM and SGLang frameworks. The result, according to people familiar with the technical integration, is a model family that can be deployed efficiently across a wide range of NVIDIA platforms, from high-performance Blackwell and Hopper systems in data centers down to RTX PCs and Jetson edge devices.

"What we are demonstrating is that frontier-class AI capabilities can be delivered through open models," a source close to the partnership said, framing the effort as a direct challenge to the dominance of proprietary, closed AI systems. The permissive licensing is a clear bid to attract a broad developer community and accelerate adoption.

At the recent VivaTech 2025 conference, the two companies presented a sovereign high-performance computing infrastructure developed jointly. This infrastructure, which combines NVIDIA's GB200 NVL72 systems with Mistral's sparse MoE architecture, is designed to help enterprises deploy and scale these massive models efficiently. Support for advanced serving techniques like prefill/decode disaggregation and speculative decoding is included, aiming to handle demanding, long-context workloads.

For NVIDIA, the partnership underscores the centrality of its hardware and software stack in the training and deployment of cutting-edge AI, even within the open-source domain. For Mistral AI, the backing and technical integration with the industry's leading AI computing platform provide formidable credibility and performance advantages. The move signals a growing industry trend where hardware manufacturers and model developers engage in deep collaboration from the outset, optimizing for efficiency and broad deployment rather than raw scale alone.

Mistral AI did not immediately respond to a request for additional comment on future model roadmaps. The Mistral 3 model family, including compressed checkpoints in formats like NVFP4, is expected to become available to developers in the coming weeks.