Posted on

Nvidia just dropped a bombshell: its new AI model is open, massive, and ready to compete with GPT-4

Nvidia just dropped a bombshell: its new AI model is open, massive, and ready to compete with GPT-4

Subscribe to our daily and weekly newsletters to receive the latest updates and exclusive content on industry-leading AI reporting. Learn more


Nvidia has released a powerful open source artificial intelligence model that competes with proprietary systems from industry leaders like OpenAI and Google.

The company’s new NVLM 1.0 family of large multimodal language models, led by the 72 billion parameter NVLM-D-72B, demonstrates exceptional performance in vision and speech tasks while improving text-only capabilities.

“We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results in vision-language tasks and compete with the leading proprietary models (e.g. GPT-4o) and open access models. “, explain the researchers in their work.

By releasing the model weights and promising to release the training code, Nvidia is breaking the trend of keeping advanced AI systems closed. This decision gives researchers and developers unprecedented access to cutting-edge technology.

Benchmark results compare NVIDIA’s NVLM-D model with AI giants such as GPT-4, Claude 3.5 and Llama 3-V, showing NVLM-D’s competitive performance on various visual and linguistic tasks. (Source: arxiv.org)

NVLM-D-72B: A versatile artist for visual and textual tasks

The NVLM-D-72B model demonstrates impressive adaptability when processing complex visual and textual inputs. The researchers provided examples that illustrate the model’s ability to interpret memes, analyze images, and solve mathematical problems step by step.

In particular, NVLM-D-72B improves its performance on text-only tasks after multimodal training. While many similar models have seen a decline in text performance, NVLM-D-72B increased its accuracy on key text benchmarks by an average of 4.3 points.

“Our NVLM-D-1.0-72B shows significant improvements over its text backbone in text-only math and coding benchmarks,” the researchers note, highlighting a key advantage of their approach.

NVIDIA’s new AI model analyzes a meme by comparing scientific abstracts to full papers, demonstrating its ability to interpret visual humor and scientific concepts. (Source: arxiv.org)

AI researchers respond to Nvidia’s open source initiative

The AI ​​community reacted positively to the release. One AI researcher commented on social media: “Wow! “Nvidia just released a 72B model that is on par with Lama 3.1 405B in math and coding tests and also has a vision?”

Nvidia’s decision to make such a powerful model openly available could accelerate AI research and development across the industry. By providing access to a model that competes with proprietary systems from well-funded technology companies, Nvidia can enable smaller organizations and independent researchers to make greater contributions to AI advancements.

The NVLM project also presents innovative architectural designs, including a hybrid approach that combines various multimodal processing techniques. This development could determine the direction of future research in this area.

NVLM 1.0: A new chapter in open source AI development

Nvidia’s release of NVLM 1.0 marks a pivotal moment in AI development. By open-sourcing a model that competes with proprietary giants, Nvidia is not just sharing code, but challenging the structure of the AI ​​industry itself.

This move could trigger a chain reaction. Other technology leaders may feel pressure to open up their research, potentially accelerating AI progress across the board. It also levels the playing field, allowing smaller teams and researchers to innovate using tools once reserved for tech giants.

However, the release of NVLM 1.0 is not without risks. As powerful AI becomes more accessible, concerns about misuse and ethical implications are likely to increase. The AI ​​community is now faced with the complex task of promoting innovation while at the same time setting guidelines for responsible use.

Nvidia’s decision also raises questions about the future of AI business models. As cutting-edge models become freely available, companies may need to rethink how they create value and maintain competitive advantage in AI.

The true impact of NVLM 1.0 will unfold in the coming months and years. It could usher in an era of unprecedented collaboration and innovation in AI. Or it could force a reckoning with the unintended consequences of widespread, advanced AI.

One thing is certain: Nvidia has fired a shot across the bow of the AI ​​industry. The question now is not whether the landscape will change, but how dramatically – and who will adapt quickly enough to succeed in this new world of open AI.