# DeepSeek: The New AI Sensation Competing with ChatGPT
In the swiftly changing landscape of artificial intelligence, a fresh contender has emerged, disrupting the existing competition. DeepSeek, an AI chatbot developed in China, is being recognized as a potential revolution, with some experts dubbing it a “Sputnik moment” for the advancement of AI. Former Microsoft employee and tech analyst Dave Plummer recently offered his perspective on DeepSeek’s groundbreaking abilities on his YouTube channel, *Dave’s Garage*. Plummer suggests that the arrival of DeepSeek indicates a notable transformation in the AI sector, posing a challenge to established entities such as OpenAI’s ChatGPT.
## A Ferrari Assembled from Spare Parts
One of DeepSeek’s most remarkable features is its economical development. While companies like OpenAI and Google have invested billions into the training of their large language models (LLMs), DeepSeek was reportedly created for less than $6 million—a mere fraction of the funds spent by its rivals. Despite this limited budget, DeepSeek delivers performance that rivals premier models like ChatGPT.
Plummer compares DeepSeek to a Ferrari made from used parts: “Just as effective, but much more affordable.” This analogy highlights the creativity of DeepSeek’s creators, who successfully built a high-performing AI without the latest Nvidia chips—hardware regarded as crucial for training cutting-edge AI models. Nvidia, whose GPUs support much of the AI surge, has become one of the most valuable companies globally, yet DeepSeek’s achievements illustrate that exceptional outcomes can be realized with more modest means.
## The Key Ingredient: Distilled Models
DeepSeek’s effectiveness derives from its application of a method known as model distillation. Unlike conventional LLMs that demand extensive computational power, distilled models are smaller and more efficient. This technique involves training a smaller model to replicate the performance of a larger, more intricate one.
Plummer elucidates this idea using a master-apprentice metaphor: “It’s akin to a master training their apprentice—the apprentice doesn’t need to learn everything, yet they can perform the job equally well.” For DeepSeek, the “masters” included Meta’s open-source Llama model and OpenAI’s ChatGPT. By utilizing the insights from these larger models, DeepSeek achieves similar performance while needing considerably less hardware and energy.
This methodology not only lowers expenses but also makes AI more approachable. Unlike traditional LLMs that depend on extensive data centers with numerous GPUs, DeepSeek can function locally on high-end consumer machines. For example, the largest DeepSeek model can run on an AMD Threadripper with an Nvidia RTX 6000 GPU, while smaller versions can even operate on a MacBook Pro.
## A Change on the Horizon
Plummer draws comparisons between DeepSeek’s progress and the inception of the personal computer (PC) revolution. During the 1970s and 1980s, PCs were significantly less powerful than mainframe computers, yet they democratized computing and reshaped society. In a similar vein, DeepSeek’s capability to operate on consumer-grade hardware could broaden access to advanced AI, allowing individuals and small businesses to utilize its functionalities without dependence on centralized cloud services.
“It reminds me of the early days of PCs—they were not as powerful as mainframes, but they altered the world,” Plummer comments.
## Geopolitical Consequences: A Sputnik Moment
While the technological feats of DeepSeek are notable, its geopolitical ramifications are equally important. Plummer refers to its advent as a “Sputnik moment,” likening it to the Soviet Union’s launch of the Sputnik satellite in 1957. The success of Sputnik initiated the Space Race and escalated tensions between the Soviet Union and the United States during the Cold War.
In a similar fashion, DeepSeek’s development underscores the intensifying technological competition between China and the United States. As a Chinese AI model, DeepSeek directly contends with American tech giants like OpenAI and Google, symbolizing the broader rivalry between the two nations for global technological leadership.
This competition encompasses more than just innovation, touching upon governance and values. While DeepSeek’s capabilities are remarkable, it operates under state censorship in China. The AI steers clear of sensitive subjects like the Tiananmen Square incident, the suppression of Uyghurs in Xinjiang, and Taiwan’s political status. Responses mentioning Chinese President Xi Jinping are also heavily filtered. This censorship is closely monitored by Chinese regulators, although users have reportedly found ways to circumvent these limitations through clever phrasing or local deployment of the model.
## What Does This Indicate for AI’s Future?
The success of DeepSeek prompts significant questions regarding the future of AI development. If a high-performing model can be constructed for a fraction of the cost using older hardware and clever techniques, what does that imply for the value of pricier models like ChatGPT? Plummer succinctly poses the question: “If you can build a Ferrari in your garage out of Chevy…