ChatGPT's developers have raised concerns that China's cost-efficient DeepSeek AI models may have been developed using OpenAI's proprietary data.
This week, Donald Trump described DeepSeek as a "wake-up call" for U.S. tech firms after Nvidia lost nearly $600 billion in market value. The rise of DeepSeek triggered significant stock declines for AI-focused companies. Nvidia, the dominant GPU supplier for AI systems, suffered a historic 16.86% single-day drop - the largest in Wall Street history.
Other tech giants saw smaller but notable declines, with Microsoft, Meta, and Alphabet dropping 2.1-4.2%, while AI server manufacturer Dell fell 8.7%.
DeepSeek promotes its R1 model as a budget-friendly alternative to Western AI services like ChatGPT. Built on the open-source DeepSeek-V3 framework, the company claims its system requires substantially less computing power and was trained for just $6 million - though some experts question these assertions.
This sudden competition has shaken investor confidence in the massive AI investments by American tech firms. DeepSeek's user base expanded rapidly, briefly becoming the most downloaded free app in the U.S. amid growing interest in its capabilities.
Bloomberg now reports that OpenAI and Microsoft are investigating whether DeepSeek improperly utilized OpenAI's API to enhance its own models. "We're aware that PRC-based firms and others continuously attempt to extract knowledge from leading U.S. AI systems," an OpenAI spokesperson told Bloomberg.
The scrutiny focuses on a technique called "distillation," where developers train new models by extracting data from more advanced systems - a violation of OpenAI's service terms.
David Sacks, Trump's AI policy advisor, commented to Fox News: "Clear evidence suggests DeepSeek appropriated knowledge from OpenAI's models. I expect U.S. AI leaders will implement stronger protections against such practices."
Observers note the irony of these accusations, given OpenAI's own controversies regarding training data. Tech analyst Ed Zitron tweeted: "The company that built ChatGPT by scraping the entire internet now complains when someone might use its outputs? The hypocrisy is staggering."
OpenAI previously defended its use of copyrighted material, telling UK lawmakers in January: "Modern AI development necessarily involves copyrighted content - without it, we could only train models on century-old public domain works, creating useless systems."
The debate over AI training data has intensified alongside generative AI's rapid growth. The New York Times sued OpenAI and Microsoft in December 2023 for alleged copyright infringement, following earlier litigation from authors including George R.R. Martin.
Legal uncertainties persist, as demonstrated by a 2023 U.S. court ruling that AI-generated art cannot receive copyright protection, upholding a 2018 Copyright Office position requiring human creative input for protection.