Why US Big Techs Have Embraced DeepSeek but Kept China at Arm’s Length

There is no stopping China. Following the release of DeepSeek-R1, the company has launched a new model called Janus Pro 7B, an open-source image generation model that outperformed OpenAI’s DALL·E 3 and Stability AI’s Stable Diffusion on benchmarks such as GenEval and DPG-Bench.

Janus Pro 7B excels in tasks beyond image generation, such as visual question answering and image captioning. This makes it an attractive option for businesses seeking to integrate AI into diverse operations without excessive infrastructure costs.

On the other hand, DeepSeek’s competitor, Alibaba, has launched two new models in the past two days, challenging DeepSeek’s R1, OpenAI, and Anthropic.

The Chinese tech giant released Qwen2.5-Max, a large-scale MoE model pre-trained on over 20 trillion tokens and further post-trained using curated supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) methodologies. The model is available via API through Alibaba Cloud and Qwen Chat.

It also announced the launch of its latest vision-language model, Qwen2.5-VL, the successor to Qwen2-VL. 

This model is built to “understand things visually”, including recognising objects, analysing texts, charts, and graphics within images, and acting as a visual agent capable of directing tools. One of its key features is that the model can also control mobile and computer screens, similar to Anthropic’s Computer Use and OpenAI’s Operator agent. 

Another Chinese company Multimodal Art Projection (MAP) has launched an open-source model called YuE that can generate full songs. From lyrics to complete tracks, it can create songs lasting several minutes. It’s compatible with Hugging Face and Llama for easy fine-tuning.

DeepSeek’s success has made US big tech companies rethink their AI strategies. For instance, Meta’s Llama models were once the go-to option for enterprises, thanks to their open-source nature and cost-effectiveness. However, multiple Chinese alternatives are now available in the market.

Although the Chinese models are effective, US companies will hesitate before adopting them, as they are concerned about data security and potential risks. “No one in the West is going to build an enterprise app and scaled consumer apps on a Chinese API,” said Antoine Blondeau, co-founder and managing partner at Alpha Intelligence Capital. 

Nonetheless, he added that the fully open-sourced DeepSeek AI model would be highly beneficial to many, as demonstrated by the traffic on Hugging Face, which already indicates its impact. 

“It was released just a few days ago and, already, more than 500 derivative models of DeepSeek have been created all over the world on Hugging Face with 2.5 million downloads (5x the original weights),” Clem Delangue, co-founder and CEO of Hugging Face, said

Hugging Face has launched the integration of four powerful serverless inference providers – FAL, Replicate, SambaNova, and Together AI – directly on the Hub’s model pages. This allows developers to easily run DeepSeek-R1.

How US Biggies Are Reacting to DeepSeek

Meta’s chief AI scientist Yann LeCun argues that the market’s response to DeepSeek’s low $6 million training cost was “unjustifiable”. “Much of those billions are going into infrastructure for *inference*, not training. Running AI assistant services for billions of people requires a lot of compute,” he said in a post on Threads. 

“Once you put video understanding, reasoning, large-scale memory, and other capabilities in AI systems, inference costs are going to increase,” he added.

On the other hand, Microsoft and OpenAI are currently investigating whether a group linked to Chinese AI startup DeepSeek improperly obtained data from OpenAI. 

Meanwhile, AI startup Perplexity AI has made DeepSeek-R1 available on its search platform. CEO Aravind Srinivas has assured users that it will purchase additional capacity to continue serving DeepSeek-R1 in American data centres. He further said that those shorting NVIDIA are shortsighted.

Moreover, the R1 available on Perplexity is uncensored, unlike DeepSeek, which is available on the DeepSeek app. “It’s probably the most neutral model right now with American politics,” he added. 

Srinivas defended DeepSeek, arguing that the claim that China “just cloned” OpenAI’s outputs is a misconception. He said that this belief stems from an incomplete understanding of how these models are trained. DeepSeek-R1 has successfully implemented RL fine-tuning.

It’s not inaccurate to liken DeepSeek to TikTok in terms of their rapid rise and geopolitical implications. While TikTok was compelled to partner with Oracle to host its US user data in American data centres, DeepSeek’s AI models have similarly sparked discussions about potential regulatory responses.

Meanwhile, Amazon Web Services recently announced that Amazon SageMaker AI now supports distilled versions of Llama, Qwen, and DeepSeek models, allowing users to deploy them efficiently. 

Moreover, Amazon Bedrock’s Custom Model Import feature allows seamless integration and utilisation of distilled Llama and DeepSeek models. 

Furthermore, AWS has enhanced its collaboration with Hugging Face, making it possible to train DeepSeek models directly on Amazon SageMaker. 

DeepSeek-R1 models are available on IBM’s watsonx.ai as well. Considering that DeepSeek-R1 is also available as an open-source model, it is likely that other cloud service providers will host the model in their services, with a guarantee that customers’ data will remain safe and secure.

AIM reached out to Oracle to inquire whether they would be hosting DeepSeek models, but the company declined to comment.

What About India?

Rajeev Chandrasekhar, former Indian IT minister, took to X to question if DeepSeek was on the path to becoming the next TikTok. His remark hinted at the growing concerns about DeepSeek AI’s potential impact on user data and its broader geopolitical implications.

Devilal Sharma, an alumnus of IIT Madras, said that since the model is open source, it can be used locally without an internet connection. “Deploy it on your own servers inside your own country, and the data won’t go anywhere,” he said

There is a growing discussion that India should also focus on sovereign AI. “India can do a better job with AI compute resources. We need to invest aggressively in supercomputers, AI data centres, and GPU clusters. Our focus should be on building AI-specific infrastructure, AI research hubs, and innovation centres,” said Manu Jain, CEO of G42 India.

Similarly, Yotta chief Sunil Gupta told AIM that DeepSeek is a true game-changer, demonstrating how advanced AI can be developed with minimal resources. “Its open-source nature and low compute requirements are making AI more accessible than ever, significantly lowering costs and accelerating adoption,” he said

0
Show Comments (0) Hide Comments (0)
0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x