Web Stories Thursday, January 30

Meanwhile, US AI developers are hurrying to analyse DeepSeek’s V3 model. DeepSeek in December published a research paper accompanying the model, the basis of its popular app, but many questions such as total development costs are not answered in the document.

China has now leapfrogged from 18 months to six months behind state-of-the-art AI models developed in the US, one person said. Yet with DeepSeek’s free release strategy drumming up such excitement, the firm may soon find itself without enough chips to meet demand, this person predicted.

DeepSeek’s strides did not flow solely from a US$6 million shoestring budget, a tiny sum compared to US$250 billion analysts estimate big US cloud companies will spend this year on AI infrastructure. The research paper noted that this cost referred specifically to chip usage on its final training run, not the entire cost of development.

The training run is the tip of the iceberg in terms of total cost, executives at two top labs told Reuters. The cost to determine how to design that training run can cost magnitudes more money, they said.

The paper stated that the training run for V3 was conducted using 2,048 of Nvidia’s H800 chips, which were designed to comply with US export controls released in 2022, rules that experts told Reuters would barely slow China’s AI progress.

Sources at two AI labs said they expected earlier stages of development to have relied on a much larger quantity of chips. One of the people said such an investment could have cost north of US$1 billion.

Some American AI leaders lauded DeepSeek’s decision to launch its models as open source, which means other companies or individuals are free to use or change them.

“DeepSeek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen – and as open source, a profound gift to the world,” venture capitalist Marc Andreessen said in a post on X on Sunday.

The acclaim garnered by DeepSeek’s models underscores the viability of open source AI technology as an alternative to costly and tightly controlled technology such as OpenAI’s ChatGPT, industry watchers said.

Wall Street’s most valuable companies have surged in recent years on expectations that only they had access to the vast capital and computing power necessary to develop and scale emerging AI technology. Those assumptions will come under further scrutiny this week and the next, when many American tech giants will report quarterly earnings.

Share.

Leave A Reply

Exit mobile version