Última hora
RUСильные взрывы снова прозвучали в КиевеCN超級颱風巴威襲擊美屬羅塔島,關島進入戒備狀態AUCannibalism part of investigation into death of 4-year-old boy on NSW Central CoastDESPD fordert Merz auf, Demokratiedefizite in der Türkei anzusprechenDEFIFA lässt Balogun trotz Rot-Sperre gegen Belgien spielen – Trump involviert?RUПредставлена модель российского высокоскоростного поезда в серийной окраскеTRABD'den İsrail'e Filistinlilere Yönelik Şiddeti Durdurma ÇağrısıCN西藏活动人士在联合国总部外自焚身亡,抗议中国统治AUNorway Knocks Brazil Out of World Cup; Mexico vs. England DelayedRUЗамглавы МИД РФ назвал условие переговоров Москвы с ЕСRUСильные взрывы снова прозвучали в КиевеCN超級颱風巴威襲擊美屬羅塔島,關島進入戒備狀態AUCannibalism part of investigation into death of 4-year-old boy on NSW Central CoastDESPD fordert Merz auf, Demokratiedefizite in der Türkei anzusprechenDEFIFA lässt Balogun trotz Rot-Sperre gegen Belgien spielen – Trump involviert?RUПредставлена модель российского высокоскоростного поезда в серийной окраскеTRABD'den İsrail'e Filistinlilere Yönelik Şiddeti Durdurma ÇağrısıCN西藏活动人士在联合国总部外自焚身亡,抗议中国统治AUNorway Knocks Brazil Out of World Cup; Mexico vs. England DelayedRUЗамглавы МИД РФ назвал условие переговоров Москвы с ЕС
Newsgather
BackDeepSeek Releases V4 Open-Source AI Model with 1.6 Trillion Parameters
DeepSeek Releases V4 Open-Source AI Model with 1.6 Trillion Parameters
Urgente
SCMP Tech24.04.2026Tecnología1 dk okumaChina

DeepSeek Releases V4 Open-Source AI Model with 1.6 Trillion Parameters

Chinese AI startup's flagship model boasts 1M token context window, aims to compete with OpenAI and Google DeepMind

En resumen

  • DeepSeek released its V4 foundational AI model in two versions - V4-pro with 1.6 trillion parameters and V4-flash with 284 billion parameters.
  • Both feature a 1 million token context window, up from 128,000 in the previous model.
  • The open-source models aim to compete with US leaders OpenAI and Google DeepMind, with Huawei and Cambricon quickly announcing chip compatibility support.

Resumen generado por IA

Por qué importa

DeepSeek is a Hangzhou-based AI startup that has been gaining recognition for its open-source models. The V4 release marks a significant upgrade from its previous flagship which had a 128,000 token context window. The model architecture and training techniques are outlined in an extended technical report.

Tamaño de fuente

DeepSeek has finally released its much-anticipated next-generation foundational artificial intelligence model, the open-source V4, which it said was competitive with leading US closed-source models from the likes of OpenAI and Google DeepMind. The Hangzhou-based AI start-up released two versions of the model on Friday, with the V4-pro model boasting 1.6 trillion parameters, making the company's biggest-ever model by that metric, while the smaller V4-flash model has 284 billion parameters. A higher parameter count generally correlates with greater capabilities for a model, while also increasing the computational demands of training and serving it.

Both models have a context window of 1 million tokens, a critical feature that determines the amount of information an AI system is able to process, which DeepSeek said was achieved with "world-leading" cost efficiency. DeepSeek's previous flagship model had a context window of 128,000 tokens. Soon after DeepSeek's release, Huawei announced "full support" of its range of Ascend chips, along with its supernode systems, to serve V4 models for model inference. The Shenzhen-based tech giant is set to reveal more details about the collaboration in a livestream on Friday afternoon. AI chipmaker Cambricon Technologies also moved quickly to announce compatibility with DeepSeek's new models.

"The release of V4 explicitly mentions compatibility with domestic chips," said analysts from Huatai Securities in a note to clients. "We can look forward to a significant improvement in the capabilities of domestic graphics cards and their widespread adoption this year."

While the parameter size of V4-pro makes it prohibitively large to be run locally on consumer-grade hardware, the extended technical report outlining V4's model architecture and training techniques is likely to be beneficial for global AI developers. The V4-flash model is also one of the cheapest cutting-edge models available on the market, with token pricing identical to DeepSeek's V2 model released in June 2024.

Qué observar

Perspectiva de IA — posibilidades, no hechos

  • Huawei will reveal Ascend chip collaboration details in Friday afternoon livestream

    Muy probable · En días

  • More Chinese semiconductor companies will announce DeepSeek V4 compatibility

    Probable · En semanas

Preguntas abiertas

  • What are the specific performance benchmarks for V4 compared to GPT-4o and Gemini?
  • What is the exact pricing for V4-flash tokens?
  • What are the computational requirements for running V4-flash?

Temas relacionados

This article was originally published by SCMP Tech.

Noticias relacionadas

Más sobre este temadeepseek v4