Last time it was the Spring Festival holiday, this time is it the May Day holiday? That's fine too; there are wonderful expectations for every holiday.

Recently, various sources have reported that DeepSeek is about to release the next-generation large model DeepSeek R2.

According to leaks on social media, the core highlights are as follows:

Architectural Innovation: Adopts self-developed Hybrid MoE 3.0 architecture, with dynamic activation parameters reaching 1.2 trillion, while actual computational consumption is only 78 billion, significantly improving efficiency.

Hardware Localization: Trained on Huawei Ascend 910B chip clusters, achieving a computing power utilization rate of 82%, with performance comparable to Nvidia's A100 cluster at 91%;

Multi-modal Leap: Achieving 92.4% accuracy in the COCO image segmentation task, surpassing the CLIP model by 11.6 percentage points.

Vertical Domain Implementation: Medical diagnosis accuracy exceeds 98%, industrial quality inspection false positive rate reduced to 7.2 per million, with practical technology reaching new heights.

It's worth noting that three months ago, the release of DeepSeek R1 led to Nvidia losing $600 billion in market value in a single day, while the 'low cost + high performance' combination of R2 will undoubtedly bring even greater impact to American tech giants that rely on high-premium chips.

Technological Breakthrough

The parameter scale of the DeepSeek R2 large model has been revealed to reach an astonishing 1.2 trillion, almost doubling from the 671 billion parameters of the previous generation R1. This number is close to the levels of international top models such as GPT-4 Turbo and Google's Gemini 2.0 Pro. The explosive growth in parameters means a significant enhancement in the model's learning and complex task processing capabilities.

DeepSeek R2 adopts a mixture of experts (MoE) architecture, a technique that allocates tasks to multiple 'small expert' modules. Simply put, the model automatically selects the most suitable 'expert' for different tasks, which can improve efficiency and reduce waste of computational resources.

According to reports, the dynamic activation parameter of R2 is 78 billion, while the actual computational consumption is only 6.5% of the total parameters. This design allows the model to maintain high performance while significantly reducing operational costs.

In terms of training data, DeepSeek R2 uses a 5.2PB (1PB = 1 million GB) high-quality corpus covering fields such as finance, law, and patents. Through multi-stage semantic distillation technology, the model's instruction-following accuracy has improved to 89.7%. This means it is better at understanding complex human instructions, such as analyzing legal documents or generating financial reports.

Cost reduction of 97.3%

The biggest breakthrough of DeepSeek R2 is still the significant reduction in costs. According to reports, its unit inference cost is 97.3% lower than that of GPT-4. For generating a 5000-word article, GPT-4 requires about $1.35, while DeepSeek R2 only needs $0.035.

The core reason for the cost reduction lies in the optimization of hardware adaptation. DeepSeek R2 is trained on a Huawei Ascend 910B chip cluster, achieving a chip utilization rate of 82%. In comparison, similar Nvidia A100 clusters have an efficiency of 91%.

This means that domestic chips are approaching internationally leading levels in the AI training field and may even escape dependence on Nvidia.

Multi-modal Capability

Another highlight of DeepSeek R2 is the enhancement of multi-modal capabilities. In the visual understanding module, it adopts a ViT-Transformer hybrid architecture, achieving an accuracy of 92.4% in the COCO dataset object segmentation task, an improvement of 11.6 percentage points over the traditional CLIP model.

In simple terms, it can more accurately identify objects in images, such as distinguishing pedestrians, vehicles, and traffic signs in a street scene photo.

Additionally, R2 supports 8-bit quantization compression, reducing the model size by 83% with a precision loss of less than 2%. This means that in the future, mobile phones and smart home devices can also run high-performance AI locally without relying on cloud servers.

Global AI Competition

The leaks about DeepSeek R2 have triggered a strong reaction in the capital market. Due to cost advantages and technological autonomy, it could pose a threat to American tech companies reliant on Nvidia GPUs.

Analysts predict that if R2's performance is confirmed, Nvidia's stock price may face short-term fluctuations, while Chinese AI industry chain-related enterprises may welcome a new round of growth.

This event also reflects a new pattern in the global AI competition. DeepSeek R2 demonstrates that breakthroughs can also be achieved through architectural innovation and domestic hardware adaptation. The utilization data of Huawei’s Ascend chips (82%) indicates that domestic computing power infrastructure has attained international competitiveness.

Although the leaked information is exciting, some industry insiders point out that there are contradictions in the information. There have been instances of unofficial Chinese content being translated and spread on external networks, further increasing uncertainty.

DeepSeek's official confirmation of the release date is still pending, but combined with ZZJ's recent collective learning initiatives in artificial intelligence, the dual benefits of policy support and technological breakthroughs may accelerate R2's release.

From chips to algorithms, from data to applications, every link in China's AI industry chain is accelerating towards autonomy. Huawei's Ascend chips are replacing Nvidia, and the construction of a 5.2PB Chinese corpus is creating vertical domain barriers—these actions are part of a life-and-death race concerning the technological discourse power of the next decade.

$TRUMP

#AI