DeepSeek releases Prover-V2 model with 671 billion parameters

【DeepSeek releases Prover-V2 model with 671 billion parameters】Golden Finance reports that DeepSeek today released a new model named DeepSeek-Prover-V2-671B on the AI open-source community Hugging Face. It is reported that DeepSeek-Prover-V2-671B uses a more efficient safetensors file format and supports multiple computational precisions, facilitating faster and more resource-efficient training and deployment of the model, with parameters reaching 671 billion, which may be an upgraded version of the Prover-V1.5 mathematical model released last year. In terms of model architecture, this model uses the DeepSeek-V3 architecture and adopts the MoE (Mixture of Experts) mode, featuring 61 layers of Transformer layers and a 7168-dimensional hidden layer. It also supports ultra-long context, with a maximum positional embedding of 163,800, enabling it to handle complex mathematical proofs, and employs FP8 quantization, which can reduce model size and improve inference efficiency through quantization techniques. (Golden Ten)

DeepSeek releases Prover-V2 model with 671 billion parameters

Explore More From Creator

Latest News

DeepSeek releases Prover-V2 model with 671 billion parameters

Explore More From Creator

Latest News

Trending Articles