Before talking about ionet, let's talk about AI and machine learning
There are several mainstream AI models: Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Transformer, Generative Adversarial Model (GAN), etc. Its main application scenarios are still natural language processing and generative models.
So, which of the above does ionet belong to?
Unfortunately, neither is true. In IONET's white paper, it is positioned as "building the world's largest AI computing DePIN (decentralized physical infrastructure network)". In the final analysis, it has nothing to do with AI and deep learning, but is just a hardware computing power provider.
So how does the concept of distributed computing perform in the web3 market?
Unfortunately, ionet is not the first project dedicated to distributed computing. The concept of "decentralized physical infrastructure network" it advocates is essentially no different from what golem network does: both are web3 idle fish that rent out idle computing resources in exchange for compensation.
*In addition, the current market value of Golem Network’s token ranks 161st, with a market value of approximately US$600 million.
Since the concept is not new and there is not enough innovation to be called an "ai" project, what about the distributed computing market?
Unfortunately, this is still a false proposition. As a long-time Zhihu user, I like to add a sentence here, "Ask if it is true first, then ask why."
When we discuss distributed computing, the products we usually focus on are large language models. Since small language models do not require much computing power, there is no need to distribute them, and it is better to solve them directly in a centralized way. However, large language models require huge computing power, and they are currently in the initial stage of explosion.
We calculate a large model with 175 billion parameters (GPT3 data). Due to the huge size of the model, many GPUs are required for training. Assume there is a centralized computer room with 100 GPUs, and each device has 32GB of memory.
The training process inevitably involves a lot of data transmission and synchronization, which obviously becomes a bottleneck for training efficiency. Therefore, we need to optimize bandwidth and latency and develop efficient parallel and synchronization strategies.
The GPT-3 model has 175 billion parameters. If we use single-precision floating point numbers (4 bytes per parameter) to represent these parameters, then storing these parameters requires ~700GB of memory. In distributed training, these parameters need to be frequently transmitted and updated between the computing nodes. Assuming there are 100 computing nodes, and each node needs to update all parameters in each step, then each step requires the transmission of about 70TB (700GB*100) of data. If we assume that a step takes 1s (a very optimistic assumption), then 70TB of data needs to be transmitted every second. This demand for bandwidth has far exceeded most networks and is also a feasibility issue.
This is calculated based on high-performance GPUs in centralized computer rooms (such as A100). If a lower-end or even home-level GPU is used, the communication overhead will increase by orders of magnitude. In practice, due to communication delays and network congestion, the data transmission time may be much longer than 1 second. This means that computing nodes may need to spend a lot of time waiting for data transmission instead of performing actual calculations. This will greatly reduce the efficiency of training, and this reduction in efficiency cannot be solved by waiting, but is the difference between feasible and infeasible, which will make the entire training process infeasible.
In addition: Even in a centralized environment, the training of large models requires communication optimization.
In addition, there are some problems in distributed computing that cannot be solved at present:
Communication overhead vs. computing efficiency: In distributed computing, nodes need to communicate frequently to exchange data and coordinate tasks, which may lead to large communication overhead. However, in order to improve computing efficiency, it is sometimes necessary to perform computing tasks locally instead of transmitting data to other nodes over the network.
Consistency vs. Performance: Ensuring data consistency in a distributed system is crucial, but in some cases, consistency requirements may need to be relaxed to improve performance. This leads to a contradiction between consistency and performance, and a trade-off needs to be made between the two.
Node failure vs. reliability: In a distributed system, nodes may fail due to hardware failure, network failure, or software error, which may lead to task failure or data loss. In order to improve the reliability of the system, fault detection and recovery mechanisms need to be adopted, but this will increase the complexity and overhead of the system.
Load balancing vs. performance: Tasks in a distributed system may be assigned to different nodes for execution, but the load between nodes may be uneven, causing some nodes to be overloaded while others are idle. Load balancing technology can help balance the load in the system, but it may introduce additional overhead and latency.
Security vs. Freedom: Distributed computing involves multiple parties sharing data and resources, so security is an important issue. However, in order to achieve free data and resource sharing, security requirements may sometimes need to be relaxed, which may lead to data leakage or unauthorized access.
To sum up, regardless of whether the decentralized computing logic is valid, ionet, as a computing power market, will not have a very impressive market performance, and it will not be able to support the miners' boasting of a valuation of 2 billion or even 3 billion US dollars.
The computing power task of home computers participating in AI model calculations under the existing architecture is a fantasy, nonsense, groundless and nonsense.
This can also be seen from the load diagram posted on ionet's official website:
Except for professional-grade graphics cards, all others are running idle.
The market will vote with its feet.
Conflict of interest: Former cofounder of the decentralized computing power platform project based on the federated learning architecture, a web3 guy who studied finance and business and knows a little about AI.