Sahara is about to TGE, with a pre-market valuation exceeding $1 billion, and the AI × data asset track is rapidly gaining popularity.

At the same time, multiple AI projects are also queuing up to issue coins, and the entire sector is entering a new round of preheating.

A more underlying, but rarely discussed, data annotation platform Codatta @codatta_io is also expected to TGE soon.

Currently, it has not issued any coins, and has only received a $2.5 million seed round of financing from OKX. Previously, I only tested the project's tasks like Sahara.

Some people previously regarded it as Sahara's "benchmark project" when doing tasks. In my opinion, it actually cuts into a completely different dimension:

If Sahara is a "data trading platform", then Codatta is more like a "data asset manufacturing factory". One solves the problem of circulation, and the other focuses on asset production. The two have both division of labor and may form complementarity.

At present, AI model evaluation is more and more like a "brushing list show". Recently, many friends have been addicted to brushing the Chatbot Arena leaderboard. Claude, GPT-4o, and Gemini PK back and forth, and some people even pull groups to bet on who will win. But have you really thought about it: Can this list still be trusted?

The answer is: more and more不行.

Because the organizers have revealed that the model can be adjusted according to user preferences, and even the anonymous evaluation mechanism itself may be polluted by voting navy. You thought you were participating in model evaluation, but you actually just watched a rehearsed plot.

Amid this chaos, Codatta launched Arena. You might think it's just "another list" by the name, but what they're doing is rebuilding the foundation of trust in evaluation. Each question is initiated by users, models answer blindly, and the community votes for excellence anonymously. Voting data, answer content, and scoring logic are all written into the Avalanche subnet Kite, stored on the chain, and no one can tamper with it. That is to say, whether you are OpenAI or "the next door Shanzhai big model", you will be naked here, without a backstage.

Codatta Arena entrance: https://t.co/T0UmZiyVaF

This Arena is a key point of Codatta, but it is only a part of the project. Codatta's greater ambition lies in establishing a new logic about "data".

In the past, the training data used for the development of large models was free, whether it was open source corpus or the "I agree" button you and I clicked, they never asked you whether you agree or not, or whether you want to share the money. The mechanism proposed by Codatta is: you ask the questions, you vote, you review, these are no longer free labor, but a data asset that can be confirmed, traceable, and can participate in profit sharing.

Every time you make a valid label, every time you participate in an evaluation, you can get points. When the token goes live (TGE), these points will be converted into airdrops. This is the incentive logic that Web3 should have: data is not free, behind it is human labor, human judgment.

So in this sense, Arena is not just an "evaluation function", it is a demonstration of this data rights logic: turning every interaction we have with the model into a process of producing data and accumulating assets. Just like Arweave turns storage into assets, Render turns computing power into assets, and Codatta turns data behavior into assets.

In addition to technology, Codatta has recently added two ecological partners in foreign cooperation: @SaharaLabsAI and @irys_xyz, one is the most radical practitioner in the privacy computing track, and the other is a leading protocol focusing on on-chain storage incentives. The former guarantees that user data will not be leaked during evaluation participation, and the latter ensures the permanence and distributed archiving of on-chain data. With the blessing of these two, Codatta's main body's "on-chain data confirmation + labeling incentive" ability can be connected to deeper AI native scenarios in the future, such as enterprise private domain evaluation and sensitive industry evaluation systems.

So what's going on now?

From the data point of view, it has accumulated more than 300,000 users and 2 million verification data. Although it has not yet issued coins, the entire logic has been completed: the points system is online, the pledge mechanism is available, and user behavior can be mapped to token shares. It is not a PPT project, nor is it a cyber visionary, but a protocol product that can be seen step by step.

We often say that Web3's mission is to "assetize" user behavior, but most projects eventually turn this "asset" into a "wool opportunity". Codatta is not. Its path is destined to be less lively. If you ask whether it is a future giant? do not know. But if you ask whether it is one of the most original and verifiable evaluation protocols of AI+Web3 at this time? The answer is yes.

When AI systems are becoming more and more like black boxes and evaluation results are becoming more and more manipulated, there must be a place to tell us:

Is this model okay? It's up to you.

Many people have not heard of this project. With the popularity of AI data, Codatta's business line is very comprehensive. Although the voice is not enough and not many people pay attention, let's make a bold prediction that OK or Binance should have a place in it. What do you think? Give your insights in the comment section.