Original title: (a16z leads $33 million seed round, how does Yupp reshape the AI ​​evaluation model based on blockchain and incentives?)

Original author: ShenZhen, PANews

As AI applications penetrate into all walks of life, how to accurately evaluate model performance and enhance user trust has become a problem that needs to be solved urgently. Traditional evaluations mostly rely on centralized mechanisms, which are difficult to cover diverse scenarios and cannot reflect real user preferences; at the same time, model "hallucination" problems frequently occur, and users often fall into information cocoons when making choices.

In this context, Yupp, as a new platform, is trying to reshape the way AI models are discovered, compared, and used with its unique crowdsourcing model and incentive mechanism, and bring a paradigm shift in the field of AI evaluation. This article will deeply analyze Yupp's core mechanism, technical highlights, team background, and its potential impact on the AI ​​ecosystem.

Team background and financing: Experience from tech giants

Yupp aims to solve the long-standing evaluation problems in the AI ​​field and is committed to building a "trustless" AI feedback market - allowing diversified user feedback to flow freely under the protection of blockchain and crypto-economic incentives, thereby forming a scalable, fair and transparent model evaluation layer. By distributing incentives for high-quality manually labeled data, Yupp can capture the needs and preferences of real users in different scenarios in a timely manner, helping AI developers optimize model performance in an iterative manner.

The project was founded in June 2024 by Pankaj Gupta (co-founder and CEO) and Gilad Mishne (co-founder and head of AI), with Chief Scientist Jimmy Lin (Professor at the University of Waterloo) also participating in the core team. The three worked together at Twitter as early as 2010, building and optimizing large-scale recommendation and search systems, and later accumulated rich experience at Google and Coinbase.

Yupp has won high recognition from well-known figures in the technology industry and top venture capitalists because his vision of decentralization and data value transparency can meet the dual demands of AI vendors for trusted evaluation and user participation, and thanks to the rich resume of the core team.

Last week, Yupp announced that it had raised $33 million in seed funding led by A16z partner Chris Dixon. Other investors included Google Chief Scientist Jeff Dean, Twitter co-founder Biz Stone, Pinterest co-founder Evan Sharp, Perplexity CEO Aravind Srinivas, Stanford University’s Dan Boneh, Chris Re, Nick McKeown and Balaji Prabhakar, 45 well-known angels and corporate executives, and Coinbase Ventures.

Core functions and user experience: Building an "AI Parliament"

As a centralized AI evaluation platform, Yupp adheres to the concept of "Every AI for everyone", allowing users to easily discover, compare and use the latest AI models. Unlike traditional single responses, Yupp returns the answers of two (or even more) models for each prompt, forming an "AI parliament". This design not only meets the user's demand for diverse choices, but also effectively identifies the "hallucinations" that may appear in the model, helping users make more informed decisions through comparison. As Yupp CEO Pankaj Gupta said, side-by-side output is particularly beneficial for users who are concerned about generation errors because they can use it to cross-validate the results.

The platform now supports more than 500 AI models, covering text and image generation, including well-known models such as ChatGPT, Claude, Gemini, DeepSeek, Grok, Llama, and many emerging models. To further optimize the experience, Yupp has also launched the "QuickTake" feature, which can refine lengthy replies into a concise tweet.

In addition, Yupp attaches great importance to user privacy: all chat records are private by default unless the user actively makes them public; even if they are shared publicly, no personal information is revealed. Users can control the content and scope of sharing at any time.

Economic Model and Incentive Mechanism: Data Labor Valuation

Yupp combines free use with user feedback, and measures model usage through the "Yupp Points" system. New users get 5,000 points in seconds after signing up, and can earn more points by rating model replies, selecting preferences and explaining reasons. The higher the quality of feedback, the richer the reward, ensuring that users can continue to use high-end models such as Claude Opus 4 or OpenAI o3 for free. The platform promises that points will only increase and not decrease, and all current models can be tried for free.

After each question, users will receive two model answers and win "digital scratch cards" through feedback, which will reward Yupp points ranging from 0 to 250. Every 1,000 points can be exchanged for $1, and users can withdraw up to $10 per day and up to $50 per month. Points can be exchanged for more than 20 currencies including US dollars and euros, and partners include Stripe, PayPal, and Coinbase. At the same time, the platform integrates Base Ethernet L2 and Solana stablecoins to provide instant, fee-free rewards to global users.

As Pankaj Gupta said, the high-quality feedback generated by users is far more valuable to AI companies for model fine-tuning and reinforcement learning than the reward itself. Although the monthly income of users may only be equivalent to a few cups of coffee, these paid labeled data are crucial to AI iteration.

To encourage more people to participate, Yupp has also set up referral rewards: the referrer gets 5,000 points and the referee gets 1,000 points; currently, new registered users can get 5,000 points and the referee can get an additional 2,500 points.

Yupp VIBE Scoring: A New Paradigm for AI Assessment

In response to the lack of transparency, fairness, and uneven access to evaluation data in existing rankings, Yupp launched a beta version of the AI ​​ranking and the "Yupp VIBE (Vibe Intelligence Benchmark) Score" scoring system. The system aggregates preference data generated by global users in natural interactions, and strives to provide robust and reliable evaluation results.

Yupp's evaluation principles include:

Robustness: ensuring representativeness (covering a variety of scenarios), authenticity (reflecting user concerns), and anti-cheating (resisting malicious behavior);

Trustworthy: Fair and neutral (impartial to the model), transparent and open (detailed disclosure of ranking algorithms), rigorous and scientific (compliance with evaluation standards).

The platform not only collects binary preferences, but also encourages users to point out the advantages and disadvantages of replies (such as "getting to the point", "fast", "good style", etc.), and conducts group analysis based on users' age, education, occupation and other information to show the differences in preferences of different groups.

On the technical level, Yupp is exploring the use of blockchain, cryptographic primitives and zero-knowledge proofs to ensure that the evaluation process is fair, transparent and verifiable. At the same time, the platform has cooperated with professional AI data providers to calibrate scorers through profile verification and multi-layer quality inspection to eliminate malicious data.

The list has been recently updated to show the VIBE scores of models such as GPT‑4.5 Preview, Claude Opus 4, Claude Sonnet 4, and their win rate, dislike rate, speed, latency, context window, and cost metrics.

Development History and Future Outlook

Yupp was officially launched on June 13, 2025, after six months of internal testing. Since its launch, the product has continued to iterate:

Multimodal support: Access to models such as Dall‑E, Flux, Stable Diffusion, Luma Photon, Google Imagen 4, and support for users to upload images/PDFs to ask questions;

· Interaction mode expansion: added voice input and voice reading functions;

Model updates: DeepSeek R1/V3, Mistral Small 3, OpenAI o3‑pro, Hermes 3, Amazon Nova Pro v1, Microsoft Phi series, and the "MAX model" category have been introduced;

· Real-time information: online query requests are routed to Perplexity and Google Gemini Live with hyperlinked citations;

Payment upgrade: Added US PayPal, Venmo withdrawals and support for PayPal in 24 currencies;

· Share and export: Supports copying with formatting preserved, PDF/text/Markdown export, and sharing of single replies or entire conversations as needed;

Community Activities: Organize activities such as the "AI Tips Challenge" with prizes of up to tens of thousands of points; add new features such as personal profile pages and AI-generated chat names.

Yupp's mission is to "empower humans to shape the future of AI." Pankaj Gupta believes that the development of AI requires everyone's participation and contribution. Through multi-perspective AI responses and user feedback, Yupp not only helps users make better decisions, but also provides a continuous driving force for AI evolution.

It is worth mentioning that one of Yupp’s main competitors is the open AI model evaluation platform LMArena (URL: https://lmarena.ai/), which is very popular among AI industry insiders. However, the platform is currently in the commercial exploration stage and does not use blockchain technology to provide direct material rewards or points incentive mechanism for user participation.

In general, Yupp has opened up a new path for AI evaluation with its crowdsourcing model, incentive mechanism, and evaluation system driven by real user preferences. It not only provides users with free and diverse AI interactive experiences, but also converts user feedback into high-value training data to promote continuous model optimization. With an experienced team and top capital support, Yupp is expected to play a key role in the future AI ecosystem and realize the vision of "everyone enjoys AI, and everyone shapes AI."

However, for Yupp, which has just been launched, how to continuously ensure data quality, resist potential fraud with large-scale user participation, and strike a balance between commercialization and user incentives will still be the direction that needs to be continuously explored and optimized in its future development.

Original link