A16z Partners: In the AI Era, There Are No Moats, Only Speed

Original title: (a16z Partners' Latest Consumer Insights: There are No Moats in the AI Era, Only Speed)
Original text source: Newin
From Facebook to TikTok, consumer products have pushed social evolution by connecting people. However, in the new AI-driven cycle, "completing tasks" is replacing "establishing relationships" as the main focus of products. Products like ChatGPT, Runway, and Midjourney represent a new entry point; they not only reshape content generation methods but also change the user payment structure and product monetization pathways. The five a16z partners focusing on consumer investments revealed in discussions that while current AI tools are powerful, they have yet to establish social structures and lack a "connectivity" platform fulcrum.
The absence of consumer-level blockbuster products reflects a disconnect between platforms and models. A truly AI-native social system has yet to emerge, and this gap could give rise to the next generation of super applications. The historical evolution of a16z's platform strategy: from VC "unwilling to clean up messes" to "full-stack services." Meanwhile, products such as AI avatars, voice agents, and digital personalities are beginning to take shape, with meanings that go far beyond companionship or tools, but instead build new mechanisms of expression and psychological relationships. The core competitive advantage of future platforms may shift towards model capabilities, product evolution speed, and cognitive system integration levels.
AI is rewriting the 2C business model.
Over the past twenty years, representative consumer products have emerged every few years, from Facebook and Twitter to Instagram, Snapchat, WhatsApp, Tinder, and TikTok, with each product driving an evolution in social paradigms. In recent years, this rhythm seems to have stagnated, raising an important question: Has innovation really paused, or are we facing a reconstruction of our definition of "consumer products"?
In the new cycle, ChatGPT is regarded as one of the most representative consumer-level products. Although it is not a traditional social network, it has profoundly changed the relationship people have with information, content, and even tools. Tools like Midjourney, ElevenLabs, Blockade Labs, Kling, and VEO have rapidly gained popularity in audio, video, and image fields, but most have yet to establish a connection structure between people and lack social graph attributes.
Currently, most AI innovations are still led by model researchers, possessing technical depth but lacking experience in building terminal products. With the popularization of APIs and open-source mechanisms, underlying capabilities are being released, and new consumer-level blockbusters may emerge as a result. The development of consumer internet over the past twenty years, and the success of Google, Facebook, and Uber, is rooted in three underlying waves: the internet, mobile devices, and cloud computing. The current evolution, however, comes from leaps in model capabilities; the rhythm of technology is no longer portrayed as functional updates, but is driven by remotely upgraded models.
The main line of consumer products has also shifted from "connecting people" to "completing tasks." Google was an information retrieval tool, and ChatGPT is gradually taking over that role. Tools like Dropbox and Box, while not establishing social graphs, still possess wide penetration on the consumer end. Despite the continuous rise in content generation demands, the connection structure of the AI era has yet to be established, and this gap may be the direction for the next breakthrough.
The moat of traditional social platforms is facing reevaluation. In the context of the rise of AI, platform dominance may be shifting from building relational graphs to building capabilities in models and task systems. Whether tech-driven companies like OpenAI are becoming the next-generation platform companies is worth noting. Can returns only rely on OpenAI? Founders of Silicon Valley’s 20-year dollar funds warn that the VC model is nearing obsolescence.
From a business model perspective, AI products' monetization capabilities far exceed those of past consumer tools. In the past, even leading applications had relatively low average user revenue. Today, top users can pay up to $200 per month, exceeding the limits of most traditional tech platforms. This means that companies can bypass advertising and lengthy monetization pathways, directly obtaining stable revenue through subscriptions. The earlier overemphasis on network effects and moats was essentially due to weak product monetization abilities. Today, as long as tools are valuable enough, users are naturally willing to pay.
This change has brought about a structural shift. The traditional "weak business model" forced founders to build narratives around user stickiness, lifecycle value, and other metrics, while AI products, with their direct charging capabilities, can close the business logic loop early in their launch. Although models like Claude, ChatGPT, and Gemini may seem similar at the functional level, the actual user experience reveals significant differences. This preference divergence has spawned independent user groups. The market has not only avoided price wars but has also shown a trend of leading products continuously raising prices, indicating that a differentiated competitive structure is gradually being established.
AI is also reconstructing the definition of "retention rate." In traditional subscription products, user retention determines revenue retention. Now, users may continue using basic services but choose to upgrade subscriptions due to more frequent usage, larger point allocations, or higher-quality models. Revenue retention significantly exceeds user retention, a phenomenon unseen in the past. The pricing model of AI products is undergoing fundamental changes. Traditional consumer subscriptions generally cost around $50 per year, while now many users are willing to pay $200 per month or even higher. This price structure's acceptability stems from the essential change in the actual value experienced by users.
AI products can be accepted at high premiums because they are no longer just "assisting improvements" but genuinely "completing tasks for users." For example, research tools that once took ten hours to manually organize reports can now be generated in just a few minutes. Even if used only a few times a year, the service still has reasonable payment expectations. In the field of video generation, Runway's Gen-3 model is regarded as representing the experiential evolution of the next generation of AI tools. It can generate videos of varying styles through natural language prompts, supporting voice and action customization. Some users have used this tool to create exclusive videos with friends' names, while others generate complete animated works to upload to social platforms. This "seconds to generate, immediate use" interaction experience is unprecedented.
From a consumer structure perspective, future user spending will be highly concentrated in three categories: meals, rent, and software. As a universal tool, the penetration speed of software is continuously increasing, with spending proportion steadily rising, beginning to consume budget space that previously belonged to other categories.
A true AI social network has yet to emerge.
Entertainment, creation, and even interpersonal relationships themselves are gradually being mediated by AI tools. Many things that previously relied on offline communication or social interaction can now be accomplished through subscription models, from video generation to writing assistance, even replacing some emotional expression. Under this trend, the mechanism of connection between people is also facing a necessary rethinking. Although users remain active on traditional platforms like Instagram and Twitter, a truly new generation of connection methods has yet to emerge.
The essence of social products has always revolved around "status updates." From text to images to short videos, the medium has continuously evolved, but the underlying logic remains "What am I doing"—aimed at establishing presence and obtaining feedback. This structure formed the foundation of the previous generation of social platforms. The current question is whether AI can foster a completely new way of connecting? Model interactions have deeply integrated into users' lives. In the massive conversations with AI tools daily, highly personalized emotions and needs are inputted. This long-term input is likely to understand users better than search engines, and if systematically extracted and externalized as a "digital self," the logic of connection between people may be reconstructed.
Some early phenomena have begun to emerge. For example, on TikTok, AI feedback-based personality tests, comic generation, and content imitation have started to appear. These behaviors are no longer merely content generation but a form of "digital mapping" social expression. Users not only generate but also actively share, triggering imitation and interaction, showing a high interest in "digital self-expression." However, all of this remains confined within the structures of old platforms. Whether on TikTok or Facebook, although the content is smarter, the information flow structure and interaction logic have hardly changed. Platforms have not genuinely evolved due to the model explosion; they have merely become hosting containers for generated content.
The leap in generative capabilities has yet to find a matching platform paradigm. A large amount of content lacks structured presentation and interactive organization, instead being dissolved into information noise by the existing content architecture of platforms. Old platforms serve content carrying functions rather than being engines for reconstructing social paradigms. Current platforms resemble "old systems in new skins." Short videos, Reels, and other forms, while appearing modern and youthful, still do not escape the constraints of information flow pushing and like distribution.
An unresolved core question is: What will the first truly "AI-native" social product look like? It should not be a collage of model-generated images or a visual refresh of information flows but a system capable of carrying real emotional fluctuations, sparking connections and resonance. The essence of social interaction has never been about perfect performances but about uncertainty—awkwardness, failure, and humor constitute the emotional tension structure. Today, many AI tools output the "most ideal user version," always positive, always smooth, yet rendering real social experiences singular and hollow.
Current products referred to as "AI social" are essentially still modeled replicas of old logic. Common practices include reusing old platform interface structures and treating models as content sources, yet without fundamentally changing product paradigms and interaction structures. Truly breakthrough products should reconstruct platform systems from the underlying logic of "AI + human."
Technical limitations remain a significant barrier. Almost all consumer blockbuster products have emerged from mobile platforms, yet the deployment of current large models on mobile devices still faces challenges. Real-time response, multimodal generation, and other capabilities impose extremely high demands on edge-side computing power. Before breakthroughs in model compression and computational efficiency, "AI-native" social products will still struggle for comprehensive implementation. The individual matching mechanism is another area that has yet to be fully activated. Despite social platforms having vast amounts of user data, there has always been a lack of systematic advancement in "actively recommending appropriate connections." If dynamic matching systems can be built based on user behavior, intent, and language interaction patterns in the future, the underlying logic of social connections will be reshaped.
AI can not only capture "who you are" but also depict "what you know," "how you think," and "what you can bring." These capabilities are no longer limited to static label-based "identity profiles" but form dynamic, semantically rich "personality modeling." Traditional platforms like LinkedIn build static self-indexes, while AI can generate a knowledge-driven living personality interface.
In the future, people may even communicate directly with a "synthesized self," gaining experiences, judgments, and values from digital personalities. This is no longer an optimization of information flow structures but a fundamental reconstruction of the mechanisms for personality expression and social connection.
In the AI era, there are no moats, only speed.
In addition to social platforms not yet experiencing paradigm shifts, the user diffusion paths of AI tools are also reversing. Unlike the past, where growth initiated from the consumer end and gradually penetrated the business end, AI tools now show a reverse diffusion pattern where enterprise clients adopt them first, followed by consumer diffusion. Taking voice generation tools as an example, initial users were primarily concentrated in niche circles such as geeks, creators, and game developers, with uses including voice cloning, voice-over videos, and game modules. However, the true driving force behind growth comes from the large-scale systematic adoption by enterprise clients, applied in entertainment production, media content, voice synthesis, and other fields; many businesses have embedded these tools into their workflows, completing enterprise penetration earlier than expected.
This path is no longer an isolated example. Multiple AI products exhibit similar trajectories: initially generating attention through viral spread in the consumer sphere, then enterprise clients become the main drivers of monetization and scaling. Unlike traditional consumer products that struggle to convert to the enterprise end, many businesses today are identifying AI tools through communities like Reddit, X, and newsletters and actively piloting them, with consumer enthusiasm becoming the entry point for enterprises to deploy AI. This logic is being productized and engineered into systematic strategies, with some companies establishing mechanisms to trigger B-end sales processes proactively when platforms detect multiple employees from the same organization registering and using a particular tool. The shift from consumer to enterprise is no longer an incidental event but a replicable business path.
This "bottom-up" diffusion mechanism also raises larger questions: are these hot AI products the foundational platforms of the future, or are they transitional products similar to MySpace and Friendster?
Current judgments tend to be cautiously optimistic. AI tools have the potential to evolve into long-term platforms, but they must navigate the technical pressures brought by continuous evolution at the model level. For example, the new generation of multimodal models not only supports role-playing, graphic-text collaboration, and real-time audio generation, but also rapidly enhances the depth of expression and interaction modes. Even in the relatively stable text domain, there remains substantial room for model optimization. As long as they can continue to iterate, whether through in-house development or efficient integration, tool-based products can maintain their position at the forefront without being rapidly replaced.
"Don't fall behind" has become the most practical competitive proposition today. In an increasingly segmented market, image generation is no longer a single standard of "who is the strongest" but a competitive positioning of "who is most suitable for illustrators, photographers, and light users." As long as updates continue and users remain engaged, products have the potential for long-term sustainability.
Similar professional differentiation is also emerging in video tools. Different products excel in different content forms; some focus on e-commerce advertising, others emphasize narrative pacing, and some highlight structural editing. The market capacity is large enough to support coexistence of multiple positioning, with clarity and stability of structural positioning being key.
The discussion about whether the concept of "moat" still applies to the AI era is undergoing fundamental changes. Traditional logic emphasizes network effects, platform binding, and process integration, yet many projects once considered to have "deep moats" ultimately failed to become winners. Instead, those small teams that frequently experiment on the edges, iterating rapidly on models and products, eventually enter the center of the main track.
The most noteworthy "moat" currently is speed: first, distribution speed, meaning who can enter the user's field of vision first; second, iteration speed, meaning who can launch new features and stimulate usage inertia the fastest. In an era where attention is scarce and cognition is highly fragmented, the first to appear and the one that continuously changes is more likely to lead to revenue, channels, and market scale accumulation. "Continuous updates" are replacing "steady-state defense," becoming a more realistic strategy in the AI era.
"Speed brings mental occupation, and mental occupation drives revenue loops" has become one of the most important growth logics today. Capital resources can feed back into R&D, enhancing technological advantages and ultimately forming a snowball effect. This mechanism aligns more closely with the cyclical dynamics of AI products and is more adaptable to rapidly evolving market demands.
"Dynamic leadership" is replacing "static barriers" as the essence of the new generation of moats. The standard for measuring whether an AI product can exist long-term is no longer static market share ownership but whether it can consistently appear at the forefront of technology or user perception.
The traditional "network effect" has not fully manifested in AI scenarios. Most products are still in the "content creation" phase and have yet to form a closed-loop ecosystem of "generation-consumption-interaction." User relationships have not yet solidified into structural networks, and platforms with social-level network effects are still in the making.
However, in some vertical categories, new barrier structures have begun to emerge. Taking voice synthesis as an example, certain products have established process bindings in multiple enterprise scenarios, building a "efficiency + quality" dual barrier through frequent iterations and high-quality outputs. This mechanism may become one of the practical pathways to building product moats today.
In terms of experience, some voice platforms have already shown the initial signs of network effects. By continually expanding the database through user-uploaded corpus and character voice samples, platform models receive ongoing training feedback, forming user reliance and a positive content cycle. For instance, for targeted voice needs like "elderly wizard," mainstream platforms can provide over twenty high-quality versions, while general products only offer two or three, reflecting the disparity in training depth and content breadth. This sedimentation path has already begun to establish new user stickiness and platform dependency mechanisms in the field of voice generation, though it has yet to achieve platform-level scale, but signs of a closed-loop are emerging.
Whether voice can become the underlying interactive interface for AI is also transitioning from technical imagination to product reality. Voice, as the most primitive form of human interaction, has undergone multiple rounds of failed attempts over the past few decades—from VoiceXML to voice assistants—yet has never become an efficient human-computer interaction channel. Only with the rise of generative models has voice gained the technological foundation to support "universal interactive entry." The path for voice AI to land is also rapidly penetrating from consumer applications to enterprise scenarios. Although initial concepts revolved around AI coaches, psychological assistants, and companionship products, the fastest acceptance is occurring in industries that naturally rely on voice, such as financial services and customer support. High turnover rates in customer service, poor service consistency, and high compliance costs are beginning to reveal the systemic value of AI voice's controllability and automation advantages.
Some tools have already emerged, such as Granola, which have begun to enter enterprise usage scenarios. While a "mass-market voice product" has yet to appear, the path has been initially opened.
More importantly, AI voice is entering key scenarios of high trust cost and high-value information transfer. This includes sales conversion, customer management, collaboration negotiations, and internal cultural communication, all relying on high-quality dialogue and judgment transfer. Generative voice models have already demonstrated more consistent, uninterrupted, and controllable execution capabilities in these complex dialogue scenarios than humans.
Once these systems continue to evolve in the future, businesses will have to reassess the fundamental understanding of "who is the most important conversationalist in the organization." Behind all these trends, a new structural judgment is taking shape: the moats of the AI era no longer come from user numbers or ecological binding, but from the depth of model training, the speed of product evolution, and the breadth of system integration. Companies that have early accumulation, continuous updates, and high-frequency delivery capabilities are reshaping technological barriers with "engineering rhythm." The new generation of product infrastructure may gradually take shape in these seemingly vertical small tracks.
Sequoia America’s Roelof Botha discusses VC observation models in the AI era—AI does not weaken centralization like the internet but still presents structural opportunities.
The AI avatar that understands you best.
The evolution of voice technology is just the prologue; the concept of AI avatars is gradually moving out of the lab and into productization pathways. More and more teams are starting to think: under what circumstances will people establish long-term interactions with "synthesized selves"? The core of AI avatars is no longer about "amplifying head influence," but empowering every ordinary person to express and extend themselves. Many individuals with unique knowledge, experiences, and personal appeal exist in reality, but they have long been unseen due to expression and medium barriers. The popularization of AI cloning provides a foundational infrastructure for these individuals to be "recorded, invoked, and inherited."
Knowledge-based personality agents are one of the typical paths currently realized. For example, in voice course systems, the lecturer's voice is constructed as an interactive role, combined with retrieval-augmented generation technology, allowing users to ask any questions about the course, with the system generating answers in real-time from a vast corpus. Courses are no longer just passive content playback, but rather active participation of knowledge personalities, transforming a content that originally required hours of viewing into a personalized Q&A experience completed in minutes.
This marks the rise of digital personalities from the "content performance layer" to the "cognitive interaction entry point." When AI avatars can continuously present a familiar, ideal, or even transcendent social experience in terms of semantics, rhythm, and emotional structure, users' trust and reliance on them will exceed the tool level, entering the realm of "psychological relationships."
This evolutionary path also promotes the updating of cognitive concepts. Future digital interactions may diverge into two core forms: one is the extended personality built around real individuals (such as mentors, idols, and friends); the other is the "virtual ideal other" generated based on user preferences and idealized settings. Although the latter has never truly existed, it can form a highly effective companionship and feedback relationship.
In the creator field, this trend is also starting to manifest. Some individuals with publicly available corpuses are being "cloned" into callable digital personality assets, which may participate in content production, social interaction, and commercial licensing as part of personal IP in the future, reshaping "individual boundaries" and "expression methods."
The "AI Celebrity" was born from this. One type is completely fictional image idols, constructed comprehensively in terms of image, voice, and behavior by generative models; the other type is multiple digital avatars of real stars, interacting with users in different personality states across different platforms. These "AI cultural personalities" have already been extensively tested in social networks, evaluated based on image realism, behavioral consistency, and semantic modeling depth.
In the content ecosystem, AI tools lower the barriers to creation, yet do not change the scarcity of high-quality content. Compelling content still depends on the creator's aesthetic judgment, emotional tension, and continuous expressiveness. AI plays more of a supportive role in "realizing logic" rather than replacing the "creative impetus."
The group of "creators liberated by tools" is emerging. They may not have traditional artistic backgrounds but have used AI tools to express their intentions. AI provides an entry point, not the endpoint of a channel; whether one can stand out still depends on individual capability, thematic uniqueness, and narrative structure.
This form of expression has already manifested in content products. For example, video content in the form of "virtual street interviews" essentially involves structured interaction with AI-generated characters. The characters can be elves, wizards, or fantastical creatures, and the platform can generate entire dialogues and scenes with a single click, automating the entire process from character setting, language logic to video rendering. This mechanism has gained significant attention across multiple platforms and indicates that narrative AI product forms are taking shape.
Similar trends are also observed in the music field, but model output still faces challenges in expressiveness and stability. The biggest issue with AI music currently is the tendency towards "averageness." Models naturally tend toward central fitting, whereas truly impactful artistic content often stems from "non-average" cultural conflicts, emotional extremes, and resonance with the times.
This is not due to insufficient model capabilities but rather because the algorithmic goals do not cover the tension logic of art. Art is not about being "accurate" but about "new meanings in conflict." This also prompts a rethinking: Can AI participate in generating culturally deep content rather than merely being an accelerator of repetitive expressions?
This exploration ultimately lands on the value of "AI companionship." The relationship layer between AI and humans may be one of the earliest mature and commercially viable scenarios. In early companionship products, many users expressed that even simulated responses create a psychological safe space. AI does not need to truly "understand"; as long as it can build a subjective experience of "being heard," it can alleviate loneliness, anxiety, and social fatigue. For some groups, this simulated interaction might even serve as a prerequisite mechanism for rebuilding real social abilities.
AI relationships are not merely amplifiers of comfort zones. On the contrary, the most valuable companionship may stem from the cognitive challenges they present. If AI can moderately ask questions, guide conflicts, and challenge established perceptions, it could become a guide on the path to psychological growth, rather than merely a confirmer. This adversarial interaction logic is the direction truly worth developing in future AI avatar systems.
This trend also indicates a new functional positioning of technology: shifting from interaction tools to "psychological infrastructure." When AI can participate in emotional regulation, relationship support, and cognitive updating, what it carries is no longer just text or voice capabilities but an extension mechanism of social behavior.
The ultimate proposition of AI companionship is not to simulate relationships but to provide conversational scenarios that are difficult to construct in human experience. In various contexts such as family, education, psychology, and culture, the value boundaries of AI avatars are being expanded—not just as responders, but as conversational partners and relationship shapers.
The next step for AI terminals is social interaction itself.
After AI avatars, virtual companionship, and voice agents, industry attention is further returning to hardware and platform levels—will there be a possibility of disruptive reconstruction in future human-computer interaction forms? a16z believes that, on one hand, the position of smartphones as the main interactive platform remains highly stable, with over 7 billion smartphones deployed globally, and their popularity, ecological stickiness, and usage habits are unlikely to change in the short term. On the other hand, new possibilities are brewing in personal and continuously interactive devices.
One path is the "evolution within smartphones": models are moving toward localized deployment, with significant room for optimization around privacy protection, intent recognition, and system integration. Another path is to develop new device forms, such as "always-on" headphones, glasses, and brooch devices, focusing on effortless activation, voice-driven interaction, and proactive outreach.
The truly decisive variable may still be breakthroughs in model capabilities rather than changes in hardware appearance. Hardware forms provide boundary carriers for model capabilities, while model capabilities define the upper limits of device value. AI should not just be an input box on the web but rather an existence that "coexists with you." This view is increasingly becoming an industry consensus. Many early attempts have begun to explore pathways for "presence-type AI": AI can see user behavior, hear real-time voice, understand interaction environments, and actively intervene in decision-making processes. Transitioning from a suggestion provider to a behavioral participant is one of the key transitional directions for AI implementation.
Some devices can now record user behavior and language data in real-time for retrospective analysis and behavioral pattern recognition. Some products even attempt to proactively read user screen information and provide operational suggestions or even execute commands directly. AI is no longer just a reactive tool but a part of daily life processes.
A further question is: Can AI help users understand themselves? In daily life, where external feedback systems are lacking, most people have a limited systematic understanding of their abilities, cognitive biases, and habitual behaviors. An AI avatar that accompanies users for a sufficient amount of time and understands their paths could become an intelligent mechanism for guiding cognitive awakening and potential release. For example, it could point out to users: "If you invest 5 hours a week in an activity, you will have an 80% chance of becoming a professional in that field in three years"; or recommend networking resources that align most closely with their interest structure and behavioral patterns, thereby creating a more accurate social map.
The core of such intelligent relationship systems is that AI is no longer a function tool used intermittently but is structurally embedded in users' lives. It accompanies work, assists growth, and provides feedback, forming a continuous "digital companion" relationship. On the device side, headphones are seen as the most likely terminal form to carry such AI assistants. Headphone devices, represented by AirPods, offer natural wearing and smooth voice channels, possessing both low-resistance interaction and long-term wear advantages. However, their social perception in public settings remains limited— the cultural preconception that "wearing headphones = not welcoming communication" still influences the path of device proliferation.
The evolution of device forms is not merely a technical issue but also a redefinition of social context. As sustainable recording becomes the industry default trend, new social habits are also being reconstructed. The era of "default recording" is quietly unfolding among a generation of young users.
Despite ongoing recording leading to privacy anxieties and ethical reflections, people are gradually forming a cultural consensus of "recording as background." In some mixed work and social scenarios in San Francisco, the "existence of recording" has gradually internalized as a default setting; however, similar cultural tolerance has yet to form in places like New York. The differences in cities' acceptance and adaptation speeds to technological experiments are becoming micro-variables in the rhythm of AI product implementation.
When recording behavior transforms from tool selection to social background, the real reconstruction of norms will revolve around "boundary setting" and "value building." We are currently in the "early stage of synchronously constructing technical pathways and social norms"—with many blanks, few consensuses, and undefined definitions. But this is the most critical time for posing questions, setting boundaries, and shaping order.
Whether it is AI avatars, voice agents, digital personalities, virtual companionship, or hardware forms, the entire ecosystem remains in its most primitive and undefined state. This means that in the coming years, many assumptions will be falsified, and paths will be rapidly amplified. But more critically, it is essential to continuously pose real questions during this stage and build more sustainable answer structures.
Original link
A16z Partners: In the AI Era, There Are No Moats, Only Speed

Explore More From Creator

Latest News

A16z Partners: In the AI Era, There Are No Moats, Only Speed

Explore More From Creator

Latest News

Trending Articles