Original title: Why AI Characters & Virtual Influencers Are the Next Frontier in Video ft Hedra's Michael Lingelbach

Host: Justine Moore, Matt Bornstein, a16z

Guest: Michael Lingelbach

Compiled & Edited by: Janna, ChainCatcher

Editor's note

Michael Lingelbach is the founder and CEO of Hedra, a former PhD student in computer science at Stanford University, and a stage actor who combines technology with a passion for performance, leading Hedra to develop industry-leading generative audio and video models. Hedra is a company focused on full-body expression and dialogue-driven video generation, with technology supporting a wide range of applications from virtual influencers to educational content, significantly lowering the barriers to content creation. This article is compiled from the a16z podcast, focusing on how AI technology has crossed from viral meme content to enterprise-level applications, showcasing the innovative potential of generative audio and video technology.

The following is the dialogue content, compiled and organized by ChainCatcher (with edits).

TL&DR

  • Artificial intelligence is seamlessly connecting consumer and enterprise scenarios, such as this technology generating advertisements for baby promotion enterprise software, highlighting the enthusiasm of enterprises embracing new technology.

  • Viral meme content has become a powerful tool for startups, as the 'baby podcast' rapidly enhances brand awareness, showcasing the cleverness of market strategies.

  • Full-body expression and dialogue-driven video generation technology fills a creative gap, significantly reducing the time and cost of content production.

  • Virtual influencers like John Lawa shape unique digital personas through the 'Moses Podcast', giving content a distinct personality and appeal.

  • Content creators like 'mom bloggers' quickly produce videos with the help of technology, easily maintaining brand activity and audience connection.

  • Real-time interactive video models enable two-way conversations with virtual characters, bringing immersive experiences to education and entertainment.

  • Character-centric video generation technology focuses on personal expression and multi-agent control, meeting the demands of dynamic content creation.

  • Integrating dialogue, actions, and rendering platform strategies to create a smooth generative media experience that meets the demand for high-quality content.

  • Interactive avatar models support dynamic adjustments of video emotions and elements, signaling the next wave of innovation in content creation.

(1) From Meme to Enterprise Application AI Fusion

Justine: We find the cross-application of AI between consumer and enterprise scenarios very interesting. A few days ago, I saw an advertisement text generated by Hedra on (Forbes), which featured a talking baby promoting enterprise software. But this also indicates that we are in a new era, with enterprises rapidly embracing AI technology and showing tremendous enthusiasm.

Michael: As a startup, our duty is to draw inspiration from consumer user signals and translate them into next-generation content production tools that enterprise users can rely on. In recent months, some viral content generated by Hedra has garnered widespread attention, from early anime-style characters to 'baby podcasts', and the trending topic of this week—I’m not even sure what it is. Memes are a very effective marketing strategy that quickly occupy user minds by reaching a large audience. This strategy is becoming increasingly common in startups. For example, another a16z-invested company, Cluey, gained significant brand recognition through viral dissemination on Twitter. The essence of memes is that technology provides people with a vehicle to quickly unleash creativity, and short video content has dominated cultural awareness. Hedra's generative video technology allows users to turn any idea into content in seconds.

(2) Why Creators and Influencers Choose Hedra

Justine: Can you explain why people use Hedra to create memes, how they use it, and how that relates to your target market?

Michael: Hedra is the first company to deploy full-body expressive, dialogue-driven generative video models at scale. We support users in creating millions of pieces of content, and its rapid popularity is due to filling a critical gap in the content creation tech stack. Previously, producing generative podcasts, animated character dialogue scenes, or singing videos was very difficult, either due to high costs, lack of flexibility, or excessive time consumption. Our models are fast and cost-effective, thus giving rise to the emergence of virtual influencers.

Justine: Recently, CNBC published an article about Hedra-driven virtual influencers. Can you give a few specific examples of how influencers use Hedra?

Michael: For example, famous actor John Lawa (who played Taco in (The League)) used Hedra to create a series of content from 'Moses Podcast' to 'Baby Podcast', and these characters now have unique identities. Another example is Neural Viz, which built a character identity-centric 'metaverse' based on Hedra. Generative performance differs from a pure media model; it requires injecting personality, consistency, and control into the model, which is particularly important for video performance. Therefore, we see the unique personalities of these virtual characters becoming popular, even though they are not real people.

(3) Virtual Influencers and Digital Avatars

Matt: I've seen many Hedra videos on Instagram Reels, featuring completely new characters like aliens from the Neural Viz series—previously only achievable by Hollywood blockbusters—and real people using these tools to expand their digital presence. Many influencers or content creators don’t want to spend time meticulously dressing up, adjusting lighting, or putting on makeup. Hedra allows groups like 'mom bloggers' to quickly generate videos to convey messages without spending excessive time on preparation. For example, they can directly generate content with Hedra that talks to the camera.

Michael: This is an important observation. Maintaining a personal brand is crucial for content creators, but staying online 24/7 is very difficult. If creators pause updates for a week, they may lose followers. Hedra's automation technology significantly lowers the barriers to creation. Users combine tools like Deep Research to generate scripts, then use Hedra to produce audio and video content, automatically posting to their channels. We see an increasing number of workflows around autonomous digital identities, serving not only real people but also entirely fictional characters.

(4) The Potential and Challenges of Interactive Video

Justine: Many historical videos are trending on Reels now. In the past, we gained knowledge by reading history books, but that can be a bit dry. If we could narrate history through characters and showcase generative video scenes, the experience would be much more engaging.

Michael: Although we are not directly targeting the education sector, many educational companies develop applications based on our API. The engagement of video interactions far exceeds that of text. We recently launched a real-time interactive video model, which is the first product to achieve low-latency audio and video experience. From language learning to personal improvement applications, when the technological costs are low enough, it will completely change how users interact with large language models (LLMs). My personal favorite project is 'chatting with your favorite book or movie character.' For example, you can ask, 'Why did you walk into that dark room knowing there was a killer?' This interactive experience is richer than traditional audiobooks because users can ask questions, backtrack content, making it more vivid.

Justine: The search space for video models is vast. Single-frame image generation is already complex, but generating 120 frames of continuous video is more challenging. Hedra focuses on a unique and meaningful problem, different from other video models. Please describe the definition of this problem and your sources of inspiration.

Michael: That's a great question. We see specialization emerging at the foundational model layer, just as Claude has become the benchmark for programming models, OpenAI provides a general assistant, and Gemini serves enterprise scenarios due to cost-effectiveness and speed. Hedra has a similar positioning in the video model field. Our foundational model performs very well, especially the next-generation models, providing great flexibility for content creation. But we focus more on how to make the content 'come alive', allowing users to want to interact with it and feel a consistent personality and appeal. The core lies in how to combine the intelligence of characters in videos with the rendering experience. My vision is for users to communicate two-way with characters in videos, which have programmable unique personalities. This requires vertical integration, not only optimizing the core model but also rethinking the future experience of user interaction.

(5) 'Character-Centric' Video Models and Subject Control

Michael: I come from a dramatic background; although I'm not a professional actor, I am passionate about character performance. Video is central to our daily interactions, whether in advertisements, online courses, or Hedra-driven faceless channels, the sense of connection is crucial. By lowering creation barriers and speeding up the process, we enable ordinary users to easily generate content. In the future, the boundaries between model intelligence and rendering will gradually blur, and users will converse with systems that understand their intentions. We see characters as the core unit of control, not just videos. This requires gathering user feedback, optimizing character realism and expressiveness, while providing control levers for multiple subjects.

Matt: I've spent a lot of time creating characters for different videos; Hedra's strength lies in its integrated character creation tools. You can create or upload character images, save them for later use, and even change contexts or clone voices. Many of my YouTube video and tutorial openings use the voice cloned by Hedra. This integrated experience is particularly valuable in the fragmented generative media market.

(6) Building an Integrated Generative Media Platform

Justine: Many companies like Black Forest Labs have made technological breakthroughs, but still need partners like Hedra to deliver experiences to consumers and enterprise users. How do you decide to build an integrated platform rather than being limited to a single technology?

Michael: This is about focus and user needs. When I founded Hedra, I found it very difficult to integrate dialogue into media. In the past, users creating short videos needed to overlay lip sync, lacking a sense of cohesion. Our technological inspiration is to unify signals like breathing, gestures, and dialogue to create a more natural video model. From a market perspective, we observed differences in users' willingness to pay for different applications. Some popular applications may have low willingness to pay, but certain niche areas (like content creators) have a strong demand for high-quality experiences. We choose to integrate the best technologies, whether from Hedra or partners like 11 Labs, to ensure users receive the best experience.

Matt: In the future, will AI characters generate text, scripts, voices, and visuals from a single model?

Michael: I believe the industry is moving towards a multimodal input-output paradigm. The challenge with a single model is control. Users need to precisely adjust details like voice, tone, or rhythm. Decoupling the input can provide more control, but the future may trend towards all-modal models, where users can adjust the fit of each modality through guiding signals.

(7) The Future of Interactive Video

Justine: I am impressed by Hedra's long video generation capabilities. You can upload a few minutes of audio and generate character dialogue videos, adjusting appearance and voice separately to avoid wasting resources with one-time generation. This control makes me very excited about the future of interactive video.

Michael: The interactive avatar model we just launched excites me. In the future, users will shape video elements as if on a fluid canvas, for example, pausing a video and asking the character to be more sad in a certain line. This two-way communication will bring the next generation of experiences and will be realized soon.

Matt: Is a true AI actor possible? Users interact in real-time with the created characters and give instructions.

Michael: It's absolutely possible. But the current limitation is not in video models, but in the personality realism of large language models. Existing AI companions (like Character AI) still show obvious model traces. To achieve truly interactive digital characters, more research needs to be invested in configurable personalities.

(8) Hedra's Audio Generation and AI Native Applications

Justine: Hedra's videos are stunning, but the audio sometimes falls short. The latest model from 11 Labs has improved audio quality, but the content appeal still needs enhancement.

Michael: Audio generation is an underexplored field. Currently, generative speech is mostly used for narration or dubbing, but generating natural conversations in noisy cafes remains challenging. We need audio models that can control ambient noise and multi-turn dialogues to enhance the naturalness of video creation. Video AI is still in its early stages. Just like early CGI effects seemed realistic but now look cartoonish. Our first-generation models once amazed me, but now they seem rough. Achieving a highly controllable, cost-effective, and real-time performant model still requires effort.

Matt: Would users prefer to interact with real humans, lifelike characters, or cartoon characters?

Michael: We have generated many fluffy little balls and cat characters. Hedra's unified model can handle various characters, whether rocks or robots, allowing users to experiment freely and create unprecedented content. We built a unified model instead of traditional video plus lip sync to avoid limiting users by technology. Users can try 'talking rocks' or 'podcasts with robots and humans', and the model automatically handles dialogue and personality. This flexibility ignites revolutionary consumer scenarios.

Justine: The cross-application of AI is exciting. Consumers create content like 'baby podcasts', inspiring enterprise applications. I saw the baby advertisement generated by Hedra promoting enterprise software on (Forbes), which was surprising. This indicates that enterprises are rapidly embracing AI, and we need to translate consumer signals into enterprise-level solutions.

Michael: Enterprises are our fastest-growing area. Generative AI has shortened content creation from weeks to real-time. For example, automated news anchors are changing the way information is disseminated. In the past, local news disappeared due to high costs, but now one person can operate a news channel. This 'medium-scale personalization' meets the needs of specific groups, such as precise advertising for local cuisine or theme parks, and is more effective than the overly personalized Google models.

(9) The Founder’s Journey: Challenges, Passions, and Collaborative Innovation

Justine: What has your experience been as a founder? What are the challenges and gains?

Michael: In San Francisco, founder life is often romanticized, like a journey of building epoch-making technology. I come from a small town in Florida and never imagined I would take this path. But 99% of being a founder is tough. You have to keep pushing; the problems never diminish—from invisible development to facing a flood of support emails. Physically exhausting, but the inner satisfaction is unparalleled. I love my users and team; I can't imagine doing anything else. It's a kind of 'second-class fun'—like climbing a snowy mountain, injuring your hands and feet, but still wanting to come back after reaching the summit. I go into the office at 7:30 AM and leave at 10 PM, sometimes discussing features at 2 AM. This requires sacrificing the boundaries between work and life, but passion keeps me going.

Matt: Why do you still program personally? Is it to express creativity or to communicate with the team?

Michael: Both. Prototypes help me quickly validate ideas and clearly communicate expectations. As a leader, clear communication is crucial. I discuss edge cases with designers to ensure the system is scalable. Programming keeps me connected with the team, understanding their challenges while quickly exploring product directions.