Hedra founder Michael Lingelbach discusses how generative AI is transitioning from viral memes to enterprise-level applications, showcasing its innovative potential in virtual influencers and interactive content creation. This article is based on an interview with Michael Lingelbach, Justine Moore, and Matt Bornstein from a16z, originally titled 'Why AI Characters & Virtual Influencers Are the Next Frontier in Video ft Hedra’s Michael Lingelbach', organized and compiled by Janna and ChainCatcher.
(Previous Summary: What opportunities does Web3 AI have with breakthroughs in multimodal video generation technology?)

Michael Lingelbach is the founder and CEO of Hedra. He was a PhD student in computer science at Stanford University and a stage actor, combining technology and performance passion to lead Hedra in developing industry-leading generative video models. Hedra is a company focused on full-body embodiment and dialogue-driven video generation, with technology supporting a wide range of applications from virtual influencers to educational content, significantly lowering the barriers to content creation. This article is adapted from the a16z Podcast, focusing on how AI technology is transitioning from viral meme content to enterprise-level applications, showcasing the innovative potential of generative video technology.

The following is the dialogue content, compiled by ChainCatcher (with edits).

TL&DR

Artificial intelligence is seamlessly connecting consumer and enterprise scenarios, such as this technology generating baby advertisements promoting enterprise software, highlighting enterprises' enthusiasm for embracing new technology.

Viral meme content has become a powerful tool for startups, such as 'Baby Podcast' quickly enhancing brand awareness, showcasing clever marketing strategies.

Full-body expression and dialogue-driven video generation technology fills creative gaps, significantly reducing the time and cost of content production.

Virtual influencers like John Lawa shape unique digital characters through 'Moses Podcast', giving content distinct personality and appeal.

Content creators like 'mom bloggers' leverage technology to quickly produce videos, easily maintaining brand activity and audience connection.

The interactive video model opens up two-way dialogue with virtual characters, providing immersive experiences for education and entertainment.

Character-centric video generation technology focuses on personality expression and multi-agent control, meeting the demand for dynamic content creation.

Integrating dialogue, actions, and rendering platform strategies creates a smooth generative media experience that meets the demand for high-quality content. The interactive avatar model supports dynamic adjustments to video emotions and elements, heralding the next wave of innovation in content creation.

(1) From Memes to Enterprise Applications: The Fusion of AI

Justine: We find the cross-application of AI between consumer and enterprise scenarios very interesting. A few days ago, I saw an advertisement text generated by Hedra on (Forbes), where the content featured a talking baby promoting enterprise software. But this also shows we are in a new era where enterprises are rapidly embracing AI technology, showing great enthusiasm.

Michael: As a startup, our responsibility is to draw inspiration from the usage signals of consumer users and transform them into next-generation content production tools that enterprise users can rely on. In recent months, some viral content generated by Hedra has garnered widespread attention, from early anime-style characters to 'Baby Podcasts' and now to this week's trending topic—I'm not even sure what it is. Memes are a very effective marketing strategy, quickly capturing the minds of users by reaching a large audience. This strategy is becoming increasingly common among startups. For instance, another company invested by a16z, Cluey, gained significant brand recognition through viral dissemination on Twitter. The essence of memes is that technology empowers people to quickly unleash their creativity; short video content has dominated cultural consciousness. Hedra's generative video technology allows users to transform any idea into content within seconds.

(2) Why Creators and Influencers Choose Hedra

Justine: Can you explain why people use Hedra to create memes, how they use it, and what the connection is to your target market?

Michael: Hedra is the first company to mass-deploy full-body expressive, dialogue-driven generative video models. We support users in creating millions of pieces of content, and our rapid popularity is due to filling a critical gap in the content creation technology stack. Previously, producing generative podcasts, animated character dialogue scenes, or singing videos was very difficult—either too costly, lacked flexibility, or took too long. Our model is fast and cost-effective, thus giving rise to the rise of virtual influencers.

Justine: Recently, CNBC published an article about virtual influencers powered by Hedra. Can you give a few specific examples of how influencers use Hedra?

Michael: For example, renowned actor John Lawa (who played Taco in The League) utilized Hedra to create a series of content ranging from 'Moses Podcast' to 'Baby Podcast', and these characters now have unique identities. Another example is Neural Viz, which built a character identity-centric 'metaverse' based on Hedra. Generative performance differs from simple media models; it requires injecting personality, consistency, and control into the model, which is particularly important for video performance. Therefore, we see the unique personalities of these virtual characters becoming popular, even though they are not real people.

(3) Virtual Influencers and Digital Avatars

Matt: I see many Hedra videos on Instagram Reels, featuring newly created characters like aliens in the Neural Viz series—previously achievable only through Hollywood blockbusters—and real people using these tools to expand their digital presence. Many influencers or content creators do not want to meticulously dress up, adjust lighting, or apply makeup each time. Hedra enables groups like 'mom bloggers' to quickly generate videos to convey messages without spending a lot of time preparing. For example, they can generate content directly using Hedra to talk to the camera.

Michael: This is a very important observation. Maintaining a personal brand is crucial for content creators, but staying online 24/7 is very challenging. If a creator pauses updates for a week, they might lose fans. Hedra's automation technology significantly lowers the barriers to creation. Users combine tools like Deep Research to generate scripts, then use Hedra to create audio-visual content and automatically publish it to their channels. We see an increasing number of workflows around autonomous digital identities, serving not only real people but also completely fictional characters.

(4) The Potential and Challenges of Interactive Videos

Justine: Many historical videos are trending on Reels now. In the past, we gained knowledge through reading history books, but that can be somewhat dry. If history could be told through characters and showcased in generative video scenes, the experience would be much more engaging.

Michael: While we don't directly target the education sector, many educational companies develop applications based on our API. The engagement of video interaction is far higher than text. We recently launched the instant interactive video model, which is the first product to achieve low-latency audio-visual experiences. From language learning to personal development applications, when the technological cost is low enough, it will fundamentally change how users interact with large language models (LLMs). One of my favorite projects is 'Chat with Your Favorite Book or Movie Character'. For example, you could ask, 'Why did you walk into that dark room knowing there was a murderer?' This interactive experience is richer than traditional audiobooks because users can ask questions, revisit content, and the experience is much more vivid.

Justine: The search space for video models is enormous. Generating a single frame image is already complex, but generating 120 frames of continuous video is even more challenging. Hedra focuses on a unique and meaningful problem, which differentiates it from other video models. Please describe the definition of this problem and your source of inspiration.

Michael: That's a great question. We see specialization emerging at the foundational model layer, much like Claude becoming the benchmark for programming models, Open AI providing general assistants, and Gemini serving enterprise scenarios due to cost-effectiveness and speed. Hedra has a similar positioning in the video model space. Our foundational model performs very well, especially the next-generation models, providing tremendous flexibility for content creation. But we are more focused on how to make content 'come alive', allowing users to interact with it and feel consistent personality and appeal. The core lies in how to combine the intelligence of characters in videos with the rendering experience. My vision is for users to be able to communicate bidirectionally with characters in videos, where characters have programmable unique personalities. This requires vertical integration, not only optimizing the core model but also rethinking the future experience of user interaction.

(5) Character-Centric Video Models and Subject Control

Michael: I come from a theatrical background; although I am not a professional actor, I have a passion for character performance. Videos are central to our daily interactions, whether in advertising, online courses, or Hedra-driven faceless channels, the sense of connection is crucial. We enable ordinary users to easily generate content by lowering the barriers to creation and speeding up the process. In the future, the boundaries between the intelligence of the model and rendering will gradually blur, and users will interact with systems that understand their intents. We view characters as the core unit of control, not just as videos. This requires gathering user feedback to optimize the authenticity and expressiveness of characters while providing control levers for multiple agents.

Matt: I spent a lot of time creating characters for different videos, and the power of Hedra lies in its integrated character creation tools. You can create or upload character images, save them for later use, and even change contexts or clone voices. Many of the openings in my YouTube videos and tutorials use a clone of my voice created by Hedra. This integrated experience is particularly valuable in the fragmented generative media market.

(6) Building an Integrated Generative Media Platform

Justine: Many companies like Black Forest Labs have made technological breakthroughs, but still need partners like Hedra to deliver experiences to consumers and enterprise users. How do you decide to build an integrated platform rather than limiting yourself to a specific technology?

Michael: This is about focus and user needs. When I founded Hedra, I found it very difficult to incorporate dialogue into media. In the past, users needed to overlay lip-syncing to create short videos, which lacked a sense of wholeness. Our technology inspiration is to unify signals like breathing and gestures with dialogue to create a more natural video model. From a market perspective, we observed differences in users' willingness to pay for different applications. Some popular applications may have low willingness to pay, but certain niche areas (like content creators) have a strong demand for high-quality experiences. We choose to integrate the best technologies, whether from Hedra or partners like 11 Labs, to ensure users receive the best experience.

Matt: In the future, will AI characters generate text, scripts, voices, and visuals from a single model?

Michael: I believe the industry is moving towards a multimodal input-output paradigm. The challenge with single models lies in control. Users need to precisely adjust details such as voice, tone, or rhythm. Decoupling inputs can provide more control, but in the future, we may trend towards fully multimodal models where users can adjust the fit of each modality through guiding signals.

(7) The Future of Interactive Videos

Justine: I am impressed with Hedra's long video generation capability. You can upload a few minutes of audio, generate character dialogue videos, adjust images and voices separately, avoiding resource wastage from one-time generation. This level of control makes me excited about the future of interactive videos.

Michael: The interactive avatar model we just launched excites me. In the future, users will be able to shape video elements like on a fluid canvas, for instance, pausing the video and asking a character to sound sadder in a specific line. This kind of bidirectional communication will bring about the next generation of experiences, and it will be realized soon.

Matt: Is a true AI actor possible? Users interact with the created characters in real-time and give instructions.

Michael: Absolutely possible. But the current limitation is not in the video model, but in the authenticity of personality in large language models. Existing AI companions (like Character AI) still bear obvious model traces. To achieve truly interactive digital characters, more research is needed in configurable personalities.

(8) Hedra's Audio Generation and AI Native Applications

Justine: Hedra's videos are stunning, but the audio sometimes falls short. 11 Labs' latest models have improved audio quality, but content appeal still needs enhancement.

Michael: Audio generation is an underexplored field. Currently, generative speech is mainly used for narrations or voiceovers, but generating natural dialogue in situations like a noisy café remains challenging. We need audio models capable of controlling environmental sounds and multi-turn dialogues to enhance the naturalness of video creation. Video AI is still in its early stages. Just like early CGI effects appeared realistic but now look cartoonish. Our first-generation model amazed me, but now seems rough. Achieving highly controllable, cost-effective, and instant performance models still requires effort.

Matt: Do users prefer to interact with real humans, lifelike characters, or cartoon characters?

Michael: We generated a lot of fluffy little balls and cat characters. Hedra's unified model can handle various characters, whether rocks or robots, allowing users to experiment freely and create unprecedented content. We build a unified model, rather than traditional video with lip-syncing, to avoid limiting users by technology. Users can try 'talking rocks' or 'podcasts between robots and humans', and the model can automatically handle dialogue and personality. This flexibility sparks revolutionary consumption scenarios.

Justine: The cross-application of AI is exciting. Consumers create content like 'baby podcasts', inspiring enterprise applications. I saw a baby advertisement generated by Hedra promoting enterprise software on (Forbes), which was surprising. This shows that enterprises are quickly embracing AI, and we need to transform consumer signals into enterprise-level solutions.

Michael: Enterprises are our fastest-growing area. Generative AI has shortened content creation from weeks to instant. For example, automated news anchors are changing the way information is disseminated. In the past, local news disappeared due to high costs, but now one person can operate a news channel. This 'medium-scale personalization' meets the needs of specific audiences, such as precise advertising for local cuisine or theme parks, more effectively than overly personalized Google models.

(9) The Founder’s Journey: Challenges, Passion, and Collaborative Innovation

Justine: As a founder, what has your experience been like? What challenges and rewards have you encountered?

Michael: In San Francisco, the life of a founder is often romanticized, like a journey of building epoch-making technology. I come from a small town in Florida and never thought I would take this path. But 99% of the time as a founder is very difficult. You have to keep pushing, and the problems never decrease—from focusing on development to facing a massive influx of customer support emails. Physically exhausted, but the inner satisfaction is unparalleled. I love my users and my team; I can't imagine doing anything else. It's a kind of 'second-order joy'—like climbing a snowy mountain, my hands and feet are injured, but after reaching the peak, I still want to come back. I go to the office every day at 7:30 AM and leave at 10 PM, sometimes still discussing features at 2 AM. This requires giving up the boundaries between work and life, but passion keeps me going.

Matt: Why do you still write code yourself? Is it to express creativity or to communicate with the team?

Michael: Both. Prototypes help me quickly validate ideas and clearly convey expectations. As a leader, clear communication is crucial. I discuss edge cases with designers to ensure the system is scalable. Writing code keeps me connected with the team, understanding their challenges while quickly exploring product directions.