区块捕手聚合群(@block678)'s insights

The drawing ability of GPT-4o is inherently multimodal, not just a simple call to Dall-E, but rather generating images through reasoning chains combined with knowledge.
What does this mean? Controversial statement alert:
The AI derivative tools currently favored by capital (RAG, AI IDE, Workflow tools, Agents, scenario-based products, etc.) may lose all value in the face of multimodal AI.
In 2023, face-swapping required training LoRA, but now GPT-4o can accomplish it in one sentence.
Blind + Deaf ≠ Normal Person; multimodal resonance is not 1 + 1 = 2, but rather exponential evolution. In front of truly multidimensional AI, all 'curve-saving' solutions will eventually be eliminated.
At this moment, I can only think of that line from 'The Three-Body Problem' by Wei Da: 
Advance! Advance! Advance by any means necessary!

The drawing ability of GPT-4o is inherently multimodal, not just a simple call to Dall-E, but rather generating images through reasoning chains combined with knowledge.

Explore More From Creator

Latest News

.css-1iqe90x{box-sizing:border-box;margin:0;min-width:0;color:#EAECEF;}The drawing ability of GPT-4o is inherently multimodal, not just a simple call to Dall-E, but rather generating images through reasoning chains combined with knowledge.

Explore More From Creator

Latest News

Trending Articles

The drawing ability of GPT-4o is inherently multimodal, not just a simple call to Dall-E, but rather generating images through reasoning chains combined with knowledge.