How Elman Mansimov Created the First Text-to-Image Artwork

AI scientist Elman Mansimov has been at the forefront of the times. He started college at 15, and in 2015, while studying at the University of Toronto, he led a team to develop the first text-to-image AI generative model using deep learning technology, laying the groundwork for AI text-to-image tools like Midjourney and DALL-E 3, as well as Mansimov's own NFT series alignDRAW launched in 2023.
The 32×32 pixel artworks from the alignDRAW series may not appear as refined as today's AI-generated NFTs, but their lack of detail is compensated by their originality. These early alignDRAW works are still our first genuine AI-generated art experiments—or at least the earliest documented records.
Here is more information about Mansimov, the origins of alignDRAW, and how it entered the NFT world years later.
An airplane flying into the distance at night #78Elman Mansimov and an Introduction to Text-to-Image Art
Today, Elman Mansimov is a senior applied scientist at Amazon Web Services (AWS) in New York, focusing on foundational model research for Amazon's generative AI service suite, Amazon Bedrock.
His journey in computer science began at the University of Toronto, where he earned his undergraduate degree under the guidance of top AI expert Ruslan Salakhutdinov. This early experience propelled him to pursue a PhD at New York University, studying under renowned researcher Kyunghyun Cho in the field of deep learning.
Mansimov's research extends the limits of machine learning and natural language processing. He developed an iterative improvement-based approach to enhance complex predictive tasks. His work includes machine translation (i.e., converting text between different languages) and molecular generation, with contributions impacting text-to-image models and demonstrating that neural networks can create fine images based on simple text descriptions.
A yellow school bus is passing through a green lawn #89Creation of alignDRAW
In 2015, Elman Mansimov led a team to develop alignDRAW, a text-to-image generative model that marked a critical shift in human-machine collaboration in the art field. The model first appeared in an academic paper titled (Generating Images from Captions Using Attention Mechanisms), which laid the foundation for future advances in AI and creative technologies.
AlignDRAW represents a breakthrough, a new way of combining technology with artistic expression that allows prompt creators to create scenes instead of simply searching on Google. Years later, Vox showcased Mansimov's early works in a commentary video, marking the dawn of generative AI and early art.
The impact of early innovations of alignDRAW has transcended academia, as the art generated by the model was showcased at Paris Photo 2023, one of the world's premier photography fairs held annually in France.
In February 2024, the Worcester Art Museum, a Massachusetts historical site known for its rich art collection, acquired three alignDRAW works, further enhancing their significance and promoting the popularity of AI-generated art in mature cultural spaces.
The initial AI program for generating images was composed of 36 prompts.
Mansimov himself is a genius. Not only did he cause a sensation at 15, but he also led a team to develop the first text-to-image generative model at 19, which generated 2,709 images in 2015—quite a rich portfolio for a medium that was just beginning to develop at that time.
One of the most striking images is a 32x32 pixel depiction of a group of elephants flying in a blue sky, representing early achievements in the artistic and technical fields of generative imaging.
These over 2,000 images are all that remains from the project, generated by 36 text prompts, including 'A yellow school bus parked in a parking lot' (shown above, with pixelated colors of brown, yellow, and white), along with other prompts such as 'A toilet seat opened on the grass,' 'A group of happy elephants,' and 'A person skiing on a snow-covered majestic peak.'
Mansimov led this unprecedented attempt to test whether such generative models could iterate based on prompts rather than merely memorizing or predicting based on someone’s previous input.
As Mansimov stated in his conversation with the art platform Fellowship, 'I realized that there was skepticism about whether our model could understand and generate objects based on titles (rather than just memorizing the dataset). My goal was to use strong, representative images in the paper to overturn this view.'
A group of happy elephants on a green lawn #56Introducing alignDRAW to the NFT World
alignDRAW was originally just an experiment, but its influence quickly spread beyond the lab. In April 2023, Alejandro Cartagena, co-founder of the art platform Fellowship, contacted Mansimov with an unexpected proposal: to bring alignDRAW into the NFT realm.
This marks the beginning of a new chapter, as Mansimov's academic innovations found a home in the rapidly evolving digital art field.
Soon, Mansimov and his team demonstrated that the model could generate new images based on text prompts, marking a historically significant moment in the development of visual arts and technology. The image set was divided into two categories: 'Paper Prompts' and 'Process Prompts.'
'Paper Prompts' produced 168 unique images through 21 different prompts, such as 'A green school bus parked in a parking lot.' In contrast, 'Process Prompts' generated 2,541 images through unique and repetitive prompts, with each prompt producing a set of 121 images.
After developing the generative model, Mansimov pursued a PhD in computer science at New York University (NYU), where the alignDRAW model became a cornerstone for future text-to-image modeling and a highlight on Mansimov's growing list of academic and research achievements, until April 2023, when Fellowship co-founder Alejandro Cartagena reintroduced it to Mansimov's focus, making it more prominent.
In a Twitter DM shared by Cartagena on his blog, he asked Mansimov if he was interested in a joint project between Fellowship and Mansimov to bring alignDRAW into the NFT world.
As Mansimov explained in a blog post on his personal website, 'Next was a wonderful journey marked by the Paris Photo exhibition, a physical showcase of Verse, and a brilliant interview with the photography director at Christie's, culminating in a successful auction and sell-out on the blockchain.'
The NFT series was minted on Fellowship in December 2023 and auctioned at the art institution Christie's Auction House. NFTs in the 'Paper Prompts' category were minted and sold grouped by text prompts, with eight independent NFTs in each group, while NFTs in the 'Process Prompts' category were publicly sold in 2,015 editions, with the remaining NFTs gifted to a specific number of family, friends, and collectors.
Mansimov did not anticipate that alignDRAW would become a mainstream web3 success in 2023, especially since until recently, his introductory generative model paper was only recognized in pure academic citations and papers, according to his blog.
However, with the rise of text-to-image art, his work and papers are increasingly regarded as cornerstones of the growing artificial intelligence text-to-image and video movement.
Human-Machine Collaboration in Art
The creation of alignDRAW and similar text-to-image models opens new dimensions for human-machine collaboration in the art field. Thus, Elman Mansimov's pioneering work in alignDRAW and text-to-image art is widely recognized as an important milestone in the fusion of art and technology. His research expands the application of machine learning and natural language processing, paving new ways for human-machine collaboration in creative fields.
Looking to the future, the convergence of art and technology is sure to become a fertile ground for ongoing innovation and creativity, as foundational models and advanced machine learning algorithms continue to evolve, the future of text-to-image art and other fields is filled with exciting possibilities.
#NFT  #AI  #Web3 
OpenSea content you care about
Browse | Create | Buy | Sell | Auction
Follow the OpenSea Binance Channel
Stay updated
How Elman Mansimov Created the First Text-to-Image Artwork

Explore More From Creator

Latest News

How Elman Mansimov Created the First Text-to-Image Artwork

Explore More From Creator

Latest News

Trending Articles