Our guest today is Rohan Khadatkar, head of marketing at InstaHeadshots, the company behind popular AI tools like Magic Eraser and Magic Studio. With years of experience at the intersection of product design and artificial intelligence, Rohan has been building AI-powered consumer products long before the current wave of generative AI innovation.
In this episode, Rohan shares how his team approaches product building in the AI era β focusing on solving real user problems rather than chasing technology trends. We explore what it takes to turn cutting-edge models into reliable, everyday tools, the importance of great prompting, and why human creativity and taste still matter in an increasingly automated world.
Iβm excited to share this conversation about how Rohan and his team are redefining what it means to build useful AI products β simple, intuitive, and designed for real impact.
About Rohan: LinkedIn
Magic studio: Download the App | Instagram
ABOUT ROCKETSHIP HQ: Website | LinkedIn | Newsletter | Youtube | Podcast Website
FULL TRANSCRIPT BELOW
Shamanth Rao:
Rohan, welcome to the show.
Rohan:
Thanks, Shamanth, for having me on the podcast. It feels amazing to be here and talk to you. Thank you for all the great work youβve been doing with your βAll Your Notesβ newsletter.
Shamanth Rao:
Thank you for the kind words. Iβm especially thrilled to have you on because you and your team were exploring AI and AI-driven products long before it became mainstream. Iβm fascinated by how you approached AI ahead of the curve β and how that led to where you are today.
To start, your team began launching products in 2020 β before what we now call the βAI era.β How did those early product launches shape your understanding of AI and productizing it before most people had even heard of ChatGPT?
Rohan:
Thatβs a great question. What happened, Shamanth, is that our goal was to build products that genuinely solved user problems β which is the foundation of good product building. That mindset hasnβt changed.
Even though there are complex technologies behind the scenes, at the end of the day, itβs the product managerβs or builderβs thinking that simplifies things so the end user can understand and use them easily.
So for us, it was never βtech first, product later.β It was always: start with the customer, identify the use case, and then build for that.
Our initial entry point was the Shopify ecosystem, which was booming β a $200 billion small-to-medium business economy. The Shopify app store and community were rapidly growing.
One of our first products was a Shopify price testing app. It addressed a simple but powerful problem: could you extract more revenue from your pricing decisions for e-commerce products? It ran live A/B tests while you slept and helped you find optimal pricing.
Next, we built Hagrid.io, a social Q&A tool for any website β which is still live today.
Then came Master, inspired by Pinterest. Pinterest had all these product images, but only about 4% were actually shoppable. So we built an attempt to make those images directly buyable.
Another breakthrough came from our AI team β the product Magic Eraser. It could remove unwanted objects from images. When we launched it on Product Hunt, it blew up overnight. We were genuinely surprised by the massive demand for such a simple product that did one thing β but did it exceptionally well. It still ranks number one for the βmagic eraserβ keyword.
We also built a photo booth feature inside Magic Studio β where users could create cartoon or avatar versions of themselves. That eventually inspired InstaHeadshot, a realistic headshot generator.
At the core, it was never about thinking βAI-first.β It was always: how do we build something useful for the end user?
Shamanth:
I love that framing β doing one thing and doing it really well, and making it simple enough for any user to understand. Itβs so easy to get distracted by the new shiny technology instead of solving real problems.
Letβs move on to InstaHeadshots, one of your later products. Youβve mentioned it was profitable from day one. What inspired it, and how did you know it was scalable?
Rohan:
The story of InstaHeadshots starts again with our AI team. They stay deeply plugged into AI research and new technologies. The speed of progress in AI is so fast that itβs easy to miss key developments if youβre not actively following them.
We were reading some of the early papers on text-to-image generation β around the same time that text-to-text models like ChatGPT were evolving. Having that knowledge and access to the latest models gave us confidence that this technology could be productized.
We already had our photo booth feature in Magic Studio, which inspired us to explore realistic headshots. We noticed existing tools in the space produced low-quality or βuncannyβ results. That made us think: maybe we can do this better.
So we quickly put together a prototype β just one landing page, a three-step flow, and a Stripe paywall from our previous product. We launched it on Product Hunt and ran a few Google Ads. On day one, we earned our first $100 in revenue.
That immediate traction validated the idea. Within 30 days, the product was profitable.
Shamanth:
Thatβs incredible. You also mentioned that staying ahead of AI model improvements was crucial. What were some of the technical breakthroughs that made this kind of work possible?
Rohan:
Yes, absolutely. Let me explain this simply.
ChatGPT is text-to-text β you input text and get text back. But AI can now do text-to-image, text-to-voice, and even text-to-music.
In the text-to-image world, one of the biggest breakthroughs was the ability to describe an object and generate a photorealistic image that matched it. Then came another leap β identity preservation.
That meant you could train a model on just 8β10 images of a person and have it generate new, realistic photos of that same person in different settings or outfits.
That capability made headshot generation possible. Of course, there were still technical limitations β image resolution was capped at 512×512 pixels β so we had to do a lot of upscaling and post-processing.
But then, in 2024, Flux models arrived. These models were much larger and far more capable β similar to how GPT-5 today is a leap beyond GPT-3. When we integrated Flux into our workflow, our refund rate dropped by 50% overnight. The model quality alone transformed the business metrics.
Today, our focus is on pre-processing β making sure user identity data is handled correctly and giving customers more control: choice of outfit, hairstyle, background, even brand colors.
The next frontier is eliminating model fine-tuning altogether β making identity generation instant and cost-free, which is where research like nano models is heading.
Shamanth:
Thatβs a fantastic overview of the evolution of image generation. Whatβs really striking is how important the concept of identity is β how the model understands who itβs generating.
And this leads to another thought: model wrappers often get a bad reputation, like βoh, itβs just a GPT wrapper.β But as youβve described, building something reliable on top of these models is far from simple.
So when it comes to productizing these models, how does your team approach making them truly usable for end users?
Rohan:
Great question. Hereβs the analogy we use internally:
Think of AI models as roads β theyβre the infrastructure. But users donβt want to build their own cars just to use the road. We build the vehicles β the apps β that take them from A to B efficiently.
You can build a bicycle for short trips (a lightweight consumer app), or a bus (an enterprise product). But roads β or foundational models β are massive, expensive, and need to be fully built to be useful.
Our focus is on the vehicles. We work backwards from the userβs problem and decide which model, or combination of models, can solve it effectively.
For example, when customers asked, βCan I generate a photo wearing my companyβs T-shirt or with a specific logo?β β we didnβt change the core model. We improved our workflow around it to make that request possible.
Thatβs how we think about productization β use the foundation but innovate on the experience.
Shamanth:
I love that analogy β not everyone should build roads, but everyone should focus on building the right vehicle for the right user journey.
You also mentioned having to reject low-quality user uploads in the early days. Why was that so important, and what changed as the models improved?
Rohan:
Yes, that was a crucial part of building trust.
We learned early on that bad inputs produce bad outputs β but you canβt just tell users that after theyβve uploaded a photo. If the result is poor, theyβll blame the product, not their photo.
So, we put strict constraints: only front-facing images, no hats or glasses, one person per image, and a minimum of 12 photos.
It was tough β and yes, one ongoing joke was that men never have enough good photos to upload. But we needed those rules to ensure quality outputs and maintain that βwow factor.β
Over time, as models improved, we relaxed those restrictions. Today, users can upload far fewer images and even preview their headshots before paying.
That balance β between trust and accessibility β has been key.
Shamanth:
That makes perfect sense. Sometimes you have to protect users from themselves to maintain product quality, even if it slightly hurts conversion rates.
Now, there are so many models available today. How does your team decide which ones to use and test for different use cases?
Rohan:
Our AI team constantly experiments. We read new research papers, test early-access models, and design our products modularly β so one part of a workflow can use one model, and another part can use a different one.
This flexibility lets us combine the best strengths of multiple models rather than relying on a single system.
Shamanth:
That makes sense. Now, much like βGPT wrappers,β another term that often gets a bad rap is prompting or prompt engineering. But itβs incredibly important.
How did your team realize the impact of the right or wrong prompt?
Rohan:
Honestly, prompting deserves more respect. Itβs simply the language we use to communicate with models.
With text-to-text systems like ChatGPT, you can be a bit lazy β vague prompts still give good results. But in text-to-image models, specificity is everything.
For example, we once used the prompt βnavy blazer.β The model kept generating navy military jackets instead of business blazers. Changing it to βnavy blue blazerβ fixed it instantly.
So prompt precision matters β itβs how you teach the model exactly what you mean.
Testing and iterating on prompts is a big part of our workflow. Thatβs where a lot of the invisible engineering effort goes β not just the model itself, but how we talk to it.
Shamanth:
I completely agree. In my own experience, tweaking the system prompt often made the biggest difference β more than any code changes.
Now, not everyone on a team is technical. How do you ensure that everyone in your organization uses AI effectively in their daily work?
Rohan:
We encourage constant exploration. Whenever a new tool launches, someone on our team signs up and tries it.
For engineers, that might be coding assistants. For marketers, AI ad tools. For me, itβs content and analytics.
But the key is what we call the βaha moment.β Thatβs when a user experiences a clear, immediate benefit from AI β something that saves time or produces unexpectedly great results.
For a developer, it might be solving a complex syntax error instantly. For a marketer, it could be an AI-generated ad copy that performs well. Once someone gets that aha moment, adoption becomes natural.
Shamanth:
Thatβs so true. And finally, on the marketing side β even though your company is AI-first, you still rely heavily on human context and editing for messaging and copy. Why is that?
Rohan:
Because context is everything.
AI can generate text, but it doesnβt always understand where or how that text will appear. A line that sounds great on a notepad might not fit a landing page or email.
Our founder, for example, will often edit copy directly on the staging site β adjusting it in real-time to see how it looks in context.
That human touch β understanding tone, placement, and emotion β is what keeps the brand authentic.
Even with all the AI advancements, taste and context are still very human strengths.
Shamanth:
Absolutely. And thatβs a great note to end on. Context and taste β those are still uniquely human skills. Maybe one day AI will catch up, but for now, they remain our advantage.
Rohan:
Thank you so much, Shamanth. This was a great conversation β I truly enjoyed sharing our journey and insights with you.
Shamanth:
Thank you, Rohan. Itβs been a pleasure having you on the show.
