✨ This team started with pre-GPT image models to launch 3 breakout apps: with Rohan Khadatkar, Head of Marketing at InstaHeadshots 🚀

Our guest today is Rohan Khadatkar, head of marketing at InstaHeadshots, the company behind popular AI tools like Magic Eraser and Magic Studio. With years of experience at the intersection of product design and artificial intelligence, Rohan has been building AI-powered consumer products long before the current wave of generative AI innovation.

In this episode, Rohan shares how his team approaches product building in the AI era — focusing on solving real user problems rather than chasing technology trends. We explore what it takes to turn cutting-edge models into reliable, everyday tools, the importance of great prompting, and why human creativity and taste still matter in an increasingly automated world.

I’m excited to share this conversation about how Rohan and his team are redefining what it means to build useful AI products — simple, intuitive, and designed for real impact.

About Rohan: LinkedIn
Magic studio: Download the App | Instagram

ABOUT ROCKETSHIP HQ: Website | LinkedIn | Newsletter | Youtube | Podcast Website

FULL TRANSCRIPT BELOW

Shamanth Rao:
Rohan, welcome to the show.

Rohan:
Thanks, Shamanth, for having me on the podcast. It feels amazing to be here and talk to you. Thank you for all the great work you’ve been doing with your “All Your Notes” newsletter.

Shamanth Rao:
Thank you for the kind words. I’m especially thrilled to have you on because you and your team were exploring AI and AI-driven products long before it became mainstream. I’m fascinated by how you approached AI ahead of the curve — and how that led to where you are today.

To start, your team began launching products in 2020 — before what we now call the “AI era.” How did those early product launches shape your understanding of AI and productizing it before most people had even heard of ChatGPT?

Rohan:
That’s a great question. What happened, Shamanth, is that our goal was to build products that genuinely solved user problems — which is the foundation of good product building. That mindset hasn’t changed.

Even though there are complex technologies behind the scenes, at the end of the day, it’s the product manager’s or builder’s thinking that simplifies things so the end user can understand and use them easily.

So for us, it was never “tech first, product later.” It was always: start with the customer, identify the use case, and then build for that.

Our initial entry point was the Shopify ecosystem, which was booming — a $200 billion small-to-medium business economy. The Shopify app store and community were rapidly growing.

One of our first products was a Shopify price testing app. It addressed a simple but powerful problem: could you extract more revenue from your pricing decisions for e-commerce products? It ran live A/B tests while you slept and helped you find optimal pricing.

Next, we built Hagrid.io, a social Q&A tool for any website — which is still live today.

Then came Master, inspired by Pinterest. Pinterest had all these product images, but only about 4% were actually shoppable. So we built an attempt to make those images directly buyable.

Another breakthrough came from our AI team — the product Magic Eraser. It could remove unwanted objects from images. When we launched it on Product Hunt, it blew up overnight. We were genuinely surprised by the massive demand for such a simple product that did one thing — but did it exceptionally well. It still ranks number one for the “magic eraser” keyword.

We also built a photo booth feature inside Magic Studio — where users could create cartoon or avatar versions of themselves. That eventually inspired InstaHeadshot, a realistic headshot generator.

At the core, it was never about thinking “AI-first.” It was always: how do we build something useful for the end user?

Shamanth:
I love that framing — doing one thing and doing it really well, and making it simple enough for any user to understand. It’s so easy to get distracted by the new shiny technology instead of solving real problems.

Let’s move on to InstaHeadshots, one of your later products. You’ve mentioned it was profitable from day one. What inspired it, and how did you know it was scalable?

Rohan:
The story of InstaHeadshots starts again with our AI team. They stay deeply plugged into AI research and new technologies. The speed of progress in AI is so fast that it’s easy to miss key developments if you’re not actively following them.

We were reading some of the early papers on text-to-image generation — around the same time that text-to-text models like ChatGPT were evolving. Having that knowledge and access to the latest models gave us confidence that this technology could be productized.

We already had our photo booth feature in Magic Studio, which inspired us to explore realistic headshots. We noticed existing tools in the space produced low-quality or “uncanny” results. That made us think: maybe we can do this better.

So we quickly put together a prototype — just one landing page, a three-step flow, and a Stripe paywall from our previous product. We launched it on Product Hunt and ran a few Google Ads. On day one, we earned our first $100 in revenue.

That immediate traction validated the idea. Within 30 days, the product was profitable.

Shamanth:
That’s incredible. You also mentioned that staying ahead of AI model improvements was crucial. What were some of the technical breakthroughs that made this kind of work possible?

Rohan:
Yes, absolutely. Let me explain this simply.

ChatGPT is text-to-text — you input text and get text back. But AI can now do text-to-image, text-to-voice, and even text-to-music.

In the text-to-image world, one of the biggest breakthroughs was the ability to describe an object and generate a photorealistic image that matched it. Then came another leap — identity preservation.

That meant you could train a model on just 8–10 images of a person and have it generate new, realistic photos of that same person in different settings or outfits.

That capability made headshot generation possible. Of course, there were still technical limitations — image resolution was capped at 512×512 pixels — so we had to do a lot of upscaling and post-processing.

But then, in 2024, Flux models arrived. These models were much larger and far more capable — similar to how GPT-5 today is a leap beyond GPT-3. When we integrated Flux into our workflow, our refund rate dropped by 50% overnight. The model quality alone transformed the business metrics.

Today, our focus is on pre-processing — making sure user identity data is handled correctly and giving customers more control: choice of outfit, hairstyle, background, even brand colors.

The next frontier is eliminating model fine-tuning altogether — making identity generation instant and cost-free, which is where research like nano models is heading.

Shamanth:
That’s a fantastic overview of the evolution of image generation. What’s really striking is how important the concept of identity is — how the model understands who it’s generating.

And this leads to another thought: model wrappers often get a bad reputation, like “oh, it’s just a GPT wrapper.” But as you’ve described, building something reliable on top of these models is far from simple.

So when it comes to productizing these models, how does your team approach making them truly usable for end users?

Rohan:
Great question. Here’s the analogy we use internally:

Think of AI models as roads — they’re the infrastructure. But users don’t want to build their own cars just to use the road. We build the vehicles — the apps — that take them from A to B efficiently.

You can build a bicycle for short trips (a lightweight consumer app), or a bus (an enterprise product). But roads — or foundational models — are massive, expensive, and need to be fully built to be useful.

Our focus is on the vehicles. We work backwards from the user’s problem and decide which model, or combination of models, can solve it effectively.

For example, when customers asked, “Can I generate a photo wearing my company’s T-shirt or with a specific logo?” — we didn’t change the core model. We improved our workflow around it to make that request possible.

That’s how we think about productization — use the foundation but innovate on the experience.

Shamanth:
I love that analogy — not everyone should build roads, but everyone should focus on building the right vehicle for the right user journey.

You also mentioned having to reject low-quality user uploads in the early days. Why was that so important, and what changed as the models improved?

Rohan:
Yes, that was a crucial part of building trust.

We learned early on that bad inputs produce bad outputs — but you can’t just tell users that after they’ve uploaded a photo. If the result is poor, they’ll blame the product, not their photo.

So, we put strict constraints: only front-facing images, no hats or glasses, one person per image, and a minimum of 12 photos.

It was tough — and yes, one ongoing joke was that men never have enough good photos to upload. But we needed those rules to ensure quality outputs and maintain that “wow factor.”

Over time, as models improved, we relaxed those restrictions. Today, users can upload far fewer images and even preview their headshots before paying.

That balance — between trust and accessibility — has been key.

Shamanth:
That makes perfect sense. Sometimes you have to protect users from themselves to maintain product quality, even if it slightly hurts conversion rates.

Now, there are so many models available today. How does your team decide which ones to use and test for different use cases?

Rohan:
Our AI team constantly experiments. We read new research papers, test early-access models, and design our products modularly — so one part of a workflow can use one model, and another part can use a different one.

This flexibility lets us combine the best strengths of multiple models rather than relying on a single system.

Shamanth:
That makes sense. Now, much like “GPT wrappers,” another term that often gets a bad rap is prompting or prompt engineering. But it’s incredibly important.

How did your team realize the impact of the right or wrong prompt?

Rohan:
Honestly, prompting deserves more respect. It’s simply the language we use to communicate with models.

With text-to-text systems like ChatGPT, you can be a bit lazy — vague prompts still give good results. But in text-to-image models, specificity is everything.

For example, we once used the prompt “navy blazer.” The model kept generating navy military jackets instead of business blazers. Changing it to “navy blue blazer” fixed it instantly.

So prompt precision matters — it’s how you teach the model exactly what you mean.

Testing and iterating on prompts is a big part of our workflow. That’s where a lot of the invisible engineering effort goes — not just the model itself, but how we talk to it.

Shamanth:
I completely agree. In my own experience, tweaking the system prompt often made the biggest difference — more than any code changes.

Now, not everyone on a team is technical. How do you ensure that everyone in your organization uses AI effectively in their daily work?

Rohan:
We encourage constant exploration. Whenever a new tool launches, someone on our team signs up and tries it.

For engineers, that might be coding assistants. For marketers, AI ad tools. For me, it’s content and analytics.

But the key is what we call the “aha moment.” That’s when a user experiences a clear, immediate benefit from AI — something that saves time or produces unexpectedly great results.

For a developer, it might be solving a complex syntax error instantly. For a marketer, it could be an AI-generated ad copy that performs well. Once someone gets that aha moment, adoption becomes natural.

Shamanth:
That’s so true. And finally, on the marketing side — even though your company is AI-first, you still rely heavily on human context and editing for messaging and copy. Why is that?

Rohan:
Because context is everything.

AI can generate text, but it doesn’t always understand where or how that text will appear. A line that sounds great on a notepad might not fit a landing page or email.

Our founder, for example, will often edit copy directly on the staging site — adjusting it in real-time to see how it looks in context.

That human touch — understanding tone, placement, and emotion — is what keeps the brand authentic.

Even with all the AI advancements, taste and context are still very human strengths.

Shamanth:
Absolutely. And that’s a great note to end on. Context and taste — those are still uniquely human skills. Maybe one day AI will catch up, but for now, they remain our advantage.

Rohan:
Thank you so much, Shamanth. This was a great conversation — I truly enjoyed sharing our journey and insights with you.

Shamanth:
Thank you, Rohan. It’s been a pleasure having you on the show.

WANT TO SCALE PROFITABLY IN A GENERATIVE AI WORLD ?