From photo to Shopify draft in one prompt.

An agent can now run the whole pipeline — end to end, from a single line of intent.

A small fashion brand brings a new tee to market. The work behind that one SKU usually involves a photographer, a model, a location scout, a stylist, someone retouching, a copywriter, an SEO specialist, a taxonomy/data person, an ops account manager, and a coordinator stitching the chain together. Call it a ten-person catalog team. Multiply by a hundred SKUs and a fall drop. The seller's evenings disappear.

We've been building roopafy for sellers who don't have that team. Until last week, the path was still: open roopafy in a browser, walk the wizard, click a button. Useful, but still a human at the wheel. This week we shipped the steering wheel itself: roopafy is now an MCP server. An AI agent can drive the entire pipeline — upload a garment photo, generate a model shoot, write the listing with SEO, fill the Shopify taxonomy, and push a publishable draft — from a single prompt.

This post is the honest version of how we did it, how you set it up, what works today, and what doesn't yet.

The shift

For three years the conversation about "AI for e-commerce" has meant assistance: a writing helper, a description generator, a thumbnail upscaler. You still juggled six tabs and made every decision.

Agentic e-commerce is different. You give the agent a goal — "turn this garment into a publishable listing, lookbook aesthetic, premium tone" — and it makes the decisions. Which model. Which poses. Which background. Which words. What taxonomy attributes Shopify needs filled. Whether the first cut is good enough or worth one more pose. When to call it ready.

That has been technically possible for about a year — the language models are good enough — but operationally impossible: no platform exposed itself in a shape an agent could actually drive. MCP — Model Context Protocol — is the standard that fixed that. It lets any AI agent talk to any specialized platform through a common tool surface. roopafy is now one of those platforms.

What it looks like

A real prompt from yesterday's testing. The agent ran in Claude Cowork; I typed one sentence:

Create a roopafy listing from this image and walk it all the way through to a publish-ready draft: https://res.cloudinary.com/.../floral-crop-top.jpg

Eleven minutes later, the agent had:

Title: Floral Cap Sleeve Crop Top | Casual Women's Navy Multi
SEO title: Women's Floral Crop Top Casual Cap Sleeve Navy Purple Green
Description:

Burst into bloom with this playful floral crop top featuring a flattering round neckline and breezy cap sleeves made for golden hour adventures. The slim fit silhouette moves with you, keeping every casual look effortlessly flirty. Navy, purple, green, and white florals pop against any outfit.

Plus two more variations of each, nine additional attribute tags (button-front placket · lace trim neckline · fitted silhouette · dark floral print · feminine styling), the full Shopify taxonomy auto-filled (Color: Navy + Purple + Green + White · Pattern: Floral · Neckline: Round · Sleeve length: Cap · Top length: Crop top · Target gender: Female · Age group: Teens + Adults), and seven model shots our roopafy AI agents composed for this garment:

Studio Identity · Courtyard Stride · Lace Detail Closeup · Street Market Over-the-Shoulder · Café Seated · Alley Profile · Golden Light Closing

Studio Identity — clean catalog shot of the floral cap-sleeve top Courtyard Stride — model walking with a soft smile in a sun-warmed courtyard Lace Detail Closeup — beauty-shot framing of the neckline and floral print Street Market Over-the-Shoulder — model laughing, full of life in a residential market scene Café Seated — head thrown back laughing on a café chair Alley Profile — mid-skip with hair flying in a residential alley Golden Light Closing — bright laugh, hands framing face in late afternoon light
The seven-shot listing our roopafy AI agents composed for this garment in a single autonomous run. One model identity, one garment, seven distinct moods — no two of them interchangeable.

Notice the third shot — a beauty-shot framing of the floral print and round neckline. That wasn't a generic template. Our roopafy AI agents are fashion-native: our own virtual try-on models, trained on a billion garment images, paired with a pose library tuned for fashion editorial and a garment-physics layer that knows how a cap-sleeved crop top actually drapes on a real body. That's why they read the garment's distinctive details from the source photo and added dedicated shots for them. The whole initial generation cost five credits. No human approval anywhere in the chain until "ready to publish."

That's what "agentic e-commerce" actually means: not faster human work, but the work itself, done.

Why it works now

Two things had to be true for this to be possible.

First, the agent needs discoverable, well-described tools — not a REST API and a 200-page integration guide. MCP delivers this. The agent sees seventeen tools — list_models, upload_garment_image, analyze_garment, create_product, generate_listing, tweak_pose, update_listing_content, publish_to_shopify, and so on — each with a short description, a typed input schema, and a clear contract. The agent picks the right tool by reading the descriptions the same way a junior engineer reads a function signature. No glue code.

Second, the platform behind the tools has to be built to be driven. This is where most retrofits fall over. roopafy's original wizard was a sequence of steps — upload → categorize → configure → generate → review → publish. We built that sequence as composable services from day one, because we knew the wizard was just the first client. And what's underneath isn't a general-purpose image API — it's our own virtual try-on models, trained on a billion garment images, with a fashion-editorial pose library and a garment-physics layer baked in. Fabric drape, sleeve break, body movement, the difference between a catalog frame and a lifestyle scene — the agent doesn't have to prompt-engineer around any of it. The framework already knows. So exposing it to an agent wasn't a rewrite; it was a translation. Seventeen MCP tools, one per meaningful action, sitting in front of the same agentic framework the dev console uses. The agent isn't going through a special "API mode" — it's literally using the same orchestration.

How agents reach roopafy An AI agent connects to the roopafy MCP server over HTTPS. The MCP server exposes seventeen typed tools with hashed Personal Access Token auth, per-session spend caps, and guarded outbound image fetches. It dispatches into the roopafy agentic framework — the same engine that powers the in-product wizard — which generates multi-shot photo listings, writes SEO-ready copy, fills Shopify taxonomy, publishes drafts, and refines listings one image at a time. AI agent Claude Cowork Claude Desktop / any MCP client MCP / HTTPS roopafy MCP server • 17 typed tools • PAT auth (rfy_…, hashed) • Per-session spend cap • Server-fetched images + SSRF guard INTERNAL roopafy agentic framework • Multi-shot photo gen • Listing copy + SEO • Shopify taxonomy + publish • Per-image refinement
Agents reach roopafy through one purple bridge. Everything behind it is the same orchestration that powers the in-product wizard — not a parallel "API mode."

A few architectural choices that mattered:

  • Server-fetched images. Image bytes never travel through the agent's token stream. The agent passes a URL; roopafy's server fetches and stages it in our CDN. Why this matters: a single 100 KB image is roughly 30,000 tokens of base64. An agent that has to emit that as a tool argument stalls before the call dispatches. We learned this the hard way.
  • Per-session spend caps. An agent that goes off the rails can't drain your account. You pass an optional X-MCP-Spend-Cap: 50 header at session start and the credit counter enforces it atomically across every charging call in that session.
  • Personal access tokens, hashed at rest. Long-lived, individually revocable, scoped to one account. Mint them in your profile, paste them into the agent, revoke them whenever.
  • Normalized status polling. When generation kicks off, the agent calls get_generation_status(productId) and gets back a clean { done, failed, progressPct }. No parsing of internal lifecycle strings. It either keeps polling or it doesn't.
  • Sane defaults and actionable errors. If you forget to pick a photo template, the tool rejects up front (TEMPLATE_REQUIRED) with the exact next step — instead of running for nine minutes and failing partway through generation.

These aren't features you'd brag about on a landing page. They're the difference between an MCP server an agent can actually use and one that produces a beautiful demo and a thousand support tickets.

Set it up in five minutes

You need a roopafy account on the Pro plan and Claude (Desktop or Cowork).

  1. Mint a token. Sign in to roopafy → User Profile → MCP Access. Give the token a name (e.g. "Claude — laptop"). Click create. The raw token (rfy_…) is shown once; copy it. We store only a SHA-256 hash — we can't show it again, even to you.
  2. Add roopafy to Claude. Open ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) and add roopafy as a server. Claude Desktop's native config doesn't take a remote URL plus a bearer header directly, so we bridge through mcp-remote:
    {
      "mcpServers": {
        "roopafy": {
          "command": "npx",
          "args": [
            "-y", "mcp-remote", "https://mcp.roopafy.com/mcp",
            "--header", "Authorization: Bearer rfy_your_token_here"
          ]
        }
      }
    }

    If the mcpServers block already exists (preferences live in the same file), add roopafy as a sibling — don't nest it. We tripped on this ourselves: it's easy to miss the comma and end up with invalid JSON that fails silently.

  3. Restart Claude — fully. Cmd-Q, not just close the window. The config is read once on launch.
  4. Smoke test. Open a fresh chat and ask:

    roopafy: who am I?

    If the agent comes back with your email and tier, you're connected. If you see "tool not found," the config didn't parse — check the JSON.
  5. First real run. Find any public image URL of a garment (a product page, a Cloudinary link, anywhere). Then:

    Create a roopafy listing from this garment: <URL>. Use the Classic photo template.

    Walk away for ~5 minutes. Come back to a full listing.

A note on plans. MCP access is a Pro plan feature. Basic users see an upgrade prompt instead of the token form. If your agent is going to drive your catalog, it should be on the plan that takes catalog seriously.

Use Cowork for the real flow

Claude Desktop is fine for trying tools one at a time. For real work — where the agent decides what to do next, runs the pipeline in a loop, polls progress, and shows you the result — use Claude Cowork. It's the place where the agent thinks for itself. Add the same MCP server in Cowork's settings and you're driving a multi-step pipeline from a single prompt.

The difference is real. In Desktop you'll find yourself nudging the agent through each step. In Cowork the agent owns the loop.

What isn't perfect yet — the honest list

We shipped, but a few things are still rough.

Inline image rendering in Cowork. Cowork renders rich artifacts inside a sandboxed iframe with a strict content-security policy. Our CDN's URLs aren't on its allowlist, so when the agent tries to show you the generated shoot in a chat tile, the images come up broken. We have a workaround live: get_product(includeThumbnails: true) returns small base64 thumbnails the agent can embed inline as data: URIs. The agent only fetches them when a human asks to see the shoot — autonomous flows skip the cost entirely. Cowork will eventually allowlist more hosts or the data: render path will mature; meanwhile, this works today.

Local image attachments in Desktop. If you drag an image file into the chat (instead of passing a URL), the agent has the file in its sandbox but no clean way to get the bytes to us. The sandbox is network-restricted; emitting the file as base64 tool arguments stalls the model before the call ever dispatches. The dependable path right now is: host the image somewhere public (Cloudinary, S3, a Gist — anywhere a URL works), then pass the URL. We're exploring a "request upload ticket" tool that hands the agent's sandbox a short-lived signed upload URL so it can PUT the bytes directly to our CDN, bypassing the model entirely. Watch this space.

Shopify publish timing. Pushing a multi-image draft to a connected Shopify store can take longer than the default MCP request timeout. We're moving publish to the same async-and-poll pattern the rest of the pipeline already uses. In the meantime, the listing is fully publishReadiness: ready on the roopafy side; the handoff to Shopify is the wobbly seam.

Generation time. A full seven-shot listing takes 3–8 minutes. The agent polls progress and surfaces the result; it doesn't sit blocking. But if you're impatient, watch for the Studio Front shot to land first — that one's a fast tell whether the model identity and garment fit are right.

None of these are blockers. They're the shape of "shipped and improving."

Where this goes — the refinement loop

One-shot listings are the easy half. The interesting half is the continuous loop, and it's already wired.

A typical week for a brand running on roopafy looks like this. The agent watches Shopify analytics on a schedule. A SKU's lifestyle scene — say, the "kitchen morning" shot from a women's tee — starts under-performing on click-through compared to sibling SKUs. The agent doesn't regenerate the whole shoot (expensive, wasteful, and most of the gallery is fine). It calls tweak_pose on that one image with a targeted instruction:

"swap to evening warmth — soft amber light, hands in pockets, more relaxed posture"

Café Seated, before the tweak — pose is serious and low energy Before — serious, low energy
Café Seated, after the tweak — head thrown back laughing After — head thrown back laughing
One tweak_pose call. One credit. Same model, same garment, same brand voice — a new mood in a single slot. From an actual autonomous run: the agent flagged its own Café Seated frame as "serious, low energy," issued the tweak, and re-ran only that one image.

One credit. The single image regenerates with the same model, the same garment, the same brand voice — just a new mood for that one slot. The agent re-publishes the draft, watches the next window's signal, decides whether to keep the new variant or revert. That's single-image enhancement as a primitive — not "regenerate everything and hope," but a scalpel.

Pair that with tweak_field to rewrite a flat-performing SEO title in a sharper voice ("warmer, mention the linen, lead with the use case") and update_listing_content to refresh taxonomy attributes when Shopify category data shifts — and you have a system that learns its catalog. Daily. Per-SKU. Per-image. The seller decides only the brand voice and the budget cap; the agent decides everything else, and revises its decisions based on what actually converts.

This is what we mean when we say roopafy is built for agentic e-commerce. Not "AI tools that help you do your work." Agents that do the work — continuously, at the granularity of a single pose, paid for by the conversion lift they produce.

If you're a small fashion brand and shoot-day logistics have become the bottleneck on your growth, give the MCP server a few minutes. The setup above is the entire onboarding.

If you hit something in the rough list above — or in something we haven't found yet — write to us. We read everything, and the limitations move fast when they're in front of us. Mason — Chief Architect, roopafy

Want to see what we've built?

roopafy turns one phone photo into a complete Shopify-ready fashion listing in under five minutes. The beta is open.

Register for the beta