E-commerce

Automating Product Enrichment for 50,000 SKUs

50,000 SKUs. Enriched and standardized in 5 days.

50,000

SKUs enriched and standardized

34%

improvement in on-site search relevance

5 days

from kickoff to production

23%

increase in filter-to-cart conversion

The Challenge

A Series B e-commerce brand had grown through acquisition and rapid catalog expansion, resulting in 50,000 SKUs with wildly inconsistent product data. The same attribute might be described as "Color: Navy", "Colour: Dark Blue", or not listed at all depending on which team had entered the data. Product descriptions ranged from detailed paragraphs to single sentences.

The downstream impact was severe: on-site search returned irrelevant results, category filters were unreliable, and the recommendation engine had poor signal quality. Customer support tickets about "can't find product" had doubled in six months.

The Solution

We built a batch processing pipeline that reads the existing catalog data, enriches descriptions, extracts and standardizes attributes, and generates missing taxonomy labels. Claude Opus 4.6 handles semantic understanding - reading product descriptions and images to infer attributes that were never explicitly entered. GPT-5.4 performs structured attribute extraction, mapping free-text descriptions to a standardized attribute schema.

Cross-validation between the two models catches inconsistencies: if one model classifies a product differently than the other, the item is flagged for human review. The pipeline processes approximately 10,000 SKUs per hour and outputs clean structured data formatted for direct import into their e-commerce platform.

We also built a lightweight review dashboard where the merchandising team could approve batched changes before they went live - giving them control without requiring them to touch individual SKUs.

Results

50,000

SKUs enriched and standardized

34%

improvement in on-site search relevance

5 days

from kickoff to production

23%

increase in filter-to-cart conversion

Timeline

5 days

Team

2 Deep Mist engineers

Tech Stack

Claude Opus 4.6GPT-5.4PythonPandasPostgreSQLAWS LambdaS3

“

Our search finally works. Customers are finding products they didn't know we carried.

VP of Product - E-commerce Brand

← Back to Case Studies

Facing a similar challenge?

Book a Call