All articles
Product 10 min read April 16, 2026 Daniel Crouch

Inside the webem.io prompt mining pipeline.

Every webem.io customer gets a custom prompt set within 48 hours of onboarding. Here's the pipeline we run to get there — and why a hand-picked list of 80 prompts beats an auto-generated list of 800.

Inside the webem.io prompt mining pipeline.

Step 1 — Seed harvesting

We pull seed phrases from four sources: Google Search Console, customer support transcripts, sales call recordings and competitor review snippets. Each source contributes a different intent profile.

Step 2 — Intent classification

Seeds are classified into awareness, comparison, decision and post-purchase using a small fine-tuned model. The mix matters: most clients arrive over-indexed on comparison prompts and under-indexed on awareness.

Step 3 — Clustering and pruning

We cluster by embedding similarity and keep the medoid of each cluster. The result is a short, non-redundant list that covers the buyer journey without paying to track 20 variants of the same query.

Step 4 — Human review

A category specialist signs off on the final list. This step alone removes about 18% of the prompts as either off-brand or commercially irrelevant. It's the difference between a dashboard you trust and a dashboard you ignore.

Keep reading