MatchingFeb 13, 2026· 5 min read

Matching Products by Meaning, Not Barcodes

Why semantic matching finds the right competitor products when SKUs and barcodes don't line up — how it works, where it beats exact-match, and how to review results.

F

The Fuzzify Team

Competitor intelligence & pricing

A network of connected nodes linking a query entity to a document entity, representing semantic product matching.

You can't compare your prices to a competitor's until you know which of their products actually correspond to yours. That mapping sounds trivial — until you try to build it. Across two different brands, the same product has a different name, a different SKU, a different barcode, a different photo, and a different description. Exact-match keys line up almost nothing. To compare catalogs across brands, you have to match on what a product is, not on how it's labeled.

This is the difference between exact matching and semantic matching, and it's the foundation everything else — price suggestions, monitoring, positioning — is built on. Get the matching wrong and every downstream number is comparing apples to unrelated oranges.

Why barcodes and SKUs fail across brands

Exact identifiers work beautifully inside one catalog and fall apart the moment you cross a brand boundary. Here's why the obvious keys don't help:

  • SKUs are internal. Every brand invents its own scheme. There is no shared namespace, so your SKU and a competitor's SKU for the same item have zero relationship.
  • Barcodes (UPC/EAN/GTIN) identify a manufacturer's item, not an equivalent one. Two competing brands' own-label products are equivalent to shoppers but carry entirely different barcodes — and plenty of catalogs don't publish barcodes at all.
  • Names are written for marketing, not matching. "Ultra-Soft Merino Base Layer — Slate" and "Men's Merino Wool Thermal Top (Grey)" are the same product to a buyer and a string-distance nightmare to a computer.
  • Titles and specs live in different fields, in different orders, with different units. Exact and fuzzy string matching drown in this noise.

Exact-match keys answer "is this the identical listing?" What you actually need answered is "would a shopper consider these the same product?" Those are different questions.

What "matching by meaning" actually means

Semantic matching represents each product as an embedding — a long vector of numbers that captures the meaning of its title, description, and attributes. Products that mean similar things land near each other in that vector space, regardless of the exact words used. "Merino base layer" and "merino wool thermal top" end up as neighbors because the concepts overlap, even though the strings barely do.

Once every product is a point in that space, finding candidate matches becomes a nearest-neighbor search: for each of your products, pull the competitor products whose vectors sit closest. That's a wide net — it surfaces everything plausibly related, including a few that are close-but-not-equal (a different size, a two-pack, an accessory). Precision comes in the next step.

The two-stage pipeline: wide net, then precise filter

A good matcher doesn't trust vector proximity alone, and it doesn't ask an expensive model to read your entire competitor catalog either. It splits the work so each stage does what it's best at.

[01]

Retrieve with vector search (recall)

Embed every product and run a nearest-neighbor search to pull the handful of competitor products closest in meaning to each of yours. This stage is cheap and fast, and its only job is to make sure the true match is somewhere in the shortlist.

[02]

Judge with a language model (precision)

Hand each candidate pair to an LLM that reads both products the way a person would and returns a 0–100 confidence score plus a one-line reason. This stage throws out the near-misses vector search let through — the wrong size, the bundle, the accessory — and keeps only genuine equivalents.

The two-stage design is what makes semantic matching both accurate and affordable. Vector search is too blunt to trust alone; running a smart model over every possible pair would be too slow and too expensive. Retrieve wide, then judge narrow, and you get the best of both. Fuzzify auto-confirms matches above a confidence threshold and leaves the borderline ones for a human — which is exactly where review comes in.

One product, many matches — never collapse to one

A common mistake is forcing each of your products to a single "best" competitor match. Real catalogs don't work that way. One of your products might correspond to three competitor listings across two rivals — a direct equivalent, a slightly larger pack, a bundled version. Collapsing that to one match throws away most of the market. Keeping the full set of confirmed matches is what makes the competitor price range meaningful in the first place.

How to review matches efficiently

Semantic matching gets you very close, but you stay in control. The goal of review isn't to check every pair — it's to spend your attention where the model is least sure. A good workflow:

  1. Trust the high-confidence auto-confirms, but spot-check a sample. If a handful look right, the batch is almost certainly sound.
  2. Focus on the middle band. Scores in the 50–80 range are where genuine judgment calls live — a two-pack, a slightly different spec, a renamed variant. This is where a minute of human attention pays off most.
  3. Read the reason, not just the score. The one-line explanation tells you *why* the model matched them, which makes an accept/reject decision fast.
  4. Reject bundles and accessories deliberately. These are the classic false positives from the retrieval stage. Clearing them sharpens every price downstream.

Why this is the foundation

Everything else Fuzzify does depends on this layer being right. Price suggestions are only as good as the comparable set they're computed over. Weekly monitoring only tells you something useful if it's watching the products that actually compete with yours. Match by meaning, review where the model is unsure, and the rest of the pipeline has honest inputs to work with.

See it on your own catalog

Import your products and your competitors'. Fuzzify matches by meaning, suggests a defensible price, and monitors changes weekly.

Keep reading