Tech NewsFebruary 28, 2024·7 min read

AI Vision Update: Smarter Item Recognition

Deep dive into the AI improvements that allow VFetch to identify items from blurry or partial photos with 40% higher accuracy.

person

VFetch Team

Lost property photography is rarely ideal. Items are found in storerooms, handed over at busy front desks, or discovered by cleaners mid-shift. Photos get taken quickly, under fluorescent lighting, against cluttered backgrounds, and sometimes out of focus. The real world is messy - and our AI has to work in it.

This month we shipped a significant update to VFetch's vision pipeline. Here is a plain-language breakdown of what changed, why it matters, and what it means for venues and guests.

40%

Improvement in attribute extraction

61%

Better on low-quality photos

89%

Brand ID accuracy (up from 67%)

34%

Richer descriptions on average

The Challenge: Real-World Photos Are Imperfect

When we first built VFetch's item recognition system, we trained and tested it against relatively clean product photography. A clear photo of a black Sony headphone on a white surface is easy to analyse. But that is not what most found-item photos look like.

photo_camera

Early venue feedback revealed a consistent pattern: accuracy dropped when photos were taken at an angle, in low light, with part of the item obscured, or against a patterned background. A wallet photographed half-open on a patterned carpet would return less detail than we needed for a reliable match.

Real-world found item photography challenges — 70% of found-item photos fall into the 'messy middle' - not terrible, but not clean product shots either.

We needed the model to perform well in that messy middle - the 70% of photos that are not terrible but are not pristine either.

What We Changed

The update covers three areas of the recognition pipeline:

Contextual inference

The model now draws on contextual signals to supplement what it can directly observe. A partial label, a distinctive strap, or a specific hardware detail is used to make probabilistic inferences about brand, model, and category - rather than returning a generic low-confidence description.

Multi-angle prompt structuring

We restructured how we prompt the vision model. Instead of asking for a single description, the pipeline now runs structured extraction - separating object type, colour analysis, brand identification, condition assessment, and distinguishing features as discrete steps. This produces richer, more consistent output even when individual signals are weak.

Confidence-weighted descriptions

The model now flags each attribute by confidence level. High-confidence attributes are displayed prominently. Lower-confidence inferences (e.g. 'possibly Mulberry, stitching consistent with premium brand') are surfaced separately. Staff get a clearer picture of what the AI is certain about vs. what it is inferring.

The Results

We measured the update against a benchmark of 4,000 real found-item photos submitted by venues over three months - manually reviewed and tagged to establish ground truth.

40%

Overall accuracy improvement

61%

Improvement on worst-quality photos

67→89%

Brand ID accuracy

+34%

Average description length

Why This Matters for Matching

Better descriptions directly improve match rates. The VFetch matching engine compares guest search queries against logged item descriptions using semantic similarity. The more detail in a description, the more surface area for a match to connect.

A wallet logged as "brown leather wallet" will match a handful of queries. A wallet logged as "bifold, dark tan leather, brass zip coin compartment, visible corner wear, possible Fossil branding" will match far more - and with far higher confidence.

What stays the same

The logging flow for venue staff is unchanged. Staff take a photo and confirm the description - the AI handles the analysis silently in the background. The only visible difference is that descriptions are richer and more detailed than before.

upcoming

The next update to the vision pipeline focuses on document and ID recognition - specifically, improving how the platform handles found passports, driving licences, and bank cards, where the right balance between useful identification and privacy protection requires careful design.

Free to use. No contracts. Staff training takes under 20 minutes.

Try VFetch at Your Venuearrow_forward

arrow_backAll articles Get Started Free