Pillar 02 · Services

Sensor and vision data, turned into actionable outputs.

We build the data platforms and inference pipelines behind smart-city perception, automated enforcement, logistics optimization, and industrial vision applications. Anchored in production work at Hayden AI and Cargomatic.

→ The problem

Physical-world systems generate the messiest data in the AI economy: real-world video from edge devices, multi-sensor streams with timing variance, environmental conditions that change frame-to-frame, and inference latency requirements measured in milliseconds, not minutes.

Most AI services teams are good at GenAI but have never shipped real-time perception at scale. Most computer vision teams are good at models but not the data infrastructure to feed them. The gap is where projects fail. See how this compares to Big 4 advisory engagements or building an in-house team.

What we do

We build the end-to-end stack — from edge ingestion through structured datasets to production inference — for systems that need to perceive and act on the physical world.

1. Edge data platforms & ingestion

Real-world video and sensor data ingestion from edge devices. Frame-level structured datasets. Data lake architecture for perception-driven systems. Production-tested at smart-city scale on Hayden AI's automated traffic enforcement platform.

2. Production CV & perception pipelines

Computer vision pipelines that turn raw video and sensor streams into structured outputs. Object detection, tracking, classification, anomaly identification. Built to operationalize — not just demo.

3. Real-time ML inference at scale

Production ML inference pipelines for real-time video analysis. Latency-optimized architectures. Edge-cloud orchestration. Continuous evaluation and drift detection.

4. AI-driven operational optimization

Routing, dispatch, and operational decisions powered by ML — anchored in our work at Cargomatic on logistics routing and container unloading workflows. Data-driven optimization applied to real-world ops.

→ Reference architecture

Edge → ingestion → inference → action.

The pattern we deploy for perception-driven systems: real-world data ingestion from edge devices, structured frame-level datasets, production inference pipelines, and a reasoning layer that contextualizes ML outputs and routes to action.

Where Claude fits: anomaly explanation on top of perception outputs, exception handling in real-time pipelines, operator copilots for human-in-the-loop systems, and reasoning over multi-modal outputs from CV and sensor fusion.

// pattern

01 · Edge ingestion — Video and sensor data from edge devices. Resilient to network conditions. Structured at point of capture where possible.

02 · Frame-level structuring — Raw streams turned into queryable datasets. Annotation, labeling, lineage. The foundation everything else sits on.

03 · Production inference — CV models, sensor fusion, real-time output. Latency-optimized. Continuously evaluated and monitored for drift.

04 · Reasoning + action — Claude reasons over CV outputs and operational state. Explains anomalies. Routes exceptions. Generates human-readable reports. Supports operator copilots.

01 · EDGE INGESTION Real-world capture video · LiDAR · IoT devices resilient at network edge 02 · FRAME DATASETS Structure raw streams annotate · label · lineage queryable frame-level data 03 · CV INFERENCE Detect + track object · class · anomaly latency-optimized pipelines 04 · REASONING + ACTION Claude over outputs explain · route · copilot human-in-loop exceptions
real-time
Edge video inference
Production ML inference for real-time video at smart-city scale, deployed on Hayden AI's traffic enforcement platform.
frame-level
Structured datasets
Raw real-world video turned into queryable, annotated frame-level data. The foundation for every CV model that follows.
end-to-end
From edge to action
Ingestion through inference through operational decisions. We own the whole pipeline so the seams don't break in production.
[ FAQ ]

Common questions about edge & perception AI.

What makes production computer vision different from a CV demo or POC?
Production CV has to survive conditions no benchmark predicts: variable lighting, camera occlusion, edge network instability, and real-world object variability. A POC runs on clean, curated video in a lab. Production systems need resilient ingestion, continuous evaluation, drift detection, and human review loops for edge cases. The infrastructure gap between POC and production is usually larger than the model gap.
What types of edge devices and sensors can you work with?
We've worked with traffic cameras, LiDAR units, industrial IoT sensors, dashcams, smart city fixed cameras, and logistics dock sensors. Ingestion is engineered to be resilient to network conditions at the edge — intermittent connectivity is expected, not exceptional. We handle time-synchronization variance across sensor streams and build structured datasets from raw multi-sensor input.
How do you handle real-time inference latency for video analysis?
Latency optimization is architecture-first: model selection, batching strategy, edge vs. cloud inference split, and caching for known-state scenes. We've deployed real-time inference pipelines for traffic enforcement video at smart-city scale. Typical production targets are under 200ms for detection + classification on live streams, though this varies by hardware and throughput requirements.
Where does Claude fit in a computer vision pipeline — isn't this just about CV models?
CV models produce structured detections. Claude operates over those outputs: explaining anomalies in natural language, routing edge cases for human review, generating operator reports, handling exceptions that fall outside the model's confidence range, and providing the reasoning layer for operator copilots. For many production perception systems, the hard problems in year two are operational — and that's where Claude adds most value.
Can you build the data labeling and annotation pipeline, or just the inference layer?
We build the full stack: ingestion, annotation, labeling pipeline, structured dataset creation, model training infrastructure, and production inference. The annotation and dataset work is where most CV projects go wrong at scale — inconsistent labeling, poor lineage tracking, and no feedback loop from production errors back to training data. We engineer the whole pipeline rather than hand off the data problem.
What industries do you have production perception deployments in?
Anchor deployments in smart cities (Hayden AI — automated traffic enforcement at scale), logistics (Cargomatic — AI-driven routing and container handling), and broadcast/ad-tech (Bluocean — set-top box platform). We're actively working with teams in manufacturing, robotics, and industrial inspection. The common thread is systems that have to perceive and act on the physical world in real time.

→ Related reading

Physical AI Data Flywheel → Computer vision production deployment → AI for smart cities → AI for logistics → AI for manufacturing → DehazeLabs vs Big 4 → DehazeLabs vs in-house AI team →

Shipping perception that has to work in the wild?

Real-world systems break in ways no benchmark predicts. Tell us what you're building — we'll tell you what we'd build to make it survive contact with reality.