Thales Video Intelligence Indexing Platform

At a glance

Problem
A full-stack, production-style system that turns raw video into searchable, analyst-ready structured data using CV + NLP + async pipelines.
Role
Solo full-stack applied ML engineer
Timeframe
Prototype build (local-first, CPU)
Stack
FastAPI • Celery • Redis • SQLAlchemy
Focus
Computer Vision • NLP • Full-Stack

Problem

Before: analysts scrub videos manually, pause/play repeatedly, and take notes by hand. There is no structured output, no timeline correlation, and no searchable archive.

After: a single upload yields entities, timestamps, frames, transcripts, a searchable index, exportable reports, and shareable read-only links.

Success criteria: search across videos, jump directly to moments of interest, and export analyst-ready artifacts with explainable evidence.

Executive summary

Upload video or URL → async pipeline → shareable read-only report
Timeline + semantic search UX for fast triage
Deterministic JSON/PDF/CSV artifacts for auditability
Local-first, offline-capable processing on CPU

Context

Scope: backend API, async pipeline, CV/NLP, entity aggregation, dataset exporter, search + timeline UX, shareable reports, and Dockerized deployment.

Constraints: solo developer, CPU-first local compute, offline requirement, no paid cloud services, and real-world video sizes.

Positioning: analyst-friendly video indexing, not an operational system.

Links

GitHub repo Architecture doc API reference

Video intelligence pipeline with FastAPI, Celery, and YOLOv8

The system processes video asynchronously with FastAPI + Celery and uses YOLOv8 for object detection on extracted frames.

This enables analyst-ready outputs without requiring cloud inference.

Entity indexing, timeline search, and exportable evidence

Detected entities are aggregated into time ranges with confidence scores, then indexed for fast search and reporting.

Exports include JSON/PDF/CSV plus frame evidence and transcripts.

Security / Threat Model

Optional API keys must be injected via env/.env/Docker secrets and never committed.
Share links are read-only tokens stored in SQLite.
If deployed publicly, add auth + rate limiting and isolate storage.

Stack

FastAPICeleryRedisSQLAlchemySQLiteffmpegOpenCVYOLOv8sentence-transformersfaster-whisperTesseract OCRReactViteTypeScriptTailwindDocker Compose

FAQ

What makes this different from a simple video demo?

It is an end-to-end system: upload → async processing → searchable index → exportable, shareable reports with evidence.

Does it require cloud services to run?

No. It is local-first and CPU-friendly; all processing runs offline with local storage.

How do you validate accuracy?

Confidence scoring, verification passes, and a frame gallery make results easy to audit visually.

Back to Projects Explore more Case Studies