- ProblemA full-stack, production-style system that turns raw video into searchable, analyst-ready structured data using CV + NLP + async pipelines.
- RoleSolo full-stack applied ML engineer
- TimeframePrototype build (local-first, CPU)
- StackFastAPI • Celery • Redis • SQLAlchemy
- FocusComputer Vision • NLP • Full-Stack
Problem
Before: analysts scrub videos manually, pause/play repeatedly, and take notes by hand. There is no structured output, no timeline correlation, and no searchable archive.
After: a single upload yields entities, timestamps, frames, transcripts, a searchable index, exportable reports, and shareable read-only links.
Success criteria: search across videos, jump directly to moments of interest, and export analyst-ready artifacts with explainable evidence.
- Upload video or URL → async pipeline → shareable read-only report
- Timeline + semantic search UX for fast triage
- Deterministic JSON/PDF/CSV artifacts for auditability
- Local-first, offline-capable processing on CPU
Context
Scope: backend API, async pipeline, CV/NLP, entity aggregation, dataset exporter, search + timeline UX, shareable reports, and Dockerized deployment.
Constraints: solo developer, CPU-first local compute, offline requirement, no paid cloud services, and real-world video sizes.
Positioning: analyst-friendly video indexing, not an operational system.
Video intelligence pipeline with FastAPI, Celery, and YOLOv8
The system processes video asynchronously with FastAPI + Celery and uses YOLOv8 for object detection on extracted frames.
This enables analyst-ready outputs without requiring cloud inference.
Entity indexing, timeline search, and exportable evidence
Detected entities are aggregated into time ranges with confidence scores, then indexed for fast search and reporting.
Exports include JSON/PDF/CSV plus frame evidence and transcripts.
Security / Threat Model
- Optional API keys must be injected via env/.env/Docker secrets and never committed.
- Share links are read-only tokens stored in SQLite.
- If deployed publicly, add auth + rate limiting and isolate storage.
Stack
FAQ
What makes this different from a simple video demo?
It is an end-to-end system: upload → async processing → searchable index → exportable, shareable reports with evidence.
Does it require cloud services to run?
No. It is local-first and CPU-friendly; all processing runs offline with local storage.
How do you validate accuracy?
Confidence scoring, verification passes, and a frame gallery make results easy to audit visually.
