v4.3
Vision OCR Content Understanding & Learned Semantic Embeddings
WickyX Team
Mar 06, 2026 4 min readChangelog v4.3
New Features
- Vision OCR Content Understanding: Integrated OCR into the media analysis pipeline using Google Vision
TEXT_DETECTION, allowing the system to read text directly from images and video frames (HUD text, subtitles, overlays). Extracted OCR tokens are filtered and merged intoanalysis.tags, improving automatic content understanding even when captions or hashtags are missing. - Extracted Text Storage: Added a new
analysis.extractedTextfield to both Post and Moment schemas. This stores the raw OCR text detected from media (truncated and sanitized), enabling better debugging, moderation, and future NLP analysis. - Learned Semantic Tag Embeddings: Introduced a new learned embedding system using Skip-gram with negative sampling. The system now learns semantic relationships between tags from real user interaction data, enabling the algorithm to understand relationships like
gta ↔ heist ↔ los santoseven if tags do not match exactly. - Hybrid Embedding Architecture: Implemented a blended embedding strategy combining learned vectors and the existing feature-hash vectors. The algorithm dynamically blends both representations using a coverage factor so the system remains stable during cold start while gradually shifting toward learned semantic vectors as training data grows.
Machine Learning Infrastructure
- Embedding Training Worker: Added a dedicated worker thread (
tagEmbeddingWorker.ts) to train the Skip-gram model using TensorFlow.js. This prevents blocking the Node.js event loop and ensures training does not affect API performance. - Embedding Training Orchestrator: Implemented
TagEmbeddingTrainer.ts, responsible for collecting tag co-occurrence data from user and content embeddings, generating Skip-gram training pairs, and scheduling model retraining automatically. - Learned Embedding Database Model: Introduced a new
LearnedEmbeddingcollection storing trained token vectors along with token frequency and model versioning, allowing safe incremental updates and future retraining. - Scheduled Daily Training Pipeline: The system now automatically retrains tag embeddings every 24 hours, starting shortly after server initialization. Newly trained vectors are written to MongoDB and lazily refreshed in memory by
EmbeddingService.
Content Intelligence Improvements
- OCR-Driven Tag Enrichment: OCR text extracted from media is tokenized, normalized, filtered, and merged into content tags with confidence weighting. This enables the recommendation engine to detect contextual information such as game titles, mission names, subtitles, and on-screen labels.
- Improved Cold-Start Handling: The new hybrid embedding system ensures that when learned embeddings are unavailable or incomplete, the algorithm automatically falls back to deterministic hash vectors. As training coverage increases, the system smoothly transitions to semantic embeddings without degrading recommendation quality.
Fixes & Improvements
- Moment Video Playback Fix: Resolved a scrolling issue where videos in the Moments feed would sometimes fail to start playing when entering the viewport.
- Background Playback Control: Videos in the Moments feed now automatically pause when the browser tab becomes inactive, preventing unnecessary playback and reducing resource usage.
- Keyboard Navigation for Moments: Added keyboard arrow key navigation (↑ / ↓) to allow users to scroll through Moments more smoothly without using the mouse.
- Expanded Notification Settings: Added additional notification controls in the settings page, allowing users to enable or disable alerts for new activities such as live streams and other platform events.
- General Moments Stability Improvements: Minor playback and scrolling optimizations to improve the reliability and smoothness of the Moments feed.
#v4.3#OCR#Machine Learning#Content Intelligence#Fixes
Read Next
v4.2Mar 05, 2026
Share Modal, Streaming Badges, Full Localization & Custom Modals
Enhanced share modal, new streaming & group badges, full studio/stream localization, custom confirmation modals, and critical bug fixes.
v4.1Mar 03, 2026
Explore Live Streams, Global Localization & Major Fixes
Live stream discovery, 4-language support, smart follower management, and critical privacy/moderation fixes.