Sleep Story AI System
Sleep Story AI System
AI Automation
An end-to-end n8n automation that turns a single Google Sheets row (topic + length) into a finished sleep-story YouTube video. The orchestrator generates a script with Google Gemini, splits it into scenes, then fans out to five reusable sub-workflows that produce per-scene images and voiceovers, stitch them into clips, and combine everything into a final video via the Rendi.dev ffmpeg API. Status and asset URLs are written back to the sheet, so every run is fully traceable. Built around a 32-node main workflow plus five sub-workflows — production-tested on Rendi's 4-commands-per-minute free tier with batching, throttling, and retry logic to absorb 429s gracefully.
- Type
- AI Automation
- Role
- AI Automation Engineer
- Service
- AI Automation / Workflow Engineering / Content Generation
- Year
- 2025
Project Overview
Sleep Story AI System is a fully automated content pipeline that takes a single row in a Google Sheet — a topic and a target length — and produces a finished, ready-to-upload YouTube sleep-story video. No manual editing, no scripting, no asset stitching.
The orchestrator runs in n8n and chains together six workflows: a main controller plus five purpose-built sub-workflows for image generation, voice synthesis, per-scene clip assembly, clip combination, and final ffmpeg rendering via the Rendi.dev API.
Key Features
- Sheet-Driven Input: Topic + duration in a Google Sheet kicks off the entire pipeline
- AI Scripting: Google Gemini (via LangChain Agent) writes the full sleep story and splits it into scenes
- Per-Scene Assets: Each scene gets its own AI-generated image + AI voice narration
- Automated Stitching: ffmpeg via Rendi.dev combines images + voice into clips, then merges clips into one video
- Rate-Limit-Aware: Batching, 20s throttle, 15s polling, 8× retry — handles Rendi's 4/min free-tier cap without dropping runs
- Full Traceability: Every asset URL and status is written back to the source Google Sheet
Technologies Used
- n8n: Workflow orchestration across 6 chained workflows (32 nodes in the main controller)
- Google Gemini + LangChain: Script generation with structured output parsing
- Google Sheets: Input row, output asset tracking
- Rendi.dev: ffmpeg-as-a-service for clip rendering and final video combination
- HTTP image + voice APIs: Per-scene visual and audio generation
- Docker: Containerised n8n instance
Architecture
The system is split into one orchestrator and five sub-workflows so each step is independently testable and reusable:
Sleep Story AI System (32 nodes — manual trigger + Google Sheets)
├── Create Image (image gen per scene)
├── Create Voice (voice synthesis per scene, with retry)
├── Create Clips (image + voice → per-scene clip)
├── Combine Clips (all scene clips → one video)
└── Videofy with Rendi (ffmpeg via Rendi.dev, rate-limited)
A 15-scene job runs end-to-end in about 12–15 minutes.
Outcome
The system replaces what used to be hours of script-writing, image-sourcing, voice-recording, and video editing with a single sheet entry and a click. Designed to be left running in the background while other work happens elsewhere.