Sleep Story AI System

AI Automation

View Detail

An end-to-end n8n automation that turns a single Google Sheets row (topic + length) into a finished sleep-story YouTube video. The orchestrator generates a script with Google Gemini, splits it into scenes, then fans out to five reusable sub-workflows that produce per-scene images and voiceovers, stitch them into clips, and combine everything into a final video via the Rendi.dev ffmpeg API. Status and asset URLs are written back to the sheet, so every run is fully traceable. Built around a 32-node main workflow plus five sub-workflows — production-tested on Rendi's 4-commands-per-minute free tier with batching, throttling, and retry logic to absorb 429s gracefully.

Type
AI Automation
Role
AI Automation Engineer
Service
AI Automation / Workflow Engineering / Content Generation
Year
2025

Project Overview

Sleep Story AI System is a fully automated content pipeline that takes a single row in a Google Sheet — a topic and a target length — and produces a finished, ready-to-upload YouTube sleep-story video. No manual editing, no scripting, no asset stitching.

The orchestrator runs in n8n and chains together six workflows: a main controller plus five purpose-built sub-workflows for image generation, voice synthesis, per-scene clip assembly, clip combination, and final ffmpeg rendering via the Rendi.dev API.


Key Features

  • Sheet-Driven Input: Topic + duration in a Google Sheet kicks off the entire pipeline
  • AI Scripting: Google Gemini (via LangChain Agent) writes the full sleep story and splits it into scenes
  • Per-Scene Assets: Each scene gets its own AI-generated image + AI voice narration
  • Automated Stitching: ffmpeg via Rendi.dev combines images + voice into clips, then merges clips into one video
  • Rate-Limit-Aware: Batching, 20s throttle, 15s polling, 8× retry — handles Rendi's 4/min free-tier cap without dropping runs
  • Full Traceability: Every asset URL and status is written back to the source Google Sheet

Technologies Used

  • n8n: Workflow orchestration across 6 chained workflows (32 nodes in the main controller)
  • Google Gemini + LangChain: Script generation with structured output parsing
  • Google Sheets: Input row, output asset tracking
  • Rendi.dev: ffmpeg-as-a-service for clip rendering and final video combination
  • HTTP image + voice APIs: Per-scene visual and audio generation
  • Docker: Containerised n8n instance

Architecture

The system is split into one orchestrator and five sub-workflows so each step is independently testable and reusable:

Sleep Story AI System (32 nodes — manual trigger + Google Sheets)
        ├── Create Image      (image gen per scene)
        ├── Create Voice      (voice synthesis per scene, with retry)
        ├── Create Clips      (image + voice → per-scene clip)
        ├── Combine Clips     (all scene clips → one video)
        └── Videofy with Rendi (ffmpeg via Rendi.dev, rate-limited)

A 15-scene job runs end-to-end in about 12–15 minutes.


Outcome

The system replaces what used to be hours of script-writing, image-sourcing, voice-recording, and video editing with a single sheet entry and a click. Designed to be left running in the background while other work happens elsewhere.