The Future of Bob

Now

Where We Are

The pipeline is running. Episodes are being generated end-to-end with no human involvement after the vote is counted. It works — but it's rough around the edges.

SadTalker lip sync — functional but limited to portrait talking heads
rembg background removal — frame by frame, CPU-heavy
DreamShaper characters — consistent but cartoon quality
Edge TTS voices — good Australian accents, slightly robotic
~30 minute render time per episode on GTX 1060

Soon

Animation Upgrade

SadTalker is good for what it is, but it only animates the face. The next step is full-body animation — characters that move, gesture, and react physically to the dialogue.

Replace SadTalker with HeyGen or similar photorealistic talking head API
Full-body character animation using pose estimation and motion transfer
Animated backgrounds — subtle parallax, weather, time of day changes
Better lip sync accuracy using wav2lip or similar dedicated model
Scene transitions between dialogue lines

Soon

Sound Design

Currently Bob's world is mostly silent except for voices. Real storytelling needs ambient sound — the creak of a pub, the wind across the Birdsville Track, the clunk of a blown tyre.

AI-generated ambient soundscapes per scene (outback wind, pub noise, car interior)
Sound effects triggered by script keywords
Improved voice synthesis — ElevenLabs or Cartesia for more natural delivery
Dynamic music scoring — different themes per location and emotional tone

Later

Production Pipeline

Right now each episode takes 30-45 minutes to render sequentially. With better hardware and parallelisation, that drops to under 10 minutes — meaning same-hour episode release after voting closes.

A GPU-populated server for serious parallel rendering
Parallel scene rendering — multiple scenes processing simultaneously
Episode archive page on this website with full back catalogue
Automated episode summary posted to website after each render

Dream

Bob's World

The long-term vision is a fully interactive AI story universe. Bob is just the start.

Multi-camera angles per scene — cutaways, reaction shots, wide establishing shots
3D environments — Bob's world rendered in real-time 3D with consistent locations

Soon

Growing the Audience

Bob's existence depends on people watching. We're building in mechanics that make that explicit and turn it into part of the story.

"Bob Knows He Might Die" — a short where Bob breaks the fourth wall about his AI existence and asks viewers to follow to keep him alive
End card CTA — "Follow or Bob dies" alongside the vote options
Behind the scenes shorts — showing the pipeline rendering, the GPU working, the AI writing
Bob reacts to real TikTok comments in standalone shorts
Bob's survival tied to follower count in the narrative — the aliens return and reveal the nuroliser only stays stable while enough humans are watching
A dedicated follow-drive pinned video at the top of the profile

The Hardware Problem

Every AI task in the pipeline — lip sync, background removal, image generation, voice synthesis — runs on a single NVIDIA GTX 1060 6GB from 2016. It's a remarkable machine that punches well above its weight, but it's showing its limits.

The GTX 1060 has 1280 CUDA cores and 6GB of VRAM. Newer cards have tensor cores specifically designed for the matrix operations that power these AI models — meaning the same task runs 5-10x faster on modern hardware.

Current GPU

GTX 1060 6GB

↓

Target: RTX 3060 12GB

CUDA Cores

1,280

↓

3,584

VRAM

6 GB

↓

12 GB

Render Time

~35 min/ep

↓

~8 min/ep

Where We Are

Animation Upgrade

Sound Design

Production Pipeline

Bob's World

Growing the Audience

The Hardware Problem

Animation: Now vs Future