AI

Chatterbox Turbo is Live on Models

We just added Chatterbox Turbo to Models. Chatterbox Turbo is one of the fastest open-source text-to-speech models available today—and the first to combine real-time performance with paralinguistic prompting. That means speech that doesn’t just sound realistic, but behaves like a human voice: laughs, sighs, hesitations, and all. On

2 min

A Deep Dive into LLM Inference Latencies

Why? Large language model deployment is becoming a necessity for modern applications, with inference optimization playing a central role in shaping both user experience and cost. Latency spans GPU efficiency, network routing, and autoscaling, making it a complex but rewarding area to improve. At Hathora, we’re drawing on lessons

11 min

Subscribe to Hathora Blog

Elastic-metal infra, globally deployed, ready for scale.