Day 4 -
Aug 21st
Panorama
Building GenAI features is one thing—operating them reliably in production is another. In this talk, we’ll dive into how we’ve architected and scaled real-time Generative AI systems using Scala, FS2, and Server-Sent Events (SSE), with a particular focus on integrating Retrieval-Augmented Generation (RAG) to ground LLMs in dynamic business data.
We’ll explore the challenges of serving low-latency responses to thousands of users, streaming outputs token-by-token, and orchestrating document retrieval pipelines on the fly. Expect practical insights on managing memory pressure, backpressure, observability, error handling, and model fallbacks—all in the wild.
If you’re building production-ready GenAI services—chatbots, assistants, or document QA systems—this talk will equip you with the architectural patterns, tooling, and hard-earned lessons needed to scale them with confidence.
Writer
Muayad Sayed Ali is a technologist, builder, and mentor with a deep passion for turning cutting-edge AI into real-world impact. As Director of Engineering at Writer.com, he leads the teams behind one of the most advanced Generative AI platforms for enterprise—helping companies transform how they launch products, analyze data, and make decisions using LLM-powered agents. Muayad’s journey spans more than a decade of building high-performance systems at the intersection of machine learning, big data, and functional programming. He thrives on solving hard engineering problems at scale—whether it’s designing production-grade GenAI infrastructure, fine-tuning RAG pipelines, or leading teams through the complexities of deploying AI responsibly. Known online as @coldenzero, Muayad actively shares ideas, mentors engineers, and contributes to the developer community. He’s driven by a belief that the future of work will be agentic, deeply contextual, and collaborative—and he’s helping to build that future, one system at a time.
Subscribe and follow @ScalaDays on BlueSky for the latest conference updates.