AI to Z: Navigating the Alphabet Soup

Notes from my journey into the world of engineering, AI, NLP, and productivity tools.

From Sound to Text in Real-Time: Understanding Voxtral Realtime

Most speech-to-text models work like a translator reading an entire letter before responding — they need the full audio clip before they can produce any text. But what if your model could transcribe speech as it’s being spoken, word by word, with barely any delay? That’s exactly what Voxtral Mini 4B Realtime does. Released by Mistral AI under the Apache 2.0 license, it’s a 4-billion parameter model that can transcribe audio in real-time with delays as low as 80 milliseconds....

February 16, 2026 · 10 min · 2029 words · Me

Avengers Assemble!: Uniting AI Agents for Advanced Problem-Solving

Imagine if, just like the Avengers, a team of specialized AI agents could come together, each with its unique strengths, to tackle complex problems. This isn’t a scene from a sci-fi movie; it’s the reality of modern artificial intelligence. In the world of AI, assistants and agents work in unison, much like a well-coordinated superhero team. Each AI agent, equipped with specific skills — be it data analysis, autonomous decision-making, or predictive analytics — joins forces to form a formidable swarm, ready to take on tasks ranging from mundane to monumental....

January 4, 2024 · 6 min · 1200 words · Me

Local LLMs as browser sidekicks

Introduction The rapidly evolving AI era presents new discoveries daily, particularly in software engineering, where AI assistants are now integral to our workflows. While these innovations have boosted productivity, they also bring challenges like: Reliance on cloud-based services and closed models 🛅 Privacy and security concerns 🔒 Other barriers to entry 🚧 The Challenge of Cloud-Based and Proprietary Models According to the Chatbot arena leaderboard the top entries with high Elo ratings are all proprietary models....

November 30, 2023 · 3 min · 531 words · Me