We're looking for a Senior Machine Learning Engineer to join our ML team in Toronto, focusing on building and optimizing state-of-the-art RAG (Retrieval Augmented Generation) systems. You'll be joining us at an exciting time as we reinvent our RAG systems, making this an excellent opportunity for someone with strong ML and IR fundamentals who wants to dive deep into practical LLM applications.
Q: Klue who?
A: Klue is a VC-backed, capital-efficient growing SaaS company. Tiger Global and Salesforce Ventures led our US$62m Series B in the fall of 2021. We’re creating the category of competitive enablement: helping companies understand their market and outmaneuver their competition. We benefit from having an experienced leadership team working alongside several hundred risk-taking builders who elevate every day.
We’re one of Canada’s Most Admired Corporate Cultures by Waterstone HC, a Deloitte Technology Fast 50 & Fast 500 winner, and recipient of both the Startup of the Year and Tech Culture of the Year awards at the Technology Impact Awards.
Q: What are the responsibilities, and how will I spend my time?
A: In this role, you'll focus on optimizing our RAG systems with scientific rigor and reproducible results. You'll measure and improve retrieval systems across the spectrum from BM25 to semantic search, using comprehensive evaluation metrics including Recall@K and Precision@K. A key challenge will be developing optimal chunking and enrichment strategies for diverse data sources including news articles, website changes, documents, CRM entries, call recordings and internal communications. You'll explore how different data types and formats impact retrieval performance and develop strategies to maintain high relevance across all sources.
Beyond RAG and retrieval, you'll work on prompt engineering to effectively utilize the retrieved context. This includes developing zero-shot and few-shot prompts with structured inputs/outputs, and implementing tight iteration loops with the right evaluation metrics.
You'll also work on training and fine-tuning smaller, more efficient models that can match the performance of large LLMs at a fraction of the cost. This includes creating labeled datasets (sometimes using prompts), conducting careful hyperparameter optimizations, and building automated training pipelines. You'll also deploy and monitor these models in production, optimize their latency, and implement comprehensive offline/online metrics to track their performance.
Throughout all this work, you'll apply your deep understanding of the latest breakthroughs in the field to connect new research advances to practical improvements in our systems. Working closely with backend engineers, you'll help build scalable, production-ready systems that turn cutting-edge ML experiments into reliable business value.
Q: What experience are we looking for?
Q: What makes you thrive at Klue?
A: We're looking for builders who:
Q: What technologies do we use?
How We Work at Klue: