Gimlet Labs is building the foundation for the next generation of AI applications. As generative AI workloads rapidly scale, inference efficiency is becoming the critical bottleneck. Gimlet is redefining AI inference from the ground up, combining cutting‑edge research with an integrated hardware‑software stack that delivers breakthrough performance, efficiency, and model quality.
Gimlet pairs its inference stack with a seamless developer experience, allowing users to deploy, manage, and monitor AI workloads from frameworks like PyTorch and LangChain at production scale in seconds.
Gimlet is spun out of a Stanford research project under Professors Zain Asgar and Sachin Katti. The founding team has deep experience across AI, distributed systems, and hardware with previous successful exits.
Gimlet Labs is seeking a Software Engineer to help build and scale our platform for deploying efficient AI inference. You will have the opportunity to work across the stack: from Gimlet's orchestration layer for distributing data and workloads at production scale, to its compilation framework for optimizing AI across diverse environments. Whether you're diving deep into breakthrough techniques to drive performance for the latest AI models, designing systems for processing millions of tokens a second, or refining the developer experience for AI deployment, you'll help shape the future of modern AI infrastructure.