Job Details

View jobs in our app

Learn more about the app. Workinapps.com

AI/Backend Engineer | San Francisco | $250k + equity

2026-05-03 Harrison Clarke Santa Rosa,CA

Description:

Harrison Clarke are partnered with an early-stage startup building ground truth infrastructure for AI agents - creating the data, evaluation, and runtime systems that allow LLM-powered agents to behave reliably in real-world environments.

As an AI / Infrastructure Engineer focused on LLM systems, you will help design and operate the production backbone for deploying and scaling large language models. This includes building low-latency inference systems, GPU-optimised serving infrastructure, and the evaluation pipelines that ensure model outputs remain accurate, consistent, and grounded.

Key Responsibilities:

Design and operate infrastructure for deploying LLMs (e.g., GPT-style, open-weight, fine-tuned models)
Build and optimise high-throughput, low-latency inference pipelines
Implement scalable LLM serving systems (batching, caching, streaming, request routing)
Manage GPU-based infrastructure with a focus on cost and performance efficiency
Deploy and maintain model serving stacks (e.g., vLLM, TensorRT-LLM, TGI, Triton, or equivalents)
Build systems for model routing, fallback logic, and multi-model orchestration
Implement observability for LLM systems (latency, throughput, cost, failure modes, quality signals)
Design evaluation infrastructure for production LLM behaviour (A/B testing, regression testing, drift detection)
Collaborate with ML and product teams to productionise RAG systems and fine-tuned models

Qualifications:

3+ years in infrastructure engineering, MLOps, or backend systems roles
Proven experience deploying ML or LLM systems in production environments
Strong proficiency in Python and/or Go
Strong understanding of distributed systems and scalable backend architecture
Hands-on experience with Docker, Kubernetes and CI/CD pipelines
Familiarity with model serving frameworks (e.g., vLLM, Triton, TGI)
Experience building high-performance APIs for production systems
Strong debugging skills across infrastructure and application layers
Must have the legal right to work in the US and must not require visa sponsorship

If this sounds like something of interest, please apply below or alternatively reach out to me at reece@harrisonclarke.com

Job Details

View jobs in our app

AI/Backend Engineer | San Francisco | $250k + equity

Apply for this Job

Registration Required

Login to Apply

You are leaving our site

Registration Required

Email this job to a friend

Job: AI/Backend Engineer | San Francisco | $250k + equity

Job Alert Sign Up

Add To Job Alert

Job Alert Updated

Email Customer Care