Job Details

AI/Backend Engineer | San Francisco | $250k + equity

  2026-05-03     Harrison Clarke     Santa Rosa,CA  
Description:

Harrison Clarke are partnered with an early-stage startup building ground truth infrastructure for AI agents - creating the data, evaluation, and runtime systems that allow LLM-powered agents to behave reliably in real-world environments.

As an AI / Infrastructure Engineer focused on LLM systems, you will help design and operate the production backbone for deploying and scaling large language models. This includes building low-latency inference systems, GPU-optimised serving infrastructure, and the evaluation pipelines that ensure model outputs remain accurate, consistent, and grounded.

Key Responsibilities:

  • Design and operate infrastructure for deploying LLMs (e.g., GPT-style, open-weight, fine-tuned models)
  • Build and optimise high-throughput, low-latency inference pipelines
  • Implement scalable LLM serving systems (batching, caching, streaming, request routing)
  • Manage GPU-based infrastructure with a focus on cost and performance efficiency
  • Deploy and maintain model serving stacks (e.g., vLLM, TensorRT-LLM, TGI, Triton, or equivalents)
  • Build systems for model routing, fallback logic, and multi-model orchestration
  • Implement observability for LLM systems (latency, throughput, cost, failure modes, quality signals)
  • Design evaluation infrastructure for production LLM behaviour (A/B testing, regression testing, drift detection)
  • Collaborate with ML and product teams to productionise RAG systems and fine-tuned models

Qualifications:

  • 3+ years in infrastructure engineering, MLOps, or backend systems roles
  • Proven experience deploying ML or LLM systems in production environments
  • Strong proficiency in Python and/or Go
  • Strong understanding of distributed systems and scalable backend architecture
  • Hands-on experience with Docker, Kubernetes and CI/CD pipelines
  • Familiarity with model serving frameworks (e.g., vLLM, Triton, TGI)
  • Experience building high-performance APIs for production systems
  • Strong debugging skills across infrastructure and application layers
  • Must have the legal right to work in the US and must not require visa sponsorship

If this sounds like something of interest, please apply below or alternatively reach out to me at reece@harrisonclarke.com


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search