Job Details

Member of Technical Staff - Compilers

  2026-05-15     Acceler8 Talent     Santa Rosa,CA  
Description:

Member of Technical Staff – Compilers

Full-time | On-site | Bay Area

About the Company

We are a frontier AI hardware company developing AI models and tools for on-demand custom ASICs at scale. Our mission is to co-design custom ASICs alongside evolving machine learning workloads and enable a new generation of domain-specific chips that unlock capabilities beyond current hardware paradigms.

Our team brings together expertise across AI, silicon, systems, and compiler infrastructure, with backgrounds spanning leading AI research labs, hyperscalers, semiconductor companies, and advanced computing organizations.

We are looking for staff- or principal-level compiler engineers with deep experience building code generation toolchains for custom AI accelerators. Ideal candidates have shipped production compilers for AI, ML, or domain-specific hardware accelerators.

What You'll Do

As a Member of the Technical Staff on the Compilers team, you will own the compiler stack targeting a SIMD/VLIW neural processing architecture, from graph ingestion through code generation on production silicon. You will work closely with hardware architects to co-design the ISA and close the loop between compiler requirements and hardware decisions.

Responsibilities include:

  • Own the compiler end-to-end, including graph ingestion from formats such as ONNX and PyTorch, IR optimization, AI-driven code generation, instruction scheduling, and register allocation for a SIMD/VLIW NPU.
  • Implement and own the memory management layer, including software-managed on-chip scratchpad memory, compiler-driven data tiling, bank allocation, DMA scheduling, and double-buffering across SRAM banks.
  • Design and iterate on mid-end and backend optimization passes, including operator fusion, loop transformations, vectorization, and software pipelining to close the gap between peak and achieved throughput.
  • Co-design ISA features and instruction encodings with architecture and silicon teams, using real workload performance data to inform architectural decisions.
  • Support quantization and mixed-precision lowering, including FP32, INT32, INT8, INT4, BF16, FP16, FP8, and FP4, while ensuring correct end-to-end numerics.
  • Benchmark compiler output against cycle-accurate models, RTL simulation, and FPGA prototypes, and own quality-of-results tracking.
  • Grow into a compiler team leadership role as the organization scales.

What We'd Like to See

Qualifications and Skills:

  • Bachelor's, Master's, or PhD in Computer Science, Computer Engineering, or a closely related field.
  • 5+ years of experience building compilers or code generation toolchains for custom accelerators.
  • Direct experience targeting ML or AI hardware compilers; general-purpose CPU compiler experience alone is not sufficient.
  • Hands-on experience with at least one custom accelerator compiler stack, such as neural processor compilers, TPU-like code generation, spatial scheduling systems, AI engine compilers, or equivalent domain-specific accelerator compiler infrastructure.
  • Strong understanding of instruction scheduling, register allocation, and software pipelining, especially for SIMD/VLIW or spatial architectures.
  • Experience with tiling strategies, loop nest optimization, and operator fusion for ML workloads such as convolution, attention, element-wise operations, reductions, and transpositions.
  • Experience with software-managed memory, including scratchpad allocation, data layout, DMA orchestration, and multi-buffering.
  • Strong C++ skills and Python proficiency.
  • Familiarity with MLIR, LLVM, or similar compiler infrastructure.
  • Ability to lead and grow a compiler team over time.

Bonus Experience

  • Hardware/software co-design experience, including defining ISA features, instruction encodings, or hardware interfaces driven by compiler needs.
  • IR design for ML accelerators, including custom dialects, MLIR-based flows, or graph-level IRs.
  • Experience with ML frameworks and portable graph formats, such as PyTorch, TensorFlow, and ONNX.
  • Experience benchmarking and profiling compiler output on real hardware, FPGA prototypes, or cycle-accurate simulators.
  • Understanding of ML inference systems and workload-level optimizations, including attention variants, batching systems, KV cache management, and prefill/decode scheduling.
  • Contributions to open-source ML compiler projects.
  • Track record in energy-efficient, high-performance hardware accelerator bring-up.

What We Offer

  • Competitive salary and meaningful equity.
  • Fast-paced startup environment with autonomy and visible impact.
  • Technical challenges at the intersection of AI, compilers, and silicon design.
  • Direct ownership of a compiler stack as the company scales.


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search