Apply Now
Location: Irvine, California (CA)
Contract Type: C2C
Posted: 1 month ago
Closed Date: 11/06/2025
Skills: AI Engineers, QA, and DevOps
Visa Type: Any Visa

Title: Machine Learning Manager

Locations: Irvine, CA - Hybrid

Duration: 12+ Months


Note: Occasional travel to deployment sites or test locations may be needed


Job Description:

Lead the development and operationalization of machine learning systems powering the Voice AI experience. This role is central to ensuring performance, scalability, and reliability across real-time models that support speech, natural language understanding, and agent behaviour. Manage a team of MLEs and partner with AI Engineers, QA, and DevOps to deliver high-quality agent performance with a strong focus on latency, integration with restaurant systems (e.g., HME, POS), and production excellence.

Key Responsibilities


ML System Design & Architecture

 

Lead the design of end-to-end ML pipelines for speech, ASR, and NLU modules

Optimize model performance for real-time interaction, including latency, uptime, and inference cost

Implement and evolve model evaluation, testing, and monitoring frameworks

Infrastructure & Integration

Collaborate with engineering to integrate ML components with external systems (HME, menu boards, POS)

Support scalable deployment strategies across markets and environments

Drive MLOps best practices in CI/CD, rollback, logging, and observability

Leadership & Collaboration

Mentor and guide a team of MLEs and junior ML engineers

Partner with product, AI engineering, and QA to define technical scope, delivery targets, and quality standards

Support internal upskilling and technical review of AI-driven components

 

Required Qualifications:

6+ years of experience in machine learning, including at least 2 years in technical leadership roles

Proven expertise in deploying NLP, ASR, or LLM-based systems in real-time applications

Strong programming skills in Python and ML tooling (e.g., PyTorch, HuggingFace, ONNX, MLflow)

Experience optimizing model latency and integrating ML with backend infrastructure.