Apply Now
Location: San Jose, California (CA)
Contract Type: C2C
Posted: 4 hours ago
Closed Date: 06/23/2026
Skills: and Production Operations Engineer
Visa Type: Any Visa

Job Description:

This role acts as a hands-on technical lead, driving cloud engineering initiatives, automating infrastructure, and ensuring high-availability and performance across customer-facing systems. The Lead Engineer will collaborate with IT, DevOps, and Software Engineering teams to build secure, scalable environments that support continuous delivery and rapid innovation.


Reporting to the Associate Director of IT and Infrastructure, this position combines deep technical execution with mentoring responsibilities—balancing architectural vision with day-to-day operational excellence.


Key Responsibilities:

Cloud Infrastructure and Engineering

Design, deploy, and manage hybrid and cloud infrastructures (OCI, AWS, Azure, on-prem) to support production and enterprise systems

Implement infrastructure-as-code (IaC) using Terraform or CloudFormation to ensure repeatable, secure, and automated deployments

Develop and maintain CI/CD-ready environments that support rapid build, test, and release cycles for engineering teams

Partner with network and security teams to implement resilient, compliant architectures

 

Production Operations and Reliability

Serve as technical lead for production systems, ensuring stability, performance, and scalability

Establish monitoring, logging, and alerting frameworks to improve visibility and reduce mean time to detection (MTTD) and resolution (MTTR)

Participate in incident response, root cause analysis, and reliability improvement efforts

Collaborate with Engineering and SRE teams to define SLIs, SLOs, and performance metrics for critical services

 

Automation and CI/CD Enablement

Develop and enhance deployment pipelines (e.g., Jenkins, GitLab, ArgoCD) to automate software delivery and environment provisioning

Embed security, compliance, and testing gates into CI/CD workflows

Implement configuration management and orchestration tools such as Ansible, Chef, or Puppet to manage infrastructure at scale

Drive efficiency through self-healing systems, auto-scaling, and infrastructure automation

 

Operational Leadership and Collaboration

Lead day-to-day production operations activities, mentoring junior engineers on cloud and reliability best practices

Act as a technical bridge between Infrastructure, Security, and Application Engineering teams

Contribute to capacity planning, cost optimization, and production readiness reviews

Maintain documentation, runbooks, and standard operating procedures for production systems


Qualifications:

Bachelor’s degree in Computer Science, Information Systems, or equivalent experience

7+ years of experience in cloud and infrastructure engineering, with at least 2–3 years in a lead or senior engineer capacity

Deep expertise in OCI (preferred) AWS or Azure (networking, compute, storage, IAM, and monitoring)

Proven experience with production-scale operations and hybrid cloud deployments

 

Proficiency in:

Infrastructure-as-code (Terraform, CloudFormation)

CI/CD and DevOps pipelines (Jenkins, GitLab, ArgoCD)

Containers and orchestration (Kubernetes, Docker)

Observability tools (Datadog, Prometheus, Grafana, ELK)

Scripting languages (Python, Bash, PowerShell)

Strong troubleshooting skills and the ability to lead through high-impact incidents

Excellent communication and collaboration skills across cross-functional teams