Apply Now
Location: Austin, Texas (TX)
Contract Type: C2C
Posted: 2 days ago
Closed Date: 05/09/2025
Skills: LLMs using PyTorch, DeepSpeed, and LoRA
Visa Type: Any Visa

Role: GenAI Ops Engineer

Location: Austin, TX (Onsite day 1)

Type: Contract

 

Job Description:

We are looking for a GenAI Ops Engineer to train, fine-tune, and deploy Generative AI models (LLMs, Diffusion Models, Transformers, etc.). You will optimize model performance, manage training pipelines, and integrate AI solutions into production.

Key Responsibilities:

  • Train and fine-tune LLMs using PyTorch, DeepSpeed, and LoRA.
  • Optimize inference using ONNX, vLLM, TensorRT, and GPU acceleration.
  • Manage datasets, preprocess data, and implement RAG with vector databases (FAISS, Chroma, Pinecone).
  • Automate training workflows using ML flow, Weights & Biases, and Ray.
  • Deploy models using Kubernetes, Docker, and cloud AI services AWS or GCP.
  • Monitor model performance, mitigate drift, and optimize resource utilization.

Requirements:

  • Experience with LLM training, fine-tuning, and inference optimization.
  • Proficiency in Python, cloud AI services, and distributed training.
  • Familiarity with retrieval-augmented generation (RAG) and prompt engineering.
  • Strong problem-solving skills and ability to work in fast-paced AI environments.

Preferred:

  • Experience with open-weight models (LLaMA, Mistral, Gemma, Falcon, etc.).
  • Hands-on knowledge of multi-agent architectures and synthetic data generation.