Apply Now
Location: Atlanta, Georgia (GA)
Contract Type: C2C
Posted: 20 hours ago
Closed Date: 04/28/2026
Skills: Python (PySpark) and advanced SQL,Kafka, Kinesis
Visa Type: H1B

Role: Senior Data Engineer

Location: Atlanta, GA (Hybrid)

Experience Level: 12+ Years

Only H1b

Key Responsibilities

  • Pipeline Engineering: Architect and implement scalable ETL/ELT pipelines using PySpark and SQL to ingest and process massive datasets from diverse sources.
  • Cloud Orchestration: Design and maintain complex workflow automation using Apache Airflow, ensuring high availability and fault tolerance.
  • Platform Optimization: Leverage Databricks (Jobs & Delta Lake) and AWS EMR Serverless to optimize data processing performance and minimize cloud compute costs.
  • Modern Table Formats: Implement and manage table formats like Iceberg and Delta to support ACID transactions, time-travel, and schema evolution.
  • Performance Tuning: Perform deep-dive analysis on Spark internals (partitioning, shuffling, caching) to resolve data skew and performance bottlenecks.
  • Data Governance: Ensure data integrity and security by implementing robust metadata management, lineage tracking, and compliance with global financial regulations (GDPR/PCI-DSS).
  • Collaboration: Partner with Data Scientists, AI Ops, and Product Managers to translate complex business requirements into high-performance technical solutions.



Required Technical Skills

  • Languages: Expert proficiency in Python (PySpark) and advanced SQL (window functions, CTEs, performance tuning).
  • Cloud Ecosystem: Extensive experience with AWS services, specifically S3, EMR Serverless, Glue, and IAM.
  • Big Data Tech: Hands-on mastery of Apache Spark and the Databricks ecosystem.
  • Orchestration: Strong experience building and managing Directed Acyclic Graphs (DAGs) in Apache Airflow.
  • Data Modeling: Proven ability to design efficient Data Warehouse (DWH) and Data Lake schemas.



Preferred Qualifications

  • Streaming: Experience with real-time data processing using KafkaKinesis, or Spark Structured Streaming.
  • AI/ML Integration: Exposure to building data foundations for GenAI or LLM-powered workflows.
  • Infrastructure as Code (IaC): Familiarity with Terraform or CloudFormation for automated environment scaffolding.
  • DevOps: Experience with CI/CD pipelines (GitLab/GitHub Actions) and containerization (Docker).

Regards-