Senior Data EngineerAtlanta, Hybrid, H1B, Full time, Contract Jobs in USA

Apply Now

Location: Atlanta, Georgia (GA)

Contract Type: C2C

Posted: 20 hours ago

Closed Date: 04/28/2026

Skills: Python (PySpark) and advanced SQL,Kafka, Kinesis

Visa Type: H1B

Role: Senior Data Engineer

Location: Atlanta, GA (Hybrid)

Experience Level: 12+ Years

Only H1b

Key Responsibilities

Pipeline Engineering: Architect and implement scalable ETL/ELT pipelines using PySpark and SQL to ingest and process massive datasets from diverse sources.
Cloud Orchestration: Design and maintain complex workflow automation using Apache Airflow, ensuring high availability and fault tolerance.
Platform Optimization: Leverage Databricks (Jobs & Delta Lake) and AWS EMR Serverless to optimize data processing performance and minimize cloud compute costs.
Modern Table Formats: Implement and manage table formats like Iceberg and Delta to support ACID transactions, time-travel, and schema evolution.
Performance Tuning: Perform deep-dive analysis on Spark internals (partitioning, shuffling, caching) to resolve data skew and performance bottlenecks.
Data Governance: Ensure data integrity and security by implementing robust metadata management, lineage tracking, and compliance with global financial regulations (GDPR/PCI-DSS).
Collaboration: Partner with Data Scientists, AI Ops, and Product Managers to translate complex business requirements into high-performance technical solutions.

Required Technical Skills

Languages: Expert proficiency in Python (PySpark) and advanced SQL (window functions, CTEs, performance tuning).
Cloud Ecosystem: Extensive experience with AWS services, specifically S3, EMR Serverless, Glue, and IAM.
Big Data Tech: Hands-on mastery of Apache Spark and the Databricks ecosystem.
Orchestration: Strong experience building and managing Directed Acyclic Graphs (DAGs) in Apache Airflow.
Data Modeling: Proven ability to design efficient Data Warehouse (DWH) and Data Lake schemas.

Preferred Qualifications

Streaming: Experience with real-time data processing using Kafka, Kinesis, or Spark Structured Streaming.
AI/ML Integration: Exposure to building data foundations for GenAI or LLM-powered workflows.
Infrastructure as Code (IaC): Familiarity with Terraform or CloudFormation for automated environment scaffolding.
DevOps: Experience with CI/CD pipelines (GitLab/GitHub Actions) and containerization (Docker).

Regards-