Title: Big Data Engineer
Location: Rockville, MD or McLean, VA (Hybrid)
Contract: 6+ Months Contract
during the test if candidiate cheat or copy paste any thing then all these thing will be monitorised and then profile will be rejected .
so no fake submission please .
please give me best excellent genuine candidiate sure shot 100% interview and offer for right candidate
give me local candidiate only no relocation no represent
interview: video+f2f
Only Local candidates who can take Assessment and only who are in DC/VA/MD who can got for F2F interview
Overview
- Must haves: spark, Hadoop, scala, hive
- scripting is a must- python or perl
- must be expert level in Complex SQL- window functioning, complex multiple joins, cloud experience is mandatory-S3, glue, emr, athena
- AI- How to use AI for prompt engineering
- Github
- Copiliot
- Chjatgopt
- Q
Need someone who is well versed in agile, test automations, CICD practices
Financial experience is preferred
ROLE FIT
- 5+ years building enterprise-scale data solutions using Spark, Hadoop, Hive, and Scala
- Strong scripting skills (Python or Perl) and expert-level complex SQL (window functions, multi-joins)
- AWS cloud experience required (S3, EMR, Glue, Athena)
- Experience with Agile delivery, CI/CD pipelines, automated testing, and GitHub workflows
- Financial services or regulated industry experience preferred
- Design and maintain scalable, reliable big data pipelines
- Optimize Spark/Hadoop workloads for performance, scalability, and cost efficiency
- Implement automated testing and data quality validation
- Enable analytics and data science teams with high-quality, accessible datasets
- Leverage AI-assisted tools (Copilot, ChatGPT, Q Developer) to improve development productivity
- Diagnose and resolve Spark performance bottlenecks and data pipeline failures
- Optimize complex SQL transformations and large-scale joins
- Troubleshoot data quality, latency, and reliability issues in production
- Improve AWS workload efficiency through tuning and resource optimization
- Automate repetitive engineering tasks using AI-assisted development tools
- Delivered end-to-end pipelines using Spark and Hadoop ecosystem tools
- Optimized SQL and pipeline performance with measurable improvements
- Deployed and supported AWS data workloads (EMR, Glue, Athena, S3)
- Implemented CI/CD and automated testing for data pipelines
- Used AI coding assistants and GitHub workflows in team-based development