Role: Databricks Data Scientist
Location: Indianapolis, IN -Onsite
Duration: 6 Months
JD:
Role Overview
- We are seeking a Databricks Data Scientist with strong experience in Databricks Lakehouse, Advanced Analytics, and Databricks Genie (AI/BI) to design, build, and deploy scalable data science and AI solutions.
- The role will focus on transforming enterprise data into actionable insights using machine learning, natural language analytics, and self-service BI powered by Databricks Genie.
Key Responsibilities:
Data Science & Machine Learning
- Design, develop, and deploy machine learning models using Databricks (MLflow, Spark ML, Python).
- Implement end-to-end ML pipelines (Data Ingestion ? Training ? Deployment ? Monitoring).
- Collaborate with Data Engineers to ensure reliable, high-quality datasets in the Lakehouse environment.
Databricks & Lakehouse Architecture
- Leverage Databricks Lakehouse (Delta Lake, Unity Catalog) for scalable analytics.
- Optimize Spark jobs for performance, scalability, and cost efficiency.
- Apply best practices for data governance, lineage, and security.
Genie (AI/BI & Natural Language Analytics)
- Configure and enable Databricks Genie for self-service analytics.
- Design semantic layers and curated datasets optimized for natural language queries.
- Partner with business stakeholders to translate business questions into Genie-enabled insights.
Business Enablement & Collaboration
- Work closely with Product Owners, Analysts, and Business Leaders to identify high-value business use cases.
- Communicate complex analytical results in a clear, business-friendly manner.
- Drive adoption of AI/BI solutions across the organization.
Required Qualifications
- Bachelor's or Master's degree in Data Science, Computer Science, Statistics, Engineering, or a related field.
- 4+ years of experience in Data Science or Advanced Analytics.
- Hands-on experience with Databricks and Apache Spark.
Strong programming skills in:
- Python
- PySpark
- Pandas
- NumPy
- Scikit-learn
- Experience building and deploying Machine Learning models in production environments.
- Strong understanding of SQL and Data Modeling.
- Experience with MLflow, Model Lifecycle Management, and Experiment Tracking.
Mandatory Skills
- Databricks
- Apache Spark
- Python
- PySpark
- MLflow
- Machine Learning
- SQL
- Delta Lake
- Unity Catalog
- Databricks Lakehouse
- Databricks Genie (AI/BI)
Preferred Skills
- Databricks Genie
- Natural Language Analytics
- Advanced Analytics
- Semantic Layer Design
- Data Governance
- Model Monitoring
- Data Lineage
- Self-Service BI