Position: Senior SRE Engineer with Cloud Security – (Federal Project Exp. Required) – (F2F Interview)
Location: Alpharetta, GA or Berkeley, NJ or Columbus, OH or Frisco, TX – (Local candidates only along with DL or ID card)
Note: F2F interview and 5 days onsite is mandatory
Visa: Only USC and GC
Job Summary: SRE (Security Engineer)
We are seeking a skilled and motivated Cloud Security Engineer – SRE to join our dynamic team. The ideal candidate will possess a strong technical background in systems administration, cloud computing, and infrastructure as code, with a particular focus on solution engineering/site reliability. This role will involve collaborating with cross-functional teams to enhance our security posture and streamline processes through automation.
Technical Skills
- Programming and Scripting: Strong proficiency in languages like Python, Go, Bash, or Ruby. SREs often need to write automation scripts and build tooling.
- Systems Administration: Deep understanding of operating systems (Linux/Unix), file systems, processes, and system configurations.
- Infrastructure as Code (IaC): Experience with IaC tools like Terraform, Ansible, or Chef to manage infrastructure.
- Cloud Computing: Knowledge of cloud platforms such as AWS, Azure, or Google Cloud Platform, including services like EC2, S3, Kubernetes, and serverless functions.
- Containers and Orchestration: Expertise in containerization (Docker) and container orchestration (Kubernetes, OpenShift).
- Networking: Understanding of networking concepts, including DNS, firewalls, load balancing, and VPNs.
- Monitoring and Observability: Experience with monitoring and observability tools like Prometheus, Grafana, Datadog, or New Relic. Ability to set up and maintain monitoring dashboards, alerts, and logs.
- Continuous Integration/Continuous Deployment (CI/CD): Familiarity with CI/CD tools like Jenkins, GitLab CI, GitHub Actions, or CircleCI.
- A strong understanding of HashiCorp Vault and Terraform will make you stand out.
2. Problem-Solving and Troubleshooting
- Incident Management: Ability to manage and respond to incidents, perform root cause analysis, and implement post-mortem reviews.
- Automation: Focus on automating repetitive tasks to improve efficiency and reduce human error.
- Performance Tuning: Skills in identifying and resolving performance bottlenecks in systems and applications.
3. Collaboration and Communication
• Teamwork: Ability to work closely with cross-functional teams, including software engineers, product managers, and DevOps teams.
• Documentation: Skill in creating clear and comprehensive documentation for systems, processes, and incident reports.
• Communication: Effective communication skills for interacting with stakeholders and explaining technical concepts to non-technical audiences.
4. Reliability and Scalability
- Service-Level Objectives (SLOs) and Service-Level Agreements (SLAs): Understanding of setting, monitoring, and maintaining SLOs and SLAs for system reliability.
- Scalability: Knowledge of best practices for designing and scaling systems to handle increased loads and demands.
- Redundancy and Resilience: Experience in designing systems with redundancy and fault tolerance to minimize downtime.
5. Security and Compliance
- Security Best Practices: Understanding of security principles, such as access control, data encryption, and secure coding practices.
- Compliance: Familiarity with compliance standards like GDPR, HIPAA, or PCI-DSS, depending on the industry.
Minimum Job Qualifications:
- Bachelor degree in business or equivalent work experience
- 10 years of previous program leadership and/or relevant consulting experience
- Knowledge of and demonstrated experience in program management framework, knowledge groups & life cycle