Job Type: Contract
Job Category: IT

Job Description

Job Summary

The Senior Site Reliability Engineer will design, build, and maintain scalable, resilient, and high-performance infrastructure and systems. This role combines software engineering and systems engineering skills to ensure reliability, availability, and efficiency of production environments while driving automation, observability, and incident response best practices.


🛠️ Key Responsibilities

  • Design, implement, and manage highly available, fault-tolerant, and scalable systems.

  • Develop automation tools and frameworks to improve system reliability and reduce manual work.

  • Monitor, troubleshoot, and resolve production issues with a focus on incident response, root cause analysis, and post-mortems.

  • Partner with development teams to design systems with reliability, performance, and scalability in mind.

  • Maintain and improve observability (logging, monitoring, tracing, and alerting) for services and infrastructure.

  • Optimize CI/CD pipelines, infrastructure provisioning, and configuration management.

  • Ensure strong security, compliance, and disaster recovery (DR) strategies.

  • Implement service-level objectives (SLOs), indicators (SLIs), and error budgets to guide reliability engineering practices.

  • Participate in an on-call rotation and provide mentorship to junior engineers.

  • Stay up to date with new technologies and best practices in SRE, DevOps, and cloud-native ecosystems.


🎓 Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent work experience).

  • 5+ years of experience in site reliability engineering, DevOps, or software engineering.

  • Strong expertise in cloud platforms (AWS, Azure, or GCP).

  • Proficiency in programming/scripting (Python, Go, Java, Bash, etc.).

  • Hands-on experience with Kubernetes, Docker, and container orchestration.

  • Solid knowledge of CI/CD tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD, etc.).

  • Experience with monitoring tools (Prometheus, Grafana, Datadog, New Relic, ELK, Splunk).

  • Strong background in infrastructure as code (Terraform, Ansible, Helm, Pulumi).

  • Knowledge of networking, Linux systems, and security best practices.


🌟 Skills & Competencies

  • Strong analytical and problem-solving skills.

  • Excellent collaboration with development, operations, and product teams.

  • Ability to handle high-pressure incidents with composure.

  • Passion for automation and continuous improvement.

  • Strong written and verbal communication skills.


🕒 Work Environment

  • Fast-paced technology environments (SaaS, cloud, fintech, e-commerce, healthcare IT, etc.).

  • Hybrid/remote work options depending on employer.

  • On-call rotation expected as part of the role.


#SRE #SiteReliabilityEngineering #CloudEngineering #DevOps #InfrastructureAsCode #AWS #GCP #Azure #Kubernetes #Automation

Required Skills
DevOps Engineer Senior Email Security Engineer

Fill below details & click “Apply”

Only add 10 digit number without prefix
Resume can be attached in PDF, JPG, Word , Txt format only

Share This Job