Senior Site Reliability Engineer

Job Type: Contract

Job Category: IT

Job Description

Job Summary

The Senior Site Reliability Engineer will design, build, and maintain scalable, resilient, and high-performance infrastructure and systems. This role combines software engineering and systems engineering skills to ensure reliability, availability, and efficiency of production environments while driving automation, observability, and incident response best practices.

🛠️ Key Responsibilities

Design, implement, and manage highly available, fault-tolerant, and scalable systems.
Develop automation tools and frameworks to improve system reliability and reduce manual work.
Monitor, troubleshoot, and resolve production issues with a focus on incident response, root cause analysis, and post-mortems.
Partner with development teams to design systems with reliability, performance, and scalability in mind.
Maintain and improve observability (logging, monitoring, tracing, and alerting) for services and infrastructure.
Optimize CI/CD pipelines, infrastructure provisioning, and configuration management.
Ensure strong security, compliance, and disaster recovery (DR) strategies.
Implement service-level objectives (SLOs), indicators (SLIs), and error budgets to guide reliability engineering practices.
Participate in an on-call rotation and provide mentorship to junior engineers.
Stay up to date with new technologies and best practices in SRE, DevOps, and cloud-native ecosystems.

🎓 Qualifications

Bachelor’s degree in Computer Science, Information Technology, or related field (or equivalent work experience).
5+ years of experience in site reliability engineering, DevOps, or software engineering.
Strong expertise in cloud platforms (AWS, Azure, or GCP).
Proficiency in programming/scripting (Python, Go, Java, Bash, etc.).
Hands-on experience with Kubernetes, Docker, and container orchestration.
Solid knowledge of CI/CD tools (Jenkins, GitHub Actions, GitLab CI, ArgoCD, etc.).
Experience with monitoring tools (Prometheus, Grafana, Datadog, New Relic, ELK, Splunk).
Strong background in infrastructure as code (Terraform, Ansible, Helm, Pulumi).
Knowledge of networking, Linux systems, and security best practices.

🌟 Skills & Competencies

Strong analytical and problem-solving skills.
Excellent collaboration with development, operations, and product teams.
Ability to handle high-pressure incidents with composure.
Passion for automation and continuous improvement.
Strong written and verbal communication skills.

🕒 Work Environment

Fast-paced technology environments (SaaS, cloud, fintech, e-commerce, healthcare IT, etc.).
Hybrid/remote work options depending on employer.
On-call rotation expected as part of the role.

#SRE #SiteReliabilityEngineering #CloudEngineering #DevOps #InfrastructureAsCode #AWS #GCP #Azure #Kubernetes #Automation

Required Skills

DevOps Engineer Senior Email Security Engineer

Fill below details & click “Apply”

Name*

Email*

Contact Number* Only add 10 digit number without prefix

Visa Status*

Primary Skill*

Zip Code*

Country*

State*

City*

Resume*

Resume can be attached in PDF, JPG, Word , Txt format only

Preferred locations*

Select Preferred State*

I agree to be contacted via call , email and text. To opt out, you can reply "Stop” at any time or click the unsubscribe link in the emails. Message and data rates may apply. Messages frequency varies.