Job Title: DevOps Engineer with SRE Capabilities
Location: New Port Beach CA
Position Overview
We are seeking a highly skilled DevOps Engineer with Site Reliability Engineering (SRE) capabilities to join our team. The ideal candidate will possess strong technical expertise across modern DevOps tools, automation, cloud platforms, and observability practices. You will be responsible for ensuring the scalability, reliability, and performance of systems through automation, monitoring, troubleshooting, and collaboration with development teams.
Key Responsibilities
Core DevOps & Infrastructure
Design, implement, and optimize pipelines using Azure DevOps (pipeline creation, management, optimization).
Build and manage CI/CD pipelines (end-to-end design and implementation).
Manage AWS services (EC2, S3, Lambda, RDS, CloudWatch, IAM).
Implement and troubleshoot Docker (container creation, optimization).
Manage Kubernetes clusters (troubleshooting, scaling, optimization).
Conduct code reviews, debugging, performance optimization in Core Java and Python.
Develop automation scripts, tools, and API integrations.
Administer Windows automation and system management using PowerShell.
Automate configuration management with Ansible.
Quality & Security Tools
Manage repositories and artifact operations with JFrog Artifactory.
Ensure code quality and remediation using SonarQube.
Observability & Monitoring
Application performance monitoring using AppDynamics, Grafana, Zabbix.
Experience with Datadog or Dynatrace (preferred).
Conduct log analysis with ELK Stack, Splunk, or equivalents.
Monitor infrastructure using Prometheus, CloudWatch, or similar.
Essential Professional Competencies
Problem-Solving & Critical Thinking
Perform root cause analysis to identify and resolve complex issues.
Lead incident response including escalation and post-mortem analysis.
Proactively optimize performance and resolve bottlenecks.
Apply capacity planning strategies to ensure scalability.
Development Collaboration
Conduct thorough code reviews and debugging.
Collaborate with developers to resolve application-level issues.
Use APM tools for profiling and resolving performance issues.
Implement and maintain security best practices across the pipeline.
Self-Direction & Initiative
Research, analyze, and solve issues independently.
Communicate proactively on status, blockers, and recommendations.
Stay updated with industry trends and best practices.
Create and maintain clear technical documentation.
Behavioral Expectations
Self-Motivated: Own tasks and drive them to completion.
Resourceful: Utilize documentation, community, and colleagues to find solutions.
Proactive: Anticipate issues before they occur.
Quality-Focused: Deliver work that meets standards without extensive rework.
Communication & Collaboration
Communicate technical concepts clearly to both technical and non-technical stakeholders.
Ask clarifying questions to ensure accurate understanding.
Contribute to team knowledge sharing through documentation and mentoring.
Qualifications
Minimum Competency Level: E2 (Medium) across all listed technologies.
Strong experience in DevOps, Infrastructure, Automation, Monitoring, and SRE practices.
Hands-on expertise with cloud platforms, CI/CD, observability tools, and scripting languages.