Senior Site Reliability Engineer Jobs in Halifax, Nova Scotia, Canada

Senior Site Reliability Engineer - ResMed Inc

Halifax, Nova Scotia, Canada
via Jobleads.com

Salary: -

Type

Career Level

Positions

Experience

Degree

Job Description

Senior Site Reliability Engineer page is loaded

Senior Site Reliability Engineer

Apply locations Halifax, Canada San Diego, CA, United States time type Full time posted on Posted 4 Days Ago job requisition id JR_033768

ResMed is seeking a Sr. Site Reliability Engineer – SRE to help define and execute against a Site Reliability
Engineering strategy for its rapidly expanding Digital Health Technology group. You will use your software engineering
expertise to constantly automate processes and innovate in a push to improve the reliability of the system. You will plan,
design, build and maintain large scale engineering solutions. Whether a bug fix or an awesome feature, you will own
your work and deliver the most elegant and scalable solutions.

Let’s talk about Responsibilities.

• Monitoring and metrics — establishing desired service behavior, measuring how the service is actually behaving availability, latency, and overall system health), and correcting discrepancies
• Emergency response — noticing and responding effectively to service failures in order to preserve the service's conformance to its SLA (service-level agreement)
• Work to simplify and automate deployment processes, run-time operations, and provide non-disruptive releases
• Provide technical advisory for other engineers to help them grow and deliver high quality work faster.
• Capacity planning — projecting future demand and ensuring that a service has enough computing resources in appropriate locations to satisfy that demand
• Service turn-up and turn-down — deploying and removing computing resources for a service in a data center in a predictable fashion, often as a consequence of capacity planning
• Scaling systems sustainably through mechanisms such as automation
• Participate in planning discussions with Product Development and other IT teams
• Maintain expertise in the area of architecture, including industry trends, strategies, and products to ensure that our assets are effectively and efficiently utilized
• Evolving systems by pushing for changes that improve reliability and velocity
• Conducting incident responses and blameless postmortems

Let’s talk about Qualifications and Experience

Required:
• Bachelor's degree in Computer Science or Information Systems or equivalent technical discipline, minimum 8 years working experience in an enterprise 24/7 production environment supporting
critical, real-time applications.
• Minimum 4 years of experience focused on site reliability for high-traffic applications
• Systematic problem-solving approach, combined with strong communication skills and a sense of ownership
• Cloud programming experience and comfort with working in multiple languages as required (please note we mainly use Python and Java)
• Expert full-stack debugging and performance optimization ability, including hands-on knowledge of AWS
• Extensive experience with monitoring tools such as DataDog and AWS native monitoring
• Track record monitoring and analyzing system performance, isolating issues or bottlenecks that could impact
reliability, performance and scalability ( We are using mainly DataDog and cloudwatch).

• Performance engineering mindset — design, development, and engineering related to scalability, isolation, latency, throughput, and efficiency
• Good verbal and written communication skills, and be able to work effectively with geographically remote teams

Good to have:

• Able to write/maintain terraform and lambda code in the AWS environment

• Supporting CI/CD pipeline with GitHub

• Strong exposure and use of AWS EKS

• Experience using Atlassian tools as Confluence and JIRA

• Understanding of Product Development Life Cycle, including Agile SCRUM, TDD, BDD
• Experience with Machine Learning

Joining us is more than saying “yes” to making the world a healthier place. It’s discovering a career that’s challenging, supportive and inspiring. Where a culture driven by excellence helps you not only meet your goals, but also create new ones. We focus on creating a diverse and inclusive culture, encouraging individual expression in the workplace and thrive on the innovative ideas this generates. If this sounds like the workplace for you, apply now! We commit to respond to every applicant.

About Us

At ResMed (NYSE: RMD, ASX: RMD) we pioneer innovative solutions that treat and keep people out of the hospital, empowering them to live healthier, higher-quality lives. Our digital health technologies and cloud-connected medical devices transform care for people with sleep apnea, COPD and other chronic diseases. Our comprehensive out-of-hospital software platforms support the professionals and caregivers who help people stay healthy in the home or care setting of their choice. By enabling better care, we improve quality of life, reduce the impact of chronic disease and lower costs for consumers and healthcare systems in more than 140 countries. To learn more, visit ResMed.com and follow @ResMed.

ResMed Corporation is an equal opportunity employer and provides equal opportunity in employment for all qualified persons, without regard to sex, gender identity, sexual orientation, race, color, religion, national origin, disability, protected Veteran status, age, or any other characteristic protected by law.

(US/Canada only) ResMed is an equal opportunity/affirmative action employer. ResMed is an E-Verify Employer. ResMed is a smoke-free workplace.

We are a 2024 Circle Back Initiative Employer – we commit to respond to every applicant!

#J-18808-Ljbffr

Senior Site Reliability Engineer Job In Halifax

Senior Site Reliability Engineer - ResMed Inc

Job Description

Senior Site Reliability Engineer

About Us