Unfortunately, this job posting is expired.
Don't worry, we can still help! Below, please find related information to help you with your job search.
Some similar recruitments
Site Reliability Engineer / Python Developer
Recruited by The Blue Mount 8 months ago Address Montreal, Quebec, Canada
Site Activation Lead - Remote
Recruited by ICON Strategic Solutions 8 months ago Address Canada
Sr Pm / Associate Director, Project Management - Us/Canada - Respiratory - Remote/Home Based
Recruited by Worldwide Clinical Trials 8 months ago Address Canada
Senior Platform Reliability Engineer
Recruited by Luupli 8 months ago Address Canada
Clinical Site Activation Lead Remote- Canada
Recruited by ICON Strategic Solutions 8 months ago Address Canada
Clinical Site Activation Partner Remote- Canada
Recruited by ICON Strategic Solutions 8 months ago Address Canada
Sr. Clinical Project Manager - Canada Remote
Recruited by ICON Strategic Solutions 8 months ago Address Canada
Joist - Junior Web Engineer (Remote, Canada)
Recruited by Talentify.io 8 months ago Address Canada
Clinical Site Activation Lead (Remote- Canada)
Recruited by DOCS 8 months ago Address Canada
Site Reliability Engineer - Linux
Recruited by Astreya 8 months ago Address Canada
Staff Site Reliability Engineer - Remote
Recruited by Luxury Presence 9 months ago Address Canada
Senior Site Reliability Engineer
Recruited by Lyft 9 months ago Address Montreal, Quebec, Canada
Platform Engineer Jobs
Recruited by Leading Company 9 months ago Address Canada
Site Reliability Engineering Manager
Recruited by Soho Square Solutions 9 months ago Address Montreal, Quebec, Canada
Site Reliability Engineering (Sre) Developer
Recruited by Soho Square Solutions 9 months ago Address Montreal, Quebec, Canada
(Canada) Sr. Compensation Partner
Recruited by PointClickCare 10 months ago Address Canada

Site Reliability Engineer (Canada)

Company

Uptime.com

Address Montreal, Quebec, Canada
Employment type FULL_TIME
Salary
Category Technology, Information and Internet
Expires 2023-06-18
Posted at 11 months ago
Job Description
Why We Need You:
We are looking for an experienced Site Reliability Engineer to join Uptime.com and help us build reliable, robust software solutions for our customers. As a Site Reliability Engineer, you will be responsible for monitoring system performance, troubleshooting technical issues, deploying code changes, and collaborating with other teams to ensure the best possible customer experience.
The ideal candidate should have extensive experience in cloud infrastructure, distributed systems engineering, scripting and automation tools such as Docker containers and Kubernetes clusters. Additionally, you should possess the skills needed to manage service outages and ensure system availability by writing scalable software solutions.
What You Will Do:
  • Collaborate with other teams to ensure optimal performance of system and dependent resources
  • Design and assist in the setup and maintenance of application monitoring and alerting
  • Develop and support automation that allows for continuous testing of software created by the team
  • Manage monitoring tools like Grafana and Prometheus including deploying and optimizing their usage
  • Assist in designing and deploying HA/DR architecture for mission critical workloads
  • Deploy releases of applications and services in collaboration with developers
  • Develop strategies for resolving performance issues and identify areas of improvement
  • Troubleshoot production outages and implement fault tolerance solutions
  • Maintain documentation related to system operation procedures
  • Monitor system health metrics to proactively identify potential bottlenecks or errors
  • Participate in on-call duty rotation
  • Document game-day scenarios and test these scenarios
Requirements
What You Will Need:
  • Experience in defensive coding practices and patterns for high-availability.
  • Be comfortable working in a fast-paced agile environment. Requirements change quickly and our team needs to constantly adapt to moving targets.
  • Expertise in Linux server administration and scripting languages (Python)
  • Deep understanding of modern microservices based architectures and operations
  • Familiarity with configuration management tools
  • 5+ years of experience in SRE/DevOps roles
  • Proficient in a modern scripting language like GO or Python
  • Bachelor's degree in Computer Science or relevant field preferred
  • Good communicator and able to clearly articulate complex issues and technologies.
  • Knowledge of containerization technologies like Docker & Kubernetes
  • Excellent problem solving skills & strong collaboration abilities
Benefits
How we will support your growth and success:
  • Work From Home
  • Training & Development
  • Professional development opportunities to further skills and knowledge
  • Partner with executives, leadership and cross-functional organization including engineering, marketing and business operations.
  • Family Leave (Maternity, Paternity)
  • Health Care Plan (Medical, Dental & Vision)
  • A supportive team of passionate and dedicated individuals all focused on building the best monitoring service in the world.
  • Paid Time Off (Vacation, Sick & Public Holidays)
  • Discover the exciting world of monitoring, observability, and SRE while becoming an advocate and drive innovation in the industry.