Senior Cloud Operations Engineer

Job ID: 3362

Description

When you relentlessly push the limits of cloud infrastructure, something eventually breaks. That’s why we engage the brightest and best talent to rapidly and permanently fix operational issues for our portfolio of more than 100 SaaS products. We know that a diving save is exciting, yet we also need heroes who ward off potential attacks and shore up vulnerabilities before they cause trouble. Are you a cloud operations superstar who can do that? 

WHO WE ARE LOOKING FOR

Our continuously growing portfolio of 100+ products and expanding infrastructure scale means you will always be tackling new and different issues. We marshal teams comprised of individuals with the following attributes:  

  • You crave variety and are energized by novel and complex problems.
  • You are an expert troubleshooter who embraces the challenge of maintaining live systems of massive scale.
  • You want to be part of a smart, capable, and tenacious team that uncovers hidden issues and delivers the right solution the first time. 

If that describes you, then this is your opportunity to join a global cloud operations organization unlike any other.

YOU WILL LOVE THIS JOB IF

  • You enjoy learning and applying new knowledge and skills to tackle unfamiliar and challenging problems.  
  • You revel in playing detective and are confident in your ability to recognize clues and evidence that point toward the solution.
  • You thrive in situations where people are relying on you to quickly save the day.
  • You value permanent solutions over temporary fixes.

YOU WILL HATE THIS JOB IF

  • You only want to grow your expertise in a single product or technology stack.
  • High-stakes outages and urgent timelines stress you out rather than giving you a jolt of energy.
  • You can’t be bothered to carefully document your work.

What you will be doing

In this role, you will act as both a skilled firefighter and an ace detective of cloud-based infrastructure incidents. Your top-notch trouble-shooting skills will be enlisted to combat the unique incidents encountered within our large and diverse infrastructure. Your expertise in SaaS Operations will be called upon to analyze and locate root causes, recognize and address systemic factors, and diagnose and mitigate weaknesses before they become disruptive. 

What you will NOT be doing

  • Fixing product code
  • Writing product documentation or runbooks
  • Interacting with external customers

Key Responsibilities

This role serves on a team with responsibility to maintain mission-critical SaaS infrastructure for thousands of enterprise users across the globe. In this role, you will:

  • Respond to alerts and L2 service requests, resolving incidents to ensure system uptime and expected service levels
  • Analyze cloud systems issues and provide recommendations and supporting data to monitor, prevent, simplify, and/or automate
  • Execute change requests that impact production systems used by our diverse portfolio of products
  • Provide 24x7 cloud operations support on a rotating, on-call schedule as part of a global SaaS Operations team.

Candidate Requirements

We hire candidates with expertise in infrastructure administration and proven capabilities supporting cloud technologies. This is indicated by: 

  • Bachelor's degree in Computer Science or related technical field
  • Expertise in cloud infrastructure troubleshooting
  • Proven experience writing shell scripts in any language (e.g., Python, PHP, bash, SQL) and knowledge of software coding principles and methods
  • Broad-based knowledge of all IT functions and their interrelationships (incl. DevOps, Security, DB, Infrastructure)
  • Clutch Performance - the ability to concentrate and deliver high-quality work under pressure
  • English language proficiency (written and spoken).


Nice to have

  • AWS Cloud Practitioner or SysOps Admin Certification
Apply Now