Senior DevOps Engineer
$100,000 USD/year Pay is set based on global value, not the local market. Most roles = hourly rate x 40 hrs x 50 weeks 

Worldwide
Fully-remote
full-time (40 hrs/week)
Flexible schedule
Long-term role

Senior DevOps Engineer   $100,000 USD/year

Description

You're the engineer who stabilizes production when no one else knows where to start. We need DevOps engineers capable of stepping into unknown AWS environments, restoring order from disorder, and driving uptime beyond 99.9% through rigorous monitoring, automation, and root-cause analysis. You'll break down complex projects into executable one-day increments, deliver production-ready Python or JavaScript, and leverage AI as an accelerant.

Most organizations talk about "cloud transformation" while hand-holding fragile systems. We're building industrial-grade reliability across dozens of acquired SaaS offerings where original engineers have departed and documentation is incomplete. The challenge: you'll deploy agents and cutting-edge tooling to map unfamiliar systems 5–10x faster, document your findings in code, and automate remediations so repeat incidents become impossible. Rather than judging you on certifications and vendor badges, we'll observe how you diagnose live issues, produce a meaningful 5-Whys analysis that identifies a single preventable root cause, and construct automations resilient enough for production deployment.

This is not a tier-two "execute the runbook" position. Here, you author the runbooks, architect the deployment pipeline from development through staging to canary and full rollout with soak periods and rollback triggers, and instrument the monitoring that surfaces obscure failure modes. You reject hazardous changes before they reach production. You distinguish infrastructure failures under your ownership from application bugs that belong to Engineering, and you route permanent remediations to the correct team.

You'll operate at the engineering heart of reliability, driving infrastructure initiatives, incident management and RCAs, and change execution with copy-paste-ready runbooks. If you've already run a significant SaaS platform and are ready to apply that expertise across an entire portfolio, this is your opportunity. Bring deep AWS knowledge, production-caliber coding ability, disciplined scope management, and daily, mission-critical use of AI tooling. If you're prepared to keep systems running, apply now.

What you will be doing

  • Advanced infrastructure migrations, consolidations, production-quality automation, and monitoring system modifications
  • Diagnosing production incidents, deploying immediate remediations, and authoring root cause analyses with permanent fixes allocated to owning teams
  • Authoring, reviewing, and deploying production changes, including assessing the safety profile of proposed modifications

What you will NOT be doing

  • Spending time in Jira and endless status calls - we prioritize people who deliver solutions, not those who merely document issues
  • Preserving legacy systems forever - you'll be authorized to implement substantial improvements
  • Waiting for bureaucratic approval workflows - you'll possess the authority to deploy immediate fixes during active incidents

Key responsibilities

  • Advance reliability and cloud infrastructure standardization across our expanding product suite by deploying comprehensive monitoring, automation, and AWS best practices.

Candidate requirements

  • Extensive AWS infrastructure knowledge (this is our core platform - expertise in other clouds alone is insufficient)
  • Track record owning substantial production infrastructure and resolving production outages autonomously (not simply executing a runbook)
  • Proficiency scripting in Python and Bash for routine administrative operations
  • Experience administering and migrating production databases across multiple engines (including MySql, Postgres, Oracle, MS-SQL)
  • Hands-on experience with infrastructure automation (Terraform, Ansible, or CloudFormation)
  • Linux systems administration proficiency

Meet a successful candidate

Watch Interview
Anonymous
Anonymous  |  Elite Coder
Lebanon

Have you ever made so much money you had to remain anonymous to protect yourself? How about being able to fix an impossible coding problem i...

Meet Anonymous

Applying for a role? Here’s what to expect.

Crossover's skill assessment process combines innovative AI power with decades of human research, to take the guesswork, human bias, and pointless filters out of recruiting high-performing teams.

Chat-style
screening interview.
STEP 1

Chat-style
screening interview.

Cognitive 
aptitude test.
STEP 2

Cognitive 
aptitude test.

Prove real-world 
job skills.
STEP 3

Prove real-world 
job skills.

Interview with the hiring manager.
STEP 4

Interview with the hiring manager.

Pass
proctored test.
STEP 5

Pass
proctored test.

Accept job offer.
STEP 6

Accept job offer.

Frequently asked questions

About the role

About Crossover

Meet some people who've landed similar jobs

Why Crossover

Recruitment sucks. So we’re fixing it.

The Olympics of work

The Olympics of work

It’s super hard to qualify—extreme quality standards ensure every single team member is at the top of their game.

Premium pay for premium talent

Premium pay for premium talent

Over 50% of new hires double or triple their previous pay. Why? Because that’s what the best person in the world is worth.

Shortlist by skills, not bias

Shortlist by skills, not bias

We don’t care where you went to school, what color your hair is, or whether we can pronounce your name. Just prove you’ve got the skills.

Crossover Logo White
Follow us on
Have a question?

Get answers to common questions using our smart chatbot Crosby.

HELP AND FAQs

Join the world's largest community of AI first Remote WorkersAI-first remote workers.