DevOps Architect
$100,000 USD/year Pay is set based on global value, not the local market. Most roles = hourly rate x 40 hrs x 50 weeks 

Worldwide
Fully-remote
full-time (40 hrs/week)
Flexible schedule
Long-term role

DevOps Architect   $100,000 USD/year

Description

You are the engineer who ensures over 50 SaaS products remain operational while others are still learning. We need DevOps professionals capable of exploring unknown AWS environments, restoring order from disorder, and driving uptime beyond 99.9% through authentic monitoring, genuine automation, and thorough root-cause analyses. Your work will involve breaking down complex projects into daily deliverables, deploying production-ready Python or JavaScript, and leveraging AI as your assistant.

Many organizations boast about their "cloud" capabilities while manually maintaining individual systems. We are systematizing reliability across numerous acquired products where original developers have departed and documentation remains incomplete. The challenge is compelling: you will employ agents and contemporary tools to examine unfamiliar systems 5–10× more rapidly, document your discoveries, and automate solutions to prevent recurring failures. Rather than judging you on certifications and vendor logos, we will observe how you troubleshoot in real time, compose a genuine 5-Whys analysis that identifies a single preventable root cause, and construct automations that endure in production environments.

This position is not an L2 "execute the playbook" role. Here, you author the playbooks, architect the deployment progression from development through staged environments to 10% and then 100% rollout with soak periods and rollback mechanisms, and create monitoring that detects edge-case scenarios. You decline hazardous changes before implementation. You distinguish between infrastructure failures you manage and application bugs that Engineering addresses, assigning permanent remediation to the appropriate team.

You will operate at the engineering center of reliability, managing infrastructure initiatives, incident management and root-cause analyses, and change requests accompanied by executable runbooks. If you have already managed a significant SaaS product and wish to extend that expertise across an entire fleet, join us. Bring advanced AWS knowledge, production-quality coding skills, rigorous scope discipline, and daily, essential use of AI tooling. If you are prepared to maintain operational continuity, please apply.

What you will be doing

  • Sophisticated infrastructure migrations, consolidations, production-quality automations, and monitoring modifications
  • Diagnosing production incidents, deploying immediate remediation, and authoring root cause analyses with permanent solutions assigned to responsible teams
  • Authoring, reviewing, and implementing production changes, including assessing whether proposed changes are safe for execution

What you will NOT be doing

  • Spending time in Jira and endless status meetings - we value individuals who can deliver solutions, not merely document problems
  • Preserving legacy systems without end - you will be empowered to implement substantive improvements
  • Waiting on bureaucratic approval processes - you will possess the authority to implement immediate fixes during incidents

Key responsibilities

  • Advance reliability and standardization of cloud infrastructure throughout our expanding product portfolio by deploying comprehensive monitoring, automation, and AWS best practices.

Candidate requirements

  • Extensive AWS infrastructure expertise (this is our core platform - experience with other clouds alone is insufficient)
  • Experience managing substantial production infrastructure and resolving production outages autonomously (not simply executing a runbook)
  • Experience scripting in Python and Bash for routine administrative operations
  • Experience administering and migrating production databases across multiple engines (including MySql, Postgres, Oracle, MS-SQL)
  • Experience with infrastructure automation (Terraform, Ansible, or CloudFormation)
  • Linux systems administration expertise

Meet a successful candidate

Watch Interview
Anonymous
Anonymous  |  Elite Coder
Lebanon

Have you ever made so much money you had to remain anonymous to protect yourself? How about being able to fix an impossible coding problem i...

Meet Anonymous

Applying for a role? Here’s what to expect.

Crossover's skill assessment process combines innovative AI power with decades of human research, to take the guesswork, human bias, and pointless filters out of recruiting high-performing teams.

Chat-style
screening interview.
STEP 1

Chat-style
screening interview.

Cognitive 
aptitude test.
STEP 2

Cognitive 
aptitude test.

Prove real-world 
job skills.
STEP 3

Prove real-world 
job skills.

Interview with the hiring manager.
STEP 4

Interview with the hiring manager.

Pass
proctored test.
STEP 5

Pass
proctored test.

Accept job offer.
STEP 6

Accept job offer.

Frequently asked questions

About the role

About Crossover

Meet some people who've landed similar jobs

Why Crossover

Recruitment sucks. So we’re fixing it.

The Olympics of work

The Olympics of work

It’s super hard to qualify—extreme quality standards ensure every single team member is at the top of their game.

Premium pay for premium talent

Premium pay for premium talent

Over 50% of new hires double or triple their previous pay. Why? Because that’s what the best person in the world is worth.

Shortlist by skills, not bias

Shortlist by skills, not bias

We don’t care where you went to school, what color your hair is, or whether we can pronounce your name. Just prove you’ve got the skills.

Crossover Logo White
Follow us on
Have a question?

Get answers to common questions using our smart chatbot Crosby.

HELP AND FAQs

Join the world's largest community of AI first Remote WorkersAI-first remote workers.