You are the engineer who ensures over 50 SaaS products remain operational while others are still learning. We need DevOps professionals capable of exploring unknown AWS environments, restoring order from disorder, and driving uptime beyond 99.9% through authentic monitoring, genuine automation, and thorough root-cause analyses. Your work will involve breaking down complex projects into daily deliverables, deploying production-ready Python or JavaScript, and leveraging AI as your assistant.
Many organizations boast about their "cloud" capabilities while manually maintaining individual systems. We are systematizing reliability across numerous acquired products where original developers have departed and documentation remains incomplete. The challenge is compelling: you will employ agents and contemporary tools to examine unfamiliar systems 5–10× more rapidly, document your discoveries, and automate solutions to prevent recurring failures. Rather than judging you on certifications and vendor logos, we will observe how you troubleshoot in real time, compose a genuine 5-Whys analysis that identifies a single preventable root cause, and construct automations that endure in production environments.
This position is not an L2 "execute the playbook" role. Here, you author the playbooks, architect the deployment progression from development through staged environments to 10% and then 100% rollout with soak periods and rollback mechanisms, and create monitoring that detects edge-case scenarios. You decline hazardous changes before implementation. You distinguish between infrastructure failures you manage and application bugs that Engineering addresses, assigning permanent remediation to the appropriate team.
You will operate at the engineering center of reliability, managing infrastructure initiatives, incident management and root-cause analyses, and change requests accompanied by executable runbooks. If you have already managed a significant SaaS product and wish to extend that expertise across an entire fleet, join us. Bring advanced AWS knowledge, production-quality coding skills, rigorous scope discipline, and daily, essential use of AI tooling. If you are prepared to maintain operational continuity, please apply.
Crossover's skill assessment process combines innovative AI power with decades of human research, to take the guesswork, human bias, and pointless filters out of recruiting high-performing teams.






It’s super hard to qualify—extreme quality standards ensure every single team member is at the top of their game.
Over 50% of new hires double or triple their previous pay. Why? Because that’s what the best person in the world is worth.
We don’t care where you went to school, what color your hair is, or whether we can pronounce your name. Just prove you’ve got the skills.
Join the world's largest community of AI-first remote workers.