Manager

 

Description:

As a Manager of Bare Metal Support Engineering, you'll be at the center of ensuring our dedicated infrastructure remains stable, reliable, and performant. You’ll lead daily support operations, triage incidents, drive escalations, and ensure that hardware is monitored, maintained, and delivered effectively for our clients. You'll oversee a team of experienced Systems Operations Engineers and help build a new team focused on our Bare Metal support model. This role balances tactical execution with operational maturity, working cross-functionally with engineering, product, and infrastructure teams to scale processes as we grow.

In This Role, You Will
 

  • Lead a skilled team responsible for maintaining and optimizing physical infrastructure across multiple client environments.
  • Build, develop, and lead a dedicated Infrastructure Support team focused on supporting key infrastructure, handling escalations, and ensuring smooth hardware operations.
  • Oversee the resolution of infrastructure-related incidents, escalation management, and collaborate with internal teams to deliver effective solutions.
  • Improve support processes to enhance efficiency and reduce downtime, ensuring the infrastructure meets client expectations.
  • Work closely with product, infrastructure, and other teams to ensure seamless delivery of infrastructure resources.
  • Manage client communication during escalations and issue resolution to ensure transparency and client satisfaction.
  • Mentor team members, developing their skills to manage and maintain critical infrastructure effectively.
     

Who You Are
 

  • 5+ years of experience leading teams responsible for infrastructure support, data center operations, or physical compute environments.
  • Hands-on experience with Linux system administration and command-line tools.
  • Familiarity with hardware-level diagnostics, troubleshooting, and replacement (servers, power, cabling, etc.).
  • Experience working with high-performance rack-scale hardware, including CPU and GPU-based compute nodes.
  • Understanding of GPU infrastructure (e.g., NVIDIA A100/H100s, PCIe/NVLink, liquid cooling) or a demonstrated ability to quickly learn and adapt to HPC environments.
  • Proven track record in incident and escalation management, with direct ownership of client or production-impacting issues.
  • Experience managing ticket-based workflows (Jira, Zendesk, etc.) in a high-urgency technical environment.
  • Comfortable interpreting and acting on metrics (MTTR, SLOs, backlog, ticket trends) to drive operational improvements.
  • Skilled in managing scheduling, shift coverage, and team logistics in 24/7 or hybrid support models.
  • Travel up to 30% annually

Organization CoreWeave
Industry Management Jobs
Occupational Category Manager
Job Location London,UK
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Experienced Professional
Experience 5 Years
Posted at 2026-02-25 1:59 am
Expires on 2026-04-11