
Nebius
Technical Product Manager – AI Compute Platform
US
•
Full TimeRemotePosted Today
Full TimeSeniorRemote
See how this job matches your profile
Sign in for an AI-powered fit score, breakdown, and a tailored resume.
Job Description
About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model
Key Highlights
- Hardware platforms & launch — bringing new GPU and CPU platforms (GB300, Vera Rubin, ARM/Grace, future generations) to production with full launch readiness across the stack.
- Cluster lifecycle & fleet operations — new region launches, 100,000+ GPU cluster bring-up, platform sharding and allocation architecture, release engineering, host-lifecycle automation, operational efficiency.
- Reliability & Mission Control — autohealing, health checks, SLA, fault-tolerant training, MTTR reduction, customer trust at scale, observability as a product.
- Customer experience & developer surface — Compute APIs, console, CLI, IMDS and in-VM signals, self-service workflows, notifications, customer-facing observability, unified UX across the product line.
- GPU & InfiniBand foundational services — drivers, firmware, NCCL, IB/RoCE, NVLink topology, the foundational layer everything else builds on.
Skills & Technologies
KubernetesAWSGCPAzure
About the Company
Nebius
View company profile →
Interested in this role?
Sign in or create a free account to see how this job matches your skills, apply with one click, and let our AI tailor your resume.
Sign in to applyAI-powered resume optimization
Save and track your applications
Job Details
Employment Type
Full Time
Experience Level
Senior
Location
US • Full Time
Work Mode
Remote
Posted
Today