Nebius

Nebius

Technical Product Manager – AI Compute Platform

US • Full TimeRemotePosted Today
Full TimeSeniorRemote

See how this job matches your profile

Sign in for an AI-powered fit score, breakdown, and a tailored resume.

Sign in

Job Description

About Nebius: Nebius is leading a new era in cloud infrastructure for the global AI economy. We are building a full-stack AI cloud platform that supports developers and enterprises from data and model

Key Highlights

  • Hardware platforms & launch — bringing new GPU and CPU platforms (GB300, Vera Rubin, ARM/Grace, future generations) to production with full launch readiness across the stack.
  • Cluster lifecycle & fleet operations — new region launches, 100,000+ GPU cluster bring-up, platform sharding and allocation architecture, release engineering, host-lifecycle automation, operational efficiency.
  • Reliability & Mission Control — autohealing, health checks, SLA, fault-tolerant training, MTTR reduction, customer trust at scale, observability as a product.
  • Customer experience & developer surface — Compute APIs, console, CLI, IMDS and in-VM signals, self-service workflows, notifications, customer-facing observability, unified UX across the product line.
  • GPU & InfiniBand foundational services — drivers, firmware, NCCL, IB/RoCE, NVLink topology, the foundational layer everything else builds on.

Skills & Technologies

KubernetesAWSGCPAzure

Interested in this role?

Sign in or create a free account to see how this job matches your skills, apply with one click, and let our AI tailor your resume.

Sign in to apply
AI-powered resume optimization
Save and track your applications

Job Details

Employment Type

Full Time

Experience Level

Senior

Location

US • Full Time

Work Mode

Remote

Posted

Today