POSTED Mar 4

DevOps Intern - AI Infrastructure

at NVIDIAChina, Shanghai

Share:

We are now looking for a DevOps Intern - AI Infrastructure!

NVIDIA is hiring engineers & interns to scale up its AI infrastructure. You will need to have strong programming skills, a deep understanding of cloud technologies, orchestration & automation systems, data centers and cloud architecture, as well as excellent communication and planning skills. You and other specialists in this team will help advance NVIDIA's capacity to build and deploy leading solutions for a broad range of AI-based applications such as autonomous vehicles, healthcare, virtual reality, graphics engines and visual computing.

This is a challenging and exciting role in the AI Infrastructure Software team that gives you a chance to create and scale out a new product category. We are a dynamic, startup-like environment with strong focus on execution, flexibility and teamwork. We are looking for highly motivated software engineers who share our a real passion for building phenomenal software.

NVIDIA is at the forefront of the DL and AI revolutions. Come join us as we craft the future of Deep Leaning on NVIDIA GPUs.

What you’ll be doing:

  • Collaborate with multiple AI product teams to understand their data and compute requirements (cars, healthcare, etc.)

  • Work with data center team to drive cluster needs

  • Collaborate with AI applied researchers and leaders to build future-proof infrastructure

  • Build infrastructure and tools that will increase the productivity of teams developing AI-based systems (training of deep learning / reinforcement learning systems)

  • Enable development team by providing automated build and test solutions in simulation environments using AWS, Docker, and physical deep learning machines

  • Maintain version control schemas to track development, staging, and production code using git

  • Orchestrate create/delete/upgrade of live systems using maintenance windows, HA failover, and immutable infrastructure patterns

  • Collaborate with multiple teams and domain experts to integrate multiple NVIDIA products into the CI workflow

  • Automate complex tasks and improve the efficiency of functional, white box and black box automated tests

What we want to see:

  • Bachelor or Master of Science (or equivalent) in Computer Science, Computer Engineering, or related

  • Solid technical foundation in automation, cloud infrastructure and orchestration, including experience with at least one orchestration system (Kubernetes, Swarm, Mesos, Marathon, Aurora, etc)

  • You have experience with microservices and ETL jobs

  • You have experience with cloud automation tools (Ansible, Terraform, etc)

  • AWS: EC2, S3, RDS, ECS, CloudFront, VPC, or equivalents in Aliyun

  • CI/CD: Jenkins, GitHub, GitLab, etc

  • Programming: Python, Bash, Go, Javascript

  • Linux: Debian package management, Docker, systemd

  • Networking: Linux firewall, PXE, NFS, ZFS, CIFS

  • Excellent communication and interpersonal skills

Ways to stand out from the crowd:

  • Excellent data analysis skills and the demonstrated ability to solve complex issues involving multiple software or hardware components

  • Team player, loves to work in a team environment

NVIDIA is widely considered to be one of the technology world’s most desirable employers. We have some of the most brilliant and talented people on the planet working for us. If you're creative and autonomous, we want to hear from you!

#deeplearning

Please mention that you found this job on Moaijobs, this helps us get more companies to post here, thanks!

Related Jobs

Leonardo AI
AI Researcher - (Infrastructure)
United Kingdom - Remote
Leonardo AI
AI Researcher - (Infrastructure)
Germany - Remote
Meta
Research Scientist Intern, Neuroscience / AI
Paris, France
Meta
Research Scientist Intern, Neuroscience / AI
Paris, France
Groq
AI Application Engineering Winter '25 Intern
Mountain View, CA (Hybrid)