POSTED Jul 30

Senior Site Reliability Engineer (SRE)

at Stability AIUnited States

Share:

 

About Us:
Stability AI is a community and mission-driven, open artificial intelligence company that cares deeply about real-world implications and applications. Our most considerable advances grow from our diversity in working across multiple teams and disciplines. We are unafraid to go against established norms and explore creativity. We are motivated to generate breakthrough ideas and convert them into tangible solutions. Our vibrant communities consist of experts, leaders, and partners across the globe who are developing cutting-edge open AI models for Image, Language, Audio, Video, and 3D.

Job Description:
Stability AI’s Security team is looking for a Senior Site Reliability Engineer (SRE) who will play a pivotal role in improving and shaping our cloud infrastructure. The person will closely work with IT, security, product, and engineering teams to drive innovation and reliability in an evolving environment. Candidates should have the initiative to build and improve a maturing cloud landscape.

Responsibilities:

  • Developing and enforcing SRE best practices and standards across the organization.
  • Implementing and maintaining infrastructure as code using Terraform.
  • Architecting and managing scalable AWS environments, focusing on high availability and resilience.
  • Setting up and refining monitoring, logging, and alerting systems.
  • Driving incident management and root cause analysis to improve system reliability.
  • Championing SRE principles and mentoring junior team members.

Qualifications:

  • Collaborating with development teams to enhance CI/CD pipelines.
  • Cloud security experience.
  • Training and working with generative models.
  • Background in software development or automation scripting.
  • Knowledge of Grafana, ELK stack, or similar tools.

Equal Employment Opportunity:
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.

Please mention that you found this job on Moaijobs, this helps us get more companies to post here, thanks!

Related Jobs

Figure
Senior Reliability Engineer
Sunnyvale, CA
Shield AI
Senior Staff DevEx Engineer (R2976)
San Diego Metro Area
Groq
Principal Site Reliability Engineer, Infrastructure Platform
Mountain View, CA (Remote)
Shield AI
Senior Engineer, Software Infrastructure (R2973)
San Diego Metro Area
Shield AI
Senior Mechanical Engineer, Sustaining (R2972)
Dallas Metro Area