About Us:
Stability AI is a community and mission-driven, open artificial intelligence company that cares deeply about real-world implications and applications. Our most considerable advances grow from our diversity in working across multiple teams and disciplines. We are unafraid to go against established norms and explore creativity. We are motivated to generate breakthrough ideas and convert them into tangible solutions. Our vibrant communities consist of experts, leaders, and partners across the globe who are developing cutting-edge open AI models for Image, Language, Audio, Video, and 3D.
Job Description:
Stability AI’s Security team is looking for a Site Reliability Engineer (SRE) to help shape our cloud infrastructure. The person will closely work with IT, security, SRE and engineering teams to improve reliability across our environment. Candidates should have the initiative to build and improve a maturing cloud landscape.
Responsibilities:
- Implementing and maintaining infrastructure as code using Terraform.
- Supporting container orchestration platforms such as Kubernetes or ECS.
- Participating in incident management and root cause analysis to improve system reliability.
- Contributing to cloud security practices and resource tagging strategies.
Qualifications:
- Collaborating with development teams to enhance CI/CD pipelines.
- Cloud security experience.
- Training and working with generative models.
- Background in software development or automation scripting.
- Knowledge of Grafana, ELK stack, or similar tools.
- Involvement in the SRE or DevOps community.
Equal Employment Opportunity:
We are an equal opportunity employer and do not discriminate on the basis of race, religion, national origin, gender, sexual orientation, age, veteran status, disability or other legally protected statuses.