POSTED Jan 21

Member of Technical Staff, Data Pipelines

at CohereLondon / Remote

Share:

Who are we?
Cohere is focused on building and deploying large language model (LLM) AI into enterprises in a safe and responsible way that drives human productivity, and creates magical new ways to interact with technology and real business value. We’re a team of highly motivated and experienced engineers, innovators, and disruptors looking to change the face of technology.

Our goals are ambitious, but also concrete and practical. Cohere wants to fundamentally change how businesses operate, making everyone more productive and able to focus on doing better what they do best. Every day, our team breaks new ground, as we build transformational AI technology and products for enterprise and developers to harness the power of LLMs.

Cohere was founded by three global leaders in AI development, including our CEO, Aidan Gomez, who co-created the Transformer, which makes LLMs possible. Collectively, we're driven by the belief that our technology has the potential to revolutionize the way enterprises, their employees, and customers engage with technology through language.

Cohere’s broader research team is world-renowned, having contributed to the development of sentence transformers for semantic search, dynamic adversarial data collection and red teaming, and retrieval augmented generation, often referred to as “RAG,” among other technological breakthroughs.

We have been deliberate in assembling a team of operational leaders with industry-leading experience, with backgrounds working at the most sophisticated, demanding, and respected enterprises in the world. Cohere’s operational leaders have built, scaled, and led multi-billion product lines and businesses at Google, Apple, Rakuten, YouTube, AWS, and Cisco.

The Cohere team is a collective from all walks of life, from people who left college to start businesses, to some of the most experienced people from globally renowned companies. We believe a diverse team is the key to a safer, more responsible technology, and that different experiences and backgrounds enable us to tackle problems from all angles and avoid blindspots.

There’s no better time to play a role in defining the future of AI, and its impact on the world.

Why this role?

We are seeking a Member of Technical Staff to join our Data Pipeline team at Cohere. Our team is responsible for handling the data annotation streams, ensuring data quality, and improving our large language models. As a member of this team, you'll play a crucial role in ensuring the high quality and accuracy of our models by designing and implementing innovative techniques for data ingestion, annotation, and integration into model training and evaluation pipelines. You will also contribute to the development of new methods to enhance the safety, efficiency, and effectiveness of our models.

Please Note: We have offices in Toronto, Palo Alto, and London but embrace being remote-first! There are no restrictions on where you can be located for this role.

As a Member of Technical Staff, Data Pipelines you will:

    • Work on dataset ablations and other sorts of data quality efforts to ensure data diversity, coverage and consistency
    • Evaluate the impact of the data on our models and continuously improve the data quality
    • Contribute to synthetic, adversarial and state-of-the-art data collection efforts. In addition to data collection efforts in many specialized domains such as code, math and red-teaming data
    • Develop and optimize data pipelines that efficiently handle the ingestion, annotation, and integration of large datasets for model training and evaluation
    • Enhance and develop infrastructure for data management, pipeline orchestration, data validation, and MLOps
    • Train large-scale models on massive datasets, leveraging distributed computing and optimization techniques to achieve best-in-class performance
    • Collaborate with modeling and product teams to identify, prioritize, and secure new data sources

Skills & Qualifications:

    • 5+ experience in managing and analyzing large datasets and evaluating the performance of machine learning models
    • Proficiency in Python and familiarity with relevant ML frameworks such as TensorFlow, JAX, and XLA/MLIR, along with experience using large-scale distributed training strategies
    • Exposure to autoregressive sequence models, such as Transformers
    • Experience with annotated data orchestration; managing and analyzing large datasets, evaluation methods, running experiments and benchmarking performance 
    • Excellent communication and collaboration skills, allowing you to work effectively across diverse teams and domains

If some of the above doesn’t line up perfectly with your experience, we still encourage you to apply! If you consider yourself a thoughtful worker, a lifelong learner, and a kind and playful team member, Cohere is the place for you.

We value and celebrate diversity and strive to create an inclusive work environment for all. We welcome applicants of all kinds and are committed to providing an equal opportunity process. Cohere provides accessibility accommodations during the recruitment process. Should you require any accommodation, please let us know and we will work with you to meet your needs.

Our Perks:
🤝 An open and inclusive culture and work environment 
🧑‍💻 Work closely with a team on the cutting edge of AI research 
🍽 Weekly lunch stipend, in-office lunches & snacks
🦷 Full health and dental benefits, including a separate budget to take care of your mental health 
🐣 100% Parental Leave top-up for 6 months for employees based in Canada, the US, and the UK
🎨 Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
🏙 Remote-flexible, offices in Toronto, Palo Alto, San-Francisco and London and co-working stipend
✈️ 6 weeks of vacation

Note: This post is co-authored by both Cohere humans and Cohere technology.

Please mention that you found this job on Moaijobs, this helps us get more companies to post here, thanks!

Related Jobs

Cohere
Senior Member of Technical Staff, RAG
Runway
Member of Technical Staff, Machine Learning Trust & Safety
Remote
Invisible
Manager of Data Infrastructure
Worldwide - Remote
Welocalize
Vice President of Technology - Welo Data
United States
Welocalize
Vice President of Technology - Welo Data
Mexico City, Mexico