Research Engineer, Multimodal Companion Agent
At Google DeepMind, we value diversity of experience, knowledge, backgrounds and perspectives and harness these qualities to create extraordinary impact. We are committed to equal employment opportunities regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy, or related condition (including breastfeeding) or any other basis as protected by applicable law. If you have a disability or additional need that requires accommodation, please do not hesitate to let us know.
Snapshot
We are seeking a highly motivated and innovative Research Engineer to join our team in Tokyo, focused on building the state-of-the-art in multimodal companion agents. You will work with researchers and software engineers developing a cutting-edge companion agent. This will involve utilizing the latest advancements in large language models (LLMs), particularly in the multimodal domain (vision, audio, text). The focus will be on developing more capable, robust, factual, and helpful companion agents, with the potential to impact millions of users. In this role, you will have the opportunity to apply your expertise such as LLM post-training and evaluation to create AI agents that can understand and interact with the world in unprecedented ways. This role offers a unique opportunity to collaborate with a world-class, cross-functional team at Google DeepMind, work on challenging problems, and develop innovative solutions in a dynamic and collaborative environment. If you are passionate about shaping the future of human-computer interaction through AI and are eager to make a significant impact in the rapidly evolving landscape of assistive technologies, we encourage you to apply.
About us
We believe artificial intelligence has the potential to revolutionize the way we live and interact with the world. At Google DeepMind, we’re a team of scientists, engineers, machine learning experts and more, working together to advance the state of the art in artificial intelligence. We use our technologies for widespread public benefit and scientific discovery, and collaborate with others on critical challenges, ensuring safety and ethics are the highest priority.
The role
As a Research Engineer at Google DeepMind, you will contribute to the development of the Gemini-powered multimodal companion, pushing the boundaries of AI to create an immersive and engaging user experience across various domains, including education, health, gaming, and other exciting areas. You will be working on cutting-edge research that directly contributes to the development of impactful products, pushing the boundaries of AI to create companion agents that can truly understand and respond to human needs.
Key responsibilities
- Implementation & Optimization: Translate research concepts into practical implementations by developing and optimizing multimodal AI models, and building and maintaining robust data pipelines for training and evaluation.
- Experimentation & Evaluation: Design, implement, and run experiments to evaluate the performance and robustness of multimodal companion AI agents, using metrics and techniques like prompt engineering and few-shot learning.
- Contextual Interaction: Implement algorithms to enable the agent to analyze user interactions via vision and audio, providing contextually relevant assistance in voice.
- Collaboration & Knowledge Sharing: Work closely with research scientists and engineers, contributing to team discussions, sharing knowledge, and actively participating in code reviews to foster a collaborative environment.
- Innovation & Product Impact: Proactively identify and address technical challenges, stay updated on the latest AI advancements, and focus on developing solutions that can be effectively integrated into Google products and services, contributing to product impact.
About you
You are a highly skilled and passionate engineer with a strong foundation in machine learning and a drive to push the boundaries of AI. You excel at translating research ideas into practical implementations and thrive in a collaborative environment. You are excited to work on ambitious challenges and make a positive impact on the world through your work. You are adaptable, thrive in a fast-paced, dynamic environment, and are comfortable with ambiguity.
- B.S. or M.S. in Computer Science, Artificial Intelligence, or a related field.
- Strong programming skills in Python. Experience with C++ is a plus.
- Solid understanding of deep learning, natural language processing, computer vision, and/or speech processing.
- Experience with relevant ML frameworks such as JAX, TensorFlow, or PyTorch.
- Experience implementing and evaluating machine learning models and/or LLMs.
- Excellent communication and collaboration skills.
In addition, the following would be an advantage:
- Ph.D. in Computer Science, Artificial Intelligence, or a related field.
- Experience with multimodal learning, large language models, and/or companion AI agents.
- Experience in prompt engineering, few-shot learning, post-training techniques, and evaluations with large language models.
- Familiarity with large-scale model training and deployment.
- Experience contributing to research publications or open-source projects.
- Experience working in a collaborative, cross-functional team environment, particularly across different time zones.
- Experience with applying AI to complex systems or interactive environments is a plus.