4 days ago

Member of Technical Staff - Audio - Language Model Data

Liquid AI, an MIT spin-off, is a foundation model company headquartered in Boston, Massachusetts. Our mission is to build capable and efficient general-purpose AI systems at every scale.

Our goal at Liquid is to build the most capable AI systems to solve problems at every scale, such that users can build, access, and control their AI solutions. This is to ensure that AI will get meaningfully, reliably and efficiently integrated at all enterprises. Long term, Liquid will create and deploy frontier-AI-powered solutions that are available to everyone.

We are seeking a highly skilled Member of Technical Staff - Audio-Language Model Data to play a critical role in the development of Liquid Audio-Language models. This role focuses on gathering high-quality audio-text pre-training and SFT datasets.

Key Responsibilities

  • Create and maintain data cleaning, filtering, selection pipeline that can handle audio-text data.
  • Watch out for the release of public high quality audio (ASR and SFT) datasets.
  • Create and maintain synthetic data generation pipeline to create task-specific audio SFT data.
  • Work with the multimodal audio team to run ablations on new dataset.
  • ,

    Required Qualifications

  • Experience Level: B.S. + 5 years experience or M.S. + 3 years experience or Ph.D. + 1 year of experience.
  • Dataset Engineering: Expertise in data curation, cleaning, augmentation, and synthetic data generation techniques.
  • Machine Learning Expertise: Ability to write and debug models in popular ML frameworks, and experience working with LLMs and VLMs.
  • Software Development: Strong programming skills in Python, with an emphasis on writing clean, maintainable, and scalable code.
  • ,

    Preferred Qualifications

  • M.S. or Ph.D. in Computer Science, Electrical Engineering, Math, or a related field.
  • Experience training text-to-speech (TTS) or translation models.
  • 2+ years working with audio data.
  • First-author publications in top ML or audio conferences (e.g. NeurIPS, ICML, ICLR, ICASSP, Interspeech).
  • Contributions to popular open-source projects.
  • Please mention that you found this job on MoAIJobs, this helps us grow. Thank you!

    Share this job opportunity

    Related Jobs

    Liquid AI
    4 days ago

    Member of Technical Staff - Vision-Language Model Data

    Liquid AI
    6 days ago

    Member of Technical Staff - Foundational Model Data

    Inflection AI
    2 weeks ago

    Member of Technical Staff, Data Scientist / Statistician

    Palo Alto, CA
    Cohere
    1 week ago

    Member of Technical Staff, Model Serving

    San Francisco
    Amazon
    3 weeks ago

    Member of Technical Staff, AGI Autonomy

    US, CA, San Francisco