Sr Staff R&D Engineer
Lucasfilm
Nicasio, CAThis was removed by the employer on PST
This is a Full Time Job
Job Summary:
The Skywalker Sound Development Group is seeking a highly accomplished Senior Staff AI/ML Research Engineer to lead the development of transformative audio intelligence technologies for global media production. This senior-level role is central to advancing our next-generation soundtrack platform, with a focus on speech processing, style transfer, upmixing, source separation, and generative audio synthesis.
You will architect, build, and optimize cutting-edge machine learning systems at scale-leveraging foundational models, neural vocoders, latent diffusion models, and advanced retraining workflows. As a core member of our applied R&D team, you will contribute to technical direction, collaborate across product and engineering, and deliver production-ready solutions that integrate seamlessly into creative and operational workflows for elite content creators worldwide.
This role is considered Hybrid, which means the employee will work onsite in our Nicasio, CA office and occasionally from home.
What You'll Do
• Lead the research, design, and implementation of state-of-the-art machine learning algorithms for speech processing, voice transfer, source separation, and upmixing in media post-production environments.
• Drive the architecture and deployment of scalable model training pipelines using PyTorch and distributed computing frameworks.
• Develop novel generative audio models, including latent diffusion, flow-based models, variational autoencoders, and neural vocoders, optimized for professional soundtrack production.
• Own end-to-end model lifecycle management: pretraining, fine-tuning, validation, inference optimization, and CI/CD integration.
• Guide the development of personalized model adaptation workflows to support per-user tuning, cross-project continuity, and flexible deployment.
• Collaborate with product, platform, and engineering leads to define integration strategies within a secure, cloud-optimized SaaS environment.
• Stay at the forefront of generative audio, multi-modal modeling, and self-supervised learning-translating emerging research into applied innovation.
• Contribute to internal tooling and infrastructure that improves iteration speed, reproducibility, and explainability of deployed models.
• Mentor junior researchers and engineers, and contribute to a culture of rigorous experimentation, collaboration, and continuous improvement.
What We're Looking For
• MSc or PhD in Computer Science, Electrical Engineering, Applied Math, or a related field with a focus on AI/ML and mult-imodal signal processing.
• 5 years of professional experience in applied ML, with a deep focus on audio-centric AI/ML research and deployment.
• Expertise in building and scaling models using PyTorch, with fluency in training, fine-tuning, and inference for deep neural networks.
• Demonstrated experience developing generative models such as VAE, GAN, diffusion models, or neural vocoders (e.g., HiFi-GAN, WaveNet).
• Deep understanding of audio-specific ML domains, including source separation, speech enhancement, music processing, and cross-modal tasks.
• Experience with MLOps tooling (e.g., Weights & Biases, MLflow, Datachain), Docker-based containerization, and scalable infrastructure for distributed training.
• Fluency in audio signal processing fundamentals and the integration of DSP into ML pipelines.
• Proven ability to contribute to architectural planning, research strategy, and production deployment in complex, multi-stakeholder environments.
Preferred Qualifications
• Familiarity with audio/text/video multi-modal frameworks and cross-domain representations.
• Experience implementing real-time or near-real-time inference pipelines in cloud or edge environments (e.g., AWS, GCP, on-prem GPUs).
• Working knowledge of latent diffusion audio models (e.g., stable-audio, AudioLDM, AudioGen).
• Strong knowledge of industry-standard audio datasets and benchmarks (LibriSpeech, VCTK, MUSDB, etc.).
• Experience optimizing inference pipelines for creative applications or interactive use.
• Proficiency in lower-level audio frameworks (C / C++, etc.)
• Contributions to published research at top-tier conferences (NeurIPS, ICASSP, ICLR, Interspeech) and/or open-source ML frameworks.
The hiring range for this position in Nicasio, CA is $201,900 to $270,700 per year. The base pay actually offered will take into account internal equity and also may vary depending on the candidate's geographic region, job-related knowledge, skills, and experience among other factors. A bonus and/or long-term incentive units may be provided as part of the compensation package, in addition to the full range of medical, financial, and/or other benefits, dependent on the level and position offered.