Description:
You will architect, train, and deploy end-to-end large behaviour models for bi-manual and mobile manipulation, and lead the maturing of the early-stage RL pipeline.
The key responsibilities
- Architect, train, and evaluate end-to-end large behaviour models for bi-manual and mobile manipulation
- Advance diffusion transformer policies, mature VLA integration, and develop language conditioning for true multi-task generalisation
- Apply RL to refine pre-trained policies: RL token fine-tuning, residual RL, off-policy RL with reference-action regularisation, RL-based fine-tuning of diffusion policies
- Build a systematic sim-to-real transfer pipeline, connecting existing simulation infrastructure to training
- Deploy and iterate learned policies on physical robot hardware
- Mentor junior researchers and engineers, and publish at top-tier venues
What We're Looking For
Essential:
- PhD/MSc in ML, Robotics, CS, or related field with 4+ years of equivalent industry research experience
- Demonstrated expertise training and deploying learned manipulation policies on real robots
- Strong background in at least two of: behaviour cloning, diffusion policies, VLA/VLM architectures, RL for manipulation
- PyTorch and large-scale (multi-GPU, distributed) training
- Track record of publications at top-tier venues (CoRL, RSS, ICRA, NeurIPS, ICML, ICLR), or equivalent demonstrated research impact through deployed systems, patents, or significant open-source contributions
- Strong Python; production-quality research code with proper testing, type hints, and documentation