Sponsored Session: NeMo-Aligner: A Scalable Toolkit for Model Alignment - Gerald Shen & Jimmy Zhang
PyTorch PyTorch
54.7K subscribers
57 views
0

 Published On Oct 1, 2024

Sponsored Session: NeMo-Aligner: A Scalable Toolkit for Model Alignment - Gerald Shen & Jimmy Zhang, NVIDIA

Aligning AI models with human values and preferences is essential for making them safe and helpful. However, building an efficient and scalable toolkit for alignment can be challenging, especially when applied to state of the art foundation models with billions or trillions of parameters. NeMo-Aligner is an open-source, optimized and scalable toolkit that implements alignment algorithms such as Reinforcement Learning from Human Feedback (RLHF), Direct Preference Optimization (DPO), SteerLM and Self-Play Fine Tuning (SPIN). This talk will introduce NeMo-Aligner and show the steps we took to design and optimize the toolkit around various alignment algorithms. In particular, we discuss the RLHF implementation where we observe close to 7x speedup and excellent scaling performance by adding TRT-LLM integration, carefully orchestrating communication and utilizing fast training kernels. We’re able to align state-of-the-art open source models with NeMo-Aligner and hope our framework can enable the community to performantly customize, fine-tune and align foundational models at any scale.

show more

Share/Embed