Jamba-1.5: Hybrid Transformer-Mamba Models at Scale (White Paper Explained)
AI21 Labs AI21 Labs
130 subscribers
74 views
0

 Published On Sep 8, 2024

In this video, we’ll dive into the white paper that covers the details of our new open model family, Jamba 1.5, including the novel Transformer-Mamba architecture they’re built with, with a focus on its combination of Transformer, Mamba, and Mixture of Experts layers. We will also cover the new quantization technique, ExpertsInt8, which allowed for improved serving.
Join me as we break down the key concepts and benefits of these innovative models.

The white paper:
https://arxiv.org/pdf/2408.12570

Previous white paper video:
   • Jamba: A Hybrid Transformer-Mamba Lan...  

Previous white paper:
https://arxiv.org/pdf/2403.19887

ExpertInt8 commit in vllm:
https://github.com/vllm-project/vllm/...

RULER benchmark:
https://arxiv.org/pdf/2404.06654

show more

Share/Embed