RAG Production Trick - Semantic Cache (Step-by-step Juicy Code Walk-Through)
TwoSetAI TwoSetAI
29.2K subscribers
1,868 views
54

 Published On Apr 3, 2024

In this episode, join Angelina and Mehdi for a discussion about Semantic Cache - another trick to create efficient production RAG systems. We'll cover what is Semantic Cache, implementation code walk through, and when it's best suited to use.

Our implementation utilizes Facebook FAISS, and SentenceTransformer, rather than LangChain.

Who's Angelina:   / meetangelina  
Who's Mehdi:   / mehdiallahyari  

00:00 Intro
00:12 Huggingface Cookbook
01:03 What is Semantic Cache?
01:13 What is Cache?
01:40 Diagram for a vanilla RAG
02:46 What is the issue here?
03:34 Diagram with caching
06:11 When to use caching
06:32 An example that motivates Semantic Cache
09:02 Key value caching
09:26 Difference with Semantic Caching
11:55 Summary of benefits
13:20 When to use Semantic Caching in production?
14:31 Caveats in production
17:58 Juicy code walk through
18:17 FAISS library
18:53 Setting up vector DB
22:36 We have a form for you! 👇

🦄 Any specific contents you wish to learn from us? Sign up here: https://noteforms.com/forms/twosetai-...

🧰 Our video editing tool is this one!: https://get.descript.com/nf5cum9nj1m8

🖼️ Blogpost for today: https://open.substack.com/pub/mlnotes...

🔨 Colab Implementation: https://colab.research.google.com/git...

📬 Don't miss out on the latest updates - Subscribe to our newsletter: https://mlnotes.substack.com/

📚 If you'd like to learn more about RAG systems, check out our book on the RAG system: https://angelinamagr.gumroad.com/

🕴️ Our consulting firm: We help companies that don't want to miss the boat of the current wave of AI advancement by integrating these solutions into their business operations and products. https://www.transformaistudio.com/

Stay tuned for more content! 🎥 Thanks you for watching! 🙌

show more

Share/Embed