Published On Oct 4, 2024
Optimizing your chunking techniques is one of the top places to improve performance in your RAG pipelines, but what’s the best one?
Jina AI just released a new method called late chunking that takes the same amount of storage space as naive chunking, but solves the problem of lost context, similarly to ColBERT.
You can implement it super easily with just a few extra lines in your embedding step!
Blog: https://weaviate.io/blog/late-chunking
Notebook: https://github.com/weaviate/recipes/b...
📄 Papers
Late Chunking: https://arxiv.org/pdf/2409.04701
ColBERT: https://arxiv.org/pdf/2004.12832
▬▬▬▬▬▬▬▬▬▬▬▬ CONNECT WITH US ▬▬▬▬▬▬▬▬▬▬▬▬
Visit http://weaviate.io/
Star us on GitHub https://github.com/weaviate/weaviate
Stay updated and subscribe to our newsletter: https://newsletter.weaviate.io/
Try out Weaviate Cloud for free here: https://console.weaviate.cloud/
Got a question?
Forum: https://forum.weaviate.io/
Slack: https://weaviate.io/slack
Connect with us on
Twitter: / weaviate_io
LinkedIn: / weaviate-io