Published On May 17, 2024
Live demo that shows how to deploy an end-to-end Retrieval Augmented Generation stack including the LLM and Embedding Server on top of any K8s cluster.
Lingo as the model proxy and autoscaler: https://github.com/substratusai/lingo
Verba as the RAG application: https://github.com/weaviate/Verba
Weaviate as the Vector DB: https://github.com/weaviate/weaviate
Mistral-7B-Instruct-v2 as the LLM
STAPI with MiniLM-L6-v2 as the embedding model: https://github.com/substratusai/stapi
Blog post with copy pasteable steps: https://www.substratus.ai/blog/lingo-...
show more