How to evaluate upgrading your app to GPT-4o | LangSmith Evaluations - Part 18 Video Tanpa Iklan

How to evaluate upgrading your app to GPT-4o | LangSmith Evaluations - Part 18

58.5K subscribers

9,833 views

216

About
Share

Published On May 13, 2024

OpenAI recently released GPT-4o, which reports significant improvements in latency and cost. Many users may wonder how to evaluate the effects of upgrading their app to GPT-4o? For example, what latency benefit will users expect to gain and are there any material differences in app performance when I switch to the new GPT-4o model.

Decisions like this are often limited by quality evaluations! Here, we show the process of evaluating GPT-4o on an example RAG app with a 20 question eval set related to LangChain documentation. We show how regression testing in the LangSmith UI allows you to quickly pinpoint examples where GPT-4o shows improvements or regressions over your current app.

GPT-4o docs:
https://openai.com/index/hello-gpt-4o/

LangSmith regression testing UI docs:
https://docs.smith.langchain.com/old/...

RAG evaluation docs:
https://docs.smith.langchain.com/old/...

Public dataset referenced in the video:
https://smith.langchain.com/public/ea...

Cookbook referenced in the video:
https://github.com/langchain-ai/langs...

Published On May 13, 2024

Share/Embed

Video Link