McKay Wrigley - I gave GPT4 eyes - AI CAN SEE!!
TheTunneys TheTunneys
4.36K subscribers
2,891 views
92

 Published On Apr 28, 2023

mckaywrigley Apr 26
https://twitter.com/mckaywrigley
https://twitter.com/mckaywrigley/status/16...

Here’s what I did:
added some data to a vision model
gave the AI camera access
asked it questions about the scene
it identified objects
it searched web for info
used that info to accurately answer

Watch it get 3 questions 100% correct!
And just for clarification the vision stuff isn’t GPT-4’s work.
It can’t access your camera.
I hooked up a separate vision model to it that handles the camera stuff.
Paired the 2 for the idea.

Does a frequency measure of objects in the scene to determine the “base” state, and then when you ask it questions it identifies the “anomalies.”

I have the anomalies I added to the dataset mapped to a description that gets injected into the prompt with my dialogue.

I have a lot of open source examples on my GitHub if you wanna check them out: https://github.com/mckaywrigley

Which vision models are you using?
This with added custom data: https://github.com/ultralytics/ultralytics

How did you search the web - GPT Browsing plug-in?
Google Custom Search Engine results fed into GPT

The time of response is nuts, is just really fast internet or any particular method for getting GPT response to TTS more quickly?
Everything was pretty short, so 4-6 seconds for the response was about right.
My TTS is a local library so it doesn’t have to hit an API.
I also have about 100 up/down so pretty good internet.

Building AI devtools @CodewandAI, an AI code school @TakeoffAI, open source AI chat @ChatbotUI, & other experiments.

   / @realmckaywrigley  

show more

Share/Embed