Implement and Train ViT From Scratch for Image Recognition - PyTorch Video Tanpa Iklan

Implement and Train ViT From Scratch for Image Recognition - PyTorch

1.58K subscribers

14,335 views

652

About
Share

Published On Sep 29, 2023

We're going to implement ViT (Vision Transformer) and train our implementation on the MNIST dataset to classify images! Video where I explain the ViT paper and GitHub below ↓

Want to support the channel? Hit that like button and subscribe!

ViT (Vision Transformer) - An Image Is Worth 16x16 Words (Paper Explained)
• ViT (Vision Transformer) - An Image I...

GitHub Link of the Code
https://github.com/uygarkurt/ViT-PyTorch

Notebook
https://github.com/uygarkurt/ViT-PyTo...

ViT (Vision Transformer) is introduced in the paper: "An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale"
https://arxiv.org/abs/2010.11929

What should I implement next? Let me know in the comments!

00:00:00 Introduction
00:00:09 Paper Overview
00:02:41 Imports and Hyperparameter Definitions
00:11:09 Patch Embedding Implementation
00:19:36 ViT Implementation
00:29:00 Dataset Preparation
00:51:16 Train Loop
01:09:27 Prediction Loop
01:12:05 Classifying Our Own Images

Published On Sep 29, 2023

Share/Embed

Video Link