Python Data Load time - Pandas vs Polars: New York SPARCS 2022 Data
Hants Williams Hants Williams
33 subscribers
43 views
0

 Published On Oct 2, 2024

Comparison of loading a csv file (SPARCS health care data from New York State - 2022) using PANDAS and POLARS.

In this video, you can see how I first create a virtual environment, load the virtual environment, then proceed to install the necessary packages. Then, we trouble shoot an issue with the loading of the SPARCS data where Polars tries to infer the wrong or incorrect datatype for one of the columns - so we address this issue on the fly.

Once we have successfully dealt with the loading issue, we can see the speed comparison of approximately 1 second to load in Polars versus 9 seconds using Pandas.

show more

Share/Embed