Use TQDM Progress Bar with Pandas

Is it possible to use TQDM progress bar when importing and indexing large datasets using Pandas?

Here is an example of of some 5-minute data I am importing, indexing, and using to_datetime. It takes a while and it would be nice to see a progress bar.

#Import csv files into a Pandas dataframes and convert to Pandas datetime and set to index


eurusd_ask = pd.read_csv('EURUSD_Candlestick_5_m_ASK_01.01.2012-05.08.2017.csv')
eurusd_ask.index = pd.to_datetime(eurusd_ask.pop('Gmt time'))
68737 次浏览

You could fill a pandas data frame in line by line by reading the file normally and simply add each new line as a new row to the dataframe, though this would be a fair bit slower than just using Pandas own reading methods.

with tqdm(total=Df.shape[0]) as pbar:
for index, row in Df.iterrows():
pbar.update(1)
...

There is a workaround for tqdm > 4.24. As per https://github.com/tqdm/tqdm#pandas-integration:

from tqdm import tqdm
        

# Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
# (can use `tqdm_gui`, `tqdm_notebook`, optional kwargs, etc.)
tqdm.pandas(desc="my bar!")
eurusd_ask['t_stamp'] = eurusd_ask['Gmt time'].progress_apply(lambda x: pd.Timestamp)
eurusd_ask.set_index(['t_stamp'], inplace=True)

Find length by getting shape

for index, row in tqdm(df.iterrows(), total=df.shape[0]):
print("index",index)
print("row",row)