Date: 15 May 2023
Module : modin
Installation : pip install modin
About:
Modin is a replacement for pandas. While pandas is single-threaded,
Modin lets instantly speed up the workflows by scaling pandas so it uses
all of your cores. Modin works especially well on larger datasets, where
pandas has challenges.
By simply replacing the import statement, Modin offers users
effortless speed and scale for their pandas workflows:
import modin.pandas as pd
Sample:
import modin.pandas as pd
import numpy as np
df = pd.read_csv("my_dataset.csv")
left_data = np.random.randint(0, 100, size=(2**8, 2**8))
right_data = np.random.randint(0, 100, size=(2**12, 2**12))
left_df = pd.DataFrame(left_data)
right_df = pd.DataFrame(right_data)
%timeit left_df.merge(right_df, how="inner", on=10)
3.59 s 107 ms per loop (mean std. dev. of 7 runs, 1 loop each)
%timeit right_df.merge(left_df, how="inner", on=10)
1.22 s 40.1 ms per loop (mean std. dev. of 7 runs, 1 loop each)
Reference:
https://pypi.org/project/modin/
_______________________________________________
Chennaipy mailing list
[email protected]
https://mail.python.org/mailman/listinfo/chennaipy