Hi i am trying to use chi-square Test to select most important columns among 5501 columns. But for most of the columns i am getting NAN value as a Chi test value
import pandas as pd from sklearn.preprocessing import LabelEncoder from sklearn.feature_selection import chi2 cols =[] cols.append(int(0)) #for i in range(1, 5502): cols.append(int(10)) df = pd.read_csv("D:\PHD\obranking\\demo.csv", usecols=cols) df.apply(LabelEncoder().fit_transform) X = df.drop(labels='label', axis=1) Y = df['label'] chi_scores = chi2(X, Y) print(chi_scores) in this code i printed chi value for 10th column but for most of the columns it is behaving like below "C:\Users\Rahul Gupta\PycharmProjects\CSVLearn\venv\Scripts\python.exe" "C:/Users/Rahul Gupta/PycharmProjects/CSVLearn/ChiSq_learn.py" (array([nan]), array([nan])) Process finished with exit code 0 -- https://mail.python.org/mailman/listinfo/python-list