Hi i am trying to use chi-square Test to select most important columns among 
5501 columns. But for most of the columns i am getting NAN value as a Chi test 
value

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_selection import chi2
cols =[]
cols.append(int(0))
#for i in range(1, 5502):
cols.append(int(10))

df = pd.read_csv("D:\PHD\obranking\\demo.csv", usecols=cols)
df.apply(LabelEncoder().fit_transform)
X = df.drop(labels='label', axis=1)
Y = df['label']
chi_scores = chi2(X, Y)
print(chi_scores)
in this code i printed chi value for 10th column but for most of the columns it 
is behaving like below "C:\Users\Rahul 
Gupta\PycharmProjects\CSVLearn\venv\Scripts\python.exe" "C:/Users/Rahul 
Gupta/PycharmProjects/CSVLearn/ChiSq_learn.py" (array([nan]), array([nan]))

Process finished with exit code 0
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to