On 19/11/17 18:55, shalu.ash...@gmail.com wrote: > Hello Peter, > > Many thanks for your suggestion. > Now I am using Pandas & > I already did that but now I need to make a multi-dimensional array for > reading all variables (5 in this case) at one x-axis, so I can perform > multiple regression analysis. > > I am not getting how to bring all variables at one axis (e.g. at x-axis)?
Pandas is great at this: index a single row of a DataFrame with your favourite selector from http://pandas.pydata.org/pandas-docs/stable/indexing.html (or just loop over the DataFrame's .iterrows) If you want a multi-dimensional array with all the data, numpy.loadtxt can do that for you. > > Thanks > Vishal > > On Sunday, 19 November 2017 22:32:06 UTC+5:30, Peter Otten wrote: >> shalu.ash...@gmail.com wrote: >> >>> Hi, All, >>> >>> I have 6 variables in CSV file. One is rainfall (dependent, at y-axis) and >>> others are predictors (at x). I want to do multiple regression and create >>> a correlation matrix between rainfall (y) and predictors (x; n1=5). Thus I >>> want to read rainfall as a separate variable and others in separate >>> columns, so I can apply the algo. However, I am not able to make a proper >>> matrix for them. >>> >>> Here are my data and codes? >>> Please suggest me for the same. >>> I am new to Python. >>> >>> RF P1 P2 P3 P4 P5 >>> 120.235 0.234 -0.012 0.145 21.023 0.233 >>> 200.14 0.512 -0.021 0.214 22.21 0.332 >>> 185.362 0.147 -0.32 0.136 24.65 0.423 >>> 201.895 0.002 -0.12 0.217 30.25 0.325 >>> 165.235 0.256 0.001 0.22 31.245 0.552 >>> 198.236 0.012 -0.362 0.215 32.25 0.333 >>> 350.263 0.98 -0.85 0.321 38.412 0.411 >>> 145.25 0.046 -0.36 0.147 39.256 0.872 >>> 198.654 0.65 -0.45 0.224 40.235 0.652 >>> 245.214 0.47 -0.325 0.311 26.356 0.632 >>> 214.02 0.18 -0.012 0.242 22.01 0.745 >>> 147.256 0.652 -0.785 0.311 18.256 0.924 >>> >>> import numpy as np >>> import statsmodels as sm >>> import statsmodels.formula as smf >>> import csv >>> >>> with open("pcp1.csv", "r") as csvfile: >>> readCSV=csv.reader(csvfile) >>> >>> rainfall = [] >>> csvFileList = [] >>> >>> for row in readCSV: >>> Rain = row[0] >>> rainfall.append(Rain) >>> >>> if len (row) !=0: >>> csvFileList = csvFileList + [row] >>> >>> print(csvFileList) >>> print(rainfall) >> >> You are not the first to read tabular data from a file; therefore numpy (and >> pandas) offer highlevel function to do just that. Once you have the complete >> table extracting a specific column is easy. For instance: >> >> $ cat rainfall.txt >> RF P1 P2 P3 P4 P5 >> 120.235 0.234 -0.012 0.145 21.023 0.233 >> 200.14 0.512 -0.021 0.214 22.21 0.332 >> 185.362 0.147 -0.32 0.136 24.65 0.423 >> 201.895 0.002 -0.12 0.217 30.25 0.325 >> 165.235 0.256 0.001 0.22 31.245 0.552 >> 198.236 0.012 -0.362 0.215 32.25 0.333 >> 350.263 0.98 -0.85 0.321 38.412 0.411 >> 145.25 0.046 -0.36 0.147 39.256 0.872 >> 198.654 0.65 -0.45 0.224 40.235 0.652 >> 245.214 0.47 -0.325 0.311 26.356 0.632 >> 214.02 0.18 -0.012 0.242 22.01 0.745 >> 147.256 0.652 -0.785 0.311 18.256 0.924 >> $ python3 >> Python 3.4.3 (default, Nov 17 2016, 01:08:31) >> [GCC 4.8.4] on linux >> Type "help", "copyright", "credits" or "license" for more information. >>>>> import numpy >>>>> rf = numpy.genfromtxt("rainfall.txt", names=True) >>>>> rf["RF"] >> array([ 120.235, 200.14 , 185.362, 201.895, 165.235, 198.236, >> 350.263, 145.25 , 198.654, 245.214, 214.02 , 147.256]) >>>>> rf["P3"] >> array([ 0.145, 0.214, 0.136, 0.217, 0.22 , 0.215, 0.321, 0.147, >> 0.224, 0.311, 0.242, 0.311]) > -- https://mail.python.org/mailman/listinfo/python-list