Re: Problem in defining multidimensional array matrix and regression

Thomas Jollans Sun, 19 Nov 2017 14:04:57 -0800

On 19/11/17 18:55, shalu.ash...@gmail.com wrote:
> Hello Peter,
> 
> Many thanks for your suggestion. 
> Now I am using Pandas &
> I already did that but now I need to make a multi-dimensional array for 
> reading all variables (5 in this case) at one x-axis, so I can perform 
> multiple regression analysis. 
> 
> I am not getting how to bring all variables at one axis (e.g. at x-axis)?


Pandas is great at this: index a single row of a DataFrame with your
favourite selector from
http://pandas.pydata.org/pandas-docs/stable/indexing.html (or just loop
over the DataFrame's .iterrows)

If you want a multi-dimensional array with all the data, numpy.loadtxt
can do that for you.


> 
> Thanks
> Vishal
> 
> On Sunday, 19 November 2017 22:32:06 UTC+5:30, Peter Otten  wrote:
>> shalu.ash...@gmail.com wrote:
>>
>>> Hi, All,
>>>
>>> I have 6 variables in CSV file. One is rainfall (dependent, at y-axis) and
>>> others are predictors (at x). I want to do multiple regression and create
>>> a correlation matrix between rainfall (y) and predictors (x; n1=5). Thus I
>>> want to read rainfall as a separate variable and others in separate
>>> columns, so I can apply the algo. However, I am not able to make a proper
>>> matrix for them.
>>>
>>> Here are my data and codes?
>>> Please suggest me for the same.
>>> I am new to Python.
>>>
>>> RF  P1      P2      P3      P4      P5
>>> 120.235     0.234   -0.012  0.145   21.023  0.233
>>> 200.14      0.512   -0.021  0.214   22.21   0.332
>>> 185.362     0.147   -0.32   0.136   24.65   0.423
>>> 201.895     0.002   -0.12   0.217   30.25   0.325
>>> 165.235     0.256   0.001   0.22    31.245  0.552
>>> 198.236     0.012   -0.362  0.215   32.25   0.333
>>> 350.263     0.98    -0.85   0.321   38.412  0.411
>>> 145.25      0.046   -0.36   0.147   39.256  0.872
>>> 198.654     0.65    -0.45   0.224   40.235  0.652
>>> 245.214     0.47    -0.325  0.311   26.356  0.632
>>> 214.02      0.18    -0.012  0.242   22.01   0.745
>>> 147.256     0.652   -0.785  0.311   18.256  0.924
>>>
>>> import numpy as np
>>> import statsmodels as sm
>>> import statsmodels.formula as smf
>>> import csv
>>>
>>> with open("pcp1.csv", "r") as csvfile:
>>>     readCSV=csv.reader(csvfile)
>>>     
>>>     rainfall = []
>>>     csvFileList = []
>>>     
>>>     for row in readCSV:
>>>         Rain = row[0]
>>>         rainfall.append(Rain)
>>>
>>>         if len (row) !=0:
>>>             csvFileList = csvFileList + [row]
>>>         
>>> print(csvFileList)
>>> print(rainfall)
>>
>> You are not the first to read tabular data from a file; therefore numpy (and 
>> pandas) offer highlevel function to do just that. Once you have the complete 
>> table extracting a specific column is easy. For instance:
>>
>> $ cat rainfall.txt 
>> RF      P1      P2      P3      P4      P5
>> 120.235 0.234   -0.012  0.145   21.023  0.233
>> 200.14  0.512   -0.021  0.214   22.21   0.332
>> 185.362 0.147   -0.32   0.136   24.65   0.423
>> 201.895 0.002   -0.12   0.217   30.25   0.325
>> 165.235 0.256   0.001   0.22    31.245  0.552
>> 198.236 0.012   -0.362  0.215   32.25   0.333
>> 350.263 0.98    -0.85   0.321   38.412  0.411
>> 145.25  0.046   -0.36   0.147   39.256  0.872
>> 198.654 0.65    -0.45   0.224   40.235  0.652
>> 245.214 0.47    -0.325  0.311   26.356  0.632
>> 214.02  0.18    -0.012  0.242   22.01   0.745
>> 147.256 0.652   -0.785  0.311   18.256  0.924
>> $ python3
>> Python 3.4.3 (default, Nov 17 2016, 01:08:31) 
>> [GCC 4.8.4] on linux
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import numpy
>>>>> rf = numpy.genfromtxt("rainfall.txt", names=True)
>>>>> rf["RF"]
>> array([ 120.235,  200.14 ,  185.362,  201.895,  165.235,  198.236,
>>         350.263,  145.25 ,  198.654,  245.214,  214.02 ,  147.256])
>>>>> rf["P3"]
>> array([ 0.145,  0.214,  0.136,  0.217,  0.22 ,  0.215,  0.321,  0.147,
>>         0.224,  0.311,  0.242,  0.311])
> 

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Problem in defining multidimensional array matrix and regression

Reply via email to