Diego, If your goal is to sum data by month, there are more general methods than you making lists of starting and ending dates.
ONE way to consider in your case that is easy to understand is to add a derived column/feature to your data such as the first 7 characters of the date as perhaps a routine string. I mean 2012-01-01 06:00 Becomes: "2012-01" Once you have such a column, you can aggregate your data by that column using functionality in pandas and it will be done for all items with that field being the same. The extra column can be temporary and there are other ways but it is conceptually simple. Of course to do the above does involve lots of details and you don't seem to need it now, so I am leaving this as an academic hint. -----Original Message----- From: Python-list <python-list-bounces+avigross=verizon....@python.org> On Behalf Of Diego Avesani Sent: Monday, February 4, 2019 2:38 PM To: python-list@python.org Subject: Re: pandas read dataframe and sum all value same month and year Deal all, following Peter's suggestion, I put the example code: import pandas as pd import numpy as np from datetime import datetime #input: start_date = np.array(["2012-01-01 06:00",'2013-01-01 06:00','2014-01-01 06:00']) end_date = np.array(["2013-01-01 05:00",'2014-01-01 05:00','2015-01-01 05:00']) yearfolder = np.array(['2012','2013','2014']) for ii in range(0, 1): df = pd.read_csv('dati.csv',delimiter=',',header=0,parse_dates=True,na_values=-99 9) df['datatime'] = df['datatime'].map(lambda x: datetime.strptime(str(x), "%Y-%m-%d %H:%M")) mask = (df['datatime'] > str(start_date[ii])) & (df['datatime'] <= str(end_date[ii])) df = df.loc[mask] df = df.reset_index(drop=True) # df.groupby(pd.TimeGrouper('m')).sum() and the example of file: datatime,T,RH,PSFC,DIR,VEL10,PREC,RAD,CC,FOG 2012-01-01 06:00, 0.4,100, 911,321, 2.5, 0.0, 0, 0,0 2012-01-01 07:00, 0.8,100, 911,198, 0.8, 0.0, 0, 22,0 2012-01-01 08:00, 0.6,100, 912, 44, 1.2, 0.0, 30, 22,0 2012-01-01 09:00, 3.1, 76, 912, 22, 0.8, 0.0, 134, 44,0 2012-01-01 10:00, 3.4, 77, 912, 37, 0.5, 0.0, 191, 67,0 2012-01-01 11:00, 3.5,100, 912,349, 0.4, 0.0, 277, 44,0 2012-01-01 12:00, 3.6,100, 912, 17, 0.9, 0.0, 292, 22,0 2012-01-01 13:00, 3.5,100, 912, 28, 0.3, 0.0, 219, 44,0 2012-01-01 14:00, 3.3, 68, 912, 42, 0.5, 0.0, 151, 22,0 Hope this could help in finding a way to sum value belonging to the same month. Thanks again, a lot Diego On Monday, 4 February 2019 15:50:52 UTC+1, Diego Avesani wrote: > Dear all, > > I am reading the following data-frame: > > datatime,T,RH,PSFC,DIR,VEL10,PREC,RAD,CC,FOG > 2012-01-01 06:00, -0.1,100, 815,313, 2.6, 0.0, 0, 0,0 > 2012-01-01 07:00, -1.2, 93, 814,314, 4.8, 0.0, 0, 0,0 > 2012-01-01 08:00, 1.7, 68, 815,308, 7.5, 0.0, 41, 11,0 > 2012-01-01 09:00, 2.4, 65, 815,308, 7.4, 0.0, 150, 33,0 > 2012-01-01 10:00, 3.0, 64, 816,305, 8.4, 0.0, 170, 44,0 > 2012-01-01 11:00, 2.6, 65, 816,303, 6.3, 0.0, 321, 22,0 > 2012-01-01 12:00, 2.0, 72, 816,278, 1.3, 0.0, 227, 22,0 > 2012-01-01 13:00, -0.0, 72, 816,124, 0.1, 0.0, 169, 22,0 > 2012-01-01 14:00, -0.1, 68, 816,331, 1.4, 0.0, 139, 33,0 > 2012-01-01 15:00, -4.0, 85, 816,170, 0.6, 0.0, 49, 0,0 .... > .... > > I read the data frame as: > > df = pd.read_csv('dati.csv',delimiter=',',header=0,parse_dates=True,na_values=-99 9) > df['datatime'] = df['datatime'].map(lambda x: datetime.strptime(str(x), "%Y-%m-%d %H:%M")) > # > mask = (df['datatime'] > str(start_date[ii])) & (df['datatime'] <= str(end_date[ii])) > df = df.loc[mask] > df = df.reset_index(drop=True) > > I would to create an array with the sum of all the PREC value in the same month. > > I have tried with: > > df.groupby(pd.TimeGrouper('M')).sum() > > But as always, it seems that I have same problems with the indexes. Indeed, I get: > 'an instance of %r' % type(ax).__name__) > TypeError: axis must be a DatetimeIndex, but got an instance of 'Int64Index' > > thanks for any kind of help, > Really Really thanks > > Diego -- https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list