Hello, Michael, Pandas GroupBy does not behave consistently. Last time, when we had conversation, I used grouby. It works well. Now, I thought to re-write the program, so that I can end up with a clean script. But, the problem is that a lot of columns are missing after groupby application. Any idea? Regards. David
On Saturday, 14 May 2016, 17:00, Michael Selik <michael.se...@gmail.com> wrote: This StackOverflow question was the first search result when I Googled for "Python why is there a little u"http://stackoverflow.com/questions/11279331/what-does-the-u-symbol-mean-in-front-of-string-values On Sat, May 14, 2016, 11:40 AM David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, Why there is a little u ? u'ID',? Why can be done to it? How to handle such objects? Can it be turn into list easily? Regards. David On Saturday, 14 May 2016, 15:34, Michael Selik <michael.se...@gmail.com> wrote: You might also be interested in "Python for Data Analysis" for a thorough discussion of Pandas.http://shop.oreilly.com/product/0636920023784.do On Sat, May 14, 2016 at 10:29 AM Michael Selik <michael.se...@gmail.com> wrote: David, it sounds like you'll need a thorough introduction to the basics of Python.Check out the tutorial: https://docs.python.org/3/tutorial/ On Sat, May 14, 2016 at 6:19 AM David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, I discovered that the problem is "two columns of data are put together" and "are recognised as one column". This is very strange. I would like to understand the subject well. And, how many ways are there to investigate into the nature of objects dynamically? Some object types only get shown as an object. Are there anything to be typed in Python, to reveal objects. Regards. David On Saturday, 14 May 2016, 4:30, Michael Selik <michael.se...@gmail.com> wrote: What were you hoping to get from ``df[0]``?When you say it "yields nothing" do you mean it raised an error? What was the error message? Have you tried a Google search for "pandas set index"?http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html On Fri, May 13, 2016 at 11:18 PM David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, I tried to discover the problem. df[0] yields nothingdf[1] yields nothingdf[2] yields nothing However, df[3] gives the following:sid -9223372036854775808 NaN 1 133738.70 4 295256.11 5 137733.09 6 409413.58 8 269600.97 9 12852.94 Can we split this back to normal? or turn it into a dictionary, so that I can put values back properly. I like to use sid as index, some way. Regards. David On Friday, 13 May 2016, 22:58, Michael Selik <michael.se...@gmail.com> wrote: What have code you tried? What error message are you receiving? On Fri, May 13, 2016, 5:54 PM David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, How to convert a float type column into an integer or label or string type? On Friday, 13 May 2016, 22:02, Michael Selik <michael.se...@gmail.com> wrote: To clarify that you're specifying the index as a label, use df.iloc >>> df = pd.DataFrame({'X': range(4)}, index=list('abcd')) >>> df X a 0 b 1 c 2 d 3 >>> df.loc['a'] X 0 Name: a, dtype: int64 >>> df.iloc[0] X 0 Name: a, dtype: int64 On Fri, May 13, 2016 at 4:54 PM David Shi <davidg...@yahoo.co.uk> wrote: Dear Michael, To avoid complication, I only groupby using one column. It is OK now. But, how to refer to new row index? How do I use floating index? Float64Index([ 1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0, 53.0, 54.0, 55.0, 56.0], dtype='float64', name=u'StateFIPS') Regards. David On Friday, 13 May 2016, 21:43, Michael Selik <michael.se...@gmail.com> wrote: Here's an example. >>> import pandas as pd >>> df = pd.DataFrame({'group': list('AB') * 2, 'data': range(4)}, index=list('wxyz')) >>> df data group w 0 A x 1 B y 2 A z 3 B >>> df = df.reset_index() >>> df index data group 0 w 0 A 1 x 1 B 2 y 2 A 3 z 3 B >>> df.groupby('group').max() index data group A y 2 B z 3 If that doesn't help, you'll need to explain what you're trying to accomplish in detail -- what variables you started with, what transformations you want to do, and what variables you hope to have when finished. On Fri, May 13, 2016 at 4:36 PM David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, I changed groupby with one column. The index is different. Index([ u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC', u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', u'KS', u'KY', u'LA', u'MA', u'MD', u'ME', u'MI', u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', u'NE', u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', u'OK', u'OR', u'PA', u'RI', u'SC', u'SD', u'State', u'TN', u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', u'WV', u'WY'], dtype='object', name=0) How to use this index? Regards. David On Friday, 13 May 2016, 21:19, David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, I typed in df.index I got the followingMultiIndex(levels=[[1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0, 53.0, 54.0, 55.0, 56.0], [u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC', u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', u'KS', u'KY', u'LA', u'MA', u'MD', u'ME', u'MI', u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', u'NE', u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', u'OK', u'OR', u'PA', u'RI', u'SC', u'SD', u'State', u'TN', u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', u'WV', u'WY']], labels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], [0, 2, 1, 3, 4, 5, 7, 6, 8, 9, 11, 12, 13, 10, 14, 15, 16, 19, 18, 17, 20, 21, 23, 22, 24, 27, 31, 28, 29, 30, 32, 25, 26, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43, 45, 44, 46, 48, 47, 49]], names=[u'StateFIPS', 0])Regards. David On Friday, 13 May 2016, 21:11, David Shi <davidg...@yahoo.co.uk> wrote: Dear Michael, I have done a number of operation in between. Providing that information does not help you How to reset index after grouping and various operations is of interest. How to type in a command to find out its current dataframe? Regards. David On Friday, 13 May 2016, 20:58, Michael Selik <michael.se...@gmail.com> wrote: Just in case I misunderstood, why don't you make a little example of before and after the grouping? This mailing list does not accept attachments, so you'll have to make do with pasting a few rows of comma-separated or tab-separated values. On Fri, May 13, 2016 at 3:56 PM Michael Selik <michael.se...@gmail.com> wrote: In order to preserve your index after the aggregation, you need to make sure it is considered a data column (via reset_index) and then choose how your aggregation will operate on that column. On Fri, May 13, 2016 at 3:29 PM David Shi <davidg...@yahoo.co.uk> wrote: Hello, Michael, Why reset_index before grouping? Regards. David On Friday, 13 May 2016, 17:57, Michael Selik <michael.se...@gmail.com> wrote: On Fri, May 13, 2016 at 12:27 PM David Shi via Python-list <python-list@python.org> wrote: I lost my indexes after grouping in Pandas. I managed to rest_index and got back the index column. But How can I get back a index row? Was the grouping an aggregation? If so, the original indexes are meaningless. What you could do is reset_index before the grouping and when you aggregate decide how to handle the formerly-known-as-index column (min, max, mean, ?). -- https://mail.python.org/mailman/listinfo/python-list