You might also be interested in "Python for Data Analysis" for a thorough discussion of Pandas. http://shop.oreilly.com/product/0636920023784.do
On Sat, May 14, 2016 at 10:29 AM Michael Selik <michael.se...@gmail.com> wrote: > David, it sounds like you'll need a thorough introduction to the basics of > Python. > Check out the tutorial: https://docs.python.org/3/tutorial/ > > On Sat, May 14, 2016 at 6:19 AM David Shi <davidg...@yahoo.co.uk> wrote: > >> Hello, Michael, >> >> I discovered that the problem is "two columns of data are put together" >> and "are recognised as one column". >> >> This is very strange. I would like to understand the subject well. >> >> And, how many ways are there to investigate into the nature of objects >> dynamically? >> >> Some object types only get shown as an object. Are there anything to be >> typed in Python, to reveal objects. >> >> Regards. >> >> David >> >> >> On Saturday, 14 May 2016, 4:30, Michael Selik <michael.se...@gmail.com> >> wrote: >> >> >> What were you hoping to get from ``df[0]``? >> When you say it "yields nothing" do you mean it raised an error? What was >> the error message? >> >> Have you tried a Google search for "pandas set index"? >> >> http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html >> >> On Fri, May 13, 2016 at 11:18 PM David Shi <davidg...@yahoo.co.uk> wrote: >> >> Hello, Michael, >> >> I tried to discover the problem. >> >> df[0] yields nothing >> df[1] yields nothing >> df[2] yields nothing >> >> However, df[3] gives the following: >> >> sid >> -9223372036854775808 NaN >> 1 133738.70 >> 4 295256.11 >> 5 137733.09 >> 6 409413.58 >> 8 269600.97 >> 9 12852.94 >> >> >> Can we split this back to normal? or turn it into a dictionary, so that I >> can put values back properly. >> >> >> I like to use sid as index, some way. >> >> >> Regards. >> >> >> David >> >> >> >> On Friday, 13 May 2016, 22:58, Michael Selik <michael.se...@gmail.com> >> wrote: >> >> >> What have code you tried? What error message are you receiving? >> >> On Fri, May 13, 2016, 5:54 PM David Shi <davidg...@yahoo.co.uk> wrote: >> >> Hello, Michael, >> >> How to convert a float type column into an integer or label or string >> type? >> >> >> On Friday, 13 May 2016, 22:02, Michael Selik <michael.se...@gmail.com> >> wrote: >> >> >> To clarify that you're specifying the index as a label, use df.iloc >> >> >>> df = pd.DataFrame({'X': range(4)}, index=list('abcd')) >> >>> df >> X >> a 0 >> b 1 >> c 2 >> d 3 >> >>> df.loc['a'] >> X 0 >> Name: a, dtype: int64 >> >>> df.iloc[0] >> X 0 >> Name: a, dtype: int64 >> >> On Fri, May 13, 2016 at 4:54 PM David Shi <davidg...@yahoo.co.uk> wrote: >> >> Dear Michael, >> >> To avoid complication, I only groupby using one column. >> >> It is OK now. But, how to refer to new row index? How do I use floating >> index? >> >> Float64Index([ 1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, >> 16.0, >> 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, >> 27.0, >> 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, >> 38.0, >> 39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, >> 50.0, >> 51.0, 53.0, 54.0, 55.0, 56.0], >> dtype='float64', name=u'StateFIPS') >> >> >> Regards. >> >> >> David >> >> >> >> On Friday, 13 May 2016, 21:43, Michael Selik <michael.se...@gmail.com> >> wrote: >> >> >> Here's an example. >> >> >>> import pandas as pd >> >>> df = pd.DataFrame({'group': list('AB') * 2, 'data': range(4)}, >> index=list('wxyz')) >> >>> df >> data group >> w 0 A >> x 1 B >> y 2 A >> z 3 B >> >>> df = df.reset_index() >> >>> df >> index data group >> 0 w 0 A >> 1 x 1 B >> 2 y 2 A >> 3 z 3 B >> >>> df.groupby('group').max() >> index data >> group >> A y 2 >> B z 3 >> >> If that doesn't help, you'll need to explain what you're trying to >> accomplish in detail -- what variables you started with, what >> transformations you want to do, and what variables you hope to have when >> finished. >> >> On Fri, May 13, 2016 at 4:36 PM David Shi <davidg...@yahoo.co.uk> wrote: >> >> Hello, Michael, >> >> I changed groupby with one column. >> >> The index is different. >> >> Index([ u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC', >> u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', >> u'KS', u'KY', u'LA', u'MA', u'MD', u'ME', u'MI', >> u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', u'NE', >> u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', u'OK', >> u'OR', u'PA', u'RI', u'SC', u'SD', u'State', u'TN', >> u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', u'WV', >> u'WY'], >> dtype='object', name=0) >> >> >> How to use this index? >> >> >> Regards. >> >> >> David >> >> >> >> On Friday, 13 May 2016, 21:19, David Shi <davidg...@yahoo.co.uk> wrote: >> >> >> Hello, Michael, >> >> I typed in df.index >> >> I got the following >> >> MultiIndex(levels=[[1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, >> 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, >> 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, >> 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0, 53.0, >> 54.0, 55.0, 56.0], [u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC', u'DE', >> u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', u'KS', u'KY', u'LA', u'MA', u'MD', >> u'ME', u'MI', u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', u'NE', u'NH', u'NJ', >> u'NM', u'NV', u'NY', u'OH', u'OK', u'OR', u'PA', u'RI', u'SC', u'SD', >> u'State', u'TN', u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', u'WV', u'WY']], >> labels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, >> 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, >> 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], [0, 2, 1, 3, 4, 5, >> 7, 6, 8, 9, 11, 12, 13, 10, 14, 15, 16, 19, 18, 17, 20, 21, 23, 22, 24, 27, >> 31, 28, 29, 30, 32, 25, 26, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43, 45, 44, >> 46, 48, 47, 49]], >> names=[u'StateFIPS', 0]) >> >> Regards. >> >> >> David >> >> >> >> On Friday, 13 May 2016, 21:11, David Shi <davidg...@yahoo.co.uk> wrote: >> >> >> Dear Michael, >> >> I have done a number of operation in between. >> >> Providing that information does not help you >> >> How to reset index after grouping and various operations is of interest. >> >> How to type in a command to find out its current dataframe? >> >> Regards. >> >> David >> >> >> On Friday, 13 May 2016, 20:58, Michael Selik <michael.se...@gmail.com> >> wrote: >> >> >> Just in case I misunderstood, why don't you make a little example of >> before and after the grouping? This mailing list does not accept >> attachments, so you'll have to make do with pasting a few rows of >> comma-separated or tab-separated values. >> >> On Fri, May 13, 2016 at 3:56 PM Michael Selik <michael.se...@gmail.com> >> wrote: >> >> In order to preserve your index after the aggregation, you need to make >> sure it is considered a data column (via reset_index) and then choose how >> your aggregation will operate on that column. >> >> On Fri, May 13, 2016 at 3:29 PM David Shi <davidg...@yahoo.co.uk> wrote: >> >> Hello, Michael, >> >> Why reset_index before grouping? >> >> Regards. >> >> David >> >> >> On Friday, 13 May 2016, 17:57, Michael Selik <michael.se...@gmail.com> >> wrote: >> >> >> >> >> On Fri, May 13, 2016 at 12:27 PM David Shi via Python-list < >> python-list@python.org> wrote: >> >> I lost my indexes after grouping in Pandas. >> I managed to rest_index and got back the index column. >> But How can I get back a index row? >> >> >> Was the grouping an aggregation? If so, the original indexes are >> meaningless. What you could do is reset_index before the grouping and when >> you aggregate decide how to handle the formerly-known-as-index column (min, >> max, mean, ?). >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> -- https://mail.python.org/mailman/listinfo/python-list