I have recently been going through "Data Science From Scratch" which may be interesting. There is a podcast with the author on talk python to me.
https://talkpython.fm/episodes/show/56/data-science-from-scratch On Sat, May 14, 2016 at 10:33 AM, Michael Selik <michael.se...@gmail.com> wrote: > You might also be interested in "Python for Data Analysis" for a thorough > discussion of Pandas. > http://shop.oreilly.com/product/0636920023784.do > > On Sat, May 14, 2016 at 10:29 AM Michael Selik <michael.se...@gmail.com> > wrote: > > > David, it sounds like you'll need a thorough introduction to the basics > of > > Python. > > Check out the tutorial: https://docs.python.org/3/tutorial/ > > > > On Sat, May 14, 2016 at 6:19 AM David Shi <davidg...@yahoo.co.uk> wrote: > > > >> Hello, Michael, > >> > >> I discovered that the problem is "two columns of data are put together" > >> and "are recognised as one column". > >> > >> This is very strange. I would like to understand the subject well. > >> > >> And, how many ways are there to investigate into the nature of objects > >> dynamically? > >> > >> Some object types only get shown as an object. Are there anything to be > >> typed in Python, to reveal objects. > >> > >> Regards. > >> > >> David > >> > >> > >> On Saturday, 14 May 2016, 4:30, Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> > >> What were you hoping to get from ``df[0]``? > >> When you say it "yields nothing" do you mean it raised an error? What > was > >> the error message? > >> > >> Have you tried a Google search for "pandas set index"? > >> > >> > http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_index.html > >> > >> On Fri, May 13, 2016 at 11:18 PM David Shi <davidg...@yahoo.co.uk> > wrote: > >> > >> Hello, Michael, > >> > >> I tried to discover the problem. > >> > >> df[0] yields nothing > >> df[1] yields nothing > >> df[2] yields nothing > >> > >> However, df[3] gives the following: > >> > >> sid > >> -9223372036854775808 NaN > >> 1 133738.70 > >> 4 295256.11 > >> 5 137733.09 > >> 6 409413.58 > >> 8 269600.97 > >> 9 12852.94 > >> > >> > >> Can we split this back to normal? or turn it into a dictionary, so > that I can put values back properly. > >> > >> > >> I like to use sid as index, some way. > >> > >> > >> Regards. > >> > >> > >> David > >> > >> > >> > >> On Friday, 13 May 2016, 22:58, Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> > >> What have code you tried? What error message are you receiving? > >> > >> On Fri, May 13, 2016, 5:54 PM David Shi <davidg...@yahoo.co.uk> wrote: > >> > >> Hello, Michael, > >> > >> How to convert a float type column into an integer or label or string > >> type? > >> > >> > >> On Friday, 13 May 2016, 22:02, Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> > >> To clarify that you're specifying the index as a label, use df.iloc > >> > >> >>> df = pd.DataFrame({'X': range(4)}, index=list('abcd')) > >> >>> df > >> X > >> a 0 > >> b 1 > >> c 2 > >> d 3 > >> >>> df.loc['a'] > >> X 0 > >> Name: a, dtype: int64 > >> >>> df.iloc[0] > >> X 0 > >> Name: a, dtype: int64 > >> > >> On Fri, May 13, 2016 at 4:54 PM David Shi <davidg...@yahoo.co.uk> > wrote: > >> > >> Dear Michael, > >> > >> To avoid complication, I only groupby using one column. > >> > >> It is OK now. But, how to refer to new row index? How do I use > floating > >> index? > >> > >> Float64Index([ 1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, > 13.0, 16.0, > >> 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, > 26.0, 27.0, > >> 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, > 37.0, 38.0, > >> 39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, > 49.0, 50.0, > >> 51.0, 53.0, 54.0, 55.0, 56.0], > >> dtype='float64', name=u'StateFIPS') > >> > >> > >> Regards. > >> > >> > >> David > >> > >> > >> > >> On Friday, 13 May 2016, 21:43, Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> > >> Here's an example. > >> > >> >>> import pandas as pd > >> >>> df = pd.DataFrame({'group': list('AB') * 2, 'data': range(4)}, > >> index=list('wxyz')) > >> >>> df > >> data group > >> w 0 A > >> x 1 B > >> y 2 A > >> z 3 B > >> >>> df = df.reset_index() > >> >>> df > >> index data group > >> 0 w 0 A > >> 1 x 1 B > >> 2 y 2 A > >> 3 z 3 B > >> >>> df.groupby('group').max() > >> index data > >> group > >> A y 2 > >> B z 3 > >> > >> If that doesn't help, you'll need to explain what you're trying to > >> accomplish in detail -- what variables you started with, what > >> transformations you want to do, and what variables you hope to have when > >> finished. > >> > >> On Fri, May 13, 2016 at 4:36 PM David Shi <davidg...@yahoo.co.uk> > wrote: > >> > >> Hello, Michael, > >> > >> I changed groupby with one column. > >> > >> The index is different. > >> > >> Index([ u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', > u'DC', > >> u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', > u'IN', > >> u'KS', u'KY', u'LA', u'MA', u'MD', u'ME', > u'MI', > >> u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', > u'NE', > >> u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', > u'OK', > >> u'OR', u'PA', u'RI', u'SC', u'SD', u'State', > u'TN', > >> u'TX', u'UT', u'VA', u'VT', u'WA', u'WI', > u'WV', > >> u'WY'], > >> dtype='object', name=0) > >> > >> > >> How to use this index? > >> > >> > >> Regards. > >> > >> > >> David > >> > >> > >> > >> On Friday, 13 May 2016, 21:19, David Shi <davidg...@yahoo.co.uk> wrote: > >> > >> > >> Hello, Michael, > >> > >> I typed in df.index > >> > >> I got the following > >> > >> MultiIndex(levels=[[1.0, 4.0, 5.0, 6.0, 8.0, 9.0, 10.0, 11.0, 12.0, > 13.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, > 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, > 39.0, 40.0, 41.0, 42.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0, > 53.0, 54.0, 55.0, 56.0], [u'AL', u'AR', u'AZ', u'CA', u'CO', u'CT', u'DC', > u'DE', u'FL', u'GA', u'IA', u'ID', u'IL', u'IN', u'KS', u'KY', u'LA', > u'MA', u'MD', u'ME', u'MI', u'MN', u'MO', u'MS', u'MT', u'NC', u'ND', > u'NE', u'NH', u'NJ', u'NM', u'NV', u'NY', u'OH', u'OK', u'OR', u'PA', > u'RI', u'SC', u'SD', u'State', u'TN', u'TX', u'UT', u'VA', u'VT', u'WA', > u'WI', u'WV', u'WY']], > >> labels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, > 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, > 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48], [0, 2, 1, 3, > 4, 5, 7, 6, 8, 9, 11, 12, 13, 10, 14, 15, 16, 19, 18, 17, 20, 21, 23, 22, > 24, 27, 31, 28, 29, 30, 32, 25, 26, 33, 34, 35, 36, 37, 38, 39, 41, 42, 43, > 45, 44, 46, 48, 47, 49]], > >> names=[u'StateFIPS', 0]) > >> > >> Regards. > >> > >> > >> David > >> > >> > >> > >> On Friday, 13 May 2016, 21:11, David Shi <davidg...@yahoo.co.uk> wrote: > >> > >> > >> Dear Michael, > >> > >> I have done a number of operation in between. > >> > >> Providing that information does not help you > >> > >> How to reset index after grouping and various operations is of interest. > >> > >> How to type in a command to find out its current dataframe? > >> > >> Regards. > >> > >> David > >> > >> > >> On Friday, 13 May 2016, 20:58, Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> > >> Just in case I misunderstood, why don't you make a little example of > >> before and after the grouping? This mailing list does not accept > >> attachments, so you'll have to make do with pasting a few rows of > >> comma-separated or tab-separated values. > >> > >> On Fri, May 13, 2016 at 3:56 PM Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> In order to preserve your index after the aggregation, you need to make > >> sure it is considered a data column (via reset_index) and then choose > how > >> your aggregation will operate on that column. > >> > >> On Fri, May 13, 2016 at 3:29 PM David Shi <davidg...@yahoo.co.uk> > wrote: > >> > >> Hello, Michael, > >> > >> Why reset_index before grouping? > >> > >> Regards. > >> > >> David > >> > >> > >> On Friday, 13 May 2016, 17:57, Michael Selik <michael.se...@gmail.com> > >> wrote: > >> > >> > >> > >> > >> On Fri, May 13, 2016 at 12:27 PM David Shi via Python-list < > >> python-list@python.org> wrote: > >> > >> I lost my indexes after grouping in Pandas. > >> I managed to rest_index and got back the index column. > >> But How can I get back a index row? > >> > >> > >> Was the grouping an aggregation? If so, the original indexes are > >> meaningless. What you could do is reset_index before the grouping and > when > >> you aggregate decide how to handle the formerly-known-as-index column > (min, > >> max, mean, ?). > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > -- > https://mail.python.org/mailman/listinfo/python-list > -- "On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question." -Charles Babbage, 19th century English mathematician, philosopher, inventor and mechanical engineer who originated the concept of a programmable computer. -- https://mail.python.org/mailman/listinfo/python-list