Works perfectly. Thanks. f. On 1 August 2011 18:22, jim holtman <jholt...@gmail.com> wrote:
> Try this: had to add extra names to your data since it was not clear > how it was organized. Next time use 'dput' to enclose data. > > > x <- read.table(textConnection(" index time key date values > + 13732 27965 DATA.Q211.SUM.Index 04/08/11 1.42 > + 13733 27974 DATA.Q211.SUM.Index 05/10/11 1.45 > + 13734 27984 DATA.Q211.SUM.Index 06/01/11 1.22 > + 13746 28615 DATA.Q211.TDS.Index 04/07/11 1.35 > + 13747 28624 DATA.Q211.TDS.Index 05/20/11 1.40 > + 13754 29262 DATA.Q211.UBS.Index 05/02/11 1.30 > + 13755 29272 DATA.Q211.UBS.Index 05/03/11 1.48 > + 13761 29915 DATA.Q211.UCM.Index 04/28/11 1.43 > + 13768 30565 DATA.Q211.VDE.Index 05/02/11 1.48 > + 13775 31215 DATA.Q211.WF.Index 04/14/11 1.44 > + 13776 31225 DATA.Q211.WF.Index 05/12/11 1.42 > + 13789 31865 DATA.Q211.WPC.Index 04/01/11 1.40 > + 13790 31875 DATA.Q211.WPC.Index 04/08/11 1.42 > + 13791 31883 DATA.Q211.WPC.Index 05/10/11 1.43 > + 13804 32515 DATA.Q211.XTB.Index 04/29/11 1.50 > + 13805 32525 DATA.Q211.XTB.Index 05/30/11 1.40 > + 13806 32532 DATA.Q211.XTB.Index 06/28/11 1.43") > + , header = TRUE > + , as.is = TRUE > + ) > > closeAllConnections() > > x > index time key date values > 1 13732 27965 DATA.Q211.SUM.Index 04/08/11 1.42 > 2 13733 27974 DATA.Q211.SUM.Index 05/10/11 1.45 > 3 13734 27984 DATA.Q211.SUM.Index 06/01/11 1.22 > 4 13746 28615 DATA.Q211.TDS.Index 04/07/11 1.35 > 5 13747 28624 DATA.Q211.TDS.Index 05/20/11 1.40 > 6 13754 29262 DATA.Q211.UBS.Index 05/02/11 1.30 > 7 13755 29272 DATA.Q211.UBS.Index 05/03/11 1.48 > 8 13761 29915 DATA.Q211.UCM.Index 04/28/11 1.43 > 9 13768 30565 DATA.Q211.VDE.Index 05/02/11 1.48 > 10 13775 31215 DATA.Q211.WF.Index 04/14/11 1.44 > 11 13776 31225 DATA.Q211.WF.Index 05/12/11 1.42 > 12 13789 31865 DATA.Q211.WPC.Index 04/01/11 1.40 > 13 13790 31875 DATA.Q211.WPC.Index 04/08/11 1.42 > 14 13791 31883 DATA.Q211.WPC.Index 05/10/11 1.43 > 15 13804 32515 DATA.Q211.XTB.Index 04/29/11 1.50 > 16 13805 32525 DATA.Q211.XTB.Index 05/30/11 1.40 > 17 13806 32532 DATA.Q211.XTB.Index 06/28/11 1.43 > > # get index of first occurance of 'key' column > > indx <- !duplicated(x$key) > > x[indx,] > index time key date values > 1 13732 27965 DATA.Q211.SUM.Index 04/08/11 1.42 > 4 13746 28615 DATA.Q211.TDS.Index 04/07/11 1.35 > 6 13754 29262 DATA.Q211.UBS.Index 05/02/11 1.30 > 8 13761 29915 DATA.Q211.UCM.Index 04/28/11 1.43 > 9 13768 30565 DATA.Q211.VDE.Index 05/02/11 1.48 > 10 13775 31215 DATA.Q211.WF.Index 04/14/11 1.44 > 12 13789 31865 DATA.Q211.WPC.Index 04/01/11 1.40 > 15 13804 32515 DATA.Q211.XTB.Index 04/29/11 1.50 > > > > > > > > On Mon, Aug 1, 2011 at 11:13 AM, Francesca <francesca.panco...@gmail.com> > wrote: > > Dear Contributors > > thanks for any help you can provide. I searched the threads > > but I could not find any query that satisfied my needs. > > This is my database: > > index time values > > 13732 27965 DATA.Q211.SUM.Index 04/08/11 1.42 > > 13733 27974 DATA.Q211.SUM.Index 05/10/11 1.45 > > 13734 27984 DATA.Q211.SUM.Index 06/01/11 1.22 > > 13746 28615 DATA.Q211.TDS.Index 04/07/11 1.35 > > 13747 28624 DATA.Q211.TDS.Index 05/20/11 1.40 > > 13754 29262 DATA.Q211.UBS.Index 05/02/11 1.30 > > 13755 29272 DATA.Q211.UBS.Index 05/03/11 1.48 > > 13761 29915 DATA.Q211.UCM.Index 04/28/11 1.43 > > 13768 30565 DATA.Q211.VDE.Index 05/02/11 1.48 > > 13775 31215 DATA.Q211.WF.Index 04/14/11 1.44 > > 13776 31225 DATA.Q211.WF.Index 05/12/11 1.42 > > 13789 31865 DATA.Q211.WPC.Index 04/01/11 1.40 > > 13790 31875 DATA.Q211.WPC.Index 04/08/11 1.42 > > 13791 31883 DATA.Q211.WPC.Index 05/10/11 1.43 > > 13804 32515 DATA.Q211.XTB.Index 04/29/11 1.50 > > 13805 32525 DATA.Q211.XTB.Index 05/30/11 1.40 > > 13806 32532 DATA.Q211.XTB.Index 06/28/11 1.43 > > > > I need to select only the rows of this database that correspond to each > > of the first occurrences of the string represented in column > > index. In the example shown I would like to obtain a new > > data.frame which is > > > > index time values > > 13732 27965 DATA.Q211.SUM.Index 04/08/11 1.42 > > 13746 28615 DATA.Q211.TDS.Index 04/07/11 1.35 > > 13754 29262 DATA.Q211.UBS.Index 05/02/11 1.30 > > 13761 29915 DATA.Q211.UCM.Index 04/28/11 1.43 > > 13768 30565 DATA.Q211.VDE.Index 05/02/11 1.48 > > 13775 31215 DATA.Q211.WF.Index 04/14/11 1.44 > > 13789 31865 DATA.Q211.WPC.Index 04/01/11 1.40 > > 13804 32515 DATA.Q211.XTB.Index 04/29/11 1.50 > > > > As you can see, it is not the whole string to change, > > rather a substring that is part of it. I want to select > > only the first values related to the row that presents for the first time > > the different part of the string(substring). > > I know how to select rows according to a substring condition on the > > index column, but I cannot use it here because the substring changes > > and moreover the number of occurrences per substring is variable. > > > > Thank you for any help you can provide. > > Francesca > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > -- Francesca ---------------------------------- Francesca Pancotto, PhD Dipartimento di Economia Università di Bologna Piazza Scaravilli, 2 40126 Bologna Office: +39 051 2098135 Cell: +39 393 6019138 Web: http://www2.dse.unibo.it/francesca.pancotto/ ---------------------------------- [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.