Can you give an example of what your final product would look like?  I'm
not exactly sure what you mean by triplets of features.

Assuming your data frame is called "df", the code below subsets those with
a score > 0.6 and then groups the rows by unique Scaff and Cat.  This might
help you get started.  Not sure how you want to define/create your triplets
within each group.

df2 <- df[df$Score>0.6, ]
split(df2, interaction(df2$Scaff, df2$Cat))

Jean



On Sat, Aug 3, 2013 at 10:42 AM, PQuery <pierre.khoue...@embl.de> wrote:

> Dear all,
>
> I have a data frame of features (example pasted below) from which I would
> like to select, say:
>
> how many triplets of features (corresponding to rows) have the same Scaff
> and the same "Cat" and a score >0.6 and fall in a distance of max 10000
> (distance defined as Start of row[i+1] - End of row[i])
>
> I've been trying that using selectors and combn in R but it is becoming
> complicated.
> Is there an intuitive way to achieve that elegantly ?
>
> Many thanks,
> Best,
>
> Scaff   Start   End     Score   Cat
> scaff_234       767099  767299  0.93    cat1
> scaff_234       790221  790421  0.924   cat1
> scaff_234       1341263 1341463 0.845   cat2
> scaff_234       1543343 1543543 0.715   cat2
> scaff_234       1551844 1552044 0.967   cat1
> scaff_234       1560829 1561029 0.825   cat2
> scaff_234       1580868 1581068 0.929   cat3
> scaff_234       1589612 1589812 0.744   cat3
> scaff_234       1597306 1597885 0.864   cat2
> scaff_234       1598617 1599091 0.908   cat2
> scaff_234       1613500 1613700 0.705   cat2
> scaff_234       1614297 1614643 0.748   cat1
> scaff_234       1623852 1624052 0.799   cat2
> scaff_234       1669873 1670073 0.691   cat2
> scaff_234       1670210 1670515 0.904   cat1
> scaff_234       1822690 1822890 0.918   cat2
> scaff_234       1824905 1825105 0.854   cat2
> scaff_234       1826092 1826292 0.95    cat2
> scaff_234       1855240 1855457 0.962   cat2
> scaff_234       1872803 1873106 0.97    cat2
> scaff_234       1894767 1894967 0.945   cat1
> scaff_234       1903338 1903538 0.854   cat3
> scaff_234       1920157 1920509 0.739   cat1
> scaff_234       1944032 1944232 0.871   cat2
> scaff_234       1976753 1976953 0.847   cat2
> scaff_234       1992677 1992877 0.694   cat2
> scaff_234       2007772 2007972 0.916   cat2
> scaff_234       2009638 2010167 0.945   cat2
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/subselecting-on-Data-frame-tp4672992.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to