Dear Jeff, it is a precious help and a fabulous suggestion. I will slowly go over the R code that you have sent. Thanks a lot !
On Wed, Jul 25, 2018 at 10:43 AM, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote: > The code below reeks of a misconception that lists are efficient to add > items to, which is a confusion with the computer science term "linked > list". In R, a list is NOT a linked list... it is a vector, which means > the memory used by the list is allocated at the time it is created, and > REALLOCATED when a new item is added. The only reason you should use a list > is because you expect to put values of different types or shapes into it, > which does not appear to apply in this use case. > > In R, you should make a valiant effort to create things right the first > time, and if that doesn't work then preallocate the space you will need in > the vectors you are working with. Since you have a need to store a variable > number of elements in each intersectX element, the column needs to be a > list but the elements of that list can perfectly well be character vectors. > > x <- data.frame( TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA") > , CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2") > , POSA=c(10, 15, 120, 340, 100, 220) > , CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1") > , POSB=c(30, 100, 300, 20, 200, 320) > , stringsAsFactors = FALSE > ) > compareRng <- function( chr1, pos1, chr2, pos2, delta ) { > ( chr1 == chr2 > & ( pos2 - delta ) < pos1 > & pos1 < ( pos2 + delta ) > ) > } > makeIntersectX <- function( n, chrlabel, poslabel, delta ) { > lgclidx <- rep( TRUE, nrow( x ) ) > lgclidx[ n ] <- FALSE > x[[ chrlabel ]][ compareRng( x[[ chrlabel ]][ n ] > , x[[ poslabel ]][ n ] > , x[[ chrlabel ]] > , x[[ poslabel ]] > , delta > ) > & lgclidx > ] > } > > x$intersectA <- lapply( seq.int( nrow( x ) ) > , makeIntersectX > , chrlabel = "CHRA" > , poslabel = "POSA" > , delta = 10L > ) > x$intersectB <- lapply( seq.int( nrow( x ) ) > , makeIntersectX > , chrlabel = "CHRB" > , poslabel = "POSB" > , delta = 21L > ) > >> x >> > TYPE CHRA POSA CHRB POSB intersectA intersectB > 1 DEL chr1 10 chr1 30 chr1 > 2 DEL chr1 15 chr1 100 chr1 > 3 DUP chr1 120 chr1 300 chr1 > 4 TRA chr1 340 chr2 20 > 5 INV chr2 100 chr2 200 > 6 TRA chr2 220 chr1 320 chr1 > > Note that depending on what you plan to do beyond this point, it might > actually be more performant to use a data frame with repeated rows instead > of list columns... but I cannot tell from what you have provided. > > > On Wed, 25 Jul 2018, Bogdan Tanasa wrote: > > Dear Thierry and Juan, thank you for your help. Thank you all. >> >> Now, if I would like to add an element to the empty list, how shall I do : >> for example, shall i = 2, and j = 1, in a bit of more complex R code : >> >> x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"), >> CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"), >> POSA=c(10, 15, 120, 340, 100, 220), >> CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"), >> POSB=c(30, 100, 300, 20, 200, 320)) >> >> x$labA <- paste(x$CHRA, x$POSA, sep="_") >> x$labB <- paste(x$CHRB, x$POSB, sep="_") >> >> x$POSA_left <- x$POSA - 10 >> x$POSA_right <- x$POSA + 10 >> >> x$POSB_left <- x$POSB - 10 >> x$POSB_right <- x$POSB + 10 >> >> x$intersectA <- rep(list(list()), nrow(x)) >> x$intersectB <- rep(list(list()), nrow(x)) >> >> And we know that for i = 2, and j = 1, the condition is TRUE : >> >> i <- 2 >> >> j <- 1 >> >> if ( (x$CHRA[i] == x$CHRA[j] ) && >> (x$POSA[i] > x$POSA_left[j] ) && >> (x$POSA[i] < x$POSA_right[j] ) ){ >> x$intersectA[i] <- c(x$intersectA[i], x$labA[j])} >> >> the R code does not work. Thank you for your kind help ! >> >> On Wed, Jul 25, 2018 at 12:26 AM, Thierry Onkelinx < >> thierry.onkel...@inbo.be >> >>> wrote: >>> >> >> Dear Bogdan, >>> >>> You are looking for x$intersectA <- vector("list", nrow(x)) >>> >>> Best regards, >>> >>> >>> ir. Thierry Onkelinx >>> Statisticus / Statistician >>> >>> Vlaamse Overheid / Government of Flanders >>> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE >>> AND >>> FOREST >>> Team Biometrie & >>> <https://maps.google.com/?q=Biometrie+%26+&entry=gmail&source=g>Kwaliteitszorg >>> / Team Biometrics & Quality Assurance >>> thierry.onkel...@inbo.be >>> Havenlaan 88 >>> <https://maps.google.com/?q=Havenlaan+88&entry=gmail&source=g> bus 73, >>> 1000 Brussel >>> www.inbo.be >>> >>> //////////////////////////////////////////////////////////// >>> /////////////////////////////// >>> To call in the statistician after the experiment is done may be no more >>> than asking him to perform a post-mortem examination: he may be able to >>> say >>> what the experiment died of. ~ Sir Ronald Aylmer Fisher >>> The plural of anecdote is not data. ~ Roger Brinner >>> The combination of some data and an aching desire for an answer does not >>> ensure that a reasonable answer can be extracted from a given body of >>> data. >>> ~ John Tukey >>> //////////////////////////////////////////////////////////// >>> /////////////////////////////// >>> >>> <https://www.inbo.be> >>> >>> >>> 2018-07-25 8:55 GMT+02:00 Bogdan Tanasa <tan...@gmail.com>: >>> >>> Dear all, >>>> >>>> assuming that I do have a dataframe like : >>>> >>>> x <- data.frame(TYPE=c("DEL", "DEL", "DUP", "TRA", "INV", "TRA"), >>>> CHRA=c("chr1", "chr1", "chr1", "chr1", "chr2", "chr2"), >>>> POSA=c(10, 15, 120, 340, 100, 220), >>>> CHRB=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr1"), >>>> POSB=c(30, 100, 300, 20, 200, 320)) , >>>> >>>> how could I initiate another 2 columns in x, where each element in >>>> these 2 >>>> columns is going to be a list (the list could be updated later). Thank >>>> you ! >>>> >>>> Shall I do, >>>> >>>> for (i in 1:dim(x)[1]) { x$intersectA[i] <- list()} >>>> >>>> for (i in 1:dim(x)[1]) { x$intersectB[i] <- list()} >>>> >>>> nothing is happening. Thank you very much ! >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide http://www.R-project.org/posti >>>> ng-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> >>> >>> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > ------------------------------------------------------------ > --------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnew...@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > ------------------------------------------------------------ > --------------- > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.