Great, thanks Arun, but I seem to be running into this error. Not sure what
did I miss.

>
result<-data.frame(final_ouput[,-5],read.table(text=as.character(final_output$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE)colnames(result)[5:7]<-paste0("DataComment",1:3)
Error: unexpected symbol in
"result<-data.frame(final_ouput[,-5],read.table(text=as.character(final_output$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE)colnames"


On Tue, Jun 11, 2013 at 5:09 PM, arun <smartpink...@yahoo.com> wrote:

>
>
>
> HI,
> You could use:
> result3<-
> data.frame(result2[,-5],read.table(text=as.character(result2$comment),sep="|",fill=TRUE,na.strings=""),stringsAsFactors=FALSE)
> colnames(result3)[5:7]<- paste0("DataComment",1:3)
> A.K.
> ________________________________
> From: Shreya Rawal <rawal.shr...@gmail.com>
> To: arun <smartpink...@yahoo.com>
> Sent: Tuesday, June 11, 2013 4:22 PM
> Subject: Re: [R] Combining CSV data
>
>
>
> Hey Arun,
>
> I guess you could guide me with this a little bit. I have been working on
> the solution Jim suggested (and also because that I could understand it
> with my little knowledge of R :))
>
> So with these commands I am able to get the data in this format:
>
> > fileA <- read.csv(text = "Row_ID_CR,   Data1,    Data2,    Data3
> + 1,                   aa,          bb,          cc
> + 2,                   dd,          ee,          ff", as.is = TRUE)
> >
> > fileB <- read.csv(text = "Row_ID_N,   Src_Row_ID,   DataN1
> + 1a,               1,                   This is comment 1
> + 2a,               1,                   This is comment 2
> + 3a,               2,                   This is comment 1
> + 4a,               1,                   This is comment 3", as.is = TRUE)
> >
> > # get rid of leading/trailing blanks on comments
> > fileB$DataN1 <- gsub("^ *| *$", "", fileB$DataN1)
> >
> > # merge together
> > result <- merge(fileA, fileB, by.x = 'Row_ID_CR', by.y = "Src_Row_ID")
> >
> > # now partition by Row_ID_CR and aggregate the comments
> > result2 <- do.call(rbind,
> +     lapply(split(result, result$Row_ID_CR), function(.grp){
> +         cbind(.grp[1L, -c(5,6)], comment = paste(.grp$DataN1, collapse =
> '|'))
> +     })
> + )
>
> Row_ID_CR                 Data1        Data2        Data3
>                                 comment
> 1         1                    aa           bb           cc
>                                This is comment 1| This is comment 2| This
> is comment 3
> 2         2                    dd           ee           ff
>                                  This is comment 1| This is Comment 2
>
> I can even split the last column by
> this: strsplit(as.character(result2$comment), split='\\|')
>
> [[1]]
> [1] "This is comment 1" "This is comment 2" " This is comment 3"
>
> [[2]]
> [1] "This is comment 1" "This is comment 2"
>
>
> but now I am not sure how to combine everything together. I guess by now
> you must have realized how new I am to R :)
>
> Thanks!!
> Shreya
>
>
>
>
>
>
> On Tue, Jun 11, 2013 at 1:02 PM, arun <smartpink...@yahoo.com> wrote:
>
> Hi,
> >If the dataset is like this with the comments in the order:
> >
> >dat2<-read.table(text="
> >Row_ID_N,  Src_Row_ID,  DataN1
> >1a,              1,                  This is comment 1
> >2a,              1,                  This is comment 2
> >3a,              2,                  This is comment 1
> >4a,              1,                  This is comment 3
> >",sep=",",header=TRUE,stringsAsFactors=FALSE)
> >
> >dat3<-read.table(text="
> >Row_ID_N,  Src_Row_ID,  DataN1
> >1a,              1,                  This is comment 1
> >2a,              1,                  This is comment 2
> >3a,              2,                  This is comment 1   #
> >
> >4a,              1,                  This is comment 3
> >5a,         2,                  This is comment 2  #
> >
> >",sep=",",header=TRUE,stringsAsFactors=FALSE)
> >
> >
> >library(stringr)
> >library(plyr)
> >fun1<- function(data1,data2){
> >    data2$DataN1<- str_trim(data2$DataN1)
> >        res<- merge(data1,data2,by.x=1,by.y=2)
> >    res1<- res[,-5]
> >    res2<-
> ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,DataN1=list(DataN1))
> >    Mx1<- max(sapply(res2[,5],length))
> >    res3<- data.frame(res2[,-5],do.call(rbind,lapply(res2[,5],function(x){
> >                                  c(x,rep(NA,Mx1-length(x)))
> >
> >                                  })),stringsAsFactors=FALSE)
> >    colnames(res3)[grep("X",colnames(res3))]<-
> paste0("DataComment",gsub("[[:alpha:]]","",colnames(res3)[grep("X",colnames(res3))]))
> >    res3
> >    }
> >
> >
> >fun1(dat1,dat2)
> >#  Row_ID_CR                Data1        Data2        Data3
> DataComment1
> >#1         1                   aa           bb           cc This is
> comment 1
> >
> >#2         2                   dd           ee           ff This is
> comment 1
> >#       DataComment2      DataComment3
> >#1 This is comment 2 This is comment 3
> >#2              <NA>              <NA>
> >
> > fun1(dat1,dat3)
> >#  Row_ID_CR                Data1        Data2        Data3
> DataComment1
> >#1         1                   aa           bb           cc This is
> comment 1
> >
> >#2         2                   dd           ee           ff This is
> comment 1
> > #      DataComment2      DataComment3
> >#1 This is comment 2 This is comment 3
> >
> >#2 This is comment 2              <NA>
> >
> >
> >Otherwise, you need to provide an example that matches the real dataset.
> >A.K.
> >
> >________________________________
> >From: Shreya Rawal <rawal.shr...@gmail.com>
> >To: arun <smartpink...@yahoo.com>
> >Cc: R help <r-help@r-project.org>
> >Sent: Tuesday, June 11, 2013 12:22 PM
> >
> >Subject: Re: [R] Combining CSV data
> >
> >
> >
> >Hi Arun,
> >
> >Thanks for your reply. Unfortunately the Comments are just text in the
> real data. There is no way to differentiate based on the value of the
> Comments column. I guess because of that reason I couldn't get your
> solution to work properly. Do you think I can try it for a more general
> case where we don't merger/split the comments based on the values?
> >
> >Thanks for your help, I appreciate!
> >
> >
> >
> >On Mon, Jun 10, 2013 at 10:14 PM, arun <smartpink...@yahoo.com> wrote:
> >
> >HI,
> >>I am not sure about your DataN1 column.  If there is any identifier to
> differentiate the comments (in this case 1,2,3), then it will easier to
> place that in the correct column.
> >>  My previous solution is not helpful in situations like these:
> >>
> >>dat2<-read.table(text="
> >>Row_ID_N,  Src_Row_ID,  DataN1
> >>1a,              1,                  This is comment 1
> >>2a,              1,                  This is comment 2
> >>3a,              2,                  This is comment 2
> >>4a,              1,                  This is comment 3
> >>",sep=",",header=TRUE,stringsAsFactors=FALSE)
> >>dat3<-read.table(text="
> >>
> >>Row_ID_N,  Src_Row_ID,  DataN1
> >>1a,              1,                  This is comment 1
> >>2a,              1,                  This is comment 2
> >>3a,              2,                  This is comment 3
> >>4a,              1,                  This is comment 3
> >>5a,         2,                  This is comment 2
> >>",sep=",",header=TRUE,stringsAsFactors=FALSE)
> >>
> >>
> >>library(stringr)
> >>library(plyr)
> >>fun1<- function(data1,data2){
> >>    data2$DataN1<- str_trim(data2$DataN1)
> >>        res<- merge(data1,data2,by.x=1,by.y=2)
> >>    res1<- res[,-5]
> >>    res2<-
> ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,DataN1=list(DataN1))
> >>    Mx1<- max(sapply(res2[,5],length))
> >>    res3<-
> data.frame(res2[,-5],do.call(rbind,lapply(res2[,5],function(x){
> >>                                  indx<-
> as.numeric(gsub("[[:alpha:]]","",x))
> >>                                  x[match(seq(Mx1),indx)]
> >>                                  })),stringsAsFactors=FALSE)
> >>
> >>    colnames(res3)[grep("X",colnames(res3))]<-
> paste0("DataComment",gsub("[[:alpha:]]","",colnames(res3)[grep("X",colnames(res3))]))
> >>    res3
> >>    }
> >>fun1(dat1,dat2)
> >>
> >>#  Row_ID_CR                Data1        Data2        Data3
> DataComment1
> >>#1         1                   aa           bb           cc This is
> comment 1
> >>#2         2                   dd           ee           ff
> <NA>
> >>
> >>#       DataComment2      DataComment3
> >>#1 This is comment 2 This is comment 3
> >>#2 This is comment 2              <NA>
> >> fun1(dat1,dat3)
> >>
> >>#  Row_ID_CR                Data1        Data2        Data3
> DataComment1
> >>#1         1                   aa           bb           cc This is
> comment 1
> >>#2         2                   dd           ee           ff
> <NA>
> >>
> >>#       DataComment2      DataComment3
> >>#1 This is comment 2 This is comment 3
> >>#2 This is comment 2 This is comment 3
> >>
> >>
> >>
> >>A.K.
> >>
> >>
> >>----- Original Message -----
> >>
> >>From: arun <smartpink...@yahoo.com>
> >>To: Shreya Rawal <rawal.shr...@gmail.com>
> >>Cc: R help <r-help@r-project.org>
> >>Sent: Monday, June 10, 2013 6:41 PM
> >>Subject: Re: [R] Combining CSV data
> >>
> >>Hi,
> >>Try this:
> >>
> >>dat1<-read.table(text="
> >>Row_ID_CR,  Data1,    Data2,    Data3
> >>1,                  aa,          bb,          cc
> >>2,                  dd,          ee,          ff
> >>",sep=",",header=TRUE,stringsAsFactors=FALSE)
> >>
> >>dat2<-read.table(text="
> >>Row_ID_N,  Src_Row_ID,  DataN1
> >>1a,              1,                  This is comment 1
> >>2a,              1,                  This is comment 2
> >>3a,              2,                  This is comment 1
> >>4a,              1,                  This is comment 3
> >>",sep=",",header=TRUE,stringsAsFactors=FALSE)
> >>library(stringr)
> >>dat2$DataN1<-str_trim(dat2$DataN1)
> >>res<- merge(dat1,dat2,by.x=1,by.y=2)
> >> res1<-res[,-5]
> >>library(plyr)
> >> res2<-ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,
> DataN1=list(DataN1))
> >> res2
> >> # Row_ID_CR                Data1        Data2        Data3
> >>#1         1                   aa           bb           cc
> >>#2         2                   dd           ee           ff
> >>#                                                   DataN1
> >>#1 This is comment 1, This is comment 2, This is comment 3
> >>#2                                       This is comment 1
> >>
> >>
> >>
> >>res3<-data.frame(res2[,-5],t(apply(do.call(rbind,res2[,5]),1,function(x)
> {x[duplicated(x)]<-NA;x})))
> >> colnames(res3)[grep("X",colnames(res3))]<-
> paste0("DataComment",gsub("[[:alpha:]]","",colnames(res3)[grep("X",colnames(res3))]))
> >>res3
> >>#  Row_ID_CR                Data1        Data2        Data3
> DataComment1
> >>#1         1                   aa           bb           cc This is
> comment 1
> >>#2         2                   dd           ee           ff This is
> comment 1
> >>#       DataComment2      DataComment3
> >>#1 This is comment 2 This is comment 3
> >>#2              <NA>              <NA>
> >>
> >>A.K.
> >>
> >>
> >>----- Original Message -----
> >>From: Shreya Rawal <rawal.shr...@gmail.com>
> >>To: r-help@r-project.org
> >>Cc:
> >>Sent: Monday, June 10, 2013 4:38 PM
> >>Subject: [R] Combining CSV data
> >>
> >>Hello R community,
> >>
> >>I am trying to combine two CSV files that look like this:
> >>
> >>File A
> >>
> >>Row_ID_CR,   Data1,    Data2,    Data3
> >>1,                   aa,          bb,          cc
> >>2,                   dd,          ee,          ff
> >>
> >>
> >>File B
> >>
> >>Row_ID_N,   Src_Row_ID,   DataN1
> >>1a,               1,                   This is comment 1
> >>2a,               1,                   This is comment 2
> >>3a,               2,                   This is comment 1
> >>4a,               1,                   This is comment 3
> >>
> >>And the output I am looking for is, comparing the values of Row_ID_CR and
> >>Src_Row_ID
> >>
> >>Output
> >>
> >>ROW_ID_CR,    Data1,    Data2,    Data3,    DataComment1,
> >>DataComment2,          DataComment3
> >>1,                      aa,         bb,         cc,        This is
> >>comment1,    This is comment2,     This is comment 3
> >>2,                      dd,          ee,         ff,          This is
> >>comment1
> >>
> >>
> >>I am a novice R user, I am able to replicate a left join but I need a bit
> >>more in the final result.
> >>
> >>
> >>Thanks!!
> >>
> >>    [[alternative HTML version deleted]]
> >>
> >>______________________________________________
> >>R-help@r-project.org mailing list
> >>https://stat.ethz.ch/mailman/listinfo/r-help
> >>PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >>and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to