Your problem is that the command you entered > the_data<-read.csv(file=“c:/file_name.csv,header=TRUE,sep=“,”)
is missing a double quote after the .csv. The statement should be > the_data<-read.csv(file=“c:/file_name.csv",header=TRUE,sep=“,”) The '+' sign is a prompt from R that indicates it has not yet seen the end of a statement, and it is expecting you to continue from the previous line. The explanation: you are supplying the read.csv() function three arguments, one each for the parameters 'file', 'header' and 'sep'. The parameters 'file' and 'sep' are expecting strings as arguments, such as "c:/file_name.csv" or "c:/myspecialdata.csv". The parameter 'sep' (for separator) indicates that the separator is a comma. Note that you could also have written > the_data<-read.csv(file=“c:/file_name.csv") as the default values for the parameter 'header' is TRUE, and for the parameter 'sep' is comma. You can confirm this by looking at the help via > ?read.csv HTH, Eric On Mon, Aug 27, 2018 at 6:49 AM, Spencer Brackett < spbracket...@saintjosephhs.com> wrote: > Hello all, > > To begin my analysis, I downloaded two TCGA datasets (GBM and LGG), both > csv files, onto on r script after loading the cBioLite package. Following > this, I inputted the following argument... > > > the_data<-read.csv(file=“c:/file_name.csv,header=TRUE,sep=“,”) > > Upon running the line I received this... > > + > > If continue to press enter, the + sign continues to appear on every > subsequent/new line. > > Does anyone know what this is indicative of and how I may continue on with > my analysis > > My next step after this would have been the following (the numbers before > each command being line markers; not part of line).. > > 1 library(TCGAbiolinks) > 2 > 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM. > 4 path <– "." > > Best wishes, > > Spencer Brackett > > On Sun, Aug 26, 2018 at 9:13 PM Caitlin <bioprogram...@gmail.com> wrote: > > > You're welcome Spencer :) > > > > I hope I was able to help you. If this problem persists, or a new one > > appears, feel free to post or email. You might also like: > > > > https://www.biostars.org/ > > > > It is quite similar to StackOverflow but with a biological sciences > focus. > > > > Hope this helps! > > > > ~Caitlin > > > > > > > > On Sun, Aug 26, 2018 at 6:02 PM Spencer Brackett < > > spbracket...@saintjosephhs.com> wrote: > > > >> Caitlin, > >> > >> Thanks again! I already have the two files stored in those two CSV > files > >> via my desktop, but if tuning those with this function do not work, > then I > >> will try it with a flash drive. > >> > >> Best, > >> > >> Spencer Brackett > >> > >> On Sun, Aug 26, 2018 at 8:56 PM Caitlin <bioprogram...@gmail.com> > wrote: > >> > >>> Hmm...could you store each in its own file (a flash drive would be > fine) > >>> then use: > >>> > >>> the_data <- read.csv(file="c:/file_name.csv", header=TRUE, sep=",") > >>> > >>> to read each into your script? The data would then exist as a > dataframe object that you could then work with. > >>> > >>> > >>> On Sun, Aug 26, 2018 at 5:50 PM Spencer Brackett < > >>> spbracket...@saintjosephhs.com> wrote: > >>> > >>>> Caitlin, > >>>> > >>>> Perhaps that is the problem. To be more specific, the data was > >>>> transferred from the TCGA database to a CSV file... there are > technically > >>>> two separate files (CSV) for this analysis.... one for GBM and one > for LGG. > >>>> Both CVS files were then individually downloaded onto my open R > console. > >>>> Upon arranging them with the summary () function, the data expanded > and > >>>> took up the whole console page... even seemingly abrogating the > arguments > >>>> which allowed for the data to be downloaded onto R in the first > place. Are > >>>> you suggesting that I would need to utilize a flash drive to > successfully > >>>> utilize the function you suggested? Or could I perhaps do so with the > CSV > >>>> field I mentioned? If so, how? > >>>> > >>>> -Spencer B > >>>> > >>>> On Sun, Aug 26, 2018 at 8:42 PM Caitlin <bioprogram...@gmail.com> > >>>> wrote: > >>>> > >>>>> No worries Spencer. There is no downloaded data? Nothing is > physically > >>>>> stored on your hard drive? The dot in the path would be interpreted > (no pun > >>>>> intended!) as something like the following: > >>>>> > >>>>> If the TCGA data was stored in a file named "tcga_data.dat" and it > was > >>>>> in a directory named "C:\spencer", the 4th line of that script would > set > >>>>> the path to "C:\spencer\tcga_data.dat" if you ran the script from > that same > >>>>> folder. If your tcga data is not stored in the same file from which > the > >>>>> script is being ran, it won't find any data to work with. Does this > help? > >>>>> > >>>>> > >>>>> On Sun, Aug 26, 2018 at 5:34 PM Spencer Brackett < > >>>>> spbracket...@saintjosephhs.com> wrote: > >>>>> > >>>>>> Caitlin, > >>>>>> > >>>>>> Forgive me, but I’m not quite sure exactly what your question is > >>>>>> asking. The data is originally from the TCGA and I have it > downloaded onto > >>>>>> another R script. I opened a new script to perform the functions I > posted > >>>>>> to this forum because I was unable to input any other commands into > the > >>>>>> console.... due to the fact that the translated data filled the > entirety of > >>>>>> said consule. Perhaps overloaded it? Regardless, I was unable to > input any > >>>>>> further commands. > >>>>>> > >>>>>> -Spencer Brackett > >>>>>> > >>>>>> > >>>>>> On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogram...@gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> You're welcome Spencer :) > >>>>>>> > >>>>>>> The 4th line: > >>>>>>> > >>>>>>> path <– "." > >>>>>>> > >>>>>>> refers to the current directory (the dot in other words). Is the > >>>>>>> data stored in the same directory where the code is being run? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett < > >>>>>>> spbracket...@saintjosephhs.com> wrote: > >>>>>>> > >>>>>>>> Thank you! I will make note of that. Unfortunately, lines 1 and 4 > >>>>>>>> of the first portion of this analysis appear to be where the error > >>>>>>>> begins... to which several subsequent lines also come up as > ‘errored’. > >>>>>>>> Perhaps this is an issue of the capitalization and/or spacing > (something > >>>>>>>> within the text)? The proposed method for methylation data > extraction is > >>>>>>>> based on the first third of the following TCGA workflow: > >>>>>>>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po= > 0.0715308 > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> > >>>>>>>> Spencer Brackett > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogram...@gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Spencer. > >>>>>>>>> > >>>>>>>>> Should you capitalize the following library import? > >>>>>>>>> > >>>>>>>>> library(summarizedExperiment) > >>>>>>>>> > >>>>>>>>> In other words, I think that line should be: > >>>>>>>>> > >>>>>>>>> library(SummarizedExperiment) > >>>>>>>>> > >>>>>>>>> Hope this helps. > >>>>>>>>> > >>>>>>>>> ~Caitlin > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett < > >>>>>>>>> spbracket...@saintjosephhs.com> wrote: > >>>>>>>>> > >>>>>>>>>> Good evening, > >>>>>>>>>> > >>>>>>>>>> I am attempting to run the following analysis on TCGA data, > >>>>>>>>>> however > >>>>>>>>>> something is being reported as an error in my arguments... any > >>>>>>>>>> ideas as to > >>>>>>>>>> what is incorrect in the following? Thanks! > >>>>>>>>>> > >>>>>>>>>> 1 library(TCGAbiolinks) > >>>>>>>>>> 2 > >>>>>>>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG > >>>>>>>>>> and GBM. > >>>>>>>>>> 4 path <– "." > >>>>>>>>>> 5 > >>>>>>>>>> 6 query.met <– TCGAquery(tumor = > >>>>>>>>>> c("LGG","GBM"),"HumanMethylation450", > >>>>>>>>>> level = 3) > >>>>>>>>>> 7 TCGAdownload(query.met, path = path ) > >>>>>>>>>> 8 met <– TCGAprepare(query = query.met,dir = path, > >>>>>>>>>> 9 add.subtype = TRUE, add.clinical = TRUE, > >>>>>>>>>> 10 summarizedExperiment = TRUE, > >>>>>>>>>> 11 save = TRUE, filename = > "lgg_gbm_met.rda") > >>>>>>>>>> 12 > >>>>>>>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG > and > >>>>>>>>>> GBM. > >>>>>>>>>> 14 query.exp <– TCGAquery(tumor = c("lgg","gbm"), platform = > >>>>>>>>>> "IlluminaHiSeq_ > >>>>>>>>>> RNASeqV2",level = 3) > >>>>>>>>>> 15 > >>>>>>>>>> 16 TCGAdownload(query.exp,path = path, type = > >>>>>>>>>> "rsem.genes.normalized_ > >>>>>>>>>> results") > >>>>>>>>>> 17 > >>>>>>>>>> 18 exp <– TCGAprepare(query = query.exp, dir = path, > >>>>>>>>>> 19 summarizedExperiment = TRUE, > >>>>>>>>>> 20 add.subtype = TRUE, add.clinical = TRUE, > >>>>>>>>>> 21 type = "rsem.genes.normalized_results", > >>>>>>>>>> 22 save = T,filename = "lgg_gbm_exp.rda") > >>>>>>>>>> > >>>>>>>>>> To download data on DNA methylation and gene expression… > >>>>>>>>>> > >>>>>>>>>> 1 library(summarizedExperiment) > >>>>>>>>>> 2 # get expression matrix > >>>>>>>>>> 3 data <– assay(exp) > >>>>>>>>>> 4 > >>>>>>>>>> 5 # get sample information > >>>>>>>>>> 6 sample.info <– colData(exp) > >>>>>>>>>> 7 > >>>>>>>>>> 8 # get genes information > >>>>>>>>>> 9 genes.info <– rowRanges(exp) > >>>>>>>>>> > >>>>>>>>>> Following stepwise procedure for obtaining GBM and LGG clinical > >>>>>>>>>> data… > >>>>>>>>>> > >>>>>>>>>> 1 # get clinical patient data for GBM samples > >>>>>>>>>> 2 gbm_clin <– TCGAquery_clinic("gbm","clinical_patient") > >>>>>>>>>> 3 > >>>>>>>>>> 4 # get clinical patient data for LGG samples > >>>>>>>>>> 5 lgg_clin <– TCGAquery_clinic("lgg","clinical_patient") > >>>>>>>>>> 6 > >>>>>>>>>> 7 # Bind the results, as the columns might not be the same, > >>>>>>>>>> 8 # we will plyr rbind.fill , to have all columns from both > files > >>>>>>>>>> 9 clinical <– plyr::rbind.fill(gbm_clin ,lgg_clin) > >>>>>>>>>> 10 > >>>>>>>>>> 11 # Other clinical files can be downloaded, > >>>>>>>>>> 12 # Use ?TCGAquery_clinic for more information > >>>>>>>>>> 13 clin_radiation <– TCGAquery_clinic("lgg"," > clinical_radiation") > >>>>>>>>>> 14 > >>>>>>>>>> 15 # Also, you can get clinical information from different tumor > >>>>>>>>>> types. > >>>>>>>>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT > >>>>>>>>>> 17 data <– TCGAquery_clinic(clinical_data_type = > >>>>>>>>>> "clinical_patient", > >>>>>>>>>> 18 samples = c("TCGA-06-5416-01A-01D-1481-05", > >>>>>>>>>> 19 "TCGA-2G-AAEW-01A-11D-A42Z-05", > >>>>>>>>>> 20 "TCGA-2G-AAEX-01A-11D-A42Z-05")) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> # Searching idat file for DNA methylation > >>>>>>>>>> query <- GDCquery(project = "TCGA-GBM", > >>>>>>>>>> data.category = "Raw microarray data", > >>>>>>>>>> data.type = "Raw intensities", > >>>>>>>>>> experimental.strategy = "Methylation array", > >>>>>>>>>> legacy = TRUE, > >>>>>>>>>> file.type = ".idat", > >>>>>>>>>> platform = "Illumina Human Methylation 450") > >>>>>>>>>> > >>>>>>>>>> **Repeat for LGG** > >>>>>>>>>> > >>>>>>>>>> To access mutational information concerning TMZ methylation… > >>>>>>>>>> > >>>>>>>>>> > mutation <– TCGAquery_maf(tumor = "lgg") > >>>>>>>>>> 2 Getting maf tables > >>>>>>>>>> 3 Source: https://wiki.nci.nih.gov/ > display/TCGA/TCGA+MAF+Files > >>>>>>>>>> 4 We found these maf files below: > >>>>>>>>>> 5 MAF.File.Name > >>>>>>>>>> 6 2 hgsc.bcm.edu_LGG.IlluminaGA_ > DNASeq.1.somatic.maf > >>>>>>>>>> 7 > >>>>>>>>>> 8 3 > >>>>>>>>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated. > somatic.maf > >>>>>>>>>> 9 > >>>>>>>>>> 10 Archive.Name Deploy.Date > >>>>>>>>>> 11 2 hgsc.bcm.edu_LGG.IlluminaGA_ > DNASeq_automated.Level_2.1.0.0 > >>>>>>>>>> 10-DEC-13 > >>>>>>>>>> 12 3 broad.mit.edu_LGG.IlluminaGA_ > DNASeq_curated.Level_2.1.3.0 > >>>>>>>>>> 24-DEC-14 > >>>>>>>>>> 13 > >>>>>>>>>> 14 Please, select the line that you want to download: 3 > >>>>>>>>>> > >>>>>>>>>> **Repeat this for GBM*** > >>>>>>>>>> > >>>>>>>>>> Selecting specified lines to download… > >>>>>>>>>> > >>>>>>>>>> 1 gbm.subtypes <− TCGAquery_subtype(tumor = "gbm") > >>>>>>>>>> 2 lgg.subtypes <− TCGAquery_subtype(tumor = "lgg”) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Downloading data via the Bioconductor package RTCGAtoolbox… > >>>>>>>>>> > >>>>>>>>>> library(RTCGAToolbox) > >>>>>>>>>> 2 > >>>>>>>>>> 3 # Get the last run dates > >>>>>>>>>> 4 lastRunDate <− getFirehoseRunningDates()[1] > >>>>>>>>>> 5 lastAnalyseDate <− getFirehoseAnalyzeDates(1) > >>>>>>>>>> 6 > >>>>>>>>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG > >>>>>>>>>> 8 lgg.data <− getFirehoseData(dataset = "LGG", > >>>>>>>>>> 9 gistic2_Date = getFirehoseAnalyzeDates(1), runDate = > >>>>>>>>>> lastRunDate, > >>>>>>>>>> 10 Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic = > >>>>>>>>>> TRUE, > >>>>>>>>>> 11 Mutation = T, > >>>>>>>>>> 12 fileSizeLimit = 10000) > >>>>>>>>>> 13 > >>>>>>>>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM > >>>>>>>>>> 15 gbm.data <− getFirehoseData(dataset = "GBM", > >>>>>>>>>> 16 runDate = lastDate, gistic2_Date = > >>>>>>>>>> getFirehoseAnalyzeDates(1), > >>>>>>>>>> 17 Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm = > >>>>>>>>>> TRUE, > >>>>>>>>>> 18 fileSizeLimit = 10000) > >>>>>>>>>> 19 > >>>>>>>>>> 20 # To access the data you should use the getData function > >>>>>>>>>> 21 # or simply access with @ (for example gbm.data@Clinical) > >>>>>>>>>> 22 gbm.mut <− getData(gbm.data,"Mutations") > >>>>>>>>>> 23 gbm.clin <− getData(gbm.data,"Clinical") > >>>>>>>>>> 24 gbm.gistic <− getData(gbm.data,"GISTIC") > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Genomic Analysis/Final data extraction: > >>>>>>>>>> > >>>>>>>>>> Enable “getData” to access the data > >>>>>>>>>> > >>>>>>>>>> Obtaining GISTIC results… > >>>>>>>>>> > >>>>>>>>>> 1 # Download GISTIC results > >>>>>>>>>> 2 gistic <− getFirehoseData("GBM",gistic2_Date ="20141017" ) > >>>>>>>>>> 3 > >>>>>>>>>> 4 # get GISTIC results > >>>>>>>>>> 5 gistic.allbygene <− gistic@GISTIC@AllByGene > >>>>>>>>>> 6 gistic.thresholedbygene <− gistic@GISTIC@ThresholedByGene > >>>>>>>>>> > >>>>>>>>>> Repeat this procedure to obtain LGG GISTIC results. > >>>>>>>>>> > >>>>>>>>>> ***Please ignore the 'non-coded' text as they are procedural > >>>>>>>>>> steps/classifications*** > >>>>>>>>>> > >>>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>>> > >>>>>>>>>> ______________________________________________ > >>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, > see > >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>>> PLEASE do read the posting guide > >>>>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>>> and provide commented, minimal, self-contained, reproducible > code. > >>>>>>>>>> > >>>>>>>>> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.