Follow up, Would read.txt also work, as I am certain that I have both datasets in .txt files? As to a previous users question concern the .csv nature of the supposed excel file, I am uncertain as to how this was translated as such. The file is most certainly in excel.
On Thu, Dec 27, 2018 at 12:10 AM Spencer Brackett < spbracket...@saintjosephhs.com> wrote: > Caitlin, > > I tried your command in both RGui and RStudio but both came up as > errors. I believe I made a mistake somewhere I labeling/downloading the > files, which is the source of the confusion in R. I will re-examine the > files saved on my desktop to determine the error. Regardless, would it be > better to use a read.table or read.csv function when attempting to download > my datasets? I tried using read.xl on RStudio as this process seemed much > easier, however, it would seem that my proclivity to error prevents such. > > Best, > > Spencer > > On Wed, Dec 26, 2018 at 11:55 PM Caitlin Gibbons <bioprogram...@gmail.com> > wrote: > >> Does this help Spencer? The read.delim() function assumes a tab character >> by default, but I specifically included it using the read.csv function. The >> downloaded file is NOT an Excel file so this should help. >> >> GBM_protein_expression <- read.csv("C:/Users/Spencer/Desktop/GBM >> protein_expression.tsv", sep=â\tâ) >> >> Sent from my iPhone >> >> > On Dec 26, 2018, at 9:23 PM, Richard M. Heiberger <r...@temple.edu> >> wrote: >> > >> > this is wrong because the file is a csv file. read_excel is designed >> > for xls files. >> > GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM >> > protein_expression.csv") >> > >> > How did you get a csv? it downloads as tsv. >> > >> > the statement you should use is in base, no library() statement is >> needed. >> > >> > GBM_protein_expression <- read.delim("C:/Users/Spencer/Desktop/GBM >> > protein_expression.csv") >> > >> > read.delim is the same as read.csv except that it sets the sep >> > argument to "\t". >> > >> > >> > >> > On Wed, Dec 26, 2018 at 11:11 PM Spencer Brackett >> > <spbracket...@saintjosephhs.com> wrote: >> >> >> >> Sorry, my mistake. >> >> >> >> So I could still use read.table and should I try using a .txt version >> of >> >> the file to avoid the silent changes you described? >> >> >> >> Also, when I tried to simply this process by downloading the dataset >> onto >> >> RStudio opposed to R (Gui) I received the following... >> >> library(readxl) >> >>> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM >> >> protein_expression.csv") >> >> Error: Can't establish that the input is either xls or xlsx. >> >>> View(GBM_protein_expression) >> >> Error in View : object 'GBM_protein_expression' not found >> >> Error in gzfile(file, mode) : cannot open the connection >> >> In addition: Warning message: >> >> In gzfile(file, mode) : >> >> cannot open compressed file >> >> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds', >> >> probable reason 'No such file or directory' >> >>> library(readxl) >> >>> GBM_protein_expression <- >> >> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx") >> >> readxl works best with a newer version of the tibble package. >> >> You currently have tibble v1.4.2. >> >> Falling back to column name repair from tibble <= v1.4.2. >> >> Message displays once per session. >> >>> View(GBM_protein_expression) >> >> >> >> >> >> Is this perhaps the result of lack of preview (which I did not >> complete at >> >> the time I hit import as the preview failed to load), or the fact that >> the >> >> excel file itself contains no numerical data, but only TRUE or FALSE >> >> entries? >> >> >> >> On Wed, Dec 26, 2018 at 10:59 PM Jeff Newmiller < >> jdnew...@dcn.davis.ca.us> >> >> wrote: >> >> >> >>> Please always reply-all to keep the list involved. >> >>> >> >>> If you used Save As to change the data format to Excel AND the file >> >>> extension to xlsx, then yes, you should be able to read with readxl. I >> >>> don't recommend it, though... Excel often changes data silently and in >> >>> irregularly located places in your file. >> >>> >> >>> On December 26, 2018 7:38:16 PM PST, Spencer Brackett < >> >>> spbracket...@saintjosephhs.com> wrote: >> >>>> So even if I imported the file form ICGC to my desktop as an excel >> >>>> file, >> >>>> and can view and saved the data as such, it is still a TSV? >> >>>> >> >>>> On Wed, Dec 26, 2018 at 10:35 PM Jeff Newmiller >> >>>> <jdnew...@dcn.davis.ca.us> >> >>>> wrote: >> >>>> >> >>>>> CSV and TSV are not Excel files. Yes, I know Excel will open them, >> >>>> but >> >>>>> that does not make them Excel files. >> >>>>> >> >>>>> Read a TSV file with read.table or read.csv, setting the sep >> argument >> >>>> to >> >>>>> "\t". >> >>>>> >> >>>>> On December 26, 2018 7:26:35 PM PST, Spencer Brackett < >> >>>>> spbracket...@saintjosephhs.com> wrote: >> >>>>>> I tried importing the file without preview and recieved the >> >>>>>> following.... >> >>>>>> >> >>>>>> library(readxl) >> >>>>>>> GBM_protein_expression <- read_excel("C:/Users/Spencer/Desktop/GBM >> >>>>>> protein_expression.csv") >> >>>>>> Error: Can't establish that the input is either xls or xlsx. >> >>>>>>> View(GBM_protein_expression) >> >>>>>> Error in View : object 'GBM_protein_expression' not found >> >>>>>> Error in gzfile(file, mode) : cannot open the connection >> >>>>>> In addition: Warning message: >> >>>>>> In gzfile(file, mode) : >> >>>>>> cannot open compressed file >> >>>>> >> >>>>> >> 'C:/Users/Spencer/AppData/Local/Temp/RtmpQNQrMh/input147c61fc5b52.rds', >> >>>>>> probable reason 'No such file or directory' >> >>>>>>> library(readxl) >> >>>>>>> GBM_protein_expression <- >> >>>>>> read_excel("C:/Users/Spencer/Desktop/GBM_protein_ expression.xlsx") >> >>>>>> readxl works best with a newer version of the tibble package. >> >>>>>> You currently have tibble v1.4.2. >> >>>>>> Falling back to column name repair from tibble <= v1.4.2. >> >>>>>> Message displays once per session. >> >>>>>>> View(GBM_protein_expression) >> >>>>>> >> >>>>>> Also, the area above my console says that no data is available in >> >>>> the >> >>>>>> table. Is this perhaps the result of lack of preview or the fact >> >>>> that >> >>>>>> the >> >>>>>> excel file itself contains no numerical data, but only TRUE or >> FALSE >> >>>>>> entries? >> >>>>>> >> >>>>>> On Wed, Dec 26, 2018 at 9:57 PM Spencer Brackett < >> >>>>>> spbracket...@saintjosephhs.com> wrote: >> >>>>>> >> >>>>>>> Hello again, >> >>>>>>> >> >>>>>>> I worked on directly downloading the file into R as was suggested, >> >>>>>> but >> >>>>>>> have thus far been unsuccessful. This is what I generated on my >> >>>>>> second >> >>>>>>> attempt... >> >>>>>>> >> >>>>>>> GBM protein_expression<-(file.choose(), header=TRUE, sep="\t") >> >>>>>>> Error: unexpected symbol in "GBM protein_expression" >> >>>>>>>> GBM >> >>>>>>> >> >>>>> >> >>> >> >>>>> >> protein_expression<-(file.choose(GBM_protein_expression.xlsx),header=TRUE, >> >>>>>>> sep="\t") >> >>>>>>> Error: unexpected symbol in "GBM protein_expression" >> >>>>>>>> >> >>>>>>> >> >>>>>>> What part of the argument is in error? >> >>>>>>> >> >>>>>>> Also I tried importing the dataset as an excel file on RStudio to >> >>>> see >> >>>>>> if I >> >>>>>>> could solve my problem that way. However, my imported excel file >> >>>> has >> >>>>>> been >> >>>>>>> stuck in the 'retrieving preview data' and no data is appearing. >> >>>> Is >> >>>>>> the >> >>>>>>> data file prehaps too large or in the wrong format? >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> On Wed, Dec 26, 2018 at 6:42 PM Spencer Brackett < >> >>>>>>> spbracket...@saintjosephhs.com> wrote: >> >>>>>>> >> >>>>>>>> Mr. Heiberger, >> >>>>>>>> >> >>>>>>>> Thank you for the insight! I will try out suggestion. >> >>>>>>>> >> >>>>>>>> Best, >> >>>>>>>> >> >>>>>>>> Spencer Brackett >> >>>>>>>> >> >>>>>>>> On Wed, Dec 26, 2018 at 6:34 PM Richard M. Heiberger >> >>>>>> <r...@temple.edu> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> I looked at the first file. It gives an option to download as >> >>>> TSV >> >>>>>>>>> (tab separated values). >> >>>>>>>>> That is the same as CSV except with tabs instead of commas. >> >>>>>>>>> You do not need any external software to read it. Read the >> >>>>>> downloaded >> >>>>>>>>> file directly into R. >> >>>>>>>>> >> >>>>>>>>> read.delim looks as if it would work directly on the downloaded >> >>>>>> file. >> >>>>>>>>> ?read.delim >> >>>>>>>>> The notation "\t" means the tab character. >> >>>>>>>>> >> >>>>>>>>> As an aside, stay away from notepad. it is too naive for almost >> >>>>>>>>> anything interesting. >> >>>>>>>>> The specific case I often see is people reading linux-style text >> >>>>>> files >> >>>>>>>>> with notepad, which doesn't >> >>>>>>>>> understand NL terminated lines. nicely formatted text files >> >>>> become >> >>>>>>>>> illegible. >> >>>>>>>>> >> >>>>>>>>> On Wed, Dec 26, 2018 at 6:04 PM Spencer Brackett >> >>>>>>>>> <spbracket...@saintjosephhs.com> wrote: >> >>>>>>>>>> >> >>>>>>>>>> Good evening, >> >>>>>>>>>> >> >>>>>>>>>> I am attempting to anaylze the protein expression data >> >>>> contained >> >>>>>> within >> >>>>>>>>>> these two ICGC, TCGA datasets (one for GBM and the other for >> >>>> LGG) >> >>>>>>>>>> >> >>>>>>>>>> *File for GBM protein expression*: >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22GBM-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D >> >>>>>>>>>> >> >>>>>>>>>> *File for LGG protein expression:* >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> * >> >>>>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D >> >>>>>>>>>> < >> >>>>>>>>> >> >>>>>> >> >>>>> >> >>>> >> >>> >> https://dcc.icgc.org/search?filters=%7B%22donor%22:%7B%22projectId%22:%7B%22is%22:%5B%22LGG-US%22%5D%7D,%22availableDataTypes%22:%7B%22is%22:%5B%22pexp%22%5D%7D%7D%7D >> >>>>>>>>>> * >> >>>>>>>>>> >> >>>>>>>>>> When I tried to transfer the files from .txt (via Notepad) >> >>>> to >> >>>>>> .csv >> >>>>>>>>> (via >> >>>>>>>>>> Excel), the data appeared in the columns as unorganized and >> >>>>>> random >> >>>>>>>>>> script... not like how a typical csv should be arranged at >> >>>> all. I >> >>>>>> need >> >>>>>>>>> the >> >>>>>>>>>> dataset to be converted into .csv in order to analyze it in R, >> >>>>>> which >> >>>>>>>>> is why >> >>>>>>>>>> I am hoping someone here might help me in doing that. If not, >> >>>> is >> >>>>>> there >> >>>>>>>>>> perhaps some other way that I could analyze the datatsets on >> >>>> R, >> >>>>>> which >> >>>>>>>>> again >> >>>>>>>>>> is downloaded from the dataportal ICGC? >> >>>>>>>>>> >> >>>>>>>>>> Best, >> >>>>>>>>>> >> >>>>>>>>>> Spencer Brackett >> >>>>>>>>>> >> >>>>>>>>>> [[alternative HTML version deleted]] >> >>>>>>>>>> >> >>>>>>>>>> ______________________________________________ >> >>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, >> >>>> see >> >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>>>>>>> PLEASE do read the posting guide >> >>>>>>>>> http://www.R-project.org/posting-guide.html >> >>>>>>>>>> and provide commented, minimal, self-contained, reproducible >> >>>>>> code. >> >>>>>>>>> >> >>>>>>>> >> >>>>>> >> >>>>>> [[alternative HTML version deleted]] >> >>>>>> >> >>>>>> ______________________________________________ >> >>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >> >>>>>> PLEASE do read the posting guide >> >>>>>> http://www.R-project.org/posting-guide.html >> >>>>>> and provide commented, minimal, self-contained, reproducible code. >> >>>>> >> >>>>> -- >> >>>>> Sent from my phone. Please excuse my brevity. >> >>>>> >> >>> >> >>> -- >> >>> Sent from my phone. Please excuse my brevity. >> >>> >> >> >> >> [[alternative HTML version deleted]] >> >> >> >> ______________________________________________ >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.