-----Original Message----- From: Noah Silverman
Sent: Tuesday, February 28, 2012 3:27 PM
To: r-help
Subject: [R] Cleaning up messy Excel data

Unfortunately, some data I need to work with was delivered in a rather messy Excel file. I want to import into R and clean up some things so that I can do my analysis. Pulling in a CSV from Excel is the easy part.

My current challenge is dealing with some text mixed in the values.
i.e.   118   5.7   <2.0  3.7

Since this column in Excel has a "<2.0" value, then R reads the column as a factor with levels. Ideally, I want to convert it a normal vector of scalars and code code the "<2.0" as 0.

Can anyone suggest an easy way to do this?
--------------------------------------
?as.character
will show you how to change the "factor" column into a character column. Then, you can replace text using any of a number of procedures.
see for example
?gsub

finally, you can use as.numeric if you want numbers. "Coding" is best done in the context of factors, so you might want to consider where replacing <2 with NA is more appropriate than replacing with 0. In this end, the choice might be context sensitive.

Rob
--------------------------------
------------------------------------------
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to