On Sep 9, 2011, at 09:13 , Petr PIKAL wrote: > Hi > > Isn't it something for merge is designed?
Sort of. (You'd need to think carefully about what happens with non-matched codes.) Wouldn't this do the trick as well? in <- as.character(DeptCodes$DeptCodes) out <- as.character(DeptCodes$DeptNames) Doctors <- within(Doctors, DeptNames <- factor(DocDepts, levels=in, labels=out)) > >> merge(Doctors, DeptCodes, by.x="DocDepts", by.y="Depts") > DocDepts Docs DeptNames > 1 1111 Christian\nChristianson Heart > 2 5555 Bob Smith Brain > 3 9999 Greg Jones Anesthesia > 4 9999 Al Franklin Anesthesia > > It is easy to get rid of the first column. > > Regards > Petr > > >> Re: [R] Factors? I think? >> >> It's probably easiest to think of this as a compound map (doctor -> dept >> code -> factor -> character -> integer -> dept code -> dept name as >> character) and to treat the code as such: if you already have R objects > with >> the codes in them, it shouldn't be hard to do the transformation. >> >> Consider the following toy set up >> >> Docs = factor(c("Greg Jones","Bob Smith","Al Franklin","Christian >> Christianson")) >> DocDepts = factor(c("9999","5555","9999","1111")) >> Doctors = data.frame(Docs, DocDepts) >> >> Depts = factor(1:9 * 1111) >> DeptNames = >> factor(c >> > ("Heart","Kidney","Feet","Teeth","Brain","Digestive","Diagnostic","Surgery","Anesthesia")) >> DeptCodes = data.frame(Depts,DeptNames) >> # Everything in our data frames is now some sort of factor so we can't > match >> things up in the "normal" ways >> >> # Now, you have to do some unpleasantly long but pretty straightforward > code >> to convert the factors in a way that makes the match properly: >> >> Doctors$numbers <- as.numeric(as.character(Doctors[,2])) ## Will extract > the >> "9999" as a real 9999, rather than the internal factor code >> DeptCodes$values <- as.numeric(as.character(DeptCodes[,1])) >> >> match(Doctors$numbers, DeptCodes$values) ## Will map the department > numbers >> onto the correct rows of the DeptCodes df >> >> # Now we get the correct names using those row numbers >> DeptAssignments = as.character(DeptCodes[match(Doctors$numbers, >> DeptCodes$values),2]) >> >> # Combine with doctor names to finish >> NamesandTitles = cbind(as.character(Doctors[,1]),DeptAssignments) >> >> It's not the most elegant way of doing it, but hopefully it gives some >> insight into how to work with factors. If you can send a little more >> information about how your data is currently stored we can optimize this >> into something easily repeatable but without specifics, I have to work > in >> generalities. >> >> Hope this helps, >> >> Michael Weylandt >> >> On Thu, Sep 8, 2011 at 6:36 PM, Totally Inept <kramer...@gmail.com> > wrote: >> >>> First of all, let me apologize, as this is probably an absurdly basic >>> question. I did search before asking, but perhaps my ineptitude didn't >>> allow >>> me to apply what I read to what I'm doing. Totally new to R, and > haven't >>> done any code in any language in a long time. >>> >>> Basically I've got categories. They're department codes for doctors > (say, >>> 9999 for radiology or 5555 for endocrinology), which of course means > that >>> there are a good number of them, i.e. it's not practical for me to > write >>> them all out as I usually see in examples of categorical variables >>> (factors). >>> >>> And then I've got a list of doctors that I'm actually interested in. I > have >>> the department codes associated with each, but I need to map the > department >>> name to the doctor name. So I might have Greg Jones, Bob Smith, Tom > Wilson, >>> etc... to go with 1234, 9999, 2222, etc. >>> >>> I need to turn Greg Jones, Bob Smith, ... and 1234, 9999, ... into > Greg >>> Jones, Bob Smith, ... Cardiology, Radiology, .... >>> >>> Obviously I could just search and replace within the csv files but I > need >>> something durable that I can run things through repeatedly. >>> >>> Anyhow, thanks to anyone willing to humor me with an answer. >>> >>> -- >>> View this message in context: >>> http://r.789695.n4.nabble.com/Factors-I-think-tp3800413p3800413.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.