"Make a table that looks like..." sounds like a use case that would benefit from some reflection. Anyway, at least don't put your IDs *in* the "table".
# Your data CaseID <- c('1015285', '1005317', '1012281', '1015285', '1015285', '1007183', '1008833', '1015315', '1015322', '1015285') Primary.Viol.Type <- c('AS.Age', 'HS.Hours', 'HS.Hours', 'HS.Hours', 'RK.Records_CL', 'OT.Overtime', 'OT.Overtime', 'OT.Overtime', 'V.Poster_Other', 'V.Poster_Other') # the code uID <- unique(CaseID) uVT <- unique(Primary.Viol.Type) m <- matrix(NA, nrow=length(uID), ncol=length(uVT), dimnames=list(uID, uVT)) for (i in 1:length(CaseID)) { m[CaseID[i], Primary.Viol.Type[i]] <- 1 } # the result AS.Age HS.Hours RK.Records_CL OT.Overtime V.Poster_Other 1015285 1 1 1 NA 1 1005317 NA 1 NA NA NA 1012281 NA 1 NA NA NA 1007183 NA NA NA 1 NA 1008833 NA NA NA 1 NA 1015315 NA NA NA 1 NA 1015322 NA NA NA NA 1 B. On Dec 18, 2014, at 8:09 AM, Crombie, Burnette N <bcrom...@utk.edu> wrote: > I want to achieve a table that looks like a grid of 1's for all cases in a > survey. I'm an R beginner and don't have a clue how to do all the things you > just suggested. I really appreciate the time you took to explain all of > those options, though. -- BNC > > -----Original Message----- > From: Boris Steipe [mailto:boris.ste...@utoronto.ca] > Sent: Thursday, December 18, 2014 5:29 AM > To: Crombie, Burnette N > Cc: r-help@r-project.org > Subject: Re: [R] Make 2nd col of 2-col df into header row of same df then > adjust col1 data display > > What you are describing sounds like a very spreadsheet-y thing. > > - The information is already IN your dataframe, and easy to get out by > subsetting. Depending on your usecase, that may actually be the "best". > > - If the number of CaseIDs is large, I would use a hash of lists (if the data > is sparse), or hash of named vectors if it's not sparse. Lookup is O(1) so > that may be the best. (Cf package hash, and explanations there). > > - If it must be the spreadsheet-y thing, you could make a matrix with > rownames and colnames taken from unique() of your respective dataframe. > Instead of 1 and NA I probably would use TRUE/FALSE. > > - If it takes less time to wait for the results than to look up how apply() > works, you can write a simple loop to populate your matrix. Otherwise apply() > is much faster. > > - You could even use a loop to build the datastructure, checking for every > cbind() whether the value in column 1 already exists in the table - but > that's terrible and would make a kitten die somewhere on every iteration. > > All of these are possible, and you haven't told us enough about what you want > to achieve to figure out what the "best" is. If you choose one of the options > and need help with the code, let us know. > > Cheers, > B. > > > > > > On Dec 17, 2014, at 10:15 PM, bcrombie <bcrom...@utk.edu> wrote: > >> # I have a dataframe that contains 2 columns: >> CaseID <- c('1015285', >> '1005317', >> '1012281', >> '1015285', >> '1015285', >> '1007183', >> '1008833', >> '1015315', >> '1015322', >> '1015285') >> >> Primary.Viol.Type <- c('AS.Age', >> 'HS.Hours', >> 'HS.Hours', >> 'HS.Hours', >> 'RK.Records_CL', >> 'OT.Overtime', >> 'OT.Overtime', >> 'OT.Overtime', >> 'V.Poster_Other', >> 'V.Poster_Other') >> >> PViol.Type.Per.Case.Original <- data.frame(CaseID,Primary.Viol.Type) >> >> # CaseID's can be repeated because there can be up to 14 >> Primary.Viol.Type's per CaseID. >> >> # I want to transform this dataframe into one that has 15 columns, >> where the first column is CaseID, and the rest are the 14 primary >> viol. types. The CaseID column will contain a list of the unique >> CaseID's (no replicates) and for each of their rows, there will be a >> "1" under a column corresponding to a primary violation type recorded >> for that CaseID. So, technically, there could be zero to 14 "1's" in a >> CaseID's row. >> >> # For example, the row for CaseID '1015285' above would have a "1" >> under "AS.Age", "HS.Hours", "RK.Records_CL", and "V.Poster_Other", but have >> "NA" >> under the rest of the columns. >> >> PViol.Type <- c("CaseID", >> "BW.BackWages", >> "LD.Liquid_Damages", >> "MW.Minimum_Wage", >> "OT.Overtime", >> "RK.Records_FLSA", >> "V.Poster_Other", >> "AS.Age", >> "BW.WHMIS_BackWages", >> "HS.Hours", >> "OA.HazOccupationAg", >> "ON.HazOccupationNonAg", >> "R3.Reg3AgeOccupation", >> "RK.Records_CL", >> "V.Other") >> >> PViol.Type.Columns <- t(data.frame(PViol.Type) >> >> # What is the best way to do this in R? >> >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Make-2nd-col-of-2-col-df-into-header-row >> -of-same-df-then-adjust-col1-data-display-tp4700878.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.