Re: [R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display

Chel Hee Lee Thu, 18 Dec 2014 21:37:56 -0800

Please take a look at my code again. The error message says that object'Primary.Viol.Type' not found. Have you ever created the object'Primary.Viol.Type'? It will be working if you replace'Primary.Viol.Type' by 'PViol.Type.Per.Case.Original$Primary.Viol.Type'where 'factor()' is used. I hope this helps.


Chel Hee Lee


On 12/18/2014 08:57 PM, Crombie, Burnette N wrote:

Chel, your solution is fantastic on the dataset I submitted in my question but 
it is not working when I import my real dataset into R.  Do I need to vectorize 
the columns in my real dataset after importing?  I tried a few things (###) but 
not making progress:

MERGE_PViol.Detail.Per.Case <- 
read.csv("~/FOIA_FLSA/MERGE_PViol.Detail.Per.Case_for_rtf10.csv", 
stringsAsFactors=TRUE)

### select only certain columns
PViol.Type.Per.Case.Original <- MERGE_PViol.Detail.Per.Case[,c("CaseID", 
"Primary.Viol.Type")]

### write.csv(PViol.Type.Per.Case,file="PViol.Type.Per.Case.Select.csv")
### PViol.Type.Per.Case.Original <- 
read.csv("~/FOIA_FLSA/PViol.Type.Per.Case.Select.csv")
### PViol.Type.Per.Case.Original$X <- NULL
###PViol.Type.Per.Case.Original[] <- lapply(PViol.Type.Per.Case.Original, 
as.character)

PViol.Type <- c("CaseID",
                 "BW.BackWages",
                 "LD.Liquid_Damages",
                 "MW.Minimum_Wage",
                 "OT.Overtime",
                 "RK.Records_FLSA",
                 "V.Poster_Other",
                 "AS.Age",
                 "BW.WHMIS_BackWages",
                 "HS.Hours",
                 "OA.HazOccupationAg",
                 "ON.HazOccupationNonAg",
                 "R3.Reg3AgeOccupation",
                 "RK.Records_CL",
                 "V.Other")

PViol.Type.Per.Case.Original$Primary.Viol.Type <- factor(Primary.Viol.Type, 
levels=PViol.Type, labels=PViol.Type)

### Error in factor(Primary.Viol.Type, levels = PViol.Type, labels = 
PViol.Type) :  object 'Primary.Viol.Type' not found

tmp <- split(PViol.Type.Per.Case.Original,PViol.Type.Per.Case.Original$CaseID)
ans <- ifelse(do.call(rbind, lapply(tmp, 
function(x)table(x$Primary.Viol.Type))), 1, NA)



-----Original Message-----
From: Crombie, Burnette N
Sent: Thursday, December 18, 2014 3:01 PM
To: 'Chel Hee Lee'
Subject: RE: [R] Make 2nd col of 2-col df into header row of same df then 
adjust col1 data display

Thanks for taking the time to review this, Chel.  I've got to step away from my 
desk, but will reply more substantially as soon as possible. -- BNC

-----Original Message-----
From: Chel Hee Lee [mailto:chl...@mail.usask.ca]
Sent: Thursday, December 18, 2014 2:43 PM
To: Jeff Newmiller; Crombie, Burnette N
Cc: r-help@r-project.org
Subject: Re: [R] Make 2nd col of 2-col df into header row of same df then 
adjust col1 data display

I like the approach presented by Jeff Newmiller as shown in the previous post 
(I really like his way).  As he suggested, it would be good to start with 
'factor' since you have all values of 'Primary.Viol.Type'.
You may try to use 'split()' function for creating table that you wish to 
build.  Please see the below (I hope this helps):

  > PViol.Type.Per.Case.Original$Primary.Viol.Type <- factor(Primary.Viol.Type, 
levels=PViol.Type, labels=PViol.Type)  >  > tmp <- split(PViol.Type.Per.Case.Original,
PViol.Type.Per.Case.Original$CaseID)
  > ans <- ifelse(do.call(rbind, lapply(tmp, function(x) 
table(x$Primary.Viol.Type))), 1, NA)  > ans
          CaseID BW.BackWages LD.Liquid_Damages MW.Minimum_Wage OT.Overtime
1005317     NA           NA                NA              NA          NA
1007183     NA           NA                NA              NA           1
1008833     NA           NA                NA              NA           1
1012281     NA           NA                NA              NA          NA
1015285     NA           NA                NA              NA          NA
1015315     NA           NA                NA              NA           1
1015322     NA           NA                NA              NA          NA
          RK.Records_FLSA V.Poster_Other AS.Age BW.WHMIS_BackWages HS.Hours
1005317              NA             NA     NA                 NA        1
1007183              NA             NA     NA                 NA       NA
1008833              NA             NA     NA                 NA       NA
1012281              NA             NA     NA                 NA        1
1015285              NA              1      1                 NA        1
1015315              NA             NA     NA                 NA       NA
1015322              NA              1     NA                 NA       NA
          OA.HazOccupationAg ON.HazOccupationNonAg R3.Reg3AgeOccupation
1005317                 NA                    NA                   NA
1007183                 NA                    NA                   NA
1008833                 NA                    NA                   NA
1012281                 NA                    NA                   NA
1015285                 NA                    NA                   NA
1015315                 NA                    NA                   NA
1015322                 NA                    NA                   NA
          RK.Records_CL V.Other
1005317            NA      NA
1007183            NA      NA
1008833            NA      NA
1012281            NA      NA
1015285             1      NA
1015315            NA      NA
1015322            NA      NA
  >

Chel Hee Lee

On 12/18/2014 10:02 AM, Jeff Newmiller wrote:

No guarantees on "best"... but one way using base R could be:

# Note that "CaseID" is actually not a valid PViol.Type as you had it
PViol.Type <- c( "BW.BackWages"
                 , "LD.Liquid_Damages"
                 , "MW.Minimum_Wage"
                 , "OT.Overtime"
                 , "RK.Records_FLSA"
                 , "V.Poster_Other"
                 , "AS.Age"
                 , "BW.WHMIS_BackWages"
                 , "HS.Hours"
                 , "OA.HazOccupationAg"
                 , "ON.HazOccupationNonAg"
                 , "R3.Reg3AgeOccupation"
                 , "RK.Records_CL"
                 , "V.Other" )

# explicitly specifying all levels to the factor insures a complete #
set of column outputs regardless of what is in the input
PViol.Type.Per.Case.Original <-
      data.frame( CaseID
                , Primary.Viol.Type=factor( Primary.Viol.Type
                                          , levels=PViol.Type ) )

tmp <- table( PViol.Type.Per.Case.Original ) ans <- data.frame(
CaseID=rownames( tmp )
                   , as.data.frame( ifelse( 0==tmp, NA, 1 ) )
                   )


On Wed, 17 Dec 2014, bcrombie wrote:

# I have a dataframe that contains 2 columns:
CaseID  <- c('1015285',
'1005317',
'1012281',
'1015285',
'1015285',
'1007183',
'1008833',
'1015315',
'1015322',
'1015285')

Primary.Viol.Type <- c('AS.Age',
'HS.Hours',
'HS.Hours',
'HS.Hours',
'RK.Records_CL',
'OT.Overtime',
'OT.Overtime',
'OT.Overtime',
'V.Poster_Other',
'V.Poster_Other')

PViol.Type.Per.Case.Original <- data.frame(CaseID,Primary.Viol.Type)

# CaseID?s can be repeated because there can be up to 14
Primary.Viol.Type?s per CaseID.

# I want to transform this dataframe into one that has 15 columns,
where the first column is CaseID, and the rest are the 14 primary
viol. types.  The CaseID column will contain a list of the unique
CaseID?s (no
replicates) and
for each of their rows, there will be a ?1? under  a column
corresponding to a primary violation type recorded for that CaseID.
So, technically, there could be zero to 14 ?1?s? in a CaseID?s row.

# For example, the row for CaseID '1015285' above would have a ?1?
under ?AS.Age?, ?HS.Hours?, ?RK.Records_CL?, and ?V.Poster_Other?,
but have "NA"
under the rest of the columns.

PViol.Type <- c("CaseID",
                "BW.BackWages",
           "LD.Liquid_Damages",
           "MW.Minimum_Wage",
           "OT.Overtime",
           "RK.Records_FLSA",
           "V.Poster_Other",
           "AS.Age",
           "BW.WHMIS_BackWages",
           "HS.Hours",
           "OA.HazOccupationAg",
           "ON.HazOccupationNonAg",
           "R3.Reg3AgeOccupation",
           "RK.Records_CL",
           "V.Other")

PViol.Type.Columns <- t(data.frame(PViol.Type)

# What is the best way to do this in R?




--
View this message in context:
http://r.789695.n4.nabble.com/Make-2nd-col-of-2-col-df-into-header-ro
w-of-same-df-then-adjust-col1-data-display-tp4700878.html

Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                        Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make 2nd col of 2-col df into header row of same df then adjust col1 data display

Reply via email to