Hi

Try

for (cname in colnames(mydf))
 print((percent(length(is.null(mydf [, cname]) / lines))

Br. Frede


-------- Oprindelig meddelelse --------
Fra: Jeff Johnson
Dato:18/01/2014 02.10 (GMT+01:00)
Til: R help
Emne: [R] For loop on column names

I'm trying to find a more efficient to calculate the percent a field is
populated and repeat it for each field (column).

First, I'm counting the number of lines:
lines <- as.integer(countLines(extract) - 1)
dput(lines)
100000L

extract <- 'C:/Users/jeffjohn/Desktop/batchextract_100k_sample.csv'
mydf <- read.csv(file = extract, header = TRUE)

Here's the list of columns in my file:
> dput(colnames(mydf))
c("PERSONPROFILE_POS", "PARTY_ID", "PERSON_FIRST_NAME", "PERSON_LAST_NAME",
"PERSON_MIDDLE_NAME", "PARTY_NUMBER", "ACCOUNT_NUMBER", "ABILITEC_LINK",
"ADDRESS1", "ADDRESS2", "ADDRESS3", "ADDRESS4", "CITY", "COUNTY",
"STATE", "PROVINCE", "POSTAL_CODE", "COUNTRY", "PRIMARY_PER_TYPE",
"SELLTOADDR_LOS", "LOCATION_ID", "SELLTOADDR_SOS", "PARTY_SITE_ID",
"PRIMARYPHONE_CPOS", "CONTACT_POINT_ID_PCP", "CONTACT_POINT_PURPOSE_PCP",
"PHONE_LINE_TYPE", "PRIMARY_FLAG_PCP", "PHONE_COUNTRY_CODE",
"PHONE_AREA_CODE", "PHONE_NUMBER", "EMAIL_CPOS", "CONTACT_POINT_ID_ECP",
"CONTACT_POINT_PURPOSE_ECP", "PRIMARY_FLAG_ECP", "EMAIL_ADDRESS",
"BB_PARTY_ID")

I want to count the percentage populated for each field. Rather than do:
percent(length(is.null(mydf$PERSONPROFILE_POS)) / lines)
percent(length(is.null(mydf$PARTY_ID)) / lines)
etc.
and repeat for each field manually, I want to use a for loop.

I am trying the following:
a <- length(colnames(mydf)) # this is to get the total number of columns

for (i in 1:a)
 print((percent(length(is.null(a)) / lines))

which isn't correct. I'm new to programming, so I don't quite know how to
deal with this. Any suggestions? Thanks much.
--
Jeff

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to