Re: [R] Approach for Storing Result Data

Jeff Newmiller Wed, 08 Mar 2017 09:04:20 -0800

Seems pretty normal except that your one-by-one lookup process usually gets old 
eventually, and comparing results is much easier if you merge the study data 
with the lookup data all at once and then use aggregate() (or any of numerous 
equivalents from contributed packages) to collect results or 
color/linetype/panel/etc plotted graphical presentations with lattice or 
ggplot2.
-- 
Sent from my phone. Please excuse my brevity.


On March 8, 2017 7:27:08 AM PST, g.maub...@weinwolf.de wrote:
>Hi All,
>
>today I have a more general question concerning the approach of storing
>
>different values from the analysis of multiple variables.
>
>My task is to compare distributions in a universe with distributions
>from 
>the respondents using a whole bunch of variables. Comparison shall be
>done 
>on relative frequencies (proportions).
>
>I was thinking about the structure I should store the results in and
>came 
>up with the following:
>
>-- cut --
>
>library(stringi)
>
># Result data frame
># Some sort of tidytidy data set where
># each value is stored as an identity.
># This way all values for all variables could be stored in
># one unique data structure.
># If an additional variable added for the name of the
># research one could also build result data set across
># surveys.
># Values for measure could be "number" for 'raw' values or
># "freq" for frequencies/counts.
># Values for unit could be "n" for 'numbers' and
># "%" for percentages.
>d_test <- data.frame(
>    group = rep(c("Universe", "Respondents"), each = 16),
>    variable = rep("State", 32),
>    value = rep(c(11.3,
>                    12.7,
>                    3.3,
>                    5,
>                    0.6,
>                    8.1,
>                    6.2,
>                    5.8,
>                    6.4,
>                    14.5,
>                    8.3,
>                    0.3,
>                    3.8,
>                    2.5,
>                    8.1,
>                    3), 2),
>    label = rep(c("Baden-Wuerttemberg",
>                "Bayern",
>                "Berlin",
>                "Brandenburg",
>                "Bremen",
>                "Hamburg",
>                "Hessen",
>                "Mecklenburg-Vorpommern",
>                "Niedersachsen",
>                "Nordrhein-Westfalen",
>                "Rheinland-Pfalz",
>                "Saarland",
>                "Sachsen",
>                "Sachsen-Anhalt",
>                "Schleswig-Holstein",
>                "Thueringen"),2),
>    measure = rep("freq", 32),
>    unit = rep("%", 32),
>    stringsAsFactors = FALSE
>)
>
># This way the variables can be selected using simple
># value selection from Base R functionality.
>data <- d_test[d_test$variable == "State" ,]
>
># And plot results for every variable.
>ggplot(
>  data = data,
>  aes(
>    x = label,
>    y = value,
>    fill = group)) +
>  geom_bar(stat = "identity", position = "dodge") +
>  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
>scale_fill_discrete(name = stringi::stri_trans_totitle(names(data)[1]))
>
>+
>  scale_x_discrete(name = data$variable[1]) +
>  scale_y_discrete(name = data$unit[1])
>
>-- cut --
>
>The reporting / presentation is done in R Markdown. I would load the 
>result data set once at the beginning and running the comparisons as
>plots 
>on each variable named in the results data set under "variable".
>
>If I follow this approach for my customer relationship survey, do think
>I 
>would face drawbacks or run into serious trouble?
>
>I am interested in your opinion and open for other approaches and 
>suggestions.
>
>Kind regards
>
>Georg
>
>______________________________________________
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Approach for Storing Result Data

Reply via email to