Dear R-users,

I have a dataset I would like to analyze and plot
It consists of 100 dummy variables (0/1) for about 2,000,000 observations
There is absolutely no quantitative variable, nor anything I could use as
an explained variable for a regression analysis.
Actually, the dataset represents the patronage of 2 billion customers for
100 stores. It equals 1 if the consumer go to the store, 0 if he doesn't.
With no further information.

As the variable look like factors (0/1), I thought I could go for a
Mutliple Correspondence Analysis (MCA). However, the resulting plot
consists of 2 points for each variable (one for 1 and one for 0) which is
not easily interpretable. (or is there a method for not plotting certain
points in MCA?)

I also tried to consider my dataset as a bipartite network
(consumer-store). However, the plot is not really insightful, as I am
especially looking for links between stores. (kind of "if a consumer go to
that store, he probably also goes to this one...")

So, I have a simple question: which method you would choose for computing
and plotting the links between a set of dummy variable?

Thanks in advance

Sylvain
PhD Marketing
Associate Professor University of Lille - FR

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to