Hi all,

I'm having a problem restructuring my data the way I'd like it.  I have data
that look like this:

  Candidate.ID        Specialty       Office         Score
       110002         C                  London           47
       110002         C                  East              48
       110003         RM               West              45
       110003         RM               Southwest      39
       110003         C                  Southwest      38
       110004         H                  South             42
       110006         G                  East               47
       110006         G                  London           45

Candidates can apply for the same job specialty in up to 2 offices (never
more).  They can apply for different specialties in further centres.  I
would like to look at score differences when candidates apply for the same
specialty in two different offices.  With the help of the archives I have
tried various stack/unstack and reshape/melt/cast combinations, and I've
managed to get a huge matrix where the columns are all possible combinations
of Specialties & Offices - and there are many.  This leaves a very sparse
matrix  with mainly null values, and this is not what I want.  I'd like the
scores from the two attempts in two columns so I can do scatterplots,
calculate differences by specialty etc. In SPSS I'd use 'restructure' to get
what I want.  I'm working to order with specific requests here so I have to
do it this way (as opposed to a modelling approach).

I would like it restructured to look something like this:

  Candidate.ID        Specialty       Office.1       Score.1      Office.2
Score.2
       110002         C                  London           47
 East          48
       110003         RM               West              45
 Southwest  39
       110003         C                  Southwest      38
       110004         H                  South             42
       110006         G                  East               47
London      45
       110006         G                  London           45

So one row per candidate/specialty combination, with 2 sets of
offices/scores and null values in the second set if they've only applied
once for that specialty.  Can anyone help me out with this? Is it possible
using stack or reshape?

Many thanks for reading,
Jan

PS Closest I've come to what I need is the sparse matrix produced by this:

recast(spec.scores, Candidate.ID ~ Specialty + Office, measure.var="Score")

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to