Dear Trent

If you want them side-by-side in one data frame then you could use merge making sure it only merges by date. I would use sub to change all the anitbiotic names by adding "h1" "h2" and so on. Then you can sum antibiotic over hospital by using grep to select all the columns containing antibiotic1. The side-by-side solution has some advantages over stacking them vertically and some disadvantages. You may need to do both for different purposes.

You would need to learn about regular expressions if they are not already familiar to you to get the best out of sub and grep.

On 05/08/2015 07:30, Trent Yarwood wrote:
Hi all,

I'm responsible for collating data on antibiotic use at my local group of
hospitals.  I have data for five different hospitals, about 40 different
antibiotics and monthly data going back to 2006.

At the moment, I have this stored in 5 datafiles, one for each hospital,
formatted as follows:

date, antibiotic1, antibiotic2, antibiotic3....
1-mmm-yy, ab11, ab21, ab31....
1-mmm-yy, ab12, ab22, ab32...

This works most of the time for me, because the most common thing I need to
do is to track a particular hospital's antibiotic use over time (sum of
columns, as a time series by row).

What I would like to do is to amalgamate the data so instead of analysing
an individual hospital (ie a datasheet in the current format) is to be able
to look at a particular antibiotic across the five hospitals.

The best way I can visualise this is having the data in a data cube, with
each hospital as a single plane. Currently, my hospitals are (x,y,1),
(x,y,2) etc. What I'd like to do is look at (2,y,z) - for example, the sum
of antibiotic1 in all hospitals.

I imagine one way of doing this is having a hospital column in the data:

date, hospital, antibiotic1, antibiotic2, antibiotic3...
1-mmm-yy, hospital1, a11, a21, a31...
1-mmm-yy, hospital2, a11, a21, a31... etc

Two questions:

1) Is there a better way of storing the data than this?
2) Is there an easy way to turn what I have into what I want?

I know that once I have the data sorted, I'll be able to dpyl it into the
categories I currently use - it's the getting from here to there I need
help with, please.

Cheers,

Trent.







--
Michael
http://www.dewey.myzen.co.uk/home.html

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to