Here are some of the ideas I have used in the past teaching a class like this:

Give them a paragraph of text describing data values and have them create a 
data frame from the data (the prose is so that there is not an obvious table 
structure to start with).  Something like:

Patient number 1 (male) had blood pressure of 120/80 before the treatment and 
110/70 after the treatment, patient number 2 lowered her systolic value from 
130 to 120 with the treatment but her diastolic value stayed at 80, ...

I only used about 6 rows of final data, but this forces the students to think 
about how they want to structure the data (should systolic before and after be 
separate columns? Or 1 column with another column indicating before/after?)

Now have the students do some basic analyses on a sample dataset, t-tests, 
summaries, basic regression, diagnostic plots.

Have them compute regression coefficients the hard way (doing the matrix 
multiplications and/or minimizing the sum of squared residuals using optim), 
this may help them appreciate the lm function.

Generate a population of random data and compute the mean and standard 
deviation, then take 100 or 1,000 samples from this population and compute the 
means of each sample.  Compare the mean and standard deviation of the means to 
the mean and standard deviation of the population, create a histogram of the 
means and show summaries of the means and the population as reference lines on 
the plot (cement the central limit theorem).

Write a function to do the classic number guessing game where the function will 
choose an integer between 1 and 100 then prompt the user for a guess, then tell 
them if their guess is too high, too low, or correct (not interesting 
statistically, but gives some good basic use of programming logic).

Write a function that will compute the arithmetic, geometric, harmonic, and 
self weighting means.  The function needs the same optional arguments as mean.  
Optionally have it plot a histogram of the data with reference lines at each of 
the means.

Use regexpr and related functions to extract information from date(), or from 
the rownames of a dataset (I often get data whith id values like M1, M2, F1, 
F2, ... and no column of sex info, so need to extract that from the id).

Generate data from the distribution f(x) = x/2 for 0<x<2.  Generate bivariate 
data from the joint distribution f(x,y) = 2x+2y-4xy, 0<x<1, 0<y<1.  Plot the 
data to see if it looks like it comes from the theoretical distribution.

Various simulations:
recreate the t-table (generate samples of normals, compute t, find the quantile 
at which 5% of tests would be more extreme.

Generate data for a 2-sample t test, but decide whether to pool the variances 
based on a test of the variances.  Simulate under various conditions to see if 
you get a different error rate than you should.

Do simulations to calculate power for different scenarios.


As part of the final I would usually have them write a function to do 
Hottelling's multivariate T-test.

Hope this helps,





--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> project.org] On Behalf Of Erin Hodgess
> Sent: Wednesday, September 24, 2008 11:39 AM
> To: r-help@r-project.org
> Subject: [R] possible interesting R projects for undergrads
>
> Dear R People:
>
> I finally (Yay!) got R installed in a classroom!
>
> Anyhow, I have a respectful request, please:  could anyone recommend
> some nice undergrad projects in R, please?
>
> This is in a statistical computation class; first time being run.
>
> Thanks,
> Erin
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: [EMAIL PROTECTED]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to