[R] Installation of R, Sweave, ESS and [X]Emacs on Windows?

2008-03-20 Thread Zembower, Kevin
I'm trying to get R, Sweave, ESS and XEmacs or emacs all installed and
working together on my Windows XP Pro system. I've got R 2.6.0 working
just fine, installed from the R Windows installer. I also have
CYGWIN_NT-5.1 with XEmacs 21.4 working okay. Can anyone point me to any
documentation on how to bring these together so that R code typed in
Xemacs can be run in R? I found the ESS installation directions here at
http://ess.r-project.org/Manual/ess.html#Microsoft-Windows-installation
but they seem daunting. I'm not sure that Xemacs from cygwin can work
with R installed alone. Can anyone confirm that I just have to follow
these directions to have everything I want?

Thank you all for your help and advice.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installation of R, Sweave, ESS and [X]Emacs on Windows?

2008-03-20 Thread Zembower, Kevin
Jim and Vincent, thank you both so much. Vincent, I really appreciate the time 
and effort you've put into this project. I was hoping for exactly what you've 
provide. Thanks, again.

-Kevin

-Original Message-
From: Vincent Goulet [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 20, 2008 12:02 PM
To: Zembower, Kevin
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Installation of R, Sweave, ESS and [X]Emacs on Windows?

Kevin,

Save yourself a lot of trouble and use my modified version of GNU  
Emacs available from

http://vgoulet.act.ulaval.ca/en/emacs

and also linked from the ESS home page. It comes bundled with ESS and  
AUCTeX, so the only other thing you will need to install for the  
purposes you mention is R itself (upgrade while you're at it, you're  
two versions behind) and a TeX distribution (consider TeX Live or  
MiKTeX). There is no need for Cygwin with this setup.

Hope this helps

---
   Vincent Goulet, Associate Professor
   École d'actuariat
   Université Laval, Québec
   [EMAIL PROTECTED]   http://vgoulet.act.ulaval.ca


Le jeu. 20 mars à 11:34, Zembower, Kevin a écrit :

> I'm trying to get R, Sweave, ESS and XEmacs or emacs all installed and
> working together on my Windows XP Pro system. I've got R 2.6.0 working
> just fine, installed from the R Windows installer. I also have
> CYGWIN_NT-5.1 with XEmacs 21.4 working okay. Can anyone point me to  
> any
> documentation on how to bring these together so that R code typed in
> Xemacs can be run in R? I found the ESS installation directions here  
> at
> http://ess.r-project.org/Manual/ess.html#Microsoft-Windows- 
> installation
> but they seem daunting. I'm not sure that Xemacs from cygwin can work
> with R installed alone. Can anyone confirm that I just have to follow
> these directions to have everything I want?
>
> Thank you all for your help and advice.
>
> -Kevin
>
> Kevin Zembower
> Internet Services Group manager
> Center for Communication Programs
> Bloomberg School of Public Health
> Johns Hopkins University
> 111 Market Place, Suite 310
> Baltimore, Maryland  21202
> 410-659-6139
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Newbie help with Sweave

2008-03-24 Thread Zembower, Kevin
I think I've gotten my Emacs/Sweave/R system set up correctly, thanks to
Vincent and Jim, but I haven't been successful getting my first document
produced. I'm trying to use one of Friedrich Leisch's examples,
http://www.ci.tuwien.ac.at/~leisch/Sweave/example-1.Snw. I cut and
pasted the text into a document sweaveexample.Rnw in Emacs. It seemed to
be processed successfully with R:
> Sweave("sweaveexample.Rnw")
Writing to file sweaveexample.tex
Processing code chunks ...

You can now run LaTeX on 'sweaveexample.tex'
>

However, when I try to open the file sweaveexample.tex and process it
with Latex in Emacs, I get this error:
ERROR: Missing \endcsname inserted.

--- TeX said ---
 
   \protect 
l.7 \begin
  {document}
--- HELP ---
>From the .log file...

The control sequence marked  should
not appear between \csname and \endcsname.

I've tried a variety of examples, but the error messages are the same.

Can anyone point out my errors or mistakes? I've pasted in the full
files below. Thanks so much for your help and advice.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 
==
sweaveexample.tex:
==
\documentclass[a4paper]{article}

\title{Sweave Example 1}
\author{Friedrich Leisch}

\usepackage{C:/PROGRA~1/R/R-26~1.2/share/texmf/Sweave}
\begin{document}

\maketitle

In this example we embed parts of the examples from the
\texttt{kruskal.test} help page into a \LaTeX{} document:

\begin{Schunk}
\begin{Sinput}
> data(airquality)
> library(ctest)
> kruskal.test(Ozone ~ Month, data = airquality)
\end{Sinput}
\begin{Soutput}
Kruskal-Wallis rank sum test

data:  Ozone by Month 
Kruskal-Wallis chi-squared = 29.2666, df = 4, p-value = 6.901e-06
\end{Soutput}
\end{Schunk}
which shows that the location parameter of the Ozone 
distribution varies significantly from month to month. Finally we
include a boxplot of the data:

\begin{center}
\includegraphics{sweaveexample-002}
\end{center}

\end{document}

sweaveexample.Rnw:
==
\documentclass[a4paper]{article}

\title{Sweave Example 1}
\author{Friedrich Leisch}

\begin{document}

\maketitle

In this example we embed parts of the examples from the
\texttt{kruskal.test} help page into a \LaTeX{} document:

<<>>=
data(airquality)
library(ctest)
kruskal.test(Ozone ~ Month, data = airquality)
@
which shows that the location parameter of the Ozone 
distribution varies significantly from month to month. Finally we
include a boxplot of the data:

\begin{center}
<>=
boxplot(Ozone ~ Month, data = airquality)
@
\end{center}

\end{document}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie help with Sweave

2008-03-25 Thread Zembower, Kevin
Kevin, thanks for writing. Yes, sorry, I forgot to mention that this is
a Windows XP Professional system running GNU Emacs 22.1.1
(i386-mingw-nt5.1.2600) from Vincent Goulet, and R 2.6.2 Windows
version. I pasted in the sessionInfo() output from ESS inside of Emacs
to the end of this note.

Was your TA successful in correcting this error? How? Should I report
this to R-development as something worth fixing for the next release?

Thanks, again, for your response and advice.

-Kevin



-Original Message-
From: Kevin E. Thorpe [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 24, 2008 9:01 PM
To: Zembower, Kevin
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Newbie help with Sweave

Is this in a windows system?  A TA of mine was just getting the exact
same message.  He tracked it down to the pathname for Sweave.sty having
trouble with "Program Files" in the path.

Kevin

Zembower, Kevin wrote:
> I think I've gotten my Emacs/Sweave/R system set up correctly, thanks
to
> Vincent and Jim, but I haven't been successful getting my first
document
> produced. I'm trying to use one of Friedrich Leisch's examples,
> http://www.ci.tuwien.ac.at/~leisch/Sweave/example-1.Snw. I cut and
> pasted the text into a document sweaveexample.Rnw in Emacs. It seemed
to
> be processed successfully with R:
>> Sweave("sweaveexample.Rnw")
> Writing to file sweaveexample.tex
> Processing code chunks ...
> 
> You can now run LaTeX on 'sweaveexample.tex'
> 
> However, when I try to open the file sweaveexample.tex and process it
> with Latex in Emacs, I get this error:
> ERROR: Missing \endcsname inserted.
> 
> --- TeX said ---
>  
>\protect 
> l.7 \begin
>   {document}
> --- HELP ---
>>From the .log file...
> 
> The control sequence marked  should
> not appear between \csname and \endcsname.
> 
> I've tried a variety of examples, but the error messages are the same.
> 
> Can anyone point out my errors or mistakes? I've pasted in the full
> files below. Thanks so much for your help and advice.
> 
> -Kevin
> 
> Kevin Zembower
> Internet Services Group manager
> Center for Communication Programs
> Bloomberg School of Public Health
> Johns Hopkins University
> 111 Market Place, Suite 310
> Baltimore, Maryland  21202
> 410-659-6139 
> ==
> sweaveexample.tex:
> ==
> \documentclass[a4paper]{article}
> 
> \title{Sweave Example 1}
> \author{Friedrich Leisch}
> 
> \usepackage{C:/PROGRA~1/R/R-26~1.2/share/texmf/Sweave}
> \begin{document}
> 
> \maketitle
> 
> In this example we embed parts of the examples from the
> \texttt{kruskal.test} help page into a \LaTeX{} document:
> 
> \begin{Schunk}
> \begin{Sinput}
>> data(airquality)
>> library(ctest)
>> kruskal.test(Ozone ~ Month, data = airquality)
> \end{Sinput}
> \begin{Soutput}
>   Kruskal-Wallis rank sum test
> 
> data:  Ozone by Month 
> Kruskal-Wallis chi-squared = 29.2666, df = 4, p-value = 6.901e-06
> \end{Soutput}
> \end{Schunk}
> which shows that the location parameter of the Ozone 
> distribution varies significantly from month to month. Finally we
> include a boxplot of the data:
> 
> \begin{center}
> \includegraphics{sweaveexample-002}
> \end{center}
> 
> \end{document}
> 
> sweaveexample.Rnw:
> ==
> \documentclass[a4paper]{article}
> 
> \title{Sweave Example 1}
> \author{Friedrich Leisch}
> 
> \begin{document}
> 
> \maketitle
> 
> In this example we embed parts of the examples from the
> \texttt{kruskal.test} help page into a \LaTeX{} document:
> 
> <<>>=
> data(airquality)
> library(ctest)
> kruskal.test(Ozone ~ Month, data = airquality)
> @
> which shows that the location parameter of the Ozone 
> distribution varies significantly from month to month. Finally we
> include a boxplot of the data:
> 
> \begin{center}
> <>=
> boxplot(Ozone ~ Month, data = airquality)
> @
> \end{center}
> 
> \end{document}


-- 
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: [EMAIL PROTECTED]  Tel: 416.864.5776  Fax: 416.864.6057

=
> sessionInfo()
R version 2.6.2 (2008-02-08) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Learning to do randomized block design analysis

2007-12-04 Thread Zembower, Kevin
We just studied randomized block design analysis in my statistics class,
and I'm trying to learn how to do them in R. I'm trying to duplicate a
case study example from my textbook [1]:

> # Case Study 13.2.1, page 778
> cd <- c(8, 11, 9, 16, 24)
> dp <- c(2, 1, 12, 11, 19)
> lm <- c(-2, 0, 6, 2, 11)
>  table <- data.frame(Block=LETTERS[1:5], "Score changes"=c(cd, dp,
lm), Therapy=rep(c("Contact Desensitisztion", "Demonstration
Participation", "Live Modeling"), each=5))
> table
   Block Score.changes Therapy
1  A 8 Contact Desensitisztion
2  B11 Contact Desensitisztion
3  C 9 Contact Desensitisztion
4  D16 Contact Desensitisztion
5  E24 Contact Desensitisztion
6  A 2 Demonstration Participation
7  B 1 Demonstration Participation
8  C12 Demonstration Participation
9  D11 Demonstration Participation
10 E19 Demonstration Participation
11 A-2   Live Modeling
12 B 0   Live Modeling
13 C 6   Live Modeling
14 D 2   Live Modeling
15 E11   Live Modeling
> model.aov <- aov(Score.changes ~ Therapy + Error(Block), data=table)
> summary(model.aov)

Error: Block
  Df Sum Sq Mean Sq F value Pr(>F)
Residuals  4  438.0   109.5   

Error: Within
  Df Sum Sq Mean Sq F value   Pr(>F)   
Therapy2 260.93  130.47  15.259 0.001861 **
Residuals  8  68.408.55
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
>

I don't understand why R doesn't output a value for F and Pr for the
Error (Block) dimension, as my textbook shows 12.807 and 0.0015
respectively. All the other numbers match. Can these two values be
recovered? Also, my text shows a total line which R omits. Is this
because it's not particularly useful?

Thanks for your suggestions and advice. Also, if I'm executing this type
of problem in R inefficiently, I'd appreciate suggestions.

-Kevin

[1] An Introduction to Mathematical Statistics and Its Applications,
Larsen and Marx, fourth edition.

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using predict()?

2007-12-11 Thread Zembower, Kevin
I'm trying to solve a homework problem using R. The problem gives a list
of cricket chirps per second and corresponding temperature, and asks to
give the equation for the linear model and then predict the temperature
to produce 18 chirps per second. So far, I have:

> # Homework 11.2.1 and 11.3.3
> chirps <- scan()
1: 20
2: 16
3: 19.8
4: 18.4
5: 17.1
6: 15.5
7: 14.7
8: 17.1
9: 15.4
10: 16.2
11: 15
12: 17.2
13: 16
14: 17
15: 14.4
16: 
Read 15 items
> temp <- scan()
1: 88.6
2: 71.6
3: 93.3
4: 84.3
5: 80.6
6: 75.2
7: 69.7
8: 82
9: 69.4
10: 83.3
11: 79.6
12: 82.5
13: 80.6
14: 83.5
15: 76.3
16: 
Read 15 items
> chirps
 [1] 20.0 16.0 19.8 18.4 17.1 15.5 14.7 17.1 15.4 16.2 15.0 17.2 16.0
17.0 14.4
> temp
 [1] 88.6 71.6 93.3 84.3 80.6 75.2 69.7 82.0 69.4 83.3 79.6 82.5 80.6
83.5 76.3
> chirps.res <- lm(chirps ~ temp)
> summary(chirps.res)

Call:
lm(formula = chirps ~ temp)

Residuals:
 Min   1Q   Median   3Q  Max 
-1.56146 -0.58088  0.02972  0.58807  1.53047 

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.314333.10963  -0.101 0.921028
temp 0.212010.03873   5.474 0.000107 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.9715 on 13 degrees of freedom
Multiple R-Squared: 0.6975, Adjusted R-squared: 0.6742 
F-statistic: 29.97 on 1 and 13 DF,  p-value: 0.0001067
> # From the linear model summary output above, the equation for the
least squares line is:
> #y = -0.3143 + 0.2120*x or chirps = -0.3143 + 0.2120*temp
> 

I can then determine the answer to the prediction, using algebra and R:
> pred_temp <- (18+0.3143)/0.2120
> pred_temp
[1] 86.3882

However, I'd like to try to use the predict() function. Since 'chirps'
and 'temp' are just vectors of numbers, and not dataframes, these
failed:
predict(chirps.res, newdata=data.frame(chirp=18))
predict(chirps.res, newdata="chirp=18")
predict(chirps.res, newdata=18)

I then tried to turn my two vectors into a dataframe. I would have bet
money that this would have worked, but it didn't:
> df <- data.frame(chirps, temp)
>  chirps.res <- lm(chirps ~ temp, data=df)
> predict(chirps.res, newdata=data.frame(chirps=18))

Can anyone tell me how to use predict() in this circumstance?

Thanks for your help and advice.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R brakes when submitting a query to MySQL

2007-12-18 Thread Zembower, Kevin
Is it your use of 'con' rather than 'con2' in dbSendQuery? -Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Marc Moragues
Sent: Tuesday, December 18, 2007 1:14 PM
To: r-help@r-project.org
Subject: [R] R brakes when submitting a query to MySQL

Hello,

I would like to retrieve data stored in MySQL database, so I installed
RMySQL package.
I can successfully connect with the my database using the following code

> dvr<-dbDriver("MySQL")
> con2<-dbConnect(dvr,group="exbardiv")
> mysqlDescribeConnection(con2)

 
  User: mmorag 
  Host: localhost 
  Dbname: exbardiv 
  Connection type: localhost via TCP/IP 
  No resultSet available

I can even see the tables in the database

> dbListTables(con2)
[1] "agoueb""high_ld"   "rescue""sjlc_info" "sjlc_ld"   "temp"

[7] "temp_snp1" "temp_snp2"

However, when I try to query the database, R breakes.

res<-dbSendQuery(con,'select * from sjlc_ld')

Can anyone help me tune up the connection between R and MySQL?

Thank you,
Marc.
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _

SCRI, Invergowrie, Dundee, DD2 5DA.  
The Scottish Crop Research Institute is a charitable company limited by
guarantee. 
Registered in Scotland No: SC 29367.
Recognised by the Inland Revenue as a Scottish Charity No: SC 006662.


DISCLAIMER:\ \ This email is from the Scottish Crop Rese...{{dropped:30}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Oddities with RSiteSearch?

2008-01-11 Thread Zembower, Kevin
[If I knew who to report this to privately, I would. Sorry to embarrass
anyone who's just trying to contribute to the R-project.]

There seems to be some oddities with the RSiteSearch web page. When I
enter 'RSiteSearch("console")' I'm taken to
http://search.r-project.org/cgi-bin/namazu.cgi?query=console&max=20&resu
lt=normal&sort=score&idxname=Rhelp02a&idxname=functions&idxname=docs. On
this page, the "How to search" link goes to
http://finzi.psych.upenn.edu/namazu.html#query, which gives me an '403
Forbidden' error. 

At the bottom of the page is "This search system is powered by Namazu
v". The text "Namazu" is a link to http://www.namazu.org/, which, when
clicked on, starts a download rather than displaying a page. Also, the
email address at the bottom, [EMAIL PROTECTED], is suspicious. I realize
that these last two errors might be caused by the system RSiteSearch
uses to form indicies, but we may want to suppress them until that
organization gets them working correctly.

Thanks for your efforts in setting up the RSiteSearch system; I use it
all the time.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Any tools for working with US 2000 census data?

2008-01-17 Thread Zembower, Kevin
I've been given the job of extracting some data from the United States
2000 census (files at
http://www2.census.gov/census_2000/datasets/Summary_File_2/Maryland/all_
Maryland.zip 52M). I'm only interested in Census Block Groups (CBGs)
located within Baltimore City, Maryland. Additionally, I just have to
extract certain data fields. I think I'll be using Summary File 2. This
is my first experience working with US census data.

I wasn't successful finding anything using RSiteSearch, although there
were some packages with data extracted from the US 2000 census.

Are there any pre-constructed tools in R for working with this data?
Does the US 2000 census data itself come packaged in R? If there are no
R tools, I'd welcome any suggestions on working with this data from
anyone experienced with it.

Thanks for your advice and suggestions for me.

-Kevin 

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R on an eeePC

2008-01-28 Thread Zembower, Kevin
Doing 'RSiteSearch("eee")' yields some hits. I knew that the ASUS eeePC
had come up on r-help.

-kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Dr. Walter H. Schreiber
Sent: Monday, January 28, 2008 9:32 AM
To: r-help@r-project.org
Subject: [R] R on an eeePC

Dear list,

I wonder if somebody has succeeded in installing R on an eeePC (Xandros 
desktop). Searching via Rseek (term eeePC)  and in eeePC forums (term 
Cran)  left me without proper hits.

Best wishes,
Walter.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Newbie: Using R to analyse Apache logs

2008-01-31 Thread Zembower, Kevin
Raj,

I've been experimenting with R to compute simple statistics from my web
logs somewhat similar to what you're describing. For instance, I'm
working on trying to classify a unique IP or domain name requestor as
'human' or 'robot' based on the number of seconds between requests for
pages. I've found that the easiest method of work, given my (elementary)
knowledge of R and my (professional) knowledge of perl, is to run my
logs through a perl program to pre-process the data, before submitting
it to R. The output of running my Apache web log through my perl program
looks like this tab-delimited output:
[EMAIL PROTECTED]:~/weblogstats$ ./weblogtimediff.pl access_log.20071130.sorted
|head
DateTimeSource  TimeDiffType
30/Nov/2007 00:00:4754.100.68.58.sikkanet.com   15  unknown
30/Nov/2007 00:00:4854.100.68.58.sikkanet.com   1   unknown
30/Nov/2007 00:01:1954.100.68.58.sikkanet.com   31  unknown
30/Nov/2007 00:01:2554.100.68.58.sikkanet.com   6   unknown
30/Nov/2007 00:01:29ip-61-14-181-116.asianetcom.net 15  unknown
30/Nov/2007 00:01:4054.100.68.58.sikkanet.com   15  unknown
30/Nov/2007 00:01:4154.100.68.58.sikkanet.com   1   unknown
30/Nov/2007 00:01:44llf520049.crawl.yahoo.net   14  robot
30/Nov/2007 00:01:46ip-61-14-181-116.asianetcom.net 17  unknown
[EMAIL PROTECTED]:~/weblogstats$

In this, I also make a preliminary classification into 'robot' (because
it identified itself as such in the browser field), 'human' (because it
submitted a text string to my internal search engine), or 'unknown'.

Unfortunately, this approach doesn't seem to be working. The
distributions of both the 'humans' and 'robots' seemed to be Poisson by
inspection. I therefore created box plots of the log(mean(time
intervals)), but the 'humans' versus the 'robots' were indistinguishable
by inspection. As this is not exactly what I'm paid to do, I just play
with this on my spare time, so I haven't tried anything else yet.

If it's of general interest to this group, I'd be happy to publish my
program for this. Otherwise, Raj, if you're interested, I'd be happy to
send it to you privately.

One oddity I noted is that Apache logs are not always in chronological
order. The date/time stamp is when the request occurred, but it's
written in the log when the request is completed. Thus, for a long
download, several, shorter subsequent downloads may have been requested
and completed before the earlier, long one. I was confused by negative
time differences from my program until I discovered this. Subsequently,
I sort my Apache log in chronological order before passing it through my
program.

Hope this helps. Let me know if you have any other questions.

-Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Raj Mathur
Sent: Thursday, January 31, 2008 8:31 AM
To: r-help
Subject: [R] Newbie: Using R to analyse Apache logs

hits=-2.5 tests=BAYES_00,FORGED_RCVD_HELO
X-USF-Spam-Flag: NO

Hi,

I have a requirement to scan Apache logs and discover ``exceptions''.  
Exceptions can be of two types:

1. A single IP generating a large amount of traffic within a given time
frame 
(for definable values of ``large'' and ``time frame'').

2. A single IP hitting a wide set of URLs on the server (indicates a
crawler), 
again for definable values of ``wide''.

I'm a complete newbie to R (and to statistics), so the questions are:

- Can R help me generate graphs which would help me identify these
activities?

- Has someone already done something like this?  If so, where could I
find it?

- If not, can someone help me with the stats (and R) part to help me
achieve 
these objectives?  Any software that gets created as a result would be 
released under a FOSS license.

Data massaging, tuning, etc. are not an issue.  We'd be dealing with a
few 
hundred thousand or a million records a day.

Regards,

-- Raju
-- 
Raj Mathur[EMAIL PROTECTED]  http://kandalaya.org/
 Freedom in Technology & Software || February 2008 || http://freed.in/
   GPG: 78D4 FC67 367F 40E2 0DD5  0FEF C968 D0EF CC68 D17F
PsyTrance & Chill: http://schizoid.in/   ||   It is the mind that moves

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sampling

2008-02-05 Thread Zembower, Kevin
Would this work:
g<-sample(rep(LETTERS[1:2],12), 24, replace=F)

HTH
-Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Judith Flores
Sent: Tuesday, February 05, 2008 1:52 PM
To: RHelp
Subject: [R] Sampling

Hi there,

   I want to generate different samples using the
followindg code:


g<-sample(LETTERS[1:2], 24, replace=T)

   How can I specify that I need 12 "A"s and 12 "B"s?

Thank you,

Judith


 


Be a better friend, newshound, and

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dice simulation: Getting rep to re-evaluate sample()?

2007-10-08 Thread Zembower, Kevin
I'm trying to get R to simulate the sum of the values on 10 fair dice
(yes, it's related to a homework problem, but is not the problem
itself). I tried to do this:
> rep(sum(sample(1:6,100,replace=T)), times=10)
 [1] 341 341 341 341 341 341 341 341 341 341
 
and noticed that sum(sample()) seems to be only evaluated once. How can
I overcome this, so that I get a vector of values that correspond to
independent throws of 10 dice each time?

Thanks for your advice and suggestions.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dice simulation: Getting rep to re-evaluate sample()?

2007-10-08 Thread Zembower, Kevin
Thanks so much, Chuck and Mark. Here's my script to simulate 10,000
rolls of 100 fair dice to demonstrate their conformity to a normal
curve:
> x<-replicate(1, sum(sample(1:6,100,replace=T)))
> sdx<-sd(x)
> sdx
[1] 17.13966
> meanx<-mean(x)
> meanx
[1] 350.0451
> hist(x, freq=FALSE)
> curve(dnorm(x, mean=meanx, sd=sdx), add=TRUE)
>

Thanks, again, for your quick and accurate help.

-Kevin

-Original Message-
From: Charles C. Berry [mailto:[EMAIL PROTECTED] 
Sent: Monday, October 08, 2007 1:56 PM
To: Zembower, Kevin
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Dice simulation: Getting rep to re-evaluate sample()?



See

?replicate

which I think is what you are after.

Chuck


On Mon, 8 Oct 2007, Zembower, Kevin wrote:

> I'm trying to get R to simulate the sum of the values on 10 fair dice
> (yes, it's related to a homework problem, but is not the problem
> itself). I tried to do this:
>> rep(sum(sample(1:6,100,replace=T)), times=10)
> [1] 341 341 341 341 341 341 341 341 341 341
>
> and noticed that sum(sample()) seems to be only evaluated once. How
can
> I overcome this, so that I get a vector of values that correspond to
> independent throws of 10 dice each time?
>
> Thanks for your advice and suggestions.
>
> -Kevin
>
> Kevin Zembower
> Internet Services Group manager
> Center for Communication Programs
> Bloomberg School of Public Health
> Johns Hopkins University
> 111 Market Place, Suite 310
> Baltimore, Maryland  21202
> 410-659-6139
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Charles C. Berry(858) 534-2098
 Dept of Family/Preventive
Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating confidence in an estimate including number of trials?

2007-10-16 Thread Zembower, Kevin
[Yes, this is related to a homework problem, but is not the problems
itself.]

In my mathematical statistics class, we've just learned about properties
of estimators, and I can now solve manually problems like this:

A sample of size n = 16 is drawn from a normal distribution where sigma
= 10 but mu is unknown. If mu = 20, what is the probability that the
estimator mu hat = Y bar will lie between 10.0 and 21.0?[1]

I solved this by converting to Z scores and using a table of cumulative
values under the normal curve and got an answer of .3108 (someone please
tell me if I'm wrong).

Now I'd like to know how to use R to solve this type of problem. In all
my other problems using normal curves, I used dnorm or pnorm, but
neither of these includes anything regarding the number of trials. I can
put the math into R after I've worked out the equation, but I wondered
if there was an R function that computed this directly, in the same
fashion that pnorm can compute probabilities using parameters of mean
and sd.

Using help.search for 'estimator' or 'sample mean' didn't turn up
anything that I recognized. Any hints on where to go looking for this?

Thanks for your help and advice.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

[1] Introduction to Mathematical Statistics and its applications, Larsen
and Marx, fourth ed., question 5.4.4.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating confidence in an estimate including numberof trials?

2007-10-16 Thread Zembower, Kevin
Daniel, thanks for your suggestion. So, it's just done like this:
> pnorm(21, mean=20, sd=10/4) - pnorm(19, mean=20, sd=10/4)
[1] 0.3108435
> # OR
> pnorm(21, mean=20, sd=10/sqrt(16)) - pnorm(19, mean=20,
sd=10/sqrt(16))
[1] 0.3108435
>

Thanks, again.

-Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Daniel Lakeland
Sent: Tuesday, October 16, 2007 4:35 PM
To: r-help@r-project.org
Subject: Re: [R] Calculating confidence in an estimate including
numberof trials?

On Tue, Oct 16, 2007 at 04:30:48PM -0400, Zembower, Kevin wrote:

> Now I'd like to know how to use R to solve this type of problem. In
all
> my other problems using normal curves, I used dnorm or pnorm, but
> neither of these includes anything regarding the number of trials.

pnorm can be used like your table of area under the normal curve. To
account for size of sample you have to scale the variance
appropriately according to the theory you have learned in your course.


-- 
Daniel Lakeland
[EMAIL PROTECTED]
http://www.street-artists.org/~dlakelan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Homework help: Is this how CI using t dist are constructed?

2007-10-30 Thread Zembower, Kevin
I'm trying to replicate some of the examples from my textbook in R (my
text uses Minitab). In this problem, I'm trying to construct a 95%
confidence interval for these distance measurements [1]:

> # Case Study 7.4.1, p. 483
> x <- scan()
1:  62 52 68 23 34 45 27 42 83 56 40
12: 
Read 11 items
> alpha<-.95
> mean(x) + qt(c((1-alpha)/2, 1-((1-alpha)/2)), df=length(x)-1) * sd(x)
/ sqrt(length(x))
[1] 36.21420 60.51307
>

Are confidence intervals with the t distribution constructed using this
type of equation, or am I overlooking a more concise, 'canned' approach
that's already been programmed? Any suggestions on simplifying this?

Thanks for all your advice and help.

-Kevin

[1] An Introduction to Mathematical Statistics and its Applications,
fourth ed., Larsen and Marx.

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homework help: Is this how CI using t dist are constructed?

2007-10-30 Thread Zembower, Kevin
Yes, exactly. In fact, I had already discovered this, too. I don't know why I 
didn't think of it before asking this question.

Thanks for your patience with me.

-Kevin

-Original Message-
From: Peter Dalgaard [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, October 30, 2007 4:54 PM
To: Zembower, Kevin
Cc: r-help@r-project.org
Subject: Re: [R] Homework help: Is this how CI using t dist are constructed?

Zembower, Kevin wrote:
> I'm trying to replicate some of the examples from my textbook in R (my
> text uses Minitab). In this problem, I'm trying to construct a 95%
> confidence interval for these distance measurements [1]:

>   
You mean like t.test(x)?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Homework help: Is this how CIs of normal distributions are computed?

2007-10-31 Thread Zembower, Kevin
I'm looking for a function in R similar to t.test() which was generously
pointed out to me yesterday, but which can be used for normally
distributed data.

To recap yesterday:
> x <- scan()
1: 62 52 68 23 34 45 27 42 83 56 40
12: 
Read 11 items
> alpha<- .05
> t.test(x)

One Sample t-test

data:  x 
t = 8.8696, df = 10, p-value = 4.717e-06
alternative hypothesis: true mean is not equal to 0 
95 percent confidence interval:
 36.21420 60.51307 
sample estimates:
mean of x 
 48.36364 

What if I now mock-up my data for 100 trials:
> x100<-sample(x, 100, replace=TRUE)

I think that I should be able to use a normal distribution, because of
the n>30 rule-of-thumb.

I can compute the 95% CI using:
> mean(x100) - qnorm(alpha/2)*sd(x100)/sqrt(length(x100))
[1] 51.91222
> mean(x100) + qnorm(alpha/2)*sd(x100)/sqrt(length(x100))
[1] 44.80778
> t.test(x100)

One Sample t-test

data:  x100 
t = 26.683, df = 99, p-value < 2.2e-16
alternative hypothesis: true mean is not equal to 0 
95 percent confidence interval:
 44.76383 51.95617 
sample estimates:
mean of x 
48.36 

>

The critical values I compute manually are close to the t.test values,
which is what I expect. As the number of samples increases, the t value
approaches the normal distribution value.

I thought I looked at all the other .test functions in the stats
package, and didn't find one that computed results like the t.test for
normal distributions. Is something similar to my 'manual' computations
the way it's done in R, or have I overlooked something again?

Thanks.

-Kevin

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homework help: Is this how CIs of normal distributionsare computed?

2007-10-31 Thread Zembower, Kevin
Daniel, thanks, I should have remembered this, too; I've seen it and
worked with it before. Thanks.

-Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Daniel Lakeland
Sent: Wednesday, October 31, 2007 4:04 PM
To: r-help@r-project.org
Subject: Re: [R] Homework help: Is this how CIs of normal
distributionsare computed?

On Wed, Oct 31, 2007 at 03:56:37PM -0400, Zembower, Kevin wrote:
> I'm looking for a function in R similar to t.test() which was
generously
> pointed out to me yesterday, but which can be used for normally
> distributed data.
...
> > x100<-sample(x, 100, replace=TRUE)
> 
> I think that I should be able to use a normal distribution, because of
> the n>30 rule-of-thumb.
> 
> I can compute the 95% CI using:
> > mean(x100) - qnorm(alpha/2)*sd(x100)/sqrt(length(x100))

You can compute quantiles of the particular normal distribution itself
rather than transforming from the standardized normal by hand.

qnorm(c(.025,.975),mean=mean(x100),sd=sd(x100)/sqrt(length(x100)))

-- 
Daniel Lakeland
[EMAIL PROTECTED]
http://www.street-artists.org/~dlakelan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homework help: Is this how CIs of normal distributions are computed?

2007-10-31 Thread Zembower, Kevin
Dan, I didn't realize that the t values were more accurate than the
normal approximation for n > about 30. I may have learned (incorrectly)
that the normal distribution should be used if n > 30, but now that I'm
thinking about it, this may have just been computationally economical
before computers.

Thanks for this thought.

-Kevin 

Dan Nordlund wrote:

You could probably use the Normal distribution as an approximation under
these circumstances, but why would you when you have a more accurate CI
using t.test?

Dan


Daniel J. Nordlund
Research and Data Analysis
Washington State Department of Social and Health Services
Olympia, WA  98504-5204
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with log axis

2007-11-01 Thread Zembower, Kevin
Well, here are two attempts that I would have bet on to work, but don't:
#Doesn't seems to show up any line at all:
abline(a=as.numeric(r1$coefficients["(Intercept)"]),
b=as.numeric(r1$coefficients["log(x)"]))
#Line doesn't match points:
abline(r1, untf=TRUE)

So much for furthering knowledge and this discussion...

-Kevin

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of R Heberto Ghezzo, Dr
Sent: Thursday, November 01, 2007 1:40 PM
To: r-help@r-project.org
Subject: [R] problem with log axis

Hello,
if I do:
  x <- c(0.5,1,3,6,10,20,40)
  y <- 10-log(x)+rnorm(7,0,0.05)
  r1 <- lm(y ~ log(x))
  plot(log(x),y)
  abline(r1)
#
I get a nice plot with the regression line almost over the points.
but:
  plot(x,y,log="x")
  abline(r1)
gives me exactly the same plot for the points but the regression line
is completely off !
I would like the plot with the real values of X on the axis, not the
log(X)
can somebody tell me why the "abline" is not correct in the second case?
Thanks
H.Ghezzo
McGill University
Montreal - Canada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Homework help: t test hypothesis testing with summarized data?

2007-11-07 Thread Zembower, Kevin
Is this how a t hypothesis test is done when I don't have the actual
data, but just the summarized statistics: 
> #Homework 9.2.6 [1]
> n<-31
> xbar<-3.10
> s_x<-1.469
> m<-57
> ybar<-2.43
> s_y<-1.35
> s_pooled<- (((n-1)*s_x^2) + ((m-1)*s_y^2)) / (n + m - 2)
> s_pooled
[1] 1.939521
> t_obs <- (xbar -  ybar) / (s_pooled * (sqrt(1/n + 1/m)))
> t_obs
[1] 1.547951
> qt(c(.025, .975), n+m-2)
[1] -1.987934  1.987934
> # Therefore, fail to reject H0 at the 0.05 level of significance
>

Or am I again overlooking a canned procedure or an easier calculation
using the t distribution.

Thank you for your continued advice and help.

-Kevin

[1] An Introduction to Mathematical Statistics and its Applications,
fourth ed., Larsen and Marx.

Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Homework help: t test hypothesis testing with summarized data?

2007-11-08 Thread Zembower, Kevin
Peter and Moshe, thank you both for your suggestions and hints. I'm proud to 
say that it took me less than an hour to find my mistake:
> s_pooled <- (((n-1)*(s_x^2)) + ((m-1)*(s_y^2))) /  (n+m-2)
> s_pooled
[1] 1.939521
> t_obs <- (xbar - ybar) / (sqrt(s_pooled) * (sqrt(1/n + 1/m)))
> t_obs
[1] 2.15578
> qt(c(.025, .975), n+m-2)
[1] -1.987934  1.987934
> # Therefore, reject H0 at the 0.05 level of significance.

Just to be clear about the 'homework' aspect of my questions: my homework is to 
work the problems out 'longhand' with just a calculator and printed tables. (In 
fact, 10 weeks into a 14 week course, we haven't been asked yet to use a 
computer.) I do this before I ask any questions regarding homework on this 
forum. On my own, I'm trying to answer some of the questions and examples in my 
textbook using R. My 'Homework help:' subject may have been misleading. I may 
change it to 'Extra-credit help:' to acknowledge the academic aspect of my 
question but distinguish it from my homework. I used 'Homework help:' because I 
didn't want anyone to suspect from the nature of the questions that I was 
trying to sneak in a homework question without acknowledging it.

Thanks, again, for all your help for this statistics student.

-Kevin

-Original Message-
From: Peter Dalgaard [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 07, 2007 6:50 PM
To: Zembower, Kevin
Cc: [EMAIL PROTECTED]
Subject: Re: [R] Homework help: t test hypothesis testing with summarized data?

Zembower, Kevin wrote:
> Is this how a t hypothesis test is done when I don't have the actual
> data, but just the summarized statistics: 
>   
>> #Homework 9.2.6 [1]
>> n<-31
>> xbar<-3.10
>> s_x<-1.469
>> m<-57
>> ybar<-2.43
>> s_y<-1.35
>> s_pooled<- (((n-1)*s_x^2) + ((m-1)*s_y^2)) / (n + m - 2)
>> s_pooled
>> 
> [1] 1.939521
>   
>> t_obs <- (xbar -  ybar) / (s_pooled * (sqrt(1/n + 1/m)))
>> t_obs
>> 
> [1] 1.547951
>   
>> qt(c(.025, .975), n+m-2)
>> 
> [1] -1.987934  1.987934
>   
>> # Therefore, fail to reject H0 at the 0.05 level of significance
>>
>> 
>
> Or am I again overlooking a canned procedure or an easier calculation
> using the t distribution.
>   
I don't know if someone told you last time, but there's an Internet code 
of honor about helping with homework Don't expect more than hints.

You're on track but there's a mistake.

Here's a way of testing your result:

 > x <- scale(rnorm(31))*1.469+3.10
 > y <- scale(rnorm(57))*1.35+2.43
 > t.test(x,y, var.equal=TRUE)




> Thank you for your continued advice and help.
>
> -Kevin
>
> [1] An Introduction to Mathematical Statistics and its Applications,
> fourth ed., Larsen and Marx.
>
> Kevin Zembower
> Internet Services Group manager
> Center for Communication Programs
> Bloomberg School of Public Health
> Johns Hopkins University
> 111 Market Place, Suite 310
> Baltimore, Maryland  21202
> 410-659-6139 
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.