Re: latex2html problems with equations (the file)

Dr. Richard E. Hawkins Thu, 26 Sep 2002 11:41:12 -0700

oops; here's the file.

#LyX 1.3 created this file. For more info see http://www.lyx.org/
\lyxformat 221
\textclass article
\language english
\inputencoding auto
\fontscheme default
\graphics default
\paperfontsize default
\papersize Default
\paperpackage a4
\use_geometry 0
\use_amsmath 0
\use_natbib 0
\use_numerical_citations 0
\paperorientation portrait
\secnumdepth 3
\tocdepth 3
\paragraph_separation indent
\defskip medskip
\quotes_language english
\quotes_times 2
\papercolumns 1
\papersides 1
\paperpagestyle default


\layout Title

Statistics for Crisis Management
\layout Section

The Statistical Methodology
\layout Standard

Answering a question in a particular manner can be treated as a Bernoulli
 variable, having a success (or value of 1) when answered that way, and
 a failure (or value of 0) when not answered that way.
 
\begin_inset Formula $ p_{i}$
\end_inset 

is the underlying parameter indicating the proportion of the population
 that will answer in that manner, also called the frequency.
 It is customary to define 
\begin_inset Formula \begin{equation}
 q_{i}=1-p_{i}\end{equation}

\end_inset 

The fraction of respondents answering in this matter is then a sample mean,
 and can be calculated as 
\layout Standard


\begin_inset Formula \[
 \hat{p}_{i}=\frac{x_{i}}{n}\]

\end_inset 

For 
\begin_inset Quotes eld
\end_inset 

large
\begin_inset Quotes erd
\end_inset 

 samples, the Central Limit Theorem guarantees that this variable will be
 asymptotically normal
\begin_inset Marginal
collapsed false

\layout Standard

jack--cite this?
\end_inset 

 distributed, and using the well-known variance of the Bernoulli trial,
 
\begin_inset Formula \begin{equation}
 \hat{p}_{i}~N\left(p,\frac{pq}{n}\right)\end{equation}

\end_inset 

where 
\begin_inset Formula $ n$
\end_inset 

is the number of responses.
\layout Standard

Generally, a large sample is taken to be thirty or more.
 In the particular case of frequencies, the additional requirements are
 generally made that 
\begin_inset Formula \begin{equation}
 np\geq 5\end{equation}

\end_inset 

and 
\begin_inset Formula \begin{equation}
 nq\geq 5\end{equation}

\end_inset 

 when this is not the case, the distribution is not sufficiently normal.
\begin_inset Marginal
collapsed false

\layout Standard

cite! (Davidson?)
\end_inset 

 
\layout Standard

For the present data, the null hypothesis for any given survey question
 will be that the American response and the Guatemalan are identical.
 Under this hypothesis, 
\begin_inset Formula $ \hat{p}_{i,a}$
\end_inset 

and 
\begin_inset Formula $ \hat{p}_{i,g}$
\end_inset 

 are separate observations of the same underlying parameter, 
\begin_inset Formula $ p_{i}$
\end_inset 

.
 That is, if the hypothesis is true, 
\begin_inset Formula $ p_{i}$
\end_inset 

 is the true frequency for both American and Guatamalen firms, while
\begin_inset Formula $ \hat{p}_{i,a}$
\end_inset 

and 
\begin_inset Formula $ \hat{p}_{i,g}$
\end_inset 

are two separate variables drawn from the distribution.
 Linear combinations of normal variables are distributed normally themsleves;
 in the case of straightforward addition and subtraction the combined variance
 is the sum of the variances, while the combined mean is the sum or difference
 of the means.
\begin_inset Marginal
collapsed false

\layout Standard

jack--cite?
\end_inset 

 In this case, the means are the same under the hypothesis, their difference
 is hypothesized as mean zero, and is distributed 
\begin_inset Formula \begin{equation}
 \hat{p}_{i,a}-\hat{p}_{i,g}~N\left(0,\sigma_{ p_{i,a}}^{2}+\sigma_{ 
p_{i,g}}^{2}\right)\end{equation}

\end_inset 

with the individual variances being of the form 
\begin_inset Formula \begin{equation}
 \sigma_{\hat{p}_{i,j}}^{2}=\frac{pq}{n_{i,j}}\end{equation}

\end_inset 

which combine as 
\begin_inset Formula \begin{equation}
 
\sigma_{\hat{p}_{i,a}-\hat{p}_{i,g}}^{2}=\frac{pq}{n_{i,a}}+\frac{pq}{n_{i,g}}=pq\left(\frac{1}{n_{i,a}}+\frac{1}{n_{i,g}}\right)\end{equation}

\end_inset 

and therefore the quantity 
\begin_inset Formula \begin{equation}
 
z_{p_{i}}\equiv\frac{\hat{p}_{i,a}-\hat{p}_{i,g}}{\sqrt{pq}\sqrt{\frac{1}{n_{i,a}}+\frac{1}{n_{j,b}}}}\label{eq:zdef-init}\end{equation}

\end_inset 

has a standard normal distribution.
 
\layout Standard

Equation 
\begin_inset LatexCommand \prettyref{eq:zdef-init}

\end_inset 

 still requires a calculation of 
\begin_inset Formula $ p$
\end_inset 

, which in turn yields a usable 
\begin_inset Formula $ q$
\end_inset 

.
 Returning to the hypothesis that both groups are the same, the best estimate
 of the true frequency will come from the underlying frequency 
\begin_inset Formula $ p_{i}$
\end_inset 

can be best estimated by using the degrees of freedom for each observation
 to form a weighted average,
\begin_inset Marginal
collapsed false

\layout Standard

jack--cite? I can show that it's BLUE if we want
\end_inset 


\begin_inset Formula \begin{equation}
 
\hat{p}_{i}=\frac{n_{i,a}\hat{p}_{i,a}+n_{i,g}\hat{p}_{i,g}}{n_{i,a}+n_{i,g}}\end{equation}

\end_inset 

which substituted innto yields the final 
\begin_inset Formula $ z$
\end_inset 

distributed test statistic of 
\begin_inset Formula \begin{eqnarray}
 z_{i} & = & 
\frac{\hat{p}_{i,a}-\hat{p}_{i,g}}{\sqrt{\hat{p}_{i}\hat{q}_{i}}\sqrt{\frac{1}{n_{i,a}}+\frac{1}{n_{j,b}}}}\nonumber
 \\
  & = & 
\frac{\hat{p}_{i,a}-\hat{p}_{i,g}}{\sqrt{\frac{\left(n_{i,a}\hat{p}_{i,a}+n_{i,g}\hat{p}_{i,g}\right)\left(1-\left(n_{i,a}\hat{p}_{i,a}+n_{i,g}\hat{p}_{i,g}\right)\right)}{\left(n_{i,a}+n_{i,g}\right)^{2}}}\sqrt{\frac{1}{n_{i,a}}+\frac{1}{n_{j,b}}}}\nonumber
 \\
  & = & 
\frac{\left(\hat{p}_{i,a}-\hat{p}_{i,g}\right)\left(n_{i,a}+n_{i,g}\right)}{\sqrt{\left(n_{i,a}\hat{p}_{i,a}+n_{i,g}\hat{p}_{i,g}\right)\left[1-\left(n_{i,a}\hat{p}_{i,a}+n_{i,g}\hat{p}_{i,g}\right)\right]}\sqrt{\frac{1}{n_{i,a}}+\frac{1}{n_{j,b}}}}
\end{eqnarray}

\end_inset 

which can easily be calculated partwise on a spreadsheet.
\layout Section

Results
\layout Standard

In many areas, the data shows conclusively that American and Guatamalen
 attitudes and experiences with crisis management are different.
 Table **
\begin_inset Marginal
collapsed true

\layout Standard

x-ref
\end_inset 

 shows the z values for all questions for which the methods of Section ***
\begin_inset Marginal
collapsed true

\layout Standard

xreftt
\end_inset 

 can be calculated.
 Any value with a magnitude greater than 
\begin_inset Formula $ 1.96$
\end_inset 

allows the hypothesis that the response is the same for both countries to
 be rejected at the 
\begin_inset Formula $ 95\% $
\end_inset 

 level.
 Similarly, values with a magnitude of 
\begin_inset Formula $ 2.576$
\end_inset 

 or greater can be rejected at the 
\begin_inset Formula $ 99\% $
\end_inset 

 level.
 Values beyond 
\begin_inset Formula $ 3$
\end_inset 

can be rejected at any reasonable level.
\layout Subsection

The OP1 category.
\layout Standard

Nearly all of the respondents answered with extemal values; either they
 are very concerned or very oncerned with unconcerned with OP1.
\begin_inset Marginal
collapsed true

\layout Standard

are the 2's and 3's even valid responses????
\end_inset 

 The positive value for level one indicates that americans answered in this
 manner far more often.
 That is, americans are far more concerned with the issue.
 OPOCCUR1 suggests a reason for this: OP1 happens far more often in the
 American experience than the Guatamalan.
\layout Standard

*****
\layout Standard

Jack, we need to go over what these mean.
 Also, are the intermediate responses valid? If not, we need to do some
 paranoia adjustments.
\layout Section

Future Research
\layout Standard

These results come from a moderately sized sample and a simple statistical
 analysis.
 At this level, it can be seen that significant differences exist between
 both the expectations and the experiences of businesses in the two countries.
 
\layout Standard

A larger data set, drawn from a larger crosssection of both countries, would
 strengthen the findings; the conclusions here justify such an effort.
 Additionally, further statistical analysis of the present data set is possible.
 The only tests done so far are upon individual questions.
 A Logit or other regression model may lead to further insights.
\the_end

Re: latex2html problems with equations (the file)

Reply via email to