Hmm, it's quite possible you know more about statistics than me, but...
Usually equations for calculating confidence level are based on the
assumption of a random sample, not a volunteering self-selected sample.
If you have a self-selected sample, then the equations for "how likely
is this to be a fluke" are only accurate if your self-selected sample is
representative; and there aren't really any equations that can tell you
how likely your self-selected sample is to be representative, it depends
on the circumstances (which is why for the statistical equations to be
completely valid, you need a random sample).
Is my understanding.
On 12/5/2012 2:18 PM, Rosalyn Metz wrote:
Ross,
I totally get what you're saying, I thought of all of that too, but
according to everything I was reading through, the likelihood that the
survey's results are a fluke is extremely low. Its actually the reason I
put information in the write up about the sample size (378), population
size (2,250), response rate (16.8%), confidence level (95%), and confidence
interval (+/- 4.6%).
Rosalyn
On Wed, Dec 5, 2012 at 1:52 PM, Ross Singer <[email protected]> wrote:
Thanks, Rosalyn for setting this up and compiling the results!
While it doesn't change my default position, "yes we need more diversity
among Code4lib presenters!", I'm not sure, statistically speaking, that you
can draw the conclusions you have based on the sample size, especially
given the survey's topic (note, I am not saying that women aren't
underrepresented in the Code4lib program).
If 83% of the mailing didn't respond, we simply know nothing about their
demographics. They could be 95% male, they could be 99% female, we have no
idea. I think it is safe to say that the breakdown of the 16% is probably
biased towards females simply given the subject matter and the dialogue
that surrounded it. We simply cannot project that the mailing list is
57/42 from this, I don't think.
What is interesting, however, is that the number roughly corresponds to
the number of seats in the conference. I think it would be interesting to
see how this compares to the gender breakdown at the conference.
This doesn't diminish how awesome it is that you put this together,
though. Thanks, again to you and Karen!
-Ross.
On Dec 5, 2012, at 1:28 PM, Rosalyn Metz <[email protected]> wrote:
Hi Friends,
I put together the data and a summary for the gender survey. Now that
conference and hotel registration has subsided, it's a perfect time for
you
to kick back and read through.
[Code4Lib] Gender Survey
Data<
https://docs.google.com/spreadsheet/ccc?key=0AqfFxMd8RTVhdFVQSWlPaFJ2UTh1Nmo0akNhZlVDTlE
Gender Survey Data is the raw data for the survey. Not very interesting,
but you can use it to view my Pivot Tables and charts.
[Code4Lib] Gender Survey
Summary<
https://docs.google.com/document/d/1Hbofh63-5F9MWEk8y8C83heOkNodttASWF5juqGLQ1E/edit
Gender Survey Summary is easy to read version of the above -- its the
summary I wrote about the results. Included is a brief intro, charts
(from
above), and a summary of the results.
Let the discussion begin,
Rosalyn
P.S. Much thanks to Karen Coyle for reviewing the summary for me before I
sent it out. Also if there are any typos or grammar mistakes, please
blame
my friend Abigail who behaved as my editor.