"A quick scan of blogs by scientists between now and last May when the 
intention was announced reveals that much single-investigator science has no 
process or procedures in place that could safely be called data management.  
The data life cycle for these projects ends with the publication of results in 
a peer-reviewed journal."  

"Avoid only committing your data to commercial journal repositories for what I 
hope are obvious reasons."

"Data management plans and procedures should become standardized . . ."

"Comments and discussion are encouraged and should be directed to the 
http://intranet2.lternet.edu/comment/reply/3248#comment-form>online forum so 
that the community may benefit."


I have offered these excerpts for what I hope are obvious reasons; that is, 
questions/comments are solicited in response without indulging in redundant 
commentary. Contrary to the last one, I have only thus far submitted it to 
Ecolog, as I am unfamiliar with LTER. Should there be a discussion on Ecolog 
(as well, or?).

WT



----- Original Message ----- 
From: "David Inouye" <[email protected]>
To: <[email protected]>
Sent: Wednesday, February 23, 2011 12:22 PM
Subject: [ECOLOG-L] How to write a data management plan for an NSF proposal


> Thanks to James Brunt for agreeing to share this. I'm sure that as
> the reviewer and PI communities gain experience with this component
> of proposals that expectations will develop.
> 
> David Inouye
> 
> How to Write a Data Management Plan for a National Science Foundation
> (NSF) Proposal
> 
> LTER Cybersecurity and Data Management Briefing #2 - February 2011
> 
> by James Brunt
> 
> The National Science Foundation (NSF) has made good the announcement
> in <http://www.nsf.gov/news/news_summ.jsp?cntn_id=116928>last May's
> press release to require a data management plan with every NSF
> proposal.  You will be happy to know that writing a data management
> plan is not difficult. While constructing the text to meet the NSF
> requirements does demand some attention to detail, the real challenge
> is that the data management plan has to be non-fiction, describing
> procedures that will actually take place.  The NSF receives about
> 40,000 proposals each year (source: Wikipedia).  It occurred to me to
> wonder how those 40,000 potential investigators were going to
> approach this new requirement.  A quick scan of blogs by scientists
> between now and last May when the intention was announced reveals
> that much single-investigator science has no process or procedures in
> place that could safely be called data management.  The data life
> cycle for these projects ends with the publication of results in a
> peer-reviewed journal.  The purpose of this briefing is to provide
> you with a solid outline for a data management plan to include in
> your NSF proposals and some resources that will help you on your way
> to leveraging your valuable research products through preservation and reuse.
> 
> As of January 18, 2011, all proposals to NSF must include a
> supplementary document of no more than two pages labeled "Data
> Management Plan".  This supplement should describe how the proposal
> will conform to NSF policy on the dissemination and sharing of
> research results
> (see<http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4>AAG<http://www.nsf.gov/pubs/policydocs/pappguide/nsf11001/aag_6.jsp#VID4>
>  
> 
> Chapter VI.D.4).    The NSF policy includes the sharing of results,
> primary data, physical samples and collections.  This policy also
> mentions that NSF will enforce this policy through a variety of
> mechanisms and provide appropriate support and incentives for data
> cleanup, documentation, dissemination, and storage.  NSF suggests
> that the plan "may" contain:
>     * the types of data, samples, physical collections, software,
> curriculum materials, and other materials to be produced in the
> course of the project;
>     * the standards to be used for data and metadata format and
> content (where existing standards are absent or deemed inadequate,
> this should be documented along with any proposed solutions or remedies);
>     * policies for access and sharing including provisions for
> appropriate protection of privacy, confidentiality, security,
> intellectual property, or other rights or requirements;
>     * policies and provisions for re-use, re-distribution, and the
> production of derivatives; and
>     * plans for archiving data, samples, and other research products,
> and for preservation of access to them.
> 
> NSF stops short of dictating what data management practices you
> should engage in. This means if there are community standards they
> will be applied through peer review pressure.  While in some
> communities this means you can probably get away with two sentences
> saying how much you don't need a data management plan, that's not
> true in the ecological community where there are standards of
> practice and experienced informatics-oriented colleagues on the
> review panels.  Some NSF directorates and divisions have issued
> advice to proposers that contain more specific suggestions (e.g.
> <http://www.nsf.gov/sbe/SBE_DataMgmtPlanPolicy.pdf>SBE,
> <http://www.nsf.gov/geo/ear/2010EAR_data_policy_9_28_10.pdf>EAR,
> <http://www.nsf.gov/bfa/dias/policy/dmpdocs/phy.pdf>MPS).  In
> addition, institutions are beginning to post resources for their
> constituents that can be of use in developing a data management plan
> (e.g.,
> <http://libraries.mit.edu/guides/subjects/data-management/>MIT,
> <http://dataplan.wisc.edu/wp-content/uploads/2010/04/data_plan_guide.pdf>UWM).
> 
> If you are reading this first hand then you are in luck because you
> are in some way associated with an LTER site.  LTER proposals have
> been going in with data management plans and backed up by data
> management procedures for the last 30 years.  This means that there
> is expertise for you to draw on to prepare your plan and more
> importantly resources to guide you down the road to fulfilling your
> plan. (Note: It has been expressed by an NSF source that a PI
> adopting their LTER site research data management plan for their
> proposed projects to other NSF programs would be viewed
> favorably.)  If you've received this via a colleague or through the
> magic of Google then I hope that I can give you some added confidence
> in the composition of your data management plan.
> 
> The National Science Board in its 2005 recommendations to NSF,
> <http://www.nsf.gov/pubs/2005/nsb0540/>NSB-05-40, Long-Lived Digital
> Data Collections Enabling Research and Education in the 21st
> Century,  intended these data management plans to be quite
> comprehensive. With this 2-page directive, however, NSF is
> particularly interested in data management with regard to the
> dissemination and sharing of research results.  While the
> instructions below reflect desirable data management
> practices,  there are several essential issues among them that
> deserve more weight in your write-up for NSF.  I will identify these
> in the text below. As with LTER proposals, any specific solicitation
> instructions trump this 2-pager in terms of expectations but must
> still include the essential information below.
> 
> Step 0. Label the page - "Data Management Plan"
> 
> Step 1.  Collection - Describe the data to be collected during the
> proposed period of operation.  These are the actual observations, not
> the final derivative product.  This can be prose if simple or a table
> if more complex.  Name the type of data (e.g., mass of seeds, counts
> of inflorescences), the instrument or collection approach (e.g.,
> visual count recorded on paper), and the sampling design (e.g.,
> number of plots, replicates, frequency of collection).  If actual
> data are interpreted, note the interpretation (e.g., impedance
> interpreted as soil moisture).  If data volumes are significant
> (e.g., >1Gb/day) indicate an estimate of the totals.  Describe any
> quality control measures that will be put in place as part of data collection.
> 
> Step 2.  Processing - Describe the disposition of the raw data
> post-collection.  How will data be transmitted from field or
> instrument to institution?  How regularly, by whom, and where will
> data be stored? How will the security of those data be ensured?  A
> previous article describes several rules of thumb for data security
> (<http://intranet2.lternet.edu/content/protecting-your-digital-research-data-and-documents-cybersecurity-briefing-1-september-2010>LTER
>  
> 
> Data Management and Cybersecurity Briefing #1).
> 
> Step 3. Analysis - Describe in general any descriptive or analytical
> statistics that will be run against the data for quality assurance,
> derivation, aggregation, etc.  Mention the names of analytical
> packages (e.g, SAS, SPSS, MatLab, R).
> 
> Step 4. Documentation - Documentation is required to ensure the
> longevity of data. The documentation of your study is best done
> during the process, not after. This step describes the accumulation
> of the documentation text, while Step 8 describes the encoding of
> this text into a metadata language for publication.  Here you will
> describe what metadata/documentation will be created at each stage of
> the data life cycle and by whom.  For example, "Changes made to the
> data to correct errors will be described and revised during the data
> manipulation process by the budgeted graduate student".  Examples of
> good metadata can be seen in the <http://metacat.lternet.edu>LTER
> data catalog or consult with your Site Information Manager.  What is
> the metadata content standard you will use to document these
> data?  Most ecological metadata is based on recommendations contained
> in
> <http://www.esajournals.org/doi/abs/10.1890/1051-0761%281997%29007%5B0330%3ANMFTES%5D2.0.CO%3B2>Michener
>  
> 
> et al. 1997.
> 
> Step 5. Products (Essential) Describe the data or other products that
> you will be making available from the study. These may or may not be
> the raw data described in step 1. This is another place where a table
> might be useful.
> 
> Step 6. Policy (Essential) Describe the policies under which these
> data will be made available (See
> <http://intranet2.lternet.edu/documents/lter-network-data-access-policy-revision-3>LTER
>  
> 
> Data Access Policy for example) and how you will deal with privacy or
> other sensitive data issues (e.g., location of endangered species).
> 
> Step 7. Archival (Essential) Describe how and where you will make
> these data and metadata available to the community in perpetuity.
> Here again you have an advantage by being associated with an LTER
> site.  LTER sites maintain archival infrastructure for making data
> and metadata accessible and can give you tips and maybe some direct
> support. If not, most institutional libraries operate digital
> repositories that will provide this service for their constituents.
> 
> Step 8. Curation (Essential) -  Preparation of metadata and data for
> publication is a time consuming process. This should be acknowledged
> in the data management plan and in the budget.  In this step you will
> describe the structural standards that you will apply in making data
> and metadata available.  For example, for most ecological data,
> documentation will need to be structured in Ecological Metadata
> Language (EML) to be included in community repositories.   There are
> <http://intranet2.lternet.edu/documents/eml-best-practices-document-2004>best
> practices available from the LTER community for EML.  However, you
> can avoid direct contact with EML and best practices documents by
> registering your datasets online with the Knowledge Network for
> Biocomplexity (See Step 9.)
> 
> Step 9.  Publication (Essential) - After making sure you have a
> secure place for your data products to reside, you need to register
> them with community repositories.  Include a description here of the
> institutional repository(s) where you will register your data.  Your
> LTER site  can register and publish your data.  If that is not
> appropriate for your study, the LTER Network operates as a node on
> the <http://knb.ecoinformatics.org>Knowledge Network for
> Biocomplexity (KNB) where these data can be independently
> registered.  KNB offers an online repository form and a guide for
> completing the form.  The NSF DataNet projects, in particular
> <http://www.dataone.org>DataONE, will hopefully soon offer another
> outlet for data publication.
> 
> For specific datasets you may consider formally publishing the
> data.  <http://esapubs.org/archive/default.htm>Ecological Archives is
> a peer-reviewed data journal operated by the Ecological Society of
> America that accepts well described datasets and their textual
> description for publication. There are others operated in various
> ways by scientific societies. Avoid only committing your data to
> commercial journal repositories for what I hope are obvious reasons.
> 
> Other considerations:
> 
> The  information contained in the plan regarding "plans for
> preservation, documentation, and sharing of data" is also required to
> be part of the Project Description -  - so it seems that placement of
> an appropriate reference to the 2-page plan in the project
> description would be prudent.
> 
> Make sure your proposed budget addresses the data management plan.
> Costs of documenting, preparing, publishing, disseminating and
> sharing research findings and supporting material are allowable
> charges against the grant.
> 
> Data management plans and procedures should become standardized for a
> lab, institute, or even community such that in time there is
> boilerplate material available that reflects institutionalized procedures.
> 
> Ultimately the success of any given plan will lie in the hands of the
> reviewers and the makeup of the panel, but as with any new initiative
> those 40,000 proposals that go in first tend to set the tone for the
> future.  Finally, just before going to press I read in a
> <http://news.unm.edu/2011/02/online-data-management-planning-tool-tames-data-and-meets-researchers%E2%80%99-funding-requirements/>reliable
>  
> 
> source that DataONE and others are developing a software tool that
> will write data management plans for you. Until that time, I hope you
> find this information useful.
> 
> Comments and discussion are encouraged and should be directed to the
> <http://intranet2.lternet.edu/comment/reply/3248#comment-form>online
> forum so that the community may benefit.
> 
> Copyright 2010-2011 James W Brunt 
> 
> 
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1204 / Virus Database: 1435/3463 - Release Date: 02/23/11
>

Reply via email to