[ 
https://jira.duraspace.org/browse/DS-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Masár updated DS-1226:
---------------------------

    Labels: has-patch has-pull-request import  (was: has-patch import)
    
> Batch import from basic bibliographic formats (Endnote, BibTex, RIS, TSV, CSV)
> ------------------------------------------------------------------------------
>
>                 Key: DS-1226
>                 URL: https://jira.duraspace.org/browse/DS-1226
>             Project: DSpace
>          Issue Type: New Feature
>          Components: DSpace API
>            Reporter: Kostas Stamatis
>              Labels: has-patch, has-pull-request, import
>         Attachments: biblio-transformation-engine-0.8.jar, import-patch.diff, 
> jbibtex-r45.jar, README.txt
>
>
> This proposed extension (implemented by National Documentation Centre/EKT - 
> http://www.ekt.gr) allows the batch import of metadata (and/or bitstreams) to 
> DSpace using the import script and the Biblio-Transformation-Engine tool. The 
> input format can be any bibliographic format (the specific patch includes 
> support for Endnote, RIS, BibTex, TSV and CSV formats).
> The biblio transformation engine 
> (http://code.google.com/p/biblio-transformation-engine/) is an open source 
> java framework developed by the Hellenic National Documentation Centre (EKT, 
> www.ekt.gr) and consists of programmatic APIs for filtering and modifying 
> records that are retrieved from various types of data sources (eg. databases, 
> files, legacy data sources) as well as for outputing them in appropriate 
> standards formats (eg. database files, txt, xml, Excel). The framework 
> includes independent abstract modules that are executed seperately, offering 
> in many cases alternative choices to the user depending of the input data 
> set, the transformation workflow that needs to be executed and the output 
> format that needs to be generated.
> Thus, the attached patch, adds support for utilizing the 
> Biblio-Transformation-Engine in the DSpace batch import procedure where the 
> user only needs to specify the mapping between the input metadata and DSpace 
> metadata. Default mapping are also provided for the default DSpace Dublin 
> Core metadata schema.
> USEFULNESS
> ---------------------
> Suppose a researcher of your institute provides you with a file with his/her 
> publications that you need to import in the repository. Supposing that the 
> format of the file is one the following: CSV, TSV, Endnote, BibTex, RIS 
> (formats that are commonly used for bibliographic metadata) using only one 
> command you can import all the records to the DSpace repository while in 
> parallel, configuration files apply in order to control which metadata is 
> imported and in which DC (or any other schema of the DSpace repository) field 
> it maps.
> For those who know well the use of the Biblio-Transformation-Engine, this 
> extension is powerful given that they can write their own DataLoaders in 
> order to support more input formats. Filtering of records as well as 
> modifying the metadata is also possible with very little effort (using Biblio 
> transformation engine's filters and modifiers). The same applies for the 
> addition of bitstreams in the records.
> CONFIGURATION FILES
> ---------------------------------------
> Since Bibilio-transformation-Engine supports Spring, the only configurations 
> that the user must work with are the Spring XML files for the Dependency 
> Injection. These files are located within "config" directory and the user can 
> specify in them the mapping between input metadata and DSpace Dublin Core 
> schema (or any other schema users have in their repository)
> EXTERNAL LIBRARIES
> -----------------------------------
> This extension makes use of three external java libraries:
> a) jbibtex, a java library for reading bibtex files (under BSD licence - 
> http://www.linfo.org/bsdlicense.html)
> b) opencsv, a java library for reading csv files (under Apache License V2.0 - 
> http://www.apache.org/licenses/LICENSE-2.0)
> c) biblio-transformation-engine, a java library for metadata transformation, 
> fitlering and modification (under European Union Public Licence (EUPL) 
> License, http://www.osor.eu/eupl/european-union-public-licence-eupl-v.1.1)
> HOW TO RUN
> ----------------------
> In the import script, there is a new option (-b) to import using the 
> Biblio-Transformation-Engine and an option -i to declare the type of the 
> input format. All the other options are the same. Option -s points to a file 
> (and not a directory as it used to) that is the file of the input data.
> Thus, to import metadata from the various input format use the following 
> commands:
> for BibTex input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-bibtex -i bibtex
> for csv input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-csv -i csv
> for tsv input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-tsv -i tsv
> for ris input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-ris -i ris
> for endnote input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-endnote -i endnote
> (-e must be a valid email of a DSpace user and -c must be the collection 
> handle the items will be imported)
> Before you run the commands, feel free to change the configuration files 
> (config/spring-bibtex2dspace.xml, config/spring-csv2dspace.xml, 
> config/spring-tsv2dspace.xml, config/spring-ris2dspace.xml, 
> config/spring-endnote2dspace.xml) in order to specify the mapping of the 
> input format to the DC metadata schema of DSpace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://jira.duraspace.org/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to