Here are some of software packages developed / managed by Yan Han . Software Packages for ProQuest/UMI new Electronic Theses and Dissertations (ETD)
1. Software Package for ProQuest/UMI ETD DTD 4.3 (After July 1, 2008, ProQuest's new delivery platform)
As ProQuest/UMI switched its delivery platform for Electronic Theses and Dissertations(ETD), I have developed a small software package to process ETD. The software package does: - Unzip ProQuest/UMI ETD delivery Zipped files, and create one directory per ETD.
- Rename these ETDs into other preferred file names (in my case, Wang_arizona_0009D_10075.xml --> azu_etd_10075_sip1_m.xml)
- Generate digital signature for digital preservation.
- Create MARC records from ProQuest/UMI XML files. (i.e. a MRK file will be generated for direct loading to catalog. I load MRC file to innovative and Koha)
- Create embargo notification and moving embargo ETDs to a different directory for future loading
This package can save time for processing hundreds of ETDs. The package has a Java compiled code (class file) and Perl Scripts.Currently I run it on Linux, but it can be run in Windows. - ProQuest/UMI original file pattern: zipped file: etdadmin_upload_[0-9].zip (e.g. etdadmin_upload_1242.zip), which can be unzipped to:
- PDF: [authorLastName]_[institutionName]_0009[D|M]_[0-9].pdf. (e.g. Zhu_arizona_0009D_10020.pdf)
- XML: [authorLastName]_[institutionName]_0009[D|M]_[0-9]_DATA.xml (e.g. Zhu_arizona_0009D_10020_DATA.xml)
- A directory: [authorLastName]_[institutionName]_0009[D|M]_63/ (e.g. Zhu_arizona_0009D_63/): This directory is empty, which is used for optional files (I guess).
- Assumption: unique id should be 10020, but the directory does not have this unique id.
- Example:
[hany@yanbox test]$ unzip etdadmin_upload_3002.zip Archive: etdadmin_upload_3002.zip inflating: Choi_arizona_0009D_10081.pdf creating: Choi_arizona_0009D_63/ inflating: Choi_arizona_0009D_10081_DATA.xml
2. Software Package for ProQuest/UMI ETD DTD 4.2 below (Before July 1, 2008, using BePress delivery platform)This package was used for processing ProQuest's previous delivery platform using BePress. BePress original file pattern: zipped file:upload-[0-9].zip (e.g. upload-3025.zip), which can be unzipped to: - One or more directory: [0-9]/
- PDF: umi-[institutionName]-[0-9].pdf
- XML: umi-[instituionName]-[0-9].xml
- May include a direcotry for optional files
- Example:
[hany@yanbox test1]$ unzip upload-3299.zip Archive: upload-3299.zip creating: 2022/ inflating: 2022/umi-arizona-2022.pdf inflating: 2022/umi-arizona-2022.xml creating: 2028/ inflating: 2028/umi-arizona-2028.pdf inflating: 2028/umi-arizona-2028.xml creating: 2045/ inflating: 2045/umi-arizona-2045.pdf inflating: 2045/umi-arizona-2045.xml
Download package etd_tool_for_bepress.tar now (3 MB). Please read README first!
|
|
Last Updated ( Monday, 06 July 2009 )
|