DigiTool Migration Workflow: Difference between revisions

From CJH Wiki
Jump to navigation Jump to search
 
(54 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Developed by Kevin Powell in 2018 to migrate Center for Jewish History assets from DigiTool to Rosetta.
Developed by Kevin Powell in 2018 and 2019 to migrate Center for Jewish History assets from DigiTool to Rosetta.


==Requirements==
=DigiTool Migration Workflow Requirements=
*'''Mapped Network Drive to at least one of the partners' Rosetta submission folders'''
*'''Mapped Network Drive to at least one of the partners' Rosetta submission folders'''
**67.111.179.133
**/storage1/operational_shared/submissions/[PARTNER_CODE]
**/storage1/operational_shared/submissions/[PARTNER_CODE]
*'''Mapped Network Drive to DigiTool server'''
*'''Mapped Network Drive to DigiTool server'''
**67.111.179.146
**/storage6/bigstreams6/dtl-export
**/storage6/bigstreams6/dtl-export
*'''Access to the Export Digital Entities job in DigiTool for at least 1 admin unit'''
*'''Access to the Export Digital Entities job in DigiTool for at least 1 admin unit'''


==Export from DigiTool==
=Export from DigiTool=
#'''Make sure you are connected to the correct Admin Unit'''
'''1. Make sure you are connected to the correct Admin Unit'''
#*DigiTool → Connect To
*DigiTool → Connect To
#'''Start Export Digital Entities Job'''
 
##Management → Maintenance → Submit New Job
'''2. Start Export Digital Entities Job'''
##Choose Export Digital Entities from list
*Management → Maintenance → Submit New Job
##Search for collection and/or objects to be migrated
*Choose Export Digital Entities from list
##Click Next
*Search for collection and/or objects to be migrated
##Set the Export directory
*Click Next
##*'''[YYYYMMDD]-[PartnerCode]-[DescriptiveText]'''
*Set the Export directory
##**'''[YYYYMMDD]''' is the date format.  
**'''[YYYYMMDD]-[PartnerCode]-[DescriptiveText]'''
##***E.g. 20180816 = August 16th, 2018
**'''[YYYYMMDD]''' is the date format.  
##**'''[PartnerCode]'''
***E.g. 20180816 = August 16th, 2018
##***AJH01
**'''[PartnerCode]'''
##***ASF01
***AJH01
##***LBI01
***ASF01
##***YIV01
***LBI01
##***YUM01
***YIV01
##**'''[DescriptiveText]''' can be the call number, name of collection, a simple description, etc.
***YUM01
##** Example: '''20180816-AJH01-JJLYONS'''
**'''[DescriptiveText]''' can be the call number, name of collection, a simple description, etc.
##*Add sequential number to end of folder name if exporting in batches
*** Example: '''20180816-AJH01-JJLYONS'''
##** Example: '''20180816-AJH01-JJLYONS-001'''
*Add sequential number to end of folder name if exporting in batches
##Set Format to “Digital Entities”
*** Example: '''20180816-AJH01-JJLYONS-001'''
##Select the “Include Streams” and “Export Related Objects” boxes
*Set Format to “Digital Entities”
##Click Next
*'''DE-SELECT''' the “Include Streams” box
##Click Confirm
*'''SELECT''' the “Export Related Objects” box
#'''Monitor the Export job'''
** NOTE: These parameters ONLY export XML from DigiTool
#* Management → Maintenance → Monitor  
*Click Next
#'''Merge batches if exporting in batches'''
*Click Confirm
##Make sure all batches have completed exporting
 
##Use [[Rosetta_Deposit_Processor_Tools#Merge_Batches | Merge Batches Tool]] in Rosetta Deposit Processor
'''3. Monitor the Export job'''
#'''Save Export job log once job is complete'''
* Management → Maintenance → Monitor  
##Search for Job log
 
##*Management → Maintenance → Jobs List
'''4. Save Export job log once job is complete'''
##**Job Name = Export Digital Entities
* Search for Job log
##**Admin Unit = Partner whose objects are being exported
* Management → Maintenance → Jobs List
##**Job Status = Completed
** Job Name = Export Digital Entities
##Click eye icon under “Action” on the far left
** Admin Unit = Partner whose objects are being exported
##In the pop up window, choose “Log”
** Job Status = Completed
##Click eye icon under “Action” on the far left
* Click eye icon under “Action” on the far left
##Copy and paste contents of log into text document and save with the name of the export folder.  
* In the pop up window, choose “Log”
##*[YYYYMMDD]-[PartnerCode]-[DescriptiveText].txt
* Click eye icon under “Action” on the far left
##*Save file '''on Rosetta server''' at [PartnerCode]/migration/migration_logs
* Copy and paste contents of log into text document and save with the name of the export folder.  
#'''Move DigiTool export to Rosetta server'''
** [YYYYMMDD]-[PartnerCode]-[DescriptiveText].txt
#*[PartnerCode]/migration
* Save file '''on Rosetta server''' at [PartnerCode]/migration/migration_logs
#*Be patient, this may take awhile!
 
#'''Write Migration Note in DigiTool'''
'''5. Move DigiTool export to Rosetta server'''
#*TBD
* Once the export is finished, locate '''[YYYYMMDD]-[PartnerCode]-[DescriptiveText]''' in the "DigiTool → Rosetta" tab of the Rosetta Migration Utility
#*Partition C
* Click submit
 
:: [[File:Migration utility 1.png|border|500px]]
 
'''6. Write Migration Note in DigiTool'''
* Management → Maintenance → Submit New Job
* Choose Assign Control Attributes from list
* Search for objects to update, click Next
* Parameters for job:
** Name = Partition C
** Value = ROS_YYYY_MM
 
=CSV Processing=
Used for migrating non-complex objects from DigiTool to Rosetta.
 
==Requirements==
*'''CSV template'''
** Current template version is available for [http://wiki.cjh.org/index.php/File:CJH_CSV.csv download here].
*'''An export folder from DigiTool that is organized into “digital_entities” and “streams” subfolders.'''
** This organization is the default structure of DigiTool exports.
 
==Process==
'''1. Supply path to CSV template'''
 
'''2. Supply path to export folder'''
 
:: [[File:Migration 1.PNG|border|500px]]
 
'''3. Submit'''
* The log will provide the path for the deposit folder.
* This process extracts the technical and administrative metadata exported by DigiTool and creates Metadata Update files. These files will be used to update the migrated objects' metadata once they're ingested into Rosetta.
* An example of a successful DigiTool Migration process log can be [[Media:Migration log.txt|viewed here]].
 
= METS Processing =
Used for migrating complex objects from DigiTool to Rosetta.
 
This process takes significantly longer than the other migration workflow. It creates METS files which includes technical, administrative, and structmap metadata.
 
===Requirements===
*'''An export folder from DigiTool that is organized into “digital_entities” and “streams” subfolders.'''
** This organization is the default structure of DigiTool exports.
 
===Process===
'''1. Supply path to export folder'''
 
:: [[File:Migration complex.PNG|border|500px]]
 
'''2. Submit'''
* The log will provide the path for the deposit folders.
* An example of a successful DigiTool Migration process log can be [[Media:Migration complex log.txt|viewed here]].
 
=ALEPH Reconciliation=
'''1. Update MARCXML Batch.'''
* Open the Rosetta Migration Utility and go to Tools --> Update MARCXML. Locate the MARCXML Batch.
 
:: [[File:Update_marc.png|border|500px]]
* If these MARCXML records are associated with a parent collection, add the ALEPH System Number for that parent collection in the pop up window.
 
:: [[File:Coll num.PNG|border|500px]]
 
'''2. Open MarcEdit. Go to File --> MARC Tools --> "MARC21XML => MARC"'''
 
:: [[File:Marc edit 1.png|border|500px]]
 
* Under Input File supply the path to the updated MARCXML batch file.
* Under Output File supply a path for a new .MRC file.
** Use the '''[YYYYMMDD]-[PartnerCode]-[DescriptiveText]''' naming convention, but make sure it is entirely in lower case. Aleph does not like capital letters in input file names.
* Make sure UTF-8 is chosen as Default Character Encoding
 
:: [[File:Marc edit 2.png|border|500px]]
 
* Click Execute
* After the MRC file is created, double click it. This will open MARC Edit again and prompt you to create a human-readable MRK file.
 
:: [[File:Marc edit 3.PNG|border|500px]]
 
* Open the MRK file to double check the MARC records. Make sure there aren't bad leaders in a record. You can do this by doing a "Find All" search for "=LDR" and looking for "^" symbols.
 
:: [[File:Marc edit 4.PNG|border|500px]]
 
* If there are bad leaders, contact the Rosetta Systems Administrator
 
'''3. Upload MRC file to the ALEPH server'''
* '''/exlibris/aleph/u22_1/cjh01/scratch'''


==Processing==
'''4. Open Aleph and follow ~ THIS GUIDE ~ for adding records.'''
#Open [[Rosetta#Rosetta_Deposit_Processor | Rosetta Deposit Processor]]
* Write down the name of the log file created during the manage-18 job. This can be used in the next step as the input file.
#If the migration deposit '''DOES NOT''' require a complex structMap:
#*'''[[DigiTool_Migration#DigiTool_Migration | Processor Migration Workflow]]'''
#*'''CSV Template''' = [http://wiki.cjh.org/index.php?title=File:CJH_CSV.csv template found here]
#*'''Export Folder''' = /[PartnerCode]/migration/[YYYYMMDD]-[PartnerCode]-[DescriptiveText]
#*Move newly created deposit folder to /[PartnerCode]/CSV
#If the migration deposit '''DOES''' require a complex structMap:
#*'''[[DigiTool_Migration#DigiTool_Migration_-_Complex | Processor Migration Workflow - Complex]]'''
#*'''Export Folder''' = /[PartnerCode]/migration/[YYYYMMDD]-[PartnerCode]-[DescriptiveText]
#*Move newly created deposit folder to /[PartnerCode]/METS


==ALEPH Reconciliation==
'''5. Follow ~ THIS GUIDE ~ for downloading MARCXML from ALEPH. '''
If the Deposit Processor extracted MARC records from the DigiTool export, open a sample amount to see if they have ALEPH system numbers in the 001 controlfield. If they do not have ALEPH system numbers, they need to be added to ALEPH:


# Create MARCXML Batch from extracted MARC records.
'''6. Add ALEPH system numbers to deposit CSV'''
# Open the Rosetta Deposit Processor and go to Tools --> Migration Tools --> Update MARC. Locate the MARCXML Batch.
* Open Command Prompt
#*: PICTURE
* cd into the directory containing add_aleph_sys.py
# If these MARC records are associated with a parent collection, add the ALEPH System Number for that parent collection in the pop up window.
<nowiki>
#*: PICTURE
C:\> cd C:\Path\To\script
# Once the batch has been updated, open MarcEdit. Go to File --> MARC Tools --> "MARC21XML => MARC"
#*: PICTURE
# Under Input File supply the path to the updated MARCXML batch file.
# Under Output File supply a path for a new .MRC file. Use the '''[YYYYMMDD]-[PartnerCode]-[DescriptiveText]''' naming convention, but make sure it is entirely in lower case. Aleph does not like capital letters in input file names.
# Make sure UTF-8 is chosen as Default Character Encoding
# Click Execute
# After the MRC file is created, double click it. This will open MARC Edit again and prompt you to create a human-readable MRK file.
# Open the MRK file to double check the MARC records. Make sure there aren't two leaders in a record.
#* If there are two leaders, contact the Rosetta Systems Administrator
# If the MRC file is ready to go, upload it to two locations on the ALEPH server
#* '''/exlibris/aleph/u22_1/cjh01/scratch'''
#* '''/exlibris/aleph/u22_1/alephe'''
# Open Aleph and, follow [[ this guide ]] for adding records.
# Write down the name of the log file created during the manage-18 job. This can be used in the next step as the input file.
# Once the records are added, follow [[ this guide ]] for downloading MARCXML from Aleph.
# Open Command Prompt
# cd into the directory containing add_aleph_sys.py
# Use the following commands


<nowiki>C:\> py -3.7
C:\Path\To\script> py -3.5
Python 3.7.0 on win32
Python 3.5.0 on win32
>>> import add_aleph_sys
>>> from add_aleph_sys import *
>>> csvfile = r'C:\Path\To\DepositCSV.csv'
>>> csvfile = r'C:\Path\To\DepositCSV.csv'
>>> xmlfile = r'C:\Path\To\DownloadedAlephXML.xml'
>>> xmlfile = r'C:\Path\To\DownloadedAlephXML.xml'
Line 107: Line 160:
>>> 'C:\Path\To\EDITED_DepositCSV.csv'</nowiki>
>>> 'C:\Path\To\EDITED_DepositCSV.csv'</nowiki>


# If successful, the script will create an edited version of the original deposit CSV with the ALEPH system numbers added.
=Rosetta Ingest=
# If the edited Deposit CSV is correct, move the original CSV to another location.


==Rosetta Ingest==
'''1. Copy the '''[YYYYMMDD]-[PartnerCode]-[DescriptiveText]_deposit''' folder to the appropriate submissions folder.'''
==Quality Assurance (QA)==
* For the simple Migration workflow (CSV)
** '''submissions\[PartnerCode]\CSV'''
* For the complex Migration workflow (METS)
** '''submissions\[PartnerCode]\METS'''

Latest revision as of 17:13, 22 February 2019

Developed by Kevin Powell in 2018 and 2019 to migrate Center for Jewish History assets from DigiTool to Rosetta.

DigiTool Migration Workflow Requirements

  • Mapped Network Drive to at least one of the partners' Rosetta submission folders
    • /storage1/operational_shared/submissions/[PARTNER_CODE]
  • Mapped Network Drive to DigiTool server
    • /storage6/bigstreams6/dtl-export
  • Access to the Export Digital Entities job in DigiTool for at least 1 admin unit

Export from DigiTool

1. Make sure you are connected to the correct Admin Unit

  • DigiTool → Connect To

2. Start Export Digital Entities Job

  • Management → Maintenance → Submit New Job
  • Choose Export Digital Entities from list
  • Search for collection and/or objects to be migrated
  • Click Next
  • Set the Export directory
    • [YYYYMMDD]-[PartnerCode]-[DescriptiveText]
    • [YYYYMMDD] is the date format.
      • E.g. 20180816 = August 16th, 2018
    • [PartnerCode]
      • AJH01
      • ASF01
      • LBI01
      • YIV01
      • YUM01
    • [DescriptiveText] can be the call number, name of collection, a simple description, etc.
      • Example: 20180816-AJH01-JJLYONS
  • Add sequential number to end of folder name if exporting in batches
      • Example: 20180816-AJH01-JJLYONS-001
  • Set Format to “Digital Entities”
  • DE-SELECT the “Include Streams” box
  • SELECT the “Export Related Objects” box
    • NOTE: These parameters ONLY export XML from DigiTool
  • Click Next
  • Click Confirm

3. Monitor the Export job

  • Management → Maintenance → Monitor

4. Save Export job log once job is complete

  • Search for Job log
  • Management → Maintenance → Jobs List
    • Job Name = Export Digital Entities
    • Admin Unit = Partner whose objects are being exported
    • Job Status = Completed
  • Click eye icon under “Action” on the far left
  • In the pop up window, choose “Log”
  • Click eye icon under “Action” on the far left
  • Copy and paste contents of log into text document and save with the name of the export folder.
    • [YYYYMMDD]-[PartnerCode]-[DescriptiveText].txt
  • Save file on Rosetta server at [PartnerCode]/migration/migration_logs

5. Move DigiTool export to Rosetta server

  • Once the export is finished, locate [YYYYMMDD]-[PartnerCode]-[DescriptiveText] in the "DigiTool → Rosetta" tab of the Rosetta Migration Utility
  • Click submit
Migration utility 1.png

6. Write Migration Note in DigiTool

  • Management → Maintenance → Submit New Job
  • Choose Assign Control Attributes from list
  • Search for objects to update, click Next
  • Parameters for job:
    • Name = Partition C
    • Value = ROS_YYYY_MM

CSV Processing

Used for migrating non-complex objects from DigiTool to Rosetta.

Requirements

  • CSV template
  • An export folder from DigiTool that is organized into “digital_entities” and “streams” subfolders.
    • This organization is the default structure of DigiTool exports.

Process

1. Supply path to CSV template

2. Supply path to export folder

Migration 1.PNG

3. Submit

  • The log will provide the path for the deposit folder.
  • This process extracts the technical and administrative metadata exported by DigiTool and creates Metadata Update files. These files will be used to update the migrated objects' metadata once they're ingested into Rosetta.
  • An example of a successful DigiTool Migration process log can be viewed here.

METS Processing

Used for migrating complex objects from DigiTool to Rosetta.

This process takes significantly longer than the other migration workflow. It creates METS files which includes technical, administrative, and structmap metadata.

Requirements

  • An export folder from DigiTool that is organized into “digital_entities” and “streams” subfolders.
    • This organization is the default structure of DigiTool exports.

Process

1. Supply path to export folder

Migration complex.PNG

2. Submit

  • The log will provide the path for the deposit folders.
  • An example of a successful DigiTool Migration process log can be viewed here.

ALEPH Reconciliation

1. Update MARCXML Batch.

  • Open the Rosetta Migration Utility and go to Tools --> Update MARCXML. Locate the MARCXML Batch.
Update marc.png
  • If these MARCXML records are associated with a parent collection, add the ALEPH System Number for that parent collection in the pop up window.
Coll num.PNG

2. Open MarcEdit. Go to File --> MARC Tools --> "MARC21XML => MARC"

Marc edit 1.png
  • Under Input File supply the path to the updated MARCXML batch file.
  • Under Output File supply a path for a new .MRC file.
    • Use the [YYYYMMDD]-[PartnerCode]-[DescriptiveText] naming convention, but make sure it is entirely in lower case. Aleph does not like capital letters in input file names.
  • Make sure UTF-8 is chosen as Default Character Encoding
Marc edit 2.png
  • Click Execute
  • After the MRC file is created, double click it. This will open MARC Edit again and prompt you to create a human-readable MRK file.
Marc edit 3.PNG
  • Open the MRK file to double check the MARC records. Make sure there aren't bad leaders in a record. You can do this by doing a "Find All" search for "=LDR" and looking for "^" symbols.
Marc edit 4.PNG
  • If there are bad leaders, contact the Rosetta Systems Administrator

3. Upload MRC file to the ALEPH server

  • /exlibris/aleph/u22_1/cjh01/scratch

4. Open Aleph and follow ~ THIS GUIDE ~ for adding records.

  • Write down the name of the log file created during the manage-18 job. This can be used in the next step as the input file.

5. Follow ~ THIS GUIDE ~ for downloading MARCXML from ALEPH.

6. Add ALEPH system numbers to deposit CSV

  • Open Command Prompt
  • cd into the directory containing add_aleph_sys.py
C:\> cd C:\Path\To\script

C:\Path\To\script> py -3.5
Python 3.5.0 on win32
>>> from add_aleph_sys import *
>>> csvfile = r'C:\Path\To\DepositCSV.csv'
>>> xmlfile = r'C:\Path\To\DownloadedAlephXML.xml'
>>> add_aleph_num_CSV(csvfile, xmlfile)
>>> 'C:\Path\To\EDITED_DepositCSV.csv'

Rosetta Ingest

1. Copy the [YYYYMMDD]-[PartnerCode]-[DescriptiveText]_deposit folder to the appropriate submissions folder.

  • For the simple Migration workflow (CSV)
    • submissions\[PartnerCode]\CSV
  • For the complex Migration workflow (METS)
    • submissions\[PartnerCode]\METS