New Deposits

From CJH Wiki
Jump to navigation Jump to search

The New Deposits Module contains workflows for processing new deposits for ingest into Rosetta. It can be accessed by going to the "Modules" menu and clicking "New Deposits."

Organizing

Requirements

  • Folder with subfolders of files.
- An example of this folder structure can be viewed here
  • Knowledge of master formats represented in subfolders.
- If unsure of the formats represented in the subfolders, use Count Formats located in the Tools menu. This tool will count all of the formats in a folder and its subfolders.

Process

  • Supply path to folder with subfolders of files
  • Select the master formats represented in the subfolders. You can select multiple.
- XML files are automatically added to a 'supplementary' folder.
- Formats not chosen will be added to the 'access' folder.


Organize 2.PNG


  • Submit
- An example of a successful Organize log can be viewed here
- An example of the folder structure after the Organize process can be viewed here

Processing

Requirements Glossary

MARCXML Batch

A MARCXML batch is an XML file that includes multiple MARCXML records. ALEPH exports multiple records in this batch format. The MARCXML batch can be used to generate the CSV Key that is required for all New Deposit processes.

Individual MARCXML records can be merged into a MARCXML batch using the Rosetta Deposit Processor. Go to Tools -> New Deposit Tools -> Create MARCXML Batch. You will be prompted to choose a folder of individual MARCXML records.

CSV Key

A CSV Key is needed for ALL New Deposit processes. The key associates a folder of stream files with its system number and/or metadata file. The bulk of a CSV Key can be generated from a MARCXML file* or EAD file. Once the key is generated, the ENTITY column must be edited to include the names of folders (or, in the Dublin Core Process, filenames) associated with the system number and/or metadata file.

If the CSV Key is generated from an ArchivesSpace EAD or an Aleph MARC record, it should include the Partner Code in the PARTNER column. This Partner Code will automatically be included in the deposit CSV / DC / METS file.

* MARC records in the MARCXML Batch file MUST have a system number in 001 controlfield to be added to the CSV Key

Partner Codes

Partner codes are abbreviations of each Partner's name. They are holdovers from the DigiTool. Partner Codes are used in the submission folder filepath and to build collections in Rosetta.

  • AJH01 = American Jewish Historical Society
  • ASF01 = American Sephardi Federation
  • LBI01 = Leo Baeck Institute
  • YIV01 = YIVO Institute for Jewish Research
  • YUM01 = Yeshiva University Museum

To generate a CSV Key, go to Tools -> New Deposit Tools.

CSV Process

Process Requirements

  • CSV Key
- CSV Key must have values in the ENTITY and LABEL columns.
- Derived from MARCXML Batch File using Tools --> New Deposit Tools --> Generate CSV Key from MARC
- Derived from exported ArchivesSpace EAD file using Tools --> New Deposit Tools --> Generate CSV Key from EAD
  • Folder with subfolders of files where the subfolders are further organized into access and master sub-subfolders.
- Subfolders can be organized into ‘master’ and ‘access’ sub-subfolders using the Organize tab.
- Supplemental files (e.g. indices, manifests, etc.) should be manually placed in a sub-subfolder entitled ‘supplement’.
- An example of this folder structure can be viewed here.
  • Deposit template
- Current template version is available for download here.
- A default deposit template path can be set in File --> Settings.

Process

  • Edit ENTITIES column in CSV Key to include the name (not the path) of each folder in "streams"
  • Supply path to CSV Key
  • Supply path to folder with subfolders of files
  • Supply path to deposit template
- A default deposit template path can be set in File --> Settings.


CSV fields.PNG


  • Submit
- The log will provide the path for the deposit folder. This folder can be copied to the appropriate submissions folder for Rosetta ingest.
- An example of a successful CSV process log can be viewed here.

Dublin Core Process

Non-complex entities with their own MARC record / EAD File Level and no derivatives.

NOTE: Derivatives can be created once preservation masters are ingested in Rosetta. For ingests containing both masters and derivatives (access copies), use the CSV or METS workflows.

Process Requirements

  • Folder of non-complex entities (i.e. entities that are only one file)
  • CSV Key
- Derived from MARCXML Batch File using Tools --> New Deposit Tools --> Generate CSV Key from MARC
- Derived from exported ArchivesSpace EAD file using Tools --> New Deposit Tools --> Generate CSV Key from EAD
- ENTITY values are individual filenames, NOT folder names for the DC process.

Process

  • Edit ENTITIES column in CSV Key to include filenames
  • Supply path to CSV Key
  • Supply path to folder of non-complex entities


DC fields.PNG


  • Submit
- The log will provide the path for the deposit folder. This folder can be copied to appropriate submissions folder for Rosetta ingest.
- An example of a successful CSV process log can be viewed here.

METS Process

Complex objects in need of a nested structmap

Process Requirements

  • CSV Key
- Derived from MARCXML Batch File using Tools --> New Deposit Tools --> Generate CSV Key from MARC
- Derived from exported ArchivesSpace EAD file using Tools --> New Deposit Tools --> Generate CSV Key from EAD
  • Folder with subfolders of files where the subfolders are further organized into access and master sub-subfolders.
- Subfolders can be organized into ‘master’ and ‘access’ sub-subfolders using the Organize tab.
- Supplemental files (e.g. indices, manifests, etc.) should be manually placed in a sub-subfolder entitled ‘supplement’.
- An example of this folder structure can be viewed here.

Process

  • Edit ENTITIES column in CSV Key to include stream folders
  • Supply path to CSV Key
  • Supply path to folder with subfolders of files


METS fields.PNG


  • Submit
- The log will provide the path for the METS deposits. These deposit folders can be copied to appropriate submissions folder for Rosetta ingest.
- An example of a successful METS process log can be viewed here.

Synchronize

The Synchronize tab is new with Version 2.0. It replaces the Sip Status module and the Add DAO Tab. DAO linking is now rolled into the synchronization job. To learn how to check on a SIP's status using the Rosetta software, see this tutorial.

Sync.png

Requirements

  • Internet Access
  • Rosetta IE contains either Aleph ID in the "Identifier (DC)" field or an ArchivesSpace Archival Object ID in the "Identifier - Archivesspace (DC)" field.
  • IE PID or SIP ID or IE CSV

Process

The Synchronize tab will synchronize an IE or set of IEs with either Aleph or ArchivesSpace. If the SIP is put together with a MARCXML file exported from Aleph, then each IE from the SIP should automatically contain an Aleph ID. Similarly, if the SIP is put together using an EAD exported from ArchivesSpace, then each IE from the SIP should automatically contain an Archival Object ID.

The Synchronize tab will do the following when supplied with an IE or set of IEs:

1. Gather metadata from Rosetta for each IE.
2. Look through the IE metadata for an Aleph and/or ArchivesSpace ID
- If an ArchivesSpace ID is found, the Processor will use the ArchivesSpace API to see whether there is a link is associated with the ArchivesSpace ID. If there is no link present, it will create a Digital Object in ArchivesSpace with the Rosetta IE link and associate that Digital Object with the appropriate Archival Object.
- The default ArchivesSpace link caption is "View Online." If you'd like to customize that caption, use the IE CSV input method
3. Update Rosetta IE with Aleph -> DC or ArchivesSpace -> DC metadata.
- The Processor prioritizes Aleph IDs over ArchivesSpace ids for metadata synchronization. However, if an IE has both an Aleph and ArchivesSpace ID, the Processor will still see whether the Archival Object in ArchivesSpace needs a link.

IE CSV

Users can Synchronize IE or SIP numbers one at a time by entering them in the IE PID and SIP ID input fields respectively. These synchronization jobs will use the default "View Online" ArchivesSpace link caption. However, if a user would like to synchronize more than one IE PID or SIP ID at a time, they can put together a simple CSV.

Ie csv.png

The processor will look in each row of the IE CSV for an IE PID first then a SIP ID. If there is a SIP ID, the Processor will use the Rosetta API to retrieve all of the IE PIDs for that SIP.

Exporting an IE CSV

Users can either put this CSV together from scratch or export an IE CSV from Rosetta:

1. Export IE information as a CSV from Rosetta.
Ie csv export 1.png


2. Add custom ASPACE CAPTION column if desired
Ie csv export 2.png


3. Supply IE CSV path
Ie csv export 3.png