Metadata and Searching: Difference between revisions

From CJH Wiki
Jump to navigation Jump to search
(updated url to remove the port number and add https)
 
(9 intermediate revisions by one other user not shown)
Line 5: Line 5:
== Searching in Rosetta ==
== Searching in Rosetta ==


Searching in Rosetta happens is available on the three levels of the Rosetta data model: IE, Representation, File.
Searching in Rosetta is available on three levels: IE, Representation, File.


{| class="wikitable" style="float:right;margin:5px;width:200px;font-size:12px"
{| class="wikitable" style="float:right;margin:5px;width:200px;font-size:12px"
Line 13: Line 13:
|}
|}


Here is how that data model breaks down for [http://digipres.cjh.org:1801/delivery/DeliveryManagerServlet?dps_pid=IE1269246 an example IE].
Here is how that data model breaks down for [https://digipres.cjh.org/delivery/DeliveryManagerServlet?dps_pid=IE1269246 an example IE].


* The IE level is the overarching object: The "Photographs"  
* '''The IE level is for Intellectual Entities''': The "Photographs"  
* The REP level is representations of Intellectual Entities: The Modified Master viewable to patrons, the Preservation Master viewable to staff
* '''The REP level is for representations of Intellectual Entities''': The Modified Master viewable to patrons, the Preservation Master viewable to staff
* The FILE level is files that make up Intellectual Entities: 3124174_la-ar25060-b01-f07.pdf
* '''The FILE level is for files that make up Intellectual Entities''': 3124174_la-ar25060-b01-f07.pdf


You can search across levels, as well. For example, you can search for the File Extension "PDF" on the IE level, and it will bring up any IE that has a PDF File associated with it.  
You can search across levels, as well. For example, you can search for the File Extension "PDF" on the IE level, and it will bring up any IE that has a PDF File associated with it.  


Rosetta lets users search on [[Media:Search_fields.txt|a huge list of possible metadata fields]]. Many of them are technical metadata fields which we do not include in our descriptive metadata. Unfortunately this master list of search fields cannot be edited, and the majority of fields on the list are not in use. The Search Glossary below includes fields that *are* commonly used and can be searched on. You can find these fields by typing any or all of the field name in the field search bar.  
You can find search fields by typing any or all of the field name in the field search bar.


[[File:Find field.png|500px|border]]
:: [[File:Find field.png|500px|border]]
 
Add and subtract fields using the green plus and red minus icons, respectively. Be sure to define whether you're searching for "ALL" fields or "ANY" fields. 
 
For example, the following query looks for any LBI IEs with ArchivesSpace identifiers that were ingested after 01/01/2020
 
:: [[File:Search example.png|1000px|border]]
 
Rosetta lets users search on [[Media:Search_fields.txt|a huge list of possible metadata fields]]. Many of them are technical metadata fields which we do not include in our descriptive metadata. Unfortunately this master list of search fields cannot be edited, and the majority of fields on the list are not in use. The Search Glossary below includes fields that '''are''' commonly used and can be searched on.


== Search Glossary ==
== Search Glossary ==
Line 49: Line 57:
'''File Extension'''
'''File Extension'''


* Extension of a file.  
* Extension of a file. Options provided.


'''File Label'''
'''File Label'''
Line 106: Line 114:


* A Dublin Core field often used by Aleph MARC crosswalk for Finding Aid or Collection information. Free text.
* A Dublin Core field often used by Aleph MARC crosswalk for Finding Aid or Collection information. Free text.
'''SIP ID'''
* The ID for a Submission Information Package, which often include multiple IEs. Free text.


'''Source (DC)'''
'''Source (DC)'''
Line 126: Line 138:
|-
|-
! Aleph MARC
! Aleph MARC
! Rosetta Dublin Core  
! IE Dublin Core  
|-
|-
| 210, 245, 246 (all subfields)
| 210, 245, 246 (all subfields)
Line 178: Line 190:
|-
|-
! ArchivesSpace
! ArchivesSpace
! Rosetta Dublin Core
! IE Dublin Core
|-
|-
| Title (Archival Object)
| Title (Archival Object)

Latest revision as of 14:43, 24 July 2024

Overview

The Rosetta Digital Asset Management System uses a few different types of metadata for its assets. Most of the descriptive metadata is Dublin Core, which is sourced from Aleph and ArchivesSpace. These source records must be crosswalked into Rosetta Dublin Core using code defined by the CJH Metadata Lab. This synchronization between systems means that CJH staff should rarely have to edit Dublin Core records in Rosetta, unless it is to change or add an Aleph/ArchivesSpace identifier.

Searching in Rosetta

Searching in Rosetta is available on three levels: IE, Representation, File.

TIP
To include both FILE and IE information in a search, search on the FILE level and customize your search Columns in the Columns tab to include IE level information

Here is how that data model breaks down for an example IE.

  • The IE level is for Intellectual Entities: The "Photographs"
  • The REP level is for representations of Intellectual Entities: The Modified Master viewable to patrons, the Preservation Master viewable to staff
  • The FILE level is for files that make up Intellectual Entities: 3124174_la-ar25060-b01-f07.pdf

You can search across levels, as well. For example, you can search for the File Extension "PDF" on the IE level, and it will bring up any IE that has a PDF File associated with it.

You can find search fields by typing any or all of the field name in the field search bar.

Find field.png

Add and subtract fields using the green plus and red minus icons, respectively. Be sure to define whether you're searching for "ALL" fields or "ANY" fields.

For example, the following query looks for any LBI IEs with ArchivesSpace identifiers that were ingested after 01/01/2020

Search example.png

Rosetta lets users search on a huge list of possible metadata fields. Many of them are technical metadata fields which we do not include in our descriptive metadata. Unfortunately this master list of search fields cannot be edited, and the majority of fields on the list are not in use. The Search Glossary below includes fields that are commonly used and can be searched on.

Search Glossary

Access Rights Policy ID (IE)

  • Search for IEs based on their Access Policy ID code. Options provided.

Collections

  • Search for IEs based on their Collection in the Collection Management module. These collections were created programmatically from MARC records and will need significant data cleanup in the future. Options provided.

Contributor - Deposit Agent (DC)

  • The username of the staff member who prepared the object for ingest. Free text.

Deposit ID

  • Deposit ID for an ingest activity. Free text.

FILE - Identifier - DTLPID (DC)

  • DigiTool PID associated with a FILE. Free text.

File Extension

  • Extension of a file. Options provided.

File Label

  • Label for a file in the IE's Struct Map (left-hand side of the IE viewer). Free text.

File Original Name

  • The filename of a FILE. Free text.

File PID

  • The Rosetta FILE PID. Free text.

File Size Bytes

  • The size of a FILE. Free text.

IE Creation Date

  • The date the IE was created in Rosetta. Choose date from calendar.

IE Modification Date

  • The date the IE was last modified in Rosetta. Choose date from calendar.

IE PID

  • The Intellectual Entity (IE) PID. Free text.

Identifier (DC)

  • The Dublin Core identifier associated with an IE. Often used for the Aleph identifier. Free text.

Identifier - Archivesspace (DC)

  • A qualified Dublin Core Identifier for the ArchivesSpace Archival Object Ref ID associated with an IE. Free text.

Identifier - DTLPID (DC)

  • A qualified Dublin Core Identifier for the DigiTool PID associated with a FILE. Free text.

Is Part Of (DCTERMS)

  • A DCTERMS field often used by the ArchivesSpace crosswalk to show the IE's relationship with an ArchivesSpace Resource. Free text.

Partner Name

  • The partner to whom the IE belongs. Options provided.

Preservation Type

  • The preservation type of a REP associated with an IE. Options provided.

Relation (DC)

  • A Dublin Core field often used by Aleph MARC crosswalk for Finding Aid or Collection information. Free text.

SIP ID

  • The ID for a Submission Information Package, which often include multiple IEs. Free text.

Source (DC)

  • A Dublin Core field often used by the ArchivesSpace crosswalk to store Box/Folder/Item numbers. Free text.

Title (DC)

  • A Dublin Core field for storing the title of an IE. Free text.

Metadata Mapping

Many of the above fields are mapped from Aleph or ArchivesSpace records. Here are the mappings for each source record.

Aleph MARC to DC

The Aleph MARC to DC crosswalk utilizes MARC XML furnished by the Aleph REST API and transforms it into Dublin Core using an XSLT stylesheet

Aleph MARC IE Dublin Core
210, 245, 246 (all subfields) Title
100a, 110a Creator
260c, 264c Date
500a Description
520ab Description
545ab Description
555au Relation
700ae Contributor
710ae Contributor
773ao Relation
09x Identifier
001 Identifier
506a Rights
540a Rights

ArchivesSpace Metadata to Rosetta Dublin Core

The ArchivesSpace to Rosetta DC crosswalk utilizes the ArchivesSpace API and both Archival Object and Resource metadata to create the Rosetta Dublin Core record

ArchivesSpace IE Dublin Core
Title (Archival Object) Title
Title (Resource) isPartOf
Call Number (Resource) Identifier
Container Information (Archival Object) Source
Language (Archival Object) Language
Ref ID (Archival Object) Identifier - Archivesspace