How to find your OAI Sets and Records
Find your base harvesting URL. Contact your systems manager if you need help finding your base harvesting URL.
Your URL may look something like this: http://cdm16007.contentdm.oclc.org/oai/oai.php
To view all sets that are visible for harvest add “?verb=ListSets”
Your link is going to look similar to this: http://cdm16007.contentdm.oclc.org/oai/oai.php?verb=ListSets
This page will display all of your sets in XML. Take note of the setSpec values for the next step.
To view the metadata records of a specific set, you will need to determine the metadata schema you want to view your records in, and chose a setSpec from the previous step. In this example we will view the records in Qualified Dublin Core and view the State Library of Ohio Rare Books Collection, for which Set Spec value is “p267401cdi.”
Add the following string to your base harvesting URL: “?verb=ListRecords&set=[setSpec chosen above]&metadataPrefix=oai_qdc”
The URL is now http://cdm16007.contentdm.oclc.org/oai/oai.php?verb=ListRecords&set=p267401cdi&metadataPrefix=oai_qdc
This link will display the metadata in the XML form. You can see how each metadata value is mapped to your metadata. Review a few records to make sure that the metadata is displaying as you expect.
Disable Page Level Metadata in OAI-PMH Feed
DPLA requires all items in the metadata to be in an item level record, and not in individual page level records. Please take a moment to review your OAI-PMH output settings.
If you are using CONTENTdm, please check the following settings:
CONTENTdm Administration > “Administration” tab > “Harvesting”
The “Enable compound object pages” option in the “OAI-PMH” section allows you to enable or disable this functionality. This setting should be disabled for page level records.
If you are using another Digital Assessment Management System, please contact your server administrator to determine the settings in your system.
Removing Deleted Records From OAI-PMH Feed
It would be helpful during our initial DPLA setup of your collections if you could remove the references to deleted records in the OAI-PMH feed. This isn’t required, but it simplifies the QA process of your metadata.
If you are using a Digital Assessment Management System other than CONTENTdm please contact your server administrator to find out how to perform this task in your system.
In CONTENTdm, the software keeps track of records that are deleted. Some basic information about these deleted records is then sent to any OAI-PMH harvesters (including us in our role with the DPLA.) To the best of our knowledge, this information is not used in any other context in CONTENTdm, and it should be safe to remove the references to these deleted records from the collection (with one caveat, explained below) but you should still confirm with your CONTENTdm support person that this functionality has not changed recently.
It is not possible to clear these references using the GUI in CONTENTdm. You must have access to the back-end server so you can manually delete a file from the server. If your CONTENTdm server is hosted by OCLC, then you must send a request to the CONTENTdm Support/Hosting teams for this change to be implemented.
Each CONTENTdm collection would need to be processed separately. The file to be removed from the collection is: /index/description/delete.log
After deleting this “delete.log” file, it will be recreated the next time that you delete an item from this CONTENTdm collection. This won’t be a problem, as we simply wish to clear out the deleted references during the initial setup when we’re looking most closely at your data in order to verify that our harvesting process is correct.
Removing “deleted” references from the OAI-PMH feed should only be problematic if you have another entity harvesting data from these collections and this other entity needs to know when records have been deleted from the collections. This information would then be used to keep that other entity’s records up-to-date. If you have no other harvesters that are doing this, then removal of the deleted references shouldn’t be a problem.