Advanced Systems Access
1. Machine Readable Collection Description
A machine may want to view / use the collection level metadata. This is available as an XML feed, again parameterising a URL with the relevant account identifier and the the collection identifier.http://www.archive-it.org/archiveit/feed.xml?accountId=ACCOUNTID&sc=COLLECTIONID
eg http://www.archive-it.org/archiveit/feed.xml?accountId=197&sc=938
For the University of Melbourne, our ACCOUNTID is 197 and the main collection is 938.
2. OAI-PMH Data Provider
A web archive collection can be exposed to industry standard federated collection catalogue services such as OAI-PMH. There is an OAI-PMH data provider built in to Archive-IT, supporting collection metdata only. The OAI-PMH repository base URL is http://oai.archive-it.org:7090/oai
This service is designed for an OAI-PMH harvester, which can issue the following six types of requests (all of this text goes after the URL). All responses are in XML.
?verb=Identify |
?verb=ListMetadataFormats |
?verb=ListIdentifiers&metadataPrefix=oai_dc
returns a list of all record identifiers (Archive-IT's are oai:archive-it.org:archiveit/[collectionid])
with date of last modification |
?verb=ListRecords&metadataPrefix=oai_dc
returns all of the metadata records |
?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:archive-it.org:archiveit/938
returns the metadata record for collection 938 |
?verb=ListSets
returns nothing for now, very soon there will be institution-based "sets" so people can pull out
all of the records for a given institution (via a "set=" argument to ListRecords above). |
For more information on OAI-PMH please see Open Archives Initiative.
3. Full Text Search by URL of Selected Archive Collection
Since Archive-IT also supports search and retrieve via URL (SRU), it is also possible to construct a URL that links to both a collection and contains a query string. The syntax is:
http://index.archive-it.org:8080/nutchwax/?query=QUERY+TERM&go=Search+Web+Archive&collection=COLLECTION_IDwhere QUERY_TERM is the normalised form of the search query, and COLLECTION_ID is the identifier of the particular collection.
As an example, to construct the URL which corresponds to the earlier cases:
4. URL Search by URL of Selected Archive Collection
Again, since Archive-IT also supports search and retrieve via URL (SRU), it is also possible to construct a URL that links to both a collection and contains a specific target URL. The syntax is:
http://wayback.archive-it.org/COLLECTION_ID/query?type=urlquery&url=QUERY_URL&go=Search+Web+Archive&type=urlquery
where QUERY_URL is the normalised form of the URL, and COLLECTION_ID is the identifier of the particular collection (938 in most cases). As an example, to construct the URL which corresponds to the earlier cases:
or with a date range included, by adding the date (YYYY((-MM)-DD) parameter:
http://wayback.archive-it.org/938/query?type=urlquery&url=http%3A%2F%2Fwww.unimelb.edu.au&type=urlquery&date=2007