NLM Digital Collections

NLM's Digital Collections Web Service

Main Help | Book Viewer | Video Player | Image Viewer | PDF Viewer | Frequently Asked Questions (FAQs) | Web Service

The National Library of Medicine's (NLM) Digital Collections offers a search-based Web service that provides access to the Dublin Core metadata and full-text OCR in the repository in XML format. Developers can use the Web service to build applications that query and link to these resources.

The service accepts keyword and fielded searches as requests. Results are returned in order of relevancy. Keyword searches may also be limited to Dublin Core fields. The returned results represent the subset of Digital Collections resources that are relevant to the query in the request.

The Web service is free of charge and does not require registration or licensing. If you use data from the Web service or build an interface using the Web service, please indicate that the information is from the NLM Digital Collections. If you have questions about the Web service, or if you would like to share an application that you have built using the service, please contact us.

Acceptable Use Policy

In order to avoid overloading our servers, NLM requires that users of the NLM Digital Collections Web service send no more than 85 requests per minute per IP address. Requests that exceed this limit will not be serviced, and service will not be restored until the request rate falls beneath the limit.

The Digital Collections Web service is updated weekly on Sundays. To limit the number of requests that you send to the Web service, NLM recommends caching results for a 12-24 hour period.

This policy ensures that the service remains available and accessible to all users. NLM encourages all users of the Digital Collections Web service to use the email and tool parameters (as described below). NLM may use this information to contact you if there are problems with your requests.

If you have a specific use case that requires you to send a large number of requests to the Web service which exceeds the request rate limit outlined in this policy, please contact us. NLM staff will evaluate your request and determine if an exception may be granted.

Base URL

https://wsearch.nlm.nih.gov/ws/query

Parameters

Parameters for Initial Search Request

There are two parameters the must be included in the initial search request to define the database being searched and the search query.

Parameter Name Required Description

db

Yes

Database to search. Value must be "digitalCollections".

term

Yes

Term submitted to the Web service as a text query. All special characters must be URL encoded. Spaces may be replaced by '+' signs, which represents the AND operator. Represent the OR operator as +OR+. To send a query as a phrase, enclose the phrase in quotes using %22 to represent quotation marks.

Examples:
https://wsearch.nlm.nih.gov/ws/query?db=digitalCollections&term=cholera
https://wsearch.nlm.nih.gov/ws/query?db=digitalCollections&term=%22asiatic+cholera%22
https://wsearch.nlm.nih.gov/ws/query?db=digitalCollections&term=%22healing%22+OR+%22medicine%22

Field Searching

The text for the query term can include limiters to restrict the search to a specific Dublin Core metadata field. The syntax is "dc:<fieldValue>". Dublin Core field values that can be searched include:

  • dc:creator
  • dc:coverage
  • dc:date
  • dc:description
  • dc:format
  • dc:identifier
  • dc:language
  • dc:publisher
  • dc:relation
  • dc:rights
  • dc:subject
  • dc:title
  • dc:type

See the Output Format section below for a description of each of these fields.

Examples:

https://wsearch.nlm.nih.gov/ws/query?db=digitalCollections&term=dc:title:%22tropical+disease%22
https://wsearch.nlm.nih.gov/ws/query?db=digitalCollections&term=dc:subject:cholera
https://wsearch.nlm.nih.gov/ws/query?db=digitalCollections&term=dc:language:french

Parameters for Subsequent Requests

The parameters listed below are used to retrieve subsequent results related to the initial search query request.

Parameter Name Required Description

file

Yes

Name of the file containing the document references for the current search. This parameter is required when retstart is being used. The value is obtained from the XML returned from the initial search. The file will expire after a certain period of inactivity, after which a new request must be initiated. If the file is expired, the XML output will contain an error message.

server

Yes

Name of the server with the file referenced by the file parameter. This is required when the file parameter is being used.

retstart

Yes

Sequential index of the first document in the retrieved set to be shown in the XML output (default=0, corresponding to the first record of the entire set). This parameter can be used in conjunction with retmax to download an arbitrary subset of documents retrieved from a search.

Example (retstart):
https://wsearch.nlm.nih.gov/ws/query?file=viv_briwYO&server=pvlbsrch05&retstart=10

Optional Parameters

These parameters may be used in initial or subsequent search requests.

Parameter Name Description

retmax

Total number of documents from the retrieved set to be shown in the XML output (default=10). By default, the web service only includes the first 10 documents retrieved in the XML output. Increasing retmax allows more of the retrieved documents to be included in the XML output.

tool

A string with no internal spaces that identifies the resource which is using the Web service (e.g., tool=your_tool_name). This argument is used to help NLM provide better service to third parties using the Web service from programs. As with any query system, it is sometimes possible to ask the same question different ways, with different effects on performance. NLM requests that developers sending a large volume of requests include a constant 'tool' argument for all requests using the Web service.

email

Email address. If you choose to provide an email address, NLM may use it to contact you if there are problems with your queries.

Example (retmax):

https://wsearch.nlm.nih.gov/ws/query?file=viv_briwYO&server=pvlbsrch05&retstart=20&retmax=30

Output Format

Output Type

The Web Service returns XML using the markup defined in this document.

XML Response Format

The following is the basic structure of the XML returned by the Web service. This XML file does not have any style information associated with it. The document tree is shown below:

<nlmSearchResult>
   <term>…</term>
   <file>…</file>
   <server>…</server>
   <count>…</count>
   <retstart>…</retstart>
   <retmax>…</retmax>
   <list>
      <document>
         <content name="dc:title">…</content>
         <content name="dc:creator">…</content>
         <content name="dc:subject">…</content>
         <content name="dc:description"> …</content>
         <content name="dc:publisher">…</content>
         <content name="dc:date">…</content>
         <content name="dc:type">…</content>
         <content name="dc:relation">…</content>
         <content name="dc:format">…</content>
         <content name="dc:identifier">…</content>
         <content name="dc:coverage">…</content>
         <content name="dc:language">…</content>
         <content name="dc:rights">…</content>
         <content name="snippet">…</content>
      </document>
   </list>
</nlmSearchResult>

Descriptions of Elements

The element names for the Web Service are described below:

Element Name Description

nlmSearchResult

Basic XML node that contains the response (has no attributes).

term

Text query submitted to the Web service.

file

Name of the file containing the document references for the current search.

server

Name of the server with the file referenced by the file parameter.

count

The number of documents in the current result set (non-negative integer).

retstart

Sequential index of the first document in the retrieved set shown in the XML output.

retmax

Total number of documents from the retrieved set shown in the XML output.

list

The element containing all of the retrieved <document>s.

document

An individual resource from the Digital Collections Repository.

content

The element containing details of the Dublin Core metadata which are defined by the name attribute.

List Element Description
The list element includes three attributes; num, start, and per.

Attribute Type Description

num

Non-negative Integer

Total number of results retrieved for the search query term.

start

Non-negative Integer

Sequential index of the first document in the retrieved set shown in the XML output (see retstart).

per

Non-negative Integer

Total number of documents from the retrieved set shown in the XML output (see retmax).

Document Element Description

The document element includes two attributes; url and rank.

Attribute Type Description

url

Text

The permanent URL for the resource.

rank

Non-negative integer

Rank of the document in the results. The rank is based on the relevance score as determined by the search engine. Generally, the first document in the first results set will have a rank of zero, and the next document will have a rank of 1, 2, 3, etc. However, if the query string matches a resource's title, that topic will be boosted to the top, regardless of the rank. In these cases the first document in the first results set may have a rank greater than zero.

Content Element Description
The content element includes one attribute; name.

Attribute Type Description

name

Text

The name of the content node. Allowed values are: dc:creator, dc:coverage, dc:date, dc:description, dc:format, dc:identifier, dc:language, dc:publisher, dc:relation, dc:rights, dc:subject, dc:title, dc:type, and snippet. See description of content node values listed below.

The content nodes for the name attribute include the following values:

Content Name
Attribute Value
Occurrences Description

dc:creator

Zero or more

Individual author or organization responsible for the intellectual content of the resource; this field may also include contributors to the content, to the publication, or to the provenance of the resource.

dc:coverage

Zero or more

Geographic subjects of the resource.

dc:date

Zero or more

Publication or copyright dates.

dc:description

Zero or more

Brief description of the content of the resource and/or notes regarding the resource, such as credits, gift/donor information, NLM permanence rating, et al.

dc:format

One or more

May include the physical or digital manifestation of the resource (such as Text, Moving image, etc.), illustrative content, and extent information.

dc:identifier

One or more

Identifiers include the Permanent URL of the resource in Digital Collections and the NLM Unique Identifier (NLMUID); other potential identifiers may include ISSN, ISBN, LCCN or OCLC numbers.

dc:language

One or more

Language of the intellectual content of the resource. Values include English, French, German, Greek, Hawaiian, Latin, Portuguese, Spanish, etc.

dc:publisher

One or more

Imprint statement, which may include the publisher, the distributor, the place of publication or distribution, and date(s) of publication or copyright.

dc:relation

Zero or more

A reference to a related resource.

dc:rights

Zero or more

Information about rights held in and over the resource.

dc:subject

One or more

Subjects (both topical and persons) of the resource; topical subjects are from the Medical Subject Heading (MeSH) vocabulary.

dc:title

One or more

Main and variant titles (including series titles) for the resource.

dc:type

Zero or one

Nature or genre of the content of the resource.

snippet

One

Brief result summary generated by the search engine that provides a preview of the relevant content from the resource's full-text.

Example:

<document rank="0" url="http://resource.nlm.nih.gov/34711120R">
   <content name="dc:title">The <span class="qt0">"laws of cholera"</span> : reprinted by permission from "The Times" : with an introduction and supplementary matter</content>
   <content name="dc:subject"><span class="qt0">Cholera</span> - prevention & control</content>
   <content name="dc:subject">Public Health</content>
   <content name="dc:description">NLM Permanence Rating: Permanent: Unchanging content.</content>
   <content name="dc:publisher">London : Charles Knight, [1855]</content><content name="dc:date">1855</content>
   <content name="dc:format">Text</content>
   <content name="dc:format">91, [5] p., [3] leaves of plates</content>
   <content name="dc:format">Illustrations</content>
   <content name="dc:identifier">http://resource.nlm.nih.gov/34711120R</content>
   <content name="dc:language">English</content>
    <content name="snippet">The <span class="qt0">"laws of cholera"</span> : reprinted by permission from "The Times" : with an introduction and supplementary matter</content>
   </document>

Search Term Highlighting (Keyword-in-Context)

The query term is identified in the content nodes using span tags. Each instance of the query term includes a class attribute equal to qt0, qt1, qt2, etc. Use these tags to apply highlighting the search terms in your user interface.

Example:

<document rank="8" url="http://resource.nlm.nih.gov/64710040R">
   <content name="dc:title">An account of the rise and progress of the Indian or spasmodic <span class="qt0">cholera</span> : with a particular description of the symptoms attending the disease : illustrated by a map, showing the route and progress of the disease, from Jessore, near the Ganges, in 1817, to Great Britain, in 1831</content>
   <content name="dc:creator">Barber, John Warner, 1798-1885</content>
   <content name="dc:subject"><span class="qt0">Cholera</span> - history</content>
   <content name="dc:subject">Disease Outbreaks</content>
   <content name="dc:description">NLM Permanence Rating: Permanent: Unchanging content.</content>
   <content name="dc:publisher">New Haven : Published and sold by L.H. Young, 1832</content>
   <content name="dc:date">1832</content>
   <content name="dc:format">Text</content>
   <content name="dc:format">48 p., [1] folded leaf of plate</content>
   <content name="dc:format">Maps</content>
   <content name="dc:identifier">http://resource.nlm.nih.gov/64710040R</content>
   <content name="dc:language">English</content>
   <content name="snippet"> ... rise and progress of the Indian or spasmodic <span class="qt0">cholera</span> : with a particular description of the symptoms attending ... </content>
   </document>

Main Help | Book Viewer | Video Player | Image Viewer | PDF Viewer | Frequently Asked Questions (FAQs) | Web Service