Entrez | PubMed | Nucleotide | Protein | Genome | Structure | PMC | Journals | Books |
LinkOut and Publisher Holdings
Last Updated: September 23, 2004
Introduction
Frequently Asked Questions
How Can a Publisher Participate
in LinkOut?
Step 1 - Preliminary Contact
Step 2 - Files Preparation
Identity
File
Example
Identity File
Identity
File Prolog
Identity
File Elements
Holdings
Files
Example
Simple Holdings File
Holdings File Prolog
Holdings
File Elements
Additional
Information on Creating a Holdings File
Selecting
PubMed Citations in a Holdings File
Query
Additional
Query Rules
ObjId
Specifying
the Link to Access the Full-text of Selected Citation
Base
Rule
Putting
It All Together
Example
Complex Holding File
Step 3 - File Transfer
Step 4 - Activate
the Publisher Holdings in PubMed
Allowable Rule Keywords
Document Delivery Service
Announcement Mailing List
For More Assistance
LinkOut is a feature of Entrez where third parties provide information to link specific Entrez records to relevant web-accessible online resources such as full-text publications, molecular biology databases, consumer health information, research tools and more. Typically, publishers or full-text providers use LinkOut to provide links from PubMed citations to their full-text journals available on the Web.
A list of Frequently Asked Questions and answers is also available to address questions that a publisher may have.
How Can a Publisher Participate in LinkOut?
To participate in LinkOut a publisher of a journal must first submit their citation and abstract data electronically for inclusion in PubMed. Publishers or full-text providers may then provide NCBI with URL links to their web-accessible electronic journals by following the steps outlined below.
Email NCBI at linkout@ncbi.nlm.nih.gov to indicate your interest in providing links from Entrez records to your web resources. Please provide the name, e-mail address and phone number of a member of your organization who will act as a designated contact person. In addition, include a LinkOut Identity File (providerinfo.xml) based on the file specifications provided below.
NCBI will establish a ProviderId, an FTP account, and a name abbreviation (NameAbbr) for your organization, and will send this information to the designated contact person.
Two types of files are needed from a participating publisher. The files must be in XML format using the Document Type Definition (DTD) specified in the LinkOut.DTD. XML tags are case sensitive.
The first is the identity file, "providerinfo.xml", that provides information about a publisher.
The second is the holdings file or files, typically named "journals.xml"; the file must have an extension of .xml. This file describes the electronic journal holdings that are provided by a publisher.
The Identity File : "providerinfo.xml"
The "providerinfo.xml" stores information about a publisher. This file should be sent via email with the preliminary contact initiated by a publisher. The ProviderId field may be left blank. NCBI will change the publisher's NameAbbr if it is not unique. Please consult the current list of LinkOut providers and their name abbreviation (NameAbbr) to help to choose a NameAbbr.
Identity File Example:
A providerinfo.xml file for the LinkOut participant, GoodPublisher,
Inc., with the ProviderId "8888" and NameAbbr "GoodPublisher"
<?xml version="1.0"?>
<!DOCTYPE Provider PUBLIC "-//NLM//DTD LinkOut 1.0//EN"
"http://www.ncbi.nlm.nih.gov/entrez/linkout/doc/LinkOut.dtd">
<Provider>
<ProviderId>8888</ProviderId>
<Name>GoodPublisher, Inc.</Name>
<NameAbbr>GoodPublisher</NameAbbr>
<SubjectType>publishers/providers</SubjectType>
<Attribute>publisher of information in URL</Attribute>
<Url>http://www.goodpublisher.com/</Url>
<IconUrl>http://www.goodpublisher.com/icon/pubmed/goodpublisher.jpg</IconUrl>
<Brief>An international publisher of biomedical
journals and books</Brief>
</Provider>
XML Declaration - <?xml version="1.0"?>
(optional)
Document Type Declaration -
<!DOCTYPE Provider PUBLIC "-//NLM//DTD
LinkOut 1.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/linkout/doc/LinkOut.dtd">
(required)
Provider - root element of the file.
(required)
ProviderId - unique ID assigned by NCBI.
(required)
Name - full name of the publisher.
(required)
NameAbbr - short one word name of the publisher. May only
include alpha and numeric characters, spaces and special characters such
as hyphens are not allowed.
(required)
SubjectType, Attribute - descriptions of the resources and relationship
of the provider to the resources listed in the holdings. The SubjectTypes
and Attributes appeared in the identity file will apply to all holdings
provided by a provider. See
LinkOut
SubjectTypes, Attributes and UrlName for the list and description of
these elements.
(optional, repeatable)
Url - URL of the publisher's web site, used in the LinkOut Providers list in
Cubby.
(optional, repeatable)
IconUrl - logo of the publisher. The optimal size of the
icon is 100 pixels in width, 25 pixels in height. The icon should
look like a button. An icon with a white or transparent background
or without borders is not recommended. Note: the Url and IconUrl here and
in the holdings file(s) may be different for different languages; see the
LNG attribute in the LinkOut.DTD.
(optional, repeatable - currently not being displayed)
Brief - short (up to 256 characters) description of the publisher.
(optional - currently not being displayed)
The providerinfo.xml file is specified in the LinkOut.DTD.
The Holdings File: "journals.xml"
This holdings file or files contain information on the web accessible electronic journals of a publisher which will be linked from PubMed. Links described in the holdings file should go to the full-text of articles, not the Table of Contents or the home page of a journal. This file must have a file extension ".xml", typically named "journals.xml".
Simple Holding File Example:
A simple holdings file for GoodPublisher, Inc., ProviderId 8888, supplying
links to the web-accessible full-text for the journal J Cell Biol.
<?xml version="1.0"?>
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/linkout/doc/LinkOut.dtd"
[ <!ENTITY icon.url "http://www.goodpublisher.com/icon/goodpublisher.jpg">
<!ENTITY base.url "http://www.goodmedical.org/cgi/content/"
> ]>
<LinkSet>
<Link>
<LinkId>1</LinkId>
<ProviderId>8888</ProviderId>
<IconUrl>&icon.url;</IconUrl>
<ObjectSelector>
<Database>PubMed</Database>
<ObjectList>
<Query> "J Cell Biol" [ta] AND 1997/06/15:2010[dp]</Query>
</ObjectList>
</ObjectSelector>
<ObjectUrl>
<Base>&base.url;</Base>
<Rule>&lo.issn;/&lo.vol;/&lo.page;</Rule>
<Attribute>full-text online</Attribute>
</ObjectUrl>
</Link>
</LinkSet>
XML Declaration - <?xml version="1.0"?>
(optional)
Document Type Declaration and Entity Declaration -
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/linkout/doc/LinkOut.dtd">
[ <!ENTITY icon.url "http://www.goodpublisher.com/icon/goodpublisher.jpg">
<!ENTITY base.url "http://www.goodmedical.org/cgi/content/"
> ]>
The Document Type Declaration: !DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/linkout/doc/LinkOut.dtd" is required. The Entity Declaration is optional. A publisher may specify ENTITY to be used in the body of the file. In the above example, ENTITY icon.url and ENTITY base.url were defined as "http://www.goodpublisher.com/icon/goodpublisher.jpg" and "http://www.goodmedical.org/cgi/content/" respectively.
Once an ENTITY is defined in the Prolog, it can be used in the holdings file by placing the ENTITY name between an ampersand (&) and semicolon (;) to alleviate the need to replicate long, textual data. In the above example, '&icon.url;' and '&base.url;' are used to represent the respective information.
LinkSet - root element of the holdings file.
(required)
Link - an element that describes a specific
set of holdings grouped together based on characteristics for access, or
convenience. A holdings file can have multiple Link elements.
(required, repeatable)
LinkId - an identifier assigned by a participating publisher
for its own reference. It can be any alpha-numeric string.
LinkId should be unique LinkId within each LinkSet or file.
(required)
ProviderId - the identifier number assigned to the publisher
by NCBI and listed in the providerinfo.xml file.
(required)
IconUrl - the icon that will be displayed in the PubMed citation. The icon should not be larger than 100 pixels in width, 25 pixels in height and should look like a button. An icon with a white or transparent background or without borders is not recommended.
Publishers that supply their data to PubMed electronically will have their icon
displayed on the Abstract and Citation formats by default.
The Cubby
feature allows users to activate other providers' icon on the Citation and Abstract
display formats. In addition, a provider's icon can also be activated by searching PubMed
with a
holding parameter.
(required, repeatable)
ObjectSelector - an element containing other sub-elements in
which a publisher will specify which PubMed records are being linked from
by a <Link> element.
(required)
Database - a sub-element of ObjectSelector. Databases available for linking
include: PubMed, Nucleotide, Protein, Genome, Structure, PopSet, Taxonomy, OMIM, Gene, GEO, SNP, UniGene, UniSTS, NLMCatalog.
(required)
ObjectList - a sub-element of ObjectSelector, containing
either the Query or ObjectId, where a publisher will specify which PubMed
citations the publisher URLs (ObjectUrl) will be linked from.
(required, repeatable)
Query - a sub-element of ObjectList that contains any valid
PubMed
Search used to select the PubMed records being linked from.
Note: 1. Use Medline abbreviation or ISSN to specify a journal title.
Be sure to include all valid ISSNs for a title.
2. Do not use the search field tag [filter] in Query; filters are generated
after the LinkOut files are processed.
(required unless ObjId is specified, repeatable)
ObjId - a sub-element of ObjectList that contains the PubMed
Unique Identifier (PMID).
(required unless Query is specified, repeatable)
ObjectUrl - an element that describes the necessary information
for the Entrez system to construct an appropriate URL to access the full-text
of a citation.
(required)
Base - a sub-element of ObjectUrl that is the base of the URL
for the publisher's full-text system.
(required)
Rule - a sub-element of ObjectUrl, from which the remainder of
the URL is constructed. It is based on the publisher's specifications
for access to the full-text articles. Links should point to full-text articles,
NOT to the journal's Table of Contents or the home page.
(required)
SubjectType, Attribute - sub-elements of ObjectUrl used to describe the resources and relationship of the provider to the resources listed in the link. The SubjectType(s) and Attribute(s) will be applied to the resources provided within a <Link>. For example: "full-text online", "subscription/membership/fee required". See LinkOut SubjectTypes, Attributes and UrlName for the list and description of these elements.
Publishers that provide free access to their summary pages, but restrict
full text access to subscribers, may choose to provide links to their summary
pages that have links to the full-text. If they do so, they should include
the attributes "full-text online" and "subscription/membership/fee required"
to let users know the full text is available in addition to the summary.
(optional, repeatable)
The holdings file is specified in the LinkOut.DTD.
Additional Information on Creating a Holdings File
The holdings file contains a <LinkSet> which may contain one or more <Link> elements. Each <Link> element provides the publisher with the ability to select a range of PubMed citations to be associated with a particular URL Rule. The URL Rule will be translated to a valid HTTP request when a citation is retrieved from a PubMed search. A publisher should examine their full-text holdings and group together those that can be accessed via a single URL Rule in one Link.
A publisher may choose to put their holdings in different Link elements even if the same URL rule applies to all their holdings. Similarly, a publisher may supply more than one holdings files to help with file management.
Selecting PubMed Citations in a Holdings File
The element <ObjectList> is used to select PubMed citations. Within this element, a publisher may use either <Query> or <ObjId> to specify PubMed citations that it is providing a link to in a particular Link element.
<Query> can include any valid PubMed search. Please consult the PubMed Help for information on constructing PubMed Boolean search queries and PubMed Search Field Tags. In addition, more than one <Query> can be listed within the <ObjectList> so a publisher may select a wide variety of citations.
Examples:
<Query> "J Mol Dis" [ta] AND 1997/06/15:2010 [dp]</Query>
will select citations for the journal "J Mol Dis" starting from the publication day and year June 15, 1997.
<Query> "J Mol Dis" [ta] AND Smith J [au]</Query>
will select citations for the journal "J Mol Dis" and the author J Smith.
The LinkOut page lists multiple links for citations that include different Attributes. For example, a citation may have a link to the "PostScript", "HTML", and "PDF" format of the article.
<ObjId> is the PubMed ID (PMID) for a citation. It can be used in place of the <Query> element. It will select only one citation at a time. More than one <ObjId> can be used in a <ObjectList>.
Example:
<ObjId>9679140</ObjId>
will select the citation with PMID 9679140.
Specifying the Link to Access the Full-text of Selected Citations
A publisher uses the <ObjectUrl> element to describe the link to their full-text. This includes instructions on how to generate the full-text URL, and additional information about the articles.
Entrez uses a rule based mechanism to generate the URL to link to the full-text of a retrieved PubMed citation. In general, two elements are needed to build this URL: <Base> and <Rule>.
<Base> is the base of the URL for the full-text of the publisher's selected citations. It may be the URL of a publisher's web site or the CGI program.
<Rule> is the remainder of the URL derived from a publisher's full-text system. Information about a citation can be passed to the publisher's system via a list of supported keywords (entities) which can be found at the end of this document or in the LinkOut.DTD. For example, &lo.vol;/&lo.iss;/&lo.page;
Entrez will replace the keywords with the actual value for a retrieved citation. A publisher may also add any additional information for their system in the <Rule> element. <Base> is then concatenated with <Rule> to form the complete URL to the full-text article.
Examples:
<Base>http://www.goodmedical.org/cgi/full/</Base>
<Rule>&lo.issn;/&lo.vol;/&lo.page;</Rule>
Using the above rule, the URL constructed for the citation "ISSN 1234-5678, volume 23, page 123" will be: http://www.goodmedical.org/cgi/full/1234-5678/23/123
<Base>http://www.goodmedical.org/links/citation/</Base>
<Rule>pmidlookup?view=reprint&pmid=&lo.id;</Rule>
Using the above rule, the URL constructed for the citation with PMID
"9679140" will be:
http://www.goodmedical.org/links/citation/pmidlookup?view=reprint&pmid=9679140
The following ENTITIES are supported by default and required for these symbols:
"&" ENTITY amp
"<" ENTITY lt
">" ENTITY gt
The list of supported keywords(entities) in the <Rule> can be found at the end of this document or in the LinkOut.DTD.
The following complex holdings file "journals.xml" includes all the elements outlined thus far.
GoodMedical Publisher describes links to the full-text of two of their journals: J Mol Dis and J Biol Chem.
LinkId 1 is for a special case for the articles for PMID 9679140, 9679141, 9679142, represented using <ObjId>, using a special rule : <Rule>pmidlookup?view=reprint&pmid=&lo.id;</Rule>. They are available online in PostScript format without restriction.
LinkId 2 is for all articles authored by J Smith published in the journal J Mol Dis. These full-text and available in PDF format without restriction. A special icon is used for these articles.
LinkId 3 is applicable for the remainder of the J Mol Dis articles supplied by GoodMedical where a subscription is required to access the full-text. The articles are in "HTML" format. If the data format is not listed in Attribute the default is HTML.
Since both LinkId 1 and 2 describe specific requirements, they should be listed first before the general LinkId 3. If a citation is selected in two Links, information from the <ObjectList> in the first Link will be used to construct the URL to the full-text.
Complex Holdings File Example:
<?xml version="1.0"?>
<!DOCTYPE LinkSet PUBLIC "-//NLM//DTD LinkOut 1.0//EN" "http://www.ncbi.nlm.nih.gov/entrez/linkout/doc/LinkOut.dtd"
[ <!ENTITY icon.url "http://www.goodpublisher.com/icon/externalservices/pubmed/goodpublisher.jpg">
<!ENTITY base.url "http://www.goodmedical.org/cgi/content/"
> ]>
<LinkSet>
<Link>
<LinkId>1</LinkId>
<ProviderId>8888</ProviderId>
<IconUrl>&icon.url;</IconUrl>
<ObjectSelector>
<Database>PubMed</Database>
<ObjectList>
<ObjId>9679140</ObjId>
<ObjId>9679141</ObjId>
<ObjId>9679142</ObjId>
</ObjectList>
</ObjectSelector>
<ObjectUrl>
<Base>&base.url;</Base>
<Rule>pmidlookup?view=reprint&pmid=&lo.id;</Rule>
<Attribute>full-text
PostScript</Attribute>
</ObjectUrl>
</Link>
<Link>
<LinkId>2</LinkId>
<ProviderId>8888</ProviderId>
<IconUrl>http://www.goodpublisher.com/pubmed/smith.gif</IconUrl>
<ObjectSelector>
<Database>PubMed</Database>
<ObjectList>
<Query> "J Mol Dis" [ta] AND Smith J [auth]</Query>
</ObjectList>
</ObjectSelector>
<ObjectUrl>
<Base>&base.url;</Base>
<Rule>&lo.issn;/&lo.vol;/&lo.page;</Rule>
<Attribute>full-text
PDF</Attribute>
</ObjectUrl>
</Link>
<Link>
<LinkId>3</LinkId>
<ProviderId>8888</ProviderId>
<IconUrl>&icon.url;</IconUrl>
<ObjectSelector>
<Database>PubMed</Database>
<ObjectList>
<Query> "J Mol Dis" [ta] AND 1997:2010 [pdat]</Query>
<Query> "J Biol Chem" [ta] AND 1996:2010[pdat]</Query>
</ObjectList>
</ObjectSelector>
<ObjectUrl>
<Base>&base.url;</Base>
<Rule>&lo.issn;/&lo.vol;/&lo.page;</Rule>
<Attribute>full-text
online</Attribute>
<Attribute>subscription/membership/fee
required</Attribute>
</ObjectUrl>
</Link>
</LinkSet>
Transfer both the providerinfo.xml and the holdings files via ftp to the host ftp-private.ncbi.nih.gov. These files must be in plain text format. Use the LinkOut File Validation utility to validate all your LinkOut files against the LinkOut DTD before submitting them to NCBI. Place the files under the directory "holdings" in the FTP account setup by NCBI for each publisher. No subdirectories should be created in the holdings directory.
There will be a test period for new LinkOut participants. During this period, the publisher should notify NCBI of all file submissions and updates. NCBI staff will check the accuracy of the submission and update files.
When the files submitted are consistently error free, NCBI will end the test period. From that point on, the files will be processed automatically every morning, except weekends and federal holidays.
A publisher may transfer new versions of current files or add new holdings
files at its own discretion. It is the responsibility of the publisher
to keep its files current and valid. Links in PubMed are regenerated
each day based on the holdings files in each publisher's directory, therefore
publishers must delete obsolete files from their holdings directory.
Step 4 - Activate the Publisher Holdings in PubMed
Once the LinkOut files for a provider are processed, the provider's icon can be displayed in PubMed citations in the Abstract or Citation formats by adding the parameter holding=NameAbbr to the basic PubMed URL. Currently, only the icon from the provider of electronic citation data is listed on the Abstract and Citation display by default.
This example URL illustrates how to display the icon of GoodPublisher for the citations that they provide full-text links:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?holding=GoodPublisher
Several NameAbbr parameters may be used in a URL to activate more than one icon. Example, to display icons for both GoodPublisher and MyPublisher:
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? holding=GoodPublisher,MyPublisher
A provider's icon can also be activated if a user selects the provider from the LinkOut Preferences in Cubby.
All restrictions on access will still apply. For example, if access
to a journal is limited by IP address, users will only have access via
a computer within the acceptable IP range; or if access is password protected,
users must still enter the password.
Below are the allowable Rule keywords(entities) as specified in the LinkOut.DTD.
For any database:
lo.id - Unique Identifier (PMID, GI, TaxID, etc.)
For PubMed only:
lo.pii - Publisher
Item Identification. Must be submitted by the publisher. In the PubMed
DTD this ID is an attribute of the ArticleId element.
lo.doi - Article DOI
lo.issn - Journal ISSN code
lo.issnl - Journal ISSN code without
the dash
lo.jtit - Journal title
(MEDLINE abbreviation)
lo.muid - MEDLINE Unique
Identifier.
lo.msrc - MEDLINE source.
For example, Exp Brain Res 1998 Oct; 122(3):339-350
lo.vol - Volume
lo.iss - Issue
lo.page - First page
lo.year - Four digit year of the
publication date.
For example, 1998
lo.yr - Last
two digit of year of the publication date. For example, 98; 00
lo.yl - Last
digit of year of the publication date. For example, for 1999 use 9; for 1990 use 0
lo.month - The month of
the publication date. For example, September
lo.mon - The 3 letter
month abbreviation of the publication date. For example, Sep
lo.mo - Two
digit month of the publication date. For example, 01; 12
lo.day - Two digit
day of the publication date. For example, 01; 31
lo.otit - Article title
lo.auth - First author. For
example, Smith JE
lo.authln - Last name of the first author. For
example, Smith
For Sequence database (Nucleotide, Protein, Structure, Genome):
lo.pacc - Primary accession for sequences
For Taxonomy only:
lo.name - Scientific name. For example,
"Homo sapiens neanderthalensis"
lo.genus - Genus name. For example, "Homo"
lo.species - Species epithet. For example, "sapiens"
lo.subsp - Sub-species epithet. For example, "neanderthalensis"
LinkOut should be used to link to resources, like full-text articles, from PubMed and Entrez databases. For Document Delivery service providers, NCBI has an alternative Document Delivery Service, and tool parameter, to handle this kind of comprehensive ordering system.
If the tool parameter is included in a hot link from a web site it will identify the document delivery service and change the function of the Order button to send users back to that web site rather than Loansome Doc.
For example,
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?tool=docdelservice
where docdelservice is the name abbreviation for a document delivery service.
In order for this to function properly, document delivery services must supply NCBI with a URL to a CGI program on their site. We will then transfer summary information such as a PMID back to the site in the URL generated from the Order button. Complete citation information to handle the document request can then be obtained using the Entrez Utilities.
The Cubby includes a selection on the sidebar menu for document delivery services. If a user registers for the Cubby, and selects an alternative document delivery service, all orders will be sent to that service rather than Loansome Doc. The alternative document delivery preference is in effect only when users are logged into the Cubby.
For general announcements regarding LinkOut for data providers you may
subscribe to the linkout-news announcement mailing list. Please disregard
the notice you receive about posting messages to the list. This mailing
list is an announcement list only; individual subscribers may not send
mail. The list of subscribers is private. To subscribe send an e-mail
message with subscribe in the Subject to:
linkout-news-request@ncbi.nlm.nih.gov
You may also subscribe on the web at:
http://www.ncbi.nlm.nih.gov/mailman/listinfo/linkout-news
For More Assistance
Please send questions about LinkOut to linkout@ncbi.nlm.nih.gov.