Contents:
There are two aspects to preparing your documents for iPlanet Compass Server:
iPlanet Compass Server is based on an index of documents. When
you perform a search, it is the information in the index that
is examined. When a document is listed, it is the information in
the index that is displayed. iPlanet Compass Server extracts two different
kinds of information from each document to build its index:
The following guidelines describe how to prepare your documents so that iPlanet Compass Server users can more easily find the information that they need.
Note however that:
By default, the server index contains four different kinds of information:
You use iPlanet Compass Server to search the index for any or all of these kinds of information. For example, if you enter "Abraham Lincoln" in the search box, you will find document about Abraham Lincoln and documents written by Abraham Lincoln.
Keywords are words that identify the contents of a document. You use iPlanet Compass Server to search for documents containing the keywords you are interested in.
For example, the keywords for an essay on the life of Thomas Jefferson might include Jefferson, presidents, America, United States, Declaration of Independence, consitution, history, Monticello, founders, founding fathers, revolution, and so on. If you wanted to find documents containing information about Jefferson and Monticello, you would search for those two keywords.
Keywords are the most important element in a document search. By making sure that the index contains the right keywords for your documents you make it easier for users to find the information you want them to have.
Having the right keywords to describe your document is far more important than the number of keywords. (While iPlanet Compass Server can accommodate a maximum of 1Mb of keywords for each document, it is unlikely that any index entry would ever approach that limit.)
By default, the server index obtains its list of keywords from four different document sources:
By default, the index's list of keywords for a document come from the following content sources:
(Keywords can also come from META tags.)The most obvious place to look for keywords is in chapter and section headings.
The unique words in all of the level <h1>, <h2>, and <h3>, heads are automatically listed as keywords. For example, the heading Lincoln at Gettysburg would produce two useful keywords: Lincoln and Gettysburg.
To ensure that your documents headings are helpful in generating keywords, follow these rules:
By default, the server only generates keywords from the first three heading levels. (Your administrator can specify more or fewer levels.)
Much of the information that iPlanet Compass Server uses to generate keywords comes from the text of the document itself. By default, all the unique words in the first 4,000 bytes of text (approximately the first 800 words) are listed as keywords. (Your administrator can increase or decrease the number of bytes from which keywords are taken.)
Keep in mind though, that to the server the "first text" is whatever immediately follows the <body> tag in the document file. If the first text is routing information, reference citations, acknowledgments, and so forth, that is what gets listed as keywords.
From the point of view of a search, it is a good rule of thumb to begin each document with a concise summary or overview of the document's contents. By doing that you ensure that the keywords taken from the first text are the important keywords that you want listed in the index.
Note that the exact amount of text included as keywords is adjustable by your site administrator.
In addition to shaping your document content to make iPlanet Compass Server searches more effective, you can also use META tags to help users find the information that they need. You can:
You can use META tags to specify keywords to
be included in the index. When specifying
keywords with META tags, keep these principles in mind:
You can use the following META tags to add keywords to the server
index:
See Working With META information for information on how to add META tags to your documents.
By default, document lists contain two pieces of information about each document:
Document lists produced by a search also display search relevance indicators (boxes), and a link to the item's category if it has one.
The title displayed in a list is the document's search title. When users browse by category, each category's documents are listed alphabetically by title.
You specify the search title you want to use for your document with a <title> META tag.
If you do not include a <title> META tag in your document, the list displays the document's URL (web) address as the title. Since that may not be helpful to readers, it is good practice to always include a <title> META tag in every document.
For many documents, the search title is the same as the formal title that readers see when they view it. But they do not have to be the same, and it is not unusual for the search title and formal title to be different. (See <Title> META Tags for information on the different uses of title tags.)
By default, document lists contain descriptions of every item. You specify the description you want displayed for your document with a <description> META tag.
If you do not include a <description> META tag in your document, the document list displays the first 20 to 30 words of document content as the description. That is, whatever words immediately follow the <body> tag. If those words are headers, bylines, acknowledgments, navigation links, frame descriptors, or other miscellaneous information, they won't provide a very useful description. Thus, it is good practice to always include a <description> META tag in every document.
At most sites, documents are grouped into categories and subcategories. Once a document has been assigned to a category, it will be listed in alphabetic order whenever a reader browses that category. When a document is found by a search, its listing will contain a link back to that document's category so that users can browse for similarly categorized items.
Your site administrator creates the categories and specifies the rules that the server will use when automatically assigning a document to a category or categories.
A document can be assigned to more than one category. By default, a document can be placed in as many as three different categories. (Your site administrator can increase or decrease that number.) When a document has multiple categories, one of those categories is primary. When a document is listed as a result of a search, the link to similarly categorized documents links to the document's primary category.
You can use a META <Classification> tag to explicitly place a document in a particular category. Categories that you assign with a classification tag are in addition to any categories automatically assigned by the server. A category explicitly specified with a classification tag becomes that document's primary category with precedence over any categories automatically generated by iPlanet Compass Server. If you specify multiple categories with a classification tag, the first one is the primary one.
META information is information about a document (as opposed to the document's contents). Types of META information include:
Some META information is generated, maintained, and displayed by the
server containing the document. Other types you enter into the
document itself using META tags. There are two ways of adding
META tags to your documents:
You can use a text editor to add META information to your web documents. META information is specified with META tags.
Important: When using a text editor to add any kind of information to a document, you must always save your document in ASCII format. (Some popular word processors and editors use names like text, text-only, or DOS-text instead of ASCII.) If you do not know how to set your editor to work in, or save as, ASCII text, ask your site administrator for assistance before adding META tags by hand.
Web documents (also known as HTML documents) have two parts:
All META tags (except the title tag) have the same format:
<META name="xxx" content="zzz">.
Where:
For example, the META tag specifying the author for this page looks like this:
<META NAME="Author" CONTENT="Sun Microsystems, Inc.">
For example, the following portion of an HTML document defines the document title, author, description, category, and keywords:
<HTML> <HEAD> <TITLE>Declaration of Independence</TITLE> <META name="Author" content="Thomas Jefferson"> <META name="Description" content="Statement of principles and enumeration of grievances by American colonists to British monarchy"> <META name="Keywords" content="Continental Congress, human rights, independence, America, democracy, July 4th 1776, Philadelphia, Libety Bell, taxation"> <META name="Classification" content="History:American:Documents; </HEAD> </HTML>
In addition to the standard META tags described here, your site administrator can define other META tags that you can use and that the server recognizes.
By default, the following META tags are recognized and used by iPlanet Compass Server. (Your site administrator can add or delete recognized META tags.)
See also Document Titles.
You use the <title> tag to assign a search title to your document. Search titles are used for the following online purposes:
To create a search title:
For example, to create a search title that reads: This is the Title your document would look like this:
<head> <title>This is the Title</title> </head>
Note that unlike other META tags, the <title> tag does not use the word META.
See About Keywords and Document Content and Keywords for information on how iPlanet Compass Server uses keywords.
You use the <keywords> META tag to specify keywords for your document. This tag uses the standard <META name="keywords" content=" "> format.
To create keywords:
For example, to add the keywords Netscape browser, web, HTML, Compass, search, search engine, and document your document would look like this:
<head> <title>This is the Title</title> <META name="keywords" content="Netscape browser, web, HTML, Compass, search, search engine, document"> </head>
See META Tags and Keywords for general principles that you should use when specifying keywords. Also note that:
You use the <author> META tag to specify the individuals or organization that created your document. This tag uses the standard <META name="keywords" content=" "> format.
To specify authors:
For example, to add the author Harriet Stowe, your document would look like this:
<head> <title>This is the Title</title> <META name="author" content="Harriet Stowe"> </head>
You can specify multiple authors by separating the author names with
semicolons. For example, to specify both Sun Microsystems and C. Brookes as authors, your tag would
look like this:
<META NAME="Author" CONTENT="Sun Microsystems; C. Brookes">
Using the <Description> META Tag
See also Describing Your Documents.
You use the <description> META tag to specify the short description of your document that you want readers to see when the document is displayed in a list. The words in your description are also added to the list of document keywords.
This tag uses the standard <META name="keywords" content=" "> format.
To specify a description:
For example, to add the description An analyses of third quarter sales by region and product line, your document would look like this:
<head> <title>This is the Title</title> <META name="description" content="An analyses of third quarter sales by region and product line."> </head>
Your description can be as long as necessary, but keep in mind that most users will prefer to see a brief summary rather than a extensively detailed review.
See Document Descriptions for additional information.
You use the <expires> META tag to specify when your document should be dropped from the iPlanet Compass Server index. (Note that the expiration date only affects the index listing it does not cause a document to be removed from the server where it resides.)
Once the expiration date you specify has passed, your document will be removed from the index the next time your administrator purges expired documents. See Removing Documents from iPlanet Compass Server for additional information.
This tag uses the standard <META name="keywords" content=" "> format.
To specify an expiration date:
For example, to specify that your document should be dropped from the index listing after January 9, 1998, your document would look like this:
<head> <title>This is the Title</title> <META name="expires" content="1/9/98"> </head>
Keep in mind the following points:
1/15/98 1.15.97 1-15-98
See also Categorizing Your Documents.
You use the <classification> META tag to specify categories for your document.
This tag uses the standard <META name="keywords" content=" "> format.
To specify a category:
For example, to specify that your document should be listed in the Compass category, which is a subcategory of New Products, which is a subcategory of iPlanet, your document would look like this:
<head> <title>This is the Title</title> <META name="classification" content="iPlanet:New Products:Compass"> </head>
You can specify more than one category up to a maximum number. By default, you can specify a maximum of three categories. (Your site administrator can change this maximum number.)
To specify more than one category, separate the different categories
with semicolons. For example, to specify both iPlanet:New Products:Compass
and iPlanet:Market Share, your tag would look like this:
<META name="classification" content="iPlanet:New
Products:Compass;iPlanet:Market Share">
Keep in mind the following points:
See Categorizing Your Documents for additional information.
You can easily add an iPlanet Compass Server search box to any of your HTML web documents. This allows anyone who is viewing your document to conduct a search from your page.
You add a search box by entering lines of HTML code to your document at the place where you want the search box to appear as shown below. (Note: In the example below, replace http://your.host.com with the URL of your Compass Server. Ask your site administrator if you are unsure what URL to use.)
<FORM METHOD=GET ACTION=http://your.host.com/compass> <INPUT TYPE=text NAME=scope> <INPUT TYPE=submit VALUE=Search> <INPUT TYPE=hidden NAME=ui VALUE=sr> </FORM>
This produces a search box in your document that looks like this:
Using Java Script, you can also:
See Compass Server Developer's Guide for information on enhancing search boxes with Java script.
Note: The file containing your document is stored on a file server.
iPlanet Compass Server cannot remove a document from a file server. To
remove a document from a file server, you must use whatever tool, application, or command
is appropriate.
To remove a document's index listing, you first specify an expiration date with an <Expire> META Tag. Once the expiration date you specify has passed, your document will be removed from the index the next time your administrator purges expired documents.
If you did not
specify an expiration date when you originally created your document,
you can edit it later to add one. You can also edit your document to
change an expiration date. Note, however, that a new or altered
expiration date will not take effect until the next time the
server happens to index the site where the document is stored. (Your
site administrator controls when and how often sites are indexed.)