|
CDDS Search Help
|
 |
The Webinator's search can
be as simple or as complex as you need it to be. Usually you will
just need to enter a few words that best describe that which you are
trying to locate. To perform more complicated searches you might use
any combination of logic operators, special pattern matchers, concept
expansion, or proximity operations.
Example: nature conservation organization
Query Rules of Thumb:
- If you get too many junk
or nonsense answers, try:
- Add some more words
to your query.
- Decrease the range
of the Proximity control.
- Change the Word
Forms control to Exact.
- Look at the Match
Info and see why they are showing up.
- Use the Exclusion
Operator (-) to remove unwanted terms.
- If you are searching
for a phrase, hyphenate the words together.
- If you don't get any
answers, or just too few:
- Remove some more
words to your query.
- Examine your spelling.
- Increase the scope
of the Proximity control.
- It just might not
be there?
Overview of query abilities
- Controlling proximity:
Mastering the usage of proximity gives the ability to locate answers
with greater precision. The Site Search input form gives you several
options to control the search proximity:
- line
All query terms must occur on the same line
- sentence
Query items should all reside within the same sentence
- paragraph
Within the same paragraph or text block
- page (default)
All items must occur within same HTML document
The bar-graph display (
 ) will be shown any time a ranking search was performed
(eg. all searches except Show Parents).
More blue indicates a better match.
- Ranking Factors
The ranking algorithm takes into consideration relative word ordering,
word proximity, database frequency, document frequency, and position
in text. The relative importance of these factors in computing
the quality of a hit can be altered under RANKING FACTORS
on the Options page.
- Keywords
Phrases and Wild-cards:
To locate words, just type them in
as you would in a word processor. Letter cases will be ignored.
The wild-card character
* (asterisk) may be used to match just the prefix of
a word or to ignore the middle of something.
To locate a number of
adjacent words in a specific order, surround them with "
(double quotation) characters. Putting a '-' (hyphen)
between words will also force order and one word proximity.
Examples:
| Query |
Locates |
| john |
john, John |
| "john public" |
John Public |
| web-browser |
Web browser, web-browser |
| John*Public |
John Q. Public, John Public |
| 456*a*def |
1-23456-789-ABCDEF |
| activate |
activate, activation, activated...
(see
Word Forms) |
- Applying
Search Logic
Texis
and Metamorph use set logic for text queries. Set logic is easier
to use and provides more abilities than boolean. The examples
below make reference to single keywords, but keep in mind that
each keyword can represent an entire list of things or any of
the special pattern matchers.
Sets (or
lists) of things are specified by placing the elements within
parenthesis, separated by commas. example: (bob,joe,sam,sue)
. In the examples below, you could replace any of the keywords
with a list like this.
The default
behavior of the search is to locate an intersection (or 'AND')
of every element within a query. This means that the query;
"microsoft bob interface" is the equivalent to the boolean
query: "microsoft AND bob AND interface"
- '-'
(without)
The '-'(minus) is the most commonly used logic symbol.
It means the answer should EXCLUDE references to that item.
- '+'
(mandatory)
The '+'(plus) symbol in front of a search item means
that the answer MUST INCLUDE that item. This is generally
used in conjunction with the permutation operation.
- '@N'
(permute)
The '@' followed by a number indicates how many intersections
to locate of the terms in your query. This may be confusing
at first, but it is very powerful.
Notes:
Only the '+' and '-' operations are valid with a
relevance rank search.
Examples:
| Query |
Finds |
| bob
sam joe |
Bob
with Sam and Joe (within the selected proximity) |
| bob
sam -joe |
Bob
with Sam without Joe |
| bob
sam joe @1 |
Bob
with Sam, or, Bob with Joe, or, Joe with Sam |
| A
B C D @1 |
AB
or AC or AD or BC or BD or CD |
| +A
B C D @1 |
ABC
or ABD or ACD |
| A
B C -D @1 |
(
AB or AC or BC ) without D |
- Natural
Language Query:
You may enter a query in the form of a sentence or question. The
software will automatically identify the important words and phrases
within your query and remove the "noise words".
- Example:
- What
is the state of the art in text retrieval?
- The
software will search for:
- state
of the art AND text AND retrieval
- Invoking
Thesaurus Expansion
Metamorph and Texis have an editable vocabulary of over 250,000
word and phrase associations. Each entry is generally classifiable
by either its meaning or part of speech.
To expand
the meaning of a word or phrase within your query, precede it
with a '~' (tilde) character.
The
Word forms options give you control over how many variations of your
query terms will be sought in your search.
- Exact:
(default) Only exact matches will be allowed.
- Plural
& posessives: Plural and possessive forms will be found.
(s, es, 's)
- Any word
forms: As many word forms as can be derived will be located.
EXAMPLES:
president
EXACT : president
PLURAL: (above) + presidents president's
ANY : (above) + presidential presidency preside presides presiding presided
tight
EXACT : tight
PLURAL: (above) + tights
ANY : (above) + tightly tightening tightened tighter tightest
program
EXACT : programs
PLURAL: (above) + programs program's
ANY : (above) + programming programmatic programmed programmer programmable
We call this morpheme processing,
and it is generally smarter than a traditional "stemming" algorithm.
It doesn't just rip the end off a word, it actually checks to see
if it could be a valid form of the search term.
Notes: Thesaurus
terms are also treated in the same manner. Words smaller than 4-5
characters will not be processed.
These options
give you control over the region in which a match must be found.
- line:
match terms must be located within the same line.
- sentence:
all terms within the same sentence.
- paragraph:
match terms must be located within the same paragraph
- page:
(default) all terms within the same document.
In all cases the best possible
matches for your query are located and ordered by decreasing quality.
A bar graph is produced to indicate the quality of each answer.
Note: The look
and feel described here is for the standard search interface. The
interface may have been customized by the web site administrator.
When a query
is submitted it will come back with another query form and up to
10 matching documents. If there are more than 10 answers, a link
at the top and bottom of the list will allow you to view the next
10 in sequence.
The input form
at the top allows you further tailor your query to home-in on the
desired answers, or to submit a completely new query without having
to navigate back to the original input form.
Each answer
in the result set will have a format similar to the following:
The components
of each result are:
- Result
number
- Document
title ( clicking on this will take you to the original document
)
- Abstract
(The first few hundred characters of the document )
- Match
quality graph. 84%
 ( Only shown if relevancy ranking was used
)
- Size (
How big is the original document )
- Depth
( How many clicks from the top of the site )
- Find Similar
( Find other documents similar to this one )
- Match
Info ( View the matches and other information about the document
)
- Show Parents
( List pages that link to this one )
The Match
Info link will show you the context of your answers within
the matching document. Matching words will be shown as hypertext
links.
Clicking
on any match
term will take you to the next matching term. A summary at the
top of the in-context view shows information about the document
including the time it was last modified.
The
Find Similar link
will find documents that are similar to the corresponding result.
It does this by reading the original document to ascertain its
main subject matter, and then conducting a relevance ranked search
for those subjects.
Result documents
are ordered from best to worst match. The bargraph display will
indicate the overall quality of the match.
Note:The
document you click on may not be ranked as the best match. This
is because other documents may contain more information about
the overall subject matter than the original.
Often
times it is difficult to navigate using a search engine because
there is no back-link present on the matching document. The
Show Parents link
solves this.
This link
will show other documents that contain hyperlinks to the one you
click on. In other words, it is an automated back button.
|