This web site was copied prior to December 11, 2006. It is now a Federal record managed by the National Archives and Records Administration. External links, forms, and search boxes may not function within this collection. Learn more.   [hide]

U.S. House of Representatives: Search Tips


Internet Query Parser (Simple Search) | Verity Query Language (Advanced Search)

Internet Query Parser (IQP)

With the internet-style query parser (IQP), users can search entire documents or parts of documents (zones and fields) using a command syntax similar to the syntax used in many Web search engines.

Search Terms
In a search form enabled with the internet-style query parser, users can enter words, phrases, and plain language. The internet-style parser does not support the Verity query language (VQL).

Words
To search for multiple words, separate them with spaces.

Phrases
To search for an exact phrase, surround it with double quotation marks. A string of capitalized words is assumed to be a name. Separate a series of names with commas. Commas aren't needed when the phrases are surrounded by quotation marks.

The following example searches for a document that contains the phrases "San Francisco" and "sourdough bread".
San Francisco "sourdough bread"
Plain Language
To search with plain language, enter a question or concept. The Verity internet-style Query Parser identifies the important words and searches for them. For example, enter a question such as:
Where is the sales office in San Francisco?
This query produces the same results as entering:
sales office San Francisco
Including and Excluding Search Terms
You can limit searches by excluding or requiring search terms, or by limiting the areas of the document that are searched. A minus sign (-) immediately preceding a search term (word or phrase) excludes documents containing the term.

A plus sign (+) immediately preceding a search term (word or phrase) means returned documents are guaranteed to contain the term. If neither sign is associated with the search term, the results may include documents that do not contain the specified term as long as they meet other search criteria.

Verity Query Language (VQL)

VQL
Search Tips Index
Basic Queries
Optional Modifiers
CASE, MANY, NOT, ORDER, [X]

Concept Operators
AND, ACCRUE, OR

Proximity Operators
IN, WHEN, NEAR,
NEAR/N, PARAGRAPH,
PHRASE, SENTENCE

Evidence Operators
SOUNDEX, STEM, THESAURUS,
TYPO/N, WORD, WILDCARD,
?, *, [], {}, [ ^ ], [ - ]

Relational Operators (Text Fields)
CONTAINS, ENDS, MATCHES,
STARTS, SUBSTRING

Relational Operators (Numeric Fields)
=, >, >=, <, <=
QUERIES
A simple query uses words and phrases separated by commas.

When a query is entered in mixed case, the search engine finds case-sensitive matches.
Queries entered completely in upper or lower case force the search engine to find all versions of the query terms -- mixed, upper, and lower case.

To search for characters such as backslash (\), comma (,), and quotes (" and "), insert a backslash before the character, for example:
     C:\\bin\\print

To create robust queries, you can incorporate Verity operators and modifiers in the query syntax. The use of operators and modifiers is optional. A modifier usually is used with an operator rather than alone.

Field searches, both text and numeric, can be performed using the relational operators. Numeric and text searches are supported for fields defined in your Verity collections.


OPTIONAL MODIFIERS

CASE
Performs a case-sensitive search. The search engine attempts to match the case-sensitivity provided in the query expression, when mixed case is used. CASE is invalid with the SOUNDEX and STEM operators.
     <CASE> NeXT

MANY
Incorporates the density of search words in the calculation of the relevance-ranked score. Used with evidence and proximity operators, except NEAR.
     <MANY> <STEM> apple

NOT
Excludes documents containing the words or phrases. To search for the word NOT, enclose NOT in double quotes.
     cat <AND> mouse <AND> <NOT> dog

ORDER
Specifies the order in which search elements must occur in the document. Used with PARAGRAPH, SENTENCE, and NEAR/N.
     president <ORDER> <PARAGRAPH> adams

[X]
(RANKING)
Assigns a relative importance, or weight, to search terms from 1 to 100, where 1 represents the lowest importance and 100 represents the highest.
     [50]politics, [80]politics <IN> title


CONCEPT OPERATORS

AND
Selects documents that contain all of the search elements you specify. (To search for the word AND, enclose AND in double quotes.)
     computer <AND> laptop
     Lewis "and" Clark

ACCRUE
Selects documents that include at least one of the search elements you specify. The more search elements that are present, the higher the score will be.
     <ACCRUE> (computer, laptop)

OR
Selects documents that show evidence of at least one of the search elements you specify.(To search for the word OR, enclose OR in double quotes.)
     nation <OR> region
     left "or" right


PROXIMITY OPERATORS

IN
Selects documents that contain specified values in one or more document zones. A document zone represents a region of a document, such as the document‚ÇÖs summary, date, or body text.
     laptop <IN> title

WHEN
The IN operator can be qualified with the WHEN operator, to search for a term only within the one or more zones upon which certain conditions have been placed. The following example locates links to the document "report.html" on the Verity Web site.
     "report.html"<IN> A <WHEN> (HREF <CONTAINS>"verity")

NEAR
Selects documents containing specified search terms, where the closer the search terms are within a document, the higher the document‚ÇÖs score.
     <CASE> World <NEAR> peace

NEAR/N
Selects documents containing two or more search terms within N number of words of each other, where N is an integer up to 1000.The closer the search terms are within a document, the higher the document‚ÇÖs score.
     commute <NEAR/10> train

PARAGRAPH
Selects documents that include all of the search elements you specify within the same paragraph.
     <CASE> Apple <PARAGRAPH> computer
     <MANY> <PARAGRAPH> laptop

PHRASE
Selects documents that include a phrase you specify. A phrase is a grouping of two or more words that occur in a specific order.
     national <PHRASE> park

SENTENCE
Selects documents that include all of the words you specify within the same sentence.
     <SENTENCE> (car, park)


EVIDENCE OPERATORS

SOUNDEX
Expands the search to include the word you enter and one or more words that "sound like", or whose letter pattern is similar to, the word specified. Collections do not have sound-alike indexes by default; to use this feature sound-alike indexes must be built.
     <SOUNDEX> sale

STEM
Expands the search to include the word you enter and its linguistic variations.
     <STEM> film

THESAURUS
Expands the search to include the word you enter and its synonyms.
     <THESAURUS> altitude
     <MANY> <THESAURUS> altitude

TYPO/N
Expands the search to include the word you enter plus words that are similar to the query term.This operator performs approximate pattern matching to identify similar words. The number specifies the maximum number of transpositions in one word.
     <TYPO/3> sweeping

WORD
Performs a basic word search, selecting documents that include one or more instances of the specific word you enter.
     <WORD> rhetoric

WILDCARD
Matches wildcard characters included in search strings. Certain characters automatically indicate a wildcard specification.

?
Specifies one alphanumeric character.When the question mark is used,<WILDCARD>is unnecessary.

*
Specifies zero or more alphanumeric characters. When the asterisk is used, <WILDCARD> is unnecessary.

[ ]
Specifies one of any characters in a set. You must enclose the word that includes a set in back quotes (‚Çÿ), and there can be no spaces in a set.
     <WILDCARD> "c[au]t" finds cat and cut.

{ }
Specifies one of each pattern separated by a comma.You must enclose the word that includes a pattern in back quotes {"}, and there can be no spaces in a set.
     <WILDCARD> "cat{s,er}" finds cats and cater.

[ ^ ]
Specifies characters excluded from the set. The caret (^) must be the first character after the left bracket ([) that introduces a set.
     <WILDCARD> "l[^ai]p" excludes lap and lip.

[ - ]
Specifies a range of characters in a set.
     <WILDCARD> "c[a-u]t" finds every three-letter word from "cat" to "cut".


RELATIONAL OPERATORS (TEXT FIELDS)

CONTAINS
Selects documents by matching the word or phrase you specify with the values stored in a specific document field. Documents are selected only if the search elements specified appear in the same sequential and contiguous order in the field value.
     title <CONTAINS> computer

ENDS
Selects documents by matching the character string you specify with the ending characters stored in a specific document field.
     title <ENDS> ter

MATCHES
Selects documents by matching the character string you specify with values stored in a specific document field. Documents are selected only if the search elements specified match the field value exactly. If a partial match is found, a document is not selected.
     author <MATCHES> Lang

STARTS
Selects documents by matching the character string you specify with the starting characters stored in a specific document field.
     author <STARTS> jack

SUBSTRING
Selects documents by matching the character string you specify with a portion of the strings stored in a specific document field.
     organization <SUBSTRING> eng


RELATIONAL OPERATORS (NUMERIC FIELDS)

=
(EQUALS)
Selects documents whose document field values are exactly the same as the search string you specify.
     orgnumber = 500

>
(GREATER THAN)
Selects documents whose document field values are greater than the search string you specify.
     revision > 4

>=
(GREATER THAN OR EQUAL TO)
Selects documents whose document field values are greater than or equal to the search string you specify.
     pages >= 500

<
(LESS THAN)
Selects documents whose document field values are less than the search string you specify.
     groupnumber < 20

<=
(LESS THAN OR EQUAL TO)
Selects documents whose document field values are less than or exactly the same as the search string you specify.
     pages <= 500