Note that operators must be capitalized, otherwise they will be treated as a query term. Query operators must also be preceded and followed by query terms or query phrases.
Inside a query, the OR operator may be used to retrieve documents containing either of two terms.
Example:
Inside a query, the AND operator may be used to retrieve documents containing both specified terms.
Example:
A NEAR operator is effectively an AND operator where you can control the distance between the words. onions NEAR cheese means that the term cheese must exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NEAR operation uses by adding a number suffix such as onions NEAR/50 cheese, which means the onion must exist within 50 words of cheese. This window can be between 0 and 99.
Other examples include:
Do not use the NEAR operator in the following fashions:
NOTNEAR is effectively a NOT operator where you can control the distance between the two words. Onions NOTNEAR cheese means that the term onions cannot exist within 10 words of the term cheese. The default distance is 10 words, but you can vary the distance the NOTNEAR operation uses by adding a number suffix such as onions NOTNEAR/50 cheese, which means the onion cannot exist within 50 words of cheese. This window can be between 0 and 99.
A WITH operator requires that the two terms occur within the same sentence. As such, it is the same as a NEAR operator, with the exception that the match window between the two terms is not specified.
A NOTWITH operator requires that the two terms cannot occur within the same sentence. As such, it is the same as a NOTNEAR operator, with the exception that the match window between the two terms is not specified. onions NOTWITH cheese means that the term cheese must not exist within the same sentence as onions.
The NOT operator excludes any documents containing the term which follows it. onions NOT celery will return all uses of onion, excluding those that contain "celery." A query must contain at least one non-excluded term when using the NOT operator.
Example
Two query terms of any type may be joined by an EXCLUDE operator, e.g. York EXCLUDE "New York". The effect is different than that of the NOT operator. The query will return documents with the word "York", excluding those that only contain occurrences of "New York".
Consider the following sample text:
I spent the day in York, visiting the magnificent cathedral. Then it was time to head back to London for my flight home to New York.
This text would generate the following results for the provided queries:
Queries can use parentheses to control the logic of the query and they may appear in any combination.
Two examples of queries with smart uses of parentheses are:
Every left parenthesis must have a corresponding right parenthesis. Queries can have nested parentheses up to 10 levels deep.
Single query terms are the simplest query element, consisting of a single word. A query term can be an operator or a word that appears in a stopword list only if it is in quotations. A query term cannot contain punctuation or other special characters like `! @ # $ % ^ ( ) _ = ~ + [ ] { } ( ) | " ' : ; . , < > ? / -
Phrases must be enclosed in double quotes. When a single word is enclosed in quotes, it is not treated as a phrase search: it is treated like a single word. The case sensitivity operator (~) must be placed outside the quotes, like so: ~”Tim Cook"
A wildcard character (*) may be used at the end of a single word query term or within a phrase. It allows the system to tag all spellings of the word starting with the letters before the wildcard (*). Wildcards will only work in phrases if they are attached to the last term in the phrase. For example:
There must be at least a three-letter prefix to a wildcard query. d*, do*, and dog*M are all invalid. Queries like "*" and Commonwealth AND "*" are invalid and achieve nothing.
Referencing a query is done by placing a caret (^) at the beginning of a query name and wrapping the caret and the query name in parentheses "( )". It signals to the system to look for a query and use it in another query. For example, consider the following queries:
Two queries can be combined to create a nested query. For example:
Query names being nested cannot contain spaces. Only the AND and OR operators function with nested queries.
Certain metadata criteria can be included by enclosing accepted keywords within braces:
The syntax above allows for the first component of the metadata criteria to be either entity or document.
If the first component is entity, it may be followed by an entity_type. This may be any of the entity types supported by the Salience entity extraction model, company, person, place, or product.
Optionally, a sentiment criteria component may be added. Sentiment criteria can be a comparison of document or entity sentiment to a single value, or a range.
Based on these specifications, the following metadata query phrases are valid:
The NEAR and WITH operators assume usage with text-level elements, it is not valid to use the {document: sentiment} construction with these query operators.
Valid: "merger announcement" NEAR/5 {entity company}
Invalid: "merger announcement" NEAR/5 {document: sentiment > 0.2}
By default, query terms are handled in a case-insensitive manner. Case-sensitivity on a query term can be enforced using the ~ operator.
By default, query terms are stemmed. For phrase searches, only the right-most word is stemmed. The query process will not stem all words within the multi-word phrase.
Placing the ! character in front of a query term will turn off stemming for the entire query. To turn off stemming for an individual term enclose it in ().
Special characters may be used within query phrases if they are in quotations.
Correct Query:
Gepp OR Gunther OR Hasso OR "Hayden-Smith" OR Hirakubo OR Kanai OR Mathis OR Moeller OR "Nijssen_Smith" OR Sherman OR Shimizu OR "U'Ren" OR Daiji
Wrong Query:
Gepp OR Gunther OR Hasso OR Hayden-Smith OR Hirakubo OR Kanai OR Mathis OR Moeller OR Nijssen_Smith OR Sherman OR Shimizu OR U'Ren OR Daiji
Query results will be accompanied with two scores, Query Relevancy and Query Sentiment.
Query Relevancy is a count of the query terms found within a document. It can be particularly effective in determining the effectiveness of your queries based on your text. Consider the following text:
I have one cat and I used to have a dog too.
The query relevancy score for the query cat OR dog OR bird will be 2 because the query detects two of the query terms.
Query Sentiment is the sentiment for each query term identified separately based on model- and dictionary- driven approaches and calculates the average score for all mentioned terms.
The most important thing to keep in mind when creating queries is to keep them simple and organized. Here are some examples of queries that vary in complexity:
anti* OR bact* OR germ* OR "anti-bacterial"
This uses simple "OR" logic while incorporating the wildcard (*) to account for plural versions and typos/misspellings.
((internet OR online OR paperless) AND (bank*)) AND (mobile OR cell* OR phone* OR access*)
This is similar "OR" logic and wildcard usage like the last example. The AND operator requires the use of parentheses to keep the desired logic.
(pric* OR cost* OR fee* OR item*) AND (high OR expensive OR premium OR "so much" OR disappoint* OR spendy OR ("too" AND (high OR "much" OR expensive)) OR ("not" AND (good OR competitive* OR worth OR fair))) OR ("too expensive" OR "a little expensive")
Sometimes, customers have used two separate queries for a single term (i.e. instead of one query for price, there is one for Price (Positive) and one for Price (Negative)). A downside of this system is false positives/negatives can occur. For example, the comment "it has high quality and reasonable prices" would attach to Price (Positive) query and the Price (Negative) query, when it belongs with only the Price (Positive) query.
(pric* OR cost* OR fee* OR item*) AND (expensive OR premium OR "so much" OR disappoint* OR ("too" AND ("much" OR expensive)) OR ("not" AND (good OR competitive* OR worth OR fair))) OR (("too expensive" OR "a little expensive") AND (price* OR cost* OR fee* OR item*)) NEAR/8 (high OR courses)
To fix the problem above, we added an operator at the end of the query, removed "high", and added parentheses at the beginning and end of the original query. The "AND" and "NEAR/8" operators act to nullify the false negative by adding the qualification that high needs to be equal to or less than 8 characters from "price, cost, fee, or item".)
Stopwords remove small and common words which have little effect on the content, like prepositions and conjunctions. In a query, all stopwords must be encapsulated in quotes.
Or call us at 1-800-377-8036