Jump to content

EN:Text search: Difference between revisions

From IP7 Wiki
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 5: Line 5:
Using the Text search block, extensive full text searches can be created.<br />
Using the Text search block, extensive full text searches can be created.<br />
2 options are generally available:<br />
2 options are generally available:<br />
'''Full text search''' or '''Semantic search'''<br />
'''Full text search''' or '''AI - Full-Text Search'''<br />


By choosing between'''Title, Abstract, Claim, Description''' you can determine, which part of the text is searched.<br />
By choosing between'''Title, Abstract, Claim, Description''' you can determine, which part of the text is searched.<br />
Line 23: Line 23:
Here, the search terms in the texts are counted. Additionally, the search terms are weighted. If the search term is contained in the title, a higher weighting is applied in comparison to the search term occurring only in the description. <br />
Here, the search terms in the texts are counted. Additionally, the search terms are weighted. If the search term is contained in the title, a higher weighting is applied in comparison to the search term occurring only in the description. <br />


== Semantic Search ==
== AI - Full-Text Search ==


The semantic search only works with English texts. <br />
The AI full-text search is an AI-supported semantic search. <br/>
The more general the specified text is, the less precise the results of the semantic search will be.<br />
An '''English text is required''' as input for the search. <br/>
The inserted text is then analyzed/interpreted by the AI, and a similarity search or semantic search is subsequently performed. <br/>
This is intended to find patents with similar patent texts. <br/>


It is therefore recommended e.g. to copy only the most important or most interesting claim into the semantic search. (e.g. the first claim)<br />
[[File:AI_Fulltext_Search.jpg|1000px]]


The semantic search is recommended as a tool to find similar patents.<br />
=== Usage ===
The amount of results of a semantic search can then be edited further using [[EN:Filter|filters]].<br />
 
Results of the AI full-text search may be incomplete. <br/>
Furthermore, results should always be questioned, challenged, and verified. <br/>
The AI full-text search is also not transparent to the user. <br/>
When/Why is a patent found or not found? <br/>
 
For this reason, AI full-text search is less suitable for, e.g., automated search profiles. <br/>
This type of search is also not intended to replace professional patent search. <br/>
AI full-text search is intended to serve as a tool (one step within the patent search process). <br/>
It could, for example, be the first step, the beginning of a search. <br/>
The results can then be analyzed, for example, using the [[EN:Filter|Filter]], and relevant insights (e.g., IPC/CPC classes) can be used for a "normal" search. <br/>
AI full-text search can also be helpful as the final step in patent search. <br/>
For example, further similar patents can be searched for based on the relevant results. <br/>
 
=== Input Text ===
 
The text of an invention disclosure can be entered as input. <br/>
No interfaces to external AI providers are used for the "AI full-text search". <br/>
Instead, only AI systems running on IP7 servers are used. <br/>
(IP7 has created its own vector database for this purpose) <br/>
This means that the "AI - Full Text Search" queries do not leave the IP7 system. <br/>
 
Input can also include, for example, a portion of a patent text. <br/>
(e.g., the first claim) <br/>
 
As a general rule: <br/>
The more general the text is, the less precise the results will be. <br/>
Therefore, it is recommended to copy only the most important text sections or the most interesting claims into the search. <br/>
 
=== Minimum Score ===
 
The more similar a patent text is to the input text, the higher the score for that patent. <br/>
The search results are sorted according to this score. The hits/patents with the highest score appear at the top. <br/>
 
A "Minimum Score" can be specified as a percentage. <br/>
This allows you to, for example, specify that results with a score below 70% should no longer be displayed/found. <br/>
Helpful when there are too many hits or too many irrelevant hits. <br/>
 
=== Show similar patents ===
 
The "Show similar patents" function is the same as the AI full-text search. <br/>
The only difference is that here, the text of the [[EN:Result_List#Select_results|selected patent]] is used. <br/>
Therefore, it is not possible to specify your own text here. <br/>
 
[[File:AI_Show_Similar_Patents.jpg|800px]]


== Full-text search ==
== Full-text search ==


A Boolean text search with extensive functions and options, which are explained in more detail here. <br />
A Boolean text search with extensive functions and options, which are explained in more detail here. <br />
In contrast to the semantic search, the full-text search is comprehensible and should therefore be used for e.g. FTO research or monitoring profiles. <br />
In contrast to the AI - Full-Text Search, the full-text search is comprehensible and should therefore be used for e.g. FTO research or monitoring profiles. <br />


=== Umlauts ===
=== Umlauts ===
Line 202: Line 248:




=== Commenst ===
=== Comments ===


It is possible to add comments in the text search. <br/>
It is possible to add comments in the text search. <br/>
Line 302: Line 348:
By clicking the Enter or Tab key, the synonyms are automatically transferred to the search.<br/>
By clicking the Enter or Tab key, the synonyms are automatically transferred to the search.<br/>
[[File:SearchTextHLgroups2.jpg|750px]]
[[File:SearchTextHLgroups2.jpg|750px]]
=== AI - Add Synonyms ===
In the full-text search, synonyms for a term can be added using AI.<br/>
Simply select the desired term and right-click to open the context menu:<br/>
[[File:AI_textSearch_synonyms.jpg|800px]]
Adding the synonyms may take a few seconds.<br/>
The synonyms are automatically combined with OR and enclosed in brackets: <br/>
[[File:AI_textSearch_synonyms_added.jpg|800px]]


=== Regular expressions "Regexp" ===
=== Regular expressions "Regexp" ===

Latest revision as of 10:51, 18 February 2026

Text Search Block

Using the Text search block, extensive full text searches can be created.
2 options are generally available:
Full text search or AI - Full-Text Search

By choosing betweenTitle, Abstract, Claim, Description you can determine, which part of the text is searched.

With the option Machine translations the English translations are also searched.

With the option Stemming, the search terms will also find terms from the same root.
For example, a search for brake: brake
will also find the following terms with Stemming: brakes, braking, braker, braked, ...

The Stemming option supports the following languages: German, English, French

Full text ranking

If you perform a search for text, the result will be sorted on the basis of a full text ranking.
This way the most relevant results will be displayed on top of the list while the uninteresting results will be displayed on the bottom.
Here, the search terms in the texts are counted. Additionally, the search terms are weighted. If the search term is contained in the title, a higher weighting is applied in comparison to the search term occurring only in the description.

AI - Full-Text Search

The AI full-text search is an AI-supported semantic search.
An English text is required as input for the search.
The inserted text is then analyzed/interpreted by the AI, and a similarity search or semantic search is subsequently performed.
This is intended to find patents with similar patent texts.

Usage

Results of the AI full-text search may be incomplete.
Furthermore, results should always be questioned, challenged, and verified.
The AI full-text search is also not transparent to the user.
When/Why is a patent found or not found?

For this reason, AI full-text search is less suitable for, e.g., automated search profiles.
This type of search is also not intended to replace professional patent search.
AI full-text search is intended to serve as a tool (one step within the patent search process).
It could, for example, be the first step, the beginning of a search.
The results can then be analyzed, for example, using the Filter, and relevant insights (e.g., IPC/CPC classes) can be used for a "normal" search.
AI full-text search can also be helpful as the final step in patent search.
For example, further similar patents can be searched for based on the relevant results.

Input Text

The text of an invention disclosure can be entered as input.
No interfaces to external AI providers are used for the "AI full-text search".
Instead, only AI systems running on IP7 servers are used.
(IP7 has created its own vector database for this purpose)
This means that the "AI - Full Text Search" queries do not leave the IP7 system.

Input can also include, for example, a portion of a patent text.
(e.g., the first claim)

As a general rule:
The more general the text is, the less precise the results will be.
Therefore, it is recommended to copy only the most important text sections or the most interesting claims into the search.

Minimum Score

The more similar a patent text is to the input text, the higher the score for that patent.
The search results are sorted according to this score. The hits/patents with the highest score appear at the top.

A "Minimum Score" can be specified as a percentage.
This allows you to, for example, specify that results with a score below 70% should no longer be displayed/found.
Helpful when there are too many hits or too many irrelevant hits.

Show similar patents

The "Show similar patents" function is the same as the AI full-text search.
The only difference is that here, the text of the selected patent is used.
Therefore, it is not possible to specify your own text here.

Full-text search

A Boolean text search with extensive functions and options, which are explained in more detail here.
In contrast to the AI - Full-Text Search, the full-text search is comprehensible and should therefore be used for e.g. FTO research or monitoring profiles.

Umlauts

When searching for Ä, Ö, Ü, other spellings are also automatically found.
If, for example, a term is searched for with Ü, the German patent texts automatically search UE as well.

Example

befüllen

also finds German texts with:
befuellen

Truncation

The following truncation options are available:

  • * - none to any number of characters
  • % - none to 1 character
  • ? - exactly 1 character

Example

?otogra?ie

finds (among others):
fotografie

does not find (among others):
photographie
?%otogra?%ie

finds (among others):
photographie, fotografie, fotographie, photografie

Boolean Operators

The following 3 operators are available for linking search terms:

AND

OR

NOT

Using the AND, OR operators and brackets, synonyms can be combined.

Example

(fahrrad* or bike) and (batter%% or akku*)

If you do not place any operators between two search terms, the terms will be automatically linked with AND.

Example

fuel cell

corresponds to:
fuel and cell

Boost

The Boost feature enables you to influence the full text ranking in a result list.
Individual terms can be boosted, influencing the sorting of the result list.


Example

fuel and cell

The term "fuel" has a greater importance to the user than the term "cell" and should be weighted higher.

fuel^2.5 and cell

The value of the term "fuel" is multiplied by 2,5.


Fuzzy

The Fuzzy-search is based on the Damerau-Levenshtein-Distanz Algorithm. It will find terms which are similar to the entered search term.

Optionally, the distance (number of allowed changes) can be specified after the fuzzy operator. A change could be the addition, deletion or replacement of a single character.

If no distance is stated, the distance is automatically selected corresponding to the length of the term:

  • Less than 3 characters: Terms must match.
  • 3 to (including) 5 characters: One change allowed.
  • 6 or more characters: Two changes allowed.

Example

electronic~
(max. 2 changes, term contains more than 6 characters)
finds (among others):
electronic
elektronik

also finds:
electron

Enter number of changes manually

kraftstoffluss~1
(max. one change)
finds (among others):
kraftstoffluss
kraftstofffluss

The Fuzzy operator is not combineable with the truncation and can only be applied to one term.

Phrase

If terms are put in quotation marks, terms are searched in this exact sequence.

Example

"fuel cell"

corresponds to:
span(fuel cell, 0)

This way it is also possible to search for keywords like operators.

Example

"Menschen in Not"

The quotation marks can be used to search for numbers:
Example

"420"

You can also search for a “-character as follows:

"fuel\""
searches for:
fuel"

Wildcards

If 2 terms are linked with “–“ these terms will be searched in this particular order.

Example

fuel-cell

searches for:
span (fuel cell, 0)


Comments

It is possible to add comments in the text search.
Comments are not taken into account in the text search and only serve as information for the user.

Example


Proximity Operators

span

Terms are searched using the maximum distance between words.
Here, the order of the words is taken into consideration.

Example

span (fuel cell, 2)

The text must contain the word fuel followed by the word cell.Up to 2 other terms can appear between the two words.

near

Terms are searched using the maximum distance between words.
Here, the order of the words is not taken into consideration.

Example

near (fuel cell, 2)

The text must contain the words fuel and cell.Up to 2 other terms can appear between the two words.

General

Within the span and near proximity operators, multiple terms can be combined with "OR" and brackets.

Example

near ((electric or elektrisch) (generator or Stromerzeuger or stromgenerator), 3)

Furthermore, within the proximity operators near and span, near and span can also be used.

Example

near((span(rotary wing aircraft,2) or helicopter) rotor,2)

The maximum word distance for near and span refers to all specified terms or synonyms.

Example

span (rotary wing thrust, 4)

In total a maximum of 4 terms may occur between the 3 searched terms.

Therefore this patent for example will be found:

If the search is made with a word distance 2, the patent will not be found anymore.

spannot

In rare exceptional cases, it may make sense to use the spannot function.

Unlike the NOT operator, a patent is not excluded if the unsearched term occurs in the same patent text.

Example
You want to search for "disc brake", but not for "disc brake caliper".

SPAN(disc brake,0) NOT SPAN(disc brake caliper,0)

The search will not find the patent with the following text:
SLEEVE FOR DISC BRAKE CALIPER AND DISC BRAKE FITTED WITH SUCH A SLEEVE
(because "DISC BRAKE CALIPER" appears in the text, the hit is excluded)

The following search with SPANNOT will find the text:

SPANNOT(SPAN(disc brake,0), SPAN(disc brake caliper,0))

The search finds everything with "disc brake" but not "disc brake caliper".
Where "disc brake" and "disc brake caliper" may appear in the same text.

Transfer highlighting synonym groups to the text search

All terms of a synonym group (Highlighting) can be added to a full text search.
Collected synonyms can be re-used for the search.

When a term is entered in the text field, the keyboard shortcut Ctrl + Space will show the synonym groups which contain the term.

All groups from all highlighting schemes are considered.

The desired group can then be selected using the arrow keys. By clicking the Enter or Tab key, the synonyms are automatically transferred to the search.

AI - Add Synonyms

In the full-text search, synonyms for a term can be added using AI.
Simply select the desired term and right-click to open the context menu:

Adding the synonyms may take a few seconds.
The synonyms are automatically combined with OR and enclosed in brackets:

Regular expressions "Regexp"

It is possible to use regular expressions in the search.

Example

SPAN (/<20-30>/ zoll (monitor or screen),2) 

searches for numbers between 20 to 30

Basis of the text search

By using the following options, the basis of the text search can be set.
Document, Application, Strict family or Extended family

Depending on the selected option it can be determined which texts are searched for the terms.

Example

fuel and cell

selected texts: Title 

Document – both terms have to appear in the title of the document
Application – one term can appear in the title of the A-document and the other term can appear in the corresponding B-document
Strict Family – one term appears in the title of a document from one country, the other term appears in the title of a different country. Both documents belong to the same strict family
Extended Family – same as strict family, however, both documents must belong to the same extended family

The higher the basis of the text search is selected, the higher the number of results will be.
Document (fewer results) → Extended family (more results)

Special feature of the base document

There is a peculiarity when the text search is performed on the "Document" base compared to the "Application" base or higher.
This peculiarity is explained using the following example:

US 5727260 A

This example or patent family consists of a single document.
So there is no other document in the family that could influence the search.

This example deals with the terms "thickness" and "injury".
The term "thickness" is only present in the abstract.
The term "injury" only appears in the claims.

In this example, both terms are combined with AND and searched for in the abstract and in the claims.
If the search is based on the base Document, both terms must be present either in the abstract or in the claims.
In this case:

"thickness" and "injury" must be present in the abstract
or
"thickness" and "injury" must be present in the claims

Neither is true, which is why the patent was not found.


If the search is based on the base Application or higher, one term must be in the abstract and the other term in the claims.
However, both terms no longer have to be in the abstract or both terms in the claims.
In this case:

"thickness" in the abstract
and
"injury" in the claims

or vice versa:

"injury" in the abstract
and
"thickness" in the claims

Here the first case applies and the patent is found.

Basis of the text search and the selected basis of the search

In this search, the term “fuel and cell” is searched in the text on the basis “Document”.
The terms have to appear in one document.

Below, the basis “Strict family“ is selected.

This means that all search blocks are enriched to the strict family.
This way, for example, “fuel cell” can appear in a US document and in the same strict family a DE document. Then this strict family is found by the search.

If the setting is changed from “Strict family“ to “Document“, then “fuel cell“ must appear in one DE document.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.