How can you compare different AI patent search engines?

2 Nov 2021 - ‘Artificial Intelligence (AI) patent searching continues to grow, with more and more options becoming available. And it is all around us - whenever you use FaceID or a fingerprint sensor to unlock your phone, run an online search, look at a feed of Facebook items or online movie recommendations in a subscription service, you are using data generated by Artificial Intelligence. Given this ubiquity, it is so tempting to ask the question - could AI be used to improve the efficiency of patent searching?

But what exactly is ‘artificial intelligence’? AI means different things to different people, but one definition is: “any system that perceives its environment and takes actions that maximize its chance of achieving its goals”.

In the world of patent searching, increasingly AI tends to mean any system that applies an algorithm to predict relevant patents from either a starting patent(s) or block of relevant text - as opposed to conventional searching via the traditional filtering query - ‘show me all patents that meet this keyword and class-code criteria’. And which is then is followed by what can be hours and hours of reviewing what are predominately not relevant patent results.

Potential benefits of AI patent searching

Conventional patent searching can take hours and hours to execute, and still not return all relevant results. As an analogy which I will return to, this can be compared to running a search for pizza shops in your area by finding a list of pizza shops in your area, and maybe filtering by certain criteria, i.e. ‘Marinara pizza is available’, ‘Has been around for more than 5 years. This is certainly doable, but would not tell you which pizza shops best meets your needs. In practice, you might have to go through all of these results individually to make a shortlist to investigate further.

In contrast, an AI patent search approach should be familiar to anyone who has ever run a Google search, i.e ‘pizza near me’. In both cases, an algorithm interprets the instructions and returns a ranked list of results. The ranking of results would be affected by a number of criteria, i.e. closeness to you, customer ratings, etc.

The benefits of this latter approach are many, and as a test, ask yourself - last time you wanted to look up a new area, did you start with an online search in Google (or an equivalent search engine), or did you find a list of relevant results and review them one by one, say for example starting with ‘Arnie’s Pizza’, and working onwards from there?

How does AI patent searching work in general?

Most AI patent search engines work by semantic analysis, i.e. given a block of text that might describe the invention, they apply natural language processing and other tools to find the key concepts within this block of text. They then search through their database of other patents and return the most similar patents, based on the matching of the found concepts and other parameters.

Ambercite works differently from this, in that it looks for patents with similar citation profiles, and then returns the most similar patents based on this similarity.

Having said that, there are a range of AI search engines now available, and all of them work slightly differently according to their internal algorithms. For commercial reasons, these algorithms are kept confidential. I don’t think anyone knows exactly how each of the options works, but ultimately this does not matter, in the same way, that it does not matter that very few people know exactly how Google works. Instead, just like we judge Google on the quality of their result listing, i.e.the list of pizza shop recommendations, we should judge AI search engines on the quality of the search results.

Judging AI patent search results

There are many ways to judge the quality of patent search results. My standard recommendation is to judge the search engine by the relevance of the search results - how close are they?

What I do not suggest is to judge the results by a pre-conceived view of what the results should be. This would be like judging the list of pizza shops by whether this list returned your favorite pizza shop at the top of the list. To limit the judgment in this way would raise two questions:

Why did you run the search if you already knew the result?
You may have a favorite pizza shop, but how do you know that there is not another pizza shop worth trying?

And judging patent search results should be no different - they be judged on what you learned, and not necessarily on what you know.

Comparing different search engines

There are a range of different patent search engines, and it can take time to compare them all. But there is a short-cut. Unified Patents run a series of patent search contests, and on the page for each patent search competition, they provide a list of predicted results from a series of AI vendors, including Ambercite. Their most recent contest (at the time of writing) was for patent US9645663, which covers a Electronic display with a virtual bezel.

Readers can interpret claim one of this patent for themselves, but my interpretation is that this covers a touch screen with a ‘virtual bezel’ around the outside. This virtual bezel responds differently to touch inputs compared to the main part of the screen. There were only three listed citations for this patent.

On this competition page seven vendors provided a list of similar patents, plus a set of similar patent results from Google. One of the vendors (IP Screener) provided a link but I needed an account to access these results. The other results were available without an account, and the table below provides a link to each of these result lists, and also lists the top three listed patents from each vendor.

I should note the following points in relation to this table:

Patents often exist in families, which can for example include a PCT patent, US application, granted US patent, and European and Japanese filings. Some IP search engines (including Ambercite and Google Patent) collapse patents into single representative filings - and some do not, i.e. the results can include separate listings for different family members of a patent family.
Even then the choice of representative patents can vary between databases. In the table below, to make comparing results easier, I have reported wherever possible the earliest granted US family member, if not the US application. I have also collapsed all results by simple family.
Ambercite results as reported in the competition pages automatically filter out known citations, as the aim of the competition is to find new patents. However, in this case, the link I have provided list all results to make the comparison easier.

Vendor

First three ranked results

Ambercite

List of results

US8018440B2, Unintentional touch rejection

US9454304B2, Multi-screen dual tap gesture

US9582122B2, Touch-sensitive bezel techniques

Amplified

List of results

US20120038571A1, System and Method for Dynamically Resizing an Active Screen of a Handheld Device

US8674959B2, Dynamic bezel for a mobile device

US20070291008A1, Inverted direct touch sensitive input devices

Apex Standards

List of results

US9400576B2, Touch sensor arrangements for organic light-emitting diode displays

US20060197753A1, Multi-functional hand-held device

US20120266079A1, Usability of cross-device user interfaces

Google Patent

List of results

US9606668B2, Mode-based graphical user interfaces for touch sensitive input devices

US20060197753A1, Multi-functional hand-held device

US7657849B2, Unlocking a device by performing gestures on an unlock image

Inquartik Patentcloud

List of results

WO2007103631A2, Electronic device having display and surrounding touch sensitive bezel for user interface and control

US8154523B2, Electronic device, display and touch-sensitive user interface

US8511890B2, Rocking bezel control

Techson

List of results

US9823890B1, Modifiable bezel for media device

US20110261001A1, Apparatus and method for impact resistant touchscreen display module

US9274682B2, Off-screen gestures to create on-screen input

Listed examiner citations

US20130222286A1, Device having touch display and method for reducing execution of erroneous touch operation

US20130234982A1, Mobile terminal and display control method

US9507561B2, Method and Apparatus for Facilitating Use of Touchscreen Devices

Which results are best? This is a subjective question, and we would ideally look at more than just the top three results (which you can do via the provided links). But if we take the top-listed Ambercite results:

the #2 listed patent refers to the concept of two separate screens that independently consider touch inputs (and a virtual bezel can be considered as a separate screen).
The #3 listed patent directly refers to a touch-sensitive bezel.
The #1 patent does not specifically refer to bezels, but instead to the prediction of the relevance of a touch based on a range of factors, one of which is the specific area of the touchscreen.

But this relevance is not limited to Ambercite results.

The #2 patent in Amplified’s results refer to a Dynamic bezel for a mobile device,
the #1 patent in the Apex list discloses the concept of a display with edge portions that may include virtual buttons
the #5 patent in the Google list refers to a displays with a touch-sensitive bezel
the #1 patent for Inquartik refers to a touch-sensitive bezel,
while the #1 patent in Techson refers to a bezel that can be virtual.
And there are other relevant results in this list.

So many of these results appear to be relevant - AI patent searching is working in this case.

How do the results compare to each other?

I downloaded the top 50 results for all of these databases, removed family duplicates, and then converted each patent into family ID numbers based Patseer family ID numbers. I have then looked at the number of overlapping cases by cross-matching these numbers.

The table below shows the results - if you look at the first number in this list, it shows that Ambercite and Amplified had two results (patent families) in common.

	Ambercite	Amplified	Apex standards	Google	Inquartik Patentcloud	Techson IP	Listed citations
Ambercite	*	2	1	6	6	9	1
Amplified		*	1	1	2	1	0
Apex standards			*	2	1	0	0
Google				*	1	2	0
Inquartik Patentcloud					*	7	0
Techson IP						*	1
Listed citations							*

I found these results to be surprising in a number of ways:

Apart from Ambercite, the remaining AI search engines all appear to work using Semantic search principles - and yet there was relatively little overlap between these results. The highest overlap score was 9 (out of the top 50 results), between Techson and Ambercite
There was only one Listed Citation in the list of Ambercite results, which initially was surprising as Ambercite is based on citation analysis. I investigated, and this was because Ambercite found a lot of new prior art documents that pushed the other two listed citations out of the top 50 results. But these listed citations were found further down a full list of results (Ambercite can return a list of up to 2000 similar patents).
None of the other AI search engines returned a listed citation in their top 50 results. To be fair, I do not know if these results were set up to exclude known citations.

What do these results say about AI testing?

This is an interesting case study as the published list of citations only had three patents in it.

Despite that, all of these AI search engines were able to return useful results - and quickly.

This last point is worth reinforcing, as traditional searching can take many hours, mainly because the number of ‘false positive’ results is so high. For example, if you look at the initial search query run for US9645663, it appears that the examiner considered hundreds of patents in 47 different queries - before citing two examiner citations. So in other words, he had to consider hundreds of potential patents before citing the two patents that he did. I am told that it can take many many hours to run these searches - as opposed to the speed and ranked results for AI searching.

But is hybrid searching the best approach?

Canadian patent search specialists Riahi Patents have investigated using AI patent search tools*, including using a hybrid approach of traditional and AI patent searching (including Ambercite). Their results suggest that combining Ambercite and traditional patent searching leads to up to a 46% improvement in search quality, as shown in the two graphs below:

None of which is surprising. Patent searching is a complex task, and like many complex tasks, a range of tools is more likely to give you the best possible results, as well as saving you time due to improved productivity. Would you build a house with a hammer alone?

*For a full explanation of these results and this project, check out our earlier blog found here, and also the full Riahi reports.

Would you like to try this for yourself?

Ambercite does offer free trials in a scaled-back version, but only 25 results are shown, and you really need to go beyond the first 25 patents to get the best outcomes from this approach. For this reason, I strongly suggest that you contact us if you want to test this for yourself, and we can arrange a trial of the full version:

Mike Lloyd2 November 2021

How can you compare different AI patent search engines?

Potential benefits of AI patent searching

How does AI patent searching work in general?

Judging AI patent search results

Comparing different search engines

How do the results compare to each other?

What do these results say about AI testing?

But is hybrid searching the best approach?

Would you like to try this for yourself?