AI vs AI: What is AI, and is this the future of patent searching?

April 28 2019 Ambercite often gets compared to other AI solutions for patent searching, and this is always a very interesting comparison. At the heart of these comparisons is the realization that conventional patent searching, which traditionally involves filtering the world’s 100 million plus patents for the patents you are looking for, can be both very time consuming, and not always find you what you are looking for.

Given this realization, finding an improved AI solution for patent searching is obviously very attractive. But what are these AI solutions? And how effective are they?

What is AI in relation to patent searching?

AI is traditionally short for Artificial Intelligence. Artificial Intelligence has a range of meanings, such as ‘intelligence demonstrated by machines’. The Wikipedia entry for Artificial Intelligence contains a wealth of detail about AI in general, including the concept of Natural Language Processing (NLP): ‘giving machines the ability to read and understand human language’. Often when companies talk about ‘AI’ in the concept of patent searching, this is what they are referring to, but the same reference* in Wikipedia points out the limitations in NLP (noting on 28 April 2019) that “Besides the usual difficulties with encoding semantic commonsense knowledge, existing semantic NLP sometimes scales too poorly to be viable in business applications”.

Ambercite has a different approach based on applying an algorithm that analyses the 175 million patent citations that link patents together. In terms of ‘AI’, Augmented Intelligence is probably the better meaning: ‘being is a complement—not a replacement—to human intelligence. It’s about helping humans become faster and smarter at the tasks they’re performing.’.

However ultimately these definitions and differences matter less than the underlying question - Does AI help you find better patents faster? And as always, such questions are best answered with data.

How to compare different solutions?

In this case, we have run a simple experiment. Patent US9,000,000 covers a means of conditioning rainwater captured by your car. At a very high level, claim 1 of this patents includes the following five concepts of:

  • Collecting rainwater

  • For cleaning a windshield in a car

  • Conditioning (treating the rainwater)

  • Adding a fluid concentrate

  • Using an ion exchange mechanism for cleaning the water

A good test for any AI patent search system is how many of the similar patents it finds discloses between 1 and 5 of these concepts, say applying a point for every concept found. To keep the experiment simple we might limit the number of patents found to the top 5 listed by the AI search engine. And to make a little bit challenging, we might ignore related patents filled by the same applicant as US9,000,000.

Example of analysis

The most similar * patent found by Ambercite is US5669986. We reviewed this patent, and found that it disclosed the concepts of

  • Collecting rainwater

  • For cleaning a windshield in a car

But not the other three patents - so we gave this patent 2 points. A similar review of the other top four patents listed by Ambercite found a further 8 points, leading to a total score for Ambercite of 10 points.

*and not owned by Wiperfill Holdings, the owner of US9,000,000.

The AI search candidates


In this case, we compared the following candidates:

  • Ambercite AI

  • The Show Similar documents function in Google Patent

  • A find similar patents function in a well known conventional patent search database (‘AI patent search engine #3’)

  • A leading AI search engine (‘AI patent search engine #3’)

  • Two lesser known AI patent databases (‘AI patent search engine #2’ and #4 )

  • The 6 patents listed as examiner citations for this patents.

What did we find?

The results are summarised by the figure below:

AI search comparison results.JPG

The figure above shows that Ambercite gave the highest total score. The performance of the different AI semantic search system was mixed, with points ranging from 1 to 8 points.

This compares to 5 points earned by the 6 patents listed by the examiner for US9,000,000. In other words, four of the search engines were at least as good as the examiner, for this case at least.

Why is this so?

I think we all appreciate searching by keywords is hard. But it is actually harder than it looks. A typical patent claim contains a whole set of technical terms, some critical to the core of the invention, and some of lesser of importance, helping to provide the base for the concept, but not the unique inventive step.

A skilled patent searcher or examiner can look at a patent claim or other text, and create a query that can find the key concepts.

But natural language processing? That is a big challenge indeed, to be able to distinguish between filler and key text.

Ambercite is different. It does not try to analyse text, as it recognises how hard this is. Instead it relies on expert reviewers, being patent examiners and applicants, to recognise similar patents and list them as patent citations. But it does more than that, Ambercite also applies an unique algorithm to combine the valuable information within these citations to provide an overall ranking of the most similar patents.

Is AI the future of patent searching?

Patent searching is hard, and it is unlikely that fully automated solutions can take over in the near future. But augmented solutions such as Ambercite, and perhaps the best of the natural language search tools, can help patent searchers make the most of their limited time. In this particular case, they gave a comparable or better performance when compared to conventional searching, at least as expressed by the examiner citations.

Is there other data available?

A major patent office in Europe has recently conducted their own review comparing Ambercite and other AI solutions, and again Ambercite comes out very well. These results are due to be published in June 2019, and we look forward to publishing these results in due course.

Mike Lloyd