Is now a good time to review your patent searching software?

With the Covid pandemic upon us, many businesses are reviewing their operations and looking for improvements and cost savings. And the same will no doubt apply to patent searching - whether as part of a perhaps a patent owning or patent attorney business - or if you are working in a specialist patent searching business or a patent examiner.

How we search at the moment - ‘Filter Searching’ - and why this is not perfect

The vast majority of patent searching is done using an approach which could be described as ‘Filter Searching’. In other words, we construct a search query using a command along these lines:

Filter out all patents except those that meet all of these conditions being: one or more of these keywords, classification codes, owners, jurisdictions and dates.

These sorts of queries can be constructed in a variety of free and paid patent engines. and generally these sorts of engines work pretty well. But despite this, there are some problems with this approach:

  1. These queries are based on some assumptions about the what the keywords, classification codes, owners etc should be. Most of the time competent searchers get these assumptions correct, but sometimes relevant patents fall outside the best prepared assumptions. Some applicants or attorneys use very unexpected keywords, there may be imperfections in translating from other languages, and class codes can be highly variable for some technologies. In practice this can lead to what we call ‘false negatives’ - as relevant results can be missed from the list of these results produced by these queries.

  2. It can be very time consuming. Searchers typically search through hundreds of patents to find the most relevant results, and this process can take many hours. In other words, there are lots of ‘false positives’ produced by these queries - being results listed that are not in fact what we are looking for.

    A key reason for this is the results are very rarely ranked in relevance - instead they are often ranked say by some sort of date sorting. So # 398 in the list may be just as relevant as #2, and so you need to look at them - a very inefficient process.

The alternative - ‘Predictive Searching’

Not surprisingly, for this reason many searching managers (and in particular patent offices) have been looking for alternative approaches to ‘Filter Searching’. ‘Artificial intelligence’ has been discussed in this context, but often what people mean by this in relation to patent searching, in fact we could more precisely define as ‘Predictive Searching’ (Or in more general terms predictive analytics) . We can define Predictive Searching as being the ability to predict relevant patents from one or more input conditions.

A key difference between Predictive Searching and Filter Searching is that Predictive Searching has the potential to find results that you did not expect, to make predictions based on (what appears to be) limited inputs.

Predictive Searching (we could also define this as Predictive Intelligence) is all around us - every time you type something into a search bar or ask Siri and the likes a question, Predictive Searching algorithms is being used to produce the answer. Similarly the likes of Facebook are always trying to predict your friends. Some of the time the answer is spot on, other times, such as the results generated by a Google search query, you will end up with a list of predicted results - and hopefully the most relevant results are near the top of the list.

Predictive Searching in patent searching

Predictive intelligence is also used very commonly in patent searching, with the most common application being Google Patents, followed by semantic searching, and Ambercite citation searching.

Google Patent

We estimate that Google patents is used several hundred thousand times a month to run patent searches -with an example of a search being shown below:

Predictive intelligence search.png

(There is also a Filter Searching mode available in Google Patent, but personally I rarely use it).

Now if you think about this process, it is remarkably simple - we simply put our best guess on the keyword, and Google Patent produces a list of patent results. Not all will be relevant (there will be a high number of false positives in the 52,967 patents returned by this query), but because they are ranked, the patents on the top of the list are likely to be relevant.

No wonder Google Patents is so popular. I went to a global patent conference a few years ago, where there were presentations by all of the leading vendors - and the Google Patent presentation was by far the most popular presentation.

But - Google Patent is not perfect. While being easy to use, relevant patents can be missed by its queries, or perhaps found in position #31,568 in the list. So alternative approaches can be required - this is not to say that you should not use Google Patent, only that no one search process is perfect.

Semantic Searching

The next most commonly used form of Predictive Searching is Semantic Searching. This is the basis of the specialist searching engines Innography, Octimine, IPScreener and many others, and also offered as a ‘similar patents’ search by many of more conventional patent searching databases (and even in Google Patents as a ‘Similar documents’ search. Semantic Searching uses natural language processing to find and rank similar patents to text you input into the search query (in many cases you can instead nominate a patent number - and the search engines extracts key text from this nominated patent).

In theory, Semantic Searching should be wonderful. All you need to do is input say a patent abstract, claim or technology description, and it will return a ranked list of similar patents.

However - after much testing of a range of these semantic search engines, I have come to the view that the results are very mixed. It can return similar patents, but also there were a lot of not very relevant patents returned (false positives).

And when we think about it, this is not very surprising. Different applicants can describe the same invention in very different ways - even a humble cardboard box can also be a carton, container, package, packaging, and receptacle, and this is before we bring in foreign language terms. Remember that this variation is for just one word - this is before we start to think of word order, or the different ways that different applicants can arrange words, or alternative sentence constructions for the same concepts.

However, this is not to dismiss semantic searching out of hand - in some cases this can can return useful results, and it is fast and simple to use.

Ambercite citation based searching

A common weakness is the earlier techniques discussed is the reliance on language - when language can be so imprecise. Ambercite applies a different approach, of looking for similar patents to one or more starting patents. using algorithms that look for similarity in citation profiles.

By doing so, Ambercite is building on the expertise of patent examiners and applicants to recognise similarity in patents regardless of whether they use the same terminology, word order or classification codes. When they recognise a similar patent, they describe it a patent citation. But Ambercite goes further that and combines citations from different examiners - and so by doing combines different opinions in a form of ‘collective wisdom’.

By using this approach, Ambercite has been shown to produce a very quality set of new results, not found in conventional results, as shown by the Austrian patent office.

Nonetheless, Ambercite is not perfect. It does require a starting patent(s) before a search begins, and this may not be available in all cases. Also it will not work for recent applications where no citation data has been published, or some patent offices that does not publish patent citations. Regardless, it can can work for patents from all of the world, in a variety of languages, and there are many case studies of its efficacy.

How do the different approaches compare?

Not surprisingly, there are strengths and weaknesses with the different approaches

Search method

Conventional keyword and class code searching (‘Patent filtering’)

Google Patent

Semantic Searching

Citation based searching

Some examples of available databases

·  Questal, Patsnap, Derwent Innovation, Patabase, Patentlens

·  Google Patent

·  Innography, Octimine, “Similar patents” functions found in some conventional database

·  Ambercite

Advantages

·  Commonly used

·  Easily understood

 

·  Very easy to use

·  Results are ranked in order of predicted relevance

·  Very easy to use

·  Results are ranked in order of predicted relevance

·  Relatively easy to use

·  High relevancy rate – has been ranked highly against semantic searching

·  Avoids the reliance of language of the other approaches

Disadvantages

·  Can be very time consuming as lot of ‘false positive’ are produced

·  Results are rarely ranked in order of relavance

·  Relies on assumptions that may not pick up all relevant patents

·  Can produce a very long list of patents

·  Not all relevant patents are found at the top of the list, i.e. many ‘false positives’

·  Many false positives

·  Requires a good starting patent

·  Will not find patents without available citation data

Why not use a variety of approaches?

To change the topic slightly, I have shown four different wood saws below. So which is the best way to cut wood?

Experienced carpenters will recognise that this is a silly question - each saw has their benefits for certain use cases, and a good carpenter will probably have one of each. Yes, in theory a truly skilled carpenter might be able to use one saw for all of the different use cases, but why would they? To do so would be time consuming, and unlikely to lead to the best outcomes.

And the same applies to patent searching. Rather than just relying one one search tool, experienced patent searchers will apply a variety of approaches because they want to save time while producing the best possible set of results.

Using multiple approaches in practice

Using different approaches may seem to be time consuming, but in practice this is rarely the case - just like a skilled carpenter will not use of all their saws for every job.

For example, given a patentability search you might:

  1. Start with a Google patent, Filter or Semantic search - or perhaps a combination of these approaches.

  2. Input the best results found into Ambercite to find patents missed by language based searching

  3. Perhaps even review the best results found by Ambercite to help run a conventional search

While three approaches may seem more laborious than one, in fact they are very synergistic, just like a carpenter’s set of tools. For this reason, it is not surprising that most users of Ambercite use a variety of search databases - because this both save you time (and hence money) and improve search quality.

Why now is a good time to review your searching practices

An economic crisis may seem a counter-intuitive time to consider new tools, but there are a number of reasons why this makes sense at the moment:

  • You may have more time to consider alternative approaches, as some other activity has slowed.

  • You may be considering reducing your subscriptions for some of your conventional databases. You may find that pricing for Ambercite (and other predictive searching tools) is somewhat less than the per-seat pricing of conventional databases (please contact us for details).

  • When the crisis abates and things get back to normal, you may be asked to do more with less - and Ambercite and other predictive searching tools can allow you to this.

How can I try Ambercite for myself?

Ambercite offers free trials, but to get the most of this, please contact us for a demonstration. You can try either option via the links below.

 
Mike Lloyd