Report on the 1st Rejection of Google's Batch Normalization Patent

Summary

We explain in detail the examination processes and comments on the 1st rejection of the patent on Google’s Batch Normalization Layers, an issue that anyone interested in deep learning technologies will find insightful.

On 31 October 2018, the U.S. Patent and Trademark Office (USPTO) issued the first office action (non-final rejection) for the patent application with application number 15-009647 (Batch Normalization Layers) filed by Google Inc.

Most of the attendees at the Artificial Intelligence technology and patents seminar held by PI IP LAW were interested in the topic of “Batch Normalization Layers”. We believe that this is due to the strong implications of this patent on overall deep learning technology. At the time, we promised that we would keep our attendees updated on any developments on the procedures of this patent application, and now we are thrilled to report that the USPTO issued the first examination results.

First Non-final Rejection, USPTO’s Office Action (OA)

Usually, a first office action (OA) is characterized as being a compilation of rejection reasons. In this particular case, the USPTO’s first OA is “the complete package” of rejection reasons related to the batch normalization layer patent, citing 14 instances of prior art (4 patents and 3 technical papers were explicitly cited) as the basis for denying the registrability of Google’s patent. In this context, 14 is an extraordinarily high number, considering that examiners of the USPTO rarely cite more than 5 prior art references. Not to mention that the Office Action was 58 pages in length, which led us to believe that the USPTO invested a huge effort into this patent examination.

Patent Process Overview

In the hopes that one may fully grasp this article, let us first explain the general procedure of patent examination.

In order to obtain patent protection for your technology, you first need to submit a request form to the patent office of each jurisdiction from which you wish to obtain a patent. We call this procedure “filing a patent application”. The application requires preparation of many documents, but the most important one is a list of “claims”, which define the precise scope of the subject patent. Of course, simply filing an application does not guarantee that the patent office will grant you patent protection, and in most jurisdictions, the patent office will first examine the patent application to determine whether or not you are eligible for patent protection.

After examining the patent application, the patent office may grant a patent right to the applicant immediately by issuing a notice of allowance with respect to your application (sometimes called a decision of grant or the like). This scenario, however, in which the patent office grants you a patent immediately without citing any reasons for rejection is rare, and only occurs in about 10% of all patent applications (although the rate varies from country to country).

For the remaining 90% of the cases, the patent office decides to reject the applicant’s request to get a patent registration. The patent office issues a notification (commonly referred to as an office action or an OA), citing the reasons for rejecting the application and provides an opportunity for the applicant to respond to the patent office’s objections.

In the present case, the applicant, Google Inc., had an opportunity to respond to the office action by amending the claims of the patent application or submitting an argument against the rejection reasons cited by the patent office.

The USPTO cited 3 major rejection reasons in this office action, namely:

Patent eligibility (35 U.S.C. 101)
Novelty (35 U.S.C. 102)
Inventiveness (35 U.S.C. 103)

Patent Eligibility (35 U.S.C. 101)

Patent eligibility became a critical issue for software patents in the wake of the U.S. Supreme Court decision in the Alice Corp. v. CLS Bank International case in 2014. It is interesting to see that the USPTO has now started to question the patent eligibility of core deep learning algorithm inventions. It may be an indication of the future stance of the USPTO on deep learning algorithm patents, especially the ones related to operations between layers, neural networks, etc. in deep learning. We will keep track of this issue and post another article about this matter soon.

You can read now our article about patent eligibility issues on software-related patents here.

Google’s Batch Normalization Layer, Claim 1

I believe that the readers of this article may be familiar with the second and third issues. Those rejection reasons address the question of how similar the invention sought to be patented with existing technology (commonly referred to as “prior art”). The issue of novelty is raised when the patent office believes that the invention in question is substantially identical to prior art, and the issue of inventive step is raised if the invention is similar to one or more examples of prior art.

In the remainder of this article, we will analyze the rejection reasons against claim 1 of Google’s patent application

Here is claim 1:

“A neural network system implemented by one or more computers, the neural network system comprising:

a batch normalization layer between a first neural network layer and a second neural network layer, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the batch normalization layer is configured to, during training of the neural network system on a batch of training examples:

receive a respective first layer output for each training example in the batch;

compute a plurality of normalization statistics for the batch from the first layer outputs;

normalize each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch;

generate a respective batch normalization layer output for each of the training examples from the normalized layer outputs; and

provide the batch normalization layer output as an input to the second neural network layer."

In essence, claim 1 describes the well-known concepts of (1) batch normalization layers between two layers, (2) which receive inputs from the preceding layer, (3) compute normalization statistics (such as average and standard deviation), (4) normalize the inputs using the statistics and (5) provide the normalized outputs to the subsequent layer as an input.

Novelty (35 U.S.C. 102)

The USPTO raised the issue that claim 1 of the batch normalization layer patent application is substantially identical to US patent 5479576.

US5479576 is a patent that was first filed in Japan in 1992, and it relates to the technology that is already rather old. It can be interpreted that from the beginning, the USPTO has had a negative stance on the registrability of Google’s patent.

In the detailed description of US5479576, the relevant technical disclosure having similarity with the batch normalization patent is in Fig. 14. As shown in the figure below, in the cited patent, there are two layers - a first layer (61) and a second layer (64). Fig. 14 also discloses a network (63) which infers the average of the output of the first layer (61) and another network (64) which infers the standard deviation. The inferred average and standard deviation are used for normalizing the inputs to the second layer (64).

What do you think?

If you read US5479576 carefully, you will notice that this patent itself is quite different from the commonly known batch normalization technique. However, you should consider that the patent office examines a patent application on the basis of the patent claims. Since Google’s patent claims only describe rudimentary elements of batch normalization, the USPTO only needs to find examples of prior art that match the wordings of Google’s patent claims, not Google’s specific technology.

Then, why didn’t Google describe its technology in more detail?

Presumably, Google’s intention was to secure a broader scope of patent protection.

The simpler the claim, the broader the scope of protection of the claim once it is registered but the more likely it is to be rejected during the application process. The process of patent examination involves trying to persuade the patent office to accept the broadest possible formulation of the claims, and balancing this against the risk of having one’s claims rejected for not being specific enough.

My guess is that it will not be easy for Google to persuade the patent office to change its opinion on the current patent claims. If Google comes to the same conclusion, Google still has the option of amending the claims to be more limited, by describing more specific features in the claims.

However, even with additional limitations added, the patent application still needs to overcome further 13 prior art citations relating to other claims of the application. It will not be an easy process.

As a third party engineer who utilizes deep learning technology on the daily, this can be considered good news, since it makes it less threatening to utilize well-known deep learning techniques such as batch normalization. However, as we already discussed in the previous article, a third party can also rely on open source licenses covering the TensorFlow library.

We can expect Google to submit a response (possibly with claim amendments) within the next few months, and we will keep you updated as the situation unfolds.

Thank you for reading!

Also, please follow up on our updates on this case.

Shortly after we published the article, the European Patent Office granted Google a patent for the same batch normalization layers application. We will continue updating news regarding this patent when more information is available.