On 31 October 2018, the USPTO issued the first office action (non-final rejection) for the patent application with application number 15-009647(batch normalization layers) filed by Google Inc.
This “batch normalization layers” patent attracted a huge amount of attention among the attendees of our seminar on artificial intelligence tech and patents due to its impact on deep learning technology. At the time, we promised that we would keep our attendees informed of any developments of the patent examination procedure of this application, and now we are pleased to report that a first examination result has been issued by the USPTO.
The USPTO’s first office action can be characterized as “the complete package” of rejection reasons related to software patents, citing 14 instances of prior art (4 patents and 3 technical papers were explicitly cited in the office action), which provide the basis for denying the registrability of Google’s patent. In this context, 14 is an extraordinarily high number, considering that it is seldom the case that 5 or more prior art references are cited by examiners of the USPTO.
The number of pages of the office action is high as well - 58 in total - which leads us to believe that the USPTO put a fair amount of effort into this patent examination.
To facilitate the understanding of this article, let us first explain the general procedure of patent examination.
In order to obtain patent protection for your technology, you first need to submit a request form to the patent office of the country, in which you want to obtain a patent. We call this “filing a patent application”. The application consists of many components, but the most important is a list of so-called “claims”, which define the precise scope of patent protection that the applicant is seeking. Of course, simply filing an application does not guarantee that the patent office will grant you patent protection, and in most countries, the patent office will first examine the patent application to determine whether or not you are eligible for patent protection.
After examining the patent application, the patent office may grant a patent right to the applicant immediately by issuing a notice of allowance with respect to your application (sometimes called a decision of grant or the like). This scenario, however, in which the patent office grants you a patent immediately, without citing any reasons for rejection, is rare, and only occurs for about 10% of all patent applications, (although the rate does vary from country to country).
For the remaining 90% of the cases, where the patent office decides to reject the applicant’s request to get a patent registration, the patent office will issue a notification (commonly referred to as an office action or OA), citing the reasons for rejecting the application, and provide an opportunity for the applicant to respond to the patent office’s objections.
In the present case, the applicant, Google Inc., has an opportunity to respond to the office action by amending the claims of the patent application or submitting an argument against the rejection reasons cited by the patent office.
The USPTO cited 3 major rejection reasons in this office action, namely:
1. Patent eligibility (35 U.S.C. 101)
2. Novelty (35 U.S.C. 102)
3. Inventiveness (35 U.S.C. 103)
Patent eligibility became a critical issue for software patents in the wake of the US Supreme Court decision in Alice Corp. v. CLS Bank International in 2014. It is interesting to see that the USPTO has now started to question the patent eligibility of core deep learning algorithm inventions. It may be an indication of the future stance of the USPTO on deep learning algorithm patents, especially ones relates to operations between layers, neural networks, etc. in deep learning. We will track this issue and post another article about this issue soon.
I believe that the readers of this article may be familiar with the second and third issues. Those rejection reasons address the question of how similar the invention sought to be patented is with with existing technology (commonly referred to as “prior art”). The issue of novelty is raised when the patent office believes that the invention in question is substantially identical to prior art, and the issue of inventive step is raised if the invention is similar to one or more examples of prior art.
In the remainder of this article, we will analyze the rejection reasons relating to claim 1 of Google’s patent application.
Here is claim 1:
1. A neural network system implemented by one or more computers, the neural network system comprising:
a batch normalization layer between a first neural network layer and a second neural network layer, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the batch normalization layer is configured to, during training of the neural network system on a batch of training examples:receive a respective first layer output for each training example in the batch;compute a plurality of normalization statistics for the batch from the first layer outputs;normalize each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch;generate a respective batch normalization layer output for each of the training examples from the normalized layer outputs; andprovide the batch normalization layer output as an input to the second neural network layer.
Claim 1 describes the well known concepts of (1) batch normalization layers between two layers, (2) which receive inputs from the preceding layer, (3) compute normalization statistics (such as average and standard deviation), (4) normalize the inputs using the statistics and (5) provide the normalized outputs to the subsequent layer as an input.
Novelty (35 U.S.C. 102)
The USPTO raised the issue that claim 1 of the patent application is substantially identical to US patent 5,479,576.
US5,479,576 is a patent that was first filed in Japan in 1992, and it relates to technology that is already rather old. This indicates to me that the USPTO had a negative stance on the registrability of Google’s patent from the outset.
In the detailed description of US5,479,576, the relevant technical disclosure having similarity with the batch normalization patent is in Fig. 14. In Fig. 14, there are two layers - a first layer (61) and a second layer (64). Fig. 14 also discloses a network (63) which infers the average of the output of the first layer (61) and another network (64) which infers the standard deviation. The inferred average and standard deviation are used for normalizing the inputs to the second layer (64).
What do you think? If you read US5,479,576 carefully, you will notice that this patent itself is quite different from the commonly known batch normalization technique. However, you should consider that the patent office examines a patent application on the basis of the patent claims. Since Google’s patent claims only describe rudimentary elements of batch normalization, the USPTO only needs to find examples of prior art that match the wording of Google’s patent claims, not the Google’s specific technology.
Then why did Google not describe its technology in more detail? Google’s motivation was presumably to try to secure a broader scope of patent protection.
The simpler a claim is, the broader the scope of protection of the claim is, once it is registered, but the more likely it is to be rejected during the application process. The process of patent examination involves trying to persuade the patent office to accept the broadest possible formulation of the claims, and balancing this against the risk of having ones claims rejected for not being specific enough.
I believe that it will not be easy for Google to persuade the patent office to change its opinion on the current patent claims. If Google comes to the same conclusion, Google still has the option of amending the claims to be more limited, by describing more specific features in the claims.
However, even with additional limitations added, the patent application still needs to overcome a further 13 prior art citations relating to other claims of the application. It will not be an easy process.
As a third party engineer who utilizes deep learning technology, this can be considered good news, since it makes it less dangerous to utilize well known deep learning technology such as batch normalization. However, as we already discussed in the previous article, a third party can also rely on open source licenses covering the TensorFlow library.
We can expect Google to submit a response (possibly with claim amendments) within the next few months, and we will keep you updated as the situation unfolds.
The European Patent Office recently granted Google a patent for the same batch normalization layers patent application. We will write about this as well, when more information is available.