NEWS

Allowance of Google’s “Batch Normalization” patent

2019.05.19Daeho Lee

Since we started our series of articles on deep learning, one of the most interesting topics has been Google's patent on batch normalization layers. This patent has now finally been accepted by the US Patent and Trademark Office.


To me, the bottom line seems to be that Google has achieved the best possible outcome under the given circumstances. Given the initial rejection in October last year and the stance of the United States Patent and Trademark Office conveyed in the examiner’s reports, it was inevitable that Google had to compromise the scope of right of the patent claims a lot in order to obtain a patent registration in the case.


However, on April 1, Google amended the claims in a way that seemed to compromise their claimed rights a great deal, but they actually maintained their claim to the technical core of performing batch normalization on the convolutional layer.


Initial ClaimAllowed Claim
1. A neural network system implemented by one or more computers, the neural network system comprising:
 a batch normalization layer between a first neural network layer and a second neural network layer, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the batch normalization layer is configured to, during training of the neural network system on a batch of training examples:
  receive a respective first layer output for each training example in the batch;
  compute a plurality of normalization statistics for the batch from the first layer outputs;
  normalize each component of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch;
  generate a respective batch normalization layer output for each of the training examples from the normalized layer outputs; andprovide the batch normalization layer output as an input to the second neural network layer.
1. A neural network system implemented by one or more computers, the neural network system comprising:
 instructions for implementing a batch normalization layer between a first neural network layer and a second neural network layer in a neural network, wherein the first neural network layer generates first layer outputs having a plurality of components, and wherein the instructions cause the one or more computers to perform operations comprising:
  during training of the neural network on a plurality of batches of training data, each batch comprising a respective plurality of training examples and for each of the batches:
  receiving a respective first layer output for each of the plurality of training examples in the batch;
  computing a plurality of normalization statistics for the batch from the first layer outputs, comprising:
  determining, for each of a plurality of subsets of the plurality of the components of the first layer outputs, a mean of the components of the first layer outputs for each of the plurality of training examples in batch that are in the respective subset, and
  determining, for each of a plurality of subsets of the plurality of the components of the first layer outputs, a standard deviation of the components of the first layer outputs for each of the plurality of training examples in the batch that are in the respective subset;
  normalizing each of the plurality of the components of each first layer output using the normalization statistics to generate a respective normalized layer output for each training example in the batch, comprising:
  for each first layer output and for each of the plurality of subsets, normalizing the components of the first layer output that are in the respective subset using the mean for the respective subset and standard deviation for the respective subset;
  generating a respective batch normalization output for each of the training examples from the normalized layer outputs; and
  providing the batch normalization layer output as an input to the second neural network layer.


Previously, in the column related to the examiner's report of the patent on batch normalization, I stated that the examiner had mentioned the nuance of the inventive step of claim 9 of the patent.


Claim 9
9. The neural network system of claim 1, wherein the first neural network layer is a convolutional layer, wherein the plurality of components of the first layer output are indexed by feature index and spatial location index, and wherein computing a plurality of normalization statistics for the first layer outputs comprises:
 computing, for each combination of feature index and spatial location index, a mean of the components of the first layer outputs having the feature index and spatial location index;
 computing, for each feature index, an average of the means for combinations that include the feature index;
 computing, for each combination of feature index and spatial location index, a variance of the components of the first layer outputs having the feature index and spatial location index; and
 computing, for each feature index, an average of the variances for combinations that include the feature index.


Claim 9 narrows the scope of the patent to be used only in CNN and, when constructing a unit of normalization, ensures that it falls within the scope of right of the patent only when constructing a data unit composed of values having the same feature index and spatial location index.


In other words, when CNN performs batch normalization in a convolution layer, each normalized component (mean and variance) is calculated for each output of each channel to represent normalized contents.


In the previous column, I informed you that it would be difficult for Google to compromise by accepting the proposals of the examiner and securing only the scope of right of claim 9.


However, Google leveraged the content and succeeded in getting a registration decision with the maximum scope of right within the given situation by amending claim 1 without any content that is used only by CNN, or with restrictions on constructing mini-batches


Now, we need to accept Google's current registration decision, and see what decisions we will make to further broaden our scope of rights. Although the claims for using the BN in the CV Layer have been preserved, it may be that a more generalized neural network may seek to secure the scope of right to technology to perform BN.


In the next column, I'll tell you more about the scope of right that Google has gained and the choices that Google has left.


Thank you.



About the author

Daeho Lee

Daeho started his career as a patent attorney at Seoul firm Park, Kim & Partner, where he worked on ...

Read more