I would like to implement the pruning algorithms in Tensorflow. In order to do this I need not only to mask the weight during the pruning stage, but keep some entries of this tensor frozen to zero, during the consequent training.
Like I had ([0, 0] and [1, 1] entries masked):
[[0 1.3 2], [1.34 0 2.3]]
And after several batches I expect to still have zeros on
positions [0, 0] and [1, 1] .
There is a solution proposed here How to stop gradient for some entry of a tensor in tensorflow. But seems like it works only for TensorFlow v1, because the masked entries of the variable were updated after calling the fit method.
It is possible to create special class of Optimizer that will have redefined apply_gradients method, where the gradient will be multiplied by mask after each backward pass, but this solution seems to be rather inconvenient and one has redefine multiple optimizers - Adam, RMSProp, whatever.