使用自定义渐变进行自定义激活不起作用

小编典典

使用自定义渐变进行自定义激活不起作用

python

我正在尝试为简单的神经网络培训编写代码。目标是定义一个自定义激活函数，而不是让Keras自动将其派生用于反向传播，而是让Keras使用我的自定义梯度函数进行自定义激活：

import numpy as np
import tensorflow as tf
import math
import keras
from keras.models import Model, Sequential
from keras.layers import Input, Dense, Activation
from keras import regularizers
from keras import backend as K
from keras.backend import tf
from keras import initializers
from keras.layers import Lambda

@tf.custom_gradient
def custom_activation(x):

    def grad(dy):
        return dy * 0

    result=(K.sigmoid(x) *2-1 )
    return result, grad

x_train=np.array([[1,2],[3,4],[3,4]]);

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
layer = Lambda(lambda x: custom_activation)(output_1)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(layer)
model2 = Model(inputs=inputs, outputs=output_2)

model2.compile(optimizer='adam',loss='mean_squared_error')
model2.fit(x_train,x_train,epochs=20,validation_split=0.1,shuffle=False)

由于梯度已定义为零，因此我预计在所有时期之后损耗都不会改变。这是我得到的错误的回溯：

Using TensorFlow backend.
WARNING:tensorflow:From C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
Traceback (most recent call last):
  File "C:/p/CE/mytest.py", line 43, in <module>
    layer = Lambda(lambda x: custom_activation)(output_1)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 474, in __call__
    output_shape = self.compute_output_shape(input_shape)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\core.py", line 656, in compute_output_shape
    return K.int_shape(x)
  File "C:\ProgramData\Anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py", line 593, in int_shape
    return tuple(x.get_shape().as_list())
AttributeError: 'function' object has no attribute 'get_shape'

更新： 我使用了Manoj
Mohan的答案，现在代码可以正常工作了。由于梯度定义为零，因此我希望各个时期之间的损失不会发生变化。但是，它确实发生了变化。为什么？我有什么想念的吗？

例：

Epoch 1/20
2019-10-03 10:31:34.193232: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

2/2 [==============================] - 0s 68ms/step - loss: 8.3184 - val_loss: 13.7232
Epoch 2/20

2/2 [==============================] - 0s 496us/step - loss: 8.2783 - val_loss: 13.6368

阅读 209

2021-01-20

共1个答案

小编典典

更换

layer = Lambda(lambda x: custom_activation)(output_1)

与

layer = Lambda(custom_activation)(output_1)

由于梯度定义为零，因此我希望各个时期之间的损失不会发生变化。但是，它确实发生了变化。为什么？

在中间层中将梯度更新为零。因此，梯度不会从那里倒流。但是从输出到中间层，梯度将流动并且权重将被更新。修改后的架构将在各个时期输出恒定的损耗。

inputs = Input(shape=(2,))
output_1 = Dense(20, kernel_initializer='glorot_normal')(inputs)
output_2 = Dense(2, activation='linear',kernel_initializer='glorot_normal')(output_1)
layer = Lambda(custom_activation)(output_2)  #should be last layer
model2 = Model(inputs=inputs, outputs=layer)

2021-01-20