首頁猿問張量流中的批量歸一化：變量和性能

張量流中的批量歸一化：變量和性能

Python

烙印99 2021-12-26 10:28:00

我想對批處理規范化層的變量添加條件操作。具體來說，在浮點數中訓練，然后在微調二次訓練階段量化。為此，我想在變量上添加一個 tf.cond 操作（均值和變量的比例、移位和 exp 移動平均值）。我用tf.layers.batch_normalization 我寫的batchnorm層替換了它（見下文）。這個函數工作得很好（即我用兩個函數得到了相同的度量），我可以向變量添加任何管道（在batchnorm 操作之前）。問題是性能（運行時）急劇下降（即通過簡單地用我自己的函數替換 layer.batchnorm 有一個 x2 因子，如下所述）。def batchnorm(self, x, name, epsilon=0.001, decay=0.99): epsilon = tf.to_float(epsilon) decay = tf.to_float(decay) with tf.variable_scope(name): shape = x.get_shape().as_list() channels_num = shape[3] # scale factor gamma = tf.get_variable("gamma", shape=[channels_num], initializer=tf.constant_initializer(1.0), trainable=True) # shift value beta = tf.get_variable("beta", shape=[channels_num], initializer=tf.constant_initializer(0.0), trainable=True) moving_mean = tf.get_variable("moving_mean", channels_num, initializer=tf.constant_initializer(0.0), trainable=False) moving_var = tf.get_variable("moving_var", channels_num, initializer=tf.constant_initializer(1.0), trainable=False) batch_mean, batch_var = tf.nn.moments(x, axes=[0, 1, 2]) # per channel update_mean = moving_mean.assign((decay * moving_mean) + ((1. - decay) * batch_mean)) update_var = moving_var.assign((decay * moving_var) + ((1. - decay) * batch_var)) tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, update_mean) tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, update_var) bn_mean = tf.cond(self.is_training, lambda: tf.identity(batch_mean), lambda: tf.identity(moving_mean)) bn_var = tf.cond(self.is_training, lambda: tf.identity(batch_var), lambda: tf.identity(moving_var)) with tf.variable_scope(name + "_batchnorm_op"): inv = tf.math.rsqrt(bn_var + epsilon) inv *= gamma output = ((x*inv) - (bn_mean*inv)) + beta return output我將不勝感激以下任何問題的幫助：關于如何提高我的解決方案的性能（減少運行時間）的任何想法？是否可以在 batchnorm 操作之前將我自己的運算符添加到 layer.batchnorm 的變量管道中？有沒有其他解決方案可以解決同樣的問題？

查看完整描述

1 回答

慕桂英546537

TA貢獻1848條經驗獲得超10個贊

tf.nn.fused_batch_norm 已優化并成功。

我不得不創建兩個子圖，每個模式一個，因為fused_batch_norm的界面不采用條件訓練/測試模式（is_training 是 bool 而不是張量，所以它的圖不是有條件的）。我在之后添加了條件（見下文）。然而，即使有兩個子圖，它的運行時間也大致相同tf.layers.batch_normalization。

這是最終解決方案（我仍然感謝任何改進意見或建議）：

def batchnorm(self, x, name, epsilon=0.001, decay=0.99):

with tf.variable_scope(name):

shape = x.get_shape().as_list()

channels_num = shape[3]

# scale factor

gamma = tf.get_variable("gamma", shape=[channels_num], initializer=tf.constant_initializer(1.0), trainable=True)

# shift value

beta = tf.get_variable("beta", shape=[channels_num], initializer=tf.constant_initializer(0.0), trainable=True)

moving_mean = tf.get_variable("moving_mean", channels_num, initializer=tf.constant_initializer(0.0), trainable=False)

moving_var = tf.get_variable("moving_var", channels_num, initializer=tf.constant_initializer(1.0), trainable=False)

(output_train, batch_mean, batch_var) = tf.nn.fused_batch_norm(x,

gamma,

beta, # pylint: disable=invalid-name

mean=None,

variance=None,

epsilon=epsilon,

data_format="NHWC",

is_training=True,

name="_batchnorm_op")

(output_test, _, _) = tf.nn.fused_batch_norm(x,

gamma,

beta, # pylint: disable=invalid-name

mean=moving_mean,

variance=moving_var,

epsilon=epsilon,

data_format="NHWC",

is_training=False,

name="_batchnorm_op")

output = tf.cond(self.is_training, lambda: tf.identity(output_train), lambda: tf.identity(output_test))

update_mean = moving_mean.assign((decay * moving_mean) + ((1. - decay) * batch_mean))

update_var = moving_var.assign((decay * moving_var) + ((1. - decay) * batch_var))

tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, update_mean)

tf.add_to_collection(tf.GraphKeys.UPDATE_OPS, update_var)

return output

反對回復 2021-12-26

1 回答
0 關注
206 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

張量流中的批量歸一化：變量和性能

張量流中的批量歸一化：變量和性能

1 回答

添加回答