我使用 Tensorflow 構建了一個 Deep-Q-Network。當我嘗試創建其中的兩個時(我想讓網絡與自己對戰),我得到:ValueError: 試圖共享變量密集/內核,但指定形狀 (100, 160) 并找到形狀 (9, 100)。這是我的網絡:class QNetwork: """ A Q-Network implementation """ def __init__(self, input_size, output_size, hidden_layers_size, gamma, maximize_entropy, reuse): self.q_target = tf.placeholder(shape=(None, output_size), dtype=tf.float32) self.r = tf.placeholder(shape=None, dtype=tf.float32) self.states = tf.placeholder(shape=(None, input_size), dtype=tf.float32) self.enumerated_actions = tf.placeholder(shape=(None, 2), dtype=tf.int32) self.learning_rate = tf.placeholder(shape=[], dtype=tf.float32) layer = self.states for l in hidden_layers_size: layer = tf.layers.dense(inputs=layer, units=l, activation=tf.nn.relu, kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=reuse) self.output = tf.layers.dense(inputs=layer, units=output_size, kernel_initializer=tf.contrib.layers.xavier_initializer(), reuse=reuse) self.predictions = tf.gather_nd(self.output, indices=self.enumerated_actions) if maximize_entropy: self.future_q = tf.log(tf.reduce_sum(tf.exp(self.q_target), axis=1)) else: self.future_q = tf.reduce_max(self.q_target, axis=1) self.labels = self.r + (gamma * self.future_q) self.cost = tf.reduce_mean(tf.losses.mean_squared_error(labels=self.labels, predictions=self.predictions)) self.optimizer = tf.train.AdamOptimizer(learning_rate=self.learning_rate).minimize(self.cost)這段代碼失敗了:q1 = QNetwork(9, 9, [100, 160, 160, 100], gamma=0.99, maximize_entropy=False, reuse=tf.AUTO_REUSE)q2 = QNetwork(9, 9, [100, 160, 160, 100], gamma=0.99, maximize_entropy=False, reuse=tf.AUTO_REUSE)知道如何解決這個問題嗎?(運行 TF 1.10.1,Python 3.6.5)
1 回答

哈士奇WWW
TA貢獻1799條經驗 獲得超6個贊
解決了。
我需要:
給每一層一個唯一的名字
將所有內容放在
variable_scope
一起reuse=tf.AUTO_REUSE
(對于 Adam 優化器)
添加回答
舉報
0/150
提交
取消