首頁猿問 unity ml 代理...

unity ml 代理 python api 的奇怪結果

Python

森欄 2023-10-05 16:25:25

我正在使用 3DBall 示例環境，但我得到了一些非常奇怪的結果，我不明白它們為什么會發生。到目前為止，我的代碼只是一個 for range 循環，用于查看獎勵并用隨機值填充所需的輸入。然而，當我這樣做時，從未顯示出負面獎勵，并且隨機不會有決策步驟，這是有道理的，但它不應該繼續模擬直到有決策步驟嗎？任何幫助將不勝感激，因為除了文檔之外，幾乎沒有任何資源。env = UnityEnvironment()env.reset()behavior_names = env.behavior_specsfor i in range(50): arr = [] behavior_names = env.behavior_specs for i in behavior_names: print(i) DecisionSteps = env.get_steps("3DBall?team=0") print(DecisionSteps[0].reward,len(DecisionSteps[0].reward)) print(DecisionSteps[0].action_mask) #for some reason it returns action mask as false when Decisionsteps[0].reward is empty and is None when not for i in range(len(DecisionSteps[0])): arr.append([]) for b in range(2): arr[-1].append(random.uniform(-10,10)) if(len(DecisionSteps[0])!= 0): env.set_actions("3DBall?team=0",numpy.array(arr)) env.step() else: env.step()env.close()

查看完整描述

1 回答

白板的微信

TA貢獻1883條經驗獲得超3個贊

我認為您的問題是，當模擬終止并需要重置時，代理不會返回 adecision_step而是返回terminal_step. 這是因為代理已經丟球了，terminal_step 中返回的獎勵將為 -1.0。我已經獲取了你的代碼并做了一些更改，現在它運行良好（除了你可能想要更改，這樣你就不會在每次代理之一掉球時重置）。

import numpy as np

import mlagents

from mlagents_envs.environment import UnityEnvironment

# -----------------

# This code is used to close an env that might not have been closed before

try:

unity_env.close()

except:

pass

# -----------------

env = UnityEnvironment(file_name = None)

env.reset()

for i in range(1000):

arr = []

behavior_names = env.behavior_specs

# Go through all existing behaviors

for behavior_name in behavior_names:

decision_steps, terminal_steps = env.get_steps(behavior_name)

for agent_id_terminated in terminal_steps:

print("Agent " + behavior_name + " has terminated, resetting environment.")

# This is probably not the desired behaviour, as the other agents are still active.

env.reset()

actions = []

for agent_id_decisions in decision_steps:

actions.append(np.random.uniform(-1,1,2))

# print(decision_steps[0].reward)

# print(decision_steps[0].action_mask)

if len(actions) > 0:

env.set_actions(behavior_name, np.array(actions))

try:

env.step()

except:

print("Something happend when taking a step in the environment.")

print("The communicatior has probably terminated, stopping simulation early.")

break

env.close()

反對回復 2023-10-05

1 回答
0 關注
134 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

unity ml 代理 python api 的奇怪結果

unity ml 代理 python api 的奇怪結果

1 回答

添加回答