我對 Google Cloud Platform 還很陌生,我正在嘗試使用 TPU 訓練模型。我按照本教程使用 Google Colab 設置 TPU。下面的所有代碼都遵循教程。這是我完成的步驟:import datetimeimport jsonimport osimport pprintimport randomimport stringimport sysimport tensorflow as tfassert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']print('TPU address is => ', TPU_ADDRESS)from google.colab import authauth.authenticate_user()with tf.Session(TPU_ADDRESS) as session: print('TPU devices:') pprint.pprint(session.list_devices()) # Upload credentials to TPU. with open('/content/adc.json', 'r') as f: auth_info = json.load(f) tf.contrib.cloud.configure_gcs(session, credentials=auth_info) # Now credentials are set for all future sessions on this TPU.輸出:TPU address is => grpc://10.4.89.154:8470提供我的BUCKET名字和OUPUT DIRECTORY姓名:BUCKET = 'my_xlnet' #@param {type:"string"}assert BUCKET, '*** Must specify an existing GCS bucket name ***'output_dir_name = 'xlnet_output' #@param {type:"string"}BUCKET_NAME = 'gs://{}'.format(BUCKET)OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET,output_dir_name)tf.gfile.MakeDirs(OUTPUT_DIR)print('***** Model output directory: {} *****'.format(OUTPUT_DIR))將預訓練模型移至 GCS 存儲桶:!gsutil mv /content/xlnet_extension_tf/model/xlnet_cased_L-24_H-1024_A-16 $BUCKET_NAME輸出:...Operation completed over 5 objects/1.3 GiB. 然后運行主要代碼:!python /content/xlnet_extension_tf/run_coqa.py \--use_tpu=True \--tpu_name=grpc://10.4.89.154:8470 \--spiece_model_file=$BUCKET_NAME/xlnet_cased_L-24_H-1024_A-16/spiece.model \--model_config_path=$BUCKET_NAME/xlnet_cased_L-24_H-1024_A-16/xlnet_config.json \--init_checkpoint=$BUCKET_NAME/xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt \...然后我得到了這個錯誤:OSError: Not found: "gs://my_xlnet/xlnet_cased_L-24_H-1024_A-16/spiece.model": No such file or directory Error #2這是 GCS 存儲桶屏幕:我不知道為什么會出現這個錯誤,因為我可以成功地將我的預訓練模型移動到桶中。你們知道如何解決這個問題嗎?
添加回答
舉報
0/150
提交
取消