我在 docker 中使用 Spark 來進行一些處理。我們有一個 Kafka 容器、Spark 主容器、兩個 Spark 工作容器和一個 Python 容器來編排整個流程。我們通常docker-compose會提出一切:version: '3.4'volumes: zookeeper-persistence: kafka-store: spark-store:services: zookeeper-server: image: 'bitnami/zookeeper:3.6.1' expose: - '2181' environment: ... volumes: - zookeeper-persistence:/bitnami/zookeeper kafka-server: image: 'bitnami/kafka:2.6.0' expose: - '29092' - '9092' environment: ... volumes: - kafka-store:/bitnami/kafka depends_on: - zookeeper-server spark-master: image: bitnami/spark:3.0.1 environment: SPARK_MODE: 'master' SPARK_MASTER_HOST: 'spark-master' ports: - '8080:8080' expose: - '7077' depends_on: - kafka-server spark-worker1: image: bitnami/spark:3.0.1 environment: SPARK_MODE: 'worker' SPARK_WORKER_MEMORY: '4G' SPARK_WORKER_CORES: '2' depends_on: - spark-master spark-worker2: #same as spark-worker1 compute: build: ./app image: compute environment: KAFKA_HOST: kafka-server:29092 COMPUTE_TOPIC: DataFrames PYSPARK_SUBMIT_ARGS: "--master spark://spark-master:7077 --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.0 pyspark-shell" depends_on: - spark-master - kafka-server volumes: - spark-store:/app/checkpoints數據通過另一個 Python 應用程序發送,計算容器響應更改。我們創建一個 ComputeDeployment 并調用 start 函數來啟動 Spark 作業:
1 回答

牛魔王的故事
TA貢獻1830條經驗 獲得超3個贊
這與權限有關。我發現 Spark 容器中的用戶無法寫入該目錄。使用此入口點文件修復它:
#!/bin/bash
chmod -R a+rwX /app/checkpoints/
python -u run.py
添加回答
舉報
0/150
提交
取消