티스토리 뷰

공부

[aws] emr spark config

승가비 2022. 8. 19. 11:28
728x90
[
  {
    "Classification": "spark-env",
    "Configurations": [
      {
        "Classification": "export",
        "Properties": {
          "JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64"
        }
      }
    ]
  },
  {
    "Classification": "spark-defaults",
    "Properties": {
      "spark.executorEnv.JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64",
      "spark.sql.broadcastTimeout": "3600",
      "spark.yarn.executor.memoryOverheadFactor": "0.1",
      "spark.default.parallelism": "200",
      "spark.yarn.am.memory": "2g",
      "spark.executor.extraJavaOptions": "-XX:+IgnoreUnrecognizedVMOptions"
    }
  },
  {
    "Classification": "hive-site",
    "Properties": {
      "hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
    }
  },
  {
    "Classification": "spark-hive-site",
    "Properties": {
      "hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
    }
  },
  {
    "Classification": "yarn-site",
    "Properties": {
      "yarn.nodemanager.pmem-check-enabled": "false",
      "yarn.nodemanager.vmem-check-enabled": "false",
      "yarn.nodemanager.vmem-pmem-ratio": "5",
      "yarn.nodemanager.resource.cpu-vcores": "4",
      "yarn.nodemanager.resource.memory-mb": "253952",
      "yarn.scheduler.maximum-allocation-vcores": "128",
      "yarn.scheduler.maximum-allocation-mb": "253952"
    }
  },
  {
    "Classification": "spark",
    "Properties": {
      "maximizeResourceAllocation": "true"
    }
  }
]

https://www.facebook.com/groups/sparkkoreauser/posts/1323386911056542/

 

Facebook에 로그인

Notice 계속하려면 로그인해주세요.

www.facebook.com

https://docs.aws.amazon.com/ko_kr/emr/latest/ReleaseGuide/emr-spark-configure.html

 

Spark 구성 - Amazon EMR

를 사용하면 안 됩니다.maximizeResourceAllocationHBase와 같은 다른 분산 애플리케이션이 있는 클러스터에서 옵션을 선택합니다. Amazon EMR은 분산 애플리케이션에 사용자 지정 YARN 구성을 사용하며, 이

docs.aws.amazon.com

https://stackoverflow.com/questions/54596569/tables-not-found-in-spark-sql-after-migrating-from-emr-to-aws-glue

 

Tables not found in Spark SQL after migrating from EMR to AWS Glue

I have Spark jobs on EMR, and EMR is configured to use the Glue catalog for Hive and Spark metadata. I create Hive external tables, and they appear in the Glue catalog, and my Spark jobs can ref...

stackoverflow.com

https://stackoverflow.com/questions/38988941/running-yarn-with-spark-not-working-with-java-8/39456967#39456967

 

Running yarn with spark not working with Java 8

I have cluster with 1 master and 6 slaves which uses pre-built version of hadoop 2.6.0 and spark 1.6.2. I was running hadoop MR and spark jobs without any problem with openjdk 7 installed on all the

stackoverflow.com

 

728x90
댓글