[aws] emr spark config

티스토리 뷰

공부 (@Deprecated)

[aws] emr spark config

승가비 2022. 8. 19. 11:28

728x90

[
  {
    "Classification": "spark-env",
    "Configurations": [
      {
        "Classification": "export",
        "Properties": {
          "JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64"
        }
      }
    ]
  },
  {
    "Classification": "spark-defaults",
    "Properties": {
      "spark.executorEnv.JAVA_HOME": "/usr/lib/jvm/java-11-amazon-corretto.x86_64",
      "spark.sql.broadcastTimeout": "3600",
      "spark.yarn.executor.memoryOverheadFactor": "0.1",
      "spark.default.parallelism": "200",
      "spark.yarn.am.memory": "2g",
      "spark.executor.extraJavaOptions": "-XX:+IgnoreUnrecognizedVMOptions"
    }
  },
  {
    "Classification": "hive-site",
    "Properties": {
      "hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
    }
  },
  {
    "Classification": "spark-hive-site",
    "Properties": {
      "hive.metastore.client.factory.class": "com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory"
    }
  },
  {
    "Classification": "yarn-site",
    "Properties": {
      "yarn.nodemanager.pmem-check-enabled": "false",
      "yarn.nodemanager.vmem-check-enabled": "false",
      "yarn.nodemanager.vmem-pmem-ratio": "5",
      "yarn.nodemanager.resource.cpu-vcores": "4",
      "yarn.nodemanager.resource.memory-mb": "253952",
      "yarn.scheduler.maximum-allocation-vcores": "128",
      "yarn.scheduler.maximum-allocation-mb": "253952"
    }
  },
  {
    "Classification": "spark",
    "Properties": {
      "maximizeResourceAllocation": "true"
    }
  }
]

https://www.facebook.com/groups/sparkkoreauser/posts/1323386911056542/

Facebook에 로그인

Notice 계속하려면 로그인해주세요.

www.facebook.com

https://docs.aws.amazon.com/ko_kr/emr/latest/ReleaseGuide/emr-spark-configure.html

Spark 구성 - Amazon EMR

를 사용하면 안 됩니다.maximizeResourceAllocationHBase와 같은 다른 분산 애플리케이션이 있는 클러스터에서 옵션을 선택합니다. Amazon EMR은 분산 애플리케이션에 사용자 지정 YARN 구성을 사용하며, 이

docs.aws.amazon.com

https://stackoverflow.com/questions/54596569/tables-not-found-in-spark-sql-after-migrating-from-emr-to-aws-glue

Tables not found in Spark SQL after migrating from EMR to AWS Glue

I have Spark jobs on EMR, and EMR is configured to use the Glue catalog for Hive and Spark metadata. I create Hive external tables, and they appear in the Glue catalog, and my Spark jobs can ref...

stackoverflow.com

https://stackoverflow.com/questions/38988941/running-yarn-with-spark-not-working-with-java-8/39456967#39456967

Running yarn with spark not working with Java 8

I have cluster with 1 master and 6 slaves which uses pre-built version of hadoop 2.6.0 and spark 1.6.2. I was running hadoop MR and spark jobs without any problem with openjdk 7 installed on all the

stackoverflow.com

728x90

저작자표시 비영리 (새창열림)

'공부 (@Deprecated)' 카테고리의 다른 글

Dockerfile': group id '1849965015' is too big ( > 2097151 ). Use STAR or POSIX extensions to overcome this limit (0)	2022.08.22
[aws][s3] bucket lifecycle (0)	2022.08.22
[github] events that trigger workflows (0)	2022.08.18
[python] 기본 날짜와 시간 형식 (0)	2022.08.18
[python] pandas loc - filter for list of values (0)	2022.08.18

250x250

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/11 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

글 보관함

배우기를 멈추는 사람은 20세건 80세건 늙은 것이다.

티스토리 뷰

[aws] emr spark config

'공부 (@Deprecated)' 카테고리의 다른 글

티스토리툴바