티스토리 뷰

728x90
import boto3
from botocore.config import Config

import random
from airflow.providers.amazon.aws.sensors.emr_step import EmrStepSensor

_POKE_INTERVAL = 60
_RANDOM = {"START": 1, "END": 60, "INTERVAL": 1}
_MAX_ATTEMPTS = 15

class DefaultEmrStepSensor(EmrStepSensor):
    def __init__(
            self,
            poke_interval=None,
            **kwargs,
    ):
        poke_interval = poke_interval or _POKE_INTERVAL + random.randrange(
            _RANDOM["START"], _RANDOM["END"], _RANDOM["INTERVAL"]
        )
        super().__init__(poke_interval=poke_interval, **kwargs)

    def get_emr_response(self):
        config = Config(retries={"max_attempts": _MAX_ATTEMPTS, "mode": "standard"})
        emr_client = boto3.client("emr", config=config)

        self.log.info("Poking step %s on cluster %s", self.step_id, self.job_flow_id)

        return emr_client.describe_step(
            ClusterId=self.job_flow_id,
            StepId=self.step_id,
        )

https://github.com/apache/airflow/issues/23475

 

AWS rate limiting causes tasks to fail · Issue #23475 · apache/airflow

Apache Airflow Provider(s) amazon Versions of Apache Airflow Providers All versions Apache Airflow version 2.3.0 (latest released) Operating System linux Deployment Astronomer Deployment details No...

github.com

https://github.com/apache/airflow/pull/23906/files

 

docs: amazon-provider retry modes by Taragolis · Pull Request #23906 · apache/airflow

Added information how to reduce chance of boto3 throttling exceptions in amazon-provider related: #23475

github.com

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/emr.html#EMR.Client.describe_step

 

EMR — Boto3 Docs 1.26.45 documentation

BlockPublicAccessConfiguration (dict) -- [REQUIRED] A configuration for Amazon EMR block public access. The configuration applies to all clusters created in your account for the current Region. The configuration specifies whether block public access is ena

boto3.amazonaws.com

https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html#retries

 

Retries — Boto3 Docs 1.26.44 documentation

To ensure that your retry configuration is correct and working properly, there are a number of ways you can validate that your client's retries are occurring. Checking retry attempts in your client logs If you enable Boto3’s logging, you can validate and

boto3.amazonaws.com

https://aws.amazon.com/ko/premiumsupport/knowledge-center/ssm-parameter-store-rate-exceeded/

 

AWS Systems Manager Parameter Store를 사용할 때 "ThrottlingException" 또는 "비율이 초과됨" 오류 해결

AWS Systems Manager Parameter Store를 사용할 때 "ThrottlingException" 또는 "비율이 초과됨" 오류를 해결하려면 어떻게 해야 하나요? 최종 업데이트 날짜: 2021년 3월 12일 AWS Systems Manager Parameter Store를 사용할

aws.amazon.com

https://aws.amazon.com/ko/premiumsupport/knowledge-center/cloudwatch-logs-throttling/

 

CloudWatch Logs에서 제한 결정

CloudWatch Logs에서 제한을 결정하려면 어떻게 해야 합니까? 최종 업데이트 날짜: 2022년 4월 6일 Amazon CloudWatch Logs로 작업할 때 RequestLimitExceeded 또는 ThrottlingException 오류가 발생하여 API 호출이 제한되

aws.amazon.com

https://github.com/cloud-custodian/cloud-custodian/issues/1812

 

ClientError: An error occurred (ThrottlingException) when calling the DescribeLogStreams operation (reached max retries: 4): Rat

When running following policy: CloudWatch Policies - name: cloudwatch-delete-stale-log-group resource: log-group filters: - type: last-write days: 90 actions: - delete configuration: using c7n_org,...

github.com

https://docs.aws.amazon.com/step-functions/latest/dg/limits-overview.html

 

Quotas - AWS Step Functions

Quotas AWS Step Functions places quotas on the sizes of certain state machine parameters, such as the number of API actions during a certain time period or the number of state machines that you can define. Although these quotas are designed to prevent a mi

docs.aws.amazon.com

https://giaosudau.medium.com/how-to-submit-spark-jobs-to-emr-easy-and-reliable-with-airflow-decf15e2584c

 

Submit Spark jobs to EMR cluster from Airflow

Introduction

giaosudau.medium.com

 

728x90

'공부' 카테고리의 다른 글

[aws] s3 mkdir  (0) 2023.01.07
[aws] s3 bucket list  (0) 2023.01.07
[aws] s3 delete files  (0) 2023.01.06
[sh] cp all files (add hidden files)  (0) 2023.01.05
[sh] docker image `RUN yum install -y awscli`  (0) 2023.01.05
댓글