https://dailyheumsi.tistory.com/265 Feast - Quick Review 일반적인 정형 데이터 머신러닝 코드에는 데이터를 불러오고 필요한 feature를 뽑아 가공하는 부분이 있다. 보통 데이터 웨어하우스나 아니면 원천 데이터 소스에서 데이터를 불러올텐데, 이렇게 dailyheumsi.tistory.com https://docs.feast.dev/ Introduction - Feast Decouple ML from data infrastructure by providing a single data access layer that abstracts feature storage from feature retrieval, ensuring models remain portable a..
Generation Usage Description First – s3 s3:\\ s3 which is also called classic (s3: filesystem for reading from or storing objects in Amazon S3 This has been deprecated and recommends using either the second or third generation library. Second – s3n s3n:\\ s3n uses native s3 object and makes easy to use it with Hadoop and other files systems. This is also not the recommended option. Third – s3a s..
"--conf", "spark.yarn.appMasterEnv.JAVA_HOME=/usr/lib/jvm/java-11-amazon-corretto.x86_64", https://brocess.tistory.com/176 [ Spark ] 스파크 jdk버전 바꿔서 실행하기 현상황 : Cloudera(클라우데라) 버전(CDH 5.5.1, Parcel), Spark버전(1.5) - jdk version 1.7필요상황 : 기존 작업을 Spark1.5(jdk1.7) - jdk 1.8로 돌리기준비상황 : 클러스터의 각 노드들에 jdk1.8이 설치되어 있어야 함. brocess.tistory.com
aws s3 ls s3://bucket --recursive | grep -v -E "(Bucket: |Prefix: |LastWriteTime|^$|--)" | awk 'BEGIN {total=0}{total+=$3}END{print total/1024/1024/1024" GB"}' https://gist.github.com/stefhen/06e9e87cb28eb46b9e34 aws cli s3 du aws cli s3 du. GitHub Gist: instantly share code, notes, and snippets. gist.github.com
https://www.gorillastack.com/blog/real-time-events/cloudtrail-vs-cloudwatch/ CloudTrail vs CloudWatch - A Detailed Guide - GorillaStack Learn the difference between AWS CloudTrail and CloudWatch and when to use them. Details on CloudTrail vs CloudWatch Events, Alarms, Metrics & Logs. www.gorillastack.com

https://weejw.tistory.com/200 Spark Streaming,Structured Streaming http://weejw.tistory.com/35 무려 작년 7월초 spark streaming을 공부해보자 라는 게시글에 .... 드디어 스트리밍 공부를 한다! ٩(ˊᗜˋ*)و Spark Streaming 스트리밍 데이터는 날씨나 log 데이터 처럼 계속해서 생 weejw.tistory.com