티스토리 뷰
728x90
Generation | Usage | Description |
First – s3 | s3:\\ | s3 which is also called classic (s3: filesystem for reading from or storing objects in Amazon S3 This has been deprecated and recommends using either the second or third generation library. |
Second – s3n | s3n:\\ | s3n uses native s3 object and makes easy to use it with Hadoop and other files systems. This is also not the recommended option. |
Third – s3a | s3a:\\ | s3a – This is a replacement of s3n which supports larger files and improves in performance. |
https://sparkbyexamples.com/spark/spark-read-text-file-from-s3/
Spark Read Text File from AWS S3 bucket
In this Spark sparkContext.textFile() and sparkContext.wholeTextFiles() methods to use to read test file from Amazon AWS S3 into RDD and spark.read.text()
sparkbyexamples.com
728x90
'공부' 카테고리의 다른 글
[Data] Lake(ELT) vs Warehouse(ETL) (0) | 2022.11.28 |
---|---|
[mlops] feast (0) | 2022.11.26 |
[spark] spark-submit --conf spark.yarn.appMasterEnv.JAVA_HOME=/usr/lib/jvm/java-11-amazon-corretto.x86_64 (0) | 2022.11.26 |
[aws] cli s3 du (0) | 2022.11.26 |
[aws] CloudTrail vs CloudWatch (0) | 2022.11.26 |
댓글