티스토리 뷰

공부

[Java][Spark] rows to dataframe

승가비 2021. 9. 11. 08:57
728x90
from pyspark.sql import SparkSession, Row

spark = SparkSession.builder.getOrCreate()

data = [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45),
        Row(id=u'2', probability=0.4444444444444444, thresh=60, prob_opt=0.45),
        Row(id=u'3', probability=0.0, thresh=10, prob_opt=0.45),
        Row(id=u'80000000808', probability=0.0, thresh=100, prob_opt=0.45)]

df = spark.createDataFrame(data)
df.show()
#  +-----------+------------------+------+--------+
#  |         id|       probability|thresh|prob_opt|
#  +-----------+------------------+------+--------+
#  |          1|               0.0|    10|    0.45|
#  |          2|0.4444444444444444|    60|    0.45|
#  |          3|               0.0|    10|    0.45|
#  |80000000808|               0.0|   100|    0.45|
#  +-----------+------------------+------+--------+

https://stackoverflow.com/questions/57559783/converting-a-list-of-rows-to-a-pyspark-dataframe

 

Converting a list of rows to a PySpark dataframe

I have the following lists of rows that I want to convert to a PySpark df: data= [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45), Row(id=u'2', probability=0.4444444444444444, thresh=60,

stackoverflow.com

 

728x90

'공부' 카테고리의 다른 글

[Github] Creating a personal access token  (0) 2021.09.11
[Java][Spark] dataframe to csv  (0) 2021.09.11
[Python] Split string every nth character?  (0) 2021.09.11
[Linux] check DNS  (0) 2021.09.11
[Java][Spark] add column & default  (0) 2021.09.11
댓글