공부

[Java][Spark] rows to dataframe

승가비 2021. 9. 11. 08:57
728x90
from pyspark.sql import SparkSession, Row

spark = SparkSession.builder.getOrCreate()

data = [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45),
        Row(id=u'2', probability=0.4444444444444444, thresh=60, prob_opt=0.45),
        Row(id=u'3', probability=0.0, thresh=10, prob_opt=0.45),
        Row(id=u'80000000808', probability=0.0, thresh=100, prob_opt=0.45)]

df = spark.createDataFrame(data)
df.show()
#  +-----------+------------------+------+--------+
#  |         id|       probability|thresh|prob_opt|
#  +-----------+------------------+------+--------+
#  |          1|               0.0|    10|    0.45|
#  |          2|0.4444444444444444|    60|    0.45|
#  |          3|               0.0|    10|    0.45|
#  |80000000808|               0.0|   100|    0.45|
#  +-----------+------------------+------+--------+

https://stackoverflow.com/questions/57559783/converting-a-list-of-rows-to-a-pyspark-dataframe

 

Converting a list of rows to a PySpark dataframe

I have the following lists of rows that I want to convert to a PySpark df: data= [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45), Row(id=u'2', probability=0.4444444444444444, thresh=60,

stackoverflow.com

 

728x90