공부
[Java][Spark] rows to dataframe
승가비
2021. 9. 11. 08:57
728x90
from pyspark.sql import SparkSession, Row
spark = SparkSession.builder.getOrCreate()
data = [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45),
Row(id=u'2', probability=0.4444444444444444, thresh=60, prob_opt=0.45),
Row(id=u'3', probability=0.0, thresh=10, prob_opt=0.45),
Row(id=u'80000000808', probability=0.0, thresh=100, prob_opt=0.45)]
df = spark.createDataFrame(data)
df.show()
# +-----------+------------------+------+--------+
# | id| probability|thresh|prob_opt|
# +-----------+------------------+------+--------+
# | 1| 0.0| 10| 0.45|
# | 2|0.4444444444444444| 60| 0.45|
# | 3| 0.0| 10| 0.45|
# |80000000808| 0.0| 100| 0.45|
# +-----------+------------------+------+--------+
https://stackoverflow.com/questions/57559783/converting-a-list-of-rows-to-a-pyspark-dataframe
Converting a list of rows to a PySpark dataframe
I have the following lists of rows that I want to convert to a PySpark df: data= [Row(id=u'1', probability=0.0, thresh=10, prob_opt=0.45), Row(id=u'2', probability=0.4444444444444444, thresh=60,
stackoverflow.com
728x90