티스토리 뷰

공부

[java][spark] List<Data> to Dataset<Row>

승가비 2022. 9. 11. 19:31
728x90
private fun dataframe(list: List<*>): Dataset<Row> {
    return SQLContext(SparkUtil.instance())
        .createDataFrame(list, Data::class.java)
        .toDF()
}
object SparkUtil {
    fun instance(name: String? = ""): SparkSession {
        return make(name!!)
    }

    private fun make(name: String): SparkSession {
        return config(
            SparkSession.builder().appName(name)
        ).enableHiveSupport().orCreate
    }

    private fun config(builder: SparkSession.Builder): SparkSession.Builder {
        val map = YamlUtils.read(this::class.java, "spark", Extension.YAML)

        var b = builder
        map.keys.forEach {
            val k = it
            val v = map[k]

            b = when (v) {
                is Long -> b.config(k, v)
                is String -> b.config(k, v)
                is Double -> b.config(k, v)
                is Boolean -> b.config(k, v)
                else -> b
            }
        }

        return b
    }
}

https://stackoverflow.com/questions/43633696/dataframe-from-liststring-in-java

 

Dataframe from List<String> in Java

Spark Version : 1.6.2 Java Version: 7 I have a List<String> data. Something like: [[dev, engg, 10000], [karthik, engg, 20000]..] I know schema for this data. name (String) degree (String)

stackoverflow.com

 

728x90
댓글