extract json value directly in pyspark
Say a spark dataframe has a column named json_str_col
which contains json format strings, and the json format string have the format {“key1” : “some value”}
we can directly extract key1
’s values as a new column with the following.
df.withColumn('value1', F.get\_json\_object(F.col('json\_str\_col'), '$.key1'))