PySpark minimum of a list -
how find minimum of list stored in cell? can udf, feels overkill. min
function pyspark.sql.functions
works on groups (that result of groupby).
min_ = udf(lambda inarr: min(inarr), integertype()) mydataframewithmin = mydataframe.withcolumn('min_value', min_(f.col('position_list')))
if imported pyspark.sql.functions
, python's min
covered, can still access __builtins__
prefix, example:
min_ = udf(lambda inarr: __builtins__.min(inarr), integertype())
Comments
Post a Comment