pyspark.sql.functions.extract#

pyspark.sql.functions.extract(field, source)[source]#

Extracts a part of the date/timestamp or interval source.

New in version 3.5.0.

Parameters
fieldColumn

selects which part of the source should be extracted.

sourceColumn or column name

a date/timestamp or interval column from where field should be extracted.

Returns
Column

a part of the date/timestamp or interval source.

Examples

>>> import datetime
>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([(datetime.datetime(2015, 4, 8, 13, 8, 15),)], ['ts'])
>>> df.select(
...     '*',
...     sf.extract(sf.lit('YEAR'), 'ts').alias('year'),
...     sf.extract(sf.lit('month'), 'ts').alias('month'),
...     sf.extract(sf.lit('WEEK'), 'ts').alias('week'),
...     sf.extract(sf.lit('D'), df.ts).alias('day'),
...     sf.extract(sf.lit('M'), df.ts).alias('minute'),
...     sf.extract(sf.lit('S'), df.ts).alias('second')
... ).show()
+-------------------+----+-----+----+---+------+---------+
|                 ts|year|month|week|day|minute|   second|
+-------------------+----+-----+----+---+------+---------+
|2015-04-08 13:08:15|2015|    4|  15|  8|     8|15.000000|
+-------------------+----+-----+----+---+------+---------+