[ 
https://issues.apache.org/jira/browse/SPARK-57741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-57741:
--------------------------------

    Assignee: Jubin Soni

> Add timestamp_nanos to PySpark public API
> -----------------------------------------
>
>                 Key: SPARK-57741
>                 URL: https://issues.apache.org/jira/browse/SPARK-57741
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 4.3.0, 5.0.0
>            Reporter: Jubin Soni
>            Assignee: Jubin Soni
>            Priority: Minor
>              Labels: pull-request-available
>
> *What is the issue?*
> {{timestamp_nanos(e)}} converts a nanoseconds-since-epoch integer to a 
> {{TIMESTAMP_LTZ(9)}} value. It is the inverse of {{unix_nanos}} and was added 
> to the Scala API in SPARK-57526. PySpark support was explicitly deferred and 
> tracked as a follow-up — the function is listed in {{expected_missing_in_py}} 
> in {{{}python/pyspark/sql/tests/test_functions.py{}}}:
> {{expected_missing_in_py = \{
>     "timestamp_nanos"
> }  # SPARK-57526: PySpark support tracked as a follow-up}}
> Without this function, PySpark users can convert a nanosecond-precision 
> timestamp to epoch nanoseconds via {{{}unix_nanos{}}}, but cannot convert 
> back, leaving the round-trip incomplete in Python.
> ----
> *How to reproduce*
> {{from pyspark.sql import functions as sf
> # unix_nanos exists:
> sf.unix_nanos("ts")           # works
> # timestamp_nanos does not:
> sf.timestamp_nanos(lit(1577885075123456789))  # AttributeError}}
> ----
> *Actual behavior*
> {{AttributeError: module 'pyspark.sql.functions' has no attribute 
> 'timestamp_nanos'}}
> ----
> *Expected behavior*
> {{from pyspark.sql import functions as sf
> df = spark.sql("SELECT BIGINT('1577885075123456789') AS ns")
> df.select(sf.timestamp_nanos("ns")).show(truncate=False)
> # +-----------------------------+
> # |timestamp_nanos(ns)          |
> # +-----------------------------+
> # |2020-01-01 13:24:35.123456789|
> # +-----------------------------+}}
> ----
> *Proposed fix*
> Follow the pattern of {{{}timestamp_micros{}}}:
>  * Add {{timestamp_nanos(col)}} to {{python/pyspark/sql/functions/builtin.py}}
>  * Add Connect-side wrapper in 
> {{python/pyspark/sql/connect/functions/builtin.py}}
>  * Export from {{python/pyspark/sql/functions/__init__.py}}
>  * Add entry to {{python/docs/source/reference/pyspark.sql/functions.rst}}
>  * Remove {{"timestamp_nanos"}} from {{expected_missing_in_py}} in 
> {{python/pyspark/sql/tests/test_functions.py}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to