ntjohnson1 opened a new issue, #1234:
URL: https://github.com/apache/datafusion-python/issues/1234

   **Describe the bug**
   with_column typically add a new column with the provided name. However, when 
using lag we get two new columns, with the provided name and automatically 
generate one.
   
   **To Reproduce**
   ```python
   import datafusion as dfn
   from datafusion import col, lit, functions as F
   import pyarrow as pa
   
   
   def datafusion_example() -> None:
       table = pa.table({"a": [1.0, 2.0, 3.0]})
       ctx = dfn.SessionContext()
       df = ctx.from_arrow(table)
       print(
           df.with_column(
               "previous_a",
               F.lag(
                   col("a"),
                   default_value=None,
               ),
           )
       )
   
       print(df.with_column("something_else", col("a") + lit(1.0)))
   
   
   if __name__ == "__main__":
       datafusion_example()
   ```
   
   Output
   ```console
   DataFrame()
   
+-----+-----------------------------------------------------------------------------------------------------------------+------------+
   | a   | lag(ced8c2b3710c14382bd5eb58d49ffbd53.a,Int64(1),NULL) ROWS BETWEEN 
UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING | previous_a |
   
+-----+-----------------------------------------------------------------------------------------------------------------+------------+
   | 1.0 |                                                                      
                                           |            |
   | 2.0 | 1.0                                                                  
                                           | 1.0        |
   | 3.0 | 2.0                                                                  
                                           | 2.0        |
   
+-----+-----------------------------------------------------------------------------------------------------------------+------------+
   DataFrame()
   +-----+----------------+
   | a   | something_else |
   +-----+----------------+
   | 1.0 | 2.0            |
   | 2.0 | 3.0            |
   | 3.0 | 4.0            |
   +-----+----------------+
   ```
   
   **Expected behavior**
   ```console
   DataFrame()
   +-----+------------+
   | a   | previous_a |
   +-----+------------+
   | 1.0 |            |
   | 2.0 | 1.0        |
   | 3.0 | 2.0        |
   +-----+------------+
   DataFrame()
   +-----+----------------+
   | a   | something_else |
   +-----+----------------+
   | 1.0 | 2.0            |
   | 2.0 | 3.0            |
   | 3.0 | 4.0            |
   +-----+----------------+
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to