Re: [PR] [SPARK-51423][SQL] Add the current_time() function for TIME datatype [spark]

via GitHub Mon, 31 Mar 2025 12:00:39 -0700


MaxGekk commented on code in PR #50336:
URL: https://github.com/apache/spark/pull/50336#discussion_r2021600007



##########
sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala:
##########
@@ -133,6 +134,32 @@ trait SparkDateTimeUtils {
     }
   }
 
+  /**
+   * Gets the number of microseconds since midnight using the session time 
zone.
+   */
+  def instantToMicrosOfDay(instant: Instant): Long = {
+    val zoneId = getZoneId(SqlApiConf.get.sessionLocalTimeZone)
+    val localDateTime = LocalDateTime.ofInstant(instant, zoneId)
+    localDateTime.toLocalTime.toNanoOfDay / 1000

Review Comment:
   ```suggestion
       localDateTime.toLocalTime.getLong(MICRO_OF_DAY)
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala:
##########
@@ -290,3 +291,63 @@ object HourExpressionBuilder extends ExpressionBuilder {
     }
   }
 }
+
+case class CurrentTime(precision: Int = 6) extends CurrentTimestampLike with 
ExpectsInputTypes {
+  // The function returns a TIME(n).
+  override def dataType: DataType = TimeType(precision)
+
+  override def prettyName: String = "current_time"
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(IntegerType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+    if (precision < TimeType.MIN_PRECISION || precision > 
TimeType.MICROS_PRECISION) {
+      TypeCheckFailure(s"Invalid precision $precision. Must be between 0 and 
6.")
+    } else {
+      TypeCheckSuccess
+    }
+  }
+}
+
+/**
+ * Returns the current time at the start of query evaluation.
+ * There is no code generation since this expression should get constant 
folded by the optimizer.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+    _FUNC_() - Returns the current time at the start of query evaluation. All 
calls of current_time within the same query return the same value.
+
+    _FUNC_ - Returns the current time at the start of query evaluation.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_();
+       15:49:11.914120
+      > SELECT _FUNC_;
+       15:49:11.914120
+      > SELECT _FUNC_(0);
+       15:49:11
+      > SELECT _FUNC_(3);
+       15:49:11.914
+  """,
+  group = "datetime_funcs",
+  since = "4.1.0")
+// scalastyle:on line.size.limit
+object CurrentTimeExpressionBuilder extends ExpressionBuilder {
+  override def build(funcName: String, expressions: Seq[Expression]): 
Expression = {
+    expressions match {
+      case Nil => CurrentTime(TimeType.MICROS_PRECISION)
+
+      case Seq(Literal(precisionValue, IntegerType)) =>
+        CurrentTime(precisionValue.asInstanceOf[Int])
+
+      case Seq(_) =>
+        val child = expressions.head
+        throw QueryCompilationErrors.nonFoldableInputError("precision", child, 
IntegerType)

Review Comment:
   A foldable expression could be non-Literal, for example `1+1`



##########
sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala:
##########
@@ -133,6 +134,32 @@ trait SparkDateTimeUtils {
     }
   }
 
+  /**
+   * Gets the number of microseconds since midnight using the session time 
zone.
+   */
+  def instantToMicrosOfDay(instant: Instant): Long = {
+    val zoneId = getZoneId(SqlApiConf.get.sessionLocalTimeZone)

Review Comment:
   Most of the functions in the file are pure, and don't depend on global 
variables. Let's follow the convention, and pass the time zone to the function. 
It is available at the caller side, right?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##########
@@ -2148,6 +2148,21 @@ private[sql] object QueryCompilationErrors extends 
QueryErrorsBase with Compilat
       ))
   }
 
+  // scalastyle:off line.size.limit
+  def nonFoldableInputError(argumentName: String, expression: Expression, 
requiredType: DataType): Throwable = {

Review Comment:
   Just put parameters on separate lines, see 
https://github.com/databricks/scala-style-guide?tab=readme-ov-file#spacing-and-indentation



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/timeExpressions.scala:
##########
@@ -290,3 +291,63 @@ object HourExpressionBuilder extends ExpressionBuilder {
     }
   }
 }
+
+case class CurrentTime(precision: Int = 6) extends CurrentTimestampLike with 
ExpectsInputTypes {
+  // The function returns a TIME(n).
+  override def dataType: DataType = TimeType(precision)
+
+  override def prettyName: String = "current_time"
+
+  override def inputTypes: Seq[AbstractDataType] = Seq(IntegerType)
+
+  override def checkInputDataTypes(): TypeCheckResult = {
+    if (precision < TimeType.MIN_PRECISION || precision > 
TimeType.MICROS_PRECISION) {
+      TypeCheckFailure(s"Invalid precision $precision. Must be between 0 and 
6.")
+    } else {
+      TypeCheckSuccess
+    }
+  }
+}
+
+/**
+ * Returns the current time at the start of query evaluation.
+ * There is no code generation since this expression should get constant 
folded by the optimizer.
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = """
+    _FUNC_() - Returns the current time at the start of query evaluation. All 
calls of current_time within the same query return the same value.
+
+    _FUNC_ - Returns the current time at the start of query evaluation.
+  """,
+  examples = """
+    Examples:
+      > SELECT _FUNC_();
+       15:49:11.914120
+      > SELECT _FUNC_;
+       15:49:11.914120
+      > SELECT _FUNC_(0);
+       15:49:11
+      > SELECT _FUNC_(3);
+       15:49:11.914
+  """,
+  group = "datetime_funcs",
+  since = "4.1.0")
+// scalastyle:on line.size.limit
+object CurrentTimeExpressionBuilder extends ExpressionBuilder {

Review Comment:
   You can just add constructors with different parameters to `CurrentTime` 
instead of using `ExpressionBuilder`.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala:
##########
@@ -129,6 +131,11 @@ object ComputeCurrentTime extends Rule[LogicalPlan] {
               Literal.create(
                 DateTimeUtils.microsToDays(currentTimestampMicros, cd.zoneId), 
DateType)
             })
+          case currentTimeType : CurrentTime =>
+            // scalastyle:off line.size.limit

Review Comment:
   Just put split the line below, and remove this.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala:
##########
@@ -129,6 +131,11 @@ object ComputeCurrentTime extends Rule[LogicalPlan] {
               Literal.create(
                 DateTimeUtils.microsToDays(currentTimestampMicros, cd.zoneId), 
DateType)
             })
+          case currentTimeType : CurrentTime =>
+            // scalastyle:off line.size.limit
+            val truncatedTime = 
truncateTimeMicrosToPrecision(currentTimeOfDayMicros, currentTimeType.precision)

Review Comment:
   ```suggestion
               val truncatedTime = 
truncateTimeMicrosToPrecision(currentTimeOfDayMicros,
                 currentTimeType.precision)```



##########
sql/api/src/main/scala/org/apache/spark/sql/catalyst/util/SparkDateTimeUtils.scala:
##########
@@ -133,6 +134,32 @@ trait SparkDateTimeUtils {
     }
   }
 
+  /**
+   * Gets the number of microseconds since midnight using the session time 
zone.
+   */
+  def instantToMicrosOfDay(instant: Instant): Long = {
+    val zoneId = getZoneId(SqlApiConf.get.sessionLocalTimeZone)
+    val localDateTime = LocalDateTime.ofInstant(instant, zoneId)
+    localDateTime.toLocalTime.toNanoOfDay / 1000
+  }
+
+  /**
+   * Truncates a time value (in microseconds) to the specified fractional 
precision `p`.
+   *
+   * For example, if `p = 3`, we keep millisecond resolution and discard any 
digits beyond
+   * the thousand-microsecond place. So a value like `123456` microseconds 
(12:34:56.123456)
+   * becomes `123000` microseconds (12:34:56.123).
+   *
+   * @param micros The original time in microseconds.
+   * @param p      The fractional second precision (range 0 to 6).

Review Comment:
   Could you add an assert for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-51423][SQL] Add the current_time() function for TIME datatype [spark]

Reply via email to