[
https://issues.apache.org/jira/browse/SPARK-55440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Milicevic updated SPARK-55440:
------------------------------------
Description:
*Summary:*
Create framework interfaces and integrate core type operations
*Description:*
Create the foundational Ops traits and interfaces, implement them for TimeType
as proof of concept, and integrate with core type system files.
*What this includes:*
* {{TypeOps}} (catalyst) - mandatory trait with physical type, literal, and
external conversion methods
* {{TypeApiOps}} (sql-api) - mandatory trait with formatting and encoding
methods
* {{TimeTypeOps}} (catalyst) and {{TimeTypeApiOps}} (sql-api) - TimeType
implementations
* Feature flag: {{spark.sql.types.framework.enabled}}
* {{TypeOps.apply(dt)}} returns {{Option[TypeOps]}} with feature flag inside -
single registration point
* Integration in 9 core files using {{getOrElse}} pattern:
{{{}PhysicalDataType{}}}, {{{}CatalystTypeConverters{}}}, {{{}ToStringBase{}}},
{{{}RowEncoder{}}}, {{{}literals.scala{}}}, {{{}EncoderUtils{}}},
{{{}CodeGenerator{}}}, {{{}SpecificInternalRow{}}}, {{InternalRow}}
*Example - before and after for physical type dispatch:*
{code:java}
// Before: each type hardcoded
case _: TimeType => PhysicalLongType
// After: framework dispatch with getOrElse fallback
TypeOps(dt).map(_.getPhysicalType).getOrElse {
dt match {
case DateType => PhysicalIntegerType
// ... legacy types unchanged
}
}{code}
*Design doc:*
Linked in the parent work item.
was:
*Summary:*
Create framework interfaces and integrate core type operations
*Description:*
Create the foundational Ops traits and interfaces, implement them for TimeType
as proof of concept, and integrate with core type system files.
*What this includes:*
* Base traits ({{{}TypeOps{}}}, {{{}TypeApiOps{}}}) with factory objects and
{{supports()}} methods
* Five core interfaces: {{{}PhyTypeOps{}}}, {{{}LiteralTypeOps{}}},
{{{}ExternalTypeOps{}}}, {{{}FormatTypeOps{}}}, {{EncodeTypeOps}}
* TimeType implementation: {{TimeTypeOps}} (catalyst) and {{TimeTypeApiOps}}
(sql-api)
* Feature flag: {{spark.sql.types.framework.enabled}}
* Check-and-delegate integration in ~10 core files: {{{}PhysicalDataType{}}},
{{{}CatalystTypeConverters{}}}, {{{}ToStringBase{}}}, {{{}RowEncoder{}}},
{{{}literals.scala{}}}, {{{}EncoderUtils{}}}, {{{}CodeGenerator{}}},
{{{}SpecificInternalRow{}}}, {{{}InternalRow{}}},
{{{}SQLConf{}}}/{{{}SqlApiConf{}}}
{*}Example - before and after for physical type dispatch:{*}{*}{{*}}
{code:java}
// Before: each type hardcoded
case _: TimeType => PhysicalLongType
// After: framework dispatch
case _ if PhyTypeOps.supports(dt) => PhyTypeOps(dt).getPhysicalType{code}
*Design doc:*
Linked in the parent work item.
> Types Framework - Phase 1a - Core Type System Foundation
> --------------------------------------------------------
>
> Key: SPARK-55440
> URL: https://issues.apache.org/jira/browse/SPARK-55440
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.2.0
> Reporter: David Milicevic
> Priority: Major
> Labels: pull-request-available
>
> *Summary:*
> Create framework interfaces and integrate core type operations
> *Description:*
> Create the foundational Ops traits and interfaces, implement them for
> TimeType as proof of concept, and integrate with core type system files.
> *What this includes:*
> * {{TypeOps}} (catalyst) - mandatory trait with physical type, literal, and
> external conversion methods
> * {{TypeApiOps}} (sql-api) - mandatory trait with formatting and encoding
> methods
> * {{TimeTypeOps}} (catalyst) and {{TimeTypeApiOps}} (sql-api) - TimeType
> implementations
> * Feature flag: {{spark.sql.types.framework.enabled}}
> * {{TypeOps.apply(dt)}} returns {{Option[TypeOps]}} with feature flag inside
> - single registration point
> * Integration in 9 core files using {{getOrElse}} pattern:
> {{{}PhysicalDataType{}}}, {{{}CatalystTypeConverters{}}},
> {{{}ToStringBase{}}}, {{{}RowEncoder{}}}, {{{}literals.scala{}}},
> {{{}EncoderUtils{}}}, {{{}CodeGenerator{}}}, {{{}SpecificInternalRow{}}},
> {{InternalRow}}
> *Example - before and after for physical type dispatch:*
> {code:java}
> // Before: each type hardcoded
> case _: TimeType => PhysicalLongType
> // After: framework dispatch with getOrElse fallback
> TypeOps(dt).map(_.getPhysicalType).getOrElse {
> dt match {
> case DateType => PhysicalIntegerType
> // ... legacy types unchanged
> }
> }{code}
> *Design doc:*
> Linked in the parent work item.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]