amoeba commented on issue #12244:
URL: https://github.com/apache/datafusion/issues/12244#issuecomment-2699036156

   The behavior was modified in #12245 and the original issue looks addressed 
but the plans I'm getting don't validate. DataFusion currently hardcodes struct 
nullability to `NULLABILITY_UNSPECIFIED`, see 
https://github.com/apache/datafusion/blob/dd0fd889ea603f929accb99002e2f99280823f5c/datafusion/substrait/src/logical_plan/producer.rs#L1042
   
   The `baseSchema` I get with the latest datafusion looks like this,
   
   ```json
   "baseSchema": {
     "names": [
       "species",
       "island",
       "bill_length_mm",
       "bill_depth_mm",
       "body_mass_g",
       "sex",
       "year"
     ],
     "struct": {
       "types": [
         {
           "string": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         },
         {
           "string": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         },
         {
           "fp64": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         },
         {
           "fp64": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         },
         {
           "i32": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         },
         {
           "string": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         },
         {
           "i32": {
             "nullability": "NULLABILITY_NULLABLE"
           }
         }
       ]
     }
   },
   ```
   
   The above implicitly sets the nullability on the baseSchema to 
`NULLABILITY_UNSPECIFIED` which, when validated as part of a larger plan, 
errors out with:
   
   ```
   Error (code 0002):
     at 
plan.relations[0].rel_type<root>.input.rel_type<project>.input.rel_type<read>.base_schema.struct.nullability:
     illegal value: nullability information is required in this context (code 
0002)
   ```
   
   I can get the plan to validate if I set nullability to 
`NULLABILITY_REQUIRED`,
   
   ```
                   "baseSchema": {
                     "names": [...],
                     "struct": {
                       "types": [...],
                       "nullability": "NULLABILITY_REQUIRED"
                     }
                   },
   ```
   
   Is the validator right and should DataFusion change its behavior here?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to