anishshri-db commented on code in PR #49277: URL: https://github.com/apache/spark/pull/49277#discussion_r1915987901
########## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateSchemaCompatibilityChecker.scala: ########## @@ -206,22 +206,23 @@ class StateSchemaCompatibilityChecker( } else if (!ignoreValueSchema && schemaEvolutionEnabled) { // Check value schema evolution // Sort schemas by most recent to least recent - val oldAvroSchemas = oldSchemas.sortBy(_.valueSchemaId).reverse.map { oldSchema => - SchemaConverters.toAvroTypeWithDefaults(oldSchema.valueSchema) + val oldStateSchemas = oldSchemas.sortBy(_.valueSchemaId).reverse.map { oldSchema => + StateSchemaMetadataValue( + oldSchema.valueSchema, SchemaConverters.toAvroTypeWithDefaults(oldSchema.valueSchema)) }.asJava - val l = oldSchemas.sortBy(_.valueSchemaId).reverse.map { oldSchema => - SchemaConverters.toAvroTypeWithDefaults(oldSchema.valueSchema) - } + val newAvroSchema = SchemaConverters.toAvroTypeWithDefaults(valueSchema) val validator = new SchemaValidatorBuilder().canReadStrategy.validateAll() - try { - validator.validate(newAvroSchema, oldAvroSchemas) - } catch { - case s: SchemaValidationException => - throw StateStoreErrors.stateStoreInvalidValueSchemaEvolution( - valueSchema.toString, s.getMessage) - case e: Throwable => throw e + oldStateSchemas.forEach { oldStateSchema => Review Comment: Yea this is needed right. Basically @HeartSaVioR - Avro allows for multiple ways of doing evolution. If you are comparing only for prev vs current - that can tell you compatibility but it also assumes that u will rewrite everything with current. If you don't want to do that, then you need to run validation from start to end and ensure that all the transformations are valid. They have different APIs to do that as well - https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/ValidateLatest.html vs https://avro.apache.org/docs/1.7.7/api/java/org/apache/avro/ValidateAll.html -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org