[ https://issues.apache.org/jira/browse/AVRO-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nikita Ryanov updated AVRO-2539: -------------------------------- Description: Currently, ThrifdData class produces not compatible avro schema in terms of AvroCompatibility rules. For example, consider this thrift structs: {code:java} struct V1 { 1: required string f1, 2: optional string f2 } struct V1 { 1: required string f1, 2: optional string f2, 3: optional string f3 }{code} Produced schemas will be: {noformat} {"type":"record","name":"V1","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}],"default":null}]} {"type":"record","name":"V2","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}]}, {"name":"f3","type":["null",{"type":"string","avro.java.string":"String"}]}]} {noformat} The problem is that if i try to check this schemas using BACKWARD compatibility checker i will get false, because fields f2 and f3 has no default values even if they are optional. Also, if i use default value in my thrift definition the resulting avro schema will not contain it. There is possibility to fix default null values for optional fields using NULL_DEFAULT_VALUE, but it will ignore the real default values. To honour the real default values specified in *.thrift we can use instance of thrift message to get default value, but this will require some refactoring of such methods as getSchema and nullable in ThriftData.class was: Currently, ThrifdData class produces not compatible avro schema in terms of AvroCompatibility rules. For example, consider this thrift structs: {code:java} struct V1 { 1: required string f1, 2: optional string f2 } struct V1 { 1: required string f1, 2: optional string f2, 3: optional string f3 }{code} Produced schemas will be: {noformat} {"type":"record","name":"V1","namespace":"schemakeeper.serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}],"default":null}]} {"type":"record","name":"V2","namespace":"schemakeeper.serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}]}, {"name":"f3","type":["null",{"type":"string","avro.java.string":"String"}]}]} {noformat} The problem is that if i try to check this schemas using BACKWARD compatibility checker i will get false, because fields f2 and f3 has no default values even if they are optional. Also, if i use default value in my thrift definition the resulting avro schema will not contain it. There is possibility to fix default null values for optional fields using NULL_DEFAULT_VALUE, but it will ignore the real default values. To honour the real default values specified in *.thrift we can use instance of thrift message to get default value, but this will require some refactoring of such methods as getSchema and nullable in ThriftData.class > ThriftData produces not compatible avro schemas > ----------------------------------------------- > > Key: AVRO-2539 > URL: https://issues.apache.org/jira/browse/AVRO-2539 > Project: Apache Avro > Issue Type: Improvement > Components: java > Affects Versions: 1.9.0 > Reporter: Nikita Ryanov > Priority: Major > > Currently, ThrifdData class produces not compatible avro schema in terms of > AvroCompatibility rules. > For example, consider this thrift structs: > {code:java} > struct V1 { > 1: required string f1, > 2: optional string f2 > } > struct V1 { > 1: required string f1, > 2: optional string f2, > 3: optional string f3 > }{code} > Produced schemas will be: > {noformat} > {"type":"record","name":"V1","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}],"default":null}]} > {"type":"record","name":"V2","namespace":"serialization.thrift.test","fields":[{"name":"f1","type":["null",{"type":"string","avro.java.string":"String"}]},{"name":"f2","type":["null",{"type":"string","avro.java.string":"String"}]}, > {"name":"f3","type":["null",{"type":"string","avro.java.string":"String"}]}]} > {noformat} > The problem is that if i try to check this schemas using BACKWARD > compatibility checker i will get false, because fields f2 and f3 has no > default values even if they are optional. > Also, if i use default value in my thrift definition the resulting avro > schema will not contain it. > There is possibility to fix default null values for optional fields using > NULL_DEFAULT_VALUE, but it will ignore the real default values. To honour the > real default values specified in *.thrift we can use instance of thrift > message to get default value, but this will require some refactoring of such > methods as getSchema and nullable in ThriftData.class -- This message was sent by Atlassian Jira (v8.3.2#803003)