cricket007 commented on a change in pull request #526: HIVE-21218: KafkaSerDe doesn't support topics created via Confluent URL: https://github.com/apache/hive/pull/526#discussion_r254993454
########## File path: kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaSerDe.java ########## @@ -369,6 +379,20 @@ private SubStructObjectInspector(StructObjectInspector baseOI, int toIndex) { } } + static class ConfluentAvroBytesConverter extends AvroBytesConverter { + ConfluentAvroBytesConverter(Schema schema) { + super(schema); + } + + @Override + Decoder getDecoder(byte[] value) { + /** + * Confluent 4 magic bytes that represents Schema ID as Integer. These bits are added before value bytes. + */ + return DecoderFactory.get().binaryDecoder(value, 5, value.length - 5, null); Review comment: > I'd like to see hive download schemas from schema registry Linked to earlier, the `/schema` endpoint should be able to get the schema text, however I suspect that it'll cause an http request per Hive "stage"? For example, N requests if reading from N Kafka partitions? And more requests if trying to do fancy selects, unions, and joins? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services