One of the areas where Kafka has an advantage over Pulsar is around data quality. Kafka conforms to the JSON Schema specification, which enables integration with any technology that conforms to the standard, such as for data validation, discoverability, lineage, versioning, etc. Pulsar's implementation is non-compliant with the standard, and producers and consumers have no built-in way in Pulsar to validate that values in their messages match expectations. As a consequence, if a producer makes a change or omission, such as in a value used for tracking, it might not surface until way down the line, and then it can be very difficult to track down the source of the problem, which kills the agility of teams responsible for maintaining apps using Pulsar. It's also bad PR because then incidents are associated with Pulsar, even though the business might not understand that the data problem wasn't necessarily caused by Pulsar.
What's the right way for us to address this problem? -- Devin Bost Sent from mobile Cell: 801-400-4602