lkuchars commented on code in PR #10105:
URL: https://github.com/apache/nifi/pull/10105#discussion_r2260130350
##########
nifi-extension-bundles/nifi-standard-services/nifi-schema-registry-service-api/src/main/java/org/apache/nifi/schemaregistry/services/SchemaRegistry.java:
##########
@@ -45,4 +45,32 @@ public interface SchemaRegistry extends ControllerService {
* @return the set of all Schema Fields that are supplied by the
RecordSchema that is returned from {@link #retrieveSchema(SchemaIdentifier)}
*/
Set<SchemaField> getSuppliedSchemaFields();
+
+ /**
+ * Retrieves the raw schema definition including its textual
representation and references.
+ * <p>
+ * This method is used to retrieve the complete schema definition
structure, including the raw schema text
+ * and any schema references. Unlike {@link
#retrieveSchema(SchemaIdentifier)}, which returns a parsed
+ * {@link RecordSchema} ready for immediate use, this method returns a
{@link SchemaDefinition} containing
+ * the raw schema content that can be used for custom schema processing,
compilation, or when schema
+ * references need to be resolved.
+ * </p>
+ * <p>
+ * This method is particularly useful for:
+ * <ul>
+ * <li>Processing schemas that reference other schemas (e.g., Protocol
Buffers with imports)</li>
+ * <li>Custom schema compilation workflows where the raw schema text is
needed</li>
+ * <li>Accessing schema metadata and references for advanced schema
processing</li>
+ * </ul>
+ * </p>
+ *
+ * @param schemaIdentifier the schema identifier containing id, name,
version, and optionally branch information
+ * @return a {@link SchemaDefinition} containing the raw schema text,
type, identifier, and references
+ * @throws IOException if unable to communicate with the backing store
+ * @throws SchemaNotFoundException if unable to find the schema based on
the given identifier
+ * @throws UnsupportedOperationException if the schema registry
implementation does not support raw schema retrieval
+ */
+ default SchemaDefinition retrieveSchemaRaw(SchemaIdentifier
schemaIdentifier) throws IOException, SchemaNotFoundException {
+ throw new UnsupportedOperationException("retrieveSchemaRaw is not
supported by this SchemaRegistry implementation");
Review Comment:
>I agree that extending this interface and throwing an exception is a
concern. One option is to include some kind of status method such as
isSchemaDefinitionAccessSupported() and return false in the default
implementation. Another option is to define a new interface.
Got it, so from the two mentioned above, I'd prefer to add the
isSchemaDefinitionAccessSupported() method, and call it in the ProtobufReader,
in order to report a validation errors, if the user will try to use schema
registry without support for raw schemas. I'd leave the new method in the
SchemaRegistry. It seems to be correct place given the current responsibilites
os SchemaRegistry implementations.
>Taking a step back, the RecordSchema interface supports returning the raw
schema text as an optional property, so was there a consideration of using the
existing method instead of a new one?
I saw the text property in RecordSchema interface, but decided against using
it for the following reasons:
- RecordSchema represents the compiled/parsed schema. Once the
implementation of the RecordSchema is returned it should support all interface
methods, not only getText() which means we would have to parse with wire before
the SchemaRecord implementation is created. I wanted to defer the compilation
and decouple those two operations. I could potentially defer the compilation
until one of the metods, eg. getFields is called but did not want to introduce
a new implementaiton of this interface.
- I needed some kind of representation for referenced schemas.
- I wanted a raw schema to be a separate entity that is not bound to Record
related entities so that the schemas can be worked on without Record related
abstractions.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]