lkuchars commented on code in PR #10105:
URL: https://github.com/apache/nifi/pull/10105#discussion_r2260130350


##########
nifi-extension-bundles/nifi-standard-services/nifi-schema-registry-service-api/src/main/java/org/apache/nifi/schemaregistry/services/SchemaRegistry.java:
##########
@@ -45,4 +45,32 @@ public interface SchemaRegistry extends ControllerService {
      * @return the set of all Schema Fields that are supplied by the 
RecordSchema that is returned from {@link #retrieveSchema(SchemaIdentifier)}
      */
     Set<SchemaField> getSuppliedSchemaFields();
+
+    /**
+     * Retrieves the raw schema definition including its textual 
representation and references.
+     * <p>
+     * This method is used to retrieve the complete schema definition 
structure, including the raw schema text
+     * and any schema references. Unlike {@link 
#retrieveSchema(SchemaIdentifier)}, which returns a parsed
+     * {@link RecordSchema} ready for immediate use, this method returns a 
{@link SchemaDefinition} containing
+     * the raw schema content that can be used for custom schema processing, 
compilation, or when schema
+     * references need to be resolved.
+     * </p>
+     * <p>
+     * This method is particularly useful for:
+     * <ul>
+     *   <li>Processing schemas that reference other schemas (e.g., Protocol 
Buffers with imports)</li>
+     *   <li>Custom schema compilation workflows where the raw schema text is 
needed</li>
+     *   <li>Accessing schema metadata and references for advanced schema 
processing</li>
+     * </ul>
+     * </p>
+     *
+     * @param schemaIdentifier the schema identifier containing id, name, 
version, and optionally branch information
+     * @return a {@link SchemaDefinition} containing the raw schema text, 
type, identifier, and references
+     * @throws IOException if unable to communicate with the backing store
+     * @throws SchemaNotFoundException if unable to find the schema based on 
the given identifier
+     * @throws UnsupportedOperationException if the schema registry 
implementation does not support raw schema retrieval
+     */
+    default SchemaDefinition retrieveSchemaRaw(SchemaIdentifier 
schemaIdentifier) throws IOException, SchemaNotFoundException {
+        throw new UnsupportedOperationException("retrieveSchemaRaw is not 
supported by this SchemaRegistry implementation");

Review Comment:
   >I agree that extending this interface and throwing an exception is a 
concern. One option is to include some kind of status method such as 
isSchemaDefinitionAccessSupported() and return false in the default 
implementation. Another option is to define a new interface.
   
   Got it, so from the two mentioned above, I'd prefer to add the 
isSchemaDefinitionAccessSupported() method, and call it in the ProtobufReader, 
in order to report a validation errors, if the user will try to use schema 
registry without support for raw schemas. I'd leave the new method in the 
SchemaRegistry. It seems to be correct place given the current responsibilites 
os SchemaRegistry implementations. 
   
   >Taking a step back, the RecordSchema interface supports returning the raw 
schema text as an optional property, so was there a consideration of using the 
existing method instead of a new one?
   
   I saw the text property in RecordSchema interface, but decided against using 
it for the following reasons:
   - RecordSchema represents the compiled/parsed schema. Once the 
implementation of the RecordSchema is returned it should support all interface 
methods, not only getText() which means we would have to parse with wire before 
the SchemaRecord implementation is created. I wanted to defer the compilation 
and decouple those two operations. I could potentially defer the compilation 
until one of the metods, eg. getFields is called but did not want to introduce 
a new implementaiton of this interface.
   - I needed some kind of representation for referenced schemas.
   - I wanted a raw schema to be a separate entity that is not bound to Record 
related entities so that the schemas can be worked on without Record related 
abstractions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to