[
https://issues.apache.org/jira/browse/AVRO-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671833#comment-16671833
]
Chong Wang commented on AVRO-1795:
----------------------------------
Found an example in
http://gisgeek.blogspot.com/2012/12/using-apache-avro-with-python.html seems
did what is required.
> Python2: Cannot parse nested schemas
> ------------------------------------
>
> Key: AVRO-1795
> URL: https://issues.apache.org/jira/browse/AVRO-1795
> Project: Avro
> Issue Type: Bug
> Components: python
> Affects Versions: 1.8.0
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Priority: Major
>
> In the Java client, one can parse nested schemas by loading the nested schema
> before the nesting schema.
> For example, a header can be defined in one file:
> {code:javascript}{ "namespace": "python.avro",
> "type": "record",
> "name": "header",
> "fields": [
> { "name": "header_field", "type": "string" }
> ]
> }{code}
> and then included in another schema:
> {code:javascript}{ "namespace": "python.avro",
> "type": "record",
> "name": "event",
> "fields": [
> { "name": "header", "type": "python.avro.header" },
> { "name": "event_field", "type": "string" }
> ]
> }{code}
> As long as one instantiates the Parser and loads the header first, the
> schemas will be reconciled and merged correctly.
> However, the Python client does not support this. The {{parse}} method of
> the {{schema.py}} file always instantiates a new Names object to hold the
> schemas:
> {code}def parse(json_string):
> """Constructs the Schema from the JSON text."""
> # TODO(hammer): preserve stack trace from JSON parse
> # parse the JSON
> try:
> json_data = json.loads(json_string)
> except:
> raise SchemaParseException('Error parsing JSON: %s' % json_string)
> # Initialize the names object
> names = Names()
> # construct the Avro Schema object
> return make_avsc_object(json_data, names){code}
> Some possible fixes for this are:
> 1) Create a separate Parser class to mimic the Schema.Parser Java approach,
> while deprecating the current parse method.
> 2) Include Names as a global variable to the parse method, allowing multiple
> parse calls to populate the same namespace. This breaks current behavior
> (and at least one unit test depends on it), so would be backwards compatible.
> 3) Create a new parse method that returns not only the schema, but also the
> Names instance and accepts that instance. This keeps the code nice and
> functional while exposing the Names class, which previously had been not
> particularly public.
> I like the first approach.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)