I am trying to wrap my head around how one goes about working with and editing xml elements since it feels more complicated than it seems it should be.. Just to get some feedback on how others might approach it and see if I am missing anything obvious that I haven't discovered yet, since maybe I am wandering off in a wrong way of thinking..

I am looking to interact with elements directly, loaded from a template, editing them, then ultimately submitting them to an API as a modified xml document.

Consider the following:

from lxml import objectify, etree
schema = etree.XMLSchema(file="path_to_my_xsd_schema_file")
parser = objectify.makeparser(schema=schema, encoding="UTF-8")
xml_obj = objectify.parse("path_to_my_xml_file", parser=parser)
xml_root = xml_obj.getroot()

let's say I have a Version element, that is defined simply as a string in a 3rd party provided xsd schema

<xs:element name="Version" type="xs:string" minOccurs="0">

and is set to a number <Version>2342</Version> in my document

The xml file loads with the above code successfully against the schema

But lxml objectify decides the element type is Int, and the pytype is int..

Version <class 'lxml.objectify.IntElement'>
Version.pyval <class 'int'>

Let's say I want this loaded into a UI with a variety of dynamically loaded entry widgets so I can edit a large number of values like this and of many other different types.

I can assign in one of two ways (both resulting the same)
xml_root.Version =
xml_root['Version'] =
(if there is some other more kosher way of assignment, let me know)

I can assign "2342" and the element suddenly becomes a <class 'lxml.objectify.StringElement'>

I can assign 1.4 and the element suddenly becomes a <class 'lxml.objectify.FloatElement'>

The schema does not check during this assignment, it could be invalid, like assigning "abc" to a xs:dateTime and it does so any way. The original value is lost. The only way I see to verify against the schema again is to do so explicitly against the whole root.

schema.validate(xml_root)

This returns False because of the added xmlns:py, py:pytype stuff, I can strip those with: objectify.deannotate(xml_root[etree.QName(xml_root.Version.tag).localname], cleanup_namespaces=True)

and get back to schema.validate(xml_root) validating True. BUT, it validates True whether the element is a String, Int, Float, etc (so long as it 'could' potentially be a string or something..).. So let's say a Version is 322.1121000, should be a string, validates against the schema as string, but is now 322.1121 (much more relevant for something like a product identification number)

If it is a case where the validate remains False, I then have to manually look at the error log via schema.error_log for something like this:

api_files/Basic:0:0:ERROR:SCHEMASV:SCHEMAV_CVC_DATATYPE_VALID_1_2_1: Element '{nsstuff}StartTime': 'asdfasdfa' is not a valid value of the atomic type 'xs:dateTime'.

Then I have to consider how I should reject the users input.. From a UI design standpoint it just seems like a lot of added steps, and redundant work on top of a object layer that doesn't really do anything other than give me a thumbs up on the way in and a thumbs up on a way out. Rather than interacting with an object that can say your change is schema approved or not from the get-go, I instead seem to have to parse 100000+ lines of xsd and design UI interaction much more situationally and granularly to assert types and corner cases and preserve original values in duplicate structures, etc..

My original assumptions when hearing about xml features doesn't seem to exist from what I have found so far. Where schema should be the law, if my schema says something should be loaded as a string, it should be a string (or something close enough, definitely not an int or float), then attempting to assign something to it that doesn't match schema should be denied or throw an error. I am sure under the hood it would probably have performance draw backs or something.. Oh well.. Back to contemplating and tinkering..
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to