schemas, datatypes, and assignments

aapost Tue, 03 Jan 2023 20:18:01 -0800

I am trying to wrap my head around how one goes about working with andediting xml elements since it feels more complicated than it seems itshould be.. Just to get some feedback on how others might approach itand see if I am missing anything obvious that I haven't discovered yet,since maybe I am wandering off in a wrong way of thinking..

I am looking to interact with elements directly, loaded from a template,editing them, then ultimately submitting them to an API as a modifiedxml document.


Consider the following:

from lxml import objectify, etree
schema = etree.XMLSchema(file="path_to_my_xsd_schema_file")
parser = objectify.makeparser(schema=schema, encoding="UTF-8")
xml_obj = objectify.parse("path_to_my_xml_file", parser=parser)
xml_root = xml_obj.getroot()

let's say I have a Version element, that is defined simply as a stringin a 3rd party provided xsd schema


<xs:element name="Version" type="xs:string" minOccurs="0">

and is set to a number <Version>2342</Version> in my document

The xml file loads with the above code successfully against the schema

But lxml objectify decides the element type is Int, and the pytype is int..

Version <class 'lxml.objectify.IntElement'>
Version.pyval <class 'int'>

Let's say I want this loaded into a UI with a variety of dynamicallyloaded entry widgets so I can edit a large number of values like thisand of many other different types.


I can assign in one of two ways (both resulting the same)
xml_root.Version =
xml_root['Version'] =
(if there is some other more kosher way of assignment, let me know)

I can assign "2342" and the element suddenly becomes a <class'lxml.objectify.StringElement'>

I can assign 1.4 and the element suddenly becomes a <class'lxml.objectify.FloatElement'>

The schema does not check during this assignment, it could be invalid,like assigning "abc" to a xs:dateTime and it does so any way.The original value is lost. The only way I see to verify against theschema again is to do so explicitly against the whole root.


schema.validate(xml_root)

This returns False because of the added xmlns:py, py:pytype stuff, I canstrip those with:objectify.deannotate(xml_root[etree.QName(xml_root.Version.tag).localname],cleanup_namespaces=True)

and get back to schema.validate(xml_root) validating True. BUT, itvalidates True whether the element is a String, Int, Float, etc (so longas it 'could' potentially be a string or something..).. So let's say aVersion is 322.1121000, should be a string, validates against the schemaas string, but is now 322.1121 (much more relevant for something like aproduct identification number)

If it is a case where the validate remains False, I then have tomanually look at the error log via schema.error_log for something like this:

api_files/Basic:0:0:ERROR:SCHEMASV:SCHEMAV_CVC_DATATYPE_VALID_1_2_1:Element '{nsstuff}StartTime': 'asdfasdfa' is not a valid value of theatomic type 'xs:dateTime'.

Then I have to consider how I should reject the users input.. From a UIdesign standpoint it just seems like a lot of added steps, and redundantwork on top of a object layer that doesn't really do anything other thangive me a thumbs up on the way in and a thumbs up on a way out. Ratherthan interacting with an object that can say your change is schemaapproved or not from the get-go, I instead seem to have to parse 100000+lines of xsd and design UI interaction much more situationally andgranularly to assert types and corner cases and preserve original valuesin duplicate structures, etc..

My original assumptions when hearing about xml features doesn't seem toexist from what I have found so far. Where schema should be the law, ifmy schema says something should be loaded as a string, it should be astring (or something close enough, definitely not an int or float), thenattempting to assign something to it that doesn't match schema should bedenied or throw an error. I am sure under the hood it would probablyhave performance draw backs or something.. Oh well.. Back tocontemplating and tinkering..

--
https://mail.python.org/mailman/listinfo/python-list

Python - working with xml/lxml/objectify/schemas, datatypes, and assignments

Reply via email to