> I'm not sure what exactly you're asking for. > Especially "is not being interpreted as a string requiring base64 encoding" is > written without giving the right context. > > So I'm just guessing that this might be the usual misunderstandings with use > of base64 in LDIF. Read more about when LDIF requires base64-encoding here: > > http://tools.ietf.org/html/rfc2849 > > To me everything looks right: > > Python 2.7.3 (default, Apr 14 2012, 08:58:41) [GCC] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> 'ZGV0XDMzMTB3YmJccGc='.decode('base64').decode('utf-8') > u'det\\3310wbb\\pg' > >>> > > What do you think is a problem?
Michael, Thanks for the reply. The issues I am sure are in my code, I read the ldif source file and up with a values such as 'det\3310wbb\pg' after the base64 encoded entries are decoded. The problem I am having is when I add this to an add/mod entry list and write it back out. As it does not get re-encoded to base64 the ldif file ends up seeing a text entry with a ^] character which if I re-read it with the parser it causes the handle method to break midway through the entry dict and so the last half re-appears disjoint without a dn. Like I said, I am pretty sure its my poor misunderstanding of decoding and encoding. I am using the build from http://www.lfd.uci.edu/~gohlke/pythonlibs/ on a windows 2008 r2 server. I have re-implemented handle to create a cidict holding all the dn/entry's that are parsed as I then perform some processing such as manipulating attribute values in the entry dict. I am pretty sure I am breaking things here. The data I am reading is coming from utf-16-le encoded files and has Unicode characters as the source directory is globally available, being written to in just about every country. Is there a process for manipulating/adding data to the entry dict before I write it out that I should adhere to? For example, if I am adding a new attribute to be composed of part of another parsed attr for use in a modlist: {'customAttr': ['foo.{}.bar'.format(entry['uid'])]} By looking at the value from above, 'det\3310wbb\pg', I gather the entry dict was parsed into byte strings. I should have decoded this, where as some of the data is Unicode and as such I should have encoded it? I really appreciate the time. Grazie per tutto, jlc -- http://mail.python.org/mailman/listinfo/python-list