This is driving me batty... more enjoyment with the Python3 "Everything must be bytes" thing... sigh... I have a file that contains a class used by other scripts. The class is fed either a file, or a stream of output from another command, then interprets that output and returns a set that the main program can use... confusing, perhaps, but not necessarily important.
The class is created and then called with the load_filename method: def load_filename(self, filename): logging.info("Loading elements from filename: %s", filename) file = open(filename, "rb", encoding="utf-8") return self.load_file(file, filename) As you can see, this calls the load_file method, by passing the filehandle and filename (in common use, filename is actually an IOStream object). load_file starts out like this: def load_file(self, file, filename="<stream>"): elements = [] for string in self._reader(file): if not string: break element = {} Note that it now calls the private _reader() passing along the filehandle further in. THIS is where I'm failing: This is the private _reader function: def _reader(self, file, size=4096, delimiter=r"\n{2,}"): buffer_old = "" while True: buffer_new = file.read() print(type(buffer_new)) if not buffer_new: break lines = re.split(delimiter, buffer_old + buffer_new) buffer_old = lines.pop(-1) for line in lines: yield line yield buffer_old (the print statement is something I put in to verify the problem. So stepping through this, when _reader executes, it executes read() on the opened filehandle. Originally, it read in 4096 byte chunks, I removed that to test a theory. It creates buffer_new with the output of the read. Running type() on buffer_new tells me that it's a bytes object. However no matter what I do: file.read().decode() buffer_new.decode() in the lines = re.split() statement buffer_str = buffer_new.decode() I always get a traceback telling me that the str object has no decoe() method. If I remove the decode attempts, I get a traceback telling me that it can't implicitly convert a bytes_object to a str object. So I'm stuck in a vicious circle and can't see a way out. here's sample error messages: When using the decode() method to attempt to convert the bytes object: Traceback (most recent call last): File "./filter_templates", line 134, in <module> sys.exit(main(sys.argv[1:])) File "./filter_templates", line 126, in main options.whitelist, options.blacklist) File "./filter_templates", line 77, in parse_file matches = match_elements(template.load_file(file), *args, **kwargs) File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 73, in load_file for string in self._reader(file): File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 35, in _reader lines = re.split(delimiter, buffer_old + buffer_new.decode()) AttributeError: 'str' object has no attribute 'decode' It's telling me that buffer_new is a str object. so if I remove the decode(): Traceback (most recent call last): File "./run_templates", line 142, in <module> sys.exit(main(sys.argv[1:])) File "./run_templates", line 137, in main runner.process(args, options.shell) File "./run_templates", line 39, in process records = self.process_output(process.stdout) File "./run_templates", line 88, in process_output return template.load_file(output) File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 73, in load_file for string in self._reader(file): File "/usr/lib/python3/dist-packages/checkbox/lib/template.py", line 35, in _reader lines = re.split(delimiter, buffer_old + buffer_new) TypeError: Can't convert 'bytes' object to str implicitly now it's complaining that buffer_new is a bytes object and can't be implicitly converted to str. This is a bug introduced in our conversion from Python 2 to Python 3. I am really, really starting to dislike some of the things Python3 does... or just am really, really frustrated. -- http://mail.python.org/mailman/listinfo/python-list