On 12/29/14, 7:40 PM, Craig Ringer wrote:
On 12/30/2014 06:39 AM, Jim Nasby wrote:


How much of this issue is caused by trying to machine-parse log files?
Is a better option to improve that case, possibly doing something like
including a field in each line that tells you the encoding for that entry?

That'd be absolutely ghastly. You couldn't just view the logs with
'less' or a text editor if your logs had mixed encodings, you'd need
some kind of special PostgreSQL log viewer tool.

I was specifically talking about logs intended for machine reading (ie: CSV), 
not human reading.

Similar to how client logging (where encoding is a lot more important) and 
server logging aren't exactly the same use case, human read logs vs something 
for a machine to read aren't the same thing either.

BTW, before someone makes an argument for using tools like cut or grep with 
CSV, that actually falls apart spectacularly at the first multi-line log 
message. I think that's just another example that trying to make one logfile 
serve two different purposes just won't work well.

Perhaps the solution here is to include a tool that makes it easier to deal 
with CSV logs, including encoding. I've certainly wished for such a tool to 
allow me to effectively deal with CSV logs in a way that didn't necessitate 
loading them into a table.

Why would we possibly do that when we could just emit utf-8 instead?

What happens if we get a translation/encoding failure (the case Tom's worried 
about)?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to