Mozilla Charset Detectors

2017-05-22 Thread Gabriel Sandor
Greetings, I recently came across the Mozilla Charset Detectors tool, at https://www-archive.mozilla.org/projects/intl/chardet.html. I'm working on a C# project where I could use a port of this library (e.g. https://github.com/errepi/ude) for advanced charset detection. I'm not sure however if th

Re: Mozilla Charset Detectors

2017-05-23 Thread Gabriel Sandor
e C/C++ libraries but in theory they can be wrapped into a managed C++.NET assembly and consumed by a C# project. I haven't seen yet any existing C# ports that also handle charset detection. On Mon, May 22, 2017 at 5:49 PM, Henri Sivonen wrote: > On Mon, May 22, 2017 at 12:13 PM, Gabriel Sandor &g

Re: Mozilla Charset Detectors

2017-05-30 Thread Gabriel Sandor
They can come from arbitrary sources that are out of my control. Therefore i may not get the charset of the original document, so all i'm left with is heuristic detection for those fragments. The application must be able to deal with any XML it receives, it doesn't impose any particular structure o