There is a Wikipedia article on this. Turns out it is not straightforward. 
There can be a Byte Order Mark that the file begins with but not all vendors 
use it. And I do not think you can make the determination simply by examining 
the contents of the file. 

Byte-order mark[edit]
If the Unicode byte-order mark U+FEFF is at the start of a UTF-8 file, the 
first three bytes will be 0xEF, 0xBB, 0xBF.
The Unicode Standard neither requires nor recommends the use of the BOM for 
UTF-8, but warns that it may be encountered at the start of a file trans-coded 
from another encoding.[23] While ASCII text encoded using UTF-8 is backward 
compatible with ASCII, this is not true when Unicode Standard recommendations 
are ignored and a BOM is added. A BOM can confuse software that isn't prepared 
for it but can otherwise accept UTF-8, e.g. programming languages that permit 
non-ASCII bytes in string literals but not at the start of the file. 
Nevertheless, there was and still is software that always inserts a BOM when 
writing UTF-8, and refuses to correctly interpret UTF-8 unless the first 
character is a BOM (or the file only contains ASCII).[24]

https://en.wikipedia.org/wiki/UTF-8#

Bob S


> On Oct 29, 2024, at 1:53 AM, jbv via use-livecode 
> <use-livecode@lists.runrev.com> wrote:
> 
> Hi list,
> 
> How to determine if a text file is UTF8 or just plain ASCII ?
> In other words, how to know if one should use
>  open file myfile.txt for UTF8 read
> or
>  open file myfile.txt for read
> 
> Thank you.
> jbv
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to