|
Re: Determine Encoding Type [message #134628 is a reply to message #133033] |
Thu, 25 August 2005 17:13 |
andrew again
Messages: 2577 Registered: March 2000
|
Senior Member |
|
|
Windows supports High-endian and Low-endian Unicode files. (wordpad > save-as). There are 2 bytes at the beginning of the file which indicate if it is High or Low endian.
If there is no marker in the file then I don't see how you could interpret the contents (assuming the file extension doesn't tell you). "file" utility on Unix makes a best guess at the content of a file - you can try that. If 7-bit ascii, you wouldn't find any bytes with values above 127, whereas you would in something like ISO-1. ISO-1 or ISO-15 have an unused range from 128-159, whereas Windows (cp1252) could have bytes in that range. Without having byte markers in the file or knowing what codepage it was written in, I don't think you can tell. Some operating systems store additional attributes about a file which could include something like codepage.
|
|
|