charset阅读文件的问题。如何解决它

[英]Problems with charset reading file. How to fix it


Input to read the file Jade:

输入读取文件Jade:

input#upload.(type='file', accept="text/xml, .csv")

and get in js:

并进入js:

 var file = document.getElementById('upload').files[0];
var reader = new FileReader();
reader.onloadend = function(e){
     var file = e.target.result;
};
reader.readAsBinaryString(file);

I get a line:

我得到一条线:

"mail;name;ТеÑÑ"

where ТеÑÑ in the last element in the file is a russian word.

其中ТÐμÑÑ在文件的最后一个元素是俄语单词。

how to fix charset?

如何修复charset?

1 个解决方案

#1


0  

The symptom is clear: you are (inadvertently) splicing UTF-8 (judging by your tag) content into something that is being presented as something else (not-UTF-8), hence mojibake ensues.

症状很明显:你(无意中)将UTF-8(通过你的标签判断)内容拼接成一些被呈现为别的东西(不是-UTF-8),因此mojibake随之而来。

Make sure that every pass the content goes through is UTF-8 clean or preserves the original content byte-for-byte exactly. That includes setting Content-type headers appropriately (Likely: text/html; charset=utf-8).

确保内容经过的每次传递都是UTF-8清理,或者完全按字节保存原始内容。这包括适当地设置Content-type标头(可能:text / html; charset = utf-8)。

This precise issue is why it is recommended to use UTF-8 for all the things. Set up your DBs to use UTF-8, set up your webserver to serve UTF-8, set up your source code to be in UTF-8, set up your editors to save in UTF-8 by default, set up your HTTP headers and meta tags to advertise UTF-8, do not accept anything that is not UTF-8 or transcode it where feasible. Anything that is not UTF-8 is just asking for trouble.

这个确切的问题是为什么建议将UTF-8用于所有事情。将数据库设置为使用UTF-8,设置网络服务器以提供UTF-8,将源代码设置为UTF-8,将编辑器设置为默认保存为UTF-8,设置HTTP头和meta标签来宣传UTF-8,不接受任何非UTF-8或在可行的情况下对其进行转码。任何不是UTF-8的东西都只是在惹麻烦。

Why standardise on UTF-8, you ask? Because it's low 7bit range happens to look like ASCII which can make a whole world of difference in interoperability with broken/legacy things that don't really understand much else.

为什么要对UTF-8进行标准化?因为它的低7bit范围看起来像是ASCII,它可以在与其他不太了解的破碎/遗留物的互操作性方面与整个世界产生差异。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2015/10/30/a824b1f0655a0232a856eeffe9735fc9.html



 
粤ICP备14056181号  © 2014-2021 ITdaan.com