charset閱讀文件的問題。如何解決它

[英]Problems with charset reading file. How to fix it


Input to read the file Jade:

輸入讀取文件Jade:

input#upload.(type='file', accept="text/xml, .csv")

and get in js:

並進入js:

 var file = document.getElementById('upload').files[0];
var reader = new FileReader();
reader.onloadend = function(e){
     var file = e.target.result;
};
reader.readAsBinaryString(file);

I get a line:

我得到一條線:

"mail;name;ТеÑÑ"

where ТеÑÑ in the last element in the file is a russian word.

其中ТÐμÑÑ在文件的最后一個元素是俄語單詞。

how to fix charset?

如何修復charset?

1 个解决方案

#1


0  

The symptom is clear: you are (inadvertently) splicing UTF-8 (judging by your tag) content into something that is being presented as something else (not-UTF-8), hence mojibake ensues.

症狀很明顯:你(無意中)將UTF-8(通過你的標簽判斷)內容拼接成一些被呈現為別的東西(不是-UTF-8),因此mojibake隨之而來。

Make sure that every pass the content goes through is UTF-8 clean or preserves the original content byte-for-byte exactly. That includes setting Content-type headers appropriately (Likely: text/html; charset=utf-8).

確保內容經過的每次傳遞都是UTF-8清理,或者完全按字節保存原始內容。這包括適當地設置Content-type標頭(可能:text / html; charset = utf-8)。

This precise issue is why it is recommended to use UTF-8 for all the things. Set up your DBs to use UTF-8, set up your webserver to serve UTF-8, set up your source code to be in UTF-8, set up your editors to save in UTF-8 by default, set up your HTTP headers and meta tags to advertise UTF-8, do not accept anything that is not UTF-8 or transcode it where feasible. Anything that is not UTF-8 is just asking for trouble.

這個確切的問題是為什么建議將UTF-8用於所有事情。將數據庫設置為使用UTF-8,設置網絡服務器以提供UTF-8,將源代碼設置為UTF-8,將編輯器設置為默認保存為UTF-8,設置HTTP頭和meta標簽來宣傳UTF-8,不接受任何非UTF-8或在可行的情況下對其進行轉碼。任何不是UTF-8的東西都只是在惹麻煩。

Why standardise on UTF-8, you ask? Because it's low 7bit range happens to look like ASCII which can make a whole world of difference in interoperability with broken/legacy things that don't really understand much else.

為什么要對UTF-8進行標准化?因為它的低7bit范圍看起來像是ASCII,它可以在與其他不太了解的破碎/遺留物的互操作性方面與整個世界產生差異。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2015/10/30/a824b1f0655a0232a856eeffe9735fc9.html



 
粤ICP备14056181号  © 2014-2021 ITdaan.com