IPython中輸入編碼的奇怪問題。

[英]Weird problem with input encoding in IPython


I'm running python 2.6 with latest IPython on Windows XP SP3, and I have two questions. First one of my problems is, when under IPython, I cannot input Unicode strings directly, and, as a result, cannot open files with non-latin names. Let me demonstrate. Under usual python this works:

我正在運行python 2.6,最新的IPython在Windows XP SP3上,我有兩個問題。首先,我的問題之一是,在IPython下,我無法直接輸入Unicode字符串,因此無法打開具有非拉丁名稱的文件。讓我展示。在通常的python下,這是可行的:

>>> sys.getdefaultencoding()
'ascii'
>>> sys.getfilesystemencoding()
'mbcs'
>>> fd = open(u'm:/Блокнот/home.tdl')
>>> print u'm:/Блокнот/home.tdl'
m:/Блокнот/home.tdl
>>>

It's cyrillic in there, by the way. And under the IPython I get:

順便說一下,這里是西里爾。在IPython下,我得到:

In [49]: sys.getdefaultencoding()
Out[49]: 'ascii'

In [50]: sys.getfilesystemencoding()
Out[50]: 'mbcs'

In [52]: fd = open(u'm:/Блокнот/home.tdl')
---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)

C:\Documents and Settings\andrey\<ipython console> in <module>()

IOError: [Errno 2] No such file or directory: u'm:/\x81\xab\xae\xaa\xad\xae\xe2/home.tdl'

In [53]: print u'm:/Блокнот/home.tdl'
-------------->print(u'm:/Блокнот/home.tdl')
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (15, 0))

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)

C:\Documents and Settings\andrey\<ipython console> in <module>()

C:\Program Files\Python26\lib\encodings\cp866.pyc in encode(self, input, errors)
     10
     11     def encode(self,input,errors='strict'):
---> 12         return codecs.charmap_encode(input,errors,encoding_map)
     13
     14     def decode(self,input,errors='strict'):

UnicodeEncodeError: 'charmap' codec can't encode characters in position 3-9: character maps to <und

In [54]:

The second problem is less frustrating, but still. When I try to open a file, and specify file name argument as non-unicode string, it does not open. I have to forcibly decode string from OEM charset, before I could open files, which is pretty inconvenient:

第二個問題沒有那么令人沮喪,但仍然如此。當我嘗試打開一個文件,並將文件名參數指定為非unicode字符串時,它不會打開。我必須從OEM字符集強制解碼字符串,然后才能打開文件,這很不方便:

>>> fd2 = open('m:/Блокнот/home.tdl'.decode('cp866'))
>>>

Maybe it has something to with my regional settings, I don't know, because I can't even cut-and-paste cyrillic text from console. I've put "Russian" everywhere in regional settings, but it does not seem to work.

也許它與我的區域設置有關,我不知道,因為我甚至不能從控制台剪切粘貼cyrillic文本。我把“俄語”放在各個地區,但似乎不管用。

3 个解决方案

#1


12  

Yes. Typing Unicode at the console is always problematic and generally best avoided, but IPython is particularly broke. It converts characters you type on its console as if they were encoded in ISO-8859-1, regardless of the actual encoding you're giving it.

是的。在控制台上輸入Unicode總是有問題的,通常最好避免使用,但是IPython特別壞。它將您在其控制台上鍵入的字符轉換為ISO-8859-1編碼,而不考慮您給它的實際編碼。

For now, you'll have to say u'm:/\u0411\u043b\u043e\u043a\u043d\u043e\u0442/home.tdl'.

現在,你必須說你是:/\u0411\u043b\u043e\u043a\u043d\u043e\u0442/home.tdl'。

#2


1  

Perversely enough, this will work:

夠變態的是,這將奏效:

fd = open('m:/Блокнот/home.tdl')

Or:

或者:

fd = open('m:/Блокнот/home.tdl'.encode('utf-8'))

This gets around ipython's bug by inputting the string as a raw UTF-8 encoded byte-string. ipython doesn't try any funny business with it. You're then free to encode it into a unicode string if you like, and get on with your life.

這將通過將字符串作為原始UTF-8編碼的字節字符串輸入到ipython的bug中。ipython沒有嘗試過任何有趣的事情。然后,您可以自由地將其編碼為unicode字符串(如果您願意),然后繼續您的生活。

#3


0  

I had the same problem with Greek input, this patch from launchpad works for me too.

我對希臘輸入也有同樣的問題,這個來自launchpad的補丁也適用於我。

Thanks.

謝謝。

关注微信公众号

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2010/02/14/7defd934627c229e512892a882858662.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com