IPython中输入编码的奇怪问题。

[英]Weird problem with input encoding in IPython


I'm running python 2.6 with latest IPython on Windows XP SP3, and I have two questions. First one of my problems is, when under IPython, I cannot input Unicode strings directly, and, as a result, cannot open files with non-latin names. Let me demonstrate. Under usual python this works:

我正在运行python 2.6,最新的IPython在Windows XP SP3上,我有两个问题。首先,我的问题之一是,在IPython下,我无法直接输入Unicode字符串,因此无法打开具有非拉丁名称的文件。让我展示。在通常的python下,这是可行的:

>>> sys.getdefaultencoding()
'ascii'
>>> sys.getfilesystemencoding()
'mbcs'
>>> fd = open(u'm:/Блокнот/home.tdl')
>>> print u'm:/Блокнот/home.tdl'
m:/Блокнот/home.tdl
>>>

It's cyrillic in there, by the way. And under the IPython I get:

顺便说一下,这里是西里尔。在IPython下,我得到:

In [49]: sys.getdefaultencoding()
Out[49]: 'ascii'

In [50]: sys.getfilesystemencoding()
Out[50]: 'mbcs'

In [52]: fd = open(u'm:/Блокнот/home.tdl')
---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)

C:\Documents and Settings\andrey\<ipython console> in <module>()

IOError: [Errno 2] No such file or directory: u'm:/\x81\xab\xae\xaa\xad\xae\xe2/home.tdl'

In [53]: print u'm:/Блокнот/home.tdl'
-------------->print(u'm:/Блокнот/home.tdl')
ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line statement', (15, 0))

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)

C:\Documents and Settings\andrey\<ipython console> in <module>()

C:\Program Files\Python26\lib\encodings\cp866.pyc in encode(self, input, errors)
     10
     11     def encode(self,input,errors='strict'):
---> 12         return codecs.charmap_encode(input,errors,encoding_map)
     13
     14     def decode(self,input,errors='strict'):

UnicodeEncodeError: 'charmap' codec can't encode characters in position 3-9: character maps to <und

In [54]:

The second problem is less frustrating, but still. When I try to open a file, and specify file name argument as non-unicode string, it does not open. I have to forcibly decode string from OEM charset, before I could open files, which is pretty inconvenient:

第二个问题没有那么令人沮丧,但仍然如此。当我尝试打开一个文件,并将文件名参数指定为非unicode字符串时,它不会打开。我必须从OEM字符集强制解码字符串,然后才能打开文件,这很不方便:

>>> fd2 = open('m:/Блокнот/home.tdl'.decode('cp866'))
>>>

Maybe it has something to with my regional settings, I don't know, because I can't even cut-and-paste cyrillic text from console. I've put "Russian" everywhere in regional settings, but it does not seem to work.

也许它与我的区域设置有关,我不知道,因为我甚至不能从控制台剪切粘贴cyrillic文本。我把“俄语”放在各个地区,但似乎不管用。

3 个解决方案

#1


12  

Yes. Typing Unicode at the console is always problematic and generally best avoided, but IPython is particularly broke. It converts characters you type on its console as if they were encoded in ISO-8859-1, regardless of the actual encoding you're giving it.

是的。在控制台上输入Unicode总是有问题的,通常最好避免使用,但是IPython特别坏。它将您在其控制台上键入的字符转换为ISO-8859-1编码,而不考虑您给它的实际编码。

For now, you'll have to say u'm:/\u0411\u043b\u043e\u043a\u043d\u043e\u0442/home.tdl'.

现在,你必须说你是:/\u0411\u043b\u043e\u043a\u043d\u043e\u0442/home.tdl'。

#2


1  

Perversely enough, this will work:

够变态的是,这将奏效:

fd = open('m:/Блокнот/home.tdl')

Or:

或者:

fd = open('m:/Блокнот/home.tdl'.encode('utf-8'))

This gets around ipython's bug by inputting the string as a raw UTF-8 encoded byte-string. ipython doesn't try any funny business with it. You're then free to encode it into a unicode string if you like, and get on with your life.

这将通过将字符串作为原始UTF-8编码的字节字符串输入到ipython的bug中。ipython没有尝试过任何有趣的事情。然后,您可以自由地将其编码为unicode字符串(如果您愿意),然后继续您的生活。

#3


0  

I had the same problem with Greek input, this patch from launchpad works for me too.

我对希腊输入也有同样的问题,这个来自launchpad的补丁也适用于我。

Thanks.

谢谢。

智能推荐

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.itdaan.com/blog/2010/02/14/7defd934627c229e512892a882858662.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号  

赞助商广告