如何使用Ruby轉義Unicode字符串?

[英]How do I escape a Unicode string with Ruby?


I need to encode/convert a Unicode string to its escaped form, with backslashes. Anybody know how?

我需要使用反斜杠將Unicode字符串編碼/轉換為其轉義形式。誰知道怎么樣?

5 个解决方案

#1


19  

In Ruby 1.8.x, String#inspect may be what you are looking for, e.g.

在Ruby 1.8.x中,String#inspect可能是您正在尋找的,例如

>> multi_byte_str = "hello\330\271!"
=> "hello\330\271!"

>> multi_byte_str.inspect
=> "\"hello\\330\\271!\""

>> puts multi_byte_str.inspect
"hello\330\271!"
=> nil

In Ruby 1.9 if you want multi-byte characters to have their component bytes escaped, you might want to say something like:

在Ruby 1.9中,如果您希望多字節字符使其組件字節轉義,您可能想要說:

>> multi_byte_str.bytes.to_a.map(&:chr).join.inspect
=> "\"hello\\xD8\\xB9!\""

In both Ruby 1.8 and 1.9 if you are instead interested in the (escaped) unicode code points, you could do this (though it escapes printable stuff too):

在Ruby 1.8和1.9中,如果你對(轉義的)unicode代碼點感興趣,你可以這樣做(雖然它也逃脫了可打印的東西):

>> multi_byte_str.unpack('U*').map{ |i| "\\u" + i.to_s(16).rjust(4, '0') }.join
=> "\\u0068\\u0065\\u006c\\u006c\\u006f\\u0639\\u0021"

#2


12  

To use a unicode character in Ruby use the "\uXXXX" escape; where XXXX is the UTF-16 codepoint. see http://leejava.wordpress.com/2009/03/11/unicode-escape-in-ruby/

要在Ruby中使用unicode字符,請使用“\ uXXXX”轉義符;其中XXXX是UTF-16碼點。見http://leejava.wordpress.com/2009/03/11/unicode-escape-in​​-ruby/

#3


8  

If you have Rails kicking around you can use the JSON encoder for this:

如果你有Rails,你可以使用JSON編碼器:

require 'active_support'
x = ActiveSupport::JSON.encode('µ')
# x is now "\u00b5"

The usual non-Rails JSON encoder doesn't "\u"-ify Unicode.

通常的非Rails JSON編碼器不會“\ u”-ify Unicode。

#4


3  

You can directly use unicode characters if you just add #Encoding: UTF-8 to the top of your file. Then you can freely use ä, ǹ, ú and so on in your source code.

如果只是將#Encoding:UTF-8添加到文件的頂部,則可以直接使用unicode字符。然后你可以在你的源代碼中自由使用ä,ǹ,ú等。

#5


-1  

try this gem. It converts Unicode or non-ASCII punctuation and symbols to nearest ASCII punctuation and symbols

試試這個寶石。它將Unicode或非ASCII標點符號和符號轉換為最近的ASCII標點和符號

https://github.com/qwuen/punctuate

https://github.com/qwuen/punctuate

example usage: "100٪".punctuate => "100%"

示例用法:“100%”。punctuate =>“100%”

the gem uses the reference in https://lexsrv3.nlm.nih.gov/LexSysGroup/Projects/lvg/current/docs/designDoc/UDF/unicode/DefaultTables/symbolTable.html for the conversion.

gem使用https://lexsrv3.nlm.nih.gov/LexSysGroup/Projects/lvg/current/docs/designDoc/UDF/unicode/DefaultTables/symbolTable.html中的引用進行轉換。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2011/04/06/729b056e34223ef6231c6cbaa3a0f169.html



 
粤ICP备14056181号  © 2014-2020 ITdaan.com