如何在Ruby中将字符串中的西里尔符号音译为拉丁语?我找不到这方面的文件。我想应该有一些标准的功能。
sigwle7e1#
您可以使用the translit gem:
translit
require 'translit' str = "Кириллица" Translit.convert(str, :english) #=> "Kirillica"
字符串
ie3xauqp2#
处理西里尔文/俄文最成熟的gem是https://github.com/yaroslav/russian/它还支持音译,以及许多其他服务:
require 'russian' # => true Russian.translit('Транслит, english letters untouched') # => "Translit, english letters untouched"
字符串它还提供了多元化,日期格式,Rails i18 n集成和许多其他好东西。免责声明:我不是在任何意义上affilated与宝石,只是快乐的用户。
bpzcxfmw3#
有一颗宝石。我没有试过,但听起来很有希望。https://github.com/dalibor/cyrillizer
wlwcrazw4#
def transliterate cyrillic_string ru = { 'а' => 'a', 'б' => 'b', 'в' => 'v', 'г' => 'g', 'д' => 'd', \ 'е' => 'e', 'ё' => 'e', 'ж' => 'j', 'з' => 'z', 'и' => 'i', \ 'к' => 'k', 'л' => 'l', 'м' => 'm', 'н' => 'n', 'о' => 'o', \ 'п' => 'p', 'р' => 'r', 'с' => 's', 'т' => 't', 'у' => 'u', \ 'ф' => 'f', 'х' => 'h', 'ц' => 'c', 'ч' => 'ch', 'ш' => 'sh', \ 'щ' => 'shch', 'ы' => 'y', 'э' => 'e', 'ю' => 'u', 'я' => 'ya', \ 'й' => 'i', 'ъ' => '', 'ь' => ''} identifier = '' cyrillic_string.downcase.each_char do |char| identifier += ru[char] ? ru[char] : char end identifier.gsub!(/[^a-z0-9_]+/, '_'); # remaining non-alphanumeric => hyphen identifier.gsub(/^[-_]*|[-_]*$/, ''); # remove hyphens/underscores and numbers at beginning and hyphens/underscores at end end
xkrw2x1b5#
我不想添加依赖项,只想在脚本中添加简单的东西,所以我这样做了:
transmap = [["кс", "x"], ["Кс", "X"], ["а", "a"], ["А", "A"], ["б", "b"], ["Б", "B"], ["в", "v"], ["В", "V"], ["г", "g"], ["Г", "G"], ["д", "d"], ["Д", "D"], ["е", "e"], ["Е", "E"], ["ё", "yo"], ["Ё", "Yo"], ["ё", "jo"], ["Ё", "Jo"], ["ё", "ö"], ["Ё", "Ö"], ["ж", "zh"], ["Ж", "Zh"], ["з", "z"], ["З", "Z"], ["и", "i"], ["И", "I"], ["й", "j"], ["Й", "J"], ["к", "k"], ["К", "K"], ["л", "l"], ["Л", "L"], ["м", "m"], ["М", "M"], ["н", "n"], ["Н", "N"], ["о", "o"], ["О", "O"], ["п", "p"], ["П", "P"], ["р", "r"], ["Р", "R"], ["с", "s"], ["С", "S"], ["т", "t"], ["Т", "T"], ["у", "u"], ["У", "U"], ["ф", "f"], ["Ф", "F"], ["х", "h"], ["Х", "H"], ["ц", "ts"], ["Ц", "Ts"], ["ч", "ch"], ["Ч", "Ch"], ["ш", "sh"], ["Ш", "Sh"], ["в", "w"], ["В", "W"], ["щ", "shch"], ["Щ", "Shch"], ["щ", "sch"], ["Щ", "Sch"], ["ъ", "#"], ["Ъ", "#"], ["ы", "y"], ["Ы", "Y"], ["ь", ""], ["Ь", ""], ["э", "je"], ["Э", "Je"], ["э", "ä"], ["Э", "Ä"], ["ю", "yu"], ["Ю", "Yu"], ["ю", "ju"], ["Ю", "Ju"], ["ю", "ü"], ["Ю", "Ü"], ["я", "ya"], ["Я", "Ya"], ["я", "ja"], ["Я", "Ja"], ["я", "q"], ["Я", "Q"]] translit = ->(string) { transmap.inject(string) { |s, (k, v)| s.gsub(k, v) } } translit.call("Пoo") # "Poo"
字符串请注意,Translit将相同的西里尔字母Map到多个拉丁字符串,例如“я”to“q”and“ja”and“ya”-所以这个代码(像Translit)当然只会选择其中一个。就是这样,但细节在下面。我使用以下代码片段从https://github.com/tjbladez/translit/blob/master/lib/translit.rb生成了transmap:
transmap
transmap = translit_map.flat_map { |k, (up, down)| [ [ down, k ], [ up, k.capitalize ] ] }.sort_by { |k, _| -k.length }
型它需要按最长的顺序排序,所以它在一个字母的音译之前执行кс => x。
nfzehxib6#
适用于任何区域设置(使用en和fr测试)
en
fr
def normalized_text I18n.transliterate(text.downcase.strip) end
6条答案
按热度按时间sigwle7e1#
您可以使用the
translit
gem:字符串
ie3xauqp2#
处理西里尔文/俄文最成熟的gem是https://github.com/yaroslav/russian/
它还支持音译,以及许多其他服务:
字符串
它还提供了多元化,日期格式,Rails i18 n集成和许多其他好东西。
免责声明:我不是在任何意义上affilated与宝石,只是快乐的用户。
bpzcxfmw3#
有一颗宝石。我没有试过,但听起来很有希望。
https://github.com/dalibor/cyrillizer
wlwcrazw4#
字符串
xkrw2x1b5#
我不想添加依赖项,只想在脚本中添加简单的东西,所以我这样做了:
字符串
请注意,Translit将相同的西里尔字母Map到多个拉丁字符串,例如“я”to“q”and“ja”and“ya”-所以这个代码(像Translit)当然只会选择其中一个。
就是这样,但细节在下面。
我使用以下代码片段从https://github.com/tjbladez/translit/blob/master/lib/translit.rb生成了
transmap
:型
它需要按最长的顺序排序,所以它在一个字母的音译之前执行кс => x。
nfzehxib6#
适用于任何区域设置(使用
en
和fr
测试)字符串