When you save these data in YAML, you will get accented signs as \xe1 for á (in cp1250), but strange sign in UTF8. So we need remove this encoding and replace it by ASCII signs.
You can use following code for removing diacritic:
TABLE1250 = {"e1" => "a", "e4" => "a", "e8" => "c", "ef" => "d", "e9" => "e", "ec" => "e", "ed" => "i", "be" => "l", "e5" => "l", "f2" => "n", "f3" => "o", "f6" => "o", "f5" => "o", "f4" => "o", "f8" => "r", "e0" => "r", "9a" => "s", "9d" => "t", "fa" => "u", "f9" => "u", "fc" => "u", "fb" => "u", "fd" => "y", "9e" => "z", "c1" => "A", "c4" => "A", "c8" => "C", "cf" => "D", "c9" => "E", "cc" => "E", "cd" => "I", "bc" => "L", "c5" => "L", "d2" => "N", "d3" => "O", "d6" => "O", "d5" => "O", "d4" => "O", "d8" => "R", "c0" => "R", "8a" => "S", "8d" => "T", "da" => "U", "d9" => "U", "dc" => "U", "db" => "U", "dd" => "Y", "8e" => "Z"}
def remove_diacritic str
while !str.index("\\x").nil?
idx = str.index("\\x")
str[idx, 4] = "#{TABLE1250[str[idx+2, 2].downcase]}"
end
str
end

3 komentářů:
Na todle sem videl prostredek v knihovne Iconv. Neco jako Iconv.new('ASCII//TRANSLIT', 'UTF-8') ...
Hello. Often the Internet can see links like [url=http://www.whitehutchinson.com/aboutus/]Buy cialis without prescription[/url] or [url=http://www.rc.umd.edu/bibliographies/]Buy cialis without prescription[/url]. Is it safe to buy in pharmacies such goods?
Nice Information.. Thx for sharing this
information
Přidat komentář