締切済み

ローマ数字の変換をしたいのですが

2004/12/21 18:42

言語：perl5.00404 機種依存文字である、ローマ数字を変換したいのですが、例：ローマ数字の１～１０を、I,II,III,IV,V,・・・と変換したい。　試みた方法としては、 &jcode::tr(\$str, "\xAD\xB5", "I"); &jcode::tr(\$str, "\xAD\xB6", "II"); &jcode::tr(\$str, "\xAD\xB7", "III"); &jcode::tr(\$str, "\xAD\xB8", "IV"); &jcode::tr(\$str, "\xAD\xB9", "V"); jcodeを使用して変換。この方法だと、ローマ数字の１～３は、全て"I"としか変換してくれず困っています。（1文字目しか変換されないようなのです。）これではいけないと考え、正規表現で以下のように試みたのですが、 $eucpre = qr{(?<!\x8F)}; $eucpost = qr{ (?= (?:[\xA1-\xFE][\xA1-\xFE])* # JIS X 0208 が 0文字以上続いて (?:[\x00-\x7F\x8E\x8F]|\z) # ASCII, SS2, SS3 または終端 ) }x; $str =~ s/$eucpre(?:\xAD\xB5)$eucpost/$1I/g; $str =~ s/$eucpre\Q\xAD\xB5\E$eucpost/$1I/g; $str =~ s/$eucpre(?:\xAD\xB6)$eucpost/$1II/g; $str =~ s/$eucpre\Q\xAD\xB6\E$eucpost/$1II/g; $str =~ s/$eucpre(?:\xAD\xB7)$eucpost/$1III/g; $str =~ s/$eucpre\Q\xAD\xB7\E$eucpost/$1III/g; $str =~ s/$eucpre(?:\xAD\xB8)$eucpost/$1IV/g; $str =~ s/$eucpre\Q\xAD\xB8\E$eucpost/$1IV/g; $str =~ s/$eucpre(?:\xAD\xB9)$eucpost/$1V/g; $str =~ s/$eucpre\Q\xAD\xB9\E$eucpost/$1V/g; これだとperlのバージョンが対応していない(perl5.005以上だとできる）のでこの策もだめで、困り果てています。どなたかよい方法を教えてください。

niitan
お礼率43% (7/16)

CGI
回答数2
ありがとう数3

みんなの回答 （2）
専門家の回答

みんなの回答

leaz024
ベストアンサー率75% (398/526)

2004/12/22 15:37 回答No.2

その正規表現による方法は、参考URLの「Perlメモ：正しくパターンマッチさせる」で紹介されているものですが、そこにはPerl5.005より前の環境でも利用可能な方法も載っていますので、そちらを参考にされるとよいでしょう。

参考URL：: http://www.din.or.jp/~ohzaki/perl.htm#JP_Match

質問者

お礼 2004/12/22 21:50

ご回答ありがとうございます。試みたのですが、不可解な現象にぶちあたりました。【正常に動作】 $ascii = "[\x00-\x7F]"; $twoBytes = "[\x8E\xA1-\xFE][\xA1-\xFE]"; $threeBytes = "\x8F[\xA1-\xFE][\xA1-\xFE]"; $pattern = "(2)"; $replace = "II"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(3)"; $replace = "III"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(4)"; $replace = "IV"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(5)"; $replace = "V"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; 【Internal Errorとなってしまう】 $ascii = "[\x00-\x7F]"; $twoBytes = "[\x8E\xA1-\xFE][\xA1-\xFE]"; $threeBytes = "\x8F[\xA1-\xFE][\xA1-\xFE]"; $pattern = "(1)"; $replace = "I"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(2)"; $replace = "II"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(3)"; $replace = "III"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(4)"; $replace = "IV"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; $pattern = "(5)"; $replace = "V"; $str =~ s/\G((?:$ascii|$twoBytes|$threeBytes)*?)(?:$pattern)/$1$replace/g; ローマ数字の(1)を変換しようとするとエラーになってしまうのですが、なぜでしょう？