Text-RewriteRules-0.25/000755 000765 000024 00000000000 11777260151 015112 5ustar00ambsstaff000000 000000 Text-RewriteRules-0.25/Changes000644 000765 000024 00000006061 11777116022 016405 0ustar00ambsstaff000000 000000 Revision history for Text-RewriteRules 0.25 2012, July 10 (22:17) - Fixed stupid bug for too generic regexps; 0.24 2012, June 8 (18:44) - POD fixes 0.23 2010, May 26 (19:51) - Added /d debug mode for RULES and RULES/m; 0.22 2010, Apr 4 (20:02) - It seems that some code was using an undefined behavior of \G and regular expressions. That code was workarounded not to use it. While it worked perfectly under perl 5.8 and perl 5.10 something changed on perl 5.12 regular expression engine that created this behavior. 0.21 2009, Jun 4 (17:08) - Mark algorithm skiping new lines. - Better compilation of power regexps. - Basic capturing for power regexps. 0.20 2009, May 15 (21:13) - Added parenthesized expressions (parenthesis, brackets and curly braces); - Fixed XML regular expression (both empty tags and balanceness) 0.19 2009, May 13 (11:52) - Added support for XML automagical tags [[:XML:]] and [[:XML(tag):]]; - Added variable +{TAG} when using [[:XML:]]; - Compile function removes 'use Text::RewriteRules;' 0.18 2009, Mar 2 (20:56) - Added parenthesis around regexps so them supports ~or~s nicely. 0.17 2009, Fev 15 (00:55) - Added support for conditional last rules on simple RULES blocks. - Added support for conditional and non conditional last rules on MRULES blocks. 0.16 2008, Nov 25 (19:42) - Workaround with a bug for perl 5.8.x - Added support for multiple lexers at the same time; - Added support to the 'x' flag on the MRULES; 0.15 2008, Nov 23 (21:27) - Fixed some issues with a global variable. 0.14 2008, Nov 15 (11:00) - Added support for lex-style blocks for lexers (LRULES and RULES/l) 0.13 2008, Apr 18 (15:50) - Added support for some spaces between RULES and the rule identifier. - Support scripts written under MS OSes that still use carriage returns. 0.12 2008, Mar 16 (14:30) - missing an entry on MANIFEST 0.11 2008, Mar 15 (14:30) - Added support to the =b=> begin rule; - added a compiler 0.10 2005, Dec 5 (19:40) - Corrected problems with cursor mode - Some more examples and tests - Some documentation for the cursor mode 0.09 2005, May 14 (14:30) - Corrected evaluation rules on mark approach 0.08 2005, Mar 06 (20:13) - Added RULES/x flag -- needs some more tests 0.07 2005, Feb 22 (18:38) - upload for 0.06 failed. 0.06 2005, Feb 22 (18:28) - Added a lot of documentation (Hurray!!) 0.05 2005, Feb 21 (17:20) - Added RULES/m as option for MRULES; - Added RULES/i flag for ignore case; - Added flag support on RULES line; - Support for empty lines and/or comments; 0.04 2005, Feb 18 (17:25) - Missing commas on generated code giving problems. 0.03 2005, Feb 15 (16:34) - Corrected bug: mark walking code missing - Added support for ignore case 0.02 2004, Dec 30 (14:00) - Added num2words example 0.01 2004, Dec 27 (11:30) - First version. Text-RewriteRules-0.25/lib/000755 000765 000024 00000000000 11777260150 015657 5ustar00ambsstaff000000 000000 Text-RewriteRules-0.25/Makefile.PL000644 000765 000024 00000001146 11202063020 017041 0ustar00ambsstaff000000 000000 use strict; use warnings; use ExtUtils::MakeMaker; use 5.010000; # 5.10.0 WriteMakefile( NAME => 'Text::RewriteRules', AUTHOR => 'Alberto Simoes ', VERSION_FROM => 'lib/Text/RewriteRules.pm', ABSTRACT_FROM => 'lib/Text/RewriteRules.pm', EXE_FILES => [ "scripts/textrr" ], PL_FILES => {}, PREREQ_PM => { 'Filter::Simple' => 0.78, 'Test::More' => 0, }, dist => { COMPRESS => 'gzip -9f', SUFFIX => 'gz', }, clean => { FILES => 'Text-RewriteRules-*' }, ); Text-RewriteRules-0.25/MANIFEST000644 000765 000024 00000000737 11777260151 016252 0ustar00ambsstaff000000 000000 Changes MANIFEST META.yml # Will be created by "make dist" Makefile.PL README lib/Text/RewriteRules.pm t/00.load_and_init.t t/pod-coverage.t t/pod.t t/01.simple.t t/02.mark.t samples/num2words samples/num2wordsx samples/naif_translator t/03.mline.t scripts/textrr t/04.lexer.t t/05.xml.t t/101.simple.t t/102.mark.t t/103.mline.t t/104.lexer.t t/105.xml.t t/06.parenthesis.t t/106.parenthesis.t META.json Module JSON meta-data (added by MakeMaker) Text-RewriteRules-0.25/META.json000644 000765 000024 00000001654 11777260151 016541 0ustar00ambsstaff000000 000000 { "abstract" : "A system to rewrite text using regexp-based rules", "author" : [ "Alberto Simoes " ], "dynamic_config" : 1, "generated_by" : "ExtUtils::MakeMaker version 6.62, CPAN::Meta::Converter version 2.120921", "license" : [ "unknown" ], "meta-spec" : { "url" : "http://search.cpan.org/perldoc?CPAN::Meta::Spec", "version" : "2" }, "name" : "Text-RewriteRules", "no_index" : { "directory" : [ "t", "inc" ] }, "prereqs" : { "build" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "configure" : { "requires" : { "ExtUtils::MakeMaker" : "0" } }, "runtime" : { "requires" : { "Filter::Simple" : "0.78", "Test::More" : "0" } } }, "release_status" : "stable", "version" : "0.25" } Text-RewriteRules-0.25/META.yml000644 000765 000024 00000001032 11777260150 016356 0ustar00ambsstaff000000 000000 --- abstract: 'A system to rewrite text using regexp-based rules' author: - 'Alberto Simoes ' build_requires: ExtUtils::MakeMaker: 0 configure_requires: ExtUtils::MakeMaker: 0 dynamic_config: 1 generated_by: 'ExtUtils::MakeMaker version 6.62, CPAN::Meta::Converter version 2.120921' license: unknown meta-spec: url: http://module-build.sourceforge.net/META-spec-v1.4.html version: 1.4 name: Text-RewriteRules no_index: directory: - t - inc requires: Filter::Simple: 0.78 Test::More: 0 version: 0.25 Text-RewriteRules-0.25/README000644 000765 000024 00000000654 10730327605 015773 0ustar00ambsstaff000000 000000 Text-RewriteRules ================= This module provides a simple way to write rules to rewrite text INSTALLATION To install this module, run the following commands: perl Makefile.PL make make test make install COPYRIGHT AND LICENCE Copyright (C) 2004-2005 Alberto Simões and José João Almeida This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Text-RewriteRules-0.25/samples/000755 000765 000024 00000000000 11777260150 016555 5ustar00ambsstaff000000 000000 Text-RewriteRules-0.25/scripts/000755 000765 000024 00000000000 11777260150 016600 5ustar00ambsstaff000000 000000 Text-RewriteRules-0.25/t/000755 000765 000024 00000000000 11777260150 015354 5ustar00ambsstaff000000 000000 Text-RewriteRules-0.25/t/00.load_and_init.t000644 000765 000024 00000000667 11203052732 020541 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 1; BEGIN { use_ok( 'Text::RewriteRules' ) } for my $file (qw/01.simple.t 02.mark.t 03.mline.t 04.lexer.t 05.xml.t 06.parenthesis.t/) { open R, "t/$file" or die "Can't open file t/$file\n"; open W, ">t/1$file" or die "Can't write file t/1$file\n"; my $str; { undef $/; $str = ; } print W Text::RewriteRules::__compiler($str); close W; close R; } Text-RewriteRules-0.25/t/01.simple.t000644 000765 000024 00000005405 11377246213 017255 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 54; use Text::RewriteRules; ## Replace RULES first a==>b ENDRULES is(first("bar"),"bbr"); ## Replace Ignore Case RULES ifirst a=i=>b ENDRULES is(ifirst("BAR"),"BbR"); ### --- ### ## Replace with references... RULES second a(\d+)==>$1 ENDRULES is(second("a342"),"342"); is(second("b342"),"b342"); is(second("ba2cd"),"b2cd"); ## Replace Ignore Case with references... RULES isecond a(\d+)=i=>$1 ENDRULES is(isecond("A342"),"342"); is(isecond("b342"),"b342"); is(isecond("ba2cd"),"b2cd"); ### --- ### ## Conditional RULES third b(a+)b==>bbb!! length($1)>5 ENDRULES is(third("bab"), "bab"); is(third("baab"), "baab"); is(third("baaab"), "baaab"); is(third("baaaab"), "baaaab"); is(third("baaaaab"), "baaaaab"); is(third("baaaaaab"), "bbb"); is(third("baaaaaaab"), "bbb"); ## Conditional Ignore Case RULES ithird b(a+)b=i=>bbb!! length($1)>5 ENDRULES is(ithird("bAb"), "bAb"); is(ithird("baab"), "baab"); is(ithird("bAaAb"), "bAaAb"); is(ithird("baAaAb"), "baAaAb"); is(ithird("bAaAaAb"), "bAaAaAb"); is(ithird("baAaAaAb"), "bbb"); is(ithird("bAaAaAaAb"), "bbb"); ### --- ### ## Eval Conditional RULES fourth b(\d+)=e=>'b' x $1 !! $1 > 5 ENDRULES is(fourth("b1"), "b1"); is(fourth("b2"), "b2"); is(fourth("b5"), "b5"); is(fourth("b6"), "bbbbbb"); is(fourth("b8"), "bbbbbbbb"); ## Eval Conditional with Ignore Case RULES ifourth b(\d+)=i=e=>'b' x $1 !! $1 > 5 ENDRULES is(ifourth("b1"), "b1"); is(ifourth("B2"), "B2"); is(ifourth("b5"), "b5"); is(ifourth("B6"), "bbbbbb"); is(ifourth("b8"), "bbbbbbbb"); ### --- ### ## Eval RULES fifth b(\d+)=e=>'b' x $1 ENDRULES is(fifth("b1"), "b"); is(fifth("b2"), "bb"); is(fifth("b5"), "bbbbb"); is(fifth("b8"), "bbbbbbbb"); ## Eval with ignore case RULES ififth (b)(\d+)=i=eval=>$1 x $2 ENDRULES is(ififth("b1"), "b"); is(ififth("B2"), "BB"); is(ififth("b5"), "bbbbb"); is(ififth("B8"), "BBBBBBBB"); ### --- ### ### Don't like this ### the return value should be used, I think. RULES sixth =b=> $_="AA${_}AA" ENDRULES is(sixth("foo"),"AAfooAA"); ### --- ### ## Last... RULES seventh bbbbbb=l=> b==>bb ENDRULES is(seventh("b"),"bbbbbb"); ## Last... with ignore case RULES iseventh bbbbbb=i=l=> b==>bB ENDRULES is(iseventh("b"),"bBBBBB"); ### --- ### # ignore and NOT ignore RULES eigth a=i=>c b==>d ENDRULES is(eigth("abc"),"cdc"); is(eigth("Abc"),"cdc"); is(eigth("aBc"),"cBc"); # ignore all RULES/i ieigth a==>c b==>d ENDRULES is(ieigth("abc"),"cdc"); is(ieigth("Abc"),"cdc"); is(ieigth("aBc"),"cdc"); # ignore all... =i= does nothing RULES/i iieigth a=i=>c b==>d ENDRULES is(iieigth("abc"),"cdc"); is(iieigth("Abc"),"cdc"); is(iieigth("aBc"),"cdc"); # Test last with condition RULES more (...)=l=>!!$1 eq "bar" ar==>oo ENDRULES is(more("bar"),"bar"); is(more("arre"),"oore"); Text-RewriteRules-0.25/t/02.mark.t000644 000765 000024 00000001634 11356161036 016713 0ustar00ambsstaff000000 000000 # -*- cperl -*- use warnings; use strict; use Test::More tests => 9; use Text::RewriteRules; ## Replace MRULES first b==>bb r==> ENDRULES is(first("bar"),"bba"); ## Replace (ignore case) RULES/mx ifirst b=i=>bb r==> ENDRULES is(ifirst("Bar"),"bba"); ## Eval MRULES second b=eval=>'b' x 2 r==> ENDRULES is(second("bar"),"bba"); ## Eval with ignore case MRULES isecond (b)=i=e=>$1 x 2 r==> ENDRULES is(isecond("Bar"),"BBa"); MRULES third a==>b!!1 ENDRULES is(third("bab"),"bbb"); ## use of flag instead of MRULES RULES/m fourth b==>bb r==> ENDRULES is(fourth("bar"),"bba"); ## Eval MRULES fifth b=eval=>$a = log(2); $a = sin($a);'b' x 2 r==> ENDRULES is(fifth("bar"),"bba"); ## Simple Last MRULES sixth bar==>ugh foo=l=> ENDRULES is(sixth("barfoobar"),"ughfoobar"); ## Last with condition MRULES seventh bar==>ugh f(o+)=l=>!!length($1)>2 ENDRULES is(seventh("barfoobarfooobar"),"ughfooughfooobar"); Text-RewriteRules-0.25/t/03.mline.t000644 000765 000024 00000000375 10730327605 017070 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 4; use Text::RewriteRules; RULES/x first a b c ==>cba ENDRULES is(first("abc"),"cba"); is(first("a b c"), "a b c"); RULES/x second a b c ==>cba ENDRULES is(second("abc"),"cba"); is(second("a b c"), "a b c"); Text-RewriteRules-0.25/t/04.lexer.t000644 000765 000024 00000001721 11113052556 017074 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 17; use Text::RewriteRules; RULES/l lexer foo==>zbr bar==>ugh ENDRULES is(lexer(),undef); lexer_init("foobar"); is(lexer(),"zbr"); is(lexer(),"ugh"); is(lexer(),undef); # (4 tests above)--------------- RULES/l lex (\d+)=e=>["INT",$1] ([A-Z]+)=e=>["STR",$1] ENDRULES is(lex(),undef); lex_init("ID25"); is_deeply(lex(),["STR","ID"]); is_deeply(lex(),["INT", 25]); is(lex(),undef); # (8 tests above)----------------- RULES/l yylex IF=e=>["IF","IF"] (\w+)=e=>["ID",$1] \s+=ignore=> ENDRULES is(yylex(),undef); yylex_init(" IF XPTO"); is_deeply(yylex(),["IF","IF"]); is_deeply(yylex(),["ID","XPTO"]); is(yylex(),undef); # (12 tests above)---------------- RULES/lx foo IF=e=>("IF","IF") (\w+)=e=>("ID",$1) \s+=ignore=> =EOF=e=>('',undef) ENDRULES =head Fix Highlight =cut is(foo(),undef); foo_init(" IF XPTO"); is_deeply([foo()],["IF","IF"]); is_deeply([foo()],["ID","XPTO"]); is_deeply([foo()],['',undef]); is(foo(),undef); Text-RewriteRules-0.25/t/05.xml.t000644 000765 000024 00000002131 11211762420 016547 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 10; use Text::RewriteRules; RULES first [[:XML:]]==>XML ENDRULES RULES Xsecond [[:XML(d):]]==>XML ENDRULES RULES Ysecond [[:XML(c):]]==>XML ENDRULES RULES Zsecond [[:XML(b):]]==>XML ENDRULES RULES third [[:XML:]]=e=>$+{TAGNAME} ENDRULES RULES Xthird [[:XML:]]=e=>$+{PCDATA} ENDRULES my $in = " ola o ola"; my $in2 = "ola o ola o ola"; my $in3 = ""; is(first($in)," ola XML ola"); is(first($in2),"ola XML ola XML ola"); is(first($in3), "XML"); is(Xsecond($in)," ola XMLo ola"); is(Ysecond($in)," ola XML ola"); is(Zsecond($in)," ola XML ola"); is(Zsecond($in2),"ola XML ola XML ola"); is(third($in)," ola a ola"); is(third($in3),"foo"); is(Xthird($in)," ola o ola"); Text-RewriteRules-0.25/t/06.parenthesis.t000644 000765 000024 00000002423 11211764523 020307 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 9; use Text::RewriteRules; RULES first [[:PB:]]==>+ ENDRULES RULES second [[:BB:]]==>* ENDRULES RULES third [[:CBB:]]==># ENDRULES RULES fourth [[:CBB:]]==>[$+{CBB}] ENDRULES RULES fifth [[:BB:]]==>{$+{BB}} ENDRULES RULES sixth [[:PB:]]==>{$+{PB}} ENDRULES my $in = "ola (a (b)(d zbr='foo')(c)) munto (c()()ba)((())) ola"; my $in2 = "ola ((a hmm =\"hmm\")(b)(d zbr='foo'/)(c)) lua ((/c)(/b)(/a) ola (a hmm =\"hmm\")(b)(d zbr='foo'/))(c)(/c)(aaa()(/a) ola"; my $on = "ola [a [b][d zbr='foo'][c]] munto [c[][]ba][[[]]] ola"; my $on2 = "ola [[a hmm =\"hmm\"][b][d zbr='foo'/][c]] lua [[/c][/b][/a] ola [a hmm =\"hmm\"][b][d zbr='foo'/]][c][/c][aaa[][/a] ola"; my $un = "ola {a {b}{d zbr='foo'}{c}} munto {c{}{}ba}{{{}}} ola"; my $un2 = "ola {{a hmm =\"hmm\"}{b}{d zbr='foo'/}{c}} lua {{/c}{/b}{/a} ola {a hmm =\"hmm\"}{b}{d zbr='foo'/}}{c}{/c}{aaa{}{/a} ola"; is(first($in),"ola + munto ++ ola"); is(first($in2),"ola + lua +++(aaa++ ola"); is(second($on),"ola * munto ** ola"); is(second($on2),"ola * lua ***[aaa** ola"); is(third($un),"ola # munto ## ola"); is(third($un2),"ola # lua ###{aaa## ola"); is(fourth("{ xpto } {{"),"[ xpto ] {{"); is(fifth("]] [xpto] {{"),"]] {xpto} {{"); is(sixth("((xpto)(xpto){{"),"({xpto}{xpto}{{"); Text-RewriteRules-0.25/t/101.simple.t000644 000765 000024 00000021137 11777115765 017351 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 54; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end ## Replace sub first { my $p = shift; for ($p) { my $modified = 1; #__1# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a}) { s{a}{b}; $modified = 1; next } } } return $p; } is(first("bar"),"bbr"); ## Replace Ignore Case sub ifirst { my $p = shift; for ($p) { my $modified = 1; #__2# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a}i) { s{a}{b}i; $modified = 1; next } } } return $p; } is(ifirst("BAR"),"BbR"); ### --- ### ## Replace with references... sub second { my $p = shift; for ($p) { my $modified = 1; #__3# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a(\d+)}) { s{a(\d+)}{$1}; $modified = 1; next } } } return $p; } is(second("a342"),"342"); is(second("b342"),"b342"); is(second("ba2cd"),"b2cd"); ## Replace Ignore Case with references... sub isecond { my $p = shift; for ($p) { my $modified = 1; #__4# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a(\d+)}i) { s{a(\d+)}{$1}i; $modified = 1; next } } } return $p; } is(isecond("A342"),"342"); is(isecond("b342"),"b342"); is(isecond("ba2cd"),"b2cd"); ### --- ### ## Conditional sub third { my $p = shift; for ($p) { my $modified = 1; #__5# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; while (m{b(a+)b}g) { if ( length($1)>5) { s{b(a+)b\G}{bbb}; pos = undef; $modified = 1; next MAIN } } } } return $p; } is(third("bab"), "bab"); is(third("baab"), "baab"); is(third("baaab"), "baaab"); is(third("baaaab"), "baaaab"); is(third("baaaaab"), "baaaaab"); is(third("baaaaaab"), "bbb"); is(third("baaaaaaab"), "bbb"); ## Conditional Ignore Case sub ithird { my $p = shift; for ($p) { my $modified = 1; #__6# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; while (m{b(a+)b}gi) { if ( length($1)>5) { s{b(a+)b\G}{bbb}i; pos = undef; $modified = 1; next MAIN } } } } return $p; } is(ithird("bAb"), "bAb"); is(ithird("baab"), "baab"); is(ithird("bAaAb"), "bAaAb"); is(ithird("baAaAb"), "baAaAb"); is(ithird("bAaAaAb"), "bAaAaAb"); is(ithird("baAaAaAb"), "bbb"); is(ithird("bAaAaAaAb"), "bbb"); ### --- ### ## Eval Conditional sub fourth { my $p = shift; for ($p) { my $modified = 1; #__7# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; while (m{b(\d+)}g) { if ( $1 > 5) { s{b(\d+)\G}{'b' x $1 }e; pos = undef; $modified = 1; next MAIN } } } } return $p; } is(fourth("b1"), "b1"); is(fourth("b2"), "b2"); is(fourth("b5"), "b5"); is(fourth("b6"), "bbbbbb"); is(fourth("b8"), "bbbbbbbb"); ## Eval Conditional with Ignore Case sub ifourth { my $p = shift; for ($p) { my $modified = 1; #__8# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; while (m{b(\d+)}gi) { if ( $1 > 5) { s{b(\d+)\G}{'b' x $1 }ei; pos = undef; $modified = 1; next MAIN } } } } return $p; } is(ifourth("b1"), "b1"); is(ifourth("B2"), "B2"); is(ifourth("b5"), "b5"); is(ifourth("B6"), "bbbbbb"); is(ifourth("b8"), "bbbbbbbb"); ### --- ### ## Eval sub fifth { my $p = shift; for ($p) { my $modified = 1; #__9# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{b(\d+)}) { s{b(\d+)}{'b' x $1}e; $modified = 1; next } } } return $p; } is(fifth("b1"), "b"); is(fifth("b2"), "bb"); is(fifth("b5"), "bbbbb"); is(fifth("b8"), "bbbbbbbb"); ## Eval with ignore case sub ififth { my $p = shift; for ($p) { my $modified = 1; #__10# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{(b)(\d+)}i) { s{(b)(\d+)}{$1 x $2}ei; $modified = 1; next } } } return $p; } is(ififth("b1"), "b"); is(ififth("B2"), "BB"); is(ififth("b5"), "bbbbb"); is(ififth("B8"), "BBBBBBBB"); ### --- ### ### Don't like this ### the return value should be used, I think. sub sixth { my $p = shift; for ($p) { my $modified = 1; $_="AA${_}AA"; #__11# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; } } return $p; } is(sixth("foo"),"AAfooAA"); ### --- ### ## Last... sub seventh { my $p = shift; for ($p) { my $modified = 1; #__12# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{bbbbbb}) { last } if (m{b}) { s{b}{bb}; $modified = 1; next } } } return $p; } is(seventh("b"),"bbbbbb"); ## Last... with ignore case sub iseventh { my $p = shift; for ($p) { my $modified = 1; #__13# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{bbbbbb}i) { last } if (m{b}) { s{b}{bB}; $modified = 1; next } } } return $p; } is(iseventh("b"),"bBBBBB"); ### --- ### # ignore and NOT ignore sub eigth { my $p = shift; for ($p) { my $modified = 1; #__14# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a}i) { s{a}{c}i; $modified = 1; next } if (m{b}) { s{b}{d}; $modified = 1; next } } } return $p; } is(eigth("abc"),"cdc"); is(eigth("Abc"),"cdc"); is(eigth("aBc"),"cBc"); # ignore all sub ieigth { my $p = shift; for ($p) { my $modified = 1; #__15# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a}i) { s{a}{c}i; $modified = 1; next } if (m{b}i) { s{b}{d}i; $modified = 1; next } } } return $p; } is(ieigth("abc"),"cdc"); is(ieigth("Abc"),"cdc"); is(ieigth("aBc"),"cdc"); # ignore all... =i= does nothing sub iieigth { my $p = shift; for ($p) { my $modified = 1; #__16# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a}i) { s{a}{c}i; $modified = 1; next } if (m{b}i) { s{b}{d}i; $modified = 1; next } } } return $p; } is(iieigth("abc"),"cdc"); is(iieigth("Abc"),"cdc"); is(iieigth("aBc"),"cdc"); # Test last with condition sub more { my $p = shift; for ($p) { my $modified = 1; #__17# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{(...)}) { if ($1 eq "bar") { last } } if (m{ar}) { s{ar}{oo}; $modified = 1; next } } } return $p; } is(more("bar"),"bar"); is(more("arre"),"oore"); Text-RewriteRules-0.25/t/102.mark.t000644 000765 000024 00000015172 11777115765 017015 0ustar00ambsstaff000000 000000 # -*- cperl -*- use warnings; use strict; use Test::More tests => 9; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end ## Replace sub first { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__18# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:b)}) { s{${_M}(?:b)}{bb${_M}}; $modified = 1; next } if (m{${_M}(?:r)}) { s{${_M}(?:r)}{${_M}}; $modified = 1; next } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(first("bar"),"bba"); ## Replace (ignore case) sub ifirst { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__25# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:b)}i) { s{${_M}(?:b)}{bb${_M}}i; $modified = 1; next } if (m{${_M}(?:r)}i) { s{${_M}(?:r)}{${_M}}i; $modified = 1; next } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(ifirst("Bar"),"bba"); ## Eval sub second { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__19# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:b)}) { s{${_M}(?:b)}{eval{'b' x 2}."$_M"}e; $modified = 1; next } if (m{${_M}(?:r)}) { s{${_M}(?:r)}{${_M}}; $modified = 1; next } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(second("bar"),"bba"); ## Eval with ignore case sub isecond { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__20# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:(b))}i) { s{${_M}(?:(b))}{eval{$1 x 2}."$_M"}ei; $modified = 1; next } if (m{${_M}(?:r)}i) { s{${_M}(?:r)}{${_M}}i; $modified = 1; next } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(isecond("Bar"),"BBa"); sub third { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__21# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; while (m{${_M}(?:a)}g) { if (1) { s{${_M}(?:a)}{b${_M}}; pos = undef; $modified = 1; next MAIN } } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(third("bab"),"bbb"); ## use of flag instead of MRULES sub fourth { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__26# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:b)}) { s{${_M}(?:b)}{bb${_M}}; $modified = 1; next } if (m{${_M}(?:r)}) { s{${_M}(?:r)}{${_M}}; $modified = 1; next } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(fourth("bar"),"bba"); ## Eval sub fifth { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__22# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:b)}) { s{${_M}(?:b)}{eval{$a = log(2); $a = sin($a);'b' x 2}."$_M"}e; $modified = 1; next } if (m{${_M}(?:r)}) { s{${_M}(?:r)}{${_M}}; $modified = 1; next } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(fifth("bar"),"bba"); ## Simple Last sub sixth { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__23# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:bar)}) { s{${_M}(?:bar)}{ugh${_M}}; $modified = 1; next } if (m{${_M}(?:foo)}) { s{${_M}}{}; last } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(sixth("barfoobar"),"ughfoobar"); ## Last with condition sub seventh { my $p = shift; my $_M = "\x01"; for ($p) { my $modified = 1; $_ = $_M.$_; #__24# my $iteration = 0; MAIN: while ($modified) { $iteration++; $modified = 0; if (m{${_M}(?:bar)}) { s{${_M}(?:bar)}{ugh${_M}}; $modified = 1; next } if (m{${_M}(?:f(o+))}) { if (length($1)>2) { s{${_M}}{}; last } } if (m{${_M}(.|\n)}) { s{${_M}(.|\n)}{$1${_M}}; $modified = 1; next } } s/$_M//; } return $p; } is(seventh("barfoobarfooobar"),"ughfooughfooobar"); Text-RewriteRules-0.25/t/103.mline.t000644 000765 000024 00000003706 11777115765 017170 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 4; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end sub first { my $p = shift; for ($p) { my $modified = 1; #__27# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a b c }x) { s{a b c }{cba}x; $modified = 1; next } } } return $p; } is(first("abc"),"cba"); is(first("a b c"), "a b c"); sub second { my $p = shift; for ($p) { my $modified = 1; #__28# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{a b c }x) { s{a b c }{cba}x; $modified = 1; next } } } return $p; } is(second("abc"),"cba"); is(second("a b c"), "a b c"); Text-RewriteRules-0.25/t/104.lexer.t000644 000765 000024 00000007146 11777115765 017206 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 17; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end my $lexer_input = ""; sub lexer_init { $lexer_input = shift; return 1; } sub lexer { return undef if not defined $lexer_input; for ($lexer_input) { if (m{^foo}g) { s{foo}{}; pos = undef; return "zbr" } if (m{^bar}g) { s{bar}{}; pos = undef; return "ugh" } } return undef; } is(lexer(),undef); lexer_init("foobar"); is(lexer(),"zbr"); is(lexer(),"ugh"); is(lexer(),undef); # (4 tests above)--------------- my $lex_input = ""; sub lex_init { $lex_input = shift; return 1; } sub lex { return undef if not defined $lex_input; for ($lex_input) { if (m{^(\d+)}g) { s{(\d+)}{}; pos = undef; return ["INT",$1]; } if (m{^([A-Z]+)}g) { s{([A-Z]+)}{}; pos = undef; return ["STR",$1]; } } return undef; } is(lex(),undef); lex_init("ID25"); is_deeply(lex(),["STR","ID"]); is_deeply(lex(),["INT", 25]); is(lex(),undef); # (8 tests above)----------------- my $yylex_input = ""; sub yylex_init { $yylex_input = shift; return 1; } sub yylex { return undef if not defined $yylex_input; for ($yylex_input) { if (m{^IF}g) { s{IF}{}; pos = undef; return ["IF","IF"]; } if (m{^(\w+)}g) { s{(\w+)}{}; pos = undef; return ["ID",$1]; } if (m{^\s+}gi) { s{\s+}{}i; pos = undef; return yylex(); } } return undef; } is(yylex(),undef); yylex_init(" IF XPTO"); is_deeply(yylex(),["IF","IF"]); is_deeply(yylex(),["ID","XPTO"]); is(yylex(),undef); # (12 tests above)---------------- my $foo_input = ""; sub foo_init { $foo_input = shift; return 1; } sub foo { return undef if not defined $foo_input; for ($foo_input) { if (m{^IF}gx) { s{IF}{}x; pos = undef; return ("IF","IF"); } if (m{^(\w+)}gx) { s{(\w+)}{}x; pos = undef; return ("ID",$1); } if (m{^\s+}gix) { s{\s+}{}ix; pos = undef; return foo(); } if (m{^$}) { $foo_input = undef; return ('',undef); } } return undef; } =head Fix Highlight =cut is(foo(),undef); foo_init(" IF XPTO"); is_deeply([foo()],["IF","IF"]); is_deeply([foo()],["ID","XPTO"]); is_deeply([foo()],['',undef]); is(foo(),undef); Text-RewriteRules-0.25/t/105.xml.t000644 000765 000024 00000007677 11777115765 016701 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 10; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end sub first { my $p = shift; for ($p) { my $modified = 1; #__33# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__XMLtree}) { s{$__XMLtree}{XML}; $modified = 1; next } } } return $p; } sub Xsecond { my $p = shift; for ($p) { my $modified = 1; #__34# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{|>$__XMLinner)}) { s{|>$__XMLinner)}{XML}; $modified = 1; next } } } return $p; } sub Ysecond { my $p = shift; for ($p) { my $modified = 1; #__35# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{|>$__XMLinner)}) { s{|>$__XMLinner)}{XML}; $modified = 1; next } } } return $p; } sub Zsecond { my $p = shift; for ($p) { my $modified = 1; #__36# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{|>$__XMLinner)}) { s{|>$__XMLinner)}{XML}; $modified = 1; next } } } return $p; } sub third { my $p = shift; for ($p) { my $modified = 1; #__37# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__XMLtree}) { s{$__XMLtree}{$+{TAGNAME}}e; $modified = 1; next } } } return $p; } sub Xthird { my $p = shift; for ($p) { my $modified = 1; #__38# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__XMLtree}) { s{$__XMLtree}{$+{PCDATA}}e; $modified = 1; next } } } return $p; } my $in = " ola o ola"; my $in2 = "ola o ola o ola"; my $in3 = ""; is(first($in)," ola XML ola"); is(first($in2),"ola XML ola XML ola"); is(first($in3), "XML"); is(Xsecond($in)," ola XMLo ola"); is(Ysecond($in)," ola XML ola"); is(Zsecond($in)," ola XML ola"); is(Zsecond($in2),"ola XML ola XML ola"); is(third($in)," ola a ola"); is(third($in3),"foo"); is(Xthird($in)," ola o ola"); Text-RewriteRules-0.25/t/106.parenthesis.t000644 000765 000024 00000007662 11777115765 020421 0ustar00ambsstaff000000 000000 # -*- cperl -*- use Test::More tests => 9; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end sub first { my $p = shift; for ($p) { my $modified = 1; #__39# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__PB}) { s{$__PB}{+}; $modified = 1; next } } } return $p; } sub second { my $p = shift; for ($p) { my $modified = 1; #__40# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__BB}) { s{$__BB}{*}; $modified = 1; next } } } return $p; } sub third { my $p = shift; for ($p) { my $modified = 1; #__41# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__CBB}) { s{$__CBB}{#}; $modified = 1; next } } } return $p; } sub fourth { my $p = shift; for ($p) { my $modified = 1; #__42# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__CBB}) { s{$__CBB}{[$+{CBB}]}; $modified = 1; next } } } return $p; } sub fifth { my $p = shift; for ($p) { my $modified = 1; #__43# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__BB}) { s{$__BB}{{$+{BB}}}; $modified = 1; next } } } return $p; } sub sixth { my $p = shift; for ($p) { my $modified = 1; #__44# my $iteration = 0; MAIN: while($modified) { $modified = 0; $iteration++; if (m{$__PB}) { s{$__PB}{{$+{PB}}}; $modified = 1; next } } } return $p; } my $in = "ola (a (b)(d zbr='foo')(c)) munto (c()()ba)((())) ola"; my $in2 = "ola ((a hmm =\"hmm\")(b)(d zbr='foo'/)(c)) lua ((/c)(/b)(/a) ola (a hmm =\"hmm\")(b)(d zbr='foo'/))(c)(/c)(aaa()(/a) ola"; my $on = "ola [a [b][d zbr='foo'][c]] munto [c[][]ba][[[]]] ola"; my $on2 = "ola [[a hmm =\"hmm\"][b][d zbr='foo'/][c]] lua [[/c][/b][/a] ola [a hmm =\"hmm\"][b][d zbr='foo'/]][c][/c][aaa[][/a] ola"; my $un = "ola {a {b}{d zbr='foo'}{c}} munto {c{}{}ba}{{{}}} ola"; my $un2 = "ola {{a hmm =\"hmm\"}{b}{d zbr='foo'/}{c}} lua {{/c}{/b}{/a} ola {a hmm =\"hmm\"}{b}{d zbr='foo'/}}{c}{/c}{aaa{}{/a} ola"; is(first($in),"ola + munto ++ ola"); is(first($in2),"ola + lua +++(aaa++ ola"); is(second($on),"ola * munto ** ola"); is(second($on2),"ola * lua ***[aaa** ola"); is(third($un),"ola # munto ## ola"); is(third($un2),"ola # lua ###{aaa## ola"); is(fourth("{ xpto } {{"),"[ xpto ] {{"); is(fifth("]] [xpto] {{"),"]] {xpto} {{"); is(sixth("((xpto)(xpto){{"),"({xpto}{xpto}{{"); Text-RewriteRules-0.25/t/pod-coverage.t000644 000765 000024 00000000374 11764434660 020125 0ustar00ambsstaff000000 000000 #!perl -T use Test::More; eval "use Test::Pod::Coverage 1.04"; plan skip_all => "Test::Pod::Coverage 1.04 required for testing POD coverage" if $@; plan skip_all => "set AUTHOR_TESTS for author tests" unless $ENV{AUTHOR_TESTS}; all_pod_coverage_ok(); Text-RewriteRules-0.25/t/pod.t000644 000765 000024 00000000334 11764434652 016331 0ustar00ambsstaff000000 000000 #!perl -T use Test::More; eval "use Test::Pod 1.14"; plan skip_all => "Test::Pod 1.14 required for testing POD" if $@; plan skip_all => "set AUTHOR_TESTS for author tests" unless $ENV{AUTHOR_TESTS}; all_pod_files_ok(); Text-RewriteRules-0.25/scripts/textrr000755 000765 000024 00000000736 11753456366 020076 0ustar00ambsstaff000000 000000 #!/usr/bin/perl use Text::RewriteRules; use strict; Text::RewriteRules::_compiler(); =head1 NAME textrr - rewrite rules compiler =head1 SYNOPSIS $ textrr file.rr > file.pl =head1 DESCRIPTION L converts rules into Perl on the fly. For efficiency purposes, you can compile the rules into Perl code, removing the overhead of parsing them when starting the script, and removing the dependency to this module. =head1 SEE ALSO Text::RewriteRules =cut Text-RewriteRules-0.25/samples/naif_translator000644 000765 000024 00000000703 10730327605 021663 0ustar00ambsstaff000000 000000 #!/usr/bin/perl use Text::RewriteRules; %dict=(driver=>"motorista", the=>"o", of=>"de", car=>"carro"); $word='\b\w+\b'; if( b(a("I see the Driver of the car")) eq "(I) (see) o Motorista do carro" ) {print "ok\n"} else {print "ko\n"} RULES/m a ($word)==>$dict{$1}!! defined($dict{$1}) ($word)=e=> ucfirst($dict{lc($1)}) !! defined($dict{lc($1)}) ($word)==>($1) ENDRULES RULES/m b \bde o\b==>do ENDRULES Text-RewriteRules-0.25/samples/num2words000644 000765 000024 00000003157 11044057656 020450 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -C use Text::RewriteRules; my %fixnum=( 0=> "zero", 1=> "um", 2=> "dois", 3=> "três", 4=> "quatro", 5=> "cinco", 6=> "seis", 7=> "sete", 8=> "oito", 9=> "nove", 10=> "dez", 11=> "onze", 12=> "doze", 13=> "treze", 14=> "catorze", 15=> "quinze", 16=> "dezasseis", 17=> "dezassete", 18=> "dezoito", 19=> "dezanove", 20=> "vinte", 30=> "trinta", 40=> "quarenta", 50=> "cinquenta", 60=> "sessenta", 70=> "setenta", 80=> "oitenta", 90=> "noventa", 100=> "cem", 200=> "duzentos", 300=> "trezentos", 400=> "quatrocentos", 500=> "quinhentos", 600=> "seiscentos", 700=> "setecentos", 800=> "oitocentos", 900=> "novecentos", 1000=> "mil", 1000000=> "um milhão", ); while(<>){ print n2w($_),"\n"; } RULES n2w (\d+)[Ee](-?\d+)==>$1 vezes 10 levantado a $2 -(\d+)==>menos $1 (\d+)\s*\%==>$1 por cento __decimaiscomp(\d+)\.=e=>join(" ",split(//,$1)) (\d+)\.(\d{1,3})\b==>$1 ponto $2 (\d+)\.(\d+)==>$1 ponto __decimaiscomp$2. \b(\d+)\b==>$fixnum{$1}!!defined $fixnum{$1} (\d+)(000000)\b==>$1 milhões (\d+)(000)(\d{3})==>$1 milhão e $3!! $1 == 1 (\d+)(\d{3})(000)==>$1 milhão e $2 mil!! $1 == 1 (\d+)(\d{6})==>$1 milhão, $2!! $1 == 1 (\d+)(000)(\d{3})==>$1 milhões e $3 (\d+)(\d{3})(000)==>$1 milhões e $2 mil (\d+)(\d{6})==>$1 milhões, $2 (\d+)(000)\b==>$1 mil (\d+)0(\d{2})==>mil e $2!! $1 == 1 (\d+)(\d00)==>mil e $2!! $1 == 1 (\d+)(\d{3})==>mil $2!! $1 == 1 (\d+)0(\d{2})==>$1 mil e $2 (\d+)(\d00)==>$1 mil e $2 (\d+)(\d{3})==>$1 mil, $2 1(\d\d)==>cento e $1 0(\d\d)==>$1 (\d)(\d\d)==>${1}00 e $2 0(\d)==>$1 (\d)(\d)==>${1}0 e $2 0$==>zero 0==> ENDRULES Text-RewriteRules-0.25/samples/num2wordsx000644 000765 000024 00000002745 10730327605 020635 0ustar00ambsstaff000000 000000 #!/usr/bin/perl -w -C use Text::RewriteRules; RULES/x num2words (\d+) (000000) \b ==>$1 milhão!! $1 == 1 (\d+) (000000) \b ==>$1 milhões (\d+) (000) (\d{3}) ==>$1 milhão e $3!! $1 == 1 (\d+) (\d{3}) (000) ==>$1 milhão e $2 mil!! $1 == 1 (\d+) (\d{6}) ==>$1 milhão, $2!! $1 == 1 (\d+) (000) (\d{3}) ==>$1 milhões e $3 (\d+) (\d{3}) (000) ==>$1 milhões e $2 mil (\d+) (\d{6}) ==>$1 milhões, $2 (\d+) (000) \b ==>mil!! $1 == 1 (\d+)(000)\b ==>$1 mil (\d+)0(\d{2}) ==>mil e $2!! $1 == 1 (\d+)(\d00) ==>mil e $2!! $1 == 1 (\d+)(\d{3}) ==>mil $2!! $1 == 1 (\d+)0(\d{2}) ==>$1 mil e $2 (\d+)(\d00) ==>$1 mil e $2 (\d+)(\d{3}) ==>$1 mil, $2 100 ==>cem 1(\d\d) ==>cento e $1 0(\d\d) ==>$1 200 ==>duzentos 300 ==>trezentos 400 ==>quatrocentos 500 ==>quinhentos 600 ==>seiscentos 700 ==>setecentos 800 ==>oitocentos 900 ==>novecentos (\d)(\d\d) ==>${1}00 e $2 10 ==>dez 11 ==>onze 12 ==>doze 13 ==>treze 14 ==>catorze 15 ==>quinze 16 ==>dezasseis 17 ==>dezassete 18 ==>dezoito 19 ==>dezanove 20 ==>vinte 30 ==>trinta 40 ==>quarenta 50 ==>cinquenta 60 ==>sessenta 70 ==>setenta 80 ==>oitenta 90 ==>noventa 0(\d) ==>$1 (\d)(\d) ==>${1}0 e $2 1 ==>um 2 ==>dois 3 ==>três 4 ==>quatro 5 ==>cinco 6 ==>seis 7 ==>sete 8 ==>oito 9 ==>nove 0$ ==>zero 0 ==> " " ==> " ," ==>, ENDRULES for (@ARGV) { print num2words($_),"\n"; } Text-RewriteRules-0.25/lib/Text/000755 000765 000024 00000000000 11777260150 016603 5ustar00ambsstaff000000 000000 Text-RewriteRules-0.25/lib/Text/RewriteRules.pm000644 000765 000024 00000061107 11777116034 021603 0ustar00ambsstaff000000 000000 package Text::RewriteRules; use Data::Dumper; use Filter::Simple; use warnings; use strict; use 5.010000; # 5.10.0 =encoding UTF-8 =head1 NAME Text::RewriteRules - A system to rewrite text using regexp-based rules =cut our $VERSION = '0.25'; =head1 SYNOPSIS use Text::RewriteRules; RULES email \.==> DOT @==> AT ENDRULES print email("ambs@cpan.org") # prints ambs AT cpan DOT org RULES/m inc (\d+)=e=> $1+1 ENDRULES print inc("I saw 11 cats and 23 dogs") # prints I saw 12 cats and 24 dogs =head1 ABSTRACT This module uses a simplified syntax for regexp-based rules for rewriting text. You define a set of rules, and the system applies them until no more rule can be applied. Two variants are provided: =over 4 =item 1 traditional rewrite (RULES function): while it is possible do substitute | apply first substitution rule =item 2 cursor based rewrite (RULES/m function): add a cursor to the beginning of the string while not reach end of string | apply substitute just after cursor and advance cursor | or advance cursor if no rule can be applied =back =head1 DESCRIPTION A lot of computer science problems can be solved using rewriting rules. Rewriting rules consist of mainly two parts: a regexp (LHS: Left Hand Side) that is matched with the text, and the string to use to substitute the content matched with the regexp (RHS: Right Hand Side). Now, why don't use a simple substitute? Because we want to define a set of rules and match them again and again, until no more regexp of the LHS matches. A point of discussion is the syntax to define this system. A brief discussion shown that some users would prefer a function to receive an hash with the rules, some other, prefer some syntax sugar. The approach used is the last: we use C such that we can add a specific non-perl syntax inside the Perl script. This improves legibility of big rewriting rules systems. This documentation is divided in two parts: first we will see the reference of the module. Kind of, what it does, with a brief explanation. Follows a tutorial which will be growing through time and releases. =head1 SYNTAX REFERENCE Note: most of the examples are very stupid, but that is the easiest way to explain the basic syntax. The basic syntax for the rewrite rules is a block, started by the keyword C and ended by the C. Everything between them is handled by the module and interpreted as rules or comments. The C keyword can handle a set of flags (we will see that later), and requires a name for the rule-set. This name will be used to define a function for that rewriting system. RULES functioname ... ENDRULES The function is defined in the main namespace where the C block appears. In this block, each line can be a comment (Perl style), an empty line or a rule. =head2 Basic Rule A basic rule is a simple substitution: RULES foobar foo==>bar ENDRULES The arrow C<==E> is used as delimiter. At its left is the regexp to match, at the right side, the substitution. So, the previous block defines a C function that substitutes all C by C. Although this can seems similar to a global substitution, it is not. With a global substitution you can't do an endless loop. With this module it is very simple. I know you will get the idea. You can use the syntax of Perl both on the left and right hand side of the rule, including C<$1...>. =head2 Execution Rule If the Perl substitution supports execution, why not to support it, also? So, you got the idea. Here is an example: RULES foo (\d+)b=e=>'b' x $1 (\d+)a=eval=>'a' x ($1*2) ENDRULES So, for any number followed by a C, we replace by that number of C. For each number followed by an C, we replace them by twice that number of C. Also, you mean evaluation using an C or C inside the arrow. I should remind you can mix all these rules together in the same rewriting system. =head2 Conditional Rule On some cases we want to perform a substitution if the pattern matches B a set of conditions about that pattern (or not) are true. For that, we use a three part rule. We have the common rule plus the condition part, separated from the rule by C. These conditional rules can be applied both for basic and execution rules. RULES translate ([[:alpha:]]+)=e=>$dic{$1}!! exists($dic{$1}) ENDRULES The previous example would translate all words that exist on the dictionary. =head2 Begin Rule Sometimes it is useful to change something on the string before starting to apply the rules. For that, there is a special rule named C (or C for abbreviate) just with a RHS. This RHS is Perl code. Any Perl code. If you want to modify the string, use C<$_>. RULES foo =b=> $_.=" END" ENDRULES =head2 Last Rule As you use C on Perl to skip the remaining code on a loop, you can also call a C (or C) rule when a specific pattern matches. Like the C rule with only a RHS, the C rule has only a LHS: RULES foo foobar=l=> ENDRULES This way, the rules iterate until the string matches with C. You can also supply a condition in a last rule: RULES bar f(o+)b(a+)r=l=> !! length($1) == 2 * length($2); =head2 Rules with /x mode It is possible to use the regular expressions /x mode in the rewrite rules. In this case: =over 4 =item 1 there must be an empty line between rules =item 2 you can insert space and line breaks into the regular expression: RULES/x f1 (\d+) (\d{3}) (000) ==>$1 milhao e $2 mil!! $1 == 1 ENDRULES =back =head1 POWER EXPRESSIONS To facilitate matching complex languages Text::RewriteRules defines a set of regular expressions that you can use (without defining them). =head2 Parenthesis There are three kind of usual parenthesis: the standard parenthesis, brackets or curly braces. You can match a balanced string of parenthesis using the power expressions C<[[:PB:]]>, C<[[:BB:]]> and C<[[:CBB:]]> for these three kind of parenthesis. For instance, if you apply this rule: [[:BB:]]==>foo to this string something [ a [ b] c [d ]] and something more then, you will get something foo and something more Note that if you apply it to something [[ not ] balanced [ here then you will get something [foo balanced [ here =head2 XML tags The power expression C<[[:XML:]]> match a XML tag (with or without children XML tags. Note that this expression matches only well formed XML tags. As an example, the rule [[:XML:]]=>tag applied to the string and will result in and tag =cut our $DEBUG = 0; our $count = 0; our $NL = qr/\r?\n\r?/; my %pseudo_classes=( TEXENV => 'TEXENV', PB => 'PB', BB => 'BB', CBB => 'CBB', XML => 'XMLtree', 'XML+1' => \&_tag_re, ); sub _regular_expressions { return <<'EORE'; our $__XMLattrs = qr/(?: \s+[a-zA-Z0-9:-]+\s* = \s*(?: '[^']+' | "[^"]+" ))*/x; ### This (?\n) is a BIG hack! our $__XMLempty = qr/<(?[a-zA-Z0-9:-]+)(?\b)$__XMLattrs\/>/x; our $__XMLtree2 = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?: $__XMLempty | [^<]++ | (?&XML) )*+ <\/\k> )/x; our $__XMLtree = qr/$__XMLempty | (? <(?[a-zA-Z0-9:-]+)$__XMLattrs> (?(?: $__XMLempty | [^<]++ | $__XMLtree2 )*+) <\/\k> )/x; our $__XMLinner = qr/(?: [^<]++ | $__XMLempty | $__XMLtree2 )*+/x; our $__CBB = qr{ (? \{ (?(?:[^\{\}]++|(?&cbb1))*+) \} ) }sx; our $__BB = qr{ (? \[ (? (?:[^\[\]]++|(?&bb1) )*+) \] ) }sx; our $__PB = qr{ (? \( (? (?:[^\(\)]++|(?&pb1) )*+) \) ) }sx; our $__TEXENV = qr{\\begin\{(\w+)\}(.*?)\\end\{\1\}}s; ## \begin{$1}$2\end our $__TEXENV1 = qr{\\begin\{(\w+)\}($__BB?)($__CBB)(.*?)\\end\{\1\}}s; ## \begin{$1}[$2]{$3}$4\end EORE } sub _tag_re { my $tagname = shift; return "<$tagname\$__XMLattrs(?:\/>|>\$__XMLinner<\/$tagname>)"; } sub _expand_pseudo_classes { my $rules = shift; $rules =~ s/(\[\[:(\w+):\]\])/$pseudo_classes{$2}?"\$__$pseudo_classes{$2}":$1/ge; $rules =~ s/\[\[:(\w+)\(([^,\(\)]+)\):\]\]/$pseudo_classes{"$1+1"}->($2)/ge; return $rules; } sub _mrules { my ($conf, $name, $rules) = @_; ++$count; my $code = "sub $name {\n"; $code .= " my \$p = shift;\n"; $code .= " my \$_M = \"\\x01\";\n"; $code .= " for (\$p) {\n"; $code .= " my \$modified = 1;\n"; $code .= " \$_ = \$_M.\$_;\n"; $code .= " #__$count#\n"; $code .= " my \$iteration = 0;\n"; $code .= " MAIN: while (\$modified) {\n"; $code .= " \$iteration++;\n"; if ($DEBUG) { $code .= " print STDERR \" >\$_\\n\";\n" } $code .= " \$modified = 0;\n"; my $ICASE = exists($conf->{i})?"i":""; my $DX = exists($conf->{x})?"x":""; if (exists($conf->{d})) { $code .= " print STDERR \"Iteration on $name: \$iteration\n\$p\n\";"; } my @rules; if ($DX eq "x") { @rules = split /$NL$NL/, $rules; } else { @rules = split /$NL/, $rules; } for my $rule (@rules) { $rule =~ s/$NL$//; if ($rule =~ m/(.*?)(=i?=>)(.*)!!(.*)/) { my ($ant,$con,$cond) = ($1,$3,$4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " while (m{\${_M}(?:$ant)}g$ICASE) {\n"; $code .= " if ($cond) {\n"; $code .= " s{\${_M}(?:$ant)}{$con\${_M}}$ICASE;\n"; $code .= " pos = undef;\n"; $code .= " \$modified = 1;\n"; $code .= " next MAIN\n"; $code .= " }\n"; $code .= " }\n"; } elsif ($rule =~ m/(.*?)(=(?:i=)?e(?:val)?=>)(.*)!!(.*)/) { my ($ant,$con,$cond) = ($1,$3,$4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " while (m{\${_M}(?:$ant)}g$ICASE) {\n"; $code .= " if ($cond) {\n"; $code .= " s{\${_M}(?:$ant)}{eval{$con}.\${_M}}e$ICASE;\n"; $code .= " pos = undef;\n"; $code .= " \$modified = 1;\n"; $code .= " next MAIN\n"; $code .= " }\n"; $code .= " }\n"; } elsif ($rule =~ m/(.*?)(=i?=>)(.*)/) { my ($ant,$con) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{\${_M}(?:$ant)}$ICASE) {\n"; $code .= " s{\${_M}(?:$ant)}{$con\${_M}}$ICASE;\n"; $code .= " \$modified = 1;\n"; $code .= " next\n"; $code .= " }\n"; } elsif($rule =~ m/=b(?:egin)?=>(.*)/s) { my $ac = $1; $code =~ s/(#__$count#\n)/$ac;\n$1/; } elsif ($rule =~ m/(.*?)(=(?:i=)?e(?:val)?=>)(.*)/) { my ($ant,$con) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{\${_M}(?:$ant)}$ICASE) {\n"; $code .= " s{\${_M}(?:$ant)}{eval{$con}.\"\$_M\"}e$ICASE;\n"; $code .= " \$modified = 1;\n"; $code .= " next\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=(?:i=)?l(?:ast)?=>\s*!!(.*))/s) { my ($ant,$cond) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{\${_M}(?:$ant)}$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " s{\${_M}}{};\n"; $code .= " last\n"; $code .= " }\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=(?:i=)?l(?:ast)?=>)/s) { my ($ant) = ($1); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{\${_M}(?:$ant)}$ICASE$DX) {\n"; $code .= " s{\${_M}}{};\n"; $code .= " last\n"; $code .= " }\n"; } else { warn "Unknown rule: $rule\n" unless $rule =~ m!^\s*(#|$)!; } } ##--- # Make it walk... $code .= " if (m{\${_M}(.|\\n)}) {\n"; $code .= " s{\${_M}(.|\\n)}{\$1\${_M}};\n"; $code .= " \$modified = 1;\n"; $code .= " next\n"; $code .= " }\n"; $code .= " }\n"; $code .= " s/\$_M//;\n"; $code .= " }\n"; $code .= " return \$p;\n"; $code .= "}\n"; $code; } sub _rules { my ($conf, $name, $rules) = @_; ++$count; my $code = "sub $name {\n"; $code .= " my \$p = shift;\n"; $code .= " for (\$p) {\n"; $code .= " my \$modified = 1;\n"; $code .= " #__$count#\n"; $code .= " my \$iteration = 0;\n"; $code .= " MAIN: while(\$modified) {\n"; $code .= " print STDERR \$_;\n" if $DEBUG > 1; $code .= " \$modified = 0;\n"; $code .= " \$iteration++;\n"; ##--- my $DICASE = exists($conf->{i})?"i":""; my $DX = exists($conf->{x})?"x":""; if (exists($conf->{d})) { $code .= " print STDERR \"Iteration on $name: \$iteration\n\$p\n\";"; } my @rules; if ($DX eq "x") { @rules = split /$NL$NL/, $rules; } else { @rules = split /$NL/, $rules; } for my $rule (@rules) { $rule =~ s/$NL$//; my $ICASE = $DICASE; if($rule =~ m/(.*?)(=i?=>)(.*)!!(.*)/s) { my ($ant,$con,$cond) = ($1,$3,$4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " while (m{$ant}g$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " s{$ant\\G}{$con}$ICASE$DX;\n"; $code .= " pos = undef;\n"; $code .= " \$modified = 1;\n"; $code .= " next MAIN\n"; $code .= " }\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=(?:i=)?e(?:val)?=>)(.*)!!(.*)/s) { my ($ant,$con,$cond) = ($1,$3,$4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " while (m{$ant}g$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " s{$ant\\G}{$con}e${ICASE}${DX};\n"; $code .= " pos = undef;\n"; $code .= " \$modified = 1;\n"; $code .= " next MAIN\n"; $code .= " }\n"; $code .= " }\n"; } elsif ($rule =~ m/(.*?)(=i?=>)(.*)/s) { my ($ant,$con) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{$ant}$ICASE$DX) {\n"; $code .= " s{$ant}{$con}$ICASE$DX;\n"; $code .= " \$modified = 1;\n"; $code .= " next\n"; $code .= " }\n"; } elsif ($rule =~ m/(.*?)(=(?:i=)?e(?:val)?=>)(.*)/s) { my ($ant,$con) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{$ant}$ICASE$DX) {\n"; $code .= " s{$ant}{$con}e$ICASE$DX;\n"; $code .= " \$modified = 1;\n"; $code .= " next\n"; $code .= " }\n"; } elsif($rule =~ m/=b(?:egin)?=>(.*)/s) { my $ac = $1; $code =~ s/(#__$count#\n)/$ac;\n$1/; } elsif($rule =~ m/(.*?)(=(i=)?l(ast)?=>\s*!!(.*))/s) { my ($ant,$cond) = ($1,$5); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{$ant}$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " last\n"; $code .= " }\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=(i=)?l(ast)?=>)/s) { my ($ant) = ($1); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{$ant}$ICASE$DX) {\n"; $code .= " last\n"; $code .= " }\n"; } else { warn "Unknown rule: $rule\n" unless $rule =~ m!^\s*(#|$)!; } } ##--- $code .= " }\n"; $code .= " }\n"; $code .= " return \$p;\n"; $code .= "}\n"; $code; } sub _lrules { my ($conf, $name, $rules) = @_; ++$count; my $code = "my \$${name}_input = \"\";\n"; $code .= "sub ${name}_init {\n"; $code .= " \$${name}_input = shift;\n"; $code .= " return 1;\n"; $code .= "}\n\n"; $code .= "sub $name {\n"; $code .= " return undef if not defined \$${name}_input;\n"; $code .= " print STDERR \$_;\n" if $DEBUG > 1; $code .= " for (\$${name}_input) {\n"; ##--- my $DICASE = exists($conf->{i})?"i":""; my $DX = exists($conf->{x})?"x":""; my @rules; if ($DX eq "x") { @rules = split /$NL$NL/, $rules; } else { @rules = split /$NL/, $rules; } for my $rule (@rules) { $rule =~ s/$NL$//; my $ICASE = $DICASE; if ($rule =~ m/=EOF=>(.*)/s) { my $act = $1; $code .= " if (m{^\$}) {\n"; $code .= " \$${name}_input = undef;\n"; $code .= " return \"$act\";\n"; $code .= " }\n"; } elsif ($rule =~ m/=EOF=e=>(.*)/s) { my $act = $1; $code .= " if (m{^\$}) {\n"; $code .= " \$${name}_input = undef;\n"; $code .= " return $act;\n"; $code .= " }\n"; } elsif ($rule =~ m/(.*?)(=(?:i=)?ignore=>)(.*)!!(.*)/s) { my ($ant,$cond) = ($1, $4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{^$ant}g$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " s{$ant}{}$ICASE$DX;\n"; $code .= " pos = undef;\n"; $code .= " return $name();\n"; $code .= " }\n"; $code .= " }\n"; } elsif ($rule =~ m/(.*?)(=(?:i=)?ignore=>)(.*)/s) { my ($ant) = ($1); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{^$ant}g$ICASE$DX) {\n"; $code .= " s{$ant}{}$ICASE$DX;\n"; $code .= " pos = undef;\n"; $code .= " return $name();\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=i?=>)(.*)!!(.*)/s) { my ($ant,$con,$cond) = ($1,$3,$4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{^$ant}g$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " s{$ant}{}$ICASE$DX;\n"; $code .= " pos = undef;\n"; $code .= " return \"$con\"\n"; $code .= " }\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=(?:i=)?e(?:val)?=>)(.*)!!(.*)/s) { my ($ant,$con,$cond) = ($1,$3,$4); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{^$ant}g$ICASE$DX) {\n"; $code .= " if ($cond) {\n"; $code .= " s{$ant}{}${ICASE}${DX};\n"; $code .= " pos = undef;\n"; $code .= " return $con;\n"; $code .= " }\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=i?=>)(.*)/s) { my ($ant,$con) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{^$ant}g$ICASE$DX) {\n"; $code .= " s{$ant}{}$ICASE$DX;\n"; $code .= " pos = undef;\n"; $code .= " return \"$con\"\n"; $code .= " }\n"; } elsif($rule =~ m/(.*?)(=(?:i=)?e(?:val)?=>)(.*)/s) { my ($ant,$con) = ($1,$3); $ICASE = "i" if $2 =~ m!i!; $ant = _expand_pseudo_classes($ant); $code .= " if (m{^$ant}g$ICASE$DX) {\n"; $code .= " s{$ant}{}${ICASE}${DX};\n"; $code .= " pos = undef;\n"; $code .= " return $con;\n"; $code .= " }\n"; } else { warn "Unknown rule in lexer mode: $rule\n" unless $rule =~ m!^\s*(#|$)!; } } ##--- $code .= " }\n"; $code .= " return undef;\n"; $code .= "}\n"; $code; } FILTER { return if m!^(\s|\n)*$!; s!^!_regular_expressions()!e; print STDERR "BEFORE>>>>\n$_\n<<<<\n" if $DEBUG; s!^MRULES +(\w+)\s*?\n((?:.|\n)*?)^ENDRULES!_mrules({}, $1,$2)!gem; s!^LRULES +(\w+)\s*?\n((?:.|\n)*?)^ENDRULES!_lrules({}, $1,$2)!gem; s{^RULES((?:\/\w+)?) +(\w+)\s*?\n((?:.|\n)*?)^ENDRULES}{ my ($a,$b,$c) = ($1,$2,$3); my $conf = {map {($_=>$_)} split //,$a}; if (exists($conf->{'l'})) { _lrules($conf, $b, $c) } elsif (exists($conf->{'m'})) { _mrules($conf,$b,$c) } else { _rules($conf,$b,$c) } }gem; print STDERR "AFTER>>>>\n$_\n<<<<\n" if $DEBUG; $_ }; sub _compiler{ local $/ = undef; $_ = <>; print __compiler($_); } sub __compiler { my $str = shift; for ($str) { s!use Text::RewriteRules;!_regular_expressions()!e; s!^MRULES +(\w+)\s*\n((?:.|\n)*?)^ENDRULES!_mrules({}, $1,$2)!gem; s!^LRULES +(\w+)\s*\n((?:.|\n)*?)^ENDRULES!_lrules({}, $1,$2)!gem; s{^RULES((?:\/\w+)?) +(\w+)\s*\n((?:.|\n)*?)^ENDRULES}{ my ($a,$b,$c) = ($1,$2,$3); my $conf = {map {($_=>$_)} split //,$a}; if (exists($conf->{'l'})) { _lrules($conf,$b,$c) } elsif (exists($conf->{'m'})) { _mrules($conf,$b,$c) } else { _rules($conf,$b,$c) } }gem; } return $str; } =head1 TUTORIAL At the moment, just a set of commented examples. Example1 -- from number to portuguese words (using traditional rewriting) Example2 -- Naif translator (using cursor-based rewriting) =head1 Conversion between numbers and words Yes, you can use L and similar (for other languages). Meanwhile, before it existed we needed to write such a conversion tool. Here I present a subset of the rules (for numbers bellow 1000). The generated text is Portuguese but I think you can get the idea. I'll try to create a version for English very soon. You can check the full code on the samples directory (file C). use Text::RewriteRules; RULES num2words 100==>cem 1(\d\d)==>cento e $1 0(\d\d)==>$1 200==>duzentos 300==>trezentos 400==>quatrocentos 500==>quinhentos 600==>seiscentos 700==>setecentos 800==>oitocentos 900==>novecentos (\d)(\d\d)==>${1}00 e $2 10==>dez 11==>onze 12==>doze 13==>treze 14==>catorze 15==>quinze 16==>dezasseis 17==>dezassete 18==>dezoito 19==>dezanove 20==>vinte 30==>trinta 40==>quarenta 50==>cinquenta 60==>sessenta 70==>setenta 80==>oitenta 90==>noventa 0(\d)==>$1 (\d)(\d)==>${1}0 e $2 1==>um 2==>dois 3==>três 4==>quatro 5==>cinco 6==>seis 7==>sete 8==>oito 9==>nove 0$==>zero 0==> ==> ,==>, ENDRULES num2words(123); # returns "cento e vinte e três" =head2 Naif translator (using cursor-based rewriting) use Text::RewriteRules; %dict=(driver=>"motorista", the=>"o", of=>"de", car=>"carro"); $word='\b\w+\b'; if( b(a("I see the Driver of the car")) eq "(I) (see) o Motorista do carro" ) {print "ok\n"} else {print "ko\n"} RULES/m a ($word)==>$dict{$1}!! defined($dict{$1}) ($word)=e=> ucfirst($dict{lc($1)}) !! defined($dict{lc($1)}) ($word)==>($1) ENDRULES RULES/m b \bde o\b==>do ENDRULES =head1 AUTHOR Alberto Simões, C<< >> José João Almeida, C<< >> =head1 BUGS We know documentation is missing and you all want to use this module. In fact we are using it a lot, what explains why we don't have the time to write documentation. Please report any bugs or feature requests to C, or through the web interface at L. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. =head1 ACKNOWLEDGEMENTS Damian Conway for Filter::Simple =head1 COPYRIGHT & LICENSE Copyright 2004-2012 Alberto Simões and José João Almeida, All Rights Reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. =cut 1; # End of Text::RewriteRules