parser, which simply encodes the
entire file and wraps it in a C<< >> element.
=item C
The character encoding to assume the source file is encoded in (if such cannot
be determined by other means, such as a
L). If not specified, the
value of the C attribute will be used, and if that attribute
is not set, UTF-8 will be assumed.
=item C
An array reference of options for the parser. See the documentation of the
various parser modules for details.
=back
=head3 C
my $format = $parser->default_format;
$parser->default_format('markdown');
An accessor for the default format attribute.
=head3 C
my $encoding = $parser->default_encoding;
$parser->default_encoding('Big5');
An accessor for the default encoding attribute.
=head3 C
my $format = $parser->guess_format($filename);
Compares the passed file name's suffix to the regular expressions of all
registered formatting parser and returns the first one that matches. Returns
C if none matches.
=head1 Add a Parser
Adding support for markup formats not supported by the core Text::Markup
distribution is a straight-forward exercise. Say you wanted to add a "FooBar"
markup parser. Here are the steps to take:
=over
=item 1
Fork L
=item 2
Clone your fork and create a new branch in which to work:
git clone git@github.com:$USER/text-markup.git
cd text-markup
git checkout -b foobar
=item 3
Create a new module named C. The simplest thing to do is
copy an existing module and modify it. The HTML parser is probably the simplest:
cp lib/Text/Markup/HTML.pm lib/Text/Markup/FooBar.pm
perl -i -pe 's{HTML}{FooBar}g' lib/Text/Markup/FooBar.pm
perl -i -pe 's{html}{foobar}g' lib/Text/Markup/FooBar.pm
=item 4
Implement the C function in your new module. If you were to use a
C module, it might look something like this:
package Text::Markup::FooBar;
use 5.8.1;
use strict;
use Text::FooBar ();
use File::BOM qw(open_bom)
sub import {
# Replace the regex if passed one.
Text::Markup->register( foobar => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my $md = Text::FooBar->new(@{ $opts || [] });
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = $md->parse(<$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return join( "\n",
'',
'',
'',
$html,
'',
'',
);
}
Use the C<$encoding> argument as appropriate to read in the source file. If
your parser requires that text be decoded to Perl's internal form, use of
L is recommended, so that an explicit BOM will determine the
encoding. Otherwise, fall back on the specified encoding. Note that some
parsers, such as an HTML parser, would want text encoded before it parsed it.
In such a case, read in the file as raw bytes:
open my $fh, '<:raw', $file or die "Cannot open $file: $!\n";
The returned HTML, however, B. Please include an
L,
such as a content-type C<< >> element:
This will allow any consumers of the returned HTML to parse it correctly.
If the parser parsed no content, C should return C.
=item 5
Edit F and add an entry to its C<%REGEX_FOR> hash for your
new format. The key should be the name of the format (lowercase, the same as
the last part of your module's name). The value should be a regular expression
that matches the file extensions that suggest that a file is formatted in your
parser's markup language. For our FooBar parser, the line might look like
this:
foobar => qr{fb|foob(?:ar)?},
=item 6
Add a file in your parser's markup language to F. It should be
named for your parser and end in F<.txt>, that is, F.
=item 7
Add an HTML file, F, which should be the expected output
once F is parsed into HTML. This will be used to test
that your parser works correctly.
=item 8
Edit F by adding a line to its C<__DATA__> section. The line
should be a comma-separated list describing your parser. The columns are:
=over
=item * Format
The lowercased name of the format.
=item * Format Module
The name of the parser module.
=item * Required Module
The name of a module that's required to be installed in order for your parser
to load.
=item * Extensions
Additional comma-separated values should be a list of file extensions that
your parser should recognize.
=back
So for our FooBar parser, it might look like this:
markdown,Text::Markup::FooBar,Text::FooBar 0.22,fb,foob,foobar
=item 9
Test your new parser by running
prove -lv t/formats.t
This will test I included parsers, but of course you should only pay
attention to how your parser works. Tweak until your tests pass. Note that one
test has the parser parse a file with just a couple of empty lines, to ensure
that the parser finds no content and returns C.
=item 10
Don't forget to write the documentation in your new parser module! If you
copied F, you can just modify as appropriate.
=item 11
Add any new module requirements to the C section of F.
=item 12
Commit and push the branch to your fork on GitHub:
git add .
git commit -am 'Add great new FooBar parser!'
git push origin -u foobar
=item 13
And finally, submit a pull request to the upstream repository via the GitHub
UI.
=back
If you don't want to submit your parser, you can still create and use one
independently. Just omit editing the C<%REGEX_FOR> hash in this module and make
sure you C the parser manually with a default regular expression
in the C method, like so:
package My::Markup::FooBar;
use Text::Markup;
sub import {
Text::Markup->register( foobar => $_[1] || qr{fb|foob(?:ar)?} );
}
This will be useful for creating private parsers you might not want to
contribute, or that you'd want to distribute independently.
=head1 See Also
=over
=item *
The L Ruby library -- the inspiration
for this module -- provides similar functionality, and is used to parse
F on GitHub.
=item *
L offers similar functionality.
=back
=head1 Support
This module is stored in an open
L. Feel free to
fork and contribute!
Please file bug reports via
L.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup 000755 001751 000177 0 14563470114 16524 5 ustar 00runner docker 000000 000000 Text-Markup-0.33/lib/Text/Markup/Asciidoc.pm 000444 001751 000177 5116 14563470114 20740 0 ustar 00runner docker 000000 000000 package Text::Markup::Asciidoc;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Text::Markup::Cmd;
use utf8;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( asciidoc => $_[1] ) if $_[1];
}
my $ASCIIDOC = find_cmd([
(map { (WIN32 ? ("$_.exe", "$_.bat") : ($_)) } qw(asciidoc)),
'asciidoc.py',
], '--version');
# Arguments to pass to asciidoc.
# Restore --safe if Asciidoc ever fixes it with the XHTML back end.
# https://groups.google.com/forum/#!topic/asciidoc/yEr5PqHm4-o
my @OPTIONS = qw(
--no-header-footer
--out-file -
--attribute newline=\\n
);
sub parser {
my ($file, $encoding, $opts) = @_;
my $html = do {
my $fh = open_pipe(
$ASCIIDOC, @OPTIONS,
'--attribute' => "encoding=$encoding",
$file
);
binmode $fh, ":encoding($encoding)";
local $/;
<$fh>;
};
# Make sure we have something.
return unless $html =~ /\S/;
utf8::encode $html;
return $html if { @{ $opts } }->{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Asciidoc - Asciidoc parser for Text::Markup
=head1 Synopsis
use Text::Markup;
my $html = Text::Markup->new->parse(file => 'hello.adoc');
my $raw = Text::Markup->new->parse(
file => 'hello.adoc',
options => [raw => 1],
);
=head1 Description
This is the L parser for L. It
depends on the L|https://asciidoc-py.github.io> command-line
application. See the L
for details, or use the command C.
Text::Markup::Asciidoc recognizes files with the following extensions as
Asciidoc:
=over
=item F<.asciidoc>
=item F<.asc>
=item F<.adoc>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Asciidoc qr{ski?doc};
Normally this parser returns the output of C wrapped in a minimal
HTML page skeleton. If you would prefer to just get the exact output returned
by C, you can pass in a true value for the C option.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2012-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Asciidoctor.pm 000444 001751 000177 6135 14563470114 21467 0 ustar 00runner docker 000000 000000 package Text::Markup::Asciidoctor;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Text::Markup::Cmd;
use utf8;
our $VERSION = '0.33';
sub import {
# Replace Text::Markup::Asciidoc.
Text::Markup->register( asciidoc => $_[1] || qr{a(?:sc(?:iidoc)?|doc)?} );
}
# Find Asciidoc.
my $ASCIIDOC = find_cmd([
(map { (WIN32 ? ("$_.exe", "$_.bat") : ($_)) } qw(asciidoctor)),
'asciidoctor.py',
], '--version');
# Arguments to pass to asciidoc.
# Restore --safe if Asciidoc ever fixes it with the XHTML back end.
# https://groups.google.com/forum/#!topic/asciidoc/yEr5PqHm4-o
my @OPTIONS = qw(
--no-header-footer
--out-file -
--attribute newline=\\n
);
sub parser {
my ($file, $encoding, $opts) = @_;
my $html = do {
my $fh = open_pipe(
$ASCIIDOC, @OPTIONS,
'--attribute' => "encoding=$encoding",
$file
);
binmode $fh, ":encoding($encoding)";
local $/;
<$fh>;
};
# Make sure we have something.
return unless $html =~ /\S/;
utf8::encode $html;
return $html if { @{ $opts } }->{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Asciidoc - Asciidoc parser for Text::Markup
=head1 Synopsis
use Text::Markup::Asciidoctor;
my $html = Text::Markup->new->parse(file => 'hello.adoc');
my $raw = Text::Markup->new->parse(
file => 'hello.adoc',
options => [raw => 1],
);
=head1 Description
This is the L parser for L. It
depends on the C command-line application; see the
L for details, or
use the command C. Note that L does
not load this module by default, but when loaded manually will replace
Text::Markup::Asciidoc as preferred Asciidoc parser.
Text::Markup::Asciidoctor reads in the file (relying on a
L), hands it off to
L|https://asciidoctor.org> for parsing, and then returns the
generated HTML as an encoded UTF-8 string with an C
element identifying the encoding as UTF-8.
Text::Markup::Asciidoctor recognizes files with the following extensions as
Asciidoc:
=over
=item F<.asciidoc>
=item F<.asc>
=item F<.adoc>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::AsciiDoctor qr{ski?doc};
Normally this parser returns the output of C wrapped in a minimal
HTML page skeleton. If you would prefer to just get the exact output returned
by C, you can pass in a true value for the C option.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2012-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Bbcode.pm 000444 001751 000177 4743 14563470114 20405 0 ustar 00runner docker 000000 000000 package Text::Markup::Bbcode;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Parse::BBCode;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( bbcode => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
my $parse = Parse::BBCode->new(\%params);
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = $parse->render(<$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $params{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Bbcode - BBcode parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'file.bbcode');
my $raw = Text::Markup->new->parse(
file => 'file.bbcode',
options => [ raw => 1 ],
);
=head1 Description
This is the L parser for L. It
reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an
encoded UTF-8 string with an C element identifying
the encoding as UTF-8.
It recognizes files with the following extensions as Markdown:
=over
=item F<.bb>
=item F<.bbcode>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Bbcode qr{beebee};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output with the raw skeleton, you can pass
the C option to C.
In addition Text::Markup::Bbcode supports all of the
L, including:
=over
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=back
=head1 Author
Lucas Kanashiro
=head1 Copyright and License
Copyright (c) 2011-2023 Lucas Kanashiro. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Cmd.pm 000444 001751 000177 6366 14563470114 17735 0 ustar 00runner docker 000000 000000 package Text::Markup::Cmd;
use strict;
use warnings;
use Symbol;
use IPC::Open3;
use File::Spec;
use Carp;use constant WIN32 => $^O eq 'MSWin32';
use Exporter 'import';
our @EXPORT = qw(find_cmd exec_or_die open_pipe WIN32);
sub find_cmd {
my ($names, @opts) = @_;
my $cli;
EXE: {
for my $exe (@{ $names }) {
for my $p (File::Spec->path) {
my $path = File::Spec->catfile($p, $exe);
next unless -f $path && -x $path;
$cli = $path;
last EXE;
}
}
}
unless ($cli) {
my $list = join(', ', @{ $names }[0..$#$names-1]) . ", or $names->[-1]";
Carp::croak( "Cannot find $list in path $ENV{PATH}" );
}
# Make sure it looks like it will work.
exec_or_die("$cli will not execute", $cli, @opts);
return $cli;
}
sub exec_or_die {
my $err = shift;
my $output = gensym;
my $pid = open3(undef, $output, $output, @_);
waitpid $pid, 0;
return 1 unless $?;
use Carp;
local $/;
Carp::croak( qq{$err\n}, <$output> );
}
# Stolen from SVN::Notify.
sub open_pipe {
# Ignored; looks like docutils always emits UTF-8.
if (WIN32) {
my $cmd = q{"} . join(q{" "}, @_) . q{"|};
open my $fh, $cmd or die "Cannot fork: $!\n";
return $fh;
}
my $pid = open my $fh, '-|';
die "Cannot fork: $!\n" unless defined $pid;
if ($pid) {
# Parent process, return the file handle.
return $fh;
} else {
# Child process. Execute the commands.
exec @_ or die "Cannot exec $_[0]: $!\n";
# Not reached.
}
}
1;
__END__
=head1 Name
Text::Markup::Cmd - Tools for external commands
=head1 Synopsis
use Text::Markup::Cmd;
my $fh = open_pipe(qw(perl -V));
=head1 Description
Text::Markup::Cmd provides tools for Text::Markup parsers that depend on
external commands, such as L and
L. Will mainly be of interest to those
L with such a dependency.
=head3 Interface
=head2 Exported Functions
=head3 C
my $exe = 'nerble' . (WIN32 ? '.exe' : '');
Constant indicating whether the current runtime environment (OS) is Windows.
=head3 C
my $cmd = find_cmd(
['nerble' . (WIN32 ? '.exe' : ''), 'nerble.rb'],
'--version',
);
Searches the path for one or more named commands. Returns the first command
to be found in the path and which executes with the specified command line
options without error. The caller must specify OS-appropriate spellings
of the commands.
=head3 C
exec_or_die(
qq{Missing required Python "docutils" module},
$PYTHON, '-c', 'import docutils',
);
Executes a command and its arguments. Dies with the error argument if the
command fails.
=head3 C
my $fh = open_pipe(qw(nerble --as-html input.nerb));
Executes a command and its arguments and returns a file handle opened to
its C. Dies if the command fails.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2012-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/CommonMark.pm 000444 001751 000177 7154 14563470114 21271 0 ustar 00runner docker 000000 000000 package Text::Markup::CommonMark;
use 5.8.1;
use strict;
use warnings;
use CommonMark;
use Text::Markup;
use File::BOM qw(open_bom);
our $VERSION = '0.33';
sub import {
# Replace Text::Markup::Markdown.
Text::Markup->register( markdown => $_[1] || qr{m(?:d(?:own)?|kdn?|arkdown)} );
}
sub parser {
my ($file, $encoding, $opts) = @_;
open_bom my $fh, $file, ":encoding($encoding)";
my %params = @{ $opts };
my $html = CommonMark->parse(
smart => 1,
unsafe => 1,
%params,
string => join( '', <$fh>),
)->render( %params, format => 'html' );
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $params{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::CommonMark - CommonMark Markdown parser for Text::Markup
=head1 Synopsis
use Text::Markup::CommonMark;
my $html = Text::Markup->new->parse(file => 'README.md');
my $raw = Text::Markup->new->parse(
file => 'README.md',
options => [ raw => 1 ],
);
=head1 Description
This is the L parser
for L. On load, it replaces the default L
parser for parsing L.
Note that L does not load this module by default, but when
loaded manually will be the preferred Markdown parser.
Text::Markup::CommonMark reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an
encoded UTF-8 string with an C element identifying
the encoding as UTF-8.
It recognizes files with the following extensions as CommonMark Markdown:
=over
=item F<.md>
=item F<.mkd>
=item F<.mkdn>
=item F<.mdown>
=item F<.markdown>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::CommonMark qr{markd?};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to C.
In addition Text::CommonMark supports all of the CommonMark
L and L,
including:
=over
=item C
When true, convert straight quotes to curly, --- to em dashes, -- to en
dashes. Enabled by default.
=item C
When true, include a data-sourcepos attribute on all block elements. Disabled
by default.
=item C
When true, render soft-break elements as hard line breaks. Disabled by default.
=item C
When true, render soft-break elements as spaces. Disabled by default.
=item C
When true, validate UTF-8 in the input before parsing, replacing illegal
sequences with the replacement character C. Disabled by default.
=item C
Render raw HTML and unsafe links (C, C, C, and
C, except for C, C, C, or
C mime types). Raw HTML is replaced by a placeholder HTML comment.
Unsafe links are replaced by empty strings. Enabled by default.
=back
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Creole.pm 000444 001751 000177 4005 14563470114 20427 0 ustar 00runner docker 000000 000000 package Text::Markup::Creole;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::WikiCreole;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( creole => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = creole_parse(<$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return $html if { @{ $opts } }->{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Creole - Creole parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'file.creole');
my $raw = Text::Markup->new->parse(
file => 'file.creole',
options => [ raw => 1 ],
);
=head1 Description
This is the L parser for L.
It reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an
encoded UTF-8 string with an C element identifying
the encoding as UTF-8.
It recognizes files with the following extensions as Markdown:
=over
=item F<.creole>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Creole qr{cre+ole+};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to C.
=head1 Author
Lucas Kanashiro
=head1 Copyright and License
Copyright (c) 2011-2023 Lucas Kanashiro. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/HTML.pm 000444 001751 000177 2673 14563470114 17773 0 ustar 00runner docker 000000 000000 package Text::Markup::HTML;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( html => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my $html = do {
open my $fh, '<:raw', $file or die "Cannot open $file: $!\n";
local $/;
<$fh>;
};
return $html =~ /\S/ ? $html : undef
}
1;
__END__
=head1 Name
Text::Markup::HTML - HTML parser for Text::Markup
=head1 Synopsis
use Text::Markup;
my $html = Text::Markup->new->parse(file => 'hello.html');
=head1 Description
This is the L parser for L. All
it does is read in the HTML file and return it as a string. It makes no
assumptions about encoding, and returns the string raw as read from the file,
with no decoding. It recognizes files with the following extensions as HTML:
=over
=item F<.html>
=item F<.htm>
=item F<.xhtml>
=item F<.xhtm>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::HTML qr{hachetml};
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Markdown.pm 000444 001751 000177 5065 14563470114 21007 0 ustar 00runner docker 000000 000000 package Text::Markup::Markdown;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::Markdown ();
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( markdown => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
my $md = Text::Markdown->new(%params);
open_bom my $fh, $file, ":encoding($encoding)";
my $html = $md->markdown(join '', <$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $params{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Markdown - Markdown parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'README.md');
my $raw = Text::Markup->new->parse(
file => 'README.md',
options => [ raw => 1 ],
);
=head1 Description
This is the L parser
for L. It reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an
encoded UTF-8 string with an C element identifying
the encoding as UTF-8.
It recognizes files with the following extensions as Markdown:
=over
=item F<.md>
=item F<.mkd>
=item F<.mkdn>
=item F<.mdown>
=item F<.markdown>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Markdown qr{markd?};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to C.
In addition, Text::Markup::Markdown supports all of the
L, including:
=over
=item C
=item C
=item C
=back
=head1 See Also
L.
MarkI or MarkI?
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Mediawiki.pm 000444 001751 000177 5016 14563470114 21124 0 ustar 00runner docker 000000 000000 package Text::Markup::Mediawiki;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::MediawikiFormat 1.0;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( mediawiki => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = Text::MediawikiFormat::format(<$fh>, @{ $opts });
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $opts->[1] && $opts->[1]->{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Mediawiki - MediaWiki syntax parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'README.mediawiki');
my $raw = Text::Markup->new->parse(
file => 'README.mediawiki',
options => [ {}, { raw => 1 } ],
);
=head1 Description
This is the L parser
for L. It reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as
an encoded UTF-8 string with an C element
identifying the encoding as UTF-8.
It recognizes files with the following extensions as MediaWiki:
=over
=item F<.mediawiki>
=item F<.mwiki>
=item F<.wiki>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Mediawiki qr{kwiki?};
Text::Markup::Mediawiki supports the two
L, a hash
reference for tags and a hash reference of options. The supported options
include:
=over
=item C
=item C
=item C
=item C
=item C
=back
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option via that second hash reference of options.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Multimarkdown.pm 000444 001751 000177 5237 14563470114 22063 0 ustar 00runner docker 000000 000000 package Text::Markup::Multimarkdown;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::MultiMarkdown ();
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( multimarkdown => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
my $md = Text::MultiMarkdown->new(%params);
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = $md->markdown(<$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $params{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Multimarkdown - MultiMarkdown parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'README.mmd');
my $raw = Text::Markup->new->parse(
file => 'README.mmd',
options => [ raw => 1 ],
);
=head1 Description
This is the L parser
for L. It reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an
encoded UTF-8 string with an C element identifying
the encoding as UTF-8.
It recognizes files with the following extensions as MultiMarkdown:
=over
=item F<.mmd>
=item F<.mmkd>
=item F<.mmkdn>
=item F<.mmdown>
=item F<.multimarkdown>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Multimarkdown qr{mmm+};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to the format options argument to C.
In addition, Text::Markup::Mediawiki supports all of the
L, including:
=over
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=back
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/None.pm 000444 001751 000177 4073 14563470114 20122 0 ustar 00runner docker 000000 000000 package Text::Markup::None;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use HTML::Entities;
use File::BOM qw(open_bom);
our $VERSION = '0.33';
sub import {
# Set a regex if passed one.
Text::Markup->register( none => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = encode_entities(<$fh>, '<>&"');
return undef unless $html =~ /\S/;
utf8::encode($html);
return $html if { @{ $opts } }->{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::None - Turn a file with no known markup into HTML
=head1 Synopsis
use Text::Markup;
my $html = Text::Markup->new->parse(file => 'README');
my $raw = Text::Markup->new->parse(
file => 'README',
options => [ raw => 1 ],
);
=head1 Description
This is the default parser used by Text::Markdown in the event that it cannot
determine the format of a text file. All it does is read the file in (relying
on a L, encodes all
entities, and then returns an HTML string with the file in a C<< >>
element. This will be handy for files that really are nothing but plain text,
like F files.
By default this parser is not associated with any file extensions. To have
Text::Markup also recognize files for this module, load it directly and pass
a regular expression matching the desired extension(s), like so:
use Text::Markup::None qr{te?xt};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to C.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Pod.pm 000444 001751 000177 5123 14563470114 17742 0 ustar 00runner docker 000000 000000 package Text::Markup::Pod;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Pod::Simple::XHTML 3.15;
sub import {
# Replace the regex if passed one.
Text::Markup->register( pod => $_[1] ) if $_[1];
}
# Disable the use of HTML::Entities.
$Pod::Simple::XHTML::HAS_HTML_ENTITIES = 0;
our $VERSION = '0.33';
sub parser {
my ($file, $encoding, $opts) = @_;
my $p = Pod::Simple::XHTML->new;
# Output everything as UTF-8.
$p->html_header_tags('');
$p->strip_verbatim_indent(sub { (sort map { /^(\s+)/ } @{$_[0]})[0] });
$p->output_string(\my $html);
# Want user supplied options to override even these default behaviors,
# if necessary
my $opt = $opts ? { @$opts } : {};
foreach my $method ( keys %$opt ) {
my $v = $opt->{$method};
$p->$method($v);
}
$p->parse_file($file);
return unless $p->content_seen;
utf8::encode($html);
return $html;
}
1;
__END__
=head1 Name
Text::Markup::Pod - Pod parser for Text::Markup
=head1 Synopsis
use Text::Markup;
my $pod = Text::Markup->new->parse(file => 'README.pod');
=head1 Description
This is the L parser for L. It runs the file
through L and returns the result. If the Pod contains any
non-ASCII characters, the encoding must be declared either via a BOM or via
the C<=encoding> tag. Text::Markup::Pod recognizes files with the following
extensions as Pod:
=over
=item F<.pod>
=item F<.pm>
=item F<.pl>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Pod qr{cgi};
=head1 Options
You may pass an arrayref of settings to this parser which changes the output returned. For example,
to suppress an HTML header and footer, pass:
my $pod_fragment = Text::Markup->new->parse(
file => 'README.pod',
options => [
html_header => '',
html_footer => '',
]
);
This implementation makes method calls to the L parser using the key as the method
name and the value as the parameter list to pass.
See L and L for the full list of options and inherited options
which can be manipulated.
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Rest.pm 000444 001751 000177 6051 14563470114 20136 0 ustar 00runner docker 000000 000000 package Text::Markup::Rest;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use Text::Markup::Cmd;
use File::Basename;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( rest => $_[1] ) if $_[1];
}
# Find Python or die.
my $PYTHON = find_cmd(
[WIN32 ? 'python3.exe' : 'python3'],
'--version',
);
# We have python, let's find out if we have docutils.
exec_or_die(
qq{Missing required Python "docutils" module},
$PYTHON, '-c', 'import docutils',
);
# We ship with our own rst2html that's lenient with unknown directives.
my $RST2HTML = File::Spec->catfile(dirname(__FILE__), 'rst2html_lenient.py');
exec_or_die(
"$RST2HTML will not execute",
$PYTHON, $RST2HTML, '--test-patch',
);
# Optional arguments to pass to rst2html
my @OPTIONS = qw(
--no-raw
--no-file-insertion
--stylesheet=
--cloak-email-address
--no-generator
--quiet
);
# Options to improve rendering of Sphinx documents
my @SPHINX_OPTIONS = qw(
--dir-ignore toctree
--dir-ignore highlight
--dir-ignore index
--dir-ignore default-domain
--dir-nested note
--dir-nested warning
--dir-nested versionadded
--dir-nested versionchanged
--dir-nested deprecated
--dir-nested seealso
--dir-nested hlist
--dir-nested glossary
--dir-notitle code-block
--dir-nested module
--dir-nested function
--output-encoding utf-8
);
# note: domains directive (last 2 options) incomplete
sub parser {
my ($file, $encoding, $opts) = @_;
my $html = do {
my $fh = open_pipe(
$PYTHON, $RST2HTML,
@OPTIONS, @SPHINX_OPTIONS,
'--input-encoding', $encoding,
$file
);
local $/;
<$fh>;
};
# Make sure we have something.
return undef if $html =~ m{\s+
}ms;
# Alas, --no-generator does not remove the generator meta tag. :-(
$html =~ s{^\s*]+>\n}{}ms;
return $html;
}
1;
__END__
=head1 Name
Text::Markup::Rest - reStructuredText parser for Text::Markup
=head1 Synopsis
use Text::Markup;
my $html = Text::Markup->new->parse(file => 'hello.rst');
=head1 Description
This is the
L parser for
L. It depends on the C Python package, which can be
found as C in many Linux distributions, or installed using
the command C. It recognizes files with the following
extensions as reST:
=over
=item F<.rest>
=item F<.rst>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Rest qr{re?st(?:aurant)};
=head1 Author
Daniele Varrazzo
=head1 Copyright and License
Copyright (c) 2011-2023 Daniele Varrazzo. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Textile.pm 000444 001751 000177 5110 14563470114 20632 0 ustar 00runner docker 000000 000000 package Text::Markup::Textile;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::Textile 2.10;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( textile => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
my $textile = Text::Textile->new(
charset => 'utf-8',
char_encoding => 0,
trim_spaces => 1,
%params,
);
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = $textile->process(<$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $params{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Textile - Textile parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'README.textile');
my $raw = Text::Markup->new->parse(
file => 'README.textile',
options => [ raw => 1 ],
);
=head1 Description
This is the L parser for L.
It reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an
encoded UTF-8 string with an C element identifying
the encoding as UTF-8.
It recognizes files with the following extension as Textile:
=over
=item F<.textile>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Textile qr{text(?:ile)?};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to C.
In addition, Text::Markup::Mediawiki supports all of the
L, including:
=over
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=item C
=back
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/Trac.pm 000444 001751 000177 4432 14563470114 20113 0 ustar 00runner docker 000000 000000 package Text::Markup::Trac;
use 5.8.1;
use strict;
use warnings;
use Text::Markup;
use File::BOM qw(open_bom);
use Text::Trac 0.10;
our $VERSION = '0.33';
sub import {
# Replace the regex if passed one.
Text::Markup->register( trac => $_[1] ) if $_[1];
}
sub parser {
my ($file, $encoding, $opts) = @_;
my %params = @{ $opts };
my $trac = Text::Trac->new(%params);
open_bom my $fh, $file, ":encoding($encoding)";
local $/;
my $html = $trac->parse(<$fh>);
return unless $html =~ /\S/;
utf8::encode($html);
return $html if $params{raw};
return qq{
$html
};
}
1;
__END__
=head1 Name
Text::Markup::Trac - Trac wiki syntax parser for Text::Markup
=head1 Synopsis
my $html = Text::Markup->new->parse(file => 'README.trac');
my $raw = Text::Markup->new->parse(
file => 'README.trac',
options => [ raw => 1 ],
);
=head1 Description
This is the L
parser for L. It reads in the file (relying on a
L), hands it off to
L for parsing, and then returns the generated HTML as an encoded
UTF-8 string with an C element identifying the
encoding as UTF-8.
It recognizes files with the following extensions as Trac:
=over
=item F<.trac>
=item F<.trc>
=back
To change it the files it recognizes, load this module directly and pass a
regular expression matching the desired extension(s), like so:
use Text::Markup::Trac qr{tr[au]ck?};
Normally this module returns the output wrapped in a minimal HTML document
skeleton. If you would like the raw output without the skeleton, you can pass
the C option to C.
In addition, Text::Markup::Mediawiki supports all of the
L, including:
=over
=item C
=item C
=item C
=back
=head1 Author
David E. Wheeler
=head1 Copyright and License
Copyright (c) 2011-2024 David E. Wheeler. Some Rights Reserved.
This module is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.
=cut
Text-Markup-0.33/lib/Text/Markup/rst2html_lenient.py 000555 001751 000177 22755 14563470114 22566 0 ustar 00runner docker 000000 000000 #!/usr/bin/env python3
"""
Parse a reST file into HTML in a very forgiving way.
The script is meant to render specialized reST documents, such as Sphinx
files, preserving the content, while not emulating the original rendering.
The script is currently tested against docutils 0.7-0.10. Other versions may
break it as it deals with the parser at a relatively low level. Use
--test-patch to verify if the script works as expected with your library
version.
"""
import sys
import docutils
from docutils import nodes, utils, SettingsSpec
from docutils.core import publish_cmdline, publish_string, default_description
from docutils.parsers.rst import Directive, directives, roles
from docutils.writers.html4css1 import HTMLTranslator, Writer
from docutils.parsers.rst.states import Body, Inliner
from docutils.frontend import validate_boolean
class any_directive(nodes.General, nodes.FixedTextElement):
"""A generic directive to deal with any unknown directive we may find."""
pass
class AnyDirective(Directive):
"""A directive returning its unaltered body."""
optional_arguments = 100 # should suffice
has_content = True
def run(self):
if self.name in self.state.document.settings.dir_ignore:
return []
children = []
if self.name not in self.state.document.settings.dir_notitle:
children.append(nodes.strong(self.name, "%s: " % self.name))
# keep the arguments, drop the options
for a in self.arguments:
if a.startswith(':') and a.endswith(':'):
break
children.append(nodes.emphasis(a, "%s " % a))
if self.name in self.state.document.settings.dir_nested:
if self.content:
container = nodes.Element()
self.state.nested_parse(self.content, self.content_offset,
container)
children.extend(container.children)
else:
content = '\n'.join(self.content)
children.append(nodes.literal_block(content, content))
node = any_directive(self.block_text, '', *children, dir_name=self.name)
return [node]
class any_role(nodes.Inline, nodes.TextElement):
"""A generic role to deal with any unknown role we may find."""
pass
class AnyRole:
"""A role to be rendered as a generic element with a specific class."""
def __init__(self, role_name):
self.role_name = role_name
def __call__(self, role, rawtext, text, lineno, inliner,
options={}, content=[]):
roles.set_classes(options)
options['role_name'] = self.role_name
node = any_role(rawtext, utils.unescape(text), **options)
return [node], []
def catchall_directive(self, match, **option_presets):
"""Directive dispatch method.
Replacement for Body.directive(): if a directive is not known, build one
on the fly instead of reporting an error.
"""
type_name = match.group(1)
directive_class, messages = directives.directive(
type_name, self.memo.language, self.document)
# in case it's missing, register a generic directive
if not directive_class:
directives.register_directive(type_name, AnyDirective)
directive_class, messages = directives.directive(
type_name, self.memo.language, self.document)
assert directive_class, "can't find just defined directive"
self.parent += messages
return self.run_directive(
directive_class, match, type_name, option_presets)
def catchall_interpreted(self, rawsource, text, role, lineno):
"""Interpreted text role dispatch method.
Replacement for Inliner.interpreted(): if a role is not known, build one
on the fly instead of reporting an error.
"""
role_fn, messages = roles.role(role, self.language, lineno,
self.reporter)
# in case it's missing, register a generic role
if not role_fn:
role_obj = AnyRole(role)
roles.register_canonical_role(role, role_obj)
role_fn, messages = roles.role(
role, self.language, lineno, self.reporter)
assert role_fn, "can't find just defined role"
nodes, messages2 = role_fn(role, rawsource, text, lineno, self)
return nodes, messages + messages2
def patch_docutils():
"""Change the docutils parser behaviour."""
# Patch the constructs dispatch table
for i, (f, p) in enumerate(Body.explicit.constructs):
if f is Body.directive is f:
Body.explicit.constructs[i] = (catchall_directive, p)
break
else:
assert False, "can't find directive dispatch entry"
# Patch the parser so that when an unknown directive is found, a generic one
# is generated on the fly.
Body.directive = catchall_directive
# Patch the parser so that when an unknown interpreted text role is found,
# a generic one is generated on the fly.
Inliner.interpreted = catchall_interpreted
class MyTranslator(HTMLTranslator):
"""An HTML translator that can render with any_role/any_directive.
"""
def visit_any_directive(self, node):
cls = node.get('dir_name')
cls = cls and 'directive-%s' % cls or 'directive'
self.body.append(self.starttag(node, 'div', CLASS=cls))
def depart_any_directive(self, node):
self.body.append('\n\n')
def visit_any_role(self, node):
cls = node.get('role_name')
cls = cls and 'role-%s' % cls or 'role'
self.body.append(self.starttag(node, 'span', '', CLASS=cls))
def depart_any_role(self, node):
self.body.append('')
class LenientSettingsSpecs(SettingsSpec):
settings_spec = ("Lenient parsing options", None, (
("Directive whose content should be interpreted as reST. "
"By default emit the content as unparsed text block. "
"Can be specified more than once",
["--dir-nested"],
{'metavar': 'NAME', 'default': [], 'action': 'append'}),
("Directive that should produce no output. "
"Can be specified more than once",
["--dir-ignore"],
{'metavar': 'NAME', 'default': [], 'action': 'append'}),
("Only emit the content of the directive, no title and options. "
"Can be specified more than once",
["--dir-notitle"],
{'metavar': 'NAME', 'default': [], 'action': 'append'}),
("Verify that lenient customization works fine. "
"Immediately return with 0 (success) or 1 (error). "
"In case of error, print a report on stdout.",
['--test-patch'],
{'action': 'store_true', 'validator': validate_boolean}),
))
def main():
# Create a writer to deal with the generic element we may have created.
writer = Writer()
writer.translator_class = MyTranslator
description = (
'Generates (X)HTML documents from standalone reStructuredText '
'sources. Be forgiving against unknown elements. '
+ default_description)
# the parser processes the settings too late: we want to decide earlier if
# we are running or testing.
if ('--test-patch' in sys.argv
and not ('-h' in sys.argv or '--help' in sys.argv)):
return test_patch(writer)
else:
# Make docutils lenient.
patch_docutils()
overrides = {
# If Pygments is missing, code-block directives are swallowed
# with Docutils >= 0.9.
'syntax_highlight': 'none',
# not available on Docutils < 0.8 so can't pass as an option
'math_output': 'HTML',
}
publish_cmdline(writer=writer, description=description,
settings_spec=LenientSettingsSpecs, settings_overrides=overrides)
return 0
def test_patch(writer):
"""Verify that patching docutils works as expected."""
TEST_SOURCE = """`
Hello `role`:norole:
.. nodirective::
"""
rv = 0
problems = []
exc = None
# patch and use lenient docutils
try:
try:
patch_docutils()
except Exception as exc:
problems.append("error during library patching")
raise
try:
out = publish_string(TEST_SOURCE,
writer=writer, settings_spec=LenientSettingsSpecs,
settings_overrides={'output_encoding': 'unicode'})
except Exception as exc:
problems.append("error while running patched docutils")
raise
except:
pass
# verify conform output
else:
out = out.replace("'", '"')
if '' not in out:
problems.append(
"unknown role didn't produce the expected output")
if '' not in out:
problems.append(
"unknown directive didn't produce the expected output")
# report problems if any
if problems:
rv = 1
print("Patching docutils failed!", file=sys.stderr)
for problem in problems:
print("-", problem, file=sys.stderr)
if rv:
print("\nVersions:", \
'docutils:', docutils.__version__, docutils.__version_details__, \
'\nPython:', sys.version, file=sys.stderr)
if exc:
if '--traceback' in sys.argv:
print(file=sys.stderr)
import traceback
traceback.print_exc()
else:
print("\nUse --traceback to display the error stack trace.", file=sys.stderr)
return rv
if __name__ == '__main__':
sys.exit(main())
Text-Markup-0.33/t 000755 001751 000177 0 14563470114 14036 5 ustar 00runner docker 000000 000000 Text-Markup-0.33/t/base.t 000444 001751 000177 10677 14563470114 15325 0 ustar 00runner docker 000000 000000 #!/usr/bin/env perl -w
use strict;
use warnings;
use Test::More tests => 38;
#use Test::More 'no_plan';
use File::Spec::Functions qw(catdir);
use HTML::Entities;
BEGIN { use_ok 'Text::Markup' or die; }
can_ok 'Text::Markup' => qw(
register
formats
new
parse
default_format
_get_parser
);
# Find core parsers.
my $dir = catdir qw(lib Text Markup);
opendir my $dh, $dir or die "Cannot open diretory $dir: $!\n";
my @core_parsers = sort qw(
creole
mediawiki
html
rest
pod
bbcode
markdown
textile
asciidoc
multimarkdown
trac
);
is_deeply [Text::Markup->formats], \@core_parsers,
'Should have core formats';
ok my %matchers = Text::Markup->format_matchers,
'Get format matchers';
is_deeply [sort keys %matchers], [sort @core_parsers],
'Should have core format matchers';
isa_ok $_, 'Regexp', $_ for values %matchers;
# Register one.
PARSER: {
package My::Cool::Parser;
use Text::Markup;
Text::Markup->register(cool => qr{cool});
sub parser {
return $_[2]->[0] || 'hello';
}
}
is_deeply [Text::Markup->formats], [sort @core_parsers, 'cool'],
'Should be now have the "cool" parser';
my $parser = new_ok 'Text::Markup';
is $parser->default_format, undef, 'Should have no default format';
$parser = new_ok 'Text::Markup', [default_format => 'cool'];
is $parser->default_format, 'cool', 'Should have default format';
is $parser->_get_parser({ format => 'cool' }), My::Cool::Parser->can('parser'),
'Should be able to find specific parser';
is $parser->_get_parser({ file => 'foo' }), My::Cool::Parser->can('parser'),
'Should be able to find default format parser';
$parser->default_format(undef);
is $parser->_get_parser({ file => 'foo'}), Text::Markup::None->can('parser'),
'Should be find the specified default parser';
# Now make it guess the format.
$parser->default_format(undef);
is $parser->_get_parser({ file => 'foo.cool'}),
My::Cool::Parser->can('parser'),
'Should be able to guess the parser file the file name';
# Now test guess_format.
is $parser->guess_format('foo.cool'), 'cool',
'Should guess "cool" format file "foo.cool"';
is $parser->guess_format('foocool'), undef,
'Should not guess "cool" format file "foocool"';
is $parser->guess_format('foo.cool.txt'), undef,
'Should not guess "cool" format file "foo.cool.txt"';
# Add another parser.
PARSER: {
package My::Funky::Parser;
Text::Markup->register(funky => qr{funky(?:[.]txt)?});
sub parser {
# Must return a UTF-8 encoded string.
use utf8;
my $ret = 'fünky';
utf8::encode($ret);
return $ret;
}
}
is_deeply [Text::Markup->formats], [sort @core_parsers, qw(cool funky)],
'Should be now have the "cool" and "funky" parsers';
is $parser->guess_format('foo.cool'), 'cool',
'Should still guess "cool" format file "foo.cool"';
is $parser->guess_format('foo.funky'), 'funky',
'Should guess "funky" format file "foo.funky"';
is $parser->guess_format('foo.funky.txt'), 'funky',
'Should guess "funky" format file "foo.funky.txt"';
# Now try parsing.
is $parser->parse(
file => 'README.md',
format => 'cool',
), 'hello', 'Test the "cool" parser';
# Send output to a file.
is $parser->parse(
file => 'README.md',
format => 'funky',
), 'fünky', 'Test the "funky" parser';
# Test opts to the parser.
is $parser->parse(
file => 'README.md',
format => 'cool',
options => ['goodbye'],
), 'goodbye', 'Test the "cool" parser with options';
my $pod_dir = catdir (qw(t markups));
like $parser->parse(
file => "$pod_dir/pod.txt",
format => "pod",
options => [
html_header => '',
],
), qr||, 'Test pod option to suppress HTML header';
unlike $parser->parse(
file => "$pod_dir/pod.txt",
format => "pod",
options => [
html_header => '',
html_footer => '',
],
), qr|