|
|
Regexp::Common::list -- provide regexes for lists
use Regexp::Common qw /list/;
while (<>) { /$RE{list}{-pat => '\w+'}/ and print "List of words"; /$RE{list}{-pat => $RE{num}{real}}/ and print "List of numbers"; }
Please consult the manual of the Regexp::Common manpage for a general description of the works of this interface.
Do not use this module directly, but load it via Regexp::Common.
$RE{list}{-pat}{-sep}{-lastsep}
Returns a pattern matching a list of (at least two) substrings.
If -pat=P
is specified, it defines the pattern for each substring
in the list. By default, P is qr/.*?\S/
. In Regexp::Common 0.02
or earlier, the default pattern was qr/.*?/
. But that will match
a single space, causing unintended parsing of a, b, and c
as a
list of four elements instead of 3 (with -word
being (?:and)
).
One consequence is that a list of the form ``a,,b'' will no longer be
parsed. Use the pattern qr /.*?/
to be able to parse this, but see
the previous remark.
If -sep=P
is specified, it defines the pattern P to be used as
a separator between each pair of substrings in the list, except the final two.
By default P is qr/\s*,\s*/
.
If -lastsep=P
is specified, it defines the pattern P to be used as
a separator between the final two substrings in the list.
By default P is the same as the pattern specified by the -sep
flag.
For example:
$RE{list}{-pat=>'\w+'} # match a list of word chars $RE{list}{-pat=>$RE{num}{real}} # match a list of numbers $RE{list}{-sep=>"\t"} # match a tab-separated list $RE{list}{-lastsep=>',\s+and\s+'} # match a proper English list
Under -keep
:
captures the entire list
captures the last separator
$RE{list}{conj}{-word=PATTERN}
An alias for $RE{list}{-lastsep=>'\s*,?\s*PATTERN\s*'}
If -word
is not specified, the default pattern is qr/and|or/
.
For example:
$RE{list}{conj}{-word=>'et'} # match Jean, Paul, et Satre $RE{list}{conj}{-word=>'oder'} # match Bonn, Koln oder Hamburg
$RE{list}{and}
An alias for $RE{list}{conj}{-word=>'and'}
$RE{list}{or}
An alias for $RE{list}{conj}{-word=>'or'}
$Log: list.pm,v $ Revision 2.103 2003/07/04 13:34:05 abigail Fixed assignment to
Revision 2.102 2003/02/11 09:42:06 abigail Added
Revision 2.101 2003/02/01 22:55:31 abigail Changed Copyright years
Revision 2.100 2003/01/21 23:19:40 abigail The whole world understands RCS/CVS version numbers, that 1.9 is an older version than 1.10. Except CPAN. Curse the idiot(s) who think that version numbers are floats (in which universe do floats have more than one decimal dot?). Everything is bumped to version 2.100 because CPAN couldn't deal with the fact one file had version 1.10.
Revision 1.2 2002/08/05 12:16:59 abigail Fixed 'Regex::' and 'Rexexp::' typos to 'Regexp::' (Found my Mike Castle).
Revision 1.1 2002/07/28 21:41:07 abigail Split off from Regexp::Common.
the Regexp::Common manpage for a general description of how to use this interface.
Damian Conway (damian@conway.org)
This package is maintained by Abigail (regexp-common@abigail.nl).
Bound to be plenty.
For a start, there are many common regexes missing. Send them in to regexp-common@abigail.nl.
Copyright (c) 2001 - 2003, Damian Conway. All Rights Reserved. This module is free software. It may be used, redistributed and/or modified under the terms of the Perl Artistic License (see http://www.perl.com/perl/misc/Artistic.html)