SWI-Prolog -- re

Documentation
- Reference manual
- Packages
  - SWI-Prolog Regular Expression library
    - library(pcre): Perl compatible regular expression matching for SWI-Prolog

Availability::- use_module(library(pcre)).(can be autoloaded)

[det]re_compile(+Pattern, -Regex, +Options)

Compiles Pattern to a Regex blob of type regex (see blob/2). Defined Options are defined below. Please consult the PCRE documentation for details.

anchored(Bool): Force pattern anchoring
bsr(Mode): If anycrlf, \R only matches CR, LF or CRLF. If unicode, \R matches all Unicode line endings.
caseless(Bool): If true, do caseless matching.
dollar_endonly(Bool): If true, $ not to match newline at end
dotall(Bool): If true, . matches anything including NL
dupnames(Bool): If true, allow duplicate names for subpatterns
extended(Bool): If true, ignore white space and # comments
extra(Bool): If true, PCRE extra features (not much use currently)
firstline(Bool): If true, force matching to be before newline
compat(With): If javascript, JavaScript compatibility
multiline(Bool): If true, ^ and $ match newlines within data
newline(Mode): If any, recognize any Unicode newline sequence, if anycrlf (default), recognize CR, LF, and CRLF as newline sequences, if cr, recognize CR, if lf, recognize LF and finally if crlf recognize CRLF as newline.
ucp(Bool): If true, use Unicode properties for \d, \w, etc.
ungreedy(Bool): If true, invert greediness of quantifiers

In addition to the options above that directly map to pcre flags the following options are processed:

optimize(Bool)

If true, study the regular expression.

capture_type(+Type)

How to return the matched part of the input and possibly captured groups in there. Possible values are:

string: Return the captured string as a string (default).
atom: Return the captured string as an atom.
range: Return the captured string as a pair Start-Length. Note the we use Start-Length` rather than the more conventional Start-End to allow for immediate use with sub_atom/5 and sub_string/5.
term: Parse the captured string as a Prolog term. This is notably practical if you capture a number.

The capture_type specifies the default for this pattern. The interface supports a different type for each named group using the syntax (?<name_T>...), where T is one of S (string), A (atom), I (integer), F (float), N (number), T (term) and R (range). In the current implementation I, F and N are synonyms for T. Future versions may act different if the parsed value is not of the requested numeric type.