123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623 |
- <html>
- <head>
- <title>pcre2build specification</title>
- </head>
- <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
- <h1>pcre2build man page</h1>
- <p>
- Return to the <a href="index.html">PCRE2 index page</a>.
- </p>
- <p>
- This page is part of the PCRE2 HTML documentation. It was generated
- automatically from the original man page. If there is any nonsense in it,
- please consult the man page, in case the conversion went wrong.
- <br>
- <ul>
- <li><a name="TOC1" href="#SEC1">BUILDING PCRE2</a>
- <li><a name="TOC2" href="#SEC2">PCRE2 BUILD-TIME OPTIONS</a>
- <li><a name="TOC3" href="#SEC3">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a>
- <li><a name="TOC4" href="#SEC4">BUILDING SHARED AND STATIC LIBRARIES</a>
- <li><a name="TOC5" href="#SEC5">UNICODE AND UTF SUPPORT</a>
- <li><a name="TOC6" href="#SEC6">DISABLING THE USE OF \C</a>
- <li><a name="TOC7" href="#SEC7">JUST-IN-TIME COMPILER SUPPORT</a>
- <li><a name="TOC8" href="#SEC8">NEWLINE RECOGNITION</a>
- <li><a name="TOC9" href="#SEC9">WHAT \R MATCHES</a>
- <li><a name="TOC10" href="#SEC10">HANDLING VERY LARGE PATTERNS</a>
- <li><a name="TOC11" href="#SEC11">LIMITING PCRE2 RESOURCE USAGE</a>
- <li><a name="TOC12" href="#SEC12">CREATING CHARACTER TABLES AT BUILD TIME</a>
- <li><a name="TOC13" href="#SEC13">USING EBCDIC CODE</a>
- <li><a name="TOC14" href="#SEC14">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a>
- <li><a name="TOC15" href="#SEC15">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
- <li><a name="TOC16" href="#SEC16">PCRE2GREP BUFFER SIZE</a>
- <li><a name="TOC17" href="#SEC17">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a>
- <li><a name="TOC18" href="#SEC18">INCLUDING DEBUGGING CODE</a>
- <li><a name="TOC19" href="#SEC19">DEBUGGING WITH VALGRIND SUPPORT</a>
- <li><a name="TOC20" href="#SEC20">CODE COVERAGE REPORTING</a>
- <li><a name="TOC21" href="#SEC21">DISABLING THE Z AND T FORMATTING MODIFIERS</a>
- <li><a name="TOC22" href="#SEC22">SUPPORT FOR FUZZERS</a>
- <li><a name="TOC23" href="#SEC23">OBSOLETE OPTION</a>
- <li><a name="TOC24" href="#SEC24">SEE ALSO</a>
- <li><a name="TOC25" href="#SEC25">AUTHOR</a>
- <li><a name="TOC26" href="#SEC26">REVISION</a>
- </ul>
- <br><a name="SEC1" href="#TOC1">BUILDING PCRE2</a><br>
- <P>
- PCRE2 is distributed with a <b>configure</b> script that can be used to build
- the library in Unix-like environments using the applications known as
- Autotools. Also in the distribution are files to support building using
- <b>CMake</b> instead of <b>configure</b>. The text file
- <a href="README.txt"><b>README</b></a>
- contains general information about building with Autotools (some of which is
- repeated below), and also has some comments about building on various operating
- systems. There is a lot more information about building PCRE2 without using
- Autotools (including information about using <b>CMake</b> and building "by
- hand") in the text file called
- <a href="NON-AUTOTOOLS-BUILD.txt"><b>NON-AUTOTOOLS-BUILD</b>.</a>
- You should consult this file as well as the
- <a href="README.txt"><b>README</b></a>
- file if you are building in a non-Unix-like environment.
- </P>
- <br><a name="SEC2" href="#TOC1">PCRE2 BUILD-TIME OPTIONS</a><br>
- <P>
- The rest of this document describes the optional features of PCRE2 that can be
- selected when the library is compiled. It assumes use of the <b>configure</b>
- script, where the optional features are selected or deselected by providing
- options to <b>configure</b> before running the <b>make</b> command. However, the
- same options can be selected in both Unix-like and non-Unix-like environments
- if you are using <b>CMake</b> instead of <b>configure</b> to build PCRE2.
- </P>
- <P>
- If you are not using Autotools or <b>CMake</b>, option selection can be done by
- editing the <b>config.h</b> file, or by passing parameter settings to the
- compiler, as described in
- <a href="NON-AUTOTOOLS-BUILD.txt"><b>NON-AUTOTOOLS-BUILD</b>.</a>
- </P>
- <P>
- The complete list of options for <b>configure</b> (which includes the standard
- ones such as the selection of the installation directory) can be obtained by
- running
- <pre>
- ./configure --help
- </pre>
- The following sections include descriptions of "on/off" options whose names
- begin with --enable or --disable. Because of the way that <b>configure</b>
- works, --enable and --disable always come in pairs, so the complementary option
- always exists as well, but as it specifies the default, it is not described.
- Options that specify values have names that start with --with. At the end of a
- <b>configure</b> run, a summary of the configuration is output.
- </P>
- <br><a name="SEC3" href="#TOC1">BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br>
- <P>
- By default, a library called <b>libpcre2-8</b> is built, containing functions
- that take string arguments contained in arrays of bytes, interpreted either as
- single-byte characters, or UTF-8 strings. You can also build two other
- libraries, called <b>libpcre2-16</b> and <b>libpcre2-32</b>, which process
- strings that are contained in arrays of 16-bit and 32-bit code units,
- respectively. These can be interpreted either as single-unit characters or
- UTF-16/UTF-32 strings. To build these additional libraries, add one or both of
- the following to the <b>configure</b> command:
- <pre>
- --enable-pcre2-16
- --enable-pcre2-32
- </pre>
- If you do not want the 8-bit library, add
- <pre>
- --disable-pcre2-8
- </pre>
- as well. At least one of the three libraries must be built. Note that the POSIX
- wrapper is for the 8-bit library only, and that <b>pcre2grep</b> is an 8-bit
- program. Neither of these are built if you select only the 16-bit or 32-bit
- libraries.
- </P>
- <br><a name="SEC4" href="#TOC1">BUILDING SHARED AND STATIC LIBRARIES</a><br>
- <P>
- The Autotools PCRE2 building process uses <b>libtool</b> to build both shared
- and static libraries by default. You can suppress an unwanted library by adding
- one of
- <pre>
- --disable-shared
- --disable-static
- </pre>
- to the <b>configure</b> command.
- </P>
- <br><a name="SEC5" href="#TOC1">UNICODE AND UTF SUPPORT</a><br>
- <P>
- By default, PCRE2 is built with support for Unicode and UTF character strings.
- To build it without Unicode support, add
- <pre>
- --disable-unicode
- </pre>
- to the <b>configure</b> command. This setting applies to all three libraries. It
- is not possible to build one library with Unicode support and another without
- in the same configuration.
- </P>
- <P>
- Of itself, Unicode support does not make PCRE2 treat strings as UTF-8, UTF-16
- or UTF-32. To do that, applications that use the library can set the PCRE2_UTF
- option when they call <b>pcre2_compile()</b> to compile a pattern.
- Alternatively, patterns may be started with (*UTF) unless the application has
- locked this out by setting PCRE2_NEVER_UTF.
- </P>
- <P>
- UTF support allows the libraries to process character code points up to
- 0x10ffff in the strings that they handle. Unicode support also gives access to
- the Unicode properties of characters, using pattern escapes such as \P, \p,
- and \X. Only the general category properties such as <i>Lu</i> and <i>Nd</i> are
- supported. Details are given in the
- <a href="pcre2pattern.html"><b>pcre2pattern</b></a>
- documentation.
- </P>
- <P>
- Pattern escapes such as \d and \w do not by default make use of Unicode
- properties. The application can request that they do by setting the PCRE2_UCP
- option. Unless the application has set PCRE2_NEVER_UCP, a pattern may also
- request this by starting with (*UCP).
- </P>
- <br><a name="SEC6" href="#TOC1">DISABLING THE USE OF \C</a><br>
- <P>
- The \C escape sequence, which matches a single code unit, even in a UTF mode,
- can cause unpredictable behaviour because it may leave the current matching
- point in the middle of a multi-code-unit character. The application can lock it
- out by setting the PCRE2_NEVER_BACKSLASH_C option when calling
- <b>pcre2_compile()</b>. There is also a build-time option
- <pre>
- --enable-never-backslash-C
- </pre>
- (note the upper case C) which locks out the use of \C entirely.
- </P>
- <br><a name="SEC7" href="#TOC1">JUST-IN-TIME COMPILER SUPPORT</a><br>
- <P>
- Just-in-time (JIT) compiler support is included in the build by specifying
- <pre>
- --enable-jit
- </pre>
- This support is available only for certain hardware architectures. If this
- option is set for an unsupported architecture, a building error occurs.
- If in doubt, use
- <pre>
- --enable-jit=auto
- </pre>
- which enables JIT only if the current hardware is supported. You can check
- if JIT is enabled in the configuration summary that is output at the end of a
- <b>configure</b> run. If you are enabling JIT under SELinux you may also want to
- add
- <pre>
- --enable-jit-sealloc
- </pre>
- which enables the use of an execmem allocator in JIT that is compatible with
- SELinux. This has no effect if JIT is not enabled. See the
- <a href="pcre2jit.html"><b>pcre2jit</b></a>
- documentation for a discussion of JIT usage. When JIT support is enabled,
- <b>pcre2grep</b> automatically makes use of it, unless you add
- <pre>
- --disable-pcre2grep-jit
- </pre>
- to the <b>configure</b> command.
- </P>
- <br><a name="SEC8" href="#TOC1">NEWLINE RECOGNITION</a><br>
- <P>
- By default, PCRE2 interprets the linefeed (LF) character as indicating the end
- of a line. This is the normal newline character on Unix-like systems. You can
- compile PCRE2 to use carriage return (CR) instead, by adding
- <pre>
- --enable-newline-is-cr
- </pre>
- to the <b>configure</b> command. There is also an --enable-newline-is-lf option,
- which explicitly specifies linefeed as the newline character.
- </P>
- <P>
- Alternatively, you can specify that line endings are to be indicated by the
- two-character sequence CRLF (CR immediately followed by LF). If you want this,
- add
- <pre>
- --enable-newline-is-crlf
- </pre>
- to the <b>configure</b> command. There is a fourth option, specified by
- <pre>
- --enable-newline-is-anycrlf
- </pre>
- which causes PCRE2 to recognize any of the three sequences CR, LF, or CRLF as
- indicating a line ending. A fifth option, specified by
- <pre>
- --enable-newline-is-any
- </pre>
- causes PCRE2 to recognize any Unicode newline sequence. The Unicode newline
- sequences are the three just mentioned, plus the single characters VT (vertical
- tab, U+000B), FF (form feed, U+000C), NEL (next line, U+0085), LS (line
- separator, U+2028), and PS (paragraph separator, U+2029). The final option is
- <pre>
- --enable-newline-is-nul
- </pre>
- which causes NUL (binary zero) to be set as the default line-ending character.
- </P>
- <P>
- Whatever default line ending convention is selected when PCRE2 is built can be
- overridden by applications that use the library. At build time it is
- recommended to use the standard for your operating system.
- </P>
- <br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
- <P>
- By default, the sequence \R in a pattern matches any Unicode newline sequence,
- independently of what has been selected as the line ending sequence. If you
- specify
- <pre>
- --enable-bsr-anycrlf
- </pre>
- the default is changed so that \R matches only CR, LF, or CRLF. Whatever is
- selected when PCRE2 is built can be overridden by applications that use the
- library.
- </P>
- <br><a name="SEC10" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br>
- <P>
- Within a compiled pattern, offset values are used to point from one part to
- another (for example, from an opening parenthesis to an alternation
- metacharacter). By default, in the 8-bit and 16-bit libraries, two-byte values
- are used for these offsets, leading to a maximum size for a compiled pattern of
- around 64 thousand code units. This is sufficient to handle all but the most
- gigantic patterns. Nevertheless, some people do want to process truly enormous
- patterns, so it is possible to compile PCRE2 to use three-byte or four-byte
- offsets by adding a setting such as
- <pre>
- --with-link-size=3
- </pre>
- to the <b>configure</b> command. The value given must be 2, 3, or 4. For the
- 16-bit library, a value of 3 is rounded up to 4. In these libraries, using
- longer offsets slows down the operation of PCRE2 because it has to load
- additional data when handling them. For the 32-bit library the value is always
- 4 and cannot be overridden; the value of --with-link-size is ignored.
- </P>
- <br><a name="SEC11" href="#TOC1">LIMITING PCRE2 RESOURCE USAGE</a><br>
- <P>
- The <b>pcre2_match()</b> function increments a counter each time it goes round
- its main loop. Putting a limit on this counter controls the amount of computing
- resource used by a single call to <b>pcre2_match()</b>. The limit can be changed
- at run time, as described in the
- <a href="pcre2api.html"><b>pcre2api</b></a>
- documentation. The default is 10 million, but this can be changed by adding a
- setting such as
- <pre>
- --with-match-limit=500000
- </pre>
- to the <b>configure</b> command. This setting also applies to the
- <b>pcre2_dfa_match()</b> matching function, and to JIT matching (though the
- counting is done differently).
- </P>
- <P>
- The <b>pcre2_match()</b> function starts out using a 20KiB vector on the system
- stack to record backtracking points. The more nested backtracking points there
- are (that is, the deeper the search tree), the more memory is needed. If the
- initial vector is not large enough, heap memory is used, up to a certain limit,
- which is specified in kibibytes (units of 1024 bytes). The limit can be changed
- at run time, as described in the
- <a href="pcre2api.html"><b>pcre2api</b></a>
- documentation. The default limit (in effect unlimited) is 20 million. You can
- change this by a setting such as
- <pre>
- --with-heap-limit=500
- </pre>
- which limits the amount of heap to 500 KiB. This limit applies only to
- interpretive matching in <b>pcre2_match()</b> and <b>pcre2_dfa_match()</b>, which
- may also use the heap for internal workspace when processing complicated
- patterns. This limit does not apply when JIT (which has its own memory
- arrangements) is used.
- </P>
- <P>
- You can also explicitly limit the depth of nested backtracking in the
- <b>pcre2_match()</b> interpreter. This limit defaults to the value that is set
- for --with-match-limit. You can set a lower default limit by adding, for
- example,
- <pre>
- --with-match-limit_depth=10000
- </pre>
- to the <b>configure</b> command. This value can be overridden at run time. This
- depth limit indirectly limits the amount of heap memory that is used, but
- because the size of each backtracking "frame" depends on the number of
- capturing parentheses in a pattern, the amount of heap that is used before the
- limit is reached varies from pattern to pattern. This limit was more useful in
- versions before 10.30, where function recursion was used for backtracking.
- </P>
- <P>
- As well as applying to <b>pcre2_match()</b>, the depth limit also controls
- the depth of recursive function calls in <b>pcre2_dfa_match()</b>. These are
- used for lookaround assertions, atomic groups, and recursion within patterns.
- The limit does not apply to JIT matching.
- <a name="createtables"></a></P>
- <br><a name="SEC12" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>
- <P>
- PCRE2 uses fixed tables for processing characters whose code points are less
- than 256. By default, PCRE2 is built with a set of tables that are distributed
- in the file <i>src/pcre2_chartables.c.dist</i>. These tables are for ASCII codes
- only. If you add
- <pre>
- --enable-rebuild-chartables
- </pre>
- to the <b>configure</b> command, the distributed tables are no longer used.
- Instead, a program called <b>pcre2_dftables</b> is compiled and run. This
- outputs the source for new set of tables, created in the default locale of your
- C run-time system. This method of replacing the tables does not work if you are
- cross compiling, because <b>pcre2_dftables</b> needs to be run on the local
- host and therefore not compiled with the cross compiler.
- </P>
- <P>
- If you need to create alternative tables when cross compiling, you will have to
- do so "by hand". There may also be other reasons for creating tables manually.
- To cause <b>pcre2_dftables</b> to be built on the local host, run a normal
- compiling command, and then run the program with the output file as its
- argument, for example:
- <pre>
- cc src/pcre2_dftables.c -o pcre2_dftables
- ./pcre2_dftables src/pcre2_chartables.c
- </pre>
- This builds the tables in the default locale of the local host. If you want to
- specify a locale, you must use the -L option:
- <pre>
- LC_ALL=fr_FR ./pcre2_dftables -L src/pcre2_chartables.c
- </pre>
- You can also specify -b (with or without -L). This causes the tables to be
- written in binary instead of as source code. A set of binary tables can be
- loaded into memory by an application and passed to <b>pcre2_compile()</b> in the
- same way as tables created by calling <b>pcre2_maketables()</b>. The tables are
- just a string of bytes, independent of hardware characteristics such as
- endianness. This means they can be bundled with an application that runs in
- different environments, to ensure consistent behaviour.
- </P>
- <br><a name="SEC13" href="#TOC1">USING EBCDIC CODE</a><br>
- <P>
- PCRE2 assumes by default that it will run in an environment where the character
- code is ASCII or Unicode, which is a superset of ASCII. This is the case for
- most computer operating systems. PCRE2 can, however, be compiled to run in an
- 8-bit EBCDIC environment by adding
- <pre>
- --enable-ebcdic --disable-unicode
- </pre>
- to the <b>configure</b> command. This setting implies
- --enable-rebuild-chartables. You should only use it if you know that you are in
- an EBCDIC environment (for example, an IBM mainframe operating system).
- </P>
- <P>
- It is not possible to support both EBCDIC and UTF-8 codes in the same version
- of the library. Consequently, --enable-unicode and --enable-ebcdic are mutually
- exclusive.
- </P>
- <P>
- The EBCDIC character that corresponds to an ASCII LF is assumed to have the
- value 0x15 by default. However, in some EBCDIC environments, 0x25 is used. In
- such an environment you should use
- <pre>
- --enable-ebcdic-nl25
- </pre>
- as well as, or instead of, --enable-ebcdic. The EBCDIC character for CR has the
- same value as in ASCII, namely, 0x0d. Whichever of 0x15 and 0x25 is <i>not</i>
- chosen as LF is made to correspond to the Unicode NEL character (which, in
- Unicode, is 0x85).
- </P>
- <P>
- The options that select newline behaviour, such as --enable-newline-is-cr,
- and equivalent run-time options, refer to these character values in an EBCDIC
- environment.
- </P>
- <br><a name="SEC14" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br>
- <P>
- By default <b>pcre2grep</b> supports the use of callouts with string arguments
- within the patterns it is matching. There are two kinds: one that generates
- output using local code, and another that calls an external program or script.
- If --disable-pcre2grep-callout-fork is added to the <b>configure</b> command,
- only the first kind of callout is supported; if --disable-pcre2grep-callout is
- used, all callouts are completely ignored. For more details of <b>pcre2grep</b>
- callouts, see the
- <a href="pcre2grep.html"><b>pcre2grep</b></a>
- documentation.
- </P>
- <br><a name="SEC15" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
- <P>
- By default, <b>pcre2grep</b> reads all files as plain text. You can build it so
- that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
- them with <b>libz</b> or <b>libbz2</b>, respectively, by adding one or both of
- <pre>
- --enable-pcre2grep-libz
- --enable-pcre2grep-libbz2
- </pre>
- to the <b>configure</b> command. These options naturally require that the
- relevant libraries are installed on your system. Configuration will fail if
- they are not.
- </P>
- <br><a name="SEC16" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br>
- <P>
- <b>pcre2grep</b> uses an internal buffer to hold a "window" on the file it is
- scanning, in order to be able to output "before" and "after" lines when it
- finds a match. The default starting size of the buffer is 20KiB. The buffer
- itself is three times this size, but because of the way it is used for holding
- "before" lines, the longest line that is guaranteed to be processable is the
- notional buffer size. If a longer line is encountered, <b>pcre2grep</b>
- automatically expands the buffer, up to a specified maximum size, whose default
- is 1MiB or the starting size, whichever is the larger. You can change the
- default parameter values by adding, for example,
- <pre>
- --with-pcre2grep-bufsize=51200
- --with-pcre2grep-max-bufsize=2097152
- </pre>
- to the <b>configure</b> command. The caller of <b>pcre2grep</b> can override
- these values by using --buffer-size and --max-buffer-size on the command line.
- </P>
- <br><a name="SEC17" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br>
- <P>
- If you add one of
- <pre>
- --enable-pcre2test-libreadline
- --enable-pcre2test-libedit
- </pre>
- to the <b>configure</b> command, <b>pcre2test</b> is linked with the
- <b>libreadline</b> or<b>libedit</b> library, respectively, and when its input is
- from a terminal, it reads it using the <b>readline()</b> function. This provides
- line-editing and history facilities. Note that <b>libreadline</b> is
- GPL-licensed, so if you distribute a binary of <b>pcre2test</b> linked in this
- way, there may be licensing issues. These can be avoided by linking instead
- with <b>libedit</b>, which has a BSD licence.
- </P>
- <P>
- Setting --enable-pcre2test-libreadline causes the <b>-lreadline</b> option to be
- added to the <b>pcre2test</b> build. In many operating environments with a
- sytem-installed readline library this is sufficient. However, in some
- environments (e.g. if an unmodified distribution version of readline is in
- use), some extra configuration may be necessary. The INSTALL file for
- <b>libreadline</b> says this:
- <pre>
- "Readline uses the termcap functions, but does not link with
- the termcap or curses library itself, allowing applications
- which link with readline the to choose an appropriate library."
- </pre>
- If your environment has not been set up so that an appropriate library is
- automatically included, you may need to add something like
- <pre>
- LIBS="-ncurses"
- </pre>
- immediately before the <b>configure</b> command.
- </P>
- <br><a name="SEC18" href="#TOC1">INCLUDING DEBUGGING CODE</a><br>
- <P>
- If you add
- <pre>
- --enable-debug
- </pre>
- to the <b>configure</b> command, additional debugging code is included in the
- build. This feature is intended for use by the PCRE2 maintainers.
- </P>
- <br><a name="SEC19" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br>
- <P>
- If you add
- <pre>
- --enable-valgrind
- </pre>
- to the <b>configure</b> command, PCRE2 will use valgrind annotations to mark
- certain memory regions as unaddressable. This allows it to detect invalid
- memory accesses, and is mostly useful for debugging PCRE2 itself.
- </P>
- <br><a name="SEC20" href="#TOC1">CODE COVERAGE REPORTING</a><br>
- <P>
- If your C compiler is gcc, you can build a version of PCRE2 that can generate a
- code coverage report for its test suite. To enable this, you must install
- <b>lcov</b> version 1.6 or above. Then specify
- <pre>
- --enable-coverage
- </pre>
- to the <b>configure</b> command and build PCRE2 in the usual way.
- </P>
- <P>
- Note that using <b>ccache</b> (a caching C compiler) is incompatible with code
- coverage reporting. If you have configured <b>ccache</b> to run automatically
- on your system, you must set the environment variable
- <pre>
- CCACHE_DISABLE=1
- </pre>
- before running <b>make</b> to build PCRE2, so that <b>ccache</b> is not used.
- </P>
- <P>
- When --enable-coverage is used, the following addition targets are added to the
- <i>Makefile</i>:
- <pre>
- make coverage
- </pre>
- This creates a fresh coverage report for the PCRE2 test suite. It is equivalent
- to running "make coverage-reset", "make coverage-baseline", "make check", and
- then "make coverage-report".
- <pre>
- make coverage-reset
- </pre>
- This zeroes the coverage counters, but does nothing else.
- <pre>
- make coverage-baseline
- </pre>
- This captures baseline coverage information.
- <pre>
- make coverage-report
- </pre>
- This creates the coverage report.
- <pre>
- make coverage-clean-report
- </pre>
- This removes the generated coverage report without cleaning the coverage data
- itself.
- <pre>
- make coverage-clean-data
- </pre>
- This removes the captured coverage data without removing the coverage files
- created at compile time (*.gcno).
- <pre>
- make coverage-clean
- </pre>
- This cleans all coverage data including the generated coverage report. For more
- information about code coverage, see the <b>gcov</b> and <b>lcov</b>
- documentation.
- </P>
- <br><a name="SEC21" href="#TOC1">DISABLING THE Z AND T FORMATTING MODIFIERS</a><br>
- <P>
- The C99 standard defines formatting modifiers z and t for size_t and
- ptrdiff_t values, respectively. By default, PCRE2 uses these modifiers in
- environments other than Microsoft Visual Studio when __STDC_VERSION__ is
- defined and has a value greater than or equal to 199901L (indicating C99).
- However, there is at least one environment that claims to be C99 but does not
- support these modifiers. If
- <pre>
- --disable-percent-zt
- </pre>
- is specified, no use is made of the z or t modifiers. Instead of %td or %zu,
- %lu is used, with a cast for size_t values.
- </P>
- <br><a name="SEC22" href="#TOC1">SUPPORT FOR FUZZERS</a><br>
- <P>
- There is a special option for use by people who want to run fuzzing tests on
- PCRE2:
- <pre>
- --enable-fuzz-support
- </pre>
- At present this applies only to the 8-bit library. If set, it causes an extra
- library called libpcre2-fuzzsupport.a to be built, but not installed. This
- contains a single function called LLVMFuzzerTestOneInput() whose arguments are
- a pointer to a string and the length of the string. When called, this function
- tries to compile the string as a pattern, and if that succeeds, to match it.
- This is done both with no options and with some random options bits that are
- generated from the string.
- </P>
- <P>
- Setting --enable-fuzz-support also causes a binary called <b>pcre2fuzzcheck</b>
- to be created. This is normally run under valgrind or used when PCRE2 is
- compiled with address sanitizing enabled. It calls the fuzzing function and
- outputs information about what it is doing. The input strings are specified by
- arguments: if an argument starts with "=" the rest of it is a literal input
- string. Otherwise, it is assumed to be a file name, and the contents of the
- file are the test string.
- </P>
- <br><a name="SEC23" href="#TOC1">OBSOLETE OPTION</a><br>
- <P>
- In versions of PCRE2 prior to 10.30, there were two ways of handling
- backtracking in the <b>pcre2_match()</b> function. The default was to use the
- system stack, but if
- <pre>
- --disable-stack-for-recursion
- </pre>
- was set, memory on the heap was used. From release 10.30 onwards this has
- changed (the stack is no longer used) and this option now does nothing except
- give a warning.
- </P>
- <br><a name="SEC24" href="#TOC1">SEE ALSO</a><br>
- <P>
- <b>pcre2api</b>(3), <b>pcre2-config</b>(3).
- </P>
- <br><a name="SEC25" href="#TOC1">AUTHOR</a><br>
- <P>
- Philip Hazel
- <br>
- University Computing Service
- <br>
- Cambridge, England.
- <br>
- </P>
- <br><a name="SEC26" href="#TOC1">REVISION</a><br>
- <P>
- Last updated: 20 March 2020
- <br>
- Copyright © 1997-2020 University of Cambridge.
- <br>
- <p>
- Return to the <a href="index.html">PCRE2 index page</a>.
- </p>
|