Commit b760a69e authored by Sergei Golubchik's avatar Sergei Golubchik

Merge branch 'merge-pcre' into 10.0

parents c84a40bf 1592fbd3
...@@ -8,7 +8,7 @@ Email domain: cam.ac.uk ...@@ -8,7 +8,7 @@ Email domain: cam.ac.uk
University of Cambridge Computing Service, University of Cambridge Computing Service,
Cambridge, England. Cambridge, England.
Copyright (c) 1997-2015 University of Cambridge Copyright (c) 1997-2016 University of Cambridge
All rights reserved All rights reserved
...@@ -19,7 +19,7 @@ Written by: Zoltan Herczeg ...@@ -19,7 +19,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester Email local part: hzmester
Emain domain: freemail.hu Emain domain: freemail.hu
Copyright(c) 2010-2015 Zoltan Herczeg Copyright(c) 2010-2016 Zoltan Herczeg
All rights reserved. All rights reserved.
...@@ -30,7 +30,7 @@ Written by: Zoltan Herczeg ...@@ -30,7 +30,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester Email local part: hzmester
Emain domain: freemail.hu Emain domain: freemail.hu
Copyright(c) 2009-2015 Zoltan Herczeg Copyright(c) 2009-2016 Zoltan Herczeg
All rights reserved. All rights reserved.
......
...@@ -65,6 +65,7 @@ ...@@ -65,6 +65,7 @@
# so it has been removed. # so it has been removed.
# 2013-10-08 PH got rid of the "source" command, which is a bash-ism (use ".") # 2013-10-08 PH got rid of the "source" command, which is a bash-ism (use ".")
# 2013-11-05 PH added support for PARENS_NEST_LIMIT # 2013-11-05 PH added support for PARENS_NEST_LIMIT
# 2016-03-01 PH applied Chris Wilson's patch for MSVC static build
PROJECT(PCRE C CXX) PROJECT(PCRE C CXX)
......
...@@ -4,12 +4,104 @@ ChangeLog for PCRE ...@@ -4,12 +4,104 @@ ChangeLog for PCRE
Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All
development is happening in the PCRE2 10.xx series. development is happening in the PCRE2 10.xx series.
Version 8.39 14-June-2016
-------------------------
1. If PCRE_AUTO_CALLOUT was set on a pattern that had a (?# comment between
an item and its qualifier (for example, A(?#comment)?B) pcre_compile()
misbehaved. This bug was found by the LLVM fuzzer.
2. Similar to the above, if an isolated \E was present between an item and its
qualifier when PCRE_AUTO_CALLOUT was set, pcre_compile() misbehaved. This
bug was found by the LLVM fuzzer.
3. Further to 8.38/46, negated classes such as [^[:^ascii:]\d] were also not
working correctly in UCP mode.
4. The POSIX wrapper function regexec() crashed if the option REG_STARTEND
was set when the pmatch argument was NULL. It now returns REG_INVARG.
5. Allow for up to 32-bit numbers in the ordin() function in pcregrep.
6. An empty \Q\E sequence between an item and its qualifier caused
pcre_compile() to misbehave when auto callouts were enabled. This bug was
found by the LLVM fuzzer.
7. If a pattern that was compiled with PCRE_EXTENDED started with white
space or a #-type comment that was followed by (?-x), which turns off
PCRE_EXTENDED, and there was no subsequent (?x) to turn it on again,
pcre_compile() assumed that (?-x) applied to the whole pattern and
consequently mis-compiled it. This bug was found by the LLVM fuzzer.
8. A call of pcre_copy_named_substring() for a named substring whose number
was greater than the space in the ovector could cause a crash.
9. Yet another buffer overflow bug involved duplicate named groups with a
group that reset capture numbers (compare 8.38/7 below). Once again, I have
just allowed for more memory, even if not needed. (A proper fix is
implemented in PCRE2, but it involves a lot of refactoring.)
10. pcre_get_substring_list() crashed if the use of \K in a match caused the
start of the match to be earlier than the end.
11. Migrating appropriate PCRE2 JIT improvements to PCRE.
12. A pattern such as /(?<=((?C)0))/, which has a callout inside a lookbehind
assertion, caused pcretest to generate incorrect output, and also to read
uninitialized memory (detected by ASAN or valgrind).
13. A pattern that included (*ACCEPT) in the middle of a sufficiently deeply
nested set of parentheses of sufficient size caused an overflow of the
compiling workspace (which was diagnosed, but of course is not desirable).
14. And yet another buffer overflow bug involving duplicate named groups, this
time nested, with a nested back reference. Yet again, I have just allowed
for more memory, because anything more needs all the refactoring that has
been done for PCRE2. An example pattern that provoked this bug is:
/((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/ and the bug was
registered as CVE-2016-1283.
15. pcretest went into a loop if global matching was requested with an ovector
size less than 2. It now gives an error message. This bug was found by
afl-fuzz.
16. An invalid pattern fragment such as (?(?C)0 was not diagnosing an error
("assertion expected") when (?(?C) was not followed by an opening
parenthesis.
17. Fixed typo ("&&" for "&") in pcre_study(). Fortunately, this could not
actually affect anything, by sheer luck.
18. Applied Chris Wilson's patch (Bugzilla #1681) to CMakeLists.txt for MSVC
static compilation.
19. Modified the RunTest script to incorporate a valgrind suppressions file so
that certain errors, provoked by the SSE2 instruction set when JIT is used,
are ignored.
20. A racing condition is fixed in JIT reported by Mozilla.
21. Minor code refactor to avoid "array subscript is below array bounds"
compiler warning.
22. Minor code refactor to avoid "left shift of negative number" warning.
23. Fix typo causing compile error when 16- or 32-bit JIT is compiled without
UCP support.
24. Refactor to avoid compiler warnings in pcrecpp.cc.
25. Refactor to fix a typo in pcre_jit_test.c
26. Patch to support compiling pcrecpp.cc with Intel compiler.
Version 8.38 23-November-2015 Version 8.38 23-November-2015
----------------------------- -----------------------------
1. If a group that contained a recursive back reference also contained a 1. If a group that contained a recursive back reference also contained a
forward reference subroutine call followed by a non-forward-reference forward reference subroutine call followed by a non-forward-reference
subroutine call, for example /.((?2)(?R)\1)()/, pcre2_compile() failed to subroutine call, for example /.((?2)(?R)\1)()/, pcre_compile() failed to
compile correct code, leading to undefined behaviour or an internally compile correct code, leading to undefined behaviour or an internally
detected error. This bug was discovered by the LLVM fuzzer. detected error. This bug was discovered by the LLVM fuzzer.
......
...@@ -25,7 +25,7 @@ Email domain: cam.ac.uk ...@@ -25,7 +25,7 @@ Email domain: cam.ac.uk
University of Cambridge Computing Service, University of Cambridge Computing Service,
Cambridge, England. Cambridge, England.
Copyright (c) 1997-2015 University of Cambridge Copyright (c) 1997-2016 University of Cambridge
All rights reserved. All rights reserved.
...@@ -36,7 +36,7 @@ Written by: Zoltan Herczeg ...@@ -36,7 +36,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester Email local part: hzmester
Emain domain: freemail.hu Emain domain: freemail.hu
Copyright(c) 2010-2015 Zoltan Herczeg Copyright(c) 2010-2016 Zoltan Herczeg
All rights reserved. All rights reserved.
...@@ -47,7 +47,7 @@ Written by: Zoltan Herczeg ...@@ -47,7 +47,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester Email local part: hzmester
Emain domain: freemail.hu Emain domain: freemail.hu
Copyright(c) 2009-2015 Zoltan Herczeg Copyright(c) 2009-2016 Zoltan Herczeg
All rights reserved. All rights reserved.
......
News about PCRE releases News about PCRE releases
------------------------ ------------------------
Release 8.39 14-June-2016
-------------------------
Some appropriate PCRE2 JIT improvements have been retro-fitted to PCRE1. Apart
from that, this is another bug-fix release. Note that this library (now called
PCRE1) is now being maintained for bug fixes only. New projects are advised to
use the new PCRE2 libraries.
Release 8.38 23-November-2015 Release 8.38 23-November-2015
----------------------------- -----------------------------
......
...@@ -67,6 +67,15 @@ fi ...@@ -67,6 +67,15 @@ fi
./pcretest -C utf >/dev/null ./pcretest -C utf >/dev/null
utf8=$? utf8=$?
# We need valgrind suppressions when JIT is in use. (This isn't perfect because
# some tests are run with -no-jit, but as PCRE1 is in maintenance only, I have
# not bothered about that.)
./pcretest -C jit >/dev/null
if [ $? -eq 1 -a "$valgrind" != "" ] ; then
valgrind="$valgrind --suppressions=./testdata/valgrind-jit.supp"
fi
echo "Testing pcregrep main features" echo "Testing pcregrep main features"
echo "---------------------------- Test 1 ------------------------------" >testtrygrep echo "---------------------------- Test 1 ------------------------------" >testtrygrep
......
...@@ -178,6 +178,7 @@ nojit= ...@@ -178,6 +178,7 @@ nojit=
sim= sim=
skip= skip=
valgrind= valgrind=
vjs=
# This is in case the caller has set aliases (as I do - PH) # This is in case the caller has set aliases (as I do - PH)
unset cp ls mv rm unset cp ls mv rm
...@@ -357,6 +358,9 @@ $sim ./pcretest -C jit >/dev/null ...@@ -357,6 +358,9 @@ $sim ./pcretest -C jit >/dev/null
jit=$? jit=$?
if [ $jit -ne 0 -a "$nojit" != "yes" ] ; then if [ $jit -ne 0 -a "$nojit" != "yes" ] ; then
jitopt=-s+ jitopt=-s+
if [ "$valgrind" != "" ] ; then
vjs="--suppressions=$testdata/valgrind-jit.supp"
fi
fi fi
# If no specific tests were requested, select all. Those that are not # If no specific tests were requested, select all. Those that are not
...@@ -423,7 +427,7 @@ for bmode in "$test8" "$test16" "$test32"; do ...@@ -423,7 +427,7 @@ for bmode in "$test8" "$test16" "$test32"; do
if [ $do1 = yes ] ; then if [ $do1 = yes ] ; then
echo $title1 echo $title1
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput1 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput1 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput1 testtry $cf $testdata/testoutput1 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -441,7 +445,7 @@ fi ...@@ -441,7 +445,7 @@ fi
if [ $do2 = yes ] ; then if [ $do2 = yes ] ; then
echo $title2 "(not UTF-$bits)" echo $title2 "(not UTF-$bits)"
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput2 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput2 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput2 testtry $cf $testdata/testoutput2 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -504,7 +508,7 @@ if [ $do3 = yes ] ; then ...@@ -504,7 +508,7 @@ if [ $do3 = yes ] ; then
if [ "$locale" != "" ] ; then if [ "$locale" != "" ] ; then
echo $title3 "(using '$locale' locale)" echo $title3 "(using '$locale' locale)"
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $infile testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $infile testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
if $cf $outfile testtry >teststdout || \ if $cf $outfile testtry >teststdout || \
$cf $outfile2 testtry >teststdout || \ $cf $outfile2 testtry >teststdout || \
...@@ -540,7 +544,7 @@ if [ $do4 = yes ] ; then ...@@ -540,7 +544,7 @@ if [ $do4 = yes ] ; then
echo " Skipped because UTF-$bits support is not available" echo " Skipped because UTF-$bits support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput4 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput4 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput4 testtry $cf $testdata/testoutput4 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -560,7 +564,7 @@ if [ $do5 = yes ] ; then ...@@ -560,7 +564,7 @@ if [ $do5 = yes ] ; then
echo " Skipped because UTF-$bits support is not available" echo " Skipped because UTF-$bits support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput5 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput5 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput5 testtry $cf $testdata/testoutput5 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -580,7 +584,7 @@ if [ $do6 = yes ] ; then ...@@ -580,7 +584,7 @@ if [ $do6 = yes ] ; then
echo " Skipped because Unicode property support is not available" echo " Skipped because Unicode property support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput6 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput6 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput6 testtry $cf $testdata/testoutput6 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -602,7 +606,7 @@ if [ $do7 = yes ] ; then ...@@ -602,7 +606,7 @@ if [ $do7 = yes ] ; then
echo " Skipped because Unicode property support is not available" echo " Skipped because Unicode property support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput7 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput7 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput7 testtry $cf $testdata/testoutput7 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -698,7 +702,7 @@ if [ $do12 = yes ] ; then ...@@ -698,7 +702,7 @@ if [ $do12 = yes ] ; then
if [ $jit -eq 0 -o "$nojit" = "yes" ] ; then if [ $jit -eq 0 -o "$nojit" = "yes" ] ; then
echo " Skipped because JIT is not available or not usable" echo " Skipped because JIT is not available or not usable"
else else
$sim $valgrind ./pcretest -q $bmode $testdata/testinput12 testtry $sim $valgrind $vjs ./pcretest -q $bmode $testdata/testinput12 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput12 testtry $cf $testdata/testoutput12 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -735,7 +739,7 @@ if [ "$do14" = yes ] ; then ...@@ -735,7 +739,7 @@ if [ "$do14" = yes ] ; then
cp -f $testdata/saved16 testsaved16 cp -f $testdata/saved16 testsaved16
cp -f $testdata/saved32 testsaved32 cp -f $testdata/saved32 testsaved32
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput14 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput14 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput14 testtry $cf $testdata/testoutput14 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -759,7 +763,7 @@ if [ "$do15" = yes ] ; then ...@@ -759,7 +763,7 @@ if [ "$do15" = yes ] ; then
echo " Skipped because UTF-$bits support is not available" echo " Skipped because UTF-$bits support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput15 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput15 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput15 testtry $cf $testdata/testoutput15 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -783,7 +787,7 @@ if [ $do16 = yes ] ; then ...@@ -783,7 +787,7 @@ if [ $do16 = yes ] ; then
echo " Skipped because Unicode property support is not available" echo " Skipped because Unicode property support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput16 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput16 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput16 testtry $cf $testdata/testoutput16 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -805,7 +809,7 @@ if [ $do17 = yes ] ; then ...@@ -805,7 +809,7 @@ if [ $do17 = yes ] ; then
echo " Skipped when running 8-bit tests" echo " Skipped when running 8-bit tests"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput17 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput17 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput17 testtry $cf $testdata/testoutput17 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -829,7 +833,7 @@ if [ $do18 = yes ] ; then ...@@ -829,7 +833,7 @@ if [ $do18 = yes ] ; then
echo " Skipped because UTF-$bits support is not available" echo " Skipped because UTF-$bits support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput18 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput18 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput18-$bits testtry $cf $testdata/testoutput18-$bits testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
...@@ -853,7 +857,7 @@ if [ $do19 = yes ] ; then ...@@ -853,7 +857,7 @@ if [ $do19 = yes ] ; then
echo " Skipped because Unicode property support is not available" echo " Skipped because Unicode property support is not available"
else else
for opt in "" "-s" $jitopt; do for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput19 testtry $sim $valgrind ${opt:+$vjs} ./pcretest -q $bmode $opt $testdata/testinput19 testtry
if [ $? = 0 ] ; then if [ $? = 0 ] ; then
$cf $testdata/testoutput19 testtry $cf $testdata/testoutput19 testtry
if [ $? != 0 ] ; then exit 1; fi if [ $? != 0 ] ; then exit 1; fi
......
...@@ -9,18 +9,18 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might ...@@ -9,18 +9,18 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
dnl be defined as -RC2, for example. For real releases, it should be empty. dnl be defined as -RC2, for example. For real releases, it should be empty.
m4_define(pcre_major, [8]) m4_define(pcre_major, [8])
m4_define(pcre_minor, [38]) m4_define(pcre_minor, [39])
m4_define(pcre_prerelease, []) m4_define(pcre_prerelease, [])
m4_define(pcre_date, [2015-11-23]) m4_define(pcre_date, [2016-06-14])
# NOTE: The CMakeLists.txt file searches for the above variables in the first # NOTE: The CMakeLists.txt file searches for the above variables in the first
# 50 lines of this file. Please update that if the variables above are moved. # 50 lines of this file. Please update that if the variables above are moved.
# Libtool shared library interface versions (current:revision:age) # Libtool shared library interface versions (current:revision:age)
m4_define(libpcre_version, [3:6:2]) m4_define(libpcre_version, [3:7:2])
m4_define(libpcre16_version, [2:6:2]) m4_define(libpcre16_version, [2:7:2])
m4_define(libpcre32_version, [0:6:0]) m4_define(libpcre32_version, [0:7:0])
m4_define(libpcreposix_version, [0:3:0]) m4_define(libpcreposix_version, [0:4:0])
m4_define(libpcrecpp_version, [0:1:0]) m4_define(libpcrecpp_version, [0:1:0])
AC_PREREQ(2.57) AC_PREREQ(2.57)
......
...@@ -315,9 +315,8 @@ documentation for details of how to do this. It is a non-standard way of ...@@ -315,9 +315,8 @@ documentation for details of how to do this. It is a non-standard way of
building PCRE, for use in environments that have limited stacks. Because of the building PCRE, for use in environments that have limited stacks. Because of the
greater use of memory management, it runs more slowly. Separate functions are greater use of memory management, it runs more slowly. Separate functions are
provided so that special-purpose external code can be used for this case. When provided so that special-purpose external code can be used for this case. When
used, these functions are always called in a stack-like manner (last obtained, used, these functions always allocate memory blocks of the same size. There is
first freed), and always for memory blocks of the same size. There is a a discussion about PCRE's stack usage in the
discussion about PCRE's stack usage in the
<a href="pcrestack.html"><b>pcrestack</b></a> <a href="pcrestack.html"><b>pcrestack</b></a>
documentation. documentation.
</P> </P>
...@@ -2913,9 +2912,9 @@ Cambridge CB2 3QH, England. ...@@ -2913,9 +2912,9 @@ Cambridge CB2 3QH, England.
</P> </P>
<br><a name="SEC26" href="#TOC1">REVISION</a><br> <br><a name="SEC26" href="#TOC1">REVISION</a><br>
<P> <P>
Last updated: 09 February 2014 Last updated: 18 December 2015
<br> <br>
Copyright &copy; 1997-2014 University of Cambridge. Copyright &copy; 1997-2015 University of Cambridge.
<br> <br>
<p> <p>
Return to the <a href="index.html">PCRE index page</a>. Return to the <a href="index.html">PCRE index page</a>.
......
This source diff could not be displayed because it is too large. You can view the blob instead.
.TH PCREAPI 3 "09 February 2014" "PCRE 8.35" .TH PCREAPI 3 "18 December 2015" "PCRE 8.39"
.SH NAME .SH NAME
PCRE - Perl-compatible regular expressions PCRE - Perl-compatible regular expressions
.sp .sp
...@@ -273,9 +273,8 @@ documentation for details of how to do this. It is a non-standard way of ...@@ -273,9 +273,8 @@ documentation for details of how to do this. It is a non-standard way of
building PCRE, for use in environments that have limited stacks. Because of the building PCRE, for use in environments that have limited stacks. Because of the
greater use of memory management, it runs more slowly. Separate functions are greater use of memory management, it runs more slowly. Separate functions are
provided so that special-purpose external code can be used for this case. When provided so that special-purpose external code can be used for this case. When
used, these functions are always called in a stack-like manner (last obtained, used, these functions always allocate memory blocks of the same size. There is
first freed), and always for memory blocks of the same size. There is a a discussion about PCRE's stack usage in the
discussion about PCRE's stack usage in the
.\" HREF .\" HREF
\fBpcrestack\fP \fBpcrestack\fP
.\" .\"
...@@ -2914,6 +2913,6 @@ Cambridge CB2 3QH, England. ...@@ -2914,6 +2913,6 @@ Cambridge CB2 3QH, England.
.rs .rs
.sp .sp
.nf .nf
Last updated: 09 February 2014 Last updated: 18 December 2015
Copyright (c) 1997-2014 University of Cambridge. Copyright (c) 1997-2015 University of Cambridge.
.fi .fi
This diff is collapsed.
...@@ -250,6 +250,7 @@ It returns the number of the first one that was set in a pattern match. ...@@ -250,6 +250,7 @@ It returns the number of the first one that was set in a pattern match.
code the compiled regex code the compiled regex
stringname the name of the capturing substring stringname the name of the capturing substring
ovector the vector of matched substrings ovector the vector of matched substrings
stringcount number of captured substrings
Returns: the number of the first that is set, Returns: the number of the first that is set,
or the number of the last one if none are set, or the number of the last one if none are set,
...@@ -258,13 +259,16 @@ Returns: the number of the first that is set, ...@@ -258,13 +259,16 @@ Returns: the number of the first that is set,
#if defined COMPILE_PCRE8 #if defined COMPILE_PCRE8
static int static int
get_first_set(const pcre *code, const char *stringname, int *ovector) get_first_set(const pcre *code, const char *stringname, int *ovector,
int stringcount)
#elif defined COMPILE_PCRE16 #elif defined COMPILE_PCRE16
static int static int
get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector) get_first_set(const pcre16 *code, PCRE_SPTR16 stringname, int *ovector,
int stringcount)
#elif defined COMPILE_PCRE32 #elif defined COMPILE_PCRE32
static int static int
get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector) get_first_set(const pcre32 *code, PCRE_SPTR32 stringname, int *ovector,
int stringcount)
#endif #endif
{ {
const REAL_PCRE *re = (const REAL_PCRE *)code; const REAL_PCRE *re = (const REAL_PCRE *)code;
...@@ -295,7 +299,7 @@ if (entrysize <= 0) return entrysize; ...@@ -295,7 +299,7 @@ if (entrysize <= 0) return entrysize;
for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize) for (entry = (pcre_uchar *)first; entry <= (pcre_uchar *)last; entry += entrysize)
{ {
int n = GET2(entry, 0); int n = GET2(entry, 0);
if (ovector[n*2] >= 0) return n; if (n < stringcount && ovector[n*2] >= 0) return n;
} }
return GET2(entry, 0); return GET2(entry, 0);
} }
...@@ -402,7 +406,7 @@ pcre32_copy_named_substring(const pcre32 *code, PCRE_SPTR32 subject, ...@@ -402,7 +406,7 @@ pcre32_copy_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
PCRE_UCHAR32 *buffer, int size) PCRE_UCHAR32 *buffer, int size)
#endif #endif
{ {
int n = get_first_set(code, stringname, ovector); int n = get_first_set(code, stringname, ovector, stringcount);
if (n <= 0) return n; if (n <= 0) return n;
#if defined COMPILE_PCRE8 #if defined COMPILE_PCRE8
return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size); return pcre_copy_substring(subject, ovector, stringcount, n, buffer, size);
...@@ -457,7 +461,10 @@ pcre_uchar **stringlist; ...@@ -457,7 +461,10 @@ pcre_uchar **stringlist;
pcre_uchar *p; pcre_uchar *p;
for (i = 0; i < double_count; i += 2) for (i = 0; i < double_count; i += 2)
size += sizeof(pcre_uchar *) + IN_UCHARS(ovector[i+1] - ovector[i] + 1); {
size += sizeof(pcre_uchar *) + IN_UCHARS(1);
if (ovector[i+1] > ovector[i]) size += IN_UCHARS(ovector[i+1] - ovector[i]);
}
stringlist = (pcre_uchar **)(PUBL(malloc))(size); stringlist = (pcre_uchar **)(PUBL(malloc))(size);
if (stringlist == NULL) return PCRE_ERROR_NOMEMORY; if (stringlist == NULL) return PCRE_ERROR_NOMEMORY;
...@@ -473,7 +480,7 @@ p = (pcre_uchar *)(stringlist + stringcount + 1); ...@@ -473,7 +480,7 @@ p = (pcre_uchar *)(stringlist + stringcount + 1);
for (i = 0; i < double_count; i += 2) for (i = 0; i < double_count; i += 2)
{ {
int len = ovector[i+1] - ovector[i]; int len = (ovector[i+1] > ovector[i])? (ovector[i+1] - ovector[i]) : 0;
memcpy(p, subject + ovector[i], IN_UCHARS(len)); memcpy(p, subject + ovector[i], IN_UCHARS(len));
*stringlist++ = p; *stringlist++ = p;
p += len; p += len;
...@@ -619,7 +626,7 @@ pcre32_get_named_substring(const pcre32 *code, PCRE_SPTR32 subject, ...@@ -619,7 +626,7 @@ pcre32_get_named_substring(const pcre32 *code, PCRE_SPTR32 subject,
PCRE_SPTR32 *stringptr) PCRE_SPTR32 *stringptr)
#endif #endif
{ {
int n = get_first_set(code, stringname, ovector); int n = get_first_set(code, stringname, ovector, stringcount);
if (n <= 0) return n; if (n <= 0) return n;
#if defined COMPILE_PCRE8 #if defined COMPILE_PCRE8
return pcre_get_substring(subject, ovector, stringcount, n, stringptr); return pcre_get_substring(subject, ovector, stringcount, n, stringptr);
......
...@@ -7,7 +7,7 @@ ...@@ -7,7 +7,7 @@
and semantics are as close as possible to those of the Perl 5 language. and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel Written by Philip Hazel
Copyright (c) 1997-2014 University of Cambridge Copyright (c) 1997-2016 University of Cambridge
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
...@@ -275,7 +275,7 @@ pcre.h(.in) and disable (comment out) this message. */ ...@@ -275,7 +275,7 @@ pcre.h(.in) and disable (comment out) this message. */
typedef pcre_uint16 pcre_uchar; typedef pcre_uint16 pcre_uchar;
#define UCHAR_SHIFT (1) #define UCHAR_SHIFT (1)
#define IN_UCHARS(x) ((x) << UCHAR_SHIFT) #define IN_UCHARS(x) ((x) * 2)
#define MAX_255(c) ((c) <= 255u) #define MAX_255(c) ((c) <= 255u)
#define TABLE_GET(c, table, default) (MAX_255(c)? ((table)[c]):(default)) #define TABLE_GET(c, table, default) (MAX_255(c)? ((table)[c]):(default))
...@@ -283,7 +283,7 @@ typedef pcre_uint16 pcre_uchar; ...@@ -283,7 +283,7 @@ typedef pcre_uint16 pcre_uchar;
typedef pcre_uint32 pcre_uchar; typedef pcre_uint32 pcre_uchar;
#define UCHAR_SHIFT (2) #define UCHAR_SHIFT (2)
#define IN_UCHARS(x) ((x) << UCHAR_SHIFT) #define IN_UCHARS(x) ((x) * 4)
#define MAX_255(c) ((c) <= 255u) #define MAX_255(c) ((c) <= 255u)
#define TABLE_GET(c, table, default) (MAX_255(c)? ((table)[c]):(default)) #define TABLE_GET(c, table, default) (MAX_255(c)? ((table)[c]):(default))
...@@ -2289,7 +2289,7 @@ enum { ERR0, ERR1, ERR2, ERR3, ERR4, ERR5, ERR6, ERR7, ERR8, ERR9, ...@@ -2289,7 +2289,7 @@ enum { ERR0, ERR1, ERR2, ERR3, ERR4, ERR5, ERR6, ERR7, ERR8, ERR9,
ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59, ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69, ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79, ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERRCOUNT }; ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERR87, ERRCOUNT };
/* JIT compiling modes. The function list is indexed by them. */ /* JIT compiling modes. The function list is indexed by them. */
......
This diff is collapsed.
...@@ -242,13 +242,17 @@ static struct regression_test_case regression_test_cases[] = { ...@@ -242,13 +242,17 @@ static struct regression_test_case regression_test_cases[] = {
{ MA, 0, "a\\z", "aaa" }, { MA, 0, "a\\z", "aaa" },
{ MA, 0 | F_NOMATCH, "a\\z", "aab" }, { MA, 0 | F_NOMATCH, "a\\z", "aab" },
/* Brackets. */ /* Brackets and alternatives. */
{ MUA, 0, "(ab|bb|cd)", "bacde" }, { MUA, 0, "(ab|bb|cd)", "bacde" },
{ MUA, 0, "(?:ab|a)(bc|c)", "ababc" }, { MUA, 0, "(?:ab|a)(bc|c)", "ababc" },
{ MUA, 0, "((ab|(cc))|(bb)|(?:cd|efg))", "abac" }, { MUA, 0, "((ab|(cc))|(bb)|(?:cd|efg))", "abac" },
{ CMUA, 0, "((aB|(Cc))|(bB)|(?:cd|EFg))", "AcCe" }, { CMUA, 0, "((aB|(Cc))|(bB)|(?:cd|EFg))", "AcCe" },
{ MUA, 0, "((ab|(cc))|(bb)|(?:cd|ebg))", "acebebg" }, { MUA, 0, "((ab|(cc))|(bb)|(?:cd|ebg))", "acebebg" },
{ MUA, 0, "(?:(a)|(?:b))(cc|(?:d|e))(a|b)k", "accabdbbccbk" }, { MUA, 0, "(?:(a)|(?:b))(cc|(?:d|e))(a|b)k", "accabdbbccbk" },
{ MUA, 0, "\xc7\x82|\xc6\x82", "\xf1\x83\x82\x82\xc7\x82\xc7\x83" },
{ MUA, 0, "=\xc7\x82|#\xc6\x82", "\xf1\x83\x82\x82=\xc7\x82\xc7\x83" },
{ MUA, 0, "\xc7\x82\xc7\x83|\xc6\x82\xc6\x82", "\xf1\x83\x82\x82\xc7\x82\xc7\x83" },
{ MUA, 0, "\xc6\x82\xc6\x82|\xc7\x83\xc7\x83|\xc8\x84\xc8\x84", "\xf1\x83\x82\x82\xc8\x84\xc8\x84" },
/* Greedy and non-greedy ? operators. */ /* Greedy and non-greedy ? operators. */
{ MUA, 0, "(?:a)?a", "laab" }, { MUA, 0, "(?:a)?a", "laab" },
...@@ -318,6 +322,14 @@ static struct regression_test_case regression_test_cases[] = { ...@@ -318,6 +322,14 @@ static struct regression_test_case regression_test_cases[] = {
{ CMUA, 0, "[^\xe1\xbd\xb8][^\xc3\xa9]", "\xe1\xbd\xb8\xe1\xbf\xb8\xc3\xa9\xc3\x89#" }, { CMUA, 0, "[^\xe1\xbd\xb8][^\xc3\xa9]", "\xe1\xbd\xb8\xe1\xbf\xb8\xc3\xa9\xc3\x89#" },
{ MUA, 0, "[^\xe1\xbd\xb8][^\xc3\xa9]", "\xe1\xbd\xb8\xe1\xbf\xb8\xc3\xa9\xc3\x89#" }, { MUA, 0, "[^\xe1\xbd\xb8][^\xc3\xa9]", "\xe1\xbd\xb8\xe1\xbf\xb8\xc3\xa9\xc3\x89#" },
{ MUA, 0, "[^\xe1\xbd\xb8]{3,}?", "##\xe1\xbd\xb8#\xe1\xbd\xb8#\xc3\x89#\xe1\xbd\xb8" }, { MUA, 0, "[^\xe1\xbd\xb8]{3,}?", "##\xe1\xbd\xb8#\xe1\xbd\xb8#\xc3\x89#\xe1\xbd\xb8" },
{ MUA, 0, "\\d+123", "987654321,01234" },
{ MUA, 0, "abcd*|\\w+xy", "aaaaa,abxyz" },
{ MUA, 0, "(?:abc|((?:amc|\\b\\w*xy)))", "aaaaa,abxyz" },
{ MUA, 0, "a(?R)|([a-z]++)#", ".abcd.abcd#."},
{ MUA, 0, "a(?R)|([a-z]++)#", ".abcd.mbcd#."},
{ MUA, 0, ".[ab]*.", "xx" },
{ MUA, 0, ".[ab]*a", "xxa" },
{ MUA, 0, ".[ab]?.", "xx" },
/* Bracket repeats with limit. */ /* Bracket repeats with limit. */
{ MUA, 0, "(?:(ab){2}){5}M", "abababababababababababM" }, { MUA, 0, "(?:(ab){2}){5}M", "abababababababababababM" },
...@@ -574,6 +586,16 @@ static struct regression_test_case regression_test_cases[] = { ...@@ -574,6 +586,16 @@ static struct regression_test_case regression_test_cases[] = {
{ MUA, 0, "(?:(?=.)??[a-c])+m", "abacdcbacacdcaccam" }, { MUA, 0, "(?:(?=.)??[a-c])+m", "abacdcbacacdcaccam" },
{ MUA, 0, "((?!a)?(?!([^a]))?)+$", "acbab" }, { MUA, 0, "((?!a)?(?!([^a]))?)+$", "acbab" },
{ MUA, 0, "((?!a)?\?(?!([^a]))?\?)+$", "acbab" }, { MUA, 0, "((?!a)?\?(?!([^a]))?\?)+$", "acbab" },
{ MUA, 0, "a(?=(?C)\\B)b", "ab" },
{ MUA, 0, "a(?!(?C)\\B)bb|ab", "abb" },
{ MUA, 0, "a(?=\\b|(?C)\\B)b", "ab" },
{ MUA, 0, "a(?!\\b|(?C)\\B)bb|ab", "abb" },
{ MUA, 0, "c(?(?=(?C)\\B)ab|a)", "cab" },
{ MUA, 0, "c(?(?!(?C)\\B)ab|a)", "cab" },
{ MUA, 0, "c(?(?=\\b|(?C)\\B)ab|a)", "cab" },
{ MUA, 0, "c(?(?!\\b|(?C)\\B)ab|a)", "cab" },
{ MUA, 0, "a(?=)b", "ab" },
{ MUA, 0 | F_NOMATCH, "a(?!)b", "ab" },
/* Not empty, ACCEPT, FAIL */ /* Not empty, ACCEPT, FAIL */
{ MUA | PCRE_NOTEMPTY, 0 | F_NOMATCH, "a*", "bcx" }, { MUA | PCRE_NOTEMPTY, 0 | F_NOMATCH, "a*", "bcx" },
...@@ -664,6 +686,7 @@ static struct regression_test_case regression_test_cases[] = { ...@@ -664,6 +686,7 @@ static struct regression_test_case regression_test_cases[] = {
{ PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 1, ".", "\r\n" }, { PCRE_MULTILINE | PCRE_UTF8 | PCRE_NEWLINE_CRLF | PCRE_FIRSTLINE, 1, ".", "\r\n" },
{ PCRE_FIRSTLINE | PCRE_NEWLINE_LF | PCRE_DOTALL, 0 | F_NOMATCH, "ab.", "ab" }, { PCRE_FIRSTLINE | PCRE_NEWLINE_LF | PCRE_DOTALL, 0 | F_NOMATCH, "ab.", "ab" },
{ MUA | PCRE_FIRSTLINE, 1 | F_NOMATCH, "^[a-d0-9]", "\nxx\nd" }, { MUA | PCRE_FIRSTLINE, 1 | F_NOMATCH, "^[a-d0-9]", "\nxx\nd" },
{ PCRE_NEWLINE_ANY | PCRE_FIRSTLINE | PCRE_DOTALL, 0, "....a", "012\n0a" },
/* Recurse. */ /* Recurse. */
{ MUA, 0, "(a)(?1)", "aa" }, { MUA, 0, "(a)(?1)", "aa" },
...@@ -798,6 +821,9 @@ static struct regression_test_case regression_test_cases[] = { ...@@ -798,6 +821,9 @@ static struct regression_test_case regression_test_cases[] = {
/* (*SKIP) verb. */ /* (*SKIP) verb. */
{ MUA, 0 | F_NOMATCH, "(?=a(*SKIP)b)ab|ad", "ad" }, { MUA, 0 | F_NOMATCH, "(?=a(*SKIP)b)ab|ad", "ad" },
{ MUA, 0, "(\\w+(*SKIP)#)", "abcd,xyz#," },
{ MUA, 0, "\\w+(*SKIP)#|mm", "abcd,xyz#," },
{ MUA, 0 | F_NOMATCH, "b+(?<=(*SKIP)#c)|b+", "#bbb" },
/* (*THEN) verb. */ /* (*THEN) verb. */
{ MUA, 0, "((?:a(*THEN)|aab)(*THEN)c|a+)+m", "aabcaabcaabcaabcnacm" }, { MUA, 0, "((?:a(*THEN)|aab)(*THEN)c|a+)+m", "aabcaabcaabcaabcnacm" },
...@@ -1534,10 +1560,10 @@ static int regression_tests(void) ...@@ -1534,10 +1560,10 @@ static int regression_tests(void)
is_successful = 0; is_successful = 0;
} }
#endif #endif
#if defined SUPPORT_PCRE16 && defined SUPPORT_PCRE16 #if defined SUPPORT_PCRE16 && defined SUPPORT_PCRE32
if (ovector16_1[i] != ovector16_2[i] || ovector16_1[i] != ovector16_1[i] || ovector16_1[i] != ovector16_2[i]) { if (ovector16_1[i] != ovector16_2[i] || ovector16_1[i] != ovector32_1[i] || ovector16_1[i] != ovector32_2[i]) {
printf("\n16 and 16 bit: Ovector[%d] value differs(J16:%d,I16:%d,J32:%d,I32:%d): [%d] '%s' @ '%s' \n", printf("\n16 and 32 bit: Ovector[%d] value differs(J16:%d,I16:%d,J32:%d,I32:%d): [%d] '%s' @ '%s' \n",
i, ovector16_1[i], ovector16_2[i], ovector16_1[i], ovector16_2[i], i, ovector16_1[i], ovector16_2[i], ovector32_1[i], ovector32_2[i],
total, current->pattern, current->input); total, current->pattern, current->input);
is_successful = 0; is_successful = 0;
} }
......
...@@ -1371,7 +1371,7 @@ do ...@@ -1371,7 +1371,7 @@ do
for (c = 0; c < 16; c++) start_bits[c] |= map[c]; for (c = 0; c < 16; c++) start_bits[c] |= map[c];
for (c = 128; c < 256; c++) for (c = 128; c < 256; c++)
{ {
if ((map[c/8] && (1 << (c&7))) != 0) if ((map[c/8] & (1 << (c&7))) != 0)
{ {
int d = (c >> 6) | 0xc0; /* Set bit for this starter */ int d = (c >> 6) | 0xc0; /* Set bit for this starter */
start_bits[d/8] |= (1 << (d&7)); /* and then skip on to the */ start_bits[d/8] |= (1 << (d&7)); /* and then skip on to the */
......
...@@ -66,7 +66,7 @@ Arg RE::no_arg((void*)NULL); ...@@ -66,7 +66,7 @@ Arg RE::no_arg((void*)NULL);
// inclusive test if we ever needed it. (Note that not only the // inclusive test if we ever needed it. (Note that not only the
// __attribute__ syntax, but also __USER_LABEL_PREFIX__, are // __attribute__ syntax, but also __USER_LABEL_PREFIX__, are
// gnu-specific.) // gnu-specific.)
#if defined(__GNUC__) && __GNUC__ >= 3 && defined(__ELF__) #if defined(__GNUC__) && __GNUC__ >= 3 && defined(__ELF__) && !defined(__INTEL_COMPILER)
# define ULP_AS_STRING(x) ULP_AS_STRING_INTERNAL(x) # define ULP_AS_STRING(x) ULP_AS_STRING_INTERNAL(x)
# define ULP_AS_STRING_INTERNAL(x) #x # define ULP_AS_STRING_INTERNAL(x) #x
# define USER_LABEL_PREFIX_STR ULP_AS_STRING(__USER_LABEL_PREFIX__) # define USER_LABEL_PREFIX_STR ULP_AS_STRING(__USER_LABEL_PREFIX__)
...@@ -168,22 +168,22 @@ bool RE::FullMatch(const StringPiece& text, ...@@ -168,22 +168,22 @@ bool RE::FullMatch(const StringPiece& text,
const Arg& ptr16) const { const Arg& ptr16) const {
const Arg* args[kMaxArgs]; const Arg* args[kMaxArgs];
int n = 0; int n = 0;
if (&ptr1 == &no_arg) goto done; args[n++] = &ptr1; if (&ptr1 == &no_arg) { goto done; } args[n++] = &ptr1;
if (&ptr2 == &no_arg) goto done; args[n++] = &ptr2; if (&ptr2 == &no_arg) { goto done; } args[n++] = &ptr2;
if (&ptr3 == &no_arg) goto done; args[n++] = &ptr3; if (&ptr3 == &no_arg) { goto done; } args[n++] = &ptr3;
if (&ptr4 == &no_arg) goto done; args[n++] = &ptr4; if (&ptr4 == &no_arg) { goto done; } args[n++] = &ptr4;
if (&ptr5 == &no_arg) goto done; args[n++] = &ptr5; if (&ptr5 == &no_arg) { goto done; } args[n++] = &ptr5;
if (&ptr6 == &no_arg) goto done; args[n++] = &ptr6; if (&ptr6 == &no_arg) { goto done; } args[n++] = &ptr6;
if (&ptr7 == &no_arg) goto done; args[n++] = &ptr7; if (&ptr7 == &no_arg) { goto done; } args[n++] = &ptr7;
if (&ptr8 == &no_arg) goto done; args[n++] = &ptr8; if (&ptr8 == &no_arg) { goto done; } args[n++] = &ptr8;
if (&ptr9 == &no_arg) goto done; args[n++] = &ptr9; if (&ptr9 == &no_arg) { goto done; } args[n++] = &ptr9;
if (&ptr10 == &no_arg) goto done; args[n++] = &ptr10; if (&ptr10 == &no_arg) { goto done; } args[n++] = &ptr10;
if (&ptr11 == &no_arg) goto done; args[n++] = &ptr11; if (&ptr11 == &no_arg) { goto done; } args[n++] = &ptr11;
if (&ptr12 == &no_arg) goto done; args[n++] = &ptr12; if (&ptr12 == &no_arg) { goto done; } args[n++] = &ptr12;
if (&ptr13 == &no_arg) goto done; args[n++] = &ptr13; if (&ptr13 == &no_arg) { goto done; } args[n++] = &ptr13;
if (&ptr14 == &no_arg) goto done; args[n++] = &ptr14; if (&ptr14 == &no_arg) { goto done; } args[n++] = &ptr14;
if (&ptr15 == &no_arg) goto done; args[n++] = &ptr15; if (&ptr15 == &no_arg) { goto done; } args[n++] = &ptr15;
if (&ptr16 == &no_arg) goto done; args[n++] = &ptr16; if (&ptr16 == &no_arg) { goto done; } args[n++] = &ptr16;
done: done:
int consumed; int consumed;
...@@ -210,22 +210,22 @@ bool RE::PartialMatch(const StringPiece& text, ...@@ -210,22 +210,22 @@ bool RE::PartialMatch(const StringPiece& text,
const Arg& ptr16) const { const Arg& ptr16) const {
const Arg* args[kMaxArgs]; const Arg* args[kMaxArgs];
int n = 0; int n = 0;
if (&ptr1 == &no_arg) goto done; args[n++] = &ptr1; if (&ptr1 == &no_arg) { goto done; } args[n++] = &ptr1;
if (&ptr2 == &no_arg) goto done; args[n++] = &ptr2; if (&ptr2 == &no_arg) { goto done; } args[n++] = &ptr2;
if (&ptr3 == &no_arg) goto done; args[n++] = &ptr3; if (&ptr3 == &no_arg) { goto done; } args[n++] = &ptr3;
if (&ptr4 == &no_arg) goto done; args[n++] = &ptr4; if (&ptr4 == &no_arg) { goto done; } args[n++] = &ptr4;
if (&ptr5 == &no_arg) goto done; args[n++] = &ptr5; if (&ptr5 == &no_arg) { goto done; } args[n++] = &ptr5;
if (&ptr6 == &no_arg) goto done; args[n++] = &ptr6; if (&ptr6 == &no_arg) { goto done; } args[n++] = &ptr6;
if (&ptr7 == &no_arg) goto done; args[n++] = &ptr7; if (&ptr7 == &no_arg) { goto done; } args[n++] = &ptr7;
if (&ptr8 == &no_arg) goto done; args[n++] = &ptr8; if (&ptr8 == &no_arg) { goto done; } args[n++] = &ptr8;
if (&ptr9 == &no_arg) goto done; args[n++] = &ptr9; if (&ptr9 == &no_arg) { goto done; } args[n++] = &ptr9;
if (&ptr10 == &no_arg) goto done; args[n++] = &ptr10; if (&ptr10 == &no_arg) { goto done; } args[n++] = &ptr10;
if (&ptr11 == &no_arg) goto done; args[n++] = &ptr11; if (&ptr11 == &no_arg) { goto done; } args[n++] = &ptr11;
if (&ptr12 == &no_arg) goto done; args[n++] = &ptr12; if (&ptr12 == &no_arg) { goto done; } args[n++] = &ptr12;
if (&ptr13 == &no_arg) goto done; args[n++] = &ptr13; if (&ptr13 == &no_arg) { goto done; } args[n++] = &ptr13;
if (&ptr14 == &no_arg) goto done; args[n++] = &ptr14; if (&ptr14 == &no_arg) { goto done; } args[n++] = &ptr14;
if (&ptr15 == &no_arg) goto done; args[n++] = &ptr15; if (&ptr15 == &no_arg) { goto done; } args[n++] = &ptr15;
if (&ptr16 == &no_arg) goto done; args[n++] = &ptr16; if (&ptr16 == &no_arg) { goto done; } args[n++] = &ptr16;
done: done:
int consumed; int consumed;
...@@ -252,22 +252,22 @@ bool RE::Consume(StringPiece* input, ...@@ -252,22 +252,22 @@ bool RE::Consume(StringPiece* input,
const Arg& ptr16) const { const Arg& ptr16) const {
const Arg* args[kMaxArgs]; const Arg* args[kMaxArgs];
int n = 0; int n = 0;
if (&ptr1 == &no_arg) goto done; args[n++] = &ptr1; if (&ptr1 == &no_arg) { goto done; } args[n++] = &ptr1;
if (&ptr2 == &no_arg) goto done; args[n++] = &ptr2; if (&ptr2 == &no_arg) { goto done; } args[n++] = &ptr2;
if (&ptr3 == &no_arg) goto done; args[n++] = &ptr3; if (&ptr3 == &no_arg) { goto done; } args[n++] = &ptr3;
if (&ptr4 == &no_arg) goto done; args[n++] = &ptr4; if (&ptr4 == &no_arg) { goto done; } args[n++] = &ptr4;
if (&ptr5 == &no_arg) goto done; args[n++] = &ptr5; if (&ptr5 == &no_arg) { goto done; } args[n++] = &ptr5;
if (&ptr6 == &no_arg) goto done; args[n++] = &ptr6; if (&ptr6 == &no_arg) { goto done; } args[n++] = &ptr6;
if (&ptr7 == &no_arg) goto done; args[n++] = &ptr7; if (&ptr7 == &no_arg) { goto done; } args[n++] = &ptr7;
if (&ptr8 == &no_arg) goto done; args[n++] = &ptr8; if (&ptr8 == &no_arg) { goto done; } args[n++] = &ptr8;
if (&ptr9 == &no_arg) goto done; args[n++] = &ptr9; if (&ptr9 == &no_arg) { goto done; } args[n++] = &ptr9;
if (&ptr10 == &no_arg) goto done; args[n++] = &ptr10; if (&ptr10 == &no_arg) { goto done; } args[n++] = &ptr10;
if (&ptr11 == &no_arg) goto done; args[n++] = &ptr11; if (&ptr11 == &no_arg) { goto done; } args[n++] = &ptr11;
if (&ptr12 == &no_arg) goto done; args[n++] = &ptr12; if (&ptr12 == &no_arg) { goto done; } args[n++] = &ptr12;
if (&ptr13 == &no_arg) goto done; args[n++] = &ptr13; if (&ptr13 == &no_arg) { goto done; } args[n++] = &ptr13;
if (&ptr14 == &no_arg) goto done; args[n++] = &ptr14; if (&ptr14 == &no_arg) { goto done; } args[n++] = &ptr14;
if (&ptr15 == &no_arg) goto done; args[n++] = &ptr15; if (&ptr15 == &no_arg) { goto done; } args[n++] = &ptr15;
if (&ptr16 == &no_arg) goto done; args[n++] = &ptr16; if (&ptr16 == &no_arg) { goto done; } args[n++] = &ptr16;
done: done:
int consumed; int consumed;
...@@ -300,22 +300,22 @@ bool RE::FindAndConsume(StringPiece* input, ...@@ -300,22 +300,22 @@ bool RE::FindAndConsume(StringPiece* input,
const Arg& ptr16) const { const Arg& ptr16) const {
const Arg* args[kMaxArgs]; const Arg* args[kMaxArgs];
int n = 0; int n = 0;
if (&ptr1 == &no_arg) goto done; args[n++] = &ptr1; if (&ptr1 == &no_arg) { goto done; } args[n++] = &ptr1;
if (&ptr2 == &no_arg) goto done; args[n++] = &ptr2; if (&ptr2 == &no_arg) { goto done; } args[n++] = &ptr2;
if (&ptr3 == &no_arg) goto done; args[n++] = &ptr3; if (&ptr3 == &no_arg) { goto done; } args[n++] = &ptr3;
if (&ptr4 == &no_arg) goto done; args[n++] = &ptr4; if (&ptr4 == &no_arg) { goto done; } args[n++] = &ptr4;
if (&ptr5 == &no_arg) goto done; args[n++] = &ptr5; if (&ptr5 == &no_arg) { goto done; } args[n++] = &ptr5;
if (&ptr6 == &no_arg) goto done; args[n++] = &ptr6; if (&ptr6 == &no_arg) { goto done; } args[n++] = &ptr6;
if (&ptr7 == &no_arg) goto done; args[n++] = &ptr7; if (&ptr7 == &no_arg) { goto done; } args[n++] = &ptr7;
if (&ptr8 == &no_arg) goto done; args[n++] = &ptr8; if (&ptr8 == &no_arg) { goto done; } args[n++] = &ptr8;
if (&ptr9 == &no_arg) goto done; args[n++] = &ptr9; if (&ptr9 == &no_arg) { goto done; } args[n++] = &ptr9;
if (&ptr10 == &no_arg) goto done; args[n++] = &ptr10; if (&ptr10 == &no_arg) { goto done; } args[n++] = &ptr10;
if (&ptr11 == &no_arg) goto done; args[n++] = &ptr11; if (&ptr11 == &no_arg) { goto done; } args[n++] = &ptr11;
if (&ptr12 == &no_arg) goto done; args[n++] = &ptr12; if (&ptr12 == &no_arg) { goto done; } args[n++] = &ptr12;
if (&ptr13 == &no_arg) goto done; args[n++] = &ptr13; if (&ptr13 == &no_arg) { goto done; } args[n++] = &ptr13;
if (&ptr14 == &no_arg) goto done; args[n++] = &ptr14; if (&ptr14 == &no_arg) { goto done; } args[n++] = &ptr14;
if (&ptr15 == &no_arg) goto done; args[n++] = &ptr15; if (&ptr15 == &no_arg) { goto done; } args[n++] = &ptr15;
if (&ptr16 == &no_arg) goto done; args[n++] = &ptr16; if (&ptr16 == &no_arg) { goto done; } args[n++] = &ptr16;
done: done:
int consumed; int consumed;
......
...@@ -2437,7 +2437,7 @@ return options; ...@@ -2437,7 +2437,7 @@ return options;
static char * static char *
ordin(int n) ordin(int n)
{ {
static char buffer[8]; static char buffer[14];
char *p = buffer; char *p = buffer;
sprintf(p, "%d", n); sprintf(p, "%d", n);
while (*p != 0) p++; while (*p != 0) p++;
......
...@@ -6,7 +6,7 @@ ...@@ -6,7 +6,7 @@
and semantics are as close as possible to those of the Perl 5 language. and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel Written by Philip Hazel
Copyright (c) 1997-2014 University of Cambridge Copyright (c) 1997-2016 University of Cambridge
----------------------------------------------------------------------------- -----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without Redistribution and use in source and binary forms, with or without
...@@ -173,7 +173,8 @@ static const int eint[] = { ...@@ -173,7 +173,8 @@ static const int eint[] = {
REG_BADPAT, /* group name must start with a non-digit */ REG_BADPAT, /* group name must start with a non-digit */
/* 85 */ /* 85 */
REG_BADPAT, /* parentheses too deeply nested (stack check) */ REG_BADPAT, /* parentheses too deeply nested (stack check) */
REG_BADPAT /* missing digits in \x{} or \o{} */ REG_BADPAT, /* missing digits in \x{} or \o{} */
REG_BADPAT /* pattern too complicated */
}; };
/* Table of texts corresponding to POSIX error codes */ /* Table of texts corresponding to POSIX error codes */
...@@ -364,6 +365,7 @@ start location rather than being passed as a PCRE "starting offset". */ ...@@ -364,6 +365,7 @@ start location rather than being passed as a PCRE "starting offset". */
if ((eflags & REG_STARTEND) != 0) if ((eflags & REG_STARTEND) != 0)
{ {
if (pmatch == NULL) return REG_INVARG;
so = pmatch[0].rm_so; so = pmatch[0].rm_so;
eo = pmatch[0].rm_eo; eo = pmatch[0].rm_eo;
} }
......
...@@ -2250,7 +2250,7 @@ data is not zero. */ ...@@ -2250,7 +2250,7 @@ data is not zero. */
static int callout(pcre_callout_block *cb) static int callout(pcre_callout_block *cb)
{ {
FILE *f = (first_callout | callout_extra)? outfile : NULL; FILE *f = (first_callout | callout_extra)? outfile : NULL;
int i, pre_start, post_start, subject_length; int i, current_position, pre_start, post_start, subject_length;
if (callout_extra) if (callout_extra)
{ {
...@@ -2280,14 +2280,19 @@ printed lengths of the substrings. */ ...@@ -2280,14 +2280,19 @@ printed lengths of the substrings. */
if (f != NULL) fprintf(f, "--->"); if (f != NULL) fprintf(f, "--->");
/* If a lookbehind is involved, the current position may be earlier than the
match start. If so, use the match start instead. */
current_position = (cb->current_position >= cb->start_match)?
cb->current_position : cb->start_match;
PCHARS(pre_start, cb->subject, 0, cb->start_match, f); PCHARS(pre_start, cb->subject, 0, cb->start_match, f);
PCHARS(post_start, cb->subject, cb->start_match, PCHARS(post_start, cb->subject, cb->start_match,
cb->current_position - cb->start_match, f); current_position - cb->start_match, f);
PCHARS(subject_length, cb->subject, 0, cb->subject_length, NULL); PCHARS(subject_length, cb->subject, 0, cb->subject_length, NULL);
PCHARSV(cb->subject, cb->current_position, PCHARSV(cb->subject, current_position, cb->subject_length - current_position, f);
cb->subject_length - cb->current_position, f);
if (f != NULL) fprintf(f, "\n"); if (f != NULL) fprintf(f, "\n");
...@@ -5612,6 +5617,12 @@ while (!done) ...@@ -5612,6 +5617,12 @@ while (!done)
break; break;
} }
if (use_size_offsets < 2)
{
fprintf(outfile, "Cannot do global matching with an ovector size < 2\n");
break;
}
/* If we have matched an empty string, first check to see if we are at /* If we have matched an empty string, first check to see if we are at
the end of the subject. If so, the /g loop is over. Otherwise, mimic what the end of the subject. If so, the /g loop is over. Otherwise, mimic what
Perl's /g options does. This turns out to be rather cunning. First we set Perl's /g options does. This turns out to be rather cunning. First we set
...@@ -5740,3 +5751,4 @@ return yield; ...@@ -5740,3 +5751,4 @@ return yield;
} }
/* End of pcretest.c */ /* End of pcretest.c */
...@@ -138,4 +138,6 @@ is required for these tests. --/ ...@@ -138,4 +138,6 @@ is required for these tests. --/
/.((?2)(?R)\1)()/B /.((?2)(?R)\1)()/B
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
/-- End of testinput11 --/ /-- End of testinput11 --/
...@@ -4217,4 +4217,30 @@ backtracking verbs. --/ ...@@ -4217,4 +4217,30 @@ backtracking verbs. --/
/a[[:punct:]b]/BZ /a[[:punct:]b]/BZ
/L(?#(|++<!(2)?/BZ
/L(?#(|++<!(2)?/BOZ
/L(?#(|++<!(2)?/BCZ
/L(?#(|++<!(2)?/BCOZ
/(A*)\E+/CBZ
/()\Q\E*]/BCZ
/(?<A>)(?J:(?<B>)(?<B>))(?<C>)/
\O\CC
/(?=a\K)/
ring bpattingbobnd $ 1,oern cou \rb\L
/(?<=((?C)0))/
9010
abcd
/((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/
/\N(?(?C)0?!.)*/
/-- End of testinput2 --/ /-- End of testinput2 --/
...@@ -1553,4 +1553,13 @@ ...@@ -1553,4 +1553,13 @@
\x{200} \x{200}
\x{37e} \x{37e}
/[^[:^ascii:]\d]/8W
a
~
0
\a
\x{7f}
\x{389}
\x{20ac}
/-- End of testinput6 --/ /-- End of testinput6 --/
...@@ -853,4 +853,8 @@ of case for anything other than the ASCII letters. --/ ...@@ -853,4 +853,8 @@ of case for anything other than the ASCII letters. --/
/a[b[:punct:]]/8WBZ /a[b[:punct:]]/8WBZ
/L(?#(|++<!(2)?/B8COZ
/L(?#(|++<!(2)?/B8WCZ
/-- End of testinput7 --/ /-- End of testinput7 --/
...@@ -231,7 +231,7 @@ Memory allocation (code space): 73 ...@@ -231,7 +231,7 @@ Memory allocation (code space): 73
------------------------------------------------------------------ ------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/BM /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
Memory allocation (code space): 77 Memory allocation (code space): 93
------------------------------------------------------------------ ------------------------------------------------------------------
0 24 Bra 0 24 Bra
2 5 CBra 1 2 5 CBra 1
...@@ -765,4 +765,7 @@ Memory allocation (code space): 14 ...@@ -765,4 +765,7 @@ Memory allocation (code space): 14
25 End 25 End
------------------------------------------------------------------ ------------------------------------------------------------------
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
Failed: regular expression is too complicated at offset 490
/-- End of testinput11 --/ /-- End of testinput11 --/
...@@ -231,7 +231,7 @@ Memory allocation (code space): 155 ...@@ -231,7 +231,7 @@ Memory allocation (code space): 155
------------------------------------------------------------------ ------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/BM /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
Memory allocation (code space): 157 Memory allocation (code space): 189
------------------------------------------------------------------ ------------------------------------------------------------------
0 24 Bra 0 24 Bra
2 5 CBra 1 2 5 CBra 1
...@@ -765,4 +765,7 @@ Memory allocation (code space): 28 ...@@ -765,4 +765,7 @@ Memory allocation (code space): 28
25 End 25 End
------------------------------------------------------------------ ------------------------------------------------------------------
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
Failed: missing ) at offset 509
/-- End of testinput11 --/ /-- End of testinput11 --/
...@@ -231,7 +231,7 @@ Memory allocation (code space): 45 ...@@ -231,7 +231,7 @@ Memory allocation (code space): 45
------------------------------------------------------------------ ------------------------------------------------------------------
/(?P<a>a)...(?P=a)bbb(?P>a)d/BM /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
Memory allocation (code space): 50 Memory allocation (code space): 62
------------------------------------------------------------------ ------------------------------------------------------------------
0 30 Bra 0 30 Bra
3 7 CBra 1 3 7 CBra 1
...@@ -765,4 +765,7 @@ Memory allocation (code space): 10 ...@@ -765,4 +765,7 @@ Memory allocation (code space): 10
38 End 38 End
------------------------------------------------------------------ ------------------------------------------------------------------
/([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00]([00](*ACCEPT)/
Failed: missing ) at offset 509
/-- End of testinput11 --/ /-- End of testinput11 --/
...@@ -419,7 +419,7 @@ Need char = '>' ...@@ -419,7 +419,7 @@ Need char = '>'
/(?U)<.*>/I /(?U)<.*>/I
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: ungreedy No options
First char = '<' First char = '<'
Need char = '>' Need char = '>'
abc<def>ghi<klm>nop abc<def>ghi<klm>nop
...@@ -443,7 +443,7 @@ Need char = '=' ...@@ -443,7 +443,7 @@ Need char = '='
/(?U)={3,}?/I /(?U)={3,}?/I
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: ungreedy No options
First char = '=' First char = '='
Need char = '=' Need char = '='
abc========def abc========def
...@@ -477,7 +477,7 @@ Failed: lookbehind assertion is not fixed length at offset 12 ...@@ -477,7 +477,7 @@ Failed: lookbehind assertion is not fixed length at offset 12
/(?i)abc/I /(?i)abc/I
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: caseless No options
First char = 'a' (caseless) First char = 'a' (caseless)
Need char = 'c' (caseless) Need char = 'c' (caseless)
...@@ -489,7 +489,7 @@ No need char ...@@ -489,7 +489,7 @@ No need char
/(?i)^1234/I /(?i)^1234/I
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: anchored caseless Options: anchored
No first char No first char
No need char No need char
...@@ -502,7 +502,7 @@ No need char ...@@ -502,7 +502,7 @@ No need char
/(?s).*/I /(?s).*/I
Capturing subpattern count = 0 Capturing subpattern count = 0
May match empty string May match empty string
Options: anchored dotall Options: anchored
No first char No first char
No need char No need char
...@@ -516,7 +516,7 @@ Starting chars: a b c d ...@@ -516,7 +516,7 @@ Starting chars: a b c d
/(?i)[abcd]/IS /(?i)[abcd]/IS
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: caseless No options
No first char No first char
No need char No need char
Subject length lower bound = 1 Subject length lower bound = 1
...@@ -524,7 +524,7 @@ Starting chars: A B C D a b c d ...@@ -524,7 +524,7 @@ Starting chars: A B C D a b c d
/(?m)[xy]|(b|c)/IS /(?m)[xy]|(b|c)/IS
Capturing subpattern count = 1 Capturing subpattern count = 1
Options: multiline No options
No first char No first char
No need char No need char
Subject length lower bound = 1 Subject length lower bound = 1
...@@ -538,7 +538,7 @@ No need char ...@@ -538,7 +538,7 @@ No need char
/(?i)(^a|^b)/Im /(?i)(^a|^b)/Im
Capturing subpattern count = 1 Capturing subpattern count = 1
Options: caseless multiline Options: multiline
First char at start or follows newline First char at start or follows newline
No need char No need char
...@@ -555,13 +555,13 @@ Failed: malformed number or name after (?( at offset 4 ...@@ -555,13 +555,13 @@ Failed: malformed number or name after (?( at offset 4
Failed: malformed number or name after (?( at offset 4 Failed: malformed number or name after (?( at offset 4
/(?(?i))/ /(?(?i))/
Failed: assertion expected after (?( at offset 3 Failed: assertion expected after (?( or (?(?C) at offset 3
/(?(abc))/ /(?(abc))/
Failed: reference to non-existent subpattern at offset 7 Failed: reference to non-existent subpattern at offset 7
/(?(?<ab))/ /(?(?<ab))/
Failed: assertion expected after (?( at offset 3 Failed: assertion expected after (?( or (?(?C) at offset 3
/((?s)blah)\s+\1/I /((?s)blah)\s+\1/I
Capturing subpattern count = 1 Capturing subpattern count = 1
...@@ -1179,7 +1179,7 @@ No need char ...@@ -1179,7 +1179,7 @@ No need char
End End
------------------------------------------------------------------ ------------------------------------------------------------------
Capturing subpattern count = 1 Capturing subpattern count = 1
Options: anchored dotall Options: anchored
No first char No first char
No need char No need char
...@@ -2735,7 +2735,7 @@ No match ...@@ -2735,7 +2735,7 @@ No match
End End
------------------------------------------------------------------ ------------------------------------------------------------------
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: caseless extended Options: extended
First char = 'a' (caseless) First char = 'a' (caseless)
Need char = 'c' (caseless) Need char = 'c' (caseless)
...@@ -2748,7 +2748,7 @@ Need char = 'c' (caseless) ...@@ -2748,7 +2748,7 @@ Need char = 'c' (caseless)
End End
------------------------------------------------------------------ ------------------------------------------------------------------
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: caseless extended Options: extended
First char = 'a' (caseless) First char = 'a' (caseless)
Need char = 'c' (caseless) Need char = 'c' (caseless)
...@@ -3095,7 +3095,7 @@ Need char = 'b' ...@@ -3095,7 +3095,7 @@ Need char = 'b'
End End
------------------------------------------------------------------ ------------------------------------------------------------------
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: ungreedy No options
First char = 'x' First char = 'x'
Need char = 'b' Need char = 'b'
xaaaab xaaaab
...@@ -3497,7 +3497,7 @@ Need char = 'c' ...@@ -3497,7 +3497,7 @@ Need char = 'c'
/(?i)[ab]/IS /(?i)[ab]/IS
Capturing subpattern count = 0 Capturing subpattern count = 0
Options: caseless No options
No first char No first char
No need char No need char
Subject length lower bound = 1 Subject length lower bound = 1
...@@ -6299,7 +6299,7 @@ Capturing subpattern count = 3 ...@@ -6299,7 +6299,7 @@ Capturing subpattern count = 3
Named capturing subpatterns: Named capturing subpatterns:
A 2 A 2
A 3 A 3
Options: anchored dupnames Options: anchored
Duplicate name status changes Duplicate name status changes
No first char No first char
No need char No need char
...@@ -7870,7 +7870,7 @@ No match ...@@ -7870,7 +7870,7 @@ No match
Failed: malformed number or name after (?( at offset 6 Failed: malformed number or name after (?( at offset 6
/(?(''))/ /(?(''))/
Failed: assertion expected after (?( at offset 4 Failed: assertion expected after (?( or (?(?C) at offset 4
/(?('R')stuff)/ /(?('R')stuff)/
Failed: reference to non-existent subpattern at offset 7 Failed: reference to non-existent subpattern at offset 7
...@@ -14346,7 +14346,7 @@ No match ...@@ -14346,7 +14346,7 @@ No match
"((?2)+)((?1))" "((?2)+)((?1))"
"(?(?<E>.*!.*)?)" "(?(?<E>.*!.*)?)"
Failed: assertion expected after (?( at offset 3 Failed: assertion expected after (?( or (?(?C) at offset 3
"X((?2)()*+){2}+"BZ "X((?2)()*+){2}+"BZ
------------------------------------------------------------------ ------------------------------------------------------------------
...@@ -14574,4 +14574,100 @@ No match ...@@ -14574,4 +14574,100 @@ No match
End End
------------------------------------------------------------------ ------------------------------------------------------------------
/L(?#(|++<!(2)?/BZ
------------------------------------------------------------------
Bra
L?+
Ket
End
------------------------------------------------------------------
/L(?#(|++<!(2)?/BOZ
------------------------------------------------------------------
Bra
L?
Ket
End
------------------------------------------------------------------
/L(?#(|++<!(2)?/BCZ
------------------------------------------------------------------
Bra
Callout 255 0 14
L?+
Callout 255 14 0
Ket
End
------------------------------------------------------------------
/L(?#(|++<!(2)?/BCOZ
------------------------------------------------------------------
Bra
Callout 255 0 14
L?
Callout 255 14 0
Ket
End
------------------------------------------------------------------
/(A*)\E+/CBZ
------------------------------------------------------------------
Bra
Callout 255 0 7
SCBra 1
Callout 255 1 2
A*
Callout 255 3 0
KetRmax
Callout 255 7 0
Ket
End
------------------------------------------------------------------
/()\Q\E*]/BCZ
------------------------------------------------------------------
Bra
Callout 255 0 7
Brazero
SCBra 1
Callout 255 1 0
KetRmax
Callout 255 7 1
]
Callout 255 8 0
Ket
End
------------------------------------------------------------------
/(?<A>)(?J:(?<B>)(?<B>))(?<C>)/
\O\CC
Matched, but too many substrings
copy substring C failed -7
/(?=a\K)/
ring bpattingbobnd $ 1,oern cou \rb\L
Start of matched string is beyond its end - displaying from end to start.
0: a
0L
/(?<=((?C)0))/
9010
--->9010
0 ^ 0
0 ^ 0
0:
1: 0
abcd
--->abcd
0 ^ 0
0 ^ 0
0 ^ 0
0 ^ 0
No match
/((?J)(?'R'(?'R'(?'R'(?'R'(?'R'(?|(\k'R'))))))))/
/\N(?(?C)0?!.)*/
Failed: assertion expected after (?( or (?(?C) at offset 4
/-- End of testinput2 --/ /-- End of testinput2 --/
...@@ -2557,4 +2557,20 @@ No match ...@@ -2557,4 +2557,20 @@ No match
\x{37e} \x{37e}
0: \x{37e} 0: \x{37e}
/[^[:^ascii:]\d]/8W
a
0: a
~
0: ~
0
No match
\a
0: \x{07}
\x{7f}
0: \x{7f}
\x{389}
No match
\x{20ac}
No match
/-- End of testinput6 --/ /-- End of testinput6 --/
...@@ -2348,4 +2348,24 @@ No match ...@@ -2348,4 +2348,24 @@ No match
End End
------------------------------------------------------------------ ------------------------------------------------------------------
/L(?#(|++<!(2)?/B8COZ
------------------------------------------------------------------
Bra
Callout 255 0 14
L?
Callout 255 14 0
Ket
End
------------------------------------------------------------------
/L(?#(|++<!(2)?/B8WCZ
------------------------------------------------------------------
Bra
Callout 255 0 14
L?+
Callout 255 14 0
Ket
End
------------------------------------------------------------------
/-- End of testinput7 --/ /-- End of testinput7 --/
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment