Merge branch 'merge-pcre' into 10.0

0b4f5060 · Sergei Golubchik · 6c5ee862 · c4cc91cd · 0b4f5060 · 0b4f5060
Commit 0b4f5060 authored May 04, 2015 by Sergei Golubchik
41 changed files
--- a/pcre/AUTHORS
+++ b/pcre/AUTHORS
@@ -8,7 +8,7 @@ Email domain:     cam.ac.uk
 University of Cambridge Computing Service,
 Cambridge, England.

-Copyright (c) 1997-2014 University of Cambridge
+Copyright (c) 1997-2015 University of Cambridge
 All rights reserved


@@ -19,7 +19,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2010-2014 Zoltan Herczeg
+Copyright(c) 2010-2015 Zoltan Herczeg
 All rights reserved.


@@ -30,7 +30,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2009-2014 Zoltan Herczeg
+Copyright(c) 2009-2015 Zoltan Herczeg
 All rights reserved.



--- a/pcre/ChangeLog
+++ b/pcre/ChangeLog
 ChangeLog for PCRE
 ------------------

+Version 8.37 28-April-2015
+--------------------------
+
+1.  When an (*ACCEPT) is triggered inside capturing parentheses, it arranges
+    for those parentheses to be closed with whatever has been captured so far.
+    However, it was failing to mark any other groups between the hightest
+    capture so far and the currrent group as "unset". Thus, the ovector for
+    those groups contained whatever was previously there. An example is the
+    pattern /(x)|((*ACCEPT))/ when matched against "abcd".
+
+2.  If an assertion condition was quantified with a minimum of zero (an odd
+    thing to do, but it happened), SIGSEGV or other misbehaviour could occur.
+
+3.  If a pattern in pcretest input had the P (POSIX) modifier followed by an
+    unrecognized modifier, a crash could occur.
+
+4.  An attempt to do global matching in pcretest with a zero-length ovector
+    caused a crash.
+
+5.  Fixed a memory leak during matching that could occur for a subpattern
+    subroutine call (recursive or otherwise) if the number of captured groups
+    that had to be saved was greater than ten.
+
+6.  Catch a bad opcode during auto-possessification after compiling a bad UTF
+    string with NO_UTF_CHECK. This is a tidyup, not a bug fix, as passing bad
+    UTF with NO_UTF_CHECK is documented as having an undefined outcome.
+
+7.  A UTF pattern containing a "not" match of a non-ASCII character and a
+    subroutine reference could loop at compile time. Example: /[^\xff]((?1))/.
+
+8. When a pattern is compiled, it remembers the highest back reference so that
+   when matching, if the ovector is too small, extra memory can be obtained to
+   use instead. A conditional subpattern whose condition is a check on a
+   capture having happened, such as, for example in the pattern
+   /^(?:(a)|b)(?(1)A|B)/, is another kind of back reference, but it was not
+   setting the highest backreference number. This mattered only if pcre_exec()
+   was called with an ovector that was too small to hold the capture, and there
+   was no other kind of back reference (a situation which is probably quite
+   rare). The effect of the bug was that the condition was always treated as
+   FALSE when the capture could not be consulted, leading to a incorrect
+   behaviour by pcre_exec(). This bug has been fixed.
+
+9. A reference to a duplicated named group (either a back reference or a test
+   for being set in a conditional) that occurred in a part of the pattern where
+   PCRE_DUPNAMES was not set caused the amount of memory needed for the pattern
+   to be incorrectly calculated, leading to overwriting.
+
+10. A mutually recursive set of back references such as (\2)(\1) caused a
+    segfault at study time (while trying to find the minimum matching length).
+    The infinite loop is now broken (with the minimum length unset, that is,
+    zero).
+
+11. If an assertion that was used as a condition was quantified with a minimum
+    of zero, matching went wrong. In particular, if the whole group had
+    unlimited repetition and could match an empty string, a segfault was
+    likely. The pattern (?(?=0)?)+ is an example that caused this. Perl allows
+    assertions to be quantified, but not if they are being used as conditions,
+    so the above pattern is faulted by Perl. PCRE has now been changed so that
+    it also rejects such patterns.
+
+12. A possessive capturing group such as (a)*+ with a minimum repeat of zero
+    failed to allow the zero-repeat case if pcre2_exec() was called with an
+    ovector too small to capture the group.
+
+13. Fixed two bugs in pcretest that were discovered by fuzzing and reported by
+    Red Hat Product Security:
+
+    (a) A crash if /K and /F were both set with the option to save the compiled
+    pattern.
+
+    (b) Another crash if the option to print captured substrings in a callout
+    was combined with setting a null ovector, for example \O\C+ as a subject
+    string.
+
+14. A pattern such as "((?2){0,1999}())?", which has a group containing a
+    forward reference repeated a large (but limited) number of times within a
+    repeated outer group that has a zero minimum quantifier, caused incorrect
+    code to be compiled, leading to the error "internal error:
+    previously-checked referenced subpattern not found" when an incorrect
+    memory address was read. This bug was reported as "heap overflow",
+    discovered by Kai Lu of Fortinet's FortiGuard Labs and given the CVE number
+    CVE-2015-2325.
+
+23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
+    call within a group that also contained a recursive back reference caused
+    incorrect code to be compiled. This bug was reported as "heap overflow",
+    discovered by Kai Lu of Fortinet's FortiGuard Labs, and given the CVE
+    number CVE-2015-2326.
+
+24. Computing the size of the JIT read-only data in advance has been a source
+    of various issues, and new ones are still appear unfortunately. To fix
+    existing and future issues, size computation is eliminated from the code,
+    and replaced by on-demand memory allocation.
+
+25. A pattern such as /(?i)[A-`]/, where characters in the other case are
+    adjacent to the end of the range, and the range contained characters with
+    more than one other case, caused incorrect behaviour when compiled in UTF
+    mode. In that example, the range a-j was left out of the class.
+
+26. Fix JIT compilation of conditional blocks, which assertion
+    is converted to (*FAIL). E.g: /(?(?!))/.
+
+27. The pattern /(?(?!)^)/ caused references to random memory. This bug was
+    discovered by the LLVM fuzzer.
+
+28. The assertion (?!) is optimized to (*FAIL). This was not handled correctly
+    when this assertion was used as a condition, for example (?(?!)a|b). In
+    pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
+    error about an unsupported item.
+
+29. For some types of pattern, for example /Z*(|d*){216}/, the auto-
+    possessification code could take exponential time to complete. A recursion
+    depth limit of 1000 has been imposed to limit the resources used by this
+    optimization.
+
+30. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
+    such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
+    because \S ensures they are all in the class. The code for doing this was
+    interacting badly with the code for computing the amount of space needed to
+    compile the pattern, leading to a buffer overflow. This bug was discovered
+    by the LLVM fuzzer.
+
+31. A pattern such as /((?2)+)((?1))/ which has mutual recursion nested inside
+    other kinds of group caused stack overflow at compile time. This bug was
+    discovered by the LLVM fuzzer.
+
+32. A pattern such as /(?1)(?#?'){8}(a)/ which had a parenthesized comment
+    between a subroutine call and its quantifier was incorrectly compiled,
+    leading to buffer overflow or other errors. This bug was discovered by the
+    LLVM fuzzer.
+
+33. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
+    assertion after (?(. The code was failing to check the character after
+    (?(?< for the ! or = that would indicate a lookbehind assertion. This bug
+    was discovered by the LLVM fuzzer.
+
+34. A pattern such as /X((?2)()*+){2}+/ which has a possessive quantifier with
+    a fixed maximum following a group that contains a subroutine reference was
+    incorrectly compiled and could trigger buffer overflow. This bug was
+    discovered by the LLVM fuzzer.
+
+35. A mutual recursion within a lookbehind assertion such as (?<=((?2))((?1)))
+    caused a stack overflow instead of the diagnosis of a non-fixed length
+    lookbehind assertion. This bug was discovered by the LLVM fuzzer.
+
+36. The use of \K in a positive lookbehind assertion in a non-anchored pattern
+    (e.g. /(?<=\Ka)/) could make pcregrep loop.
+
+37. There was a similar problem to 36 in pcretest for global matches.
+
+38. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
+    and a subsequent item in the pattern caused a non-match, backtracking over
+    the repeated \X did not stop, but carried on past the start of the subject,
+    causing reference to random memory and/or a segfault. There were also some
+    other cases where backtracking after \C could crash. This set of bugs was
+    discovered by the LLVM fuzzer.
+
+39. The function for finding the minimum length of a matching string could take
+    a very long time if mutual recursion was present many times in a pattern,
+    for example, /((?2){73}(?2))((?1))/. A better mutual recursion detection
+    method has been implemented. This infelicity was discovered by the LLVM
+    fuzzer.
+
+40. Static linking against the PCRE library using the pkg-config module was
+    failing on missing pthread symbols.
+
+
 Version 8.36 26-September-2014
 ------------------------------


--- a/pcre/LICENCE
+++ b/pcre/LICENCE
@@ -6,7 +6,8 @@ and semantics are as close as possible to those of the Perl 5 language.

 Release 8 of PCRE is distributed under the terms of the "BSD" licence, as
 specified below. The documentation for PCRE, supplied in the "doc"
-directory, is distributed under the same terms as the software itself.
+directory, is distributed under the same terms as the software itself. The data
+in the testdata directory is not copyrighted and is in the public domain.

 The basic library functions are written in C and are freestanding. Also
 included in the distribution is a set of C++ wrapper functions, and a
@@ -24,7 +25,7 @@ Email domain:     cam.ac.uk
 University of Cambridge Computing Service,
 Cambridge, England.

-Copyright (c) 1997-2014 University of Cambridge
+Copyright (c) 1997-2015 University of Cambridge
 All rights reserved.


@@ -35,7 +36,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2010-2014 Zoltan Herczeg
+Copyright(c) 2010-2015 Zoltan Herczeg
 All rights reserved.


@@ -46,7 +47,7 @@ Written by:       Zoltan Herczeg
 Email local part: hzmester
 Emain domain:     freemail.hu

-Copyright(c) 2009-2014 Zoltan Herczeg
+Copyright(c) 2009-2015 Zoltan Herczeg
 All rights reserved.



--- a/pcre/NEWS
+++ b/pcre/NEWS
 News about PCRE releases
 ------------------------

+Release 8.37 28-April-2015
+--------------------------
+
+This is bug-fix release. Note that this library (now called PCRE1) is now being
+maintained for bug fixes only. New projects are advised to use the new PCRE2
+libraries.
+
+
 Release 8.36 26-September-2014
 ------------------------------


--- a/pcre/NON-AUTOTOOLS-BUILD
+++ b/pcre/NON-AUTOTOOLS-BUILD
 Building PCRE without using autotools
 -------------------------------------

+NOTE: This document relates to PCRE releases that use the original API, with
+library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
+release of a new API, known as PCRE2, with release numbers starting at 10.00
+and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
+(now called PCRE1) are still being maintained for bug fixes, but there will be
+no new development. New projects are advised to use the new PCRE2 libraries.
+
+
 This document contains the following sections:

  General
@@ -761,4 +769,4 @@ There is also a mirror here:
  http://www.vsoft-software.com/downloads.html

 ==========================
-Last Updated: 14 May 2013
+Last Updated: 10 February 2015
--- a/pcre/README
+++ b/pcre/README
 README file for PCRE (Perl-compatible regular expression library)
 -----------------------------------------------------------------

-The latest release of PCRE is always available in three alternative formats
+NOTE: This set of files relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+
+
+The latest release of PCRE1 is always available in three alternative formats
 from:

  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
@@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 24 October 2014
+Last updated: 10 February 2015
--- a/pcre/RunGrepTest
+++ b/pcre/RunGrepTest
@@ -506,6 +506,11 @@ echo "---------------------------- Test 106 -----------------------------" >>tes
 (cd $srcdir; echo "a" | $valgrind $pcregrep -M "|a" ) >>testtrygrep 2>&1
 echo "RC=$?" >>testtrygrep

+echo "---------------------------- Test 107 -----------------------------" >>testtrygrep
+echo "a" >testtemp1grep
+echo "aaaaa" >>testtemp1grep
+(cd $srcdir; $valgrind $pcregrep  --line-offsets '(?<=\Ka)' $builddir/testtemp1grep) >>testtrygrep 2>&1
+echo "RC=$?" >>testtrygrep

 # Now compare the results.


--- a/pcre/configure.ac
+++ b/pcre/configure.ac
@@ -9,17 +9,17 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
 dnl be defined as -RC2, for example. For real releases, it should be empty.

 m4_define(pcre_major, [8])
-m4_define(pcre_minor, [36])
+m4_define(pcre_minor, [37])
 m4_define(pcre_prerelease, [])
-m4_define(pcre_date, [2014-09-26])
+m4_define(pcre_date, [2015-04-28])

 # NOTE: The CMakeLists.txt file searches for the above variables in the first
 # 50 lines of this file. Please update that if the variables above are moved.

 # Libtool shared library interface versions (current:revision:age)
-m4_define(libpcre_version, [3:4:2])
-m4_define(libpcre16_version, [2:4:2])
-m4_define(libpcre32_version, [0:4:0])
+m4_define(libpcre_version, [3:5:2])
+m4_define(libpcre16_version, [2:5:2])
+m4_define(libpcre32_version, [0:5:0])
 m4_define(libpcreposix_version, [0:3:0])
 m4_define(libpcrecpp_version, [0:1:0])


--- a/pcre/doc/html/NON-AUTOTOOLS-BUILD.txt
+++ b/pcre/doc/html/NON-AUTOTOOLS-BUILD.txt
 Building PCRE without using autotools
 -------------------------------------

+NOTE: This document relates to PCRE releases that use the original API, with
+library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
+release of a new API, known as PCRE2, with release numbers starting at 10.00
+and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
+(now called PCRE1) are still being maintained for bug fixes, but there will be
+no new development. New projects are advised to use the new PCRE2 libraries.
+
+
 This document contains the following sections:

  General
@@ -761,4 +769,4 @@ There is also a mirror here:
  http://www.vsoft-software.com/downloads.html

 ==========================
-Last Updated: 14 May 2013
+Last Updated: 10 February 2015
--- a/pcre/doc/html/README.txt
+++ b/pcre/doc/html/README.txt
 README file for PCRE (Perl-compatible regular expression library)
 -----------------------------------------------------------------

-The latest release of PCRE is always available in three alternative formats
+NOTE: This set of files relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+
+
+The latest release of PCRE1 is always available in three alternative formats
 from:

  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
@@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 24 October 2014
+Last updated: 10 February 2015
--- a/pcre/doc/html/pcre.html
+++ b/pcre/doc/html/pcre.html
@@ -13,13 +13,24 @@ from the original man page. If there is any nonsense in it, please consult the
 man page, in case the conversion went wrong.
 <br>
 <ul>
-<li><a name="TOC1" href="#SEC1">INTRODUCTION</a>
-<li><a name="TOC2" href="#SEC2">SECURITY CONSIDERATIONS</a>
-<li><a name="TOC3" href="#SEC3">USER DOCUMENTATION</a>
-<li><a name="TOC4" href="#SEC4">AUTHOR</a>
-<li><a name="TOC5" href="#SEC5">REVISION</a>
+<li><a name="TOC1" href="#SEC1">PLEASE TAKE NOTE</a>
+<li><a name="TOC2" href="#SEC2">INTRODUCTION</a>
+<li><a name="TOC3" href="#SEC3">SECURITY CONSIDERATIONS</a>
+<li><a name="TOC4" href="#SEC4">USER DOCUMENTATION</a>
+<li><a name="TOC5" href="#SEC5">AUTHOR</a>
+<li><a name="TOC6" href="#SEC6">REVISION</a>
 </ul>
-<br><a name="SEC1" href="#TOC1">INTRODUCTION</a><br>
+<br><a name="SEC1" href="#TOC1">PLEASE TAKE NOTE</a><br>
+<P>
+This document relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+</P>
+<br><a name="SEC2" href="#TOC1">INTRODUCTION</a><br>
 <P>
 The PCRE library is a set of functions that implement regular expression
 pattern matching using the same syntax and semantics as Perl, with just a few
@@ -115,7 +126,7 @@ clashes. In some environments, it is possible to control which external symbols
 are exported when a shared library is built, and in these cases the
 undocumented symbols are not exported.
 </P>
-<br><a name="SEC2" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
+<br><a name="SEC3" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
 <P>
 If you are using PCRE in a non-UTF application that permits users to supply
 arbitrary patterns for compilation, you should be aware of a feature that
@@ -149,7 +160,7 @@ against this: see the PCRE_EXTRA_MATCH_LIMIT feature in the
 <a href="pcreapi.html"><b>pcreapi</b></a>
 page.
 </P>
-<br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
+<br><a name="SEC4" href="#TOC1">USER DOCUMENTATION</a><br>
 <P>
 The user documentation for PCRE comprises a number of different sections. In
 the "man" format, each of these is a separate "man page". In the HTML format,
@@ -188,7 +199,7 @@ follows:
 In the "man" and HTML formats, there is also a short page for each C library
 function, listing its arguments and results.
 </P>
-<br><a name="SEC4" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
 <P>
 Philip Hazel
 <br>
@@ -202,11 +213,11 @@ Putting an actual email address here seems to have been a spam magnet, so I've
 taken it away. If you want to email me, use my two initials, followed by the
 two digits 10, at the domain cam.ac.uk.
 </P>
-<br><a name="SEC5" href="#TOC1">REVISION</a><br>
+<br><a name="SEC6" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 08 January 2014
+Last updated: 10 February 2015
 <br>
-Copyright &copy; 1997-2014 University of Cambridge.
+Copyright &copy; 1997-2015 University of Cambridge.
 <br>
 <p>
 Return to the <a href="index.html">PCRE index page</a>.

--- a/pcre/doc/pcre.3
+++ b/pcre/doc/pcre.3
-.TH PCRE 3 "08 January 2014" "PCRE 8.35"
+.TH PCRE 3 "10 February 2015" "PCRE 8.37"
 .SH NAME
-PCRE - Perl-compatible regular expressions
+PCRE - Perl-compatible regular expressions (original API)
+.SH "PLEASE TAKE NOTE"
+.rs
+.sp
+This document relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+.
+.
 .SH INTRODUCTION
 .rs
 .sp
@@ -213,6 +225,6 @@ two digits 10, at the domain cam.ac.uk.
 .rs
 .sp
 .nf
-Last updated: 08 January 2014
-Copyright (c) 1997-2014 University of Cambridge.
+Last updated: 10 February 2015
+Copyright (c) 1997-2015 University of Cambridge.
 .fi
--- a/pcre/doc/pcre.txt
+++ b/pcre/doc/pcre.txt
@@ -13,7 +13,18 @@ PCRE(3)                    Library Functions Manual                    PCRE(3)


 NAME
-       PCRE - Perl-compatible regular expressions
+       PCRE - Perl-compatible regular expressions (original API)
+
+PLEASE TAKE NOTE
+
+       This  document relates to PCRE releases that use the original API, with
+       library names libpcre, libpcre16, and libpcre32. January 2015  saw  the
+       first release of a new API, known as PCRE2, with release numbers start-
+       ing  at  10.00  and  library   names   libpcre2-8,   libpcre2-16,   and
+       libpcre2-32. The old libraries (now called PCRE1) are still being main-
+       tained for bug fixes,  but  there  will  be  no  new  development.  New
+       projects are advised to use the new PCRE2 libraries.
+

 INTRODUCTION

@@ -179,8 +190,8 @@ AUTHOR

 REVISION

-       Last updated: 08 January 2014
-       Copyright (c) 1997-2014 University of Cambridge.
+       Last updated: 10 February 2015
+       Copyright (c) 1997-2015 University of Cambridge.
 ------------------------------------------------------------------------------



--- a/pcre/pcre_compile.c
+++ b/pcre/pcre_compile.c
--- a/pcre/pcre_dfa_exec.c
+++ b/pcre/pcre_dfa_exec.c
@@ -2736,9 +2736,10 @@ for (;;)
            condcode == OP_DNRREF)
          return PCRE_ERROR_DFA_UCOND;

-        /* The DEFINE condition is always false */
+        /* The DEFINE condition is always false, and the assertion (?!) is
+        converted to OP_FAIL. */

-        if (condcode == OP_DEF)
+        if (condcode == OP_DEF || condcode == OP_FAIL)
          { ADD_ACTIVE(state_offset + codelink + LINK_SIZE + 1, 0); }

        /* The only supported version of OP_RREF is for the value RREF_ANY,

--- a/pcre/pcre_exec.c
+++ b/pcre/pcre_exec.c
@@ -1136,8 +1136,8 @@ for (;;)
    printf("\n");
 #endif

-    if (offset < md->offset_max)
-      {
+    if (offset >= md->offset_max) goto POSSESSIVE_NON_CAPTURE;
+
    matched_once = FALSE;
    code_offset = (int)(ecode - md->start_code);

@@ -1211,18 +1211,6 @@ for (;;)
      }

    RRETURN(MATCH_NOMATCH);
-      }
-
-    /* FALL THROUGH ... Insufficient room for saving captured contents. Treat
-    as a non-capturing bracket. */
-
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
-
-    DPRINTF(("insufficient capture room: treat as non-capturing\n"));
-
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */
-    /* VVVVVVVVVVVVVVVVVVVVVVVVV */

    /* Non-capturing possessive bracket with unlimited repeat. We come here
    from BRAZERO with allow_zero = TRUE. The code is similar to the above,
@@ -1388,6 +1376,7 @@ for (;;)
      break;

      case OP_DEF:     /* DEFINE - always false */
+      case OP_FAIL:    /* From optimized (?!) condition */
      break;

      /* The condition is an assertion. Call match() to evaluate it - setting
@@ -1404,8 +1393,11 @@ for (;;)
        condition = TRUE;

        /* Advance ecode past the assertion to the start of the first branch,
-        but adjust it so that the general choosing code below works. */
+        but adjust it so that the general choosing code below works. If the
+        assertion has a quantifier that allows zero repeats we must skip over
+        the BRAZERO. This is a lunatic thing to do, but somebody did! */

+        if (*ecode == OP_BRAZERO) ecode++;
        ecode += GET(ecode, 1);
        while (*ecode == OP_ALT) ecode += GET(ecode, 1);
        ecode += 1 + LINK_SIZE - PRIV(OP_lengths)[condcode];
@@ -1474,7 +1466,18 @@ for (;;)
      md->offset_vector[offset] =
        md->offset_vector[md->offset_end - number];
      md->offset_vector[offset+1] = (int)(eptr - md->start_subject);
-      if (offset_top <= offset) offset_top = offset + 2;
+
+      /* If this group is at or above the current highwater mark, ensure that
+      any groups between the current high water mark and this group are marked
+      unset and then update the high water mark. */
+
+      if (offset >= offset_top)
+        {
+        register int *iptr = md->offset_vector + offset_top;
+        register int *iend = md->offset_vector + offset;
+        while (iptr < iend) *iptr++ = -1;
+        offset_top = offset + 2;
+        }
      }
    ecode += 1 + IMM2_SIZE;
    break;
@@ -1826,7 +1829,11 @@ for (;;)
        are defined in a range that can be tested for. */

        if (rrc >= MATCH_BACKTRACK_MIN && rrc <= MATCH_BACKTRACK_MAX)
+          {
+          if (new_recursive.offset_save != stacksave)
+            (PUBL(free))(new_recursive.offset_save);
          RRETURN(MATCH_NOMATCH);
+          }

        /* Any return code other than NOMATCH is an error. */

@@ -3476,7 +3483,7 @@ for (;;)
          if (possessive) continue;    /* No backtracking */
          for(;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;
+            if (eptr <= pp) goto TAIL_RECURSE;
            RMATCH(eptr, ecode, offset_top, md, eptrb, RM23);
            if (rrc != MATCH_NOMATCH) RRETURN(rrc);
 #ifdef SUPPORT_UCP
@@ -3897,7 +3904,7 @@ for (;;)
          if (possessive) continue;    /* No backtracking */
          for(;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;
+            if (eptr <= pp) goto TAIL_RECURSE;
            RMATCH(eptr, ecode, offset_top, md, eptrb, RM30);
            if (rrc != MATCH_NOMATCH) RRETURN(rrc);
            eptr--;
@@ -4032,7 +4039,7 @@ for (;;)
          if (possessive) continue;    /* No backtracking */
          for(;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;
+            if (eptr <= pp) goto TAIL_RECURSE;
            RMATCH(eptr, ecode, offset_top, md, eptrb, RM34);
            if (rrc != MATCH_NOMATCH) RRETURN(rrc);
            eptr--;
@@ -5603,7 +5610,7 @@ for (;;)
        if (possessive) continue;    /* No backtracking */
        for(;;)
          {
-          if (eptr == pp) goto TAIL_RECURSE;
+          if (eptr <= pp) goto TAIL_RECURSE;
          RMATCH(eptr, ecode, offset_top, md, eptrb, RM44);
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);
          eptr--;
@@ -5645,12 +5652,17 @@ for (;;)

        if (possessive) continue;    /* No backtracking */

+        /* We use <= pp rather than == pp to detect the start of the run while
+        backtracking because the use of \C in UTF mode can cause BACKCHAR to
+        move back past pp. This is just palliative; the use of \C in UTF mode
+        is fraught with danger. */
+
        for(;;)
          {
          int lgb, rgb;
          PCRE_PUCHAR fptr;

-          if (eptr == pp) goto TAIL_RECURSE;   /* At start of char run */
+          if (eptr <= pp) goto TAIL_RECURSE;   /* At start of char run */
          RMATCH(eptr, ecode, offset_top, md, eptrb, RM45);
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);

@@ -5668,7 +5680,7 @@ for (;;)

          for (;;)
            {
-            if (eptr == pp) goto TAIL_RECURSE;   /* At start of char run */
+            if (eptr <= pp) goto TAIL_RECURSE;   /* At start of char run */
            fptr = eptr - 1;
            if (!utf) c = *fptr; else
              {
@@ -5918,7 +5930,7 @@ for (;;)
        if (possessive) continue;    /* No backtracking */
        for(;;)
          {
-          if (eptr == pp) goto TAIL_RECURSE;
+          if (eptr <= pp) goto TAIL_RECURSE;
          RMATCH(eptr, ecode, offset_top, md, eptrb, RM46);
          if (rrc != MATCH_NOMATCH) RRETURN(rrc);
          eptr--;

--- a/pcre/pcre_internal.h
+++ b/pcre/pcre_internal.h
@@ -2446,6 +2446,7 @@ typedef struct compile_data {
  BOOL had_pruneorskip;             /* (*PRUNE) or (*SKIP) encountered */
  BOOL check_lookbehind;            /* Lookbehinds need later checking */
  BOOL dupnames;                    /* Duplicate names exist */
+  BOOL iscondassert;                /* Next assert is a condition */
  int  nltype;                      /* Newline type */
  int  nllen;                       /* Newline string length */
  pcre_uchar nl[4];                 /* Newline string when fixed length */
@@ -2459,6 +2460,13 @@ typedef struct branch_chain {
  pcre_uchar *current_branch;
 } branch_chain;

+/* Structure for mutual recursion detection. */
+
+typedef struct recurse_check {
+  struct recurse_check *prev;
+  const pcre_uchar *group;
+} recurse_check;
+
 /* Structure for items in a linked list that represents an explicit recursive
 call within the pattern; used by pcre_exec(). */


--- a/pcre/pcre_jit_compile.c
+++ b/pcre/pcre_jit_compile.c
--- a/pcre/pcre_jit_test.c
+++ b/pcre/pcre_jit_test.c
@@ -51,8 +51,6 @@ POSSIBILITY OF SUCH DAMAGE.

 #include "pcre_internal.h"

-#define PCRE_BUG 0x80000000
-
 /*
 Letter characters:
   \xe6\x92\xad = 0x64ad = 25773 (kanji)
@@ -69,6 +67,9 @@ POSSIBILITY OF SUCH DAMAGE.
      \xc3\x89 = 0xc9 = 201 (E')
   \xc3\xa1 = 0xe1 = 225 (a')
      \xc3\x81 = 0xc1 = 193 (A')
+   \x53 = 0x53 = S
+     \x73 = 0x73 = s
+     \xc5\xbf = 0x17f = 383 (long S)
   \xc8\xba = 0x23a = 570
      \xe2\xb1\xa5 = 0x2c65 = 11365
   \xe1\xbd\xb8 = 0x1f78 = 8056
@@ -78,6 +79,10 @@ POSSIBILITY OF SUCH DAMAGE.
   \xc7\x84 = 0x1c4 = 452
     \xc7\x85 = 0x1c5 = 453
     \xc7\x86 = 0x1c6 = 454
+ Caseless sets:
+   ucp_Armenian - \x{531}-\x{556} -> \x{561}-\x{586}
+   ucp_Coptic - \x{2c80}-\x{2ce3} -> caseless: XOR 0x1
+   ucp_Latin - \x{ff21}-\x{ff3a} -> \x{ff41]-\x{ff5a}

 Mark property:
   \xcc\x8d = 0x30d = 781
@@ -626,6 +631,9 @@ static struct regression_test_case regression_test_cases[] = {
 	{ MUA, 0, "(?P<Name>a)?(?P<Name2>b)?(?(Name)c|d)+?dd", "bcabcacdb bdddd" },
 	{ MUA, 0, "(?P<Name>a)?(?P<Name2>b)?(?(Name)c|d)+l", "ababccddabdbccd abcccl" },
 	{ MUA, 0, "((?:a|aa)(?(1)aaa))x", "aax" },
+	{ MUA, 0, "(?(?!)a|b)", "ab" },
+	{ MUA, 0, "(?(?!)a)", "ab" },
+	{ MUA, 0 | F_NOMATCH, "(?(?!)a|b)", "ac" },

 	/* Set start of match. */
 	{ MUA, 0, "(?:\\Ka)*aaaab", "aaaaaaaa aaaaaaabb" },
@@ -944,7 +952,7 @@ static void setstack16(pcre16_extra *extra)

 	pcre16_assign_jit_stack(extra, callback16, getstack16());
 }
-#endif /* SUPPORT_PCRE8 */
+#endif /* SUPPORT_PCRE16 */

 #ifdef SUPPORT_PCRE32
 static pcre32_jit_stack *stack32;
@@ -967,7 +975,7 @@ static void setstack32(pcre32_extra *extra)

 	pcre32_assign_jit_stack(extra, callback32, getstack32());
 }
-#endif /* SUPPORT_PCRE8 */
+#endif /* SUPPORT_PCRE32 */

 #ifdef SUPPORT_PCRE16

@@ -1177,7 +1185,7 @@ static int regression_tests(void)
 #elif defined SUPPORT_PCRE16
 	pcre16_config(PCRE_CONFIG_UTF16, &utf);
 	pcre16_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
-#elif defined SUPPORT_PCRE16
+#elif defined SUPPORT_PCRE32
 	pcre32_config(PCRE_CONFIG_UTF32, &utf);
 	pcre32_config(PCRE_CONFIG_UNICODE_PROPERTIES, &ucp);
 #endif

--- a/pcre/pcre_study.c
+++ b/pcre/pcre_study.c
@@ -70,7 +70,7 @@ rather than bytes.
  code            pointer to start of group (the bracket)
  startcode       pointer to start of the whole pattern's code
  options         the compiling options
-  int             RECURSE depth
+  recurses        chain of recurse_check to catch mutual recursion

 Returns:   the minimum length
           -1 if \C in UTF-8 mode or (*ACCEPT) was encountered
@@ -80,12 +80,13 @@ Returns:   the minimum length

 static int
 find_minlength(const REAL_PCRE *re, const pcre_uchar *code,
-  const pcre_uchar *startcode, int options, int recurse_depth)
+  const pcre_uchar *startcode, int options, recurse_check *recurses)
 {
 int length = -1;
 /* PCRE_UTF16 has the same value as PCRE_UTF8. */
 BOOL utf = (options & PCRE_UTF8) != 0;
 BOOL had_recurse = FALSE;
+recurse_check this_recurse;
 register int branchlength = 0;
 register pcre_uchar *cc = (pcre_uchar *)code + 1 + LINK_SIZE;

@@ -130,7 +131,7 @@ for (;;)
    case OP_SBRAPOS:
    case OP_ONCE:
    case OP_ONCE_NC:
-    d = find_minlength(re, cc, startcode, options, recurse_depth);
+    d = find_minlength(re, cc, startcode, options, recurses);
    if (d < 0) return d;
    branchlength += d;
    do cc += GET(cc, 1); while (*cc == OP_ALT);
@@ -393,7 +394,7 @@ for (;;)
        ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(slot, 0));
        if (cs == NULL) return -2;
        do ce += GET(ce, 1); while (*ce == OP_ALT);
-        if (cc > cs && cc < ce)
+        if (cc > cs && cc < ce)     /* Simple recursion */
          {
          d = 0;
          had_recurse = TRUE;
@@ -401,9 +402,23 @@ for (;;)
          }
        else
          {
-          int dd = find_minlength(re, cs, startcode, options, recurse_depth);
+          recurse_check *r = recurses;
+          for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+          if (r != NULL)           /* Mutual recursion */
+            {
+            d = 0;
+            had_recurse = TRUE;
+            break;
+            }
+          else
+            {
+            int dd;
+            this_recurse.prev = recurses;
+            this_recurse.group = cs;
+            dd = find_minlength(re, cs, startcode, options, &this_recurse);
            if (dd < d) d = dd;
            }
+          }
        slot += re->name_entry_size;
        }
      }
@@ -418,14 +433,26 @@ for (;;)
      ce = cs = (pcre_uchar *)PRIV(find_bracket)(startcode, utf, GET2(cc, 1));
      if (cs == NULL) return -2;
      do ce += GET(ce, 1); while (*ce == OP_ALT);
-      if (cc > cs && cc < ce)
+      if (cc > cs && cc < ce)    /* Simple recursion */
        {
        d = 0;
        had_recurse = TRUE;
        }
      else
        {
-        d = find_minlength(re, cs, startcode, options, recurse_depth);
+        recurse_check *r = recurses;
+        for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+        if (r != NULL)           /* Mutual recursion */
+          {
+          d = 0;
+          had_recurse = TRUE;
+          }
+        else
+          {
+          this_recurse.prev = recurses;
+          this_recurse.group = cs;
+          d = find_minlength(re, cs, startcode, options, &this_recurse);
+          }
        }
      }
    else d = 0;
@@ -474,12 +501,21 @@ for (;;)
    case OP_RECURSE:
    cs = ce = (pcre_uchar *)startcode + GET(cc, 1);
    do ce += GET(ce, 1); while (*ce == OP_ALT);
-    if ((cc > cs && cc < ce) || recurse_depth > 10)
+    if (cc > cs && cc < ce)    /* Simple recursion */
      had_recurse = TRUE;
    else
      {
+      recurse_check *r = recurses;
+      for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
+      if (r != NULL)           /* Mutual recursion */
+        had_recurse = TRUE;
+      else
+        {
+        this_recurse.prev = recurses;
+        this_recurse.group = cs;
        branchlength += find_minlength(re, cs, startcode, options,
-        recurse_depth + 1);
+          &this_recurse);
+        }
      }
    cc += 1 + LINK_SIZE;
    break;
@@ -1503,7 +1539,7 @@ if ((re->options & PCRE_ANCHORED) == 0 &&

 /* Find the minimum length of subject string. */

-switch(min = find_minlength(re, code, code, re->options, 0))
+switch(min = find_minlength(re, code, code, re->options, NULL))
  {
  case -2: *errorptr = "internal error: missing capturing bracket"; return NULL;
  case -3: *errorptr = "internal error: opcode not recognized"; return NULL;

--- a/pcre/pcregrep.c
+++ b/pcre/pcregrep.c
@@ -1582,12 +1582,15 @@ while (ptr < endptr)
  int endlinelength;
  int mrc = 0;
  int startoffset = 0;
+  int prevoffsets[2];
  unsigned int options = 0;
  BOOL match;
  char *matchptr = ptr;
  char *t = ptr;
  size_t length, linelength;

+  prevoffsets[0] = prevoffsets[1] = -1;
+
  /* At this point, ptr is at the start of a line. We need to find the length
  of the subject string to pass to pcre_exec(). In multiline mode, it is the
  length remainder of the data in the buffer. Otherwise, it is the length of
@@ -1729,6 +1732,20 @@ while (ptr < endptr)
      {
      if (!invert)
        {
+        int oldstartoffset = startoffset;
+
+        /* It is possible, when a lookbehind assertion contains \K, for the
+        same string to be found again. The code below advances startoffset, but
+        until it is past the "bumpalong" offset that gave the match, the same
+        substring will be returned. The PCRE1 library does not return the
+        bumpalong offset, so all we can do is ignore repeated strings. (PCRE2
+        does this better.) */
+
+        if (prevoffsets[0] != offsets[0] || prevoffsets[1] != offsets[1])
+          {
+          prevoffsets[0] = offsets[0];
+          prevoffsets[1] = offsets[1];
+
          if (printname != NULL) fprintf(stdout, "%s:", printname);
          if (number) fprintf(stdout, "%d:", linenumber);

@@ -1771,13 +1788,30 @@ while (ptr < endptr)

            if (printed || printname != NULL || number) fprintf(stdout, "\n");
            }
+          }

-        /* Prepare to repeat to find the next match */
+        /* Prepare to repeat to find the next match. If the patterned contained
+        a lookbehind tht included \K, it is possible that the end of the match
+        might be at or before the actual strting offset we have just used. We
+        need to start one character further on. Unfortunately, for unanchored
+        patterns, the actual start offset can be greater that the one that was
+        set as a result of "bumpalong". PCRE1 does not return the actual start
+        offset, so we have to check against the original start offset. This may
+        lead to duplicates - we we need the fudge above to avoid printing them.
+        (PCRE2 does this better.) */

        match = FALSE;
        if (line_buffered) fflush(stdout);
        rc = 0;                      /* Had some success */
        startoffset = offsets[1];    /* Restart after the match */
+        if (startoffset <= oldstartoffset)
+          {
+          if ((size_t)startoffset >= length)
+            goto END_ONE_MATCH;              /* We were at the end */
+          startoffset = oldstartoffset + 1;
+          if (utf8)
+            while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
+          }
        goto ONLY_MATCHING_RESTART;
        }
      }
@@ -1974,6 +2008,7 @@ while (ptr < endptr)
  /* Advance to after the newline and increment the line number. The file
  offset to the current line is maintained in filepos. */

+  END_ONE_MATCH:
  ptr += linelength + endlinelength;
  filepos += (int)(linelength + endlinelength);
  linenumber++;

--- a/pcre/pcretest.c
+++ b/pcre/pcretest.c
@@ -2257,6 +2257,8 @@ if (callout_extra)
  fprintf(f, "Callout %d: last capture = %d\n",
    cb->callout_number, cb->capture_last);

+  if (cb->offset_vector != NULL)
+    {
    for (i = 0; i < cb->capture_top * 2; i += 2)
      {
      if (cb->offset_vector[i] < 0)
@@ -2270,6 +2272,7 @@ if (callout_extra)
        }
      }
    }
+  }

 /* Re-print the subject in canonical form, the first time or if giving full
 datails. On subsequent calls in the same match, we use pchars just to find the
@@ -2519,7 +2522,7 @@ re->name_entry_size = swap_uint16(re->name_entry_size);
 re->name_count = swap_uint16(re->name_count);
 re->ref_count = swap_uint16(re->ref_count);

-if (extra != NULL)
+if (extra != NULL && (extra->flags & PCRE_EXTRA_STUDY_DATA) != 0)
  {
  pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
  rsd->size = swap_uint32(rsd->size);
@@ -2700,7 +2703,7 @@ re->name_entry_size = swap_uint16(re->name_entry_size);
 re->name_count = swap_uint16(re->name_count);
 re->ref_count = swap_uint16(re->ref_count);

-if (extra != NULL)
+if (extra != NULL && (extra->flags & PCRE_EXTRA_STUDY_DATA) != 0)
  {
  pcre_study_data *rsd = (pcre_study_data *)(extra->study_data);
  rsd->size = swap_uint32(rsd->size);
@@ -3453,7 +3456,7 @@ while (!done)
  pcre_extra *extra = NULL;

 #if !defined NOPOSIX  /* There are still compilers that require no indent */
-  regex_t preg;
+  regex_t preg = { NULL, 0, 0} ;
  int do_posix = 0;
 #endif

@@ -5603,6 +5606,12 @@ while (!done)

      if (!do_g && !do_G) break;

+      if (use_offsets == NULL)
+        {
+        fprintf(outfile, "Cannot do global matching without an ovector\n");
+        break;
+        }
+
      /* If we have matched an empty string, first check to see if we are at
      the end of the subject. If so, the /g loop is over. Otherwise, mimic what
      Perl's /g options does. This turns out to be rather cunning. First we set
@@ -5618,9 +5627,33 @@ while (!done)
        g_notempty = PCRE_NOTEMPTY_ATSTART | PCRE_ANCHORED;
        }

-      /* For /g, update the start offset, leaving the rest alone */
+      /* For /g, update the start offset, leaving the rest alone. There is a
+      tricky case when \K is used in a positive lookbehind assertion. This can
+      cause the end of the match to be less than or equal to the start offset.
+      In this case we restart at one past the start offset. This may return the
+      same match if the original start offset was bumped along during the
+      match, but eventually the new start offset will hit the actual start
+      offset. (In PCRE2 the true start offset is available, and this can be
+      done better. It is not worth doing more than making sure we do not loop
+      at this stage in the life of PCRE1.) */

-      if (do_g) start_offset = use_offsets[1];
+      if (do_g)
+        {
+        if (g_notempty == 0 && use_offsets[1] <= start_offset)
+          {
+          if (start_offset >= len) break;  /* End of subject */
+          start_offset++;
+          if (use_utf)
+            {
+            while (start_offset < len)
+              {
+              if ((bptr[start_offset] & 0xc0) != 0x80) break;
+              start_offset++;
+              }
+            }
+          }
+        else start_offset = use_offsets[1];
+        }

      /* For /G, update the pointer and length */

@@ -5637,7 +5670,7 @@ while (!done)
  CONTINUE:

 #if !defined NOPOSIX
-  if (posix || do_posix) regfree(&preg);
+  if ((posix || do_posix) && preg.re_pcre != 0) regfree(&preg);
 #endif

  if (re != NULL) new_free(re);

--- a/pcre/testdata/grepoutput
+++ b/pcre/testdata/grepoutput
--- a/pcre/testdata/testinput1
+++ b/pcre/testdata/testinput1
@@ -5720,4 +5720,14 @@ AbcdCBefgBhiBqz
 /[\Q]a\E]+/
    aa]]

+/(?:((abcd))|(((?:(?:(?:(?:abc|(?:abcdef))))b)abcdefghi)abc)|((*ACCEPT)))/
+    1234abcd
+
+/(\2)(\1)/
+
+"Z*(|d*){216}"
+
+"(?1)(?#?'){8}(a)"
+    baaaaaaaaac
+
 /-- End of testinput1 --/
--- a/pcre/testdata/testinput11
+++ b/pcre/testdata/testinput11
@@ -134,4 +134,6 @@ is required for these tests. --/

 /(((a\2)|(a*)\g<-1>))*a?/B

+/((?+1)(\1))/B
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testinput12
+++ b/pcre/testdata/testinput12
@@ -87,4 +87,12 @@ and a couple of things that are different with JIT. --/
 /^12345678abcd/mS++
    12345678abcd

+/-- Test pattern compilation --/ 
+
+/(?:a|b|c|d|e)(?R)/S++
+
+/(?:a|b|c|d|e)(?R)(?R)/S++
+
+/(a(?:a|b|c|d|e)b){8,16}/S++
+
 /-- End of testinput12 --/
--- a/pcre/testdata/testinput2
+++ b/pcre/testdata/testinput2
@@ -1380,6 +1380,8 @@
    1X
    123456\P

+//KF>/dev/null
+
 /abc/IS>testsavedregex
 <testsavedregex
    abc
@@ -4078,4 +4080,76 @@ backtracking verbs. --/

 /\x{whatever}/

+"((?=(?(?=(?(?=(?(?=()))))))))"
+    a
+
+"(?(?=)==)(((((((((?=)))))))))"
+    a
+
+/^(?:(a)|b)(?(1)A|B)/I
+    aA123\O3
+    aA123\O6
+
+'^(?:(?<AA>a)|b)(?(<AA>)A|B)'
+    aA123\O3
+    aA123\O6
+
+'^(?<AA>)(?:(?<AA>a)|b)(?(<AA>)A|B)'J
+    aA123\O3
+    aA123\O6
+
+'^(?:(?<AA>X)|)(?:(?<AA>a)|b)\k{AA}'J
+    aa123\O3
+    aa123\O6
+
+/(?<N111>(?J)(?<N111>1(111111)11|)1|1|)(?(<N111>)1)/
+
+/(?(?=0)?)+/
+
+/(?(?=0)(?=00)?00765)/
+     00765
+
+/(?(?=0)(?=00)?00765|(?!3).56)/
+     00765
+     456
+     ** Failers
+     356   
+
+'^(a)*+(\w)'
+    g
+    g\O3
+
+'^(?:a)*+(\w)'
+    g
+    g\O3
+
+//C
+    \O\C+
+
+"((?2){0,1999}())?"
+
+/((?+1)(\1))/BZ
+
+/(?(?!)a|b)/
+    bbb
+    aaa 
+
+"((?2)+)((?1))"
+
+"(?(?<E>.*!.*)?)"
+
+"X((?2)()*+){2}+"BZ
+
+"X((?2)()*+){2}"BZ
+
+"(?<=((?2))((?1)))"
+
+/(?<=\Ka)/g+
+    aaaaa
+
+/(?<=\Ka)/G+
+    aaaaa
+
+/((?2){73}(?2))((?1))/
+
 /-- End of testinput2 --/
--- a/pcre/testdata/testinput4
+++ b/pcre/testdata/testinput4
@@ -722,4 +722,9 @@
 /^#[^\x{ffff}]#[^\x{ffff}]#[^\x{ffff}]#/8
    #\x{10000}#\x{100}#\x{10ffff}#

+"[\S\V\H]"8
+
+/\C(\W?ſ)'?{{/8
+    \\C(\\W?ſ)'?{{
+
 /-- End of testinput4 --/
--- a/pcre/testdata/testinput5
+++ b/pcre/testdata/testinput5
@@ -790,4 +790,12 @@

 /[b-d\x{200}-\x{250}]*[ae-h]?#[\x{200}-\x{250}]{0,8}[\x00-\xff]*#[\x{200}-\x{250}]+[a-z]/8BZ

+/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ
+
+/(?<=\K\x{17f})/8g+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+
+/(?<=\K\x{17f})/8G+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+
 /-- End of testinput5 --/

--- a/pcre/testdata/testinput6
+++ b/pcre/testdata/testinput6
@@ -1496,4 +1496,10 @@
 /^s?c/mi8
    scat

+/[A-`]/i8
+    abcdefghijklmno
+
+/\C\X*QT/8
+    Ӆ\x0aT
+
 /-- End of testinput6 --/
--- a/pcre/testdata/testinput8
+++ b/pcre/testdata/testinput8
@@ -4837,4 +4837,8 @@
 '\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED

+/(?(?!)a|b)/
+    bbb
+    aaa 
+
 /-- End of testinput8 --/
--- a/pcre/testdata/testoutput1
+++ b/pcre/testdata/testoutput1
@@ -9411,4 +9411,22 @@ No match
    aa]]
 0: aa]]

+/(?:((abcd))|(((?:(?:(?:(?:abc|(?:abcdef))))b)abcdefghi)abc)|((*ACCEPT)))/
+    1234abcd
+ 0: 
+ 1: <unset>
+ 2: <unset>
+ 3: <unset>
+ 4: <unset>
+ 5: 
+
+/(\2)(\1)/
+
+"Z*(|d*){216}"
+
+"(?1)(?#?'){8}(a)"
+    baaaaaaaaac
+ 0: aaaaaaaaa
+ 1: a
+
 /-- End of testinput1 --/
--- a/pcre/testdata/testoutput11-16
+++ b/pcre/testdata/testoutput11-16
@@ -231,7 +231,7 @@ Memory allocation (code space): 73
 ------------------------------------------------------------------

 /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 57
+Memory allocation (code space): 61
 ------------------------------------------------------------------
  0  24 Bra
  2   5 CBra 1
@@ -733,4 +733,19 @@ Memory allocation (code space): 14
 41     End
 ------------------------------------------------------------------

+/((?+1)(\1))/B
+------------------------------------------------------------------
+  0  20 Bra
+  2  16 Once
+  4  12 CBra 1
+  7   9 Recurse
+  9   5 CBra 2
+ 12     \1
+ 14   5 Ket
+ 16  12 Ket
+ 18  16 Ket
+ 20  20 Ket
+ 22     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput11-32
+++ b/pcre/testdata/testoutput11-32
@@ -231,7 +231,7 @@ Memory allocation (code space): 155
 ------------------------------------------------------------------

 /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 117
+Memory allocation (code space): 125
 ------------------------------------------------------------------
  0  24 Bra
  2   5 CBra 1
@@ -733,4 +733,19 @@ Memory allocation (code space): 28
 41     End
 ------------------------------------------------------------------

+/((?+1)(\1))/B
+------------------------------------------------------------------
+  0  20 Bra
+  2  16 Once
+  4  12 CBra 1
+  7   9 Recurse
+  9   5 CBra 2
+ 12     \1
+ 14   5 Ket
+ 16  12 Ket
+ 18  16 Ket
+ 20  20 Ket
+ 22     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput11-8
+++ b/pcre/testdata/testoutput11-8
@@ -231,7 +231,7 @@ Memory allocation (code space): 45
 ------------------------------------------------------------------

 /(?P<a>a)...(?P=a)bbb(?P>a)d/BM
-Memory allocation (code space): 34
+Memory allocation (code space): 38
 ------------------------------------------------------------------
  0  30 Bra
  3   7 CBra 1
@@ -733,4 +733,19 @@ Memory allocation (code space): 10
 60     End
 ------------------------------------------------------------------

+/((?+1)(\1))/B
+------------------------------------------------------------------
+  0  31 Bra
+  3  25 Once
+  6  19 CBra 1
+ 11  14 Recurse
+ 14   8 CBra 2
+ 19     \1
+ 22   8 Ket
+ 25  19 Ket
+ 28  25 Ket
+ 31  31 Ket
+ 34     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput12
+++ b/pcre/testdata/testoutput12
@@ -176,4 +176,12 @@ No match, mark = m (JIT)
    12345678abcd
 0: 12345678abcd (JIT)

+/-- Test pattern compilation --/ 
+
+/(?:a|b|c|d|e)(?R)/S++
+
+/(?:a|b|c|d|e)(?R)(?R)/S++
+
+/(a(?:a|b|c|d|e)b){8,16}/S++
+
 /-- End of testinput12 --/
--- a/pcre/testdata/testoutput2
+++ b/pcre/testdata/testoutput2
@@ -561,7 +561,7 @@ Failed: assertion expected after (?( at offset 3
 Failed: reference to non-existent subpattern at offset 7

 /(?(?<ab))/
-Failed: syntax error in subpattern name (missing terminator) at offset 7
+Failed: assertion expected after (?( at offset 3

 /((?s)blah)\s+\1/I
 Capturing subpattern count = 1
@@ -1566,30 +1566,35 @@ Need char = 'b'

 /a(?(1)b)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 No need char

 /a(?(1)bag|big)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'g'

 /a(?(1)bag|big)*(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 No need char

 /a(?(1)bag|big)+(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'g'

 /a(?(1)b..|b..)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'b'
@@ -3379,24 +3384,28 @@ Need char = 'a'

 /(?(1)ab|ac)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 No need char

 /(?(1)abz|acz)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 First char = 'a'
 Need char = 'z'

 /(?(1)abz)(.)/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 No first char
 No need char

 /(?(1)abz)(1)23/I
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 No first char
 Need char = '3'
@@ -5605,6 +5614,10 @@ No match
    123456\P
 No match

+//KF>/dev/null
+Compiled pattern written to /dev/null
+Study data written to /dev/null
+
 /abc/IS>testsavedregex
 Capturing subpattern count = 0
 No options
@@ -6336,6 +6349,7 @@ No need char

 /^(?P<A>a)?(?(A)a|b)/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  A   1
 Options: anchored
@@ -6353,6 +6367,7 @@ No match

 /(?:(?(ZZ)a|b)(?P<ZZ>X))+/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  ZZ   1
 No options
@@ -6370,6 +6385,7 @@ Failed: reference to non-existent subpattern at offset 9

 /(?:(?(ZZ)a|b)(?(ZZ)a|b)(?P<ZZ>X))+/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  ZZ   1
 No options
@@ -6381,6 +6397,7 @@ Need char = 'X'

 /(?:(?(ZZ)a|\(b\))\\(?P<ZZ>X))+/I
 Capturing subpattern count = 1
+Max back reference = 1
 Named capturing subpatterns:
  ZZ   1
 No options
@@ -10226,6 +10243,7 @@ No starting char list
  (?(1)|.)                    # check that there was an empty component
  /xiIS
 Capturing subpattern count = 1
+Max back reference = 1
 Options: anchored caseless extended
 No first char
 Need char = ':'
@@ -10255,6 +10273,7 @@ Failed: different names for subpatterns of the same number are not allowed at of
    b(?<quote> (?<apostrophe>')|(?<realquote>")) ) 
    (?('quote')[a-z]+|[0-9]+)/JIx
 Capturing subpattern count = 6
+Max back reference = 1
 Named capturing subpatterns:
  apostrophe   2
  apostrophe   5
@@ -10317,6 +10336,7 @@ No match
        End
 ------------------------------------------------------------------
 Capturing subpattern count = 4
+Max back reference = 4
 Named capturing subpatterns:
  D   4
  D   1
@@ -10364,6 +10384,7 @@ No match
        End
 ------------------------------------------------------------------
 Capturing subpattern count = 4
+Max back reference = 1
 Named capturing subpatterns:
  A   1
  A   4
@@ -10486,6 +10507,7 @@ No starting char list
    
 /()i(?(1)a)/SI 
 Capturing subpattern count = 1
+Max back reference = 1
 No options
 No first char
 Need char = 'i'
@@ -14206,4 +14228,199 @@ Failed: digits missing in \x{} or \o{} at offset 3
 /\x{whatever}/
 Failed: non-hex character in \x{} (closing brace missing?) at offset 3

+"((?=(?(?=(?(?=(?(?=()))))))))"
+    a
+ 0: 
+ 1: 
+ 2: 
+
+"(?(?=)==)(((((((((?=)))))))))"
+    a
+No match
+
+/^(?:(a)|b)(?(1)A|B)/I
+Capturing subpattern count = 1
+Max back reference = 1
+Options: anchored
+No first char
+No need char
+    aA123\O3
+Matched, but too many substrings
+ 0: aA
+    aA123\O6
+ 0: aA
+ 1: a
+
+'^(?:(?<AA>a)|b)(?(<AA>)A|B)'
+    aA123\O3
+Matched, but too many substrings
+ 0: aA
+    aA123\O6
+ 0: aA
+ 1: a
+
+'^(?<AA>)(?:(?<AA>a)|b)(?(<AA>)A|B)'J
+    aA123\O3
+Matched, but too many substrings
+ 0: aA
+    aA123\O6
+Matched, but too many substrings
+ 0: aA
+ 1: 
+
+'^(?:(?<AA>X)|)(?:(?<AA>a)|b)\k{AA}'J
+    aa123\O3
+Matched, but too many substrings
+ 0: aa
+    aa123\O6
+Matched, but too many substrings
+ 0: aa
+ 1: <unset>
+
+/(?<N111>(?J)(?<N111>1(111111)11|)1|1|)(?(<N111>)1)/
+
+/(?(?=0)?)+/
+Failed: nothing to repeat at offset 7
+
+/(?(?=0)(?=00)?00765)/
+     00765
+ 0: 00765
+
+/(?(?=0)(?=00)?00765|(?!3).56)/
+     00765
+ 0: 00765
+     456
+ 0: 456
+     ** Failers
+No match
+     356   
+No match
+
+'^(a)*+(\w)'
+    g
+ 0: g
+ 1: <unset>
+ 2: g
+    g\O3
+Matched, but too many substrings
+ 0: g
+
+'^(?:a)*+(\w)'
+    g
+ 0: g
+ 1: g
+    g\O3
+Matched, but too many substrings
+ 0: g
+
+//C
+    \O\C+
+Callout 255: last capture = -1
+--->
+ +0 ^    
+Matched, but too many substrings
+
+"((?2){0,1999}())?"
+
+/((?+1)(\1))/BZ
+------------------------------------------------------------------
+        Bra
+        Once
+        CBra 1
+        Recurse
+        CBra 2
+        \1
+        Ket
+        Ket
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+/(?(?!)a|b)/
+    bbb
+ 0: b
+    aaa 
+No match
+
+"((?2)+)((?1))"
+
+"(?(?<E>.*!.*)?)"
+Failed: assertion expected after (?( at offset 3
+
+"X((?2)()*+){2}+"BZ
+------------------------------------------------------------------
+        Bra
+        X
+        Once
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+"X((?2)()*+){2}"BZ
+------------------------------------------------------------------
+        Bra
+        X
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        CBra 1
+        Recurse
+        Braposzero
+        SCBraPos 2
+        KetRpos
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+"(?<=((?2))((?1)))"
+Failed: lookbehind assertion is not fixed length at offset 17
+
+/(?<=\Ka)/g+
+    aaaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaa
+ 0: a
+ 0+ aa
+ 0: a
+ 0+ a
+ 0: a
+ 0+ 
+
+/(?<=\Ka)/G+
+    aaaaa
+ 0: a
+ 0+ aaaa
+ 0: a
+ 0+ aaa
+ 0: a
+ 0+ aa
+ 0: a
+ 0+ a
+ 0: a
+ 0+ 
+
+/((?2){73}(?2))((?1))/
+
 /-- End of testinput2 --/
--- a/pcre/testdata/testoutput4
+++ b/pcre/testdata/testoutput4
@@ -1271,4 +1271,10 @@ No match
    #\x{10000}#\x{100}#\x{10ffff}#
 0: #\x{10000}#\x{100}#\x{10ffff}#

+"[\S\V\H]"8
+
+/\C(\W?ſ)'?{{/8
+    \\C(\\W?ſ)'?{{
+No match
+
 /-- End of testinput4 --/
--- a/pcre/testdata/testoutput5
+++ b/pcre/testdata/testoutput5
@@ -1897,4 +1897,49 @@ Failed: disallowed Unicode code point (>= 0xd800 && <= 0xdfff) at offset 5
        End
 ------------------------------------------------------------------

+/[^\xff]*PRUNE:\x{100}abc(xyz(?1))/8DZ
+------------------------------------------------------------------
+        Bra
+        [^\x{ff}]*
+        PRUNE:\x{100}abc
+        CBra 1
+        xyz
+        Recurse
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options: utf
+No first char
+Need char = 'z'
+
+/(?<=\K\x{17f})/8g+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}
+ 0: \x{17f}
+ 0+ 
+
+/(?<=\K\x{17f})/8G+
+    \x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}\x{17f}
+ 0: \x{17f}
+ 0+ \x{17f}
+ 0: \x{17f}
+ 0+ 
+
 /-- End of testinput5 --/

--- a/pcre/testdata/testoutput6
+++ b/pcre/testdata/testoutput6
@@ -2461,4 +2461,12 @@ No match
    scat
 0: sc

+/[A-`]/i8
+    abcdefghijklmno
+ 0: a
+
+/\C\X*QT/8
+    Ӆ\x0aT
+No match
+
 /-- End of testinput6 --/
--- a/pcre/testdata/testoutput8
+++ b/pcre/testdata/testoutput8
@@ -7785,4 +7785,10 @@ Matched, but offsets vector is too small to show all matches
    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
 0: NON QUOTED "QUOT""ED" AFTER 

+/(?(?!)a|b)/
+    bbb
+ 0: b
+    aaa 
+No match
+
 /-- End of testinput8 --/