Commit 48636f09 authored by Sergei Golubchik's avatar Sergei Golubchik

Merge branch 'merge-pcre' into 10.0

parents 5ae2656b cf242ade
......@@ -8,7 +8,7 @@ Email domain: cam.ac.uk
University of Cambridge Computing Service,
Cambridge, England.
Copyright (c) 1997-2017 University of Cambridge
Copyright (c) 1997-2018 University of Cambridge
All rights reserved
......@@ -19,7 +19,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
Copyright(c) 2010-2017 Zoltan Herczeg
Copyright(c) 2010-2018 Zoltan Herczeg
All rights reserved.
......@@ -30,7 +30,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
Copyright(c) 2009-2017 Zoltan Herczeg
Copyright(c) 2009-2018 Zoltan Herczeg
All rights reserved.
......
......@@ -4,6 +4,59 @@ ChangeLog for PCRE
Note that the PCRE 8.xx series (PCRE1) is now in a bugfix-only state. All
development is happening in the PCRE2 10.xx series.
Version 8.42 20-March-2018
--------------------------
1. Fixed a MIPS issue in the JIT compiler reported by Joshua Kinard.
2. Fixed outdated real_pcre definitions in pcre.h.in (patch by Evgeny Kotkov).
3. pcregrep was truncating components of file names to 128 characters when
processing files with the -r option, and also (some very odd code) truncating
path names to 512 characters. There is now a check on the absolute length of
full path file names, which may be up to 2047 characters long.
4. Using pcre_dfa_exec(), in UTF mode when UCP support was not defined, there
was the possibility of a false positive match when caselessly matching a "not
this character" item such as [^\x{1234}] (with a code point greater than 127)
because the "other case" variable was not being initialized.
5. Although pcre_jit_exec checks whether the pattern is compiled
in a given mode, it was also expected that at least one mode is available.
This is fixed and pcre_jit_exec returns with PCRE_ERROR_JIT_BADOPTION
when the pattern is not optimized by JIT at all.
6. The line number and related variables such as match counts in pcregrep
were all int variables, causing overflow when files with more than 2147483647
lines were processed (assuming 32-bit ints). They have all been changed to
unsigned long ints.
7. If a backreference with a minimum repeat count of zero was first in a
pattern, apart from assertions, an incorrect first matching character could be
recorded. For example, for the pattern /(?=(a))\1?b/, "b" was incorrectly set
as the first character of a match.
8. Fix out-of-bounds read for partial matching of /./ against an empty string
when the newline type is CRLF.
9. When matching using the the REG_STARTEND feature of the POSIX API with a
non-zero starting offset, unset capturing groups with lower numbers than a
group that did capture something were not being correctly returned as "unset"
(that is, with offset values of -1).
10. Matching the pattern /(*UTF)\C[^\v]+\x80/ against an 8-bit string
containing multi-code-unit characters caused bad behaviour and possibly a
crash. This issue was fixed for other kinds of repeat in release 8.37 by change
38, but repeating character classes were overlooked.
11. A small fix to pcregrep to avoid compiler warnings for -Wformat-overflow=2.
12. Added --enable-jit=auto support to configure.ac.
13. Fix misleading error message in configure.ac.
Version 8.41 05-July-2017
-------------------------
......
This diff is collapsed.
......@@ -25,7 +25,7 @@ Email domain: cam.ac.uk
University of Cambridge Computing Service,
Cambridge, England.
Copyright (c) 1997-2017 University of Cambridge
Copyright (c) 1997-2018 University of Cambridge
All rights reserved.
......@@ -36,7 +36,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
Copyright(c) 2010-2017 Zoltan Herczeg
Copyright(c) 2010-2018 Zoltan Herczeg
All rights reserved.
......@@ -47,7 +47,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
Copyright(c) 2009-2017 Zoltan Herczeg
Copyright(c) 2009-2018 Zoltan Herczeg
All rights reserved.
......
News about PCRE releases
------------------------
Release 8.42 20-March-2018
--------------------------
This is a bug-fix release.
Release 8.41 13-June-2017
-------------------------
......
......@@ -760,13 +760,14 @@ The character code used is EBCDIC, not ASCII or Unicode. In z/OS, UNIX APIs and
applications can be supported through UNIX System Services, and in such an
environment PCRE can be built in the same way as in other systems. However, in
native z/OS (without UNIX System Services) and in z/VM, special ports are
required. For details, please see this web site:
required. PCRE1 version 8.39 is available in file 882 on this site:
http://www.zaconsultants.net
http://www.cbttape.org
You may download PCRE from WWW.CBTTAPE.ORG, file 882.  Everything, source and
executable, is in EBCDIC and native z/OS file formats and this is the
recommended download site.
Everything, source and executable, is in EBCDIC and native z/OS file formats.
However, this software is not maintained and will not be upgraded. If you are
new to PCRE you should be looking at PCRE2 (version 10.30 or later).
==========================
Last Updated: 25 June 2015
===============================
Last Updated: 13 September 2017
===============================
......@@ -9,18 +9,18 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
dnl be defined as -RC2, for example. For real releases, it should be empty.
m4_define(pcre_major, [8])
m4_define(pcre_minor, [41])
m4_define(pcre_minor, [42])
m4_define(pcre_prerelease, [])
m4_define(pcre_date, [2017-07-05])
m4_define(pcre_date, [2018-03-20])
# NOTE: The CMakeLists.txt file searches for the above variables in the first
# 50 lines of this file. Please update that if the variables above are moved.
# Libtool shared library interface versions (current:revision:age)
m4_define(libpcre_version, [3:9:2])
m4_define(libpcre16_version, [2:9:2])
m4_define(libpcre32_version, [0:9:0])
m4_define(libpcreposix_version, [0:5:0])
m4_define(libpcre_version, [3:10:2])
m4_define(libpcre16_version, [2:10:2])
m4_define(libpcre32_version, [0:10:0])
m4_define(libpcreposix_version, [0:6:0])
m4_define(libpcrecpp_version, [0:1:0])
AC_PREREQ(2.57)
......@@ -155,6 +155,18 @@ AC_ARG_ENABLE(jit,
[enable Just-In-Time compiling support]),
, enable_jit=no)
# This code enables JIT if the hardware supports it.
if test "$enable_jit" = "auto"; then
AC_LANG(C)
AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
#define SLJIT_CONFIG_AUTO 1
#include "sljit/sljitConfigInternal.h"
#if (defined SLJIT_CONFIG_UNSUPPORTED && SLJIT_CONFIG_UNSUPPORTED)
#error unsupported
#endif]])], enable_jit=yes, enable_jit=no)
fi
# Handle --disable-pcregrep-jit (enabled by default)
AC_ARG_ENABLE(pcregrep-jit,
AS_HELP_STRING([--disable-pcregrep-jit],
......@@ -469,7 +481,7 @@ pcre_have_type_traits="0"
pcre_have_bits_type_traits="0"
if test "x$enable_cpp" = "xyes" -a -z "$CXX"; then
AC_MSG_ERROR([You need a C++ compiler for C++ support.])
AC_MSG_ERROR([Invalid C++ compiler or C++ compiler flags])
fi
if test "x$enable_cpp" = "xyes" -a -n "$CXX"
......
......@@ -760,13 +760,14 @@ The character code used is EBCDIC, not ASCII or Unicode. In z/OS, UNIX APIs and
applications can be supported through UNIX System Services, and in such an
environment PCRE can be built in the same way as in other systems. However, in
native z/OS (without UNIX System Services) and in z/VM, special ports are
required. For details, please see this web site:
required. PCRE1 version 8.39 is available in file 882 on this site:
http://www.zaconsultants.net
http://www.cbttape.org
You may download PCRE from WWW.CBTTAPE.ORG, file 882.  Everything, source and
executable, is in EBCDIC and native z/OS file formats and this is the
recommended download site.
Everything, source and executable, is in EBCDIC and native z/OS file formats.
However, this software is not maintained and will not be upgraded. If you are
new to PCRE you should be looking at PCRE2 (version 10.30 or later).
==========================
Last Updated: 25 June 2015
===============================
Last Updated: 13 September 2017
===============================
......@@ -321,11 +321,11 @@ these bits, just add new ones on the end, in order to remain compatible. */
/* Types */
struct real_pcre; /* declaration; the definition is private */
typedef struct real_pcre pcre;
struct real_pcre8_or_16; /* declaration; the definition is private */
typedef struct real_pcre8_or_16 pcre;
struct real_pcre16; /* declaration; the definition is private */
typedef struct real_pcre16 pcre16;
struct real_pcre8_or_16; /* declaration; the definition is private */
typedef struct real_pcre8_or_16 pcre16;
struct real_pcre32; /* declaration; the definition is private */
typedef struct real_pcre32 pcre32;
......
......@@ -8063,7 +8063,7 @@ for (;; ptr++)
single group (i.e. not to a duplicated name. */
HANDLE_REFERENCE:
if (firstcharflags == REQ_UNSET) firstcharflags = REQ_NONE;
if (firstcharflags == REQ_UNSET) zerofirstcharflags = firstcharflags = REQ_NONE;
previous = code;
item_hwm_offset = cd->hwm - cd->start_workspace;
*code++ = ((options & PCRE_CASELESS) != 0)? OP_REFI : OP_REF;
......
......@@ -2287,12 +2287,14 @@ for (;;)
case OP_NOTI:
if (clen > 0)
{
unsigned int otherd;
pcre_uint32 otherd;
#ifdef SUPPORT_UTF
if (utf && d >= 128)
{
#ifdef SUPPORT_UCP
otherd = UCD_OTHERCASE(d);
#else
otherd = d;
#endif /* SUPPORT_UCP */
}
else
......
......@@ -6,7 +6,7 @@
and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
Copyright (c) 1997-2014 University of Cambridge
Copyright (c) 1997-2018 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
......@@ -2313,7 +2313,7 @@ for (;;)
case OP_ANY:
if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
if (md->partial != 0 &&
eptr + 1 >= md->end_subject &&
eptr == md->end_subject - 1 &&
NLBLOCK->nltype == NLTYPE_FIXED &&
NLBLOCK->nllen == 2 &&
UCHAR21TEST(eptr) == NLBLOCK->nl[0])
......@@ -3061,7 +3061,7 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, eptrb, RM18);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
if (eptr-- == pp) break; /* Stop if tried at original pos */
if (eptr-- <= pp) break; /* Stop if tried at original pos */
BACKCHAR(eptr);
}
}
......@@ -3218,7 +3218,7 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, eptrb, RM21);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
if (eptr-- == pp) break; /* Stop if tried at original pos */
if (eptr-- <= pp) break; /* Stop if tried at original pos */
#ifdef SUPPORT_UTF
if (utf) BACKCHAR(eptr);
#endif
......
This diff is collapsed.
......@@ -1387,8 +1387,8 @@ Returns: nothing
*/
static void
do_after_lines(int lastmatchnumber, char *lastmatchrestart, char *endptr,
char *printname)
do_after_lines(unsigned long int lastmatchnumber, char *lastmatchrestart,
char *endptr, char *printname)
{
if (after_context > 0 && lastmatchnumber > 0)
{
......@@ -1398,7 +1398,7 @@ if (after_context > 0 && lastmatchnumber > 0)
int ellength;
char *pp = lastmatchrestart;
if (printname != NULL) fprintf(stdout, "%s-", printname);
if (number) fprintf(stdout, "%d-", lastmatchnumber++);
if (number) fprintf(stdout, "%lu-", lastmatchnumber++);
pp = end_of_line(pp, endptr, &ellength);
FWRITE(lastmatchrestart, 1, pp - lastmatchrestart, stdout);
lastmatchrestart = pp;
......@@ -1502,11 +1502,11 @@ static int
pcregrep(void *handle, int frtype, char *filename, char *printname)
{
int rc = 1;
int linenumber = 1;
int lastmatchnumber = 0;
int count = 0;
int filepos = 0;
int offsets[OFFSET_SIZE];
unsigned long int linenumber = 1;
unsigned long int lastmatchnumber = 0;
unsigned long int count = 0;
char *lastmatchrestart = NULL;
char *ptr = main_buffer;
char *endptr;
......@@ -1609,7 +1609,7 @@ while (ptr < endptr)
if (endlinelength == 0 && t == main_buffer + bufsize)
{
fprintf(stderr, "pcregrep: line %d%s%s is too long for the internal buffer\n"
fprintf(stderr, "pcregrep: line %lu%s%s is too long for the internal buffer\n"
"pcregrep: check the --buffer-size option\n",
linenumber,
(filename == NULL)? "" : " of file ",
......@@ -1747,7 +1747,7 @@ while (ptr < endptr)
prevoffsets[1] = offsets[1];
if (printname != NULL) fprintf(stdout, "%s:", printname);
if (number) fprintf(stdout, "%d:", linenumber);
if (number) fprintf(stdout, "%lu:", linenumber);
/* Handle --line-offsets */
......@@ -1862,7 +1862,7 @@ while (ptr < endptr)
{
char *pp = lastmatchrestart;
if (printname != NULL) fprintf(stdout, "%s-", printname);
if (number) fprintf(stdout, "%d-", lastmatchnumber++);
if (number) fprintf(stdout, "%lu-", lastmatchnumber++);
pp = end_of_line(pp, endptr, &ellength);
FWRITE(lastmatchrestart, 1, pp - lastmatchrestart, stdout);
lastmatchrestart = pp;
......@@ -1902,7 +1902,7 @@ while (ptr < endptr)
int ellength;
char *pp = p;
if (printname != NULL) fprintf(stdout, "%s-", printname);
if (number) fprintf(stdout, "%d-", linenumber - linecount--);
if (number) fprintf(stdout, "%lu-", linenumber - linecount--);
pp = end_of_line(pp, endptr, &ellength);
FWRITE(p, 1, pp - p, stdout);
p = pp;
......@@ -1916,7 +1916,7 @@ while (ptr < endptr)
endhyphenpending = TRUE;
if (printname != NULL) fprintf(stdout, "%s:", printname);
if (number) fprintf(stdout, "%d:", linenumber);
if (number) fprintf(stdout, "%lu:", linenumber);
/* In multiline mode, we want to print to the end of the line in which
the end of the matched string is found, so we adjust linelength and the
......@@ -2112,7 +2112,7 @@ if (count_only && !quiet)
{
if (printname != NULL && filenames != FN_NONE)
fprintf(stdout, "%s:", printname);
fprintf(stdout, "%d\n", count);
fprintf(stdout, "%lu\n", count);
}
}
......@@ -2234,7 +2234,7 @@ if (isdirectory(pathname))
if (dee_action == dee_RECURSE)
{
char buffer[1024];
char buffer[2048];
char *nextfile;
directory_type *dir = opendirectory(pathname);
......@@ -2249,7 +2249,14 @@ if (isdirectory(pathname))
while ((nextfile = readdirectory(dir)) != NULL)
{
int frc;
sprintf(buffer, "%.512s%c%.128s", pathname, FILESEP, nextfile);
int fnlength = strlen(pathname) + strlen(nextfile) + 2;
if (fnlength > 2048)
{
fprintf(stderr, "pcre2grep: recursive filename is too long\n");
rc = 2;
break;
}
sprintf(buffer, "%s%c%s", pathname, FILESEP, nextfile);
frc = grep_or_recurse(buffer, dir_recurse, FALSE);
if (frc > 1) rc = frc;
else if (frc == 0 && rc == 1) rc = 0;
......@@ -2520,7 +2527,14 @@ if ((popts & PO_FIXED_STRINGS) != 0)
}
}
sprintf(buffer, "%s%.*s%s", prefix[popts], patlen, ps, suffix[popts]);
if (snprintf(buffer, PATBUFSIZE, "%s%.*s%s", prefix[popts], patlen, ps,
suffix[popts]) > PATBUFSIZE)
{
fprintf(stderr, "pcregrep: Buffer overflow while compiling \"%s\"\n",
ps);
return FALSE;
}
p->compiled = pcre_compile(buffer, options, &error, &errptr, pcretables);
if (p->compiled != NULL) return TRUE;
......@@ -2756,8 +2770,15 @@ for (i = 1; i < argc; i++)
int arglen = (argequals == NULL || equals == NULL)?
(int)strlen(arg) : (int)(argequals - arg);
sprintf(buff1, "%.*s", baselen, op->long_name);
sprintf(buff2, "%s%.*s", buff1, fulllen - baselen - 2, opbra + 1);
if (snprintf(buff1, sizeof(buff1), "%.*s", baselen, op->long_name) >
(int)sizeof(buff1) ||
snprintf(buff2, sizeof(buff2), "%s%.*s", buff1,
fulllen - baselen - 2, opbra + 1) > (int)sizeof(buff2))
{
fprintf(stderr, "pcregrep: Buffer overflow when parsing %s option\n",
op->long_name);
pcregrep_exit(2);
}
if (strncmp(arg, buff1, arglen) == 0 ||
strncmp(arg, buff2, arglen) == 0)
......
......@@ -6,7 +6,7 @@
and semantics are as close as possible to those of the Perl 5 language.
Written by Philip Hazel
Copyright (c) 1997-2017 University of Cambridge
Copyright (c) 1997-2018 University of Cambridge
-----------------------------------------------------------------------------
Redistribution and use in source and binary forms, with or without
......@@ -389,8 +389,8 @@ if (rc >= 0)
{
for (i = 0; i < (size_t)rc; i++)
{
pmatch[i].rm_so = ovector[i*2] + so;
pmatch[i].rm_eo = ovector[i*2+1] + so;
pmatch[i].rm_so = (ovector[i*2] < 0)? -1 : ovector[i*2] + so;
pmatch[i].rm_eo = (ovector[i*2+1] < 0)? -1: ovector[i*2+1] + so;
}
if (allocated_ovector) free(ovector);
for (; i < nmatch; i++) pmatch[i].rm_so = pmatch[i].rm_eo = -1;
......
......@@ -4249,4 +4249,12 @@ backtracking verbs. --/
/(?=.*[A-Z])/I
"(?<=(a))\1?b"
ab
aaab
"(?=(a))\1?b"
ab
aaab
/-- End of testinput2 --/
......@@ -798,4 +798,10 @@
/(?<=\K\x{17f})/8G+
\x{17f}\x{17f}\x{17f}\x{17f}\x{17f}
/\C[^\v]+\x80/8
[AΏBŀC]
/\C[^\d]+\x80/8
[AΏBŀC]
/-- End of testinput5 --/
......
......@@ -14705,4 +14705,20 @@ No options
No first char
No need char
"(?<=(a))\1?b"
ab
0: b
1: a
aaab
0: ab
1: a
"(?=(a))\1?b"
ab
0: ab
1: a
aaab
0: ab
1: a
/-- End of testinput2 --/
......@@ -1942,4 +1942,12 @@ Need char = 'z'
0: \x{17f}
0+
/\C[^\v]+\x80/8
[AΏBŀC]
No match
/\C[^\d]+\x80/8
[AΏBŀC]
No match
/-- End of testinput5 --/
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment