Commit 4a62dd59 authored by Chris McDonough's avatar Chris McDonough

Collector #2348 - doc_href method fails to include query strings in detected...

Collector #2348 - doc_href method fails to include query strings in detected URLs.  Fixed two regexes.  Thanks to "datagrok".
parent 2c262eed
......@@ -951,8 +951,8 @@ class DocumentClass:
## Some constants to make the doc_href() regex easier to read.
_DQUOTEDTEXT = r'("[ %s0-9\n\r\-\.\,\;\(\)\/\:\/\*\']+")' % letters ## double quoted text
_ABSOLUTE_URL=r'((http|https|ftp|mailto|file|about)[:/]+?[%s0-9_\@\.\,\?\!\/\:\;\-\#\~]+)' % letters
_ABS_AND_RELATIVE_URL=r'([%s0-9_\@\.\,\?\!\/\:\;\-\#\~]+)' % letters
_ABSOLUTE_URL=r'((http|https|ftp|mailto|file|about)[:/]+?[%s0-9_\@\.\,\?\!\/\:\;\-\#\~\=\?]+)' % letters
_ABS_AND_RELATIVE_URL=r'([%s0-9_\@\.\,\?\!\/\:\;\-\#\~\=\?]+)' % letters
_SPACES = r'(\s*)'
def doc_href(self, s,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment