Skip to content

Commit

Permalink
urlwatch 2.29
Browse files Browse the repository at this point in the history
  • Loading branch information
thp committed Oct 28, 2024
1 parent 707a62f commit cbc8eb0
Show file tree
Hide file tree
Showing 11 changed files with 167 additions and 44 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ All notable changes to this project will be documented in this file.

The format mostly follows [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## UNRELEASED
## [2.29] -- 2024-10-28

### Added

Expand Down
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@
# -- Project information -----------------------------------------------------

project = 'urlwatch'
copyright = '2023 Thomas Perl'
copyright = '2024 Thomas Perl'
author = 'Thomas Perl'

# The full version, including alpha/beta/rc tags
release = '2.28'
release = '2.29'


# -- General configuration ---------------------------------------------------
Expand Down
2 changes: 1 addition & 1 deletion lib/urlwatch/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,5 +12,5 @@
__author__ = 'Thomas Perl <[email protected]>'
__license__ = 'BSD'
__url__ = 'https://thp.io/2008/urlwatch/'
__version__ = '2.28'
__version__ = '2.29'
__user_agent__ = '%s/%s (+https://thp.io/2008/urlwatch/info.html)' % (pkgname, __version__)
18 changes: 14 additions & 4 deletions share/man/man1/urlwatch.1
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "URLWATCH" "1" "May 03, 2023" "urlwatch 2.28" "urlwatch 2.28 Documentation"
.TH "URLWATCH" "1" "Oct 28, 2024" "urlwatch " "urlwatch Documentation"
.SH NAME
urlwatch \- Monitor webpages and command output for changes
.SH SYNOPSIS
Expand All @@ -50,8 +50,9 @@ This manpage describes the CLI tool.
.INDENT 7.0
.TP
.B JOB
index of job(s) to run, as numbered according to the \-\-list command.
If none are specified, then all jobs will be run.
indexes or tags of job(s) to run.
If \-\-tags is set, each JOB is a tag,
if not, each JOB is an index numbered according to the \-\-list command.
.UNINDENT
.TP
.B optional arguments:
Expand All @@ -60,6 +61,9 @@ If none are specified, then all jobs will be run.
.B \-h\fP,\fB \-\-help
show this help message and exit
.TP
.B \-\-tags
use tags instead of indexes to select jobs to run
.TP
.B \-\-version
show program\(aqs version number and exit
.TP
Expand Down Expand Up @@ -110,6 +114,12 @@ add job (key1=value1,key2=value2,...)
.TP
.BI \-\-delete \ JOB
delete job by location or index
.TP
.BI \-\-enable \ JOB
enable job by location or index
.TP
.BI \-\-disable \ JOB
delete job by location or index
.UNINDENT
.INDENT 7.0
.TP
Expand Down Expand Up @@ -185,6 +195,6 @@ Thomas Perl <\fI\%https://thp.io/\fP>
.sp
\fI\%https://thp.io/2008/urlwatch/\fP
.SH COPYRIGHT
2023 Thomas Perl
2024 Thomas Perl
.\" Generated by docutils manpage writer.
.
9 changes: 5 additions & 4 deletions share/man/man5/urlwatch-config.5
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "URLWATCH-CONFIG" "5" "May 03, 2023" "" "urlwatch"
.TH "URLWATCH-CONFIG" "5" "Oct 28, 2024" "urlwatch " "urlwatch Documentation"
.SH NAME
urlwatch-config \- Configuration of urlwatch behavior
.SH SYNOPSIS
Expand Down Expand Up @@ -85,7 +85,7 @@ In addition to the reporter\-specific options, all reporters support these
options:
.INDENT 0.0
.IP \(bu 2
\fBenable\fP: \fI[bool]\fP Activate the reporter. (default: False)
\fBenabled\fP: \fI[bool]\fP Activate the reporter. (default: False)
.IP \(bu 2
\fBseparate\fP: \fI[bool]\fP Send a report for each job rather than a combined
report for all jobs. (default: False)
Expand Down Expand Up @@ -115,7 +115,8 @@ text
├───prowl
└───shell
markdown
└───matrix
├───matrix
└───gotify
.ft P
.fi
.UNINDENT
Expand Down Expand Up @@ -208,6 +209,6 @@ See \fI\%Jobs\fP about the different job kinds and what the possible keys are.
\fBurlwatch\-intro(7)\fP,
\fBurlwatch\-cookbook(7)\fP
.SH COPYRIGHT
2023 Thomas Perl
2024 Thomas Perl
.\" Generated by docutils manpage writer.
.
70 changes: 52 additions & 18 deletions share/man/man5/urlwatch-filters.5
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "URLWATCH-FILTERS" "5" "May 03, 2023" "" "urlwatch"
.TH "URLWATCH-FILTERS" "5" "Oct 28, 2024" "urlwatch " "urlwatch Documentation"
.SH NAME
urlwatch-filters \- Filtering output and diff data of urlwatch jobs
.SH SYNOPSIS
Expand Down Expand Up @@ -126,6 +126,8 @@ At the moment, the following filters are built\-in:
.IP \(bu 2
\fBre.sub\fP: Replace text with regular expressions using Python\(aqs re.sub
.IP \(bu 2
\fBre.findall\fP: Find all non\-overlapping matches using Python\(aqs re.findall
.IP \(bu 2
\fBreverse\fP: Reverse input items
.IP \(bu 2
\fBsha1sum\fP: Calculate the SHA\-1 checksum of the content
Expand Down Expand Up @@ -612,9 +614,10 @@ project for the latest release version, to be notified of new releases:
.ft C
url: https://github.com/tulir/gomuks/releases
filter:
\- xpath: \(aq(//div[contains(@class,\(dqd\-flex flex\-column flex\-md\-row my\-5 flex\-justify\-center\(dq)]//h1//a)[1]\(aq
\- html2text: re
\- strip
\- xpath:
path: //*[@class=\(dqLink\-\-primary Link\(dq]
maxitems: 1
\- html2text:
.ft P
.fi
.UNINDENT
Expand All @@ -629,7 +632,7 @@ This is the corresponding version for Github tags:
url: https://github.com/thp/urlwatch/tags
filter:
\- xpath:
path: //*[@class=\(dqLink\-\-primary\(dq]
path: //*[@class=\(dqLink\-\-primary Link\(dq]
maxitems: 1
\- html2text:
.ft P
Expand Down Expand Up @@ -665,11 +668,12 @@ filter:
.fi
.UNINDENT
.UNINDENT
.SH REMOVE OR REPLACE TEXT USING REGULAR EXPRESSIONS
.SH FIND, REMOVE OR REPLACE TEXT USING REGULAR EXPRESSIONS
.sp
Just like Python’s \fBre.sub\fP function, there’s the possibility to apply
a regular expression and either remove of replace the matched text. The
following example applies the filter 3 times:
You can use \fBre.sub\fP and \fBre.findall\fP to apply regular expressions.
.sp
\fBre.sub\fP can be used to remove or replace all non\-overlapping instances
of matched text. The following example applies the filter 3 times:
.INDENT 0.0
.IP 1. 3
Just specifying a string as the value will replace the matches with
Expand All @@ -682,11 +686,7 @@ You can use groups (\fB()\fP) and back\-reference them with \fB\e1\fP
(etc..) to put groups into the replacement string.
.UNINDENT
.sp
All features are described in Python’s
\fI\%re.sub\fP <\fBhttps://docs.python.org/3/library/re.html#re.sub\fP>
documentation (the \fBpattern\fP and \fBrepl\fP values are passed to this
function as\-is, with the value of \fBrepl\fP defaulting to the empty
string).
\fBrepl\fP defaults to the empty string, which will remove matched strings.
.INDENT 0.0
.INDENT 3.5
.sp
Expand All @@ -706,9 +706,43 @@ filter:
.UNINDENT
.UNINDENT
.sp
If you want to enable certain flags (e.g. \fBre.MULTILINE\fP) in the
call, this is possible by inserting an \(dqinline flag\(dq documented in
\fI\%flags in re.compile\fP <\fBhttps://docs.python.org/3/library/re.html#re.compile\fP>, here are some examples:
\fBre.findall\fP can be used to find all non\-overlapping matches of a
regular expression. Each match is output on its own line. The following
example applies the filter twice:
.INDENT 0.0
.IP 1. 3
It uses a group (\fB()\fP) and back\-reference (\fB\e1\fP) to extract a
date from the input string.
.IP 2. 3
It breaks the numbers in the date out into separate lines.
.UNINDENT
.sp
If \fBrepl\fP is not specified, the full match will be included in the output.
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
url: https://example.com/regex\-findall.html
filter:
\- re.findall:
pattern: \(aqThe next draw is on (\ed{4}\-\ed{2}\-\ed{2}).\(aq
repl: \(aq\e1\(aq
\- re.findall: \(aq\ed+\(aq
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
Note: When using HTML or XML, it is usually better to use CSS selectors or
XPATH expressions. HTML and XML \fI\%cannot be parsed\fP <\fBhttps://stackoverflow.com/a/1732454/1047040\fP> properly using regular
expressions. If the CSS selector or XPATH cannot provide the targeted
selection required, using an \fBhtml2text\fP filter first then using
\fBre.findall\fP can be a good pattern.
.sp
If you want to enable flags (e.g. \fBre.MULTILINE\fP) in \fBre.sub\fP
or \fBre.findall\fP filters, use an \(dqinline flag\(dq, here are some
examples:
.INDENT 0.0
.IP \(bu 2
\fBre.MULTILINE\fP: \fB(?m)\fP (Makes \fB^\fP match start\-of\-line and \fB$\fP match end\-of\-line)
Expand Down Expand Up @@ -892,6 +926,6 @@ more information on the operations permitted, see the \fI\%jq Manual\fP <\fBhttp
\fBurlwatch\-intro(5)\fP,
\fBurlwatch\-jobs(5)\fP
.SH COPYRIGHT
2023 Thomas Perl
2024 Thomas Perl
.\" Generated by docutils manpage writer.
.
20 changes: 15 additions & 5 deletions share/man/man5/urlwatch-jobs.5
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "URLWATCH-JOBS" "5" "May 03, 2023" "" "urlwatch"
.TH "URLWATCH-JOBS" "5" "Oct 28, 2024" "urlwatch " "urlwatch Documentation"
.SH NAME
urlwatch-jobs \- Job types and configuration for urlwatch
.SH SYNOPSIS
Expand Down Expand Up @@ -90,9 +90,9 @@ Job\-specific optional keys:
.IP \(bu 2
\fBignore_cached\fP: Do not use cache control (ETag/Last\-Modified) values (true/false)
.IP \(bu 2
\fBhttp_proxy\fP: Proxy server to use for HTTP requests (might be http:// or socks5://)
\fBhttp_proxy\fP: Proxy server to use for HTTP requests (might be \fI\%http://\fP or socks5://)
.IP \(bu 2
\fBhttps_proxy\fP: Proxy server to use for HTTPS requests (might be http:// or socks5://)
\fBhttps_proxy\fP: Proxy server to use for HTTPS requests (might be \fI\%http://\fP or socks5://)
.IP \(bu 2
\fBheaders\fP: HTTP header to send along with the request
.IP \(bu 2
Expand All @@ -107,6 +107,8 @@ Job\-specific optional keys:
\fBignore_timeout_errors\fP: Do not report errors when the timeout is hit
.IP \(bu 2
\fBignore_too_many_redirects\fP: Ignore redirect loops (see \fI\%Advanced Topics\fP)
.IP \(bu 2
\fBignore_incomplete_reads\fP: Ignore incomplete HTTP responses (see \fI\%Advanced Topics\fP)
.UNINDENT
.sp
(Note: \fBurl\fP implies \fBkind: url\fP)
Expand Down Expand Up @@ -142,6 +144,10 @@ Job\-specific optional keys:
\fBwait_until\fP: Either \fBload\fP, \fBdomcontentloaded\fP, \fBnetworkidle\fP, or
\fBcommit\fP (see \fI\%Advanced Topics\fP)
.IP \(bu 2
\fBwait_for\fP: A CSS or XPath selector based on the
: \fI\%https://playwright.dev/python/docs/locators#locate\-by\-css\-or\-xpath\fP
spec. The job will wait for the default timeout of 30 seconds.
.IP \(bu 2
\fBuseragent\fP: \fBUser\-Agent\fP header used for requests (otherwise browser default is used)
.IP \(bu 2
\fBbrowser\fP: Either \fBchromium\fP, \fBchrome\fP, \fBchrome\-beta\fP, \fBmsedge\fP,
Expand Down Expand Up @@ -242,9 +248,11 @@ stderr: stdout
.IP \(bu 2
\fBname\fP: Human\-readable name/label of the job
.IP \(bu 2
\fBtags\fP: Array of tags, or a single tag as a string
.IP \(bu 2
\fBfilter\fP: \fI\%Filters\fP (if any) to apply to the output (can be tested with \fB\-\-test\-filter\fP)
.IP \(bu 2
\fBmax_tries\fP: Number of times to retry fetching the resource
\fBmax_tries\fP: After this many sequential failed runs, the error will be reported rather than ignored
.IP \(bu 2
\fBdiff_tool\fP: Command to a custom tool for generating diff text
.IP \(bu 2
Expand All @@ -257,6 +265,8 @@ stderr: stdout
\fBkind\fP (redundant): Either \fBurl\fP, \fBshell\fP or \fBbrowser\fP\&. Automatically derived from the unique key (\fBurl\fP, \fBcommand\fP or \fBnavigate\fP) of the job type
.IP \(bu 2
\fBuser_visible_url\fP: Different URL to show in reports (e.g. when watched URL is a REST API URL, and you want to show a webpage)
.IP \(bu 2
\fBenabled\fP: Can be set to false to disable an individual job (default is \fBtrue\fP)
.UNINDENT
.SH SETTING KEYS FOR ALL JOBS AT ONCE
.sp
Expand All @@ -276,6 +286,6 @@ See \fBurlwatch\-cookbook(7)\fP for example job configurations.
\fBurlwatch\-intro(5)\fP,
\fBurlwatch\-filters(5)\fP
.SH COPYRIGHT
2023 Thomas Perl
2024 Thomas Perl
.\" Generated by docutils manpage writer.
.
Loading

0 comments on commit cbc8eb0

Please sign in to comment.