Skip to content

Commit

Permalink
Update link to robots.txt meta documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Kristinn Sigurðsson authored Sep 2, 2020
1 parent 33a3f9a commit adac067
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-

## Crawl Operators!

Heritrix is designed to respect the [`robots.txt`](http://www.robotstxt.org/wc/robots.html) exclusion directives and [META robots tags](http://www.robotstxt.org/wc/exclusion.html#meta). Please consider the
Heritrix is designed to respect the [`robots.txt`](http://www.robotstxt.org/wc/robots.html) exclusion directives and [META robots tags](http://www.robotstxt.org/meta.html). Please consider the
load your crawl will place on seed sites and set politeness policies accordingly. Also, always identify your crawl with contact information in the `User-Agent` so sites that may be adversely affected by your crawl can contact you or adapt their server behavior accordingly.

## Getting Started
Expand Down

0 comments on commit adac067

Please sign in to comment.