Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heritrix 3.4.0-SNAPSHOT-2022-03-08T19:15:59Z keeps pausing.. #569

Closed
cslr opened this issue Oct 16, 2023 · 1 comment
Closed

Heritrix 3.4.0-SNAPSHOT-2022-03-08T19:15:59Z keeps pausing.. #569

cslr opened this issue Oct 16, 2023 · 1 comment

Comments

@cslr
Copy link

cslr commented Oct 16, 2023

Heritrx keeps pausing during crawls for no good reason. It may crawl something like 1GB and then pause again and again.

Only error message I see is "WARNING politessDelay unset" but I don't find documentation how to set politessDelay.

@anjackson
Copy link
Collaborator

The politenessDelay warning is irritating but should not be the cause of the pause.

As far as I know, pausing can happen for a couple of reasons:

  1. You have a DiskSpaceMonitor configured (e.g.) and this is pausing the crawl because you don't have enough disk space given how that bean is configured.
  2. You have less than 5GB of free space, and you are using the default value of the FREE_DISK parameter for the underlying JE-BDB. See Disk usage is not within je.maxDisk or je.freeDisk limits and write operations are prohibited #340 for some information on that, but I personally would not recommend running Heritrix without plenty of space disk space.

If neither of these is consistent with what you're seeing then it sounds like a bug. In which case, some more detailed logs might help.

@internetarchive internetarchive locked and limited conversation to collaborators May 15, 2024
@ato ato converted this issue into discussion #587 May 15, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants