Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Celery spider #156

Draft
wants to merge 37 commits into
base: master
Choose a base branch
from
Draft

[WIP]Celery spider #156

wants to merge 37 commits into from

Conversation

cronosnull
Copy link
Collaborator

@cronosnull cronosnull commented Feb 5, 2020

The celery version of the spider is a rewrite of the current spider using Celery/Redis to make it scalable in a containerized environment (such as k8s).

image

The docker-compose file will create the appropriate services and run the application. Before running, you will need to adjust the paths of the secret section.

Design decisions:
Each worker container will run only one Celery Worker. Scale creating new pods.

Work in progress, it still needs:

  • ES support (Done)
  • Update the documentation
  • Adjust logging
  • Additional testing
  • Parameters tunning.
  • Move the validation schema to the shared volume.

For testing purposes, using 4 spider workers, and one instance of each the other pods, we get similar execution times to the current spider.

cronosnull added 12 commits May 28, 2020 15:59
Remove unused imports
A docker compose configuration for a celery based spider. 
The tests/celery_test.py script query all the schedds queues and send messages to the test broker. 
# How to run:

```bash 
docker-compose up --scale spider-worker=3
```
As we changed the user id in the dockerfile before the rebase, we should update the docker-compose file too.
Adding the support for the history queries.
This modules are being replaced by the celery tasks.
* Adding support to ES

* [Celery]Adding some documentation and small style changes.

* [Celery][ES] Fixing the index assignment.

* [Celery][ES]Moving the post ads to a new task
In order to improve the performance (related to the AMQ frequency), we can externalize the process to send the data to ES, as it can affect the rate we sent data to AMQ otherwise.

* [Celery][ES]Changing the format of the conf file
It turns out that, after a restart of the worker process, the first message was tried to be serialized as json and fail because classads are not json serializable. Explicitly setting the serializer to the tasks prevent this.
K8S (and openstack) add some environment variables to the containers, so we need to change the naming to avoid conflicts (e.g. flower uses a FLOWER_PORT variable, that conflicts with the the {POD_NAME}_PORT variable in K8S)
@cronosnull cronosnull force-pushed the CelerySpider branch 4 times, most recently from e26d6dc to f0073b6 Compare May 28, 2020 18:57
@cronosnull cronosnull force-pushed the CelerySpider branch 4 times, most recently from 4020976 to 8099372 Compare June 2, 2020 09:45
Creation of the share folder and cleanup of the environment in the affiliation cron (as it will not need most of the secrets)
As it was before, the affiliation manager will be created on first module load (which will be cause problems in the k8s setup as it may not exists)
Reducing the required resources, adding a shared redis storate, adding RuntimeError exception to retry for in the query schedd.
It turns out that the problem in the worker was caused by flower configuration.
@cronosnull cronosnull force-pushed the CelerySpider branch 3 times, most recently from 42fe429 to 25cf493 Compare June 4, 2020 14:33
Using multiple queues the tasks from different types can be served in parallel.
@cronosnull cronosnull force-pushed the CelerySpider branch 2 times, most recently from c969024 to 83516ca Compare June 4, 2020 21:30
Making part of the process synchronous (parallelize ony by schedd) will create less and small messages,  and with a small number of workers will have a better performance.
@cronosnull cronosnull force-pushed the CelerySpider branch 3 times, most recently from f562be7 to 920bb35 Compare June 9, 2020 14:03
In order to remove the most common warning from the log we can try to solve it (and then, ignore it).
Rolling back to prefetch multiplier 1 (the waiting time of the messages is lower as the task duration is uneven)
This will make easier to update the schema in the docker image as it it will be automatically updated on image build (that is triggered by commit into the repository).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant