-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve LogProcessingWorker._fetch_events() performances #94
Comments
Thanks for the detailed analysis and ideas.
I think I will work soon on the last item to give the queue a maxsize and handle a full queue accordingly as this might improve the overall behavior. Though for the other options it's unlikely I will be able to spend much time on this, as it would probably also require refactoring lots of code for a maybe only special use case. |
Hi, |
I'm using python-logstash-async in a very particular context:
First let's resume what
LogProcessingWorker._fetch_events()
do.It sequentialy process the followings steps:
self._queue = PriorityQueue()
self._fetch_queued_events_for_flush()
- It retrieves the events in the DB
self._fetch_queued_events_for_flush()
=>self._database.get_queued_events()
- It send the events using
self._send_events(events)
So now what are the issues.
That can be easily solved by adding;
But this solution is not good enough because while storing the events in DB and sending them trough networks, they will accumulate.
In fact the queue is never emptyed because it takes too much time to write the event to the database.
An option would be to retrieve the events from the queue by batch (not possible using
PriorityQueue()
implementation).Another could be to write the events in DB using another or multiple async threads.
Writing to the sqlite DB using
import sqlite3
does not allow concurrent access from multiple threads, the code should be rewritten usingimport aiosqlite
and maybe the queue should bemultiprocessing.Queue()
type.The same way the events should be sent using another async thread one at a time in order to not delete not yet sent events from database with flag
pending_delete=1
.So
self._database.delete_queued_events()
should not be call when a sending suceed but it should be a requirement before a new sending or when the app shutdown.If I resume, basicaly my proposal is to have :
It should improve the performance and makes async-logstash works better on the field.
For sure any other ideas are welcome !
The text was updated successfully, but these errors were encountered: