forked from osTicket/osTicket
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large dataset performance #11
Open
protich
wants to merge
20
commits into
greezybacon:issue/large-dataset-performance
Choose a base branch
from
protich:issue/large-dataset-performance
base: issue/large-dataset-performance
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Large dataset performance #11
protich
wants to merge
20
commits into
greezybacon:issue/large-dataset-performance
from
protich:issue/large-dataset-performance
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The joins for the queues should not yield duplicate records. Therefore distinct counts should not be necessary. This saves the overhead of sorting the records to be counted to ensure duplicated rows are not counted multiple times.
This removes two or three joins from queries which check ticket access, such as the queue pages, by checking the object_type and object_id directly. In general, this helps the MySQL query optimizer have an easier time realizing that the joins are not necessary at all. It would be nice if the ORM would realize that a join to check the primary key value of the foreign table should actually stop one table short in the join path.
If there are annotations in an SQL statement, but there are no aggregate functions used (such as SUM, COUNT, etc), then a GROUP BY clause is not technically required. Using one implies sorting of the results to ensure uniqueness--prior to sorting them according to the requested sort in the ORDER BY clause.
Somehow on large datasets (like >1M tickets), MySQL can get confused on which index will provide the best performance. Generally, as systems age, they will have significantly more closed tickets than open ones. Therefore, it should be safe to assume that scanning the `status_id` index on the ticket table for `open` tickets would be the fastest way to arrive at the sort-of short list of tickets which should need to possibly be aged.
If APCu is available, then the queue counts can be cached between requests. They are automatically cleared and recalculated if the status of a ticket changes or if a queue or saved search is edited. Otherwise, the queue counts will expire after an hour and be recalculated anyway.
This changes the queue counts shown at the bottom of the page to no longer be calculated using the SQL_CALC_FOUND_ROWS method of MySQL. Such is very slow for large recordsets. Instead, a rough count is computed based on the total number of tickets in the queue without respect for staff access. This is the fastest way to get a maximum number of possible tickets to be shown. The pagenation interface should be changed to show only NEXT and PREVIOUS pages where the rough estimate can be used to provide a rough idea of whether or not another page of data would be available. Furthermore, if APCu is available, the rough count is stashed and kept between requests so that the rough counts do not need to be re-tallied until they would change from a ticket state change. Another optimization might be to increment and decrement the queue rough counts when tickets are created or change states. In such a case, it could be identified which queues the old ticket would have been (and decrement the count) and which queues the updated ticket would be in (and increment the count).
For its own reasons, MySQL seems to pick a better index when the join between ticket and user is a left join.
The joins for the queues should not yield duplicate records. Therefore distinct counts should not be necessary. This saves the overhead of sorting the records to be counted to ensure duplicated rows are not counted multiple times.
This removes two or three joins from queries which check ticket access, such as the queue pages, by checking the object_type and object_id directly. In general, this helps the MySQL query optimizer have an easier time realizing that the joins are not necessary at all. It would be nice if the ORM would realize that a join to check the primary key value of the foreign table should actually stop one table short in the join path.
If there are annotations in an SQL statement, but there are no aggregate functions used (such as SUM, COUNT, etc), then a GROUP BY clause is not technically required. Using one implies sorting of the results to ensure uniqueness--prior to sorting them according to the requested sort in the ORDER BY clause.
Somehow on large datasets (like >1M tickets), MySQL can get confused on which index will provide the best performance. Generally, as systems age, they will have significantly more closed tickets than open ones. Therefore, it should be safe to assume that scanning the `status_id` index on the ticket table for `open` tickets would be the fastest way to arrive at the sort-of short list of tickets which should need to possibly be aged.
If APCu is available, then the queue counts can be cached between requests. They are automatically cleared and recalculated if the status of a ticket changes or if a queue or saved search is edited. Otherwise, the queue counts will expire after an hour and be recalculated anyway.
This changes the queue counts shown at the bottom of the page to no longer be calculated using the SQL_CALC_FOUND_ROWS method of MySQL. Such is very slow for large recordsets. Instead, a rough count is computed based on the total number of tickets in the queue without respect for staff access. This is the fastest way to get a maximum number of possible tickets to be shown. The pagenation interface should be changed to show only NEXT and PREVIOUS pages where the rough estimate can be used to provide a rough idea of whether or not another page of data would be available. Furthermore, if APCu is available, the rough count is stashed and kept between requests so that the rough counts do not need to be re-tallied until they would change from a ticket state change. Another optimization might be to increment and decrement the queue rough counts when tickets are created or change states. In such a case, it could be identified which queues the old ticket would have been (and decrement the count) and which queues the updated ticket would be in (and increment the count).
For its own reasons, MySQL seems to pick a better index when the join between ticket and user is a left join.
Prefer agent's queue count instead of rough count when paginating the tickets. This will make the initial queue load expensive but has an added advantage of having queue counts available thereafter for drop downs. This commits also adds entry to auto-cron, to keep queue counts more up to date in the background.When APCu is not available SESSION is used to cache the counts.
This adds the advanced option to the queue sort configuration. An index can be specified to be used for the sorting operation. In some cases, the MySQL query optimizer cannot select the most efficient index to use when dealing with large querysets and sorting. This feature, if enabled, allows an administrator to specify an index which MySQL should use when using the sort. To use the feature, an `extra` column must be added to the `%queue_sort` table to receive the index name.
Prefer agent's queue count instead of rough count when paginating the tickets. This will make the initial queue load expensive but has an added advantage of having queue counts available thereafter for drop downs. This commits also adds entry to auto-cron, to keep queue counts more up to date in the background.When APCu is not available SESSION is used to cache the counts.
…e/large-dataset-performance
This is useful to avoid blank page due to `getCount` on queue.
827d39f
to
dcc58bc
Compare
…e/large-dataset-performance Conflicts: include/class.orm.php include/class.search.php include/staff/templates/queue-tickets.tmpl.php
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Please see last commit and let's chat when you get a minute.