-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of drawing methods #2080
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd appreciate explanations/code comments to better understand these changes, so that this can be reviewed faster
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, this makes a reasonable amount of sense to me. Lines off the edges of surfaces are being clipped early so we don't walk the whole breshnam's line algorithm on them.
Visual tests and CI tests all look fine with the changes and I can confirm the speed ups of the bad case in some performance testing.
Test program:
from sys import stdout
from pstats import Stats
from cProfile import Profile
import pygame
pygame.init()
def draw_line_test():
lines = [
((-100, -100), (512, 512)),
((128, -1000), (128, 1000)),
((-1000, 50), (1000, 200)),
((256, 256), (0, 0)),
((128, 256), (128, 0)),
((-1280, 128), (128, 50)),
]
widths = [1, 2, 4, 6, 20]
surf = pygame.Surface((256, 256), pygame.SRCALPHA, 32)
surf.fill((255, 255, 255, 255))
for iterations in range(0, 1000):
for line in lines:
for width in widths:
pygame.draw.line(
surf, pygame.Color("red"), line[0], line[1], width=width
)
if __name__ == "__main__":
print("Draw Line")
profiler = Profile()
profiler.runcall(draw_line_test)
stats = Stats(profiler, stream=stdout)
stats.strip_dirs()
stats.sort_stats("cumulative")
stats.print_stats()
Results before this PR:
Draw Line
30003 function calls in 0.184 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.031 0.031 0.184 0.184 draw_line_performance.py:11(draw_line_test)
30000 0.153 0.000 0.153 0.000 {built-in method pygame.draw.line}
1 0.000 0.000 0.000 0.000 {method 'fill' of 'pygame.surface.Surface' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
Results after this PR:
Draw Line
30003 function calls in 0.128 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.029 0.029 0.128 0.128 draw_line_performance.py:11(draw_line_test)
30000 0.099 0.000 0.099 0.000 {built-in method pygame.draw.line}
1 0.000 0.000 0.000 0.000 {method 'fill' of 'pygame.surface.Surface' objects}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
It looks like a 50% speedup in this test with a mix of large overlap lines and regular lines.
However, it looks like there is a cost for the case of lines entirely within the surface of doing the extra clip testing. Running the same test above, but this time with only lines entirely within the bounds of the surface runs at half the speed after this PR.
I'm not sure if it is worth the performance trade off to speed up 'off-surface' line drawing at the expense of on surface line drawing.
Anyway, I'd like to hear more thoughts and see if anyone else sees similar results when testing this.
I've expanded my testing cases a bit further to figure out exactly what is happening: Test program: from sys import stdout
from pstats import Stats
from cProfile import Profile
import pygame
pygame.init()
def draw_line_short_inside():
lines = [
((256, 256), (0, 0)),
((128, 256), (128, 0)),
((50, 256), (128, 0)),
((0, 240), (12, 56)),
((50, 40), (60, 11)),
]
widths = [1, 2, 4, 6, 20]
surf = pygame.Surface((256, 256), pygame.SRCALPHA, 32)
surf.fill((255, 255, 255, 255))
for iterations in range(0, 10000):
for line in lines:
for width in widths:
pygame.draw.line(
surf, pygame.Color("red"), line[0], line[1], width=width
)
def draw_line_long_inside():
lines = [
((1256, 256), (0, 0)),
((1128, 256), (128, 0)),
((150, 1256), (128, 0)),
((0, 240), (1200, 56)),
((50, 40), (1060, 11)),
]
widths = [1, 2, 4, 6, 20]
surf = pygame.Surface((2048, 2048), pygame.SRCALPHA, 32)
surf.fill((255, 255, 255, 255))
for iterations in range(0, 10000):
for line in lines:
for width in widths:
pygame.draw.line(
surf, pygame.Color("red"), line[0], line[1], width=width
)
def draw_line_short_outside():
lines = [
((256, 256), (0, 0)),
((128, 256), (128, 0)),
((50, 256), (128, 0)),
((0, 240), (12, 56)),
((50, 40), (80, 11)),
]
widths = [1, 2, 4, 6, 20]
surf = pygame.Surface((64, 64), pygame.SRCALPHA, 32)
surf.fill((255, 255, 255, 255))
for iterations in range(0, 10000):
for line in lines:
for width in widths:
pygame.draw.line(
surf, pygame.Color("red"), line[0], line[1], width=width
)
def draw_line_long_outside():
lines = [
((2256, 256), (0, 0)),
((2128, 256), (128, 0)),
((150, 4256), (128, 0)),
((0, 240), (2200, 56)),
((50, 40), (4060, 11)),
]
widths = [1, 2, 4, 6, 20]
surf = pygame.Surface((1024, 1024), pygame.SRCALPHA, 32)
surf.fill((255, 255, 255, 255))
for iterations in range(0, 10000):
for line in lines:
for width in widths:
pygame.draw.line(
surf, pygame.Color("red"), line[0], line[1], width=width
)
if __name__ == "__main__":
print("Draw Line - short, inside surface")
profiler = Profile()
profiler.runcall(draw_line_short_inside)
stats = Stats(profiler, stream=stdout)
stats.strip_dirs()
stats.sort_stats("cumulative")
stats.print_stats()
print("\nDraw Line - long, inside surface")
profiler = Profile()
profiler.runcall(draw_line_long_inside)
stats = Stats(profiler, stream=stdout)
stats.strip_dirs()
stats.sort_stats("cumulative")
stats.print_stats()
print("\nDraw Line - short, edges outside surface")
profiler = Profile()
profiler.runcall(draw_line_short_outside)
stats = Stats(profiler, stream=stdout)
stats.strip_dirs()
stats.sort_stats("cumulative")
stats.print_stats()
print("\nDraw Line - long, edges a long way outside surface")
profiler = Profile()
profiler.runcall(draw_line_long_outside)
stats = Stats(profiler, stream=stdout)
stats.strip_dirs()
stats.sort_stats("cumulative")
stats.print_stats() Which produces these results on the current main branch, running on my system (Windows 11 with an AMD Ryzen 5 3600X CPU - with SDL 2.26.5, & Python 3.11.1):
And these results on this PR branch:
So you can see that it is just the short lines that don't get clipped at all that are losing performance. Which makes sense if the clipping code takes a certain set chunk of execution time that then gives benefits per line pixel. In the case of the short lines with no clipping the benefits don't accrue enough to pay for the cost of the clipping. |
Thanks a bunch for the test results, I'll work on refactoring the code so that it can perform equivalent or better for small lines (and overall if possible). |
src_c/draw.c
Outdated
while (x1 != end) { | ||
if (x1 != exit) { | ||
set_and_check_rect(surf, x1, y1, color, drawn_area); | ||
} | ||
else | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while (x1 != end) { | |
if (x1 != exit) { | |
set_and_check_rect(surf, x1, y1, color, drawn_area); | |
} | |
else | |
break; | |
while (!(x1 == end || x1 == exit)) { | |
set_and_check_rect(surf, x1, y1, color, drawn_area); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't help with the speed at all, but is a little bit cleaner to read.
Merging this with main seems to have boosted the short lines inside surface performance in my testing. This is what I get on this branch this morning (tried several different runs):
So perhaps some other changes went into draw in the mean time and that is what was showing the difference earlier this week? Multiple runs show a bit of variation from the randomness, but the pattern is consistent in that only the long edges, a long way outside the surface are significantly changed by this PR - and they are changed for the better (+30% faster). I left a little code clean-up suggestion as well. But in general I would be happy to approve this on overall performance grounds |
Performance of the latest commits is looking good:
Seems to have got about 5% faster across all these tests since I last tested it. |
Improves the speed of methods that use draw_line_width, draw_line and draw_fillpoly when the points provided are not inside the surface being drawn on.