Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of drawing methods #2080

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Temmie3754
Copy link
Member

@Temmie3754 Temmie3754 commented Mar 30, 2023

Improves the speed of methods that use draw_line_width, draw_line and draw_fillpoly when the points provided are not inside the surface being drawn on.

@Temmie3754 Temmie3754 requested a review from a team as a code owner March 30, 2023 20:17
Copy link
Member

@ankith26 ankith26 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd appreciate explanations/code comments to better understand these changes, so that this can be reviewed faster

@Temmie3754 Temmie3754 marked this pull request as draft April 13, 2023 20:38
@Temmie3754 Temmie3754 marked this pull request as ready for review April 19, 2023 10:40
@Temmie3754 Temmie3754 requested a review from ankith26 April 26, 2023 08:57
@yunline yunline added Performance Related to the speed or resource usage of the project draw pygame.draw labels Jun 4, 2023
Copy link
Member

@MyreMylar MyreMylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, this makes a reasonable amount of sense to me. Lines off the edges of surfaces are being clipped early so we don't walk the whole breshnam's line algorithm on them.

Visual tests and CI tests all look fine with the changes and I can confirm the speed ups of the bad case in some performance testing.

Test program:

from sys import stdout
from pstats import Stats
from cProfile import Profile


import pygame

pygame.init()


def draw_line_test():
    lines = [
        ((-100, -100), (512, 512)),
        ((128, -1000), (128, 1000)),
        ((-1000, 50), (1000, 200)),
        ((256, 256), (0, 0)),
        ((128, 256), (128, 0)),
        ((-1280, 128), (128, 50)),
    ]
    widths = [1, 2, 4, 6, 20]
    surf = pygame.Surface((256, 256), pygame.SRCALPHA, 32)
    surf.fill((255, 255, 255, 255))
    for iterations in range(0, 1000):
        for line in lines:
            for width in widths:
                pygame.draw.line(
                    surf, pygame.Color("red"), line[0], line[1], width=width
                )


if __name__ == "__main__":
    print("Draw Line")
    profiler = Profile()
    profiler.runcall(draw_line_test)
    stats = Stats(profiler, stream=stdout)
    stats.strip_dirs()
    stats.sort_stats("cumulative")
    stats.print_stats()

Results before this PR:

Draw Line
         30003 function calls in 0.184 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.031    0.031    0.184    0.184 draw_line_performance.py:11(draw_line_test)
    30000    0.153    0.000    0.153    0.000 {built-in method pygame.draw.line}
        1    0.000    0.000    0.000    0.000 {method 'fill' of 'pygame.surface.Surface' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Results after this PR:

Draw Line
         30003 function calls in 0.128 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.029    0.029    0.128    0.128 draw_line_performance.py:11(draw_line_test)
    30000    0.099    0.000    0.099    0.000 {built-in method pygame.draw.line}
        1    0.000    0.000    0.000    0.000 {method 'fill' of 'pygame.surface.Surface' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

It looks like a 50% speedup in this test with a mix of large overlap lines and regular lines.

However, it looks like there is a cost for the case of lines entirely within the surface of doing the extra clip testing. Running the same test above, but this time with only lines entirely within the bounds of the surface runs at half the speed after this PR.

I'm not sure if it is worth the performance trade off to speed up 'off-surface' line drawing at the expense of on surface line drawing.

Anyway, I'd like to hear more thoughts and see if anyone else sees similar results when testing this.

@MyreMylar
Copy link
Member

MyreMylar commented Oct 9, 2023

I've expanded my testing cases a bit further to figure out exactly what is happening:

Test program:

from sys import stdout
from pstats import Stats
from cProfile import Profile


import pygame

pygame.init()


def draw_line_short_inside():
    lines = [
        ((256, 256), (0, 0)),
        ((128, 256), (128, 0)),
        ((50, 256), (128, 0)),
        ((0, 240), (12, 56)),
        ((50, 40), (60, 11)),
    ]
    widths = [1, 2, 4, 6, 20]
    surf = pygame.Surface((256, 256), pygame.SRCALPHA, 32)
    surf.fill((255, 255, 255, 255))
    for iterations in range(0, 10000):
        for line in lines:
            for width in widths:
                pygame.draw.line(
                    surf, pygame.Color("red"), line[0], line[1], width=width
                )


def draw_line_long_inside():
    lines = [
        ((1256, 256), (0, 0)),
        ((1128, 256), (128, 0)),
        ((150, 1256), (128, 0)),
        ((0, 240), (1200, 56)),
        ((50, 40), (1060, 11)),
    ]
    widths = [1, 2, 4, 6, 20]
    surf = pygame.Surface((2048, 2048), pygame.SRCALPHA, 32)
    surf.fill((255, 255, 255, 255))
    for iterations in range(0, 10000):
        for line in lines:
            for width in widths:
                pygame.draw.line(
                    surf, pygame.Color("red"), line[0], line[1], width=width
                )


def draw_line_short_outside():
    lines = [
        ((256, 256), (0, 0)),
        ((128, 256), (128, 0)),
        ((50, 256), (128, 0)),
        ((0, 240), (12, 56)),
        ((50, 40), (80, 11)),
    ]
    widths = [1, 2, 4, 6, 20]
    surf = pygame.Surface((64, 64), pygame.SRCALPHA, 32)
    surf.fill((255, 255, 255, 255))
    for iterations in range(0, 10000):
        for line in lines:
            for width in widths:
                pygame.draw.line(
                    surf, pygame.Color("red"), line[0], line[1], width=width
                )


def draw_line_long_outside():
    lines = [
        ((2256, 256), (0, 0)),
        ((2128, 256), (128, 0)),
        ((150, 4256), (128, 0)),
        ((0, 240), (2200, 56)),
        ((50, 40), (4060, 11)),
    ]
    widths = [1, 2, 4, 6, 20]
    surf = pygame.Surface((1024, 1024), pygame.SRCALPHA, 32)
    surf.fill((255, 255, 255, 255))
    for iterations in range(0, 10000):
        for line in lines:
            for width in widths:
                pygame.draw.line(
                    surf, pygame.Color("red"), line[0], line[1], width=width
                )


if __name__ == "__main__":
    print("Draw Line - short, inside surface")
    profiler = Profile()
    profiler.runcall(draw_line_short_inside)
    stats = Stats(profiler, stream=stdout)
    stats.strip_dirs()
    stats.sort_stats("cumulative")
    stats.print_stats()

    print("\nDraw Line - long, inside surface")
    profiler = Profile()
    profiler.runcall(draw_line_long_inside)
    stats = Stats(profiler, stream=stdout)
    stats.strip_dirs()
    stats.sort_stats("cumulative")
    stats.print_stats()

    print("\nDraw Line - short, edges outside surface")
    profiler = Profile()
    profiler.runcall(draw_line_short_outside)
    stats = Stats(profiler, stream=stdout)
    stats.strip_dirs()
    stats.sort_stats("cumulative")
    stats.print_stats()

    print("\nDraw Line - long, edges a long way outside surface")
    profiler = Profile()
    profiler.runcall(draw_line_long_outside)
    stats = Stats(profiler, stream=stdout)
    stats.strip_dirs()
    stats.sort_stats("cumulative")
    stats.print_stats()

Which produces these results on the current main branch, running on my system (Windows 11 with an AMD Ryzen 5 3600X CPU - with SDL 2.26.5, & Python 3.11.1):

Draw Line - short, inside surface
         250003 function calls in 1.056 seconds - main faster

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.247    0.247    1.056    1.056 draw_line_performance_2.py:11(draw_line_short_inside)
   250000    0.809    0.000    0.809    0.000 {built-in method pygame.draw.line}


Draw Line - long, inside surface
         250003 function calls in 5.984 seconds 

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.258    0.258    5.984    5.984 draw_line_performance_2.py:30(draw_line_long_inside)
   250000    5.725    0.000    5.725    0.000 {built-in method pygame.draw.line}


Draw Line - short, edges outside surface
         250003 function calls in 0.505 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.244    0.244    0.505    0.505 draw_line_performance_2.py:49(draw_line_short_outside)
   250000    0.261    0.000    0.261    0.000 {built-in method pygame.draw.line}


Draw Line - long, edges a long way outside surface
         250003 function calls in 6.891 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.251    0.251    6.891    6.891 draw_line_performance_2.py:68(draw_line_long_outside)
   250000    6.640    0.000    6.640    0.000 {built-in method pygame.draw.line}

And these results on this PR branch:

Draw Line - short, inside surface
         250003 function calls in 1.669 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.581    0.581    1.669    1.669 draw_line_performance_2.py:11(draw_line_short_inside)
   250000    1.088    0.000    1.088    0.000 {built-in method pygame.draw.line}


Draw Line - long, inside surface
         250003 function calls in 5.496 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.249    0.249    5.496    5.496 draw_line_performance_2.py:30(draw_line_long_inside)
   250000    5.246    0.000    5.246    0.000 {built-in method pygame.draw.line}


Draw Line - short, edges outside surface
         250003 function calls in 0.416 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.227    0.227    0.416    0.416 draw_line_performance_2.py:49(draw_line_short_outside)
   250000    0.189    0.000    0.189    0.000 {built-in method pygame.draw.line}


Draw Line - long, edges a long way outside surface
         250003 function calls in 4.910 seconds

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.251    0.251    4.910    4.910 draw_line_performance_2.py:68(draw_line_long_outside)
   250000    4.659    0.000    4.659    0.000 {built-in method pygame.draw.line}

So you can see that it is just the short lines that don't get clipped at all that are losing performance. Which makes sense if the clipping code takes a certain set chunk of execution time that then gives benefits per line pixel. In the case of the short lines with no clipping the benefits don't accrue enough to pay for the cost of the clipping.

@Temmie3754
Copy link
Member Author

Thanks a bunch for the test results, I'll work on refactoring the code so that it can perform equivalent or better for small lines (and overall if possible).

@Temmie3754 Temmie3754 marked this pull request as draft October 12, 2023 08:40
src_c/draw.c Outdated
Comment on lines 1793 to 1798
while (x1 != end) {
if (x1 != exit) {
set_and_check_rect(surf, x1, y1, color, drawn_area);
}
else
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while (x1 != end) {
if (x1 != exit) {
set_and_check_rect(surf, x1, y1, color, drawn_area);
}
else
break;
while (!(x1 == end || x1 == exit)) {
set_and_check_rect(surf, x1, y1, color, drawn_area);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't help with the speed at all, but is a little bit cleaner to read.

@MyreMylar
Copy link
Member

Merging this with main seems to have boosted the short lines inside surface performance in my testing.

This is what I get on this branch this morning (tried several different runs):

Draw Line - short, inside surface
         250003 function calls in 1.059 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.254    0.254    1.059    1.059 draw_line_performance_2.py:11(draw_line_short_inside)
   250000    0.805    0.000    0.805    0.000 {built-in method pygame.draw.line}
        1    0.000    0.000    0.000    0.000 {method 'fill' of 'pygame.surface.Surface' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



Draw Line - long, inside surface
         250003 function calls in 5.830 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.265    0.265    5.830    5.830 draw_line_performance_2.py:30(draw_line_long_inside)
   250000    5.564    0.000    5.564    0.000 {built-in method pygame.draw.line}
        1    0.001    0.001    0.001    0.001 {method 'fill' of 'pygame.surface.Surface' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



Draw Line - short, edges outside surface
         250003 function calls in 0.450 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.247    0.247    0.450    0.450 draw_line_performance_2.py:49(draw_line_short_outside)
   250000    0.203    0.000    0.203    0.000 {built-in method pygame.draw.line}
        1    0.000    0.000    0.000    0.000 {method 'fill' of 'pygame.surface.Surface' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}



Draw Line - long, edges a long way outside surface
         250003 function calls in 4.990 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.261    0.261    4.990    4.990 draw_line_performance_2.py:68(draw_line_long_outside)
   250000    4.729    0.000    4.729    0.000 {built-in method pygame.draw.line}
        1    0.000    0.000    0.000    0.000 {method 'fill' of 'pygame.surface.Surface' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

So perhaps some other changes went into draw in the mean time and that is what was showing the difference earlier this week?

Multiple runs show a bit of variation from the randomness, but the pattern is consistent in that only the long edges, a long way outside the surface are significantly changed by this PR - and they are changed for the better (+30% faster).

I left a little code clean-up suggestion as well. But in general I would be happy to approve this on overall performance grounds

@MyreMylar
Copy link
Member

MyreMylar commented Oct 26, 2023

Performance of the latest commits is looking good:

Draw Line - short, inside surface
         250003 function calls in 1.022 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.253    0.253    1.022    1.022 draw_line_performance_2.py:11(draw_line_short_inside)
   250000    0.769    0.000    0.769    0.000 {built-in method pygame.draw.line}



Draw Line - long, inside surface
         250003 function calls in 5.594 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.266    0.266    5.594    5.594 draw_line_performance_2.py:30(draw_line_long_inside)
   250000    5.327    0.000    5.327    0.000 {built-in method pygame.draw.line}



Draw Line - short, edges outside surface
         250003 function calls in 0.451 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.253    0.253    0.451    0.451 draw_line_performance_2.py:49(draw_line_short_outside)
   250000    0.198    0.000    0.198    0.000 {built-in method pygame.draw.line}


Draw Line - long, edges a long way outside surface
         250003 function calls in 4.862 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.262    0.262    4.862    4.862 draw_line_performance_2.py:68(draw_line_long_outside)
   250000    4.600    0.000    4.600    0.000 {built-in method pygame.draw.line}

Seems to have got about 5% faster across all these tests since I last tested it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
draw pygame.draw Performance Related to the speed or resource usage of the project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants