New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

feat: Add OrganizationSpansFrequencyStatsEndpoint for Comparison Workflows project #84765

Merged

Mitan merged 38 commits into master from dmitrii/stats-endpoint

Feb 10, 2025

+266 −0

Member

Mitan commented Feb 7, 2025 •

edited

Loading

For comparison workflows, we need the count distribution data of all attribute/value pairs (both string and numeric, however we'll focus on string first). This PR adds a sentry backend endpoint for retrieving these stats.
The protos required for this task were already merged here
The snuba RPC was merged here

More info on the project can be found here

Mitan added 21 commits

January 29, 2025 18:09


          (wip): add the OrganizationSpansFrequencyStatsEndpoint class skeleton.

e02ee7c


          Merge branch 'master' into dmitrii/stats-endpoint

b65d86c


          (wip): add metadata for the request.

5b014be


          (feat): add sample request

b86867e


          fix: minor import

5eb8fdc


          feat: add snuba rpc helper.

95ba11b


          feat: added implementation with mocked snuba rpc call result.

f9449a9


          Merge branch 'master' into dmitrii/stats-endpoint

63a3a2a


          move code to a separate file.

5a9df69


          bump protos version.

56dd532


          fix start and end time.

1f9aa63


          comment hardcoded responses.

4900fef


          Merge remote-tracking branch 'origin/master' into dmitrii/stats-endpoint

104a1d3


          pass through filter parameter

c12d390


          pass through StatsParams

d52f3e9


          add parameter validation

51827ae


          change referrer

fd78656


          cleanup

6afb0cf


          add tests skeleton class

9ced913


          add more tests.

2309ef5


          add check for max buckets value

2fc0212

Mitan linked an issue

that may be closed by this pull request

(workflows) Build a Sentry endpoint to query the stats getsentry/seer#1815

Closed

github-actions bot added the Scope: Backend label

vercel bot deployed to Preview

February 7, 2025 05:16

View deployment


          Merge remote-tracking branch 'origin/master' into dmitrii/stats-endpoint

4db71d4

vercel bot deployed to Preview

February 7, 2025 05:41

View deployment

Mitan assigned shruthilayaj

codecov bot commented Feb 7, 2025 •

edited

Loading

❌ 3 Tests Failed:

Tests completed	Failed	Passed	Skipped
23473	3	23470	289

View the top 3 failed test(s) by shortest run time

tests.sentry.api.endpoints.test_organization_spans_fields_stats.OrganizationSpansFieldsStatsEndpointTest::test_max_buckets

Stack Traces | 3.04s run time

#x1B[1m#x1B[.../api/endpoints/test_organization_spans_fields_stats.py#x1B[0m:120: in test_max_buckets
    assert response.status_code == 200, response.data
#x1B[1m#x1B[31mE   AssertionError: {'detail': 'Internal Error', 'errorId': None}#x1B[0m
#x1B[1m#x1B[31mE   assert 500 == 200#x1B[0m
#x1B[1m#x1B[31mE    +  where 500 = <Response status_code=500, "application/json">.status_code#x1B[0m

tests.sentry.api.endpoints.test_organization_spans_fields_stats.OrganizationSpansFieldsStatsEndpointTest::test_distribution_values

Stack Traces | 3.81s run time

#x1B[1m#x1B[.../api/endpoints/test_organization_spans_fields_stats.py#x1B[0m:140: in test_distribution_values
    assert response.status_code == 200, response.data
#x1B[1m#x1B[31mE   AssertionError: {'detail': 'Internal Error', 'errorId': None}#x1B[0m
#x1B[1m#x1B[31mE   assert 500 == 200#x1B[0m
#x1B[1m#x1B[31mE    +  where 500 = <Response status_code=500, "application/json">.status_code#x1B[0m

tests.sentry.api.endpoints.test_organization_spans_fields_stats.OrganizationSpansFieldsStatsEndpointTest::test_filter_query

Stack Traces | 4.21s run time

#x1B[1m#x1B[.../api/endpoints/test_organization_spans_fields_stats.py#x1B[0m:163: in test_filter_query
    assert response.status_code == 200, response.data
#x1B[1m#x1B[31mE   AssertionError: {'detail': 'Internal Error', 'errorId': None}#x1B[0m
#x1B[1m#x1B[31mE   assert 500 == 200#x1B[0m
#x1B[1m#x1B[31mE    +  where 500 = <Response status_code=500, "application/json">.status_code#x1B[0m

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

Mitan changed the title ~~Dmitrii/stats endpoint~~ feat: Add OrganizationSpansFrequencyStatsEndpoint for Comparison Workflows project


          fix mypy

aae2c26

Mitan requested a review from a team as a code owner

February 7, 2025 16:37


          add test for fitler.

c49d920

vercel bot deployed to Preview

February 7, 2025 19:47

View deployment

shruthilayaj reviewed

View reviewed changes

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated

    
                              {"attributeDistributions": []}  # Empty response matching the expected structure

                          )

                      serializer = OrganizationSpansFieldsEndpointSerializer(data=request.GET)

Member

shruthilayaj Feb 7, 2025

Why are we validating our request with this serializer? Looks like you're already doing the dataset validation below and this serializer contains fields not pertinent to this endpoint

Member Author

Mitan Feb 7, 2025

thanks, added a serializer for this endpoint

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated

    
              class OrganizationSpansFrequencyStatsEndpoint(OrganizationEventsV2EndpointBase):

                  snuba_methods = ["GET"]

                  publish_status = {

                      "GET": ApiPublishStatus.PRIVATE,

Member

shruthilayaj Feb 7, 2025

Let's create a ticket to document this API as a follow up! This API should likely be a public one eventually so it would be good to document it while you have the context!

src/sentry/utils/snuba_rpc.py Outdated Show resolved Hide resolved

src/sentry/api/urls.py Outdated Show resolved Hide resolved


          suggestions from code review

8a42fb7

Co-authored-by: Shruthi <[email protected]>

vercel bot deployed to Preview

February 7, 2025 21:22

View deployment

Mitan added 2 commits

February 7, 2025 16:53


          adjust serializer

5d024d2


          Merge remote-tracking branch 'origin/master' into dmitrii/stats-endpoint

059c1ef

vercel bot deployed to Preview

February 7, 2025 22:09

View deployment

shruthilayaj reviewed

View reviewed changes

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated Show resolved Hide resolved

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated Show resolved Hide resolved

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated

+                  dataset = serializers.ChoiceField(["spans", "spansIndexed"], required=False, default="spans")
+                  # if values are not provided, we will use zeros and then snuba RPC will set the defaults
+                  # Top number of frequencies to return for each attribute, defaults in snuba to 10 and can't be more than 100
+                  max_buckets = serializers.IntegerField(required=False, min_value=0, max_value=100, default=0)

Member

shruthilayaj Feb 10, 2025

let's pick a more reasonable default - like 5 or 10?

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated

+                  # Top number of frequencies to return for each attribute, defaults in snuba to 10 and can't be more than 100
+                  max_buckets = serializers.IntegerField(required=False, min_value=0, max_value=100, default=0)
+                  # Total number of attributes to return, defaults in snuba to 10_000
+                  max_attributes = serializers.IntegerField(required=False, min_value=0, default=0)

Member

shruthilayaj Feb 10, 2025

any reason we're setting a 0 default? let's just not set anything here for default since it defeats the purpose of making it a not required field

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated Show resolved Hide resolved

src/sentry/api/endpoints/organization_spans_frequency_stats.py Outdated



		class OrganizationSpansFrequencyStatsEndpointSerializer(serializers.Serializer):
		dataset = serializers.ChoiceField(["spans", "spansIndexed"], required=False, default="spans")

Member

shruthilayaj Feb 10, 2025

spansIndexed should not be a valid choice here

Member Author

Mitan Feb 10, 2025

then i can just remove dataset parameter, since technically we don't need it at all


          rename endpoint to SpansFieldsStats

c35547e

vercel bot deployed to Preview

February 10, 2025 18:41

View deployment

Mitan added 2 commits

February 10, 2025 14:01


          add a feature flag

5b0d44d


          Merge remote-tracking branch 'origin/master' into dmitrii/stats-endpoint

56d0753

vercel bot deployed to Preview

February 10, 2025 19:06

View deployment


          adjust params

61ae980

vercel bot deployed to Preview

February 10, 2025 19:14

View deployment


          fix tests

8567d9a

vercel bot deployed to Preview

February 10, 2025 20:11

View deployment


          Merge remote-tracking branch 'origin/master' into dmitrii/stats-endpoint

19bfbb9

sentaur-athena reviewed

View reviewed changes

src/sentry/api/endpoints/organization_spans_fields_stats.py Outdated

    
              class OrganizationSpansFieldsStatsEndpoint(OrganizationEventsV2EndpointBase):

                  snuba_methods = ["GET"]

                  publish_status = {

                      "GET": ApiPublishStatus.PRIVATE,

Member

sentaur-athena Feb 10, 2025

Just to confirm, private means this endpoint will never be useful for customers calling our APIs directly and only used in Sentry frontend. Is that right? If it is marked as private only because it's not stable yet, please make it EXPERIMENTAL

Member Author

Mitan Feb 10, 2025

thanks, yes, we can mark it as experimental


          change referrer

d894e4a

vercel bot deployed to Preview

February 10, 2025 22:27

View deployment


          make API experimental

070e87d

vercel bot deployed to Preview

February 10, 2025 22:38

View deployment

shruthilayaj approved these changes

View reviewed changes

Member

shruthilayaj left a comment

lgtm, just a small comment

src/sentry/api/endpoints/organization_spans_fields_stats.py Outdated

    
                      stats_type = StatsType(

                          attribute_distributions=AttributeDistributionsRequest(

                              max_buckets=serialized["max_buckets"],

                              max_attributes=serialized.get("max_attributes", 0),

Member

shruthilayaj Feb 10, 2025

don't default to 0 here, just set it to None

Suggested change

      
                            max_attributes=serialized.get("max_attributes", 0),
          
                            max_attributes=serialized.get("max_attributes"),

src/sentry/api/endpoints/organization_spans_fields_stats.py Outdated

    
              @region_silo_endpoint

              class OrganizationSpansFieldsStatsEndpoint(OrganizationEventsV2EndpointBase):

                  snuba_methods = ["GET"]

Member

shruthilayaj Feb 10, 2025

can remove


          minors

ac4b39b

vercel bot deployed to Preview

February 10, 2025 22:56

View deployment

Mitan merged commit ae8a52b into master

49 checks passed

Mitan deleted the dmitrii/stats-endpoint branch

February 10, 2025 23:30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels