Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Folders are also considered for WalkWithObjects to adhere bulk insert failure on using GCS buckets #39150

Closed
wants to merge 1 commit into from

Conversation

gifi-siby
Copy link
Contributor

Related task: #39134

@sre-ci-robot sre-ci-robot added the size/S Denotes a PR that changes 10-29 lines. label Jan 10, 2025
Copy link
Contributor

mergify bot commented Jan 10, 2025

@gifi-siby Thanks for your contribution. Please submit with DCO, see the contributing guide https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md#developer-certificate-of-origin-dco.

@mergify mergify bot added needs-dco DCO is missing in this pull request. kind/bug Issues or changes related a bug labels Jan 10, 2025
Copy link

codecov bot commented Jan 10, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.64%. Comparing base (75d7978) to head (1a3b062).

Current head 1a3b062 differs from pull request most recent head 0eeacbf

Please upload reports for the commit 0eeacbf to get more accurate results.

❗ There is a different number of reports uploaded between BASE (75d7978) and HEAD (1a3b062). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (75d7978) HEAD (1a3b062)
2 1
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           master   #39150       +/-   ##
===========================================
- Coverage   81.03%   69.64%   -11.40%     
===========================================
  Files        1407      296     -1111     
  Lines      198527    26633   -171894     
===========================================
- Hits       160881    18548   -142333     
+ Misses      31987     8085    -23902     
+ Partials     5659        0     -5659     
Components Coverage Δ
Client ∅ <ø> (∅)
Core 69.64% <ø> (+0.15%) ⬆️
Go ∅ <ø> (∅)

see 1150 files with indirect coverage changes

@xiaofan-luan
Copy link
Collaborator

@gifi-siby

i think the implementation of gcp_native_object storage is wrong. becuase the test result of it is different from minio_object_storage test

	insertWithPrefixTests := []struct {
		recursive     bool
		prefix        string
		expectedValue []string
	}{
		{true, "abc/", []string{"abc/e/d"}},
		{true, "key_/", []string{"key_/1/1", "key_/2/3", "key_/test.txt"}},
		{false, "abc/", []string{}},
		{false, "key_/", []string{"key_/test.txt"}},
	}

	for _, test := range insertWithPrefixTests {
		t.Run(fmt.Sprintf("prefix: %s, recursive: %t", test.prefix, test.recursive), func(t *testing.T) {
			gotk, _, err := listAllObjectsWithPrefixAtBucket(ctx, testCM, config.bucketName,
				test.prefix, test.recursive)
			assert.NoError(t, err)
			assert.Equal(t, len(test.expectedValue), len(gotk))
			for _, key := range gotk {
				assert.Contains(t, test.expectedValue, key)
			}
		})
	}

this is the expected behaviour behaviour of minio storage

I think we need to modify gcp_native_object_storage to follow the behaviour of minio storage?

@gifi-siby
Copy link
Contributor Author

The prepared tests are different for minio and gcp_native and thus the expected results will be different right?

@xiaofan-luan
Copy link
Collaborator

The prepared tests are different for minio and gcp_native and thus the expected results will be different right?

that's the probably the core reason gcp_native object store is buggy.

We should get exact same test and test result for minio and gcp_native object store

@gifi-siby
Copy link
Contributor Author

GCS treats slashes (/) in object keys as part of a simulated folder structure. For example, when you try to create "abc/e/d" after "abc/d/e", GCS detects a conflict because it interprets the object paths as part of a hierarchy. It sees "abc/d/e" and "abc/e/d" as related and checks for consistency in the way these paths are structured.
MinIO, on the other hand, uses a flat object storage model, where slashes are simply characters in the object name and don't imply any folder structure. Therefore, MinIO allows both "abc/d/e" and "abc/e/d" without any issues.
This difference in behaviour is due to how GCS and MinIO handle object keys – it's not a bug in GCS.

But we can stimulate the behaviour of azure in GCS. So, I'll modify the test scenarios according to azure.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

@gifi-siby gifi-siby force-pushed the 39134_bulk_insert branch 3 times, most recently from 6d8a417 to 1a3b062 Compare January 16, 2025 09:08
Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby go-sdk check failed, comment rerun go-sdk can trigger the job again.

Copy link
Contributor

mergify bot commented Jan 16, 2025

@gifi-siby cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

@gifi-siby gifi-siby closed this Jan 16, 2025
@sre-ci-robot sre-ci-robot added size/XS Denotes a PR that changes 0-9 lines. and removed size/S Denotes a PR that changes 10-29 lines. labels Jan 16, 2025
@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gifi-siby

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gifi-siby
Copy link
Contributor Author

Created new PR request as this branch had conflict issues: #39352

sre-ci-robot pushed a commit that referenced this pull request Jan 18, 2025
…nsert failure on using GCS buckets (#39352)

Related task: [#39134
](#39134)

Previous PR: [#39150 ](#39150)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-passed kind/bug Issues or changes related a bug needs-dco DCO is missing in this pull request. size/XS Denotes a PR that changes 0-9 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants