-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Bulk Insert failed during Milvus restore with GCS native buckets using Milvus Backup tool #39134
Comments
/assign @huanghaoyuanhhy |
@yanliang567: GitHub didn't allow me to assign the following users: huanghaoyuanhhy. Note that only milvus-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @gifi-siby |
I think for Zilliz cloud, backup tool is actually used for environemnt of both AWS, GCP and Azure. |
The error came up from Milvus with GCP native storage (using Google Cloud Storage libraries) |
do you mean gcp_native_object_storage? |
I am enhancing MilvusBackup tool to support GCP native as well (related PR: #495 ) and I got this issue during the restore process. The error I got was from BulkInsert operation: |
/assign |
It looks like both milvus and backup issues are caused by hierarchical namespaces, I am still trying to reproduce. As I mentioned in zilliztech/milvus-backup#446, most of our tests and users use minio or S3 generic buckets. Hierarchical namespaces may cause inconsistent behavior of listObjects. |
The GCP native object storage implementation inside milvus seems not really correct. |
Is there an existing issue for this?
Environment
Current Behavior
Perform a Milvus restore operation using the Milvus Backup tool to restore data to a Milvus instance associated with a GCS native bucket. Below error is found:
workerpool: execute job no binlog to import, input=[paths:"nativemill/BackupMilvus864/flavorBack1/binlogs/insert_log/453580537997566884/453580537997566885/453580537997766901/" paths:"" ]: invalid parameter
Expected Behavior
Restore operation using the MilvusBack tool should be success without any error.
Steps To Reproduce
Milvus Log
No response
Anything else?
As part of BulkInsert,
ListObjects
function ofGCPNativeObjectStorage
is called to list all the folders and files inside a certain path (inside backupPath/binlogs/insert_log/collectionID/partitionID/segmentID/groupID).Only end files were returned from ListObjects (folder-like objects are skipped) and thus empty array of string was returned (thus no binlogs were found).
This issue is found as part of the MilvusBackup tool enhancement:
Enhancement task: #444
PR raised: #495
The text was updated successfully, but these errors were encountered: