Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][standalone] Milvus panic counter cannot decrease in value in concurrent dql & dml scene(SCANN index) #38769

Open
1 task done
wangting0128 opened this issue Dec 26, 2024 · 1 comment
Assignees
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

wangting0128 commented Dec 26, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version:master-20241225-f49d6183-amd64
- Deployment mode(standalone or cluster):standalone
- MQ type(rocksmq, pulsar or kafka):rocksmq    
- SDK version(e.g. pymilvus v2.0.0rc2):2.5.0rc124
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

argo task: new-stable-master-1735131600

server:

[2024-12-25 18:17:02,285 -  INFO - fouram]: [Base] Deploy initial state: 
I1225 13:11:10.667643    3630 request.go:665] Waited for 1.174761357s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/kibana.k8s.elastic.co/v1beta1?timeout=32s
I1225 13:11:20.667853    3630 request.go:665] Waited for 11.174853774s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/elasticsearch.k8s.elastic.co/v1?timeout=32s
I1225 13:11:30.867587    3630 request.go:665] Waited for 8.195786748s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/resolution.tekton.dev/v1beta1?timeout=32s
NAME                                                              READY   STATUS                   RESTARTS        AGE     IP              NODE         NOMINATED NODE   READINESS GATES
new-stable-mast31600-2-57-1379-etcd-0                             1/1     Running                  0               5m30s   10.104.20.57    4am-node22   <none>           <none>
new-stable-mast31600-2-57-1379-milvus-standalone-65bbc5bfctlbqp   1/1     Running                  0               5m30s   10.104.19.222   4am-node28   <none>           <none>
new-stable-mast31600-2-57-1379-minio-fd76b67bb-dkbmn              1/1     Running                  0               5m30s   10.104.19.221   4am-node28   <none>           <none> (base.py:261)
[2024-12-25 18:17:02,285 -  INFO - fouram]: [Cmd Exe]  kubectl get pods  -n qa-milvus  -o wide | grep -E 'NAME|new-stable-mast31600-2-57-1379-milvus|new-stable-mast31600-2-57-1379-minio|new-stable-mast31600-2-57-1379-etcd|new-stable-mast31600-2-57-1379-pulsar|new-stable-mast31600-2-57-1379-zookeeper|new-stable-mast31600-2-57-1379-kafka|new-stable-mast31600-2-57-1379-log|new-stable-mast31600-2-57-1379-tikv'  (util_cmd.py:14)
[2024-12-25 18:17:31,788 -  INFO - fouram]: [CliClient] pod details of release(new-stable-mast31600-2-57-1379): 
 I1225 18:17:03.533653    3767 request.go:665] Waited for 1.174230381s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/source.toolkit.fluxcd.io/v1beta1?timeout=32s
I1225 18:17:13.733192    3767 request.go:665] Waited for 11.373660147s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/triggers.tekton.dev/v1beta1?timeout=32s
I1225 18:17:23.733567    3767 request.go:665] Waited for 8.197901226s due to client-side throttling, not priority and fairness, request: GET:https://kubernetes.default.svc.cluster.local/apis/maps.k8s.elastic.co/v1alpha1?timeout=32s
NAME                                                              READY   STATUS                   RESTARTS         AGE     IP              NODE         NOMINATED NODE   READINESS GATES
new-stable-mast31600-2-57-1379-etcd-0                             1/1     Running                  0                5h11m   10.104.20.57    4am-node22   <none>           <none>
new-stable-mast31600-2-57-1379-milvus-standalone-65bbc5bfctlbqp   0/1     CrashLoopBackOff         60 (2m5s ago)    5h11m   10.104.19.222   4am-node28   <none>           <none>
new-stable-mast31600-2-57-1379-minio-fd76b67bb-dkbmn              1/1     Running                  0                5h11m   10.104.19.221   4am-node28   <none>           <none> (cli_client.py:144)
截屏2024-12-26 11 03 23 截屏2024-12-26 11 06 44

milvus_panic.log

Expected Behavior

No response

Steps To Reproduce

1. create a collection with fields: 'id'(primary key), 'float_vector'(128dim), 'float_1'
2. build SCANN index on field 'float_vector'
3. insert 1m data
4. flush collection
5. rebuild index
6. load collection
7. concurrent requests:
   - search
   - query
   - load
   - scene_insert_delete_flush
     (insert -> delete insert id -> flush)

Milvus Log

No response

Anything else?

server config:fouramf-server-standalone-8c16m

{
     "standalone": {
          "resources": {
               "limits": {
                    "cpu": 8,
                    "memory": "16Gi"
               },
               "requests": {
                    "cpu": 8,
                    "memory": "16Gi"
               }
          }
     },
     "cluster": {
          "enabled": false
     },
     "etcd": {
          "replicaCount": 1,
          "metrics": {
               "enabled": true,
               "podMonitor": {
                    "enabled": true
               }
          }
     },
     "minio": {
          "mode": "standalone",
          "metrics": {
               "podMonitor": {
                    "enabled": true
               }
          }
     },
     "pulsarv3": {
          "enabled": false
     },
     "metrics": {
          "serviceMonitor": {
               "enabled": true
          }
     },
     "log": {
          "level": "debug"
     },
     "image": {
          "all": {
               "repository": "harbor.milvus.io/milvus/milvus",
               "tag": "master-20241225-f49d6183-amd64"
          }
     }
}

client config: fouramf-client-scann-comapct

{
     "dataset_params": {
          "metric_type": "L2",
          "dim": 128,
          "dataset_name": "sift",
          "dataset_size": 1000000,
          "ni_per": 50000
     },
     "collection_params": {
          "other_fields": [
               "float_1"
          ],
          "shards_num": 2
     },
     "index_params": {
          "index_type": "SCANN",
          "index_param": {
               "nlist": 1024,
               "with_raw_data": true
          }
     },
     "concurrent_params": {
          "concurrent_number": 20,
          "during_time": "5h",
          "interval": 20
     },
     "concurrent_tasks": [
          {
               "type": "search",
               "weight": 10,
               "params": {
                    "nq": 10,
                    "top_k": 10,
                    "search_param": {
                         "radius": 126,
                         "range_filter": 0.5789
                    },
                    "ignore_growing": false,
                    "timeout": 60,
                    "random_data": true
               }
          },
          {
               "type": "query",
               "weight": 10,
               "params": {
                    "ids": [
                         0,
                         1,
                         2,
                         3,
                         4,
                         5,
                         6,
                         7,
                         8,
                         9
                    ],
                    "ignore_growing": false,
                    "timeout": 60
               }
          },
          {
               "type": "load",
               "weight": 1,
               "params": {
                    "replica_number": 1,
                    "timeout": 30
               }
          },
          {
               "type": "scene_insert_delete_flush",
               "weight": 1,
               "params": {
                    "check_tasks": {
                         "flush": {
                              "check_task": "check_ignore_rate_limit"
                         }
                    },
                    "insert_length": 1,
                    "delete_length": 1,
                    "random_id": true,
                    "random_vector": true,
                    "varchar_filled": true
               }
          }
     ]
}
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels Dec 26, 2024
@wangting0128 wangting0128 added this to the 3.0 milestone Dec 26, 2024
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Dec 26, 2024
@yanliang567 yanliang567 assigned sunby and unassigned yanliang567 Dec 26, 2024
@xiaofan-luan
Copy link
Collaborator

/assign @xiaocai2333

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants