Performance#
In this performance measurement, we control the following parameters:
Test database |
SimReady (50 Gb) |
Number of concurrent users |
30 |
API request acceptable failure rate |
0 |
Number of CPU replicas of the API services (each replica is using a single CPU thread) |
10 |
Number of GPUs |
1 |
The average req/s will change when database size changes:
Text-based (unit: msec) |
Avg Req/s |
Average |
95% |
99% |
---|---|---|---|---|
A10g |
142.33 |
210 |
350 |
450 |
A100 |
159.55 |
187 |
300 |
380 |
H100 |
221.69 |
135 |
220 |
300 |
L40 |
160.76 |
186 |
320 |
400 |
Image-based (unit: msec) |
Avg Req/s |
Average |
95% |
99% |
---|---|---|---|---|
A10g |
38.89 |
766 |
780 |
840 |
A100 |
76.73 |
389 |
420 |
490 |
H100 |
143.73 |
208 |
240 |
290 |
L40 |
85.22 |
350 |
380 |
420 |
To increase the number of concurrent users, it is possible to scale the number of CPU replicas of the service.
kubectl scale --replicas=<desired number of CPU replicas> deployment <deployment name>-ngsearch-search-rest-api
Performance Tuning#
USD Search API includes stateless microservices and stateful open-source components (such as OpenSearch, Redis, and Neo4j). The stateless USD Search API services can be scaled both vertically and horizontally based on specific use cases, allowing for flexible performance optimization.
It is recommended to monitor resource usage via Grafana and adjust scaling as needed. The stateful components (Neo4j, OpenSearch, and Redis) are standard OSS components that support vertical scaling, with some supporting horizontal scaling.
For detailed performance tuning information, please refer to the Helm chart installation documentation for each of these components.