Performance#

In this performance measurement, we control the following parameters:

Test database

SimReady (50 Gb)

Number of concurrent users

30

API request acceptable failure rate

0

Number of CPU replicas of the API services (each replica is using a single CPU thread)

10

Number of GPUs

1

The average req/s will change when database size changes:

Text-based (unit: msec)

Avg Req/s

Average

95%

99%

A10g

142.33

210

350

450

A100

159.55

187

300

380

H100

221.69

135

220

300

L40

160.76

186

320

400

Image-based (unit: msec)

Avg Req/s

Average

95%

99%

A10g

38.89

766

780

840

A100

76.73

389

420

490

H100

143.73

208

240

290

L40

85.22

350

380

420

To increase the number of concurrent users, it is possible to scale the number of CPU replicas of the service.

kubectl scale --replicas=<desired number of CPU replicas> deployment <deployment name>-ngsearch-search-rest-api

Performance Tuning#

USD Search API includes stateless microservices and stateful open-source components (such as OpenSearch, Redis, and Neo4j). The stateless USD Search API services can be scaled both vertically and horizontally based on specific use cases, allowing for flexible performance optimization.

It is recommended to monitor resource usage via Grafana and adjust scaling as needed. The stateful components (Neo4j, OpenSearch, and Redis) are standard OSS components that support vertical scaling, with some supporting horizontal scaling.

For detailed performance tuning information, please refer to the Helm chart installation documentation for each of these components.