Performance#

In this performance measurement, we control the following parameters:

Test database

SimReady (50 Gb)

Number of concurrent users

50

API request acceptable failure rate

0

Number of CPU replicas of the API services (each replica is using a single CPU thread)

8

Number of GPUs

1

The average req/s will change when database size changes:

Text-based (unit: msec)

Avg Req/s

Average

95%

99%

A10g

91.0

548

1100

1500

L40

88.6

562

1300

1600

A100

72.8

683

1500

1900

H100

117.8

423

940

1300

Image-based (unit: msec)

Avg Req/s

Average

95%

99%

A10g

10.2

4795

4800

7700

L40

25.1

1973

2100

2300

A100

34.6

1436

1600

1800

H100

53.4

932

1000

1200

To increase the number of concurrent users, it is possible to scale the number of CPU replicas of the service.

kubectl scale --replicas=<desired number of CPU replicas> deployment <deployment name>-ngsearch-search-rest-api

Performance Tuning#

USD Search API includes stateless microservices and stateful open-source components (such as OpenSearch, Redis, and Neo4j). The stateless USD Search API services can be scaled both vertically and horizontally based on specific use cases, allowing for flexible performance optimization.

It is recommended to monitor resource usage via Grafana and adjust scaling as needed. The stateful components (Neo4j, OpenSearch, and Redis) are standard OSS components that support vertical scaling, with some supporting horizontal scaling.

For detailed performance tuning information, please refer to the Helm chart installation documentation for each of these components.