通用基准测试工具¶

我们提供了一个基准测试工具来评估交互式引擎的性能。该工具模拟多个客户端，通过引擎暴露的对应端点向服务器发送查询（Gremlin或Cypher）。它会报告延迟、吞吐量和查询结果等性能指标。

值得注意的是，该工具最近进行了升级，支持对不同系统和多种基准工作负载进行全面比较，从而实现对查询正确性和性能的深入评估与对比。

基准测试工具概览¶

以下是该基准测试工具的一些关键特性：

多查询语言支持。该工具兼容多种图查询语言，包括Gremlin和Cypher，使系统能够根据其特定的语言支持进行配置。
不同的图计算系统。它支持多种图计算系统之间的比较，例如GraphScope GIE和KuzuDB。未来将集成更多系统。
多样化工作负载. 该工具支持多种工作负载，包括 LDBC IC 和 BI, LSQB, 以及 JOB.
结果评估。它支持正确性验证和性能基准测试，以便进行详细比较。

基准测试工具使用指南¶

基准测试工具可在此处获取。该基准测试程序通过从queries读取查询模板，并使用substitution_parameters填充查询模板中的参数，向服务器发送混合查询。程序采用轮询策略遍历所有已启用的查询及其对应参数。

仓库内容¶

- bin
    - bench.sh                          // script for running benchmark for queries
    - collect.sh                        // script for collecting benchmark results
- config
    - interactive-benchmark.properties  // configurations for running benchmark
- data
    - substitution_parameters           // query parameter files using to fill the query templates
    - expected_results                  // expected query results for the running queries 
- queries                               // query templates including LDBC queries, LSQB queries, Job queries, customized queries, etc.
- dbs                                   // Other graph systems for comparison. Currently, KuzuDB is supported.
- example                               // an example to compare GraphScope GIE and Kuzu
- src                                   // source code of benchmark program

注意：此处以ldbc_query为前缀的查询是LDBC官方交互式复杂读操作的实现，以bi_query为前缀的查询是LDBC官方商业智能的实现，以lsqb_query为前缀的查询是LDBC标记子图查询基准的实现，而以job为前缀的查询则是JOB基准测试的实现。 Gremlin查询应使用.gremlin后缀，Cypher查询应使用.cypher后缀。 LDBC查询的相应参数（因子1）由LDBC官方工具生成。

构建基准测试¶

使用Maven构建基准测试程序：

mvn clean package

所有二进制文件和查询都将打包到target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz中，您可以将该部署包部署到任何能够连接到gremlin端点的地方（该端点应在interactive-benchmark.properties文件中配置）。

运行基准测试¶

您可以解压构建好的target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz文件，然后运行基准测试。

cd target
tar -xvf gaia-benchmark-0.0.1-SNAPSHOT-dist.tar.gz
cd gaia-benchmark-0.0.1-SNAPSHOT
./bin/bench.sh                             # run the benchmark program. You can also modify running configurations in config/interactive-benchmark.properties

使用示例配置文件example/job_benchmark.properties，该文件在执行JOB基准测试时比较GraphScope-GIE和KuzuDB，结果示例如下：

Start to benchmark system: GIE
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3638].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[266].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3669].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[8603].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[613].
...
System: GIE; query count: 35; execute time(ms): xxx qps: xxx

Start to benchmark system: KuzuDb
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[7068].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[253].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[5122].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[13623].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[4676].
...
System: KuzuDB; query count: 35; execute time(ms): xxx qps: xxx

收集结果¶

./bin/collect.sh                      # run the result collection program to collect the results and generate a performance comparison table

此外，根据基准测试结果，收集的数据和最终性能对比如下表所示：

查询名称	GIE 平均值	GIE P50	GIE P90	GIE P95	GIE P99	GIE 计数	KuzuDb 平均值	KuzuDb P50	KuzuDb P90	KuzuDb P95	KuzuDb P99	KuzuDb 计数
3a	613.00	613	613	613	613	1	4676.00	4676	4676	4676	4676	1
5c	8603.00	8603	8603	8603	8603	1	13623.00	13623	13623	13623	13623	1
9a	3669.00	3669	3669	3669	3669	1	5122.00	5122	5122	5122	5122	1
13a	3638.00	3638	3638	3638	3638	1	7068.00	7068	7068	7068	7068	1
32a	266.00	266	266	266	266	1	253.00	253	253	253	253	1

更详细的端到端示例请参见此处。

配置¶

所有详细配置可在config/interactive-benchmark.properties中找到。

下面我们重点介绍一些关键设置。

配置对比系统¶

我们支持不同图计算系统之间的性能对比。例如，要比较GIE和Kuzu系统，可以按如下方式配置interactive-benchmark.properties文件。基准测试工具随后会向GIE和Kuzu发送查询请求，并收集分析它们的返回结果。

# The configuration for the compared systems.
# Currently, the supported systems includes GIE and KuzuDb.
# For each system, starting from system.1 to system.n, the following configurations are needed:
# name: the name of the system, e.g., GIE, KuzuDb.
# client: the client of the system, e.g., for GIE, it can be cypher, gremlin; for KuzuDB, it should be kuzu.
# endpoint(optional): the endpoint of the system if the sytem provides a service endpoint, e.g., for GIE gremlin, it is 127.0.0.1:8182 by default.
# path(optional): the path of the database of the system if the system is a local database and need to access the database by the path, e.g., for KuzuDb, it can be /path_to_db/example_db.
# Either of endpoint or path need to be provided, depending on the access method of the system.
system.1.name = GIE
system.1.client = cypher
system.1.endpoint = 127.0.0.1:7687
system.1.path =
system.2.name = KuzuDb
system.2.client = kuzu
system.2.endpoint =
system.2.path = ./job_db

配置工作负载¶

目前，我们已经提供了常用的基准测试工作负载，包括ic、bi、lsqb和job。用户还可以将自己的基准测试查询添加到queries中，并将查询的替换参数添加到substitution_parameters。请注意，用户自定义查询模板的文件名应遵循前缀custom_query或custom_constant_query。custom_query和custom_constant_query的区别在于后者没有对应的参数。

以JOB基准测试为例，相关配置如下：

# The configuration for the benchmarking workloads.
# the directory of query templates
query.dir = ./queries/cypher_queries/job
# the directory of query parameters. If the queries do not have parameters, leave it empty.
query.parameters.dir = 
# query file suffix, e.g., cypher (ldbc_query.cypher), gremlin (ldbc_query.gremlin), txt (ldbc_query.txt), etc.
query.file.suffix=cypher
# specify which kind of queries are sent.
# if query.all.enable is true, the benchmark will send all the queries in the query.dir.
query.all.enable=true

配置结果收集¶

默认情况下，基准测试结果将输出到interactive-benchmark.log和interactive-benchmark-report.md文件中，如前面"运行基准测试"和"收集结果"部分所示。具体来说，如果您想进一步比较当前工作负载下的查询正确性，可以提供相应的配置：

# the directory of query results which is optional. if provided, the benchmarking results will be compared with the expected results.
query.expected.path = ./data/expected_results/job_expected.json

基准测试工具将自动执行查询并比较结果的正确性。