MongoDB 查询优化
本文总结了 MongoDB 查询的执行过程,如何读懂 Explain 信息、执行统计信息,如何创建索引,如何建立数据归档。
MoongoDB 版本:v4.4
查询的执行过程
Explain 与 执行统计
Mongodb也提供了explain语句,可以获取query语句的查询计划(queryPlanner)、以及执行过程中的统计信息(executionStats)。
land> db.boxes.find({"stopped_at":{"$lt": ISODate("2023-12-07T21:12:00.957+0800")},"status":"running"}).sort({"created_at":1}).explain()
{
queryPlanner: {
plannerVersion: 1,
namespace: 'land.boxes',
indexFilterSet: false,
parsedQuery: {
'$and': [
{ status: { '$eq': 'running' } },
{ stopped_at: { '$lt': ISODate("2023-12-07T13:12:00.957Z") } }
]
},
winningPlan: {
stage: 'FETCH',
filter: {
'$and': [
{ status: { '$eq': 'running' } },
{
stopped_at: { '$lt': ISODate("2023-12-07T13:12:00.957Z") }
}
]
},
inputStage: {
stage: 'IXSCAN',
keyPattern: { created_at: 1 },
indexName: 'created_at_1',
isMultiKey: false,
multiKeyPaths: { created_at: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { created_at: [ '[MinKey, MaxKey]' ] }
}
},
rejectedPlans: []
},
serverInfo: {
host: 'mongo-0',
port: 27017,
version: '4.2.14',
gitVersion: '0e6db36e92d82cc81cbd40ffd607eae88dc1f09d'
},
ok: 1
}
结果分为四部分:queryPlanner、executionStats、serverInfo、ok,仅关注queryPlanner、executionStats这两部分。
查询计划是按 tree of stages
组织的:
"winningPlan" : {
"stage" : <STAGE1>,
...
"inputStage" : {
"stage" : <STAGE2>,
...
"inputStage" : {
"stage" : <STAGE3>,
...
}
}
}
每个阶段将其结果(即文档或索引键)传递给父节点。叶节点访问集合或索引。内部节点操作子节点产生的文档或索引键。根节点是MongoDB从中派生结果集的最后阶段。
阶段是对操作的描述,有以下几种类型:
-
COLLSCAN
for a collection scan -
IXSCAN
for scanning index keys -
FETCH
for retrieving documents -
SHARD_MERGE
for merging results from shards -
SHARDING_FILTER
for filtering out orphan documents from shards
executionStats 就是执行 queryPlanner.winningPlan 这个计划时的统计信息。在mongo shell中的使用方法是在query语句后面加上.explain('executionStats')。
land> db.boxes.find({"stopped_at":{"$lt": ISODate("2023-12-07T21:12:00.957+0800")},"status":"running"}).sort({"created_at":1}).explain("executionStats")
{
queryPlanner: {
plannerVersion: 1,
namespace: 'land.boxes',
indexFilterSet: false,
parsedQuery: {
'$and': [
{ status: { '$eq': 'running' } },
{ stopped_at: { '$lt': ISODate("2023-12-07T13:12:00.957Z") } }
]
},
winningPlan: {
stage: 'FETCH',
filter: {
'$and': [
{ status: { '$eq': 'running' } },
{
stopped_at: { '$lt': ISODate("2023-12-07T13:12:00.957Z") }
}
]
},
inputStage: {
stage: 'IXSCAN',
keyPattern: { created_at: 1 },
indexName: 'created_at_1',
isMultiKey: false,
multiKeyPaths: { created_at: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { created_at: [ '[MinKey, MaxKey]' ] }
}
},
rejectedPlans: []
},
executionStats: {
executionSuccess: true,
nReturned: 0,
executionTimeMillis: 15818,
totalKeysExamined: 87411,
totalDocsExamined: 87411,
executionStages: {
stage: 'FETCH',
filter: {
'$and': [
{ status: { '$eq': 'running' } },
{
stopped_at: { '$lt': ISODate("2023-12-07T13:12:00.957Z") }
}
]
},
nReturned: 0,
executionTimeMillisEstimate: 14065,
works: 87412,
advanced: 0,
needTime: 87411,
needYield: 0,
saveState: 892,
restoreState: 892,
isEOF: 1,
docsExamined: 87411,
alreadyHasObj: 0,
inputStage: {
stage: 'IXSCAN',
nReturned: 87411,
executionTimeMillisEstimate: 846,
works: 87412,
advanced: 87411,
needTime: 0,
needYield: 0,
saveState: 892,
restoreState: 892,
isEOF: 1,
keyPattern: { created_at: 1 },
indexName: 'created_at_1',
isMultiKey: false,
multiKeyPaths: { created_at: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: { created_at: [ '[MinKey, MaxKey]' ] },
keysExamined: 87411,
seeks: 1,
dupsTested: 0,
dupsDropped: 0
}
}
},
serverInfo: {
host: 'mongo-0',
port: 27017,
version: '4.2.14',
gitVersion: '0e6db36e92d82cc81cbd40ffd607eae88dc1f09d'
},
ok: 1
}
explain.executionStats
winning plan 执行的统计信息
explain.executionStats.nReturned
符合条件的总行数
explain.executionStats.executionTimeMillis
执行时间,毫秒
explain.executionStats.totalKeysExamined
索引扫描条目总数
explain.executionStats.totalDocsExamined
文档扫描条目总数,对应 COLLSCAN
和 FETCH
索引操作
查看索引
db.boxes.getIndexes();
创建索引
db.boxes.createIndex({"status:": 1, "stopped_at": -1});
删除索引
db.boxes.dropIndex("created_at_1_status_1_deleted_at_1")
数据归档
将用不到的历史数据分离出来建立归档,也是一种有效减少数据量提升查询性能的手段。
1、备份 collection
db.boxes.aggregate([{$match: { created_at: { $lte: new Date(2023, 1, 1)} }}, {$out: "boxes_archive"}]);
2、删除历史数据
db.boxes.deleteMany({ created_at: { $lt: new Date(2023, 1, 1)}});