基类: BaseNodePostprocessor
模型难以获取长篇幅文本中间部分的重要细节。一项研究(https://arxiv.org/abs/2307.03172)发现,当关键数据位于输入上下文开头或结尾时,通常能获得最佳性能表现。此外,随着输入上下文长度增加,性能会显著下降,即使是专为长文本设计的模型也不例外。
Source code in llama-index-core/llama_index/core/postprocessor/node.py
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396 | class LongContextReorder(BaseNodePostprocessor):
"""
Models struggle to access significant details found
in the center of extended contexts. A study
(https://arxiv.org/abs/2307.03172) observed that the best
performance typically arises when crucial data is positioned
at the start or conclusion of the input context. Additionally,
as the input context lengthens, performance drops notably, even
in models designed for long contexts.".
"""
@classmethod
def class_name(cls) -> str:
return "LongContextReorder"
def _postprocess_nodes(
self,
nodes: List[NodeWithScore],
query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
"""Postprocess nodes."""
reordered_nodes: List[NodeWithScore] = []
ordered_nodes: List[NodeWithScore] = sorted(
nodes, key=lambda x: x.score if x.score is not None else 0
)
for i, node in enumerate(ordered_nodes):
if i % 2 == 0:
reordered_nodes.insert(0, node)
else:
reordered_nodes.append(node)
return reordered_nodes
|