跳至内容

搜索与转换

搜索与替换是开发者工具箱中的一项强大工具,可以为您节省时间和精力……前提是您能构思出正确的正则表达式。

搜索与转换是对同一概念的创新应用,但我们使用LLM来执行转换,而非简单的字符串替换。

👩‍💻 理解脚本代码

script({
title: "Search and transform",
description:
"Search for a pattern in files and apply an LLM transformation to the match",
parameters: {
glob: {
type: "string",
description: "The glob pattern to filter files",
default: "*",
},
pattern: {
type: "string",
description: "The text pattern (regular expression) to search for",
},
transform: {
type: "string",
description: "The LLM transformation to apply to the match",
},
},
})

该脚本首先使用script函数定义其用途和参数。在这里,我们定义了标题、描述以及脚本所需的三个参数:glob用于指定文件,pattern用于指定要搜索的文本,transform用于指定所需的转换。

提取和验证参数

const { pattern, glob, transform } = env.vars
if (!pattern) cancel("pattern is missing")
const patternRx = new RegExp(pattern, "g")
if (!transform) cancel("transform is missing")

接下来,我们从环境变量中提取patternglobtransform参数并进行验证。如果缺少patterntransform,脚本将取消执行。然后我们将pattern编译为正则表达式对象以供后续使用。

搜索文件和匹配项

const { files } = await workspace.grep(patternRx, glob)

这里,我们使用workspace API中的grep函数来搜索符合glob模式并包含正则表达式模式的文件。

转换匹配项

// cached computed transformations
const patches = {}
for (const file of files) {
console.log(file.filename)
const { content } = await workspace.readText(file.filename)
// skip binary files
if (!content) continue
// compute transforms
for (const match of content.matchAll(patternRx)) {
console.log(` ${match[0]}`)
if (patches[match[0]]) continue

我们初始化一个名为patches的对象来存储转换。然后,我们遍历每个文件,读取其内容,并跳过二进制文件。对于在文件内容中找到的每个匹配项,我们会检查是否已为此匹配项计算过转换,以避免重复工作。

生成转换提示词

const res = await runPrompt(
(_) => {
_.$`
## Task
Your task is to transform the MATCH using the following TRANSFORM.
Return the transformed text.
- do NOT add enclosing quotes.
## Context
`
_.def("MATCHED", match[0])
_.def("TRANSFORM", transform)
},
{ label: match[0], system: [], cache: "search-and-transform" }
)

对于每个唯一的匹配项,我们使用runPrompt函数生成提示。在提示中,我们定义了转换任务和上下文,明确指出转换后的文本应返回而不带引号。我们还定义了匹配文本和要应用的转换。

应用转换

const transformed = res.fences?.[0].content ?? res.text
if (transformed) patches[match[0]] = transformed
console.log(` ${match[0]} -> ${transformed ?? "?"}`)
}
// apply transforms
const newContent = content.replace(
patternRx,
(match) => patches[match] ?? match
)

然后我们从提示结果中提取转换后的文本,并将其存储在patches对象中。最后,我们使用String.prototype.replace将转换应用到文件内容上。

保存更改

if (content !== newContent)
await workspace.writeText(file.filename, newContent)
}

如果在应用转换后文件内容发生了变化,我们会将更新后的内容保存回文件。

运行脚本

要运行此脚本,您需要GenAIScript CLI。如果需要设置,请查看安装指南。安装CLI后,通过执行以下命令运行脚本:

终端窗口
genaiscript run st

完整源代码 (GitHub)

st.genai.mts
script({
title: "Search and transform",
description:
"Search for a pattern in files and apply a LLM transformation the match",
parameters: {
glob: {
type: "string",
description: "The glob pattern to filter files",
},
pattern: {
type: "string",
description: "The text pattern (regular expression) to search for",
},
transform: {
type: "string",
description: "The LLM transformation to apply to the match",
},
},
})
let { pattern, glob, transform } = env.vars
if (!glob)
glob =
(await host.input(
"Enter the glob pattern to filter files (default: *)"
)) || "*"
if (!pattern)
pattern = await host.input(
"Enter the pattern to search for (regular expression)"
)
if (!pattern) cancel("pattern is missing")
const patternRx = new RegExp(pattern, "g")
if (!transform)
transform = await host.input(
"Enter the LLM transformation to apply to the match"
)
if (!transform) cancel("transform is missing")
const { files } = await workspace.grep(patternRx, { glob })
// cached computed transformations
const patches = {}
for (const file of files) {
console.log(file.filename)
const { content } = await workspace.readText(file.filename)
// skip binary files
if (!content) continue
// compute transforms
for (const match of content.matchAll(patternRx)) {
console.log(` ${match[0]}`)
if (patches[match[0]]) continue
const res = await runPrompt(
(_) => {
_.$`
## Task
Your task is to transform the MATCH with the following TRANSFORM.
Return the transformed text.
- do NOT add enclosing quotes.
## Context
`
_.def("MATCHED", match[0])
_.def("TRANSFORM", transform, {
detectPromptInjection: "available",
})
},
{
label: match[0],
system: [
"system.assistant",
"system.safety_jailbreak",
"system.safety_harmful_content",
],
cache: "search-and-transform",
}
)
const transformed = res.fences?.[0].content ?? res.text
if (transformed) patches[match[0]] = transformed
console.log(` ${match[0]} -> ${transformed ?? "?"}`)
}
// apply transforms
const newContent = content.replace(
patternRx,
(match) => patches[match] ?? match
)
// save results if file content is modified
if (content !== newContent)
await workspace.writeText(file.filename, newContent)
}

内容安全

为确保生成内容的安全性,采取了以下措施。

进一步增强安全性的额外措施包括运行带有安全过滤器的模型或使用内容安全服务验证消息。

有关内容安全的更多信息,请参阅透明度说明