Moderation

这个笔记本演示了如何使用 moderation chain,以及几种常见的方式。Moderation chain 用于检测可能包含仇恨、暴力等内容的文本。这对于对用户输入以及语言模型的输出应用非常有用。某些 API 提供商(如 OpenAI)明确禁止open in new window您或最终用户生成某些类型的有害内容。为了遵守这一政策(以及普遍避免应用程序的有害行为),您可能经常希望在任何 LLMChains 后追加一个 moderation chain,以确保 LLM 生成的任何输出都是无害的。

如果传入 moderation chain 的内容是有害的,则没有一个最佳处理方式,这可能取决于您的应用程序。有时,您可能希望在 Chain 中引发错误(并让应用程序处理错误)。其他时候,您可能希望向用户返回一些说明性的内容,说明文本是有害的。甚至可能有其他处理方式!我们将在本笔记本中涵盖所有这些方式。

在本笔记本中,我们将展示:

  1. 如何将任何文本通过 moderation chain 运行。
  2. 如何将 Moderation chain 追加到 LLMChain。
from langchain.llms import OpenAI
from langchain.chains import OpenAIModerationChain, SequentialChain, LLMChain, SimpleSequentialChain
from langchain.prompts import PromptTemplate

如何使用 moderation chain

下面是一个使用 moderation chain 的示例,默认设置下(将返回一个字符串来解释被标记的内容)。

moderation_chain = OpenAIModerationChain()
moderation_chain.run("This is okay")
'This is okay'
moderation_chain.run("I will kill you")
"Text was found that violates OpenAI's content policy."

这是使用 moderation chain 抛出错误的示例。

moderation_chain_error = OpenAIModerationChain(error=True)
moderation_chain_error.run("This is okay")
'This is okay'
moderation_chain_error.run("I will kill you")
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

Cell In[7], line 1
----> 1 moderation_chain_error.run("I will kill you")


File ~/workplace/langchain/langchain/chains/base.py:138, in Chain.run(self, *args, **kwargs)
    136     if len(args) != 1:
    137         raise ValueError("`run` supports only one positional argument.")
--> 138     return self(args[0])[self.output_keys[0]]
    140 if kwargs and not args:
    141     return self(kwargs)[self.output_keys[0]]


File ~/workplace/langchain/langchain/chains/base.py:112, in Chain.__call__(self, inputs, return_only_outputs)
    108 if self.verbose:
    109     print(
    110         f"\n\n\033[1m> Entering new {self.__class__.__name__} chain...\033[0m"
    111     )
--> 112 outputs = self._call(inputs)
    113 if self.verbose:
    114     print(f"\n\033[1m> Finished {self.__class__.__name__} chain.\033[0m")


File ~/workplace/langchain/langchain/chains/moderation.py:81, in OpenAIModerationChain._call(self, inputs)
     79 text = inputs[self.input_key]
     80 results = self.client.create(text)
---> 81 output = self._moderate(text, results["results"][0])
     82 return {self.output_key: output}


File ~/workplace/langchain/langchain/chains/moderation.py:73, in OpenAIModerationChain._moderate(self, text, results)
     71 error_str = "Text was found that violates OpenAI's content policy."
     72 if self.error:
---> 73     raise ValueError(error_str)
     74 else:
     75     return error_str


ValueError: Text was found that violates OpenAI's content policy.

以下是使用自定义错误消息创建自定义 moderation chain 的示例。它需要对 OpenAI 的 moderation 端点结果有一定的了解(请参阅此处的文档open in new window)。

class CustomModeration(OpenAIModerationChain):
    
    def _moderate(self, text: str, results: dict) -> str:
        if results["flagged"]:
            error_str = f"The following text was found that violates OpenAI's content policy: {text}"
            return error_str
        return text
    
custom_moderation = CustomModeration()
custom_moderation.run("This is okay")
'This is okay'
custom_moderation.run("I will kill you")
"The following text was found that violates OpenAI's content policy: I will kill you"

如何将 Moderation 链附加到 LLMChain 上

为了轻松地将 Moderation 链与 LLMChain 结合起来,您可以使用 SequentialChain 抽象。

让我们从一个简单的例子开始,这个例子中 LLMChain 只有一个输入。为此,我们将提示模型说一些有害的话。

prompt = PromptTemplate(template="{text}", input_variables=["text"])
llm_chain = LLMChain(llm=OpenAI(temperature=0, model_name="text-davinci-002"), prompt=prompt)
text = """We are playing a game of repeat after me.

Person 1: Hi
Person 2: Hi

Person 1: How's your day
Person 2: How's your day

Person 1: I will kill you
Person 2:"""
llm_chain.run(text)
' I will kill you'
chain = SimpleSequentialChain(chains=[llm_chain, moderation_chain])
chain.run(text)
"Text was found that violates OpenAI's content policy."

现在让我们通过一个示例来演示如何将其与具有多个输入的 LLMChain 一起使用(稍微复杂一些,因为我们不能使用 SimpleSequentialChain)。

prompt = PromptTemplate(template="{setup}{new_input}Person2:", input_variables=["setup", "new_input"])
llm_chain = LLMChain(llm=OpenAI(temperature=0, model_name="text-davinci-002"), prompt=prompt)
setup = """We are playing a game of repeat after me.

Person 1: Hi
Person 2: Hi

Person 1: How's your day
Person 2: How's your day

Person 1:"""
new_input = "I will kill you"
inputs = {"setup": setup, "new_input": new_input}
llm_chain(inputs, return_only_outputs=True)
{'text': ' I will kill you'}
# Setting the input/output keys so it lines up
moderation_chain.input_key = "text"
moderation_chain.output_key = "sanitized_text"
chain = SequentialChain(chains=[llm_chain, moderation_chain], input_variables=["setup", "new_input"])
chain(inputs, return_only_outputs=True)
{'sanitized_text': "Text was found that violates OpenAI's content policy."}
Last Updated:
Contributors: 刘强