camel.models.reward 包

本页内容

camel.models.reward 包#

子模块#

camel.models.reward.base_reward_model 模块#

class camel.models.reward.base_reward_model.BaseRewardModel(model_type: ~<unknown>.ModelType | str, api_key: str | None = None, url: str | None = None)[来源]#

基类: ABC

奖励模型的抽象基类。奖励模型用于评估消息并根据不同标准返回分数。

子类应实现'evaluate'和'get_scores_types'方法。

abstract evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

评估消息并根据不同标准返回评分。

Parameters:: messages (List[Dict[str, str]]) – 一个消息列表，其中每条消息都是一个包含'role'和'content'的字典。
Returns:: 一个将评分类型映射到其值的字典。
Return type:: 字典[str, float]

abstract get_scores_types() → List[str][来源]#

获取奖励模型可以返回的分数类型列表。

Returns:: 奖励模型可以返回的评分类型列表。
Return type:: List[str]

camel.models.reward.evaluator 模块#

class camel.models.reward.evaluator.Evaluator(reward_model: BaseRewardModel)[来源]#

基类：object

Evaluator类用于通过奖励模型评估消息并根据分数过滤数据。

Parameters:: reward_model (BaseRewardModel) – 用于评估消息的奖励模型。

evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

使用奖励模型评估消息。

Parameters:: messages (List[Dict[str, str]]) – 消息列表，其中每条消息都是一个包含'role'和'content'键的字典。
Returns:: 一个将评分类型映射到其值的字典。
Return type:: 字典[str, float]

filter_data(messages: List[Dict[str, str]], thresholds: Dict[str, float]) → bool[来源]#

根据分数筛选消息。

Parameters:

messages (List[Dict[str, str]]) - 一个消息列表，其中每个消息都是一个包含'role'和'content'的字典。
thresholds (Dict[str, float]) – 一个将分数类型映射到其值的字典。

Returns:

一个布尔值，表示消息是否通过过滤器。

Return type:

布尔值

camel.models.reward.nemotron_model 模块#

class camel.models.reward.nemotron_model.NemotronRewardModel(model_type: ~<unknown>.ModelType | str, api_key: str | None = None, url: str | None = None)[来源]#

基础类: BaseRewardModel

基于Nemotron模型并兼容OpenAI的奖励模型。

Parameters:

model_type (Union[ModelType, str]) - 创建后端所用的模型。
api_key (Optional[str], optional) – 用于模型服务认证的API密钥。(默认: None)
url (可选[str], optional) - 模型服务的URL地址。

注意

Nemotron模型不支持模型配置。

evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

使用Nemotron模型评估消息。

Parameters:

messages (List[Dict[str, str]]) - 一个消息列表，其中每条消息都是字典格式。

Returns:

一个将评分类型映射到其对应: 值的字典。

Return type:

字典[str, float]

get_scores_types() → List[str][来源]#

获取奖励模型可以返回的分数类型列表。

Returns:: 奖励模型可以返回的评分类型列表。
Return type:: List[str]

camel.models.reward.skywork_model 模块#

基础类: BaseRewardModel

基于transformers的奖励模型，它将从huggingface下载模型。

Parameters:

model_type (Union[ModelType, str]) - 创建后端所用的模型。
api_key (Optional[str], optional) – 未使用。(默认值: None)
url (可选[str], 可选) – 未使用。(默认值: None)
device_map (Optional[str], optional) - 选择设备映射。 (默认: auto)
attn_implementation (Optional[str], optional) - 选择注意力实现方式。(默认: flash_attention_2)
offload_folder (可选[str], 可选) – 选择卸载文件夹。 (默认: offload)

evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

使用Skywork模型评估消息。

Parameters:: messages (List[Dict[str, str]]) – 消息列表。
Returns:: 一个包含得分的ChatCompletion对象。
Return type:: ChatCompletion

get_scores_types() → List[str][来源]#

获取分数类型

Returns:: 分数类型列表
Return type:: List[str]

模块内容#

class camel.models.reward.BaseRewardModel(model_type: ~<unknown>.ModelType | str, api_key: str | None = None, url: str | None = None)[来源]#

基类: ABC

奖励模型的抽象基类。奖励模型用于评估消息并根据不同标准返回分数。

子类应实现'evaluate'和'get_scores_types'方法。

abstract evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

评估消息并根据不同标准返回评分。

Parameters:: messages (List[Dict[str, str]]) – 消息列表，其中每条消息都是一个包含'role'和'content'键的字典。
Returns:: 一个将评分类型映射到其值的字典。
Return type:: 字典[str, float]

abstract get_scores_types() → List[str][来源]#

获取奖励模型可以返回的分数类型列表。

Returns:: 奖励模型可以返回的评分类型列表。
Return type:: List[str]

class camel.models.reward.Evaluator(reward_model: BaseRewardModel)[来源]#

基类：object

Evaluator类用于通过奖励模型评估消息并根据分数过滤数据。

Parameters:: reward_model (BaseRewardModel) – 用于评估消息的奖励模型。

evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

使用奖励模型评估消息。

Parameters:: messages (List[Dict[str, str]]) – 消息列表，其中每条消息都是一个包含'role'和'content'键的字典。
Returns:: 一个将评分类型映射到其值的字典。
Return type:: 字典[str, float]

filter_data(messages: List[Dict[str, str]], thresholds: Dict[str, float]) → bool[来源]#

根据分数筛选消息。

Parameters:

messages (List[Dict[str, str]]) - 一个消息列表，其中每个消息都是一个包含'role'和'content'的字典。
thresholds (Dict[str, float]) – 一个将分数类型映射到其值的字典。

Returns:

一个布尔值，表示消息是否通过过滤器。

Return type:

布尔值

class camel.models.reward.NemotronRewardModel(model_type: ~<unknown>.ModelType | str, api_key: str | None = None, url: str | None = None)[来源]#

基础类: BaseRewardModel

基于Nemotron模型并兼容OpenAI的奖励模型。

Parameters:

model_type (Union[ModelType, str]) - 创建后端所用的模型。
api_key (Optional[str], optional) – 用于模型服务认证的API密钥。(默认: None)
url (可选[str], optional) - 模型服务的URL地址。

注意

Nemotron模型不支持模型配置。

evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

使用Nemotron模型评估消息。

Parameters:

messages (List[Dict[str, str]]) - 一个消息列表，其中每条消息都是字典格式。

Returns:

一个将评分类型映射到其对应: 值的字典。

Return type:

字典[str, float]

get_scores_types() → List[str][来源]#

获取奖励模型可以返回的分数类型列表。

Returns:: 奖励模型可以返回的评分类型列表。
Return type:: List[str]

基础类: BaseRewardModel

基于transformers的奖励模型，它将从huggingface下载模型。

Parameters:

model_type (Union[ModelType, str]) - 创建后端所用的模型。
api_key (Optional[str], optional) – 未使用。(默认: None)
url (可选[str], 可选) – 未使用。(默认值: None)
device_map (Optional[str], optional) – 选择设备映射。 (默认值: auto)
attn_implementation (Optional[str], optional) – 选择注意力机制的实现方式。(默认: flash_attention_2)
offload_folder (Optional[str], optional) – 选择卸载文件夹。 (默认: offload)

evaluate(messages: List[Dict[str, str]]) → Dict[str, float][来源]#

使用Skywork模型评估消息。

Parameters:: messages (List[Dict[str, str]]) – 消息列表。
Returns:: 一个包含得分的ChatCompletion对象。
Return type:: ChatCompletion

get_scores_types() → List[str][来源]#

获取分数类型

Returns:: 分数类型列表
Return type:: List[str]