Table of Contents

Shortcuts

llama3_2_vision_11b¶

torchtune.models.llama3_2_vision.llama3_2_vision_11b(decoder_trainable: bool = False, encoder_trainable: bool = True, fusion_trainable: bool = True, image_size: int = 560) → DeepFusionModel[source]¶

Llama 3.2 Vision 11B 模型

Parameters:

decoder_trainable (bool) – 是否使解码器参数可训练。默认为 False。
encoder_trainable (bool) – 是否使编码器参数可训练。默认为True。
fusion_trainable (bool) – 是否使融合参数可训练。默认值为True。
image_size (int) – 基础图像大小，图像将被平铺并调整为此大小。默认值为560用于Instruct权重，使用448用于预训练。

Returns:

Llama 3.2 Vision 11B 模型的实例化

Return type:

DeepFusionModel