Kubernetes¶

这包含了TorchX Kubernetes调度器，可用于在Kubernetes集群上运行TorchX组件。

先决条件¶

TorchX Kubernetes调度器依赖于Volcano。如果您尝试进行升级，您需要完全移除所有非Job的Volcano资源并重新创建。

安装Volcano：

kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/v1.6.0/installer/volcano-development.yaml

查看 Volcano Quickstart 以获取更多信息。

class torchx.schedulers.kubernetes_scheduler.KubernetesScheduler(session_name: str, client: Optional[ApiClient] = None, docker_client: Optional[DockerClient] = None)[source]¶

基础类: DockerWorkspaceMixin, Scheduler[KubernetesOpts]

KubernetesScheduler 是 TorchX 的 Kubernetes 调度接口。

重要提示：需要在Kubernetes集群上安装Volcano。 TorchX需要用于多副本/多角色执行的组调度，而Volcano是目前唯一支持Kubernetes的调度器。有关安装说明，请参见：https://github.com/volcano-sh/volcano

这已经确认适用于Volcano v1.3.0和Kubernetes版本v1.18-1.21。请参阅https://github.com/pytorch/torchx/issues/120，该链接正在跟踪Kubernetes v1.22的Volcano支持。

注意

如果重试次数超过0次的AppDefs失败，可能不会显示为pods。这是由于Volcano（截至1.4.0版本）中的已知错误引起的： https://github.com/volcano-sh/volcano/issues/1651

$ pip install torchx[kubernetes]
$ torchx run --scheduler kubernetes --scheduler_args namespace=default,queue=test utils.echo --image alpine:latest --msg hello
kubernetes://torchx_user/1234
$ torchx status kubernetes://torchx_user/1234
...

配置选项

    usage:
        queue=QUEUE,[namespace=NAMESPACE],[service_account=SERVICE_ACCOUNT],[priority_class=PRIORITY_CLASS],[image_repo=IMAGE_REPO],[quiet=QUIET]

    required arguments:
        queue=QUEUE (str)
            Volcano queue to schedule job in

    optional arguments:
        namespace=NAMESPACE (str, default)
            Kubernetes namespace to schedule job in
        service_account=SERVICE_ACCOUNT (str, None)
            The service account name to set on the pod specs
        priority_class=PRIORITY_CLASS (str, None)
            The name of the PriorityClass to set on the job specs
        image_repo=IMAGE_REPO (str, None)
            (remote jobs) the image repository to use when pushing patched images, must have push access. Ex: example.com/your/container
        quiet=QUIET (bool, False)
            whether to suppress verbose output for image building. Defaults to ``False``.

挂载

挂载外部文件系统/卷是通过HostPath和PersistentVolumeClaim支持实现的。

hostPath 卷：type=bind,src= path>,dst= path>[,readonly]
持久卷声明：type=volume,src=,dst= path>[,readonly]
主机设备：type=device,src=/dev/foo[,dst= path>][,perm=rwm] 如果您指定了主机设备，作业将以特权模式运行，因为 Kubernetes 没有提供将 –device 传递给底层容器运行时的方法。用户应优先使用设备插件。

查看 torchx.specs.parse_mounts() 获取更多信息。

外部文档: https://kubernetes.io/docs/concepts/storage/persistent-volumes/

资源 / 分配

要选择特定的机器类型，您可以使用node.kubernetes.io/instance-type为您的资源添加一个能力，这将限制启动的作业为该实例类型的节点。

>>> from torchx import specs
>>> specs.Resource(
...     cpu=4,
...     memMB=16000,
...     gpu=2,
...     capabilities={
...         "node.kubernetes.io/instance-type": "<cloud instance type>",
...     },
... )
Resource(...)

Kubernetes 可能会为主机保留一些内存。TorchX 假设您是在整个主机上进行调度，因此会自动减少资源请求量，以考虑节点保留的 CPU 和内存。如果您遇到调度问题，可能需要从主机值中减少请求的 CPU 和内存。

兼容性

功能	调度器支持
获取日志	✔️
分布式作业	✔️
取消任务	✔️
描述工作	部分支持。KubernetesScheduler 将返回作业和副本状态，但不提供完整的原始 AppSpec。
工作区 / 补丁	✔️
挂载	✔️
弹性	需要 Volcano >1.6

describe(app_id: str) → Optional[DescribeAppResponse][source]¶

描述指定的应用程序。

Returns:: AppDef 描述或如果应用程序不存在则为 None。

list() → List[ListAppResponse][source]¶: 对于在调度程序上启动的应用程序，此API返回一个ListAppResponse对象列表，每个对象都包含应用程序ID及其状态。注意：此API处于原型阶段，可能会发生变化。

log_iter(app_id: str, role_name: str, k: int = 0, regex: Optional[str] = None, since: Optional[datetime] = None, until: Optional[datetime] = None, should_tail: bool = False, streams: Optional[Stream] = None) → Iterable[str][source]¶

返回一个迭代器，用于遍历k``th replica of the ``role的日志行。当所有符合条件的日志行都被读取后，迭代器结束。

如果调度程序支持基于时间的光标获取自定义时间范围内的日志行，则since和until字段会被尊重，否则它们会被忽略。不指定since和until等同于获取所有可用的日志行。如果until为空，则迭代器的行为类似于tail -f，跟随日志输出直到作业达到终止状态。

日志的确切定义取决于调度程序。一些调度程序可能将stderr或stdout视为日志，而其他调度程序可能从日志文件中读取日志。

行为和假设：

如果在一个不存在的应用程序上调用，会产生未定义行为调用者应在调用此方法之前使用exists(app_id)检查应用程序是否存在。
不是有状态的，使用相同参数调用此方法两次会返回一个新的迭代器。之前的迭代进度会丢失。
并不总是支持日志尾部查看。并非所有调度器都支持实时日志迭代（例如，在应用程序运行时查看日志尾部）。请参考特定调度器的文档以了解迭代器的行为。

3.1 If the scheduler supports log-tailing, it should be controlled: 通过 should_tail 参数。

不保证日志的保留。当调用此方法时，底层调度程序可能已经清除了此应用程序的日志记录。如果是这样，此方法会引发任意异常。
如果 should_tail 为 True，该方法仅在可访问的日志行完全耗尽且应用程序达到最终状态时引发 StopIteration 异常。例如，如果应用程序卡住并且不产生任何日志行，则迭代器会阻塞，直到应用程序最终被终止（无论是通过超时还是手动），此时它会引发 StopIteration。

如果 should_tail 为 False，当没有更多日志时，该方法会引发 StopIteration。
不需要所有调度程序都支持。
一些调度器可能通过支持__getitem__来支持行光标（例如iter[50]跳转到第50条日志行）。
Whitespace is preserved, each new line should include \n. To
支持交互式进度条，返回的行不需要包含\n，但应在打印时不换行以正确处理\r回车符。

Parameters:: streams – 要选择的IO输出流。选项之一：combined, stdout, stderr。如果调度程序不支持所选的流，它将抛出一个ValueError。
Returns:: 一个Iterator，用于遍历指定角色副本的日志行
Raises:: NotImplementedError – 如果调度程序不支持日志迭代

schedule(dryrun_info: AppDryRunInfo[KubernetesJob]) → str[source]¶

与submit相同，只是它接受一个AppDryRunInfo。鼓励实现者实现此方法，而不是直接实现submit，因为submit可以通过以下方式轻松实现：

dryrun_info = self.submit_dryrun(app, cfg)
return schedule(dryrun_info)

class torchx.schedulers.kubernetes_scheduler.KubernetesJob(images_to_push: Dict[str, Tuple[str, str]], resource: Dict[str, object])[source]¶

参考¶

torchx.schedulers.kubernetes_scheduler.create_scheduler(session_name: str, client: Optional[ApiClient] = None, docker_client: Optional[DockerClient] = None, **kwargs: Any) → KubernetesScheduler[source]¶

torchx.schedulers.kubernetes_scheduler.app_to_resource(app: AppDef, queue: str, service_account: Optional[str], priority_class: Optional[str] = None) → Dict[str, object][source]¶

app_to_resource 从提供的 AppDef 创建一个火山作业 Kubernetes 资源定义。该资源定义可用于在 Kubernetes 上启动应用程序。

为了支持宏，我们为每个副本生成一个任务，而不是使用火山replicas字段，因为宏会根据每个副本改变参数。

Volcano 有两个级别的重试：一个在任务级别，一个在作业级别。当使用 APPLICATION 重试策略时，作业级别的重试次数被设置为角色最大重试次数的最小值。

torchx.schedulers.kubernetes_scheduler.pod_labels(app: AppDef, role_idx: int, role: 角色, replica_id: int, app_id: str) → Dict[str, str][source]¶

torchx.schedulers.kubernetes_scheduler.role_to_pod(name: str, role: 角色, service_account: Optional[str]) → V1Pod[source]¶

torchx.schedulers.kubernetes_scheduler.sanitize_for_serialization(obj: object) → object[source]¶