容器镜像

关于Kubeflow笔记本的容器镜像

Kubeflow Notebooks 原生支持三种类型的笔记本， JupyterLab， RStudio，和 Visual Studio Code (code-server)，但任何基于网页的 IDE 都应该可以使用。笔记本服务器作为容器在 Kubernetes Pod 内运行，这意味着 IDE 的类型（以及安装了哪些软件包）由您为服务器选择的 Docker 镜像决定。

官方镜像

Kubeflow 提供了若干个示例容器镜像，帮助您开始使用 Kubeflow Notebooks。

该图表显示了图像之间的关联（请注意，这些节点是指向Dockerfiles的可点击链接）：

%%{init: {'theme':'forest'}}%%
graph TD
  Base[Base] --> Jupyter[Jupyter]
  Base --> Code-Server[code-server]
  Base --> RStudio[RStudio]
  
  Jupyter --> PyTorch[PyTorch]
  Jupyter --> SciPy[SciPy]
  Jupyter --> TensorFlow[TensorFlow]
  
  Code-Server --> Code-Server-Conda-Python[Conda Python]
  RStudio --> Tidyverse[Tidyverse]

  PyTorch --> PyTorchFull[PyTorch Full]
  TensorFlow --> TensorFlowFull[TensorFlow Full]

  Jupyter --> PyTorchCuda[PyTorch CUDA]
  Jupyter --> TensorFlowCuda[TensorFlow CUDA]
  Jupyter --> PyTorchGaudi[PyTorch Gaudi]

  PyTorchCuda --> PyTorchCudaFull[PyTorch CUDA Full]
  TensorFlowCuda --> TensorFlowCudaFull[TensorFlow CUDA Full]
  PyTorchGaudi --> PyTorchGaudiFull[PyTorch Gaudi Full]

  click Base "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/base"
  click Jupyter "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter"
  click Code-Server "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/codeserver"
  click RStudio "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/rstudio"
  click PyTorch "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-pytorch"
  click SciPy "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-scipy"
  click TensorFlow "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-tensorflow"
  click Code-Server-Conda-Python "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/codeserver-python"
  click Tidyverse "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/rstudio-tidyverse"
  click PyTorchFull "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-pytorch-full"
  click TensorFlowFull "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-tensorflow-full"
  click PyTorchCuda "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-pytorch-cuda"
  click TensorFlowCuda "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-tensorflow-cuda"
  click PyTorchCudaFull "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-pytorch-cuda-full"
  click TensorFlowCudaFull "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-tensorflow-cuda-full"
  click PyTorchGaudi "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-pytorch-gaudi"
  click PyTorchGaudiFull "https://github.com/kubeflow/kubeflow/tree/master/components/example-notebook-servers/jupyter-pytorch-gaudi-full"

基础镜像

这些镜像为Kubeflow Notebook容器提供了一个共同的起点。

Dockerfile	容器注册表	备注
`./base`	`kubeflownotebookswg/base`	Common Base Image
`./codeserver`	`kubeflownotebookswg/codeserver`	code-server (Visual Studio Code)
`./jupyter`	`kubeflownotebookswg/jupyter`	JupyterLab
`./rstudio`	`kubeflownotebookswg/rstudio`	RStudio

Kubeflow 镜像

这些镜像扩展了基础镜像，包含在现实世界中常用的包。

Dockerfile	容器注册中心	备注
`./codeserver-python`	`kubeflownotebookswg/codeserver-python`	code-server + Conda Python
`./rstudio-tidyverse`	`kubeflownotebookswg/rstudio-tidyverse`	RStudio + Tidyverse
`./jupyter-pytorch`	`kubeflownotebookswg/jupyter-pytorch`	JupyterLab + PyTorch
`./jupyter-pytorch-full`	`kubeflownotebookswg/jupyter-pytorch-full`	JupyterLab + PyTorch + Common Packages
`./jupyter-pytorch-cuda`	`kubeflownotebookswg/jupyter-pytorch-cuda`	JupyterLab + PyTorch + CUDA
`./jupyter-pytorch-cuda-full`	`kubeflownotebookswg/jupyter-pytorch-cuda-full`	JupyterLab + PyTorch + CUDA + Common Packages
`./jupyter-scipy`	`kubeflownotebookswg/jupyter-scipy`	JupyterLab + Common Packages
`./jupyter-tensorflow`	`kubeflownotebookswg/jupyter-tensorflow`	JupyterLab + TensorFlow
`./jupyter-tensorflow-full`	`kubeflownotebookswg/jupyter-tensorflow-full`	JupyterLab + TensorFlow + Common Packages
`./jupyter-tensorflow-cuda`	`kubeflownotebookswg/jupyter-tensorflow-cuda`	JupyterLab + TensorFlow + CUDA
`./jupyter-tensorflow-cuda-full`	`kubeflownotebookswg/jupyter-tensorflow-cuda-full`	JupyterLab + TensorFlow + CUDA + Common Packages
`./jupyter-pytorch-gaudi`	`kubeflownotebookswg/jupyter-pytorch-gaudi`	JupyterLab + PyTorch + Gaudi
`./jupyter-pytorch-gaudi-full`	`kubeflownotebookswg/jupyter-pytorch-gaudi-full`	JupyterLab + PyTorch + Gaudi + Common Packages

软件包安装

用户在启动 Kubeflow Notebook 后安装的包只会在 pod 的生命周期内存在（除非安装到 PVC-backed 目录中）。

为了确保在Pod重启过程中包能够被保留，用户需要：

构建自定义镜像以包含它们，或者
确保它们安装在PVC支持的目录中

自定义镜像

您可以构建自己的自定义镜像以与 Kubeflow Notebooks 一起使用。

确保您的自定义镜像满足要求的最简单方法是扩展我们的基础镜像之一。

图像要求

要使容器镜像与 Kubeflow Notebooks 一起工作，它必须：

expose an HTTP interface on port 8888:
- kubeflow 在运行时设置一个环境变量 NB_PREFIX，其值为我们期望容器监听的 URL 路径
- kubeflow 使用 IFrames，因此确保您的应用程序在 HTTP 响应头中设置 Access-Control-Allow-Origin: *
run as a user called jovyan:
- jovyan的主目录应为/home/jovyan
- jovyan的UID应该是1000
start successfully with an empty PVC mounted at /home/jovyan:
- kubeflow 在 /home/jovyan 挂载一个 PVC 以保持 Pod 重启后的状态

安装 Python 软件包

您可以扩展其中一个镜像，并安装您的 Kubeflow Notebook 用户可能需要的任何 pip 或 conda 包。作为指南，请查看 ./jupyter-pytorch-full/Dockerfile 以获取 pip install ... 示例，以及 ./rstudio-tidyverse/Dockerfile 以获取 conda install ...。

一个常见的错误原因是用户运行 pip install --user ...，导致主目录（由PVC支持）中包含一个不同或不兼容版本的包，该包位于 /opt/conda/...

安装Linux软件包

您可以扩展其中一个镜像，并安装任何您的 Kubeflow Notebook 用户可能需要的 apt-get 包。确保在运行 apt-get 之前在 Dockerfile 中切换到 root，然后再切换回 $NB_USER。

配置 S6 Overlay

某些用例可能需要在Notebook Server容器启动时运行自定义脚本，或者高级用户可能想要在容器内部添加额外的服务（例如，Apache或NGINX web服务器）。为了简化这个过程，我们使用s6-overlay。

s6-overlay与其他初始化系统如tini不同。tini是为了处理在容器中运行的单个进程作为PID 1而创建的，而s6-overlay则是为了管理多个进程而构建的，并允许映像的创建者决定哪些进程失败应静默重启，以及哪些应导致容器退出。

创建脚本

在容器启动期间需要运行的脚本可以放在 /etc/cont-init.d/ 中，并按升序字母数字顺序执行。

一个启动脚本的示例可以在 ./rstudio/s6/cont-init.d/02-rstudio-env-fix 找到。这个脚本使用了 with-contenv 辅助工具，以便环境变量（传递给容器）在脚本中可用。这个脚本的目的是在 pod 启动时将任何 KUBERNETES_* 环境变量快照到 Renviron.site，因为没有这些变量 kubectl 是无法工作的。

创建服务

额外的服务需要由 s6-overlay 监控，应放置在 /etc/services.d/ 下的专用文件夹中，该文件夹包含一个名为 run 的脚本，以及一个可选的完成脚本 finish。

一个服务的示例可以在.jupyter/s6/services.d/jupyterlab的run脚本中找到，该脚本用于启动JupyterLab本身。有关run和finish脚本的更多信息，请参阅s6-overlay文档。

以根用户身份运行服务

可能会有需要以root身份运行服务的情况，为此，您可以将Dockerfile修改为在末尾添加 USER root，然后使用 s6-setuidgid 以 $NB_USER 的身份运行用户服务。

我们的示例图像以 s6-overlay 作为 $NB_USER 运行（而不是 root），这意味着与 s6-overlay 相关的任何文件或脚本必须由 $NB_USER 用户拥有才能成功运行。

下一步

通过在启动笔记本服务器时指定您的容器镜像来使用它。（请参见快速入门指南。）

反馈

此页面有帮助吗？

感谢您的反馈！

我们很抱歉，这个页面没有帮助。如果您有时间，请分享您的反馈以便我们改进。

Last modified October 24, 2024: 操作: 添加对Intel Gaudi容器镜像的引用 (#3891) (bf0895e)