调试笔记

本节介绍了一些在调试生成的代码的编译和执行过程中可能有用的技术。

参见

调试JIT编译的代码

Memcheck

Memcheck 是一个使用 Valgrind 实现的内存错误检测器。它对于检测编译代码中的内存错误特别有用，尤其是越界访问和使用后释放错误。有缺陷或编译错误的本地代码可能会产生这些类型的错误。Memcheck 文档解释了其使用方法；在这里，我们只讨论将其与 Numba 一起使用的具体细节。

Python解释器和Numba使用的一些库可能会在Memcheck中产生误报 - 有关误报发生的原因的更多信息，请参阅手册的这一部分。误报可能会使确定实际错误何时发生变得困难，因此抑制已知的误报是有帮助的。这可以通过提供一个抑制文件来实现，该文件指示Memcheck忽略与其中定义的抑制匹配的错误。

CPython 源代码分发包中包含一个抑制文件，位于文件 Misc/valgrind-python.supp 中。使用此文件可以防止由 Python 内存分配实现生成的大量虚假错误。此外，Numba 仓库中包含一个抑制文件，位于 contrib/valgrind-numba.supp 中。

备注

使用与你所使用的 Python 解释器和 Numba 版本相匹配的抑制文件非常重要——这些文件会随着时间的推移而演变，因此非当前版本可能无法抑制某些错误，或者错误地抑制实际错误。

要在Memcheck下使用两个抑制文件运行Python解释器，可以使用以下命令调用:

valgrind --tool=memcheck \
         --suppressions=${CPYTHON_SRC_DIR}/Misc/valgrind-python.supp \
         --suppressions=${NUMBA_SRC_DIR}/contrib/valgrind-numba.supp \
         python ${PYTHON_ARGS}

其中 ${CPYTHON_SRC_DIR} 设置为 CPython 源代码分发的位置，${NUMBA_SRC_DIR} 是 Numba 源代码目录的位置，而 ${PYTHON_ARGS} 是传递给 Python 解释器的参数。

如果有错误，那么描述这些错误的讯息将会被打印到标准错误输出。一个错误的例子是:

==77113==    at 0x24169A: PyLong_FromLong (longobject.c:251)
==77113==    by 0x241881: striter_next (bytesobject.c:3084)
==77113==    by 0x2D3C95: _PyEval_EvalFrameDefault (ceval.c:2809)
==77113==    by 0x21B499: _PyEval_EvalCodeWithName (ceval.c:3930)
==77113==    by 0x26B436: _PyFunction_FastCallKeywords (call.c:433)
==77113==    by 0x2D3605: call_function (ceval.c:4616)
==77113==    by 0x2D3605: _PyEval_EvalFrameDefault (ceval.c:3124)
==77113==    by 0x21B977: _PyEval_EvalCodeWithName (ceval.c:3930)
==77113==    by 0x21C2A4: _PyFunction_FastCallDict (call.c:376)
==77113==    by 0x2D5129: do_call_core (ceval.c:4645)
==77113==    by 0x2D5129: _PyEval_EvalFrameDefault (ceval.c:3191)
==77113==    by 0x21B499: _PyEval_EvalCodeWithName (ceval.c:3930)
==77113==    by 0x26B436: _PyFunction_FastCallKeywords (call.c:433)
==77113==    by 0x2D46DA: call_function (ceval.c:4616)
==77113==    by 0x2D46DA: _PyEval_EvalFrameDefault (ceval.c:3139)
==77113==
==77113== Use of uninitialised value of size 8

提供的回溯仅概述了C调用堆栈，这可能使得难以确定错误发生时Python解释器正在执行的操作。通过查看 GNU调试器（GDB）中的回溯，可以了解更多关于堆栈状态的信息。使用附加参数 --vgdb-error=0 启动 valgrind，并按照输出指示使用GDB附加到进程。一旦遇到错误，GDB将在错误处停止，此时可以检查堆栈。

GDB 确实提供了通过 Python 堆栈进行回溯的支持，但这需要符号，这些符号在您的 Python 发行版中可能不容易获得。在这种情况下，仍然有可能确定一些关于 Python 中发生的事情的信息，但这取决于仔细检查回溯。例如，在上述错误的回溯中，我们看到了诸如：

#18 0x00000000002722da in slot_tp_call (
    self=<_wrap_impl(_callable=<_wrap_missing_loc(func=<function at remote
    0x1cf66c20>) at remote 0x1d200bd0>, _imp=<function at remote 0x1d0e7440>,
    _context=<CUDATargetContext(address_size=64,
    typing_context=<CUDATypingContext(_registries={<Registry(functions=[<type
    at remote 0x65be5e0>, <type at remote 0x65be9d0>, <type at remote
    0x65bedc0>, <type at remote 0x65bf1b0>, <type at remote 0x8b78000>, <type
    at remote 0x8b783f0>, <type at remote 0x8b787e0>, <type at remote
    0x8b78bd0>, <type at remote 0x8b78fc0>, <type at remote 0x8b793b0>, <type
    at remote 0x8b797a0>, <type at remote 0x8b79b90>, <type at remote
    0x8b79f80>, <type at remote 0x8b7a370>, <type at remote 0x8b7a760>, <type
    at remote 0x8b7ab50>, <type at remote 0x8b7af40>, <type at remote
    0x8b7b330>, <type at remote 0x8b7b720>, <type at remote 0x8b7bf00>, <type
    at remote 0x8b7c2f0>, <type at remote 0x8b7c6e0>], attributes=[<type at
    remote 0x8b7cad0>, <type at remote 0x8b7cec0>, <type at remote
    0x8b7d2b0>, <type at remote 0x8b7d6a0>, <type at remote 0x8b7da90>,
    <t...(truncated),
    args=(<Builder(_block=<Block(parent=<Function(parent=<Module(context=<Context(scope=<NameScope(_useset={''},
    _basenamemap={}) at remote 0xbb5ae10>, identified_types={}) at remote
    0xbb5add0>, name='cuconstRecAlign$7',
    data_layout='e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v32:32:32-v64:64:64-v128:128:128-n16:32:64',
    scope=<NameScope(_useset={'',
    '_ZN08NumbaEnv5numba4cuda5tests6cudapy13test_constmem19cuconstRecAlign$247E5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE',
    '_ZN5numba4cuda5tests6cudapy13test_constmem19cuconstRecAlign$247E5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE'},
    _basenamemap={}) at remote 0x1d27bf10>, triple='nvptx64-nvidia-cuda',
    globals={'_ZN08NumbaEnv5numba4cuda5tests6cudapy13test_constmem19cuconstRecAlign$247E5ArrayIdLi1E1C7mutable7ali...(truncated),
    kwds=0x0)

我们可以看到一些参数，特别是编译函数的名称，例如:

_ZN5numba4cuda5tests6cudapy13test_constmem19cuconstRecAlign$247E5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE5ArrayIdLi1E1C7mutable7alignedE

我们可以通过 c++filt 来查看更易读的表示形式:

numba::cuda::tests::cudapy::test_constmem::cuconstRecAlign$247(
  Array<double, 1, C, mutable, aligned>,
  Array<double, 1, C, mutable, aligned>,
  Array<double, 1, C, mutable, aligned>,
  Array<double, 1, C, mutable, aligned>,
  Array<double, 1, C, mutable, aligned>)

这是一个 jitted 函数的完全限定名以及调用它的类型。