高级绘图示例#

如果你想在实时的 Python 内核中尝试这个笔记本，请使用 mybinder：

Vaex 使用 matplotlib 来创建图表，这提供了极大的灵活性。为了避免重复的“样板”代码，Vaex 尝试涵盖许多用例，您可以使用简单的声明式风格来绘制一个或多个面板。

以下示例将使用示例数据集，该数据集是对类似我们银河系的星系形成过程的数值模拟结果(来源)。数据包含模拟中每个起始粒子的3D位置、速度、角动量、能量和铁含量。

让我们从加载数据开始：

[1]:

import vaex
import numpy as np
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings('ignore')

[2]:

df = vaex.example()
df.head()

[2]:

#	id	x	y	z	vx	vy	vz	E	L	Lz	FeH
0	0	1.23187	-0.396929	-0.598058	301.155	174.059	27.4275	-149431	407.389	333.956	-1.00539
1	23	-0.163701	3.65422	-0.254906	-195	170.472	142.53	-124248	890.241	684.668	-1.70867
2	32	-2.12026	3.32605	1.70784	-48.6342	171.647	-2.07944	-138501	372.241	-202.176	-1.83361
3	8	4.71559	4.58525	2.25154	-232.421	-294.851	62.8586	-60037	1297.63	-324.688	-1.47869
4	16	7.21719	11.9947	-1.06456	-1.68917	181.329	-11.3336	-83206.8	1332.8	1328.95	-1.85705
5	16	-7.78437	5.98977	-0.682695	86.7009	-238.778	-2.31309	-86497.6	1353.25	1339.42	-1.91944
6	12	8.08373	-3.27348	5.54687	-57.4544	120.117	5.37438	-101867	1100.8	782.915	-1.93517
7	26	-3.55719	5.41363	0.0917156	-67.0511	-145.933	39.6374	-127682	921.008	882.101	-1.79423
8	25	3.9848	5.40691	2.57724	-38.7449	-152.407	-92.9073	-113632	493.316	-397.824	-1.18076
9	8	-20.8139	-3.29468	13.4866	99.4067	28.6749	-115.079	-55825.3	1088.46	-269.324	-1.28892

单个图表#

最简单的情况是由两个轴创建的单个热图，由前两个参数指定：

[3]:

df.viz.heatmap('x', 'y', title='Face on galaxy', limits='99%')

../_images/guides_advanced_plotting_5_0.png

相同类型的多个图表#

第一个参数可以是一个轴对的列表。这将生成多个图：

[4]:

df.viz.heatmap([["x", "y"], ["x", "z"]], title="Face on and edge on", figsize=(10, 4), limits='99%');

../_images/guides_advanced_plotting_7_0.png

多个图表，相同坐标轴，不同统计#

如果 what 参数是一个列表，默认情况下它将创建多个子图：

[5]:

df.viz.heatmap("x", "y", what=["count(*)", "mean(vx)", "correlation(vy,vz)"],
               title="Different statistics",
               figsize=(10, 5), limits='99%');

../_images/guides_advanced_plotting_9_0.png

多个图表，不同的轴，不同的统计#

可以指定多个轴对作为第一个参数，以及一个what参数列表。生成的图形将包含多个子图，其中不同的轴组合将形成行，而不同的what统计量将形成列：

[6]:

df.viz.heatmap([["x", "y"], ["x", "z"], ["y", "z"]],
               what=["count(*)", "mean(vx)", "correlation(vx,vy)", "correlation(vx,vz)"],
               title="Different statistics and plots",
               figsize=(14,12),
               limits='99%');

../_images/guides_advanced_plotting_11_0.png

还可以通过visual参数指定图形的布局，该参数可用于交换子图的行和列顺序：

[7]:

df.viz.heatmap([["x", "y"], ["x", "z"], ["y", "z"]],
               what=["count(*)", "mean(vx)", "correlation(vx,vy)", "correlation(vx,vz)"],
               visual=dict(row="what", column="subspace"),
               title="Different statistics and plots",
               figsize=(14,12),
               limits='99%');

../_images/guides_advanced_plotting_13_0.png

第三维度的切片#

如果提供了第三个轴（z），你可以“切片”数据，将z切片显示为行。请注意，这里的行是换行的，可以通过wrap_columns参数进行更改：

[8]:

df.viz.heatmap("Lz", "E", z="FeH:-3,-1,8",
               visual=dict(row="z"),
               figsize=(12, 8),
               f="log",
               wrap_columns=3,
               limits='99%');

../_images/guides_advanced_plotting_15_0.png

多图环绕#

如果尝试创建一个包含许多子图的图形，它们将会很好地排列。在示例数据集中，我们创建了所有列组合的热图，按它们的互信息排序：

[9]:

# Get all column pars
pairs = df.combinations(exclude=['id'])
# Calculate the mutual information for each pair, sorted by the largest value
mi, pairs_sorted = df.mutual_information(pairs, sort=True)

# Create the figure
df.viz.heatmap(pairs_sorted, f='log', colorbar=False, figsize=(14, 20), limits='99%', wrap_columns=5);

../_images/guides_advanced_plotting_17_0.png

绘图选择#

如果使用了selection参数，则只绘制选定的部分：

[10]:

df.viz.heatmap("x", "y", selection="sqrt(x**2+y**2) < 5", limits=[-10, 10]);

../_images/guides_advanced_plotting_19_0.png

如果指定了选择列表（False 或 None 表示没有选择），那么默认情况下，每个选择都会形成所生成图形的不同“层”：

[11]:

df.viz.heatmap("x", "y",
               selection=[None, "sqrt(x**2+y**2) < 5", "(sqrt(x**2+y**2) < 7) & (x < 0)"],
               limits=[-10, 10]);

../_images/guides_advanced_plotting_21_0.png

在热图上叠加矢量场#

天文学家认为，像我们银河系这样的星系是由许多前星系团块合并和混合而成的。尝试找到原始前星系碎片的一种方法是检查它们的能量（𝐸）和角动量（𝐿𝑧）的二维分布。因此，让我们制作这样的图表：

[12]:

df.viz.heatmap('Lz', 'E', f='log', figsize=(9, 6));

../_images/guides_advanced_plotting_23_0.png

现在，为了展示上图中每个星团中的恒星确实在空间中一致移动，我们可以在位置热图上叠加它们的速度矢量。

首先，让我们选择属于其中一个星团的星星：

[13]:

# specify ranges of angular momentum (Lz) and energy (E)
limits_Lz_E_clump = (1181.770, 1291.92), (-70850.91, -68491.16)

# Use the rectangle selection method
df.select_rectangle("Lz", "E", limits_Lz_E_clump, name="stream")

# Check how many stars we have selected
print(f'Selection contains {df.count(selection="stream")} "stars".')

Selection contains 9556 "stars".

我们还可以叠加显示所选区域，以确信我们选择了一个好的区域：

[14]:

df.viz.heatmap("Lz", "E", selection=[None, "stream"], f="log", figsize=(9, 6));

../_images/guides_advanced_plotting_27_0.png

现在让我们在𝑦−𝑧图上绘制𝑣𝑦和𝑣𝑧速度矢量。首先，我们计算一个平均𝑣𝑦和𝑣𝑧速度的网格。请注意，我们将𝑣𝑦和𝑣𝑧值的范围限制在-20到20之间，网格分辨率为32x32个区间：

[15]:

limits = [-20, 20]
shape_vector = 32
mean_vy = df.mean("vy", binby=["y", "z"], limits=limits, shape=shape_vector, selection='stream')
mean_vz = df.mean("vz", binby=["y", "z"], limits=limits, shape=shape_vector, selection='stream')

接下来，让我们创建一个网格来保存箱子的中心：

[16]:

# create a 2d array with holds the center of the bins
centers = np.linspace(*limits, shape_vector, endpoint=False) + (limits[1] - limits[0])/2./shape_vector
z, y = np.meshgrid(centers, centers)

为了保持图表的“整洁”，我们也不希望可视化计数较少的箱子的速度：

[17]:

# we don't want to show bins with low number of counts
counts = df.count(binby=["y", "z"], limits=limits, shape=shape_vector, selection='stream')
mask = counts.flatten( ) > 10

最后，我们可以绘制一个\(v_y\)与\(v_z\)的背景密度图，然后使用plt.quiver来叠加速度矢量：

[18]:

df.viz.heatmap("y", "z", limits=limits, f="log1p", figsize=(10, 9), selection=[None, "stream"], shape=128)

# overplot the mean velocity vectors
plt.quiver(y.flatten()[mask],
           z.flatten()[mask],
           mean_vy.flatten()[mask],
           mean_vz.flatten()[mask],
           color="white",
           alpha=0.75);

../_images/guides_advanced_plotting_35_0.png

我们确实看到我们选择的星星一起移动，并形成了一条流！

绘制healpix地图#

Healpix 通过 healpy 包提供。Vaex 不需要对 healpix 进行特殊支持，但引入了一些辅助函数以使使用 healpix 更加方便。

确保你已经安装了healpy。如果没有，你可以使用以下命令之一来安装它：

!pip install healpy  # if you prefer pip
!conda install -c conda-forge healpy if you are using a conda package manager

为了更好地理解这一点，我们将从头开始。如果我们想制作一个密度天空图，我们希望向healpy传递一个一维numpy数组，其中每个值代表球体上某个位置的密度，该位置由数组大小（healpix级别）和偏移量（位置）决定。

此示例使用了模拟的Gaia数据集。Gaia数据包括在source_id列中编码的healpix索引。通过将source_id除以34359738368，您可以得到healpix索引级别12，进一步除以该值将带您到更低的级别。

让我们从获取数据集开始（注意：数据集在磁盘上约为700MB）。

[19]:

import healpy as hp

[20]:

df = vaex.datasets.tgas(full=True)
df.head()

[20]:

#	astrometric_delta_q	astrometric_excess_noise	astrometric_excess_noise_sig	astrometric_n_bad_obs_ac	astrometric_n_bad_obs_al	astrometric_n_good_obs_ac	astrometric_n_good_obs_al	astrometric_n_obs_ac	astrometric_n_obs_al	astrometric_primary_flag	astrometric_priors_used	astrometric_relegation_factor	astrometric_weight_ac	astrometric_weight_al	b	dec	dec_error	dec_parallax_corr	dec_pmdec_corr	dec_pmra_corr	duplicated_source	ecl_lat	ecl_lon	hip	l	matched_observations	parallax	parallax_error	parallax_pmdec_corr	parallax_pmra_corr	phot_g_mean_flux	phot_g_mean_flux_error	phot_g_mean_mag	phot_g_n_obs	phot_variable_flag	pmdec	pmdec_error	pmra	pmra_error	pmra_pmdec_corr	ra	ra_dec_corr	ra_error	ra_parallax_corr	ra_pmdec_corr	ra_pmra_corr	random_index	ref_epoch	scan_direction_mean_k1	scan_direction_mean_k2	scan_direction_mean_k3	scan_direction_mean_k4	scan_direction_strength_k1	scan_direction_strength_k2	scan_direction_strength_k3	scan_direction_strength_k4	solution_id	source_id	tycho2_id
0	1.91906	0.717101	412.606	1	0	78	79	79	79	84	3	2.9361	1.26696e-05	1.81816	-48.7144	0.235392	0.218802	-0.407338	0.0606588	-0.0994513	70	-16.1211	42.6418	13989	176.74	9	6.35295	0.30791	-0.101957	-0.00157679	1.03123e+07	10577.4	7.99138	77	b'NOT_AVAILABLE'	-7.64199	0.0874018	43.7523	0.0705422	0.214677	45.0343	-0.414972	0.305989	0.179966	-0.0857597	0.159207	243619	2015	-113.76	21.3929	-41.6784	26.2018	0.382348	0.538266	0.392379	0.916306	1635378410781933568	7627862074752	b''
1	nan	0.253463	47.3163	2	0	55	57	57	57	84	5	2.65231	3.16002e-05	12.8616	-48.645	0.200068	1.19779	0.837626	-0.975644	0.972577	70	-16.193	42.7612	-2147483648	176.916	8	3.90033	0.323488	-0.853779	0.839739	949565	1140.17	10.581	62	b'NOT_AVAILABLE'	-55.1092	2.52293	10.0363	4.61141	-0.996399	45.165	-0.995923	2.58388	-0.860911	0.97348	-0.972417	487238	2015	-156.433	22.7661	-36.2397	22.8906	0.711003	0.96597	0.646115	0.86716	1635378410781933568	9277129363072	b'55-28-1'
2	nan	0.398901	221.185	4	1	57	60	61	61	84	5	3.9934	2.56339e-05	5.76753	-48.6678	0.248825	0.180326	-0.391891	-0.193256	0.0894205	70	-16.1234	42.6975	-2147483648	176.78	7	3.15531	0.273484	-0.118552	-0.0418587	817838	1827.38	10.7431	60	b'NOT_AVAILABLE'	-1.60287	1.03526	2.93228	1.90864	-0.914271	45.0862	-0.177443	0.213836	0.307722	-0.184817	0.0468668	1948952	2015	-117.008	19.7722	-43.1082	26.7157	0.482528	0.428758	0.524153	0.903062	1635378410781933568	13297218905216	b'55-1191-1'
3	nan	0.422492	179.982	1	0	51	52	52	52	84	5	4.21516	2.86726e-05	5.36086	-48.6824	0.248211	0.200958	-0.337217	-0.223501	0.131811	70	-16.1182	42.6778	-2147483648	176.76	7	2.29237	0.280972	-0.109202	-0.0494409	602053	905.877	11.0757	61	b'NOT_AVAILABLE'	-18.4149	1.12985	3.66198	2.06505	-0.926177	45.0665	-0.365707	0.276039	0.202878	-0.0589288	-0.0509089	102321	2015	-132.421	22.5693	-38.9545	25.8786	0.494655	0.638456	0.509074	0.898918	1635378410781933568	13469017597184	b'55-624-1'
4	nan	0.3175	119.748	2	3	85	84	87	87	84	5	3.23564	2.22788e-05	8.08078	-48.572	0.335044	0.17013	-0.438708	-0.279349	0.121792	70	-16.0555	42.7734	-2147483648	176.739	11	1.58208	0.261539	-0.329196	0.100312	1.38812e+06	2826.43	10.1687	96	b'NOT_AVAILABLE'	-2.37939	0.710632	0.340802	1.22048	-0.833604	45.136	-0.0490526	0.170697	0.471425	-0.156392	-0.152076	409284	2015	-106.86	4.4521	-47.8954	26.7555	0.520654	0.23931	0.653377	0.863385	1635378410781933568	15736760328576	b'55-849-1'
5	nan	0.303723	64.6868	2	1	68	69	70	70	84	5	3.10892	2.22511e-05	9.65279	-48.5511	0.359618	0.179848	-0.437142	-0.376402	0.257906	70	-16.0335	42.7861	-2147483648	176.718	9	8.66308	0.255867	-0.297309	0.0791063	1.66384e+06	1381.58	9.97199	76	b'NOT_AVAILABLE'	-72.7114	0.720852	-52.8493	1.26429	-0.852784	45.1414	-0.264588	0.205008	0.39493	0.102073	-0.36853	204642	2015	-127.824	16.3828	-44.2417	25.1631	0.522809	0.479366	0.621515	0.847412	1635378410781933568	16527034310784	b'55-182-1'
6	nan	0.340405	118.911	2	1	76	77	78	78	84	5	3.44745	2.19728e-05	7.91894	-48.5242	0.386343	0.17188	-0.341053	-0.34408	0.1516	70	-16.0114	42.8058	-2147483648	176.701	9	5.6982	0.263677	-0.367848	0.0846782	1.821e+06	2755.91	9.874	77	b'NOT_AVAILABLE'	-3.35036	0.707184	24.5272	1.17738	-0.800098	45.153	-0.0412512	0.189524	0.488929	-0.163855	-0.195289	540954	2015	-114.478	11.0431	-46.4507	26.2651	0.512088	0.322961	0.637399	0.856398	1635378410781933568	16733192740608	b'55-867-1'
7	nan	0.253709	88.6261	3	0	76	79	79	79	84	5	2.65453	2.57372e-05	13.709	-48.5569	0.380844	0.150943	-0.139315	-0.358996	0.238914	70	-16.0049	42.7641	-2147483648	176.665	10	2.09081	0.222206	-0.277202	0.093748	967144	601.802	10.561	87	b'NOT_AVAILABLE'	-11.6616	0.982994	-1.57293	1.73319	-0.904223	45.1128	-0.187136	0.206981	0.412381	0.0994892	-0.284353	1081909	2015	-88.3027	14.7861	-47.9744	27.0228	0.39079	0.333692	0.400387	0.90071	1635378410781933568	16870631694208	b'55-72-1'
8	nan	0.401473	226.044	3	1	69	71	72	72	84	5	4.01755	2.45771e-05	5.41389	-48.6511	0.351099	0.169345	-0.276625	-0.175754	0.101633	70	-16.0034	42.6531	-2147483648	176.589	9	6.20249	0.247253	-0.139338	0.0669677	1.66582e+06	1233.43	9.9707	79	b'NOT_AVAILABLE'	9.19541	1.02832	26.308	2.03485	-0.905496	45.0103	-0.321544	0.243576	0.263603	-0.143727	0.107397	589318	2015	-106.23	19.3449	-44.7095	25.5226	0.335982	0.520842	0.35827	0.90504	1635378410781933568	26834955821312	b'55-912-1'
9	nan	0.235866	49.3216	2	0	51	53	53	53	84	5	2.49518	2.42543e-05	15.7304	-48.5912	0.473472	0.163531	-0.0605532	-0.242013	0.14566	70	-15.8759	42.6549	-2147483648	176.419	8	1.67767	0.222067	-0.18584	0.0668122	1.96682e+06	1184.17	9.79036	62	b'NOT_AVAILABLE'	-24.5264	1.1319	9.10421	2.20939	-0.92529	44.9747	-0.407078	0.267911	0.236157	-0.0912424	0.0305957	1178636	2015	-99.9696	19.5819	-46.0718	24.0416	0.217998	0.655547	0.219464	0.892649	1635378410781933568	33260226885120	b'48-1139-1'

让我们绘制一个级别为2的healpix图。我们可以从计算每个healpix区域中的星星数量开始：

[21]:

level = 2
factor = 34359738368 * (4**(12-level))
nmax = hp.nside2npix(2**level)
counts = df.count(binby="source_id/" + str(factor), limits=[0, nmax], shape=nmax)
counts

[21]:

array([ 4021,  6171,  5318,  7114,  5755, 13420, 12711, 10193,  7782,
       14187, 12578, 22038, 17313, 13064, 17298, 11887,  3859,  3488,
        9036,  5533,  4007,  3899,  4884,  5664, 10741,  7678, 12092,
       10182,  6652,  6793, 10117,  9614,  3727,  5849,  4028,  5505,
        8462, 10059,  6581,  8282,  4757,  5116,  4578,  5452,  6023,
        8340,  6440,  8623,  7308,  6197, 21271, 23176, 12975, 17138,
       26783, 30575, 31931, 29697, 17986, 16987, 19802, 15632, 14273,
       10594,  4807,  4551,  4028,  4357,  4067,  4206,  3505,  4137,
        3311,  3582,  3586,  4218,  4529,  4360,  6767,  7579, 14462,
       24291, 10638, 11250, 29619,  9678, 23322, 18205,  7625,  9891,
        5423,  5808, 14438, 17251,  7833, 15226,  7123,  3708,  6135,
        4110,  3587,  3222,  3074,  3941,  3846,  3402,  3564,  3425,
        4125,  4026,  3689,  4084, 16617, 13577,  6911,  4837, 13553,
       10074,  9534, 20824,  4976,  6707,  5396,  8366, 13494, 19766,
       11012, 16130,  8521,  8245,  6871,  5977,  8789, 10016,  6517,
        8019,  6122,  5465,  5414,  4934,  5788,  6139,  4310,  4144,
       11437, 30731, 13741, 27285, 40227, 16320, 23039, 10812, 14686,
       27690, 15155, 32701, 18780,  5895, 23348,  6081, 17050, 28498,
       35232, 26223, 22341, 15867, 17688,  8580, 24895, 13027, 11223,
        7880,  8386,  6988,  5815,  4717,  9088,  8283, 12059,  9161,
        6952,  4914,  6652,  4666, 12014, 10703, 16518, 10270,  6724,
        4553,  9282,  4981])

使用healpy包，我们可以在摩尔维德投影中绘制这个

[22]:

hp.mollview(counts, nest=True);

../_images/guides_advanced_plotting_44_0.png

为了避免重复编写上述代码，我们可以使用df.healpix_count方法代替：

[23]:

counts = df.healpix_count(healpix_level=6)
hp.mollview(counts, nest=True)

../_images/guides_advanced_plotting_46_0.png

我们可以使用vaex的df.viz.healpix_plot方法，而不是使用healpy：

[24]:

df.viz.healpix_heatmap(f="log1p", healpix_level=6, figsize=(10,8), healpix_output="ecliptic")

../_images/guides_advanced_plotting_48_0.png

高级绘图示例

目录