Covariance for Angular Power Spectra with Heracles¶
This notebook demonstrates how Heracles computes an estimate of the covariance of the two-point statistics from a 3×2pt catalogue using the DICES method. This tutorial is heavily based on the summary statisitcs tutorial so we will reproduce the results found in said tutorial without going into the details of the analysis. We will then show how to compute the covariance of the angular power spectra using Heracles.
Important note
This notebook is only meant to give you an idea of how Heracles works.
It does not show everything that Heracles can do.
This is a toy, treat it is such!
Setup¶
[1]:
import numpy as np
import healpy as hp
import matplotlib.pyplot as plt
import heracles
import heracles.healpy
from heracles.notebook import Progress
# Covariance code
import heracles.dices as dices
import helpers
with Progress("example data") as progress:
helpers.get_example_data(progress)
Basic parameters¶
This is the resolution parameter for measuring spectra from HEALPix maps. Here, we use nside = 128 since that is the resolution at which the example data has been created. A value lmax of approximate 1.5x nside is fairly safe in terms of errors introduced by HEALPix. For the purpose of this tutorial we will only use 3 tomographic bins.
[2]:
nside = 128
lmax = 140
nbins = 4
Prepare maps¶
For this tutorial we will use the same data as in the summary statistics tutorial. Note that the maps generated in the summary statistics tutorial might have a different resolution. Therefore we will have to up/down-scale them accordingly. Doing so, unfortunately destroys the metadata of the maps. Therefore we will have to manually re-enter the metadata.
[3]:
from heracles.healpy import HealpixMapper
from heracles.fields import Positions, Shears, Visibility, Weights
data_maps = heracles.read_maps("example-data_maps.fits")
mapper = HealpixMapper(nside=nside, lmax=lmax)
fields = {
"POS": Positions(mapper, mask="VIS"),
"SHE": Shears(mapper, mask="WHT"),
"VIS": Visibility(mapper),
"WHT": Weights(mapper),
}
[4]:
data_maps = heracles.read_maps("example-data_maps.fits")
for key in list(data_maps.keys()):
f, i = key
if i <= nbins - 1:
_map = data_maps[key]
meta = _map.dtype.metadata
new_map = hp.ud_grade(_map, nside)
heracles.update_metadata(
new_map,
nside=nside,
lmax=lmax,
bias=meta["bias"],
fsky=meta["fsky"],
spin=meta["spin"],
)
data_maps[key] = new_map
else:
data_maps.pop(key)
# load the FITS mask
vis_map = hp.read_map("vmap.fits.gz")
vis_map[vis_map == hp.UNSEEN] = 0.0
vis_map = hp.ud_grade(vis_map, nside)
heracles.update_metadata(
vis_map,
nside=nside,
lmax=lmax,
bias=0.0,
fsky=meta["fsky"],
spin=0,
)
vis_maps = {}
for key in list(data_maps.keys()):
f, i = key
if f == "POS":
f = "VIS"
if f == "SHE":
f = "WHT"
key = (f, i)
vis_maps[key] = vis_map
Jackknife Regions¶
The fundamental idea behind Jackknife approaches is to generate an ensemble of angular power spectra by leaving out one of the regions at a time. The covariance of the angular power spectra can then be estimated from the variance of the ensemble. Therefore, the first step is to divide the mask of the survey into regions. In order to do so, in this tutorial we will make use of the SkySegmentor library. However, any library that segments the sky into regions can be used.
[5]:
import skysegmentor
[6]:
jk_maps = {}
Njk = 30
jk_map = skysegmentor.segmentmapN(vis_map, Njk)
for key in list(vis_maps.keys()):
jk_maps[key] = jk_map
[7]:
hp.mollview(jk_maps[("VIS", 1)])
Two-point statistics¶
Since the computation of the two-point statistics was already covered in the summary statistics tutorial, we will not go into the details of the analysis. We will simply load the results from the summary statistics tutorial and use them to compute the covariance of the angular power spectra.
[8]:
cls0 = dices.jackknife_cls(data_maps, vis_maps, jk_maps, fields, nd=0)[()]
Theory¶
Similarly, in this tutorial we will limit ourselves to loading the thoeory power spectra from the summary statistics tutorial.
[9]:
theory = heracles.read("example-theory.fits")
theory_cls = {}
for key in list(cls0.keys()):
theory_cls[key] = theory[key][..., : lmax + 1]
[10]:
nlbins = 10
ell = np.arange(lmax + 1)
ledges = np.logspace(np.log10(10), np.log10(lmax), nlbins + 1)
lgrid = (ledges[1:] + ledges[:-1]) / 2
Ensemble Computation¶
Compute the ensemble of angular power spectrum effectively boils down to repeating the process presented in the two-point statistics section removing one of the jackknife regions at a time. However, a couple of extra considerations have to be made.
When computing the angular power spectra from catalogues, removing one of the jackknife regions changes the sample variance of the angular power spectra by a factor of
Moreover, removing of the jackknife regions also changes the footprint of the survey which can introduce additional mixing betweeen E- and B-modes. Heracles is equiped with a mask correction routine that corrects for this effect by transforming the angular power spectra to real space, dividing the correlation function of the ratio of new footprint to the old footprint and transforming back to harmonic space.
[11]:
with Progress("cls 1") as progress:
cls1 = dices.jackknife_cls(
data_maps, vis_maps, jk_maps, fields, nd=1, progress=progress
)
To add numerically stability to our computations, we will also bin the ensemble of angular power spectra.
[12]:
cqs0 = heracles.binned(cls0, ledges)
cqs1 = heracles.binned(cls1, ledges)
Jackknife Covariance¶
The delete-one covariance is computed by measuring the covariance matrix of the ensemble of delete-one angular power spectra. The covariance matrix of an ensemble of \(K\) vector variables \(x \in \mathbb{R}^N\) is defined as:
where \(\bar{x_i} = \mathbb{E}[x_{ik}]_k = \frac{1}{K} \sum_k^K x_{ik}\) is the mean over the ensemble. Therefore, one can define any covariance matrix as an expectation over an ensemble of matrices with entries:
such that \(\mathbb{C}_{ij} = \mathbb{E} [W(x)_{ijk}]_k\). Therefore and associating \(x\) with the angular power spectra, the computation of the delete-one covariance is then given by:
Analyzing the equation above we can understand better the Jackknife approach. As we increase the number of jackknife regions, the size of each indiviual regions decreases meaning that
such that \(W_{ijk} \rightarrow 0\) and thus \(\mathbb{E} [W_{ijk}]_k \rightarrow 0\). However, the prefactor \((K-1)\) ensures that the Jackknige covariance doesn’t vanish as \(K\) increases.
[13]:
cov_jk = dices.jackknife_covariance(cqs1)
[14]:
plt.imshow(
cov_jk[("POS", "POS", "POS", "POS", 1, 1, 1, 1)][:, :],
vmin=-2.5 * 10**-14,
vmax=2.5 * 10**-14,
cmap="seismic",
)
plt.colorbar()
plt.title("Delete-1 covariance")
plt.ylabel(r"$\ell$")
plt.xlabel(r"$\ell$")
plt.show()
Debiasing¶
It can be shown that the Jackknife covariance errors (i.e. its diagonal entries) tend to be biased high. In other words, the Jackknife covariance matrix overestimates the variances of the angular power spectra. In order to address this, the DICES covariances introduces a second ensemble of angular power spectra, the delete-two ensemble. The delete-two ensemble is computed by removing two jackknife regions at a time
[15]:
with Progress("cls 2") as progress:
cls2 = dices.jackknife_cls(
data_maps, vis_maps, jk_maps, fields, nd=2, progress=progress
)
cqs2 = heracles.binned(cls2, ledges)
Once, the delete-two ensemble is computed, the correction can be obtained by computing the expressions:
where \(x_0\) denotes the concatenated angular power spectra of the full survey, \(x_{1 \, k}\) and \(x_{1 \, k'}\) denote the concatenated angular power spectra of the survey with the \(k\)-th and \(k'\)-th jackknife regions removed respectively, and \(x_{2 \, p}\) denotes the concatenated angular power spectra of the survey with both the \(k\)-th and \(k'\)-th jackknife regions removed. Note that \(p\) is an iterator that runs from 1 to \(K(K-1)/2\) related to the combinations of \(k\) and \(k'\).
Now we need to compute the variance of the q-ensemble. In order to do so, we can use the same framework as for the delete-one matrices substituting \(q\) for \(x\) in the expression for \(W\). The correction is then given by:
Finally the debiased covariance matrix is given by:
[16]:
debiased_cov_jk = dices.debias_covariance(
cov_jk,
cqs0,
cqs1,
cqs2,
)
[17]:
plt.imshow(
debiased_cov_jk[("POS", "POS", "POS", "POS", 1, 1, 1, 1)][:, :],
vmin=-2.5 * 10**-14,
vmax=2.5 * 10**-14,
cmap="seismic",
)
plt.colorbar()
plt.title("Debiased Covariance")
plt.ylabel(r"$\ell$")
plt.xlabel(r"$\ell$")
plt.show()
Shrinkage¶
On top of debiasing, DICES also gives the user the option to shrink their covariance estimate to given target. This additional step can great help in reducing the noise in the covariance matrix. Moreover, shrinking towards a Gaussian estimate of the covariance has been shown to greatly help in reproducing the eigen-values of sample covariances from simulations. Thus, from here onward we will assume a Gaussian target given by:
[18]:
gauss_cov = dices.gaussian_covariance(cqs0)
[19]:
plt.imshow(
gauss_cov[("POS", "POS", "POS", "POS", 1, 1, 1, 1)][:, :],
vmin=1 * 10**-14,
vmax=5 * 10**-13,
cmap="seismic",
)
plt.colorbar()
plt.title("Target")
plt.ylabel(r"$\ell$")
plt.xlabel(r"$\ell$")
plt.show()
Once the target has been set, the shrunk covariance is given by:
where \(\lambda\) is the shrinkage factor and \(\mathbb{C}^{\rm target}\) is the target covariance matrix. We set the diagonal entries of target covariance to those of the debiased covariance matrix. This is done in order to ensure that the variances of the angular power spectra are not shrunk.
Moreover, we modify the off-diagonal entries of the target as:
Of course, the main challenge of shrinkage is finding the optimal shrinkage factor. Heracles currently provides a routine to compute an estimate of the optimal shrinkage factor based of the sample variace of the Jackknife covariance matrix computed as follows. First, \(C_\ell\)’s in the ensemble as well as the target covariance are concatenated to form a single vector and matrix. Then, the mean and variance of the W-matrices associated with the concatenated vectors are computed.
We now express the shrinkage factor as:
where:
and
[20]:
shrinkage_factor = dices.shrinkage_factor(cqs1, gauss_cov)
print(shrinkage_factor)
0.6450755713337696
[21]:
shrunk_cov_jk = dices.shrink(
cov_jk,
gauss_cov,
shrinkage_factor,
)
/home/jaimerzp/anaconda3/envs/gitd/lib/python3.13/site-packages/heracles/utils.py:133: RuntimeWarning: invalid value encountered in sqrt
a_std = np.sqrt(a_v[..., None, :])
/home/jaimerzp/anaconda3/envs/gitd/lib/python3.13/site-packages/heracles/utils.py:134: RuntimeWarning: invalid value encountered in sqrt
b_std = np.sqrt(b_v[..., None, :])
[22]:
plt.imshow(
shrunk_cov_jk[("POS", "POS", "POS", "POS", 1, 1, 1, 1)][:, :],
vmin=-2.5 * 10**-14,
vmax=2.5 * 10**-14,
cmap="seismic",
)
plt.colorbar()
plt.title("Shrunk Covariance")
plt.ylabel(r"$\ell$")
plt.xlabel(r"$\ell$")
plt.show()
DICES¶
Finally, the DICES covariance is obtained by imposing the correlation structure of the shrunk covariance to the debiased covariance.
[23]:
dices_cov = dices.impose_correlation(debiased_cov_jk, shrunk_cov_jk)
Results¶
[24]:
fig, axes = plt.subplots(1, 4, figsize=(10, 15))
# Flattened cov_jk
flat_cov_jk = dices.flatten(cov_jk)
flat_corr_jk = flat_cov_jk / np.sqrt(
np.diag(flat_cov_jk)[:, None] * np.diag(flat_cov_jk)[None, :]
)
im1 = axes[0].imshow(flat_corr_jk, cmap="seismic", vmin=-1, vmax=1)
axes[0].set_title("Jackknife Covariance")
# Flattened debiased_cov_jk
flat_debiased_cov = dices.flatten(debiased_cov_jk)
flat_corr_debiased = flat_debiased_cov / np.sqrt(
np.diag(flat_debiased_cov)[:, None] * np.diag(flat_debiased_cov)[None, :]
)
im2 = axes[1].imshow(flat_corr_debiased, cmap="seismic", vmin=-1, vmax=1)
axes[1].set_title("Debiased Covariance")
# Flattened shrunk_cov
flat_shrunk_cov = dices.flatten(shrunk_cov_jk)
flat_corr_shrunk = flat_shrunk_cov / np.sqrt(
np.diag(flat_shrunk_cov)[:, None] * np.diag(flat_shrunk_cov)[None, :]
)
im3 = axes[2].imshow(flat_corr_shrunk, cmap="seismic", vmin=-1, vmax=1)
axes[2].set_title("Shrunk Covariance")
# Flattened dices_cov
flat_dices_cov = dices.flatten(dices_cov)
flat_corr_dices = flat_dices_cov / np.sqrt(
np.diag(flat_dices_cov)[:, None] * np.diag(flat_dices_cov)[None, :]
)
im3 = axes[3].imshow(flat_corr_dices, cmap="seismic", vmin=-1, vmax=1)
axes[3].set_title("DICES Covariance")
plt.tight_layout()
plt.show()
/tmp/ipykernel_251916/416151296.py:13: RuntimeWarning: invalid value encountered in sqrt
flat_corr_debiased = flat_debiased_cov / np.sqrt(
[25]:
fig, ax = plt.subplots(
2 * (nbins - 1),
nbins - 1,
figsize=(9, 9),
gridspec_kw={"height_ratios": [3, 1, 3, 1, 3, 1]},
)
for i in range(1, nbins):
for j in range(1, i):
ax[j - 1, i - 1].axis("off")
ax[j, i - 1].axis("off")
ax[3, 2].axis("off")
for j in range(i, nbins):
key = ("POS", "POS", i, j)
cov_key = ("POS", "POS", "POS", "POS", i, j, i, j)
cov = dices_cov[cov_key]
c = cqs0[key].array
t = theory_cls[key]
t_itp = np.interp(lgrid, ell, t)
err = np.sqrt(np.diag(cov))
ax[2 * (j - 1), i - 1].errorbar(
lgrid, c, yerr=err, c="C0", lw=1.5, zorder=3.0, alpha=0.5
)
ax[2 * (j - 1), i - 1].plot(ell[10:], t[10:], c="C0", lw=1.0, zorder=4.0)
ax[2 * (j - 1), i - 1].axhline(0.0, c="k", lw=0.8, zorder=-1)
ax[2 * (j - 1) + 1, i - 1].errorbar(
lgrid,
(c - t_itp) / t_itp,
yerr=np.abs(err / t_itp),
fmt=".",
c="C0",
lw=1.5,
zorder=1.0,
alpha=0.5,
)
ax[2 * (j - 1) + 1, i - 1].axhline(0.0, c="k", lw=0.8, zorder=-1)
ax[2 * (j - 1), i - 1].tick_params(axis="both", which="both", direction="in")
ax[2 * (j - 1), i - 1].set_yscale(
"symlog", linthresh=1e-7, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
)
ax[2 * (j - 1), i - 1].set_ylim(-3e-7, 5e-6)
ax[2 * (j - 1) + 1, i - 1].set_yscale(
"symlog", linthresh=0.1, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
)
ax[2 * (j - 1) + 1, i - 1].set_ylim(-10, 10)
ax[2 * (j - 1), i - 1].set_xscale("log")
ax[2 * (j - 1), i - 1].set_xlim(5, lmax * 2)
ax[2 * (j - 1) + 1, i - 1].set_xscale("log")
ax[2 * (j - 1) + 1, i - 1].set_xlim(5, lmax * 2)
if i > 1:
ax[2 * (j - 1), i - 1].set_yticklabels([])
ax[2 * (j - 1) + 1, i - 1].set_yticklabels([])
fig.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0, wspace=0.0, hspace=0.0)
fig.supxlabel("angular mode $\\ell$", y=-0.05, va="top")
fig.supylabel("galaxy clustering $C_\\ell$", x=-0.1, ha="right")
plt.show()
[26]:
fig, ax = plt.subplots(
2 * (nbins - 1),
nbins - 1,
figsize=(9, 9),
gridspec_kw={"height_ratios": [3, 1, 3, 1, 3, 1]},
)
for i in range(1, nbins):
for j in range(1, i):
ax[j - 1, i - 1].axis("off")
ax[j, i - 1].axis("off")
ax[3, 2].axis("off")
for j in range(i, nbins):
key = ("SHE", "SHE", i, j)
cov_key = ("SHE", "SHE", "SHE", "SHE", i, j, i, j)
e_cov = dices_cov[cov_key][0, 0, 0, 0, :, :]
e_err = np.sqrt(np.diag(e_cov))
b_cov = dices_cov[cov_key][1, 1, 1, 1, :, :]
b_err = np.sqrt(np.diag(b_cov))
e_c = cqs0[key][0, 0, :]
b_c = cqs0[key][1, 1, :]
e_t = theory_cls[key][0, 0, :]
b_t = theory_cls[key][1, 1, :]
e_t_itp = np.interp(lgrid, ell, e_t)
b_t_itp = np.interp(lgrid, ell, b_t)
ax[2 * (j - 1), i - 1].errorbar(
lgrid,
cqs0[key][0, 0, :],
yerr=e_err,
c="C0",
lw=1.5,
zorder=3.0,
alpha=0.5,
)
ax[2 * (j - 1), i - 1].plot(ell[10:], e_t[10:], c="C0", lw=1.0, zorder=4.0)
ax[2 * (j - 1) + 1, i - 1].errorbar(
lgrid,
(e_c - e_t_itp) / e_t_itp,
yerr=e_err / e_t_itp,
fmt=".",
c="C0",
lw=1.5,
zorder=1.0,
alpha=0.5,
)
ax[2 * (j - 1), i - 1].errorbar(
lgrid,
b_c,
yerr=b_err,
c="C1",
lw=1.5,
zorder=1.0,
alpha=0.5,
)
ax[2 * (j - 1), i - 1].plot(
ell[10:],
b_t[10:],
c="C1",
lw=1.0,
zorder=2.0,
)
ax[2 * (j - 1), i - 1].axhline(0.0, c="k", lw=0.8, zorder=-1)
ax[2 * (j - 1) + 1, i - 1].axhline(0.0, c="k", lw=0.8, zorder=-1)
ax[2 * (j - 1), i - 1].tick_params(axis="both", which="both", direction="in")
ax[2 * (j - 1), i - 1].set_yscale(
"symlog", linthresh=1e-10, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
)
ax[2 * (j - 1), i - 1].set_ylim(-3e-9, 5e-9)
ax[2 * (j - 1) + 1, i - 1].set_yscale(
"symlog", linthresh=0.1, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
)
ax[2 * (j - 1) + 1, i - 1].set_ylim(-2.5, 2.5)
ax[2 * (j - 1), i - 1].set_xscale("log")
ax[2 * (j - 1), i - 1].set_xlim(5, lmax * 2)
ax[2 * (j - 1) + 1, i - 1].set_xscale("log")
ax[2 * (j - 1) + 1, i - 1].set_xlim(5, lmax * 2)
if i > 1:
ax[2 * (j - 1), i - 1].set_yticklabels([])
ax[2 * (j - 1) + 1, i - 1].set_yticklabels([])
ax[0, 0].xaxis.get_major_locator().set_params(numticks=99)
ax[0, 0].xaxis.get_minor_locator().set_params(
numticks=99, subs=np.arange(0.1, 1.0, 0.1)
)
# ax[0, 0].set_yscale(
# "symlog", linthresh=1e-10, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
# )
# ax[0, 0].set_ylim(-3e-7, 5e-7)
fig.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0, wspace=0.0, hspace=0.0)
fig.supxlabel("angular mode $\\ell$", y=-0.05, va="top")
fig.supylabel("cosmic shear $C_\\ell$", x=-0.1, ha="right")
plt.show()
[27]:
fig, ax = plt.subplots(
2 * (nbins - 1),
(nbins - 1),
figsize=(9, 9),
gridspec_kw={"height_ratios": [3, 1, 3, 1, 3, 1]},
)
for i in range(1, nbins):
for j in range(1, nbins):
key = ("POS", "SHE", i, j)
cov_key = ("POS", "SHE", "POS", "SHE", i, j, i, j)
e_cov = dices_cov[cov_key][0, 0, :, :]
e_err = np.sqrt(np.diag(e_cov))
b_cov = dices_cov[cov_key][1, 1, :, :]
b_err = np.sqrt(np.diag(b_cov))
e_c = cqs0[key][0, :]
b_c = cqs0[key][1, :]
e_t = theory_cls[key][0, :]
b_t = theory_cls[key][1, :]
e_t_itp = np.interp(lgrid, ell, e_t)
b_t_itp = np.interp(lgrid, ell, b_t)
ax[2 * (j - 1), i - 1].errorbar(
lgrid,
e_c,
yerr=e_err,
c="C0",
lw=1.5,
zorder=3.0,
alpha=0.5,
)
ax[2 * (j - 1), i - 1].plot(ell[10:], e_t[10:], c="C0", lw=1.0, zorder=4.0)
ax[2 * (j - 1) + 1, i - 1].errorbar(
lgrid,
(e_c - e_t_itp) / e_t_itp,
yerr=e_err / np.abs(e_t_itp),
fmt=".",
c="C0",
lw=1.5,
zorder=1.0,
alpha=0.5,
)
ax[2 * (j - 1), i - 1].errorbar(
lgrid,
b_c,
yerr=b_err,
c="C1",
lw=1.5,
zorder=1.0,
alpha=0.5,
)
ax[2 * (j - 1), i - 1].plot(ell[10:], b_t[10:], c="C1", lw=1.0, zorder=2.0)
ax[2 * (j - 1), i - 1].axhline(0.0, c="k", lw=0.8, zorder=-1)
ax[2 * (j - 1) + 1, i - 1].axhline(0.0, c="k", lw=0.8, zorder=-1)
ax[2 * (j - 1), i - 1].tick_params(axis="both", which="both", direction="in")
ax[2 * (j - 1), i - 1].set_yscale(
"symlog", linthresh=1e-9, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
)
ax[2 * (j - 1), i - 1].set_ylim(-8e-7, 8e-7)
ax[2 * (j - 1) + 1, i - 1].set_yscale(
"symlog",
linthresh=0.1,
linscale=0.1,
)
ax[2 * (j - 1) + 1, i - 1].set_ylim(-10, 10)
ax[2 * (j - 1), i - 1].set_xscale("log")
ax[2 * (j - 1), i - 1].set_xlim(5, lmax * 2)
ax[2 * (j - 1) + 1, i - 1].set_xscale("log")
ax[2 * (j - 1) + 1, i - 1].set_xlim(5, lmax * 2)
if i > 1:
ax[2 * (j - 1), i - 1].set_yticklabels([])
ax[2 * (j - 1) + 1, i - 1].set_yticklabels([])
# ax[0, 0].set_xscale("log")
# ax[0, 0].set_xlim(1 / 2, lmax * 2)
ax[0, 0].xaxis.get_major_locator().set_params(numticks=99)
ax[0, 0].xaxis.get_minor_locator().set_params(
numticks=99, subs=np.arange(0.1, 1.0, 0.1)
)
# ax[0, 0].set_yscale(
# "symlog", linthresh=1e-9, linscale=0.45, subs=np.arange(0.1, 1.0, 0.1)
# )
# ax[0, 0].set_ylim(-8e-7, 4e-7)
fig.subplots_adjust(left=0.0, bottom=0.0, right=1.0, top=1.0, wspace=0.0, hspace=0.0)
fig.supxlabel("angular mode $\\ell$", y=-0.05, va="top")
fig.supylabel("galaxy--galaxy lensing $C_\\ell$", x=-0.1, ha="right")
plt.show()