Skip to main content
Skip to article

Research Note

MCMC & Bayesian Inversion

Zhenyu He · Jobs Stroustrup 5 min read

Proficiency

Proficient

Description

MCMC sampling

  • emcee (affine-invariant ensemble sampler, Foreman-Mackey 2013) on real problems
  • Log-space reparameterization for scale-spanning parameters
  • Trace + corner plots for chain diagnostics
  • Practical handling of MCMC multi-process × BLAS thread collision (OMP_NUM_THREADS=1 + 7 other env vars)

Gaussian Process regression

  • RBF kernel + hierarchical MCMC sampling of hyperparameters
  • Spatial × Temporal separable kernel KS ⊗ KT to avoid O(N³) joint covariance
  • HEALPix spherical grid as GP spatial support (healpy)
  • Snapshot-level and pixelwise posterior covariances implemented in closed form

Regularization parameter selection for inverse problems

  • L-Curve method
  • L-Curve Curvature method
  • Generalized Cross Validation (GCV) — hand-implemented from Golub 1979
  • Morozov discrepancy principle
  • All four compared on Hansen gravity-problem example and on real DSCOVR data

Numerical linear algebra

  • scipy.linalg.solve(..., assume_a="pos") Cholesky instead of explicit inverse
  • np.linalg.slogdet vs log(det) for underflow safety
  • SVD + Tikhonov regularization (MATLAB csvd, etc.)
  • Matrix condition number analysis

Toolchain

  • Python: emcee, scipy, numpy, healpy, astropy, pandas
  • MATLAB: method verification + Hansen-book reproductions
  • Mathematica: symbolic derivations / probability formalism checks

Self-study path (worth highlighting)

In 2021 self-studied Hansen’s Discrete Inverse Problems + Bishop’s PRML + Golub 1979 original GCV paper, progressively reproducing textbook and original-paper numerical experiments — the “textbook → original paper → reproduction → applied to new problem” chain.

Used In

  • Undergraduate Research @ Peking University & Caltech/UCR — MCMC Retrieval Methods project (Mar 2021 — Mar 2022)
  • — full research-code + study-notes repo
  • Transferable to: any ill-posed inverse problem (climate retrieval, medical imaging, geophysical inversion, exoplanet spin-mapping)

Methodology Concept Pages

Caltech DSCOVR 4 Original Contributions (2021-2022, Advised by Yuk Yung + King-Fai Li)

Contribution (c): Hansen Textbook σ² vs σ⁴ Bug Catch (80× Deviation)

Core achievement: As an undergraduate, using independent SVD + filter-factor derivation, found a formula-level bug in Hansen’s classic Discrete Inverse Problems (SIAM 2010) — a foundational textbook — not a typo, but a substantive error causing numerical results to differ by ~80×.

Numerical comparison (same toy model + PC2 data):

FormulaToy model λPC2 λAccuracy
Hansen-student correct version (σ⁴)0.02510.0575Consistent with naked-eye elbow + theoretical value under axis-image
Hansen-book erroneous version (σ²)1.9951.99580× overestimate, complete failure

Innovative discovery: independently identified an L-curve axis-scale ambiguity not discussed by Hansen:

“Hansen didn’t talk about whether we should control the units of x-axis and y-axis of L-curve to be the same. In his book, he showed two L-curve figures, but both didn’t control the units to be the same.” (discovered 2022-01-08)

3-step self-correction cycle (11/29 → 12/06 → 01/08): self-discovered the formula error → hypothesized accumulation of errors → refuted that and correctly attributed it to axis-scale ambiguity — demonstrates intellectual honesty + iteration.

Physical-evidence discipline: simultaneously preserved Hansen original-version PDF (denominator σ²) + Hansen-student correct-version PDF (denominator σ⁴) side-by-side under LCurve_HansenMistake_WeRectify_Test/ directory — “inconvenient-evidence retention” spirit, corresponding to .

Community contribution: posed 3 open questions to the academic community — (1) Should L-curve axes be controlled to the same units? (2) Is a senior collaborator’s elbow λ=1e-3 conclusion a scale artifact? (3) When applying L-curve, in which cases should one tune, and how?

Contribution (b): Pre-whitening L-curve Innovation + Methodological Restraint

Technical innovation: under RBF kernel (non-diagonal covariance ), transform into the whitened space:

Problem diagnosis: discovered that L-curve x/y values become non-monotonic under non-diagonal → structural limit: L-curve corresponds to pure Tikhonov, and non-diagonal cannot derive into Tikhonov form; “modified L-curve” is overly sensitive to priors.

Independent judgment — methodological restraint (2022-01-23 group meeting verbatim):

(Declined to tune , priors to engineer a target λ — methodological commitment.)

Methodology-maturity evidence: “I systematically diagnose the limitations of a method I myself proposed” — an undergraduate refusing the advisor’s implicit preference to engineer the desired λ value, a research-integrity signature. Corresponds to Moment 3.

Contribution (a): Hardened Balancing Principle (HBP) 90-paragraph Writeup

Theoretical framework: starting from (Tikhonov), via Bauer 2007 exponential concentration assumption → automatic parameter-selection HBP λ_{n+} formula.

Systematic stress test:

  • Hansen gravity problem (500-instance, condition number ~2.88e28 vs reported ~1.54e5)
  • Golub 1979 Laplace (condition number reaching 2.88e28)
  • 3-point linear regression (Prof. King-Fai Li × Zhenyu Bayesian-form closed-solution cross-validation)
  • DSCOVR real PC2 time series

Conclusion: HBP performs the best, requiring no prior knowledge of noise level δ or tuning constant κ.

Output: the 90-paragraph math section writeup in internal-only manuscript — equivalent in scale to a standalone technical note/tutorial, containing the full derivation from Tikhonov to HBP.

Contribution (d): Direction of Noise Vector Influences Regularization (Original Discovery)

Original discovery (2021-09-20 verbatim):

“When I did numerical tests on L-Curve, GCV etc., I found out the direction of the noise vector plays a big role in the performance of statistical criteria. I did a literature search and it seems nobody has studied this topic.

4-method failure-mode catalog:

MethodFailure condition
L-curve|x_sol| large, or near EV1
Kawahara Bayesian|x_sol| large (5/5 fail for x_sol=[10,1])
GCV near EV1 and near EV2
Discrepancy principleAlways works, but introduces systematic error

Origin — accidental discovery: when reproducing Golub 1979 GCV (condition 2.88e28), the 4th of 5 instances had a completely different λ_GCV → Hansen-gravity 500-instance validation aggregated → systematized into contribution (d). Independent discovery + literature-gap awareness (with HBP later extended by a subsequent collaborator to PC1/PC4 clouds, forming a methodology-influence chain).

Cross-language fluency (within this single 11-month project):

LanguageTask taken
IDLOOP MCMC class (inheriting Yuk Yung lab convention)
MATLABHansen-book gravity problem replication + csvd
PythonKawahara CUDA library bootstrap (remote workstation, Anaconda) + emcee + healpy
MathematicaSymbolic curvature derivations (calculation_4_cal_deriva.nb + calculation_5_cal_deriva.nb)
RMetropolis-Hastings baseline for cross-validation

29 group_meeting pptx sustained trajectory (Mar 2021 → Mar 2022), together with Walker paper v3→v6 drafting (Sep-Oct 2021) + nwp_hw 4 assignments (Oct-Dec 2021) constitutes the “3-track parallel Sep-Dec 2021” evidence of Moment 4.