Research Note
MCMC & Bayesian Inversion
Proficiency
Proficient
Description
MCMC sampling
emcee(affine-invariant ensemble sampler, Foreman-Mackey 2013) on real problems- Log-space reparameterization for scale-spanning parameters
- Trace + corner plots for chain diagnostics
- Practical handling of MCMC multi-process × BLAS thread collision (
OMP_NUM_THREADS=1+ 7 other env vars)
Gaussian Process regression
- RBF kernel + hierarchical MCMC sampling of hyperparameters
- Spatial × Temporal separable kernel
KS ⊗ KTto avoid O(N³) joint covariance - HEALPix spherical grid as GP spatial support (healpy)
- Snapshot-level and pixelwise posterior covariances implemented in closed form
Regularization parameter selection for inverse problems
- L-Curve method
- L-Curve Curvature method
- Generalized Cross Validation (GCV) — hand-implemented from Golub 1979
- Morozov discrepancy principle
- All four compared on Hansen gravity-problem example and on real DSCOVR data
Numerical linear algebra
scipy.linalg.solve(..., assume_a="pos")Cholesky instead of explicit inversenp.linalg.slogdetvslog(det)for underflow safety- SVD + Tikhonov regularization (MATLAB
csvd, etc.) - Matrix condition number analysis
Toolchain
- Python: emcee, scipy, numpy, healpy, astropy, pandas
- MATLAB: method verification + Hansen-book reproductions
- Mathematica: symbolic derivations / probability formalism checks
Self-study path (worth highlighting)
In 2021 self-studied Hansen’s Discrete Inverse Problems + Bishop’s PRML + Golub 1979 original GCV paper, progressively reproducing textbook and original-paper numerical experiments — the “textbook → original paper → reproduction → applied to new problem” chain.
Used In
- Undergraduate Research @ Peking University & Caltech/UCR — MCMC Retrieval Methods project (Mar 2021 — Mar 2022)
- — full research-code + study-notes repo
- Transferable to: any ill-posed inverse problem (climate retrieval, medical imaging, geophysical inversion, exoplanet spin-mapping)
Methodology Concept Pages
- Inverse Problems & Regularization — frequentist view
- Gaussian Process Bayesian Inversion — Bayesian view
- Hardened Balancing Principle (HBP) — HBP automatic regularization parameter selection
- L-Curve Axis-Scale Ambiguity (Hansen σ² vs σ⁴ bug catch) — Hansen σ²/σ⁴ 80× bug open question
Caltech DSCOVR 4 Original Contributions (2021-2022, Advised by Yuk Yung + King-Fai Li)
Contribution (c): Hansen Textbook σ² vs σ⁴ Bug Catch (80× Deviation)
Core achievement: As an undergraduate, using independent SVD + filter-factor derivation, found a formula-level bug in Hansen’s classic Discrete Inverse Problems (SIAM 2010) — a foundational textbook — not a typo, but a substantive error causing numerical results to differ by ~80×.
Numerical comparison (same toy model + PC2 data):
| Formula | Toy model λ | PC2 λ | Accuracy |
|---|---|---|---|
| Hansen-student correct version (σ⁴) | 0.0251 ✓ | 0.0575 ✓ | Consistent with naked-eye elbow + theoretical value under axis-image |
| Hansen-book erroneous version (σ²) | 1.995 ✗ | 1.995 ✗ | 80× overestimate, complete failure |
Innovative discovery: independently identified an L-curve axis-scale ambiguity not discussed by Hansen:
“Hansen didn’t talk about whether we should control the units of x-axis and y-axis of L-curve to be the same. In his book, he showed two L-curve figures, but both didn’t control the units to be the same.” (discovered 2022-01-08)
3-step self-correction cycle (11/29 → 12/06 → 01/08): self-discovered the formula error → hypothesized accumulation of errors → refuted that and correctly attributed it to axis-scale ambiguity — demonstrates intellectual honesty + iteration.
Physical-evidence discipline: simultaneously preserved Hansen original-version PDF (denominator σ²) + Hansen-student correct-version PDF (denominator σ⁴) side-by-side under LCurve_HansenMistake_WeRectify_Test/ directory — “inconvenient-evidence retention” spirit, corresponding to .
Community contribution: posed 3 open questions to the academic community — (1) Should L-curve axes be controlled to the same units? (2) Is a senior collaborator’s elbow λ=1e-3 conclusion a scale artifact? (3) When applying L-curve, in which cases should one tune, and how?
Contribution (b): Pre-whitening L-curve Innovation + Methodological Restraint
Technical innovation: under RBF kernel (non-diagonal covariance ), transform into the whitened space:
Problem diagnosis: discovered that L-curve x/y values become non-monotonic under non-diagonal → structural limit: L-curve corresponds to pure Tikhonov, and non-diagonal cannot derive into Tikhonov form; “modified L-curve” is overly sensitive to priors.
Independent judgment — methodological restraint (2022-01-23 group meeting verbatim):
(Declined to tune , priors to engineer a target λ — methodological commitment.)
Methodology-maturity evidence: “I systematically diagnose the limitations of a method I myself proposed” — an undergraduate refusing the advisor’s implicit preference to engineer the desired λ value, a research-integrity signature. Corresponds to Moment 3.
Contribution (a): Hardened Balancing Principle (HBP) 90-paragraph Writeup
Theoretical framework: starting from (Tikhonov), via Bauer 2007 exponential concentration assumption → automatic parameter-selection HBP λ_{n+} formula.
Systematic stress test:
- Hansen gravity problem (500-instance, condition number ~2.88e28 vs reported ~1.54e5)
- Golub 1979 Laplace (condition number reaching 2.88e28)
- 3-point linear regression (Prof. King-Fai Li × Zhenyu Bayesian-form closed-solution cross-validation)
- DSCOVR real PC2 time series
Conclusion: HBP performs the best, requiring no prior knowledge of noise level δ or tuning constant κ.
Output: the 90-paragraph math section writeup in internal-only manuscript — equivalent in scale to a standalone technical note/tutorial, containing the full derivation from Tikhonov to HBP.
Contribution (d): Direction of Noise Vector Influences Regularization (Original Discovery)
Original discovery (2021-09-20 verbatim):
“When I did numerical tests on L-Curve, GCV etc., I found out the direction of the noise vector plays a big role in the performance of statistical criteria. I did a literature search and it seems nobody has studied this topic.”
4-method failure-mode catalog:
| Method | Failure condition |
|---|---|
| L-curve | |x_sol| large, or near EV1 |
| Kawahara Bayesian | |x_sol| large (5/5 fail for x_sol=[10,1]) |
| GCV | near EV1 and near EV2 |
| Discrepancy principle | Always works, but introduces systematic error |
Origin — accidental discovery: when reproducing Golub 1979 GCV (condition 2.88e28), the 4th of 5 instances had a completely different λ_GCV → Hansen-gravity 500-instance validation aggregated → systematized into contribution (d). Independent discovery + literature-gap awareness (with HBP later extended by a subsequent collaborator to PC1/PC4 clouds, forming a methodology-influence chain).
Cross-language fluency (within this single 11-month project):
| Language | Task taken |
|---|---|
| IDL | OOP MCMC class (inheriting Yuk Yung lab convention) |
| MATLAB | Hansen-book gravity problem replication + csvd |
| Python | Kawahara CUDA library bootstrap (remote workstation, Anaconda) + emcee + healpy |
| Mathematica | Symbolic curvature derivations (calculation_4_cal_deriva.nb + calculation_5_cal_deriva.nb) |
| R | Metropolis-Hastings baseline for cross-validation |
29 group_meeting pptx sustained trajectory (Mar 2021 → Mar 2022), together with Walker paper v3→v6 drafting (Sep-Oct 2021) + nwp_hw 4 assignments (Oct-Dec 2021) constitutes the “3-track parallel Sep-Dec 2021” evidence of Moment 4.