Skip to main content
Skip to article

Research Note

Multi-Method Comparison Spine (2019-2025)

Zhenyu He · Jobs Stroustrup 6 min read

Page Purpose

This page articulates Zhenyu’s pattern of systematic multi-method comparison across at least 8 instances over 7 years — rather than the ad-hoc pattern of “single-method + post-hoc sanity check”. This spine is the cross-project methodology backbone, from sophomore first long report in November 2019 (earliest evidence) to ongoing 2025 methane regressions, consistently appearing across completely independent domains (climate dynamics / quantitative finance / inverse problem theory / aerosol retrieval / spectral methods).

Definition

Multi-method comparison = for the same scientific question, set up ≥ 2 independent methods in parallel from the start + preserve all results, not the “1 method + post-hoc sanity check” ad-hoc pattern. Key distinguishing features:

  1. Methods are independent (not sharing core assumptions or subroutines)
  2. Parallel execution (not a fallback from a failed method)
  3. All results preserved (even if some methods are disadvantageous to the narrative)
  4. Cross-validation as discovery mechanism (not just robustness check)

Cross-7-year ≥ 8 instances overview table

YearContextMethods comparedOutcome
2019-11-04PKU Zhenyu sophomore first long report 43 slides the impact on the strength of tropical circulations 2.pptx3 Walker strength metrics (Tanaka 2004 velocity potential + ω-based + u-based / equivalent)Earliest multi-method evidence — origin of the capability spine
2020-2021Walker paper v6 (PKU Ji Nie)ω-based vs u-based stream function (msf_top2bottom_omega2_WalkerStream.ncl 63 lines + msf_top2bottom_u2_Walker.ncl 61 lines) two NCL modules preserved in parallelMethodology bug catch: ω-based does not satisfy “non-divergence” condition well → switch to u-based for paper v6, but ω-based module preserved as failed-approach evidence
Summer 2021QLBS self-study + NSD Summer School Final Project (joint course project)DP backward Bellman + FQI off-policy RL () + Black-Scholes closed-form benchmark — 3 solutions in single 780-line notebook + 1606-line Coursera independent path780 lines of independent verification — earliest numerical multi-method instance, Summer 2021 (3 months earlier than Caltech, 4 years earlier than PhD)
2021-11Caltech DSCOVR (Yuk Yung + King-Fai Li)Pre-whitening L-curve + L-curve + GCV + Morozov + Hardened Balancing Principle + Tikhonov + TSVD + Kawahara Bayesian cost + discrepancy principle (7+ methods)Structural limits diagnosed: Hansen σ² vs σ⁴ bug (80×) + Method B/C failure + axis-scale ambiguity + non-diagonal self-limit
2022-01Phase 1 T,q vs h,q closure (PhD Nuclear Winter)h-based RK4 + h-based forward Euler + T,q separate integrate + h_adj + T,q + cp_adj (4 Method variants) + conserved potential temperature 3rd-party ground truthRoot cause isolation (85 J/kg → 3 J/kg → 3.58e-5 J/kg), diagnosed hydrostatic balance violation as mechanism
2022-02Aerosol joint retrieval (PKU Jing Li senior thesis)Log scale height regression vs 1/e decay method — two scale height inference methods + systematic comparison across 12-month China LUTConvention harmonization: log-linear full-profile regression chosen as primary method, 1/e preserved; test case 2019-08-01 Beijing (0.66 km vs 4.1 km) based on physical sanity check
2025-09Phase 4.5 SWTG (PhD Nuclear Winter)Spectral modes (Sturm-Liouville eigendecomposition) vs DAM 3-D LES ground truth + sanity-check experiments isolating mode evolution1-D model significantly overestimates height diagnostic: “failed cancellation” mechanism identified through multi-experiment comparison
Ongoing (2024-)Methane mapping (UC Berkeley Inez Fung Lab)Multiple regression types: quantile (upper 50% filter) + density (AvgHeadPer100Acres) + raw linear (MillionHeads)Texas puzzle reformulation — recovered positive slope in Nebraska/Idaho after density + quantile; raw linear preserved to show original negative slope artifact

Meta pattern (what makes this distinct from ad-hoc sanity check)

Characteristic 1: Upfront design, not fallback

  • Methods in each instance are parallel by design from the beginning, not “method A failed → try method B”
  • Evidence: QLBS single notebook simultaneously implements 3 solutions; Phase 1 4 method variants existed in parallel from the project’s first phase

Characteristic 2: Independent methods (not sharing core assumptions)

  • DP + FQI + BS closed-form: completely different mathematical frameworks (backward recursion / off-policy RL / analytical stochastic calculus)
  • ω-based vs u-based stream function: different momentum variable + different integration path
  • Phase 1: h-based vs T,q-based are different state-variable choices

Characteristic 3: All results preserved, even disadvantageous

  • Walker v6 preserves ω-based NCL module even though methodology bug was caught
  • Caltech preserves Hansen original (σ²) wrong PDF side-by-side with Hansen student correct (σ⁴) PDF
  • Phase 1+2 enforceEpsilon_z_iteration_v3.ipynb filename-as-failure-marker
  • Aerosol (supplementary inconvenient-evidence docx) (filename gloss: “Hard-to-Explain or inconvenient evidence Section.docx”) preserves CALIPSO L2 4/4 advantage

Characteristic 4: Cross-validate as discovery mechanism, not just robustness

  • Caltech 7+ method comparison → Hansen textbook bug catch (80× overestimation)
  • Phase 1 T,q vs h,q comparison → hydrostatic balance violation mechanism identified
  • Methane density vs raw count → Texas free-range grazing hypothesis (Inez’s contribution)

Relationships to other cross-project patterns

Relation to Evidence Preservation Discipline (private companion)

  • Multi-method comparison generates the “source pool” of inconvenient evidence (1 method “wins”, others are “losers” or “fails”) — Characteristic 3 directly overlaps with the inconvenient-evidence retention discipline
  • However, multi-method comparison’s focus is design-time upfront parallelism; inconvenient-evidence retention’s focus is result-time honest retention. The two patterns are complementary (the former upstream, the latter downstream)

Relation to PKU advisor methodology companion (private)

  • Both are capabilities at the methodology integrity layer. The methodology arc is advisor-level (credit honesty), the multi-method spine is student-level (methodology honesty) — two independent pieces of evidence in the same integrity cluster

Relation to Independent Judgment narrative (held until 2026-06-10)

  • Moment 3 (2022-01-23 refuse methodological-engineering pressure) is directly rooted in multi-method comparison Characteristic 4: pre-whitening L-curve method showed non-monotonic behavior under RBF kernel → Zhenyu articulates method self-limit, refuses to tune multiple Sa/Se bundles. Multi-method discipline enables Zhenyu to diagnose method-level limits and thus refuse methodological-engineering pressure at the methodology-integrity level.

Distinct signature: sophomore earliest evidence

Critical observation: the origin of the capability spine is not PhD, not Caltech, not the senior thesis — it is the sophomore first long report (2019-11-04). This means:

  • Multi-method comparison is not a learned skill from research training but is Zhenyu’s personal epistemic habit — present from his entry into the field
  • It does not match the timeline of PhD-level multi-method training (typically developed after 1-2 years of literature exposure)
  • This explains why the Caltech DSCOVR 7-method comparison emerged in undergrad senior fall — not “suddenly learned” but “the capability already existed, fully expressed when the domain was ready”

Sources

  • research_wiki resume_angles/problem_solving.md whole page (350+ lines, all multi-method stories fully documented)
  • research_wiki resume_angles/technical_depth.md §1 Numerical Methods (7+ regularization methods + QLBS 3-way + Phase 1 4-method variants)
  • research_wiki projects/walker_hadley/overview.md §ω-based vs u-based stream function + sophomore 2019-11-04 first long report
  • research_wiki projects/mcmc_retrieval/overview.md §Caltech DSCOVR 7-method systematic comparison
  • research_wiki projects/aerosol_retrieval/overview.md §log fit vs 1/e method scale height
  • research_wiki projects/nuclear_winter/overview.md §Phase 1 T,q vs h,q + Phase 4.5 SWTG spectral modes
  • research_wiki projects/methane_mapping/overview.md §regression reformulation (density + quantile)
  • research_wiki projects/early_exploration/overview.md §2019-11-04 first long report 43 slides