IDA
IDA-PTW EEWS Review Tinjauan IDA-PTW EEWS 01 · CSV Cross-check 01 · Cross-check CSV

CSV Cross-check — Consistency Audit

Cross-check CSV — Audit Konsistensi

Mapping every CSV in reports/ to its producing script, dataset N, and model config — identifying where numbers disagree across artifacts.

Memetakan setiap CSV di reports/ ke skrip produsen, ukuran dataset N, dan konfigurasi model — mengidentifikasi di mana angka saling tidak konsisten.

ScopeCakupan DL_Spectra project Proyek DL_Spectra DatasetDataset Java-Sunda HQ 25,058 traces HQ Jawa-Sunda 25.058 trace DateTanggal 2026-04-22
TL;DR — Ringkas — There are three distinct model runs that produce non-overlapping R² numbers, and they have been mixed in reports without clear labeling. The current validation_evidence_report.md and manuscript_draft_IEEE.md pull from different runs, causing internal inconsistency. Recommendation: adopt a single authoritative source per evidence. Terdapat tiga run model yang berbeda dengan nilai R² tidak saling beririsan, dan ketiganya tercampur di laporan tanpa pelabelan yang jelas. validation_evidence_report.md dan manuscript_draft_IEEE.md saat ini menarik dari run yang berbeda sehingga terjadi inkonsistensi internal. Rekomendasi: gunakan satu sumber otoritatif per evidence.

The Three Runs

Tiga Run Tersebut

Run IDID Run Produced byDihasilkan oleh Core configKonfigurasi inti Sa(0.3s) R²
RUN-A "Heavy stratified""Heavy stratified" train_stratified_ida.py
train_eews_windows.py
XGBoost n_est=800, max_depth=12; GroupKFold 5-fold; full feature set. N=25,058 (Fixed PTW) or N=2,747 (IDA stratified subset). XGBoost n_est=800, max_depth=12; GroupKFold 5-fold; fitur lengkap. N=25.058 (Fixed PTW) atau N=2.747 (subset IDA stratified). 0.867 (Fixed 10s)
0.876 (IDA)
RUN-B "103 fast-marathon""103 fast-marathon" train_xgboost_103_marathon_all.py XGBoost n_est=150, max_depth=8; 5-fold. "Full-Wave" here is 50 s, NOT 341 s. XGBoost n_est=150, max_depth=8; 5-fold. "Full-Wave" di sini 50 s, BUKAN 341 s. 0.763 (Fixed 10s)
0.738 (IDA)
0.811 (50s)
RUN-C "End-to-end operational""End-to-end operasional" Not found in codebase; referenced only in manuscript Tidak ditemukan di codebase; hanya dirujuk di manuskrip Includes Stage 1 routing uncertainty (8.91% critical miss); numbers quoted in manuscript Table 11/12. Mencakup ketidakpastian routing Stage 1 (8,91% critical miss); angka dikutip di Table 11/12 manuskrip. 0.625 (IDA e2e)

Key Discrepancies by Period

Perbedaan Utama per Periode

IDA-PTW Adaptive at Sa(0.3s)

IDA-PTW Adaptif pada Sa(0,3 s)

SourceSumberValueNilaiRunRun
benchmark_results_ida.csv0.8759RUN-A · N=2,747RUN-A · N=2.747
spectral_r2_performance.csv @ T=0.30.8798RUN-A · N=2,747RUN-A · N=2.747
comparison_r2_table.csv0.8760RUN-A · N=2,747RUN-A · N=2.747
xgboost_103_all_baselines.csv @ T=0.30.7381RUN-B · N=25,058RUN-B · N=25.058
comparison_golden_metrics.csv @ T=0.3000.7381RUN-B · N=25,058RUN-B · N=25.058
manuscript_draft_IEEE.md Table 11 "IDA-PTW Operational"0.6252RUN-C · no CSV sourceRUN-C · tanpa sumber CSV
validation_evidence_report.md0.8759RUN-A · N=2,747RUN-A · N=2.747
Δ = 25 p.p. spread Δ = 25 p.p. rentang

Three documents label the same thing "IDA-PTW Adaptive" but the quoted R² values span from 0.876 → 0.738 → 0.625. The manuscript's "Operational" value has no reproducible CSV artifact.

Tiga dokumen memberi label "IDA-PTW Adaptive" untuk hal yang sama, namun nilai R² yang dikutip merentang dari 0,876 → 0,738 → 0,625. Nilai "Operational" di manuskrip tidak memiliki artefak CSV yang dapat direproduksi.

"Full-Wave" R² at Sa(0.3s) — label ambiguity

R² "Full-Wave" pada Sa(0,3 s) — ambiguitas label

SourceSumberValueNilaiActual windowJendela aktual
validation_evidence_report.md Evidence B0.9508~341 s
comparison_r2_table.csv0.9450not specifiedtidak dispesifikasi
xgboost_103_all_baselines.csv "Full_Wave"0.811050 s (RUN-B)
manuscript_draft_IEEE.md Table 12 "Post-P Full-Wave"0.951~341 s

Critical issue: "Full_Wave" in RUN-B is actually a 50 s window — not 341 s. The ~341 s values (0.951) in the evidence report trace back to "run c7a50193", but no CSV artifact for that run exists in the repo. It is quoted as scalars only.

Isu kritis: "Full_Wave" di RUN-B sebenarnya window 50 s — bukan 341 s. Nilai ~341 s (0,951) di evidence report berasal dari "run c7a50193", tetapi tidak ada artefak CSV untuk run tersebut di repo. Nilai hanya dikutip sebagai skalar.

Internal Discrepancies Within Documents

Inkonsistensi Internal dalam Dokumen

1. Manuscript Table 11 — mislabeled PTW rows

1. Manuskrip Table 11 — baris PTW salah label

Table 11 in manuscript_draft_IEEE.md labels its rows "Fixed 2 / 3 / 4 / 6 / 8":

Table 11 di manuscript_draft_IEEE.md memberi label baris "Fixed 2 / 3 / 4 / 6 / 8":

2. intensity_correlation_metrics.csv N anomaly

2. Anomali N pada intensity_correlation_metrics.csv

Sum across 4 intensity bins: 7,356 + 8,309 + 8,698 + 3,903 = 28,266 ≠ 25,058. This suggests per-trace predictions were enumerated across multiple PTW outputs (double-counting), or a different training split was used. This file feeds Table 1 of the vFinal draft — if the N is inflated, the per-bin R² values may be biased too.

Jumlah 4 bin intensitas: 7.356 + 8.309 + 8.698 + 3.903 = 28.266 ≠ 25.058. Ini mengindikasikan prediksi per-trace dihitung ganda pada beberapa output PTW, atau split training yang digunakan berbeda. File ini menjadi sumber Table 1 draft vFinal — bila N menggelembung, nilai R² per-bin juga berpotensi bias.

3. Fisis ceiling N = 21,704 vs dataset N = 25,058

3. Plafon Fisis N = 21.704 vs N dataset = 25.058

scwfparam_equivalence_golden.csv reports N=21,704, but dataset description claims 25,058. The "golden" subset is a stricter filter (defined in metadata_golden.csv) — legitimate, but must be clearly distinguished from the 25,058 training dataset in any tables that juxtapose ML R² (25,058) against Fisis R² (21,704).

scwfparam_equivalence_golden.csv melaporkan N=21.704, sedangkan deskripsi dataset menyatakan 25.058. Subset "golden" merupakan filter yang lebih ketat (didefinisikan di metadata_golden.csv) — legitim, tetapi harus secara eksplisit dibedakan dari dataset training 25.058 di setiap tabel yang membandingkan R² ML (25.058) dengan R² Fisis (21.704).

Authoritative Source Mapping (Proposed)

Pemetaan Sumber Otoritatif (Usulan)

EvidenceEvidence Use this CSVGunakan CSV ini RationaleAlasan
A1. Fixed PTW benchmarkA1. Benchmark Fixed PTWbenchmark_results_fixed.csvHeavy RUN-A on full N=25,058Heavy RUN-A pada N=25.058
A2. IDA-PTW 3 anchor periodsA2. 3 periode anchor IDA-PTWbenchmark_results_ida.csvRUN-A stratified on N=2,747RUN-A stratified pada N=2.747
A3. IDA-PTW 103-periodA3. IDA-PTW 103 periodespectral_r2_performance.csvRUN-A stratified, 103 periodsRUN-A stratified, 103 periode
B. Information ceiling🚨 No CSV exists🚨 CSV belum adaMust re-run or flag as legacyHarus re-run atau tandai legacy
C. Saturation testsaturation_test_results.csvDirect match to evidence reportCocok langsung dengan evidence report
D. P-arrival sensitivityp_arrival_sensitivity.csvDirect matchCocok langsung
E. Newmark-Beta ceilingscwfparam_equivalence_golden.csvPhysics validation on N=21,704Validasi fisis pada N=21.704
F. 103-period fast marathonF. 103 periode fast marathonxgboost_103_all_baselines.csvSecondary/exploratory onlySekunder/eksploratif saja

Recommended Actions

Tindakan yang Direkomendasikan

  1. Rename or retire comparison_golden_metrics.csv. The "Golden" label implies it is authoritative; in fact it comes from the fast marathon (RUN-B). Rename to comparison_marathon_metrics_preliminary.csv or delete.
  2. Rename atau nonaktifkan comparison_golden_metrics.csv. Label "Golden" menyiratkan file ini otoritatif; faktanya berasal dari fast marathon (RUN-B). Rename menjadi comparison_marathon_metrics_preliminary.csv atau hapus.
  3. Audit comparison_r2_table.csv Full-Wave & Total MiniSEED columns — these appear to be placeholders, not machine-computed. Either regenerate from a real 341 s run or remove the columns.
  4. Audit kolom Full-Wave & Total MiniSEED pada comparison_r2_table.csv — tampaknya placeholder, bukan hasil komputasi mesin. Regenerasi dari run 341 s yang nyata atau hapus kolomnya.
  5. Recover or document the "c7a50193" (Full-Wave 341 s) and "c3399cac" (Total MiniSEED) runs. Without CSV artifacts, Evidence B numbers are not reproducible.
  6. Pulihkan atau dokumentasikan run "c7a50193" (Full-Wave 341 s) dan "c3399cac" (Total MiniSEED). Tanpa artefak CSV, angka Evidence B tidak dapat direproduksi.
  7. Fix Manuscript Table 11 labels. Either rename rows "Fixed 4/6" → "Fixed 5/10", or regenerate values for true 4 s and 6 s PTW.
  8. Perbaiki label Manuscript Table 11. Rename baris "Fixed 4/6" → "Fixed 5/10", atau regenerasi nilai untuk PTW 4 s dan 6 s yang sebenarnya.
  9. Verify intensity_correlation_metrics.csv N=28,266 and re-run per-trace grouping logic if double-counting is confirmed.
  10. Verifikasi N=28.266 pada intensity_correlation_metrics.csv dan re-run logika grouping per-trace jika terbukti double-counting.
  11. Decide which IDA-PTW paradigm the paper defends — Stage-2-oracle (R²≈0.88) or end-to-end with routing uncertainty (R²≈0.62–0.73) — and use ONE consistently throughout abstract, tables, and conclusion.
  12. Putuskan paradigma IDA-PTW mana yang dipertahankan paper — Stage-2 oracle (R²≈0,88) atau end-to-end dengan routing uncertainty (R²≈0,62–0,73) — dan gunakan SATU secara konsisten di abstract, tabel, maupun kesimpulan.
  13. Add a PROVENANCE.md at the top of reports/ listing every CSV with its producing script, dataset N, CV config, model hyperparameters, and run date.
  14. Tambahkan PROVENANCE.md di bagian atas reports/ yang mencantumkan setiap CSV beserta skrip produsen, N dataset, konfigurasi CV, hiperparameter model, dan tanggal run.

Report generated 2026-04-22 · Based on file state of /mnt/DL_Spectra/reports/ as of 2026-04-19.Laporan dihasilkan 22-04-2026 · Berdasarkan kondisi /mnt/DL_Spectra/reports/ pada 19-04-2026.