Concluded

PIDS — Physics-Informed Deep Stereo

A five-month, 45-experiment investigation into polarized stereo for transparent-object depth — closed on proving the path is an optical dead end, which is itself the result.

Concluded

  • polarized stereo
  • transparent surfaces
  • stereo matching
  • surface reconstruction
  • negative result

Physics-Informed Deep Stereo (PIDS) · Late 2025 – 2026-02-26 (~5 months) · 45+ training experiments, 60+ chapters of development logs.

1. Project goal

A stereo camera system using active asymmetric polarization (left I∥ / right I⊥) to recover the depth of transparent obstacles — glass doors, acrylic panels.

2. Reasons for closure

2.1 Physical ceiling: stereo matching intrinsically fails on transparent surfaces

On glass, stereo matching converges to the background depth, not the glass surface depth.

The root cause is not “lack of texture” but left–right inconsistency (violation of photometric consistency):

  • Specular reflections are view-dependent.
  • Compound light paths (refraction + internal reflection + refraction) are extremely sensitive to viewing angle.
  • The left and right cameras observe completely different patterns.
  • Polarization can only change the ratio of reflection — it cannot create matchable feature points.

This is an optical ceiling that cannot be broken by algorithms or larger models.

2.2 The polarization signal lacks a matchable peak in correlation space

MetricValueProblem
Peak-to-Mean Ratio0.963< 1.0 — the GT location is even lower than the mean
Peak Sharpnessnegativefully inverted
I∥/I⊥ Ratio1.168a difference exists, but it is too small

→ The polarization signal has no sharp, matchable peak in correlation space.

2.3 Polarization forces us into “surface reconstruction” — which is equally unsolvable

The left–right inconsistency caused by polarization forces us onto the surface-reconstruction path — this is not a voluntary decision to “abandon” stereo, but a forced outcome. Active asymmetric polarization deliberately creates a left–right difference, which breaks photometric consistency; stereo matching (finding the same point via left–right alignment) therefore becomes fundamentally impossible, leaving surface reconstruction — rather than “measuring depth via correspondence” — as the only remaining option.

But once forced into surface reconstruction, PIDS faces two problems that cannot be circumvented:

(a) Insufficient reflective-surface coverage. Polarization / active probing relies on specular reflection, and specular reflection on glass only appears within a specific incidence-angle region → it cannot cover the entire glass surface. Even reframed as a “probing” approach, it only yields scattered reflective patches and cannot measure depth over the full glass surface.

(b) The same theoretical risk as ClearGrasp. Surface reconstruction must rely on surface normals, and our method still does not solve the problem of inferring normals via a neural network — bringing us right back to the same unsolved problem faced by ClearGrasp and similar methods. Polarization provides no shortcut around it.

Closure judgment

After 45+ experiments spanning three generations of architecture — RAFT-Stereo (PIDS 1.x), Two-Pass RGB (PIDS 2.0), and S2M2 (PIDS 3.0) — we confirm that “polarization + stereo matching” is the wrong question: the failure lies in optical principles, not engineering implementation. Cutting losses and closing the project is the correct call; proving that this path is a dead end is itself a valuable research conclusion.

3. Legacy assets

Hardware — dual-camera system (2× Raspberry Pi Global Shutter Camera, IMX296), polarizer set (0° / 0° / 90° linear polarizing film), calibration jigs and fixtures, camera synchronization circuit (XVS Master/Slave).

Software — calibration programs (including dark / flat-field radiometric calibration), a polarization image-processing library, Mitsuba 3 polarized rendering scripts (v1 → v7, including the physical-polarizer geometry architecture), Blender scene-randomization tools, and a benchmark / data-quality-check system (5-point quality validation).

Knowledge & documentation

  • An Architecture Design Compendium extracted from the 17,399-line development log, documenting every architecture design across three categories (renderer, depth model, training strategy) in chronological order:
    • Renderer architectures v3.1.0 → v7.2 (physical-polarizer geometry, parallel optical axis, textured RGB Stokes rendering)
    • The full depth-model main line: Baseline RAFT-Stereo → Dual-Stream → Polarization Volume V1–V2-E → PIDS 2.0 (Two-Pass RGB) → PIDS V3 (Dual Volume + FiLM) → V4 (True Dual-Stream) → V5 (Cost Concat) → V6 (Glass-Aware) → PIDS 3.0 / S2M2 (Transformer, 6-channel, ComplexCNNEncoder)
    • Training strategies (curriculum sampling, staged freezing/unfreezing, Directional Impulse Descent, sensor-realism augmentation)
    • Version-evolution tables, naming-conflict notes, and the three architectural “iron rules” for polarization injection
  • Practical polarization-optics know-how
  • The destructive effect of neural-network normalization (BatchNorm / LayerNorm / L2 / per-image p99) on physical signals
  • A cross-GPU-architecture (Ada / Hopper / Blackwell) TF32 consistency checklist
  • A grounded understanding of the theoretical limits of transparent-object depth measurement

Closing note

As a research project, PIDS successfully proved that one path does not work — and that itself is a valuable scientific contribution. We now know: polarized stereo vision cannot measure the surface depth of transparent objects, because stereo matching presupposes left–right photometric consistency, while transparent objects are inherently view-dependent and inconsistent. Once forced into surface reconstruction, it then fails due to insufficient reflective coverage and the unsolved normal-inference problem — bearing the same theoretical risk as ClearGrasp.

Many of these assets fed directly into later work — the Architecture Design Compendium became the Blueprints collection, and the TF32 consistency finding became its own writeup.

← All log entries