OCEAN ENGINEERING, cilt.348, sa.124004, ss.1-26, 2026 (SCI-Expanded, Scopus)
Autonomous Underwater Vehicles (AUVs) and Unmanned Surface Vehicles (USVs) are key enablers for underwater sensing, yet their performance degrades under turbulent currents and wave-induced disturbances. We present a unified simulation framework for cooperative USV-AUV missions that operationalizes Fisher Information Matrix (FIM) within multi-agent reinforcement learning: FIM-guided USV positioning feeds into the agents’ observation and, when applicable, reward signals to reduce localization uncertainty and improve coordination. We evaluate three policies-Proximal Policy Optimization (PPO), Curriculum Reinforcement Learning (CRL), and Twin Delayed Deep Deterministic Policy Gradient (TD3) with FIM features (TD3-FIM)–under a single, reproducible setup across two regimes (normal and extreme sea states), with training and testing performed in both, using task-level metrics (data rate, total throughput), resource metrics (energy), system health (overflow events), and tracking error. A composite Reliability Index (RI) summarizes multi-objective performance. Results show that PPO consistently achieves higher reliability and more stable data collection than CRL in both regimes, while FIM-guided USV adaptation markedly lowers tracking error versus static baselines. The three-arm comparison establishes a practical benchmark for USV-AUV cooperation with physically motivated sea dynamics. Limitations include a simulation-based evaluation and simplified acoustic assumptions; future work will consider multi-USV coordination, latency/packet-loss models, and hardware-in-the-loop trials.