What Generative AI Can (and Can't) Do for Ocean Science Right Now

Tuesday, April 28th, 2026

Large language models and generative AI have arrived in marine science, as they have everywhere else. The tools are impressive. The claims made on their behalf are sometimes more impressive still. For researchers working on ocean ecosystems, who are already managing sprawling datasets, incomplete observations, and the pressure to produce policy-relevant results, the question is practical: what can these tools actually do for us today, and where should we be cautious?

This piece attempts an honest accounting.

Where generative AI is genuinely useful

Literature synthesis and knowledge extraction. The volume of published marine science is staggering. The Web of Science indexes over 100,000 articles annually under ocean and marine subject categories. No individual researcher can keep pace. Large language models are well suited to summarising bodies of literature, identifying thematic clusters, and extracting structured information from unstructured text. Recent work has shown that LLM-assisted systematic reviews can reduce screening time by 50 to 70 percent compared with manual approaches, without significant loss in recall. For fields like marine ecology, where evidence synthesis informs conservation policy and stock assessments, this is a meaningful gain.

Tools such as Elicit, Semantic Scholar, and Consensus already allow researchers to query the literature in natural language and receive synthesised answers with citations. They are not perfect. They miss nuance, occasionally conflate findings, and can surface low-quality studies alongside rigorous ones. However, as a first pass through a large body of evidence, they are faster and more consistent than manual keyword searching.

Data gap-filling and interpolation. Ocean observation is inherently patchy. Cloud cover obscures satellite retrievals. Argo floats drift unevenly. Coastal monitoring stations have gaps in their records. Machine learning methods for filling spatial and temporal gaps in oceanographic data are well established. Neural networks have been used for sea surface temperature reconstruction since the early 2010s. Generative models, particularly variational autoencoders and diffusion models, extend this capability by learning the underlying statistical structure of oceanographic fields and generating plausible reconstructions that respect physical constraints. This is not merely cosmetic smoothing. When properly validated, these reconstructions can improve inputs to ecosystem models and climate reanalyses.

Scenario generation and synthetic data. Marine ecosystem models often need to explore a wide range of future conditions, including climate pathways, management regimes, and land-use changes. Generative AI can produce synthetic datasets that capture the statistical properties of observed data while allowing researchers to explore conditions not yet encountered. This is particularly valuable for stress-testing models against extreme events, where historical observations are sparse by definition. Diffusion models have been applied to generate physically consistent weather and ocean state scenarios at a fraction of the computational cost of running full numerical simulations.

Code generation and analysis workflows. A less glamorous but practically significant use case is writing and debugging code. Marine scientists spend considerable time processing data in R, Python, and MATLAB. LLMs are competent at generating boilerplate code, translating between languages, explaining unfamiliar scripts, and suggesting corrections. A 2024 survey of researchers across disciplines found that 42 percent reported using AI coding assistants regularly, with data processing and visualisation as the most common tasks. The time saved is real, even if the code produced requires careful review.

Where the hype outpaces reality

Causal reasoning about marine systems. LLMs are pattern-matching engines trained on text. They do not understand causal mechanisms in the way a biogeochemical model does. Ask a language model why a particular algal bloom occurred, and it will produce a plausible-sounding narrative, drawing on ideas such as nutrient loading, stratification, and temperature anomalies from patterns in its training data. However, it cannot distinguish correlation from causation, weigh competing hypotheses against site-specific evidence, or flag when its answer is based on analogy rather than analysis. This matters enormously in marine science, where management decisions hinge on understanding why something is happening, not just what it looks like.

Numerical accuracy and quantitative prediction. Generative AI models are fundamentally probabilistic text generators. They are not calculators. Studies have documented persistent issues with numerical reasoning in LLMs, including errors in unit conversion, statistical interpretation, and basic arithmetic. In a domain where getting a nutrient concentration wrong by an order of magnitude can invalidate a model run, this is not a minor limitation. Any quantitative output from an LLM must be independently verified, which can reduce the efficiency gains if verification takes as long as performing the calculation manually.

Hallucination and fabricated references. The tendency of LLMs to generate plausible but false information, often referred to as hallucinations, is well documented and not yet solved. In scientific contexts, this manifests as fabricated citations, invented datasets, and confident assertions that are simply wrong. Analyses have found that a substantial proportion of references generated by advanced language models in response to scientific queries are either non-existent or substantially misattributed. For researchers under time pressure, the temptation to trust AI-generated references without checking each one is obvious and risky.

Replacing domain expertise. There is a persistent idea that sufficiently powerful AI will make specialist knowledge unnecessary. In marine science, this is nowhere close to reality. Understanding tidal dynamics in a specific estuary, interpreting the behaviour of a particular fish stock, or recognising an artefact in a sonar record all require experience that cannot be replicated by pattern-matching over text corpora. Generative AI is a useful assistant to a knowledgeable researcher. It is not a substitute for one.

The responsible path forward

The productive approach is neither uncritical adoption nor reflexive rejection. Generative AI tools are most valuable when used by researchers who understand both the tools’ capabilities and the domain well enough to spot errors. This means investing in AI literacy within marine science teams, not to turn every ecologist into a machine learning engineer, but to ensure that people using these tools understand what they are doing and, critically, what they are not doing.

It also means insisting on transparency. When AI tools are used in research workflows, for example in literature screening, data processing, code generation, or scenario design, that use should be documented and reported. Several journals, including Nature and Science, now require disclosure of AI tool use in submitted manuscripts. This is a sensible baseline, but the marine science community would benefit from more specific guidance on validation standards for AI-assisted outputs.

Projects working at the intersection of data integration and ecosystem modelling, including initiatives like EcoTwin, will inevitably need to engage with these tools. The question is not whether to use generative AI, but how to use it critically, with appropriate validation, clear documentation, and an honest recognition of its limits.

The ocean is complicated enough without adding poorly understood tools to the mix. But used carefully, generative AI can help researchers work faster, synthesise more effectively, and explore a wider range of possibilities. That is worth pursuing, as long as we remain clear-eyed about what these tools actually are.