Agent-Based Modeling: 37 landmark papers, quick overflow
Below are all 37 papers, each as a collapsible block with an outgoing link to the primary source (DOI / publisher / PDF / official page).
All 37 papers (collapsible)
Each block includes (a) the paper title + venue/year, (b) the report’s analytical summary text (where present in the upload), and (c) an outgoing link to the source.
1) Dynamic models of segregation (1971, Journal of Mathematical Sociology)
Schelling — micro preferences → macro segregation (“tipping”)
The research question is whether and how spatial segregation can emerge even when individuals have only “moderate” neighborhood preferences. The method is an early (but essentially classic) ABM: two agent types (residents belonging to two groups) occupy a discrete grid; the rule is a simple satisfaction function (e.g., if too few neighbors are from your own group, you move) and dynamics arise via repeated relocation. There is no empirical data source (the model is a theoretical “toy”), but validation is performed at the pattern level: the study examines which segregation patterns the system can generate and when “tipping” transitions appear. The main result is that macro-level segregation can be an endogenous outcome of the system, not necessarily a direct reflection of centralized coordination or strong discriminatory policy. The contribution to ABM practice is twofold: (i) it demonstrates the generative logic of micro-motives → macro-patterns and (ii) it provides one of the most reproducible baseline models in social simulation, later extended with networks, market mechanisms, and heterogeneous constraints. Limitations stem from simplification: the grid and moving rules are stylized; housing markets, socioeconomic stratification, and spatial push–pull functions are absent; and there is no empirical calibration. Natural next steps are to connect “satisfaction” and mobility to real data (prices, jobs, schools, transport), test structural identity (do different micro-rules yield the same macro-pattern), and evaluate policy instruments that act at the micro level (e.g., zoning, subsidies, information effects).
2) Flocks, Herds and Schools: A Distributed Behavioral Model (1987, ACM SIGGRAPH Computer Graphics)
Reynolds — Boids: separation + alignment + cohesion → flocking
This work asks how to model realistic collective movement (bird flocks, fish schools) without a global trajectory script—i.e., can local, individual rules produce visible “flocking”? The method is agent-based: each “boid” is an autonomous agent with state variables (position, direction/speed), and a perception neighborhood determines which neighbors are considered. The core rules are micro-policies in modern terms: (i) separation (avoid crowding), (ii) alignment (match heading with neighbors), and (iii) cohesion (move toward the neighborhood’s center of mass). There is no data-based calibration; validation is phenomenological—does the simulated motion qualitatively resemble flock dynamics? The key result is robust emergence: flocks form, turn, split, and rejoin without central control. The contribution to ABM theory is foundational: it is one of the clearest demonstrations that aggregate order can be a result of distributed control, and that “visual” validation can be useful when analytic equation models struggle. The limitations are also clear: rules are designed to evoke the phenomenon rather than derived from measurement; there is no explicit energy/goal function, fluid dynamics, or species-specific sensory constraints. Natural extensions include (a) fitting parameters to empirical trajectory data (distance/speed distributions), (b) adding environmental constraints (obstacles, predators, resources), (c) studying rule identifiability (many micro-rule sets can yield similar macro patterns), and (d) using modern calibration methods (Bayesian/ABC, surrogates) to separate “looks convincing” from “is scientifically grounded.”
3) Agent-based modeling: Methods and techniques for simulating human systems (2002, PNAS)
Bonabeau — when ABM is justified; practical design cycle
Bonabeau focuses on the meta-question: when and why ABM is justified for studying “human systems” (organizations, supply chains, markets, traffic, epidemiology), and which practical design techniques matter. This is a methodological overview that formalizes a typical ABM “build cycle”: agent types and heterogeneity, local decision rules, interaction structure (space/network), timing and events, and how aggregate phenomena arise. The paper does not lock into a specific platform, but emphasizes ABM’s core strengths: agent-level nonlinearity, discreteness, heterogeneity, and structured local interaction. It also highlights ABM’s costs: computational burden and validation difficulty. A key practical contribution is a “decision-tree style” argument: ABM is especially appropriate when (i) there are many autonomous entities, (ii) interactions are local and structural, (iii) processes are non-normal or discontinuous, and (iv) the focus is emergence rather than equilibrium solutions. The limitations are scientific and technical: ABMs can “fit anything” without a strict validation plan; model descriptions can be too thin for replication; and parameter spaces are often large. Natural follow-ups include (a) standardized protocols (e.g., ODD), (b) validation taxonomies (especially in economics), (c) computational scaling (GPU/HPC), and (d) ML tools (surrogates, differentiable simulation) for calibration and sensitivity.
4) From Factors to Actors: Computational Sociology and Agent-Based Modeling (2002, Annual Review of Sociology)
Macy & Willer — mechanisms, feedback, networks; ABM as virtual lab
Macy & Willer ask why classic “variable-based” sociology (factors) fails for many phenomena, and what is gained by modeling social life as interacting agents (actors). The paper is a high-level synthesis that focuses on models explaining norms, conventions, information diffusion, collective behavior, and dynamic network effects. While not a single ABM, it makes a direct methodological contribution: (i) ABM as “virtual experimentation,” where structural conditions (network topology, mobility, stratification) can be manipulated; (ii) emphasis on feedback: “networks shape behavior; behavior reshapes networks”; (iii) ABM’s suitability where macro patterns are not a simple sum of micro attributes. The paper is cautious about validation: ABMs are theoretically powerful, but empirical linkage is often weak in social science, so one must be explicit whether the goal is mechanism demonstration or quantitative prediction. Limitations are inherent to the overview format: it maps research directions rather than offering a single “recipe” for deriving rules from data. Natural extensions include clearer model descriptions, validation frameworks and data types (microdata, digital traces), calibration methods that tolerate non-identifiability and stochasticity, and ABM+ML coupling for rule learning or simulation acceleration.
5) Tutorial on agent-based modelling and simulation (2010, Journal of Simulation)
Macal & North — ABM toolbox, terminology, V&V framing
Macal & North present an ABM “toolbox” with the goal of standardizing how ABM is understood and built: what an agent is (attributes + behavior + interaction), how to schedule events, how to collect outputs, and how ABM compares with other simulation paradigms. They argue ABM’s growth is driven by (i) maturing tools, (ii) emerging microdata, and (iii) increased compute. Methodologically, they emphasize heterogeneity, space and networks, and how interaction generates emergent system-level behavior. On quality, the tutorial stresses development discipline: distinguish conceptual model, implementation, and experiment; verify that the program does what the model specifies; validate that the model is reasonably consistent with reality. The practical contribution is a shared vocabulary and bridges to business (decision support) and research (computational laboratory). Limitations are typical: the tutorial cannot solve calibration/identifiability for you; ABM value still depends on justified rules and environment. Natural follow-ups include ODD-style protocols, data-driven calibration (UQ/ABC/Bayes), computational scaling (HPC/GPU), and ML integration for rule learning or surrogates.
6) Pattern-Oriented Modeling of Agent-Based Complex Systems (2005, Science)
Grimm et al. — multi-pattern constraints to reduce “fit anything” ABMs
Grimm et al. ask how to make ABMs more scientifically convincing when there are many plausible behavior rules and models can be highly flexible. Their solution is Pattern-Oriented Modeling (POM): instead of fitting a model to a single target statistic, constrain structure and parameters using multiple independent patterns observed at different scales (e.g., spatial clustering, temporal cycles, distributional shapes). POM is not a platform, but a validation philosophy: design rules so the model reproduces known real-world patterns; a richer pattern set reduces free-parameter degrees of freedom and helps discriminate between competing behavioral hypotheses. The key contribution is POM as a practical framework linking ABM’s strength (mechanistic detail) to testability (patterns as tests). Limitations: pattern selection can be subjective; matching patterns doesn’t guarantee correct micro rules (equifinality); and collecting patterns can be data-intensive. Natural next steps include formalizing pattern selection (domain pattern catalogs), merging POM with Bayesian methods (patterns as ABC constraints), using surrogates/scaling to make multi-pattern fitting feasible, and linking model code to ODD-style descriptions for replication.
7) A standard protocol for describing individual-based and agent-based models (ODD) (2006, Ecological Modelling)
Grimm et al. — ABM description standard: Overview, Design concepts, Details
Grimm et al. address a direct problem: ABM/IBM papers often describe models inconsistently and incompletely, making understanding, replication, and critical evaluation difficult. Their solution is the ODD protocol (Overview, Design concepts, Details), providing a standard structure: (1) purpose, (2) state variables and scales, (3) process overview and scheduling, (4) design concepts, (5) initialization, (6) inputs, (7) submodels. ODD is “meta-infrastructure”: it doesn’t dictate platforms or rules, but forces authors to present enough detail that an independent reader can re-implement. The contribution is standardized communication—ODD becomes a de facto format in ecology and later spreads into social/economic simulation. Limitations are mainly in practice: ODD can become a box-ticking exercise if authors don’t link to code; complex models can yield long text; ODD is not itself a validation framework. Natural follow-up includes protocol updates and deeper integration with open-science workflows (code, tests, data) so “description” and “implementation” don’t drift apart.
8) The ODD protocol: A review and first update (2010, Ecological Modelling)
Grimm et al. — tighten definitions + reduce ambiguity in real usage
This paper asks whether ODD (2006) works in practice—did descriptions actually become clearer and more replicable, and which parts needed tightening? The method is meta-analytic: the authors review real uses of ODD and identify confusion points (e.g., process scheduling descriptions, input distinctions, design concept phrasing). The result is a “first update” that clarifies definitions and adds guidance to reduce ambiguity and increase completeness. The contribution is practical standard maturation: ABM criticism (“it’s code, not a model”) weakens when documentation becomes systematic. Limitations: the protocol cannot guarantee a correct model; it still requires honest, complete reporting; and very complex models may need additional structures (appendices, explicit code references). Natural follow-up is to keep aligning ODD with open-science practices (code + tests + environment capture) and to adapt it for hybrid/multilevel and code-heavy ABMs.
9) The ODD Protocol: A Second Update to Improve Clarity, Replication, and Structural Realism (2020, JASSS)
ODD 2020 — summary format + code-linking + structural realism emphasis
ODD 2020 responds to a practical tension: ABMs are getting more complex (more agents, sub-processes, data sources), while replication and “structural realism” demand better documentation. This second update adds concrete guidance (including an “ODD summary” format for journal articles, guidance on pushing details into supplements, and explicit practices for linking to code) and reorganizes structure so a reader can find the model “spine” quickly. The contribution is twofold: (i) ODD becomes more suitable for real policy- and data-driven ABMs; (ii) it supports a tighter connection between code, process diagrams, and text. Limitations remain: ODD doesn’t solve identifiability, stochasticity, or validation epistemology; it can’t rescue bad assumptions. Natural extensions include deeper integration with repositories, tests, and environment capture (containers), explicit linkage between ODD elements and calibration documentation (what was fit to what), and standards for ML-integrated/hybrid agents (how to describe an “agent” when a learned model sits inside it).
10) A Critical Guide to Empirical Validation of Agent-Based Models in Economics (2007, Computational Economics)
Fagiolo et al. — validation is a procedure, not a single metric
Fagiolo et al. tackle a core issue in economic ABMs: they often reproduce “stylized facts,” but empirical validation is fragmented and methodologically unclear. The research question is how to systematize validation problems, create a taxonomy, and propose procedures that make ABMs empirically assessable. The paper is a critical review plus taxonomy: it distinguishes ABM types, goals, and validation levels, and discusses procedural steps for comparing outputs, selecting targets, and handling multidimensional stochastic outputs and large parameter spaces. The contribution is a pragmatic checklist: validation is not “one number,” but a process aligned with the model’s purpose (mechanism vs prediction), data type, and stochasticity. Limitations: the paper cannot provide a universal likelihood or guarantee identifiability; equifinality is structural for many ABMs. Natural follow-up includes Bayesian/ABC approaches for complex likelihoods, surrogate models and intelligent parameter search, multi-level validation (microdata + macro moments), and open-science standards that enable independent replication.
11) Challenges, tasks, and opportunities in modeling agent-based complex systems (2021, Ecological Modelling)
An et al. — ABM pain points: realism, data granularity, scaling, calibration
An et al. map ABM’s broader status: it is widely used in social, ecological, and socio-ecological systems, but the community repeatedly hits “pain points”: clarity of purpose, realism of human decision models, data granularity, scaling, calibration, and reproducibility. The paper is a synthesis that compares ABM with equation-based models and provides guidance for newcomers and reviewers while identifying “tasks” the field still needs to complete. Empirically, the paper stresses that ABM’s strength (heterogeneity, feedback, space/network) becomes a weakness when data cannot support identification of agent rules and parameters— hence the need for better calibration strategies, micro-level data, and robust sensitivity analysis. The practical contribution is a framing: what to ask before you simulate (purpose, scales, mechanisms, data adequacy). Limitations: it prioritizes challenges rather than delivering a single technical “how.” Natural follow-up includes better behavioral grounding (experiments/behavioral models), hybrid models (ABM + ODE/SD + ML), scaling (HPC/GPU), and practical calibration tooling (surrogates, Bayes, UQ).
12) Agent-Based Modeling in Public Health: Current Applications and Future Directions (2018, Annual Review of Public Health)
Tracy et al. — ABM advantages + traps in public health policy modeling
Tracy et al. ask where ABM provides a special advantage in public health compared to compartment models (SEIR-like), and what the common traps are. The method is a structured overview covering infectious disease, chronic disease, health behavior, and social epidemiology. ABM’s “method bundle” in public health typically includes agents (individuals/households), contact networks (age, work/school, social ties), spatial mobility, and policy interventions (testing, tracing, closures, vaccination). On validation, the paper stresses ABMs often require multiple data types (demographics, contact matrices, incidence, mobility) and transparent calibration plans; otherwise, models get used too confidently in policy. The core takeaway is mapping rather than a single numeric result: ABMs are especially useful for heterogeneous risk, targeting, inequality, and network/spatial effects; the ongoing problems are compute cost, parameter overload, and output instability due to stochasticity. Natural follow-up: standardized calibration protocols, surrogate and differentiable ABMs for faster fit, and open code/data to allow independent checking.
13) NetLogo: A Simple Environment for Modeling Complexity (2004, ICCS proceedings)
Tisue & Wilensky — low-friction ABM prototyping + visualization
Tisue & Wilensky address a practical question: how can we make building agent models easy enough for both research and teaching without losing expressive power? NetLogo is a multi-agent language and environment with standard abstractions (“turtles,” “patches,” links) and a low barrier to experimentation. Agent rules are coded in a Logo-like syntax; time is typically discrete (“ticks”); and the design includes built-in visualization and experimentation support (parameter sweeps, output logging). Data sources/calibration are left to the user; NetLogo’s contribution is workflow speed (prototyping) and a widely used model library that spreads reproducible examples. The main theoretical contribution is indirect: it becomes a standard platform for reproducing ABM ideas (Schelling, emergence, spatial interaction) and exploring variations quickly. Limitations: performance and parallelism; data-heavy models may need integration with Python/R or HPC. Natural next steps include multilevel extensions (e.g., LevelSpace), better calibration/UQ tooling, and integration with ML for learning agent rules or building surrogates.
14) MASON: A Multiagent Simulation Environment (2005, Simulation)
Luke et al. — performance + modularity; headless batch runs
Luke et al. ask how to build a fast, extensible ABM framework suitable for social complexity as well as robotics/ML experiments. MASON is a Java discrete-event toolkit whose key architectural choice is separating the model from visualization: simulations can run “headless” (batch) and a GUI can attach or detach. Agents and environments are classes; scheduling is explicit; spatial structures and networks are included. Data/calibration are not built-in, but the framework supports large experimental batches—critical for stochastic replication and sensitivity analysis. The main contribution is infrastructural: MASON becomes a standard “mid-level” ABM platform where performance and modularity take priority over beginner friendliness. Limitations: Java can raise the entry cost compared with NetLogo; performance comes with engineering overhead. Natural follow-ups include distributed variants (e.g., D-MASON), integration with modern analytics/ML (surrogate calibration), and standardized documentation (ODD) to avoid “model glued to code” opacity.
15) Complex adaptive systems modeling with Repast Simphony (2013, Complex Adaptive Systems Modeling)
North et al. — modular ABM platform + tooling for experiments
North et al. describe Repast Simphony as an open-source ABM environment intended to provide both library components (networks, schedulers, logging) and GUI tooling (model building, results exploration). The practical design question is how to keep architecture modular so components can be swapped (e.g., alternative schedulers or network structures) while supporting multiple abstraction levels. Agent behavior is implemented in code (Java/Groovy) and scheduled via a scheduler; environments can be spatial or network-based. Calibration/validation are still model-specific, but Repast supports experiment automation and output collection. The contribution is positioning: a “mid-to-high ambition” platform usable for teaching and for larger research models. Limitations: tool richness can create complexity; integration with modern statistical pipelines can be less natural than in Python ecosystems. Natural follow-up includes statechart-style behavior documentation, stronger coupling with UQ/calibration frameworks (ABC, HM), and performance scaling (HPC/distributed/GPU) for million-agent demands.
16) A scalable distributed multi-agent simulation environment (D-MASON) (2018, Simulation Modelling Practice and Theory)
Cordasco et al. — run MASON models on clusters without rewriting
Cordasco et al. solve a concrete engineering problem: how to run existing MASON ABMs in distributed environments (clusters) without rewriting them from scratch. D-MASON adds mechanisms for space/agent partitioning, synchronization, and communication while keeping agent rules mostly unchanged. The main result is practical scalability: large models become feasible while preserving MASON’s architectural logic. The calibration/validation benefit is indirect: faster runs enable more replications, sensitivity sweeps, and fitting. Limitations: distribution raises determinism and repeatability issues (synchronization, random number streams); and developers must understand partitioning side effects (boundary effects, communication cost). Natural follow-up includes adaptive partitioning based on density/interaction structure, hybrid scaling (cluster + GPU), and standardized benchmarks linking performance to scientific equivalence (“faster and the same model,” not just faster).
17) FLAME GPU 2: A framework for flexible and performant agent based simulation on GPUs (2023, Software: Practice and Experience)
Richmond et al. — GPU ABM for millions of agents + ensembles
Richmond et al. target ABM’s biggest bottleneck: when agents reach millions and interactions are dense, CPU simulation becomes too slow. FLAME GPU 2 is a GPU-accelerated framework aiming to balance flexibility (general ABM expressiveness) with performance (CUDA parallelism). Agent behavior is expressed as functions that run in parallel on the GPU, with careful handling of competition and race conditions. The result is a framework that enables scalable simulations and ensembles (many runs) in scientifically practical time. The theoretical contribution is expanding what questions are computationally feasible: micro-detailed models that used to be “too expensive” become standard experiments. Limitations: GPU programming complexity; communication/randomness require careful design; and performance-driven implementation can hurt transparency if documentation lags. Natural next steps include higher-level DSLs that generate GPU kernels, stronger integration with Python analytics (calibration pipelines), and reproducibility bundles (code + configs + seeds + ODD).
18) Mesa 3: Agent-based modeling with Python in 2025 (2025, JOSS)
ter Hoeven et al. — modern Python ABM + reproducible software paper
The Mesa 3 paper is a software publication with a practical aim: support building, visualizing, and analyzing ABMs in Python without writing all infrastructure from scratch. Mesa provides abstractions for agent management, space representations, schedulers, and browser-based visualization. Because JOSS emphasizes reproducible software development (versioning, archiving, review), Mesa’s contribution is especially strong on open science: the tool becomes a citable scientific object with an auditable development history. Calibration/validation remain research tasks, but Mesa makes the broader analysis pipeline (pandas, Jupyter, ML libraries) immediately accessible. Limitations: Python performance can be a bottleneck for very large agent counts (often requiring vectorization, compiled extensions, or HPC/GPU). Natural follow-up includes stronger built-in batch/UQ tooling, standard reproducibility templates (data + config + seeds), and tighter connection to ML-based surrogate calibration and differentiable simulation workflows.
19) A survey on agent-based modelling assisted by machine learning (2023, Expert Systems)
Platas-López et al. — ML across the ABM lifecycle (design→analysis)
Platas-López et al. review how ML supports different stages of the ABM lifecycle: design, experimentation, calibration, validation, and analysis. The method is a literature survey (including bibliometric/Scopus-driven mapping and thematic categorization). It distinguishes online ML (during simulation, often RL for agent decisions) from offline ML (before/after simulation: learning rules from data, building surrogates, analyzing outputs). The core result is that “ML+ABM” is not one technique but a family: RL is natural online; supervised learning serves both rule specification and surrogates; unsupervised methods appear in output pattern discovery and clustering. The practical contribution is a map of what to use where—and the risks: black-box agents, interpretability loss, and strong data assumptions. Limitations: the view depends on search/classification schemes; parts of the field move fast via conferences/preprints and may be undercounted. Natural next steps include documentation standards for ML agents, reproducibility practices (code/data/seeds) when ABM and ML stochasticity stack, and tighter linkage to ODD-like descriptions.
20) Using Machine Learning for Agent Specifications in Agent-Based Models and Simulations: A Critical Review and Guidelines (2023, JASSS)
Dehkordi et al. — learning agent rules; transparency + robustness risks
Dehkordi et al. focus on a narrower but crucial question: how to use ML to specify agent behavior rules (not just accelerate analysis). The method is a critical review plus guidelines mapping real practices: when to learn an agent directly from data, when to use ML as heuristics, and when to prefer interpretable models (e.g., decision trees). The paper stresses a new transparency problem: ML-learned agent behavior can increase realism but makes the justification and inspection of “the rule” harder than in hand-coded heuristics. The theoretical shift is “agent as model”: agents can be statistical predictors or learned policies, not only if-then logic. The practical contribution is guidance tying together data quality, modeling goal (mechanism vs prediction), interpretability needs, and validation strategy. Limitations: the area evolves quickly; some best techniques are still emerging; and ML cannot fix missing micro-level evidence when data do not identify behavior. Natural next steps include standardized documentation of ML agents (architecture, training data, generalization limits), robustness testing under distribution shift, and connection to differentiable ABMs where policy and system dynamics sit in one optimizable graph.
21) Agent-based model calibration using machine learning surrogates (2018, Journal of Economic Dynamics & Control)
Lamperti et al. — surrogate-assisted calibration + sensitivity at scale
Lamperti et al. address a classic ABM pain point: calibration is expensive because the model must be run many times across a large parameter space. The research question is whether supervised ML surrogates can learn the ABM’s input→output mapping well enough to make parameter search and calibration much cheaper. The method builds an iterative meta-model that approximates the ABM; it is demonstrated on two economic ABMs (an asset-pricing ABM and a growth model), where the surrogate must capture nonlinear responses. Calibration here is fitting to data/target moments; the surrogate accelerates exploration. Results: surrogates make sensitivity analysis and calibration search practical by reducing computation. The practical recipe is: “run ABM at selected points → train surrogate → use surrogate for fast search → verify top candidates with ABM.” Limitations: surrogates can fail where the ABM is chaotic or exhibits sharp transitions; building the training set can still be costly for very slow ABMs. Natural extensions include uncertainty-aware surrogates (quantiles/ensembles), Bayesian integration (posterior over parameters), and standardized reporting of surrogate error to avoid overconfident policy conclusions.
22) Using machine learning as a surrogate model for agent-based simulations (2022, PLOS ONE)
Angione et al. — compare ML surrogates; speed up calibration + SA
Angione et al. tackle the same overall challenge but with a different methodological move: they compare multiple ML algorithms as ABM surrogates and link this to practical sensitivity analysis and calibration. The research question is which ML classes (e.g., neural networks, random forests, etc.) can emulate ABM output with adequate fidelity, and when this yields significant compute savings. The method uses a case ABM (a social care services model), generates a systematic dataset of ABM outputs under many input parameter combinations, and trains ML models to predict ABM outcomes. Validation is two-step: (i) surrogate accuracy vs ABM output, and (ii) discussion of how this enables faster parameter search and sensitivity workflows. Results show surrogates can substantially accelerate ABM analysis, but performance varies by algorithm and by how chaotic/thresholdy the ABM is. The contribution is a comparative framework for choosing surrogates. Limitations: surrogates learn the ABM, not reality—if the ABM is wrong, the surrogate helps you be wrong faster. Natural follow-up includes uncertainty-aware surrogates for stochastic ABMs, active learning for selecting training points, and fully reproducible pipelines (code + data).
23) Calibrating Agent-Based Models Using Uncertainty Quantification Methods (2022, JASSS)
McCulloch et al. — History Matching + ABC for stochastic ABMs
McCulloch et al. start from ABM stochasticity and high dimensionality: naïve “minimize error” search becomes expensive and unstable. The research question is how to bring uncertainty quantification tools into ABM calibration so that searching is efficient and uncertainty is quantified. The framework combines History Matching (to quickly rule out implausible parameter regions) with Approximate Bayesian Computation (to approximate a posterior without an explicit likelihood). Practically, this means evaluating whether candidate parameters fall within tolerances across multiple target statistics—rather than optimizing a single scalar. The contribution is an efficient staged approach: HM shrinks the search space before ABC samples expensively, especially valuable for environments where ABM building is easy but “serious calibration” is hard. Limitations: HM/ABC depends on choice of summary statistics and distance metrics; stochastic noise may require many replications. Natural follow-up includes surrogate models for HM/ABC, automated summary statistic selection (ML), and calibration documentation standards comparable to ODD (priors, tolerances, metrics).
24) Bayesian Calibration of Stochastic Agent Based Model via Random Forest (2025, Statistics in Medicine)
Robertson et al. — RF surrogate + PCA + MCMC for heavy ABM calibration
Robertson et al. address “practical impossibility” in large stochastic epidemiological ABMs: with millions of agents, full Bayesian MCMC calibration directly on the ABM is infeasible. The question is whether a random forest global surrogate plus dimension reduction (PCA) can deliver a useful parameter posterior and better predictive performance than earlier ABC-style calibration. The method: time-series outputs (hospitalizations and deaths) from the CityCOVID ABM are decomposed into latent modes (PCA); an RF surrogate learns the mapping from parameters to those components; then MCMC samples a posterior over parameters. Validation checks whether posterior samples, pushed through the ABM (or surrogate), reproduce observed dynamics; the paper compares with earlier IMABC. Results: calibration improves and computation drops, while still handling stochastic outputs via ensembles. Contribution: a transferable architecture “ABM → decomposition → surrogate → Bayes” (not just COVID), and RF gives sensitivity clues about important parameters. Limitations: surrogates fail outside training space; PCA may hide localized dynamics; posterior quality depends on the chosen targets. Natural follow-up includes richer latent time-series surrogates (e.g., RNN/Transformer), gradient-aware inference where possible, and standard scoring for stochastic calibration.
25) Differentiable Agent-based Epidemiology (GradABM) (2023, AAMAS)
Chopra et al. — differentiable ABM enables gradient-based learning
Note: In the uploaded report, the GradABM narrative summary starts but continues beyond the excerpt captured into the PDF section you provided here; this block keeps the content in the same spirit and does not invent missing specifics.
Differentiable ABM shifts calibration logic: if the simulation is differentiable (or can be made differentiable via automatic differentiation), parameters can be learned with gradient-based methods rather than only via sampling/search. This can dramatically speed up fitting and enable end-to-end learning. It also introduces new risks: interpretability, handling stochasticity, and out-of-distribution behavior become even more critical when “the agent rules” are optimized.
26) Machine learning surrogates for agent-based models in transportation (2025, Elsevier transport)
Natterer et al. — listed in corpus table; summary not included in narrative section
Note: This paper is included in the corpus overview table in the uploaded report, but the report’s narrative “analytical summaries” section does not contain a full text summary block for it. (So I’m not inventing one.)
27) Modelling disease outbreaks in realistic urban social networks (2004, Nature)
Eubank et al. — realistic contact networks change outbreak dynamics
(As emphasized in the report.) Realistic urban-network ABM highlights that contact structure (households, schools, workplaces, spatial mobility) is part of the mechanism—not a detail you can safely “average away.” This makes ABMs especially suitable for policy evaluation where targeting and heterogeneity determine outcomes. The tradeoff is heavy data requirements and calibration complexity, which later work addresses via acceleration (surrogates) and stronger fitting (Bayes/UQ).
28) Strategies for mitigating an influenza pandemic (2006, Nature)
Ferguson et al. — intervention feasibility + timing dominate outcomes
(The report’s key emphasis.) Intervention impact depends not only on biology but on what is socially and operationally feasible: compliance, timing, school closures, movement restrictions, targeting. ABM lets you encode these assumptions explicitly and explore where strategies break.
29) Mitigation strategies for pandemic influenza in the United States (2006, PNAS)
Germann et al. — national-scale ABM; combined strategies matter
(As framed in the report.) At national scale, heterogeneity and mobility become decisive. Many “single lever” strategies mostly shift timing; combined interventions are more robust, especially when transmissibility is high and compliance is variable.
30) Covasim: An agent-based model of COVID-19 dynamics and interventions (2021, PLOS Computational Biology)
Kerr et al. — policy-usable ABM because it ships as a tool
(The report’s emphasis.) Covasim is a good example of ABM becoming practical decision support: open code, modular intervention definitions, and enough performance for ensembles make it usable beyond a single research group. This is “ABM as a product,” not only as a concept.
31) Role of heterogeneity… data-driven national-scale ABM using UVA-EpiHiper (2024, Elsevier)
Jia Chen et al. — listed in corpus table; summary not included in narrative section
Note: This paper is included in the corpus overview table in the uploaded report, but the report’s narrative “analytical summaries” section does not contain a full text summary block for it. (So I’m not inventing one.)
32) UrbanSim: Modeling Urban Development for Land Use… (2002, Journal of the American Planning Association)
Waddell — land use ↔ transport feedback as an endogenous system
(As referenced in the report.) UrbanSim links household, firm, and developer decisions to accessibility, markets, and policy constraints, letting researchers examine feedback loops (transport → access → demand → development → new spatial pattern). The ABM value is making those feedbacks explicit and testable under policy scenarios.
33) Agent-based simulation of travel demand: MATSim-T… performance (2008, open access conference paper)
Balmer et al. — iterative replanning → emergent traffic equilibria
(As referenced in the report.) MATSim-T treats traffic as iterative plan adaptation: simulate, score plans, replan, repeat. This loop yields equilibrium-like outcomes without requiring an analytic solution, while retaining heterogeneity and spatial dynamics—exactly where ABM shines in transportation.
34) Scaling and criticality in a stochastic multi-agent model of a financial market (1999, Nature)
Lux & Marchesi — stylized facts (fat tails, clustering) emerge endogenously
(Listed in the report as a classic milestone.) This financial-market ABM shows how heterogeneous traders and strategy switching can generate volatility clustering and heavy-tailed return distributions without external shocks. The ABM value here is explicitly modeling endogenous instability and crisis-like dynamics.
35) EURACE: A massively parallel agent-based model of the European economy (2008, Applied Mathematics and Computation)
Deissenberg et al. — macro ABM engineering + scaling
(Highlighted in the report as a “large application” case.) EURACE is an engineering proof that large parallel macro-ABMs are feasible. But as complexity grows, calibration and validation become the hardest part (not only compute). This pushes the field toward standards, UQ, and later acceleration methods.
36) The economy needs agent-based modelling (2009, Nature)
Farmer & Foley — agenda-setting: systemic risk needs ABM
(Referenced in the report as an ABM “justification manifesto” for economics.) The argument: equilibrium models fail in crises; systemic risk and nonlinear feedback require an agent-based view. This also means ABM must take standards, data, calibration, and reproducibility seriously—otherwise it stays at the level of “nice animation.”
37) Opening the black box—Development, testing and documentation… (2010, Ecological Modelling)
Topping et al. — transparency: development + testing + documentation as method
(Included in the report as an ecology-domain contribution.) The focus is opening the “black box” problem: ABM’s scientific value depends on whether model development, testing, and documentation are transparent enough for others to evaluate and replicate. This shifts ABM back from “program” to “model”—with a controllable, inspectable workflow.




