Rafal Kucharski Lab | Michał Hoffmann

I was pursuing a bachelor’s degree in Theoretical Computer Science at Jagiellonian University. My research focuses on competition in the transportation, where I combine Game Theory with Machine Learning to explore the impact of conflicts on everyday commuter. I love to work on intersection of various fields where I can use sometimes unorthodox methods solving my problems.

Outside of the lab I enjoy mountain trekking and I never pass up a good book!

List of main publications and preprints

arXiv

URB - Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles

Akman, Ahmet Onur, Psarou, Anastasia, Hoffmann, Michał, Gorczyca, Łukasz, Kowalski, Łukasz, Gora, Paweł, Jamróz, Grzegorz, and Kucharski, Rafał

arXiv preprint arXiv:2505.17734 2025

Abs HTML

Connected Autonomous Vehicles (CAVs) promise to reduce congestion in future urban networks, potentially by optimizing their routing decisions. Unlike for human drivers, these decisions can be made with collective, data-driven policies, developed by machine learning algorithms. Reinforcement learning (RL) can facilitate the development of such collective routing strategies, yet standardized and realistic benchmarks are missing. To that end, we present \our: Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles. \our is a comprehensive benchmarking environment that unifies evaluation across 29 real-world traffic networks paired with realistic demand patterns. \our comes with a catalog of predefined tasks, four state-of-the-art multi-agent RL (MARL) algorithm implementations, three baseline methods, domain-specific performance metrics, and a modular configuration scheme. Our results suggest that, despite the lengthy and costly training, state-of-the-art MARL algorithms rarely outperformed humans. Experimental results reported in this paper initiate the first leaderboard for MARL in large-scale urban routing optimization and reveal that current approaches struggle to scale, emphasizing the urgent need for advancements in this domain.
arXiv

Wardropian Cycles make traffic assignment both optimal and fair by eliminating price-of-anarchy with Cyclical User Equilibrium for compliant connected autonomous vehicles

Hoffmann, Michał, Bujak, Michał, Jamróz, Grzegorz, and Kucharski, Rafał

arXiv preprint arXiv:2507.19675 2025

Abs HTML

Connected and Autonomous Vehicles (CAVs) open the possibility for centralised routing with full compliance, making System Optimal traffic assignment attainable. However, as System Optimum makes some drivers better off than others, voluntary acceptance seems dubious. To overcome this issue, we propose a new concept of Wardropian cycles, which, in contrast to previous utopian visions, makes the assignment fair on top of being optimal, which amounts to satisfaction of both Wardrop’s principles. Such cycles, represented as sequences of permutations to the daily assignment matrices, always exist and equalise, after a limited number of days, average travel times among travellers (like in User Equilibrium) while preserving everyday optimality of path flows (like in System Optimum). We propose exact methods to compute such cycles and reduce their length and within-cycle inconvenience to the users. As identification of optimal cycles turns out to be NP-hard in many aspects, we introduce a greedy heuristic efficiently approximating the optimal solution. Finally, we introduce and discuss a new paradigm of Cyclical User Equilibrium, which ensures stability of optimal Wardropian Cycles under unilateral deviations. We complement our theoretical study with large-scale simulations. In Barcelona, 670 vehicle-hours of Price-of-Anarchy are eliminated using cycles with a median length of 11 days-though 5% of cycles exceed 90 days. However, in Berlin, just five days of applying the greedy assignment rule significantly reduces initial inequity. In Barcelona, Anaheim, and Sioux Falls, less than 7% of the initial inequity remains after 10 days, demonstrating the effectiveness of this approach in improving traffic performance with more ubiquitous social acceptability.
arXiv

Autonomous Vehicles Using Multi-Agent Reinforcement Learning for Routing Decisions Can Harm Urban Traffic

Psarou, Anastasia, Akman, Ahmet Onur, Gorczyca, Łukasz, Hoffmann, Michał, Varga, Zoltán György, Jamróz, Grzegorz, and Kucharski, Rafał

arXiv preprint arXiv:2502.13188 2025

Abs HTML

Autonomous vehicles (AVs) using Multi-Agent Reinforcement Learning (MARL) for simultaneous route optimization may destabilize traffic environments, with human drivers possibly experiencing longer travel times. We study this interaction by simulating human drivers and AVs. Our experiments with standard MARL algorithms reveal that, even in trivial cases, policies often fail to converge to an optimal solution or require long training periods. The problem is amplified by the fact that we cannot rely entirely on simulated training, as there are no accurate models of human routing behavior. At the same time, real-world training in cities risks destabilizing urban traffic systems, increasing externalities, such as CO2 emissions, and introducing non-stationarity as human drivers adapt unpredictably to AV behaviors. Centralization can improve convergence in some cases, however, it raises privacy concerns for the travelers’ destination data. In this position paper, we argue that future research must prioritize realistic benchmarks, cautious deployment strategies, and tools for monitoring and regulating AV routing behaviors to ensure sustainable and equitable urban mobility systems.