People sometimes ask how I ended up choosing the University of Groningen (RUG). The short answer is that it matched both what I wanted to study and the kind of experience I was looking for.
The university was highly ranked in the QS World University Rankings, and its software engineering and distributed systems programme fit my interests especially well.
Cross-cultural Collaboration in Code Reviews
In one particularly interesting software analytics course, I worked with others to study 3.3 million pull requests from 11,000+ open-source projects. We started with a simple question:
Can you buy faster code review with more reviewers?
We found that pull requests with 4 to 12 participants received the fastest first response. With fewer reviewers, the time to first response roughly tripled. Adding even more reviewers made merge times longer and less predictable.
The interesting catch is that 85% of the PRs we analysed had three or fewer participants. Software engineers already seem to understand that adding more reviewers helps only up to a point, after which it starts making things worse.
However, many studies had already analysed pull request speed. What interested us more was the personal and identity-based dimension of cross-cultural collaboration. So we asked a second question:
Does the national, cultural or group identity of the reviewer affect the results?
Here, the results were a bit uncomfortable. Same-country and same-affiliation programmer-reviewer pairs merged slightly faster, but the more important effect was on acceptance. Contributors reviewed by someone from a different country were measurably less likely to have their PR accepted.
The effect sizes may seem small, but the pattern is consistent. That makes the finding important for diverse teams that want to adjust their processes for fairer cross-cultural collaboration.
You can read the full write-up with charts and other findings in our Medium article, and inspect the data analysis in the accompanying GitHub repository.

Drone Bridge Inspection Architecture
Outside research, I also focused on the large-scale software architecture track at RUG. I started with a regular software architecture course and then continued with an advanced one. There, my team and I designed Standing Bridges, a system that completely automates routine bridge inspections using autonomous swarms of specialised drones.
We began by defining the goals and requirements. That led us to a multi-tenant SaaS model built around infrastructure owners, bridge engineers, system integrators, and the essential use cases: regular inspections, in-house drone management, expert data analysis, and inspection scheduling.
Our solution strategy focused on meeting quality goals such as reliability and safety. We did this by using proven industry standards, including ROS 2 for swarm coordination and TLS 1.3 for secure data transmission.
At first, we thought most of the obvious complexity would be in drone flight planning. We quickly realised that the practical bottleneck was the logistics supporting the entire process. The real challenge was keeping track of drones: which ones were charged, which were due for maintenance, where they needed to be tomorrow, which tasks they could perform, and which spare parts needed to be reordered so the drone warehouse could keep them operational.
We first decomposed the system into four cooperating subsystems: mission management, warehouse management, on-site deployment, and data analysis. While modelling the architecture with the C4 model, we found that warehouse management was the main bottleneck. We addressed it with an event-driven architecture inside that subsystem, using an asynchronous event bus to communicate drone, maintenance, and inventory status.
All software components and processes were analysed and then translated into a physical deployment. The result was a hybrid cloud-edge architecture, with data and management functions running on AWS and time- and safety-sensitive drone operations handled by on-site edge devices.
We made sure to codify our architectural choices in Architecture Decision Records (ADRs), including the use of Docker and Kubernetes and the decision to establish Event-Driven Service-Oriented Architecture (ED-SOA) as a cross-cutting concept.
Finally, we concluded our work by taking a hard look at the future. We identified technical debt and operational risks, such as potential signal jamming or public discontent, and established a five-year deployment roadmap to guide the project through deployment.
The full arc42 architecture document can be seen here. However, we also deployed an interactive C4 view, which is a much nicer way to explore the architecture design.

LLMs for Code Clone Detection
Code clones come in four flavours, from copy-pasted snippets that differ only in whitespace (Type-1) to functionally equivalent fragments written in completely different code (Type-4). Type-4 is the interesting case: traditional tools rarely catch it because there is almost nothing textual to compare.
For our colloquium, we surveyed 25 papers published between 2020 and February 2025 to see how well LLMs were closing that gap.
A few headline numbers stuck with me. Across all clone types, the surveyed methods averaged ~74% recall and ~68% precision. However, for the hard Type-4 case, GPT-4 tripled GPT-3.5’s recall from 7% to 23%. At the same time, the small CodeBERT family (125M parameters) showed up in roughly a quarter of the studies and held its own, suggesting that “bigger” is not automatically “better” once training cost matters.
Two of the results we found were subtler. First, GPT models were noticeably better at spotting clones that other LLMs had generated than ones written by humans, a bias that may quietly matter as more production code is itself written by LLMs. Second, models trained on the popular BigCloneBench benchmark lost roughly 30 percentage points of precision when evaluated on a different dataset, a reminder that benchmark scores often flatter the benchmark as much as the model.
Next Steps
During my studies in Groningen, I learned a great deal about academic research in computing science and paired that with sustainability-focused electives.
Still, my biggest takeaway from university life was the people I met. Working on projects with people from all over the globe was eye-opening, and building personal connections across vastly different cultures was one of the most rewarding parts of my time there. It changed me as a person, the way I think, work, and relate to others, and it is probably what I will carry with me most from Groningen.
Along the way, I completed a second internship at ASML and also worked on an internship project with PosAm. After finishing my coursework, I started looking for a company to collaborate with, and the contents of my studies eventually led me to researching LLMs for sustainable software engineering at TNO in my master thesis.
