The Hidden Layer of IT Resilience: What a Monitoring Platform Migration Really Teaches You

Mar 12, 2026

Reading Time: 4 minutes

We had been using PRTG for almost ten years. It was stable, well understood, and every engineer on the team knew it inside out. That is exactly the kind of situation that makes a monitoring platform migration feel unnecessary — until the costs of standing still start to outweigh the costs of change.

For Si Futures, the trigger was straightforward. Our client base was growing, and new devices meant new licences. PRTG’s pricing model meant that growth was starting to carry a real cost overhead, and we owed it to our clients to keep our operational costs sensible. We investigated alternatives, found that Zabbix covered all the same ground at a fraction of the cost, and made the decision to migrate.

What happened next taught us more about monitoring than a decade of steady-state operation had managed.

The migration was easier than we feared — in one specific way

Our biggest concern going in was data integrity. We had nearly ten years of host configurations sitting in PRTG, and we could not afford to lose anything in transit. What made the transfer manageable was something we had not fully anticipated: by capturing the PRTG and Zabbix IDs in parallel and using AI tools — including direct database-level access via Claude Code — to interrogate both platforms simultaneously, we could do a genuine one-to-one comparison and verify that every host had crossed over accurately. It was considerably faster than the manual export-and-import process we had started with, and it gave us real confidence in the completeness of the migration.

The host transfer turned out to be the easy part.

What was harder: the monitoring you did not know you were missing

Zabbix is, by default, considerably more thorough than PRTG at surfacing monitoring data. When we completed the migration, we found ourselves looking at a volume of alerts and metrics we had simply never seen before. Not because the infrastructure had changed. Because Zabbix was telling us things PRTG had never told us.

Firewall licence expiries. Certificate expirations. Health metrics that had previously required manual checks or email reminders. Zabbix surfaced all of them automatically, and suddenly we had a significant calibration task on our hands. Were these alerts valid? Were we checking at the right intervals? Was the method correct, or had we just automated a manual process that could be done better?

The reality is that over ten years you become accustomed to what your monitoring platform shows you. You stop questioning whether there are things it is not showing you. The migration forced that question into the open, and answering it properly took considerably more time than moving the hosts had.

Why alert quality matters more than alert volume in IT resilience strategy

This is worth explaining carefully, because it is easy to assume that more monitoring data is simply better. It is not.

If your monitoring checks a device too infrequently, you risk missing a failure window that matters. Check too often, and engineers start receiving alerts about momentary packet loss that resolves in sixty seconds — noise that pulls attention away from issues that actually require action. Alert fatigue is a genuine operational risk, and effective threat detection and response only works when engineers can trust that every alert reaching them is worth their time.

There is also a client trust dimension that is easy to underestimate. When a monitoring system repeatedly flags problems that do not exist, it erodes confidence. You end up telling clients there is a problem when there is not one, and over time the service becomes associated with noise rather than intelligence.

The purpose of a monitoring platform is not to surface every possible data point — it is to surface precisely the right ones. Calibration is not configuration. It is the difference between intelligence and noise.

IT monitoring calibration spectrum diagram showing the shift from alert fatigue to precision threat detection

The purpose of a monitoring platform is not to surface every possible data point — it is to surface precisely the right ones. Calibration is not configuration. It is the difference between intelligence and noise.

On running two systems in parallel — and why you still need a hard deadline

We currently run PRTG alongside Zabbix, with PRTG configured as a fallback. It only alerts us if Zabbix has not already done so. This redundancy provided confidence during the transition, but it carries its own overhead. When PRTG generates an alert that Zabbix has not, someone still has to investigate whether it is meaningful. Two systems means two workflows, two validation processes, and two potential sources of noise.

We set a firm switch-off date for PRTG. Not because the migration would be perfectly complete by then — it would not be — but because projects without hard deadlines do not finish. The date forces prioritisation and keeps the remaining proxy installs, agent configurations, and calibration work moving forward. Our proactive IT support model depends on monitoring precision, and that precision requires the discipline of a committed completion date.

What we would tell another IT leader facing the same decision

Run a pilot before you commit. Import as many hosts as you can into the new platform before you stop updating the old one. You will almost certainly find more functionality than you expected, and you need to understand the scale of the calibration work ahead of you before you are fully committed.

If the pilot reveals that the scope is larger than you anticipated, do not let that be the reason you stop. The value on the other side is real. A monitoring platform that tells you more, with better precision, and at a lower operational cost, is worth the months of work it takes to get there. The sweat is temporary. The capability improvement is not.

If your organisation is reviewing its monitoring strategy — or questioning whether your current platform is showing you everything it should — speak with our team about a threat readiness and resilience assessment.
author avatar
Geordie Hogarth

Let’s connect