Case Study: BGP Routing Investigation | Retail

Reading Time: 5 minutes

A retail client’s critical network infrastructure faced severe performance degradation during a primary circuit outage, threatening their revenue-critical e-commerce operations. When their main IPT connection failed due to a fibre break at their Johannesburg DC facility, backup systems exhibited unexplained packet loss and bottlenecks that simple capacity constraints couldn’t explain. Si Futures conducted a systematic post-incident investigation that revealed complex BGP routing asymmetry issues causing traffic splitting across multiple paths during failover scenarios.

Challenge

The immediate crisis centred on a fibre break to the data centre, requiring urgent repairs. However, during the 19-hour outage period, the client’s backup systems exhibited severe performance degradation and packet loss that couldn’t be explained by simple capacity constraints.
E-commerce network outage BGP no incoming traffic
The retail client experienced critical network performance issues during a primary circuit outage at their Johannesburg DC facility. When their main IPT connection failed, the backup infrastructure struggled under load, creating bottlenecks that threatened their revenue-critical e-commerce operations.

Key technical challenges included:

  • Unexplained packet loss during failover scenarios
  • Backup wireless circuit saturation
  • Performance degradation beyond simple capacity constraints
  • Risk to revenue-critical e-commerce transaction processing

Approach

Si Futures conducted a comprehensive post-incident investigation using a systematic network analysis methodology to identify the root cause of performance issues during failover scenarios.

Technical Investigation Process:

  • Traffic Pattern Analysis – Examined output and input traffic graphs across all circuits to identify routing patterns
  • BGP Route Advertisement Investigation – Used RIPE’s BGPlay tool to analyse global internet routing table changes during the outage period
  • Circuit Utilisation Assessment – Monitored bandwidth consumption patterns across primary and backup connections under load conditions
  • Infrastructure Architecture Review – Analysed physical router patching and BGP peer configuration to understand failover behaviour

BGP routing investigation RIPE BGPlay analysis e-commerce network outage
Key Technical Findings:

  • Asymmetric Routing Detected – Traffic graphs revealed output to the secondary IPT with no corresponding input traffic, indicating routing asymmetry
  • Dual Route Advertisement – BGP analysis showed the client’s prefix being advertised simultaneously via both the secondary and primary links during the outage
  • Wireless Circuit Saturation – The backup wireless link reached maximum capacity whilst the primary circuit remained down
  • Traffic Split Identified – Approximately 70% of global internet traffic was routed via the primary IPT provider and 30% via the secondary provider during the outage period

Architecture Discovery:

  • Routers were patched directly without intermediate switches, preventing cross-router circuit visibility
  • BGP peers were configured with IP SLA tracking on both routers independently
  • Router 1 was advertising the prefix over wireless to the primary provider link whilst Router 2 simultaneously advertised to the secondary provider link

Solution

Immediate Infrastructure Corrections:

  • BGP Configuration Optimisation – Route advertisement logic was reconfigured to prevent simultaneous dual-path announcements during failover scenarios
  • Router Architecture Enhancement – Infrastructure changes were recommended to enable cross-router circuit awareness and coordinated failover
  • SFP Replacement Protocol – Implemented systematic SFP testing and replacement procedures for the secondary connection experiencing packet loss

BGP routing resolution RIPE BGPlay analysis e-commerce network outage
Monitoring and Alerting Improvements:

  • Aggressive Threshold Implementation – Reduced packet loss alerting from 15% over 5 minutes to 2% over 1 minute across all DC circuits
  • Enhanced NMS Deployment – Upgraded to advanced network monitoring system providing improved circuit health visibility and intelligent alerting
  • Comprehensive Circuit Testing – Established load testing protocols using iPerf at 300Mbps to replicate real-world traffic conditions

Systematic Testing Validation:

  • Zero packet loss achieved during 30-minute sustained testing at 300Mbps load
  • 5,000-packet rapid ping tests confirmed circuit stability under operational conditions
  • Multiple health check protocols validated across all connection paths

Outcome

Network Reliability Improvements:

  • Eliminated Asymmetric Routing – BGP configuration prevents traffic splitting across multiple paths during failover events
  • Resolved Wireless Circuit Bottlenecks – Backup infrastructure no longer saturates during primary circuit outages
  • Enhanced Monitoring Capability – 750% improvement in alerting sensitivity (15% to 2% threshold) ensures early intervention
  • Verified Circuit Performance – Systematic testing confirmed zero packet loss under realistic load conditions

Infrastructure Resilience Enhancements:

  • Coordinated Failover Behaviour – Router architecture recommendations prevent independent BGP advertisements causing routing conflicts
  • Proactive Issue Detection – Advanced monitoring system deployment provides comprehensive circuit health visibility
  • Systematic Testing Protocols – Established load testing procedures ensure backup systems perform under real-world conditions

Business Continuity Assurance:

  • Revenue Protection – Eliminated network bottlenecks that could impact e-commerce transaction processing during outages
  • Improved Customer Experience – Consistent network performance maintained across primary and backup infrastructure scenarios
  • Reduced Incident Duration – Enhanced monitoring enables faster problem identification and resolution

What Made the Difference

Systematic Technical Investigation Methodology:
Si Futures’ approach went beyond reactive troubleshooting to conduct comprehensive root cause analysis using industry-standard tools like RIPE BGPlay and systematic traffic pattern investigation. This proactive technical methodology identified architectural issues that traditional monitoring had missed.

Integration of Multiple Technical Disciplines:
The investigation combined network analysis, BGP routing expertise, infrastructure architecture assessment, and advanced monitoring deployment. This comprehensive technical approach addressed both immediate issues and underlying architectural weaknesses.

Evidence-Based Problem Solving:
Rather than making assumptions about network behaviour, Si Futures used quantifiable evidence from traffic graphs, BGP routing tables, and systematic load testing to identify the precise technical causes of performance degradation.

Proactive Monitoring Enhancement:
The implementation of aggressive alerting thresholds and advanced monitoring systems demonstrates Si Futures’ commitment to preventing future issues rather than simply reacting to problems after they impact business operations.

Client Impact

The systematic investigation prevented future revenue-threatening outages for this major e-commerce platform by identifying and resolving critical BGP routing architecture issues that would have caused identical problems during subsequent primary circuit failures.

“This level of technical investigation goes far beyond standard MSP troubleshooting. The systematic approach and comprehensive analysis prevented what could have been repeated infrastructure failures affecting our core business operations.”

This major e-commerce platform now benefits from enterprise-grade network resilience with coordinated failover behaviour, enhanced monitoring capabilities, and systematic testing protocols that ensure backup infrastructure performs reliably under operational load conditions.

The client’s e-commerce infrastructure is now protected against BGP routing asymmetry, with comprehensive monitoring ensuring early detection of any potential issues before they impact revenue-critical operations.

Protect Your E-commerce Infrastructure from BGP Routing Issues

Does your organisation rely on mission-critical network infrastructure for revenue generation? Si Futures’ systematic network investigation methodology can identify hidden architectural issues before they cause costly outages.

Our comprehensive approach combines BGP routing expertise, advanced monitoring deployment, and systematic testing protocols to ensure your backup infrastructure performs reliably when you need it most.

Contact our network infrastructure specialists to discuss how systematic network analysis can protect your critical business operations from hidden routing issues and infrastructure bottlenecks.

author avatar
Nicholas Broderick

Let’s connect