Technicals on 4th October Facebook Outage? (BGP)

  • The Border Gateway Protocol (BGP) allows Internet Service Providers (ISPs) and enterprise networks like (Facebook) to announce routes towards their IP prefixes.
  • Simply BGP is the protocol that connects the entire internet.
Simple LAN
  • As you see above, in the organization with flat network devices (A and B) they can connect using Local Network addresses.
  • Now if the organization have a different LAN’s it can connect using the router as seen above. Router here uses various routing protocols, including internal BGP.
An autonomous system comprising ISP and branches.
  • Now Large organizations (Facebook) and other ISPs manage internet connectivity from multiple sites and this whole is called AS or Autonomous System.
  • Now AS (Autonomous System) networks can handle local traffic but we need BGP to handle traffic between different Autonomous Systems.
  • External BGP contains a routing table to find the best path to the destination for network packets/traffic.
  • BGP’s path vector has two types of messages: Updates and Withdraws
  • A BGP update is used to advertise a path towards a prefix or a change in a previously announced path towards a prefix. (Adding or updating path)
  • In layman language Facebook telling the BGP network how to reach the Facebook server by publishing Paths.
  • A BGP withdraw indicates that a previously announced prefix becomes unreachable. (Deleting paths from routing tables)
  • Facebook server telling BGP network to withdraw or delete the paths.
  • To conclude as we know Facebook’s server reached out to the BGP network and withdrew path’s which allows the internet to reach out to Facebook. Hence, routing tables had no information on how to reach the Facebook server and caused this downtime.
  •, the research paper explains some of the reasons for path withdrawal but is not limited to this. Adding one of the reasons below.
  • If some interdomain links are unstable and fail frequently. Each of these failures causes the transmission of a number of BGP withdraws.
  • Second, as BGP relies on path vectors, it suffers from the path exploration problem when a route becomes unavailable. When a route fails, a new BGP convergence starts. During this convergence, routers may advertise paths that they consider valid although they are also affected by the failure. These paths will be withdrawn later causing another exchange of BGP messages.




Kaushal Prajapati

