Rainbow Services have experienced some troubles from Wednesday, 24th June at 08:45PM CEST to Thursday, 25th June at 10:00AM CEST.
What Happened:
Incident Time frame:
- From 08:45PM CEST to 11:05PM CEST: Network outage in one datacenter of our provider impacting Rainbow WW users.
- From 11:05PM CEST to 09:00AM (day+1) CEST: Rainbow Services restoration by our team.
- At 10:00AM CEST: Stability confirmed after one hour monitoring.
Incident impact:
- [All Regions] From 08:45PM CEST to 11:05PM CEST: Full Rainbow Services (Connection, Messaging, Bubbles, Conferences, Telephony Services ...) down for all Rainbow users.
- [Regions: NA, CALA, APAC] From 11:05PM CEST to 11:50PM CEST: Restoration of services in a phased manner.
- Web Call came back at 11:10PM CEST.
- Web Conference came back at 11:10PM CEST.
- Telephony services came back at 11:50PM CEST - [Region: EMEA] From 11:05PM CEST to 11:50PM CEST: Restoration of services in a phased manner.
- Web Call came back at 11:10PM CEST.
- Telephony services came back at 11:50PM CEST
- Bubble (messages, creation, deletion) came back at 09:00AM (day+1) CEST
- Web Conference came back at 09:00AM (day+1) CEST
Incident description:
Our datacenters provider faced a network issue in an EMEA datacenter. This network issue resulted in server intermittent loss of internal network access. Despite high-availability is in place, secondary datacenter did not offload the traffic back
Once access to the data center was restored, restarting the application caused a problem with a component involved in Bubbles management. This component is taking a huge time to start and operate bubbles. Rainbow teams has done everything possible to alleviate the problem and shorten that time.
Corrective Measures:
-
Make sure that the Rainbow does not load all the Bubbles at start-up to reduce the load on the infrastructure during a massive restart. Improvment already planned to be release the 10th July.
-
Add statistics on the components linked to this outage to better understand their activity and better estimate the restart times.
Comments
0 comments
Please sign in to leave a comment.