An external Google API experienced an issue which resulted in spikes in error rate on our internal jobs API and requests not being served. This issue was a recurrence of an issue observed previously on Sept 6.
The high error rate caused by the failing API resulted in sudden erroneous load on our career website platform. This in turn caused sudden scaling which compounded the application delay.
Our developers issued a patch for the affected API and it was deployed to production. As soon as the new instances running our web platform were deployed, the issue was resolved.
09/09/19 11:54am US EDT - Google API failures began impacting career website performance.
09/09/19 12:09pm US EDT - Issue identified in the API and patch created to disable affected Google API calls.
09/09/19 12:20pm US EDT - Updated patch issued to production environment
09/09/19 12:25pm US EDT - Cache invalidations initiated to apply patch on all sites.
09/09/19 12:36pm US EDT - Career site services restored.
Troubled Google API was disabled as it was not necessary for core functionality in use.
Secondary mitigation also deployed at backend layer to prevent this API from impacting application backend.
Monitors for this API have been tuned even further to alert us in the even this error occurs in the future.