<?xml version="1.0" encoding="UTF-8"?>
<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
  <id>tag:docevent.instatus.com,2005:/history</id>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com"/>
  <link rel="self" type="application/atom+xml" href="https://docevent.instatus.com/history.atom"/>
  <title>DocEvent Status - Incident history</title>
  <updated>2024-06-12T04:25:00.000+00:00</updated>
  <author>
    <name>DocEvent</name>
  </author>
  
<entry>
  <id>tag:docevent.instatus.com,2005:Incident/clxbxvmap8427b7ob8xd7ayn8</id>
  <published>2024-06-12T04:25:00.000+00:00</published>
  <updated>2024-06-12T04:25:00.000+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/clxbxvmap8427b7ob8xd7ayn8"/>
  <title>Issue with downloads for all services except for S3</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 5 hours and 55 minutes</p>
    <p><strong>Affected Components:</strong> SFS, SFS, SFS</p>
    <p><small>Jun <var data-var='date'> 12</var>, <var data-var='time'>04:25:00</var> GMT+0</small><br /><strong>Identified</strong> -
  An update was released to a canary deployment to initially us-east-1 region, then ap-southeast-2 and eu-west-1, this caused errors to occur occasionally for GET requests for all services except for S3.  
  
This deployment was later rolled out wider within all regions, as the deployment rolled out more errors were found with GET requests. This release has now been rolled back and has since been resolved.

The root cause of the issue in the deployment has since been found.  
  
This incident has been resolved..</p>
<p><small>Jun <var data-var='date'> 12</var>, <var data-var='time'>14:00:00</var> GMT+0</small><br /><strong>Resolved</strong> -
  This incident has been resolved..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/clw0rrni1125024bmol0eho1424</id>
  <published>2024-05-10T14:25:20.103+00:00</published>
  <updated>2024-05-10T15:17:20.399+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/clw0rrni1125024bmol0eho1424"/>
  <title>Issue in authentication which is affecting all regions</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 1 hour and 8 minutes</p>
    <p><strong>Affected Components:</strong> SFS, Public API, SFS, SFS, Public API, Channels, Console, Console, Console, Public API</p>
    <p><small>May <var data-var='date'> 10</var>, <var data-var='time'>15:17:20</var> GMT+0</small><br /><strong>Monitoring</strong> -
  We implemented a fix and are currently monitoring the result..</p>
<p><small>May <var data-var='date'> 10</var>, <var data-var='time'>14:43:46</var> GMT+0</small><br /><strong>Identified</strong> -
  We have identified the issue and are upgrading some clusters to manage database load.</p>
<p><small>May <var data-var='date'> 10</var>, <var data-var='time'>14:50:30</var> GMT+0</small><br /><strong>Identified</strong> -
  We are continuing to work on a fix for this incident..</p>
<p><small>May <var data-var='date'> 10</var>, <var data-var='time'>15:12:32</var> GMT+0</small><br /><strong>Identified</strong> -
  The fix to upgrade the deployment is currently in progress..</p>
<p><small>May <var data-var='date'> 10</var>, <var data-var='time'>15:32:52</var> GMT+0</small><br /><strong>Resolved</strong> -
  This incident has been resolved..</p>
<p><small>May <var data-var='date'> 10</var>, <var data-var='time'>14:25:20</var> GMT+0</small><br /><strong>Investigating</strong> -
  We are currently investigating this incident..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/cliuqtg7v97628avoijyekdudj</id>
  <published>2023-06-13T20:00:00.000+00:00</published>
  <updated>2023-06-13T20:00:00.000+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/cliuqtg7v97628avoijyekdudj"/>
  <title>us-east-1 AWS outage affecting services</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 46 minutes</p>
    <p><strong>Affected Components:</strong> Channels, Console, Public API</p>
    <p><small>Jun <var data-var='date'> 13</var>, <var data-var='time'>20:00:00</var> GMT+0</small><br /><strong>Investigating</strong> -
  us-east-1 console and verification service for self-hosted instances is currently unavailable.  All cloud ftp and sftp services continue to be available.

AWS is having an outage in us-east-1 region for AWS Lambda and API Gateway which these services rely on.

https://health.aws.amazon.com/health/status.</p>
<p><small>Jun <var data-var='date'> 13</var>, <var data-var='time'>20:45:41</var> GMT+0</small><br /><strong>Resolved</strong> -
  AWS has implemented a fix, and we see services reconnecting to our APIs now.

This incident has been resolved..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/clip6r2c6228179bencq39cbvxq</id>
  <published>2023-06-09T23:00:00.000+00:00</published>
  <updated>2023-06-09T23:23:30.014+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/clip6r2c6228179bencq39cbvxq"/>
  <title>Elastic search cluster issue</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 1 hour and 32 minutes</p>
    <p><strong>Affected Components:</strong> SFS, Public API, SFS, SFS, Public API, Channels, Console, Console, Console, Public API</p>
    <p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>23:23:30</var> GMT+0</small><br /><strong>Investigating</strong> -
  Our provider elastic.co is currently having an outage and it is affecting our instances from authenticating users to login to our servers.

https://status.elastic.co/incidents/07bw653d2677?u=kpnld432cry6.</p>
<p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>23:43:40</var> GMT+0</small><br /><strong>Identified</strong> -
  A new update from elastic.co

We have confirmed a proxy outage for us-east-1 that impacts all communication to Elastic Cloud. A resolution is being worked on. Another update will be provided in 30m or sooner.</p>
<p><small>Jun <var data-var='date'> 10</var>, <var data-var='time'>00:15:10</var> GMT+0</small><br /><strong>Identified</strong> -
  New update from elastic:

The issue has been identified and a fix applied to a subset of proxies. Full rollout to all proxies is in progress.
.</p>
<p><small>Jun <var data-var='date'> 10</var>, <var data-var='time'>00:28:33</var> GMT+0</small><br /><strong>Monitoring</strong> -
  elastic.co have updated their us-east-1 proxies and our monitoring is returning successful results for connectivity and access..</p>
<p><small>Jun <var data-var='date'> 10</var>, <var data-var='time'>00:32:26</var> GMT+0</small><br /><strong>Resolved</strong> -
  This incident has been resolved..</p>
<p><small>Jun <var data-var='date'> 9</var>, <var data-var='time'>23:00:00</var> GMT+0</small><br /><strong>Investigating</strong> -
  We are currently investigating this incident..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/clhqih4s4192831i4ncqcxko6ic</id>
  <published>2023-05-16T16:34:00.000+00:00</published>
  <updated>2023-05-16T16:34:00.000+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/clhqih4s4192831i4ncqcxko6ic"/>
  <title>Instance failures in eu-west-1</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 25 minutes</p>
    <p><strong>Affected Components:</strong> SFS</p>
    <p><small>May <var data-var='date'> 16</var>, <var data-var='time'>16:34:00</var> GMT+0</small><br /><strong>Investigating</strong> -
  We are currently investigating this incident..</p>
<p><small>May <var data-var='date'> 16</var>, <var data-var='time'>16:49:00</var> GMT+0</small><br /><strong>Monitoring</strong> -
  We have updated instance capacity and are monitoring..</p>
<p><small>May <var data-var='date'> 16</var>, <var data-var='time'>16:59:00</var> GMT+0</small><br /><strong>Resolved</strong> -
  Our monitoring has verified everything has returned to normal.  The incident is now resolved..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/cl3a4hj8o235131nqoaldsku0fo</id>
  <published>2022-05-17T08:27:00.000+00:00</published>
  <updated>2022-05-17T12:27:45.704+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/cl3a4hj8o235131nqoaldsku0fo"/>
  <title>Degredations due to DOS</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 4 hours and 1 minute</p>
    <p><strong>Affected Components:</strong> SFS</p>
    <p><small>May <var data-var='date'> 17</var>, <var data-var='time'>12:27:45</var> GMT+0</small><br /><strong>Resolved</strong> -
  We&#039;re marking this as resolved, and will continue monitoring..</p>
<p><small>May <var data-var='date'> 17</var>, <var data-var='time'>08:27:00</var> GMT+0</small><br /><strong>Investigating</strong> -
  We are currently investigating this incident.  We have put mitigations in place, and our monitoring has shown degraded service during this period.

This means some data IPs were unavailable.  This only affects sftp connections not going through our load balancer / static IPs..</p>
<p><small>May <var data-var='date'> 17</var>, <var data-var='time'>08:50:00</var> GMT+0</small><br /><strong>Monitoring</strong> -
  Note, this affects only eu-west-1, not us-east-1 as previously tagged..</p>
<p><small>May <var data-var='date'> 17</var>, <var data-var='time'>08:45:00</var> GMT+0</small><br /><strong>Monitoring</strong> -
  We have been blocking traffic, rerouting and monitoring..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/ckxz7ut684949868vom4c2r6h5n</id>
  <published>2022-01-03T21:50:08.233+00:00</published>
  <updated>2022-01-03T21:50:08.233+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/ckxz7ut684949868vom4c2r6h5n"/>
  <title>SFTP temporary connection failures</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    
    <p><strong>Affected Components:</strong> SFS</p>
    <p><small>Jan <var data-var='date'> 3</var>, <var data-var='time'>21:50:08</var> GMT+0</small><br /><strong>Resolved</strong> -
  We resolved an issue where connectivity for sftp and scp connections were receiving errors.  This was happening temporarily on certain instances.  These instances have been ejected and are monitoring is continuing..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/ckwxd3x7d85998moaqe5yezda</id>
  <published>2021-12-07T21:00:00.000+00:00</published>
  <updated>2021-12-07T21:00:00.000+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/ckwxd3x7d85998moaqe5yezda"/>
  <title>AWS us-east-1 API outage</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 5 days</p>
    <p><strong>Affected Components:</strong> Console, Public API</p>
    <p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>21:00:00</var> GMT+0</small><br /><strong>Investigating</strong> -
  Our login console and public API are experiencing failures.

This is due to an AWS outage in us-east-1 region that affects our API Gateway services in this region..</p>
<p><small>Dec <var data-var='date'> 7</var>, <var data-var='time'>23:00:00</var> GMT+0</small><br /><strong>Resolved</strong> -
  Update provided by AWS:

[9:37 AM PST] We are seeing impact to multiple AWS APIs in the US-EAST-1 Region. This issue is also affecting some of our monitoring and incident response tooling, which is delaying our ability to provide updates. We have identified the root cause and are actively working towards recovery.

[10:12 AM PST] We are seeing impact to multiple AWS APIs in the US-EAST-1 Region. This issue is also affecting some of our monitoring and incident response tooling, which is delaying our ability to provide updates. We have identified root cause of the issue causing service API and console issues in the US-EAST-1 Region, and are starting to see some signs of recovery. We do not have an ETA for full recovery at this time.

[11:26 AM PST] We are seeing impact to multiple AWS APIs in the US-EAST-1 Region. This issue is also affecting some of our monitoring and incident response tooling, which is delaying our ability to provide updates. Services impacted include: EC2, Connect, DynamoDB, Glue, Athena, Timestream, and Chime and other AWS Services in US-EAST-1. The root cause of this issue is an impairment of several network devices in the US-EAST-1 Region. We are pursuing multiple mitigation paths in parallel, and have seen some signs of recovery, but we do not have an ETA for full recovery at this time. Root logins for consoles in all AWS regions are affected by this issue, however customers can login to consoles other than US-EAST-1 by using an IAM role for authentication.

[12:34 PM PST] We continue to experience increased API error rates for multiple AWS Services in the US-EAST-1 Region. The root cause of this issue is an impairment of several network devices. We continue to work toward mitigation, and are actively working on a number of different mitigation and resolution actions. While we have observed some early signs of recovery, we do not have an ETA for full recovery. For customers experiencing issues signing-in to the AWS Management Console in US-EAST-1, we recommend retrying using a separate Management Console endpoint (such as https://us-west-2.console.aws.amazon.com/). Additionally, if you are attempting to login using root login credentials you may be unable to do so, even via console endpoints not in US-EAST-1. If you are impacted by this, we recommend using IAM Users or Roles for authentication. We will continue to provide updates here as we have more information to share.

[2:04 PM PST] We have executed a mitigation which is showing significant recovery in the US-EAST-1 Region. We are continuing to closely monitor the health of the network devices and we expect to continue to make progress towards full recovery. We still do not have an ETA for full recovery at this time.

[2:43 PM PST] We have mitigated the underlying issue that caused some network devices in the US-EAST-1 Region to be impaired. We are seeing improvement in availability across most AWS services. All services are now independently working through service-by-service recovery. We continue to work toward full recovery for all impacted AWS Services and API operations. In order to expedite overall recovery, we have temporarily disabled Event Deliveries for Amazon EventBridge in the US-EAST-1 Region. These events will still be received &amp; accepted, and queued for later delivery.

[3:03 PM PST] Many services have already recovered, however we are working towards full recovery across services. Services like SSO, Connect, API Gateway, ECS/Fargate, and EventBridge are still experiencing impact. Engineers are actively working on resolving impact to these services.

[4:35 PM PST] With the network device issues resolved, we are now working towards recovery of any impaired services. We will provide additional updates for impaired services within the appropriate entry in the Service Health Dashboard..</p>

        ]]>
  </content>
</entry>

<entry>
  <id>tag:docevent.instatus.com,2005:Incident/cksibgpu027085381oilwwqoy14</id>
  <published>2021-08-19T02:39:22.200+00:00</published>
  <updated>2021-08-19T02:43:40.992+00:00</updated>
  <link rel="alternate" type="text/html" href="https://docevent.instatus.com/incident/cksibgpu027085381oilwwqoy14"/>
  <title>Issues connecting, auto disconnecting (us-east-1, ap-southeast-2)</title>

  <content type="html">
  <![CDATA[
    <p><strong>Type:</strong> Incident</p>
    <p><strong>Duration:</strong> 13 minutes</p>
    <p><strong>Affected Components:</strong> SFS, SFS</p>
    <p><small>Aug <var data-var='date'> 19</var>, <var data-var='time'>02:43:40</var> GMT+0</small><br /><strong>Identified</strong> -
  It appears as if a recent patch caused us-east-1 and ap-southeast-2 regions to randomly disconnect users.  We have identified the issue and are rolling out a fix.

us-east-1 regions now resolved.</p>
<p><small>Aug <var data-var='date'> 19</var>, <var data-var='time'>02:50:39</var> GMT+0</small><br /><strong>Monitoring</strong> -
  A fix has been implemented for ap-southeast-2 and released.</p>
<p><small>Aug <var data-var='date'> 19</var>, <var data-var='time'>02:39:22</var> GMT+0</small><br /><strong>Investigating</strong> -
  We are currently investigating this incident..</p>
<p><small>Aug <var data-var='date'> 19</var>, <var data-var='time'>02:52:30</var> GMT+0</small><br /><strong>Resolved</strong> -
  We just resolved the issue!.</p>

        ]]>
  </content>
</entry>

</feed>