At 11:05 we deployed a misconfiguration to a new staging-environment for collaboration services in DrPublish, which affected production. The misconfiguration have been resolved for good, and we have made sure a similar effect from staging to production cannot happen again for this reason in the future.
While issuing a fix at 11:10, we made a separate mistake and rolled back to a version with a known bug. This affected users until 11:34. At 11:34 the situation was permanently resolved, and all users were instructed to reload at 11:36.
In the bigger picture, we see that our introduction of a new technology, while having solved many problems for users in the daily, has caused some procedural changes in our development and operations team, which have caused two separate incidents (see also incident 12. February https://www.aptomastatus.com/incidents/657h91dlfj9d)
We are, and will still be, working to establish robust development and operations workflows to ensure intermittent problems caused by us will not persist over time.
We are sorry for the problems caused.