It is important that SMVS has both high performance and availability so that medicines can be distributed to the public without delay. e-VIS therefore measure and continuously monitor the system’s availability and performance. According to Commission Delegated Regulation (EU) 2016/161 (Article 35), the response time, regardless of the speed of the internet connection, shall be less than 300 milliseconds for at least 95% of the requests.
The diagrams below show:
- The proportion of requests (in%) that took place during 300 milliseconds on an average over a month divided into verifications and status updates on packs.
- Percentage of request as above in the last 6 months.
- The average response times in milliseconds for verifications and status updates on packs the last 30 days.
2023-01-25: DEGRADED PERFORMANCE AND POSSIBLE TIMEOUTS ON SMVS
Starting at 07:05 UTC on 25 January 2023, Microsoft informed that customers may experience issues with networking connectivity, manifesting as network latency and/or timeouts when attempting to connect to Azure resources in Public Azure regions, as well as other Microsoft services including M365, Power BI. It was determined that network connectivity issue was occurring with devices across the Microsoft Wide Area Network (WAN). This impacted connectivity between clients on the internet to Azure, as well as connectivity between services in data centres. The issue caused impact in waves, peaking approximately every 30 minutes. Microsoft identified a recent WAN update as the likely underlying cause, and they took steps to roll back this update.
Timeline of incident (CET)
- At 08:05 on 25 January 2023, Microsoft informed IT-suppliers that customers may experience issues with networking connectivity.
- At 9:30 e-VIS is made aware that Microsoft has Global problems and starts monitor SMVS.
- At 09:52 end-user contacts e-VIS with problem to connect to SMVS.
- At 10.01 e-VIS inform all end-users and end-user IT-suppliers about the ongoing incident.
- At 11:40 Microsoft had identified the issue and was remediating the issue. They have identified a recent WAN update as the likely underlying cause, and took steps to roll back this update. After remediating actions was performed it showed signs of recovery across multiple regions and services, and they continued to actively monitor the situation.
- At 12:30 Microsoft stated that the issue was fully mitigated.
2022-09-07: DEGRADED PERFORMANCE AND POSSIBLE TIMEOUTS ON SMVS
At 12:15 it was first noticed that monitoring tools showed failed requests. Investigation was performed by Solidsoft Reply and it showed that all national systems hosted in Microsoft Azure environment was impacted. A call was logged with Microsoft to investigate possible Microsoft issues and at 14:20 Microsoft confirmed an Azure CosmosDB problem in Northern Europe that caused degraded performance and possible timeouts. Microsoft was resolving the problem by applying their mitigations. At 19:30 all monitoring tools showed successful requests and SMVS was back to full service and performance.
Timeline of incident (CEST)
- At 12:15: The incident was first identified due to monitoring tools showed failed requests. Investigation was immediately started, and analysis showed that all national systems hosted in Microsoft Azure was impacted. Decreased CosmosDB performance could be the cause and further investigation was performed.
- At 13:30: A call was logged with Microsoft to investigate possible Microsoft issues.
- At 14:20: Microsoft confirmed an Azure CosmosDB problem in Northern Europe that may result in timeouts.
- At 14:53: Continued to work with Microsoft on resolving the problem, and an improvement was noted in the latency of the CosmosDB in Northern Europe, however end user requests was still intermittent on whether they timeout or not.
- At 17:00: Gradual improvements in different services for each hour going forward as Microsoft continued to apply their mitigations.
- At 19:30: All monitoring showed successful requests and national systems were stable and operating in full service.
2022-01-18: NATIONAL SYSTEMS UNABLE TO RENEW TOKENS
At 23:53 on Monday 17th January one of the two Firewalls within the shared infrastructure environment of the National Systems, impacting all National e-verification systems using Solidsoft Reply as IT-provider, partially failed. The Load Balancer, which passes transactions between the two firewalls, continued to pass transactions to the failed firewall as the failure did not affect the Load Balancer communications connection. However, the failure did prevent the affected server from passing these transactions on to SMVS.
The problem was escalated to a high priority incident in the morning the next day as NMVOs and end-users reported on the problem. The failing firewall was restarted, and the processes were being successfully performed again.
- At 23:53 on Monday 17th January one of the two Firewalls within the shared infrastructure environment of the National Systems partially failed.
- At 06:59 on Tuesday 18th January 2021 a ticket was logged by one of the NMVOs concerning access to both the IQE and PRD National System environments. There were intermittently successful attempts to access the NMVO Portal, in the majority of attempts an error was received.
- At 08:10 the incident was escalated to a high priority incident and Solidsoft Reply created a team to investigate the cause.
- At 08:23 a Swedish end-user called in to e-VIS reporting that they experienced problem when trying to verify and decommission packs.
- At 09:01 following investigations by Solidsoft Reply it was identified that one of the Firewalls was not processing requests and a process to restart the firewall was implemented.
- At 09:19 the firewall was restarted, and Solidsoft Reply Operations monitored the system and this indicated that processes were being successfully performed.
- At 09:50 the incident was marked as resolved and all of the impacted NMVOs was notified.
- At 10:29 e-VIS communicated to the Swedish end-users and their IT-suppliers about the incident being resolved.
2021-03-15: MICROSOFT AZURE ACTIVE DIRECTORY OUTAGE
Starting at 19:15 UTC on the 15th of March 2021 Microsoft Azure Active Directory Services experienced an outage resulting in Azure authentication errors.
The SMVS services that were already logged in continued to work as expected. The incident was global and it was expected that Microsoft would resolve the issue promptly (within hours).
The problem escalated in the morning the next day and end users experienced connectivity issues due to the global Microsoft incident. At 10.28 Microsoft reported that the incident was resolved and by 11.00 end users in Sweden and other countries started reporting successful connectivity. Some intermittent issues were still happening during the day and the IT-supplier Solidsoft did restart SMVS to fully resolve the issue.
- At 20:15 on the 15th of March 2021 Microsoft Azure Active Directory Services experienced an outage.
- At 08:15 on the 16th of March 2021 Microsoft updated the Azure Status page to show that errors were being experienced in attempts to access storage across all Azure regions. The following hours several end users in different countries reported connectivity issues and failure to renew tokens.
- At 09.31 the Swedish end-users and their IT-suppliers were informed about the ongoing incident.
- At 10:28 Microsoft stated that the Azure Active Directory problem had been resolved and was due to Microsoft performing key rotation tasks unsuccessfully. By 11:00 end users were reporting successful connectivity and only intermittent connection issues.
2021-01-01: EU DOMAIN SUSPENDED
SMVS and EU Hub was up and functioning but could not be contacted due to the failure to resolve the URLs *.nmvo.eu. This was identified to be due to the EUR ID registry suspending the domain.
The EUR ID Registry suspended any domains registered to a UK Address on 1st January due to Brexit. It is no longer possible to have an .EU domain registered to a UK Address.
As the *.nmvo.eu address was still registered to Solidsoft Reply’s office in the UK (Basingstoke), it triggered the domains to be suspended.
Notification of action needing to be taken from the domain registrars was not received by Solidsoft Reply due to a legacy email address being used by the Domain Registrars.
Once the issue was identified, the registration address was updated (to Solidsoft Reply’s Italian Head Office address) and the domain registrars were contacted. An emergency change request was submitted to the EUR ID registry who approved removal of the suspension at approximately 13:00 (UK time).
As preventive measure all .eu domain entries have been updated with the Reply corporate address in Italy and the email address registered with the domain registrars has been updated to that of the Solidsoft Service Desk.
- At 01.00 (00.00 UK Time) on the 1st of January 2021 the *.nmvo.eu was suspended and no request could find its way to the SMVS.
- At 12.44 the same day the Swedish end users and their IT-suppliers were informed about the ongoing incident.
- At 14.24 the same day the problem was resolved, and the Swedish end users and their IT-suppliers were informed about this at 14.54.