Akamai EAA Impersonation Vulnerability - A Deep Dive
In this post, we cover the technical details of CVE-2021-28091, the vulnerability impacting Akamai's Enterprise Application Access (EAA) platform. We cover our investigation, remediation and disclosure process for the vulnerability. For an overview of the vulnerability, the impact to Akamai, the impact to EAA customers and actions required, please see our companion report.
In this section, we will walk you through the history and anatomy of this vulnerability. Some readers may wish to skip this section for now and go directly to the Actions Required section, using this Overview for reference in any assessments that they need to conduct or for future reviews.
Prior to Akamai's acquisition of the EAA technology through its acquisition of Soha Systems in 2016, a key feature was introduced to the platform allowing customers of the platform to make access control and authentication decisions based on identity information provided by a third-party identity provider. The EAA platform offers multiple methods for third-party identity integration. The notable method for this report is support for the Security Assertion Markup Language (SAML) v2.0 authentication protocol.
SAML is a widely used, open standard. SAML allows an Identity Provider (IdP) to assert, by cryptographically signing and returning, to a Service Provider (SP) through the client presenting an assertion object from the IdP to the SP within a defined time period. (See the Background section below for more information.)
When third-party IdP support was added to EAA, the developers selected the open source library Lasso to implement SAML support within the platform. Based on Akamai's assessment of the code where Lasso verifies the SAML responses provided to it as a SP, we believe that at the time of initial integration, the developers implementing the third-party IdP feature did so in a reasonable way based on the test cases provided with the library. Further investigation revealed that the test suite used to exercise Akamai's implementation was not rigorous enough to identify this impersonation vulnerability or similar weaknesses in the authentication process. This shortcoming has been addressed as part of our response by expanding the test suite applied to new releases of the product to include all combinations of valid and invalidly signed responses and/or assertions as well as unsigned assertions and responses. These new tests are part of the standard QA process going forward.
In the following sections of the report, we break down the various weaknesses and contributing factors which made up the overall vulnerability. Our goal in providing this level of transparency is to help others understand the steps taken by Akamai and to allow them to avoid similar circumstances in their own environments.
System testing and assessment
Unit tests, integration tests, and regression tests are a critical aspect of any software development lifecycle (SDLC). While the sub-component which was implemented did have all of these testing methods associated with it, we have clearly learned that the tests were not rigorous enough. Additionally this incident has illuminated an oversight where some third-party libraries are incorporated into projects under the false assumption that the SDLC of the dependence is itself rigorous and will be informed by domain specific discoveries such as vulnerabilities in similar libraries.
While a rigorous SDLC for each component and each of its dependencies is necessary, often the testing incorporated in the component development and Quality Assurance (QA) plan is not sufficient. To supplement this testing, adversarial assessments, such as penetration tests or third party code reviews, can be employed. In the case of EAA, multiple external security and vulnerability assessments of the EAA platform have been conducted over the lifetime of the product, often by customers. Despite this, the report that started this response was the first time that this vulnerability had been reported to Akamai. Akamai has conducted targeted assessments against other portions of the EAA platform and its client application, but this specific component has not been subject to that level of scrutiny.
Avoiding premature disclosure
Early in this incident response process, Akamai started to write a high level customer notification to guide customers to start the investigations suggested in the Actions Required section in the companion post. In a pre-publication review of that document, Akamai staff who were not informed about the vulnerability were provided a copy of the customer-facing messaging. Within an hour of the message being provided, our reviewers were able to identify the protocol affected (SAML), affected package (Lasso), and with some recent activity from the Lasso project maintainer, a guess at what the vulnerability was. This revelation put an immediate pause to our partial notification plans to abide by the principles of responsible disclosure.
After further conversations with our reviewers, the incident team was able to learn the process by which they made these discoveries. A key finding reported by the reviewers was the error message returned by the IdP when an error condition occurred. Up until the fix for the vulnerability was released, SAML failures would return an error page which exposed the Lasso error to the end user, as seen in the image below. Forwarding an error, especially for critical security processes like authentication, is counter to best practices, which is why the error will not be visible to the end users starting in this release.
After Akamai's engineers had identified the weakness in the Lasso library, a targeted review of the Lasso codebase was undertaken. Before a report was provided to the maintainers of the library, the engineering team was able to recreate the vulnerability using none of our application specific code. The patch applied by the maintainer can be found here.
In coordination with the Lasso maintainers, Akamai reserved CVE ID CVE-2021-28091. The associated CVSS score for the CVE ID published is 8.2. Also in coordination with the Lasso maintainers, Akamai reported this issue to CERT/CC who ran the coordinated disclosure process.
To fully understand this issue, a working understanding of the SAML authentication process is helpful. An approachable introduction to this topic is The Beer Drinker's Guide to SAML published by DUO.
At the center of all of this lies the weakness which could be exploited by the attacker. To explain the issue in further detail, we start by covering how a SAML response is interpreted in the patched version of the library. We then discuss the cases where the weakness could be used to impersonate another user.
After a user authenticates to a SAML IdP, the IdP returns the SAML response to the SP through a method which is pre-negotiated by the SP and IdP administrators. Often this is achieved by using the client as an intermediary. The SP verifies that the client is authorized through this SAML response.
SAML assertions are an XML document which will have roughly this form:
<samlp:Response> <saml:Issuer>http://idp.example.com/metadata.php</saml:Issuer> <saml:Assertion> <saml:Issuer>http://idp.example.com/metadata.php</saml:Issuer> <ds:Signature> ... Assertion Signature ... </ds:Signature> <saml:AttributeStatement> <saml:Attribute Name="uid" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">test</saml:AttributeValue> </saml:Attribute> <saml:Attribute Name="eduPersonAffiliation" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">users</saml:AttributeValue> </saml:Attribute> </saml:AttributeStatement> </saml:Assertion> </samlp:Response>
The above XML document has been simplified for the purposes of this report, but the structure is the same. The outer, 'parent' document is the SAML response including metadata about the request and an Assertion document. The Assertion, also called a SAML Assertion, is the data being provided from the IdP to the SP for use in the authentication process. Multiple assertions may be present in a single SAML response. In the above example, the contents of the ds:Signature brackets is a cryptographic signature over the contents in the parent object, which is the Assertion in this case. The same signature object can also be applied to the entire response object. The purpose of the signature is to allow the SP to validate that the data contained in the Assertion or response is legitimate and provided by the IdP. In the case of the Assertion, the signature only applies to all data contained within the Assertion, like a username, an email address, or group membership indications. Signatures applied at the response level apply to the full contents of the response and all assertions therein.
Verification of the various signatures in the SAML response is entrusted to the SP and is often configured at the time that the IdP is configured to communicate with the application. In our response to this issue, we believe that the default verification conditions for SAML responses should be as follows.
- When the entire SAML response is validly signed, all of the assertions in the response must be correctly signed or have no signature. If any invalid signatures are found, the verification must fail. This method relies on the IdP to be authoritative for the entire message body, which is signed.
- When the SAML response is unsigned, all assertions in the response must be correctly signed, otherwise the verification must fail.
- When the SAML response has an invalid signature, the verification must fail.
The above processing conditions are what Akamai's proposed patch to Lasso implemented.
The report provided to Akamai at the start of this issue showed the researcher submitting two SAML assertions in a single SAML response, the first was validly signed but the second was unsigned. The default configuration for Lasso had the following default verification conditions.
- If the first SAML assertion in the response was validly signed, the verification passed, without regard for the full SAML response signature being valid or not.
- If the first SAML assertion was invalidly signed, the verification failed.
- If the SAML response was validly signed and none of the assertions were signed, the verification passed.
- Otherwise the verification would fail.
To complicate matters, when the response was deemed valid by the library, the function to retrieve the assertion from the SAML response would return the last assertion in the response object, irrespective of it having a valid signature. By way of example, say an attacker obtains a valid SAML response with a single assertion from an IdP, like the one above, and adds the following as a second assertion:
<saml:Assertion> <saml:AttributeStatement> <saml:Attribute Name="uid" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">superuser</saml:AttributeValue> </saml:Attribute> <saml:Attribute Name="eduPersonAffiliation" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">admins</saml:AttributeValue> </saml:Attribute> </saml:AttributeStatement> </saml:Assertion>
In the case where the user provided is valid for the organization but has more privileges, they would then have the combined SAML response for submission to the SP:
<samlp:Response> <saml:Issuer>http://idp.example.com/metadata.php</saml:Issuer> <saml:Assertion> <ds:Signature> ... Assertion Signature ... </ds:Signature> <saml:AttributeStatement> <saml:Attribute Name="uid" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">test</saml:AttributeValue> </saml:Attribute> <saml:Attribute Name="eduPersonAffiliation" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">users</saml:AttributeValue> </saml:Attribute> </saml:AttributeStatement> </saml:Assertion> <saml:Assertion> <saml:AttributeStatement> <saml:Attribute Name="uid" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">superuser</saml:AttributeValue> </saml:Attribute> <saml:Attribute Name="eduPersonAffiliation" NameFormat="urn:oasis:names:tc:SAML:2.0:attrname-format:basic"> <saml:AttributeValue xsi:type="xs:string">admins</saml:AttributeValue> </saml:Attribute> </saml:AttributeStatement> </saml:Assertion> </samlp:Response>
When Lasso attempted to validate this SAML response, the result would be that the response was valid. When the calling application retrieved the assertion from the above response, the assertion with the User ID (uid) of superuser would be returned and likely assumed as a valid assertion. In addition to the example shown above, if the SAML response itself had a valid signature, this same method of impersonation would be possible. This was the case with EAA's processing of SAML responses.
Conditions of exploitation
In order for the SAML response to be modified prior to submission to the SP, one of the following conditions must occur:
- The legitimate client, controlled by a valid authorized user and through which a SAML response is redirected, must alter the SAML response by injecting the additional assertion document as part of the SAML response. For example, this could be via a malicious browser extension or other malware on the client system, sometimes referred to as a "Man-in-the-Browser attack".
- An attacker must obtain a valid copy of a SAML response which is still valid, either by still having time before the assertion expires or, in some applications, the assertion has not yet been presented to the SP. For example, an intermediate party can intercept and modify a SAML response through a proxy, often referred to as a "Man-in-the-Middle attack".
- An unauthorized client either knows or is able to guess the login information of an authorized user. Login information can be collected through many processes, including Phishing, Password Breaches, Guessing, or Brute Force attacks.
Each of the above conditions could result in a user's session becoming compromised, and if the SAML implementation is flawed as stated above, the SP would be vulnerable to an impersonation attack.
History of the vulnerability
Review of the Lasso repositories indicates that the weakness in the library has been incorporated into the codebase as early as November 2005, well before our incorporation of the library and also before the release of the previous vulnerabilities announced to other platforms.
We also noticed during the investigation that the maintainers of the Lasso library had made a commit to the project shortly after the notice of the issue was sent to Akamai. In discussions with the reporter, this commit was not related at all to their report but was merely coincidental.
The fix that was proposed on February 24th, 2021 did partially resolve the impact of the exploit, but after further review we determined it was not a complete fix, which is why our patch was ultimately proposed to the maintainers.
A look at Akamai's incident response
Akamai follows a formal incident response process. Incidents are regularly handled by cooperative effort among engineering/systems development, network operations, and customer support personnel. In general, the more severe the incident, the more people are involved to work on it, and the more it is prioritized over planned operations and work. In all incidents, Akamai's goal is to:
- Limit the impact of the problem,
- Ensure continued, safe operation of our systems,
- Ensure the continued, safe and care of our incident responders,
- Keep customers happy and their data secure,
- Adhere to various laws and regulations,
- Ensure that we are able to learn and improve from whatever hazards allowed the incident to occur.
As we described above, we engaged our incident response process upon notification of this vulnerability. That process allowed Akamai to align technical resources, communicate with internal stakeholders and management, communicate with external stakeholders, and coordinate all activities related to the incident in a timely and effective manner.
Patching & deployment process
The process of developing a fix for this vulnerability, and deploying the patch on the EAA network was very similar to the normal process followed for planned upgrades, only with a much smaller change and a much faster timeline.
Within the first hours of incident response, we prepared a draft timeline for the fix with a few key decisions taken into account. That timeline was, following the fix being ready, the QA process was expected to take 3 days following the standard QA process, and the deployment phase would be 48 hours. Following the deployment phase, we planned the communication of the issue in the form of a blog post and customer notifications. These timelines could have been accelerated, but as there was no evidence of active exploitation, we prioritized the stability of our network and ensured that our customers remained stable through the full process.
After the initial triage of the issue, the engineering team approached the fix via two paths, both using different engineering and QA resources.
One team investigated and developed a partial fix for the issue, closing the reported issue and constraining the requirements on the processing of SAML responses to what we believe is the normal presentation of a SAML response with an assertion. This method may have resulted in some responses from some IdPs being denied even if they were valid and safe. This approach also had the option to disable the more strict checking on a per customer basis to allow the rest of the customer base to be protected in the event of an unexpected interaction with a small number of IdPs.
The other team took a similar approach to the first team, but rather than a configurable, partial fix, worked on what we believe to be a complete fix. Their approach was scrutinized to ensure that all well-formed and correctly signed SAML responses would be accepted, reducing the complexity required to allow customers to be downgraded.
We took this concurrent approach because it would allow for one path to be blocked or run into QA challenges while still allowing the deployment of a fix. Late in the day on February 24th, US Eastern time, we expected the partial fix to be ready for QA three days after the incident started, while the full fix was expected to take a week longer to develop. Work continued overnight into the 25th where the progress report from the engineering team showed that the full fix option was progressing well and expected to ultimately be faster to develop and less complex to test.
Ultimately the full fix option was handed off to the QA team about a day before the partial fix was. Once the first option was handed over to the QA team, a patch notification to all EAA customers was posted noting the expected deployment timeframe.
This status remained the same through the QA process. Slightly ahead of the maintenance window, the full fix received QA team signoff, clearing the way for the deployment.
The deployment process started with a very lightly loaded Point of Presence (POP) upon which we ran an extended regression suite with the traffic being monitored carefully for potential disruption to customers. Another lightly loaded POP was upgraded, with a reduced testing process and a period of monitoring. In the early hours of the day on March 2nd, the POPs serving Akamai's internal EAA deployment were upgraded to allow for more load testing with our nearly 8000 end users. When we saw no issues with any of the deployments up to that time, the rest of the POPs were upgraded over the remaining 36 hours of the maintenance window, ultimately completing the deployment process before the close of the maintenance window.
During the upgrade process, most users who were interacting with their EAA applications likely saw one or more re-authentication interactions. These typically consist of a temporary session interruption and a redirect to their IdP in order to enter their credentials before being returned to their normal work. Users of the EAA Client may have also observed this behavior. The re-authentication attempts were clustered by EAA customers during the deployment. Following the code upgrade on each POP, the session cache that EAA maintains was cleared, which would, over a 5 minute window, trigger a re-authentication for all requests. Once re-authenticated, we observed no further impact on the end users.
Managing the incident team
One key aspect to developing, verifying and safely deploying the fix to our network for significant issues like this includes a careful focus in caring for the team working on an issue. Akamai's incident management process and the accompanying training include guidance on how to limit burnout for the technical and management teams who respond to an incident. This guidance includes verifying that the incident team:
- Eat (regularly)
- Sleep (ideally a full 'night' of sleep)
- Tend to personal obligations
- Stay healthy (COVID-19 Vaccines, exercise)
While remediating the incident at hand is critical, we find that keeping up with team care during the whole process aids in reaching a safer destiny for the impacted product and/or system, while also reducing the number of avoidable errors, and potentially customer impacting events often associated with high stress incident response.
Another key aspect of Akamai's incident management process is the principle that we're all in this together, and we're not going to blame anyone now or in the future. The incident team works together to solve the issue, whatever it may be in the best way we can, focusing on reducing impact first, then figuring out how to prevent it from happening again. Akamai focuses on finding the assumptions which led to an incident, learning from those incomplete assumptions, and making the appropriate modifications to reduce the chances of that or similar events happening again.
System owners who rely on Lasso for their SAML authentication should patch as soon as possible. Additional actions may be required to investigate the impact on the authenticated systems. Further information on what actions may be required can be found in the Actions Required section of the companion post to this writeup.
|Timestamp (All UTC)||Activity|
|2230 - 23 Feb 2021||External vulnerability report sent to Akamai's Information Security Group|
|1222 - 24 Feb 2021||Akamai's Information Security Team decrypted the report and began investigation of the issue|
|1242 - 24 Feb 2021||Responders initiated the Akamai Incident Management Process, gathering the necessary parties to investigate and fix the issue.|
|2000 - 24 Feb 2021||The issue was successfully recreated by the engineering team.|
|0132 - 27 Feb 2021||Patching notification posted to Akamai Control Center|
|1500 - 1 Mar 2021||First contact with the maintainer of Lasso|
|0100 - 2 Mar 2021||Deployment of fix begins|
|1126 - 2 Mar 2021||Akamai's production service was upgraded to conduct rigorous testing of the upgrade|
|2134 - 2 Mar 2021||Researchers confirmed that their exploit was not possible on the patched systems.|
|2336 - 4 Mar 2021||Deployment of fix complete|
|1646 - 8 Mar 2021||CVE ID CVE-2021-28091 Reserved|
|1747 - 8 Mar 2021||
Initial contact with CERT/CC to report the vulnerability.
|1200 - 1 Jun 2021||Embargo Completed|