npm Registry Spelunking: Dependencies Referenced by URL
- Adam Baldwin
- Research
- November 8, 2017
I’ve learned a long time ago that not all security research pans out with a stack of vulnerabilities but every time I venture down a rabbit hole I learn something along the way. This is one of those times.
During a recent assessment ^Lift team member Jon Lamendola found an access_token on a url for a project dependency and that got us thinking:
“wonder if any npm dependencies are using urls that contain tokens or passwords.”
That was enough to nerd snipe an hour of my time as I answered his question and found a few more to answer along the way. Let me take you through the process and we’ll see what results shook out of the npm registry.
Gathering of the urls
To get the url’s the first thing I had todo was get the metadata from the registry. I do this with a script that looks like this.
wget "https://replicate.npmjs.com/registry/_all_docs?include_docs=true" -O registry.json
This will yield you an 8GB JSON file should you want to do your own spelunking. A stream and some for loops later I had a file that contained urls from any dependency that referenced it’s source as a url. The result was 9,326 unique urls from 7,482 modules. Not as many as I had suspected, but enough to keep going.
Looking for access_tokens
Quick grep for access_token and various other key words (key, token, secret, password) yielded no modules that contained a token similar to what Jon had witnessed. I also looked for any urls that had query string parameters just for coverage sake and also urls that contained HTTP basic auth and the result of interest was again 0. I was surprised at the random places (like dropbox.com) people were referencing to install dependencies from, I had to keep digging around.
EDIT: Jon Lamendola took a look over the url list after I had published this and found a needle in the haystack. 1 set of basic auth credentials that seem to be invalid at this time.
Interesting to note that 845 of those url’s are just using HTTP, not HTTPS. Those modules could potentially be vulnerable to MTIM if the attacker had the right network position (you never npm i at the coffee shop do you?)
Removing every github.com, github.io, and npmjs.org url and filtering the list further by hand brought it down to 75 domains and this is where things got interesting.
A few of those were very much Amazon S3 buckets. I resolved every domain to see if they resolved to an S3 bucket and added them to the list of potential S3 buckets, then I visited each one looking to see which ones were not active anymore. As S3 is a global name space if I found an abandoned bucket still in use I could grab it and fill it with whatever nefarious package material I wanted.
This was fun. I found 2 buckets to squat on. 🎉 But as I looked further those dependencies were used for a really old version of the modules so while I grabbed the buckets for the sake of science and maybe I could compromise something but the likelihood went down greatly as it wasn’t a recent version or popular module.
The rabbit hole continues… I thought are any of those domains expired? Same principle as the buckets, I could register the domain if it was to become expired and then take over the resources. Again nothing. 😭
Most of these were long shots, but hey I looked and learned a few things along the way.
Be cautious of packages with urls for dependencies. What’s interesting to remember here is that these dependency resources are mutable. Maybe the hosting server says for anybody with .gov owned addresses gets one file and everyone gets another one.
Make sure dependencies you use don’t have dependencies that reference HTTP only url’s. Those will be susceptible to tampering the first time you install it. What was fun to learn was that while the HTTP based dependencies won’t have that integrity check on their first download npm will store the hash inside the package-lock.json file for integrity on future installations. This is one good reason you should be checking in your package-lock.json.
Finally if you use a S3 bucket for something, just assume you are going to have to keep that bucket around forever. Even if you empty it out keep your hands on it and don’t let me have it.
Originally posted on Medium