Pillaging Distributed Version Control 5 Years Later
- Adam Baldwin
- Research
- November 13, 2016
5 years ago at DEFCON 19 I gave a talked titled “Pillaging DVCS repos for fun and profit.” The technique & tool I outlined in that talk has been very fruitful through out the years and plenty of security consultants have told me that this had helped them have breakthroughs during penetration tests. If it’s useful to us it’s also useful to attackers.
The basics of the vulnerability is that git/mercurial/bazar are all distributed version control systems and because of that if you have access to the .git / .hg / .bzr meta data directory you can recreate some or all of the source code tree. This is particularly useful for security assessments as it turns them from black box to white box, making source code available, or leaking keys, passwords or other private information that may be checked into the repository.
This year I revisited that research to see what might have changed and I was fairly surprised by the results.
The original results
The original scan of the Alexa top million looked like this.
GIT: 1498 repos HG: 312 repos BZR: 235 repos
Possibly the most interesting part of this original research is that I found the source code to every product of a company. I’m under NDA to not discuss the story but suffice it to say it was worth the research.
The new results
- GIT: 10,357
- HG: 445
- BZR: 31
10,833 total (1.08 % of the Alexa Top Million)
My thought is the significant increase is directly tied to the raise in the popularity of git over the last 5 years. It’s unfortunate that these simple attacks continue to work. Maybe it’s time to improve the standard defaults of web servers out there to not serve dot files by default.
How to test yourself
For GIT, HG (mercurial), or BZR (Bazar) you can use the following urls and what should be in the response (it might vary slightly) on your domains.
GIT
curl http://example.com/.git/HEAD
ref: refs/heads/deploy
HG
curl http://example.com/.hg/requires
revlogv1
store
fncache
dotencode
BZR
/.bzr/README
This is a Bazaar control directory.
Do not change any files in this directory.
See http://bazaar.canonical.com/ for more information about Bazaar.
How to protect yourself
The advice here is pretty simple, always block dot files (directories or files that start with a period). Since many configurations are different look up “how to block dot files” for your particular web server.
Interesting findings
dot files as a honeypot
That file is there to fuck with people like you.
A number of the sites responded that the presence of these files were there as a honeypot / red herring or possibly they were just saving face. I’ve seen that before and then a week later the vulnerability is gone. I suppose it is a very cheap way to get a potential attacker to waste a little time and would make false positives on a number of scanners.
Disclosing at scale
Probably the hardest part of all of this was disclosing the issues to the offending sites. I’ve always felt an ethical obligation to notify somebody when I find a vulnerability, but when you find 10k of them it suddenly becomes a rather large burden.
I asked my twitter followers
I went with the option of emailing security@domain, hopeful that it would get through to most of them with little cost or effort.
It’s interesting to me that most of my followers said that punting an email to security@ would be acceptable however most of them also thought that > 75% would bounce, but this post isn’t about the science behind twitter polls.
The final result? 74.82% failed to be delivered. A higher percentage than I would have expected or had hoped for.
We really need a universal API for discovery and disclosure. It’s quite frustrating and at the end of the day having security@domain, despite the problem with spam it’s a good fallback communication endpoint everyone should have setup.
Standards like ISO/IEC 29147 exist as a great framework but only works if people implement it. Additionally HackerOne has a community curated resource for contacting security teams but none of the 10k domains I had to contact showed up in there.
We have a long way to go.
Finding bugs at scale is a lot easier than disclosure at scale. It should be the other way around. Make sure your organization has a clearly defined process for ingress of security issues.
Originally posted on Medium