Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Codecov breach impacts GoDaddy, Atlassian, P&G etc. (reuters.com)
200 points by mmaunder on April 17, 2021 | hide | past | favorite | 73 comments


At risk of over-fitting these observations to this scenario, there seem to be a few common trends here which compound into situations like this one:

1. The move to cloud means the "edge corporate firewall" monitoring point is forgotten about. Devops infrastructure and similar is now running off-prem in the cloud, and the perimeter is now being written off as "zero trust" (but without actually adopting the right assumptions there)

2. The move away from fewer, better tested, releases means we see more risky distribution methods (curl-bash without hash pinning etc), and an encouragement to always pull the latest from upstream, rather than to internally vet and approve a pinned release.

3. A general trend towards more and more third party dependencies and external connections to enable basic functionality, meaning that everything assumes it runs in an internet-connected environment with DNS resolution and the ability to make requests to any hostname and IP the software chooses.

It feels like once again, as other commenters have pointed out, basic sandboxing principles and isolation would prevent this being an issue, coupled with internal vetting of what is being executed in sensitive environments. If the environment is properly limited to read-only access to code, your biggest risk is exfiltration, so you should be focused on preventing exfiltration of anything (environment, state, code, secrets), rather than leaving the barn door wide open.

In the same way we do W^X in memory, to prevent an exploit writing to memory, then executing it, perhaps we need data^network - you can either have internet access, or access sensitive information, but not both. This breaks the X-aaS model, but after SolarWinds, this, and many other such failures, maybe it's time to really re-think the wisdom of giving third parties just so much access to core systems.


You are absolutely correct, and corporate America is not going to do that until/unless they start to see corporations actually losing money vs. competitors if they do not. Right now, it is essentially just "well that was embarrasing", not "I might lose my job as CEO because this happened".

In other words, it will have to get worse before it gets better.


Absolutely. Solarwinds is a great example that if you can become important enough, and in with the bricks, the rats won't flee the sinking ship, even as the deck goes below the waterline.

If investors lost confidence in a company, and sold their shares (impacting the share price), could this create a sufficient paper-based loss for other investors to pursue the senior officers of the company?

I think you're right though - unless it is something that makes the CEO lose their job, or go to prison, it is going to be hard to get anyone to take on any (even marginal) increase in the cost of delivering the product.


Yep, that about nails it. Your OPEX for closing everything up and running it all yourself is pretty high. Undoubtedly using "the cloud" and lots of OSS stuff (that your company probably doesn't fund or contribute to either) is great for business. Unless the company's pocketbook is affected more and/or some folks at the top lose their jobs over it, I don't see anything changing.


That’s a really smart breach target, since reading their security notice it sounds like it works by you downloading and executing their code in your CI environment. So their code has access to your code and all the secrets you inject into your CI environment. There’s probably a lot of pipelines with deploy credentials available to other jobs. As an example I’m most familiar with, Gitlab provides all environment variables to all jobs so if you’re using them for deploy tokens then any job like code coverage can access them. They do document this and offer an alternative in something like Hashicorp Vault, but it’s a big target regardless.


The other thing that is interesting about Codecov is how they distribute their analysis/uploader. You download a fresh uploader for every CI run, even if you're using your CI platform's native integration (i.e. the CircleCI "orb"). I get why they do this -- if they instructed people to download a binary and store it, customers would do that exactly once and never update it, causing them to have to support old versions of their API indefinitely (or annoy their customers greatly). So basically, we took a shortcut that triggers everyone's security spidey sense, and exactly as predicted, we all got burned.

(I will point out that their uploader is literally a curl | bash, but that isn't really the problem here. Installing a Debian package or downloading a statically linked Go binary would be just as risky. Actually, it would have been harder to find the underlying problem.)

All in all I'm pretty disappointed. We trusted Codecov with a shortcut that makes it easy for them to deliver a better product. They didn't have any security in place to ensure that what was checked into source control was what customers downloaded, and now we're all hacked.


You can use their bash uploaded, but you can also use language specific ones.

The Python one, for example, is downloaded via PyPI, and you’ll usually have a pinned version with the hash of it. Any tampering would make pipelines fail. If a new release were pushed by rogue actors, someone would notice an unplanned version bump.


FYI their support actively asks you to move to the bash uploader from the Python one, and the Python one had other compatability issues.

Here's a snippet from a support thead with them last year:

>One thing I noticed is that you are using the python uploader to upload coverage reports to Codecov. We do not support this uploader as a first-party tool, but we do support the bash uploader (https://docs.codecov.io/v4.3.0/docs/about-the-codecov-bash-u...) with full-time staff. We want to make sure this issue isn't related to something in a third party uploader before troubleshooting further. > Would you be able to switch to using this uploader to see if it's related?

I'd stopped using codecov last year for pricing reasons, when they moved to per-user pricing.


Debian packages can be signed, so depending on how the signing keys are managed - it might have been much easier to find the underlying problem.


True! I think since in the CI case you're bootstrapping a fresh machine for every build, you'd have to apt-key add their key as part of installing the package, so an attacker would just make a new one of those. I bet the compromise duration would have been shorter, though; there's got to be one engineer that installs the Debian package on their workstation and would be confused when their release system's signature stopped working.

(And honestly, an attacker that can build a Debian package and update the public key for apt-key is probably harder to find than one that can edit a cleartext bash script. Security through hoping your adversaries can't perform tedious devops operations is no security at all, as they say, but in the real world it probably helps a tiny bit.)


I'm not above throwing public keys in the git repo for use in CI/CD jobs.


They offer, but you don't have to. Only add the uploader that passed your review and use that, not just anything from the interwebs.


I'm not that familiar with codecov but it seems to me way worse then that.

The "bash uploader" that was compromised is designed to curl and run a script on codecov's server. This script was compromised to in turn silently run a script from the attacker's server. This could have done anything but I doubt they just dumped ENV, probably opened a reverse shell to literally do anything they want as the user that originally invoked the script. This could be anything from stealing ~/.kube/config and ~/.ssh/id_rsa... If one of the 29,000 customers invoked as root it could be even worse, such as adding users and compromising entire platforms.


Well, a minor clarification... If you have multiple projects you can set up different per-project environment variables. But it's fair to say that all jobs within a project have access to all environment variables in that project.


A search for their bash script on github gives more than 400K results... The scale of this breach is staggering.

https://github.com/search?q=https%3A%2F%2Fcodecov.io%2Fbash&...


Gitlab has protected variables which are only made available to protected branches (i.e. the ones that would have deploy stages).

But yeah in general secrets management is a trainwreck.


Those branches still run tests and coverage, so it’s likely that code of would run on them anyway.


The design is that such tests would be run in PR non-protected branches and thus reviewed before being merged into protected branches for later deployment; that, we agree as much.

I guess the protected branch segregation should go -both ways- and more effort should be made into auditing and vetting which CI step has run access to secret-authorised branches.


Protected secrets has always seemed like such a rough barrier. No fine tuning at all (e.g. if you have stage and prod creds that you want to be kept separate, but both "protected"... No good way to do that from my POV).


CI has always been a Yellow Smiley Face Post-it, and the only thing I can true see to combat this is reproducible builds - just like if your repo was compromised, a dev is going to notice pretty fast if their local build differs from a compromised CI build


Would they really notice? You can have reproducible builds, but unless someone explicitly checks checksums of local builds, it won't be noticed.


you can also scope down variables by branch


I am not surprised, I reported a security issue years ago and I was completely ignored by them. Unfortunately I cannot find the ticket anymore but the issue was that I could see coverage results for a different account. Since then I am still waiting for Github to support coverage results via Github actions.


I think what we will (need) to see in coming time is more security awareness wrt. development tooling and setup.

Like:

- always run all dev tools in a sandbox, including the compiler and language server

- proper secure (harden) developer systems

- don't install software outside of sandboxes which is known to be often less careful with security vulnerabilities, like steam (or games in general). Firewalls can help wrt. offline only games, but due to things like drm, invites, game updates etc. most modern game software is not offline only.

- split CI into components, make sure the part which builds (and potential deploys) artifacts doesn't run in the same sandbox as additional analysis tools. Potentially run different analysis tools in different sandboxes. Don't give CI sandboxes permissions to directly push to your repository etc. If a tool need to be able to push to git consider limiting it's access to a specific folder or if not possible sub-module (which tbh. are annoying).

- Limit internet access of CI sandboxes as far as possible.

Sadly some of this things are quite cumbersome or even impossible to setup with (at least non enterprise) github.


> Sadly some of this things are quite cumbersome or even impossible to setup with (at least non enterprise) github.

All of this can be set up in-house with open source tooling, there really is no excuse. Gitlab CI with your own runners that are forbidden from direct Internet access, Sonatype's Nexus to proxy Docker images, Debian/Ubuntu APT, NPM and Maven repositories, and stuff like Sonarqube on Docker containers with a no-internet configuration. For stuff that absolutely needs Internet access (PHPs Composer comes to mind, sigh), set up a caching Squid instance.

You don't even need VMs for that any more, a single machine with Docker running is enough.


> Gitlab CI with your own runners that are forbidden from direct Internet access

Yes. Absolutely yes.

GitLab (CI) is SO. GOOD. Use it! It's incredibly flexible and powerful.


I agree, but it has a couple of annoyances.

- One thing I'm missing to ditch Jenkins permanently is general purpose job stuff... like "synchronize the production database to integration". Yes there are workarounds but all of them are ugly.

- Developing pipelines in Gitlab CI is tedious at best. With Jenkins I don't need to switch between applications during development, with Gitlab I need to switch between browser (to see logs), command line (git) and an editor for the pipeline.

- Holy hell it's dog slow sometimes. A runner with Docker executors always builds a whole new instance for each goddamn step of the pipeline, ALWAYS. Jenkins is intelligent enough to re-purpose executors.


> - One thing I'm missing to ditch Jenkins permanently is general purpose job stuff... like "synchronize the production database to integration". Yes there are workarounds but all of them are ugly.

We had a job that backed up some data to a git repo and the gitlab ci cronjobs were really easy to use. It was a simple java -jar, git pull, git commit, git push script and tbh it couldn't have been simpler.


>One thing I'm missing to ditch Jenkins permanently is general purpose job stuff... like "synchronize the production database to integration". Yes there are workarounds but all of them are ugly.

Oh, I have a wonderful setup consisting of just 30 different tools. Let's start with proper Kubernetes cluster...


> Gitlab I need to switch between browser (to see logs), command line (git) and an editor for the pipeline.

Fwiw, you can do it all in-browser with gitlab's editor.


Nexus doesn't have proxy plugins for a lot of popular repositories. In looking at finishing the one for crates.io that is out there I got the impression that Nexus was kind of a dumpster fire. We decided to air gap our dev network instead and mirror stuff devs need, updating at regular intervals. Mostly works. Haven't figured out how to mirror crates.io but it is supposed to be possible. Julia seems like it might not be.


One of the prices to pay for choosing an exotic stack, I'd say. How are your developers able to do any work in an airgapped network? I'd hand in my papers if I were not able to use Google on my development machine.


Rust isn't that exotic anymore, or did you meann air gapping? That is getting more common too due to events like this. There were some efforts to mirror stuff like stack overflow, but eventually they gave all the devs internet facing machines for research. No copy paste, just paraphrase, which is probably better. Software requests are handled quite promptly which helps.


> All of this can be set up in-house with open source tooling, there really is no excuse.

You use cloud services because they don't cost so much time and have a fixed amount of money (normally) that you can sink into. Also, there are people who fix the problems with the product. You can't just do everything on your own if you want to focus on your product.


I think what we will (need) to see in coming time is more security awareness wrt. development tooling and setup.

Management at a former employer quite rightly freaked out when they realised that their devs on a certain subcontinent were routinely pasting their code into external web-based prettifiers to format it nicely...


Ha. I bet if you were so inclined you could harvest a gold mine of sensitive information by having a public website which did:

1. JSON prettifying

2. JWT decoding/verification (bonus for "paste your signing key and we will generate JWTs for you too!")

3. PEM <-> DER conversion


The amount of valid JWTs that jwt.io could have harvested by this point is incredible to even consider.


As per my sibling comment - we need easier/better CLI and/or offline GUI tools for simple stuff like this.


Yeah, I use bloop personally which is great for this stuff.


I've never heard of it. Is there a webpage or source repo?


Sorry, it's boop not bloop. Source is at https://github.com/IvanMathy/Boop


In my experience, relying on "prettifier" websites is directly a result of the lack of canonical, well-supported CLI tools for prettifying most programming and markup languages. It's such a weird glaring gap in dev tooling.

I'd love to see "standard" prettifiers for HTML, CSS, JSON, YAML, and TOML. The required libraries exist and the problem isn't terribly hard if you aren't picky about corner cases. I'm sure countless programmers have written such tools already. Where is the disconnect?


In my experience, relying on "prettifier" websites is directly a result of the lack of canonical, well-supported CLI tools for prettifying most programming and markup languages. It's such a weird glaring gap in dev tooling.

There's a CLI linter for pretty much every language I've used, but how would you prettify a valid YAML file? Any reformatting will change it semantically.


You just parse and reserialize it. The trick is handling comments.


Don't underestimate YAML and not fully compliant YAML parsers(1) and sterilizer's with everything but nice results.

(1): Which can be necessary to prevent accidental problems like NO defaulting to false not "NO" and similar. I personally strongly recommend against anything where string quotes are optional for human written/edited serialization formats (and comments are a must have).


Codecov failed the IT approval process at two corps I worked at.

I never understood why anyone who is remotely serious about security would run something like Codecov within their walls. Maybe an air-gapped install, but I don't recall that even being an option.


How do you imagine an airgapped version of codecov would work? You walk over to it with a USB stick with your branch on every time you need to check the coverage stats?


Codecov as appliance within of your own network.


I’ve used it in lots of FLOSS projects, but the risk is lower there.

The level of access they get is too much for a secret-source project, especially for the little returns.


> I’ve used it in lots of FLOSS projects, but the risk is lower there.

How is the risk lower? It's basically an RCE vuln which can poison open source artifacts. Massive blast radius.


If configured well, you run it in a CI stage that has no access to secrets whatsoever. All it can do is leak your source code. Which is an issue for closed source projects, but not open source ones.


Surely the "simple" fix for this is that Codecov runs (for public projects) by pulling the source from the public repo, like anyone else can, and gets a scoped "token" to post back a comment on the Github/Gitlab API. That way, Codecov doesn't get privileged access.

I could almost see this work in private repos, albeit with an authenticated fetch token - I'm sure I've used a "deploy-only" key before with both GH and GL to do an authenticated pull. The repo would then be processed, and a response posted via commenting API.

Am I missing something here in why it's necessary to involve Codecov in your build/CI stage itself? Is it truly doing dynamic analysis, or is this some (in)convenience factor that could be eliminated? It just strikes me this could sit entirely on the side, without access to anything (beyond the source itself). For enterprise, you could spin up an on-prem instance, and give it the API tokens for authenticated pulls etc.


I submitted this pr to have them advocate minimal security measures in their readme:

https://github.com/codecov/codecov-bash/pull/426

Days later and no reply except for the goddamn autogenerated codecov report.


Every time I read stuff like this I’m reminded why our company doesn’t like us using external services.


Same here. I even sometimes catch myself being kind of happy about incidents like this happening, because every single one gets us closer to the tipping point at which companies staying away from third-party cloud services requiring getting unfettered access to some of their crown jewels (code, code documentation, issue data) finally have a serious moat again. Incentives are currently misaligned in my opinion, mostly because of marketing hype around everything cloud-based and still too little consciousness about the long-term consequences of outsourcing core competencies to whatever cloud service is currently hyped the most.

Our company is currently looking into alternatives to Jira due to Atlassian going the cloud-only route. One of the must-haves for whatever replaces Jira is a possibility for an on-premise installation. It's still early in the decision phase, but a surprisingly large number of candidates has already been ruled out due to this seemingly-simple requirement.


Does your company really have enough people with the knowledge of how to securely run Jira/CI/GitHub/etc. tools, with minimal downtime, and with proper backups?

I'm sure there are companies out there where this is possible, but most of them (I'd wager it's more like 99%) won't be able to match cloud-based solutions.


It actually has been running Jira on premise with zero unannounced downtime for ten years. So...yes, I think that's not the problem to simply continue doing what we did for the last decade.

CI and GitLab infrastructure are the same story: run on premise on our own hardware for a decade, with only very minor downtime. It has also evolved from simple isolated Jenkins instances with manually configured build jobs to a shared build/CI cluster using containerization and a lot of automation for build job management.

And it's not that we have an abundance of administrative people. A team of three (sometimes being temporarily buffed up with a few more if large infrastructural tasks are planned) for about 200 developers in total.

I have a different suspicion: a lot of companies out there have simply forgotten that they are perfectly able to run such services for themselves. Having people telling them all the time that they wouldn't ever be able to do that anyway (with a lot of them having their personal agenda, for example being employed by a cloud service provider) doesn't make things better in that regard.


> Does your company really have enough people with the knowledge of how to securely run Jira/CI/GitHub/etc. tools, with minimal downtime, and with proper backups?

Years ago, many companies ran similar services in house and having downtime wasn't a common scenario.


It's becoming the exception though, sadly. And don't even think companies affected by this will reconsider their strategy here. Some manager in charge will remember that product $SNAKEOIL_COMPANY tried to shill last week and calls back, ordering a round of AI powered, cloud based threat mitigation.

Well let's see how this unfolds and who's behind this, maybe we get some nice surprise-opensourcing of interesting stuff. /schadenfreude


There's a reason I submitted this PR one year ago. https://github.com/graycoreio/daffodil/pull/625

Specifically,

> Relying on a curl'd bash script downloaded off your server seems inherently vulnerable (though this apparently applies to many vendors in the space apparently). Checksum procedures should be built into your docs, or you should be using a package registry (ala codecov-node).


curl | bash only highlights in a very obvious way how vulnerable we can be to supply chain attacks, but is fundamentally not really different than downloading and running a setup.exe a foo.dmg or a foo.deb, running npm/pip/gem/bundle install, or ./configure && make && sudo make install.

Even when (if, really) people check for hash, do they check that the hash was published and fetched through a second channel? Nope, people run arbitrary code unchecked all the time, so they're just as equivalent.

Even git clone can be dangerous. Do you audit all the repo branch names before cloning? Because depending on how your shell prompt shows e.g branch names it could be used to RCE.

EDIT: by this I don't mean to excuse curl | bash and such, I just mean to highlight that the issue runs deeper, curl | bash being the unburied part.


is fundamentally not really different than downloading and running a setup.exe a foo.dmg or a foo.deb, running npm/pip/gem/bundle install, or ./configure && make && sudo make install.

Well, it is, because any .EXE or .MSI or .DMG I download will pass through one or more virus scanners. You can also do this by forcing users through a proxy such as Artifactory that can also do scanning of Linux packages. But it's futile to rely on automated solutions for NPM or PyPI where there is no curation of the repo so anyone can do typosquatting.


A virus scanner doesn't help about supply chain attacks, where malicious code is often bespoke and looks entirely like original code except for actual humans auditing the code/binary.


It does when the malicious file is discovered and its signature added to the database. Even if it was allowed to run before, subsequent runs can be blocked and the sysadmin notified.



This is a very concerning trend. The reliance on external tools is causing them to be a much more frequent attack vector.

SolarWinds, the Github actions thing, this.


What does codecov offer as a service that you don't get with tools like Istanbul and Sonarqube? Is it just "ease of use"?


Ironically, our company started using Codecov as a measure to improve security. I don't know how far a high code coverage correlates to a secure code base, but the security incident certainly had adverse effects.


> I don't know how far a high code coverage correlates to a secure code base

This is a hard question, basically high code coverage does in no way imply that your code is secure (or correct working wrt. non security aspects) at all.

But at the same time not having reasonable good code coverage can often be an indicator for code being potential not that good, especially assuming non-malice of a author group I would start with reviewing the parts of there code with little code coverage (after reviewing parts which are common to be messed up, after running automatic analysis tools).

The important thing to realize is that most coverage is in % of lines covered which is even by 100% coverage far far less then 100% possible execution flows. And not seldom security vulnerabilities (instead of "just" normal bugs, assuming a security aware dev) are tricky unexpected execution flow (parts) in unexpected combinations through lines which as well might have 100% line coverage.

So what I would say in general:

- Don't get fixated on any specific number of % of lines with test coverage, especially not 100%.

- Focus on writing "good" tests instead of tests with high increase in coverage.

- Be security aware wrt. supply chain attacks and similar, like due to above's points I would argue that due to the points above in general there is no need to run coverage analysis anywhere but the CI, in which you could put it into a different container, i.e. separate sandbox CI stages for building artifacts from such which just do testing and/or analytics, don't ever allow CI write access to your repository directly.


I'm not impressed with codecov in general -- I've got them on an open source repo, and the reported results seem somewhat random. I've got pr's that only add test infra, and it complains that less than the target pr code coverage %age has been hit with tests, when the PR was strictly "more testing".


You have to wonder how they use corporate logos in their customer lists. I am sure that there is no meaningful sense in which Google uses Codecov, but their logo is on the codecov marketing site.


Google probably is a customer, but just for some open source stuff they share and where they can't expose their internal build system to the public.

Just an assumption though.


I don't know, it looks like pure bullshit to me. Under their customer profile for Google there is just a random blog post about how to setup their garbage for Kotlin.

https://about.codecov.io/blog/company/google/




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: