Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
Open Source Security

Linus Torvalds On Git's Use Of SHA-1: 'The Sky Isn't Falling' (zdnet.com) 203

Google's researchers specifically cited Git when they announced a new SHA-1 attack vector, according to ZDNet. "The researchers highlight that Linus Torvald's code version-control system Git 'strongly relies on SHA-1' for checking the integrity of file objects and commits. It is essentially possible to create two Git repositories with the same head commit hash and different contents, say, a benign source code and a backdoored one,' they note." Saturday morning, Linus responded: First off - the sky isn't falling. There's a big difference between using a cryptographic hash for things like security signing, and using one for generating a "content identifier" for a content-addressable system like git. Secondly, the nature of this particular SHA1 attack means that it's actually pretty easy to mitigate against, and there's already been two sets of patches posted for that mitigation. And finally, there's actually a reasonably straightforward transition to some other hash that won't break the world - or even old git repositories...

The reason for using a cryptographic hash in a project like git is because it pretty much guarantees that there is no accidental clashes, and it's also a really really good error detection thing. Think of it like "parity on steroids": it's not able to correct for errors, but it's really really good at detecting corrupt data... if you use git for source control like in the kernel, the stuff you really care about is source code, which is very much a transparent medium. If somebody inserts random odd generated crud in the middle of your source code, you will absolutely notice... It's not silently switching your data under from you... And finally, the "yes, git will eventually transition away from SHA1". There's a plan, it doesn't look all that nasty, and you don't even have to convert your repository. There's a lot of details to this, and it will take time, but because of the issues above, it's not like this is a critical "it has to happen now thing".

In addition, ZDNet reports, "Torvalds said on a mailing list yesterday that he's not concerned since 'Git doesn't actually just hash the data, it does prepend a type/length field to it', making it harder to attack than a PDF... Do we want to migrate to another hash? Yes. Is it game over for SHA-1 like people want to say? Probably not."
This discussion has been archived. No new comments can be posted.

Linus Torvalds On Git's Use Of SHA-1: 'The Sky Isn't Falling'

Comments Filter:
  • by ffkom ( 3519199 ) on Saturday February 25, 2017 @06:51PM (#53930751)
    Both happened in 2005. And SHA-2 was published 4 years earlier. So yes, the sky is not falling, and git can be made secure, but it also wasn't really wise to use SHA-1 when git was implemented, first.

    BTW: At the company I work for, we already replaced SHA-2 with SHA-3 for security reasons. Better safe than sorry.
    • So yes, the sky is not falling, and git can be made secure, but it also wasn't really wise to use SHA-1 when git was implemented, first.

      Why not? As the summary quotes Torvalds, this is simply used to guard against corruption. Using something stronger would have given people the wrong idea about what protections it offered. Sort of like how people who don't understand how HTTPS really works think "as long as I see the lock icon, I am OK and my transaction is safe". Even if it hadn't given people the wrong idea, it would have added computational overhead for no reason.

      If your repository needs to be cryptographically verifiable,

      • by ffkom ( 3519199 ) on Saturday February 25, 2017 @07:16PM (#53930939)
        If you read Linus' whole statement, you will also find the part where he writes "yes, in git we also end up using the SHA1 when we use "real" cryptography for signing the resulting trees, so the hash does end up being part of a certain chain of trust. So we do take advantage of some of the actual security features of a good cryptographic hash, and so breaking SHA1 does have real downsides for us."

        Regarding our use of SHA-3: We use crypographic hash-sums as keys to cached data items that are not permitted for everyone to request. Thus we need to make sure that the cache keys cannot be "guessed" (like from knowing a valid cache key for a similar data item).
      • by Kjella ( 173770 )

        If you think that SHA-3 somehow magically makes everything more secure for verifying data have not been modified in transit (e.g., installer gets corrupted while being downloaded) because you replaced all the SHA-2 hashes with SHA-3 hashes on the installer download page which is served over insecure HTTP, then I suspect you may not fully understand what threats you are trying to protect against.

        The point is that if you're trying to use a hash instead of a checksum, it'll actually work as advertised. If you only care about random bit flips CRC32 will work very well and be much faster than MD5 or SHA-1. If you're doing major overkill you might not care that a hash doesn't function as a hash because you don't actually need a hash but that's no reason to use a bad hash. You should either use a good hash or use a lesser solution that doesn't pretend to make promises it can't hold.

        • If you only care about random bit flips CRC32 will work very well and be much faster than MD5 or SHA-1.

          Well, not exactly.
          - MD5 and SHA-1 have fast hardware implementation on some CPUs. CRC32 won't necessarily be a huge performance gain.

          SHA-1 is used a bit more than a simple glorified checksum in GIT.
          It is also used to give a handy number by which you designates commits, etc.
          (i.e.: to compute a hash - e.g.: as would also be used in a hash look-up table).
          That requires good output uniformity.
          In other words you'd need a hashing function that "spreads" its output accross the whole output domain.
          (to give an over-s

    • by geek ( 5680 )

      Linus really has no sense of security. He'll use whatever is expedient over what's wise. It's a shame really.

      • Linus really has no sense of security. He'll use whatever is expedient over what's wise. It's a shame really.

        How about describing the attack vector?

        • Linus really has no sense of security. He'll use whatever is expedient over what's wise. It's a shame really.

          How about describing the attack vector?

          The attack vector is straight out of OP's fuzzy behind.

          • Linus really has no sense of security. He'll use whatever is expedient over what's wise. It's a shame really.

            How about describing the attack vector?

            The attack vector is straight out of OP's fuzzy behind.

            It's like its possible, for the most unlikely possible way you'd ever want to infect a computer. I'm more worried about getting hit by a meteor. Which is to say, not at all.

        • Linus really has no sense of security. He'll use whatever is expedient over what's wise. It's a shame really.

          How about describing the attack vector?

          Well, the "practical" attack, described here [techcrunch.com] required:

          This attack required over 9,223,372,036,854,775,808 SHA1 computations. This took the equivalent processing power as 6,500 years of single-CPU computations and 110 years of single-GPU computations.

          So, Step 1: Get a super-computer ... or rent a fuck-tonne of capacity at Amazon EC2 ...

          • by HiThere ( 15173 )

            Even that wouldn't suffice, as your "corrupted copy" probably wouldn't compile.

          • This would be much harder that the PDF example by Google (which is quite impressive though). You'd need to generate a zlib compressed commit with a specific hash which references the correct parent commit's own hash, has consistent naming and dates and yields somehow valid code.

            This was said before, but again: the point of having SHA1 hashes in git is not security but to ensure reasonable uniqueness of objects (commit, trees, blobs or tags). SHA1 is (was?) a rather strong crypto hash so you do get some of i

            • by fisted ( 2295862 )

              and yields somehow valid code.

              Comments. You have infinite tries to get it right.

              • Not really. You can't just put random binary content on a comment.

                • by fisted ( 2295862 )

                  Yes you can. At least the compilers i know let you get away with it (you obviously have to strip possible comment terminators from the data but that goes without saying). As an example, I've just appended 1M worth of data from /dev/urandom (after sed(1)ing "*/" away) at the end of a hello world program. Compiled fine.

                  But the "random binary data" is a straw man anyway, because why would you even have to use random binary data? It's not like you don't have infinite tries with random printable ASCII.

              • and yields somehow valid code.

                Comments. You have infinite tries to get it right.

                But not infinite time, unless you're immortal - in which case you should have better things to do.

                • by fisted ( 2295862 )

                  But as of SHA1 being "broken", this is now considered possible in reasonable time. Currently it requires substantial computing power. Soon, it won't. Or might not anyway.

          • This attack required over 9,223,372,036,854,775,808 SHA1 computations. This took the equivalent processing power as 6,500 years of single-CPU computations and 110 years of single-GPU computations.

            So, Step 1: Get a super-computer ... or rent a fuck-tonne of capacity at Amazon EC2 ...

            Yeah, and don't forget that the (millions) of different hashed programs with the malevolent code will also need to replace the good copy. And compil And install

            Oy, such a lot of effort. And yup, there is a non-zero chance of this happening to me. Don't think I'll worry much though.

      • Care to substantiate that incredible claim?

    • BTW: At the company I work for, we already replaced SHA-2 with SHA-3 for security reasons. Better safe than sorry.

      This is a misunderstanding of the purpose of SHA-3. It was not designed to be a "successor" to SHA-2, but an alternative. There is no evidence that SHA-2 is insecure, or even that SHA-3 is more secure than SHA-2. It was simply picked because, at its core, the design is fundamentally different from SHA-2 so it is unlikely that both will be broken at the same time. There is no reason to move from SHA-2 to SHA-3 at this point.

    • by mrvan ( 973822 ) on Saturday February 25, 2017 @11:54PM (#53932089)

      BTW: At the company I work for, we already replaced SHA-2 with SHA-3 for security reasons. Better safe than sorry.

      In my country, we kicked out the Shas and migrated to Ayatollahs, who have a unique uniqueness guarantee!

    • by tlhIngan ( 30335 )

      Both happened in 2005. And SHA-2 was published 4 years earlier. So yes, the sky is not falling, and git can be made secure, but it also wasn't really wise to use SHA-1 when git was implemented, first.

      As a hash function, SHA-1 was perfectly adequate for how Git works.

      All Git uses SHA-1 for internally is to hash the contents of a file to turn it into a unique number. SHA-1 is a nice fast algorithm to do that, and 160 bits offers plenty of space to uniquely identify stuff. It's so good that all the other thing

  • By the time you could do something like trash it with crafted content, you could screw things over in less difficult ways...

    On the other hand, gpg still uses SHA1 for key fingerprints per the standard, which seems like that would be a much bigger risk. You can use other more secure hashes for digests, but fingerprint ids are SHA1, which was deemed inadequate for key fingerprints in X509...

    • by gweihir ( 88907 )

      Using SHA1 for fingerprints is not an issue at this time. All somebody could do is create two PGP keys with the same fingerprint. As keys do not really contain any critical information, unlike X.509 certificates, this matters little. So the reason SHA1 is "still" used there is that the people doing it actually understand what they are doing.

      • by Junta ( 36770 )

        Iif someone else makes a key with the same fingerprint as my key, couldn't they sign things and be trusted by people who have trusted my key, since keys are trusted by fingeprint?

        • by gweihir ( 88907 )

          They cannot do that. This is a two-sides hash-collision, i.e. the attacker needs to create both things that then collide when hashed. Creating a second key with the same hash as yours is still infeasible today, unless you cooperate with the attacker.

  • Yeah... (Score:3, Insightful)

    by Stormy Dragon ( 800799 ) on Saturday February 25, 2017 @09:11PM (#53931481) Homepage

    If somebody inserts random odd generated crud in the middle of your source code, you will absolutely notice

    Like how Heartbleed was immediately noticed and didn't sit there for two years.

    This is one of those things Open Source proponents keep saying that isn't actually true, because people aren't really auditing the code.

  • by gweihir ( 88907 ) on Sunday February 26, 2017 @07:02AM (#53933009)

    That really annoys me no end. There is some gradual improvement in a specific attack, expected by everybody that has a clue and not seen as anything dramatic by the same people. And immediately a horde of people with no understanding of crypto swoop in an declare the sky to be falling and all uses of this thing are now invalid. This is really just utterly pathetic.

    Example: I have to constantly defend the use of SHA1 for password hashing. (Sure, something like pbkdf2 or Argon2 should come later if the password may be low-entropy and gets stored. That is not always the case.) The thing is that password hashing has the purpose of preventing the hash being turned into a password again. Collision attacks have no impact on that at all. For a collision attack you would need to know the password and then you could find a second one with the same hash (or rather with the two-sided, much easier, variant you can find two passwords that map to the same hash). Now, these nil-whits completely overlook that the situation when using hashes in signatures always is that you already have what gets signed, which is completely different to the password situation. Still they claim "SHA1 is broken!". No it is not. It is broken for some specific _different_ application.

    Why so many non-experts think they can voice a qualified opinion about a very hard mathematical topic is beyond me.

    What Linus says here is exactly right and it is a statement by an expert. All those criticizing him are basically people that can put on a band-aid telling a brain-surgeon how to do his work. They just do not get it at all.

    • There's nothing wrong with that use of SHA1, but I can't think of a threat model in which it actually accomplishes anything useful, not because SHA1 is defective, but because passwords are. If an attacker gets the hash, he can almost certainly recover the password. Further, your implied threat model seems to assume that an attacker may be inside the system (which is a good assumption), where he can grab the in-flight hashes. But if that's the case, what prevents the attacker from replaying the hashes? At th

  • Torvalds said on a mailing list yesterday that he's not concerned since 'Git doesn't actually just hash the data, it does prepend a type/length field to it', making it harder to attack than a PDF

    I didn't get that one. PDF has a preamble too, right? Which the researchers were able to reproduce just fine in both "shattered" PDFs.

The best defense against logic is ignorance.

Working...