Forgot your password?
typodupeerror
Encryption Security Announcements

SHA-0 Broken, MD5 Rumored Broken 707

Posted by CowboyNeal
from the lucky-numbers dept.
An anonymous reader writes "Exciting advances in breaking hash functions this week at the CRYPTO conference. SHA-0 has definitely been broken (collision found in the full function). Rumors are that at the informal rump session, a researcher will announce a collision in full MD5 and RIPEMD-128. And Ed Felten is speculating about collisions in SHA-1! Many systems, especially those that use cryptography for digital signatures are most at risk here."
This discussion has been archived. No new comments can be posted.

SHA-0 Broken, MD5 Rumored Broken

Comments Filter:
  • by Anonymous Coward on Monday August 16, 2004 @09:57PM (#9987006)
    I picked the wrong week to quit sniffing MD5 hashes.
  • by Anonymous Coward on Monday August 16, 2004 @09:57PM (#9987007)
    d008960fa6b395dca1c8362165bb31be
  • Consequences? (Score:5, Insightful)

    by Fruny (194844) on Monday August 16, 2004 @10:00PM (#9987034)
    Can a crypto-geek sum up the consequences for all of us dummies? Thanks.
    • Re:Consequences? (Score:5, Informative)

      by mblase (200735) on Monday August 16, 2004 @10:12PM (#9987147)
      I looked up hash collision [wikipedia.org] (e2 [everything2.com]) and hash function [wikipedia.org] (e2 [everything2.com]) at Wikipedia and Everything2, which clarified the summary quite a bit.
  • by ThisNukes4u (752508) <tcoppiNO@SPAMgmail.com> on Monday August 16, 2004 @10:00PM (#9987037) Homepage
    What are the possible rammifications if MD5 is confirmed to have a collision? What other hashing algo's are there that may take its place? And even if it does have collisions, is that serious enough to break many security systems already in place?
  • by ejito (700826) on Monday August 16, 2004 @10:04PM (#9987067)
    "We are
    glad to announce that we found a collision for SHA-0." - from the article [mail-archive.com]
    I just found the wording kinda weird... I'm hoping to do research in cryptography in the future. I know I'd feel quite proud if I found a vulnerability like that, but is it appropriate to show such enthusiasm? Kinda like an overjoyed astronomer that finds a comet heading into a collision course with Earth.
  • Umm... (Score:5, Interesting)

    by Anonymous Coward on Monday August 16, 2004 @10:04PM (#9987069)
    I'm a neophyte, so excuse my ignorance, but how does the fact that a full-time researcher (working on the SHA-0 algorithm), using a computation requiring: (direct quote follows):
    The computation was performed on TERA NOVA (a 256 Intel-Itanium2 system developped by BULL SA, installed in the CEA DAM open laboratory TERA TECH). It required approximatively 80 000 CPU hours. The complexity of the attack was about 2^51.
    ... in order to find two different input values which produce the same output value (I presume that's what they did.. I could be wrong) make _any_ sort of practical difference?
    • Re:Umm... (Score:3, Insightful)

      I'm a neophyte, so excuse my ignorance, but how does the fact that a full-time researcher (working on the SHA-0 algorithm), using a computation requiring: (direct quote follows): ... to find two different input values which produce the same output value (I presume that's what they did.. I could be wrong) make _any_ sort of practical difference?

      Once upon a time, you needed millions of dollars and specialized hardware to crack 56-bit DES. Now it can be done with commodity hardware and a little time. Moo

    • Re:Umm... (Score:5, Informative)

      by addaon (41825) <addaon+slashdot@NOsPAM.gmail.com> on Monday August 16, 2004 @10:24PM (#9987252)
      Good question. :-)

      The complexity of the attack was around 2^51.

      SHA-0 is a 160-bit hash. Now, you might at first think that means it would take aorund 2^160 tries to get two strings that collided, but because of the birthday paradox (what's the probability that two people on a football field have the same birthday) it's actually supposed to be the square root of that, or 2^80.

      Unfortunately, 2^51 is much, much less than 2^80. Which means that this crack was not just brute force (which is kinda redundant, since the definition of a crack to a cryptosystem is a violation like this without using brute force, but whatever). That means they've found a mathematical technique for making this 'easy'. Generally, once the idea is out there for an easy crack, people can gnaw on it and make it more general or easier, bringing this down to every-day-concern level.

      • Re:Umm... (Score:5, Informative)

        by yppiz (574466) on Monday August 16, 2004 @10:41PM (#9987377) Homepage
        Mod parent up. I just waded through 50 posts, and this is the only one that got it right.

        The key is not that they found a collision, but that they did so with substantially less than brute-force effort. Substantially less here means 30 - 130 orders of magnitude less (depending on whether you call SHA-0 a 2^160 space or a 2^80 space because of the birthday paradox mentioned in the parent article).

        --Pat / zippy@cs.brandeis.edu

  • by WhatAmIDoingHere (742870) <sexwithanimals@gmail.com> on Monday August 16, 2004 @10:04PM (#9987073) Homepage
    As long as we don't tell anybody, it doesn't exist right?

    Oh...
  • Broken how? (Score:4, Funny)

    by Matey-O (518004) <michaeljohnmiller@mSPAMsSPAMnSPAM.com> on Monday August 16, 2004 @10:04PM (#9987076) Homepage Journal
    If it's brute force, I'm not worried. If it's a cryptologically trivial computation, I'll have to go back to ROT26.
  • by Josh Booth (588074) * <joshbooth2000NO@SPAMyahoo.com> on Monday August 16, 2004 @10:04PM (#9987077)
    Isn't it always going to happen when you make a hash code that is shorter than the string you make it from there is the possibility of collisions? Or is it just that there was found two equal length strings that have the same hash? Either way, is it that big of a deal?
    • by shadowmatter (734276) on Monday August 16, 2004 @11:15PM (#9987628)
      Indeed, there are infinitely many unique bit strings we may take the hash of, and yet only 2^160 hash values they can map to. By the pigeonhole principle, some hash value is mapped to by at least two unique bit strings.

      (The pigeonhole principle says that if you shoot N pigeons with N+1 bullets, and don't miss, one pigeon has two holes... or something like that.)

      Now hash functions can be classified as either weak collision resistant or strong collision resistant.

      A hash function h is weak collision resistant if, given h and a bitstring x, it is impossible to find some other x' such that h(x) = h(x'). Note that x is specified.

      A hash function h is strong collision resistant if it is infeasible to compute any collision (x, x') of h.

      So, if a collision was found in MD5, it's no longer strong collision resistant (since the person found a pair x, x' such that h(x) = h(x')). Its when MD5 is no longer weak collision resistant that things start to get scary -- meaning, given any arbitrary bit vector, I can find another bit vector that generates the same hash value in a reasonable amount of time. This means that, after you download your favorite Linux ISO, just making sure the MD5 hash checks out does not garuantee the authenticity of the file.

      But no worries. There's still SHA-1. And if that isn't good enough, there's its bigger cousins, SHA-256, SHA-384, and SHA-512 [nist.gov].

      - sm
  • by shawn_f (620177) * <wicked...vertigo@@@gmail...com> on Monday August 16, 2004 @10:06PM (#9987107) Homepage Journal
    I am curious about 2 things:

    Firstly, perhaps someone here on slashdot.org might be able to shed some light on the particular encryption algorythms mentioned in the article...thier basic differences, weaknesses and strengths.

    Secondly, I am assuming that a "collision" is not as seriuos as a crack...what is a collision, and what are the ramifications of this in my ssh sessions and the like...

    I have read about and used encryption, but have never really delved into it. I would venture to say that a good many of us would benefit from some enlightenment.
    • by ctr2sprt (574731) on Monday August 16, 2004 @10:43PM (#9987382)
      Well, basically, there is a kind of data structure called a hash table. The idea there is that you take an arbitrary-length key and convolve it so you get a fixed-length, shorter result called a "hash." The hash is then used as an index into some other kind of data structure, typically a flat array. This lets you get very fast accses to anything in the hash table, provided that your hashing function is a good one. The main factor which hurts hashes is a function that doesn't evenly spread data across the entire table. Consider the English language. Some letters start words much more frequently than others. Using the first letter of the key would therefore be a bad hash function, since there would be a lot of keys with the same hash and several with almost none.

      So, very simply, collisions are when a hash function takes two different input keys and produces the same hash for both as output. Going with our English-language library example from above, "Stevenson, Robert Louis" and "Sheridan, John" would both produce the same hash ("S"). That's all there is to a collision. They are inevitable when you take longer strings and turn them into shorter ones.

      Now what this has to do with security may be becoming clearer. Because SHA1 and MD5 are good, general-purpose algorithms, they are used in lots of places. They are used to store your password. (You enter your password, the system runs MD5 or whatever against it, then stores that. When you need to verify your password, it runs the same process and compares the hashed versions.) They are also used as "digital signatures" to verify that the content of an object hasn't changed. And they are, of course, used in hash tables with a lot of disparate entries.

      But this is genuinely not something you should be worried about. Like I said, collisions are inevitable. But the computational effort required to find one is far from trivial. It's much easier for someone to crack your password using a dictionary-based attack or social engineering or pretty much anything else, at least for now. When you should worry is if someone finds a way to take a hash and use that to produce something which, when hashed again, will result in the original: that is, two-way hashing. MD5 and SHA1 are both allegedly one-way hashes, so you cannot ever go from the hash to any sort of original data. This is why they're secure for passwords and the like. But if that should turn out to be wrong...

      I can't speak much to specifics about MD5 and SHA1 because I don't really have the background in math to do so. (I have the background to write a computer program implementing the algorithms, but I can't explain why they work so well.) MD5 is older and has been suspected to be not-so-good for a few years now. SHA1 is newer and better. That's about all I can tell you.

  • by Vilim (615798) <ryanNO@SPAMjabberwock.ca> on Monday August 16, 2004 @10:07PM (#9987111) Homepage

    I have been expecting the MD5 crack for a while, it just isn't a secure hash anymore. SHA-0 was proven to be mathematically weak back in '98. There was no real need to brute force it. I highly doubt that SHA-1 was cracked, if it was, we are in trouble, is there any better hash to replace it? I figured that we would get quite a few more years out of SHA-1

    What we really need is a mathematically strong hash which will let you user define its strength. For example the first byte of the hash tells the program how strong the hash is. As the strength byte increases, the mandatory execution time of the hash increases exponentially.

    • by g-san (93038) on Tuesday August 17, 2004 @12:22AM (#9987983)
      hmmmm....

      I have heard rumors of a cypher on the street called SHA-X. It's not mathematically strong, as you so eloquently put, but it's supposed to be really good, really stong stuff. And is really asymmetrical, meaning it takes less time to decypher the message after encryption. Unfortunately it uses a semi-random keysize, so you never know the strength until you try to decrypt. It also has a key that destroys itself 48 hours later so Alice or Bob can't even tell you were ever encrypted. Only problem is the algorithm tends to overuse one particular register resulting in spontaneous cpu burnout.

      But hey, if you got extra cpus...

  • How is this news? (Score:5, Informative)

    by 0racle (667029) on Monday August 16, 2004 @10:10PM (#9987131)
    Im reading 'Practical Cryptography' by Niels Ferguson and Bruce Schneier, a facinating read on current crypto and cypher systems btw. Now it was copyrighted 2003, and there (page 88) they talk about SHA-0 being broken when it was created and give the distinct impression that SHA-1 is not to be trusted.

    On a side note, I would recommend 'Practical Crytography' to anyone interested in, or working with modern crytography.
    • Re:How is this news? (Score:5, Informative)

      by Anonymous Coward on Tuesday August 17, 2004 @01:14AM (#9988256)
      Okay, the story goes like this:

      NIST (working with the NSA, but in a GOOD way) develops the SHS, of which SHA is part, during their effort to develop the DSS (of which the DSA standard is part). The standard is published, and everybody rejoices.

      A little while later, NIST says, "Hey! There's a new version of SHA! We're calling it SHA-1. There are, uh, 'security concerns' with the old SHA. We can't tell you what it is, but don't use it. By the way, the main difference between SHA and SHA-1 is the introduction of a shift instruction. Buh-bye!"

      The obvious inference here is that the NSA's crypto folks found a nasty problem with the original algorithm, and had it changed in the interest of keeping the Secure Hash Algorithm, well, secure.

      Now, this catches the interest of cryptologists. They spend a bunch of time analyzing the algorithm, and a couple of folks found that there are a few attacks-- very theoretical, very impractical-- that could find a collision in slightly less time than a brute force search or a birthday attack.

      With that, everybody stops trusting the original SHA. Most people decide to use SHA-1 or MD5 instead-- but SHA-1 has a longer hash length than MD5 (SHA-1 is designed for ~80 bits of security, MD5 designed for ~64 bits), so a lot of people decide to trust SHA-1 instead.

      What's happened at CRYPTO '04 is a significant improvement on the attacks discovered previously. A birthday attack on the SHA-0 algorithm would normally be on the order of 2^80 memory and 2^80 time. This attack used 2^51 time and significantly less memory-- in other words, this attack is somewhere in the area of 500 million times faster than previously known possible.

      As for the idea that "SHA-1 is not to be trusted", well-- maybe the more paranoid folks avoid it, but a lot of applications still use SHA-1:

      For the record, SHA-1 is used in PGP (but not exclusively, and not necessarily as the default hash algorithm), in SSL (again, this can be configured), in a number of TripWire-like programs, by Gentoo's "emerge" system, the *BSDs' "ports" trees, and-- this is important, here-- as the only officially recognized hash for use in the digital signature standard (DSS).

      So an attack on SHA-1 would be VERY significant.
  • Collision != Broken (Score:3, Informative)

    by scovetta (632629) on Monday August 16, 2004 @10:12PM (#9987146) Homepage
    A hashing algorithm H is "broken" when for an arbitrary input A, you can feasibly calculate another input B for which H(A) = H(B).

    All they found here were two values C, D for which H(C) = H(D). Anyone who thinks this was suprising needs to take a look at the idea of a hash:
    Start out with 512 bits (2^512 possible values)
    Hash it to 128 bits (2^128 possible values)
    For each possible value to come out of the hash, you're going to have (2^(512-128)) = 2^384 collisions.

    If they could find two data values with less than, say, 80 bits of data that both hash to the same, then that's something, 'cause that could be someone's password. And if two values of 80-bit data hash to the same thing, then the algorithm is bad.

    I just don't think this is anything special. It's sort of like saying, "look, i found my password in the newspaper by circling various letters and numbers thoughout the articles." Or maybe not, it's still just not that great.

    • by 0x0d0a (568518) on Monday August 16, 2004 @10:18PM (#9987197) Journal
      All they found here were two values C, D for which H(C) = H(D). Anyone who thinks this was suprising needs to take a look at the idea of a hash:

      The existence of collisions is obvious. The point is that it should be so phenomenally difficult to find a collision that you can't ever come up with one in a sane amount of time.

      If they could find two data values with less than, say, 80 bits of data that both hash to the same, then that's something, 'cause that could be someone's password.

      That would be unlikely to be a big deal. For passwords, I can just brute force the password, maybe precompute a dictionary of password hashes.
  • wha?? (Score:3, Funny)

    by flacco (324089) on Monday August 16, 2004 @10:15PM (#9987174)
    i don't care about the implications for crypto or the science behind all of this. i just want to know what the fuck a "rump session" is, and would appreciate tips on avoiding them if i should go to such a conference.
  • by Facekhan (445017) on Monday August 16, 2004 @10:24PM (#9987254)
    I would just like to mention that one of the biggest problems of it becoming feasible to find a collision in MD5 is that a lot of routers use MD5 to authenticate routing updates with one another. If a hash is sniffed and the password is cracked then it becomes a trivial matter to inject bad routing updates and crash large networks especially if the inter-ISP BGP links are cracked. Its not quite as simple as I put it here but it is possible.
  • Summary (Score:5, Informative)

    by TheRealFoxFire (523782) on Monday August 16, 2004 @10:35PM (#9987331)

    First, a little about SHA and SHA-1. SHA (0) was developed as a national standard hash function. Curiously, a last minute change was proposed by the NSA which resulted in SHA-1. The change was puzzling the cryptographers in academia. After some time, an attack was discovered on some classes of cryptographic hashes which SHA-1 turned out to be curiously immune to.

    All that aside, one should view cryptographers a bit like plumbers. Cryptographers have a bag full of equipment (algorithms), which they combine in many ways to accomplish what they want. After much research and testing, certain ways of combining those pipes, valves, and spigots are 'proven' to work well, within given assumptions, such as expecting an L-joint not to leak. SHA-1 likewise is an integral component in many cryptographic systems.

    SHA, MD5, RIPEM, and others are cryptographic hash functions. Cryptographers expect certain things out of a hash function. Its job is to take a variable amount of input bytes, and give you back a small, fixed number of bytes. Most importantly:

    1. The function is 'one-way'. You can't determine from the output what any of the input was.
    2. The probability that two inputs hash to the same output should be exceedingly low. In particular, an ideal hash function has a randomly distributed output for a large set of inputs.
    3. Given a hash output, it should be extremely difficult to construct another document which also hashes to that output.
    In this case, it seems that it may be much more easy than the designers had hoped to break the second condition. This tends to mean that 3 is easier as well, which has ramifications for security protocols.

    So, in summary, this discovery is a bit like finding out that an L-joint which has been used reliably by plumbers the world over is much more likely to leak than previously thought.

    But we haven't seen the results yet...

  • by roca (43122) on Monday August 16, 2004 @10:45PM (#9987398) Homepage
    It's true.

    Paper here:
    http://eprint.iacr.org/2004/199.pdf

    Independently verified here:
    http://www.mail-archive.com/cryptography@me tzdowd. com/msg02579.html
    • by RedSlash0 (744564) on Tuesday August 17, 2004 @12:22AM (#9987981) Homepage
      They didn't get a real MD5 collision. The authors of that paper got the IV vector endianness wrong on the MD5 algorithm and therefore the algorithm used wasn't MD5. However, the IV vectors don't seem to have any significance on the MD5, so I wouldn't think it'd be hard for them to produce a real collision after fixing up the algorithm.
  • by Pan T. Hose (707794) on Monday August 16, 2004 @11:24PM (#9987689) Homepage Journal
    Slashdot reports that CowboyNeal posts that an anonymous reader writes that rumors are that at the informal rump session, an unknown researcher will announce a collision in full MD5, two ACs confirm, all slashdotters consider MD5 definitely proved broken, film at eleven. That is what I call good journalism.
  • SHA-1 is not SHA-0 (Score:4, Informative)

    by Effugas (2378) * on Monday August 16, 2004 @11:39PM (#9987774) Homepage
    It's worth pointing out that it was widely assumed that there was a serious flaw in the original SHA, enough that NSA saw fit to add that final left shift at the end of each round. SHA-1 *exists* because there's a problem in SHA-0. The original SHA is not "slightly less secure" because it just lacks the fix; the nature of the algorithm is such that the slight variation NSA introduced created enormous deviations in the output function. Unless there's a fundamental architectural hole -- possible, see MD4/MD5 -- the original SHA could fall to pieces and it wouldn't mean SHA-1 was dead in the water.

    I don't really know what Ed Felten means regarding "weaker cousins" of SHA-1 being the only other popular hashes. SHA-256 is a cousin, but has a larger hash size and is generally considered to be stronger. SHA (SHA-0?) is a cousin, but I've never seen it deployed anywhere. MD4/MD5 are still mildly popular, but they're at best design ancestors and not cousins -- there's no way a break in SHA-1 is going to make them any less secure (they have their own issues, of course). RIPE-160 and TIGER are the only two other hashes I've seen in the field, and they too have nothing to do with SHA-1.

    There might be something here -- the left shift could have been a band-aid solution to an architectural fault in the SHA design, and there may be lots of curiosity about whether the new SHA-0 attack routes neatly around the fix. We'll see.

  • For those who are seriously following this, you've probably seen the paper claiming to break MD5 [iacr.org]. I immediately started playing with confirming their results, but failed. There was some seriously strange stuff going on.

    Eventually I gave up trying to reproduce the hashes, and went to looking online. I found a good summary explaining the mistake the authors made (endianness problems, mostly) at this website [rtfm.com].

    The end result is that they didn't break MD5 -- yet. But their result can probably be modified to break the real MD5. Looks like we have a few more days till the world ends. ;)

  • crypto news flash... (Score:5, Informative)

    by xquark (649804) on Tuesday August 17, 2004 @01:59AM (#9988437) Homepage
    This person's intro into this "news flash" is so out of wack I don't know where to begin!
    Lets start with SHA-0 collisions, methods for determining collisions in the original
    SHA algorithm have been known since 98'. In 95' the NSA issued a modification to the
    SHA-0 (original algorithm) which became SHA-1, the modification was to counter an attack
    unknown to open researchers known as parallel shifting. The two French guyz which found
    collision in SHA-0 in 98' were the first open researchers to encounter this attack method.
    A side point I would like to make is that the NSA made a similar change back in the early
    70's to the IBM DES algorithm which until 89-90 was not known why such a change was needed.
    The attack the modification was protecting against was differential cryptanalysis. early 90s
    there was a 20 year difference in knowledge late 90s there was only 3 years difference in
    knowledge GO FIGURE!

    So in theory the SHA news is old news, as far as MD5 and RipeMD, well anyone that has the
    slightest clue of the mathematics behind message digests will know that all MDs derive from
    the same logic and same mathematics, and that a flaw found in one algorithm can be easily
    transferred to another algorithm if that a particular algorithm hasn't already dealt with
    that specific attack/flaw.

    And on a final note:
    "Many systems, especially those that use cryptography for digital signatures are most at risk here."

    Tell me of system in the world that uses security and does not make use of hash functions?

    Arash Partow

    ________________________________________________ __
    Be one who knows what they don't know,
    Instead of being one who knows not what they don't know,
    Thinking they know everything about all things.
    http://www.partow.net
  • by flux (5274) on Tuesday August 17, 2004 @02:47AM (#9988582) Homepage
    Well, it's quite simple actually. Let's take an arbitrary md5sum for instance:

    d3b07384d113edec49eaa6238ad5ff00

    Now, we obviously can see that the beginning of the data is complete gibberish. However, may I point your attention to the trailing three nibbles: f00. This is a clear clue! Let's use that as a base for our educated guess:

    % echo foo | md5sum

    d3b07384d113edec49eaa6238ad5ff00 -

    And voilá, we're cracked it!
  • Sigh... (Score:5, Insightful)

    by Kjella (173770) on Tuesday August 17, 2004 @03:51AM (#9988765) Homepage
    1. If you can produce a hash collision between two random inputs, that is rarely a problem. Either input will be junk. It doesn't matter that you have two interchangable pieces of junk.

    2. If you can produce a hash identical to a desired hash, you can mostly only sabotage (poison) a transfer.

    3. If you can take an existing file and append data to match a hash, you have a total and very dangerous compromise.

    From what I can tell, we're at #1. Make it #2, and things will break as people move to a new algorithm, but I doubt there'll be much problems. Now do #3, and I'm worried...

    Kjella
  • by michael path (94586) * on Tuesday August 17, 2004 @11:18AM (#9991537) Homepage Journal
    Ed Felten posted a follow-up on this article here [freedom-to-tinker.com].

    As it turns out, there was an error in their findings - and that the MD5 value created is NOT the same in this incident.

(1) Never draw what you can copy. (2) Never copy what you can trace. (3) Never trace what you can cut out and paste down.

Working...