Forgot your password?
typodupeerror
Communications Encryption Privacy Security News

Chapel Hill Computational Linguists Crack Skype Calls 156

Posted by timothy
from the hookt-on-fon-iks-indeed dept.
mikejuk writes "You might think of linguistics as being interesting but not really useful. Now computational linguistics [PDF of original paper] has been used to crack Skype encryption and reconstruct what is being said in a VoIP call. What is surprising is that though they are encrypted, the frames that make up a Skype call contain clues about what phonemes are being spoken."
This discussion has been archived. No new comments can be posted.

Chapel Hill Computational Linguists Crack Skype Calls

Comments Filter:
  • Speach recognition (Score:4, Insightful)

    by city (1189205) on Thursday May 26, 2011 @04:23PM (#36255540)
    My Google Voice voicemail transcription gets about 1 out of 4 words correct. Can Google please buy this company already.
    • Re: (Score:2, Funny)

      by Anonymous Coward

      Do you speak as well as you spell?

    • by bhcompy (1877290)
      Vonage gets about 75%. Not bad. I think, secretly, that they hire people in India to do it.
      • by fragfoo (2018548)

        Vonage gets about 75%. Not bad. I think, secretly, that they hire people in India to do it.

        I should have guessed when the robotic voice sounded like Apoo! Vonaaage!

    • by city (1189205)
      or hire these researchers I should say.
    • by jmcbain (1233044) on Thursday May 26, 2011 @04:51PM (#36255948)
      The ignorance of the statement "You might think of linguistics as being interesting but not really useful" is simply astounding. Linguistics provides the foundation and formal frameworks for grammar, syntax, morphology, phonetics, and semantics that allows us to better understand language. From that basis, computational linguistics is seen simply as an application of linguistics, and computational linguistics of course leads to information retrieval, automatic speech recognition, text classification, and other fields that are among the most important computing topics of the 21st century. Ignorantly saying linguistics is interesting but not useful is like saying physics and chemistry are interesting but not useful.
      • Yet it is no less true that someone reading that statement may indeed hold that opinion. I've always found it very interesting... How else might we ever develop human/animal translators?
      • by sgt scrub (869860)

        Linguistics provides the foundation and formal frameworks for...

        Agreed. mikejuk was obviously feeling dyslexic. In point of fact, nobody likes to discuss linguistics. It is boring as hell. :p

      • by formfeed (703859)

        The ignorance of the statement "You might think of linguistics as being interesting but not really useful" is simply astounding.

        Right. Would they have a linguist on basically every interplanetary mission, if they were just a bunch of useless bookish nerds ?!!

      • by qc_dk (734452)

        The ignorance of the statement "You might think of linguistics as being interesting but not really useful" is simply astounding.

        What ignorance would that be. I read that as a statement as a hypothesis that the common man might hold linguistics to be not really useful. The statement makes no claim whatsoever that linguistics is in fact not useful. In fact it makes the exact opposite claim.

        Do you believe that the commonly held opinion is that linguistics is useful? or simply some academic pursuit for bearded people with leather patches on their elbows. I think you can easily get a feeling by looking at research grants to linguistics a

        • by rk (6314)

          The common man might think this, but this is slashdot, where I hope the level of computer science and software engineering clue is still a bit higher than the background levels. Such people should already be aware of the close linkages between linguistics and computer programs and systems.

          But maybe I'm just engaging in wishful thinking now.

    • by vegiVamp (518171)

      Right now that would probably entail buying Microsoft. What could possibly go wrong?

  • Side channel attack (Score:5, Informative)

    by betterunixthanunix (980855) on Thursday May 26, 2011 @04:25PM (#36255566)
    The wording in TFS is a little misleading; they did not "crack Skype encryption," they found an exploitable side channel in Skype. The crypto itself has not been cracked, but it was being used in a way that leaked lots of information.
    • Re: (Score:2, Interesting)

      by Anonymous Coward

      The simple description is: By looking at the size of the encrypted data packets you can guess what phonemes were spoken. Yes, that's all there is to it. They are just looking at how much data is sent and guessing what might be said that reasonably fits in that size.

      An obvious simple fix would be to vary the length of the packets with random padding (using a cryptographically secure random algorithm to determine the length). It would add overhead but probably not that much considering how small these pac

      • A simpler fix would be to use a different method of compression, which does not vary the length of its output frames.
      • by NoSig (1919688) on Thursday May 26, 2011 @05:57PM (#36256774)
        If the padding is random you'll decrease the amount of information leaked, but there may still be enough information leaked to reconstruct some conversations. What you really need for total security from this attack is to eliminate the side-channel completely, such as by sending packets of the same size and with the same frequency no matter how much data you've actually got that needs sending. That is a form of padding too, but it is better than random.
        • If the padding is random you'll decrease the amount of information leaked, but there may still be enough information leaked to reconstruct some conversations. What you really need for total security from this attack is to eliminate the side-channel completely, such as by sending packets of the same size and with the same frequency no matter how much data you've actually got that needs sending. That is a form of padding too, but it is better than random.

          ^^ This. I'm actually surprised to hear that with Skype the packets are of variable length and (somewhat) a function of the contents. I would have imagined that, after encryption, the communication protocol would split the content into packets of either random or same size.

          But OTH, there might have been performance implications that forced Skype to not do just that. After all, there are legit reasons to not do super encryptions (as with the Predator's unencrypted download links [schneier.com].)

      • by Eivind (15695)

        Yeah. Furthermore, this is a *really* old and *really* well-known side-channel. Everyone knows, and has known for many decades, that crypto by itself is no defence against traffic-analysis, that is, you still know what size the packets are, and who the sender and recipient is, and the frequency they're sent with.

        The only way to thwart that completely, would be to send a constant stream of constant-size packages regardless of if anything is being said or not, this is an easy fix, but it conflicts with the go

        • by nahdude812 (88157) *

          You could still save a lot of bandwidth, and protect against phoneme exposure by having fixed packet size transmission happen for a short interval after speaking occurred (perhaps randomly between 0.5 and 2 seconds). You'd be able to tell when people were speaking, but lose visibility into the cadence of the words. That shouldn't be that large an impact on the overall bandwidth consumption but should pretty much shut down this side channel.

    • by blair1q (305137)

      if your encryption leaves the message where it can be read without decrypting it, then it was never actually encrypted

      skype is using a lot more bandwidth than they need to. like single-sideband radio, they can drop at least half the channels they're sending and the information will still be perfectly intelligible on the other end. they've effectively done that by sending superfluous encrypted gibberish on their "main" channel.

      the bonus is, their method of sending the message in the side channel is probabl

      • if your encryption leaves the message where it can be read without decrypting it, then it was never actually encrypted

        While you are technically correct, you are not really contradicting what I said.

        The encryption algorithm itself does not allow you to obtain the plaintext without decrypting it (as far as we know); the problem is that the protocol requires many encrypted messages to be sent in a particular sequence, and the size and sequence of those messages leaks information about the plaintext. This is a side channel, not a break of the encryption algorithm itself, and the problem is solved without any change to the

        • There's a reason that SSH has inserted random padding into its packets since its inception. You would think that the folks at Skype might've done just a a bit more research...
        • by blair1q (305137)

          The Cone of Silence was never really all that soundproof, either. Nor was it at all cone-like.

        • by AK Marc (707885)
          You are right and wrong at the same time. If the encryption still allows information to be accurately pulled from it after encryption, then the encryption is broken. The person that pointed out this flaw and exploited it is termed the person who broke it. So you are correcting someone that was actually correct. And you are correct as well. They aren't unencrypting the encrypted packet and extracting the information. But the encryption is broken because it allows information to be gained from the encry
          • Arguing about whether they broke "the encryption" or "the secure channel" or "the encryption machine" is a worthless rhetorical exercise.

            Except that it is not just rhetoric. Suppose I use PGP to encrypt all of my email, but then save copies of the plaintext on a "cloud system" and someone comes along and reads the plaintext. What was broken? It was not PGP; PGP, when used correctly is secure.

            Yes, if you use a cryptographic algorithm incorrectly, your security may be compromised. That does not mean the cryptographic algorithm was broken, it means your specific way of using it was bad. Just because someone managed to compute Sony's P

            • by AK Marc (707885)
              Are you being deliberately obtuse? The actual encryption method resulted in the vulnerability that resulted in the attack. That's completely unlike emailing a copy of your plaintext emails to the New York Post for safekeeping and then claiming that your disk encryption was broken. Now, if PGP, or your disk encryption were to email a copy of the key to a central server for safe keeping, then that would be similar to the situation described. However, your inability to form a related analogy indicates that
              • Then I guess by your definition, all encryption everywhere is "cracked," since there is always a way to get the plaintext without attacking the cipher itself. There are easy to implement side channel attacks on a common software implementation of AES, which is used in both OpenSSL and NSS, but nobody is claiming that AES, OpenSSL, or NSS have been "cracked." There are side channel attacks on pretty much every encryption system out there, which is why in places where security really matters you see a lot o
                • by AK Marc (707885)

                  Then I guess by your definition, all encryption everywhere is "cracked,"

                  No. It requires actual judgment of whether the flaw is in the standard as expected to be applied. You are apparently purposefully ignoring judgment in order to defend your pet idea. There exists no implementation of Skype's encryption (not talking about the cypher it's based on) which you couldn't "break" in this method. Thus, there is no Skype conversation free from this attack, regardless of platform, implementation, or anything else. I'd call that "cracked." You call that "secure." That's where ou

                  • Thus, there is no Skype conversation free from this attack, regardless of platform, implementation, or anything else. I'd call that "cracked." You call that "secure." That's where our opinion differs.

                    Except that is not what I said. I said that the encryption algorithm has not been cracked, because it has not. The attack is a side channel attack. This does not mean that Skype calls are secure, it means that an otherwise secure algorithm was applied in a way that undermined the security of the system.

                    There is a difference between the encryption algorithm, and the system that uses that algorithm. This same attack would have worked if a different encryption algorithm had been used, even one as wide

                    • by AK Marc (707885)

                      There is a difference between the encryption algorithm, and the system that uses that algorithm. This same attack would have worked if a different encryption algorithm had been used, even one as widely evaluated as AES. The encryption algorithm is not what was cracked here.

                      "They broke Skype's encryption" is a true statement. The encryption package Skype uses is broken. Whether they did that from breaking the algorithm (I see after I spend a post proving "cypher" to be a pointless red herring, you've swapped to a new red herring in substituting "encryption algorithm" for cypher without changing your statements at all), or by some other attack that compromised the security of the encrypted calls is irrelevant to the truth. "Skype's call encryption has been broken." Again, t

                    • by AK Marc (707885)

                      There is a difference between the encryption algorithm, and the system that uses that algorithm. The encryption algorithm is not what was cracked here.

                      "They broke Skype's encryption" is a true statement. The encryption package Skype uses is broken. Whether they did that from breaking the algorithm (I see after I spend a post proving "cypher" to be a pointless red herring, you've swapped to a new red herring in substituting "encryption algorithm" for cypher without changing your statements at all), or by some other attack that compromised the security of the encrypted calls is irrelevant to the truth. "Skype's call encryption has been broken." Again, that'

                    • "They broke Skype's encryption" is a true statement.

                      My point from the beginning is that that state is ambiguous. It is not clear from that statement that the researchers did not crack the actual encryption algorithm. It does not make it clear that the problem has more to do with the compression than with cryptography.

                      Whether they did that from breaking the algorithm (I see after I spend a post proving "cypher" to be a pointless red herring, you've swapped to a new red herring in substituting "encryption algorithm" for cypher without changing your statements at all), or by some other attack that compromised the security of the encrypted calls is irrelevant to the truth.

                      It is relevant to whether or not the statement is clear about what the attack actually constitutes. Again, my point from the beginning was that TFS is ambiguous.

                    • by AK Marc (707885)

                      Again, my point from the beginning was that TFS is ambiguous.

                      No, your point from the beginning was that it was "misleading." Misleading is an opinion that, based on other comments and what's actually broken in the wild, false. If you had asserted facts (ambiguous is a fact, misleading is an opinion), then there would have been no room for discussion.

                      When you hear that an application is broken, is it mostly because the underlying cypher was broken? I've never heard it that way. Because when someone broke the cypher, the statements were naming the cypher, not comm

    • by gatkinso (15975)

      Is that really side channel - by that I mean it seems to me like block cipher mode crypto on a per packet basis is being employed... which would make it akin to a watermarking attack.

    • by mgbastard (612419)

      um, this counts as cracking their encryption. Just because you can't efficiently perform a "cleartext" digital translation (it is analog sound...) doesn't mean you can't read the message.

      And now that Microsoft has bought them for 8.5 billion: LMAO.

      Fuck you Ballmer.

  • Looks like their karma isnt so good these days!
  • Seems like the Skype buy wasnt such a good thing for MS... its been what..a week or two and already its been down and compromised?
  • Encrypting a wave (Score:2, Informative)

    by Anonymous Coward

    Of course, since the data basically represents sound waves, there is a certain level of predictability and pattern on the data unlike normal data which is much more random.

    It would have to be a special encryption to get rid of this pattern using a more dynamic algorithm that changes as it progress (which can make it annoying to decrypt or simpler to detect) or disjoint the data over a greater amount of data (making it somewhat harder to find the patterns though still might be possible) of the encryption tho

    • normal data...is much more random.

      Actually, most data used in practice is not uniformly random. Text, images, and even computer programs tend to have significant biases.

      It would have to be a special encryption to get rid of this pattern using a more dynamic algorithm that changes as it progress

      http://en.wikipedia.org/wiki/Stream_cipher [wikipedia.org]

      We know how to get these things right, and the problem with Skype was not the type of data, but rather the way in which that data was compressed.

    • by Ksevio (865461)
      What they need to do is add appropriate padding so the software just detects the people saying "Skype Skype Skype!" "Skype Skype?" "Skype Skype Skype Skype Skype!"
    • by Jonner (189691)

      Of course, since the data basically represents sound waves, there is a certain level of predictability and pattern on the data unlike normal data which is much more random.

      It would have to be a special encryption to get rid of this pattern using a more dynamic algorithm that changes as it progress (which can make it annoying to decrypt or simpler to detect) or disjoint the data over a greater amount of data (making it somewhat harder to find the patterns though still might be possible) of the encryption though that is difficult in a time sensitive app like Skype which encrypts and sends as it receives the data.

      It does not follow that encrypting sound waveforms leaks information just because they are predictable. If that were the case, encryption wouldn't be very useful in general. There is no such thing as "normal data" and most data people need to encrypt does have strong patterns. The entire purpose of encryption is to make non-random data look random.

      The method for guessing what people are saying described in TFA exploits specific properties of the most efficient voice compression algorithms coupled with timin

  • by youn (1516637) on Thursday May 26, 2011 @04:37PM (#36255742) Homepage

    I remember reading something similar with sip over encrypted channel... I guess it is the plague of all compressed communication even if encrypted... the only way to bypass that is use an uncompressed protocol and not blank out the silence. I guess what's new is they've done it with skype.

    • by afidel (530433)
      I believe if you use a CBR codec like G.711 without VAD or CNG you should be ok.
    • Or make it a constant bitrate.

    • by Jonner (189691)

      I remember reading something similar with sip over encrypted channel... I guess it is the plague of all compressed communication even if encrypted... the only way to bypass that is use an uncompressed protocol and not blank out the silence. I guess what's new is they've done it with skype.

      It's only a problem for variable bitrate compression algorithms, not less efficient fixed bitrate ones like the venerable G.722 [wikipedia.org]. It may only be a problem for voice-specific variable bitrate codecs, not general ones like MP3 or Vorbis. This risk from this type of attack may also be greatly mitigated by decoupling datagram size and timing from the output of the encoder, which would probably increase latency but still allow use of efficient codecs.

  • by HBI (604924) <kparadine@@@gmail...com> on Thursday May 26, 2011 @04:37PM (#36255746) Homepage Journal

    The reason why is that any serious encryption attempt of IP traffic would make all packets a constant size, significantly below expected MTU size (taking into account tunnels). This attack would not exist in that scenario. They are measuring the payload size of IP packets and matching it to phonemes spoken.

    I probably shouldn't blame them for this, but it's barely worth the effort of encrypting the traffic if it is this easy to sniff out the words being spoken.

    • by Anonymous Coward

      The reason why is that any serious encryption attempt of IP traffic would make all packets a constant size

      From TFA: A solution might be to break the data up into fixed sized frames but this would make it more difficult to reconstruct the data if there was packet loss.

      • by HBI (604924)

        Yeah, more difficult but not impossible. You just need a larger window and more buffers.

      • From TFA: A solution might be to break the data up into fixed sized frames but this would make it more difficult to reconstruct the data if there was packet loss.

        And even then, the data rate would leak some information about the content.

        The only trivial solution for zero leakage is to either use constant rate encoding, or use some kind of padding to make the data rate constant. Non-trivial solutions would include some random data rate variations to obfuscate the data rate of actual payload content. Unfortunately, all these methods will waste bandwidth.

    • by subreality (157447) on Thursday May 26, 2011 @04:56PM (#36256020)

      The reason why is that any serious encryption attempt of IP traffic would make all packets a constant size, significantly below expected MTU size (taking into account tunnels). This attack would not exist in that scenario.

      It's actually harder than that. You also have to generate the packets at an even rate as well, or you'll still have some leakage.

      Even after you do that, the presence or absence of a stream of packets will at the very least indicate if a call is in progress; to defend against that, you have to *always* transmit the stream.

      Even then you're leaking information about the maximum amount of data you could be communicating.

      The goalposts keep moving right on down the field when you're talking about side channels. You just have to pick the point where you're comfortable.

      • by HBI (604924)

        "Somebody's talking" is information that it'll be hard to conceal without the measures you cite. I'd be ok with that, generally. Having 60% of what I say easily ferreted out is not ok, however.

  • TFA states that this is possible due to the codec that is used:

    the best...compression for voice data makes use of the structure of speech

    So using a not-optimized-for-speech codec (e.g. mp3 or wav) would defeat this.

    • by blair1q (305137)

      it could have been defeated by encrypting the entire data stream instead of just part of it.

      • by profplump (309017)

        No, it really can't. Essentially this same paper, but as an analysis of SIP-IPSec/SIP-TLS, was published not long ago. Any real-time, size-efficient voice codec leaks a ton of information about the underlying speech just in the rate and size of its packets, so any encryption system that is real-time and length-preserving (i.e. any system that would be considered suitable to be paired with the underlying codec) leaks the same information. You can add padding to hide this, but A) that defeats the purpose of y

        • by blair1q (305137)

          Okay, so, then, what are the teachers in the Charlie Brown specials saying?

          Huh? Mr. Smarty-pants?

  • TFA was TLDR, but a quick question to those of you with knowledge to understand this... Did a particular language help? Does this work on all languages? Are some languages more secure than others?

    IE - Esperanto - Easy to break, but languages with Click Consonants [wikipedia.org] are harder?

  • Huh? (Score:5, Insightful)

    by tthomas48 (180798) on Thursday May 26, 2011 @04:47PM (#36255888) Homepage

    No, I find linguistics pretty useful. Especially since it has some pretty 1:1 relationships with computer programming. And Larry Wall was a linguist. And what kind of lead in is that?

    • by blair1q (305137)

      i find linguistics pretty useful, too, since it's how translation of all kinds works (including code compilation). in fact, it's pretty silly to say anyone doesn't find it useful. maybe they meant studying linguistics is pretty useless, if you're not going to work in the construction of translators. but that could be said of any subject, and the continued propagation of that attitude across all subjects and throughout the population, in a nation operating democratically under the principle of majority ru

    • "You might", and apparently you're someone who "might not". It's the lead-in for its intended audience, which is non-linguists. And among non-linguists, it is possible that people might find it interesting but not useful. Perfectly accurate, audience specific.

      You'd think the linguists complaianing about this would be able to parse out the "...and you might not" which is implied.

  • A December 2010 paper, "Uncovering Spoken Phrases in Encrypted Voice over IP Conversations", takes a similar approach.

    The article was published in ACM Transactions on Information and System Security, PDF version [unc.edu].

    The paper details a gap in the security of VBR compressed encrypted VoIP streams. The authors had earlier found that it is possible to determine the language that is spoken on such a VoIP call, based on packet lengths. Now they have expanded their research and show that itâ(TM)s possible
  • Not sure how it works with voice, but I know with text, if you have a part of the message, it's a lot easier to break the encryption method - assuming it's breakable. Security is just a cat and mouse game, anyway. Someone finds a hole, someone plugs the hole, then someone finds another hole...etc. Fun stuff though!
    • with text, if you have a part of the message, it's a lot easier to break the encryption method

      This is called a known plaintext attack, and any decent modern cipher should be secure against it (that is, you should learn very little even if I give you plaintext/ciphertext pairs). Modern ciphers are generally designed to be secure against this type of attack, as well as stronger attacks:

      • Chosen plaintext attacks -- the attacker is allowed to request ciphertexts for plaintexts of his choosing.
      • Chosen ciphertext attacks -- the attacker is allowed to request decryptions of ciphertexts of his choosing pr
    • Yes; this is follow-up work to the paper [acm.org] in that earlier article.

      Also important to note, neither paper is specific to Skype; their work is on encrypted VoIP in general. But apparently /. prefers things having to do with Skype for some reason.

  • If you can compress the data stream from the packet contents to just the lengths of the packets and still recover the word stream, that suggests two things: A) vocal inflection is worth 100 words per syllable, and B) you're not compressing enough in the first place. Yet there's a reason why compression sucks: the low latency requirement. Compression over 5 minute speech blocks would blow this side channel away.

    Were it not for the human tension of a conversation amounting to a group of people mutually waiti

  • I was hoping that Skype had been cracked so we can start using 3rd party messengers!
  • Fsck You, Slashdot (Score:3, Interesting)

    by theshibboleth (968645) on Thursday May 26, 2011 @05:33PM (#36256522)
    "You might think of linguistics as being interesting but not really useful" Way to go Slashdot, insult one of the most important fields in existence. Do the editors and readers really not realize how closely comp ling is related to AI? I have confidence that eventually computational linguistics will crack speech/language in general and lead to computers that can learn languages as readily as human infants. This will be momentous because it would allow communication between computers and humans. Now it wouldn't solve the consciousness problem, but it would be a step in the right direction.
    • I noticed that too. Replace "linguistics" with "the space program" and watch how many slashdotters go supercritical.
    • I have confidence that eventually computational linguistics will crack speech/language in general and lead to computers that can learn languages as readily as human infants.

      Why do you think this? Really, I want to know, because it is quite a claim.

      Actually, two claims:
      A) That computational linguistics will crack speech/language and
      B) That this will lead to computers that can learn languages as readily as human infants.

  • You mean Skype wasn't smart enough to mix in other sounds while encrypting the original sound?! That is just retarded. Note that I am not a mathematician or any sort of "really smart guy." But I can definitely picture in my mind why this would be somewhat trivial. Vocal sound is primarily frequency modulated which means that the flow of signal will vary in density on a constant carrier. If you mix up the numbers, you will still see a great deal of fidelity in the variations of the frequencies of data r

    • by LS (57954)

      if the signal were combined with another sound pattern which the receiving end would know how to properly remove after decryption ..... I have to wonder why this isn't being done. It is simply too obvious to patent.

      You obviously haven't been paying attention to the absolute nonsense that has been successfully receiving patents these days.

  • To demonstrate the obvious. What do you expect when using high complexity VBR codecs with no blinding of any kind. I sincerely hope this was not news to anyone.

  • I found it somewhat surprising that the Town name was used to identify the University. Would you say Ann Arbor or Ithaca or New Haven? You might say Berkeley or Princeton. So, I guess you might say Chapel Hill. OK, never mind.
    • by JSBiff (87824)

      That all depends on the name of the School, doesn't it?

      Private schools tend not to be named after the town. Public universities very often are named after the town they're in. I live in Ohio, I can name quite a few schools named after towns. . .

      U of Toledo, U of Akron, Youngstown State University, Cincinnati State, The U of Cincinnati, Kent State University. Out of state I can think of University of Chicago, UC San Diego, UC Berkeley, (pretty much all the UC schools are named for the towns they are in), Uni

  • This is not the exact same thing, but it's a great example of how encryption alone is not enough and it must be done right.

    Block cipher modes of operation [wikipedia.org]

    Scroll down til you see the penguins.

  • So, as I understand, it may not be the obvious weakest potential link that has been compromised - the cipher itself for example - but rather a detail of implementation that paved way for their successful attack, right? If Skype fragment the encrypted data stream in variable sized frames that have also rather umm unpredictable (bear with me here) sizes, the attack, as stated by researchers themselves I believe, could not be instantiated in its current form? The entire weakness is based around the fact that i

What the large print giveth, the small print taketh away.

Working...