Using AES in ECB mode is clearly a bad choice, but honestly it's not that horrib...

tialaramex · on April 4, 2020

Is it just me to whom it seems obvious why they've gone with ECB?

Zoom's design has a single key for everybody and for everything [ in the context of a particular video conference call ] . It's simpler and, to a layman, it sounds secure. [ We arguably contribute to this if we say e.g. "the key" implying it's just one thing when we mean something like a master secret in TLS used to derive lots of actual keys ].

Once you've committed to a single key ECB behaves exactly how you'd want.

You've got some audio, or video, ready to send? Just encrypt it with the key. Receive some encrypted data? Just decrypt it.

What happens if you have some network trouble briefly? Nothing, everybody decrypts whatever does arrive and maybe a few frames are missing.

All of the other modes don't work at all if you try to use them this way. They all expect you to have thought about the problem and track a bunch more state and then maintain that state despite an unreliable network and other issues.

Unless there is somebody in the room who says we can't do ECB because it's fundamentally not a secure choice, ECB is what you're going to get from this design decision.

And I've been in rooms like that as the only voice, or at least as the only person who spoke up. I've been in rooms where I was part of a chorus too, but as organisations get bigger and "security is everyone's problem" becomes a phrase people learn but don't act on, it gets lonelier.

Also, I actually can't even work out what a "correct" key rotation strategy could be for ECB with a variable number of parties all encrypting stuff at once. As a result it seems unlikely that Zoom did figure out such a strategy and then correctly implemented it. Instead it seems safe to assume there is no key rotation, everything sent by every participant for the life of the stream is encrypted with the same key, even though that's a terrible idea.

betterunix2 · on April 4, 2020

That whole argument would make sense if there was not a standardized solution to the problem:

https://en.wikipedia.org/wiki/Secure_Real-time_Transport_Pro...

I would like to hear the explanation for why they did not use SRTP, though I suspect the answer is, "We had no idea it existed."

kop316 · on April 3, 2020

> Using AES in ECB mode is clearly a bad choice, but honestly it's not that horrible for high entropy data like compressed audio/video. I'm sure someone could prove me wrong one day, but it seems hard to extract any useful patterns out of compressed audio/video.

...you're joking right? The Wikipedia example for why ECB is not recommended is literally an image:

https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation

Edit: This applies to compression too. Please refer to Shannon's source coding theorem.

rzimmerman · on April 3, 2020

It's definitely a terrible choice for uncompressed images or video. I'm arguing it probably isn't that bad for highly compressed video. That being said, if you're encrypting any data stream you should use an appropriate stream cipher.

sudosysgen · on April 3, 2020

You're forgetting about the technical intricacies of compressed video. Compressed video is a mix of high and low entropy content, with a predictable time pattern to this. For example, one can easily use traffic analysis to find B-Frames, and run analysis on that. Bam, you get very low entropy due to the stationary nature of video conference.

pja · on April 4, 2020

Yeah, the B-frames were my thought too, but ordinary sensor noise would hopefully make the individual frames different enough. If you’re doing green-screen background switching, or transmitting a static image then it’s definitely going to be a problem.

But given all the other security issues that seem to hover around Zoom like a cloud of angry bees, this is probably all moot - if you really want to crack a video stream then there are probably easier ways.

kardos · on April 3, 2020

Zoom does screen sharing, right? Surely it's not transmitted uncompressed, but it is stationary for a long time and perhaps only small parts changing when they do (eg, switching slides). Is there an ECB-based weakness here?

whoopdedo · on April 4, 2020

Those unchanged pixels won't be transmitted. Only the changed part of the screen needs to be sent to the client. For example, RFB[RFC 6143] would send a 16-byte header with the size and position of a rectangular area of the screen followed by the pixels in that rectangle. Or multiple rectangles can be sent in one update message. But if you consider the case of text being typed, there will be a single rectangle per keystroke(s).

Now I wonder if a sequence of these rectangles, all the same size and in roughly the same area of the screen, would lend to some sort of statistical analysis. At the very least the timestamps of the updates would tell you how fast the operator is typing.

[RFC 6143] https://tools.ietf.org/html/rfc6143

miked85 · on April 3, 2020

If the encryption scheme is poor, why would the data being compressed or not matter?

stouset · on April 3, 2020

Compression already makes a compressed file roughly indistinguishable from random noise (module access to the decompressor). So the patterns have been removed.

That doesn’t make this good, but it means that one specific example isn’t immediately applicable.

monocasa · on April 3, 2020

There's more in the stream than just compressed data. There'll be metadata info that you can make reasonable guesses about. ECB mode lets you take that information and apply it to other blocks in the ciphertext.

_lqaf · on April 3, 2020

This thread is an excellent illustration of why you don't want your encryption implemented by merely good coders. You need people who know what they are doing.

stouset · on April 4, 2020

I’m literally one of those people who “knows what they’re doing”. This is the problem with discussing ECB on an online forum. There’s no space to have a nuanced discussion without people cargo culting “ECB bad” over every comment.

Yes, ECB is almost always the wrong choice. Yes, there are other ways it’s going to fail in this use case. Yes, compression before encryption itself often enables other attacks. No, I should not have to prefix a comment about ECB with this type of disclaimer when I’m making (what should be) an uncontroversial statement that the tux attack doesn’t directly apply to compressed image data.

Ironically, when designing a protocol for my company, one of the reasons we didn’t use ECB when it would have been entirely justified (each chunk of data was precisely one block in size and keys were only ever used once) was because of potential backlash from people who only know “ECB bad” and nothing more.

stouset · on April 4, 2020

I’m not arguing that. I’m saying that for compressed data, underlying patterns in data aren’t trivially exposed by ECB. Ergo, the “tux” attack on bitmap image files doesn’t really apply here. I meant nothing more and nothing less than that.

monocasa · on April 4, 2020

And I'm saying that they aren't just sending compressed data, nor would hardly any practical communication application, which makes the "well maybe they have enough entropy that it doesn't matter?" argument moot.

stouset · on April 4, 2020

Literally nobody is making that argument or in any way suggesting that ECB is a great choice. Just that this one specific attack doesn’t apply.

monocasa · on April 4, 2020

Which specific attack?

stouset · on April 6, 2020

The one discussed directly in the grandparents of this thread.

monocasa · on April 6, 2020

In your own words.

staticassertion · on April 3, 2020

It's harder to extract patterns from high entropy data. I don't think anyone's saying that this is even an OK thing to rely on, at all, just that the nature of the data means that this specific weakness is likely more difficult to take advantage of.

If zoom were transmitting text this would be relatively more serious.

anaphor · on April 3, 2020

What about the chat system? I doubt they're intentionally compressing the text there in order to increase the entropy. I guess they could be using gzip or whatever, but we'd need to look at how the protocol works in more detail. Or do they use a different system for the chat protocol altogether?

rzimmerman · on April 3, 2020

AES-ECB isn't necessarily insecure, it's just very easy to misuse (and I agree what the article described is a misuse). I think the argument is that if there are patterns in the input data, the same patterns will show up in the AES-ECB encrypted data, just with different values. Compressed data should be high entropy and hard to predict, so there really shouldn't be structure or patterns to the input data. There's no guarantee that any given compression algorithm provides sufficient randomness, though.

fennecfoxen · on April 3, 2020

"The use of ECB mode is not recommended because patterns present in the plaintext are preserved during encryption."

miked85 · on April 4, 2020

absolutely - I was asking why a poor encryption scheme would be considered acceptable just because the data is compressed.

kop316 · on April 3, 2020

Compression does not introdoce entropy to a stream. Please refer to Shannon's source coding theorm.

If anything, it reduces entropy.

hackinthebochs · on April 3, 2020

But it does increase the entropy per byte, thus making patterns harder to spot.

kop316 · on April 3, 2020

....so your argument us to hope an adversary only sees part of your video and not all of it?

johncolanduoni · on April 4, 2020

No, their argument is that if you have a higher entropy per byte, there will be more variation in the aligned 16-byte chunks that are relevant for attacking AES-128-ECB. This reduces the probability of the attacker being able to find equal blocks.

kop316 · on April 4, 2020

And my argument is if I know the video encoding and compression sequence, I wouldn't depend on AES-ECB. I know the patterns that show up.

If I am encrypting something, I only want to depend on the strength of the encryption. I don't want to hope that something else ensures that an adversary cannot figure out my ciohertext. That is a very bad idea.

johncolanduoni · on April 4, 2020

Sure, and nobody was arguing they should have used ECB or that they shouldn’t change it. Only that the ability to exploit this given compressed data is lower than the uncompressed penguin image example.

colanderman · on April 3, 2020

No comment on the original claim, but that example is encryption applied to an uncompressed image. (Adjacent identical pixels are not typically represented individually when compressed, and thus encryption could not cause the banding patterns seen in those regions of the image if it were compressed prior to encryption.)

DevKoala · on April 3, 2020

Seriously. The main argument of this article is assuming Zoom encrypts uncompressed video data. That is not what is happening here.

kop316 · on April 3, 2020

The point is that any pattern in the plaintxt data shows up in encrypted data if you use AES-ECB.

Compression does not introdoce entropy to a stream. So assuming that saying the stream is compressed and calling it good is a very bad idea. Please refer to Shannon's source coding theorem. If anything, compression reduces the entropy in the information.

johncolanduoni · on April 3, 2020

I think you may want to look closer at Shannon’s source coding theorem; The Shannon entropy of the output of a compression algorithm will be higher than the entropy of the source as identifiable patterns are eliminated. Otherwise the theorem would trivially contradict itself.

kop316 · on April 4, 2020

Shannon's source coding theorn says that the entropy in a compressed information is at most the entropy of the uncompressed information. If you add entropy to a compressed algorithm, you are by definition adding noise to the SNR of a signal.

johncolanduoni · on April 4, 2020

We’re not talking about a noisy channel here, so I’m not sure where you’re getting the SNR from. I think we’re talking about entropy of different distributions here so let’s cut to a concrete example relevant to your original claim (that compression doesn’t help reduce the impact of repeated blocks in ECB by reducing the rate of repeated blocks).

Suppose we have some string of bytes. When we split it into aligned 16-byte blocks (let’s assume it divides evenly for simplicity), we find that the distribution of these blocks are not evenly distributed. For example, 1% of blocks turn out to be the same, which given the number of symbols in this code is massively out of proportion.

We apply a Huffman code using the 16-byte blocks present in the message as the alphabet and their observed statistics for this particular message (if that aspect bothers you, you can assume we pretend the dictionary to the message). Huffman codes are optimal for per-symbol encoding.

Suppose we re-evaluate the distribution of 16-byte blocks in the compressed data; will this distribution have higher entropy (meaning there will be fewer duplicate blocks to exploit ECB with) or not?

blattimwind · on April 3, 2020

> The point is that any pattern in the plaintxt data shows up in encrypted data if you use AES-ECB.

No, that's false. ECB reveals repeating plaintext blocks. "F0123456789ABCDEF0123456789ABCDEF" contains a repeating block-length sequence, but would encrypt to three distinct blocks under ECB, because the sequence is not aligned to a block boundary.

kop316 · on April 4, 2020

> ECB reveals repeating plaintext blocks.

That by definition is saying any pattern in plaintext shows up in cipher text

phendrenad2 · on April 3, 2020

Is compressed audio/video actually high-entropy (in the time domain) though?

rongenre · on April 3, 2020

Compressed anything is high entropy

jldugger · on April 3, 2020

Depends on the compression. Lossy compression discards a lot of noise vs the signal we care about and would in a sense reduce entropy.

sudosysgen · on April 3, 2020

Untrue. Many performance trade-offs have to be made and the entropy has to vary drastically with time. See for example B-Frames vs I-frames in compressed video. Couple that with the very low entropy video conference data and bam.

userbinator · on April 3, 2020

Two words: sensor noise.

Even uncompressed video will be hard to see that "penguin image effect" in, because the pixels that make up each block will be constantly changing in a random way, and unlike that synthetically generated image, it's highly unlikely for a block to be the exact same as any other one in any given frame.

sudosysgen · on April 4, 2020

You greatly overestimate both the image quality of crappy videoconferencing streamed video, the amount of pixel-wise sensor noise after noise reduction (pretty low actually), while underestimating the ingenuity of crypt-analysts and the power of having a lot of data. Like seriously, the only way the shitty 1mm or less sensors on webcams are able to deliver HD video is through an abject amount of noise reudction, sharpening and filtering. All of which greatly reduce entropy.

Hint: You don't need to know the plaintext exactly. you just need to be able to build a reasonably precise probability distribution.

rzimmerman · on April 3, 2020

I'm actually not sure - that's a good point. The whole point of compressing audio for video conferencing is to preserve human speech, so things that produce radically different waveforms but "sound the same" to us might show up as patterns. I guess it's better to avoid the question entirely and use an appropriate stream cipher!

kevin_thibedeau · on April 4, 2020

There are lots of embedded processors with hardware support for AES-128 only. I have to fight to keep AES-256 out of the ciphersuite list because of the performance regression. The rest of the world will probably force the issue eventually but the saving grace is that 3DES is still considered secure.

tialaramex · on April 4, 2020

> the saving grace is that 3DES is still considered secure.

Nobody who wants to do AES-256 rather than AES-128 thinks 3DES is "still secure". 3DES is perhaps 112 bits of useful keyspace but it has 64-bit blocks which was already bad news when DES was invented.

TLS 1.3 doesn't have a 3DES option at all. You can do AES 128 or AES 256 (or ChaCha20).

kevin_thibedeau · on April 4, 2020

The USG still considers it secure and that will retard forward progress.

blattimwind · on April 4, 2020

There is an abundance of secure, widely reviewed symmetric ciphers for every imaginable application profile.

ouid · on April 3, 2020

communications on the platform aren't encrypted at all. That's what the article says.