Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've heard images are better modeled in DCT space (which isn't based on complex numbers) because it's better at energy compaction than FFT, and also because it doesn't assume that the image is periodic. Also some people think that the FFT is insufficient, even for audio, because it doesn't model time-domain hearing perception. Some people say that wavelets are better at modeling images than purely frequency-domain transforms, because they take spatiality more into account. From what I've heard, wavelets work well for modeling human vision (in fact convolutional neural network input kernels tend to converge to Gabor filters, which I don't know howw they differ from Gabor wavelets) and noise reduction, but have fallen flat for image/video compression codec design.


All excellent points, and I think you should DM me on twitter to chat about this more. (I hope you will!)

DCT is on my radar. But there are several serious limitations that I think are overlooked. For example, convolution is no longer a simple component-wise multiplication. This seems, to me, a big deal.

Complex numbers are tricky to model, but I think most people have given up too easily, or haven't been creative enough in how they're modeling them. Some of my (outdated) ideas: https://gist.github.com/shawwn/c6865fccafac5066e1c7bab672781...

In other words, you're probably right, but I'm focused solely on FFTs on the (very low) chance that people have overlooked something that will work well.


Sorry I don't work on neural networks much, and have my plate too full with other projects (and my DSP is a bit rusty?) to hold a conversation on this right now. And I don't use Twitter much either.

Maybe we can talk later? Not sure.


No worries :) it was just an offer. It surprised me how much you knew about the domain. Good luck with your projects!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: