The power of N Log N :
FFT based FIR filtering :
gr.fft_filter_ccc handles the float taps case too.
For ntaps = 64 with decimation = 1, it's more than 9 times faster than our best hand-coded SSE assembler on the Pentium M.
The speed-up you'll see depends on your machine architecuture and microarchitecture, as well
as the configure options you use to build FFTW.
See gnuradio-core/src/python/gnuradio/gr/benchmark_filters.py for code to compare gr.fir_filter_ccc to gr.fft_filter_ccc.
>>> gr_fir_ccc: using SSE
gr.fir_filter_ccc: taps: 256 input: 4e+07, time: 7.978 taps/sec: 1.283e+09
Using Volk machine: sse4_1_32
gr.fft_filter_ccc: taps: 256 input: 4e+07, time: 1.302 taps/sec: 7.863e+09
Can you see the difference !!!