Frequency xlating filter vs. complex multiplication

When I upgraded the simple GNU Radio receiver to single side band, I decided to separate the band pass filter and the frequency xlating filter and the frequency xlating filter got equipped with a wide low pass filter instead. The low pass filter is useful but really not important. Therefore, I decided to replace the frequency xlating filter with a simple complex multiplication (aka. local oscillator) and thereby save some CPU cycles – I figured that removing a filter would save something.

To my great surprise, this excercise had the opposite effect. The CPU load with the frequency xlating filter was 30-40%, while the local oscillator version went up to 45-50%. The LO version had an additional rational resampler for narrowing the IF spectrum display, but removing it did not have any significant effect! The memory load on the other hand seems to be 3 MB higher with the frequency xlating filter.

CPU and memory load with freq xlating filter.
CPU and memory load with LO and RR.
CPU and memory load with LO and no RR.