Optimize vp9_get_sub_block_energy.

Because energy scaling is non-decreasing, we can work on the variance
and scale after the loop. This avoids costly computations (in
particular, log()) within the loop.
We've measured that we spend 0.8% of our total time computing the log.

Change-Id: I302fc0ecd9fd8cf96ee9f31b8673e82de1b2b3e2
1 file changed