ack-grep vs. grep

Following Daniel Bachhuber – The Zen of WordPress Development talk, I’ve started to explore this magical ack tool, a replacement for the native grep.

ack vs. grep

ack can be downloaded from the official and quite modest website called ack is also available in all sorts of software repositories, and can be named ack-grep instead (due to a naming conflict).

ack is written in Perl, while grep is written is C. So why the heck does ack appear to be faster? Here are some test with the latest WordPress package.

An example head-to-head run

## Regular recursive searches

grep -rni 'function ' wordpress > /dev/null

real 0m2.734s
user 0m2.704s
sys 0m0.028s

ack-grep -i 'function ' wordpress > /dev/null

real 0m2.008s
user 0m1.824s
sys 0m0.164s

## Only PHP files, show before context

find wordpress -type f -name '*.php' | xargs -n 1 grep -ni 'function ' --color --before-context 10 > /dev/null

real 0m2.520s
user 0m2.148s
sys 0m0.724s

ack-grep --php -i 'function ' > /dev/null

real 0m1.463s
user 0m1.316s
sys 0m0.144s

## Regular expressions

grep -rni 'function ...(' wordpress > /dev/null

real 0m0.182s
user 0m0.156s
sys 0m0.024s

ack-grep -i 'function ...\(' wordpress > /dev/null

real 0m0.653s
user 0m0.604s
sys 0m0.044s

## Exclude dir

grep -rni --exclude-dir wordpress/wp-admin 'function ' wordpress > /dev/null

real 0m1.909s
user 0m1.884s
sys 0m0.020s

ack-grep -i --ignore-dir=wordpress/wp-admin 'function ' wordpress > /dev/null

real 0m1.968s
user 0m1.808s
sys 0m0.156s

The data doesn’t say much, there isn’t too much stress, and this is only one run and something may have been happening in the system while one of them was running. Far from conclusive, but you can run these in a more dedicated manner that I did. However, ack-grep does appear to outperform grep in regular searches. Why?

ack‘s speed

One of the major reasons for ack‘s speed lies in its whitelists of filetypes. ack will skip filetypes that are not known to it. To get a list of default types that ack supports issue a ack-grep --help-types command. Moreover, ack will not search inside source control directories by default. It skips a quite a lot. A lot that you would usually want it to skip anyways.

grep is still faster

Yes, grep is still faster, with very low system call count. Check strace grep on 1 file vs. strace ack on that same file. ack loads up Perl, initialization is thus slower. They both use open() and read() system calls to read files, nothing apparently special.

Why ack still rules

ack is a replacement for grep in 99% of cases, eliminating many flags that you would normally use with grep, making you type less. The set of whitelist rules are fair enough for most uses, delivering faster results without additional “rubbish” or flags handling the removal thereof.

However, I think that it is very important to be able to write useful and working grep chains along with find, xargs, etc.

Further reading

Are you as excited about ack as the testimonials on the ack website? Do you still prefer grep?