ack-grep vs. grep
Following Daniel Bachhuber – The Zen of WordPress Development talk, I’ve started to explore this magical ack
tool, a replacement for the native grep
.
ack
can be downloaded from the official and quite modest website called BetterThanGrep.com. ack
is also available in all sorts of software repositories, and can be named ack-grep
instead (due to a naming conflict).
ack
is written in Perl, while grep
is written is C. So why the heck does ack
appear to be faster? Here are some test with the latest WordPress package.
An example head-to-head run
## Regular recursive searches
grep -rni 'function ' wordpress > /dev/null
real 0m2.734s
user 0m2.704s
sys 0m0.028s
ack-grep -i 'function ' wordpress > /dev/null
real 0m2.008s
user 0m1.824s
sys 0m0.164s
## Only PHP files, show before context
find wordpress -type f -name '*.php' | xargs -n 1 grep -ni 'function ' --color --before-context 10 > /dev/null
real 0m2.520s
user 0m2.148s
sys 0m0.724s
ack-grep --php -i 'function ' > /dev/null
real 0m1.463s
user 0m1.316s
sys 0m0.144s
## Regular expressions
grep -rni 'function ...(' wordpress > /dev/null
real 0m0.182s
user 0m0.156s
sys 0m0.024s
ack-grep -i 'function ...\(' wordpress > /dev/null
real 0m0.653s
user 0m0.604s
sys 0m0.044s
## Exclude dir
grep -rni --exclude-dir wordpress/wp-admin 'function ' wordpress > /dev/null
real 0m1.909s
user 0m1.884s
sys 0m0.020s
ack-grep -i --ignore-dir=wordpress/wp-admin 'function ' wordpress > /dev/null
real 0m1.968s
user 0m1.808s
sys 0m0.156s
The data doesn’t say much, there isn’t too much stress, and this is only one run and something may have been happening in the system while one of them was running. Far from conclusive, but you can run these in a more dedicated manner that I did. However, ack-grep
does appear to outperform grep
in regular searches. Why?
ack
‘s speed
One of the major reasons for ack
‘s speed lies in its whitelists of filetypes. ack
will skip filetypes that are not known to it. To get a list of default types that ack
supports issue a ack-grep --help-types
command. Moreover, ack
will not search inside source control directories by default. It skips a quite a lot. A lot that you would usually want it to skip anyways.
grep
is still faster
Yes, grep
is still faster, with very low system call count. Check strace grep
on 1 file vs. strace ack
on that same file. ack
loads up Perl, initialization is thus slower. They both use open()
and read()
system calls to read files, nothing apparently special.
Why ack
still rules
ack
is a replacement for grep
in 99% of cases, eliminating many flags that you would normally use with grep
, making you type less. The set of whitelist rules are fair enough for most uses, delivering faster results without additional “rubbish” or flags handling the removal thereof.
However, I think that it is very important to be able to write useful and working grep
chains along with find
, xargs
, etc.
Further reading
- Interview With Andy Lester – Author of Grep Replacement Tool Ack
- Jonathan Hartley’s response to ack-grep – a source-aware grep replacement
- Combining ack-grep and xargs
Are you as excited about ack
as the testimonials on the ack
website? Do you still prefer grep
?
[…] Published 3 hours ago […]
Benchmarks like this only measure some of ack’s performance speedup. For example, it doesn’t take into account the speedup of typing
ack-grep –php -i ‘function ‘
instead of
find wordpress -type f -name ‘*.php’ | xargs -n 1 grep -ni ‘function ‘ –color –before-context 10 > /dev/null
Also, it seems that you’re trying to only find the word “function” as a word, so you can use the -w flag: ack –php -i -w function
I’m glad you like ack, and appreciate you spreading the word.
Thanks for stopping by Andy, great tool, very well done. I use it every day when manually searching and even symlinked it over to `ack` (.bashaliasing works as well) for a 25% faster launch 😉 however I still prefer `grep` in my shellscripts for lower-level control and clarity.
I think this “find wordpress -type f -name ‘*.php’ | xargs -n 1 grep -ni ‘function ‘ –color –before-context 10 > /dev/null” can be written “grep -rni –include=\*.php –color –before-context 10 ‘function ‘ wordpress > /dev/null”
It would be interesting to see CPU profiling of both tools. I assume they both harness parallelism? I can’t see Perl being faster than C as a language, so it would be interesting to know the differences.
Today’s replacement for grep is ag: https://github.com/ggreer/the_silver_searcher