< Previous | Contents | Next >
Anchors
The caret (^) and dollar sign ($) characters are treated as anchors in regular expressions. This means that they cause the match to occur only if the regular expression is found at the beginning of the line (^) or at the end of the line ($):
[me@linuxbox ~]$ grep -h '^zip' dirlist*.txt
zip zipcloak zipgrep zipinfo zipnote zipsplit
[me@linuxbox ~]$ grep -h 'zip$' dirlist*.txt
gunzip gzip funzip gpg-zip preunzip prezip unzip zip
[me@linuxbox ~]$ grep -h '^zip$' dirlist*.txt
zip
[me@linuxbox ~]$ grep -h '^zip' dirlist*.txt
zip zipcloak zipgrep zipinfo zipnote zipsplit
[me@linuxbox ~]$ grep -h 'zip$' dirlist*.txt
gunzip gzip funzip gpg-zip preunzip prezip unzip zip
[me@linuxbox ~]$ grep -h '^zip$' dirlist*.txt
zip
Here we searched the list of files for the string “zip” located at the beginning of the line, the end of the line, and on a line where it is at both the beginning and the end of the line (i.e., by itself on the line). Note that the regular expression ‘^$’ (a beginning and an end with nothing in between) will match blank lines.
A Crossword Puzzle Helper
Even with our limited knowledge of regular expressions at this point, we can do something useful.
My wife loves crossword puzzles and she will sometimes ask me for help with a particular question. Something like, “What’s a five letter word whose third letter is ‘j’ and last letter is ‘r’ that means...?” This kind of question got me thinking.
Did you know that your Linux system contains a dictionary? It does. Take a look in the /usr/share/dict directory and you might find one, or several. The dictionary files located there are just long lists of words, one per line, arranged in alphabetical order. On my system, the words file contains just over 98,500
words. To find possible answers to the crossword puzzle question above, we could do this:
[me@linuxbox ~]$ grep -i '^..j.r$' /usr/share/dict/words
Major major
Using this regular expression, we can find all the words in our dictionary file that are five letters long and have a “j” in the third position and an “r” in the last posi- tion.