< Previous | Contents | Next >
rm: remove directory `archive'? y
We will discuss how to make this option the default in Chapter 7, which discusses customizing your shell environment.
3.3.3. Finding files
3.3.3.1. Using shell features
In the example on moving files we already saw how the shell can manipulate multiple files at once. In that example, the shell finds out automatically what the user means by the requirements between the square braces "[" and "]". The shell can substitute ranges of numbers and upper or lower case characters alike. It also substitutes as many characters as you want with an asterisk, and only one character with a question mark.
All sorts of substitutions can be used simultaneously; the shell is very logical about it. The Bash shell, for instance, has no problem with expressions like ls dirname/*/*/*[2-3].
In other shells, the asterisk is commonly used to minimize the efforts of typing: people would enter cd dir* instead of cd directory. In Bash however, this is not necessary because the GNU shell has a feature called file name completion. It means that you can type the first few characters of a command (anywhere) or a file (in the current directory) and if no confusion is possible, the shell will find out what you mean. For example
in a directory containing many files, you can check if there are any files beginning with the letter A just by typing ls A and pressing the Tab key twice, rather than pressing Enter. If there is only one file starting with "A", this file will be shown as the argument to ls (or any shell command, for that matter) immediately.
3.3.3.2. Which
A very simple way of looking up files is using the which command, to look in the directories listed in the user's search path for the required file. Of course, since the search path contains only paths to directories containing executable programs, which doesn't work for ordinary files. The which command is useful when troubleshooting "Command not Found" problems. In the example below, user tina can't use the acroread program, while her colleague has no troubles whatsoever on the same system. The problem is similar to the PATH problem in the previous part: Tina's colleague tells her that he can see the required program in
/opt/acroread/bin, but this directory is not in her path:
tina:~> which acroread
/usr/bin/which: no acroread in (/bin:/usr/bin:/usr/bin/X11)
tina:~> which acroread
/usr/bin/which: no acroread in (/bin:/usr/bin:/usr/bin/X11)
The problem can be solved by giving the full path to the command to run, or by re-exporting the content of the
PATH variable:
tina:~> export PATH=$PATH:/opt/acroread/bin
tina:~> echo $PATH
/bin:/usr/bin:/usr/bin/X11:/opt/acroread/bin
tina:~> export PATH=$PATH:/opt/acroread/bin
tina:~> echo $PATH
/bin:/usr/bin:/usr/bin/X11:/opt/acroread/bin
Using the which command also checks to see if a command is an alias for another command:
gerrit:~> which -a ls
ls is aliased to `ls -F --color=auto' ls is /bin/ls
gerrit:~> which -a ls
ls is aliased to `ls -F --color=auto' ls is /bin/ls
If this does not work on your system, use the alias command:
tille@www:~/mail$ alias ls
alias ls='ls --color'
tille@www:~/mail$ alias ls
alias ls='ls --color'
3.3.3.3. Find and locate
These are the real tools, used when searching other paths beside those listed in the search path. The find tool, known from UNIX, is very powerful, which may be the cause of a somewhat more difficult syntax. GNU find, however, deals with the syntax problems. This command not only allows you to search file names, it can also accept file size, date of last change and other file properties as criteria for a search. The most common use is for finding file names:
find <path> -name <searchstring>
This can be interpreted as "Look in all files and subdirectories contained in a given path, and print the names of the files containing the search string in their name" (not in their content).
Another application of find is for searching files of a certain size, as in the example below, where user peter
wants to find all files in the current directory or one of its subdirectories, that are bigger than 5 MB:
peter:~> find . -size +5000k
psychotic_chaos.mp3
peter:~> find . -size +5000k
psychotic_chaos.mp3
If you dig in the man pages, you will see that find can also perform operations on the found files. A common example is removing files. It is best to first test without the -exec option that the correct files are selected, after that the command can be rerun to delete the selected files. Below, we search for files ending in .tmp:
peter:~> find . -name "*.tmp" -exec rm {} \;
peter:~>
peter:~> find . -name "*.tmp" -exec rm {} \;
peter:~>
Optimize!
This command will call on rm as many times as a file answering the requirements is found. In the worst case, this might be thousands or millions of times. This is quite a load on your system.
A more realistic way of working would be the use of a pipe (|) and the xargs tool with rm as an argument. This way, the rm command is only called when the command line is full, instead of for every file. See Chapter 5 for more on using I/O redirection to ease everyday tasks.
Later on (in 1999 according to the man pages, after 20 years of find), locate was developed. This program is easier to use, but more restricted than find, since its output is based on a file index database that is updated only once every day. On the other hand, a search in the locate database uses less resources than find and therefore shows the results nearly instantly.
Most Linux distributions use slocate these days, security enhanced locate, the modern version of locate that prevents users from getting output they have no right to read. The files in root's home directory are such an example, these are not normally accessible to the public. A user who wants to find someone who knows about the C shell may issue the command locate .cshrc, to display all users who have a customized configuration file for the C shell. Supposing the users root and jenny are running C shell, then only the file
/home/jenny/.cshrc will be displayed, and not the one in root's home directory. On most systems,
locate is a symbolic link to the slocate program:
billy:~> ls -l /usr/bin/locate
lrwxrwxrwx 1 root slocate 7 Oct 28 14:18 /usr/bin/locate -> slocate*
billy:~> ls -l /usr/bin/locate
lrwxrwxrwx 1 root slocate 7 Oct 28 14:18 /usr/bin/locate -> slocate*
User tina could have used locate to find the application she wanted:
tina:~> locate acroread
/usr/share/icons/hicolor/16x16/apps/acroread.png
/usr/share/icons/hicolor/32x32/apps/acroread.png
/usr/share/icons/locolor/16x16/apps/acroread.png
/usr/share/icons/locolor/32x32/apps/acroread.png
/usr/local/bin/acroread
/usr/local/Acrobat4/Reader/intellinux/bin/acroread
/usr/local/Acrobat4/bin/acroread
tina:~> locate acroread
/usr/share/icons/hicolor/16x16/apps/acroread.png
/usr/share/icons/hicolor/32x32/apps/acroread.png
/usr/share/icons/locolor/16x16/apps/acroread.png
/usr/share/icons/locolor/32x32/apps/acroread.png
/usr/local/bin/acroread
/usr/local/Acrobat4/Reader/intellinux/bin/acroread
/usr/local/Acrobat4/bin/acroread
Directories that don't contain the name bin can't contain the program - they don't contain executable files. There are three possibilities left. The file in /usr/local/bin is the one tina would have wanted: it is a link to the shell script that starts the actual program:
tina:~> file /usr/local/bin/acroread
/usr/local/bin/acroread: symbolic link to ../Acrobat4/bin/acroread
tina:~> file /usr/local/Acrobat4/bin/acroread
/usr/local/Acrobat4/bin/acroread: Bourne shell script text executable
tina:~> file /usr/local/Acrobat4/Reader/intellinux/bin/acroread
/usr/local/Acrobat4/Reader/intellinux/bin/acroread: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses
tina:~> file /usr/local/bin/acroread
/usr/local/bin/acroread: symbolic link to ../Acrobat4/bin/acroread
tina:~> file /usr/local/Acrobat4/bin/acroread
/usr/local/Acrobat4/bin/acroread: Bourne shell script text executable
tina:~> file /usr/local/Acrobat4/Reader/intellinux/bin/acroread
/usr/local/Acrobat4/Reader/intellinux/bin/acroread: ELF 32-bit LSB executable, Intel 80386, version 1, dynamically linked (uses
shared libs), not stripped In order to keep the path as short as possible, so the system doesn't have to search too long every time a user wants to execute a command, we add /usr/local/bin to the path and not the other directories, which
only contain the binary files of one specific program, while /usr/local/bin contains other useful
programs as well.
Again, a description of the full features of find and locate can be found in the Info pages.
3.3.3.4. The grep command
3.3.3.4.1. General line filtering
A simple but powerful program, grep is used for filtering input lines and returning certain patterns to the output. There are literally thousands of applications for the grep program. In the example below, jerry uses grep to see how he did the thing with find:
jerry:~> grep -a find .bash_history
find . -name userinfo man find
find ../ -name common.cfg
jerry:~> grep -a find .bash_history
find . -name userinfo man find
find ../ -name common.cfg
Search history
Also useful in these cases is the search function in bash, activated by pressing Ctrl+R at once, such as in the example where we want to check how we did that last find again:
thomas ~> ^R
(reverse-i-search)`find': find `/home/thomas` -name *.xml
thomas ~> ^R
(reverse-i-search)`find': find `/home/thomas` -name *.xml
Type your search string at the search prompt. The more characters you type, the more restricted the search gets. This reads the command history for this shell session (which is written to
.bash_history in your home directory when you quit that session). The most recent occurrence of your search string is shown. If you want to see previous commands containing the same string, type Ctrl+R again.
See the Info pages on bash for more.
All UNIXes with just a little bit of decency have an online dictionary. So does Linux. The dictionary is a list of known words in a file named words, located in /usr/share/dict. To quickly check the correct spelling of a word, no graphical application is needed:
william:~> grep pinguin /usr/share/dict/words
william:~> grep penguin /usr/share/dict/words
penguin penguins
william:~> grep pinguin /usr/share/dict/words
william:~> grep penguin /usr/share/dict/words
penguin penguins
Dictionary vs. word list
Some distributions offer the dict command, which offers more features than simply searching words in a list.
Who is the owner of that home directory next to mine? Hey, there's his telephone number!
lisa:~> grep gdbruyne /etc/passwd
gdbruyne:x:981:981:Guy Debruyne, tel 203234:/home/gdbruyne:/bin/bash
lisa:~> grep gdbruyne /etc/passwd
gdbruyne:x:981:981:Guy Debruyne, tel 203234:/home/gdbruyne:/bin/bash
And what was the E-mail address of Arno again?
serge:~/mail> grep -i arno *
sent-mail: To: <Arno.Hintjens@celeb.com>
sent-mail: On Mon, 24 Dec 2001, Arno.Hintjens@celeb.com wrote:
serge:~/mail> grep -i arno *
sent-mail: To: <Arno.Hintjens@celeb.com>
sent-mail: On Mon, 24 Dec 2001, Arno.Hintjens@celeb.com wrote:
find and locate are often used in combination with grep to define some serious queries. For more information, see Chapter 5 on I/O redirection.
3.3.3.4.2. Special characters
Characters that have a special meaning to the shell have to be escaped. The escape character in Bash is backslash, as in most shells; this takes away the special meaning of the following character. The shell knows about quite some special characters, among the most common /, ., ? and *. A full list can be found in the Info pages and documentation for your shell.
For instance, say that you want to display the file "*" instead of all the files in a directory, you would have to use