< Previous | Contents | Next >
String Operations
There is a large set of expansions that can be used to operate on strings. Many of these expansions are particularly well suited for operations on pathnames.
${#parameter}
expands into the length of the string contained by parameter. Normally, parameter is a string; however, if parameter is either @ or *, then the expansion results in the number of positional parameters.
[me@linuxbox ~]$ foo="This string is long."
[me@linuxbox ~]$ echo "'$foo' is ${#foo} characters long."
'This string is long.' is 20 characters long.
[me@linuxbox ~]$ foo="This string is long."
[me@linuxbox ~]$ echo "'$foo' is ${#foo} characters long."
'This string is long.' is 20 characters long.
${parameter:offset}
${parameter:offset:length}
These expansions are used to extract a portion of the string contained in parameter. The extraction begins at offset characters from the beginning of the string and continues until the end of the string, unless the length is specified.
[me@linuxbox ~]$ foo="This string is long."
[me@linuxbox ~]$ echo ${foo:5}
string is long.
[me@linuxbox ~]$ echo ${foo:5:6}
string
[me@linuxbox ~]$ foo="This string is long."
[me@linuxbox ~]$ echo ${foo:5}
string is long.
[me@linuxbox ~]$ echo ${foo:5:6}
string
If the value of offset is negative, it is taken to mean it starts from the end of the string rather than the beginning. Note that negative values must be preceded by a space to pre- vent confusion with the ${parameter:-word} expansion. length, if present, must not be less than zero.
If parameter is @, the result of the expansion is length positional parameters, starting at
offset.
[me@linuxbox ~]$ foo="This string is long."
[me@linuxbox ~]$ echo ${foo: -5}
long.
[me@linuxbox ~]$ echo ${foo: -5:2}
lo
[me@linuxbox ~]$ foo="This string is long."
[me@linuxbox ~]$ echo ${foo: -5}
long.
[me@linuxbox ~]$ echo ${foo: -5:2}
lo
${parameter#pattern}
${parameter##pattern}
These expansions remove a leading portion of the string contained in parameter defined by pattern. pattern is a wildcard pattern like those used in pathname expansion. The dif- ference in the two forms is that the # form removes the shortest match, while the ## form removes the longest match.
[me@linuxbox ~]$ foo=file.txt.zip [me@linuxbox ~]$ echo ${foo#*.} txt.zip
[me@linuxbox ~]$ echo ${foo##*.}
zip
[me@linuxbox ~]$ foo=file.txt.zip [me@linuxbox ~]$ echo ${foo#*.} txt.zip
[me@linuxbox ~]$ echo ${foo##*.}
zip
${parameter%pattern}
${parameter%%pattern}
These expansions are the same as the # and ## expansions above, except they remove text from the end of the string contained in parameter rather than from the beginning.
[me@linuxbox ~]$ foo=file.txt.zip
[me@linuxbox ~]$ foo=file.txt.zip
[me@linuxbox ~]$ echo ${foo%.*}
file.txt
[me@linuxbox ~]$ echo ${foo%%.*}
file
[me@linuxbox ~]$ echo ${foo%.*}
file.txt
[me@linuxbox ~]$ echo ${foo%%.*}
file
${parameter/pattern/string}
${parameter//pattern/string}
${parameter/#pattern/string}
${parameter/%pattern/string}
This expansion performs a search-and-replace upon the contents of parameter. If text is found matching wildcard pattern, it is replaced with the contents of string. In the normal form, only the first occurrence of pattern is replaced. In the // form, all occurrences are replaced. The /# form requires that the match occur at the beginning of the string, and the /% form requires the match to occur at the end of the string. In every form, /string may be omitted, causing the text matched by pattern to be deleted.
[me@linuxbox ~]$ foo=JPG.JPG [me@linuxbox ~]$ echo ${foo/JPG/jpg} jpg.JPG
[me@linuxbox ~]$ echo ${foo//JPG/jpg}
jpg.jpg
[me@linuxbox ~]$ echo ${foo/#JPG/jpg}
jpg.JPG
[me@linuxbox ~]$ echo ${foo/%JPG/jpg}
JPG.jpg
[me@linuxbox ~]$ foo=JPG.JPG [me@linuxbox ~]$ echo ${foo/JPG/jpg} jpg.JPG
[me@linuxbox ~]$ echo ${foo//JPG/jpg}
jpg.jpg
[me@linuxbox ~]$ echo ${foo/#JPG/jpg}
jpg.JPG
[me@linuxbox ~]$ echo ${foo/%JPG/jpg}
JPG.jpg
Parameter expansion is a good thing to know. The string manipulation expansions can be used as substitutes for other common commands such as sed and cut. Expansions can improve the efficiency of scripts by eliminating the use of external programs. As an ex- ample, we will modify the longest-word program discussed in the previous chapter to use the parameter expansion ${#j} in place of the command substitution $(echo
-n $j | wc -c) and its resulting subshell, like so:
#!/bin/bash
# longest-word3: find longest string in a file for i; do
if [[ -r $i ]]; then max_word= max_len=0
#!/bin/bash
# longest-word3: find longest string in a file for i; do
if [[ -r $i ]]; then max_word= max_len=0
for j in $(strings $i); do
len=${#j}
if (( len > max_len )); then max_len=$len max_word=$j
fi
done
echo "$i: '$max_word' ($max_len characters)"
fi done
for j in $(strings $i); do
len=${#j}
if (( len > max_len )); then max_len=$len max_word=$j
fi
done
echo "$i: '$max_word' ($max_len characters)"
fi done
Next, we will compare the efficiency of the two versions by using the time command:
[me@linuxbox ~]$ time longest-word2 dirlist-usr-bin.txt
dirlist-usr-bin.txt: 'scrollkeeper-get-extended-content-list' (38 characters)
[me@linuxbox ~]$ time longest-word2 dirlist-usr-bin.txt
dirlist-usr-bin.txt: 'scrollkeeper-get-extended-content-list' (38 characters)
real
user
0m3.618s
0m1.544s
real
user
sys 0m1.768s
[me@linuxbox ~]$ time longest-word3 dirlist-usr-bin.txt
dirlist-usr-bin.txt: 'scrollkeeper-get-extended-content-list' (38 characters)
sys 0m1.768s
[me@linuxbox ~]$ time longest-word3 dirlist-usr-bin.txt
dirlist-usr-bin.txt: 'scrollkeeper-get-extended-content-list' (38 characters)
real
user
0m0.060s
0m0.056s
real
user
sys 0m0.008s
sys 0m0.008s
The original version of the script takes 3.618 seconds to scan the text file, while the new version, using parameter expansion, takes only 0.06 seconds — a very significant im- provement.