Free Hosting Online for WorkStations

< Previous | Contents | Next >

cat

The cat program has a number of interesting options. Many of them are used to help better visualize text content. One example is the -A option, which is used to display non- printing characters in the text. There are times when we want to know if control charac- ters are embedded in our otherwise visible text. The most common of these are tab char- acters (as opposed to spaces) and carriage returns, often present as end-of-line characters in MS-DOS-style text files. Another common situation is a file containing lines of text with trailing spaces.

Let’s create a test file using cat as a primitive word processor. To do this, we’ll just en- ter the command cat (along with specifying a file for redirected output) and type our text, followed by Enter to properly end the line, then Ctrl-d, to indicate to cat that we have reached end-of-file. In this example, we enter a leading tab character and follow the line with some trailing spaces:



[me@linuxbox ~]$ cat > foo.txt

The quick brown fox jumped over the lazy dog. [me@linuxbox ~]$

[me@linuxbox ~]$ cat > foo.txt

The quick brown fox jumped over the lazy dog. [me@linuxbox ~]$


Next, we will use cat with the -A option to display the text:


[me@linuxbox ~]$ cat -A foo.txt

^IThe quick brown fox jumped over the lazy dog. $ [me@linuxbox ~]$

[me@linuxbox ~]$ cat -A foo.txt

^IThe quick brown fox jumped over the lazy dog. $ [me@linuxbox ~]$


As we can see in the results, the tab character in our text is represented by ^I. This is a common notation that means “Control-I” which, as it turns out, is the same as a tab char - acter. We also see that a $ appears at the true end of the line, indicating that our text con- tains trailing spaces.


MS-DOS Text Vs. Unix Text

One of the reasons you may want to use cat to look for non-printing characters in text is to spot hidden carriage returns. Where do hidden carriage returns come from? DOS and Windows! Unix and DOS don’t define the end of a line the same way in text files. Unix ends a line with a linefeed character (ASCII 10) while MS- DOS and its derivatives use the sequence carriage return (ASCII 13) and linefeed to terminate each line of text.

There are a several ways to convert files from DOS to Unix format. On many Linux systems, there are programs called dos2unix and unix2dos, which can convert text files to and from DOS format. However, if you don’t have dos2u- nix on your system, don’t worry. The process of converting text from DOS to Unix format is very simple; it simply involves the removal of the offending car- riage returns. That is easily accomplished by a couple of the programs discussed later in this chapter.


cat also has options that are used to modify text. The two most prominent are -n, which numbers lines, and -s, which suppresses the output of multiple blank lines. We can demonstrate thusly:


[me@linuxbox ~]$ cat > foo.txt

The quick brown fox

[me@linuxbox ~]$ cat > foo.txt

The quick brown fox


jumped over the lazy dog.

[me@linuxbox ~]$ cat -ns foo.txt

jumped over the lazy dog.

[me@linuxbox ~]$ cat -ns foo.txt


1

2

3

1

2

3


[me@linuxbox ~]$

[me@linuxbox ~]$


The quick brown fox

The quick brown fox

jumped over the lazy dog.

jumped over the lazy dog.

In this example, we create a new version of our foo.txt test file, which contains two lines of text separated by two blank lines. After processing by cat with the -ns options, the extra blank line is removed and the remaining lines are numbered. While this is not much of a process to perform on text, it is a process.


Top OS Cloud Computing at OnWorks: