2010 Grut Training Schedule   2010 Grut Course Catalog

Sign up to have weekly UNIX hints, tips, and tricks sent directly to your e-mail (or, scroll down to see the published tips and tricks.)                                 

E-Mail Address

Updated on September 16, 2010

Scripting Hints, Tips, and Tricks

Who's hogging the system (besides root, of course)?

ps -ef | tail +2 | grep -v "^ *root" | awk '{print $1}' | sort | uniq -c | sort -nr


ps -ef                 #  list all processes with the owner names

tail +2                #  strip off the header

grep -v "^ *root "     #  remove the root processes

awk '{print $1}'       #  print out only the first field which is the user-name

sort                   #  sort in ASCIIbetical sequence

uniq -c                #  give me unique user names with a count

sort -nr               #  re-sort in high to low sequence by process count

Notes:  If you remove the grep portion, you will include root's process count as well.  With this construct, you could, for example, programmatically send an email to anyone who had more than 50 processes running at any one time, log who has heavy process usage, etc.

I need to grep something from a stream of data and still preserve my header. What should I do...what should I do?
Solution:  If your stream of data has only one header, it's pretty easy.  However, if your stream has multiple headers, e.g. one for every x lines of output, you may have to omit all but the first one.  Let's use the ps command as our first example, then netstat for the second example (this same technique could, of course, be applied to any command.)


psgrep cron         (psgrep is our script or, function, if you prefer)

Here's the script:

[ $# -eq 0 ] && { echo "usage: `basename $0` pattern" >&2; exit 1; }
ps -f | egrep "PID|$1"

Explanation:  The second line checks to ensure there's a command line argument.  If not, abort with an error message.  The next line runs your command grepping for either a pattern in your header or the pattern you are looking for in your output (supplied as your command line argument.)  Just make sure the hard-coded pattern (PID in this case) is a unique value in your header.

Example 2:

netgrep established

Here's the script:

[ $# -eq 0 ] && { echo "usage: `basename $0` pattern" >&2; exit 1; }
HEADER=`netstat | grep -i swind | head -1`
echo "$HEADER\n`netstat -an | grep -i $1`"

Explanation:  In this example, since we have multiple headers embedded throughout the netstat output, we'll need to isolate them all and just retrieve the first one.  Then, prefix the header to the stream of data where we grep for our pattern (swind, of course, is a pattern in the header that is unlikely to exist anywhere else in the output.)  
Notes:  Use the option of -i (dash-eye) with grep or egrep if you want your command line pattern to match case-insensitively.

My "find" commands run too long.  How can I speed them up by skipping directories?
Solution:  Use the -prune option with find.  Just make sure you put your -prune before your positive criteria.

Example 1:

find / -name usr -prune -o -name "sh*" -print

The above command starts at the top of the directory structure (/) and then does one of two things.  Either:

     If the name is usr (as in /usr), skip it.  Or,

    if the name is "sh*" (begins with sh), print its name.

This will circumvent the entire /usr directory chain.

Example 2:

find / -name usr -prune -o \

          -name proc -prune -o \

          -name tmp -prune -o \

          -name "1*" -print  2> /dev/null
The above example searches the global file system (begin at the root (/)), but skips the /usr, /proc, and /tmp directory chains, finding all files whose name begins with a 1 (one.)
Notes:  The \ as the last character on a command line is the shell's line continuation character.  Also, remember if you throw away your errors (2> /dev/null), you'll be disregarding any command syntax errors as well.  Just make sure you have the command format correct before you add the 2> /dev/null.

How do I detect files that contain carriage-returns? (usually as the result of a binary ftp from Windows to UNIX).
Solution:  A carriage-return is an octal 015.  We can use the "od -b" command with grep to determine if the file contains carriage-returns.

cat my.file | od -b | cut -f 2- -d" " | grep 015

If you see any output, your file contains carriage returns

Following is script file detcr (detect carriage returns):

for FILE in $*
    [ ! -r $FILE -o ! -f $FILE ] && { echo Skipping file $FILE; continue; }
    cat $FILE | od -b | cut -f2- -d" " | grep -q 015
    [ $? -eq 0 ] &&  echo File $FILE contains carriage returns


In the above, the script looks at all of your command line arguments (should be ASCII files to interrogate for carriage returns.)  If the file is not readable or the file is not a regular file, skip it.  Type out the contents of the file, pipe the output into od �b, strip off the portion of the od �b output that is a byte offset (with the cut command) , and then check for a 015.

If the 015 was found by grep, grep's return code will be 0 (success.)  If so, print that the file contains carriage returns.
Note:  If you pass binary files, it is likely that you will find 015 patterns, even though they will not be treated as carriage-returns.

I have several ways of accomplishing my UNIX task.  Which is the most efficient?
Solution:  Write a little, reusable test harness to iterate over your algorithms multiple (or many) times.  I have a test harness named mytester.  First, the invocation syntax:

time mytester          #  Run my algorithm 100 times (the default.)  Give me statistics.
time mytester 500   #  Run my algorithm 500 times.  Give me statistics


real    0m6.79s
user   0m0.14s
sys    0m0.56s

Following is script file mytester (my testing harness):
COUNT=$1    #  Place argument 1 in variable COUNT
echo "$1" | grep -qv "[0-9]"  && COUNT=""  #  if COUNT contains any non-numerics, blank it out
[ "$COUNT" ] || COUNT=100                      #  Default COUNT to 100 if blank
while [ $LIMIT -lt $COUNT ]
#  Place test code here *********************************
#  As a sample, 3 different ways to uppercase a string
#  Keep two of the statements commented out for each invocation of this script

echo "abc" | tr "a-z" "A-Z"
#     echo "abc" | awk '{print toupper($1)};'
#      echo "abc" | perl -e 'print uc( <> )';

#  End test code here ***********************************
((  LIMIT += 1 ))
echo Executed $COUNT times


In the above, I've put 3 different variations of an uppercasing algorithm (shaded in gray.)  That's where you'll put whatever test code you interested in evaluating.  Two of the methods are commented out.  I'll run mytester 3 different times with a different statement each time.  The invocation that produces the smallest times is usually the most efficient.
Note:  Ensure you invoke your script with the "time" command.  Otherwise, you'll not get your stats.
In the above example, all three methods produced extremely similar internal time measurements (sys and user.)  However, the Perl version was significantly slower in real time, most likely due to the size of actually loading the Perl executable (which is much larger than awk or tr) many, many times.

I want to quickly get some information about a particular command, regardless of where on the file system it is stored.
Solution:  Use the which command (in reverse tiks with the command you're interested in.)

ls -l  `which ps`

file  `which ps`

strings `which ps`

Solution 2 (for a little less typing):  Create an alias for which named w.

alias which=w


ls -l  `w ps`

file  `w ps`

strings `w ps`
Notes:  This tip, obviously, only works with commands that are physical commands (stored on the file system), not built-ins, keywords, or functions.  Also, the command would have to be in a directory that is noted in your PATH variable (most should be.)

My shell terminates when I type <Ctrl>D.  I don't like that.  How do I make it stop?

First, take a look at your shell's current option settings:

set -o   #   set dash lowercase Oh

Look for the option named "ignoreeof".  If it is turned OFF, the shell will not ignore EOF (in other words, the shell will respond to a <Ctrl>D as an exit.)

To turn "ignoreeof" ON, so the shell will ignore the <Ctrl>D, your command is:

set -o ignoreeof
If you put the above command in your .profile (or .bash_profile) it will affect only your login shell.  All other subsequent shells should respond to <Ctrl>D as an "exit".  (That's the way I like to set my environment up.)
Notes:  You can, of course turn ON any of your listed options in the same way we did above.  To turn a shell option OFF:

set +o option_name.

Also, some of the options have shortcuts for turning them on and off.  For example:

set -x

is the same as:

set -o xtrace

The shell options are NOT  inherited by subsequent shells.  For that type of behavior, you would need to put your option settings in .kshrc or .bashrc.

My terminal is hung up.  Nothing I type is visible.  No output is visible.  What's up with that?
Possibility 1:
You've inadvertently typed <Ctrl>S (I accidentally do this about once a week.)  That control sequence is usually defined to stop the transmission between your IO device and the shell.

Solution 1:
Type <Ctrl>Q.  Restart flow.  (By the way, the <Ctrl>S / <Ctrl>Q combination comes in handy for stopping and starting a long stream of output.)

Possibility 2:
A job is actually running in the foreground that you're not aware of.

Solution 2:
Type <Ctrl>Z to see if you can pause (STOP) the job.  If so, you should get to your launching shell.  From there, you can do the "ps" or "jobs" command to see what's going on.  If all is OK and you want to restart your job, use "fg" to continue it in foreground or "bg" to continue it in background.

Possibility 3:
The administrator has sent a STOP signal to your shell.

Solution 3:
This is a tough one.  You really don't have much control over this problem.  Fortunately, this shouldn't happen too often.  Logon to another session.  Type the command:

ps -u  your_user_id  -l  (That's a dash-ell.)

Look under the "S" column for any "T" values.  If it's listed on a line that contains a shell, that implies that that particular shell is Stopped.  Shoot it a Continue signal to see if that wakes it up:

kill -CONT PID (That shell's process id.)
Note:  The "ps" command in solution 3 might be a good candidate for an alias named "psu".

alias psu="ps -u $LOGNAME -l"

"Show all of the jobs that I own in long format"

I want to scriptomatically find files of a particular type.  I want the file type I'm searching for to be part of the name of the script.
First Off:
Run the file * command to see the types of records generated from it.  Find something unique to the file-type in the record.  For example, I see the word Perl
only on the lines containing my Perl scripts.

Step 1:  Create a script file named findperl containing the following.

file * | grep -i ":.*[^a-zA-Z]perl[^a-zA-Z]" | cut -f1 -d:    # locate the pattern perl, after a ":", with non-word characters on each side of it

Step 2:  Save the file.

Step 3:  chmod 740 findperl        #  Give the script executed privileges

Step 4:  ./findperl                         #  Run it.  The ./ is not necessary if your current
                                                      #  directory is in your PATH.

Step 5:  Modify your findperl script to look at the name of the script to yank the "perl" portion from it.  We'll then use the portion of the file name for the pattern
passed to grep.

PTRN=`basename $0`        #  Strip off any full qualification of the file name
PTRN = ${PTRN#find}    #  Strip off "find" from the script file name
file * | grep -i ":.*[^a-zA-Z]$PTRN[^a-zA-Z]" | cut -f1 -d:     #  grep for the remainder of the file name (whatever follows "find")

Step 6:  Rerun it.  You should get results identical to the first version.

Step 7:  Create the ASCII file version:

ln findperl findascii       #  Just link to the file

Step 8:  Create the C version

ln findperl findc      #  Just link to the file
Note:  All your scripts will need to be named findx, where x is the pattern you're looking for.  The pattern has a non-alpha character list around it with the grep,
so make sure you identify a complete "word" as your pattern in the output of the file command.

Keep in mind that we're using hard-links for the script file names.  A new name, via a hard link, only takes up one slot in a directory file.
It otherwise, does not consume any additional disk space, so is very sleek.

I want to get a list of all my Perl Scripts, C Programs, ASCII files, etc.
The Problem:
I need to isolate all of my files, for some reason, of a particular type.

Step 1:  Run the "file" command on all of your files and find a very tight pattern that is unique to the file type you're interested in.

file *

Step 2:  Then, take the output of that command and grep for your tight pattern:

file * | grep " ASCII C Program"

Step 3:  Then, once you're sure you've not gotten too little or too much, cut out everything but the file name:

file * | grep " ASCII C Program" | cut -f1 -d:

Step 4:  Now, whenever you want to feed those file names to another command, wrap the entire sequence in reverse tiks and use them.

Example 1 (Copy all my Perl Scripts to my /home/mark/perl directory)

cp  -i      `file * | grep "perl script" | cut -f1 -d:`     /home/mark/perl

Example 2 (Archive to a tar file all of my English text files)

tar  -cvf  english.tar      `file * | grep "English text" | cut -f1 -d:`

Example 3 (An alternative to the reverse tiks)

tar  -cvf  english.tar      $(file * | grep "English text" | cut -f1 -d:)  
Note:  Make sure you understand that reverse tiks have a quite different function than the similar looking single-quote.  Reverse tiks always have a command or command sequence inside of them that, should, return some output.  Reverse tiks pass that output to something else.

Next time: We'll see how to automate the above process in a tricky script file that uses part of the script file name as the pattern to search for.

I want to remove some files with names beginning with a hyphen.
The Problem:
Somehow, I've inadvertently created a file named -x, or -y, or -z.   Every time I attempt to remove it with rm -x, rm thinks -x is an option, not a file name.

For example:  I type the following command sequence:

rm -x

and receive this error:

rm:  rm invalid option -x

Solution (fully qualify the file name)

rm /home/barney/-x

#  or

rm ./-x    #  The file is right here, in my current directory. (dot-slash prefixed)
Explanation:  A fully qualified file name tells the shell exactly where the file lives.  You fully qualify a file by prepending the file's directory in front of the file name.  The shell assumes that unqualified file names live in your current directory.

Additional Note:  You can certainly always qualify your file names for any command whether you need to or not (as a matter of fact, it's considered good practice in shell scripts.)  However, the non-full-qualification of file name reference, allowed by the shell, sure saves a lot of typing.

I want to save a man page.  I don't want to see those nasty backspaces.
The Problem:
I know how to save a man page via redirection.  How do I make it more readable?

For example:  I type the following command sequence:

man ps > psman.out

Then, I pull up psman.out in vi, and there are tons of strange character sequences.

Solution: (pipe the output through col -b before writing it)

man ps | col -b > psman.out
The col command with the -b option reads its input from its source (the man command, in this example) and assumes that the output device (STDOUT) has no ability to translate backspaces.  Therefore, it removes the backspaces and replaces them with one occurrence of the last character following the last backspace in the sequence.

The backspace characters are used in the man command's output in case the document is actually sent to an impact printer (dot-matrix, daisy-wheel, line, etc.)  It's sort of the poor-man's way of bolding (just keep hammering the same character in the same place a couple of times.)

Additional Note:  If you're not happy with the man command's paging mechanism (more, less, pg, etc.) you can change it to whatever you like with the command:

export PAGER=more
export PAGER=less
export PAGER=pg
export PAGER=cat

This essentially disables paging

Just make sure you point the PAGER variable to a valid paging command.

I want to store my command history in separate files by month or day.
The Problem:
I would like to store my command history in smaller files that I can identify by month or day.  How can I do this?

For all of the solutions below, include the solution in your .profile (or .bash_profile if you're running a bash.

Solution 1 (Store history in monthly files):
       export HISTFILE=.sh_history_`date +%Y_%b`
Solution 2 (Store history in daily files):
       export HISTFILE=.sh_history_`date +%Y_%m_%d`
Your HISTFILE variable controls which physical file your command history is written into.  By default, it is usually written into a hidden file named .sh_history.

Solution 1:
Your history will be written, this month, into a file named .sh_history_2010_Jan.
Beginning Feb 1st: .sh_history_2010_Feb.

Solution 2:
Your command history will be written, today, into a file named .sh_history_2010_01_25.  Beginning tomorrow: .sh_history_2010_01_26.

Notes:  Make sure that you use reverse tiks around the date command.   Use the solutions above as a guide for formatting your command history file name as you would like it.  Check the man pages for the date command to see other format specifiers you might be interested in.

After you've made the change in your .profile or .bash_profile, on next login type the following to verify the syntax of your HISTFILE assignment:


Then, after a few commands, run the history command to verify your command history:


How to quickly look for patterns through an entire directory chain.
The Problem:
I can identify a file by something in the file's contents.  However, I don't remember the file's name or location.  Essentially, I need to "grep" across multiple directories.

Solution 1:
       find  /home  -exec  grep  -H  "Mr. Peepers"  {}  \;  2> /dev/null
Solution 2:
       find  /home  -name  "p*.doc"  -exec  grep  -H  "Mr. Peepers"  {}  \;  2> /dev/null
Solution 3:
       find  /home  -exec  grep  -H  "Mr. Peepers"  {}  \;  2> /dev/null  |  cut  -f1  -d:

Solution 1:
Use the find command with grep.   In the above solution, /home is the starting point. We will look in all directories in /home and beneath.  We will take the default find criteria of "find all files" (-name  "*").

-exec tells find to execute a command for each file, in this case, the grep command.

-H is the grep option to display the file name, in addition to the record that was found.

"Mr. Peepers" is the pattern we are looking for in each record of the files.

Don't forget, when you use -exec, the {} is the placeholder that receives the file name of the "found" file.  Make sure you put a space before the protected semi-colon (\;).

Disregard the error messages (2> /dev/null)  This is a good extension for many commands if you are not interested in viewing errors.

Solution 2:
If you know a portion of the file's name, add the -name "pattern" to limit the grep to only those files that match that pattern.  Since grep will otherwise search through every file, this technique will drastically improve the speed of the search.  Of course, this would only work if you were sure of the filename match.

Solution 3:
If you just want the file name without the actual text of the record printed, you can pipe the output to a cut command to retrieve only the first portion of the value returned from grep.  After cut, put a -f1 -d: (that is, dash eff one dash dee colon).

Notes:  Some grep versions support a recursive option which may be a more concise command.  Check your man pages for details on your version.  The above solution should work on all UNIX platforms.

^M at the end of records.
The Problem:
I've got a ^M at the end of each of my records.  What are they and how can I get rid of them?

Solution 1:

             tr  -d  "\012"  <  input.file > output.file
             mv  output.file   input.file

Solution 2:

             sed 's/^M//'  input.file > output.file
             mv  output.file  input.file

UNIX uses a single character to specify newline.  Many other operating systems use 2 characters (carriage return and line-feed.)  When a file gets
transferred from location to location in a binary mode, no translation is made from the CR / LF to UNIX's preferred single-character newline.  Hence,
you get stuck with an additional character (the CR) and it looks funny, is irrritating, and can cause problems.

CR happens to be octal value 012 which shows up as ^M (Ctrl-M) when displayed on most terminals.  All we need to do is remove that character.

Solution 1 uses the tr "translate" command which can only read from STDIN, hence the < for inputting.  The -d option is for deleting characters, as
opposed to tr's normal function of translating characters.

If you want to type a Ctrl-M (as is shown in solution 2)  on the command line, you will probably need to type Ctrl-V before the Ctrl-M.  That just tells
the shell that an embedded control sequence follows.  (That is not a carat M.  It is a Ctrl-M.)

Following either of the translating commands, rename your output file back to your original file name.  Make sure you don't redirect ( > ) over the
filename you want translated in whichever command you use..

How to fix your backspace key.
The Problem:
I get weird characters printed to the terminal when I attempt to backspace, like ^H or ^?.

            If you see ^H run this command:
                    stty erase ^H

            If you see ^? run this command:
                    stty erase ^?

            Or, for either case:
                stty erase (and then, just type the backspace key which will generate the proper character sequence)

Your backspace key emits character sequences, just like the "A" key or "Z" key.  However, backspace performs terminal manipulation other than just printing
a character.  Your emulator may be expecting a different character sequence than the one that is currently set up to backspace.  The stty command (set your
terminal characteristics) is used to attach character sequences to particular keystroke combinations on your tty (terminal.)

You may want to put either of the "stty erase" commands above in your .profile (or .bash_profile.)  Or, if you prefer, you could create an alias as in:
    alias fixtty="stty erase ^?"

    and put that statement in your .profile.  Then, whenever you need it, just type in

How to quickly create an empty file not knowing whether a file under that name already exists or not
The Problem:
            I need to create a new, empty file.  A current file with the same name may already be in existence.

Solution 1 (works):
             rm   my.new.file   2> /dev/null
             touch   my.new.file  

Solution 2 (better):
             cat   /dev/null  > my.new.file

Solution 3 (best):
             cp  /dev/null  my.new.file

Solution 1 will works by attempting to remove the file.  If the file does not exist, rm generates an error, which we disregard.  We then, unconditionally create the new file.  However, if the rm failed because we did not have write privileges on the file, we would not see that error either and think that touch created a new file.  In fact, it would have left the existing file intact.

Solution 2 types the contents of an empty file and redirects it over the top of my.new.file.  Although we often use /dev/null to throw away output and errors,  we can also use it as a file containing nothing as input.  The problem with this solution is we are depending on the shell to allow us to clobber a file, via redirection, that may be in existence.  The shell, through its option settings, may disallow that.

Solution 3 merely copies the contents of the non-existent file to a new file.   Shell redirection does not get involved.

Note:  With all three solutions, of course, we would need write privileges on the directory (and the file, if in existence) to create the file.

Removing the directory name from a fully qualified file name
The Problem:
            I want to turn /home/napoleon/sweet into just "sweet".

Solution 1: (slower)

           FILENAME=`basename $FILENAME`    #  Those are reverse tiks, not apostrophes

Solution 2: (way faster)

             FILENAME=${FILENAME##*/}                #  Those are curley brackets, not parenthesis

Solution 1 works perfectly.  However, basename is an external command that has to be loaded before being executed.  Anytime an

external command has to be loaded by UNIX a significant efficiency penalty is assessed.

Solution 2 also works perfectly, but it uses a built-in shell construct to extract the same result.  The $ sign, of course indicates that we are working with a variable (FILENAME is the variable.)  The ## construct is a "left" match-and-strip construct. ##, in this construct indicates to the shell to remove the longest pattern that matches on the wildcard pattern of */ (anything (*) through the last slash.)

Building Hang-Up immunity into your shell script so it survives you logging off
The Problem:

          My background jobs don't survive me logging off if I forget to "nohup" them.


          At the top of your shell script, after the #!/usr/bin/ksh (or whatever), do this:

          trap "" 1    #  That's trap space double-quote double-quote space 1

          At the bottom of your shell script:

          trap 1

trap alters the behavior of a received signal.  When you log out of your shell, the shell passes signal 1 (the HangUp) signal to all child
processes (unless you've otherwise set a shell option to disable that signal pass.)

The double-quote double-quote contains the altered functionality of signal 1 (in this case, do nothing.)  When your running child process
(your script file) receives the signal 1 from the parent, it has no effect, and the script file continues running..

The "trap 1" at the bottom of the script will not do anything unless you run your script as part of your current process (by sourcing the
script file.)  It untraps the signal 1 for that shell.  It's always a good practice to untrap any signals you trap in your shell script.