Friday, 16 September 2011

Advanced techniques for using the UNIX Find command.

Find all files in current directory and subdirectory, greater than some size using find command in Unix:

find . -size +1000c -exec ls -l {} \;

Always use a c after the number, and specify the size in bytes, otherwise you will get confuse because find -size list files based on size of disk block. to find files using a range of file sizes, a minus or plus sign can be specified before the number. The minus sign means "less than," and the plus sign means "greater than." Suppose if you want to find all the files within a range you can use find command as below.

-size n[cwbkMG]
              File uses n units of space.  The following suffixes can be used:
              `b'    for 512-byte blocks (this is the default if no suffix is used)
              `c'    for bytes
              `w'    for two-byte words
              `k'    for Kilobytes (units of 1024 bytes)
              `M'    for Megabytes (units of 1048576 bytes)
              `G'    for Gigabytes (units of 1073741824 bytes)
              The  size  does  not count indirect blocks, but it does count blocks in sparse files that are not actually allocated.
              Bear in mind that the `%k' and `%b' format specifiers of -printf handle sparse files  differently.   The  `b'  suffix
              always denotes 512-byte blocks and never 1 Kilobyte blocks, which is different to the behaviour of -ls.

+ equal to greater than
- equal to less than


Example :

find . -size +2G -exec ls -lh {} \;
It return files which size is greater than 2GB.
find . -size +50000000c -exec ls -lh {} \;
It return files which size is greater than 50M

find . -type f -name "*.java*" -print
It return files which extension are .java and end with other character


There's nothing quite like the thrill of exploring, discovering new 
people, places, and things. The territory might change, but a few
principles remain the same. One of those principles is to keep a written
record of your journey; another is to know and use your tools.

The UNIX® operating system is much like a vast, uncharted wilderness.
As you travel the terrain, you can
pick up tools that assist you later. The find command is
such a tool. The
find command is capable of much more than simply locating
files; it can automatically execute sequences of other UNIX commands,
using the filenames found for input, as this article explains.

Find with few limits

All operating systems worth their salt have a tool to assist you in
finding things. Unlike most of these tools, the UNIX find
command can automatically perform many operations for you on the files
it finds.

Standard find tools found in graphical user interfaces
(GUIs) allow you to do a
few common tasks with the files you find: You can mark them for cutting,
copying, and pasting; you can move them to a new location; and you can
open them with the program used to create them. These operations involve
two or more steps and aren't automatic -- you find the files first, and
then you use the GUI to mark them for the next operation. This approach
is fine for many users, but the explorer wants more.

The UNIX find command can delete, copy, move, and
execute files that it finds. In addition, with the
-exec parameter, it can automatically run files through any
sequence of UNIX commands you need. It can
even ask you before it performs such operations on any file.

Simplify management of your file system

The UNIX find command, like most UNIX commands, has an
intimidating array of options and switches that can discourage people
from learning its depth -- but true explorers aren't intimidated just
because the territory is vast. A good general principle goes a long way
toward simplifying a complex topic. Start up an xterm, and try the
following command:

$ find . -name *.gif -exec ls {} \;

The -exec parameter holds the real power. When a file is
found that matches the search criteria, the -exec
parameter defines what to do with the file. This example tells the
computer
to:

  1. Search from the current directory on down, using the dot (.)
    just after find.
  2. Locate all files that have a name ending in .gif (graphic files).
  3. List all found files, using the ls command.
The -exec parameter requires further scrutiny. When a
filename is found that matches the search criteria, the find
command executes the ls {} string, substituting the
filename and path for the {} text. If saturn.gif was found
in the search, find would execute this command:

$ ls ./gif_files/space/solar_system/saturn.gif

The rest of the article builds on this general principle: Thoughtful
use of the find command can make the management of UNIX
file systems a much easier task. For example, the find
command can process commands based on the type of file system where the
file is found, if you use the -fstype parameter. And it's
often useful to have the find command prompt you before it
executes commands on a found file; you can tell it to do so by using the
-ok parameter, as you'll see next.

Optional execution

An important alternative to the -exec parameter is -ok;
it behaves the same as -exec, but it prompts you to see if
you want to run the command on that file.
Suppose you want to remove most of the .txt files in your home
directory, but you wish to do it on a file-by-file basis. Delete
operations like the UNIX rm command are dangerous, because
it's possible to inadvertently
delete files that are important when they're found by an automated
process like find; you might want to scrutinize all the
files the system finds before removing them.

The following command lists all the .txt files in your home
directory. To delete the files, you must enter Y or y when
the find command prompts you for action by
listing the filename:

$ find $HOME/. -name *.txt -ok rm {} \;

Each file found is listed, and the system pauses for you to enter
Y or y. If you press the Enter
key, the system won't delete
the file. Listing1 shows some sample results:


Listing 1. Sample results
< rm ... /home/bill/./.kde/share/apps/karm/karmdata.txt > ?
< rm ... /home/bill/./archives/LDDS.txt > ?
< rm ... /home/bill/./www/txt/textfile1.txt > ?
< rm ... /home/bill/./www/txt/faq.txt > ?
< rm ... /home/bill/./www/programs/MIKE.txt > ?
< rm ... /home/bill/./www/programs/EESTRING.txt > ?
.
.
.
After each question mark, the system paused; in this case, the Enter
key was pressed to continue to the next file. (No files were removed.)
The -ok parameter lets you control the automatic processing
of each found file,
adding a measure of safety to the danger of automatic file removal.

If too many files are involved for you to spend time with the -ok
parameter, a good rule of thumb is to run the find command
with -exec to list the files that would be deleted; then,
after examining the list to be sure no important files will be deleted,
run the command again, replacing ls with rm.

Both -exec and -ok are useful, and you must
decide which
works best for you in your current situation. Remember, safety first!

Use find creatively

You can perform myriad tasks with the find command. This
section provides some examples of ways you can put find to
work as you manage your file system.

To keep things simple, these examples avoid -exec
commands that involve the piping of output from one command to another.
However, you're free to use commands like these in a find's -exec
clause.

Clean out temporary files

You can use find to clean directories and subdirectories
of the temporary files generated during normal use, thereby saving disk
space. To do so, use the following command:

$ find . \( -name a.out -o -name '*.o' -o -name 'core' \) -exec rm {} \;

File masks identifying the file types to be removed are
located between the parentheses; each file mask is preceded by -name.
This list can be extended to include any temporary file types you can
come up with that need to be cleaned off the system. In the course of
compiling and linking code, programmers and their tools generate file
types like those shown in the example: a.out, *.o,
and core. Other users have similar commonly generated
temporary files and can edit the command accordingly, using file masks
like *.tmp,
*.junk, and so on. You might also find it useful to put the
command into a script called clean, which you can execute
whenever you need to clean a directory.

Copy a directory's contents

The find command lets you copy the entire contents of a
directory
while preserving the permissions, times, and ownership of every file and
subdirectory. To do so,
combine find and the cpio command, like this:

Listing 2. Combining the find and cpio command

$ cd /path/to/source/dir
$ find . | cpio -pdumv /path/to/destination/dir

The cpio command is a copy command designed to copy
files into and out of a cpio
or tar archive, automatically preserving permissions, times, and
ownership of files and subdirectories.

List the first lines of
text files

Some people use the first line of every text file as a heading or
description of the file's contents. A report that lists the filenames
and first line of each text file can make sifting through several
hundred text files a lot easier. The following command lists the first
line in every text file in your home directory in a report, ready to be
examined at your leisure with the less command:


Listing 3. The less command

$ find $HOME/. -name *.txt -exec head -n 1 -v {} \; > report.txt

$ less < report.txt

Maintain LOG and TMP file
storage spaces

To maintain LOG and TMP file storage space for applications that
generate a lot of these files,
you can put the following commands into a cron job that
runs daily:

Listing 4. Maintaining LOG and TMP file
storage spaces


$ find $LOGDIR -type d -mtime +0 -exec compress -r {} \;

$ find $LOGDIR -type d -mtime +5 -exec rm -f {} \;

The first command runs all the directories (-type d)
found in the $LOGDIR directory wherein a file's data has been modified
within the last 24 hours (-mtime +0) and compresses them (compress
-r {}
) to save disk space. The second command deletes them (rm
-f {}
) if they are more than a work-week old (-mtime +5),
to increase the free space on the disk. In this way, the cron job
automatically keeps the directories for a window of time that you
specify.

Copy complex directory
trees

If you want to copy complex directory trees from one machine to
another while preserving copy permissions and the User ID and Group ID
(UID and GID -- numbers used by the operating system to mark files for
ownership purposes), and leaving user files alone, find and
cpio once again come to the rescue:

Listing 5. Maintaining LOG and TMP file
storage spaces


$ cd /source/directory

$ find . -depth -print | cpio -o -O /target/directory

Find links that point to
nothing

To find links that point to nothing, use the perl
interpreter
with find, like this:

$ find / -type l -print | perl -nle '-e || print';

This command starts at the topmost directory (/) and lists all links (-type
l -print
) that
the perl interpreter determines point to nothing (-nle
'-e || print'
) -- see the Resources
section for more information regarding this tip from the Unix Guru
Universe site. You can further pipe the output through the rm -f
{}
functionality if you want to delete the files. Perl is,
of course, one of the many powerful interpretive language tools also
found in most UNIX toolkits.

Locate and rename
unprintable directories

It's possible in UNIX for an errant or malicious program to create a
directory with unprintable characters. Locating and renaming these
directories makes it easier to examine and remove them. To do so, you
first include the -i switch of ls to get the
directory's inode number. Then, use find
to turn the inode number into a filename that can be renamed with the mv
command:

Listing 6. Locating and renaming unprintable
directories


$ ls -ail
$ find . -inum 211028 -exec mv {} newname.dir \;

List zero-length files
To list all zero-length files, use this command:

$ find . -empty -exec ls {} \;

After finding empty files, you might choose to delete them by
replacing the ls command
with the rm command.

Clearly, your use of the UNIX find command is limited
only by your knowledge and creativity.
$ find . -inum 211028 -exec mv {} newname.dir \;

List zero-length files
To list all zero-length files, use this command:


$ find . -empty -exec ls {} \;

After finding empty files, you might choose to delete them by
replacing the ls command
with the rm command.

Clearly, your use of the UNIX find command is limited
only by your knowledge and creativity.
Resources

Learn

No comments:

Post a Comment