Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations gkittelson on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

find text in files

Status
Not open for further replies.

Guest_imported

New member
Jan 1, 1970
0
Anyone know how to make "text search" in all unix files using "find" ?
 
From :

How can I recursively grep through sub-directories?

The problem with all the reponses that invariably pop up for this type of question is that none of them are ever truly fast and most of them aren't truly robust.

Typically, the answer is to use find, xargs, and grep. That's horribly slow for a full filesystem search, and it's painfully difficult to properly construct a pipeline that will avoid searching binaries if you don't want to, won't get stuck on named pipes or blow up on funky filenames (beginning with -, or sometimes spaces, punctuation etc). There are ways around all these things, but they are all ugly.

BTW, something that almost never gets mentioned but that I will frequently use under conditions where it is appropriate is a simple grep pattern * */* */*/* 2/dev/null Not useful much beyond that, and may not even be good at that except for certain starting points, but it's faster than any find xargs pipeline can ever be if the set is small enough.

The simplistic approach using find is

find /whereveryouwantostart -exec grep whatever {} dev/null \;


That's not necessarily very efficient. Using xargs can help

find . | xargs grep whatever


But it also has bugs if the filenames could have "-" at their beginning. Fixing that can be a little nasty.

You may not want to grep binary files:

find . -type f -print|xargs file|grep -i text|cut -fl -d: | xargs grep whatever


That's pretty awful, but it's what you have to get into if you have special cases. Special cases are what makes this question more difficult. If you have a small number of files and subdirs to search, the simple approach may work fine for you. If not, you have to get more creative.

Bill Campbell offers this Perl script:

I have a perlscript I call ``textfiles'' that I use for many
things like this:
textfiles dirname [dirname... ] | xargs ...

Essentially it runs ``gfind @ARGV -type f'', then uses perl's -T
option on each file to determine whether it's a text file.

My textfiles script also has options to add options to the gnu
find command like -xdev, -mindepth, and -maxdepth.

Hell, it's short so I'm attaching it for anybody who wants to use
it. It does assume that the gnu version of find is in your PATH
named gfind (I make a symlink to /usr/bin/find on Linux systems
so that it works there as well).


#!/usr/local/bin/perl
eval ' exec /usr/local/bin/perl -S $0 "$@" '
if $running_under_some_shell;

# $Header: /u/usr/cvs/lbin/textfiles,v 1.7 2000/06/22 18:29:08 bill Exp $
# $Date: 2000/06/22 18:29:08 $
# @(#) $Id: textfiles,v 1.7 2000/06/22 18:29:08 bill Exp $
#
# find text files

( $progname = $0 ) =~ s!.*/!!; # save this very early

$USAGE = "
# Find text files
#
# Usage: $progname [-v] [file [file...]]
#
# Options Argument Description
# -f Follow symlinks
# -M maxdepth maxdepth argument to gfind
# -m mindepth mindepth argument to gfind
# -x Don't cross device boundaries
# -v Verbose
#
";

sub usage {
die join("\n",@_) .
"\n$USAGE\n";
}

do "getopts.pl";

&usage("Invalid Option") unless do Getopts("fM:m:xvV");

$verbose = '-v' if $opt_v;
$suffix = $$ unless $opt_v;

$\ = "\n"; # use newlines as separators.

# use current directory if there aren't any arguments
push(@ARGV, '.') unless defined($ARGV[0]);

$args = join(" ", @ARGV);
$xdev = '-xdev' if $opt_x;
$opt_f = '-follow' if $opt_f;
$opt_m = "-mindepth $opt_m" if $opt_m;
$opt_M = "-maxdepth $opt_M" if $opt_M;
$cmd = "gfind @ARGV -type f $xdev $opt_f $opt_m $opt_M |";
print STDERR &quot;cmd = >$cmd<&quot; if $verbose;

open(INPUT, $cmd);
while(<INPUT>) {
chop($name = $_);
print STDERR &quot;testing $name...&quot; if $verbose;
print $name if -T $name;
}



John Dubois also comments on Glimpse:

Glimpse indexes files by the words contained in the file. Then when you want to search all of the files, it only runs its equivalent of grep (agrep) on the files that contain the words you're looking for. You can search for partial words too, though it takes longer. I have the man pages, include files, rfcs, source trees, my home directory, web pages, etc. all separately glimpse-indexed.

Binaries & man pages for OpenServer are at ftp://deepthought.armory.com/pub/scobins/glimpse.tar.Z

A front end that allows you to easily search any of multiple glimpse databases is at: ftp://ftp.armory.com/pub/scripts/search Tony Lawrence
SCO Unix/Linux Resources tony@pcunix.com
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top