Glindra
Documentation Index Download
Command Line File Handling and ASCII Tools

se - Search


Searches for strings or regular expressions in one or more files.

The output from the command can be either a listing of all the records that matched in each file, a list of totals showing the number of matches per file, or just a grand total.

When searching for regular expressions, the command accepts Perl flavor expressions. The character classes are defined to reflect the full 8 bit Latin 1 character set, rather than just the lower 7 bits.

Example (Windows)
> se songs.txt \b(\w+)((?:\s)+)\1\b -regex -nonumber
Da doo ron ron
It's the posh posh traveling life
Up up and away

  3 records found
Example (Linux)
> se '*.html' abc -totals
       2  /home/glindra/doc/file_selection.html
       7  /home/glindra/doc/filename.html
       9  /home/glindra/doc/regex.html
       2  /home/glindra/doc/search.html
 -------
      20  records found in 4 files
Example (Windows)
> se \***\glindra\***\*.?pp regular
########################################################################

########  e:\Glindra\src\glindra\lib1\cli.cpp
 914:  //  The main difference between this routine and the regular 'full_parse cli_table'

########  e:\Glindra\src\glindra\lib2\on_disk.cpp
 124:    // Copy regular file.

########  e:\Glindra\src\glindra\lib2\regex.cpp
  70:    ask(clis, oval.regex_flag(), oval.regex_flag(), "-regexp", "Use regular expressions");
  78:      // Regular expressions

########  e:\Glindra\src\glindra\lib2\regex.hpp
 185:    if (slist.size() == 0) clis.complain("No regular expression found");

  5 records found in 4 files in 2 directories


Parameters

Input File Specification

Example (Windows)
> se ***\*.[ch]pp -exclude=(***\detail\) -before=today xyz
Example (Linux)
> se '***/*.[ch]pp' -exclude=('***/detail/') -before=today xyz


se accepts the same wildcards and file selection options as the d directory command.

Multiple Input Files Within Parentheses

If you want to give a list of input files, they must be enclosed in parentheses, to separate them from the search strings.

Example (Windows)
> se (*.html .txt) xyz
Example (Linux)
> se \('*.html' .txt\) xyz
Text Files With Windows Or Linux Line Endings
Text files on Windows are normally encoded so that the lines of text are separated by a carriage-return/newline character pair ('\r\n'). Under Linux, lines are normally separated by a newline character only ('\n').

se will read either kind of text file under both Windows and Linux, and convert it to the native encoding. The output file will always be on the native format for the platform.

con: Is Standard Input

Specify the filename con: (including the trailing colon) to read from standard input. This makes it possible to pipe the result from some other program directly to se (just like with grep under Linux).

Note that this special meaning for con: works under Linux as well as Windows.

Example (Windows)
To find all filenames that contain a double letter:
> d *.txt -files | se con: -regex '"([a-z])\1" -nohead -nonumber -nogrand
Example (Linux)
> d '*.txt' -files | se con: -regex '"([a-z])\1"' -nohead -nonumber -nogrand
(See below under Search Strings for an explanation why the regular expression is within both single and double quotes.)

Search Strings

All additional parameters after the input file specification are search strings. If they contain special characters that would cause confusion when the command line is being parsed, they must be enclosed in quotes.

Example (Windows)
> se *.txt alice bob 'mary ann' '-anne'


By default, the search strings are interpreted as literal strings. If you specify the -regex option, they will instead be interpreted as Perl flavored regular expressions.

Example (Linux)
> se '*.txt' '"\b(\w+)((?:\s)+)\1\b"' -regex
Quoted Strings

Complex regular expressions typically contain characters that have special meanings to either the shell (bash or Windows), or the Glindra command line parser. Examples of such characters are ()=\/<|> and others. Blanks are also special characters in this context.

To make sure that a string that contains such special characters is interpreted properly by both the she and the Glindra command line parser, it may be necessary to enclose the string in both single and double quotes. This method always works under both Windows and Linux, so when in doubt, you can always use it just to be on the safe side.

Example (Windows and Linux)
Suppose we want to find all lines in a file that contain a dot, two blanks, and a lower case letter. The regular expression for expressing this is \. [a-z] (with two space characters in the middle).

To find all such strings in a file, we can use the following command under both Windows and Linux:

> se -regex -exact a.txt '"\.  [a-z]"'
Under Windows, the shell will interpret the " quote characters and make sure the two adjacent blanks in the string are preserved, and the Glindra command line parser will use the ' quotes to determine where the string begins and ends.

Under Linux, the bash shell will interpret the outer pair of quotes and make sure the blanks are preserved, and the Glindra command line parser will use the inner pair of quotes.


Search Options

These options control how the search is conducted, and what output should be produced.

Example
> se '***' 'Colou?r' -regex -case_sensitive -totals
Options
-regex Interpret the search string(s) as regular expressions. Default is -noregex.
Make the comparison case sensitive. Default is -nocase_sensitive.
-exact_case is just a synonym for -case_sensitive.
-number Specify -nonumber to suppress the line number in front of each record that is found.
Print a grand total of the number of records found in all the files.
-totals Print only the number of records found for each file, instead of listing the individual occurrences.
-files Print only the full filenames for the files where at least one record was found. The output will be a list of file names that is suitable for use as input for other programs.
-header Print a header for each new file where a record is found. The default is to print such a header if more than one file is being searched.

-regex

Specifies that the search string(s) are regular expressions, rather than literal strings.
Example
> se '*' '\babc\b' -regex


The command accepts Perl flavor regular expressions.

The character classes are defined to reflect the Latin 1 (ISO 8859-1) character set. This means that letters like éñüöÉÑÜÖ belong to the alpha character class (together with a-zA-Z), and that the 8-bit punctuation characters in the upper half of the the character set are correctly classified as punct.

See the page Regular Expression Syntax for a (compact) description of the exact syntax.

For a somewhat more gentle introduction to regular expressions, there is a good tutorial at http://www.regular-expressions.info/tutorial.html

-case_sensitive
-exact_case

By default, the comparison is not case sensitive. Both the letters A-Z and the national upper case letters in the Latin 1 alphabet are treated as equal to their lower case counterparts.

Specify the -case_sensitive or -exact_case option to get a case sensitive comparison, where a and A and ö and Ö are considered different.

The -case_sensitive and -exact_case are synonyms, and have exactly the same meaning.

The option works the same for both regular expressions and normal strings.

-number

By default, the command prints the line number in front of each record that is found.

Specify -nonumber to suppress the line numbers.

-grand_total

The -grand_total option prints the total size, the number of files, and in how many different directories they were found.

The option is on by default for normal listings, but not for -files listings. If you specify the -grand_total option by itself, it will just print the summary, and no file names.

The negative form of the option, -nogrand_total, suppresses the printing of the grand total for normal file listings.

-totals

The -totals option prints a listing with the number of records that matched in each file. The actual matching records are not printed.

Example
> se '***/*.?pp' virtual -totals
6 e:\glindra\src\glindra\lib1\glindra_exception.hpp
5 e:\glindra\src\glindra\lib2\file.hpp
1 e:\glindra\src\glindra\lib2\on_disk.hpp
-------
12 records found in 3 files in 2 directories


After the listing of the individual directories, the grand total is printed by default. To suppress this, add the option -nogrand_total

Example
> se '***/*.?pp' virtual -totals -nogrand
6 e:\glindra\src\glindra\lib1\glindra_exception.hpp
5 e:\glindra\src\glindra\lib2\file.hpp
1 e:\glindra\src\glindra\lib2\on_disk.hpp

-files

With the -files option, the command will list only the full filenames of those files that contained one or more records that matched. The output file is suitable as input to programs that expect a list of files to be processed.

Example
> se '***/*.?pp' virtual -files
e:\glindra\src\glindra\lib1\glindra_exception.hpp
e:\glindra\src\glindra\lib2\file.hpp
e:\glindra\src\glindra\lib2\on_disk.hpp

-header

This option controls if a header should be printed out for each new input file. It can assume the values yes, no, or auto.
-header Print a header with the filename for each new file where the pattern is found.
-noheader Do not print any headers.
-header=auto Print headers if more than one input file is being searched, but do not print any header if there is only one input file.

This is the default.


Standard Options

Output File

Example (Windows and Linux)
> se *.txt alice -output=mysearch.lis
Options
-output [= filename]
-o [= filename]
Specifies a single output file. If the option is not present, output will be sent to standard output.

If the -output option is given without a filename, the default filename is a.tmp in the current working directory.


The output file will always be a text file with line endings encoded on the format that is native for the platform. This means that the lines of text are separated by a carriage-return/newline character pair ('\r\n') under Windows, and by a newline character only ('\n') under Linux.


See File Version Numbers: Output Files for a description of how the version numbers work for output files.

File Selection Options

These options control which files are selected. See File Selection Options for more information.

Options
-since [= datetime] Select files that were created on or after a certain date/time.
-before [= datetime] Select file that were created before a certain date/time.
-min_size [= size] Specify a minimum file size in bytes, Mb, or some other unit
-max_size [= size] Specify a maximum file size .
-directories [= only/also/not]
List directory files (only/also/not) . Default is not.
-hidden [= only/also/not] List hidden files (only/also/not). Default is not.
-nodefault Do not add any default wildcards or other filename parts to the filename.
-exclude [= filespec] Exclude files that match the file specification given as option value. If the option value is enclosed in parenthesis, it can contain a full file specification, including recursive file selection options.

Help and Information Options

See Help and Information Options.

Options
-help   -h   -?
Print out a brief help text with a summary of each of the different options, and exit from the program.
-version Show the name and version number of the program, and exit. This option must be written out in full, and cannot be abbreviated.
-verbose   -v
-statistics
-noverbose
Specify the amount of informational messages.
-warning
-nowarning
-noerror
Specify the level of error reporting.