PURPOSE OPERATION COMMAND LINES PARAMETER FILE OPTIONS RELATED PROGRAMS
Author: Dan Mares, dmares @ maresware . com (you will be asked for e-mail address confirmation)
Portions Copyright © 1998-2023 by Dan Mares and Mares and Company, LLC
Phone: 678-427-3275
One liner: bsearch searches key field of a sorted file.
Sample Maresware Batches an executable with data that demonstrates various Maresware software. Download and run the appropriate _07_xx batch for bsearch demo.
This is a command line program.
MUST be run within a command window as administrator.
Bsearch will search a file using binary search routines. Binary searching of a file is extremely fast.
The file to search must be a fixed length record file. Since the input file MUST be of a fixed length record, carriage returns at the end of each record are unnecessary.
The input file MUST also be sorted on the field in which the key is located.
The program obtains from the user (on the command line) the name of an ascii text file (called a parameter file) containing the following: information about the structure of the input file; its record length; location of the key field to search; length of the key field (which must be fixed in size); and the keys to search for.
This program is mainly designed as a data file analysis tool to analyse data obtained from mainframe and Unix data base files.
When the input file is sorted on the key field, the program does a binary search for records that contain the search keys specified in the parameter file. If records are found, they are placed in the output file.
The number of keys permitted to search for is unlimited. Because the program only reads one search key at a time, there is theoretically no limit to the number of search keys the parameter file can contain. But if more than 50 are used, the program speed degrades to the point that the Search program would be more efficient.
The fact that the program does a binary search to find records with the search keys means that it is much faster than the Search program. However, the speed slows down in a linear fashion as the number of keys increases. If you have a very large sorted file, and a few search keys, this program is remarkably fast. The program can search for one key in about 2 seconds regardless of the size of the input file.
It will also allow you to search a second field of the record for a second key match. The binary search is done only on the first field. When the first field matches, it then goes through the keys in a second key file to determine if any of these keys match the one in the record.
There are two types of searches you might want to conduct. The first is "logical AND", meaning that one of the keys in the first key file must be present, AND one of the keys in the second key file must be present. The second type of search is "logical NOT", meaning that a key in the first parameter file must be present, and there are NOT any keys matching the second parameter key field(so, a key match will be found in field 1 but not in field 2).
The -k and -K options are used to designate that a second field is also to be searched. The second key parameter file is identical in structure to the first parameter key file. It must have the blocksize and record length the same as the parameter file. The displacement and length of the key field are naturally going to have to reflect the proper size and placement in the record of the second field to check. See parameter file descriptions and sample command lines in the Search program.
C:>bsearch input output paramfile -[options]
Item 1: Program name[bsearch]
Item 2: Input file
Item 3: Output file
Item 4: Parameter file name
Item 5: 0ptions -[aA1MmcikKnrR]
Samples:
C:> bsearch input.fle output.fle param.fle -aAThe parameter file is an ascii text file created with a text editor, not word processor. It has the following structure:
LINE 1: Input File Blocksize (must be A multiple of record length, MAX 32000)
LINE 2: Input File Record Length
LINE 3: Displacment (from 0) to field to search
LINE 4: Length of field to search
LINE 5-X: Keys to search for, one per line, with a blank line at end
Sample:
32000
80
21
9
123456789
abcdefghi
CAUTION NOTE:
When building the parameter file for either bsearch or search it is very important
that there is only one occurance of each search key. If by accident you duplicate a
search key, ie:
123456789
123456789
Then the output will contain multiple instances of the records which are hit upon.
And thus give eroneous output total counts. In testing, it took me two days to
figure out what was the problem. And it turned out, a simple error in forming the
parameter file.
-a (a)ppends output to existing output file
-A (A)ppends/creates accounting file (ACCT-ING) containing accounting information/statistics.
This ACCT-ING file is defaulted to the current working directory. (see next option -1 below)
-1 + accounting_filename (thats a one, not ell). This creates an accounting file in the specified path/filename provided.
-m: adds the entire parameter file to the accounting file identified in option -1 or -A
-M: adds the entire parameter file to the accounting file identified in option -1 or -A, AND all the keys in the parameter file followed the number of hits for each.
/A Turns off auto-accounting set in environment
-c (c)onverts all unprinting characters in input record to tilde, hex 7f == ~
-2 + # Converts unprinting characters to the decimal character indicated by #. (i.e. upper case A is 64)
-k + filename Filename is name of file containing keys to search for in a second field. See note below on 2nd key.
-K + # Indicates which logical search to perform on the second key field. They are:
1== Field 1 and field 2 must be matched with keys in
the respective key file. Logical (AND) (field 1 AND field 2)
2== Field one matches and none of the keys in field 2 match.
Logical (NOT) (field 1 and NOT 2)
-I (i)gnore case of search keys
-n Converting unprinting characters, leave carriage returns alone
-r or -R Adds carriage returns at end of output records. The small ‘r’ means add the carriage return that is native to that computer system, and the ‘R’ means add the carriage return that is foreign to that system. The native carriage return for IBM is a carriage return line feed pair which is 0x0d 0x0a. The native return for UNIX is single carriage return 0x0a. The foreign return for IBM is 0x0a, and for UNIX is 0x0d 0x0a.