sort240

Overview

Linux provides a useful command named sort that can be used to sort the lines in a text file. For this assignment you will write a minimal version of sort named sort240. Your program will be able to sort the lines in a text file based on the value of a particular field in each line. For example, the file below contains a list of student scores on an exam.
    Cartwright   Wendy    93
    Williamson   Mark     81
    Thompson     Mark     100
    Anderson     John     76
    Turner       Dennis   56
Each line consists of three fields: last name, first name, and exam score. The first two fields are text fields, and the third field is a numeric field. Assuming that this file is named grades.txt, the following command could be used to sort the lines in the file by last name:
    $ sort240 1 grades.txt
This command tells sort240 to sort the lines in the file named grades.txt based on the value of field number 1 in each line (i.e., the last name). sort240 always writes the sorted lines to standard output. By default, sort240 sorts the lines in ascending order. Therefore, the output from this command would be:
    Anderson     John     76
    Cartwright   Wendy    93
    Thompson     Mark     100
    Turner       Dennis   56
    Williamson   Mark     81
NOTE: This file containing student names and scores is just an example. Your sort240 program must be able to sort any text file that is formatted as rows and columns. Specifically, you may NOT assume that the file to be sorted contains student names and scores, or that it has three columns per row.

In some cases we may want to sort the lines in descending order rather than ascending order. For this purpose, sort240 supports the -r command-line option that allows the user to specify that the lines should be sorted in descending order. For example, this command would sort the file in descending order based on first name.

    $ sort240 -r 2 grades.txt

    Cartwright   Wendy    93
    Thompson     Mark     100
    Williamson   Mark     81      
    Anderson     John     76
    Turner       Dennis   56
Notice that in this case there are two students with the first name Mark. When two lines have identical values in the sort field, those lines may be placed in any order in the sorted output (i.e., it doesn't matter which one comes first).

By default, sort240 does a case-sensitive comparison of text field values. For example, MARK would come before Mark if we are sorting in ascending order based on first name. Sometimes it is desirable to do a case-insensitive sort which ignores case differences when comparing text field values. For this purpose, sort240 supports the -i command-line option, which allows the user to specify that a case-insensitive comparison should be used when comparing text field values. For example, this command would perform a case-insensitive sort of the file in ascending order based on last name:

    $ sort240 -i 1 grades.txt

    Anderson     John     76
    Cartwright   Wendy    93
    Thompson     Mark     100
    Turner       Dennis   56
    Williamson   Mark     81
Of course, in this case the -i flag makes no difference because none of last names in the file differ only in case.

All of the examples so far have sorted the grades file based on the value of a text field (last name or first name). sort240 can also sort the lines in a file based on the value of a numeric field. In this case, the user must explicitly specify that the value of the sort field should be treated as a number rather than a text string. This is done by specifying the -n command-line flag. For example, the following command sorts the grades file in ascending order based on exam grade.

    $ sort240 -n 3 grades.txt

    Turner       Dennis   56
    Anderson     John     76
    Williamson   Mark     81      
    Cartwright   Wendy    93
    Thompson     Mark     100
Notice that in this case the -n option is required to produce the correct sort order. If -n were not specified, the exam scores would be treated as text strings, in which case the line with exam score 100 would appear first rather than last. Of course, that would not be the desired result.

It is also possible to specify multiple command-line options at once. For example, this command would sort the grades file in descending order based on exam grade.

    $ sort240 -rn 3 grades.txt

    Thompson     Mark     100
    Cartwright   Wendy    93
    Williamson   Mark     81      
    Anderson     John     76
    Turner       Dennis   56

The following sections provide more details on the function of sort240.

Command-Line

The executable file for your program must be named sort240 (notice the capitalization).

sort240's command-line has the following form:

    sort240 [options] field file
The options parameter is optional, and, if present, specifies one or more of the following options (in any order):
-r sort the data in descending order rather than the default ascending order
-i do a case-insensitive comparision of text field values rather than the default case-sensitive comparison
-n the sort field should be treated as a numeric value rather than a text string

The field parameter is required. It specifies the number of the field that should be used to sort the lines. If each line has N fields, then 1 <= field <= N.

The file parameter is required, and specifies the name of the file to be sorted.

Sorting the File

Each line in the input file should be parsed into a sequence of fields. Fields are separated by one or more whitespace characters.

Any lines in the input file that are empty (i.e., contain only whitespace characters) should be ignored.

Numeric fields should be converted from text form to numeric form using the standard C++ function atof. This implies that numeric fields are stored internally as type double.

In order to actually sort the lines, you may write your own sorting function, or you may use the function named qsort from the Standard C++ Library. This function may be accessed by including the <stdlib.h> header file.

Simplifying Assumptions

You may assume that all command-line parameters are correct.

You may assume the following about the file being sorted:

Generating the Output

After lines have been sorted, they should be printed to the standard output in sorted order.

Although the sorting operation is based on only one field from each line, each line should be printed out in its entirety exactly as it appears in the input.

The sorted output should not include the empty lines from the original file.

Restrictions

You are not allowed to use the C++ string class or the classes and functions from the Standard Template Library (vector, list, map, set, etc.). Specifically, the following header files may not be used:


Ken Rodham