DSORT(1) Commands and Applications DSORT(1)
NAME
dsort, msort - sort text files lexicographically
SYNOPSIS
msort [ -hvV? ] [ -o outfile ] [ -n lines ] file1 [ file2 ... ]
dsort [ -hvV? ] [ -l length ] [ -n lines ] [ -o outfile ] [-t
path1[,path2[,path3[,path4]]]] infile
DESCRIPTION
This manual page documents dsort and msort version 1.0.
dsort and msort are robust text file sorting utilities. While they do
not support a lot of features, they are designed to sort large (and
small) files very quickly.
msort is an in-place memory sort. Since it uses the heapsort algo‐
rithm, it is O[n lg n] both on average and for worst-case. Provided it
has enough memory, msort will sort files with lines of arbitrary
length. Unless overridden by the -n flag, msort will sort files of up
to 1000 lines. Larger files can be sorted provided there is sufficient
core memory. If multiple input files are given, the output is the con‐
catenated result of sorting the input files separately. Thus, the fol‐
lowing would be equivalent:
% msort file1 file2 file3 >outfile
and
% msort file1 >file1out
% msort file2 >file2out
% cat file1out file2out >outfile
dsort is a disk sort intended for files too large to be sorted in mem‐
ory. It uses a four-file polyphase merge algorithm. Since it is an
I/O-bound program, dsort's speed is very dependant on the speed of the
device used for temporary files. By default, dsort will sort files
with lines up to 512 characters long. Lines with more characters will
be trucated unless the -l flag is used. Also by default, 1000 lines at
a time will be sorted in memory during the collection (first) phase of
the merge sort algorithm. This can be changed using the -n flag.
dsort will accept only one input file.
Both dsort and msort leave the input file(s) intact.
OPTIONS
-h -? -- print version and usage info, then exit
-l length -- use a line length of length
-n lines -- sort lines lines in memory, (for dsort); don't
try to sort files over lines long (for msort).
-o outfile -- send sorted output to outfile rather than to stdout
-t pathlist -- use pathlist as the locations of temp files. If any
of these are not specified, dsort will attempt to use
the directory specified by the environment variable
$(TMPDIR), then the system default temp path.
-v -- verbose operation
-V -- print version information
HINTS
If you have more than one fast drive, the speed of dsort can in general
be improved by using four different drives for the path list when using
-t . The best speed observed, however, has occurred when $(TMPDIR) or
/tmp reside on a RAM disk or ROM disk. It is not suggested that flop‐
pies be used for temporary files.
RESOURCE USAGE
Both dsort and msort use 1k of stack space.
msort is an in-place sort, so in general the amount of core memory used
is the same as the size of the file to be sorted. When sorting multi‐
ple files, msort's memory usage will match the size of the largest in‐
put file, not the total of all files. It will use a minimum of approx‐
imately 4k of core memory.
dsort by default uses approximately 512k of core memory. This can be
modified by changing the -l and -n parameters. Core memory usage is
approximately the product of these two parameters.
When using dsort, the amount free space on the temporary path(s) must
be at least twice the size of the file to be sorted.
AUTHOR
Devin Reade - glyn@cs.ualberta.ca
SEE ALSO
sort(1), uniq(1).
GNO 14 June 1994 DSORT(1)
Man(1) output converted with
man2html