DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

(diff.info.gz) diff Performance

Info Catalog (diff.info.gz) Adjusting Output (diff.info.gz) Top (diff.info.gz) Comparing Three Files
 
 `diff' Performance Tradeoffs
 ****************************
 
    GNU `diff' runs quite efficiently; however, in some circumstances
 you can cause it to run faster or produce a more compact set of changes.
 
    One way to improve `diff' performance is to use hard or symbolic
 links to files instead of copies.  This improves performance because
 `diff' normally does not need to read two hard or symbolic links to the
 same file, since their contents must be identical.  For example,
 suppose you copy a large directory hierarchy, make a few changes to the
 copy, and then often use `diff -r' to compare the original to the copy.
 If the original files are read-only, you can greatly improve
 performance by creating the copy using hard or symbolic links (e.g.,
 with GNU `cp -lR' or `cp -sR').  Before editing a file in the copy for
 the first time, you should break the link and replace it with a regular
 copy.
 
    You can also affect the performance of GNU `diff' by giving it
 options that change the way it compares files.  Performance has more
 than one dimension.  These options improve one aspect of performance at
 the cost of another, or they improve performance in some cases while
 hurting it in others.
 
    The way that GNU `diff' determines which lines have changed always
 comes up with a near-minimal set of differences.  Usually it is good
 enough for practical purposes.  If the `diff' output is large, you
 might want `diff' to use a modified algorithm that sometimes produces a
 smaller set of differences.  The `-d' or `--minimal' option does this;
 however, it can also cause `diff' to run more slowly than usual, so it
 is not the default behavior.
 
    When the files you are comparing are large and have small groups of
 changes scattered throughout them, you can use the
 `--speed-large-files' option to make a different modification to the
 algorithm that `diff' uses.  If the input files have a constant small
 density of changes, this option speeds up the comparisons without
 changing the output.  If not, `diff' might produce a larger set of
 differences; however, the output will still be correct.
 
    Normally `diff' discards the prefix and suffix that is common to
 both files before it attempts to find a minimal set of differences.
 This makes `diff' run faster, but occasionally it may produce
 non-minimal output.  The `--horizon-lines=LINES' option prevents `diff'
 from discarding the last LINES lines of the prefix and the first LINES
 lines of the suffix.  This gives `diff' further opportunities to find a
 minimal output.
 
    Suppose a run of changed lines includes a sequence of lines at one
 end and there is an identical sequence of lines just outside the other
 end.  The `diff' command is free to choose which identical sequence is
 included in the hunk.  In this case, `diff' normally shifts the hunk's
 boundaries when this merges adjacent hunks, or shifts a hunk's lines
 towards the end of the file.  Merging hunks can make the output look
 nicer in some cases.
 
Info Catalog (diff.info.gz) Adjusting Output (diff.info.gz) Top (diff.info.gz) Comparing Three Files
automatically generated byinfo2html