|
|
Four different readability statistics are calculated within analyze. Readability statistics assess variables including the average number of words per sentence, average length of sentences, number of syllables per word, and so on, to derive a formulaic estimate of the ``readability'' of the text. They do not take into account less quantifiable elements such as semantic content, grammatical correctness, or meaning. Thus, there is no guarantee that a text that a readability test identifies as easy to understand actually is readable. However, in practice it has been found that real documents that the tests identify as ``easy to read'' are likely to be easier to comprehend at a structural level.
The four test formulae used in the analyze function are as follows:
File rap-bat.wc contains:
243 words
95 lines
1768 characters
Sentences are counted using a custom awk script, explained
in
``Spanning multiple lines''.
Then the number of letters is established (by subtracting the white
space from the file and counting the number of characters), and the
number of syllables is estimated using another awk
script. Finally, these values are fed into four calculations that
make use of bc, the SCO OpenServer binary calculator.
bc is a simple programming language for calculations; it recognizes a syntax similar to C or awk, and can use variables and functions. It is fully described in bc(C), and is used here because unlike the shell's eval command, it can handle floating point arithmetic (that is, numbers with a decimal point are not truncated). Because bc is interactive and reads commands from its standard input, the basic readability variables are substituted into a here-document which is fed to bc, and the output is captured in another environment variable. For example:
233 : Flesch=`bc << %% 234 : w = ($wordcount / $sentences) 235 : s = ($sylcount / $wordcount) 236 : 206.835 - 84.6analyze also prints the output from the tests, as follows:s - 1.015
w 237 : %% 238 : `
ARI = -10.43 Kincaid= -7.01 Coleman-Liau = -17.00 Flesch Reading Ease = 184.505Depending on the setting of $LOG (the variable that controls file logging) the output is printed to the terminal, or printed to the terminal and a logfile (the name of which is set by the variable $LOGFILE.)