DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

(gawk.info.gz) Two-way I/O

Info Catalog (gawk.info.gz) Nondecimal Data (gawk.info.gz) Advanced Features (gawk.info.gz) TCP/IP Networking
 
 Two-Way Communications with Another Process
 ===========================================
 
      From: brennan@whidbey.com (Mike Brennan)
      Newsgroups: comp.lang.awk
      Subject: Re: Learn the SECRET to Attract Women Easily
      Date: 4 Aug 1997 17:34:46 GMT
      Message-ID: <5s53rm$eca@news.whidbey.com>
      
      On 3 Aug 1997 13:17:43 GMT, Want More Dates???
      <tracy78@kilgrona.com> wrote:
      >Learn the SECRET to Attract Women Easily
      >
      >The SCENT(tm)  Pheromone Sex Attractant For Men to Attract Women
      
      The scent of awk programmers is a lot more attractive to women than
      the scent of perl programmers.
      --
      Mike Brennan
 
    It is often useful to be able to send data to a separate program for
 processing and then read the result.  This can always be done with
 temporary files:
 
      # write the data for processing
      tempfile = ("/tmp/mydata." PROCINFO["pid"])
      while (NOT DONE WITH DATA)
          print DATA | ("subprogram > " tempfile)
      close("subprogram > " tempfile)
      
      # read the results, remove tempfile when done
      while ((getline newdata < tempfile) > 0)
          PROCESS newdata APPROPRIATELY
      close(tempfile)
      system("rm " tempfile)
 
 This works, but not elegantly.
 
    Starting with version 3.1 of `gawk', it is possible to open a
 _two-way_ pipe to another process.  The second process is termed a
 "coprocess", since it runs in parallel with `gawk'.  The two-way
 connection is created using the new `|&' operator (borrowed from the
 Korn shell, `ksh'):(1)
 
      do {
          print DATA |& "subprogram"
          "subprogram" |& getline results
      } while (DATA LEFT TO PROCESS)
      close("subprogram")
 
    The first time an I/O operation is executed using the `|&' operator,
 `gawk' creates a two-way pipeline to a child process that runs the
 other program.  Output created with `print' or `printf' is written to
 the program's standard input, and output from the program's standard
 output can be read by the `gawk' program using `getline'.  As is the
 case with processes started by `|', the subprogram can be any program,
 or pipeline of programs, that can be started by the shell.
 
    There are some cautionary items to be aware of:
 
    * As the code inside `gawk' currently stands, the coprocess's
      standard error goes to the same place that the parent `gawk''s
      standard error goes. It is not possible to read the child's
      standard error separately.
 
      </itemizedlist>
 
    * I/O buffering may be a problem.  `gawk' automatically flushes all
      output down the pipe to the child process.  However, if the
      coprocess does not flush its output, `gawk' may hang when doing a
      `getline' in order to read the coprocess's results.  This could
      lead to a situation known as "deadlock", where each process is
      waiting for the other one to do something.
 
    It is possible to close just one end of the two-way pipe to a
 coprocess, by supplying a second argument to the `close' function of
 either `"to"' or `"from"' ( Closing Input and Output Redirections
 Close Files And Pipes.).  These strings tell `gawk' to close the end of
 the pipe that sends data to the process or the end that reads from it,
 respectively.
 
    This is particularly necessary in order to use the system `sort'
 utility as part of a coprocess; `sort' must read _all_ of its input
 data before it can produce any output.  The `sort' program does not
 receive an end-of-file indication until `gawk' closes the write end of
 the pipe.
 
    When you have finished writing data to the `sort' utility, you can
 close the `"to"' end of the pipe, and then start reading sorted data
 via `getline'.  For example:
 
      BEGIN {
          command = "LC_ALL=C sort"
          n = split("abcdefghijklmnopqrstuvwxyz", a, "")
      
          for (i = n; i > 0; i--)
              print a[i] |& command
          close(command, "to")
      
          while ((command |& getline line) > 0)
              print "got", line
          close(command)
      }
 
    This program writes the letters of the alphabet in reverse order, one
 per line, down the two-way pipe to `sort'.  It then closes the write
 end of the pipe, so that `sort' receives an end-of-file indication.
 This causes `sort' to sort the data and write the sorted data back to
 the `gawk' program.  Once all of the data has been read, `gawk'
 terminates the coprocess and exits.
 
    As a side note, the assignment `LC_ALL=C' in the `sort' command
 ensures traditional Unix (ASCII) sorting from `sort'.
 
    ---------- Footnotes ----------
 
    (1) This is very different from the same operator in the C shell,
 `csh'.
 
Info Catalog (gawk.info.gz) Nondecimal Data (gawk.info.gz) Advanced Features (gawk.info.gz) TCP/IP Networking
automatically generated byinfo2html