DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

/usr/man2/cat.3/Filter::Simple.3.Z





NAME

       Apache::Filter - Alter the output of previous handlers


SYNOPSIS

         #### In httpd.conf:
         PerlModule Apache::Filter
         # That's it - this isn't a handler.

         <Files ~ "*\.blah">
          SetHandler perl-script
          PerlSetVar Filter On
          PerlHandler Filter1 Filter2 Filter3
         </Files>

         #### In Filter1, Filter2, and Filter3:
         $r = $r->filter_register();  # Required
         my $fh = $r->filter_input(); # Optional (you might not need the input FH)
         while (<$fh>) {
           s/ something / something else /;
           print;
         }

         #### or, alternatively:
         $r = $r->filter_register();
         my ($fh, $status) = $r->filter_input(); # Get status information
         return $status unless $status == OK;
         while (<$fh>) {
           s/ something / something else /;
           print;
         }


DESCRIPTION

       In basic operation, each of the handlers Filter1, Filter2, and Filter3
       will make a call to $r->filter_input(), which will return a filehandle.
       For Filter1, the filehandle points to the requested file.  For Filter2,
       the filehandle contains whatever Filter1 wrote to STDOUT.  For Filter3,
       it contains whatever Filter2 wrote to STDOUT.  The output of Filter3
       goes directly to the browser.

       Note that the modules Filter1, Filter2, and Filter3 are listed in for-
       ward order, in contrast to the reverse-order listing of Apache::Out-
       putChain.

       When you've got this module, you can use the same handler both as a
       stand-alone handler, and as an element in a chain.  Just make sure that
       whenever you're chaining, all the handlers in the chain are "Fil-
       ter-aware," i.e. they each call $r->filter_register() exactly once,
       before they start printing to STDOUT.  There should be almost no over-
       head for doing this when there's only one element in the chain.

       Currently the following public modules are Filter-aware.  Please tell
       me of others you know about.

        Apache::Registry (using Apache::RegistryFilter, included here)
        Apache::SSI
        Apache::ASP
        HTML::Mason
        Apache::SimpleReplace
        Text::Forge


METHODS

       Apache::Filter is a subclass of Apache, so all Apache methods are
       available.

       This module doesn't create an Apache handler class of its own - rather,
       it adds some methods to the Apache:: class.  Thus, it's really a mix-in
       package that just adds functionality to the $r request object.

       * $r = $r->filter_register()
           Every Filter-aware module must call this method exactly once, so
           that Apache::filter can properly rotate its filters from previous
           handlers, and so it can know when the output should eventually go
           to the browser.

       * $r->filter_input()
           This method will give you a filehandle that contains either the
           file requested by the user ($r->filename), or the output of a pre-
           vious filter.  If called in a scalar context, that filehandle is
           all you'll get back.  If called in a list context, you'll also get
           an Apache status code (OK, NOT_FOUND, or FORBIDDEN) that tells you
           whether $r->filename was successfully found and opened.  If it was
           not, the filehandle returned will be undef.

       * $r->changed_since($time)
           Returns true or false based on whether the current input seems like
           it has changed since $time.  Currently the criteria to figure this
           out is this: if the file pointed to by "$r->finfo" hasn't changed
           since the time given, and if all previous filters in the chain are
           deterministic (see below), then we return false.  Otherwise we
           return true.

           This method is meant to be useful in implementing caching schemes.

           A caution: always call the "changed_since()" and "deterministic()"
           methods AFTER the "filter_register()" method.  This is because
           Apache::Filter uses a crude counting method to figure out which
           handler in the chain is currently executing, and calling these rou-
           tines out of order messes up the counting.

       * $r->deterministic(1|0);
           As of version 0.07, the concept of a "deterministic" filter is sup-
           ported.  A deterministic filter is one whose output is entirely
           determined by the contents of its input file (whether the $r->file-
           name file or the output of another filter), and doesn't depend at
           all on outside factors.  For example, a filter that translates all
           its output to upper-case is deterministic, but a filter that adds a
           date stamp to a page, or looks things up in a database which may
           vary over time, is not.

           Why is this a big deal?  Let's say you have the following setup:

            <Files ~ "\.boffo$">
             SetHandler perl-script
             PerlSetVar Filter On
             PerlHandler Apache::FormatNumbers Apache::DoBigCalculation
             # The above are fake modules, you get the idea
            </Files>

           Suppose the FormatNumbers module is deterministic, and the DoBig-
           Calculation module takes a long time to run.  The DoBigCalculation
           module can now cache its results, so that when an input file is
           unchanged on disk, its results will remain known when passed
           through the FormatNumbers module, and the DoBigCalculation module
           will be able to used cached results from a previous run.

           The guts of the modules would look something like this:

            sub Apache::FormatNumbers::handler {
               my $r = shift;
               $r->content_type("text/html");
               my ($fh, $status) = $r->filter_input();
               return $status unless $status == OK;
               $r->deterministic(1); # Set to true; default is false

               # ... do some formatting, print to STDOUT
               return OK;
            }

            sub Apache::DoBigCalculation::handler {
               my $r = shift;
               $r->content_type("text/html");
               my ($fh, $status) = $r->filter_input();
               return $status unless $status == OK;

               # This module implements a caching scheme by using the
               # %cache_time and %cache_content hashes.
               my $time = $cache_time{$r->filename};
               my $output;
               if ($r->changed_since($time)) {
                   # Read from <$fh>, perform a big calculation on it, and print to STDOUT
               } else {
                   print $cache_content{$r->filename};
               }

               return OK;
            }

           A caution: always call the "changed_since()" and "deterministic()"
           methods AFTER the "filter_register()" method.  This is because
           Apache::Filter uses a crude counting method to figure out which
           handler in the chain is currently executing, and calling these rou-
           tines out of order messes up the counting.


HEADERS

       In previous releases of this module, it was dangerous to call
       $r->send_http_header(), because a previous/postvious filter might also
       try to send headers, and then you'd have duplicate headers getting
       sent.  In current releases you can simply send the headers.  If the
       current filter is the last filter, the headers will be sent as usual,
       and otherwise send_http_header() is a no-op.


NOTES

       You'll notice in the SYNOPSIS that I say "PerlSetVar Filter On".  That
       information isn't actually used by this module, it's used by modules
       which are themselves filters (like Apache::SSI).  I hereby suggest that
       filtering modules use this parameter, using it as the switch to detect
       whether they should call $r->filter_register.  However, it's often not
       necessary - there is very little overhead in simply calling $r->fil-
       ter_register even when you don't need to do any filtering, and $r->fil-
       ter_input can be a handy way of opening the $r->filename file.

       VERY IMPORTANT: if one handler in a stacked handler chain uses
       "Apache::Filter", then THEY ALL MUST USE IT.  This means they all must
       call $r->filter_register exactly once.  Otherwise "Apache::Filter"
       couldn't capture the output of the handlers properly, and it wouldn't
       know when to release the output to the browser.

       The output of each filter (except the last) is accumulated in memory
       before it's passed to the next filter, so memory requirements are large
       for large pages.  Apache::OutputChain only needs to keep one item from
       print()'s argument list in memory at a time, so it doesn't have this
       problem, but there are others (each chunk is filtered independently, so
       content spanning several chunks won't be properly parsed).  In future
       versions I might find a way around this, or cache large pages to disk
       so memory requirements don't get out of hand.  We'll see whether it's a
       problem.

       A couple examples of filters are provided with this distribution in the
       t/ subdirectory: UC.pm converts all its input to upper-case, and
       Reverse.pm prints the lines of its input reversed.

       Finally, a caveat: in version 0.09 I started explicitly setting the
       Content-Length to undef.  This prevents early filters from incorrectly
       setting the content length, which will almost certainly be wrong if
       there are any filters after it.  This means that if you write any fil-
       ters which set the content length, they should do it after the $r->fil-
       ter_register call.


TO DO

       Add a buffered mode to the final output, so that we can send a proper
       Content-Length header. [gozer@hbesoftware.com (Philippe M. Chiasson)]


BUGS

       This uses some funny stuff to figure out when the currently executing
       handler is the last handler in the chain.  As a result, code that
       manipulates the handler list at runtime (using push_handlers and the
       like) might produce mayhem.  Poke around a bit in the code before you
       try anything.  Let me know if you have a better idea.

       As of 0.07, Apache::Filter will automatically return DECLINED when
       $r->filename points to a directory.  This is just because in most cases
       this is what you want to do (so that mod_dir can take care of the
       request), and because figuring out the "right" way to handle directo-
       ries seems pretty tough - the right way would allow a directory index-
       ing handler to be a filter, which isn't possible now.  Also, you can't
       properly pass control to a non-mod_perl indexer like mod_autoindex.
       Suggestions are welcome.

       I haven't considered what will happen if you use this and you haven't
       turned on PERL_STACKED_HANDLERS.  So don't do it.


AUTHOR

       Ken Williams (kwilliams@cpan.org)


COPYRIGHT

       Copyright 1998,1999,2000 Ken Williams.  All rights reserved.

       This library is free software; you can redistribute it and/or modify it
       under the same terms as Perl itself.


SEE ALSO

       perl(1).

perl v5.8.8                       2005-09-24                 Apache::Filter(3)
See also Filter::Simple(3)
See also Filter::Util::Call(3)
See also HTML::Filter(3)
See also Mail::Filter(3)
See also PAR::Filter(3)
See also PAR::Filter::Bleach(3)
See also PAR::Filter::Bytecode(3)
See also PAR::Filter::Obfuscate(3)
See also PAR::Filter::PatchContent(3)
See also PAR::Filter::PodStrip(3)
See also XML::Filter::BufferText(3)
See also XML::Filter::DetectWS(3)
See also XML::Filter::Reindent(3)
See also XML::Filter::SAXT(3)

Man(1) output converted with man2html