DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

(flex.info.gz) Performance

Info Catalog (flex.info.gz) Scanner Options (flex.info.gz) Top (flex.info.gz) Cxx
 
 17 Performance Considerations
 *****************************
 
 The main design goal of `flex' is that it generate high-performance
 scanners.  It has been optimized for dealing well with large sets of
 rules.  Aside from the effects on scanner speed of the table compression
 `-C' options outlined above, there are a number of options/actions
 which degrade performance.  These are, from most expensive to least:
 
 
          REJECT
          arbitrary trailing context
 
          pattern sets that require backing up
          %option yylineno
          %array
 
          %option interactive
          %option always-interactive
 
          @samp{^} beginning-of-line operator
          yymore()
 
    with the first two all being quite expensive and the last two being
 quite cheap.  Note also that `unput()' is implemented as a routine call
 that potentially does quite a bit of work, while `yyless()' is a
 quite-cheap macro. So if you are just putting back some excess text you
 scanned, use `yyless()'.
 
    `REJECT' should be avoided at all costs when performance is
 important.  It is a particularly expensive option.
 
    There is one case when `%option yylineno' can be expensive. That is
 when your patterns match long tokens that could _possibly_ contain a
 newline character. There is no performance penalty for rules that can
 not possibly match newlines, since flex does not need to check them for
 newlines.  In general, you should avoid rules such as `[^f]+', which
 match very long tokens, including newlines, and may possibly match your
 entire file! A better approach is to separate `[^f]+' into two rules:
 
 
      %option yylineno
      %%
          [^f\n]+
          \n+
 
    The above scanner does not incur a performance penalty.
 
    Getting rid of backing up is messy and often may be an enormous
 amount of work for a complicated scanner.  In principal, one begins by
 using the `-b' flag to generate a `lex.backup' file.  For example, on
 the input:
 
 
          %%
          foo        return TOK_KEYWORD;
          foobar     return TOK_KEYWORD;
 
    the file looks like:
 
 
          State #6 is non-accepting -
           associated rule line numbers:
                 2       3
           out-transitions: [ o ]
           jam-transitions: EOF [ \001-n  p-\177 ]
 
          State #8 is non-accepting -
           associated rule line numbers:
                 3
           out-transitions: [ a ]
           jam-transitions: EOF [ \001-`  b-\177 ]
 
          State #9 is non-accepting -
           associated rule line numbers:
                 3
           out-transitions: [ r ]
           jam-transitions: EOF [ \001-q  s-\177 ]
 
          Compressed tables always back up.
 
    The first few lines tell us that there's a scanner state in which it
 can make a transition on an 'o' but not on any other character, and
 that in that state the currently scanned text does not match any rule.
 The state occurs when trying to match the rules found at lines 2 and 3
 in the input file.  If the scanner is in that state and then reads
 something other than an 'o', it will have to back up to find a rule
 which is matched.  With a bit of headscratching one can see that this
 must be the state it's in when it has seen `fo'.  When this has
 happened, if anything other than another `o' is seen, the scanner will
 have to back up to simply match the `f' (by the default rule).
 
    The comment regarding State #8 indicates there's a problem when
 `foob' has been scanned.  Indeed, on any character other than an `a',
 the scanner will have to back up to accept "foo".  Similarly, the
 comment for State #9 concerns when `fooba' has been scanned and an `r'
 does not follow.
 
    The final comment reminds us that there's no point going to all the
 trouble of removing backing up from the rules unless we're using `-Cf'
 or `-CF', since there's no performance gain doing so with compressed
 scanners.
 
    The way to remove the backing up is to add "error" rules:
 
 
          %%
          foo         return TOK_KEYWORD;
          foobar      return TOK_KEYWORD;
 
          fooba       |
          foob        |
          fo          {
                      /* false alarm, not really a keyword */
                      return TOK_ID;
                      }
 
    Eliminating backing up among a list of keywords can also be done
 using a "catch-all" rule:
 
 
          %%
          foo         return TOK_KEYWORD;
          foobar      return TOK_KEYWORD;
 
          [a-z]+      return TOK_ID;
 
    This is usually the best solution when appropriate.
 
    Backing up messages tend to cascade.  With a complicated set of rules
 it's not uncommon to get hundreds of messages.  If one can decipher
 them, though, it often only takes a dozen or so rules to eliminate the
 backing up (though it's easy to make a mistake and have an error rule
 accidentally match a valid token.  A possible future `flex' feature
 will be to automatically add rules to eliminate backing up).
 
    It's important to keep in mind that you gain the benefits of
 eliminating backing up only if you eliminate _every_ instance of
 backing up.  Leaving just one means you gain nothing.
 
    _Variable_ trailing context (where both the leading and trailing
 parts do not have a fixed length) entails almost the same performance
 loss as `REJECT' (i.e., substantial).  So when possible a rule like:
 
 
          %%
          mouse|rat/(cat|dog)   run();
 
    is better written:
 
 
          %%
          mouse/cat|dog         run();
          rat/cat|dog           run();
 
    or as
 
 
          %%
          mouse|rat/cat         run();
          mouse|rat/dog         run();
 
    Note that here the special '|' action does _not_ provide any
 savings, and can even make things worse ( Limitations).
 
    Another area where the user can increase a scanner's performance (and
 one that's easier to implement) arises from the fact that the longer the
 tokens matched, the faster the scanner will run.  This is because with
 long tokens the processing of most input characters takes place in the
 (short) inner scanning loop, and does not often have to go through the
 additional work of setting up the scanning environment (e.g., `yytext')
 for the action.  Recall the scanner for C comments:
 
 
          %x comment
          %%
                  int line_num = 1;
 
          "/*"         BEGIN(comment);
 
          <comment>[^*\n]*
          <comment>"*"+[^*/\n]*
          <comment>\n             ++line_num;
          <comment>"*"+"/"        BEGIN(INITIAL);
 
    This could be sped up by writing it as:
 
 
          %x comment
          %%
                  int line_num = 1;
 
          "/*"         BEGIN(comment);
 
          <comment>[^*\n]*
          <comment>[^*\n]*\n      ++line_num;
          <comment>"*"+[^*/\n]*
          <comment>"*"+[^*/\n]*\n ++line_num;
          <comment>"*"+"/"        BEGIN(INITIAL);
 
    Now instead of each newline requiring the processing of another
 action, recognizing the newlines is distributed over the other rules to
 keep the matched text as long as possible.  Note that _adding_ rules
 does _not_ slow down the scanner!  The speed of the scanner is
 independent of the number of rules or (modulo the considerations given
 at the beginning of this section) how complicated the rules are with
 regard to operators such as `*' and `|'.
 
    A final example in speeding up a scanner: suppose you want to scan
 through a file containing identifiers and keywords, one per line and
 with no other extraneous characters, and recognize all the keywords.  A
 natural first approach is:
 
 
          %%
          asm      |
          auto     |
          break    |
          ... etc ...
          volatile |
          while    /* it's a keyword */
 
          .|\n     /* it's not a keyword */
 
    To eliminate the back-tracking, introduce a catch-all rule:
 
 
          %%
          asm      |
          auto     |
          break    |
          ... etc ...
          volatile |
          while    /* it's a keyword */
 
          [a-z]+   |
          .|\n     /* it's not a keyword */
 
    Now, if it's guaranteed that there's exactly one word per line, then
 we can reduce the total number of matches by a half by merging in the
 recognition of newlines with that of the other tokens:
 
 
          %%
          asm\n    |
          auto\n   |
          break\n  |
          ... etc ...
          volatile\n |
          while\n  /* it's a keyword */
 
          [a-z]+\n |
          .|\n     /* it's not a keyword */
 
    One has to be careful here, as we have now reintroduced backing up
 into the scanner.  In particular, while _we_ know that there will never
 be any characters in the input stream other than letters or newlines,
 `flex' can't figure this out, and it will plan for possibly needing to
 back up when it has scanned a token like `auto' and then the next
 character is something other than a newline or a letter.  Previously it
 would then just match the `auto' rule and be done, but now it has no
 `auto' rule, only a `auto\n' rule.  To eliminate the possibility of
 backing up, we could either duplicate all rules but without final
 newlines, or, since we never expect to encounter such an input and
 therefore don't how it's classified, we can introduce one more
 catch-all rule, this one which doesn't include a newline:
 
 
          %%
          asm\n    |
          auto\n   |
          break\n  |
          ... etc ...
          volatile\n |
          while\n  /* it's a keyword */
 
          [a-z]+\n |
          [a-z]+   |
          .|\n     /* it's not a keyword */
 
    Compiled with `-Cf', this is about as fast as one can get a `flex'
 scanner to go for this particular problem.
 
    A final note: `flex' is slow when matching `NUL's, particularly when
 a token contains multiple `NUL's.  It's best to write rules which match
 _short_ amounts of text if it's anticipated that the text will often
 include `NUL's.
 
    Another final note regarding performance: as mentioned in 
 Matching, dynamically resizing `yytext' to accommodate huge tokens is
 a slow process because it presently requires that the (huge) token be
 rescanned from the beginning.  Thus if performance is vital, you should
 attempt to match "large" quantities of text but not "huge" quantities,
 where the cutoff between the two is at about 8K characters per token.
 
Info Catalog (flex.info.gz) Scanner Options (flex.info.gz) Top (flex.info.gz) Cxx
automatically generated byinfo2html