bundles/org.simantics.databoard/cpp/DataBoardTest/libantlr3c-3.2/README

   1 ANTLR v3.0.1 C Runtime
   2 ANTLR 3.0.1
   3 January 1, 2008
   4
   5 At the moment, the use of the C runtime engine for the parser is not generally
   6 for the inexperienced C programmer. However this is mainly because of the lack
   7 of documentation on use, which will be corrected shortly. The C runtime
   8 code itself is however well documented with doxygen style comments and a
   9 reasonably experienced C programmer should be able to piece it together. You
  10 can visit the documentation at: http://www.antlr.org/api/C/index.html
  11
  12 The general make up is that everything is implemented as a pseudo class/object
  13 initialized with pointers to its 'member' functions and data. All objects are
  14 (usually) created by factories, which auto manage the memory allocation and
  15 release and generally make life easier. If you remember this rule, everything
  16 should fall in to place.
  17
  18 Jim Idle - Portland Oregon, Jan 2008
  19 jimi     idle ws
  20
  21 ===============================================================================
  22
  23 Terence Parr, parrt at cs usfca edu
  24 ANTLR project lead and supreme dictator for life
  25 University of San Francisco
  26
  27 INTRODUCTION
  28
  29 Welcome to ANTLR v3!  I've been working on this for nearly 4 years and it's
  30 almost ready!  I plan no feature additions between this beta and first
  31 3.0 release.  I have lots of features to add later, but this will be
  32 the first set.  Ultimately, I need to rewrite ANTLR v3 in itself (it's
  33 written in 2.7.7 at the moment and also needs StringTemplate 3.0 or
  34 later).
  35
  36 You should use v3 in conjunction with ANTLRWorks:
  37
  38     http://www.antlr.org/works/index.html
  39
  40 WARNING: We have bits of documentation started, but nothing super-complete
  41 yet.  The book will be printed May 2007:
  42
  43 http://www.pragmaticprogrammer.com/titles/tpantlr/index.html
  44
  45 but we should have a beta PDF available on that page in Feb 2007.
  46
  47 You also have the examples plus the source to guide you.
  48
  49 See the new wiki FAQ:
  50
  51     http://www.antlr.org/wiki/display/ANTLR3/ANTLR+v3+FAQ
  52
  53 and general doc root:
  54
  55     http://www.antlr.org/wiki/display/ANTLR3/ANTLR+3+Wiki+Home
  56
  57 Please help add/update FAQ entries.
  58
  59 I have made very little effort at this point to deal well with
  60 erroneous input (e.g., bad syntax might make ANTLR crash).  I will clean
  61 this up after I've rewritten v3 in v3.
  62
  63 Per the license in LICENSE.txt, this software is not guaranteed to
  64 work and might even destroy all life on this planet:
  65
  66 THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
  67 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
  68 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
  69 DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
  70 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
  71 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
  72 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  73 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
  74 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
  75 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
  76 POSSIBILITY OF SUCH DAMAGE.
  77
  78 EXAMPLES
  79
  80 ANTLR v3 sample grammars:
  81
  82     http://www.antlr.org/download/examples-v3.tar.gz
  83
  84 contains the following examples: LL-star, cminus, dynamic-scope,
  85 fuzzy, hoistedPredicates, island-grammar, java, python, scopes,
  86 simplecTreeParser, treeparser, tweak, xmlLexer.
  87
  88 Also check out Mantra Programming Language for a prototype (work in
  89 progress) using v3:
  90
  91     http://www.linguamantra.org/
  92
  93 ----------------------------------------------------------------------
  94
  95 What is ANTLR?
  96
  97 ANTLR stands for (AN)other (T)ool for (L)anguage (R)ecognition and was
  98 originally known as PCCTS.  ANTLR is a language tool that provides a
  99 framework for constructing recognizers, compilers, and translators
 100 from grammatical descriptions containing actions.  Target language list:
 101
 102 http://www.antlr.org/wiki/display/ANTLR3/Code+Generation+Targets
 103
 104 ----------------------------------------------------------------------
 105
 106 How is ANTLR v3 different than ANTLR v2?
 107
 108 See migration guide:
 109     http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3
 110
 111 ANTLR v3 has a far superior parsing algorithm called LL(*) that
 112 handles many more grammars than v2 does.  In practice, it means you
 113 can throw almost any grammar at ANTLR that is non-left-recursive and
 114 unambiguous (same input can be matched by multiple rules); the cost is
 115 perhaps a tiny bit of backtracking, but with a DFA not a full parser.
 116 You can manually set the max lookahead k as an option for any decision
 117 though.  The LL(*) algorithm ramps up to use more lookahead when it
 118 needs to and is much more efficient than normal LL backtracking. There
 119 is support for syntactic predicate (full LL backtracking) when LL(*)
 120 fails.
 121
 122 Lexers are much easier due to the LL(*) algorithm as well.  Previously
 123 these two lexer rules would cause trouble because ANTLR couldn't
 124 distinguish between them with finite lookahead to see the decimal
 125 point:
 126
 127 INT : ('0'..'9')+ ;
 128 FLOAT : INT '.' INT ;
 129
 130 The syntax is almost identical for features in common, but you should
 131 note that labels are always '=' not ':'.  So do id=ID not id:ID.
 132
 133 You can do combined lexer/parser grammars again (ala PCCTS) both lexer
 134 and parser rules are defined in the same file.  See the examples.
 135 Really nice.  You can reference strings and characters in the grammar
 136 and ANTLR will generate the lexer for you.
 137
 138 The attribute structure has been enhanced.  Rules may have multiple
 139 return values, for example.  Further, there are dynamically scoped
 140 attributes whereby a rule may define a value usable by any rule it
 141 invokes directly or indirectly w/o having to pass a parameter all the
 142 way down.
 143
 144 ANTLR v3 tree construction is far superior--it provides tree rewrite
 145 rules where the right hand side is simply the tree grammar fragment
 146 describing the tree you want to build:
 147
 148 formalArgs
 149         :       typename declarator (',' typename declarator )*
 150                 -> ^(ARG typename declarator)+
 151         ;
 152
 153 That builds tree sequences like:
 154
 155 ^(ARG int v1) ^(ARG int v2)
 156
 157 ANTLR v3 also incorporates StringTemplate:
 158
 159       http://www.stringtemplate.org
 160
 161 just like AST support.  It is useful for generating output.  For
 162 example this rule creates a template called 'import' for each import
 163 definition found in the input stream:
 164
 165 grammar Java;
 166 options {
 167   output=template;
 168 }
 169 ...
 170 importDefinition
 171     :   'import' identifierStar SEMI
 172         -> import(name={$identifierStar.st},
 173                 begin={$identifierStar.start},
 174                 end={$identifierStar.stop})
 175     ;
 176
 177 The attributes are set via assignments in the argument list.  The
 178 arguments are actions with arbitrary expressions in the target
 179 language.  The .st label property is the result template from a rule
 180 reference.  There is a nice shorthand in actions too:
 181
 182     %foo(a={},b={},...) ctor
 183     %({name-expr})(a={},...) indirect template ctor reference
 184     %{string-expr} anonymous template from string expr
 185     %{expr}.y = z; template attribute y of StringTemplate-typed expr to z
 186     %x.y = z; set template attribute y of x (always set never get attr)
 187               to z [languages like python without ';' must still use the
 188               ';' which the code generator is free to remove during code gen]
 189               Same as '(x).setAttribute("y", z);'
 190
 191 For ANTLR v3 I decided to make the most common tasks easy by default
 192 rather.  This means that some of the basic objects are heavier weight
 193 than some speed demons would like, but they are free to pare it down
 194 leaving most programmers the luxury of having it "just work."  For
 195 example, to read in some input, tweak it, and write it back out
 196 preserving whitespace, is easy in v3.
 197
 198 The ANTLR source code is much prettier.  You'll also note that the
 199 run-time classes are conveniently encapsulated in the
 200 org.antlr.runtime package.
 201
 202 ----------------------------------------------------------------------
 203
 204 How do I install this damn thing?
 205
 206 Just untar and you'll get:
 207
 208 antlr-3.0b6/README.txt (this file)
 209 antlr-3.0b6/LICENSE.txt
 210 antlr-3.0b6/src/org/antlr/...
 211 antlr-3.0b6/lib/stringtemplate-3.0.jar (3.0b6 needs 3.0)
 212 antlr-3.0b6/lib/antlr-2.7.7.jar
 213 antlr-3.0b6/lib/antlr-3.0b6.jar
 214
 215 Then you need to add all the jars in lib to your CLASSPATH.
 216
 217 ----------------------------------------------------------------------
 218
 219 How do I use ANTLR v3?
 220
 221 [I am assuming you are only using the command-line (and not the
 222 ANTLRWorks GUI)].
 223
 224 Running ANTLR with no parameters shows you:
 225
 226 ANTLR Parser Generator   Early Access Version 3.0b6 (Jan 31, 2007) 1989-2007
 227 usage: java org.antlr.Tool [args] file.g [file2.g file3.g ...]
 228   -o outputDir          specify output directory where all output is generated
 229   -lib dir              specify location of token files
 230   -report               print out a report about the grammar(s) processed
 231   -print                print out the grammar without actions
 232   -debug                generate a parser that emits debugging events
 233   -profile              generate a parser that computes profiling information
 234   -nfa                  generate an NFA for each rule
 235   -dfa                  generate a DFA for each decision point
 236   -message-format name  specify output style for messages
 237   -X                    display extended argument list
 238
 239 For example, consider how to make the LL-star example from the examples
 240 tarball you can get at http://www.antlr.org/download/examples-v3.tar.gz
 241
 242 $ cd examples/java/LL-star
 243 $ java org.antlr.Tool simplec.g
 244 $ jikes *.java
 245
 246 For input:
 247
 248 char c;
 249 int x;
 250 void bar(int x);
 251 int foo(int y, char d) {
 252   int i;
 253   for (i=0; i<3; i=i+1) {
 254     x=3;
 255     y=5;
 256   }
 257 }
 258
 259 you will see output as follows:
 260
 261 $ java Main input
 262 bar is a declaration
 263 foo is a definition
 264
 265 What if I want to test my parser without generating code?  Easy.  Just
 266 run ANTLR in interpreter mode.  It can't execute your actions, but it
 267 can create a parse tree from your input to show you how it would be
 268 matched.  Use the org.antlr.tool.Interp main class.  In the following,
 269 I interpret simplec.g on t.c, which contains "int x;"
 270
 271 $ java org.antlr.tool.Interp simplec.g WS program t.c
 272 ( <grammar SimpleC>
 273   ( program
 274     ( declaration
 275       ( variable
 276         ( type [@0,0:2='int',<14>,1:0] )
 277         ( declarator [@2,4:4='x',<2>,1:4] )
 278         [@3,5:5=';',<5>,1:5]
 279       )
 280     )
 281   )
 282 )
 283
 284 where I have formatted the output to make it more readable.  I have
 285 told it to ignore all WS tokens.
 286
 287 ----------------------------------------------------------------------
 288
 289 How do I rebuild ANTLR v3?
 290
 291 Make sure the following two jars are in your CLASSPATH
 292
 293 antlr-3.0b6/lib/stringtemplate-3.0.jar
 294 antlr-3.0b6/lib/antlr-2.7.7.jar
 295 junit.jar [if you want to build the test directories]
 296
 297 then jump into antlr-3.0b6/src directory and then type:
 298
 299 $ javac -d . org/antlr/Tool.java org/antlr/*/*.java org/antlr/*/*/*.java
 300
 301 Takes 9 seconds on my 1Ghz laptop or 4 seconds with jikes.  Later I'll
 302 have a real build mechanism, though I must admit the one-liner appeals
 303 to me.  I use Intellij so I never type anything actually to build.
 304
 305 There is also an ANT build.xml file, but I know nothing of ANT; contributed
 306 by others (I'm opposed to any tool with an XML interface for Humans).
 307
 308 -----------------------------------------------------------------------
 309 C# Target Notes
 310
 311 1. Auto-generated lexers do not inherit parent parser's @namespace
 312    {...} value.  Use @lexer::namespace{...}.
 313
 314 -----------------------------------------------------------------------
 315
 316 CHANGES
 317
 318 March 17, 2007
 319
 320 * Jonathan DeKlotz updated C# templates to be 3.0b6 current
 321
 322 March 14, 2007
 323
 324 * Manually-specified (...)=> force backtracking eval of that predicate.
 325   backtracking=true mode does not however.  Added unit test.
 326
 327 March 14, 2007
 328
 329 * Fixed bug in lexer where ~T didn't compute the set from rule T.
 330
 331 * Added -Xnoinlinedfa make all DFA with tables; no inline prediction with IFs
 332
 333 * Fixed http://www.antlr.org:8888/browse/ANTLR-80.
 334   Sem pred states didn't define lookahead vars.
 335
 336 * Fixed http://www.antlr.org:8888/browse/ANTLR-91.
 337   When forcing some acyclic DFA to be state tables, they broke.
 338   Forcing all DFA to be state tables should give same results.
 339
 340 March 12, 2007
 341
 342 * setTokenSource in CommonTokenStream didn't clear tokens list.
 343   setCharStream calls reset in Lexer.
 344
 345 * Altered -depend.  No longer printing grammar files for multiple input
 346   files with -depend.  Doesn't show T__.g temp file anymore. Added
 347   TLexer.tokens.  Added .h files if defined.
 348
 349 February 11, 2007
 350
 351 * Added -depend command-line option that, instead of processing files,
 352   it shows you what files the input grammar(s) depend on and what files
 353   they generate. For combined grammar T.g:
 354
 355   $ java org.antlr.Tool -depend T.g
 356
 357   You get:
 358
 359   TParser.java : T.g
 360   T.tokens : T.g
 361   T__.g : T.g
 362
 363   Now, assuming U.g is a tree grammar ref'd T's tokens:
 364
 365   $ java org.antlr.Tool -depend T.g U.g
 366
 367   TParser.java : T.g
 368   T.tokens : T.g
 369   T__.g : T.g
 370   U.g: T.tokens
 371   U.java : U.g
 372   U.tokens : U.g
 373
 374   Handles spaces by escaping them.  Pays attention to -o, -fo and -lib.
 375   Dir 'x y' is a valid dir in current dir.
 376
 377   $ java org.antlr.Tool -depend -lib /usr/local/lib -o 'x y' T.g U.g
 378   x\ y/TParser.java : T.g
 379   x\ y/T.tokens : T.g
 380   x\ y/T__.g : T.g
 381   U.g: /usr/local/lib/T.tokens
 382   x\ y/U.java : U.g
 383   x\ y/U.tokens : U.g
 384
 385   You have API access via org.antlr.tool.BuildDependencyGenerator class:
 386   getGeneratedFileList(), getDependenciesFileList().  You can also access
 387   the output template: getDependencies().  The file
 388   org/antlr/tool/templates/depend.stg contains the template.  You can
 389   modify as you want.  File objects go in so you can play with path etc...
 390
 391 February 10, 2007
 392
 393 * no more .gl files generated.  All .g all the time.
 394
 395 * changed @finally to be @after and added a finally clause to the
 396   exception stuff.  I also removed the superfluous "exception"
 397   keyword.  Here's what the new syntax looks like:
 398
 399   a
 400   @after { System.out.println("ick"); }
 401     : 'a'
 402     ;
 403     catch[RecognitionException e] { System.out.println("foo"); }
 404     catch[IOException e] { System.out.println("io"); }
 405     finally { System.out.println("foobar"); }
 406
 407   @after executes after bookkeeping to set $rule.stop, $rule.tree but
 408   before scopes pop and any memoization happens.  Dynamic scopes and
 409   memoization are still in generated finally block because they must
 410   exec even if error in rule.  The @after action and tree setting
 411   stuff can technically be skipped upon syntax error in rule.  [Later
 412   we might add something to finally to stick an ERROR token in the
 413   tree and set the return value.]  Sequence goes: set $stop, $tree (if
 414   any), @after (if any), pop scopes (if any), memoize (if needed),
 415   grammar finally clause.  Last 3 are in generated code's finally
 416   clause.
 417
 418 3.0b6 - January 31, 2007
 419
 420 January 30, 2007
 421
 422 * Fixed bug in IntervalSet.and: it returned the same empty set all the time
 423   rather than new empty set.  Code altered the same empty set.
 424
 425 * Made analysis terminate faster upon a decision that takes too long;
 426   it seemed to keep doing work for a while.  Refactored some names
 427   and updated comments.  Also made it terminate when it realizes it's
 428   non-LL(*) due to recursion.  just added terminate conditions to loop
 429   in convert().
 430
 431 * Sometimes fatal non-LL(*) messages didn't appear; instead you got
 432   "antlr couldn't analyze", which is actually untrue.  I had the
 433   order of some prints wrong in the DecisionProbe.
 434
 435 * The code generator incorrectly detected when it could use a fixed,
 436   acyclic inline DFA (i.e., using an IF).  Upon non-LL(*) decisions
 437   with predicates, analysis made cyclic DFA.  But this stops
 438   the computation detecting whether they are cyclic.  I just added
 439   a protection in front of the acyclic DFA generator to avoid if
 440   non-LL(*).  Updated comments.
 441
 442 January 23, 2007
 443
 444 * Made tree node streams use adaptor to create navigation nodes.
 445   Thanks to Emond Papegaaij.
 446
 447 January 22, 2007
 448
 449 * Added lexer rule properties: start, stop
 450
 451 January 1, 2007
 452
 453 * analysis failsafe is back on; if a decision takes too long, it bails out
 454   and uses k=1
 455
 456 January 1, 2007
 457
 458 * += labels for rules only work for output option; previously elements
 459   of list were the return value structs, but are now either the tree or
 460   StringTemplate return value.  You can label different rules now
 461   x+=a x+=b.
 462
 463 December 30, 2006
 464
 465 * Allow \" to work correctly in "..." template.
 466
 467 December 28, 2006
 468
 469 * errors that are now warnings: missing AST label type in trees.
 470   Also "no start rule detected" is warning.
 471
 472 * tree grammars also can do rewrite=true for output=template.
 473   Only works for alts with single node or tree as alt elements.
 474   If you are going to use $text in a tree grammar or do rewrite=true
 475   for templates, you must use in your main:
 476
 477   nodes.setTokenStream(tokens);
 478
 479 * You get a warning for tree grammars that do rewrite=true and
 480   output=template and have -> for alts that are not simple nodes
 481   or simple trees.  new unit tests in TestRewriteTemplates at end.
 482
 483 December 27, 2006
 484
 485 * Error message appears when you use -> in tree grammar with
 486   output=template and rewrite=true for alt that is not simple
 487   node or tree ref.
 488
 489 * no more $stop attribute for tree parsers; meaningless/useless.
 490   Removed from TreeRuleReturnScope also.
 491
 492 * rule text attribute in tree parser must pull from token buffer.
 493   Makes no sense otherwise.  added getTokenStream to TreeNodeStream
 494   so rule $text attr works.  CommonTreeNodeStream etc... now let
 495   you set the token stream so you can access later from tree parser.
 496   $text is not well-defined for rules like
 497
 498      slist : stat+ ;
 499
 500   because stat is not a single node nor rooted with a single node.
 501   $slist.text will get only first stat.  I need to add a warning about
 502   this...
 503
 504 * Fixed http://www.antlr.org:8888/browse/ANTLR-76 for Java.
 505   Enhanced TokenRewriteStream so it accepts any object; converts
 506   to string at last second.  Allows you to rewrite with StringTemplate
 507   templates now :)
 508
 509 * added rewrite option that makes -> template rewrites do replace ops for
 510   TokenRewriteStream input stream.  In output=template and rewrite=true mode
 511   same as before 'cept that the parser does
 512
 513     ((TokenRewriteStream)input).replace(
 514               ((Token)retval.start).getTokenIndex(),
 515               input.LT(-1).getTokenIndex(),
 516               retval.st);
 517
 518   after each rewrite so that the input stream is altered.  Later refs to
 519   $text will have rewrites.  Here's a sample test program for grammar Rew.
 520
 521         FileReader groupFileR = new FileReader("Rew.stg");
 522         StringTemplateGroup templates = new StringTemplateGroup(groupFileR);
 523         ANTLRInputStream input = new ANTLRInputStream(System.in);
 524         RewLexer lexer = new RewLexer(input);
 525         TokenRewriteStream tokens = new TokenRewriteStream(lexer);
 526         RewParser parser = new RewParser(tokens);
 527         parser.setTemplateLib(templates);
 528         parser.program();
 529         System.out.println(tokens.toString());
 530         groupFileR.close();
 531
 532 December 26, 2006
 533
 534 * BaseTree.dupTree didn't dup recursively.
 535
 536 December 24, 2006
 537
 538 * Cleaned up some comments and removed field treeNode
 539   from MismatchedTreeNodeException class.  It is "node" in
 540   RecognitionException.
 541
 542 * Changed type from Object to BitSet for expecting fields in
 543   MismatchedSetException and MismatchedNotSetException
 544
 545 * Cleaned up error printing in lexers and the messages that it creates.
 546
 547 * Added this to TreeAdaptor:
 548         /** Return the token object from which this node was created.
 549          *  Currently used only for printing an error message.
 550          *  The error display routine in BaseRecognizer needs to
 551          *  display where the input the error occurred. If your
 552          *  tree of limitation does not store information that can
 553          *  lead you to the token, you can create a token filled with
 554          *  the appropriate information and pass that back.  See
 555          *  BaseRecognizer.getErrorMessage().
 556          */
 557         public Token getToken(Object t);
 558
 559 December 23, 2006
 560
 561 * made BaseRecognizer.displayRecognitionError nonstatic so people can
 562   override it. Not sure why it was static before.
 563
 564 * Removed state/decision message that comes out of no
 565   viable alternative exceptions, as that was too much.
 566   removed the decision number from the early exit exception
 567   also.  During development, you can simply override
 568   displayRecognitionError from BaseRecognizer to add the stuff
 569   back in if you want.
 570
 571 * made output go to an output method you can override: emitErrorMessage()
 572
 573 * general cleanup of the error emitting code in BaseRecognizer.  Lots
 574   more stuff you can override: getErrorHeader, getTokenErrorDisplay,
 575   emitErrorMessage, getErrorMessage.
 576
 577 December 22, 2006
 578
 579 * Altered Tree.Parser.matchAny() so that it skips entire trees if
 580   node has children otherwise skips one node.  Now this works to
 581   skip entire body of function if single-rooted subtree:
 582   ^(FUNC name=ID arg=ID .)
 583
 584 * Added "reverse index" from node to stream index.  Override
 585   fillReverseIndex() in CommonTreeNodeStream if you want to change.
 586   Use getNodeIndex(node) to find stream index for a specific tree node.
 587   See getNodeIndex(), reverseIndex(Set tokenTypes),
 588   reverseIndex(int tokenType), fillReverseIndex().  The indexing
 589   costs time and memory to fill, but pulling stuff out will be lots
 590   faster as it can jump from a node ptr straight to a stream index.
 591
 592 * Added TreeNodeStream.get(index) to make it easier for interpreters to
 593   jump around in tree node stream.
 594
 595 * New CommonTreeNodeStream buffers all nodes in stream for fast jumping
 596   around.  It now has push/pop methods to invoke other locations in
 597   the stream for building interpreters.
 598
 599 * Moved CommonTreeNodeStream to UnBufferedTreeNodeStream and removed
 600   Iterator implementation.  moved toNodesOnlyString() to TestTreeNodeStream
 601
 602 * [BREAKS ANY TREE IMPLEMENTATION]
 603   made CommonTreeNodeStream work with any tree node type.  TreeAdaptor
 604   now implements isNil so must add; trivial, but does break back
 605   compatibility.
 606
 607 December 17, 2006
 608
 609 * Added traceIn/Out methods to recognizers so that you can override them;
 610   previously they were in-line print statements. The message has also
 611   been slightly improved.
 612
 613 * Factored BuildParseTree into debug package; cleaned stuff up. Fixed
 614   unit tests.
 615
 616 December 15, 2006
 617
 618 * [BREAKS ANY TREE IMPLEMENTATION]
 619   org.antlr.runtime.tree.Tree; needed to add get/set for token start/stop
 620   index so CommonTreeAdaptor can assume Tree interface not CommonTree
 621   implementation.  Otherwise, no way to create your own nodes that satisfy
 622   Tree because CommonTreeAdaptor was doing
 623
 624         public int getTokenStartIndex(Object t) {
 625                 return ((CommonTree)t).startIndex;
 626         }
 627
 628   Added to Tree:
 629
 630         /**  What is the smallest token index (indexing from 0) for this node
 631          *   and its children?
 632          */
 633         int getTokenStartIndex();
 634
 635         void setTokenStartIndex(int index);
 636
 637         /**  What is the largest token index (indexing from 0) for this node
 638          *   and its children?
 639          */
 640         int getTokenStopIndex();
 641
 642         void setTokenStopIndex(int index);
 643
 644 December 13, 2006
 645
 646 * Added org.antlr.runtime.tree.DOTTreeGenerator so you can generate DOT
 647   diagrams easily from trees.
 648
 649         CharStream input = new ANTLRInputStream(System.in);
 650         TLexer lex = new TLexer(input);
 651         CommonTokenStream tokens = new CommonTokenStream(lex);
 652         TParser parser = new TParser(tokens);
 653         TParser.e_return r = parser.e();
 654         Tree t = (Tree)r.tree;
 655         System.out.println(t.toStringTree());
 656         DOTTreeGenerator gen = new DOTTreeGenerator();
 657         StringTemplate st = gen.toDOT(t);
 658         System.out.println(st);
 659
 660 * Changed the way mark()/rewind() work in CommonTreeNode stream to mirror
 661   more flexible solution in ANTLRStringStream.  Forgot to set lastMarker
 662   anyway.  Now you can rewind to non-most-recent marker.
 663
 664 December 12, 2006
 665
 666 * Temp lexer now end in .gl (T__.gl, for example)
 667
 668 * TreeParser suffix no longer generated for tree grammars
 669
 670 * Defined reset for lexer, parser, tree parser; rewinds the input stream also
 671
 672 December 10, 2006
 673
 674 * Made Grammar.abortNFAToDFAConversion() abort in middle of a DFA.
 675
 676 December 9, 2006
 677
 678 * fixed bug in OrderedHashSet.add().  It didn't track elements correctly.
 679
 680 December 6, 2006
 681
 682 * updated build.xml for future Ant compatibility, thanks to Matt Benson.
 683
 684 * various tests in TestRewriteTemplate and TestSyntacticPredicateEvaluation
 685   were using the old 'channel' vs. new '$channel' notation.
 686   TestInterpretedParsing didn't pick up an earlier change to CommonToken.
 687   Reported by Matt Benson.
 688
 689 * fixed platform dependent test failures in TestTemplates, supplied by Matt
 690   Benson.
 691
 692 November 29, 2006
 693
 694 *  optimized semantic predicate evaluation so that p||!p yields true.
 695
 696 November 22, 2006
 697
 698 * fixed bug that prevented var = $rule.some_retval from working in anything
 699   but the first alternative of a rule or subrule.
 700
 701 * attribute names containing digits were not allowed, this is now fixed,
 702   allowing attributes like 'name1' but not '1name1'.
 703
 704 November 19, 2006
 705
 706 * Removed LeftRecursionMessage and apparatus because it seems that I check
 707   for left recursion upfront before analysis and everything gets specified as
 708   recursion cycles at this point.
 709
 710 November 16, 2006
 711
 712 * TokenRewriteStream.replace was not passing programName to next method.
 713
 714 November 15, 2006
 715
 716 * updated DOT files for DFA generation to make smaller circles.
 717
 718 * made epsilon edges italics in the NFA diagrams.
 719
 720 3.0b5 - November 15, 2006
 721
 722 The biggest thing is that your grammar file names must match the grammar name
 723 inside (your generated class names will also be different) and we use
 724 $channel=HIDDEN now instead of channel=99 inside lexer actions.
 725 Should be compatible other than that.   Please look at complete list of
 726 changes.
 727
 728 November 14, 2006
 729
 730 * Force token index to be -1 for CommonIndex in case not set.
 731
 732 November 11, 2006
 733
 734 * getUniqueID for TreeAdaptor now uses identityHashCode instead of hashCode.
 735
 736 November 10, 2006
 737
 738 * No grammar nondeterminism warning now when wildcard '.' is final alt.
 739   Examples:
 740
 741         a : A | B | . ;
 742
 743         A : 'a'
 744           | .
 745           ;
 746
 747         SL_COMMENT
 748             : '//' (options {greedy=false;} : .)* '\r'? '\n'
 749             ;
 750
 751         SL_COMMENT2
 752             : '//' (options {greedy=false;} : 'x'|.)* '\r'? '\n'
 753             ;
 754
 755
 756 November 8, 2006
 757
 758 * Syntactic predicates did not get hoisting properly upon non-LL(*) decision.  Other hoisting issues fixed.  Cleaned up code.
 759
 760 * Removed failsafe that check to see if I'm spending too much time on a single DFA; I don't think we need it anymore.
 761
 762 November 3, 2006
 763
 764 * $text, $line, etc... were not working in assignments. Fixed and added
 765   test case.
 766
 767 * $label.text translated to label.getText in lexer even if label was on a char
 768
 769 November 2, 2006
 770
 771 * Added error if you don't specify what the AST type is; actions in tree
 772   grammar won't work without it.
 773
 774   $ cat x.g
 775   tree grammar x;
 776   a : ID {String s = $ID.text;} ;
 777
 778   ANTLR Parser Generator   Early Access Version 3.0b5 (??, 2006)  1989-2006
 779   error: x.g:0:0: (152) tree grammar x has no ASTLabelType option
 780
 781 November 1, 2006
 782
 783 * $text, $line, etc... were not working properly within lexer rule.
 784
 785 October 32, 2006
 786
 787 * Finally actions now execute before dynamic scopes are popped it in the
 788   rule. Previously was not possible to access the rules scoped variables
 789   in a finally action.
 790
 791 October 29, 2006
 792
 793 * Altered ActionTranslator to emit errors on setting read-only attributes
 794   such as $start, $stop, $text in a rule. Also forbid setting any attributes
 795   in rules/tokens referenced by a label or name.
 796   Setting dynamic scopes's attributes and your own parameter attributes
 797   is legal.
 798
 799 October 27, 2006
 800
 801 * Altered how ANTLR figures out what decision is associated with which
 802   block of grammar.  Makes ANTLRWorks correctly find DFA for a block.
 803
 804 October 26, 2006
 805
 806 * Fixed bug where EOT transitions led to no NFA configs in a DFA state,
 807   yielding an error in DFA table generation.
 808
 809 * renamed action.g to ActionTranslator.g
 810   the ActionTranslator class is now called ActionTranslatorLexer, as ANTLR
 811   generates this classname now. Fixed rest of codebase accordingly.
 812
 813 * added rules recognizing setting of scopes' attributes to ActionTranslator.g
 814   the Objective C target needed access to the right-hand side of the assignment
 815   in order to generate correct code
 816
 817 * changed ANTLRCore.sti to reflect the new mandatory templates to support the above
 818   namely: scopeSetAttributeRef, returnSetAttributeRef and the ruleSetPropertyRef_*
 819   templates, with the exception of ruleSetPropertyRef_text. we cannot set this attribute
 820
 821 October 19, 2006
 822
 823 * Fixed 2 bugs in DFA conversion that caused exceptions.
 824   altered functionality of getMinElement so it ignores elements<0.
 825
 826 October 18, 2006
 827
 828 * moved resetStateNumbersToBeContiguous() to after issuing of warnings;
 829   an internal error in that routine should make more sense as issues
 830   with decision will appear first.
 831
 832 * fixed cut/paste bug I introduced when fixed EOF in min/max
 833   bug. Prevented C grammar from working briefly.
 834
 835 October 17, 2006
 836
 837 * Removed a failsafe that seems to be unnecessary that ensure DFA didn't
 838   get too big.  It was resulting in some failures in code generation that
 839   led me on quite a strange debugging trip.
 840
 841 October 16, 2006
 842
 843 * Use channel=HIDDEN not channel=99 to put tokens on hidden channel.
 844
 845 October 12, 2006
 846
 847 * ANTLR now has a customizable message format for errors and warnings,
 848   to make it easier to fulfill requirements by IDEs and such.
 849   The format to be used can be specified via the '-message-format name'
 850   command line switch. The default for name is 'antlr', also available
 851   at the moment is 'gnu'. This is done via StringTemplate, for details
 852   on the requirements look in org/antlr/tool/templates/messages/formats/
 853
 854 * line numbers for lexers in combined grammars are now reported correctly.
 855
 856 September 29, 2006
 857
 858 * ANTLRReaderStream improperly checked for end of input.
 859
 860 September 28, 2006
 861
 862 * For ANTLRStringStream, LA(-1) was off by one...gave you LA(-2).
 863
 864 3.0b4 - August 24, 2006
 865
 866 * error when no rules in grammar.  doesn't crash now.
 867
 868 * Token is now an interface.
 869
 870 * remove dependence on non runtime classes in runtime package.
 871
 872 * filename and grammar name must be same Foo in Foo.g.  Generates FooParser,
 873   FooLexer, ...  Combined grammar Foo generates Foo$Lexer.g which generates
 874   FooLexer.java.  tree grammars generate FooTreeParser.java
 875
 876 August 24, 2006
 877
 878 * added C# target to lib, codegen, templates
 879
 880 August 11, 2006
 881
 882 * added tree arg to navigation methods in treeadaptor
 883
 884 August 07, 2006
 885
 886 * fixed bug related to (a|)+ on end of lexer rules.  crashed instead
 887   of warning.
 888
 889 * added warning that interpreter doesn't do synpreds yet
 890
 891 * allow different source of classloader:
 892 ClassLoader cl = Thread.currentThread().getContextClassLoader();
 893 if ( cl==null ) {
 894     cl = this.getClass().getClassLoader();
 895 }
 896
 897
 898 July 26, 2006
 899
 900 * compressed DFA edge tables significantly.  All edge tables are
 901   unique. The transition table can reuse arrays.  Look like this now:
 902
 903      public static readonly DFA30_transition0 =
 904         new short[] { 46, 46, -1, 46, 46, -1, -1, -1, -1, -1, -1, -1,...};
 905          public static readonly DFA30_transition1 =
 906         new short[] { 21 };
 907       public static readonly short[][] DFA30_transition = {
 908           DFA30_transition0,
 909           DFA30_transition0,
 910           DFA30_transition1,
 911           ...
 912       };
 913
 914 * If you defined both a label like EQ and '=', sometimes the '=' was
 915   used instead of the EQ label.
 916
 917 * made headerFile template have same arg list as outputFile for consistency
 918
 919 * outputFile, lexer, genericParser, parser, treeParser templates
 920   reference cyclicDFAs attribute which was no longer used after I
 921   started the new table-based DFA.  I made cyclicDFADescriptors
 922   argument to outputFile and headerFile (only).  I think this is
 923   correct as only OO languages will want the DFA in the recognizer.
 924   At the top level, C and friends can use it.  Changed name to use
 925   cyclicDFAs again as it's a better name probably.  Removed parameter
 926   from the lexer, ...  For example, my parser template says this now:
 927
 928     <cyclicDFAs:cyclicDFA()> <! dump tables for all DFA !>
 929
 930 * made all token ref token types go thru code gen's
 931   getTokenTypeAsTargetLabel()
 932
 933 * no more computing DFA transition tables for acyclic DFA.
 934
 935 July 25, 2006
 936
 937 * fixed a place where I was adding syn predicates into rewrite stuff.
 938
 939 * turned off invalid token index warning in AW support; had a problem.
 940
 941 * bad location event generated with -debug for synpreds in autobacktrack mode.
 942
 943 July 24, 2006
 944
 945 * changed runtime.DFA so that it treats all chars and token types as
 946   char (unsigned 16 bit int).  -1 becomes '\uFFFF' then or 65535.
 947
 948 * changed MAX_STATE_TRANSITIONS_FOR_TABLE to be 65534 by default
 949   now. This means that all states can use a table to do transitions.
 950
 951 * was not making synpreds on (C)* type loops with backtrack=true
 952
 953 * was copying tree stuff and actions into synpreds with backtrack=true
 954
 955 * was making synpreds on even single alt rules / blocks with backtrack=true
 956
 957 3.0b3 - July 21, 2006
 958
 959 * ANTLR fails to analyze complex decisions much less frequently.  It
 960   turns out that the set of decisions for which ANTLR fails (times
 961   out) is the same set (so far) of non-LL(*) decisions.  Morever, I'm
 962   able to detect this situation quickly and report rather than timing
 963   out. Errors look like:
 964
 965   java.g:468:23: [fatal] rule concreteDimensions has non-LL(*)
 966     decision due to recursive rule invocations in alts 1,2.  Resolve
 967     by left-factoring or using syntactic predicates with fixed k
 968     lookahead or use backtrack=true option.
 969
 970   This message only appears when k=*.
 971
 972 * Shortened no viable alt messages to not include decision
 973   description:
 974
 975 [compilationUnit, declaration]: line 8:8 decision=<<67:1: declaration
 976 : ( ( fieldDeclaration )=> fieldDeclaration | ( methodDeclaration )=>
 977 methodDeclaration | ( constructorDeclaration )=>
 978 constructorDeclaration | ( classDeclaration )=> classDeclaration | (
 979 interfaceDeclaration )=> interfaceDeclaration | ( blockDeclaration )=>
 980 blockDeclaration | emptyDeclaration );>> state 3 (decision=14) no
 981 viable alt; token=[@1,184:187='java',<122>,8:8]
 982
 983   too long and hard to read.
 984
 985 July 19, 2006
 986
 987 * Code gen bug: states with no emanating edges were ignored by ST.
 988   Now an empty list is used.
 989
 990 * Added grammar parameter to recognizer templates so they can access
 991   properties like getName(), ...
 992
 993 July 10, 2006
 994
 995 * Fixed the gated pred merged state bug.  Added unit test.
 996
 997 * added new method to Target: getTokenTypeAsTargetLabel()
 998
 999 July 7, 2006
1000
1001 * I was doing an AND instead of OR in the gated predicate stuff.
1002   Thanks to Stephen Kou!
1003
1004 * Reduce op for combining predicates was insanely slow sometimes and
1005   didn't actually work well.  Now it's fast and works.
1006
1007 * There is a bug in merging of DFA stop states related to gated
1008   preds...turned it off for now.
1009
1010 3.0b2 - July 5, 2006
1011
1012 July 5, 2006
1013
1014 * token emission not properly protected in lexer filter mode.
1015
1016 * EOT, EOT DFA state transition tables should be init'd to -1 (only
1017   was doing this for compressed tables).  Fixed.
1018
1019 * in trace mode, exit method not shown for memoized rules
1020
1021 * added -Xmaxdfaedges to allow you to increase number of edges allowed
1022   for a single DFA state before it becomes "special" and can't fit in
1023   a simple table.
1024
1025 * Bug in tables.  Short are signed so min/max tables for DFA are now
1026   char[].  Bizarre.
1027
1028 July 3, 2006
1029
1030 * Added a method to reset the tool error state for current thread.
1031   See ErrorManager.java
1032
1033 * [Got this working properly today] backtrack mode that let's you type
1034   in any old crap and ANTLR will backtrack if it can't figure out what
1035   you meant.  No errors are reported by antlr during analysis.  It
1036   implicitly adds a syn pred in front of every production, using them
1037   only if static grammar LL(*) analysis fails.  Syn pred code is not
1038   generated if the pred is not used in a decision.
1039
1040   This is essentially a rapid prototyping mode.
1041
1042 * Added backtracking report to the -report option
1043
1044 * Added NFA->DFA conversion early termination report to the -report option
1045
1046 * Added grammar level k and backtrack options to -report
1047
1048 * Added a dozen unit tests to test autobacktrack NFA construction.
1049
1050 * If you are using filter mode, you must manually use option
1051   memoize=true now.
1052
1053 July 2, 2006
1054
1055 * Added k=* option so you can set k=2, for example, on whole grammar,
1056   but an individual decision can be LL(*).
1057
1058 * memoize option for grammars, rules, blocks.  Remove -nomemo cmd-line option
1059
1060 * but in DOT generator for DFA; fixed.
1061
1062 * runtime.DFA reported errors even when backtracking
1063
1064 July 1, 2006
1065
1066 * Added -X option list to help
1067
1068 * Syn preds were being hoisted into other rules, causing lots of extra
1069   backtracking.
1070
1071 June 29, 2006
1072
1073 * unnecessary files removed during build.
1074
1075 * Matt Benson updated build.xml
1076
1077 * Detecting use of synpreds in analysis now instead of codegen.  In
1078   this way, I can avoid analyzing decisions in synpreds for synpreds
1079   not used in a DFA for a real rule.  This is used to optimize things
1080   for backtrack option.
1081
1082 * Code gen must add _fragment or whatever to end of pred name in
1083   template synpredRule to avoid having ANTLR know anything about
1084   method names.
1085
1086 * Added -IdbgST option to emit ST delimiters at start/stop of all
1087   templates spit out.
1088
1089 June 28, 2006
1090
1091 * Tweaked message when ANTLR cannot handle analysis.
1092
1093 3.0b1 - June 27, 2006
1094
1095 June 24, 2006
1096
1097 * syn preds no longer generate little static classes; they also don't
1098   generate a whole bunch of extra crap in the rules built to test syn
1099   preds.  Removed GrammarFragmentPointer class from runtime.
1100
1101 June 23-24, 2006
1102
1103 * added output option to -report output.
1104
1105 * added profiling info:
1106   Number of rule invocations in "guessing" mode
1107   number of rule memoization cache hits
1108   number of rule memoization cache misses
1109
1110 * made DFA DOT diagrams go left to right not top to bottom
1111
1112 * I try to recursive overflow states now by resolving these states
1113   with semantic/syntactic predicates if they exist.  The DFA is then
1114   deterministic rather than simply resolving by choosing first
1115   nondeterministic alt.  I used to generated errors:
1116
1117 ~/tmp $ java org.antlr.Tool -dfa t.g
1118 ANTLR Parser Generator   Early Access Version 3.0b2 (July 5, 2006)  1989-2006
1119 t.g:2:5: Alternative 1: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b
1120 t.g:2:5: Alternative 2: after matching input such as A A A A A decision cannot predict what comes next due to recursion overflow to b from b
1121
1122   Now, I uses predicates if available and emits no warnings.
1123
1124 * made sem preds share accept states.  Previously, multiple preds in a
1125 decision forked new accepts each time for each nondet state.
1126
1127 June 19, 2006
1128
1129 * Need parens around the prediction expressions in templates.
1130
1131 * Referencing $ID.text in an action forced bad code gen in lexer rule ID.
1132
1133 * Fixed a bug in how predicates are collected.  The definition of
1134   "last predicated alternative" was incorrect in the analysis.  Further,
1135   gated predicates incorrectly missed a case where an edge should become
1136   true (a tautology).
1137
1138 * Removed an unnecessary input.consume() reference in the runtime/DFA class.
1139
1140 June 14, 2006
1141
1142 * -> ($rulelabel)? didn't generate proper code for ASTs.
1143
1144 * bug in code gen (did not compile)
1145 a : ID -> ID
1146   | ID -> ID
1147   ;
1148 Problem is repeated ref to ID from left side.  Juergen pointed this out.
1149
1150 * use of tokenVocab with missing file yielded exception
1151
1152 * (A|B)=> foo yielded an exception as (A|B) is a set not a block. Fixed.
1153
1154 * Didn't set ID1= and INT1= for this alt:
1155   | ^(ID INT+ {System.out.print(\"^(\"+$ID+\" \"+$INT+\")\");})
1156
1157 * Fixed so repeated dangling state errors only occur once like:
1158 t.g:4:17: the decision cannot distinguish between alternative(s) 2,1 for at least one input sequence
1159
1160 * tracking of rule elements was on (making list defs at start of
1161   method) with templates instead of just with ASTs.  Turned off.
1162
1163 * Doesn't crash when you give it a missing file now.
1164
1165 * -report: add output info: how many LL(1) decisions.
1166
1167 June 13, 2006
1168
1169 * ^(ROOT ID?) Didn't work; nor did any other nullable child list such as
1170   ^(ROOT ID* INT?).  Now, I check to see if child list is nullable using
1171   Grammar.LOOK() and, if so, I generate an "IF lookahead is DOWN" gate
1172   around the child list so the whole thing is optional.
1173
1174 * Fixed a bug in LOOK that made it not look through nullable rules.
1175
1176 * Using AST suffixes or -> rewrite syntax now gives an error w/o a grammar
1177   output option.  Used to crash ;)
1178
1179 * References to EOF ended up with improper -1 refs instead of EOF in output.
1180
1181 * didn't warn of ambig ref to $expr in rewrite; fixed.
1182 list
1183      :  '[' expr 'for' type ID 'in' expr ']'
1184         -> comprehension(expr={$expr.st},type={},list={},i={})
1185         ;
1186
1187 June 12, 2006
1188
1189 * EOF works in the parser as a token name.
1190
1191 * Rule b:(A B?)*; didn't display properly in AW due to the way ANTLR
1192   generated NFA.
1193
1194 * "scope x;" in a rule for unknown x gives no error.  Fixed.  Added unit test.
1195
1196 * Label type for refs to start/stop in tree parser and other parsers were
1197   not used.  Lots of casting.  Ick. Fixed.
1198
1199 * couldn't refer to $tokenlabel in isolation; but need so we can test if
1200   something was matched.  Fixed.
1201
1202 * Lots of little bugs fixed in $x.y, %... translation due to new
1203   action translator.
1204
1205 * Improperly tracking block nesting level; result was that you couldn't
1206   see $ID in action of rule "a : A+ | ID {Token t = $ID;} | C ;"
1207
1208 * a : ID ID {$ID.text;} ; did not get a warning about ambiguous $ID ref.
1209
1210 * No error was found on $COMMENT.text:
1211
1212 COMMENT
1213     :   '/*' (options {greedy=false;} : . )* '*/'
1214         {System.out.println("found method "+$COMMENT.text);}
1215     ;
1216
1217   $enclosinglexerrule scope does not exist.  Use text or setText() here.
1218
1219 June 11, 2006
1220
1221 * Single return values are initialized now to default or to your spec.
1222
1223 * cleaned up input stream stuff.  Added ANTLRReaderStream, ANTLRInputStream
1224   and refactored.  You can specify encodings now on ANTLRFileStream (and
1225   ANTLRInputStream) now.
1226
1227 * You can set text local var now in a lexer rule and token gets that text.
1228   start/stop indexes are still set for the token.
1229
1230 * Changed lexer slightly.  Calling a nonfragment rule from a
1231   nonfragment rule does not set the overall token.
1232
1233 June 10, 2006
1234
1235 * Fixed bug where unnecessary escapes yield char==0 like '\{'.
1236
1237 * Fixed analysis bug.  This grammar didn't report a recursion warning:
1238 x   : y X
1239     | y Y
1240     ;
1241 y   : L y R
1242     | B
1243     ;
1244   The DFAState.equals() method was messed up.
1245
1246 * Added @synpredgate {...} action so you can tell ANTLR how to gate actions
1247   in/out during syntactic predicate evaluation.
1248
1249 * Fuzzy parsing should be more efficient.  It should backtrack over a rule
1250   and then rewind and do it again "with feeling" to exec actions.  It was
1251   actually doing it 3x not 2x.
1252
1253 June 9, 2006
1254
1255 * Gutted and rebuilt the action translator for $x.y, $x::y, ...
1256   Uses ANTLR v3 now for the first time inside v3 source. :)
1257   ActionTranslator.java
1258
1259 * Fixed a bug where referencing a return value on a rule didn't work
1260   because later a ref to that rule's predefined properties didn't
1261   properly force a return value struct to be built.  Added unit test.
1262
1263 June 6, 2006
1264
1265 * New DFA mechanisms.  Cyclic DFA are implemented as state tables,
1266   encoded via strings as java cannot handle large static arrays :(
1267   States with edges emanating that have predicates are specially
1268   treated.  A method is generated to do these states.  The DFA
1269   simulation routine uses the "special" array to figure out if the
1270   state is special.  See March 25, 2006 entry for description:
1271   http://www.antlr.org/blog/antlr3/codegen.tml.  analysis.DFA now has
1272   all the state tables generated for code gen.  CyclicCodeGenerator.java
1273   disappeared as it's unneeded code. :)
1274
1275 * Internal general clean up of the DFA.states vs uniqueStates thing.
1276   Fixed lookahead decisions no longer fill uniqueStates.  Waste of
1277   time.  Also noted that when adding sem pred edges, I didn't check
1278   for state reuse.  Fixed.
1279
1280 June 4, 2006
1281
1282 * When resolving ambig DFA states predicates, I did not add the new states
1283   to the list of unique DFA states.  No observable effect on output except
1284   that DFA state numbers were not always contiguous for predicated decisions.
1285   I needed this fix for new DFA tables.
1286
1287 3.0ea10 - June 2, 2006
1288
1289 June 2, 2006
1290
1291 * Improved grammar stats and added syntactic pred tracking.
1292
1293 June 1, 2006
1294
1295 * Due to a type mismatch, the DebugParser.recoverFromMismatchedToken()
1296   method was not called.  Debug events for mismatched token error
1297   notification were not sent to ANTLRWorks probably
1298
1299 * Added getBacktrackingLevel() for any recognizer; needed for profiler.
1300
1301 * Only writes profiling data for antlr grammar analysis with -profile set
1302
1303 * Major update and bug fix to (runtime) Profiler.
1304
1305 May 27, 2006
1306
1307 * Added Lexer.skip() to force lexer to ignore current token and look for
1308   another; no token is created for current rule and is not passed on to
1309   parser (or other consumer of the lexer).
1310
1311 * Parsers are much faster now.  I removed use of java.util.Stack for pushing
1312   follow sets and use a hardcoded array stack instead.  Dropped from
1313   5900ms to 3900ms for parse+lex time parsing entire java 1.4.2 source.  Lex
1314   time alone was about 1500ms.  Just looking at parse time, we get about 2x
1315   speed improvement. :)
1316
1317 May 26, 2006
1318
1319 * Fixed NFA construction so it generates NFA for (A*)* such that ANTLRWorks
1320   can display it properly.
1321
1322 May 25, 2006
1323
1324 * added abort method to Grammar so AW can terminate the conversion if it's
1325   taking too long.
1326
1327 May 24, 2006
1328
1329 * added method to get left recursive rules from grammar without doing full
1330   grammar analysis.
1331
1332 * analysis, code gen not attempted if serious error (like
1333   left-recursion or missing rule definition) occurred while reading
1334   the grammar in and defining symbols.
1335
1336 * added amazing optimization; reduces analysis time by 90% for java
1337   grammar; simple IF statement addition!
1338
1339 3.0ea9 - May 20, 2006
1340
1341 * added global k value for grammar to limit lookahead for all decisions unless
1342 overridden in a particular decision.
1343
1344 * added failsafe so that any decision taking longer than 2 seconds to create
1345 the DFA will fall back on k=1.  Use -ImaxtimeforDFA n (in ms) to set the time.
1346
1347 * added an option (turned off for now) to use multiple threads to
1348 perform grammar analysis.  Not much help on a 2-CPU computer as
1349 garbage collection seems to peg the 2nd CPU already. :( Gotta wait for
1350 a 4 CPU box ;)
1351
1352 * switched from #src to // $ANTLR src directive.
1353
1354 * CommonTokenStream.getTokens() looked past end of buffer sometimes. fixed.
1355
1356 * unicode literals didn't really work in DOT output and generated code. fixed.
1357
1358 * fixed the unit test rig so it compiles nicely with Java 1.5
1359
1360 * Added ant build.xml file (reads build.properties file)
1361
1362 * predicates sometimes failed to compile/eval properly due to missing (...)
1363   in IF expressions.  Forced (..)
1364
1365 * (...)? with only one alt were not optimized.  Was:
1366
1367         // t.g:4:7: ( B )?
1368         int alt1=2;
1369         int LA1_0 = input.LA(1);
1370         if ( LA1_0==B ) {
1371             alt1=1;
1372         }
1373         else if ( LA1_0==-1 ) {
1374             alt1=2;
1375         }
1376         else {
1377             NoViableAltException nvae =
1378                 new NoViableAltException("4:7: ( B )?", 1, 0, input);
1379             throw nvae;
1380         }
1381
1382 is now:
1383
1384         // t.g:4:7: ( B )?
1385         int alt1=2;
1386         int LA1_0 = input.LA(1);
1387         if ( LA1_0==B ) {
1388             alt1=1;
1389         }
1390
1391   Smaller, faster and more readable.
1392
1393 * Allow manual init of return values now:
1394   functionHeader returns [int x=3*4, char (*f)()=null] : ... ;
1395
1396 * Added optimization for DFAs that fixed a codegen bug with rules in lexer:
1397    EQ                    : '=' ;
1398    ASSIGNOP              : '=' | '+=' ;
1399   EQ is a subset of other rule.  It did not given an error which is
1400   correct, but generated bad code.
1401
1402 * ANTLR was sending column not char position to ANTLRWorks.
1403
1404 * Bug fix: location 0, 0 emitted for synpreds and empty alts.
1405
1406 * debugging event handshake how sends grammar file name.  Added getGrammarFileName() to recognizers.  Java.stg generates it:
1407
1408     public String getGrammarFileName() { return "<fileName>"; }
1409
1410 * tree parsers can do arbitrary lookahead now including backtracking.  I
1411   updated CommonTreeNodeStream.
1412
1413 * added events for debugging tree parsers:
1414
1415         /** Input for a tree parser is an AST, but we know nothing for sure
1416          *  about a node except its type and text (obtained from the adaptor).
1417          *  This is the analog of the consumeToken method.  Again, the ID is
1418          *  the hashCode usually of the node so it only works if hashCode is
1419          *  not implemented.
1420          */
1421         public void consumeNode(int ID, String text, int type);
1422
1423         /** The tree parser looked ahead */
1424         public void LT(int i, int ID, String text, int type);
1425
1426         /** The tree parser has popped back up from the child list to the
1427          *  root node.
1428          */
1429         public void goUp();
1430
1431         /** The tree parser has descended to the first child of a the current
1432          *  root node.
1433          */
1434         public void goDown();
1435
1436 * Added DebugTreeNodeStream and DebugTreeParser classes
1437
1438 * Added ctor because the debug tree node stream will need to ask quesitons about nodes and since  nodes are just Object, it needs an adaptor to decode the nodes and get text/type info for the debugger.
1439
1440 public CommonTreeNodeStream(TreeAdaptor adaptor, Tree tree);
1441
1442 * added getter to TreeNodeStream:
1443         public TreeAdaptor getTreeAdaptor();
1444
1445 * Implemented getText/getType in CommonTreeAdaptor.
1446
1447 * Added TraceDebugEventListener that can dump all events to stdout.
1448
1449 * I broke down and make Tree implement getText
1450
1451 * tree rewrites now gen location debug events.
1452
1453 * added AST debug events to listener; added blank listener for convenience
1454
1455 * updated debug events to send begin/end backtrack events for debugging
1456
1457 * with a : (b->b) ('+' b -> ^(PLUS $a b))* ; you get b[0] each time as
1458   there is no loop in rewrite rule itself.  Need to know context that
1459   the -> is inside the rule and hence b means last value of b not all
1460   values.
1461
1462 * Bug in TokenRewriteStream; ops at indexes < start index blocked proper op.
1463
1464 * Actions in ST rewrites "-> ({$op})()" were not translated
1465
1466 * Added new action name:
1467
1468 @rulecatch {
1469 catch (RecognitionException re) {
1470     reportError(re);
1471     recover(input,re);
1472 }
1473 catch (Throwable t) {
1474     System.err.println(t);
1475 }
1476 }
1477 Overrides rule catch stuff.
1478
1479 * Isolated $ refs caused exception
1480
1481 3.0ea8 - March 11, 2006
1482
1483 * added @finally {...} action like @init for rules.  Executes in
1484   finally block (java target) after all other stuff like rule memoization.
1485   No code changes needs; ST just refs a new action:
1486       <ruleDescriptor.actions.finally>
1487
1488 * hideous bug fixed: PLUS='+' didn't result in '+' rule in lexer
1489
1490 * TokenRewriteStream didn't do toString() right when no rewrites had been done.
1491
1492 * lexer errors in interpreter were not printed properly
1493
1494 * bitsets are dumped in hex not decimal now for FOLLOW sets
1495
1496 * /* epsilon */ is not printed now when printing out grammars with empty alts
1497
1498 * Fixed another bug in tree rewrite stuff where it was checking that elements
1499   had at least one element.  Strange...commented out for now to see if I can remember what's up.
1500
1501 * Tree rewrites had problems when you didn't have x+=FOO variables.  Rules
1502   like this work now:
1503
1504   a : (x=ID)? y=ID -> ($x $y)?;
1505
1506 * filter=true for lexers turns on k=1 and backtracking for every token
1507   alternative.  Put the rules in priority order.
1508
1509 * added getLine() etc... to Tree to support better error reporting for
1510   trees.  Added MismatchedTreeNodeException.
1511
1512 * $templates::foo() is gone.  added % as special template symbol.
1513   %foo(a={},b={},...) ctor (even shorter than $templates::foo(...))
1514   %({name-expr})(a={},...) indirect template ctor reference
1515
1516   The above are parsed by antlr.g and translated by codegen.g
1517   The following are parsed manually here:
1518
1519   %{string-expr} anonymous template from string expr
1520   %{expr}.y = z; template attribute y of StringTemplate-typed expr to z
1521   %x.y = z; set template attribute y of x (always set never get attr)
1522             to z [languages like python without ';' must still use the
1523             ';' which the code generator is free to remove during code gen]
1524
1525 * -> ({expr})(a={},...) notation for indirect template rewrite.
1526   expr is the name of the template.
1527
1528 * $x[i]::y and $x[-i]::y notation for accesssing absolute scope stack
1529   indexes and relative negative scopes.  $x[-1]::y is the y attribute
1530   of the previous scope (stack top - 1).
1531
1532 * filter=true mode for lexers; can do this now...upon mismatch, just
1533   consumes a char and tries again:
1534 lexer grammar FuzzyJava;
1535 options {filter=true;}
1536
1537 FIELD
1538     :   TYPE WS? name=ID WS? (';'|'=')
1539         {System.out.println("found var "+$name.text);}
1540     ;
1541
1542 * refactored char streams so ANTLRFileStream is now a subclass of
1543   ANTLRStringStream.
1544
1545 * char streams for lexer now allowed nested backtracking in lexer.
1546
1547 * added TokenLabelType for lexer/parser for all token labels
1548
1549 * line numbers for error messages were not updated properly in antlr.g
1550   for strings, char literals and <<...>>
1551
1552 * init action in lexer rules was before the type,start,line,... decls.
1553
1554 * Tree grammars can now specify output; I've only tested output=templat
1555   though.
1556
1557 * You can reference EOF now in the parser and lexer.  It's just token type
1558   or char value -1.
1559
1560 * Bug fix: $ID refs in the *lexer* were all messed up.  Cleaned up the
1561   set of properties available...
1562
1563 * Bug fix: .st not found in rule ref when rule has scope:
1564 field
1565 scope {
1566         StringTemplate funcDef;
1567 }
1568     :   ...
1569         {$field::funcDef = $field.st;}
1570     ;
1571 it gets field_stack.st instead
1572
1573 * return in backtracking must return retval or null if return value.
1574
1575 * $property within a rule now works like $text, $st, ...
1576
1577 * AST/Template Rewrites were not gated by backtracking==0 so they
1578   executed even when guessing.  Auto AST construction is now gated also.
1579
1580 * CommonTokenStream was somehow returning tokens not text in toString()
1581
1582 * added useful methods to runtime.BitSet and also to CommonToken so you can
1583   update the text.  Added nice Token stream method:
1584
1585   /** Given a start and stop index, return a List of all tokens in
1586    *  the token type BitSet.  Return null if no tokens were found.  This
1587    *  method looks at both on and off channel tokens.
1588    */
1589   public List getTokens(int start, int stop, BitSet types);
1590
1591 * literals are now passed in the .tokens files so you can ref them in
1592   tree parses, for example.
1593
1594 * added basic exception handling; no labels, just general catches:
1595
1596 a : {;}A | B ;
1597         exception
1598                 catch[RecognitionException re] {
1599                         System.out.println("recog error");
1600                 }
1601                 catch[Exception e] {
1602                         System.out.println("error");
1603                 }
1604
1605 * Added method to TokenStream:
1606   public String toString(Token start, Token stop);
1607
1608 * antlr generates #src lines in lexer grammars generated from combined grammars
1609   so error messages refer to original file.
1610
1611 * lexers generated from combined grammars now use originally formatting.
1612
1613 * predicates have $x.y stuff translated now.  Warning: predicates might be
1614   hoisted out of context.
1615
1616 * return values in return val structs are now public.
1617
1618 * output=template with return values on rules was broken.  I assume return values with ASTs was broken too.  Fixed.
1619
1620 3.0ea7 - December 14, 2005
1621
1622 * Added -print option to print out grammar w/o actions
1623
1624 * Renamed BaseParser to be BaseRecognizer and even made Lexer derive from
1625   this; nice as it now shares backtracking support code.
1626
1627 * Added syntactic predicates (...)=>.  See December 4, 2005 entry:
1628
1629   http://www.antlr.org/blog/antlr3/lookahead.tml
1630
1631   Note that we have a new option for turning off rule memoization during
1632   backtracking:
1633
1634   -nomemo        when backtracking don't generate memoization code
1635
1636 * Predicates are now tested in order that you specify the alts.  If you
1637   leave the last alt "naked" (w/o pred), it will assume a true pred rather
1638   than union of other preds.
1639
1640 * Added gated predicates "{p}?=>" that literally turn off a production whereas
1641 disambiguating predicates are only hoisted into the predictor when syntax alone
1642 is not sufficient to uniquely predict alternatives.
1643
1644 A : {p}?  => "a" ;
1645 B : {!p}? => ("a"|"b")+ ;
1646
1647 * bug fixed related to predicates in predictor
1648 lexer grammar w;
1649 A : {p}? "a" ;
1650 B : {!p}? ("a"|"b")+ ;
1651 DFA is correct.  A state splits for input "a" on the pred.
1652 Generated code though was hosed.  No pred tests in prediction code!
1653 I added testLexerPreds() and others in TestSemanticPredicateEvaluation.java
1654
1655 * added execAction template in case we want to do something in front of
1656   each action execution or something.
1657
1658 * left-recursive cycles from rules w/o decisions were not detected.
1659
1660 * undefined lexer rules were not announced! fixed.
1661
1662 * unreachable messages for Tokens rule now indicate rule name not alt. E.g.,
1663
1664   Ruby.lexer.g:24:1: The following token definitions are unreachable: IVAR
1665
1666 * nondeterminism warnings improved for Tokens rule:
1667
1668 Ruby.lexer.g:10:1: Multiple token rules can match input such as ""0".."9"": INT, FLOAT
1669 As a result, tokens(s) FLOAT were disabled for that input
1670
1671
1672 * DOT diagrams didn't show escaped char properly.
1673
1674 * Char/string literals are now all 'abc' not "abc".
1675
1676 * action syntax changed "@scope::actionname {action}" where scope defaults
1677   to "parser" if parser grammar or combined grammar, "lexer" if lexer grammar,
1678   and "treeparser" if tree grammar.  The code generation targets decide
1679   what scopes are available.  Each "scope" yields a hashtable for use in
1680   the output templates.  The scopes full of actions are sent to all output
1681   file templates (currently headerFile and outputFile) as attribute actions.
1682   Then you can reference <actions.scope> to get the map of actions associated
1683   with scope and <actions.parser.header> to get the parser's header action
1684   for example.  This should be very flexible.  The target should only have
1685   to define which scopes are valid, but the action names should be variable
1686   so we don't have to recompile ANTLR to add actions to code gen templates.
1687
1688   grammar T;
1689   options {language=Java;}
1690   @header { package foo; }
1691   @parser::stuff { int i; } // names within scope not checked; target dependent
1692   @members { int i; }
1693   @lexer::header {head}
1694   @lexer::members { int j; }
1695   @headerfile::blort {...} // error: this target doesn't have headerfile
1696   @treeparser::members {...} // error: this is not a tree parser
1697   a
1698   @init {int i;}
1699     : ID
1700     ;
1701   ID : 'a'..'z';
1702
1703   For now, the Java target uses members and header as a valid name.  Within a
1704   rule, the init action name is valid.
1705
1706 * changed $dynamicscope.value to $dynamicscope::value even if value is defined
1707   in same rule such as $function::name where rule function defines name.
1708
1709 * $dynamicscope gets you the stack
1710
1711 * rule scopes go like this now:
1712
1713   rule
1714   scope {...}
1715   scope slist,Symbols;
1716         : ...
1717         ;
1718
1719 * Created RuleReturnScope as a generic rule return value.  Makes it easier
1720   to do this:
1721     RuleReturnScope r = parser.program();
1722     System.out.println(r.getTemplate().toString());
1723
1724 * $template, $tree, $start, etc...
1725
1726 * $r.x in current rule.  $r is ignored as fully-qualified name. $r.start works too
1727
1728 * added warning about $r referring to both return value of rule and dynamic scope of rule
1729
1730 * integrated StringTemplate in a very simple manner
1731
1732 Syntax:
1733 -> template(arglist) "..."
1734 -> template(arglist) <<...>>
1735 -> namedTemplate(arglist)
1736 -> {free expression}
1737 -> // empty
1738
1739 Predicate syntax:
1740 a : A B -> {p1}? foo(a={$A.text})
1741         -> {p2}? foo(a={$B.text})
1742         -> // return nothing
1743
1744 An arg list is just a list of template attribute assignments to actions in curlies.
1745
1746 There is a setTemplateLib() method for you to use with named template rewrites.
1747
1748 Use a new option:
1749
1750 grammar t;
1751 options {output=template;}
1752 ...
1753
1754 This all should work for tree grammars too, but I'm still testing.
1755
1756 * fixed bugs where strings were improperly escaped in exceptions, comments, etc..  For example, newlines came out as newlines not the escaped version
1757
1758 3.0ea6 - November 13, 2005
1759
1760 * turned off -debug/-profile, which was on by default
1761
1762 * completely refactored the output templates; added some missing templates.
1763
1764 * dramatically improved infinite recursion error messages (actually
1765   left-recursion never even was printed out before).
1766
1767 * wasn't printing dangling state messages when it reanalyzes with k=1.
1768
1769 * fixed a nasty bug in the analysis engine dealing with infinite recursion.
1770   Spent all day thinking about it and cleaned up the code dramatically.
1771   Bug fixed and software is more powerful and I understand it better! :)
1772
1773 * improved verbose DFA nodes; organized by alt
1774
1775 * got much better random phrase generation.  For example:
1776
1777  $ java org.antlr.tool.RandomPhrase simple.g program
1778  int Ktcdn ';' method wh '(' ')' '{' return 5 ';' '}'
1779
1780 * empty rules like "a : ;" generated code that didn't compile due to
1781   try/catch for RecognitionException.  Generated code couldn't possibly
1782   throw that exception.
1783
1784 * when printing out a grammar, such as in comments in generated code,
1785   ANTLR didn't print ast suffix stuff back out for literals.
1786
1787 * This never exited loop:
1788   DATA : (options {greedy=false;}: .* '\n' )* '\n' '.' ;
1789   and now it works due to new default nongreedy .*  Also this works:
1790   DATA : (options {greedy=false;}: .* '\n' )* '.' ;
1791
1792 * Dot star ".*" syntax didn't work; in lexer it is nongreedy by
1793   default.  In parser it is on greedy but also k=1 by default.  Added
1794   unit tests.  Added blog entry to describe.
1795
1796 * ~T where T is the only token yielded an empty set but no error
1797
1798 * Used to generate unreachable message here:
1799
1800   parser grammar t;
1801   a : ID a
1802     | ID
1803     ;
1804
1805   z.g:3:11: The following alternatives are unreachable: 2
1806
1807   In fact it should really be an error; now it generates:
1808
1809   no start rule in grammar t (no rule can obviously be followed by EOF)
1810
1811   Per next change item, ANTLR cannot know that EOF follows rule 'a'.
1812
1813 * added error message indicating that ANTLR can't figure out what your
1814   start rule is.  Required to properly generate code in some cases.
1815
1816 * validating semantic predicates now work (if they are false, they
1817   throw a new FailedPredicateException
1818
1819 * two hideous bug fixes in the IntervalSet, which made analysis go wrong
1820   in a few cases.  Thanks to Oliver Zeigermann for finding lots of bugs
1821   and making suggested fixes (including the next two items)!
1822
1823 * cyclic DFAs are now nonstatic and hence can access instance variables
1824
1825 * labels are now allowed on lexical elements (in the lexer)
1826
1827 * added some internal debugging options
1828
1829 * ~'a'* and ~('a')* were not working properly; refactored antlr.g grammar
1830
1831 3.0ea5 - July 5, 2005
1832
1833 * Using '\n' in a parser grammar resulted in a nonescaped version of '\n' in the token names table making compilation fail.  I fixed this by reorganizing/cleaning up portion of ANTLR that deals with literals.  See comment org.antlr.codegen.Target.
1834
1835 * Target.getMaxCharValue() did not use the appropriate max value constant.
1836
1837 * ALLCHAR was a constant when it should use the Target max value def.  set complement for wildcard also didn't use the Target def.  Generally cleaned up the max char value stuff.
1838
1839 * Code gen didn't deal with ASTLabelType properly...I think even the 3.0ea7 example tree parser was broken! :(
1840
1841 * Added a few more unit tests dealing with escaped literals
1842
1843 3.0ea4 - June 29, 2005
1844
1845 * tree parsers work; added CommonTreeNodeStream.  See simplecTreeParser
1846   example in examples-v3 tarball.
1847
1848 * added superClass and ASTLabelType options
1849
1850 * refactored Parser to have a BaseParser and added TreeParser
1851
1852 * bug fix: actions being dumped in description strings; compile errors
1853   resulted
1854
1855 3.0ea3 - June 23, 2005
1856
1857 Enhancements
1858
1859 * Automatic tree construction operators are in: ! ^ ^^
1860
1861 * Tree construction rewrite rules are in
1862         -> {pred1}? rewrite1
1863         -> {pred2}? rewrite2
1864         ...
1865         -> rewriteN
1866
1867   The rewrite rules may be elements like ID, expr, $label, {node expr}
1868   and trees ^( <root> <children> ).  You have have (...)?, (...)*, (...)+
1869   subrules as well.
1870
1871   You may have rewrites in subrules not just at outer level of rule, but
1872   any -> rewrite forces auto AST construction off for that alternative
1873   of that rule.
1874
1875   To avoid cycles, copy semantics are used:
1876
1877   r : INT -> INT INT ;
1878
1879   means make two new nodes from the same INT token.
1880
1881   Repeated references to a rule element implies a copy for at least one
1882   tree:
1883
1884   a : atom -> ^(atom atom) ; // NOT CYCLE! (dup atom tree)
1885
1886 * $ruleLabel.tree refers to tree created by matching the labeled element.
1887
1888 * A description of the blocks/alts is generated as a comment in output code
1889
1890 * A timestamp / signature is put at top of each generated code file
1891
1892 3.0ea2 - June 12, 2005
1893
1894 Bug fixes
1895
1896 * Some error messages were missing the stackTrace parameter
1897
1898 * Removed the file locking mechanism as it's not cross platform
1899
1900 * Some absolute vs relative path name problems with writing output
1901   files.  Rules are now more concrete.  -o option takes precedence
1902   // -o /tmp /var/lib/t.g => /tmp/T.java
1903   // -o subdir/output /usr/lib/t.g => subdir/output/T.java
1904   // -o . /usr/lib/t.g => ./T.java
1905   // -o /tmp subdir/t.g => /tmp/subdir/t.g
1906   // If they didn't specify a -o dir so just write to location
1907   // where grammar is, absolute or relative
1908
1909 * does error checking on unknown option names now
1910
1911 * Using just language code not locale name for error message file.  I.e.,
1912   the default (and for any English speaking locale) is en.stg not en_US.stg
1913   anymore.
1914
1915 * The error manager now asks the Tool to panic rather than simply doing
1916   a System.exit().
1917
1918 * Lots of refactoring concerning grammar, rule, subrule options.  Now
1919   detects invalid options.
1920
1921 3.0ea1 - June 1, 2005
1922
1923 Initial early access release
1924