This corrects a minor mistake I made earlier because I did not yet understand
the full generality of the option syntax.
Also fixes some minor markup errors in the manual.
This change was not strictly necessary to sever a preprocessor
dependency nor make the API uniform aross both C and C++. But it
cried out to be made, because now *all* the rule hooks are in the yy*
namespace. This makes the API easier to document and remember.
unput() is left in place as a compatibility macro for existing
users, but only documented as a legacy interface. The "unput"
variants of switches and options have also been retained.
...leaving an "input" macro in place for legacy compatibility.
input() had already become yyinput() in the C++ back end in order to
avoid collision with predefibed C++ inoput. In a multi-language world,
this is good policy in general. There's no real reason for C to
be different, and excellent reason to pull all possible entry
points into the yy namespace.
Non-C/C++ back ends won't have macros, so the documentation should
treat macro-ness as an implementation detail when it has to be
mentioned at all - and usually it doesn't.
The key thing about rule hooks like yyless(), yymore() etc. isn't that
they're macros, it's that they can only br used in rule actions (e.g.
inside the body of the genetated tyylex code.
This documentation patch removes the term "macro" where it isn't needed.
...rather than splicing a bunch of exposed guts into the middle of
yylex(). yyread() is put in the set of functions that gets
prefix-modified.
This means buffer refill can be documented without C-specific
references to YY_INPUT.
It should also enable actually having a non-macro replacement
for YY_INPUT, with a bit more work.
No specific thing can be said about a non-C/C++ backend yet, but
this patch prepares the way by explaining which features and aspects
of the Flex interface are specific to C/C++.
It also fixes one pre-ANSI prototype - that of non-reentrant yylex(),
which should be declared yylex(void) in this day and age.
These changes make one new commitment. Observing that the YY_INPUT
macro is impossible to port out of the C/C++ context, I have observed that
it is probably extinct in the wild (due to the later introduction of
the multi-buffer primitives, though I don't say that explicitly). The
text says tat people who really need the equivalent of this capability
in a non-C/C++ back end should file an issue with the Flex maintainers.
I don't actually expect this to happen.
This patch implements and documents a yyreject() macro to replace
argumentless REJECT. It does not remove REJECT, but warns that this
macro will not be supported in non-C languages and deprecates it.
This commit begins a new appendix in the Flex manual, to list
deprecated interfaces and explain why they have been superseded.
Flex has a strategy of packing its arrays with in32 or imt32 depending
on length, but it wasn't applied consistently. While I don't thinlk this
kind of space optimization matters a lot in 2020, if we're going to do it at
all we should do it thoroughly.
No need for it, since the skel content is in core
and the relevant hook can be searched for.
This is a postscript to the retargeting series. It's
not necessary, but it imoroves the code slightly.
It will doubtless need expansion and revision when we actually
write one.
No diffs in generated test code.
#70 and last in the retargeting patch series
Everything Flex ships to the skeleton-file expansion phase is
now either a macro expansion or a macro call. This almost finishes
the retargeting patch series; the wrapup will be documentation.
Sadky, this does not get us *all* the way to target-syntax independence.
The probem is the inclusion of tables_shared.c when table serialization is
enabled. Which ,eans table serialization is not practical to support
outside the C/C++ back end.
No diffs in generated test code from this commit.
#69 in the retargeting patch series
Produces a tediously large diff in generated test code that
is all table moving around. This is due to them being shipped
as macros and being substitured in a fixed order determined
by the calls in the skell file, rather than veing generated as
the functrions originally emitting the tables are called.
#68 in the retargeting patch series
What it used to do is now handled entirely by macro conditionals.
Besides being a good complexoty reduction in itself, this is one
of the last steps in turning C backend methods into macro deliveries.
Order of the yydmap table is perturbed. No other non-whitespace
diffs and no logic changes,
#67 in the retargeting patch series
This commit collects several minor changes:
* Fix a minor type specification bug in a tablesext initializer.
* macroize the trans_offset, mkctbl, and mkftbl methods.
* Fix a bug in footprint computation.
This commit oroduces no code diffs in the generated test code, but the
footprint reports change due to the bug fix.
#66 in the retargeting patch series
...to have an indent style uniform with the rest of the code,
and one that makes it easier noy to miss the trailing table delimiters.
Not all tables are generated this way yet. I'm working on it.
Is isolated in its own commit so the format change can't confuse a
reviewer's eyeballs out of noticing real mutations in the table data.
#65 in the retargeting patch series
Also macro-generate yydmap entry for the yymeta table.
We're npw about 75% of the way through pushing all
C syntax out of the method table.
Permutes table order in the generated code.
#64 in the retargeting patch series
This required addin a new 0.0 breakpoint right after the
M4_HOOK_* definitions so they will be visible early.
Produces no diffs in generated test code.
#62 in the retargeting patch series
They were: geneol, fulltable, eecs, and debug.
To accomplish this, dataend's emission of trailing } needed to be
suppressable.
Also, remove a %% mark that is no longer required.
This doesn't change any of the generated tables, but does change the
orer in which they're generated, froducing large diffs in the
generated test code that don't actually mean anything. The reason for
this is that tables used to come out in a variable order as functions
like geneecs were called ar variable times depending on the
compressuion mode. Now, instead, the order is fixed by where the
tanle-body macros these functions define are expanded.
More methods remain to be turnerd into macro generators.
#61 in the retargeting patch series, following an unnumbered
bugfix patch that I shipped in too much of a hurry.
Do this for table opener/closer/continuation syntax, the trace-format
string, the state entry string, constant definitions, the state-dyad
format, and the three pieces of EOF state syntax. The documentation
appendix on how to write a back end is also updated.
There are comment diffs because I decided generating an
explicit fallthrough marker and some other new explanatory comments
was a good idea.
#58 in the retargeting patch series
All symbols except a handful dependent on nultrans and the number of
backups are now written in one visible group right at the start of m4
generation. The exception are exceptions because their values
are not known until after DFA computation.
Has comment diffs in generated test code due to one symbol rename and
symbols beoming visible. Should be the last time the latter happens.
#57 in the retargeting patch series
Includes handling of --nounistd, --always_interactive, --never_interactive, --stack,
their corresponding lexer items, and and noinput.
An unavoidable side effect is that the place where "#define
YY_NO_INPUT 1" is inserted, if it's inserted. has to move because it's
done by a different route - m4 expansion rather than the action_define
function (which is now gone - this was the last use). I have put the
new insertion point just iin time for the first reference to the macro.
Otherwise the only diffs in generarted test code are symbol
definitions becoming visible.
#56 in the retargeting patch series
Formerly, Flex's own lexer and the logic for pocessing command-line
options both did calls to write M4 conditionals to a buffer that was
later dumped into the befinning of the text that m4 expands, before
the body of the skel file.
This was bad layering. Instead, both these places now set flags in
the ctrl structure. Later, (almost) all the generated m4 conditionals
are shipped at once.
It's "almost" because there are a couple of awkward cases to be
cleaned up. Again, this was the part that could be done
simply via almost mechanical cut and paste.
In generated code, there are some comment diffs because symbols that
used to be invisibly set are now visibly set - that is, shown at the
beginning of the generated C.
#55 in the retargeting patch series
This is separate from the big reorganization in commit #52 because
there's a comment about this variable in flexdef.h that makes me
nervous. According to the comment this variable is a trit, but
it looks to me like flexinit sets it to false and I can't find
anywhere in the code that sets it to a non-boolean value.
This commit asumes that the comment is stale and the member
can be typed boolean. Should be audited.
As I was working on some layer separation. I realized that I
was getting confused a lot by the huge pile of globals that
control this program.
In particular, I need to be able to clearly distinguish those that set
m4 conditional symbols from those that don't. So I've done something
about it. Almost all globals that can be set by options are now
bundled into two context structures, "ctrl" for options that have
corresponding m4 symbols and "env" for options that don't.
The few I haven't moved have sufficiently tricky interdependencies
that I'm going to break out any changes related to them into smaller
patches that can be easier to review. In this one I did only the bulk
of straightforward changes that could be done mechanically with search
and replace.
I changed one varuable name to reflext its senantics better;
the performance_report global is now env.performance_hint.
Ideally there ought to be a third structure that bundles all the
shared state used by DFA/NDFSA table computation, so all globals would
live in one of three context structures. I may do that in a later
commit, but this patch is already unpleasantly large as it is.
No diffs in generated test code, nor any logic changes.
#52 in the retargeting patch series.
There were only two left, for YY_MAIN, and that definition
was moved so it's in the visible controls.
This is a step towards making *all* conditionalization symbols
viible in generated comments.
This commit also cleans up some misnamed mode symbols. There are
still a couple of duplicative pairs, to be cleaned up in a later
commit.
We can now report generated M4 symbols with values in the "m4
controls" part of a generated file. Partly as a result, the following
symbols become visible in generare code from the tests:
M4_MODE_PREFIX, M4_YY_TABLES_VERIFY, M4_YY_REENTRANT, and
M4_MODE_PREFIX.
No other diffs.
#51 in the retargeting patch series. #50 was accidentally
unnumbered.
It was a no-op anyway in the C version, there as a placeholder
in case other languages needed it. But in the new organization
of things, with everything being done by conditional expansion in
the skeleton file, there's no point.
No diffs at all in generated test code.
Thios does remove some cpde that was conditioned out, an abandoned
attempt to undefine all #defines at the end of code generation.
Now that all the mode conditionals are visible early, wverything that
used to be done in the prolog can be done as conditionalized code in the
skeleton.
Whitespace and comment diffs only.
#49 in the retargeting patch series
Also, clean up some unused and duplicative symbols.
In generated test code, comment and whitespace diffs only
except for YY_INT_ALIGNED going away.
#48 in the retargeting patch series
The point of this change is to move the setting of the M4_MODE_*
controls up to the front of the generated code so that they can be
used for conditionalization earlier, notably in replacing the prolog
method. I tried to do this in #46 but dididn't move the mode
setting far enough up.
(Also, rename instances of a duplicated mode switch.)
In generated code, the m4 controls move but nothing else changes.
#47 in the retargeting patch series
This should make it possible to eliminate much of the C-specific
prolog code.
Sadly, because of the moves of the generated comments this makes
a rather noisy diff. All comments and whitespace, though;
what looks like being other than that is pieces of generated code
being shifteed around.
#46 in the retargeting patch series
Produces only whitespace diffs in generated code for tests, except the
order of items in the initializer for table serialization changes.
#45 in the retargeting patch series
This feature is better implemented with m4 macroexpansion;
that way skelout() does not have to know that #define is a thing.
Also in skelout(), use the backend comment method rather than
embedding knowledge about /* and */, and int_format_define
to factor out knowedge about #define.
Produces only comment diffs in the generated test code.
#44 in the retargeting patch series
This patch is a pure refactoring step. It changes the
interface between gen.c and the back end so that the
method table can shed a number of methods and no headers
are generated in gen.c any more.
Most methods now return the amount of memory they
allocate. Eventually this will be used to add
a report on this to the generated code.
No diffs in generated code, even without ignoring whitespace.
#43 in the retargeting patch series, which turned out
not to be finished after all. These is ugly magic in skelout()
that needs to be factored out.