The Undecidability
of A asing
G. RAMALINGAM IBM T. J. Watson
Research
Center
Alias analysis is a prerequisite for performing most of the common program analyses such as reaching-definitions analysis or live-variables analysis. Landi [1992] recently established that it is impossible to compute statically precise alias information—either may-alias or must-alias—in languages with if statements, loops, dynamic storage, and recursive data structures: more precisely, he showed that the may-alias relation is not recursive, while the must-alias relation is not even recursively enumerable. This article presents simpler proofs of the same results. Categories
and Subject Descriptors:
opttmizatzon;
F.4.
Decision
Problems
General
Terms:
Additional
1 [Mathematical
Languages,
Key
Words
D.3,4 [Programming
Languages]:
Processors—compilers;
Logic]:
Theory;
[Formal
Computability
F.4.3
Languages]:
Theory
and Phrases:
Alias
analysis,
pointer
analysis
1. INTRODUCTION Compilers program
and various other programming analysis. To solve most program
problem
of determining
information same
about
value
at
expressions execution interested progy-am, aliases
live
whether
some
variables, two
point.
that
that
occur
in all executions
all
paths
in
if an arbitrary
simplifying
problems.
information,
may/must
two
at a particular
names point
of the program.
or
have
the
or
L-valued
during
program
program
are
makes
it
T. J. Watson
problem
Research
to
this
solve
such informa-
conservative since
is executable
possible even
the
executable,
in a program
or must-alias
IBM
Obviously,
analysis problems. performed under
Unfortunately,
the may-alias
address:
the path
assumption
analysis
Author’s
alias
use of static such as the
if both refer to the same location. In the may-alias problem, one is in identifying aliases that can occur during some execution of the while in the must-alias problem, one is interested in identifying
deciding
make
requires
expressions
Informally,
each other
tion is relevant to most dataflow Program analysis is commonly tion
one
L-valued
program
are said to alias
tools make good analysis problems,
the
assumpproblem
is undecidable. a number
assumption
is not
of This
of program sufficient
to
Heights,
NY
the copies
are
decidable.
Center,
P.O.
Box
704,
Yorktown
10598. Permission not made of the
to copy without or distributed
publication
Association
its
for Computing
specific
permission.
01994
ACM ACM
and
fee all or part
for dmect date
appear,
Machinery
0164-0925/94/0900-1467 Transactmns
on Programmmg
of this
commercial and To
material
is granted
advantage, notice
copy
the ACM
is given
otherwise,
that
provided copyright
copying
that notice
and the title
is by permission
or to republisb,
requires
of the
a fee
and/or
$03.50 Languages
and Systems,
Vol
16, No
5, September
1994, Pages
1467-1471
1468
G. Fiamallngam
.
each other at a program point if Names a and b are said to may-alias entry to that program point, such there exists a path P from the program after execution along path P. Names that a and b refer to the same location b are said
to must-alias the program beginning
a and P from same
that
location
after
problems the
execution
even the simpler Here,
undecidability or PCP,
The
of this problem to the may-alias
decision
version
Given
does there
while
may-alias
in generating made finite that
The
2.1.
arbitrary
exist
recently
recursive
data
of the same result.
Section
established
and must-alias
3 discusses
structures
We establish
Correspondence
article
presents
related
work.
OF ALIASING of the
expressions
Definition
ing:
permit
problem
in a program and two names, the given names at the given
we are interested This set can be L-valued
that proof
[1992]
of the may-alias
by reducing the Post’s problem. Section 2 of the
results,
UNDECIDABILITY
program point holds between
versions
a simpler
of the undecidability
2. THE
P. Landi
path
for languages
we present
Problem,
proofs
along
intraprocedural
are undecidable
to be built.
each other at a program point if, for all paths to the program point, a and b both refer to the
A
a nonempty
the
the set of all may-alias by restricting attention
occur
Post’s
lists
is
following:
given
a
decide if the may-alias relation program point. More generally, facts that hold to the names
true. and
in the program.
Correspondence
and
B
A=wl,
wz,
B=zl,
zz,
sequence
or PCP
Problem,
of r strings
is the follow-
each in {O, 1} +, say
. . ..w. . . ..z.
of integers
i ~, z~, . . . . zh such that
w11w12 ““” Wlk = ZL1ZL2 ““” Z,k. THEOREM THEOREM
languages
The
2.2.
The
2.3.
with
if
PCP
is undecidable
intraprocedural
statements,
[ Hopcroft may-alias
loops,
and
Unman
problem
dynamic
is
storage,
and
1979].
undecidable recursive
for data
structures.
PROOF.
We relate
binary tree interpreted representing
and
PCP
with root root. as representing a left
branch
to the
may-alias
problem
to be right. For any binary . . . b.) to be branch(bo) ~ branch(bl)
string
branch(1)
path(bobl
as follows.
Consider
a
A binary string consisting of 0s and 1s can be a path from the root of the binary tree with O and 1 a right branch. Define branch(0) to be left ~
“””
b. b ~ “”” b., ~
branch.
define Let a
a = /3 iff root * path(a) and root ,f3 be two binary strings. Then, tree. Essentially, this is the idea /3) refer to the same node in the binary behind our reduction of PCP to the may-alias problem. Given an instance of PCP, we construct the program in Figure 1. The program is written in C, but it can be written in any language with if statements, loops, dynamic storage, and recursive data structures. The may-
and
path(
ACM
TransactIons
on Programmmg
Languages
and
Systems,
Vol
16,
No
5, September
1994
The Undecidability alias
relation
instance lines
between
19, and
are r different
ber these
assume
that
the
at line
39 iff
Ignore,
for the moment,
points
path
loop
o of t integers,
P
to a binary
22 through the jth
the
tree
20.
35 —num-
from
line
20 to
t times
35)
element
given
at line
22 through
the program
in
(lines where
1469
node
the loop of lines
r. Any
through
.
below.
and
root
inside
1 through
iterates
+ left)
as explained
branches
branches
36 that
*(q
has a solution,
7 through
There line
holds
of PCP
of Aliaslng
corre-
in the sequence
sponds
to a sequence
denotes “p and
the branch taken during the jth time through the loop. Furthermore, ‘q will alias each other at the end of path P iff the sequence a is a
solution
to the
ciently
large”
Instead
given
P(3P instance,
binary
tree
of actually
at line
provided
that
root
pointed
constructing
a binary
tree,
we use the
code in lines
through 19 to “generate” all possible paths through a binary this, the pointer fields of newly allocated tree nodes are not null
pointer
point
to
as might
a special
initialized
to point
program
be done usually.
node
called
to itself.
beginning
to line
as dereferencing
a null
Instead,
This
these
whose
undefined
ensures
that
36 can be executed
pointer.
to a “suffi-
20.
fields
every at line
to
fields
are
right
possible
without
Consequently,
are initialized
and
left
9
tree. In doing initialized to a
path
raising
from
the
any errors
36, either
such p has
pointer
a “proper value” and points to some node allocated in line 12 or 15, or pointer p points to the node undefined. The same claim holds true for pointer q. Consequently, the given instance of PCP has a solution iff there exists some execution
path
for this line
to line
condition
38. Obviously,
at line
the may-alias
39 iff the given
The may-alias enumerate mine
36, at the end of which
can be converted
all
the aliases
that
THEOREM 2.4.
The
hold
and
execution
intraprocedural
with
if
structures.
The
intraprocedural
for a may-alias
between
is recursively
program, after
languages
holds
statements,
for
*(q
enumerable
along
that
using
and
node
because
we can
we can deter-
path.
storage, relation
left)
path,
problem
dynamic
must-alias
~
fact
❑
any given
must-alias
loops,
Checking
= q + &undefined.
of PCP has a solution.
however,
in the
p
checking
relation
instance
relation, paths
into
is undecidable and
is
not
for
recursive even
data
recursively
enumerable. PROOF. from shows tion.
the
The
undecidability
undecidability
how The
must-alias must-alias
iff the may-alias
of the
of the information relation
relation
can holds
does not
39. The complement of a recursively recursively enumerable. It follows ❑ recursively enumerable. 3. RELATED Kam
and
must-alias
may-alias
be used
between
hold
problem
problem.
follows
Consider to compute
node
between
and node
immediately
Figure
1. Line
may-alias “(node.left)
and
*(q
40
informa-
in line ~ left)
41
in line
enumerable but nom-ecursive set is not that the must-alias relation is not even
WORK Unman
meet-over-all-paths monotone MOP ACM
[ 1977]
established
that
the
problem
solution to a monotonic dataflow problem, is undecidable by reducing
Transactions
on Programmmg
Languages
and
Systems,
of computing
analysis framework, a modified version Vol.
16, No
5, September
the or of 1994.
G Ramallngam
1470 [1]
ma!no { mt I, struct tree. nude { Inl value, strucl tree_node *lef[, *right:
[2] [3] [4] [5] [6]
1 *rW
“pi “q, *I, node, undefined,
[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]
undetined.left
[20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42]
p = root, q = rmx; do { !=
Fig.
root
I = root; wh]le (.){ If (
undefined.nght tree_ node));
= &undefined,
roo–>left
)( t–>ngh[ = malloc(sweof(struct I = t–>rvght;
=
&undefineL
roo~–>wh~ = &Undefin@
tree_ node));
) else ( t–>left = malloc(sizeof(struct t = t–>left; I t–>left = &undefined;
tree_node)),
t–>nght = &undefined;
...,
If(l==
1) {
P = P–> path(wl); q = q–> path(zl); I else If (1 == 2) { p = p–> path(wz), q = q–> path(z~); } else if (1 == 3) [ ) else lf(]= r-1) { p = y> path(w,-l ); q = q-> path(z, I), ) else { P = P–>
path(w,),
q = q-> path(z. ),
j while ( ) I* The green PCP Instance has an afjrmati~e am~ er iff 3 some execuf(on path tn th[s point after * wh[ck p = q # &undefined */ ~>left = &node, undefined left= &undefined, [’ ThP gwen PCP msfurrcehus an afhrmatjve answer iff ‘(q–>lefl) node left = &node; q–>left–>left = &undefined, /* node must-alias *(node.left) here iff not *(q-> left) may-alias
The
1
N~te that fields
= &undefined,
= mai!oc(slzeof(struct
program
IX&~)
corresponding
1s an abb&vlati~n
as determmed
by the binary
to an instance for a sequence string
w,,
, w,, z,,
of dereferences
may-alias nede at rhu polrrt “1 node m /tne 39 */
., z, through
of the the
PCP left
problem. and
rzght
a.
PCP to the monotone MOP problem. The proof presented in this article is similar to the proof of Kam and Unman. However, as Kam and Unman any monotonic observe, their result says only that no algorithm that solves dataflow analysis problem exists. However, they do not rule out the existence monotonic dataflow analysis problem, such as the of algorithms for a specific may-alias problem. In other words, the meet-over-all-paths problem for arbitrary monotonic dataflow analysis frameworks is more general than the may-alias problem. Consequently, the undecidability of the latter problem is a stronger result than the undecidability of the former problem. was NP-hard in languages Larus [1988; 1989] showed that alias analysis with recursive data structures. Landi [1992] presented the first proof that the ACM
Transactmns
on
Programming
Languages
and
Systems,
Vol
16,
No
5, September
1994
The Undecidability may-alias even
problem
is not
recursively
halting
recursive
enumerable.
problem
to these
and
that
the
He established
problems,
and
must-alias
these
.
1471
problem
is not
of Aliasmg
results
this
article
data
structures,
by reducing
presents
the
simpler
proofs
for the same results. In the the
absence
aliasing
showed
that
Refer to complexity
of recursively
problems many
defined
become
decidable,
interprocedural
but
remain
static-analysis
Landi’s thesis [1991] of various restricted
various difficult.
problems
versions Myers
of
[1981]
are NP-complete.
for a comprehensive classification versions of the aliasing problems.
of the Not sur-
prisingly, the problem of computing conservative approximations to the mayalias and must-alias relations in the presence of pointers has attracted and continues to attract much attention. Pfeiffer’s thesis [1991] presents a comprehensive overview of this area. Refer to Landi and Ryder [ 1992] and Choi et al. [1993]
for more
recent
work
in this
area.
ACKNOWLEDGMENTS The
author
thanks
Landi
for their
proofs
more
the
anonymous
comments,
which
referees, helped
Charles
improve
Fischer,
this
and
article
and
William made
the
precise.
REFERENCES CHOI,
J.-D.,
tation
BURRF,,
M.
G.
AND
of pointer-reduced
Symposmm York,
P. 1993. Efficient and side effects. In
CARINI,
aliases
on Principles
of Programmmg
flow-sensitive Conference
Languages
mterprocedural Record
of the
(Charleston,
S. Carolina).
to Automata
Theory,
compu20th
ACM,
ACM New
232-245.
HOPCROFT, J. E. AND ULLMAN,
J. D,
1979.
Introduchon
Addison-Wesley, Reading, Mass. KAM, J. B. AND ULLMAN, J. D. 1977. Monotone
Languages,
and
Computation.
Informatica LANDI,
data
flow
analysis
frameworks.
In
Acts
7, 305-317.
W.
1992.
Undecidahility
of static
1992.
analysis.
Lett.
Program.
Lang.
Syst.
1, 4
(Dec.).
W. Computer
1991. Interprocedural aliasing in the presence Science, Rutgers Univ., New Brunswick, N.J.
LANDI,
LANDI,
W. AND RYDER,
SIGP-L4N
Not.
LARUS, J. R.
1989.
sors. Ph.D.
thesis,
LARUS,
J.
R.
SIGPLAN 8th
Ph.D.
P.
Madison,
Received
A New
on
Dept. of
for pointer-induced
aliasing.
P.
N.
programs
Berkeley,
for concurrent
Calif.
Detecting
1988.
execution
on multiproces-
(May). conflicts
between
structure
accesses,
21-34. inter-procedural Principles
data
flow
algorithm.
In
of Programming
Languages
representations
for programs
Conference
Record
(Williamsburg,
Vs.,
of the Jan.
York. Dependence-based
1991.
Wis.
symbolic
of California,
precise
algorithm
Ph.D. thesis,
235-248.
HILFINGER,
dissertation
A safe approximate
1992.
Restructuring Univ.
Symposium
ACM,
PFEIFFER,
G.
7 (July),
23, 7 (July),
1981.
ACM
26-28).
AND
Not.
MYERS, E.
B.
27,
of pointers.
and
Tech.
Rep.
TR-1037,
Computer
Sciences
with Dept.,
reference Univ.
variables.
of Wisconsin,
(Aug.).
June
1993;
revised
ACM
TransactIons
October
1993
on Programmmg
and March
Languages
1994;
accepted
and Systems,
May
Vol
1994
16, No.
5, September
1994.
A Generalized Flow Analysis
Theory of Bit Vector Data
UDAY
P. KHEDKER
DHANANJAY
Indian
Institute
The
classical
inadequate
theory,
theory
data
flows
We
flow
The that
show
theory
data
yields
anal ysls,
This
and provides
of depth.
Other
sphttmg
measure,
tlonal
flow
also
undth,
apphcations
edge
into
define
a tighter
motivation
and
about
include
of new
a sequence
and
develop
a fcaslbdlty
of umdu-ectlonal
for a specific
bounded
flow
same
monotone
analysls.
for
of round-robin
than
results
flows, crlterlon
We show
ucudmectlonal
to umdmectlonal
problems
for bidirectional
on the
to unldirec-
of solution. of data
of Isolated
is of
Based
apphcable to all
complexity
apphcable
flows, theory
and bldwectlonal
analysls.
and easy to adapt
M the
of the
explanation
flow
applicable
complexity
for umdmectlonal
techmques
placement
versatde,
analysis
us umformly
bound
a generahzed
m umdmectlonal
separabihty
a measure
umdmectlonal
m umformly
are
the
in
of data
which
of the
Iterative
roots
We present
results
algorithm
the property
Its
process
It M simple, the
mformatlon
called
the
algorlthm
problems. and
has
problems.
the known
generic
possess
We
flow
into
of workhst-based
problems.
and
explains
theory
valuable
which
data
insight
flow
the
which
problems mques
which
a deeper
that
complexity
bidirectional
analysis,
a workhst-based
problems
the
flow
bidmectlonal
analysls
and bidmectlonal
problem data
data
provides
we develop
‘uonal
of
flow
and
M. DHAMDHERE
of Technology
to characterize
bit vector data
and
and
In particular, for
bidu-ectlonal
the traditional m efficient
and
iterative measure
solution
tech-
we discuss
decomposition
edge
of a bldmec-
flows
Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—con-@ers; optimizat~on, F 2.2 [Analysis of Algorithms and Problem Complexity]. Nonnumerical Algoof proof procedures rithms and Problems—complexity General
Terms:
Additional
Algorithms,
Key
Words
Theory
and
Phrases
Bldn-ectlonal
data
flows,
data
flow
analysis,
data
flow
the uses
and
frameworks
1. INTRODUCTION Data
flow
analysis
definitions uses,
of data
viz.,
Thls
work
program
was
and
India;
process in
design,
done
Technology. Authors’ addresses: 411007,
is the
items
when
the
Engmeermg,
debugging,
optimization,
author
was
Department
[email protected]. Indian
information
This
first
U. P, Khedkar,
emad:
of collecting
a program.
information
of
a research of Computer
Technology,
to a variety
maintenance,
in; D. M, Dhamdhere,
Institute
about is put
student
at the
Science,
Umverslty
Department
Bombay
and
Indian
Institute
of Poona,
of Computer
400
076,
of
docu-
India;
of Pune
Science emad:
[email protected]. Permission
to copy without
fee all or part
of this
material
is granted
provided
that
the copies
or distributed
specific
permission.
01994
ACM
ACM
Transactions
for direct
0164-0925/94/0900-1472 on PmgrammngLanguages
commercial
advantage,
the ACM
are
copyright notice and the title of the publication and its date appear, and notice m gwen that copying is by permission of the Assoclatlon for Computing Machinery. To copy otherwise, or to repubhsh, reqmres a fee and/or not made
$0350 and Systems,
Vol
16, No
5, Septembm’
1994,
Pages
1472-1511
Bit Vector Data Flow Analysis mentation. for
Compilers
the
purpose
Data
flows
mostly
used
in
i.e., the data
graph
can
In
depends Algorithm
code
by
its
classified
MRA)
into
is that time
of code
optimizations
stren@h
decade,
it
has
using
the
solution
for
safety
of
an
though
[Dhamdhere In
article
bidirectional results into
data
in the
based
the
monotone
bit
data
solution.
be found
in
2
throughout analysis.
Section
traditional uniform
algorithm
is developed applicable
This
section
shows for
also the
flow latter
procedure
is ACM
this
can
Section
considered TransactIons
the
either
except for
analyzing
provides
deeper of the
as
known insights
theory
all
to
is
bounded
of separability
article
flow
a
for
of
brevity;
they
flow
problems. out
A
of
worklist-based
the
faciligeneric
theory,
data
flow
generic
iterative
flow the
which
of a generalized
of
data
generalizes
equations
bidirectional
worklist-based
example
theory
formally,
data
and
representative
classical
problems
performance
of the
the
exposition
property
as
the
5. Arising
bidirectional
be
article
used
generic
unidirectional
analyses and
analysis in
in
1992]. as well
explaining
the
been
obtained
et al.
is applicable the
this
[1992], is
vector
data
it
from
3 reviews
bit
flows,
the
of not
unidirectional from
possess
have been
Dhamdhere
Apart
a
and
Because
have
handles
theory
omitted
which
provides
complexity
unidirectional
1 Data the
that
the
over
bidirectional a fixed-point
problems
the
for
of
of information
results
1993;
Dhamdhere
of
to
movement,
known
formally.
Though
which
defines
specification
uniformly
Patil which
been
and
and
the
Reduction
code
Although
flow
hoc
bidirectional
Section
concepts,
unifies
Strength
been
analysis. the
ad
analysis.
MRA
4
flow
uniformly.
have
article.
of both
elimina-
intricacies
characterized
and
problems
Khedker
have
to bidirectional
problems,
introduces the
be
a theory
flow
proofs
et
a node
reducing
unifies
the
exists,
and
and
flow
[Aho
at
MRA
and
1982b]
explain
of data
problems
vector
Several
Section
tate
present
of data
data
The advantage
example,
Hoisting
problems to
solutions
unidirectional
process on
flow
isolated
flow
problem. For
1982a;
Dhamdhere
we
flows
common-subexpression
Composite
cannot
efficient
1988a;
this
flow
Such
available
optimizations
optimizer.
problem
some
program
data
backward
several
movement,
theory
assignment
the
successors.
information
bidirectional unify
possible
traditional
dependencies
in
optimization.
data been
lacuna,
found
loop
a bidirectional
theoretical
can
and
not
information
as its successors. The Morel and Renvoise elimination [Morel and Renvoise 1979]
Dhamdhere
bidirectional
and
the
of an
The
and
reduction,
Though
flows
optimization.
[Joshi
collect
unidirectional
or by its
forward
they
traditional
loop
to
at a node
predecessors
the size and the running and
involve
is a representative
problems
Algorithm
analysis
available
problems,
bidirectional
bidirectional
flow
optimization
on its predecessors as well for partial redundancy
(also called
tion,
data
information
either
be readily
1986].
use
optimization.1
flow
is influenced
flows
al.
typically
of code
1473
.
it
algorithm
analysis
is
problems.
is the
and same
problems.
znterprocedural that
the data
on Programmmg
or wztraprocedural.
interprocedural flows
within
Languages
VVe restrict
information the and
at
the
ourselves entry/exit
to of
procedure. Systems,
Vol
16, No.
5, September
1994.
a
1474
U. P. Khedker and D M. Dhamdhere
.
Section
6 discusses
complexity graph
of data
for
number
a data
edge
We
Section in
known
the
section
DATA
is
into
theory
the
width
shown
to
unidirectional
provides
solution
and
a
the
bidirec-
bound
This
depth.
flows,
criterion
for
of unidirectional of the
for
section
data
a feasibility
applicability
the
of a
bound and
of bidirectional
develops
in (w)
tighter
of
a sequence
and
unifies
flows.
results
presented
framework
lies
in
as well
fact
the
size
the
size
and
a
reported
in
the
literature
at least
two
mented
in
has
inspired
and
Dhamdhere
The
data
Figure
several
that
PPIN,
for
property expression Local
to
(MRA)
which
is used
running
reduction
[Morel
and
movement,
The
in
the
Renvoise
1979].
a 35%
1988;
reduction
cost
It
MRA
optimizations
execution
compilers
[Chow
of the
classical
of an optimizer;
production
and
common-sub-
importance
of many
time
70%
[Morel
as a repre-
article.
of code
unification
unifications
has
(MIPS
has
been and
Dhamdhere
been imple-
PL.8)
1988a;
and Joshi
1982 b].
PPIN, all
the
optimization.
important
other
properties
1. Note
that
as the
30%
1982a;
flow
loop
Algorithm
elimination
optimizations
and the
and Renvoise throughout
traditional
elimination,
EXAMPLE
redundancy
problem
the
AN
the Morel
expression reduces
FLOWS:
for partial
bidirectional
MRA
and
the
is the
expressions,
data bit
flow
equations
vector
for
whereas
for
node
PPIN~
MRA
i which
is the
are
given
represents
bit
representing
in the the
el. property
ANTLOC
~ represents
local
anticipability,
i.e.,
the
existence
upward
transparency,
node.
which
measure
in the
flows
called
for
width
placement,
significance
introduces
1979]
sentative
of an
the
generalized
article.
Renvoise
in
defined
traditional
results
edge
the
analysis
the
of bidirectional
2. BIDIRECTIONAL This
is
that
than
and
7 discusses
this
show
several
splitting
decomposition
measure
new
framework
problems
also explains
of
A
of round-robin
problems.
unidirectional
applications
analysis.
flow
of iterations
tional
viz.,
several
flow
The
exposed expression el in node i, while TRANSP,~ reflects i.e., the absence of definition(s) of any operand(s) of el in the (ANTIN~\ANTOUT~ ) indicates global property of anticipability
whether expression et is very busy at the entry/exit of node i—a necessary of el at the and sufficient condition for the safety of placing an evaluation entry
\exit
of
the
node
[Kennedy
1972].
Equations
(1)
and
(2)
do
not
use
ANTINj/ANTOUTj
properties explicitly; they are implied by PPIN~/PPOUT~ of availability (AVIN~/AVOUTj ) is comproperties. The data flow property puted using the classical forward data flow problem [Aho et al. 1986]. The availability partial redundancy of an expression is represented by the partial of the expression (PAVIN~ ) at the entry of node i. PPIN~ indicates the PPOUT~ feasibility of placing an evaluation of el at the entry of i while
indicates the feasibility of placing it at the exit. Computations of an expression et are inserted at the exit of node i if INSERT,l = T. REDUND~ indicates and may be that the upward exposed occurrence of el in node i is redundant deleted. ACM
Transactions
on
Programmmg
Languages
and
Systems,
Vol
16,
No
5, September
1994
Bit Vector Data Flow Analysls
1475
.
LOCAL DATA FLOW PROPERTIES : ANTLOC:
Node t contains
a computation
of el, not preceded
by a definition
of any of its operands. COMP!
Node i contains a computation of any of its operands.
TRANSP: GLOBAL
Node i does not contain
DATA FLOW
AVIN; /AVOUT:
PROPERTIES
of ei, not followed
a definition
e~ is partially
ANTIN:/ANTOUT!
available
el is anticipated
PPIN:/PPOUT: INSERT: REDUND:
at the
of el.
of el may of e[ should
t.
entry/exit
entry/exit
Computation
computation
of node
at the
Computation First
of any operand
:
el is available at the entry/exit
PAVIN:/PAVOUT:
by a definition
of node
of node
be placed
at the
be inserted
of el existing
entry
at the
in node
~.
t. /eYit exit
of node
of node
t.
t.
z is redundant.
DATA FLOW EQUATIONS : PPIN,
= PAVIN, . (ANTLOC,
PPOUT,
=
~
(AVOUT,
~
(PPINk)
+ TRANSP, +
PPOUT, )
(1)
PPOUT,]
(:1
k C SUCC(Z) INSERT, REDUND,
=
PPOUT,
=
PPIN,
Fig.
The PPIN, the
term
equation
ANTIN,
is replaced
when
expression
the path
along
two terms expression
the
is not
which
term
rise to forward
data
equation).
safety
The
Example
2.1.
)
the
arise
feasibility
by
term
than
flow
of hoisting.
on the notion
graph
in Figure
subsumes
execu-
of an
Redundancy
the
which
gives
in the PPIN, of anticipabil-
backward dependencies in the PPOUT, equation).
MRA
represents
once. The other
by the II term
is based
MRA hoisting
one possible
of the expression
(reflected
program
performed
more
in MRA;
original
redundant
PAVIN,
at least
as follows.
of availability
of code movement
the
The
exists
equation
in the
to prohibit
represent
dependencies
Consider
the original
is computed
of MRA
on the notion
elimination
-’IRANSP,
. TRANSP,)
there
ity of the expression which introduces flow problem (reflected by the 11 term
redundancy
from
available.
in that
equation
flow
+
algorithm.
PAVIN,
partially
dependencies
is based
different
the expression
in the PPIN,
Bidirectional
The Morel-Renvoise
+ ~ ANTLOC,
of hoisting
profitability
tion
by
(-PPIN,
ANTLOC,
is slightly
. (PAVIN,
equations the
1.
~AVOUT,
in the
2. The following
data
partial three
optimization: —Loop-Invariant
Movement:
5 are hoisted out of the REli3UND~, and INSERT; ACM
TransactIons
The computations loops and are T).
on Programmmg
are
Languages
of a a * b in node 4 and node inserted
and
Systems,
in
Vol.
node
16, No.
2 (REDUND~,
5, September
1994.
1476
U. P. Khedker and D. M. Dhamdhere
.
Node
Tramp
Pavin
Antloc
Ppin
Avout
Insert
Ppout
Redund
1
T
F
F
F
F
F
F
F
2
T
F
F
F
F
T
T
F
3
T
F
T
F
T
T
F
F
4
T
T
T
T
T
F
F
T
5
T
T
T
T
T
T
F
T
6
T
T
T
T
T
F–
F
T
7
F
F
T
F
F
T
T
F
8
T
F
F
F
F
F
F
F
9
F
F
F
F
F
F
F
F
10
T
T
F
T
F
F
F
F
11
T
T
T
T
F
T
F
F
12
T
T
T
T
T
F
F
T
2
Fig
—Code
flow
The partially
Hoisting:
hoisted
Program
to node
graph
redundant
7. As a result
path 1-8-11-12 would program has two.
have
—Common-Subexpression
and properties
computation
of suppressing
only
for Example
this
21
of a * b in node partial
12 is
redundancy,
the
of a * b; the unoptimized
one computation
The totally redundant computation of as an instance of common-subexpression elimina-
EhmLnatLon:
a * b in node 6 is deleted tion. Note
that
the
partially
redundant
computation
in
a * b
suppressed since hoisting it to node 8 would be unsafe—the progz-am. no computation of a * b in the original Example
assignment flow ACM
equations Transactions
Bidirectional
2.2.
and strength
data
reduction
flows
of two such algorithms. on
Programmmg
Languages
have
optimizations.
been
used
Figure
The SPPIN/SPPOUT and
Systems,
Vol
16,
No
node
11 is not
path
also
1-8-9
in
3 presents
register the data
problem 5, September
had
1994
of LSIA
Bit Vector Data Flow Analysls 0
13ASIC
LOAD
STORE
SPPIN,
INSERTION
=
ALGORITHM
~
(LSIA)
[Dhamdhere
.
1477
[Joshi
ad
1988b]
(SPPOUTJ)
j E ~re~(i) SPPOUT,
= DPANTOUT, ~
. ( DCOMP,
(DANTINk
+ DTRANSP,
SPPIN, )
+ SPPINk)
k E SUCC(i]
HOISTING AND STRENGTH REDUCTION ALGORITHM
COMPOSITE
0
Dhamdkere
1982.;
Joshi
and
Dharudhere
NOCOMIN,
=
CONSTA,
NOCOMOUT,
~ j NOCOMOUT,
=
(CHSA)
1982b] +
CONST13,
NOCOMOUT,
< pred(~)
CONSTC,
+
~
CONSTD,
. NOCOMIN,
CONSTE,
+
NOCOMIN,
k 6 SILCC(i) Fig. 3.
performs tion
sinking
techniques
of CHSA
equations
of STORE
is used to inhibit
FROM
section
1988b]. [Joshi
an overview
various
solution
tion
mostly
on Graham
is based
Marlowe
and Ryder
[1990].
using
redundancy
of an update
FLOW
and
1982a;
theory
and their Wegman
detailed
3.1
part
of this
section
motivates
problem
computation
following
1982 b].
of data
complexities. [1976],
treatment
flow
analysis
Our
descrip-
Hecht
[ 1977],
can be found
al. [1986], Graham and Wegman [1976], Hecht [1977], [1977], Kildall [1973], Marlowe and Ryder [1990], and concluding
elimina-
ANALYSIS
of the classical
methods
A more
problems
partial
and Dhamdhere
DATA
and compares
bldmectional
The NOCOMIN/NOCOMOUT
the placement
CLASSICAL
presents
of some other
instructions
computation
NOTIONS
This
flow
[Dhamdhere
a high-strength
3,
Data
and
in Aho et
Kam and Unman Rosen [1980]. The
the need for a more
general
setting.
Preliminaries
A data
flow
Elements
framework
in &
is defined
represent
as a triple
the information
D = (S’,
associated
with
n , F)
(Figure
the entry/exit
4). of a
basic block. m is the set union or intersection operation which determines the way the global information is combined when it reaches a basic block. A function f, = 7 represents the effect on the information as it flows through basic
block
‘Alternatively, (backward)
i.2
the flow
ACM
functions
can be associated
with
in-edges
(out-edges)
of node
z for
forward
problems. Transactions
on Programming
Languages
and
Systems,
Vol
16, No
5, September
1994