The undecidability of aliasing - CiteSeerX

The Undecidability of A asing. G. RAMALINGAM. IBM T. J. Watson. Research. Center ..... algorithm for pointer-induced aliasing. SIGP-L4N. Not. 27, 7 (July),. 235-248. LARUS, J. R.. 1989. ..... several applications of the ..... basic block. m is the set union or intersection operation which determines the way the global.
716KB taille 1 téléchargements 312 vues
The Undecidability

of A asing

G. RAMALINGAM IBM T. J. Watson

Research

Center

Alias analysis is a prerequisite for performing most of the common program analyses such as reaching-definitions analysis or live-variables analysis. Landi [1992] recently established that it is impossible to compute statically precise alias information—either may-alias or must-alias—in languages with if statements, loops, dynamic storage, and recursive data structures: more precisely, he showed that the may-alias relation is not recursive, while the must-alias relation is not even recursively enumerable. This article presents simpler proofs of the same results. Categories

and Subject Descriptors:

opttmizatzon;

F.4.

Decision

Problems

General

Terms:

Additional

1 [Mathematical

Languages,

Key

Words

D.3,4 [Programming

Languages]:

Processors—compilers;

Logic]:

Theory;

[Formal

Computability

F.4.3

Languages]:

Theory

and Phrases:

Alias

analysis,

pointer

analysis

1. INTRODUCTION Compilers program

and various other programming analysis. To solve most program

problem

of determining

information same

about

value

at

expressions execution interested progy-am, aliases

live

whether

some

variables, two

point.

that

that

occur

in all executions

all

paths

in

if an arbitrary

simplifying

problems.

information,

may/must

two

at a particular

names point

of the program.

or

have

the

or

L-valued

during

program

program

are

makes

it

T. J. Watson

problem

Research

to

this

solve

such informa-

conservative since

is executable

possible even

the

executable,

in a program

or must-alias

IBM

Obviously,

analysis problems. performed under

Unfortunately,

the may-alias

address:

the path

assumption

analysis

Author’s

alias

use of static such as the

if both refer to the same location. In the may-alias problem, one is in identifying aliases that can occur during some execution of the while in the must-alias problem, one is interested in identifying

deciding

make

requires

expressions

Informally,

each other

tion is relevant to most dataflow Program analysis is commonly tion

one

L-valued

program

are said to alias

tools make good analysis problems,

the

assumpproblem

is undecidable. a number

assumption

is not

of This

of program sufficient

to

Heights,

NY

the copies

are

decidable.

Center,

P.O.

Box

704,

Yorktown

10598. Permission not made of the

to copy without or distributed

publication

Association

its

for Computing

specific

permission.

01994

ACM ACM

and

fee all or part

for dmect date

appear,

Machinery

0164-0925/94/0900-1467 Transactmns

on Programmmg

of this

commercial and To

material

is granted

advantage, notice

copy

the ACM

is given

otherwise,

that

provided copyright

copying

that notice

and the title

is by permission

or to republisb,

requires

of the

a fee

and/or

$03.50 Languages

and Systems,

Vol

16, No

5, September

1994, Pages

1467-1471

1468

G. Fiamallngam

.

each other at a program point if Names a and b are said to may-alias entry to that program point, such there exists a path P from the program after execution along path P. Names that a and b refer to the same location b are said

to must-alias the program beginning

a and P from same

that

location

after

problems the

execution

even the simpler Here,

undecidability or PCP,

The

of this problem to the may-alias

decision

version

Given

does there

while

may-alias

in generating made finite that

The

2.1.

arbitrary

exist

recently

recursive

data

of the same result.

Section

established

and must-alias

3 discusses

structures

We establish

Correspondence

article

presents

related

work.

OF ALIASING of the

expressions

Definition

ing:

permit

problem

in a program and two names, the given names at the given

we are interested This set can be L-valued

that proof

[1992]

of the may-alias

by reducing the Post’s problem. Section 2 of the

results,

UNDECIDABILITY

program point holds between

versions

a simpler

of the undecidability

2. THE

P. Landi

path

for languages

we present

Problem,

proofs

along

intraprocedural

are undecidable

to be built.

each other at a program point if, for all paths to the program point, a and b both refer to the

A

a nonempty

the

the set of all may-alias by restricting attention

occur

Post’s

lists

is

following:

given

a

decide if the may-alias relation program point. More generally, facts that hold to the names

true. and

in the program.

Correspondence

and

B

A=wl,

wz,

B=zl,

zz,

sequence

or PCP

Problem,

of r strings

is the follow-

each in {O, 1} +, say

. . ..w. . . ..z.

of integers

i ~, z~, . . . . zh such that

w11w12 ““” Wlk = ZL1ZL2 ““” Z,k. THEOREM THEOREM

languages

The

2.2.

The

2.3.

with

if

PCP

is undecidable

intraprocedural

statements,

[ Hopcroft may-alias

loops,

and

Unman

problem

dynamic

is

storage,

and

1979].

undecidable recursive

for data

structures.

PROOF.

We relate

binary tree interpreted representing

and

PCP

with root root. as representing a left

branch

to the

may-alias

problem

to be right. For any binary . . . b.) to be branch(bo) ~ branch(bl)

string

branch(1)

path(bobl

as follows.

Consider

a

A binary string consisting of 0s and 1s can be a path from the root of the binary tree with O and 1 a right branch. Define branch(0) to be left ~

“””

b. b ~ “”” b., ~

branch.

define Let a

a = /3 iff root * path(a) and root ,f3 be two binary strings. Then, tree. Essentially, this is the idea /3) refer to the same node in the binary behind our reduction of PCP to the may-alias problem. Given an instance of PCP, we construct the program in Figure 1. The program is written in C, but it can be written in any language with if statements, loops, dynamic storage, and recursive data structures. The may-

and

path(

ACM

TransactIons

on Programmmg

Languages

and

Systems,

Vol

16,

No

5, September

1994

The Undecidability alias

relation

instance lines

between

19, and

are r different

ber these

assume

that

the

at line

39 iff

Ignore,

for the moment,

points

path

loop

o of t integers,

P

to a binary

22 through the jth

the

tree

20.

35 —num-

from

line

20 to

t times

35)

element

given

at line

22 through

the program

in

(lines where

1469

node

the loop of lines

r. Any

through

.

below.

and

root

inside

1 through

iterates

+ left)

as explained

branches

branches

36 that

*(q

has a solution,

7 through

There line

holds

of PCP

of Aliaslng

corre-

in the sequence

sponds

to a sequence

denotes “p and

the branch taken during the jth time through the loop. Furthermore, ‘q will alias each other at the end of path P iff the sequence a is a

solution

to the

ciently

large”

Instead

given

P(3P instance,

binary

tree

of actually

at line

provided

that

root

pointed

constructing

a binary

tree,

we use the

code in lines

through 19 to “generate” all possible paths through a binary this, the pointer fields of newly allocated tree nodes are not null

pointer

point

to

as might

a special

initialized

to point

program

be done usually.

node

called

to itself.

beginning

to line

as dereferencing

a null

Instead,

This

these

whose

undefined

ensures

that

36 can be executed

pointer.

to a “suffi-

20.

fields

every at line

to

fields

are

right

possible

without

Consequently,

are initialized

and

left

9

tree. In doing initialized to a

path

raising

from

the

any errors

36, either

such p has

pointer

a “proper value” and points to some node allocated in line 12 or 15, or pointer p points to the node undefined. The same claim holds true for pointer q. Consequently, the given instance of PCP has a solution iff there exists some execution

path

for this line

to line

condition

38. Obviously,

at line

the may-alias

39 iff the given

The may-alias enumerate mine

36, at the end of which

can be converted

all

the aliases

that

THEOREM 2.4.

The

hold

and

execution

intraprocedural

with

if

structures.

The

intraprocedural

for a may-alias

between

is recursively

program, after

languages

holds

statements,

for

*(q

enumerable

along

that

using

and

node

because

we can

we can deter-

path.

storage, relation

left)

path,

problem

dynamic

must-alias

~

fact



any given

must-alias

loops,

Checking

= q + &undefined.

of PCP has a solution.

however,

in the

p

checking

relation

instance

relation, paths

into

is undecidable and

is

not

for

recursive even

data

recursively

enumerable. PROOF. from shows tion.

the

The

undecidability

undecidability

how The

must-alias must-alias

iff the may-alias

of the

of the information relation

relation

can holds

does not

39. The complement of a recursively recursively enumerable. It follows ❑ recursively enumerable. 3. RELATED Kam

and

must-alias

may-alias

be used

between

hold

problem

problem.

follows

Consider to compute

node

between

and node

immediately

Figure

1. Line

may-alias “(node.left)

and

*(q

40

informa-

in line ~ left)

41

in line

enumerable but nom-ecursive set is not that the must-alias relation is not even

WORK Unman

meet-over-all-paths monotone MOP ACM

[ 1977]

established

that

the

problem

solution to a monotonic dataflow problem, is undecidable by reducing

Transactions

on Programmmg

Languages

and

Systems,

of computing

analysis framework, a modified version Vol.

16, No

5, September

the or of 1994.

G Ramallngam

1470 [1]

ma!no { mt I, struct tree. nude { Inl value, strucl tree_node *lef[, *right:

[2] [3] [4] [5] [6]

1 *rW

“pi “q, *I, node, undefined,

[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]

undetined.left

[20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42]

p = root, q = rmx; do { !=

Fig.

root

I = root; wh]le (.){ If (

undefined.nght tree_ node));

= &undefined,

roo–>left

)( t–>ngh[ = malloc(sweof(struct I = t–>rvght;

=

&undefineL

roo~–>wh~ = &Undefin@

tree_ node));

) else ( t–>left = malloc(sizeof(struct t = t–>left; I t–>left = &undefined;

tree_node)),

t–>nght = &undefined;

...,

If(l==

1) {

P = P–> path(wl); q = q–> path(zl); I else If (1 == 2) { p = p–> path(wz), q = q–> path(z~); } else if (1 == 3) [ ) else lf(]= r-1) { p = y> path(w,-l ); q = q-> path(z, I), ) else { P = P–>

path(w,),

q = q-> path(z. ),

j while ( ) I* The green PCP Instance has an afjrmati~e am~ er iff 3 some execuf(on path tn th[s point after * wh[ck p = q # &undefined */ ~>left = &node, undefined left= &undefined, [’ ThP gwen PCP msfurrcehus an afhrmatjve answer iff ‘(q–>lefl) node left = &node; q–>left–>left = &undefined, /* node must-alias *(node.left) here iff not *(q-> left) may-alias

The

1

N~te that fields

= &undefined,

= mai!oc(slzeof(struct

program

IX&~)

corresponding

1s an abb&vlati~n

as determmed

by the binary

to an instance for a sequence string

w,,

, w,, z,,

of dereferences

may-alias nede at rhu polrrt “1 node m /tne 39 */

., z, through

of the the

PCP left

problem. and

rzght

a.

PCP to the monotone MOP problem. The proof presented in this article is similar to the proof of Kam and Unman. However, as Kam and Unman any monotonic observe, their result says only that no algorithm that solves dataflow analysis problem exists. However, they do not rule out the existence monotonic dataflow analysis problem, such as the of algorithms for a specific may-alias problem. In other words, the meet-over-all-paths problem for arbitrary monotonic dataflow analysis frameworks is more general than the may-alias problem. Consequently, the undecidability of the latter problem is a stronger result than the undecidability of the former problem. was NP-hard in languages Larus [1988; 1989] showed that alias analysis with recursive data structures. Landi [1992] presented the first proof that the ACM

Transactmns

on

Programming

Languages

and

Systems,

Vol

16,

No

5, September

1994

The Undecidability may-alias even

problem

is not

recursively

halting

recursive

enumerable.

problem

to these

and

that

the

He established

problems,

and

must-alias

these

.

1471

problem

is not

of Aliasmg

results

this

article

data

structures,

by reducing

presents

the

simpler

proofs

for the same results. In the the

absence

aliasing

showed

that

Refer to complexity

of recursively

problems many

defined

become

decidable,

interprocedural

but

remain

static-analysis

Landi’s thesis [1991] of various restricted

various difficult.

problems

versions Myers

of

[1981]

are NP-complete.

for a comprehensive classification versions of the aliasing problems.

of the Not sur-

prisingly, the problem of computing conservative approximations to the mayalias and must-alias relations in the presence of pointers has attracted and continues to attract much attention. Pfeiffer’s thesis [1991] presents a comprehensive overview of this area. Refer to Landi and Ryder [ 1992] and Choi et al. [1993]

for more

recent

work

in this

area.

ACKNOWLEDGMENTS The

author

thanks

Landi

for their

proofs

more

the

anonymous

comments,

which

referees, helped

Charles

improve

Fischer,

this

and

article

and

William made

the

precise.

REFERENCES CHOI,

J.-D.,

tation

BURRF,,

M.

G.

AND

of pointer-reduced

Symposmm York,

P. 1993. Efficient and side effects. In

CARINI,

aliases

on Principles

of Programmmg

flow-sensitive Conference

Languages

mterprocedural Record

of the

(Charleston,

S. Carolina).

to Automata

Theory,

compu20th

ACM,

ACM New

232-245.

HOPCROFT, J. E. AND ULLMAN,

J. D,

1979.

Introduchon

Addison-Wesley, Reading, Mass. KAM, J. B. AND ULLMAN, J. D. 1977. Monotone

Languages,

and

Computation.

Informatica LANDI,

data

flow

analysis

frameworks.

In

Acts

7, 305-317.

W.

1992.

Undecidahility

of static

1992.

analysis.

Lett.

Program.

Lang.

Syst.

1, 4

(Dec.).

W. Computer

1991. Interprocedural aliasing in the presence Science, Rutgers Univ., New Brunswick, N.J.

LANDI,

LANDI,

W. AND RYDER,

SIGP-L4N

Not.

LARUS, J. R.

1989.

sors. Ph.D.

thesis,

LARUS,

J.

R.

SIGPLAN 8th

Ph.D.

P.

Madison,

Received

A New

on

Dept. of

for pointer-induced

aliasing.

P.

N.

programs

Berkeley,

for concurrent

Calif.

Detecting

1988.

execution

on multiproces-

(May). conflicts

between

structure

accesses,

21-34. inter-procedural Principles

data

flow

algorithm.

In

of Programming

Languages

representations

for programs

Conference

Record

(Williamsburg,

Vs.,

of the Jan.

York. Dependence-based

1991.

Wis.

symbolic

of California,

precise

algorithm

Ph.D. thesis,

235-248.

HILFINGER,

dissertation

A safe approximate

1992.

Restructuring Univ.

Symposium

ACM,

PFEIFFER,

G.

7 (July),

23, 7 (July),

1981.

ACM

26-28).

AND

Not.

MYERS, E.

B.

27,

of pointers.

and

Tech.

Rep.

TR-1037,

Computer

Sciences

with Dept.,

reference Univ.

variables.

of Wisconsin,

(Aug.).

June

1993;

revised

ACM

TransactIons

October

1993

on Programmmg

and March

Languages

1994;

accepted

and Systems,

May

Vol

1994

16, No.

5, September

1994.

A Generalized Flow Analysis

Theory of Bit Vector Data

UDAY

P. KHEDKER

DHANANJAY

Indian

Institute

The

classical

inadequate

theory,

theory

data

flows

We

flow

The that

show

theory

data

yields

anal ysls,

This

and provides

of depth.

Other

sphttmg

measure,

tlonal

flow

also

undth,

apphcations

edge

into

define

a tighter

motivation

and

about

include

of new

a sequence

and

develop

a fcaslbdlty

of umdu-ectlonal

for a specific

bounded

flow

same

monotone

analysls.

for

of round-robin

than

results

flows, crlterlon

We show

ucudmectlonal

to umdmectlonal

problems

for bidirectional

on the

to unldirec-

of solution. of data

of Isolated

is of

Based

apphcable to all

complexity

apphcable

flows, theory

and bldwectlonal

analysls.

and easy to adapt

M the

of the

explanation

flow

applicable

complexity

for umdmectlonal

techmques

placement

versatde,

analysis

us umformly

bound

a generahzed

m umdmectlonal

separabihty

a measure

umdmectlonal

m umformly

are

the

in

of data

which

of the

Iterative

roots

We present

results

algorithm

the property

Its

process

It M simple, the

mformatlon

called

the

algorlthm

problems. and

has

problems.

the known

generic

possess

We

flow

into

of workhst-based

problems.

and

explains

theory

valuable

which

data

insight

flow

the

which

problems mques

which

a deeper

that

complexity

bidirectional

analysis,

a workhst-based

problems

the

flow

bidmectlonal

analysls

and bidmectlonal

problem data

data

provides

we develop

‘uonal

of

flow

and

M. DHAMDHERE

of Technology

to characterize

bit vector data

and

and

In particular, for

bidu-ectlonal

the traditional m efficient

and

iterative measure

solution

tech-

we discuss

decomposition

edge

of a bldmec-

flows

Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—con-@ers; optimizat~on, F 2.2 [Analysis of Algorithms and Problem Complexity]. Nonnumerical Algoof proof procedures rithms and Problems—complexity General

Terms:

Additional

Algorithms,

Key

Words

Theory

and

Phrases

Bldn-ectlonal

data

flows,

data

flow

analysis,

data

flow

the uses

and

frameworks

1. INTRODUCTION Data

flow

analysis

definitions uses,

of data

viz.,

Thls

work

program

was

and

India;

process in

design,

done

Technology. Authors’ addresses: 411007,

is the

items

when

the

Engmeermg,

debugging,

optimization,

author

was

Department

[email protected]. Indian

information

This

first

U. P, Khedkar,

emad:

of collecting

a program.

information

of

a research of Computer

Technology,

to a variety

maintenance,

in; D. M, Dhamdhere,

Institute

about is put

student

at the

Science,

Umverslty

Department

Bombay

and

Indian

Institute

of Poona,

of Computer

400

076,

of

docu-

India;

of Pune

Science emad:

[email protected]. Permission

to copy without

fee all or part

of this

material

is granted

provided

that

the copies

or distributed

specific

permission.

01994

ACM

ACM

Transactions

for direct

0164-0925/94/0900-1472 on PmgrammngLanguages

commercial

advantage,

the ACM

are

copyright notice and the title of the publication and its date appear, and notice m gwen that copying is by permission of the Assoclatlon for Computing Machinery. To copy otherwise, or to repubhsh, reqmres a fee and/or not made

$0350 and Systems,

Vol

16, No

5, Septembm’

1994,

Pages

1472-1511

Bit Vector Data Flow Analysis mentation. for

Compilers

the

purpose

Data

flows

mostly

used

in

i.e., the data

graph

can

In

depends Algorithm

code

by

its

classified

MRA)

into

is that time

of code

optimizations

stren@h

decade,

it

has

using

the

solution

for

safety

of

an

though

[Dhamdhere In

article

bidirectional results into

data

in the

based

the

monotone

bit

data

solution.

be found

in

2

throughout analysis.

Section

traditional uniform

algorithm

is developed applicable

This

section

shows for

also the

flow latter

procedure

is ACM

this

can

Section

considered TransactIons

the

either

except for

analyzing

provides

deeper of the

as

known insights

theory

all

to

is

bounded

of separability

article

flow

a

for

of

brevity;

they

flow

problems. out

A

of

worklist-based

the

faciligeneric

theory,

data

flow

generic

iterative

flow the

which

of a generalized

of

data

generalizes

equations

bidirectional

worklist-based

example

theory

formally,

data

and

representative

classical

problems

performance

of the

the

exposition

property

as

the

5. Arising

bidirectional

be

article

used

generic

unidirectional

analyses and

analysis in

in

1992]. as well

explaining

the

been

obtained

et al.

is applicable the

this

[1992], is

vector

data

it

from

3 reviews

bit

flows,

the

of not

unidirectional from

possess

have been

Dhamdhere

Apart

a

and

Because

have

handles

theory

omitted

which

provides

complexity

unidirectional

1 Data the

that

the

over

bidirectional a fixed-point

problems

the

for

of

of information

results

1993;

Dhamdhere

of

to

movement,

known

formally.

Though

which

defines

specification

uniformly

Patil which

been

and

and

the

Reduction

code

Although

flow

hoc

bidirectional

Section

concepts,

unifies

Strength

been

analysis. the

ad

analysis.

MRA

4

flow

uniformly.

have

article.

of both

elimina-

intricacies

characterized

and

problems

Khedker

have

to bidirectional

problems,

introduces the

be

a theory

flow

proofs

et

a node

reducing

unifies

the

exists,

and

and

flow

[Aho

at

MRA

and

1982b]

explain

of data

problems

vector

Several

Section

tate

present

of data

data

The advantage

example,

Hoisting

problems to

solutions

unidirectional

process on

flow

isolated

flow

problem. For

1982a;

Dhamdhere

we

flows

common-subexpression

Composite

cannot

efficient

1988a;

this

flow

Such

available

optimizations

optimizer.

problem

some

program

data

backward

several

movement,

theory

assignment

the

successors.

information

bidirectional unify

possible

traditional

dependencies

in

optimization.

data been

lacuna,

found

loop

a bidirectional

theoretical

can

and

not

information

as its successors. The Morel and Renvoise elimination [Morel and Renvoise 1979]

Dhamdhere

bidirectional

and

the

of an

The

and

reduction,

Though

flows

optimization.

[Joshi

collect

unidirectional

or by its

forward

they

traditional

loop

to

at a node

predecessors

the size and the running and

involve

is a representative

problems

Algorithm

analysis

available

problems,

bidirectional

bidirectional

flow

optimization

on its predecessors as well for partial redundancy

(also called

tion,

data

information

either

be readily

1986].

use

optimization.1

flow

is influenced

flows

al.

typically

of code

1473

.

it

algorithm

analysis

is

problems.

is the

and same

problems.

znterprocedural that

the data

on Programmmg

or wztraprocedural.

interprocedural flows

within

Languages

VVe restrict

information the and

at

the

ourselves entry/exit

to of

procedure. Systems,

Vol

16, No.

5, September

1994.

a

1474

U. P. Khedker and D M. Dhamdhere

.

Section

6 discusses

complexity graph

of data

for

number

a data

edge

We

Section in

known

the

section

DATA

is

into

theory

the

width

shown

to

unidirectional

provides

solution

and

a

the

bidirec-

bound

This

depth.

flows,

criterion

for

of unidirectional of the

for

section

data

a feasibility

applicability

the

of a

bound and

of bidirectional

develops

in (w)

tighter

of

a sequence

and

unifies

flows.

results

presented

framework

lies

in

as well

fact

the

size

the

size

and

a

reported

in

the

literature

at least

two

mented

in

has

inspired

and

Dhamdhere

The

data

Figure

several

that

PPIN,

for

property expression Local

to

(MRA)

which

is used

running

reduction

[Morel

and

movement,

The

in

the

Renvoise

1979].

a 35%

1988;

reduction

cost

It

MRA

optimizations

execution

compilers

[Chow

of the

classical

of an optimizer;

production

and

common-sub-

importance

of many

time

70%

[Morel

as a repre-

article.

of code

unification

unifications

has

(MIPS

has

been and

Dhamdhere

been imple-

PL.8)

1988a;

and Joshi

1982 b].

PPIN, all

the

optimization.

important

other

properties

1. Note

that

as the

30%

1982a;

flow

loop

Algorithm

elimination

optimizations

and the

and Renvoise throughout

traditional

elimination,

EXAMPLE

redundancy

problem

the

AN

the Morel

expression reduces

FLOWS:

for partial

bidirectional

MRA

and

the

is the

expressions,

data bit

flow

equations

vector

for

whereas

for

node

PPIN~

MRA

i which

is the

are

given

represents

bit

representing

in the the

el. property

ANTLOC

~ represents

local

anticipability,

i.e.,

the

existence

upward

transparency,

node.

which

measure

in the

flows

called

for

width

placement,

significance

introduces

1979]

sentative

of an

the

generalized

article.

Renvoise

in

defined

traditional

results

edge

the

analysis

the

of bidirectional

2. BIDIRECTIONAL This

is

that

than

and

7 discusses

this

show

several

splitting

decomposition

measure

new

framework

problems

also explains

of

A

of round-robin

problems.

unidirectional

applications

analysis.

flow

of iterations

tional

viz.,

several

flow

The

exposed expression el in node i, while TRANSP,~ reflects i.e., the absence of definition(s) of any operand(s) of el in the (ANTIN~\ANTOUT~ ) indicates global property of anticipability

whether expression et is very busy at the entry/exit of node i—a necessary of el at the and sufficient condition for the safety of placing an evaluation entry

\exit

of

the

node

[Kennedy

1972].

Equations

(1)

and

(2)

do

not

use

ANTINj/ANTOUTj

properties explicitly; they are implied by PPIN~/PPOUT~ of availability (AVIN~/AVOUTj ) is comproperties. The data flow property puted using the classical forward data flow problem [Aho et al. 1986]. The availability partial redundancy of an expression is represented by the partial of the expression (PAVIN~ ) at the entry of node i. PPIN~ indicates the PPOUT~ feasibility of placing an evaluation of el at the entry of i while

indicates the feasibility of placing it at the exit. Computations of an expression et are inserted at the exit of node i if INSERT,l = T. REDUND~ indicates and may be that the upward exposed occurrence of el in node i is redundant deleted. ACM

Transactions

on

Programmmg

Languages

and

Systems,

Vol

16,

No

5, September

1994

Bit Vector Data Flow Analysls

1475

.

LOCAL DATA FLOW PROPERTIES : ANTLOC:

Node t contains

a computation

of el, not preceded

by a definition

of any of its operands. COMP!

Node i contains a computation of any of its operands.

TRANSP: GLOBAL

Node i does not contain

DATA FLOW

AVIN; /AVOUT:

PROPERTIES

of ei, not followed

a definition

e~ is partially

ANTIN:/ANTOUT!

available

el is anticipated

PPIN:/PPOUT: INSERT: REDUND:

at the

of el.

of el may of e[ should

t.

entry/exit

entry/exit

Computation

computation

of node

at the

Computation First

of any operand

:

el is available at the entry/exit

PAVIN:/PAVOUT:

by a definition

of node

of node

be placed

at the

be inserted

of el existing

entry

at the

in node

~.

t. /eYit exit

of node

of node

t.

t.

z is redundant.

DATA FLOW EQUATIONS : PPIN,

= PAVIN, . (ANTLOC,

PPOUT,

=

~

(AVOUT,

~

(PPINk)

+ TRANSP, +

PPOUT, )

(1)

PPOUT,]

(:1

k C SUCC(Z) INSERT, REDUND,

=

PPOUT,

=

PPIN,

Fig.

The PPIN, the

term

equation

ANTIN,

is replaced

when

expression

the path

along

two terms expression

the

is not

which

term

rise to forward

data

equation).

safety

The

Example

2.1.

)

the

arise

feasibility

by

term

than

flow

of hoisting.

on the notion

graph

in Figure

subsumes

execu-

of an

Redundancy

the

which

gives

in the PPIN, of anticipabil-

backward dependencies in the PPOUT, equation).

MRA

represents

once. The other

by the II term

is based

MRA hoisting

one possible

of the expression

(reflected

program

performed

more

in MRA;

original

redundant

PAVIN,

at least

as follows.

of availability

of code movement

the

The

exists

equation

in the

to prohibit

represent

dependencies

Consider

the original

is computed

of MRA

on the notion

elimination

-’IRANSP,

. TRANSP,)

there

ity of the expression which introduces flow problem (reflected by the 11 term

redundancy

from

available.

in that

equation

flow

+

algorithm.

PAVIN,

partially

dependencies

is based

different

the expression

in the PPIN,

Bidirectional

The Morel-Renvoise

+ ~ ANTLOC,

of hoisting

profitability

tion

by

(-PPIN,

ANTLOC,

is slightly

. (PAVIN,

equations the

1.

~AVOUT,

in the

2. The following

data

partial three

optimization: —Loop-Invariant

Movement:

5 are hoisted out of the REli3UND~, and INSERT; ACM

TransactIons

The computations loops and are T).

on Programmmg

are

Languages

of a a * b in node 4 and node inserted

and

Systems,

in

Vol.

node

16, No.

2 (REDUND~,

5, September

1994.

1476

U. P. Khedker and D. M. Dhamdhere

.

Node

Tramp

Pavin

Antloc

Ppin

Avout

Insert

Ppout

Redund

1

T

F

F

F

F

F

F

F

2

T

F

F

F

F

T

T

F

3

T

F

T

F

T

T

F

F

4

T

T

T

T

T

F

F

T

5

T

T

T

T

T

T

F

T

6

T

T

T

T

T

F–

F

T

7

F

F

T

F

F

T

T

F

8

T

F

F

F

F

F

F

F

9

F

F

F

F

F

F

F

F

10

T

T

F

T

F

F

F

F

11

T

T

T

T

F

T

F

F

12

T

T

T

T

T

F

F

T

2

Fig

—Code

flow

The partially

Hoisting:

hoisted

Program

to node

graph

redundant

7. As a result

path 1-8-11-12 would program has two.

have

—Common-Subexpression

and properties

computation

of suppressing

only

for Example

this

21

of a * b in node partial

12 is

redundancy,

the

of a * b; the unoptimized

one computation

The totally redundant computation of as an instance of common-subexpression elimina-

EhmLnatLon:

a * b in node 6 is deleted tion. Note

that

the

partially

redundant

computation

in

a * b

suppressed since hoisting it to node 8 would be unsafe—the progz-am. no computation of a * b in the original Example

assignment flow ACM

equations Transactions

Bidirectional

2.2.

and strength

data

reduction

flows

of two such algorithms. on

Programmmg

Languages

have

optimizations.

been

used

Figure

The SPPIN/SPPOUT and

Systems,

Vol

16,

No

node

11 is not

path

also

1-8-9

in

3 presents

register the data

problem 5, September

had

1994

of LSIA

Bit Vector Data Flow Analysls 0

13ASIC

LOAD

STORE

SPPIN,

INSERTION

=

ALGORITHM

~

(LSIA)

[Dhamdhere

.

1477

[Joshi

ad

1988b]

(SPPOUTJ)

j E ~re~(i) SPPOUT,

= DPANTOUT, ~

. ( DCOMP,

(DANTINk

+ DTRANSP,

SPPIN, )

+ SPPINk)

k E SUCC(i]

HOISTING AND STRENGTH REDUCTION ALGORITHM

COMPOSITE

0

Dhamdkere

1982.;

Joshi

and

Dharudhere

NOCOMIN,

=

CONSTA,

NOCOMOUT,

~ j NOCOMOUT,

=

(CHSA)

1982b] +

CONST13,

NOCOMOUT,

< pred(~)

CONSTC,

+

~

CONSTD,

. NOCOMIN,

CONSTE,

+

NOCOMIN,

k 6 SILCC(i) Fig. 3.

performs tion

sinking

techniques

of CHSA

equations

of STORE

is used to inhibit

FROM

section

1988b]. [Joshi

an overview

various

solution

tion

mostly

on Graham

is based

Marlowe

and Ryder

[1990].

using

redundancy

of an update

FLOW

and

1982a;

theory

and their Wegman

detailed

3.1

part

of this

section

motivates

problem

computation

following

1982 b].

of data

complexities. [1976],

treatment

flow

analysis

Our

descrip-

Hecht

[ 1977],

can be found

al. [1986], Graham and Wegman [1976], Hecht [1977], [1977], Kildall [1973], Marlowe and Ryder [1990], and concluding

elimina-

ANALYSIS

of the classical

methods

A more

problems

partial

and Dhamdhere

DATA

and compares

bldmectional

The NOCOMIN/NOCOMOUT

the placement

CLASSICAL

presents

of some other

instructions

computation

NOTIONS

This

flow

[Dhamdhere

a high-strength

3,

Data

and

in Aho et

Kam and Unman Rosen [1980]. The

the need for a more

general

setting.

Preliminaries

A data

flow

Elements

framework

in &

is defined

represent

as a triple

the information

D = (S’,

associated

with

n , F)

(Figure

the entry/exit

4). of a

basic block. m is the set union or intersection operation which determines the way the global information is combined when it reaches a basic block. A function f, = 7 represents the effect on the information as it flows through basic

block

‘Alternatively, (backward)

i.2

the flow

ACM

functions

can be associated

with

in-edges

(out-edges)

of node

z for

forward

problems. Transactions

on Programming

Languages

and

Systems,

Vol

16, No

5, September

1994