unknown
1970-01-01 00:00:00 UTC
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div><div class=3D"h5"><div dir=3D"ltr"><div=
;</div><div>=A0=A0for(i =3D 0; i< j; i++)</div><div>=A0=A0{</div><div>=
=A0=A0 =A0x +=3D N*N << 3;</div><div>=A0=A0 =A0z =3D x + N;</div><div=
oop at -O3.</div><div><div><span style=3D"white-space:pre-wrap"> </span>.p2=
align 4,,15</div><div>.Lt_0_3586:</div><div>=A0#<loop> Loop body line=
7, nesting depth: 1, estimated iterations: 1000</div>
<div><span style=3D"white-space:pre-wrap"> </span>.loc<span style=3D"white-=
space:pre-wrap"> </span>1<span style=3D"white-space:pre-wrap"> </span>9<spa=
n style=3D"white-space:pre-wrap"> </span>0</div><div>=A0# =A0 8 =A0 =A0{</d=
iv>
<div>=A0# =A0 9 =A0 =A0 =A0x +=3D N*N << 3;</div><div><span style=3D"=
white-space:pre-wrap"> </span>movl %eax,%ebx =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0<span style=3D"white-space:pre-wrap"> </span># [0]=A0</div><div><span st=
yle=3D"white-space:pre-wrap"> </span>.loc<span style=3D"white-space:pre-wra=
p"> </span>1<span style=3D"white-space:pre-wrap"> </span>11<span style=3D"w=
hite-space:pre-wrap"> </span>0</div>
<div>=A0# =A010 =A0 =A0 =A0z =3D x + N;</div><div>=A0# =A011 =A0 =A0 =A0y =
=3D y + *x + *z;</div><div><span style=3D"white-space:pre-wrap"> </span>add=
l $1,%ebp =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre=
-wrap"> </span># [0]=A0</div><div><span style=3D"white-space:pre-wrap"> </s=
pan>.loc<span style=3D"white-space:pre-wrap"> </span>1<span style=3D"white-=
space:pre-wrap"> </span>9<span style=3D"white-space:pre-wrap"> </span>0</di=
v>
<div><span style=3D"white-space:pre-wrap"> </span>imull %eax,%ebx =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [1]=A0</=
div><div><span style=3D"white-space:pre-wrap"> </span>shll $3,%ebx =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [=
4]=A0</div>
<div><span style=3D"white-space:pre-wrap"> </span>shll $2,%ebx =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [5]=
=A0</div><div><span style=3D"white-space:pre-wrap"> </span>addl %ebx,%edi =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span=
=A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [6]=
=A0</div><div><span style=3D"white-space:pre-wrap"> </span>.loc<span style=
=3D"white-space:pre-wrap"> </span>1<span style=3D"white-space:pre-wrap"> </=
span>11<span style=3D"white-space:pre-wrap"> </span>0</div>
<div><span style=3D"white-space:pre-wrap"> </span>movl 0(%edi),%ecx =A0 =A0=
=A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [7] id:23</=
div><div><span style=3D"white-space:pre-wrap"> </span>addl 0(%esi),%ecx =A0=
=A0 =A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [10]=A0=
</div>
<div><span style=3D"white-space:pre-wrap"> </span>addl %ecx,%edx =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [13]=
=A0</div><div><span style=3D"white-space:pre-wrap"> </span>cmpl 36(%esp),%e=
bp =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [1=
3] j</div>
<div><span style=3D"white-space:pre-wrap"> </span>jl .Lt_0_3586 =A0 =A0 =A0=
=A0 =A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [16]=A0=
</div></div><div><br></div><div>As we see, the imul instruction remains in =
the loop.=A0</div>
<div>(and two consequent shll instructions, my guess is that CG is thinking=
there should not be such input from WOPT, so it is not optimized in CG, th=
ough it is simple. )</div><div><br></div><div>It looks like SSA PRE omitted=
the rhs of Iv_update statement x+=3D N*N<<3, and VNFRE is only doing=
one level of CSE, say, promoting the ASHR + LDC 3 out of the loop.</div>
<div><br></div><div>I am curious why SSA PRE is omitting the expression her=
e. =A0By disabling this in opt_etable.cxx, the result looks good for this t=
est case. I wonder if there is any correctness issue for some other test ca=
se, or performance issue?</div>
<div><br></div><div>It should be noted one strength reduction transformatio=
n is done for z for this case. Also replacing "N>>=3D3;" wi=
th "N*=3D5;" results in similar sub-optimal code.</div><div><br>
</div>
<div>Best Regards,</div><div>Yiran Wang</div><div><br></div><div><br></div>=
</div>
<br></div></div>-----------------------------------------------------------=
-------------------<br>
This SF.net email is sponsored by Windows:<br>
<br>
Build for Windows Store.<br>
<br>
<a href=3D"http://p.sf.net/sfu/windows-dev2dev" target=3D"_blank">http://p.=
sf.net/sfu/windows-dev2dev</a><br>_________________________________________=
______<br>
Open64-devel mailing list<br>
<a href=3D"mailto:Open64-***@lists.sourceforge.net" target=3D"_blank">Ope=
n64-***@lists.sourceforge.net</a><br>
<a href=3D"https://lists.sourceforge.net/lists/listinfo/open64-devel" targe=
t=3D"_blank">https://lists.sourceforge.net/lists/listinfo/open64-devel</a><=
br>
<br></blockquote></div><span class=3D"HOEnZb"><font color=3D"#888888"><br><=
br clear=3D"all"><br>-- <br>Regards,<br>Lai Jian-Xin
</font></span></div>
</blockquote></div><br></div>
--047d7bf10b467201c504e025e467--
x #ccc solid;padding-left:1ex"><div><div class=3D"h5"><div dir=3D"ltr"><div=
Hi All,</div><div><br></div><div>This one looks somewhat similar to the la=
st example, but is different.</div> <div><br></div><div>int foo(int N, int j, int *x, int *z)</div><div>{</div>= <div>=A0=A0int y =3D N;</div> <div>=A0=A0N +=3D 7;</div><div>=A0=A0N >>=3D 3;</div><div>=A0=A0int i=;</div><div>=A0=A0for(i =3D 0; i< j; i++)</div><div>=A0=A0{</div><div>=
=A0=A0 =A0x +=3D N*N << 3;</div><div>=A0=A0 =A0z =3D x + N;</div><div=
=A0=A0 =A0y =3D y + *x + *z;</div><div>=A0=A0}</div>
<div>=A0=A0return y;</div><div>}</div><div><br></div><div>Assembly of the l=oop at -O3.</div><div><div><span style=3D"white-space:pre-wrap"> </span>.p2=
align 4,,15</div><div>.Lt_0_3586:</div><div>=A0#<loop> Loop body line=
7, nesting depth: 1, estimated iterations: 1000</div>
<div><span style=3D"white-space:pre-wrap"> </span>.loc<span style=3D"white-=
space:pre-wrap"> </span>1<span style=3D"white-space:pre-wrap"> </span>9<spa=
n style=3D"white-space:pre-wrap"> </span>0</div><div>=A0# =A0 8 =A0 =A0{</d=
iv>
<div>=A0# =A0 9 =A0 =A0 =A0x +=3D N*N << 3;</div><div><span style=3D"=
white-space:pre-wrap"> </span>movl %eax,%ebx =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0<span style=3D"white-space:pre-wrap"> </span># [0]=A0</div><div><span st=
yle=3D"white-space:pre-wrap"> </span>.loc<span style=3D"white-space:pre-wra=
p"> </span>1<span style=3D"white-space:pre-wrap"> </span>11<span style=3D"w=
hite-space:pre-wrap"> </span>0</div>
<div>=A0# =A010 =A0 =A0 =A0z =3D x + N;</div><div>=A0# =A011 =A0 =A0 =A0y =
=3D y + *x + *z;</div><div><span style=3D"white-space:pre-wrap"> </span>add=
l $1,%ebp =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre=
-wrap"> </span># [0]=A0</div><div><span style=3D"white-space:pre-wrap"> </s=
pan>.loc<span style=3D"white-space:pre-wrap"> </span>1<span style=3D"white-=
space:pre-wrap"> </span>9<span style=3D"white-space:pre-wrap"> </span>0</di=
v>
<div><span style=3D"white-space:pre-wrap"> </span>imull %eax,%ebx =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [1]=A0</=
div><div><span style=3D"white-space:pre-wrap"> </span>shll $3,%ebx =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [=
4]=A0</div>
<div><span style=3D"white-space:pre-wrap"> </span>shll $2,%ebx =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [5]=
=A0</div><div><span style=3D"white-space:pre-wrap"> </span>addl %ebx,%edi =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span=
# [6]=A0</div>
<div><span style=3D"white-space:pre-wrap"> </span>addl %ebx,%esi =A0 =A0 ==A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [6]=
=A0</div><div><span style=3D"white-space:pre-wrap"> </span>.loc<span style=
=3D"white-space:pre-wrap"> </span>1<span style=3D"white-space:pre-wrap"> </=
span>11<span style=3D"white-space:pre-wrap"> </span>0</div>
<div><span style=3D"white-space:pre-wrap"> </span>movl 0(%edi),%ecx =A0 =A0=
=A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [7] id:23</=
div><div><span style=3D"white-space:pre-wrap"> </span>addl 0(%esi),%ecx =A0=
=A0 =A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [10]=A0=
</div>
<div><span style=3D"white-space:pre-wrap"> </span>addl %ecx,%edx =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [13]=
=A0</div><div><span style=3D"white-space:pre-wrap"> </span>cmpl 36(%esp),%e=
bp =A0 =A0 =A0 =A0 =A0 =A0<span style=3D"white-space:pre-wrap"> </span># [1=
3] j</div>
<div><span style=3D"white-space:pre-wrap"> </span>jl .Lt_0_3586 =A0 =A0 =A0=
=A0 =A0 =A0 =A0 =A0 <span style=3D"white-space:pre-wrap"> </span># [16]=A0=
</div></div><div><br></div><div>As we see, the imul instruction remains in =
the loop.=A0</div>
<div>(and two consequent shll instructions, my guess is that CG is thinking=
there should not be such input from WOPT, so it is not optimized in CG, th=
ough it is simple. )</div><div><br></div><div>It looks like SSA PRE omitted=
the rhs of Iv_update statement x+=3D N*N<<3, and VNFRE is only doing=
one level of CSE, say, promoting the ASHR + LDC 3 out of the loop.</div>
<div><br></div><div>I am curious why SSA PRE is omitting the expression her=
e. =A0By disabling this in opt_etable.cxx, the result looks good for this t=
est case. I wonder if there is any correctness issue for some other test ca=
se, or performance issue?</div>
<div><br></div><div>It should be noted one strength reduction transformatio=
n is done for z for this case. Also replacing "N>>=3D3;" wi=
th "N*=3D5;" results in similar sub-optimal code.</div><div><br>
</div>
<div>Best Regards,</div><div>Yiran Wang</div><div><br></div><div><br></div>=
</div>
<br></div></div>-----------------------------------------------------------=
-------------------<br>
This SF.net email is sponsored by Windows:<br>
<br>
Build for Windows Store.<br>
<br>
<a href=3D"http://p.sf.net/sfu/windows-dev2dev" target=3D"_blank">http://p.=
sf.net/sfu/windows-dev2dev</a><br>_________________________________________=
______<br>
Open64-devel mailing list<br>
<a href=3D"mailto:Open64-***@lists.sourceforge.net" target=3D"_blank">Ope=
n64-***@lists.sourceforge.net</a><br>
<a href=3D"https://lists.sourceforge.net/lists/listinfo/open64-devel" targe=
t=3D"_blank">https://lists.sourceforge.net/lists/listinfo/open64-devel</a><=
br>
<br></blockquote></div><span class=3D"HOEnZb"><font color=3D"#888888"><br><=
br clear=3D"all"><br>-- <br>Regards,<br>Lai Jian-Xin
</font></span></div>
</blockquote></div><br></div>
--047d7bf10b467201c504e025e467--