Discussion:
[Open64-devel] How to dump the ir tree after vectorization?
Huan Luo
2013-01-17 07:25:41 UTC
Permalink
Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================

;; Function main (main) (executed once)

main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;

<bb 2>:

<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;

<bb 4>:
i = 100;
return;

}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.




--

Best wishes.

Huan Luo
Das, Dibyendu
2013-01-17 08:45:58 UTC
Permalink
At -O3 and above you should be able to use -LNO:simd=1,2,3 for vectorization.
-dibyendu

From: Huan Luo [mailto:***@126.com]
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================

;; Function main (main) (executed once)

main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;

<bb 2>:

<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;

<bb 4>:
i = 100;
return;

}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.


--
Best wishes.

Huan Luo
Huan Luo
2013-01-18 00:51:21 UTC
Permalink
Hi,
There seems to be a problem. Open64 cannot vectorize the loop...
$ opencc -O3 -msse2 -ffast-math -LNO:simd=2 -LNO:simd_verbose=ON fun.c
And the output is:
(fun.c:6) Expression rooted at op "OPC_I4MPY"(line 7) is not vectorizable. Loop was not vectorized.
I don't see why it cannot vectorize such a simple loop.

fun.c
==========================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<1024; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
==========================





--

Best wishes.

Huan Luo


ÔÚ 2013-01-17 17:07:13£¬"Das, Dibyendu" <***@amd.com> ÐŽµÀ£º


Try šCLNO:simd_verbose=ON



From: Huan Luo [mailto:***@126.com]
Sent: Thursday, January 17, 2013 2:31 PM
To: Das, Dibyendu
Subject: Re:RE: [Open64-devel] How to dump the ir tree after vectorization?



Thank you very much. That works.
And any idea on how to dump the internal structure after vectorization?



--

Best wishes.

Huan Luo




ÔÚ 2013-01-17 16:45:58£¬"Das, Dibyendu" <***@amd.com> ÐŽµÀ£º



At šCO3 and above you should be able to use šCLNO:simd=1,2,3 for vectorization.

-dibyendu



From: Huan Luo [mailto:***@126.com]
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?



Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================

;; Function main (main) (executed once)

main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;

<bb 2>:

<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;

<bb 4>:
i = 100;
return;

}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.



--

Best wishes.

Huan Luo
Sun Chan
2013-01-18 02:22:14 UTC
Permalink
may be other can check out if my suspiciion is correct. The arrays are
global arrays. By pre-emption rule, they are pre-emptible, meaning
their value can be changed by dependent libraries (shared libraries).
If you use "static" or "local", things should work fine. If my claim
is correct, this is an ABI issue, not compiler optimization issue
Sun
Post by Huan Luo
Hi,
There seems to be a problem. Open64 cannot vectorize the loop...
$ opencc -O3 -msse2 -ffast-math -LNO:simd=2 -LNO:simd_verbose=ON fun.c
(fun.c:6) Expression rooted at op "OPC_I4MPY"(line 7) is not vectorizable.
Loop was not vectorized.
I don't see why it cannot vectorize such a simple loop.
fun.c
==========================
int i, j;
int a[1024], b[1024], c[1024];
main()
{
for (i=0; i<1024; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
==========================
--
Best wishes.
Huan Luo
Try –LNO:simd_verbose=ON
Sent: Thursday, January 17, 2013 2:31 PM
To: Das, Dibyendu
Subject: Re:RE: [Open64-devel] How to dump the ir tree after vectorization?
Thank you very much. That works.
And any idea on how to dump the internal structure after vectorization?
--
Best wishes.
Huan Luo
At –O3 and above you should be able to use –LNO:simd=1,2,3 for
vectorization.
-dibyendu
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?
Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];
main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
fun.c.136t.uncprop
============================================
;; Function main (main) (executed once)
main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;
i = 100;
return;
}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.
--
Best wishes.
Huan Luo
------------------------------------------------------------------------------
Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
much more. Get web development skills now with LearnDevNow -
350+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
http://p.sf.net/sfu/learnmore_122812
_______________________________________________
Open64-devel mailing list
https://lists.sourceforge.net/lists/listinfo/open64-devel
Rao, Shivarama
2013-01-18 11:13:55 UTC
Permalink
Hi,

There are no simd multiply instructions which take two INT4 operands and produces INT4 result. That is the reason this loop is not getting vectorized. It will get vectorized if you change the array types to short/float/double.

You can dump the internal structure after LNO using -Wb,-trLNO option.

Regards,
Shivaram


From: Huan Luo [mailto:***@126.com]
Sent: Friday, January 18, 2013 6:21 AM
To: Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
There seems to be a problem. Open64 cannot vectorize the loop...
$ opencc -O3 -msse2 -ffast-math -LNO:simd=2 -LNO:simd_verbose=ON fun.c
And the output is:
(fun.c:6) Expression rooted at op "OPC_I4MPY"(line 7) is not vectorizable. Loop was not vectorized.
I don't see why it cannot vectorize such a simple loop.

fun.c
==========================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<1024; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
==========================



--
Best wishes.

Huan Luo


$B:_(B 2013-01-17 17:07:13$B!$(B"Das, Dibyendu" <***@amd.com<mailto:***@amd.com>> $B<LF;!'(B

Try -LNO:simd_verbose=ON

From: Huan Luo [mailto:***@126.com<mailto:***@126.com>]
Sent: Thursday, January 17, 2013 2:31 PM
To: Das, Dibyendu
Subject: Re:RE: [Open64-devel] How to dump the ir tree after vectorization?

Thank you very much. That works.
And any idea on how to dump the internal structure after vectorization?
--
Best wishes.

Huan Luo


$B:_(B 2013-01-17 16:45:58$B!$(B"Das, Dibyendu" <***@amd.com<mailto:***@amd.com>> $B<LF;!'(B
At -O3 and above you should be able to use -LNO:simd=1,2,3 for vectorization.
-dibyendu

From: Huan Luo [mailto:***@126.com<mailto:***@126.com>]
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================

;; Function main (main) (executed once)

main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;

<bb 2>:

<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;

<bb 4>:
i = 100;
return;

}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.
--
Best wishes.

Huan Luo
Rao, Shivarama
2013-01-18 11:23:47 UTC
Permalink
I meant for sse2. These instructions are present in sse4 and vectorization should work in the corresponding target.

Regards,
Shivaram




From: Rao, Shivarama [mailto:***@amd.com]
Sent: Friday, January 18, 2013 4:44 PM
To: Huan Luo; Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?

Hi,

There are no simd multiply instructions which take two INT4 operands and produces INT4 result. That is the reason this loop is not getting vectorized. It will get vectorized if you change the array types to short/float/double.

You can dump the internal structure after LNO using -Wb,-trLNO option.

Regards,
Shivaram


From: Huan Luo [mailto:***@126.com]<mailto:[mailto:***@126.com]>
Sent: Friday, January 18, 2013 6:21 AM
To: Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
There seems to be a problem. Open64 cannot vectorize the loop...
$ opencc -O3 -msse2 -ffast-math -LNO:simd=2 -LNO:simd_verbose=ON fun.c
And the output is:
(fun.c:6) Expression rooted at op "OPC_I4MPY"(line 7) is not vectorizable. Loop was not vectorized.
I don't see why it cannot vectorize such a simple loop.

fun.c
==========================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<1024; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
==========================


--
Best wishes.

Huan Luo


$B:_(B 2013-01-17 17:07:13$B!$(B"Das, Dibyendu" <***@amd.com<mailto:***@amd.com>> $B<LF;!'(B
Try -LNO:simd_verbose=ON

From: Huan Luo [mailto:***@126.com<mailto:***@126.com>]
Sent: Thursday, January 17, 2013 2:31 PM
To: Das, Dibyendu
Subject: Re:RE: [Open64-devel] How to dump the ir tree after vectorization?

Thank you very much. That works.
And any idea on how to dump the internal structure after vectorization?
--
Best wishes.

Huan Luo


$B:_(B 2013-01-17 16:45:58$B!$(B"Das, Dibyendu" <***@amd.com<mailto:***@amd.com>> $B<LF;!'(B
At -O3 and above you should be able to use -LNO:simd=1,2,3 for vectorization.
-dibyendu

From: Huan Luo [mailto:***@126.com<mailto:***@126.com>]
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================

;; Function main (main) (executed once)

main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;

<bb 2>:

<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;

<bb 4>:
i = 100;
return;

}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.
--
Best wishes.

Huan Luo
Rao, Shivarama
2013-01-18 12:29:43 UTC
Permalink
For some reason, we don$B!G(Bt seem to handle this case and hence don$B!G(Bt vectorize the loop.

Is_Well_Formed_Simd function (simd.cxx) returns false and stops vectorization.

if (WN_operator(wn) == OPR_MPY && WN_rtype(wn) == MTYPE_I4 &&
WN_desc(parent) != MTYPE_I2){
if((WN_operator(kid0) == OPR_INTCONST && WN_const_val(kid0) == 2 &&
WN_operator(kid1) == OPR_ILOAD && WN_operator(WN_kid0(kid1)) == OPR_ARRAY)
||(WN_operator(kid1) == OPR_INTCONST && WN_const_val(kid1) == 2 &&
WN_operator(kid0) == OPR_ILOAD && WN_operator(WN_kid0(kid0)) == OPR_ARRAY)
); //bug 5844: 2*b[i] should be fine
else return FALSE;
}


Regards,
Shivaram


From: Rao, Shivarama
Sent: Friday, January 18, 2013 4:54 PM
To: Rao, Shivarama; Huan Luo; Das, Dibyendu; open64 mailing list
Subject: RE: [Open64-devel] How to dump the ir tree after vectorization?

I meant for sse2. These instructions are present in sse4 and vectorization should work in the corresponding target.

Regards,
Shivaram




From: Rao, Shivarama [mailto:***@amd.com]<mailto:[mailto:***@amd.com]>
Sent: Friday, January 18, 2013 4:44 PM
To: Huan Luo; Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?

Hi,

There are no simd multiply instructions which take two INT4 operands and produces INT4 result. That is the reason this loop is not getting vectorized. It will get vectorized if you change the array types to short/float/double.

You can dump the internal structure after LNO using -Wb,-trLNO option.

Regards,
Shivaram


From: Huan Luo [mailto:***@126.com]<mailto:[mailto:***@126.com]>
Sent: Friday, January 18, 2013 6:21 AM
To: Das, Dibyendu; open64 mailing list
Subject: Re: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
There seems to be a problem. Open64 cannot vectorize the loop...
$ opencc -O3 -msse2 -ffast-math -LNO:simd=2 -LNO:simd_verbose=ON fun.c
And the output is:
(fun.c:6) Expression rooted at op "OPC_I4MPY"(line 7) is not vectorizable. Loop was not vectorized.
I don't see why it cannot vectorize such a simple loop.

fun.c
==========================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<1024; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
==========================

--
Best wishes.

Huan Luo


$B:_(B 2013-01-17 17:07:13$B!$(B"Das, Dibyendu" <***@amd.com<mailto:***@amd.com>> $B<LF;!'(B
Try -LNO:simd_verbose=ON

From: Huan Luo [mailto:***@126.com<mailto:***@126.com>]
Sent: Thursday, January 17, 2013 2:31 PM
To: Das, Dibyendu
Subject: Re:RE: [Open64-devel] How to dump the ir tree after vectorization?

Thank you very much. That works.
And any idea on how to dump the internal structure after vectorization?
--
Best wishes.

Huan Luo


$B:_(B 2013-01-17 16:45:58$B!$(B"Das, Dibyendu" <***@amd.com<mailto:***@amd.com>> $B<LF;!'(B
At -O3 and above you should be able to use -LNO:simd=1,2,3 for vectorization.
-dibyendu

From: Huan Luo [mailto:***@126.com<mailto:***@126.com>]
Sent: Thursday, January 17, 2013 12:56 PM
To: open64 mailing list
Subject: [Open64-devel] How to dump the ir tree after vectorization?

Hi,
Lately we've been trying to use open64 to vectorize our
program and dump the tree for later analysis. But we are not
sure which option is available. Could anybody tell us?
A similar purpose can be achieved by gcc. For example:
fun.c
===========================================
int i, j;
int a[1024], b[1024], c[1024];

main()
{
for (i=0; i<100; i++) {
a[i]=a[i]+b[i]*c[i];
}
}
============================================
We use the option -ftree-vectorize to apply vectorization, and
-fdump-tree-uncprop to dump the functions after vectorization.
the command is:
gcc -O -dA -msse2 -ffast-math -ftree-vectorize -fdump-tree-uncprop fun.c
and the file looks like this:
fun.c.136t.uncprop
============================================

;; Function main (main) (executed once)

main ()
{
long unsigned int ivtmp.30;
vector(4) int vect_var_.21;
vector(4) int vect_var_.20;
vector(4) int vect_var_.19;
vector(4) int vect_var_.14;
vector(4) int vect_var_.9;

<bb 2>:

<bb 3>:
# ivtmp.30_15 = PHI <ivtmp.30_7(3), 0(2)>
vect_var_.9_18 = MEM[symbol: a, index: ivtmp.30_15, offset: 0B];
vect_var_.14_22 = MEM[symbol: b, index: ivtmp.30_15, offset: 0B];
vect_var_.19_26 = MEM[symbol: c, index: ivtmp.30_15, offset: 0B];
vect_var_.20_27 = vect_var_.14_22 * vect_var_.19_26;
vect_var_.21_28 = vect_var_.9_18 + vect_var_.20_27;
MEM[symbol: a, index: ivtmp.30_15, offset: 0B] = vect_var_.21_28;
ivtmp.30_7 = ivtmp.30_15 + 16;
if (ivtmp.30_7 != 400)
goto <bb 3>;
else
goto <bb 4>;

<bb 4>:
i = 100;
return;

}
============================================
However, gcc is not enough for analysis. So I wonder if there is a way to do it in open64
and maybe it will provide more convenience than gcc.
Your help would be greatly appreciated.
--
Best wishes.

Huan Luo

Loading...