Quantcast
Channel: Tools
Viewing all articles
Browse latest Browse all 91752

Forum Post: RE: Generating NEON instructions for Floating Point Operations on AM335x

$
0
0

Please see this wiki article.  You'll see that you need to change --opt_level=off (you wrote the equivalent -Ooff) to --opt_level=2 or higher.  And you need to use --opt_for_speed=3 or higher.

Changing to those options does not result in NEON instructions for your simple example. That is because of a lack of optimization opportunity.  When I changed the code to this ...

void tstfn(float *x, float *y, float * restrict z, float *k, int length)
{
    int i;

    for (i = 0; i < length; i++)
        z[i] = (x[i] + k[i]) * y[i];
}

That change, along with the option changes, results in NEON instructions.  

Why the restrict on the z pointer?  This wiki article describes restrict in detail.  In this case, it tells the compiler that the memory locations associated with z can only be written by z.  That allows the compiler to reorder when memory is accessed, and thus order things so that NEON instructions can be used.

Thanks and regards,

-George


Viewing all articles
Browse latest Browse all 91752

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>