I'm not sure I understand what you are asking. So, I'll restate your question and then answer it. If I'm off a bit on the question part, please correct me.
You are looking at the SINGLE SCHEDULE ITERATION part of the compiler generated output, and from that you somehow think the loop should have ii=3. You don't understand why the ii=2.
My answer is just an explanation of what the SINGLE SCHEDULED ITERATION represents. On page 12 of SPRA666, about halfway down, it says ...
If the source code is compiled with –mw, the software-pipelined loop information displays the scheduled instruction sequence for a single iteration of the software-pipelined loop. Examining this single scheduled iteration makes it easier to understand the compiler’s output.
Looking at the singled scheduled iteration shows you (among other things) the number of cycles to complete one full iteration of the loop. Do not confuse that with the initiation interval (ii). The ii is the number of cycles you have to wait before the next loop can begin executing. With regard to performance, you don't worry about how long it takes for a single iteration to complete, you worry about making the ii as small as you can.
Hope this helps ...
-George