The one main thing I think is always needed is good baseline performance stats to compare against. Regular (or constant) reports on the usage pattern of your storage will help no end when troubleshooting these types of issues.
I tend to use a combination of the following:
IOPS - this needs to be viewed in regards to an expected level of performance so good base lining and previous (non-issue) stats are essential here to give you anything to go on.
Throughput (Read Bps and Write Bps - Again this is best used when compared against a good baseline of performance from previous days/hours etc and should also be linked to the IOPS value.
Read/Write ratio - This can also help you understand how the system is generating load on your storage. depending on your storage model you may find that typically writes are faster than reads and a switch in the storage usage pattern from say 30% write, 70% read to 5% write and 95% read (or even the other way around) may be causing you more issues than seeing a greater IOPS value or even more throughput. typically this may lead to disk or transport latency (Fibre/iSCSI etc) issues which can then have a serious performance impact to the end system/service.
It's kinda hard to give an answer that fits all situations but a combination of the above along with typical measurements helps build a picture of what is happening differently compared to a good day.
I like how noone replied. Not important enough I guess.
IOPS, and Growth per Month.