IB/mlx4: Add diagnostic hardware counters
Expose IB diagnostic hardware counters. The counters count IB events and are applicable for IB and RoCE. The counters can be divided into two groups, per device and per port. Device counters are always exposed. Port counters are exposed only if the firmware supports per port counters. rq_num_dup and sq_num_to are only exposed if we have firmware support for them, if we do, we expose them per device and per port. rq_num_udsdprd and num_cqovf are device only counters. rq - denotes responder. sq - denotes requester. |-----------------------|---------------------------------------| | Name | Description | |-----------------------|---------------------------------------| |rq_num_lle | Number of local length errors | |-----------------------|---------------------------------------| |sq_num_lle | number of local length errors | |-----------------------|---------------------------------------| |rq_num_lqpoe | Number of local QP operation errors | |-----------------------|---------------------------------------| |sq_num_lqpoe | Number of local QP operation errors | |-----------------------|---------------------------------------| |rq_num_lpe | Number of local protection errors | |-----------------------|---------------------------------------| |sq_num_lpe | Number of local protection errors | |-----------------------|---------------------------------------| |rq_num_wrfe | Number of CQEs with error | |-----------------------|---------------------------------------| |sq_num_wrfe | Number of CQEs with error | |-----------------------|---------------------------------------| |sq_num_mwbe | Number of Memory Window bind errors | |-----------------------|---------------------------------------| |sq_num_bre | Number of bad response errors | |-----------------------|---------------------------------------| |sq_num_rire | Number of Remote Invalid request | | | errors | |-----------------------|---------------------------------------| |rq_num_rire | Number of Remote Invalid request | | | errors | |-----------------------|---------------------------------------| |sq_num_rae | Number of remote access errors | |-----------------------|---------------------------------------| |rq_num_rae | Number of remote access errors | |-----------------------|---------------------------------------| |sq_num_roe | Number of remote operation errors | |-----------------------|---------------------------------------| |sq_num_tree | Number of transport retries exceeded | | | errors | |-----------------------|---------------------------------------| |sq_num_rree | Number of RNR NAK retries exceeded | | | errors | |-----------------------|---------------------------------------| |rq_num_rnr | Number of RNR NAKs sent | |-----------------------|---------------------------------------| |sq_num_rnr | Number of RNR NAKs received | |-----------------------|---------------------------------------| |rq_num_oos | Number of Out of Sequence requests | | | received | |-----------------------|---------------------------------------| |sq_num_oos | Number of Out of Sequence NAKs | | | received | |-----------------------|---------------------------------------| |rq_num_udsdprd | Number of UD packets silently | | | discarded on the Receive Queue due to | | | lack of receive descriptor | |-----------------------|---------------------------------------| |rq_num_dup | Number of duplicate requests received | |-----------------------|---------------------------------------| |sq_num_to | Number of time out received | |-----------------------|---------------------------------------| |num_cqovf | Number of CQ overflows | |-----------------------|---------------------------------------| Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
Showing