- 25 Apr, 2017 24 commits
-
-
Jason Gunthorpe authored
imm_data is copied directly from the ib_send_wr and ib_wc which have it marked as __be32, copy that mark into the uapi structures as well. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Adit Ranadive <aditr@vmware.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Internally MW implemented as KLM MKey and filled by userspace UMR postsends. Handle pagefault trigered by operations on this MKeys. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
To make page fault handling code more flexible split pagefault_single_data_segment() function. Keep MR resolution in pagefault_single_data_segment() and move actual updates into pagefault_single_mr(). Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Add IB_ACCESS_HUGETLB ib_reg_mr flag. Hugetlb region registered with this flag will use single translation entry per huge page. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Currenlty ODP supports only regular MMU pages. Add ODP support for regions consisting of physically contiguous chunks of arbitrary order (huge pages for instance) to improve performance. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Currenlty ODP supports only regular MMU pages. Add ODP support for regions consisting of physically contiguous chunks of arbitrary order (huge pages for instance) to improve performance. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Decrease verbosity level of ODP error flows messages to debug level. Remove one redundant print since debug level message already exists in this flow. Fixes: d9aaed83 ('{net,IB}/mlx5: Refactor page fault handling') Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
When implicit MR's leaf MKey becomes unused, i.e. when it's last page being released my MMU invalidation it is marked as "dying" and scheduled for release by garbage collector. Currentle consequent page fault may remove "dying" flag. Treat leaf MKey as non-existent once it was scheduled to removal by GC. Fixes: 81713d37 ('IB/mlx5: Add implicit MR support') Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Translation table updates of large UMR may require multiple post send operations. The last operations can be in various lengths, but current code set them to be the same length. Fixes: 7d0cc6ed ('IB/mlx5: Add MR cache for large UMR regions') Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
In memory shortage path we fall back to use spare buffer. mlx5_ib_update_xlt() called from ib_uverbs_reg_mr when ibmr.ucontext not initialized yet. Scenario how to test it: 1. trigger memory exhaustion so __get_free_pages(GFP_KERNEL, 4) will fail 2. register MR 3. there should be no kernel oops Fixes: 7d0cc6ed ('IB/mlx5: Add MR cache for large UMR regions') Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Artemy Kovalyov authored
Size of pages are held by struct ib_umem in page_size field. It is better to store it as an exponent, because page size by nature is always power-of-two and used as a factor, divisor or ilog2's argument. The conversion of page_size to be page_shift allows to have portable code and avoid following error while compiling on ARM: ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined! CC: Selvin Xavier <selvin.xavier@broadcom.com> CC: Steve Wise <swise@chelsio.com> CC: Lijun Ou <oulijun@huawei.com> CC: Shiraz Saleem <shiraz.saleem@intel.com> CC: Adit Ranadive <aditr@vmware.com> CC: Dennis Dalessandro <dennis.dalessandro@intel.com> CC: Ram Amrani <Ram.Amrani@Cavium.com> Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ram Amrani <Ram.Amrani@cavium.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Selvin Xavier <selvin.xavier@broadcom.com> Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Zhu Yanjun authored
The function ib_unregister_mad_agent always returns zero. And this returned value is not checked. As such, chane the return type to void. CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Selvin Xavier authored
Since ocrdma driver is not going to be updated with any new development activity, except for critical bug fixes reported by partners or customers, changing the module status to "Odd Fixes". Also, updating the web page info and the maintainers email addresses. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Ira Weiny authored
Fix off by 1 error in comments documenting the sdma and send context mappings. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Geert Uytterhoeven authored
Submitters of device tree binding documentation may forget to CC the subsystem maintainer if this is missing. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux-rdma@vger.kernel.org Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Vlad Tsyrklevich authored
The 'num_sge' variable is verfied to be smaller than the 'sge_count' variable; however, since both are user-controlled it's possible to cause an integer overflow for the kmalloc multiply on 32-bit platforms (num_sge and sge_count are both defined u32). By crafting an input that causes a smaller-than-expected allocation it's possible to write controlled data out-of-bounds. Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Arnd Bergmann authored
hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field, which will then change only a few of its bits, causing a warning with the latest gcc: infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci': infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized] roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1); The code is actually correct since we always set all bits of the port_vlan field, but gcc correctly points out that the first access does contain uninitialized data. This initializes the field to zero first before setting the individual bits. Fixes: 9a443537 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Petr Mladek authored
Kthreads are currently implemented as an infinite loop. Each has its own variant of checks for terminating, freezing, awakening. In many cases it is unclear to say in which state it is and sometimes it is done a wrong way. The plan is to convert kthreads into kthread_worker or workqueues API. It allows to split the functionality into separate operations. It helps to make a better structure. Also it defines a clean state where no locks are taken, IRQs blocked, the kthread might sleep or even be safely migrated. The kthread worker API is useful when we want to have a dedicated single thread for the work. It helps to make sure that it is available when needed. Also it allows a better control, e.g. define a scheduling priority. This patch converts the frm_pool kthread into the kthread worker API because I am not sure how busy the thread is. It is well possible that it does not need a dedicated kthread and workqueues would be perfectly fine. Well, the conversion between kthread worker API and workqueues is pretty trivial. The patch moves one iteration from the kthread into the work function. It is queued only when there is a pending work. Therefore we do not need to compare flush_ser and req_ser at the beginning. On the contrary, the same work could be queued only once at a time. Therefore it has to re-queue itself if some requests are pending. Otherwise, wake_up_process() is replaced by queuing the work. Important: The change is only compile tested. I did not find an easy way how to check it in a real life. Signed-off-by: Petr Mladek <pmladek@suse.com> TO: Doug Ledford <dledford@redhat.com> CC: Sean Hefty <sean.hefty@intel.com> CC: Hal Rosenstock <hal.rosenstock@gmail.com> CC: linux-rdma@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yuval Shaia authored
This logic seems to be duplicated in (at least) three separate files. Move it to one place so code can be re-use. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
-
Yuval Shaia authored
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Colin Ian King authored
trivial fix to spelling mistake in iser_err error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Ganesh Goudar authored
Enable the use of dsgl by default and determine whether dsgl is supported from lld info. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Bharat Potnuri <bharat@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Doug Ledford authored
Constructs such as if (ptr && !IS_ERR(ptr)) can be shorted to just !IS_ERR_OR_NULL(ptr) instead. Make substitutions in the bnxt_re driver where appropriate. Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Colin Ian King authored
rc is initialized to zero but is then updated by calls to bnxt_qplib_free_fast_reg_page_list and/or bnxt_qpliob_free_mrw so the initialization is redundant and can be removed. Detected with CoverityScan, CID#1408448 ("Unused Value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 21 Apr, 2017 16 commits
-
-
Noa Osherovich authored
Add missing calculation and translation of active_width and active_speed for RoCE. Fixes: 3f89a643 ('IB/mlx5: Extend query_device/port to ...') Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Noa Osherovich authored
In case of an error, the properties reported to user are zeroed out, so no need for a return value. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Noa Osherovich authored
Add high data rate speed to the ib_port_speed enumeration. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Moni Shoua authored
There is a difference when parsing a completion entry between Ethernet and IB ports. When link layer is Ethernet the bits describe the type of L3 header in the packet. In the case when link layer is Ethernet and VLAN header is present the value of SL is equal to the 3 UP bits in the VLAN header. If VLAN header is not present then the SL is undefined and consumer of the completion should check if IB_WC_WITH_VLAN is set. While that, this patch also fills the vlan_id field in the completion if present. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Moni Shoua authored
Current implementation of RDMA_CM sends MRA (Message Receipt Acknowledgment) only for request messages but not for response messages. As a result, a slow active side of the connection may send a ready-to-use message to the passive side in a delay that is too long for the passive side to wait for. This patch adds a call to ib_send_cm_mra() upon receiving a response message and by this tells the other side to modify the service timeout to a bigger value, 16 times than before. As in the request case, MRA for reply will be sent only if a duplicate response has arrived. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Matan Barak <matan@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Parav Pandit authored
This patch adds support to query the congestion related hardware counters through new command and links them with other hw counters being available in hw_counters sysfs location. In order to reuse existing infrastructure it renames related q_counter data structures to more generic counters to reflect q_counters and congestion counters and maybe some other counters in the future. New hardware counters: * rp_cnp_handled - CNP packets handled by the reaction point * rp_cnp_ignored - CNP packets ignored by the reaction point * np_cnp_sent - CNP packets sent by notification point to respond to CE marked RoCE packets * np_ecn_marked_roce_packets - CE marked RoCE packets received by notification point It also avoids returning ENOSYS which is specific for invalid system call and produces the following checkpatch.pl warning. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + return -ENOSYS; Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
The mthca driver didn't check supplied pointer to functions mthca_cmd_poll() and mthca_cmd_wait(). This caused to the following smatch errors: drivers/infiniband/hw/mthca/mthca_cmd.c:371 mthca_cmd_poll() error: we previously assumed 'out_param' could be null (see line 353) drivers/infiniband/hw/mthca/mthca_cmd.c:454 mthca_cmd_wait() error: we previously assumed 'out_param' could be null (see line 432) In reality all callers of these functions are setting out_is_imm flag are providing pointer too. However it is better to check again to remove smatch errors to achieve warning free subsystem. Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Slava Shwartsman authored
A drop rule is described by an action drop and no destination. If a user specified IB_FLOW_SPEC_ACTION_DROP then set the action to MLX5_FLOW_CONTEXT_ACTION_DROP and clear the destination. Signed-off-by: Slava Shwartsman <slavash@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Slava Shwartsman authored
This flow steering specification identifies flow for drop by the HW. If user create a flow only with the drop specification, then all the packets that hit this flow will be dropped, otherwise the HW will drop only the packets that match the other L2/L3/L4 specifications. Signed-off-by: Slava Shwartsman <slavash@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Ariel Levkovich authored
This change adds the ability for flow steering to classify IPv4/6 packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP packets and hit IPv4/6 classifed steering rules. When user added a flow rule with IP classification, driver was implicitly adding ethertype matching to the created rule in order to distinguish between IPv4 and IPv6 protocols. Since IP packets with MPLS tag header have MPLS ethertype, they missed the rule and ended up hitting the default filters. Such behavior prevented from MPLS packets to undergo inbound traffic load balancing flows (if such were defined by configuring RSS) to achieve higher throughput - the way that non-MPLS IP packets performed. Since our device is able to look past the MPLS tag and identify the next protocol we introduce this solution which replaces Ethertype matching by the device's capability to perform IP version parsing and matching in order to distinguish between IPv4 and IPv6. Therefore, whenever a flow with IP spec is added and device support IP version matching, driver will implicitly add IP version matching to the rule (Based on the IP spec type) without Ethertype matching which will cause relevant MPLS tagged packets to hit this rule as well. Otherwise (device doesn't support IP version matching), we fall back to setting Ethertype matching. If the user's filters specify an L2 ethertype and an IP spec the rule will then match both the ethertype and the IP version. The device's support for IP version matching is reported by the device via dedicated capability bit in query_device_cap and named outer/inner_ip_version. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Ariel Levkovich authored
This change fixes an incomplete validation of the user's flow attributes list. Previous implementation validated only matching of IPv4 Ethertype to IPv4 spec of outer headers (in case both Ethernet with specified Ethertype and IP specs were present) and lacked the validation of: 1. Matching of IPv6 Ethertype in Ethernet spec (if such exists) to an IPv6 protocol spec (if such exists). 2. Validation of Ethertype to IP protocol matching on inner headers specs. Which could cause some combinations of unmatching Ethernet and IP protocols to pass validation and apply on the device. The fix adds validation of IPv6 Ethertype and IP spec as well as performing the scan on both outer and inner attributes. Fixes: 038d2ef8 ("Add flow steering support") Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Bodong Wang authored
The kfree was called to free cqb, while it should free *cqb. Fixes: 1cbe6fc8 ("IB/mlx5: Add support for CQE compressing") Signed-off-by: Bodong Wang <bodong@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
In order to enlarge the flow group size to 8k, we decrease the number of flow group types to 6 and increase the flow table size to 64k. Flow group size is calculated as follow: group_size = table_size / (#group_types + 1) Fixes: 038d2ef8 ('IB/mlx5: Add flow steering support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
Check that the required flow table size is supported by device. Return ENOMEM error if no space left. In addition change the create flow table routine to return ENOMEM instead of ENOSPC. Fixes: 038d2ef8 ('IB/mlx5: Add flow steering support') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
Anonymous VMA (->vm_ops == NULL) cannot be shared, otherwise it would lead to SIGBUS. Remove the shared flags from the vma after we change it to be anonymous. This is easily reproduced by doing modprobe -r while running a user-space application such as raw_ethernet_bw. Fixes: 7c2344c3 ('IB/mlx5: Implements disassociate_ucontext API') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
When the driver disassociate user context, it changes the vma to anonymous by setting the vm_ops to null and zap the vma ptes. In order to avoid race in the kernel, we need to take write lock before we change the vma entries. Fixes: 7c2344c3 ('IB/mlx5: Implements disassociate_ucontext API') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-