1. 10 Jun, 2014 4 commits
    • Roland Dreier's avatar
      Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4',... · eeaddf36
      Roland Dreier authored
      Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next
      eeaddf36
    • Steve Wise's avatar
      RDMA/cxgb4: Add support for iWARP Port Mapper user space service · 9eccfe10
      Steve Wise authored
      Based on original work by Vipul Pandya.
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      
      [ Fix htons -> ntohs to make sparse happy.  - Roland ]
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      9eccfe10
    • Tatyana Nikolova's avatar
    • Tatyana Nikolova's avatar
      RDMA/core: Add support for iWARP Port Mapper user space service · 30dc5e63
      Tatyana Nikolova authored
      This patch adds iWARP Port Mapper (IWPM) Version 2 support.  The iWARP
      Port Mapper implementation is based on the port mapper specification
      section in the Sockets Direct Protocol paper -
      http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf
      
      Existing iWARP RDMA providers use the same IP address as the native
      TCP/IP stack when creating RDMA connections.  They need a mechanism to
      claim the TCP ports used for RDMA connections to prevent TCP port
      collisions when other host applications use TCP ports.  The iWARP Port
      Mapper provides a standard mechanism to accomplish this.  Without this
      service it is possible for RDMA application to bind/listen on the same
      port which is already being used by native TCP host application.  If
      that happens the incoming TCP connection data can be passed to the
      RDMA stack with error.
      
      The iWARP Port Mapper solution doesn't contain any changes to the
      existing network stack in the kernel space.  All the changes are
      contained with the infiniband tree and also in user space.
      
      The iWARP Port Mapper service is implemented as a user space daemon
      process.  Source for the IWPM service is located at
      http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary
      
      The iWARP driver (port mapper client) sends to the IWPM service the
      local IP address and TCP port it has received from the RDMA
      application, when starting a connection.  The IWPM service performs a
      socket bind from user space to get an available TCP port, called a
      mapped port, and communicates it back to the client.  In that sense,
      the IWPM service is used to map the TCP port, which the RDMA
      application uses to any port available from the host TCP port
      space. The mapped ports are used in iWARP RDMA connections to avoid
      collisions with native TCP stack which is aware that these ports are
      taken. When an RDMA connection using a mapped port is terminated, the
      client notifies the IWPM service, which then releases the TCP port.
      
      The message exchange between the IWPM service and the iWARP drivers
      (between user space and kernel space) is implemented using netlink
      sockets.
      
      1) Netlink interface functions are added: ibnl_unicast() and
         ibnl_mulitcast() for sending netlink messages to user space
      
      2) The signature of the existing ibnl_put_msg() is changed to be more
         generic
      
      3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW
         corresponding to the two iWarp drivers - nes and cxgb4 which use
         the IWPM service
      
      4) Enums are added to enumerate the attributes in the netlink
         messages, which are exchanged between the user space IWPM service
         and the iWARP drivers
      Signed-off-by: default avatarTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: default avatarPJ Waskiewicz <pj.waskiewicz@solidfire.com>
      
      [ Fold in range checking fixes and nlh_next removal as suggested by Dan
        Carpenter and Steve Wise.  Fix sparse endianness in hash.  - Roland ]
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      30dc5e63
  2. 09 Jun, 2014 1 commit
  3. 06 Jun, 2014 1 commit
    • Bart Van Assche's avatar
      IB/umad: Fix use-after-free on close · 60e1751c
      Bart Van Assche authored
      Avoid that closing /dev/infiniband/umad<n> or /dev/infiniband/issm<n>
      triggers a use-after-free.  __fput() invokes f_op->release() before it
      invokes cdev_put().  Make sure that the ib_umad_device structure is
      freed by the cdev_put() call instead of f_op->release().  This avoids
      that changing the port mode from IB into Ethernet and back to IB
      followed by restarting opensmd triggers the following kernel oops:
      
          general protection fault: 0000 [#1] PREEMPT SMP
          RIP: 0010:[<ffffffff810cc65c>]  [<ffffffff810cc65c>] module_put+0x2c/0x170
          Call Trace:
           [<ffffffff81190f20>] cdev_put+0x20/0x30
           [<ffffffff8118e2ce>] __fput+0x1ae/0x1f0
           [<ffffffff8118e35e>] ____fput+0xe/0x10
           [<ffffffff810723bc>] task_work_run+0xac/0xe0
           [<ffffffff81002a9f>] do_notify_resume+0x9f/0xc0
           [<ffffffff814b8398>] int_signal+0x12/0x17
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=75051Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Reviewed-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Cc: <stable@vger.kernel.org> # 3.x: 8ec0a0e6: IB/umad: Fix error handling
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      60e1751c
  4. 05 Jun, 2014 2 commits
    • Haggai Eran's avatar
      IB/core: Fix kobject leak on device register error flow · 584482ac
      Haggai Eran authored
      The ports kobject isn't being released during error flow in device
      registration.  This patch refactors the ports kobject cleanup into a
      single function called from both the error flow in device registration
      and from the unregistration function.
      
      A couple of attributes aren't being deleted (iw_stats_group, and
      ib_class_attributes).  While this may be handled implicitly by the
      destruction of their kobjects, it seems better to handle all the
      attributes the same way.
      Signed-off-by: default avatarHaggai Eran <haggaie@mellanox.com>
      
      [ Make free_port_list_attributes() static.  - Roland ]
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      584482ac
    • Yann Droneaud's avatar
      RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_resp · b7dfa889
      Yann Droneaud authored
      The i386 ABI disagrees with most other ABIs regarding alignment of
      data types larger than 4 bytes: on most ABIs a padding must be added
      at end of the structures, while it is not required on i386.
      
      So for most ABI struct c4iw_alloc_ucontext_resp gets implicitly padded
      to be aligned on a 8 bytes multiple, while for i386, such padding is
      not added.
      
      The tool pahole can be used to find such implicit padding:
      
        $ pahole --anon_include \
                 --nested_anon_include \
                 --recursive \
                 --class_name c4iw_alloc_ucontext_resp \
                 drivers/infiniband/hw/cxgb4/iw_cxgb4.o
      
      Then, structure layout can be compared between i386 and x86_64:
      
        +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
        --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
        @@ -2,9 +2,8 @@ struct c4iw_alloc_ucontext_resp {
                __u64                      status_page_key;      /*     0     8 */
                __u32                      status_page_size;     /*     8     4 */
      
        -       /* size: 12, cachelines: 1, members: 2 */
        -       /* last cacheline: 12 bytes */
        +       /* size: 16, cachelines: 1, members: 2 */
        +       /* padding: 4 */
        +       /* last cacheline: 16 bytes */
         };
      
      This ABI disagreement will make an x86_64 kernel try to write past the
      buffer provided by an i386 binary.
      
      When boundary check will be implemented, the x86_64 kernel will refuse
      to write past the i386 userspace provided buffer and the uverbs will
      fail.
      
      If the structure is on a page boundary and the next page is not
      mapped, ib_copy_to_udata() will fail and the uverb will fail.
      
      Additionally, as reported by Dan Carpenter, without the implicit
      padding being properly cleared, an information leak would take place
      in most architectures.
      
      This patch adds an explicit padding to struct c4iw_alloc_ucontext_resp,
      and, like 92b0ca7c ("IB/mlx5: Fix stack info leak in
      mlx5_ib_alloc_ucontext()"), makes function c4iw_alloc_ucontext()
      not writting this padding field to userspace. This way, x86_64 kernel
      will be able to write struct c4iw_alloc_ucontext_resp as expected by
      unpatched and patched i386 libcxgb4.
      
      Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
      Link: http://marc.info/?i=1395848977.3297.15.camel@localhost.localdomain
      Link: http://marc.info/?i=20140328082428.GH25192@mwanda
      Cc: <stable@vger.kernel.org>
      Fixes: 05eb2389 ("cxgb4/iw_cxgb4: Doorbell Drop Avoidance Bug Fixes")
      Reported-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Acked-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      b7dfa889
  5. 04 Jun, 2014 4 commits
    • Roland Dreier's avatar
      mlx4_core: Fix GFP flags parameters to be gfp_t · 4e2c341b
      Roland Dreier authored
      Otherwise sparse gives a bunch of warnings like
      
          drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: sparse: incorrect type in argument 4 (different base types)
          drivers/net/ethernet/mellanox/mlx4/srq.c:110:66:    expected int [signed] gfp
          drivers/net/ethernet/mellanox/mlx4/srq.c:110:66:    got restricted gfp_t
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      4e2c341b
    • Haggai Eran's avatar
      IB/core: Fix port kobject deletion during error flow · cad6d02a
      Haggai Eran authored
      When encountering an error during the add_port function, adding a port
      to sysfs, the port kobject is freed without being deleted from sysfs.
      
      Instead of freeing it directly, the patch uses kobject_put to release
      the kobject and delete it.
      Signed-off-by: default avatarHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      cad6d02a
    • Haggai Eran's avatar
      IB/core: Remove unneeded kobject_get/put calls · 373c0ea1
      Haggai Eran authored
      The ib_core module will call kobject_get on the parent object of each
      kobject it creates.  This is redundant since kobject_add does that
      anyway.
      
      As a side effect, this patch should fix leaking the ports kobject and
      the device kobject during unregister flow, since the previous code
      didn't seem to take into account the kobject_get calls on behalf of
      the child kobjects.
      Signed-off-by: default avatarHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      373c0ea1
    • Roland Dreier's avatar
      IB/core: Fix sparse warnings about redeclared functions · 8385fd84
      Roland Dreier authored
      Fix a few functions that are declared with __attribute_const__ in the
      ib_verbs.h header file but defined without it in verbs.c.  This gets rid
      of the following sparse warnings:
      
          drivers/infiniband/core/verbs.c:51:5: error: symbol 'ib_rate_to_mult' redeclared with different type (originally declared at include/rdma/ib_verbs.h:469) - different modifiers
          drivers/infiniband/core/verbs.c:68:14: error: symbol 'mult_to_ib_rate' redeclared with different type (originally declared at include/rdma/ib_verbs.h:607) - different modifiers
          drivers/infiniband/core/verbs.c:85:5: error: symbol 'ib_rate_to_mbps' redeclared with different type (originally declared at include/rdma/ib_verbs.h:476) - different modifiers
          drivers/infiniband/core/verbs.c:111:1: error: symbol 'rdma_node_get_transport' redeclared with different type (originally declared at include/rdma/ib_verbs.h:84) - different modifiers
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      8385fd84
  6. 03 Jun, 2014 1 commit
    • Roland Dreier's avatar
      IB/mad: Fix sparse warning about gfp_t use · 5343c00d
      Roland Dreier authored
      Properly convert gfp_t & result to bool to fix:
      
          drivers/infiniband/core/sa_query.c:621:33: warning: incorrect type in initializer (different base types)
          drivers/infiniband/core/sa_query.c:621:33:    expected bool [unsigned] [usertype] preload
          drivers/infiniband/core/sa_query.c:621:33:    got restricted gfp_t
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      5343c00d
  7. 02 Jun, 2014 4 commits
    • Jiri Kosina's avatar
      IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO · 40f2287b
      Jiri Kosina authored
      Modify the various routines used to allocate memory resources which
      serve QPs in mlx4 to get an input GFP directive.  Have the Ethernet
      driver to use GFP_KERNEL in it's QP allocations as done prior to this
      commit, and the IB driver to use GFP_NOIO when the IB verbs
      IB_QP_CREATE_USE_GFP_NOIO QP creation flag is provided.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      40f2287b
    • Or Gerlitz's avatar
      IB: Add a QP creation flag to use GFP_NOIO allocations · 09b93088
      Or Gerlitz authored
      This addresses a problem where NFS client writes over IPoIB connected
      mode may deadlock on memory allocation/writeback.
      
      The problem is not directly memory reclamation.  There is an indirect
      dependency between network filesystems writing back pages and
      ipoib_cm_tx_init() due to how a kworker is used.  Page reclaim cannot
      make forward progress until ipoib_cm_tx_init() succeeds and it is
      stuck in page reclaim itself waiting for network transmission.
      Ordinarily this situation may be avoided by having the caller use
      GFP_NOFS but ipoib_cm_tx_init() does not have that information.
      
      To address this, take a general approach and add a new QP creation
      flag that tells the low-level hardware driver to use GFP_NOIO for the
      memory allocations related to the new QP.
      
      Use the new flag in the ipoib connected mode path, and if the driver
      doesn't support it, re-issue the QP creation without the flag.
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      09b93088
    • Or Gerlitz's avatar
      IB: Return error for unsupported QP creation flags · 60093dc0
      Or Gerlitz authored
      Fix the usnic and thw qib drivers to err when QP creation flags that
      they don't understand are provided.
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      60093dc0
    • Yann Droneaud's avatar
      IB: Allow build of hw/ and ulp/ subdirectories independently · 729ee4ef
      Yann Droneaud authored
      It is not possible to build only the drivers/infiniband/hw/ (or ulp/)
      subdirectory with command such as:
      
          $ make ARCH=x86_64 O=./obj-x86_64/ drivers/infiniband/hw/
      
      This fails with following error messages:
      
          make[2]: Nothing to be done for `all'.
          make[2]: Nothing to be done for `relocs'.
            CHK     include/config/kernel.release
            Using /home/ydroneaud/src/linux as source for kernel
            GEN     /home/ydroneaud/src/linux/obj-x86_64/Makefile
            CHK     include/generated/uapi/linux/version.h
            CHK     include/generated/utsrelease.h
            CALL    /home/ydroneaud/src/linux/scripts/checksyscalls.sh
          /home/ydroneaud/src/linux/scripts/Makefile.build:44: /home/ydroneaud/src/linux/drivers/infiniband/hw/Makefile: No such file or directory
          make[2]: *** No rule to make target `/home/ydroneaud/src/linux/drivers/infiniband/hw/Makefile'.  Stop.
          make[1]: *** [drivers/infiniband/hw/] Error 2
          make: *** [sub-make] Error 2
      
      This patch creates a Makefile in hw/ and ulp/ and moves each
      corresponding parts of drivers/infiniband/Makefile in the new
      Makefiles.
      
      It should not break build except if some hw/ drivers or ulp/ were
      allowed previously to be built while CONFIG_INFINIBAND is set to 'n',
      but according to drivers/infiniband/Kconfig, it's not possible. So it
      should be safe to apply.
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      729ee4ef
  8. 30 May, 2014 11 commits
    • Roland Dreier's avatar
      mlx4_core: Move handling of MLX4_QP_ST_MLX to proper switch statement · 165cb465
      Roland Dreier authored
      The handling of MLX4_QP_ST_MLX in verify_qp_parameters() was
      accidentally put inside the inner switch statement (that handles which
      transition of RC/UC/XRC QPs is happening).  Fix this by moving the case
      to the outer switch statement.
      
      The compiler pointed this out with:
      
          drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'verify_qp_parameters':
       >> drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:2875:3: warning: case value '7' not in enumerated type 'enum qp_transition' [-Wswitch]
             case MLX4_QP_ST_MLX:
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Fixes: 99ec41d0 ("mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs")
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      165cb465
    • Yann Droneaud's avatar
      RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp · b6f04d3d
      Yann Droneaud authored
      The i386 ABI disagrees with most other ABIs regarding alignment of
      data types larger than 4 bytes: on most ABIs a padding must be added
      at end of the structures, while it is not required on i386.
      
      So for most ABI struct c4iw_create_cq_resp gets implicitly padded
      to be aligned on a 8 bytes multiple, while for i386, such padding
      is not added.
      
      The tool pahole can be used to find such implicit padding:
      
        $ pahole --anon_include \
                 --nested_anon_include \
                 --recursive \
                 --class_name c4iw_create_cq_resp \
                 drivers/infiniband/hw/cxgb4/iw_cxgb4.o
      
      Then, structure layout can be compared between i386 and x86_64:
      
        +++ obj-i386/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt   2014-03-28 11:43:05.547432195 +0100
        --- obj-x86_64/drivers/infiniband/hw/cxgb4/iw_cxgb4.o.pahole.txt 2014-03-28 10:55:10.990133017 +0100
        @@ -14,9 +13,8 @@ struct c4iw_create_cq_resp {
                __u32                      size;                 /*    28     4 */
                __u32                      qid_mask;             /*    32     4 */
      
        -       /* size: 36, cachelines: 1, members: 6 */
        -       /* last cacheline: 36 bytes */
        +       /* size: 40, cachelines: 1, members: 6 */
        +       /* padding: 4 */
        +       /* last cacheline: 40 bytes */
         };
      
      This ABI disagreement will make an x86_64 kernel try to write past the
      buffer provided by an i386 binary.
      
      When boundary check will be implemented, the x86_64 kernel will refuse
      to write past the i386 userspace provided buffer and the uverbs will
      fail.
      
      If the structure is on a page boundary and the next page is not
      mapped, ib_copy_to_udata() will fail and the uverb will fail.
      
      This patch adds an explicit padding at end of structure
      c4iw_create_cq_resp, and, like 92b0ca7c ("IB/mlx5: Fix stack info
      leak in mlx5_ib_alloc_ucontext()"), makes function c4iw_create_cq()
      not writting this padding field to userspace. This way, x86_64 kernel
      will be able to write struct c4iw_create_cq_resp as expected by
      unpatched and patched i386 libcxgb4.
      
      Link: http://marc.info/?i=cover.1399309513.git.ydroneaud@opteya.com
      Cc: <stable@vger.kernel.org>
      Fixes: cfdda9d7 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
      Fixes: e24a72a3 ("RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq()")
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      Acked-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      b6f04d3d
    • Joe Perches's avatar
      IB/srp: Avoid problems if a header uses pr_fmt · d236cd0e
      Joe Perches authored
      SRP defines pr_fmt(fmt) to be "PFX fmt", and then includes a bunch of
      header files before it gets around to defining PFX.  This causes
      problems if any of the header files do a pr_... and use pr_fmt().
      
      Fix this by using KBUILD_MODNAME instead of the private PFX.
      Acked-by: default avatarChris Metcalf <cmetcalf@tilera.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      d236cd0e
    • Bart Van Assche's avatar
      IB/umad: Fix error handling · 8ec0a0e6
      Bart Van Assche authored
      Avoid leaking a kref count in ib_umad_open() if port->ib_dev == NULL
      or if nonseekable_open() fails.
      
      Avoid leaking a kref count, that sm_sem is kept down and also that the
      IB_PORT_SM capability mask is not cleared in ib_umad_sm_open() if
      nonseekable_open() fails.
      
      Since container_of() never returns NULL, remove the code that tests
      whether container_of() returns NULL.
      
      Moving the kref_get() call from the start of ib_umad_*open() to the
      end is safe since it is the responsibility of the caller of these
      functions to ensure that the cdev pointer remains valid until at least
      when these functions return.
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Cc: <stable@vger.kernel.org>
      
      [ydroneaud@opteya.com: rework a bit to reduce the amount of code changed]
      Signed-off-by: default avatarYann Droneaud <ydroneaud@opteya.com>
      
      [ nonseekable_open() can't actually fail, but....  - Roland ]
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      8ec0a0e6
    • Jack Morgenstein's avatar
      IB/mlx4: Add interface for selecting VFs to enable QP0 via MLX proxy QPs · 65fed8a8
      Jack Morgenstein authored
      This commit adds the sysfs interface for enabling QP0 on VFs for
      selected VF/port.
      
      By default, no VFs are enabled for QP0 operation.
      
      To enable QP0 operation on a VF/port, under
      /sys/class/infiniband/mlx4_x/iov/<b:d:f>/ports/x there are two new entries:
      
      - smi_enabled (read-only). Indicates whether smi is currently
        enabled for the indicated VF/port
      
      - enable_smi_admin (rw). Used by the admin to request that smi
        capability be enabled or disabled for the indicated VF/port.
        0 = disable, 1 = enable.
        The requested enablement will occur at the next reset of the
        VF (e.g. driver restart on the VM which owns the VF).
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      65fed8a8
    • Jack Morgenstein's avatar
      mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs · 99ec41d0
      Jack Morgenstein authored
      This commit adds the infrastructure for enabling selected VFs to
      operate SMI (QP0) MADs without restriction.
      
      Additionally, for these enabled VFs, their QP0 proxy and tunnel QPs
      are MLX QPs.  As such, they operate over VL15.  Therefore, they are
      not affected by "credit" problems or changes in the VLArb table (which
      may shut down VL0).
      
      Non-enabled VFs may only create UD proxy QP0 qps (which are forced by
      the hypervisor to send packets using the q-key it assigns and places
      in the qp-context).  Thus, non-enabled VFs will not pose a security
      risk.  The hypervisor discards any privileged MADs it receives from
      these non-enabled VFs.
      
      By default, all VFs are NOT enabled, and must explicitly be enabled
      by the administrator.
      
      The sysfs interface which operates the VF enablement infrastructure
      is provided in the next commit.
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      99ec41d0
    • Jack Morgenstein's avatar
      IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses · 97982f5a
      Jack Morgenstein authored
      Currently, VFs in SRIOV VFs are denied QP0 access.  The main reason
      for this decision is security, since Subnet Management Datagrams
      (SMPs) are not restricted by network partitioning and may affect the
      physical network topology.  Moreover, even the SM may be denied access
      from portions of the network by setting management keys unknown to the
      SM.
      
      However, it is desirable to grant SMI access to certain privileged
      VFs, so that certain network management activities may be conducted
      within virtual machines instead of the hypervisor.
      
      This commit does the following:
      
      1. Create QP0 tunnel QPs for all VFs.
      
      2. Discard SMI mads sent-from/received-for non-privileged VFs in the
         hypervisor MAD multiplex/demultiplex logic.  SMI mads from/for
         privileged VFs are allowed to pass.
      
      3. MAD_IFC wrapper changes/fixes.  For non-privileged VFs, only
         host-view MAD_IFC commands are allowed, and only for SMI LID-Routed
         GET mads.  For privileged VFs, there are no restrictions.
      
      This commit does not allow privileged VFs as yet.  To determine if a VF
      is privileged, it calls function mlx4_vf_smi_enabled().  This function
      returns 0 unconditionally for now.
      
      The next two commits allow defining and activating privileged VFs.
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      97982f5a
    • Jack Morgenstein's avatar
      IB/mlx4: SET_PORT called by mlx4_ib_modify_port should be wrapped · 61565013
      Jack Morgenstein authored
      mlx4_ib_modify_port is invoked in IB for resetting the Q_Key violations
      counters and for modifying the IB port capability flags.
      
      For example, when opensm is started up on the hypervisor,
      mlx4_ib_modify_port is called to set the port's IsSM flag.
      
      In multifunction mode, the SET_PORT command used in this flow should
      be wrapped (so that the PF port capability flags are also tracked,
      thus enabling the aggregate of all the VF/PF capability flags to be
      tracked properly).
      
      The procedure mlx4_SET_PORT() in main.c is also renamed to mlx4_ib_SET_PORT()
      to differentiate it from procedure mlx4_SET_PORT() in port.c.
      mlx4_ib_SET_PORT() is used exclusively by mlx4_ib_modify_port().
      
      Finally, the CM invokes ib_modify_port() to set the IsCMSupported flag
      even when running over RoCE.  Therefore, when RoCE is active,
      mlx4_ib_modify_port should return OK unconditionally (since the
      capability flags and qkey violations counter are not relevant).
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      61565013
    • Jack Morgenstein's avatar
      mlx4_core: Fix incorrect FLAGS1 bitmap test in mlx4_QUERY_FUNC_CAP · bc82878b
      Jack Morgenstein authored
      Commit eb17711b ("net/mlx4_core: Introduce nic_info new flag in
      QUERY_FUNC_CAP") did:
      
      	if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_OFFSET) {
      
      which should be:
      
      	if (func_cap->flags1 & QUERY_FUNC_CAP_FLAGS1_FORCE_VLAN) {
      
      Fix that.
      
      Fixes: eb17711b ("net/mlx4_core: Introduce nic_info new flag in QUERY_FUNC_CAP")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      bc82878b
    • Dotan Barak's avatar
      mlx4_core: Fix memory leaks in SR-IOV error paths · b38f2879
      Dotan Barak authored
      Fix a few memory leaks that happen if errors happen in SR-IOV mode.
      Signed-off-by: default avatarDotan Barak <dotanb@dev.mellanox.co.il>
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      b38f2879
    • Vinit Agnihotri's avatar
      IB/qib: Additional Intel branding changes · 0a66d2bd
      Vinit Agnihotri authored
      This patches changes user visible function names containing "qlogic"
      in module init and cleanup.
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarVinit Agnihotri <vinit.abhay.agnihotri@intel.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      0a66d2bd
  9. 28 May, 2014 4 commits
  10. 27 May, 2014 8 commits