1. 20 Jul, 2017 30 commits
  2. 18 Jul, 2017 10 commits
    • Tadeusz Struk's avatar
      IB/core: Allow QP state transition from reset to error · ebc9ca43
      Tadeusz Struk authored
      Playing with IP-O-IB interface can trigger a warning message:
      "ib0: Failed to modify QP to ERROR state" to be logged.
      This happens when the QP is in IB_QPS_RESET state and the stack
      is trying to transition it to IB_QPS_ERR state in ipoib_ib_dev_stop().
      
      According to the IB spec, Table 91 - "QP State Transition Properties"
      it looks like the transition from reset to error is valid:
      
      Transition: Any State to Error
      Required Attributes: None
      Optional Attributes: None allowed
      Actions: Queue processing is stopped. Work Requests pending or in
      process are completed in error, when possible.
      
      This patch allows the transition and quiets the message.
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarTadeusz Struk <tadeusz.struk@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      ebc9ca43
    • oulijun's avatar
      IB/hns: Fix for checkpatch.pl comment style warnings · 5f110ac4
      oulijun authored
      This patch correct the comment style warnings caught by
      checkpatch.pl script.
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5f110ac4
    • oulijun's avatar
      IB/hns: Fix the bug with modifying the MAC address without removing the driver · d322f004
      oulijun authored
      When modified the MAC address used hns_roce_mac function, we release and create
      reserved qp again, It is not necessary to use spin_lock_bh and spin_unlock_bh in
      handle_en_event, Otherwise, it will occur a error. This patch mainly fixes it.
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      d322f004
    • oulijun's avatar
      IB/hns: Fix the bug with rdma operation · 9de61d3f
      oulijun authored
      When opcode of work request is RDMA read and write, it
      should use rdma_wr to get remote_addr and rkey. This
      patch fixes it.
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      9de61d3f
    • oulijun's avatar
      IB/hns: Fix the bug with wild pointer when destroy rc qp · 58c4f0d8
      oulijun authored
      When destroyed rc qp, the hr_qp will be used after freed. This patch
      will fix it.
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      58c4f0d8
    • oulijun's avatar
      IB/hns: Fix the bug of polling cq failed for loopback Qps · 5802883d
      oulijun authored
      In hip06 SoC, RoCE driver creates 8 reserved loopback QPs to
      ensure zero wqe when free mr. However, if the enabled phy
      port number is less than 6, it will fail in polling cqe with
      8 reserved loopback QPs.
      
      In order to solve this problem, the number of loopback Qps
      will be adjusted based on the number of enabled phy port.
      Signed-off-by: default avatarShaobo Xu <xushaobo2@huawei.com>
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5802883d
    • yonatanc's avatar
      IB/rxe: Set dma_mask and coherent_dma_mask · 56012e1c
      yonatanc authored
      The RXE coupled with dummy device causes to the kernel panic attached
      below.  The panic happens when ib_register_device tries to set dma_mask
      by accessing a NULLed parent device.
      
      The RXE does not actually use DMA, so we can set the dma_mask
      to architecture value.
      
      [16240.199689] RIP: 0010:ib_register_device+0x468/0x5a0 [ib_core]
      [16240.205289] RSP: 0018:ffffc9000220fc10 EFLAGS: 00010246
      [16240.209909] RAX: 0000000000000024 RBX: ffff880220d1a2a8 RCX: 0000000000000000
      [16240.212244] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
      [16240.214385] RBP: ffffc9000220fcb0 R08: 0000000000000000 R09: 000000000000023f
      [16240.254465] R10: 0000000000000007 R11: 0000000000000000 R12: 0000000000000000
      [16240.259467] R13: 0000000000000000 R14: 0000000000000000 R15: ffff880220d1a2a8
      [16240.263314] FS:  00007fd8ecca0740(0000) GS:ffff8802364c0000(0000) knlGS:0000000000000000
      [16240.267292] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [16240.273503] CR2: 0000000000000218 CR3: 00000002253ba000 CR4: 00000000000006e0
      [16240.277066] Call Trace:
      [16240.281836]  ? __kmalloc+0x26f/0x280
      [16240.286596]  rxe_register_device+0x297/0x300 [rdma_rxe]
      [16240.291377]  rxe_add+0x535/0x5b0 [rdma_rxe]
      [16240.297586]  rxe_net_add+0x3e/0xc0 [rdma_rxe]
      [16240.302375]  rxe_param_set_add+0x65/0x144 [rdma_rxe]
      [16240.307769]  param_attr_store+0x68/0xd0
      [16240.311640]  module_attr_store+0x1d/0x30
      [16240.316421]  sysfs_kf_write+0x3a/0x50
      [16240.317802]  kernfs_fop_write+0xff/0x180
      [16240.322989]  __vfs_write+0x37/0x140
      [16240.328164]  ? handle_mm_fault+0xce/0x240
      [16240.333340]  vfs_write+0xb2/0x1b0
      [16240.335013]  SyS_write+0x55/0xc0
      [16240.340632]  entry_SYSCALL_64_fastpath+0x1a/0xa9
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      56012e1c
    • Yonatan Cohen's avatar
      IB/rxe: Fix kernel panic from skb destructor · fda85ce9
      Yonatan Cohen authored
      In the time between rxe_send has finished and skb destructor
      called, the QP's ref count might be 0, leading to a possible
      QP destruction. This will lead to a kernel panic when the destructor
      dereferences the QP.
      
      The operation of incrementing QP ref count at rxe_send and decrementing
      from skb destructor will prevent this crash.
      
      BUG: unable to handle kernel NULL pointer dereference at 000000000000072c
      IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
      PGD 0 [16240.211178]
      Oops: 0002 [#1] SMP
      CPU: 3 PID: 0 Comm: swapper/3 Tainted: G           OE   4.9.0-mlnx #1
      Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
      task: ffff88042d6b1480 task.stack: ffffc90001904000
      RIP: 0010:[<ffffffffa05df765>]  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
      RSP: 0018:ffff88043fcc3df0  EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200
      RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700
      RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6
      R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700
      R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200
      FS:  0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0
      Stack:
       ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2
       ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700
       ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96
      Call Trace:
       <IRQ> [16240.336345]  [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0
       [<ffffffff817b42c2>] skb_release_all+0x12/0x30
       [<ffffffff817b4332>] kfree_skb+0x32/0x90
       [<ffffffff81893f96>] ndisc_error_report+0x36/0x40
       [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0
       [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0
       [<ffffffff81109295>] call_timer_fn+0x35/0x120
       [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460
       [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30
       [<ffffffff810366b9>] ? sched_clock+0x9/0x10
       [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0
       [<ffffffff818dd537>] __do_softirq+0xd7/0x289
       [<ffffffff810a6c95>] irq_exit+0xb5/0xc0
       [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50
       [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90
       <EOI> [16240.395776]  [<ffffffff818da156>] ? native_safe_halt+0x6/0x10
       [<ffffffff818d9e6e>] default_idle+0x1e/0xd0
       [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20
       [<ffffffff818da2c5>] default_idle_call+0x35/0x40
       [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210
       [<ffffffff81050433>] start_secondary+0x103/0x130
      RIP  [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe]
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      fda85ce9
    • Erez Shitrit's avatar
      IB/ipoib: Let lower driver handle get_stats64 call · b6c871e5
      Erez Shitrit authored
      The driver checks if the lower level driver supports get_stats, and if
      so calls it to get the updated statistics, otherwise takes from the
      current netdevice stats object.
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarAlex Vesker <valex@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      b6c871e5
    • Majd Dibbiny's avatar
      IB/core: Add ordered workqueue for RoCE GID management · 8fe8bacb
      Majd Dibbiny authored
      Currently the RoCE GID management uses the ib_wq to do add and delete new GIDs
      according to the netdev events.
      
      The ib_wq isn't an ordered workqueue and thus two work elements can be executed
      concurrently which will result in unexpected behavior and inconsistency of the
      GIDs cache content.
      
      Example:
      ifconfig eth1 11.11.11.11/16 up
      
      This command will invoke the following netdev events in the following order:
      1. NETDEV_UP
      2. NETDEV_DOWN
      3. NETDEV_UP
      
      If (2) and (3) will be executed concurrently or in reverse order, instead of
      having a new GID with 11.11.11.11 IP, we will end up without any new GIDs.
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      8fe8bacb