1. 22 May, 2019 7 commits
    • Christian Lamparter's avatar
      ARM: dts: qcom: ipq4019: enlarge PCIe BAR range · cc3d377c
      Christian Lamparter authored
      commit f3e35357 upstream.
      
      David Bauer reported that the VDSL modem (attached via PCIe)
      on his AVM Fritz!Box 7530 was complaining about not having
      enough space in the BAR. A closer inspection of the old
      qcom-ipq40xx.dtsi pulled from the GL-iNet repository listed:
      
      | qcom,pcie@80000 {
      |	compatible = "qcom,msm_pcie";
      |	reg = <0x80000 0x2000>,
      |	      <0x99000 0x800>,
      |	      <0x40000000 0xf1d>,
      |	      <0x40000f20 0xa8>,
      |	      <0x40100000 0x1000>,
      |	      <0x40200000 0x100000>,
      |	      <0x40300000 0xd00000>;
      |	reg-names = "parf", "phy", "dm_core", "elbi",
      |			"conf", "io", "bars";
      
      Matching the reg-names with the listed reg leads to
      <0xd00000> as the size for the "bars".
      
      Cc: stable@vger.kernel.org
      BugLink: https://www.mail-archive.com/openwrt-devel@lists.openwrt.org/msg45212.htmlReported-by: default avatarDavid Bauer <mail@david-bauer.net>
      Signed-off-by: default avatarChristian Lamparter <chunkeey@gmail.com>
      Signed-off-by: default avatarAndy Gross <agross@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc3d377c
    • Christoph Muellner's avatar
      arm64: dts: rockchip: Disable DCMDs on RK3399's eMMC controller. · e16cecbb
      Christoph Muellner authored
      commit a3eec13b upstream.
      
      When using direct commands (DCMDs) on an RK3399, we get spurious
      CQE completion interrupts for the DCMD transaction slot (#31):
      
      [  931.196520] ------------[ cut here ]------------
      [  931.201702] mmc1: cqhci: spurious TCN for tag 31
      [  931.206906] WARNING: CPU: 0 PID: 1433 at /usr/src/kernel/drivers/mmc/host/cqhci.c:725 cqhci_irq+0x2e4/0x490
      [  931.206909] Modules linked in:
      [  931.206918] CPU: 0 PID: 1433 Comm: irq/29-mmc1 Not tainted 4.19.8-rt6-funkadelic #1
      [  931.206920] Hardware name: Theobroma Systems RK3399-Q7 SoM (DT)
      [  931.206924] pstate: 40000005 (nZcv daif -PAN -UAO)
      [  931.206927] pc : cqhci_irq+0x2e4/0x490
      [  931.206931] lr : cqhci_irq+0x2e4/0x490
      [  931.206933] sp : ffff00000e54bc80
      [  931.206934] x29: ffff00000e54bc80 x28: 0000000000000000
      [  931.206939] x27: 0000000000000001 x26: ffff000008f217e8
      [  931.206944] x25: ffff8000f02ef030 x24: ffff0000091417b0
      [  931.206948] x23: ffff0000090aa000 x22: ffff8000f008b000
      [  931.206953] x21: 0000000000000002 x20: 000000000000001f
      [  931.206957] x19: ffff8000f02ef018 x18: ffffffffffffffff
      [  931.206961] x17: 0000000000000000 x16: 0000000000000000
      [  931.206966] x15: ffff0000090aa6c8 x14: 0720072007200720
      [  931.206970] x13: 0720072007200720 x12: 0720072007200720
      [  931.206975] x11: 0720072007200720 x10: 0720072007200720
      [  931.206980] x9 : 0720072007200720 x8 : 0720072007200720
      [  931.206984] x7 : 0720073107330720 x6 : 00000000000005a0
      [  931.206988] x5 : ffff00000860d4b0 x4 : 0000000000000000
      [  931.206993] x3 : 0000000000000001 x2 : 0000000000000001
      [  931.206997] x1 : 1bde3a91b0d4d900 x0 : 0000000000000000
      [  931.207001] Call trace:
      [  931.207005]  cqhci_irq+0x2e4/0x490
      [  931.207009]  sdhci_arasan_cqhci_irq+0x5c/0x90
      [  931.207013]  sdhci_irq+0x98/0x930
      [  931.207019]  irq_forced_thread_fn+0x2c/0xa0
      [  931.207023]  irq_thread+0x114/0x1c0
      [  931.207027]  kthread+0x128/0x130
      [  931.207032]  ret_from_fork+0x10/0x20
      [  931.207035] ---[ end trace 0000000000000002 ]---
      
      The driver shows this message only for the first spurious interrupt
      by using WARN_ONCE(). Changing this to WARN() shows, that this is
      happening quite frequently (up to once a second).
      
      Since the eMMC 5.1 specification, where CQE and CQHCI are specified,
      does not mention that spurious TCN interrupts for DCMDs can be simply
      ignored, we must assume that using this feature is not working reliably.
      
      The current implementation uses DCMD for REQ_OP_FLUSH only, and
      I could not see any performance/power impact when disabling
      this optional feature for RK3399.
      
      Therefore this patch disables DCMDs for RK3399.
      Signed-off-by: default avatarChristoph Muellner <christoph.muellner@theobroma-systems.com>
      Signed-off-by: default avatarPhilipp Tomsich <philipp.tomsich@theobroma-systems.com>
      Fixes: 84362d79 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
      Cc: stable@vger.kernel.org
      [the corresponding code changes are queued for 5.2 so doing that as well]
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e16cecbb
    • Katsuhiro Suzuki's avatar
      arm64: dts: rockchip: fix IO domain voltage setting of APIO5 on rockpro64 · dfd36e38
      Katsuhiro Suzuki authored
      commit 798689e4 upstream.
      
      This patch fixes IO domain voltage setting that is related to
      audio_gpio3d4a_ms (bit 1) of GRF_IO_VSEL.
      
      This is because RockPro64 schematics P.16 says that regulator
      supplies 3.0V power to APIO5_VDD. So audio_gpio3d4a_ms bit should
      be clear (means 3.0V). Power domain map is saying different thing
      (supplies 1.8V) but I believe P.16 is actual connectings.
      
      Fixes: e4f3fb49 ("arm64: dts: rockchip: add initial dts support for Rockpro64")
      Cc: stable@vger.kernel.org
      Suggested-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarKatsuhiro Suzuki <katsuhiro@katsuster.net>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfd36e38
    • Josh Poimboeuf's avatar
      objtool: Fix function fallthrough detection · 01cdbf42
      Josh Poimboeuf authored
      commit e6f393bc upstream.
      
      When a function falls through to the next function due to a compiler
      bug, objtool prints some obscure warnings.  For example:
      
        drivers/regulator/core.o: warning: objtool: regulator_count_voltages()+0x95: return with modified stack frame
        drivers/regulator/core.o: warning: objtool: regulator_count_voltages()+0x0: stack state mismatch: cfa1=7+32 cfa2=7+8
      
      Instead it should be printing:
      
        drivers/regulator/core.o: warning: objtool: regulator_supply_is_couple() falls through to next function regulator_count_voltages()
      
      This used to work, but was broken by the following commit:
      
        13810435 ("objtool: Support GCC 8's cold subfunctions")
      
      The padding nops at the end of a function aren't actually part of the
      function, as defined by the symbol table.  So the 'func' variable in
      validate_branch() is getting cleared to NULL when a padding nop is
      encountered, breaking the fallthrough detection.
      
      If the current instruction doesn't have a function associated with it,
      just consider it to be part of the previously detected function by not
      overwriting the previous value of 'func'.
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: <stable@vger.kernel.org>
      Fixes: 13810435 ("objtool: Support GCC 8's cold subfunctions")
      Link: http://lkml.kernel.org/r/546d143820cd08a46624ae8440d093dd6c902cae.1557766718.git.jpoimboe@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      01cdbf42
    • Andy Lutomirski's avatar
      x86/speculation/mds: Improve CPU buffer clear documentation · 360b4bba
      Andy Lutomirski authored
      commit 9d8d0294 upstream.
      
      On x86_64, all returns to usermode go through
      prepare_exit_to_usermode(), with the sole exception of do_nmi().
      This even includes machine checks -- this was added several years
      ago to support MCE recovery.  Update the documentation.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: 04dcbdb8 ("x86/speculation/mds: Clear CPU buffers on exit to user")
      Link: http://lkml.kernel.org/r/999fa9e126ba6a48e9d214d2f18dbde5c62ac55c.1557865329.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      360b4bba
    • Andy Lutomirski's avatar
      x86/speculation/mds: Revert CPU buffer clear on double fault exit · dd5750d4
      Andy Lutomirski authored
      commit 88640e1d upstream.
      
      The double fault ESPFIX path doesn't return to user mode at all --
      it returns back to the kernel by simulating a #GP fault.
      prepare_exit_to_usermode() will run on the way out of
      general_protection before running user code.
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jon Masters <jcm@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Fixes: 04dcbdb8 ("x86/speculation/mds: Clear CPU buffers on exit to user")
      Link: http://lkml.kernel.org/r/ac97612445c0a44ee10374f6ea79c222fe22a5c4.1557865329.git.luto@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dd5750d4
    • Waiman Long's avatar
      locking/rwsem: Prevent decrement of reader count before increment · c48fddac
      Waiman Long authored
      [ Upstream commit a9e9bcb4 ]
      
      During my rwsem testing, it was found that after a down_read(), the
      reader count may occasionally become 0 or even negative. Consequently,
      a writer may steal the lock at that time and execute with the reader
      in parallel thus breaking the mutual exclusion guarantee of the write
      lock. In other words, both readers and writer can become rwsem owners
      simultaneously.
      
      The current reader wakeup code does it in one pass to clear waiter->task
      and put them into wake_q before fully incrementing the reader count.
      Once waiter->task is cleared, the corresponding reader may see it,
      finish the critical section and do unlock to decrement the count before
      the count is incremented. This is not a problem if there is only one
      reader to wake up as the count has been pre-incremented by 1.  It is
      a problem if there are more than one readers to be woken up and writer
      can steal the lock.
      
      The wakeup was actually done in 2 passes before the following v4.9 commit:
      
        70800c3c ("locking/rwsem: Scan the wait_list for readers only once")
      
      To fix this problem, the wakeup is now done in two passes
      again. In the first pass, we collect the readers and count them.
      The reader count is then fully incremented. In the second pass, the
      waiter->task is then cleared and they are put into wake_q to be woken
      up later.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: huang ying <huang.ying.caritas@gmail.com>
      Fixes: 70800c3c ("locking/rwsem: Scan the wait_list for readers only once")
      Link: http://lkml.kernel.org/r/20190428212557.13482-2-longman@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c48fddac
  2. 16 May, 2019 33 commits
    • Greg Kroah-Hartman's avatar
      Linux 5.1.3 · 7cb9c5d3
      Greg Kroah-Hartman authored
      7cb9c5d3
    • Damien Le Moal's avatar
      f2fs: Fix use of number of devices · dabb99e0
      Damien Le Moal authored
      commit 0916878d upstream.
      
      For a single device mount using a zoned block device, the zone
      information for the device is stored in the sbi->devs single entry
      array and sbi->s_ndevs is set to 1. This differs from a single device
      mount using a regular block device which does not allocate sbi->devs
      and sets sbi->s_ndevs to 0.
      
      However, sbi->s_devs == 0 condition is used throughout the code to
      differentiate a single device mount from a multi-device mount where
      sbi->s_ndevs is always larger than 1. This results in problems with
      single zoned block device volumes as these are treated as multi-device
      mounts but do not have the start_blk and end_blk information set. One
      of the problem observed is skipping of zone discard issuing resulting in
      write commands being issued to full zones or unaligned to a zone write
      pointer.
      
      Fix this problem by simply treating the cases sbi->s_ndevs == 0 (single
      regular block device mount) and sbi->s_ndevs == 1 (single zoned block
      device mount) in the same manner. This is done by introducing the
      helper function f2fs_is_multi_device() and using this helper in place
      of direct tests of sbi->s_ndevs value, improving code readability.
      
      Fixes: 7bb3a371 ("f2fs: Fix zoned block device support")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Reviewed-by: default avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dabb99e0
    • Dexuan Cui's avatar
      PCI: hv: Add pci_destroy_slot() in pci_devices_present_work(), if necessary · d82e3d5b
      Dexuan Cui authored
      commit 340d4556 upstream.
      
      When we hot-remove a device, usually the host sends us a PCI_EJECT message,
      and a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
      
      When we execute the quick hot-add/hot-remove test, the host may not send
      us the PCI_EJECT message if the guest has not fully finished the
      initialization by sending the PCI_RESOURCES_ASSIGNED* message to the
      host, so it's potentially unsafe to only depend on the
      pci_destroy_slot() in hv_eject_device_work() because the code path
      
      create_root_hv_pci_bus()
       -> hv_pci_assign_slots()
      
      is not called in this case. Note: in this case, the host still sends the
      guest a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.
      
      In the quick hot-add/hot-remove test, we can have such a race before
      the code path
      
      pci_devices_present_work()
       -> new_pcichild_device()
      
      adds the new device into the hbus->children list, we may have already
      received the PCI_EJECT message, and since the tasklet handler
      
      hv_pci_onchannelcallback()
      
      may fail to find the "hpdev" by calling
      
      get_pcichild_wslot(hbus, dev_message->wslot.slot)
      
      hv_pci_eject_device() is not called; Later, by continuing execution
      
      create_root_hv_pci_bus()
       -> hv_pci_assign_slots()
      
      creates the slot and the PCI_BUS_RELATIONS message with
      bus_rel->device_count == 0 removes the device from hbus->children, and
      we end up being unable to remove the slot in
      
      hv_pci_remove()
       -> hv_pci_remove_slots()
      
      Remove the slot in pci_devices_present_work() when the device
      is removed to address this race.
      
      pci_devices_present_work() and hv_eject_device_work() run in the
      singled-threaded hbus->wq, so there is not a double-remove issue for the
      slot.
      
      We cannot offload hv_pci_eject_device() from hv_pci_onchannelcallback()
      to the workqueue, because we need the hv_pci_onchannelcallback()
      synchronously call hv_pci_eject_device() to poll the channel
      ringbuffer to work around the "hangs in hv_compose_msi_msg()" issue
      fixed in commit de0aa7b2 ("PCI: hv: Fix 2 hang issues in
      hv_compose_msi_msg()")
      
      Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: rewritten commit log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d82e3d5b
    • Dexuan Cui's avatar
      PCI: hv: Add hv_pci_remove_slots() when we unload the driver · 5c36f434
      Dexuan Cui authored
      commit 15becc2b upstream.
      
      When we unload the pci-hyperv host controller driver, the host does not
      send us a PCI_EJECT message.
      
      In this case we also need to make sure the sysfs PCI slot directory is
      removed, otherwise a command on a slot file eg:
      
      "cat /sys/bus/pci/slots/2/address"
      
      will trigger a
      
      "BUG: unable to handle kernel paging request"
      
      and, if we unload/reload the driver several times we would end up with
      stale slot entries in PCI slot directories in /sys/bus/pci/slots/
      
      root@localhost:~# ls -rtl  /sys/bus/pci/slots/
      total 0
      drwxr-xr-x 2 root root 0 Feb  7 10:49 2
      drwxr-xr-x 2 root root 0 Feb  7 10:49 2-1
      drwxr-xr-x 2 root root 0 Feb  7 10:51 2-2
      
      Add the missing code to remove the PCI slot and fix the current
      behaviour.
      
      Fixes: a15f2c08 ("PCI: hv: support reporting serial number as slot information")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: reformatted the log]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Reviewed-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c36f434
    • Dexuan Cui's avatar
      PCI: hv: Fix a memory leak in hv_eject_device_work() · 91425cbe
      Dexuan Cui authored
      commit 05f151a7 upstream.
      
      When a device is created in new_pcichild_device(), hpdev->refs is set
      to 2 (i.e. the initial value of 1 plus the get_pcichild()).
      
      When we hot remove the device from the host, in a Linux VM we first call
      hv_pci_eject_device(), which increases hpdev->refs by get_pcichild() and
      then schedules a work of hv_eject_device_work(), so hpdev->refs becomes
      3 (let's ignore the paired get/put_pcichild() in other places). But in
      hv_eject_device_work(), currently we only call put_pcichild() twice,
      meaning the 'hpdev' struct can't be freed in put_pcichild().
      
      Add one put_pcichild() to fix the memory leak.
      
      The device can also be removed when we run "rmmod pci-hyperv". On this
      path (hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_devices_present()),
      hpdev->refs is 2, and we do correctly call put_pcichild() twice in
      pci_devices_present_work().
      
      Fixes: 4daace0d ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
      Signed-off-by: default avatarDexuan Cui <decui@microsoft.com>
      [lorenzo.pieralisi@arm.com: commit log rework]
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      Reviewed-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Reviewed-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      91425cbe
    • YueHaibing's avatar
      virtio_ring: Fix potential mem leak in virtqueue_add_indirect_packed · 037ca765
      YueHaibing authored
      commit df0bfe75 upstream.
      
      'desc' should be freed before leaving from err handing path.
      
      Fixes: 1ce9e605 ("virtio_ring: introduce packed ring support")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      037ca765
    • Laurentiu Tudor's avatar
      powerpc/booke64: set RI in default MSR · 665c4bdd
      Laurentiu Tudor authored
      commit 5266e58d upstream.
      
      Set RI in the default kernel's MSR so that the architected way of
      detecting unrecoverable machine check interrupts has a chance to work.
      This is inline with the MSR setup of the rest of booke powerpc
      architectures configured here.
      Signed-off-by: default avatarLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      665c4bdd
    • Russell Currey's avatar
      powerpc/powernv/idle: Restore IAMR after idle · 2f855814
      Russell Currey authored
      commit a3f3072d upstream.
      
      Without restoring the IAMR after idle, execution prevention on POWER9
      with Radix MMU is overwritten and the kernel can freely execute
      userspace without faulting.
      
      This is necessary when returning from any stop state that modifies
      user state, as well as hypervisor state.
      
      To test how this fails without this patch, load the lkdtm driver and
      do the following:
      
        $ echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT
      
      which won't fault, then boot the kernel with powersave=off, where it
      will fault. Applying this patch will fix this.
      
      Fixes: 3b10d009 ("powerpc/mm/radix: Prevent kernel execution of user space")
      Cc: stable@vger.kernel.org # v4.10+
      Signed-off-by: default avatarRussell Currey <ruscur@russell.cc>
      Reviewed-by: default avatarAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Reviewed-by: default avatarNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f855814
    • Rick Lindsley's avatar
      powerpc/book3s/64: check for NULL pointer in pgd_alloc() · 6aa7a9ea
      Rick Lindsley authored
      commit f3935626 upstream.
      
      When the memset code was added to pgd_alloc(), it failed to consider
      that kmem_cache_alloc() can return NULL. It's uncommon, but not
      impossible under heavy memory contention. Example oops:
      
        Unable to handle kernel paging request for data at address 0x00000000
        Faulting instruction address: 0xc0000000000a4000
        Oops: Kernel access of bad area, sig: 11 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1
        task: c000000334a00000 task.stack: c000000331c00000
        NIP:  c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020
        REGS: c000000331c039c0 TRAP: 0300   Not tainted  (4.14.0-115.6.1.el7a.ppc64le)
        MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022840  XER: 20040000
        CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
        ...
        NIP [c0000000000a4000] memset+0x68/0x104
        LR [c00000000012f43c] mm_init+0x27c/0x2f0
        Call Trace:
          mm_init+0x260/0x2f0 (unreliable)
          copy_mm+0x11c/0x638
          copy_process.isra.28.part.29+0x6fc/0x1080
          _do_fork+0xdc/0x4c0
          ppc_clone+0x8/0xc
        Instruction dump:
        409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0
        7c0903a6 41820034 60000000 60420000 <f8860000> f8860008 f8860010 f8860018
      
      Fixes: fc5c2f4a ("powerpc/mm/hash64: Zero PGD pages on allocation")
      Cc: stable@vger.kernel.org # v4.16+
      Signed-off-by: default avatarRick Lindsley <ricklind@vnet.linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6aa7a9ea
    • Dan Carpenter's avatar
      drivers/virt/fsl_hypervisor.c: prevent integer overflow in ioctl · 85ee2081
      Dan Carpenter authored
      commit 6a024330 upstream.
      
      The "param.count" value is a u64 thatcomes from the user.  The code
      later in the function assumes that param.count is at least one and if
      it's not then it leads to an Oops when we dereference the ZERO_SIZE_PTR.
      
      Also the addition can have an integer overflow which would lead us to
      allocate a smaller "pages" array than required.  I can't immediately
      tell what the possible run times implications are, but it's safest to
      prevent the overflow.
      
      Link: http://lkml.kernel.org/r/20181218082129.GE32567@kadam
      Fixes: 6db71994 ("drivers/virt: introduce Freescale hypervisor management driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Timur Tabi <timur@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85ee2081
    • Dan Carpenter's avatar
      drivers/virt/fsl_hypervisor.c: dereferencing error pointers in ioctl · 41771de5
      Dan Carpenter authored
      commit c8ea3663 upstream.
      
      strndup_user() returns error pointers on error, and then in the error
      handling we pass the error pointers to kfree().  It will cause an Oops.
      
      Link: http://lkml.kernel.org/r/20181218082003.GD32567@kadam
      Fixes: 6db71994 ("drivers/virt: introduce Freescale hypervisor management driver")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Timur Tabi <timur@freescale.com>
      Cc: Mihai Caraman <mihai.caraman@freescale.com>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41771de5
    • Paul Bolle's avatar
      isdn: bas_gigaset: use usb_fill_int_urb() properly · de2dec85
      Paul Bolle authored
      [ Upstream commit 4014dfae ]
      
      The switch to make bas_gigaset use usb_fill_int_urb() - instead of
      filling that urb "by hand" - missed the subtle ordering of the previous
      code.
      
      See, before the switch urb->dev was set to a member somewhere deep in a
      complicated structure and then supplied to usb_rcvisocpipe() and
      usb_sndisocpipe(). After that switch urb->dev wasn't set to anything
      specific before being supplied to those two macros. This triggers a
      nasty oops:
      
          BUG: unable to handle kernel NULL pointer dereference at 00000000
          #PF error: [normal kernel read fault]
          *pde = 00000000
          Oops: 0000 [#1] SMP
          CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.1.0-0.rc4.1.local0.fc28.i686 #1
          Hardware name: IBM 2525FAG/2525FAG, BIOS 74ET64WW (2.09 ) 12/14/2006
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: f30c9f20
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Call Trace:
           <SOFTIRQ>
           ? gigaset_isdn_connD+0xf6/0x140 [gigaset]
           gigaset_handle_event+0x173e/0x1b90 [gigaset]
           tasklet_action_common.isra.16+0x4e/0xf0
           tasklet_action+0x1e/0x20
           __do_softirq+0xb2/0x293
           ? __irqentry_text_end+0x3/0x3
           call_on_stack+0x45/0x50
           </SOFTIRQ>
           ? irq_exit+0xb5/0xc0
           ? do_IRQ+0x78/0xd0
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? common_interrupt+0xd4/0xdc
           ? acpi_idle_enter_s2idle+0x50/0x50
           ? sched_cpu_activate+0x1b/0xf0
           ? acpi_fan_resume.cold.7+0x9/0x18
           ? cpuidle_enter_state+0x152/0x4c0
           ? cpuidle_enter+0x14/0x20
           ? call_cpuidle+0x21/0x40
           ? do_idle+0x1c8/0x200
           ? cpu_startup_entry+0x25/0x30
           ? rest_init+0x88/0x8a
           ? arch_call_rest_init+0xd/0x19
           ? start_kernel+0x42f/0x448
           ? i386_start_kernel+0xac/0xb0
           ? startup_32_smp+0x164/0x168
          Modules linked in: ppp_generic slhc capi bas_gigaset gigaset kernelcapi nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables sunrpc ipw2200 iTCO_wdt gpio_ich snd_intel8x0 libipw iTCO_vendor_support snd_ac97_codec lib80211 ppdev ac97_bus snd_seq cfg80211 snd_seq_device pcspkr thinkpad_acpi lpc_ich snd_pcm i2c_i801 snd_timer ledtrig_audio snd soundcore rfkill parport_pc parport pcc_cpufreq acpi_cpufreq i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sdhci_pci sysimgblt cqhci fb_sys_fops drm sdhci mmc_core tg3 ata_generic serio_raw yenta_socket pata_acpi video
          CR2: 0000000000000000
          ---[ end trace 1fe07487b9200c73 ]---
          EIP: gigaset_init_bchannel+0x89/0x320 [bas_gigaset]
          Code: 75 07 83 8b 84 00 00 00 40 8d 47 74 c7 07 01 00 00 00 89 45 f0 8b 44 b7 68 85 c0 0f 84 6a 02 00 00 8b 48 28 8b 93 88 00 00 00 <8b> 09 8d 54 12 03 c1 e2 0f c1 e1 08 09 ca 8b 8b 8c 00 00 00 80 ca
          EAX: f05ec200 EBX: ed404200 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: f065a000 EBP: f30c9f40 ESP: cddcb3bc
          DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010086
          CR0: 80050033 CR2: 00000000 CR3: 0ddc7000 CR4: 000006d0
          Kernel panic - not syncing: Fatal exception in interrupt
          Kernel Offset: 0xcc00000 from 0xc0400000 (relocation range: 0xc0000000-0xf6ffdfff)
          ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
      
      No-one noticed because this Oops is apparently only triggered by setting
      up an ISDN data connection on a live ISDN line on a gigaset base (ie,
      the PBX that the gigaset driver support). Very few people do that
      running present day kernels.
      
      Anyhow, a little code reorganization makes this problem go away, while
      avoiding the subtle ordering that was used in the past. So let's do
      that.
      
      Fixes: 78c696c1 ("isdn: gigaset: use usb_fill_int_urb()")
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      de2dec85
    • Eric Dumazet's avatar
      flow_dissector: disable preemption around BPF calls · e106991b
      Eric Dumazet authored
      [ Upstream commit b1c17a9a ]
      
      Various things in eBPF really require us to disable preemption
      before running an eBPF program.
      
      syzbot reported :
      
      BUG: assuming atomic context at net/core/flow_dissector.c:737
      in_atomic(): 0, irqs_disabled(): 0, pid: 24710, name: syz-executor.3
      2 locks held by syz-executor.3/24710:
       #0: 00000000e81a4bf1 (&tfile->napi_mutex){+.+.}, at: tun_get_user+0x168e/0x3ff0 drivers/net/tun.c:1850
       #1: 00000000254afebd (rcu_read_lock){....}, at: __skb_flow_dissect+0x1e1/0x4bb0 net/core/flow_dissector.c:822
      CPU: 1 PID: 24710 Comm: syz-executor.3 Not tainted 5.1.0+ #6
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       __cant_sleep kernel/sched/core.c:6165 [inline]
       __cant_sleep.cold+0xa3/0xbb kernel/sched/core.c:6142
       bpf_flow_dissect+0xfe/0x390 net/core/flow_dissector.c:737
       __skb_flow_dissect+0x362/0x4bb0 net/core/flow_dissector.c:853
       skb_flow_dissect_flow_keys_basic include/linux/skbuff.h:1322 [inline]
       skb_probe_transport_header include/linux/skbuff.h:2500 [inline]
       skb_probe_transport_header include/linux/skbuff.h:2493 [inline]
       tun_get_user+0x2cfe/0x3ff0 drivers/net/tun.c:1940
       tun_chr_write_iter+0xbd/0x156 drivers/net/tun.c:2037
       call_write_iter include/linux/fs.h:1872 [inline]
       do_iter_readv_writev+0x5fd/0x900 fs/read_write.c:693
       do_iter_write fs/read_write.c:970 [inline]
       do_iter_write+0x184/0x610 fs/read_write.c:951
       vfs_writev+0x1b3/0x2f0 fs/read_write.c:1015
       do_writev+0x15b/0x330 fs/read_write.c:1058
       __do_sys_writev fs/read_write.c:1131 [inline]
       __se_sys_writev fs/read_write.c:1128 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1128
       do_syscall_64+0x103/0x670 arch/x86/entry/common.c:298
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: d58e468b ("flow_dissector: implements flow dissector BPF hook")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Petar Penkov <ppenkov@google.com>
      Cc: Stanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e106991b
    • Heiner Kallweit's avatar
      net: phy: fix phy_validate_pause · b58df9d7
      Heiner Kallweit authored
      [ Upstream commit b4010af9 ]
      
      We have valid scenarios where ETHTOOL_LINK_MODE_Pause_BIT doesn't
      need to be supported. Therefore extend the first check to check
      for rx_pause being set.
      
      See also phy_set_asym_pause:
      rx=0 and tx=1: advertise asym pause only
      rx=0 and tx=0: stop advertising both pause modes
      
      The fixed commit isn't wrong, it's just the one that introduced the
      linkmode bitmaps.
      
      Fixes: 3c1bcc86 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b58df9d7
    • Jason Wang's avatar
      tuntap: synchronize through tfiles array instead of tun->numqueues · 0f852d10
      Jason Wang authored
      [ Upstream commit 9871a9e4 ]
      
      When a queue(tfile) is detached through __tun_detach(), we move the
      last enabled tfile to the position where detached one sit but don't
      NULL out last position. We expect to synchronize the datapath through
      tun->numqueues. Unfortunately, this won't work since we're lacking
      sufficient mechanism to order or synchronize the access to
      tun->numqueues.
      
      To fix this, NULL out the last position during detaching and check
      RCU protected tfile against NULL instead of checking tun->numqueues in
      datapath.
      
      Cc: YueHaibing <yuehaibing@huawei.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: weiyongjun (A) <weiyongjun1@huawei.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: c8d68e6b ("tuntap: multiqueue support")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f852d10
    • Jason Wang's avatar
      tuntap: fix dividing by zero in ebpf queue selection · 6e1d1a3a
      Jason Wang authored
      [ Upstream commit a35d310f ]
      
      We need check if tun->numqueues is zero (e.g for the persist device)
      before trying to use it for modular arithmetic.
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Fixes: 96f84061("tun: add eBPF based queue selection method")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e1d1a3a
    • Stephen Suryaputra's avatar
      vrf: sit mtu should not be updated when vrf netdev is the link · ee0d666d
      Stephen Suryaputra authored
      [ Upstream commit ff6ab32b ]
      
      VRF netdev mtu isn't typically set and have an mtu of 65536. When the
      link of a tunnel is set, the tunnel mtu is changed from 1480 to the link
      mtu minus tunnel header. In the case of VRF netdev is the link, then the
      tunnel mtu becomes 65516. So, fix it by not setting the tunnel mtu in
      this case.
      Signed-off-by: default avatarStephen Suryaputra <ssuryaextr@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee0d666d
    • Hangbin Liu's avatar
      vlan: disable SIOCSHWTSTAMP in container · 4e031338
      Hangbin Liu authored
      [ Upstream commit 873017af ]
      
      With NET_ADMIN enabled in container, a normal user could be mapped to
      root and is able to change the real device's rx filter via ioctl on
      vlan, which would affect the other ptp process on host. Fix it by
      disabling SIOCSHWTSTAMP in container.
      
      Fixes: a6111d3c ("vlan: Pass SIOC[SG]HWTSTAMP ioctls to real device")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Acked-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e031338
    • Parthasarathy Bhuvaragan's avatar
      tipc: fix hanging clients using poll with EPOLLOUT flag · 5338e8ff
      Parthasarathy Bhuvaragan authored
      [ Upstream commit ff946833 ]
      
      commit 517d7c79 ("tipc: fix hanging poll() for stream sockets")
      introduced a regression for clients using non-blocking sockets.
      After the commit, we send EPOLLOUT event to the client even in
      TIPC_CONNECTING state. This causes the subsequent send() to fail
      with ENOTCONN, as the socket is still not in TIPC_ESTABLISHED state.
      
      In this commit, we:
      - improve the fix for hanging poll() by replacing sk_data_ready()
        with sk_state_change() to wake up all clients.
      - revert the faulty updates introduced by commit 517d7c79
        ("tipc: fix hanging poll() for stream sockets").
      
      Fixes: 517d7c79 ("tipc: fix hanging poll() for stream sockets")
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@gmail.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.se>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5338e8ff
    • Paolo Abeni's avatar
      selinux: do not report error on connect(AF_UNSPEC) · 17617cd5
      Paolo Abeni authored
      [ Upstream commit c7e0d6cc ]
      
      calling connect(AF_UNSPEC) on an already connected TCP socket is an
      established way to disconnect() such socket. After commit 68741a8a
      ("selinux: Fix ltp test connect-syscall failure") it no longer works
      and, in the above scenario connect() fails with EAFNOSUPPORT.
      
      Fix the above falling back to the generic/old code when the address family
      is not AF_INET{4,6}, but leave the SCTP code path untouched, as it has
      specific constraints.
      
      Fixes: 68741a8a ("selinux: Fix ltp test connect-syscall failure")
      Reported-by: default avatarTom Deseyn <tdeseyn@redhat.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17617cd5
    • YueHaibing's avatar
      packet: Fix error path in packet_init · 7ad0ccae
      YueHaibing authored
      [ Upstream commit 36096f2f ]
      
      kernel BUG at lib/list_debug.c:47!
      invalid opcode: 0000 [#1
      CPU: 0 PID: 12914 Comm: rmmod Tainted: G        W         5.1.0+ #47
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:__list_del_entry_valid+0x53/0x90
      Code: 48 8b 32 48 39 fe 75 35 48 8b 50 08 48 39 f2 75 40 b8 01 00 00 00 5d c3 48
      89 fe 48 89 c2 48 c7 c7 18 75 fe 82 e8 cb 34 78 ff <0f> 0b 48 89 fe 48 c7 c7 50 75 fe 82 e8 ba 34 78 ff 0f 0b 48 89 f2
      RSP: 0018:ffffc90001c2fe40 EFLAGS: 00010286
      RAX: 000000000000004e RBX: ffffffffa0184000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff888237a17788 RDI: 00000000ffffffff
      RBP: ffffc90001c2fe40 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffc90001c2fe10 R11: 0000000000000000 R12: 0000000000000000
      R13: ffffc90001c2fe50 R14: ffffffffa0184000 R15: 0000000000000000
      FS:  00007f3d83634540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000555c350ea818 CR3: 0000000231677000 CR4: 00000000000006f0
      Call Trace:
       unregister_pernet_operations+0x34/0x120
       unregister_pernet_subsys+0x1c/0x30
       packet_exit+0x1c/0x369 [af_packet
       __x64_sys_delete_module+0x156/0x260
       ? lockdep_hardirqs_on+0x133/0x1b0
       ? do_syscall_64+0x12/0x1f0
       do_syscall_64+0x6e/0x1f0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      When modprobe af_packet, register_pernet_subsys
      fails and does a cleanup, ops->list is set to LIST_POISON1,
      but the module init is considered to success, then while rmmod it,
      BUG() is triggered in __list_del_entry_valid which is called from
      unregister_pernet_subsys. This patch fix error handing path in
      packet_init to avoid possilbe issue if some error occur.
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ad0ccae
    • Christophe Leroy's avatar
      net: ucc_geth - fix Oops when changing number of buffers in the ring · be0e51d7
      Christophe Leroy authored
      [ Upstream commit ee0df193 ]
      
      When changing the number of buffers in the RX ring while the interface
      is running, the following Oops is encountered due to the new number
      of buffers being taken into account immediately while their allocation
      is done when opening the device only.
      
      [   69.882706] Unable to handle kernel paging request for data at address 0xf0000100
      [   69.890172] Faulting instruction address: 0xc033e164
      [   69.895122] Oops: Kernel access of bad area, sig: 11 [#1]
      [   69.900494] BE PREEMPT CMPCPRO
      [   69.907120] CPU: 0 PID: 0 Comm: swapper Not tainted 4.14.115-00006-g179ade8ce3-dirty #269
      [   69.915956] task: c0684310 task.stack: c06da000
      [   69.920470] NIP:  c033e164 LR: c02e44d0 CTR: c02e41fc
      [   69.925504] REGS: dfff1e20 TRAP: 0300   Not tainted  (4.14.115-00006-g179ade8ce3-dirty)
      [   69.934161] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 22004428  XER: 20000000
      [   69.940869] DAR: f0000100 DSISR: 20000000
      [   69.940869] GPR00: c0352d70 dfff1ed0 c0684310 f00000a4 00000040 dfff1f68 00000000 0000001f
      [   69.940869] GPR08: df53f410 1cc00040 00000021 c0781640 42004424 100c82b6 f00000a4 df53f5b0
      [   69.940869] GPR16: df53f6c0 c05daf84 00000040 00000000 00000040 c0782be4 00000000 00000001
      [   69.940869] GPR24: 00000000 df53f400 000001b0 df53f410 df53f000 0000003f df708220 1cc00044
      [   69.978348] NIP [c033e164] skb_put+0x0/0x5c
      [   69.982528] LR [c02e44d0] ucc_geth_poll+0x2d4/0x3f8
      [   69.987384] Call Trace:
      [   69.989830] [dfff1ed0] [c02e4554] ucc_geth_poll+0x358/0x3f8 (unreliable)
      [   69.996522] [dfff1f20] [c0352d70] net_rx_action+0x248/0x30c
      [   70.002099] [dfff1f80] [c04e93e4] __do_softirq+0xfc/0x310
      [   70.007492] [dfff1fe0] [c0021124] irq_exit+0xd0/0xd4
      [   70.012458] [dfff1ff0] [c000e7e0] call_do_irq+0x24/0x3c
      [   70.017683] [c06dbe80] [c0006bac] do_IRQ+0x64/0xc4
      [   70.022474] [c06dbea0] [c001097c] ret_from_except+0x0/0x14
      [   70.027964] --- interrupt: 501 at rcu_idle_exit+0x84/0x90
      [   70.027964]     LR = rcu_idle_exit+0x74/0x90
      [   70.037585] [c06dbf60] [20000000] 0x20000000 (unreliable)
      [   70.042984] [c06dbf80] [c004bb0c] do_idle+0xb4/0x11c
      [   70.047945] [c06dbfa0] [c004bd14] cpu_startup_entry+0x18/0x1c
      [   70.053682] [c06dbfb0] [c05fb034] start_kernel+0x370/0x384
      [   70.059153] [c06dbff0] [00003438] 0x3438
      [   70.063062] Instruction dump:
      [   70.066023] 38a00000 38800000 90010014 4bfff015 80010014 7c0803a6 3123ffff 7c691910
      [   70.073767] 38210010 4e800020 38600000 4e800020 <80e3005c> 80c30098 3107ffff 7d083910
      [   70.081690] ---[ end trace be7ccd9c1e1a9f12 ]---
      
      This patch forbids the modification of the number of buffers in the
      ring while the interface is running.
      
      Fixes: ac421852 ("ucc_geth: add ethtool support")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be0e51d7
    • Thomas Bogendoerfer's avatar
      net: seeq: fix crash caused by not set dev.parent · abda892d
      Thomas Bogendoerfer authored
      [ Upstream commit 5afcd14c ]
      
      The old MIPS implementation of dma_cache_sync() didn't use the dev argument,
      but commit c9eb6172 ("dma-mapping: turn dma_cache_sync into a
      dma_map_ops method") changed that, so we now need to set dev.parent.
      Signed-off-by: default avatarThomas Bogendoerfer <tbogendoerfer@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      abda892d
    • Harini Katakam's avatar
      net: macb: Change interrupt and napi enable order in open · 5f80c82f
      Harini Katakam authored
      [ Upstream commit 05044531 ]
      
      Current order in open:
      -> Enable interrupts (macb_init_hw)
      -> Enable NAPI
      -> Start PHY
      
      Sequence of RX handling:
      -> RX interrupt occurs
      -> Interrupt is cleared and interrupt bits disabled in handler
      -> NAPI is scheduled
      -> In NAPI, RX budget is processed and RX interrupts are re-enabled
      
      With the above, on QEMU or fixed link setups (where PHY state doesn't
      matter), there's a chance macb RX interrupt occurs before NAPI is
      enabled. This will result in NAPI being scheduled before it is enabled.
      Fix this macb open by changing the order.
      
      Fixes: ae1f2a56 ("net: macb: Added support for many RX queues")
      Signed-off-by: default avatarHarini Katakam <harini.katakam@xilinx.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5f80c82f
    • Corentin Labbe's avatar
      net: ethernet: stmmac: dwmac-sun8i: enable support of unicast filtering · a344dc5c
      Corentin Labbe authored
      [ Upstream commit d4c26eb6 ]
      
      When adding more MAC addresses to a dwmac-sun8i interface, the device goes
      directly in promiscuous mode.
      This is due to IFF_UNICAST_FLT missing flag.
      
      So since the hardware support unicast filtering, let's add IFF_UNICAST_FLT.
      
      Fixes: 9f93ac8d ("net-next: stmmac: Add dwmac-sun8i")
      Signed-off-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a344dc5c
    • YueHaibing's avatar
      net: dsa: Fix error cleanup path in dsa_init_module · 2cb6d28b
      YueHaibing authored
      [ Upstream commit 68be9302 ]
      
      BUG: unable to handle kernel paging request at ffffffffa01c5430
      PGD 3270067 P4D 3270067 PUD 3271063 PMD 230bc5067 PTE 0
      Oops: 0000 [#1
      CPU: 0 PID: 6159 Comm: modprobe Not tainted 5.1.0+ #33
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:raw_notifier_chain_register+0x16/0x40
      Code: 63 f8 66 90 e9 5d ff ff ff 90 90 90 90 90 90 90 90 90 90 90 55 48 8b 07 48 89 e5 48 85 c0 74 1c 8b 56 10 3b 50 10 7e 07 eb 12 <39> 50 10 7c 0d 48 8d 78 08 48 8b 40 08 48 85 c0 75 ee 48 89 46 08
      RSP: 0018:ffffc90001c33c08 EFLAGS: 00010282
      RAX: ffffffffa01c5420 RBX: ffffffffa01db420 RCX: 4fcef45928070a8b
      RDX: 0000000000000000 RSI: ffffffffa01db420 RDI: ffffffffa01b0068
      RBP: ffffc90001c33c08 R08: 000000003e0a33d0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000094443661 R12: ffff88822c320700
      R13: ffff88823109be80 R14: 0000000000000000 R15: ffffc90001c33e78
      FS:  00007fab8bd08540(0000) GS:ffff888237a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa01c5430 CR3: 00000002297ea000 CR4: 00000000000006f0
      Call Trace:
       register_netdevice_notifier+0x43/0x250
       ? 0xffffffffa01e0000
       dsa_slave_register_notifier+0x13/0x70 [dsa_core
       ? 0xffffffffa01e0000
       dsa_init_module+0x2e/0x1000 [dsa_core
       do_one_initcall+0x6c/0x3cc
       ? do_init_module+0x22/0x1f1
       ? rcu_read_lock_sched_held+0x97/0xb0
       ? kmem_cache_alloc_trace+0x325/0x3b0
       do_init_module+0x5b/0x1f1
       load_module+0x1db1/0x2690
       ? m_show+0x1d0/0x1d0
       __do_sys_finit_module+0xc5/0xd0
       __x64_sys_finit_module+0x15/0x20
       do_syscall_64+0x6b/0x1d0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Cleanup allocated resourses if there are errors,
      otherwise it will trgger memleak.
      
      Fixes: c9eb3e0f ("net: dsa: Add support for learning FDB through notification")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarVivien Didelot <vivien.didelot@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2cb6d28b
    • David Ahern's avatar
      ipv4: Fix raw socket lookup for local traffic · 353b3fd8
      David Ahern authored
      [ Upstream commit 19e4e768 ]
      
      inet_iif should be used for the raw socket lookup. inet_iif considers
      rt_iif which handles the case of local traffic.
      
      As it stands, ping to a local address with the '-I <dev>' option fails
      ever since ping was changed to use SO_BINDTODEVICE instead of
      cmsg + IP_PKTINFO.
      
      IPv6 works fine.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      353b3fd8
    • Hangbin Liu's avatar
      fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied · e83a63de
      Hangbin Liu authored
      [ Upstream commit e9919a24 ]
      
      With commit 153380ec ("fib_rules: Added NLM_F_EXCL support to
      fib_nl_newrule") we now able to check if a rule already exists. But this
      only works with iproute2. For other tools like libnl, NetworkManager,
      it still could add duplicate rules with only NLM_F_CREATE flag, like
      
      [localhost ~ ]# ip rule
      0:      from all lookup local
      32766:  from all lookup main
      32767:  from all lookup default
      100000: from 192.168.7.5 lookup 5
      100000: from 192.168.7.5 lookup 5
      
      As it doesn't make sense to create two duplicate rules, let's just return
      0 if the rule exists.
      
      Fixes: 153380ec ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule")
      Reported-by: default avatarThomas Haller <thaller@redhat.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e83a63de
    • Laurentiu Tudor's avatar
      dpaa_eth: fix SG frame cleanup · e996c41b
      Laurentiu Tudor authored
      [ Upstream commit 17170e65 ]
      
      Fix issue with the entry indexing in the sg frame cleanup code being
      off-by-1. This problem showed up when doing some basic iperf tests and
      manifested in traffic coming to a halt.
      Signed-off-by: default avatarLaurentiu Tudor <laurentiu.tudor@nxp.com>
      Acked-by: default avatarMadalin Bucur <madalin.bucur@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e996c41b
    • Tobin C. Harding's avatar
      bridge: Fix error path for kobject_init_and_add() · 73082d24
      Tobin C. Harding authored
      [ Upstream commit bdfad5ae ]
      
      Currently error return from kobject_init_and_add() is not followed by a
      call to kobject_put().  This means there is a memory leak.  We currently
      set p to NULL so that kfree() may be called on it as a noop, the code is
      arguably clearer if we move the kfree() up closer to where it is
      called (instead of after goto jump).
      
      Remove a goto label 'err1' and jump to call to kobject_put() in error
      return from kobject_init_and_add() fixing the memory leak.  Re-name goto
      label 'put_back' to 'err1' now that we don't use err1, following current
      nomenclature (err1, err2 ...).  Move call to kfree out of the error
      code at bottom of function up to closer to where memory was allocated.
      Add comment to clarify call to kfree().
      Signed-off-by: default avatarTobin C. Harding <tobin@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      73082d24
    • Jarod Wilson's avatar
      bonding: fix arp_validate toggling in active-backup mode · 9035a941
      Jarod Wilson authored
      [ Upstream commit a9b8a2b3 ]
      
      There's currently a problem with toggling arp_validate on and off with an
      active-backup bond. At the moment, you can start up a bond, like so:
      
      modprobe bonding mode=1 arp_interval=100 arp_validate=0 arp_ip_targets=192.168.1.1
      ip link set bond0 down
      echo "ens4f0" > /sys/class/net/bond0/bonding/slaves
      echo "ens4f1" > /sys/class/net/bond0/bonding/slaves
      ip link set bond0 up
      ip addr add 192.168.1.2/24 dev bond0
      
      Pings to 192.168.1.1 work just fine. Now turn on arp_validate:
      
      echo 1 > /sys/class/net/bond0/bonding/arp_validate
      
      Pings to 192.168.1.1 continue to work just fine. Now when you go to turn
      arp_validate off again, the link falls flat on it's face:
      
      echo 0 > /sys/class/net/bond0/bonding/arp_validate
      dmesg
      ...
      [133191.911987] bond0: Setting arp_validate to none (0)
      [133194.257793] bond0: bond_should_notify_peers: slave ens4f0
      [133194.258031] bond0: link status definitely down for interface ens4f0, disabling it
      [133194.259000] bond0: making interface ens4f1 the new active one
      [133197.330130] bond0: link status definitely down for interface ens4f1, disabling it
      [133197.331191] bond0: now running without any active interface!
      
      The problem lies in bond_options.c, where passing in arp_validate=0
      results in bond->recv_probe getting set to NULL. This flies directly in
      the face of commit 3fe68df9, which says we need to set recv_probe =
      bond_arp_recv, even if we're not using arp_validate. Said commit fixed
      this in bond_option_arp_interval_set, but missed that we can get to that
      same state in bond_option_arp_validate_set as well.
      
      One solution would be to universally set recv_probe = bond_arp_recv here
      as well, but I don't think bond_option_arp_validate_set has any business
      touching recv_probe at all, and that should be left to the arp_interval
      code, so we can just make things much tidier here.
      
      Fixes: 3fe68df9 ("bonding: always set recv_probe to bond_arp_rcv in arp monitor")
      CC: Jay Vosburgh <j.vosburgh@gmail.com>
      CC: Veaceslav Falico <vfalico@gmail.com>
      CC: Andy Gospodarek <andy@greyhouse.net>
      CC: "David S. Miller" <davem@davemloft.net>
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9035a941
    • Nigel Croxon's avatar
      Don't jump to compute_result state from check_result state · 57e60bc8
      Nigel Croxon authored
      commit 4f4fd7c5 upstream.
      
      Changing state from check_state_check_result to
      check_state_compute_result not only is unsafe but also doesn't
      appear to serve a valid purpose.  A raid6 check should only be
      pushing out extra writes if doing repair and a mis-match occurs.
      The stripe dev management will already try and do repair writes
      for failing sectors.
      
      This patch makes the raid6 check_state_check_result handling
      work more like raid5's.  If somehow too many failures for a
      check, just quit the check operation for the stripe.  When any
      checks pass, don't try and use check_state_compute_result for
      a purpose it isn't needed for and is unsafe for.  Just mark the
      stripe as in sync for passing its parity checks and let the
      stripe dev read/write code and the bad blocks list do their
      job handling I/O errors.
      
      Repro steps from Xiao:
      
      These are the steps to reproduce this problem:
      1. redefined OPT_MEDIUM_ERR_ADDR to 12000 in scsi_debug.c
      2. insmod scsi_debug.ko dev_size_mb=11000  max_luns=1 num_tgts=1
      3. mdadm --create /dev/md127 --level=6 --raid-devices=5 /dev/sde1 /dev/sde2 /dev/sde3 /dev/sde5 /dev/sde6
      sde is the disk created by scsi_debug
      4. echo "2" >/sys/module/scsi_debug/parameters/opts
      5. raid-check
      
      It panic:
      [ 4854.730899] md: data-check of RAID array md127
      [ 4854.857455] sd 5:0:0:0: [sdr] tag#80 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.859246] sd 5:0:0:0: [sdr] tag#80 Sense Key : Medium Error [current]
      [ 4854.860694] sd 5:0:0:0: [sdr] tag#80 Add. Sense: Unrecovered read error
      [ 4854.862207] sd 5:0:0:0: [sdr] tag#80 CDB: Read(10) 28 00 00 00 2d 88 00 04 00 00
      [ 4854.864196] print_req_error: critical medium error, dev sdr, sector 11656 flags 0
      [ 4854.867409] sd 5:0:0:0: [sdr] tag#100 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.869469] sd 5:0:0:0: [sdr] tag#100 Sense Key : Medium Error [current]
      [ 4854.871206] sd 5:0:0:0: [sdr] tag#100 Add. Sense: Unrecovered read error
      [ 4854.872858] sd 5:0:0:0: [sdr] tag#100 CDB: Read(10) 28 00 00 00 2e e0 00 00 08 00
      [ 4854.874587] print_req_error: critical medium error, dev sdr, sector 12000 flags 4000
      [ 4854.876456] sd 5:0:0:0: [sdr] tag#101 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.878552] sd 5:0:0:0: [sdr] tag#101 Sense Key : Medium Error [current]
      [ 4854.880278] sd 5:0:0:0: [sdr] tag#101 Add. Sense: Unrecovered read error
      [ 4854.881846] sd 5:0:0:0: [sdr] tag#101 CDB: Read(10) 28 00 00 00 2e e8 00 00 08 00
      [ 4854.883691] print_req_error: critical medium error, dev sdr, sector 12008 flags 4000
      [ 4854.893927] sd 5:0:0:0: [sdr] tag#166 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [ 4854.896002] sd 5:0:0:0: [sdr] tag#166 Sense Key : Medium Error [current]
      [ 4854.897561] sd 5:0:0:0: [sdr] tag#166 Add. Sense: Unrecovered read error
      [ 4854.899110] sd 5:0:0:0: [sdr] tag#166 CDB: Read(10) 28 00 00 00 2e e0 00 00 10 00
      [ 4854.900989] print_req_error: critical medium error, dev sdr, sector 12000 flags 0
      [ 4854.902757] md/raid:md127: read error NOT corrected!! (sector 9952 on sdr1).
      [ 4854.904375] md/raid:md127: read error NOT corrected!! (sector 9960 on sdr1).
      [ 4854.906201] ------------[ cut here ]------------
      [ 4854.907341] kernel BUG at drivers/md/raid5.c:4190!
      
      raid5.c:4190 above is this BUG_ON:
      
          handle_parity_checks6()
              ...
              BUG_ON(s->uptodate < disks - 1); /* We don't need Q to recover */
      
      Cc: <stable@vger.kernel.org> # v3.16+
      OriginalAuthor: David Jeffery <djeffery@redhat.com>
      Cc: Xiao Ni <xni@redhat.com>
      Tested-by: default avatarDavid Jeffery <djeffery@redhat.com>
      Signed-off-by: default avatarDavid Jeffy <djeffery@redhat.com>
      Signed-off-by: default avatarNigel Croxon <ncroxon@redhat.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      57e60bc8
    • Gustavo A. R. Silva's avatar
      rtlwifi: rtl8723ae: Fix missing break in switch statement · 4c72b658
      Gustavo A. R. Silva authored
      commit 84242b82 upstream.
      
      Add missing break statement in order to prevent the code from falling
      through to case 0x1025, and erroneously setting rtlhal->oem_id to
      RT_CID_819X_ACER when rtlefuse->eeprom_svid is equal to 0x10EC and
      none of the cases in switch (rtlefuse->eeprom_smid) match.
      
      This bug was found thanks to the ongoing efforts to enable
      -Wimplicit-fallthrough.
      
      Fixes: 238ad2dd ("rtlwifi: rtl8723ae: Clean up the hardware info routine")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c72b658