1. 17 Apr, 2019 40 commits
    • Will Deacon's avatar
      arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value · b8dba39c
      Will Deacon authored
      commit 045afc24 upstream.
      
      Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
      explicitly set the return value on the non-faulting path and instead
      leaves it holding the result of the underlying atomic operation. This
      means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
      value will be reported as having failed. Regrettably, I wrote the buggy
      code back in 2011 and it was upstreamed as part of the initial arm64
      support in 2012.
      
      The reasons we appear to get away with this are:
      
        1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
           exercised by futex() test applications
      
        2. If the result of the atomic operation is zero, the system call
           behaves correctly
      
        3. Prior to version 2.25, the only operation used by GLIBC set the
           futex to zero, and therefore worked as expected. From 2.25 onwards,
           FUTEX_WAKE_OP is not used by GLIBC at all.
      
      Fix the implementation by ensuring that the return value is either 0
      to indicate that the atomic operation completed successfully, or -EFAULT
      if we encountered a fault when accessing the user mapping.
      
      Cc: <stable@kernel.org>
      Fixes: 6170a974 ("arm64: Atomic operations")
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8dba39c
    • David Engraf's avatar
      ARM: dts: at91: Fix typo in ISC_D0 on PC9 · 377b54a6
      David Engraf authored
      commit e7dfb6d0 upstream.
      
      The function argument for the ISC_D0 on PC9 was incorrect. According to
      the documentation it should be 'C' aka 3.
      Signed-off-by: default avatarDavid Engraf <david.engraf@sysgo.com>
      Reviewed-by: default avatarNicolas Ferre <nicolas.ferre@microchip.com>
      Signed-off-by: default avatarLudovic Desroches <ludovic.desroches@microchip.com>
      Fixes: 7f16cb67 ("ARM: at91/dt: add sama5d2 pinmux")
      Cc: <stable@vger.kernel.org> # v4.4+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      377b54a6
    • Peter Ujfalusi's avatar
      ARM: dts: am335x-evm: Correct the regulators for the audio codec · 84a8a44a
      Peter Ujfalusi authored
      commit 4f96dc0a upstream.
      
      Correctly map the regulators used by tlv320aic3106.
      Both 1.8V and 3.3V for the codec is derived from VBAT via fixed regulators.
      
      Cc: <Stable@vger.kernel.org> # v4.14+
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84a8a44a
    • Peter Ujfalusi's avatar
      ARM: dts: am335x-evmsk: Correct the regulators for the audio codec · 9af55767
      Peter Ujfalusi authored
      commit 66913706 upstream.
      
      Correctly map the regulators used by tlv320aic3106.
      Both 1.8V and 3.3V for the codec is derived from VBAT via fixed regulators.
      
      Cc: <Stable@vger.kernel.org> # v4.14+
      Signed-off-by: default avatarPeter Ujfalusi <peter.ujfalusi@ti.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9af55767
    • Cornelia Huck's avatar
      virtio: Honour 'may_reduce_num' in vring_create_virtqueue · 1b69a78a
      Cornelia Huck authored
      commit cf94db21 upstream.
      
      vring_create_virtqueue() allows the caller to specify via the
      may_reduce_num parameter whether the vring code is allowed to
      allocate a smaller ring than specified.
      
      However, the split ring allocation code tries to allocate a
      smaller ring on allocation failure regardless of what the
      caller specified. This may cause trouble for e.g. virtio-pci
      in legacy mode, which does not support ring resizing. (The
      packed ring code does not resize in any case.)
      
      Let's fix this by bailing out immediately in the split ring code
      if the requested size cannot be allocated and may_reduce_num has
      not been specified.
      
      While at it, fix a typo in the usage instructions.
      
      Fixes: 2a2d1382 ("virtio: Add improved queue allocation API")
      Cc: stable@vger.kernel.org # v4.6+
      Signed-off-by: default avatarCornelia Huck <cohuck@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Reviewed-by: default avatarJens Freimann <jfreimann@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1b69a78a
    • Kefeng Wang's avatar
      genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n · 82e1fb4d
      Kefeng Wang authored
      commit e8458e7a upstream.
      
      When CONFIG_SPARSE_IRQ is disable, the request_mutex in struct irq_desc
      is not initialized which causes malfunction.
      
      Fixes: 9114014c ("genirq: Add mutex to irq desc to serialize request/free_irq()")
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarMukesh Ojha <mojha@codeaurora.org>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190404074512.145533-1-wangkefeng.wang@huawei.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82e1fb4d
    • Stephen Boyd's avatar
      genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent() · 3559f73e
      Stephen Boyd authored
      commit 325aa195 upstream.
      
      If a child irqchip calls irq_chip_set_wake_parent() but its parent irqchip
      has the IRQCHIP_SKIP_SET_WAKE flag set an error is returned.
      
      This is inconsistent behaviour vs. set_irq_wake_real() which returns 0 when
      the irqchip has the IRQCHIP_SKIP_SET_WAKE flag set. It doesn't attempt to
      walk the chain of parents and set irq wake on any chips that don't have the
      flag set either. If the intent is to call the .irq_set_wake() callback of
      the parent irqchip, then we expect irqchip implementations to omit the
      IRQCHIP_SKIP_SET_WAKE flag and implement an .irq_set_wake() function that
      calls irq_chip_set_wake_parent().
      
      The problem has been observed on a Qualcomm sdm845 device where set wake
      fails on any GPIO interrupts after applying work in progress wakeup irq
      patches to the GPIO driver. The chain of chips looks like this:
      
           QCOM GPIO -> QCOM PDC (SKIP) -> ARM GIC (SKIP)
      
      The GPIO controllers parent is the QCOM PDC irqchip which in turn has ARM
      GIC as parent.  The QCOM PDC irqchip has the IRQCHIP_SKIP_SET_WAKE flag
      set, and so does the grandparent ARM GIC.
      
      The GPIO driver doesn't know if the parent needs to set wake or not, so it
      unconditionally calls irq_chip_set_wake_parent() causing this function to
      return a failure because the parent irqchip (PDC) doesn't have the
      .irq_set_wake() callback set. Returning 0 instead makes everything work and
      irqs from the GPIO controller can be configured for wakeup.
      
      Make it consistent by returning 0 (success) from irq_chip_set_wake_parent()
      when a parent chip has IRQCHIP_SKIP_SET_WAKE set.
      
      [ tglx: Massaged changelog ]
      
      Fixes: 08b55e2a ("genirq: Add irqchip_set_wake_parent")
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-gpio@vger.kernel.org
      Cc: Lina Iyer <ilina@codeaurora.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190325181026.247796-1-swboyd@chromium.orgSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3559f73e
    • Jason Yan's avatar
      block: fix the return errno for direct IO · b6991eb2
      Jason Yan authored
      commit a89afe58 upstream.
      
      If the last bio returned is not dio->bio, the status of the bio will
      not assigned to dio->bio if it is error. This will cause the whole IO
      status wrong.
      
          ksoftirqd/21-117   [021] ..s.  4017.966090:   8,0    C   N 4883648 [0]
                <idle>-0     [018] ..s.  4017.970888:   8,0    C  WS 4924800 + 1024 [0]
                <idle>-0     [018] ..s.  4017.970909:   8,0    D  WS 4935424 + 1024 [<idle>]
                <idle>-0     [018] ..s.  4017.970924:   8,0    D  WS 4936448 + 321 [<idle>]
          ksoftirqd/21-117   [021] ..s.  4017.995033:   8,0    C   R 4883648 + 336 [65475]
          ksoftirqd/21-117   [021] d.s.  4018.001988: myprobe1: (blkdev_bio_end_io+0x0/0x168) bi_status=7
          ksoftirqd/21-117   [021] d.s.  4018.001992: myprobe: (aio_complete_rw+0x0/0x148) x0=0xffff802f2595ad80 res=0x12a000 res2=0x0
      
      We always have to assign bio->bi_status to dio->bio.bi_status because we
      will only check dio->bio.bi_status when we return the whole IO to
      the upper layer.
      
      Fixes: 542ff7bf ("block: new direct I/O implementation")
      Cc: stable@vger.kernel.org
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJason Yan <yanaijie@huawei.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6991eb2
    • Jérôme Glisse's avatar
      block: do not leak memory in bio_copy_user_iov() · 6ec54fc4
      Jérôme Glisse authored
      commit a3761c3c upstream.
      
      When bio_add_pc_page() fails in bio_copy_user_iov() we should free
      the page we just allocated otherwise we are leaking it.
      
      Cc: linux-block@vger.kernel.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarJérôme Glisse <jglisse@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ec54fc4
    • Anand Jain's avatar
      btrfs: prop: fix vanished compression property after failed set · 2fc37a0a
      Anand Jain authored
      commit 272e5326 upstream.
      
      The compression property resets to NULL, instead of the old value if we
      fail to set the new compression parameter.
      
        $ btrfs prop get /btrfs compression
          compression=lzo
        $ btrfs prop set /btrfs compression zli
          ERROR: failed to set compression for /btrfs: Invalid argument
        $ btrfs prop get /btrfs compression
      
      This is because the compression property ->validate() is successful for
      'zli' as the strncmp() used the length passed from the userspace.
      
      Fix it by using the expected string length in strncmp().
      
      Fixes: 63541927 ("Btrfs: add support for inode properties")
      Fixes: 5c1aab1d ("btrfs: Add zstd support")
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2fc37a0a
    • Anand Jain's avatar
      btrfs: prop: fix zstd compression parameter validation · 979409e6
      Anand Jain authored
      commit 50398fde upstream.
      
      We let pass zstd compression parameter even if it is not fully valid.
      For example:
      
        $ btrfs prop set /btrfs compression zst
        $ btrfs prop get /btrfs compression
           compression=zst
      
      zlib and lzo are fine.
      
      Fix it by checking the correct prefix length.
      
      Fixes: 5c1aab1d ("btrfs: Add zstd support")
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarAnand Jain <anand.jain@oracle.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      979409e6
    • Filipe Manana's avatar
      Btrfs: do not allow trimming when a fs is mounted with the nologreplay option · 3eb52487
      Filipe Manana authored
      commit f35f06c3 upstream.
      
      Whan a filesystem is mounted with the nologreplay mount option, which
      requires it to be mounted in RO mode as well, we can not allow discard on
      free space inside block groups, because log trees refer to extents that
      are not pinned in a block group's free space cache (pinning the extents is
      precisely the first phase of replaying a log tree).
      
      So do not allow the fitrim ioctl to do anything when the filesystem is
      mounted with the nologreplay option, because later it can be mounted RW
      without that option, which causes log replay to happen and result in
      either a failure to replay the log trees (leading to a mount failure), a
      crash or some silent corruption.
      Reported-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Fixes: 96da0919 ("btrfs: Introduce new mount option to disable tree log replay")
      CC: stable@vger.kernel.org # 4.9+
      Reviewed-by: default avatarNikolay Borisov <nborisov@suse.com>
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eb52487
    • S.j. Wang's avatar
      ASoC: fsl_esai: fix channel swap issue when stream starts · 541e7568
      S.j. Wang authored
      commit 0ff4e8c6 upstream.
      
      There is very low possibility ( < 0.1% ) that channel swap happened
      in beginning when multi output/input pin is enabled. The issue is
      that hardware can't send data to correct pin in the beginning with
      the normal enable flow.
      
      This is hardware issue, but there is no errata, the workaround flow
      is that: Each time playback/recording, firstly clear the xSMA/xSMB,
      then enable TE/RE, then enable xSMB and xSMA (xSMB must be enabled
      before xSMA). Which is to use the xSMA as the trigger start register,
      previously the xCR_TE or xCR_RE is the bit for starting.
      
      Fixes commit 43d24e76 ("ASoC: fsl_esai: Add ESAI CPU DAI driver")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Acked-by: default avatarNicolin Chen <nicoleotsuka@gmail.com>
      Signed-off-by: default avatarShengjiu Wang <shengjiu.wang@nxp.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      541e7568
    • Arnd Bergmann's avatar
      include/linux/bitrev.h: fix constant bitrev · ed031128
      Arnd Bergmann authored
      commit 6147e136 upstream.
      
      clang points out with hundreds of warnings that the bitrev macros have a
      problem with constant input:
      
        drivers/hwmon/sht15.c:187:11: error: variable '__x' is uninitialized when used within its own initialization
              [-Werror,-Wuninitialized]
                u8 crc = bitrev8(data->val_status & 0x0F);
                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        include/linux/bitrev.h:102:21: note: expanded from macro 'bitrev8'
                __constant_bitrev8(__x) :                       \
                ~~~~~~~~~~~~~~~~~~~^~~~
        include/linux/bitrev.h:67:11: note: expanded from macro '__constant_bitrev8'
                u8 __x = x;                     \
                   ~~~   ^
      
      Both the bitrev and the __constant_bitrev macros use an internal
      variable named __x, which goes horribly wrong when passing one to the
      other.
      
      The obvious fix is to rename one of the variables, so this adds an extra
      '_'.
      
      It seems we got away with this because
      
       - there are only a few drivers using bitrev macros
      
       - usually there are no constant arguments to those
      
       - when they are constant, they tend to be either 0 or (unsigned)-1
         (drivers/isdn/i4l/isdnhdlc.o, drivers/iio/amplifiers/ad8366.c) and
         give the correct result by pure chance.
      
      In fact, the only driver that I could find that gets different results
      with this is drivers/net/wan/slic_ds26522.c, which in turn is a driver
      for fairly rare hardware (adding the maintainer to Cc for testing).
      
      Link: http://lkml.kernel.org/r/20190322140503.123580-1-arnd@arndb.de
      Fixes: 556d2f05 ("ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Cc: Zhao Qiang <qiang.zhao@nxp.com>
      Cc: Yalin Wang <yalin.wang@sonymobile.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed031128
    • Dave Airlie's avatar
      drm/udl: add a release method and delay modeset teardown · f7a46b61
      Dave Airlie authored
      commit 9b39b013 upstream.
      
      If we unplug a udl device, the usb callback with deinit the
      mode_config struct, however userspace will still have an open
      file descriptor and a framebuffer on that device. When userspace
      closes the fd, we'll oops because it'll try and look stuff up
      in the object idr which we've destroyed.
      
      This punts destroying the mode objects until release time instead.
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190405031715.5959-2-airlied@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7a46b61
    • Andrei Vagin's avatar
      alarmtimer: Return correct remaining time · 753ff726
      Andrei Vagin authored
      commit 07d7e120 upstream.
      
      To calculate a remaining time, it's required to subtract the current time
      from the expiration time. In alarm_timer_remaining() the arguments of
      ktime_sub are swapped.
      
      Fixes: d653d845 ("alarmtimer: Implement remaining callback")
      Signed-off-by: default avatarAndrei Vagin <avagin@gmail.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarMukesh Ojha <mojha@codeaurora.org>
      Cc: Stephen Boyd <sboyd@kernel.org>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190408041542.26338-1-avagin@gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      753ff726
    • Sven Schnelle's avatar
      parisc: regs_return_value() should return gpr28 · 224f5ab9
      Sven Schnelle authored
      commit 45efd871 upstream.
      
      While working on kretprobes for PA-RISC I was wondering while the
      kprobes sanity test always fails on kretprobes. This is caused by
      returning gpr20 instead of gpr28.
      Signed-off-by: default avatarSven Schnelle <svens@stackframe.org>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # 4.14+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      224f5ab9
    • Helge Deller's avatar
      parisc: Detect QEMU earlier in boot process · a1f52096
      Helge Deller authored
      commit d006e95b upstream.
      
      While adding LASI support to QEMU, I noticed that the QEMU detection in
      the kernel happens much too late. For example, when a LASI chip is found
      by the kernel, it registers the LASI LED driver as well.  But when we
      run on QEMU it makes sense to avoid spending unnecessary CPU cycles, so
      we need to access the running_on_QEMU flag earlier than before.
      
      This patch now makes the QEMU detection the fist task of the Linux
      kernel by moving it to where the kernel enters the C-coding.
      
      Fixes: 310d8278 ("parisc: qemu idle sleep support")
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: stable@vger.kernel.org # v4.14+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a1f52096
    • Peter Geis's avatar
      arm64: dts: rockchip: fix rk3328 sdmmc0 write errors · c1d361d3
      Peter Geis authored
      commit 09f91381 upstream.
      
      Various rk3328 based boards experience occasional sdmmc0 write errors.
      This is due to the rk3328.dtsi tx drive levels being set to 4ma, vs
      8ma per the rk3328 datasheet default settings.
      
      Fix this by setting the tx signal pins to 8ma.
      Inspiration from tonymac32's patch,
      https://github.com/ayufan-rock64/linux-kernel/commit/dc1212b347e0da17c5460bcc0a56b07d02bac3f8
      
      Fixes issues on the rk3328-roc-cc and the rk3328-rock64 (as per the
      above commit message).
      
      Tested on the rk3328-roc-cc board.
      
      Fixes: 52e02d37 ("arm64: dts: rockchip: add core dtsi file for RK3328 SoCs")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPeter Geis <pgwipeout@gmail.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1d361d3
    • Haiyang Zhang's avatar
      hv_netvsc: Fix unwanted wakeup after tx_disable · 789185d4
      Haiyang Zhang authored
      [ Upstream commit 1b704c4a ]
      
      After queue stopped, the wakeup mechanism may wake it up again
      when ring buffer usage is lower than a threshold. This may cause
      send path panic on NULL pointer when we stopped all tx queues in
      netvsc_detach and start removing the netvsc device.
      
      This patch fix it by adding a tx_disable flag to prevent unwanted
      queue wakeup.
      
      Fixes: 7b2ee50c ("hv_netvsc: common detach logic")
      Reported-by: default avatarMohammed Gamal <mgamal@redhat.com>
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      789185d4
    • Sheena Mira-ato's avatar
      ip6_tunnel: Match to ARPHRD_TUNNEL6 for dev type · bc280a1e
      Sheena Mira-ato authored
      [ Upstream commit b2e54b09 ]
      
      The device type for ip6 tunnels is set to
      ARPHRD_TUNNEL6. However, the ip4ip6_err function
      is expecting the device type of the tunnel to be
      ARPHRD_TUNNEL.  Since the device types do not
      match, the function exits and the ICMP error
      packet is not sent to the originating host. Note
      that the device type for IPv4 tunnels is set to
      ARPHRD_TUNNEL.
      
      Fix is to expect a tunnel device type of
      ARPHRD_TUNNEL6 instead.  Now the tunnel device
      type matches and the ICMP error packet is sent
      to the originating host.
      Signed-off-by: default avatarSheena Mira-ato <sheena.mira-ato@alliedtelesis.co.nz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc280a1e
    • Zubin Mithra's avatar
      ALSA: seq: Fix OOB-reads from strlcpy · 5589e51f
      Zubin Mithra authored
      commit 212ac181 upstream.
      
      When ioctl calls are made with non-null-terminated userspace strings,
      strlcpy causes an OOB-read from within strlen. Fix by changing to use
      strscpy instead.
      Signed-off-by: default avatarZubin Mithra <zsm@chromium.org>
      Reviewed-by: default avatarGuenter Roeck <groeck@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5589e51f
    • Li RongQing's avatar
      net: ethtool: not call vzalloc for zero sized memory request · eea06f38
      Li RongQing authored
      [ Upstream commit 3d883026 ]
      
      NULL or ZERO_SIZE_PTR will be returned for zero sized memory
      request, and derefencing them will lead to a segfault
      
      so it is unnecessory to call vzalloc for zero sized memory
      request and not call functions which maybe derefence the
      NULL allocated memory
      
      this also fixes a possible memory leak if phy_ethtool_get_stats
      returns error, memory should be freed before exit
      Signed-off-by: default avatarLi RongQing <lirongqing@baidu.com>
      Reviewed-by: default avatarWang Li <wangli39@baidu.com>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eea06f38
    • Eric Dumazet's avatar
      netns: provide pure entropy for net_hash_mix() · adbb8bdd
      Eric Dumazet authored
      [ Upstream commit 355b9855 ]
      
      net_hash_mix() currently uses kernel address of a struct net,
      and is used in many places that could be used to reveal this
      address to a patient attacker, thus defeating KASLR, for
      the typical case (initial net namespace, &init_net is
      not dynamically allocated)
      
      I believe the original implementation tried to avoid spending
      too many cycles in this function, but security comes first.
      
      Also provide entropy regardless of CONFIG_NET_NS.
      
      Fixes: 0b441916 ("netns: introduce the net_hash_mix "salt" for hashes")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAmit Klein <aksecurity@gmail.com>
      Reported-by: default avatarBenny Pinkas <benny@pinkas.net>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      adbb8bdd
    • Davide Caratti's avatar
      net/sched: act_sample: fix divide by zero in the traffic path · 0349ad06
      Davide Caratti authored
      [ Upstream commit fae27081 ]
      
      the control path of 'sample' action does not validate the value of 'rate'
      provided by the user, but then it uses it as divisor in the traffic path.
      Validate it in tcf_sample_init(), and return -EINVAL with a proper extack
      message in case that value is zero, to fix a splat with the script below:
      
       # tc f a dev test0 egress matchall action sample rate 0 group 1 index 2
       # tc -s a s action sample
       total acts 1
      
               action order 0: sample rate 1/0 group 1 pipe
                index 2 ref 1 bind 1 installed 19 sec used 19 sec
               Action statistics:
               Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
               backlog 0b 0p requeues 0
       # ping 192.0.2.1 -I test0 -c1 -q
      
       divide error: 0000 [#1] SMP PTI
       CPU: 1 PID: 6192 Comm: ping Not tainted 5.1.0-rc2.diag2+ #591
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
       RIP: 0010:tcf_sample_act+0x9e/0x1e0 [act_sample]
       Code: 6a f1 85 c0 74 0d 80 3d 83 1a 00 00 00 0f 84 9c 00 00 00 4d 85 e4 0f 84 85 00 00 00 e8 9b d7 9c f1 44 8b 8b e0 00 00 00 31 d2 <41> f7 f1 85 d2 75 70 f6 85 83 00 00 00 10 48 8b 45 10 8b 88 08 01
       RSP: 0018:ffffae320190ba30 EFLAGS: 00010246
       RAX: 00000000b0677d21 RBX: ffff8af1ed9ec000 RCX: 0000000059a9fe49
       RDX: 0000000000000000 RSI: 000000000c7e33b7 RDI: ffff8af23daa0af0
       RBP: ffff8af1ee11b200 R08: 0000000074fcaf7e R09: 0000000000000000
       R10: 0000000000000050 R11: ffffffffb3088680 R12: ffff8af232307f80
       R13: 0000000000000003 R14: ffff8af1ed9ec000 R15: 0000000000000000
       FS:  00007fe9c6d2f740(0000) GS:ffff8af23da80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 00007fff6772f000 CR3: 00000000746a2004 CR4: 00000000001606e0
       Call Trace:
        tcf_action_exec+0x7c/0x1c0
        tcf_classify+0x57/0x160
        __dev_queue_xmit+0x3dc/0xd10
        ip_finish_output2+0x257/0x6d0
        ip_output+0x75/0x280
        ip_send_skb+0x15/0x40
        raw_sendmsg+0xae3/0x1410
        sock_sendmsg+0x36/0x40
        __sys_sendto+0x10e/0x140
        __x64_sys_sendto+0x24/0x30
        do_syscall_64+0x60/0x210
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
        [...]
        Kernel panic - not syncing: Fatal exception in interrupt
      
      Add a TDC selftest to document that 'rate' is now being validated.
      Reported-by: default avatarMatteo Croce <mcroce@redhat.com>
      Fixes: 5c5670fa ("net/sched: Introduce sample tc action")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Acked-by: default avatarYotam Gigi <yotam.gi@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0349ad06
    • Michael Chan's avatar
      bnxt_en: Reset device on RX buffer errors. · 5df47bb6
      Michael Chan authored
      [ Upstream commit 8e44e96c ]
      
      If the RX completion indicates RX buffers errors, the RX ring will be
      disabled by firmware and no packets will be received on that ring from
      that point on.  Recover by resetting the device.
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5df47bb6
    • Michael Chan's avatar
      bnxt_en: Improve RX consumer index validity check. · 46281ee8
      Michael Chan authored
      [ Upstream commit a1b0e4e6 ]
      
      There is logic to check that the RX/TPA consumer index is the expected
      index to work around a hardware problem.  However, the potentially bad
      consumer index is first used to index into an array to reference an entry.
      This can potentially crash if the bad consumer index is beyond legal
      range.  Improve the logic to use the consumer index for dereferencing
      after the validity check and log an error message.
      
      Fixes: fa7e2812 ("bnxt_en: Add workaround to detect bad opaque in rx completion (part 2)")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46281ee8
    • Jakub Kicinski's avatar
      nfp: validate the return code from dev_queue_xmit() · e26c79d2
      Jakub Kicinski authored
      [ Upstream commit c8ba5b91 ]
      
      dev_queue_xmit() may return error codes as well as netdev_tx_t,
      and it always consumes the skb.  Make sure we always return a
      correct netdev_tx_t value.
      
      Fixes: eadfa4c3 ("nfp: add stats and xmit helpers for representors")
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Reviewed-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e26c79d2
    • Yuval Avnery's avatar
      net/mlx5e: Add a lock on tir list · b5ba76a5
      Yuval Avnery authored
      [ Upstream commit 80a2a902 ]
      
      Refresh tirs is looping over a global list of tirs while netdevs are
      adding and removing tirs from that list. That is why a lock is
      required.
      
      Fixes: 724b2aa1 ("net/mlx5e: TIRs management refactoring")
      Signed-off-by: default avatarYuval Avnery <yuvalav@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b5ba76a5
    • Gavi Teitz's avatar
      net/mlx5e: Fix error handling when refreshing TIRs · 7143c899
      Gavi Teitz authored
      [ Upstream commit bc87a003 ]
      
      Previously, a false positive would be caught if the TIRs list is
      empty, since the err value was initialized to -ENOMEM, and was only
      updated if a TIR is refreshed. This is resolved by initializing the
      err value to zero.
      
      Fixes: b676f653 ("net/mlx5e: Refactor refresh TIRs")
      Signed-off-by: default avatarGavi Teitz <gavi@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7143c899
    • Stephen Suryaputra's avatar
      vrf: check accept_source_route on the original netdevice · 16b71423
      Stephen Suryaputra authored
      [ Upstream commit 8c83f2df ]
      
      Configuration check to accept source route IP options should be made on
      the incoming netdevice when the skb->dev is an l3mdev master. The route
      lookup for the source route next hop also needs the incoming netdev.
      
      v2->v3:
      - Simplify by passing the original netdevice down the stack (per David
        Ahern).
      Signed-off-by: default avatarStephen Suryaputra <ssuryaextr@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16b71423
    • Koen De Schepper's avatar
      tcp: Ensure DCTCP reacts to losses · 2ff8616e
      Koen De Schepper authored
      [ Upstream commit aecfde23 ]
      
      RFC8257 §3.5 explicitly states that "A DCTCP sender MUST react to
      loss episodes in the same way as conventional TCP".
      
      Currently, Linux DCTCP performs no cwnd reduction when losses
      are encountered. Optionally, the dctcp_clamp_alpha_on_loss resets
      alpha to its maximal value if a RTO happens. This behavior
      is sub-optimal for at least two reasons: i) it ignores losses
      triggering fast retransmissions; and ii) it causes unnecessary large
      cwnd reduction in the future if the loss was isolated as it resets
      the historical term of DCTCP's alpha EWMA to its maximal value (i.e.,
      denoting a total congestion). The second reason has an especially
      noticeable effect when using DCTCP in high BDP environments, where
      alpha normally stays at low values.
      
      This patch replace the clamping of alpha by setting ssthresh to
      half of cwnd for both fast retransmissions and RTOs, at most once
      per RTT. Consequently, the dctcp_clamp_alpha_on_loss module parameter
      has been removed.
      
      The table below shows experimental results where we measured the
      drop probability of a PIE AQM (not applying ECN marks) at a
      bottleneck in the presence of a single TCP flow with either the
      alpha-clamping option enabled or the cwnd halving proposed by this
      patch. Results using reno or cubic are given for comparison.
      
                                |  Link   |   RTT    |    Drop
                       TCP CC   |  speed  | base+AQM | probability
              ==================|=========|==========|============
                          CUBIC |  40Mbps |  7+20ms  |    0.21%
                           RENO |         |          |    0.19%
              DCTCP-CLAMP-ALPHA |         |          |   25.80%
               DCTCP-HALVE-CWND |         |          |    0.22%
              ------------------|---------|----------|------------
                          CUBIC | 100Mbps |  7+20ms  |    0.03%
                           RENO |         |          |    0.02%
              DCTCP-CLAMP-ALPHA |         |          |   23.30%
               DCTCP-HALVE-CWND |         |          |    0.04%
              ------------------|---------|----------|------------
                          CUBIC | 800Mbps |   1+1ms  |    0.04%
                           RENO |         |          |    0.05%
              DCTCP-CLAMP-ALPHA |         |          |   18.70%
               DCTCP-HALVE-CWND |         |          |    0.06%
      
      We see that, without halving its cwnd for all source of losses,
      DCTCP drives the AQM to large drop probabilities in order to keep
      the queue length under control (i.e., it repeatedly faces RTOs).
      Instead, if DCTCP reacts to all source of losses, it can then be
      controlled by the AQM using similar drop levels than cubic or reno.
      Signed-off-by: default avatarKoen De Schepper <koen.de_schepper@nokia-bell-labs.com>
      Signed-off-by: default avatarOlivier Tilmans <olivier.tilmans@nokia-bell-labs.com>
      Cc: Bob Briscoe <research@bobbriscoe.net>
      Cc: Lawrence Brakmo <brakmo@fb.com>
      Cc: Florian Westphal <fw@strlen.de>
      Cc: Daniel Borkmann <borkmann@iogearbox.net>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Neal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Andrew Shewmaker <agshew@gmail.com>
      Cc: Glenn Judd <glenn.judd@morganstanley.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2ff8616e
    • Xin Long's avatar
      sctp: initialize _pad of sockaddr_in before copying to user memory · a7bc830b
      Xin Long authored
      [ Upstream commit 09279e61 ]
      
      Syzbot report a kernel-infoleak:
      
        BUG: KMSAN: kernel-infoleak in _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
        Call Trace:
          _copy_to_user+0x16b/0x1f0 lib/usercopy.c:32
          copy_to_user include/linux/uaccess.h:174 [inline]
          sctp_getsockopt_peer_addrs net/sctp/socket.c:5911 [inline]
          sctp_getsockopt+0x1668e/0x17f70 net/sctp/socket.c:7562
          ...
        Uninit was stored to memory at:
          sctp_transport_init net/sctp/transport.c:61 [inline]
          sctp_transport_new+0x16d/0x9a0 net/sctp/transport.c:115
          sctp_assoc_add_peer+0x532/0x1f70 net/sctp/associola.c:637
          sctp_process_param net/sctp/sm_make_chunk.c:2548 [inline]
          sctp_process_init+0x1a1b/0x3ed0 net/sctp/sm_make_chunk.c:2361
          ...
        Bytes 8-15 of 16 are uninitialized
      
      It was caused by that th _pad field (the 8-15 bytes) of a v4 addr (saved in
      struct sockaddr_in) wasn't initialized, but directly copied to user memory
      in sctp_getsockopt_peer_addrs().
      
      So fix it by calling memset(addr->v4.sin_zero, 0, 8) to initialize _pad of
      sockaddr_in before copying it to user memory in sctp_v4_addr_to_user(), as
      sctp_v6_addr_to_user() does.
      
      Reported-by: syzbot+86b5c7c236a22616a72f@syzkaller.appspotmail.com
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Tested-by: default avatarAlexander Potapenko <glider@google.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a7bc830b
    • Bjørn Mork's avatar
      qmi_wwan: add Olicard 600 · be7e16e5
      Bjørn Mork authored
      [ Upstream commit 6289d0fa ]
      
      This is a Qualcomm based device with a QMI function on interface 4.
      It is mode switched from 2020:2030 using a standard eject message.
      
      T:  Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  6 Spd=480  MxCh= 0
      D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
      P:  Vendor=2020 ProdID=2031 Rev= 2.32
      S:  Manufacturer=Mobile Connect
      S:  Product=Mobile Connect
      S:  SerialNumber=0123456789ABCDEF
      C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=500mA
      I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E:  Ad=85(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E:  Ad=87(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=86(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      E:  Ad=89(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
      E:  Ad=88(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      I:* If#= 5 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=(none)
      E:  Ad=8a(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=125us
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be7e16e5
    • Andrea Righi's avatar
      openvswitch: fix flow actions reallocation · 94ef6b98
      Andrea Righi authored
      [ Upstream commit f28cd2af ]
      
      The flow action buffer can be resized if it's not big enough to contain
      all the requested flow actions. However, this resize doesn't take into
      account the new requested size, the buffer is only increased by a factor
      of 2x. This might be not enough to contain the new data, causing a
      buffer overflow, for example:
      
      [   42.044472] =============================================================================
      [   42.045608] BUG kmalloc-96 (Not tainted): Redzone overwritten
      [   42.046415] -----------------------------------------------------------------------------
      
      [   42.047715] Disabling lock debugging due to kernel taint
      [   42.047716] INFO: 0x8bf2c4a5-0x720c0928. First byte 0x0 instead of 0xcc
      [   42.048677] INFO: Slab 0xbc6d2040 objects=29 used=18 fp=0xdc07dec4 flags=0x2808101
      [   42.049743] INFO: Object 0xd53a3464 @offset=2528 fp=0xccdcdebb
      
      [   42.050747] Redzone 76f1b237: cc cc cc cc cc cc cc cc                          ........
      [   42.051839] Object d53a3464: 6b 6b 6b 6b 6b 6b 6b 6b 0c 00 00 00 6c 00 00 00  kkkkkkkk....l...
      [   42.053015] Object f49a30cc: 6c 00 0c 00 00 00 00 00 00 00 00 03 78 a3 15 f6  l...........x...
      [   42.054203] Object acfe4220: 20 00 02 00 ff ff ff ff 00 00 00 00 00 00 00 00   ...............
      [   42.055370] Object 21024e91: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [   42.056541] Object 070e04c3: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [   42.057797] Object 948a777a: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
      [   42.059061] Redzone 8bf2c4a5: 00 00 00 00                                      ....
      [   42.060189] Padding a681b46e: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
      
      Fix by making sure the new buffer is properly resized to contain all the
      requested data.
      
      BugLink: https://bugs.launchpad.net/bugs/1813244Signed-off-by: default avatarAndrea Righi <andrea.righi@canonical.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      94ef6b98
    • Nicolas Dichtel's avatar
      net/sched: fix ->get helper of the matchall cls · a54dc7b6
      Nicolas Dichtel authored
      [ Upstream commit 0db6f8be ]
      
      It returned always NULL, thus it was never possible to get the filter.
      
      Example:
      $ ip link add foo type dummy
      $ ip link add bar type dummy
      $ tc qdisc add dev foo clsact
      $ tc filter add dev foo protocol all pref 1 ingress handle 1234 \
      	matchall action mirred ingress mirror dev bar
      
      Before the patch:
      $ tc filter get dev foo protocol all pref 1 ingress handle 1234 matchall
      Error: Specified filter handle not found.
      We have an error talking to the kernel
      
      After:
      $ tc filter get dev foo protocol all pref 1 ingress handle 1234 matchall
      filter ingress protocol all pref 1 matchall chain 0 handle 0x4d2
        not_in_hw
              action order 1: mirred (Ingress Mirror to device bar) pipe
              index 1 ref 1 bind 1
      
      CC: Yotam Gigi <yotamg@mellanox.com>
      CC: Jiri Pirko <jiri@mellanox.com>
      Fixes: fd62d9f5 ("net/sched: matchall: Fix configuration race")
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a54dc7b6
    • Mao Wenan's avatar
      net: rds: force to destroy connection if t_sock is NULL in rds_tcp_kill_sock(). · c8a88799
      Mao Wenan authored
      [ Upstream commit cb66ddd1 ]
      
      When it is to cleanup net namespace, rds_tcp_exit_net() will call
      rds_tcp_kill_sock(), if t_sock is NULL, it will not call
      rds_conn_destroy(), rds_conn_path_destroy() and rds_tcp_conn_free() to free
      connection, and the worker cp_conn_w is not stopped, afterwards the net is freed in
      net_drop_ns(); While cp_conn_w rds_connect_worker() will call rds_tcp_conn_path_connect()
      and reference 'net' which has already been freed.
      
      In rds_tcp_conn_path_connect(), rds_tcp_set_callbacks() will set t_sock = sock before
      sock->ops->connect, but if connect() is failed, it will call
      rds_tcp_restore_callbacks() and set t_sock = NULL, if connect is always
      failed, rds_connect_worker() will try to reconnect all the time, so
      rds_tcp_kill_sock() will never to cancel worker cp_conn_w and free the
      connections.
      
      Therefore, the condition !tc->t_sock is not needed if it is going to do
      cleanup_net->rds_tcp_exit_net->rds_tcp_kill_sock, because tc->t_sock is always
      NULL, and there is on other path to cancel cp_conn_w and free
      connection. So this patch is to fix this.
      
      rds_tcp_kill_sock():
      ...
      if (net != c_net || !tc->t_sock)
      ...
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      
      ==================================================================
      BUG: KASAN: use-after-free in inet_create+0xbcc/0xd28
      net/ipv4/af_inet.c:340
      Read of size 4 at addr ffff8003496a4684 by task kworker/u8:4/3721
      
      CPU: 3 PID: 3721 Comm: kworker/u8:4 Not tainted 5.1.0 #11
      Hardware name: linux,dummy-virt (DT)
      Workqueue: krdsd rds_connect_worker
      Call trace:
       dump_backtrace+0x0/0x3c0 arch/arm64/kernel/time.c:53
       show_stack+0x28/0x38 arch/arm64/kernel/traps.c:152
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x120/0x188 lib/dump_stack.c:113
       print_address_description+0x68/0x278 mm/kasan/report.c:253
       kasan_report_error mm/kasan/report.c:351 [inline]
       kasan_report+0x21c/0x348 mm/kasan/report.c:409
       __asan_report_load4_noabort+0x30/0x40 mm/kasan/report.c:429
       inet_create+0xbcc/0xd28 net/ipv4/af_inet.c:340
       __sock_create+0x4f8/0x770 net/socket.c:1276
       sock_create_kern+0x50/0x68 net/socket.c:1322
       rds_tcp_conn_path_connect+0x2b4/0x690 net/rds/tcp_connect.c:114
       rds_connect_worker+0x108/0x1d0 net/rds/threads.c:175
       process_one_work+0x6e8/0x1700 kernel/workqueue.c:2153
       worker_thread+0x3b0/0xdd0 kernel/workqueue.c:2296
       kthread+0x2f0/0x378 kernel/kthread.c:255
       ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1117
      
      Allocated by task 687:
       save_stack mm/kasan/kasan.c:448 [inline]
       set_track mm/kasan/kasan.c:460 [inline]
       kasan_kmalloc+0xd4/0x180 mm/kasan/kasan.c:553
       kasan_slab_alloc+0x14/0x20 mm/kasan/kasan.c:490
       slab_post_alloc_hook mm/slab.h:444 [inline]
       slab_alloc_node mm/slub.c:2705 [inline]
       slab_alloc mm/slub.c:2713 [inline]
       kmem_cache_alloc+0x14c/0x388 mm/slub.c:2718
       kmem_cache_zalloc include/linux/slab.h:697 [inline]
       net_alloc net/core/net_namespace.c:384 [inline]
       copy_net_ns+0xc4/0x2d0 net/core/net_namespace.c:424
       create_new_namespaces+0x300/0x658 kernel/nsproxy.c:107
       unshare_nsproxy_namespaces+0xa0/0x198 kernel/nsproxy.c:206
       ksys_unshare+0x340/0x628 kernel/fork.c:2577
       __do_sys_unshare kernel/fork.c:2645 [inline]
       __se_sys_unshare kernel/fork.c:2643 [inline]
       __arm64_sys_unshare+0x38/0x58 kernel/fork.c:2643
       __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
       invoke_syscall arch/arm64/kernel/syscall.c:47 [inline]
       el0_svc_common+0x168/0x390 arch/arm64/kernel/syscall.c:83
       el0_svc_handler+0x60/0xd0 arch/arm64/kernel/syscall.c:129
       el0_svc+0x8/0xc arch/arm64/kernel/entry.S:960
      
      Freed by task 264:
       save_stack mm/kasan/kasan.c:448 [inline]
       set_track mm/kasan/kasan.c:460 [inline]
       __kasan_slab_free+0x114/0x220 mm/kasan/kasan.c:521
       kasan_slab_free+0x10/0x18 mm/kasan/kasan.c:528
       slab_free_hook mm/slub.c:1370 [inline]
       slab_free_freelist_hook mm/slub.c:1397 [inline]
       slab_free mm/slub.c:2952 [inline]
       kmem_cache_free+0xb8/0x3a8 mm/slub.c:2968
       net_free net/core/net_namespace.c:400 [inline]
       net_drop_ns.part.6+0x78/0x90 net/core/net_namespace.c:407
       net_drop_ns net/core/net_namespace.c:406 [inline]
       cleanup_net+0x53c/0x6d8 net/core/net_namespace.c:569
       process_one_work+0x6e8/0x1700 kernel/workqueue.c:2153
       worker_thread+0x3b0/0xdd0 kernel/workqueue.c:2296
       kthread+0x2f0/0x378 kernel/kthread.c:255
       ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1117
      
      The buggy address belongs to the object at ffff8003496a3f80
       which belongs to the cache net_namespace of size 7872
      The buggy address is located 1796 bytes inside of
       7872-byte region [ffff8003496a3f80, ffff8003496a5e40)
      The buggy address belongs to the page:
      page:ffff7e000d25a800 count:1 mapcount:0 mapping:ffff80036ce4b000
      index:0x0 compound_mapcount: 0
      flags: 0xffffe0000008100(slab|head)
      raw: 0ffffe0000008100 dead000000000100 dead000000000200 ffff80036ce4b000
      raw: 0000000000000000 0000000080040004 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8003496a4580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8003496a4600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff8003496a4680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                         ^
       ffff8003496a4700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8003496a4780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      
      Fixes: 467fa153("RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarMao Wenan <maowenan@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8a88799
    • Artemy Kovalyov's avatar
      net/mlx5: Decrease default mr cache size · 96d8f624
      Artemy Kovalyov authored
      [ Upstream commit e8b26b21 ]
      
      Delete initialization of high order entries in mr cache to decrease initial
      memory footprint. When required, the administrator can populate the
      entries with memory keys via the /sys interface.
      
      This approach is very helpful to significantly reduce the per HW function
      memory footprint in virtualization environments such as SRIOV.
      
      Fixes: 9603b61d ("mlx5: Move pci device handling from mlx5_ib to mlx5_core")
      Signed-off-by: default avatarArtemy Kovalyov <artemyko@mellanox.com>
      Signed-off-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Reported-by: default avatarShalom Toledo <shalomt@mellanox.com>
      Acked-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96d8f624
    • Steffen Klassert's avatar
      net-gro: Fix GRO flush when receiving a GSO packet. · 23bfd229
      Steffen Klassert authored
      [ Upstream commit 0ab03f35 ]
      
      Currently we may merge incorrectly a received GSO packet
      or a packet with frag_list into a packet sitting in the
      gro_hash list. skb_segment() may crash case because
      the assumptions on the skb layout are not met.
      The correct behaviour would be to flush the packet in the
      gro_hash list and send the received GSO packet directly
      afterwards. Commit d61d072e ("net-gro: avoid reorders")
      sets NAPI_GRO_CB(skb)->flush in this case, but this is not
      checked before merging. This patch makes sure to check this
      flag and to not merge in that case.
      
      Fixes: d61d072e ("net-gro: avoid reorders")
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      23bfd229
    • Jiri Slaby's avatar
      kcm: switch order of device registration to fix a crash · 393c8b4c
      Jiri Slaby authored
      [ Upstream commit 3c446e6f ]
      
      When kcm is loaded while many processes try to create a KCM socket, a
      crash occurs:
       BUG: unable to handle kernel NULL pointer dereference at 000000000000000e
       IP: mutex_lock+0x27/0x40 kernel/locking/mutex.c:240
       PGD 8000000016ef2067 P4D 8000000016ef2067 PUD 3d6e9067 PMD 0
       Oops: 0002 [#1] SMP KASAN PTI
       CPU: 0 PID: 7005 Comm: syz-executor.5 Not tainted 4.12.14-396-default #1 SLE15-SP1 (unreleased)
       RIP: 0010:mutex_lock+0x27/0x40 kernel/locking/mutex.c:240
       RSP: 0018:ffff88000d487a00 EFLAGS: 00010246
       RAX: 0000000000000000 RBX: 000000000000000e RCX: 1ffff100082b0719
       ...
       CR2: 000000000000000e CR3: 000000004b1bc003 CR4: 0000000000060ef0
       Call Trace:
        kcm_create+0x600/0xbf0 [kcm]
        __sock_create+0x324/0x750 net/socket.c:1272
       ...
      
      This is due to race between sock_create and unfinished
      register_pernet_device. kcm_create tries to do "net_generic(net,
      kcm_net_id)". but kcm_net_id is not initialized yet.
      
      So switch the order of the two to close the race.
      
      This can be reproduced with mutiple processes doing socket(PF_KCM, ...)
      and one process doing module removal.
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      393c8b4c