1. 22 Jun, 2020 8 commits
    • Linus Torvalds's avatar
      Fix 'acccess_ok()' on alpha and SH · 3b051f17
      Linus Torvalds authored
      commit 94bd8a05 upstream.
      
      Commit 594cc251 ("make 'user_access_begin()' do 'access_ok()'")
      broke both alpha and SH booting in qemu, as noticed by Guenter Roeck.
      
      It turns out that the bug wasn't actually in that commit itself (which
      would have been surprising: it was mostly a no-op), but in how the
      addition of access_ok() to the strncpy_from_user() and strnlen_user()
      functions now triggered the case where those functions would test the
      access of the very last byte of the user address space.
      
      The string functions actually did that user range test before too, but
      they did it manually by just comparing against user_addr_max().  But
      with user_access_begin() doing the check (using "access_ok()"), it now
      exposed problems in the architecture implementations of that function.
      
      For example, on alpha, the access_ok() helper macro looked like this:
      
        #define __access_ok(addr, size) \
              ((get_fs().seg & (addr | size | (addr+size))) == 0)
      
      and what it basically tests is of any of the high bits get set (the
      USER_DS masking value is 0xfffffc0000000000).
      
      And that's completely wrong for the "addr+size" check.  Because it's
      off-by-one for the case where we check to the very end of the user
      address space, which is exactly what the strn*_user() functions do.
      
      Why? Because "addr+size" will be exactly the size of the address space,
      so trying to access the last byte of the user address space will fail
      the __access_ok() check, even though it shouldn't.  As a result, the
      user string accessor functions failed consistently - because they
      literally don't know how long the string is going to be, and the max
      access is going to be that last byte of the user address space.
      
      Side note: that alpha macro is buggy for another reason too - it re-uses
      the arguments twice.
      
      And SH has another version of almost the exact same bug:
      
        #define __addr_ok(addr) \
              ((unsigned long __force)(addr) < current_thread_info()->addr_limit.seg)
      
      so far so good: yes, a user address must be below the limit.  But then:
      
        #define __access_ok(addr, size)         \
              (__addr_ok((addr) + (size)))
      
      is wrong with the exact same off-by-one case: the case when "addr+size"
      is exactly _equal_ to the limit is actually perfectly fine (think "one
      byte access at the last address of the user address space")
      
      The SH version is actually seriously buggy in another way: it doesn't
      actually check for overflow, even though it did copy the _comment_ that
      talks about overflow.
      
      So it turns out that both SH and alpha actually have completely buggy
      implementations of access_ok(), but they happened to work in practice
      (although the SH overflow one is a serious serious security bug, not
      that anybody likely cares about SH security).
      
      This fixes the problems by using a similar macro on both alpha and SH.
      It isn't trying to be clever, the end address is based on this logic:
      
              unsigned long __ao_end = __ao_a + __ao_b - !!__ao_b;
      
      which basically says "add start and length, and then subtract one unless
      the length was zero".  We can't subtract one for a zero length, or we'd
      just hit an underflow instead.
      
      For a lot of access_ok() users the length is a constant, so this isn't
      actually as expensive as it initially looks.
      Reported-and-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b051f17
    • Linus Torvalds's avatar
      make 'user_access_begin()' do 'access_ok()' · 216284c4
      Linus Torvalds authored
      commit 594cc251 upstream.
      
      Originally, the rule used to be that you'd have to do access_ok()
      separately, and then user_access_begin() before actually doing the
      direct (optimized) user access.
      
      But experience has shown that people then decide not to do access_ok()
      at all, and instead rely on it being implied by other operations or
      similar.  Which makes it very hard to verify that the access has
      actually been range-checked.
      
      If you use the unsafe direct user accesses, hardware features (either
      SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
      Access Never - on ARM) do force you to use user_access_begin().  But
      nothing really forces the range check.
      
      By putting the range check into user_access_begin(), we actually force
      people to do the right thing (tm), and the range check vill be visible
      near the actual accesses.  We have way too long a history of people
      trying to avoid them.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMiles Chen <miles.chen@mediatek.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      216284c4
    • Lorenz Bauer's avatar
      selftests: bpf: fix use of undeclared RET_IF macro · 6f89ad2e
      Lorenz Bauer authored
      commit 634efb75 ("selftests: bpf: Reset global state between
      reuseport test runs") uses a macro RET_IF which doesn't exist in
      the v4.19 tree. It is defined as follows:
      
              #define RET_IF(condition, tag, format...) ({
                      if (CHECK_FAIL(condition)) {
                              printf(tag " " format);
                              return;
                      }
              })
      
      CHECK_FAIL in turn is defined as:
      
              #define CHECK_FAIL(condition) ({
                      int __ret = !!(condition);
                      int __save_errno = errno;
                      if (__ret) {
                              test__fail();
                              fprintf(stdout, "%s:FAIL:%d\n", __func__, __LINE__);
                      }
                      errno = __save_errno;
                      __ret;
              })
      
      Replace occurences of RET_IF with CHECK. This will abort the test binary
      if clearing the intermediate state fails.
      
      Fixes: 634efb75 ("selftests: bpf: Reset global state between reuseport test runs")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarLorenz Bauer <lmb@cloudflare.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6f89ad2e
    • Willem de Bruijn's avatar
      tun: correct header offsets in napi frags mode · 75e36c19
      Willem de Bruijn authored
      [ Upstream commit 96aa1b22 ]
      
      Tun in IFF_NAPI_FRAGS mode calls napi_gro_frags. Unlike netif_rx and
      netif_gro_receive, this expects skb->data to point to the mac layer.
      
      But skb_probe_transport_header, __skb_get_hash_symmetric, and
      xdp_do_generic in tun_get_user need skb->data to point to the network
      header. Flow dissection also needs skb->protocol set, so
      eth_type_trans has to be called.
      
      Ensure the link layer header lies in linear as eth_type_trans pulls
      ETH_HLEN. Then take the same code paths for frags as for not frags.
      Push the link layer header back just before calling napi_gro_frags.
      
      By pulling up to ETH_HLEN from frag0 into linear, this disables the
      frag0 optimization in the special case when IFF_NAPI_FRAGS is used
      with zero length iov[0] (and thus empty skb->linear).
      
      Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarPetar Penkov <ppenkov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75e36c19
    • Ido Schimmel's avatar
      vxlan: Avoid infinite loop when suppressing NS messages with invalid options · dbe7cfbf
      Ido Schimmel authored
      [ Upstream commit 8066e6b4 ]
      
      When proxy mode is enabled the vxlan device might reply to Neighbor
      Solicitation (NS) messages on behalf of remote hosts.
      
      In case the NS message includes the "Source link-layer address" option
      [1], the vxlan device will use the specified address as the link-layer
      destination address in its reply.
      
      To avoid an infinite loop, break out of the options parsing loop when
      encountering an option with length zero and disregard the NS message.
      
      This is consistent with the IPv6 ndisc code and RFC 4886 which states
      that "Nodes MUST silently discard an ND packet that contains an option
      with length zero" [2].
      
      [1] https://tools.ietf.org/html/rfc4861#section-4.3
      [2] https://tools.ietf.org/html/rfc4861#section-4.6
      
      Fixes: 4b29dba9 ("vxlan: fix nonfunctional neigh_reduce()")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbe7cfbf
    • Ido Schimmel's avatar
      bridge: Avoid infinite loop when suppressing NS messages with invalid options · 1e74500f
      Ido Schimmel authored
      [ Upstream commit 53fc6852 ]
      
      When neighbor suppression is enabled the bridge device might reply to
      Neighbor Solicitation (NS) messages on behalf of remote hosts.
      
      In case the NS message includes the "Source link-layer address" option
      [1], the bridge device will use the specified address as the link-layer
      destination address in its reply.
      
      To avoid an infinite loop, break out of the options parsing loop when
      encountering an option with length zero and disregard the NS message.
      
      This is consistent with the IPv6 ndisc code and RFC 4886 which states
      that "Nodes MUST silently discard an ND packet that contains an option
      with length zero" [2].
      
      [1] https://tools.ietf.org/html/rfc4861#section-4.3
      [2] https://tools.ietf.org/html/rfc4861#section-4.6
      
      Fixes: ed842fae ("bridge: suppress nd pkts on BR_NEIGH_SUPPRESS ports")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarAlla Segal <allas@mellanox.com>
      Tested-by: default avatarAlla Segal <allas@mellanox.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e74500f
    • Vasily Averin's avatar
      net_failover: fixed rollback in net_failover_open() · 8e62792a
      Vasily Averin authored
      [ Upstream commit e8224bfe ]
      
      found by smatch:
      drivers/net/net_failover.c:65 net_failover_open() error:
       we previously assumed 'primary_dev' could be null (see line 43)
      
      Fixes: cfc80d9a ("net: Introduce net_failover driver")
      Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e62792a
    • Hangbin Liu's avatar
      ipv6: fix IPV6_ADDRFORM operation logic · 470e709f
      Hangbin Liu authored
      [ Upstream commit 79a1f0cc ]
      
      Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present.
      Previously the checking logic looks like:
      if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
      	do_some_check;
      else if (sk->sk_protocol != IPPROTO_TCP)
      	break;
      
      After commit b6f61189 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP
      was blocked as the logic changed to:
      if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
      	do_some_check;
      else if (sk->sk_protocol == IPPROTO_TCP)
      	do_some_check;
      	break;
      else
      	break;
      
      Then after commit 82c9ae44 ("ipv6: fix restrict IPV6_ADDRFORM operation")
      UDP/UDPLITE were blocked as the logic changed to:
      if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
      	do_some_check;
      if (sk->sk_protocol == IPPROTO_TCP)
      	do_some_check;
      
      if (sk->sk_protocol != IPPROTO_TCP)
      	break;
      
      Fix it by using Eric's code and simply remove the break in TCP check, which
      looks like:
      if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE)
      	do_some_check;
      else if (sk->sk_protocol == IPPROTO_TCP)
      	do_some_check;
      else
      	break;
      
      Fixes: 82c9ae44 ("ipv6: fix restrict IPV6_ADDRFORM operation")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      470e709f
  2. 10 Jun, 2020 26 commits
  3. 07 Jun, 2020 6 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.127 · 106fa147
      Greg Kroah-Hartman authored
      106fa147
    • Dinghao Liu's avatar
      net: smsc911x: Fix runtime PM imbalance on error · 3fc8e9a7
      Dinghao Liu authored
      [ Upstream commit 539d39ad ]
      
      Remove runtime PM usage counter decrement when the
      increment function has not been called to keep the
      counter balanced.
      Signed-off-by: default avatarDinghao Liu <dinghao.liu@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3fc8e9a7
    • Jonathan McDowell's avatar
      net: ethernet: stmmac: Enable interface clocks on probe for IPQ806x · 4ed49848
      Jonathan McDowell authored
      [ Upstream commit a96ac8a0 ]
      
      The ipq806x_gmac_probe() function enables the PTP clock but not the
      appropriate interface clocks. This means that if the bootloader hasn't
      done so attempting to bring up the interface will fail with an error
      like:
      
      [   59.028131] ipq806x-gmac-dwmac 37600000.ethernet: Failed to reset the dma
      [   59.028196] ipq806x-gmac-dwmac 37600000.ethernet eth1: stmmac_hw_setup: DMA engine initialization failed
      [   59.034056] ipq806x-gmac-dwmac 37600000.ethernet eth1: stmmac_open: Hw setup failed
      
      This patch, a slightly cleaned up version of one posted by Sergey
      Sergeev in:
      
      https://forum.openwrt.org/t/support-for-mikrotik-rb3011uias-rm/4064/257
      
      correctly enables the clock; we have already configured the source just
      before this.
      
      Tested on a MikroTik RB3011.
      Signed-off-by: default avatarJonathan McDowell <noodles@earth.li>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4ed49848
    • Valentin Longchamp's avatar
      net/ethernet/freescale: rework quiesce/activate for ucc_geth · 876119e5
      Valentin Longchamp authored
      [ Upstream commit 79dde73c ]
      
      ugeth_quiesce/activate are used to halt the controller when there is a
      link change that requires to reconfigure the mac.
      
      The previous implementation called netif_device_detach(). This however
      causes the initial activation of the netdevice to fail precisely because
      it's detached. For details, see [1].
      
      A possible workaround was the revert of commit
      net: linkwatch: add check for netdevice being present to linkwatch_do_dev
      However, the check introduced in the above commit is correct and shall be
      kept.
      
      The netif_device_detach() is thus replaced with
      netif_tx_stop_all_queues() that prevents any tranmission. This allows to
      perform mac config change required by the link change, without detaching
      the corresponding netdevice and thus not preventing its initial
      activation.
      
      [1] https://lists.openwall.net/netdev/2020/01/08/201Signed-off-by: default avatarValentin Longchamp <valentin@longchamp.me>
      Acked-by: default avatarMatteo Ghidoni <matteo.ghidoni@ch.abb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      876119e5
    • Chaitanya Kulkarni's avatar
      null_blk: return error for invalid zone size · fb1c56d1
      Chaitanya Kulkarni authored
      [ Upstream commit e2748325 ]
      
      In null_init_zone_dev() check if the zone size is larger than device
      capacity, return error if needed.
      
      This also fixes the following oops :-
      
      null_blk: changed the number of conventional zones to 4294967295
      BUG: kernel NULL pointer dereference, address: 0000000000000010
      PGD 7d76c5067 P4D 7d76c5067 PUD 7d240c067 PMD 0
      Oops: 0002 [#1] SMP NOPTI
      CPU: 4 PID: 5508 Comm: nullbtests.sh Tainted: G OE 5.7.0-rc4lblk-fnext0
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e4
      RIP: 0010:null_init_zoned_dev+0x17a/0x27f [null_blk]
      RSP: 0018:ffffc90007007e00 EFLAGS: 00010246
      RAX: 0000000000000020 RBX: ffff8887fb3f3c00 RCX: 0000000000000007
      RDX: 0000000000000000 RSI: ffff8887ca09d688 RDI: ffff888810fea510
      RBP: 0000000000000010 R08: ffff8887ca09d688 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887c26e8000
      R13: ffffffffa05e9390 R14: 0000000000000000 R15: 0000000000000001
      FS:  00007fcb5256f740(0000) GS:ffff888810e00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000010 CR3: 000000081e8fe000 CR4: 00000000003406e0
      Call Trace:
       null_add_dev+0x534/0x71b [null_blk]
       nullb_device_power_store.cold.41+0x8/0x2e [null_blk]
       configfs_write_file+0xe6/0x150
       vfs_write+0xba/0x1e0
       ksys_write+0x5f/0xe0
       do_syscall_64+0x60/0x250
       entry_SYSCALL_64_after_hwframe+0x49/0xb3
      RIP: 0033:0x7fcb51c71840
      Signed-off-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fb1c56d1
    • Gerald Schaefer's avatar
      s390/mm: fix set_huge_pte_at() for empty ptes · 84fc4e58
      Gerald Schaefer authored
      [ Upstream commit ac8372f3 ]
      
      On s390, the layout of normal and large ptes (i.e. pmds/puds) differs.
      Therefore, set_huge_pte_at() does a conversion from a normal pte to
      the corresponding large pmd/pud. So, when converting an empty pte, this
      should result in an empty pmd/pud, which would return true for
      pmd/pud_none().
      
      However, after conversion we also mark the pmd/pud as large, and
      therefore present. For empty ptes, this will result in an empty pmd/pud
      that is also marked as large, and pmd/pud_none() would not return true.
      
      There is currently no issue with this behaviour, as set_huge_pte_at()
      does not seem to be called for empty ptes. It would be valid though, so
      let's fix this by not marking empty ptes as large in set_huge_pte_at().
      
      This was found by testing a patch from from Anshuman Khandual, which is
      currently discussed on LKML ("mm/debug: Add more arch page table helper
      tests").
      Signed-off-by: default avatarGerald Schaefer <gerald.schaefer@de.ibm.com>
      Signed-off-by: default avatarVasily Gorbik <gor@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      84fc4e58