1. 12 Oct, 2018 13 commits
  2. 11 Oct, 2018 27 commits
    • David S. Miller's avatar
      Merge branch 'net-dsa-bcm_sf2-Couple-of-fixes' · 6b9bab55
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: bcm_sf2: Couple of fixes
      
      Here are two fixes for the bcm_sf2 driver that were found during
      testing unbind and analysing another issue during system
      suspend/resume.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b9bab55
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Call setup during switch resume · 54baca09
      Florian Fainelli authored
      There is no reason to open code what the switch setup function does, in
      fact, because we just issued a switch reset, we would make all the
      register get their default values, including for instance, having unused
      port be enabled again and wasting power and leading to an inappropriate
      switch core clock being selected.
      
      Fixes: 8cfa9498 ("net: dsa: bcm_sf2: add suspend/resume callbacks")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54baca09
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix unbind ordering · bf3b452b
      Florian Fainelli authored
      The order in which we release resources is unfortunately leading to bus
      errors while dismantling the port. This is because we set
      priv->wol_ports_mask to 0 to tell bcm_sf2_sw_suspend() that it is now
      permissible to clock gate the switch. Later on, when dsa_slave_destroy()
      comes in from dsa_unregister_switch() and calls
      dsa_switch_ops::port_disable, we perform the same dismantling again, and
      this time we hit registers that are clock gated.
      
      Make sure that dsa_unregister_switch() is the first thing that happens,
      which takes care of releasing all user visible resources, then proceed
      with clock gating hardware. We still need to set priv->wol_ports_mask to
      0 to make sure that an enabled port properly gets disabled in case it
      was previously used as part of Wake-on-LAN.
      
      Fixes: d9338023 ("net: dsa: bcm_sf2: Make it a real platform device driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf3b452b
    • Peter Oberparleiter's avatar
      vmlinux.lds.h: Fix linker warnings about orphan .LPBX sections · 52c8ee5b
      Peter Oberparleiter authored
      Enabling both CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y and
      CONFIG_GCOV_PROFILE_ALL=y results in linker warnings:
      
        warning: orphan section `.data..LPBX1' being placed in
        section `.data..LPBX1'.
      
      LD_DEAD_CODE_DATA_ELIMINATION adds compiler flag -fdata-sections. This
      option causes GCC to create separate data sections for data objects,
      including those generated by GCC internally for gcov profiling. The
      names of these objects start with a dot (.LPBX0, .LPBX1), resulting in
      section names starting with 'data..'.
      
      As section names starting with 'data..' are used for specific purposes
      in the Linux kernel, the linker script does not automatically include
      them in the output data section, resulting in the "orphan section"
      linker warnings.
      
      Fix this by specifically including sections named "data..LPBX*" in the
      data section.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Tested-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Tested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      52c8ee5b
    • Peter Oberparleiter's avatar
      vmlinux.lds.h: Fix incomplete .text.exit discards · 8dcf86ca
      Peter Oberparleiter authored
      Enabling CONFIG_GCOV_PROFILE_ALL=y causes linker errors on ARM:
      
        `.text.exit' referenced in section `.ARM.exidx.text.exit':
        defined in discarded section `.text.exit'
      
        `.text.exit' referenced in section `.fini_array.00100':
        defined in discarded section `.text.exit'
      
      And related errors on NDS32:
      
        `.text.exit' referenced in section `.dtors.65435':
        defined in discarded section `.text.exit'
      
      The gcov compiler flags cause certain compiler versions to generate
      additional destructor-related sections that are not yet handled by the
      linker script, resulting in references between discarded and
      non-discarded sections.
      
      Since destructors are not used in the Linux kernel, fix this by
      discarding these additional sections.
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Tested-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reported-by: default avatarGreentime Hu <green.hu@gmail.com>
      Tested-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarPeter Oberparleiter <oberpar@linux.ibm.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      8dcf86ca
    • Sebastian Andrzej Siewior's avatar
      net: phy: sfp: remove sfp_mutex's definition · 05285866
      Sebastian Andrzej Siewior authored
      The sfp_mutex variable is defined but never used in this file. Not even
      in the commit that introduced that variable.
      
      Remove sfp_mutex, it has no purpose.
      
      Cc: Andrew Lunn <andrew@lunn.ch>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05285866
    • Maciej S. Szmigiero's avatar
      r8169: set RX_MULTI_EN bit in RxConfig for 8168F-family chips · 511cfd58
      Maciej S. Szmigiero authored
      It has been reported that since
      commit 05212ba8 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
      at least RTL_GIGA_MAC_VER_38 NICs work erratically after a resume from
      suspend.
      The problem has been traced to a missing RX_MULTI_EN bit in the RxConfig
      register.
      We already set this bit for RTL_GIGA_MAC_VER_35 NICs of the same 8168F
      chip family so let's do it also for its other siblings: RTL_GIGA_MAC_VER_36
      and RTL_GIGA_MAC_VER_38.
      
      Curiously, the NIC seems to work fine after a system boot without having
      this bit set as long as the system isn't suspended and resumed.
      
      Fixes: 05212ba8 ("r8169: set RxConfig after tx/rx is enabled for RTL8169sb/8110sb devices")
      Reported-by: default avatarChris Clayton <chris2553@googlemail.com>
      Signed-off-by: default avatarMaciej S. Szmigiero <mail@maciej.szmigiero.name>
      Reviewed-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Tested-by: default avatarChris Clayton <chris2553@googlemail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      511cfd58
    • Ilias Apalodimas's avatar
      net: socionext: clear rx irq correctly · 2a1e89df
      Ilias Apalodimas authored
      commit 63ae7949 ("net: socionext: Use descriptor info instead of MMIO reads on Rx")
      removed constant mmio reads from the driver and started using a descriptor
      field to check if packet should be processed.
      This lead the napi rx handler being constantly called while no packets
      needed processing and ksoftirq getting 100% cpu usage. Issue one mmio read
      to clear the irq correcty after processing packets
      Signed-off-by: default avatarIlias Apalodimas <ilias.apalodimas@linaro.org>
      Reported-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Acked-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a1e89df
    • Moshe Shemesh's avatar
      net/mlx4_core: Fix warnings during boot on driverinit param set failures · 26450608
      Moshe Shemesh authored
      During boot, mlx4_core sets the driverinit configuration parameters and
      updates the devlink module on the initial values calling
      devlink_param_driverinit_value_set().
      If devlink_param_driverinit_value_set() returns an error mlx4_core
      reports kernel module warning.
      
      This caused false alarm during boot in case kernel was compiled with
      CONFIG_NET_DEVLINK off.
      Fix by removing warning reported in case
      devlink_param_driverinit_value_set() fails.
      
      This actually makes the function mlx4_devlink_set_init_value()
      redundant to using directly devlink_param_driverinit_value_set() and so
      removed.
      
      It fixes the following kernel trace:
      
       mlx4_core 0000:00:06.0: devlink set parameter 0 value failed (err = -95)
       mlx4_core 0000:00:06.0: devlink set parameter 1 value failed (err = -95)
       mlx4_core 0000:00:06.0: devlink set parameter 4 value failed (err = -95)
       mlx4_core 0000:00:06.0: devlink set parameter 5 value failed (err = -95)
       mlx4_core 0000:00:06.0: devlink set parameter 3 value failed (err = -95)
      
      Fixes: bd1b51dc ("mlx4: Add mlx4 initial parameters table and register it")
      Signed-off-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26450608
    • Greg Kroah-Hartman's avatar
      Merge branch 'for-4.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 0778a9f2
      Greg Kroah-Hartman authored
      Tejun writes:
        "cgroup fixes for v4.19-rc7
      
         One cgroup2 threaded mode fix for v4.19-rc7.  While threaded mode
         isn't used widely (yet) and the bug requires somewhat convoluted
         sequence of operations, it causes a userland visible malfunction -
         EINVAL on a valid attempt to enable threaded mode.  This pull request
         contains the fix"
      
      * 'for-4.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: Fix dom_cgrp propagation when enabling threaded mode
      0778a9f2
    • Ying Xue's avatar
      tipc: eliminate possible recursive locking detected by LOCKDEP · a1f8dd34
      Ying Xue authored
      When booting kernel with LOCKDEP option, below warning info was found:
      
      WARNING: possible recursive locking detected
      4.19.0-rc7+ #14 Not tainted
      --------------------------------------------
      swapper/0/1 is trying to acquire lock:
      00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
      include/linux/spinlock.h:334 [inline]
      00000000dcfc0fc8 (&(&list->lock)->rlock#4){+...}, at:
      tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
      
      but task is already holding lock:
      00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at: spin_lock_bh
      include/linux/spinlock.h:334 [inline]
      00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
      tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
      
      other info that might help us debug this:
       Possible unsafe locking scenario:
      
             CPU0
             ----
        lock(&(&list->lock)->rlock#4);
        lock(&(&list->lock)->rlock#4);
      
       *** DEADLOCK ***
      
       May be due to missing lock nesting notation
      
      2 locks held by swapper/0/1:
       #0: 00000000f7539d34 (pernet_ops_rwsem){+.+.}, at:
      register_pernet_subsys+0x19/0x40 net/core/net_namespace.c:1051
       #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
      spin_lock_bh include/linux/spinlock.h:334 [inline]
       #1: 00000000cbb9b036 (&(&list->lock)->rlock#4){+...}, at:
      tipc_link_reset+0xfa/0xdf0 net/tipc/link.c:849
      
      stack backtrace:
      CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc7+ #14
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1af/0x295 lib/dump_stack.c:113
       print_deadlock_bug kernel/locking/lockdep.c:1759 [inline]
       check_deadlock kernel/locking/lockdep.c:1803 [inline]
       validate_chain kernel/locking/lockdep.c:2399 [inline]
       __lock_acquire+0xf1e/0x3c60 kernel/locking/lockdep.c:3411
       lock_acquire+0x1db/0x520 kernel/locking/lockdep.c:3900
       __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
       _raw_spin_lock_bh+0x31/0x40 kernel/locking/spinlock.c:168
       spin_lock_bh include/linux/spinlock.h:334 [inline]
       tipc_link_reset+0x125/0xdf0 net/tipc/link.c:850
       tipc_link_bc_create+0xb5/0x1f0 net/tipc/link.c:526
       tipc_bcast_init+0x59b/0xab0 net/tipc/bcast.c:521
       tipc_init_net+0x472/0x610 net/tipc/core.c:82
       ops_init+0xf7/0x520 net/core/net_namespace.c:129
       __register_pernet_operations net/core/net_namespace.c:940 [inline]
       register_pernet_operations+0x453/0xac0 net/core/net_namespace.c:1011
       register_pernet_subsys+0x28/0x40 net/core/net_namespace.c:1052
       tipc_init+0x83/0x104 net/tipc/core.c:140
       do_one_initcall+0x109/0x70a init/main.c:885
       do_initcall_level init/main.c:953 [inline]
       do_initcalls init/main.c:961 [inline]
       do_basic_setup init/main.c:979 [inline]
       kernel_init_freeable+0x4bd/0x57f init/main.c:1144
       kernel_init+0x13/0x180 init/main.c:1063
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:413
      
      The reason why the noise above was complained by LOCKDEP is because we
      nested to hold l->wakeupq.lock and l->inputq->lock in tipc_link_reset
      function. In fact it's unnecessary to move skb buffer from l->wakeupq
      queue to l->inputq queue while holding the two locks at the same time.
      Instead, we can move skb buffers in l->wakeupq queue to a temporary
      list first and then move the buffers of the temporary list to l->inputq
      queue, which is also safe for us.
      
      Fixes: 3f32d0be ("tipc: lock wakeup & inputq at tipc_link_reset()")
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a1f8dd34
    • Greg Kroah-Hartman's avatar
      Merge tag 'kbuild-fixes-v4.19-2' of... · e5337178
      Greg Kroah-Hartman authored
      Merge tag 'kbuild-fixes-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
      
      Masahiro writes:
        "Kbuild fixes for v4.19 (2nd)
         - Fix warnings from recordmcount.pl when building with Clang
         - Allow Clang to use GNU toolchains correctly
         - Disable CONFIG_SAMPLES for UML to avoid build error"
      
      * tag 'kbuild-fixes-v4.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
        samples: disable CONFIG_SAMPLES for UML
        kbuild: allow to use GCC toolchain not in Clang search path
        ftrace: Build with CPPFLAGS to get -Qunused-arguments
      e5337178
    • David S. Miller's avatar
      Merge branch 'net-explicitly-requires-bash-when-needed' · 26b1f4cb
      David S. Miller authored
      Paolo Abeni says:
      
      ====================
      net: explicitly requires bash when needed.
      
      Some test scripts require bash-only features but use the default shell.
      This may cause random failures if the default shell is not bash.
      Instead of doing a potentially complex rewrite of such scripts, these patches
      require the bash interpreter, where needed.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      26b1f4cb
    • Paolo Abeni's avatar
      selftests: udpgso_bench.sh explicitly requires bash · 12a2ea96
      Paolo Abeni authored
      The udpgso_bench.sh script requires several bash-only features. This
      may cause random failures if the default shell is not bash.
      Address the above explicitly requiring bash as the script interpreter
      
      Fixes: 3a687bef ("selftests: udp gso benchmark")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Acked-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12a2ea96
    • Paolo Abeni's avatar
      selftests: rtnetlink.sh explicitly requires bash. · 3c718e67
      Paolo Abeni authored
      the script rtnetlink.sh requires a bash-only features (sleep with sub-second
      precision). This may cause random test failure if the default shell is not
      bash.
      Address the above explicitly requiring bash as the script interpreter.
      
      Fixes: 33b01b7b ("selftests: add rtnetlink test script")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c718e67
    • Greg Kroah-Hartman's avatar
      Merge tag 'alloc-args-v4.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 834d3cd2
      Greg Kroah-Hartman authored
      Kees writes:
        "Fix open-coded multiplication arguments to allocators
      
         - Fixes several new open-coded multiplications added in the 4.19
           merge window."
      
      * tag 'alloc-args-v4.19-rc8' of https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        treewide: Replace more open-coded allocation size multiplications
      834d3cd2
    • Greg Kroah-Hartman's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9f203e2f
      Greg Kroah-Hartman authored
      Ingo writes:
        "x86 fixes
      
         An intel_rdt memory access fix and a VLA fix in pgd_alloc()."
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/mm: Avoid VLA in pgd_alloc()
        x86/intel_rdt: Fix out-of-bounds memory access in CBM tests
      9f203e2f
    • Greg Kroah-Hartman's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a22dd362
      Greg Kroah-Hartman authored
      Ingo writes:
        "scheduler fix:
      
         Cleanup of dead code left over from the recent sched/numa fixes."
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        mm, sched/numa: Remove remaining traces of NUMA rate-limiting
      a22dd362
    • Greg Kroah-Hartman's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6302aad4
      Greg Kroah-Hartman authored
      Ingo, a man of few words, writes:
        "perf fixes:
      
         misc perf tooling fixes"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf record: Use unmapped IP for inline callchain cursors
        perf python: Use -Wno-redundant-decls to build with PYTHON=python3
        perf report: Don't try to map ip to invalid map
        perf script python: Fix export-to-sqlite.py sample columns
        perf script python: Fix export-to-postgresql.py occasional failure
      6302aad4
    • Giacinto Cifelli's avatar
      qmi_wwan: Added support for Gemalto's Cinterion ALASxx WWAN interface · 4f761770
      Giacinto Cifelli authored
      Added support for Gemalto's Cinterion ALASxx WWAN interfaces
      by adding QMI_FIXED_INTF with Cinterion's VID and PID.
      Signed-off-by: default avatarGiacinto Cifelli <gciofono@gmail.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f761770
    • Parthasarathy Bhuvaragan's avatar
      tipc: queue socket protocol error messages into socket receive buffer · e7eb0582
      Parthasarathy Bhuvaragan authored
      In tipc_sk_filter_rcv(), when we detect protocol messages with error we
      call tipc_sk_conn_proto_rcv() and let it reset the connection and notify
      the socket by calling sk->sk_state_change().
      
      However, tipc_sk_filter_rcv() may have been called from the function
      tipc_backlog_rcv(), in which case the socket lock is held and the socket
      already awake. This means that the sk_state_change() call is ignored and
      the error notification lost. Now the receive queue will remain empty and
      the socket sleeps forever.
      
      In this commit, we convert the protocol message into a connection abort
      message and enqueue it into the socket's receive queue. By this addition
      to the above state change we cover all conditions.
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7eb0582
    • Jon Maloy's avatar
      tipc: set link tolerance correctly in broadcast link · 047491ea
      Jon Maloy authored
      In the patch referred to below we added link tolerance as an additional
      criteria for declaring broadcast transmission "stale" and resetting the
      affected links.
      
      However, the 'tolerance' field of the broadcast link is never set, and
      remains at zero. This renders the whole commit without the intended
      improving effect, but luckily also with no negative effect.
      
      In this commit we add the missing initialization.
      
      Fixes: a4dc70d4 ("tipc: extend link reset criteria for stale packet retransmission")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      047491ea
    • David S. Miller's avatar
      Merge branch 'net-ipv4-fixes-for-PMTU-when-link-MTU-changes' · 28b6bfeb
      David S. Miller authored
      Sabrina Dubroca says:
      
      ====================
      net: ipv4: fixes for PMTU when link MTU changes
      
      The first patch adapts the changes that commit e9fa1495 ("ipv6:
      Reflect MTU changes on PMTU of exceptions for MTU-less routes") did in
      IPv6 to IPv4: lower PMTU when the first hop's MTU drops below it, and
      raise PMTU when the first hop was limiting PMTU discovery and its MTU
      is increased.
      
      The second patch fixes bugs introduced in commit d52e5a7e ("ipv4:
      lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu") that
      only appear once the first patch is applied.
      
      Selftests for these cases were introduced in net-next commit
      e44e428f ("selftests: pmtu: add basic IPv4 and IPv6 PMTU tests")
      
      v2: add cover letter, and fix a few small things in patch 1
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28b6bfeb
    • Sabrina Dubroca's avatar
      net: ipv4: don't let PMTU updates increase route MTU · 28d35bcd
      Sabrina Dubroca authored
      When an MTU update with PMTU smaller than net.ipv4.route.min_pmtu is
      received, we must clamp its value. However, we can receive a PMTU
      exception with PMTU < old_mtu < ip_rt_min_pmtu, which would lead to an
      increase in PMTU.
      
      To fix this, take the smallest of the old MTU and ip_rt_min_pmtu.
      
      Before this patch, in case of an update, the exception's MTU would
      always change. Now, an exception can have only its lock flag updated,
      but not the MTU, so we need to add a check on locking to the following
      "is this exception getting updated, or close to expiring?" test.
      
      Fixes: d52e5a7e ("ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtu")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28d35bcd
    • Sabrina Dubroca's avatar
      net: ipv4: update fnhe_pmtu when first hop's MTU changes · af7d6cce
      Sabrina Dubroca authored
      Since commit 5aad1de5 ("ipv4: use separate genid for next hop
      exceptions"), exceptions get deprecated separately from cached
      routes. In particular, administrative changes don't clear PMTU anymore.
      
      As Stefano described in commit e9fa1495 ("ipv6: Reflect MTU changes
      on PMTU of exceptions for MTU-less routes"), the PMTU discovered before
      the local MTU change can become stale:
       - if the local MTU is now lower than the PMTU, that PMTU is now
         incorrect
       - if the local MTU was the lowest value in the path, and is increased,
         we might discover a higher PMTU
      
      Similarly to what commit e9fa1495 did for IPv6, update PMTU in those
      cases.
      
      If the exception was locked, the discovered PMTU was smaller than the
      minimal accepted PMTU. In that case, if the new local MTU is smaller
      than the current PMTU, let PMTU discovery figure out if locking of the
      exception is still needed.
      
      To do this, we need to know the old link MTU in the NETDEV_CHANGEMTU
      notifier. By the time the notifier is called, dev->mtu has been
      changed. This patch adds the old MTU as additional information in the
      notifier structure, and a new call_netdevice_notifiers_u32() function.
      
      Fixes: 5aad1de5 ("ipv4: use separate genid for next hop exceptions")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af7d6cce
    • Mike Rapoport's avatar
      net/ipv6: stop leaking percpu memory in fib6 info · 7abab7b9
      Mike Rapoport authored
      The fib6_info_alloc() function allocates percpu memory to hold per CPU
      pointers to rt6_info, but this memory is never freed. Fix it.
      
      Fixes: a64efe14 ("net/ipv6: introduce fib6_info struct and helpers")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7abab7b9
    • David S. Miller's avatar
      Merge tag 'rxrpc-fixes-20181008' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 49b538e7
      David S. Miller authored
      David Howells says:
      
      ====================
      rxrpc: Fix packet reception code
      
      Here are a set of patches that prepares for and fix problems in rxrpc's
      package reception code.  There serious problems are:
      
       (A) There's a window between binding the socket and setting the data_ready
           hook in which packets can find their way into the UDP socket's receive
           queues.
      
       (B) The skb_recv_udp() will return an error (and clear the error state) if
           there was an error on the Tx side.  rxrpc doesn't handle this.
      
       (C) The rxrpc data_ready handler doesn't fully drain the UDP receive
           queue.
      
       (D) The rxrpc data_ready handler assumes it is called in a non-reentrant
       state.
      
      The second patch fixes (A) - (C); the third patch renders (B) and (C)
      non-issues by using the recap_rcv hook instead of data_ready - and the
      final patch fixes (D).  That last is the most complex.
      
      The preparatory patches are:
      
       (1) Fix some places that are doing things in the wrong net namespace.
      
       (2) Stop taking the rcu read lock as it's held by the IP input routine in
           the call chain.
      
       (3) Only end the Tx phase if *we* rotated the final packet out of the Tx
           buffer.
      
       (4) Don't assume that the call state won't change after dropping the
           call_state lock.
      
       (5) Only take receive window and MTU suze parameters from an ACK packet if
           it's the latest ACK packet.
      
       (6) Record connection-level abort information correctly.
      
       (7) Fix a trace line.
      
      And then there are three main patches - note that these are mixed in with
      the preparatory patches somewhat:
      
       (1) Fix the setup window (A), skb_recv_udp() error check (B) and packet
           drainage (C).
      
       (2) Switch to using the encap_rcv instead of data_ready to cut out the
           effects of the UDP read queues and get the packets delivered directly.
      
       (3) Add more locking into the various packet input paths to defend against
           re-entrance (D).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      49b538e7