1. 12 Mar, 2018 23 commits
    • Paolo Abeni's avatar
      l2tp: fix races with ipv4-mapped ipv6 addresses · b954f940
      Paolo Abeni authored
      The l2tp_tunnel_create() function checks for v4mapped ipv6
      sockets and cache that flag, so that l2tp core code can
      reusing it at xmit time.
      
      If the socket is provided by the userspace, the connection
      status of the tunnel sockets can change between the tunnel
      creation and the xmit call, so that syzbot is able to
      trigger the following splat:
      
      BUG: KASAN: use-after-free in ip6_dst_idev include/net/ip6_fib.h:192
      [inline]
      BUG: KASAN: use-after-free in ip6_xmit+0x1f76/0x2260
      net/ipv6/ip6_output.c:264
      Read of size 8 at addr ffff8801bd949318 by task syz-executor4/23448
      
      CPU: 0 PID: 23448 Comm: syz-executor4 Not tainted 4.16.0-rc4+ #65
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x24d lib/dump_stack.c:53
        print_address_description+0x73/0x250 mm/kasan/report.c:256
        kasan_report_error mm/kasan/report.c:354 [inline]
        kasan_report+0x23c/0x360 mm/kasan/report.c:412
        __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
        ip6_dst_idev include/net/ip6_fib.h:192 [inline]
        ip6_xmit+0x1f76/0x2260 net/ipv6/ip6_output.c:264
        inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139
        l2tp_xmit_core net/l2tp/l2tp_core.c:1053 [inline]
        l2tp_xmit_skb+0x105f/0x1410 net/l2tp/l2tp_core.c:1148
        pppol2tp_sendmsg+0x470/0x670 net/l2tp/l2tp_ppp.c:341
        sock_sendmsg_nosec net/socket.c:630 [inline]
        sock_sendmsg+0xca/0x110 net/socket.c:640
        ___sys_sendmsg+0x767/0x8b0 net/socket.c:2046
        __sys_sendmsg+0xe5/0x210 net/socket.c:2080
        SYSC_sendmsg net/socket.c:2091 [inline]
        SyS_sendmsg+0x2d/0x50 net/socket.c:2087
        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x42/0xb7
      RIP: 0033:0x453e69
      RSP: 002b:00007f819593cc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f819593d6d4 RCX: 0000000000453e69
      RDX: 0000000000000081 RSI: 000000002037ffc8 RDI: 0000000000000004
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000004c3 R14: 00000000006f72e8 R15: 0000000000000000
      
      This change addresses the issues:
      * explicitly checking for TCP_ESTABLISHED for user space provided sockets
      * dropping the v4mapped flag usage - it can become outdated - and
        explicitly invoking ipv6_addr_v4mapped() instead
      
      The issue is apparently there since ancient times.
      
      v1 -> v2: (many thanks to Guillaume)
       - with csum issue introduced in v1
       - replace pr_err with pr_debug
       - fix build issue with IPV6 disabled
       - move l2tp_sk_is_v4mapped in l2tp_core.c
      
      v2 -> v3:
       - don't update inet_daddr for v4mapped address, unneeded
       - drop rendundant check at creation time
      
      Reported-and-tested-by: syzbot+92fa328176eb07e4ac1a@syzkaller.appspotmail.com
      Fixes: 3557baab ("[L2TP]: PPP over L2TP driver core")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b954f940
    • Paolo Abeni's avatar
      net: ipv6: keep sk status consistent after datagram connect failure · 2f987a76
      Paolo Abeni authored
      On unsuccesful ip6_datagram_connect(), if the failure is caused by
      ip6_datagram_dst_update(), the sk peer information are cleared, but
      the sk->sk_state is preserved.
      
      If the socket was already in an established status, the overall sk
      status is inconsistent and fouls later checks in datagram code.
      
      Fix this saving the old peer information and restoring them in
      case of failure. This also aligns ipv6 datagram connect() behavior
      with ipv4.
      
      v1 -> v2:
       - added missing Fixes tag
      
      Fixes: 85cb73ff ("net: ipv6: reset daddr and dport in sk if connect() fails")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f987a76
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · b7475948
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for your net tree, they are:
      
      1) Fixed hashtable representation doesn't support timeout flag, skip it
         otherwise rules to add elements from the packet fail bogusly fail with
         EOPNOTSUPP.
      
      2) Fix bogus error with 32-bits ebtables userspace and 64-bits kernel,
         patch from Florian Westphal.
      
      3) Sanitize proc names in several x_tables extensions, also from Florian.
      
      4) Add sanitization to ebt_among wormhash logic, from Florian.
      
      5) Missing release of hook array in flowtable.
      ====================
      b7475948
    • David S. Miller's avatar
      Merge tag 'linux-can-fixes-for-4.16-20180312' of... · 4665c6b0
      David S. Miller authored
      Merge tag 'linux-can-fixes-for-4.16-20180312' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2018-03-12
      
      this is a pull reqeust of 6 patches for net/master.
      
      The first patch is by Wolfram Sang and fixes a bitshift vs. comparison mistake
      in the m_can driver. Two patches of Marek Vasut repair the error handling in
      the ifi driver. The two patches by Stephane Grosjean fix a "echo_skb is
      occupied!" bug in the peak/pcie_fd driver. Bich HEMON's patch adds pinctrl
      select state calls to the m_can's driver to further improve power saving during
      suspend.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4665c6b0
    • Xin Long's avatar
      sock_diag: request _diag module only when the family or proto has been registered · bf2ae2e4
      Xin Long authored
      Now when using 'ss' in iproute, kernel would try to load all _diag
      modules, which also causes corresponding family and proto modules
      to be loaded as well due to module dependencies.
      
      Like after running 'ss', sctp, dccp, af_packet (if it works as a module)
      would be loaded.
      
      For example:
      
        $ lsmod|grep sctp
        $ ss
        $ lsmod|grep sctp
        sctp_diag              16384  0
        sctp                  323584  5 sctp_diag
        inet_diag              24576  4 raw_diag,tcp_diag,sctp_diag,udp_diag
        libcrc32c              16384  3 nf_conntrack,nf_nat,sctp
      
      As these family and proto modules are loaded unintentionally, it
      could cause some problems, like:
      
      - Some debug tools use 'ss' to collect the socket info, which loads all
        those diag and family and protocol modules. It's noisy for identifying
        issues.
      
      - Users usually expect to drop sctp init packet silently when they
        have no sense of sctp protocol instead of sending abort back.
      
      - It wastes resources (especially with multiple netns), and SCTP module
        can't be unloaded once it's loaded.
      
      ...
      
      In short, it's really inappropriate to have these family and proto
      modules loaded unexpectedly when just doing debugging with inet_diag.
      
      This patch is to introduce sock_load_diag_module() where it loads
      the _diag module only when it's corresponding family or proto has
      been already registered.
      
      Note that we can't just load _diag module without the family or
      proto loaded, as some symbols used in _diag module are from the
      family or proto module.
      
      v1->v2:
        - move inet proto check to inet_diag to avoid a compiling err.
      v2->v3:
        - define sock_load_diag_module in sock.c and export one symbol
          only.
        - improve the changelog.
      Reported-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarPhil Sutter <phil@nwl.cc>
      Acked-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf2ae2e4
    • David S. Miller's avatar
      Merge branch 'bnxt_en-Bug-fixes' · 9e5fb720
      David S. Miller authored
      Michael Chan says:
      
      ====================
      bnxt_en: Bug fixes.
      
      There are 3 bug fixes in this series to fix regressions recently
      introduced when adding the new ring reservations scheme.  2 minor
      fixes in the TC Flower code to return standard errno values and
      to elide some unnecessary warning dmesg.  One Fixes the VLAN TCI
      value passed to the stack by including the entire 16-bit VLAN TCI,
      and the last fix is to check for valid VNIC ID before setting up or
      shutting down LRO/GRO.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e5fb720
    • Michael Chan's avatar
      bnxt_en: Check valid VNIC ID in bnxt_hwrm_vnic_set_tpa(). · 3c4fe80b
      Michael Chan authored
      During initialization, if we encounter errors, there is a code path that
      calls bnxt_hwrm_vnic_set_tpa() with invalid VNIC ID.  This may cause a
      warning in firmware logs.
      
      Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c4fe80b
    • Venkat Duvvuru's avatar
      bnxt_en: close & open NIC, only when the interface is in running state. · 1a037782
      Venkat Duvvuru authored
      bnxt_restore_pf_fw_resources routine frees PF resources by calling
      close_nic and allocates the resources back, by doing open_nic. However,
      this is not needed, if the PF is already in closed state.
      
      This bug causes the driver to call open the device and call request_irq()
      when it is not needed.  Ultimately, pci_disable_msix() will crash
      when bnxt_en is unloaded.
      
      This patch fixes the problem by skipping __bnxt_close_nic and
      __bnxt_open_nic inside bnxt_restore_pf_fw_resources routine, if the
      interface is not running.
      
      Fixes: 80fcaf46 ("bnxt_en: Restore MSIX after disabling SRIOV.")
      Signed-off-by: default avatarVenkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a037782
    • Venkat Duvvuru's avatar
      bnxt_en: Return standard Linux error codes for hwrm flow cmds. · 6ae777ea
      Venkat Duvvuru authored
      Currently, internal error value is returned by the driver, when
      hwrm_cfa_flow_alloc() fails due lack of resources.  We should be returning
      Linux errno value -ENOSPC instead.
      
      This patch also converts other similar command errors to standard Linux errno
      code (-EIO) in bnxt_tc.c
      
      Fixes: db1d36a2 ("bnxt_en: add TC flower offload flow_alloc/free FW cmds")
      Signed-off-by: default avatarVenkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6ae777ea
    • Michael Chan's avatar
      bnxt_en: Fix regressions when setting up MQPRIO TX rings. · 832aed16
      Michael Chan authored
      Recent changes added the bnxt_init_int_mode() call in the driver's open
      path whenever ring reservations are changed.  This call was previously
      only called in the probe path.  In the open path, if MQPRIO TC has been
      setup, the bnxt_init_int_mode() call would reset and mess up the MQPRIO
      per TC rings.
      
      Fix it by not re-initilizing bp->tx_nr_rings_per_tc in
      bnxt_init_int_mode().  Instead, initialize it in the probe path only
      after the bnxt_init_int_mode() call.
      
      Fixes: 674f50a5 ("bnxt_en: Implement new method to reserve rings.")
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      832aed16
    • Michael Chan's avatar
      bnxt_en: Pass complete VLAN TCI to the stack. · ed7bc602
      Michael Chan authored
      When receiving a packet with VLAN tag, pass the entire 16-bit TCI to the
      stack when calling __vlan_hwaccel_put_tag().  The current code is only
      passing the 12-bit tag and it is missing the priority bits.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed7bc602
    • Sriharsha Basavapatna's avatar
      bnxt_en: Remove unwanted ovs-offload messages in some conditions · b9ecc340
      Sriharsha Basavapatna authored
      In some conditions when the driver fails to add a flow in HW and returns
      an error back to the stack, the stack continues to invoke get_flow_stats()
      and/or del_flow() on it. The driver fails these APIs with an error message
      "no flow_node for cookie". The message gets logged repeatedly as long as
      the stack keeps invoking these functions.
      
      Fix this by removing the corresponding netdev_info() calls from these
      functions.
      
      Fixes: d7bc7305 ("bnxt_en: add code to query TC flower offload stats")
      Signed-off-by: default avatarSriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9ecc340
    • Eddie Wai's avatar
      bnxt_en: Fix vnic accounting in the bnxt_check_rings() path. · 6fc2ffdf
      Eddie Wai authored
      The number of vnics to check must be determined ahead of time because
      only standard RX rings require vnics to support RFS.  The logic is
      similar to the ring reservation logic and we can now use the
      refactored common functions to do most of the work in setting up
      the firmware message.
      
      Fixes: 8f23d638 ("bnxt_en: Expand bnxt_check_rings() to check all resources.")
      Signed-off-by: default avatarEddie Wai <eddie.wai@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fc2ffdf
    • Michael Chan's avatar
      bnxt_en: Refactor the functions to reserve hardware rings. · 4ed50ef4
      Michael Chan authored
      The bnxt_hwrm_reserve_{pf|vf}_rings() functions are very similar to
      the bnxt_hwrm_check_{pf|vf}_rings() functions.  Refactor the former
      so that the latter can make use of common code in the next patch.
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4ed50ef4
    • Brad Mouring's avatar
      net: phy: Tell caller result of phy_change() · a2c054a8
      Brad Mouring authored
      In 664fcf12 (net: phy: Threaded interrupts allow some simplification)
      the phy_interrupt system was changed to use a traditional threaded
      interrupt scheme instead of a workqueue approach.
      
      With this change, the phy status check moved into phy_change, which
      did not report back to the caller whether or not the interrupt was
      handled. This means that, in the case of a shared phy interrupt,
      only the first phydev's interrupt registers are checked (since
      phy_interrupt() would always return IRQ_HANDLED). This leads to
      interrupt storms when it is a secondary device that's actually the
      interrupt source.
      Signed-off-by: default avatarBrad Mouring <brad.mouring@ni.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2c054a8
    • Bich HEMON's avatar
      can: m_can: select pinctrl state in each suspend/resume function · c9b3bce1
      Bich HEMON authored
      Make sure to apply the correct pin state in suspend/resume callbacks.
      Putting pins in sleep state saves power.
      Signed-off-by: default avatarBich Hemon <bich.hemon@st.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      c9b3bce1
    • Stephane Grosjean's avatar
      can: peak/pcie_fd: remove useless code when interface starts · ffd137f7
      Stephane Grosjean authored
      When an interface starts, the echo_skb array is empty and the network
      queue should be started only. This patch replaces useless code and locks
      when the internal RX_BARRIER message is received from the IP core, telling
      the driver that tx may start.
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      ffd137f7
    • Stephane Grosjean's avatar
      can: peak/pcie_fd: fix echo_skb is occupied! bug · e6048a00
      Stephane Grosjean authored
      This patch makes atomic the handling of the linux-can echo_skb array and
      the network tx queue. This prevents from the "BUG! echo_skb is occupied!"
      message to be printed by the linux-can core, in SMP environments.
      Reported-by: default avatarDiana Burgess <diana@peloton-tech.com>
      Signed-off-by: default avatarStephane Grosjean <s.grosjean@peak-system.com>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      e6048a00
    • Marek Vasut's avatar
      can: ifi: Repair the error handling · 880dd464
      Marek Vasut authored
      The new version of the IFI CANFD core has significantly less complex
      error state indication logic. In particular, the warning/error state
      bits are no longer all over the place, but are all present in the
      STATUS register. Moreover, there is a new IRQ register bit indicating
      transition between error states (active/warning/passive/busoff).
      
      This patch makes use of this bit to weed out the obscure selective
      INTERRUPT register clearing, which was used to carry over the error
      state indication into the poll function. While at it, this patch
      fixes the handling of the ACTIVE state, since the hardware provides
      indication of the core being in ACTIVE state and that in turn fixes
      the state transition indication toward userspace. Finally, register
      reads in the poll function are moved to the matching subfunctions
      since those are also no longer needed in the poll function.
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Heiko Schocher <hs@denx.de>
      Cc: Markus Marb <markus@marb.org>
      Cc: Marc Kleine-Budde <mkl@pengutronix.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      880dd464
    • Marek Vasut's avatar
      can: ifi: Check core revision upon probe · 591d65d5
      Marek Vasut authored
      Older versions of the core are not compatible with the driver due
      to various intrusive fixes of the core. Read out the VER register,
      check the core revision bitfield and verify if the core in use is
      new enough (rev 2.1 or newer) to work correctly with this driver.
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Heiko Schocher <hs@denx.de>
      Cc: Markus Marb <markus@marb.org>
      Cc: Marc Kleine-Budde <mkl@pengutronix.de>
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      591d65d5
    • Wolfram Sang's avatar
      can: m_can: change comparison to bitshift when dealing with a mask · b7db978a
      Wolfram Sang authored
      Due to a typo, the mask was destroyed by a comparison instead of a bit
      shift.
      Reported-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      b7db978a
    • zhangliping's avatar
      openvswitch: meter: fix the incorrect calculation of max delta_t · ddc502df
      zhangliping authored
      Max delat_t should be the full_bucket/rate instead of the full_bucket.
      Also report EINVAL if the rate is zero.
      
      Fixes: 96fbc13d ("openvswitch: Add meter infrastructure")
      Cc: Andy Zhou <azhou@ovn.org>
      Signed-off-by: default avatarzhangliping <zhangliping02@baidu.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddc502df
    • Shannon Nelson's avatar
      macvlan: filter out unsupported feature flags · 13fbcc8d
      Shannon Nelson authored
      Adding a macvlan device on top of a lowerdev that supports
      the xfrm offloads fails with a new regression:
        # ip link add link ens1f0 mv0 type macvlan
        RTNETLINK answers: Operation not permitted
      
      Tracing down the failure shows that the macvlan device inherits
      the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
      from the lowerdev, but with no dev->xfrmdev_ops API filled
      in, it doesn't actually support xfrm.  When the request is
      made to add the new macvlan device, the XFRM listener for
      NETDEV_REGISTER calls xfrm_api_check() which fails the new
      registration because dev->xfrmdev_ops is NULL.
      
      The macvlan creation succeeds when we filter out the ESP
      feature flags in macvlan_fix_features(), so let's filter them
      out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
      When XFRM support is added in the future, we can add the flags
      into MACVLAN_FEATURES.
      
      This same problem could crop up in the future with any other
      new feature flags, so let's filter out any flags that aren't
      defined as supported in macvlan.
      
      Fixes: d77e38e6 ("xfrm: Add an IPsec hardware offloading API")
      Reported-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13fbcc8d
  2. 11 Mar, 2018 4 commits
  3. 09 Mar, 2018 13 commits