1. 20 Nov, 2018 5 commits
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2018-11-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 1359f251
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2018-11-19
      
      The following fixes are for mlx5 core and netdev driver.
      
      For -stable v4.16
      bc7fda7d4637 ('net/mlx5e: IPoIB, Reset QP after channels are closed')
      
      For -stable v4.17
      36917a270395 ('net/mlx5: IPSec, Fix the SA context hash key')
      
      For -stable v4.18
      6492a432be3a ('net/mlx5e: Always use the match level enum when parsing TC rule match')
      c3f81be236b1 ('net/mlx5e: Removed unnecessary warnings in FEC caps query')
      c5ce2e736b64 ('net/mlx5e: Fix selftest for small MTUs')
      
      For -stable v4.19
      effcd896b25e ('net/mlx5e: Adjust to max number of channles when re-attaching')
      394cbc5acd68 ('net/mlx5e: RX, verify received packet size in Linear Striding RQ')
      447cbb3613c8 ('net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded')
      c223c1574612 ('net/mlx5e: Claim TC hw offloads support only under a proper build config')
      
      Please pull and let me know if there's any problem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1359f251
    • Juliet Kim's avatar
      net/ibmnvic: Fix deadlock problem in reset · a5681e20
      Juliet Kim authored
      This patch changes to use rtnl_lock only during a reset to avoid
      deadlock that could occur when a thread operating close is holding
      rtnl_lock and waiting for reset_lock acquired by another thread,
      which is waiting for rtnl_lock in order to set the number of tx/rx
      queues during a reset.
      
      Also, we now setting the number of tx/rx queues during a soft reset
      for failover or LPM events.
      Signed-off-by: default avatarJuliet Kim <julietk@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5681e20
    • David S. Miller's avatar
      Merge branch 'qed-Fix-Queue-Manager-getters' · db9a0bae
      David S. Miller authored
      Denis Bolotin says:
      
      ====================
      qed: Fix Queue Manager getters
      
      This patch series fixes various queue manager getter functions. It is
      important to make sure the getter's caller will receive a valid queue even
      in error case to prevent more serious bugs.
      Please consider applying to net.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db9a0bae
    • Denis Bolotin's avatar
      qed: Fix QM getters to always return a valid pq · eb62cca9
      Denis Bolotin authored
      The getter callers doesn't know the valid Physical Queues (PQ) values.
      This patch makes sure that a valid PQ will always be returned.
      
      The patch consists of 3 fixes:
      
       - When qed_init_qm_get_idx_from_flags() receives a disabled flag, it
         returned PQ 0, which can potentially be another function's pq. Verify
         that flag is enabled, otherwise return default start_pq.
      
       - When qed_init_qm_get_idx_from_flags() receives an unknown flag, it
         returned NULL and could lead to a segmentation fault. Return default
         start_pq instead.
      
       - A modulo operation was added to MCOS/VFS PQ getters to make sure the
         PQ returned is in range of the required flag.
      
      Fixes: b5a9ee7c ("qed: Revise QM cofiguration")
      Signed-off-by: default avatarDenis Bolotin <denis.bolotin@cavium.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb62cca9
    • Denis Bolotin's avatar
      qed: Fix bitmap_weight() check · 276d43f0
      Denis Bolotin authored
      Fix the condition which verifies that only one flag is set. The API
      bitmap_weight() should receive size in bits instead of bytes.
      
      Fixes: b5a9ee7c ("qed: Revise QM cofiguration")
      Signed-off-by: default avatarDenis Bolotin <denis.bolotin@cavium.com>
      Signed-off-by: default avatarMichal Kalderon <michal.kalderon@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      276d43f0
  2. 19 Nov, 2018 25 commits
    • Shay Agroskin's avatar
      net/mlx5e: Fix failing ethtool query on FEC query error · 9184e51b
      Shay Agroskin authored
      If FEC caps query fails when executing 'ethtool <interface>'
      the whole callback fails unnecessarily, fixed that by replacing the
      error return code with debug logging only.
      
      Fixes: 6cfa9460 ("net/mlx5e: Ethtool driver callback for query/set FEC policy")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9184e51b
    • Shay Agroskin's avatar
      net/mlx5e: Removed unnecessary warnings in FEC caps query · 64e28334
      Shay Agroskin authored
      Querying interface FEC caps with 'ethtool [int]' after link reset
      throws warning regading link speed.
      This warning is not needed as there is already an indication in
      user space that the link is not up.
      
      Fixes: 0696d608 ("net/mlx5e: Receive buffer configuration")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      64e28334
    • Shay Agroskin's avatar
      net/mlx5e: Fix wrong field name in FEC related functions · febd72f2
      Shay Agroskin authored
      This bug would result in reading wrong FEC capabilities for 10G/40G.
      
      Fixes: 2095b264 ("net/mlx5e: Add port FEC get/set functions")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      febd72f2
    • Shay Agroskin's avatar
      net/mlx5e: Fix a bug in turning off FEC policy in unsupported speeds · 9cdeaab3
      Shay Agroskin authored
      Some speeds don't support turning FEC policy off. In case a requested
      FEC policy is not supported for a speed (including current speed), its new
      FEC policy would be:
      	no FEC - if disabling FEC is supported for that speed
      	unchanged - else
      
      Fixes: 2095b264 ("net/mlx5e: Add port FEC get/set functions")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9cdeaab3
    • David S. Miller's avatar
      Merge branch 'ena-hibernation-and-rmmod-bug-fixes' · d7c60210
      David S. Miller authored
      Arthur Kiyanovski says:
      
      ====================
      net: ena: hibernation and rmmod bug fixes
      
      This patchset includes 2 bug fixes:
      1. A fix to a crash during resume from hibernation.
      2. A fix to an illegal memory access during driver removal (e.g. during rmmod)
         which might cause a crash in certain systems.
      
      The subminor number in the driver version is also promoted to indicate driver
      was changed.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7c60210
    • Arthur Kiyanovski's avatar
      net: ena: update driver version from 2.0.1 to 2.0.2 · 4c23738a
      Arthur Kiyanovski authored
      Update driver version due to critical bug fixes.
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c23738a
    • Arthur Kiyanovski's avatar
      net: ena: fix crash during ena_remove() · 58a54b9c
      Arthur Kiyanovski authored
      In ena_remove() we have the following stack call:
      ena_remove()
        unregister_netdev()
        ena_destroy_device()
          netif_carrier_off()
      
      Calling netif_carrier_off() causes linkwatch to try to handle the
      link change event on the already unregistered netdev, which leads
      to a read from an unreadable memory address.
      
      This patch switches the order of the two functions, so that
      netif_carrier_off() is called on a regiestered netdev.
      
      To accomplish this fix we also had to:
      1. Remove the set bit ENA_FLAG_TRIGGER_RESET
      2. Add a sanitiy check in ena_close()
      both to prevent double device reset (when calling unregister_netdev()
      ena_close is called, but the device was already deleted in
      ena_destroy_device()).
      3. Set the admin_queue running state to false to avoid using it after
      device was reset (for example when calling ena_destroy_all_io_queues()
      right after ena_com_dev_reset() in ena_down)
      
      Fixes: 944b28aa ("net: ena: fix missing lock during device destruction")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58a54b9c
    • Arthur Kiyanovski's avatar
      net: ena: fix crash during failed resume from hibernation · e76ad21d
      Arthur Kiyanovski authored
      During resume from hibernation if ena_restore_device fails,
      ena_com_dev_reset() is called, and uses the readless read mechanism,
      which was already destroyed by the call to
      ena_com_mmio_reg_read_request_destroy(). This causes a NULL pointer
      reference.
      
      In this commit we switch the call order of the above two functions
      to avoid this crash.
      
      Fixes: d7703ddb ("net: ena: fix rare bug when failed restart/resume is followed by driver removal")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e76ad21d
    • Xin Long's avatar
      sctp: not increase stream's incnt before sending addstrm_in request · e1e46479
      Xin Long authored
      Different from processing the addstrm_out request, The receiver handles
      an addstrm_in request by sending back an addstrm_out request to the
      sender who will increase its stream's in and incnt later.
      
      Now stream->incnt has been increased since it sent out the addstrm_in
      request in sctp_send_add_streams(), with the wrong stream->incnt will
      even cause crash when copying stream info from the old stream's in to
      the new one's in sctp_process_strreset_addstrm_out().
      
      This patch is to fix it by simply removing the stream->incnt change
      from sctp_send_add_streams().
      
      Fixes: 242bd2d5 ("sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter")
      Reported-by: default avatarJianwen Ji <jiji@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1e46479
    • Valentine Fatiev's avatar
      net/mlx5e: Fix selftest for small MTUs · 228c4cd0
      Valentine Fatiev authored
      Loopback test had fixed packet size, which can be bigger than configured
      MTU. Shorten the loopback packet size to be bigger than minimal MTU
      allowed by the device. Text field removed from struct 'mlx5ehdr'
      as redundant to allow send small packets as minimal allowed MTU.
      
      Fixes: d605d668 ("net/mlx5e: Add support for ethtool self diagnostics test")
      Signed-off-by: default avatarValentine Fatiev <valentinef@mellanox.com>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      228c4cd0
    • Moshe Shemesh's avatar
      net/mlx5e: RX, verify received packet size in Linear Striding RQ · 0073c8f7
      Moshe Shemesh authored
      In case of striding RQ, we use  MPWRQ (Multi Packet WQE RQ), which means
      that WQE (RX descriptor) can be used for many packets and so the WQE is
      much bigger than MTU.  In virtualization setups where the port mtu can
      be larger than the vf mtu, if received packet is bigger than MTU, it
      won't be dropped by HW on too small receive WQE. If we use linear SKB in
      striding RQ, since each stride has room for mtu size payload and skb
      info, an oversized packet can lead to crash for crossing allocated page
      boundary upon the call to build_skb. So driver needs to check packet
      size and drop it.
      
      Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the
      number of packets dropped by the driver for being too large.
      
      As a new field is added to the RQ struct, re-open the channels whenever
      this field is being used in datapath (i.e., in the case of linear
      Striding RQ).
      
      Fixes: 619a8f2a ("net/mlx5e: Use linear SKB in Striding RQ")
      Signed-off-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      0073c8f7
    • Roi Dayan's avatar
      net/mlx5e: Apply the correct check for supporting TC esw rules split · 1392f44b
      Roi Dayan authored
      The mirror and not the output count is the one denoting a split.
      Fix to condition the offload attempt on the mirror count being > 0
      along the firmware to have the related capability.
      
      Fixes: 592d3651 ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows")
      Signed-off-by: default avatarRoi Dayan <roid@mellanox.com>
      Reviewed-by: default avatarYossi Kuperman <yossiku@mellanox.com>
      Reviewed-by: default avatarChris Mi <chrism@mellanox.com>
      Acked-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      1392f44b
    • Yuval Avnery's avatar
      net/mlx5e: Adjust to max number of channles when re-attaching · a1f240f1
      Yuval Avnery authored
      When core driver enters deattach/attach flow after pci reset,
      Number of logical CPUs may have changed.
      As a result we need to update the cpu affiliated resource tables.
      	1. indirect rqt list
      	2. eq table
      
      Reproduction (PowerPC):
      	echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes
      	ppc64_cpu --smt=on
      	# Restart driver
      	modprobe -r ... ; modprobe ...
      	# Link up
      	ifconfig ...
      	# Only physical CPUs
      	ppc64_cpu --smt=off
      	# Inject PCI errors so PCI will reset - calling the pci error handler
      	echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA
      
      Call trace when trying to add non-existing rqs to an indirect rqt:
      	mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable)
      	mlx5e_redirect_rqts+0x188/0x190 [mlx5_core]
      	mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core]
      	mlx5e_open_locked+0xbc/0x140 [mlx5_core]
      	mlx5e_open+0x50/0x130 [mlx5_core]
      	mlx5e_nic_enable+0x174/0x1b0 [mlx5_core]
      	mlx5e_attach_netdev+0x154/0x290 [mlx5_core]
      	mlx5e_attach+0x88/0xd0 [mlx5_core]
      	mlx5_attach_device+0x168/0x1e0 [mlx5_core]
      	mlx5_load_one+0x1140/0x1210 [mlx5_core]
      	mlx5_pci_resume+0x6c/0xf0 [mlx5_core]
      
      Create cq will fail when trying to use non-existing EQ.
      
      Fixes: 89d44f0a ("net/mlx5_core: Add pci error handlers to mlx5_core driver")
      Signed-off-by: default avatarYuval Avnery <yuvalav@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      a1f240f1
    • Or Gerlitz's avatar
      net/mlx5e: Always use the match level enum when parsing TC rule match · 83621b7d
      Or Gerlitz authored
      We get the match level (none, l2, l3, l4) while going over the match
      dissectors of an offloaded tc rule. When doing this, the match level
      enum and the not min inline enum values should be used, fix that.
      
      This worked accidentally b/c both enums have the same numerical values.
      
      Fixes: d708f902 ('net/mlx5e: Get the required HW match level while parsing TC flow matches')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      83621b7d
    • Or Gerlitz's avatar
      net/mlx5e: Claim TC hw offloads support only under a proper build config · 077ecd78
      Or Gerlitz authored
      Currently, we are only supporting tc hw offloads when the eswitch
      support is compiled in, but we are not gating the adevertizment
      of the NETIF_F_HW_TC feature on this config being set.
      
      Fix it, and while doing that, also avoid dealing with the feature
      on ethtool when the config is not set.
      
      Fixes: e8f887ac ('net/mlx5e: Introduce tc offload support')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      077ecd78
    • Or Gerlitz's avatar
      net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded · d3a80bb5
      Or Gerlitz authored
      For the "all" ethertype we should not care whether the packet has
      vlans. Besides being wrong, the way we did it caused FW error
      for rules such as:
      
      tc filter add dev eth0 protocol all parent ffff: \
      	prio 1 flower skip_sw action drop
      
      b/c the matching meta-data (outer headers bit in struct mlx5_flow_spec)
      wasn't set. Fix that by matching on vlan non-existence only if we were
      also told to match on the ethertype.
      
      Fixes: cee26487 ('net/mlx5e: Set vlan masks for all offloaded TC rules')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reported-by: default avatarSlava Ovsiienko <viacheslavo@mellanox.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d3a80bb5
    • Denis Drozdov's avatar
      net/mlx5e: IPoIB, Reset QP after channels are closed · acf3766b
      Denis Drozdov authored
      The mlx5e channels should be closed before mlx5i_uninit_underlay_qp
      puts the QP into RST (reset) state during mlx5i_close. Currently QP
      state incorrectly set to RST before channels got deactivated and closed,
      since mlx5_post_send request expects QP in RTS (Ready To Send) state.
      
      The fix is to keep QP in RTS state until mlx5e channels get closed
      and to reset QP afterwards.
      
      Also this fix is simply correct in order to keep the open/close flow
      symmetric, i.e mlx5i_init_underlay_qp() is called first thing at open,
      the correct thing to do is to call mlx5i_uninit_underlay_qp() last thing
      at close, which is exactly what this patch is doing.
      
      Fixes: dae37456 ("net/mlx5: Support for attaching multiple underlay QPs to root flow table")
      Signed-off-by: default avatarDenis Drozdov <denisd@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      acf3766b
    • Raed Salem's avatar
      net/mlx5: IPSec, Fix the SA context hash key · f2b18732
      Raed Salem authored
      The commit "net/mlx5: Refactor accel IPSec code" introduced a
      bug where asynchronous short time change in hash key value
      by create/release SA context might happen during an asynchronous
      hash resize operation this could cause a subsequent remove SA
      context operation to fail as the key value used during resize is
      not the same key value used when remove SA context operation is
      invoked.
      
      This commit fixes the bug by defining the SA context hash key
      such that it includes only fields that never change during the
      lifetime of the SA context object.
      
      Fixes: d6c4f029 ("net/mlx5: Refactor accel IPSec code")
      Signed-off-by: default avatarRaed Salem <raeds@mellanox.com>
      Reviewed-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      f2b18732
    • Xin Long's avatar
      Revert "sctp: remove sctp_transport_pmtu_check" · 69fec325
      Xin Long authored
      This reverts commit 22d7be26.
      
      The dst's mtu in transport can be updated by a non sctp place like
      in xfrm where the MTU information didn't get synced between asoc,
      transport and dst, so it is still needed to do the pmtu check
      in sctp_packet_config.
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69fec325
    • Xin Long's avatar
      sctp: not allow to set asoc prsctp_enable by sockopt · cc3ccf26
      Xin Long authored
      As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
      
         This socket option allows the enabling or disabling of the
         negotiation of PR-SCTP support for future associations.  For existing
         associations, it allows one to query whether or not PR-SCTP support
         was negotiated on a particular association.
      
      It means only sctp sock's prsctp_enable can be set.
      
      Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
      add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
      sctp in another patchset.
      
      v1->v2:
        - drop the params.assoc_id check as Neil suggested.
      
      Fixes: 28aa4c26 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
      Reported-by: default avatarYing Xu <yinxu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc3ccf26
    • Xin Long's avatar
      sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit · 02968ccf
      Xin Long authored
      Now sctp increases sk_wmem_alloc by 1 when doing set_owner_w for the
      skb allocked in sctp_packet_transmit and decreases by 1 when freeing
      this skb.
      
      But when this skb goes through networking stack, some subcomponents
      might change skb->truesize and add the same amount on sk_wmem_alloc.
      However sctp doesn't know the amount to decrease by, it would cause
      a leak on sk->sk_wmem_alloc and the sock can never be freed.
      
      Xiumei found this issue when it hit esp_output_head() by using sctp
      over ipsec, where skb->truesize is added and so is sk->sk_wmem_alloc.
      
      Since sctp has used sk_wmem_queued to count for writable space since
      Commit cd305c74 ("sctp: use sk_wmem_queued to check for writable
      space"), it's ok to fix it by counting sk_wmem_alloc by skb truesize
      in sctp_packet_transmit.
      
      Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02968ccf
    • Heiner Kallweit's avatar
      MAINTAINERS: Add myself as third phylib maintainer · a36b5444
      Heiner Kallweit authored
      Add myself as third phylib maintainer.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Acked-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a36b5444
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f2ce1065
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix some potentially uninitialized variables and use-after-free in
          kvaser_usb can drier, from Jimmy Assarsson.
      
       2) Fix leaks in qed driver, from Denis Bolotin.
      
       3) Socket leak in l2tp, from Xin Long.
      
       4) RSS context allocation fix in bnxt_en from Michael Chan.
      
       5) Fix cxgb4 build errors, from Ganesh Goudar.
      
       6) Route leaks in ipv6 when removing exceptions, from Xin Long.
      
       7) Memory leak in IDR allocation handling of act_pedit, from Davide
          Caratti.
      
       8) Use-after-free of bridge vlan stats, from Nikolay Aleksandrov.
      
       9) When MTU is locked, do not force DF bit on ipv4 tunnels. From
          Sabrina Dubroca.
      
      10) When NAPI cached skb is reused, we must set it to the proper initial
          state which includes skb->pkt_type. From Eric Dumazet.
      
      11) Lockdep and non-linear SKB handling fix in tipc from Jon Maloy.
      
      12) Set RX queue properly in various tuntap receive paths, from Matthew
          Cover.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
        tuntap: fix multiqueue rx
        ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF
        tipc: don't assume linear buffer when reading ancillary data
        tipc: fix lockdep warning when reinitilaizing sockets
        net-gro: reset skb->pkt_type in napi_reuse_skb()
        tc-testing: tdc.py: Guard against lack of returncode in executed command
        tc-testing: tdc.py: ignore errors when decoding stdout/stderr
        ip_tunnel: don't force DF when MTU is locked
        MAINTAINERS: Add entry for CAKE qdisc
        net: bridge: fix vlan stats use-after-free on destruction
        socket: do a generic_file_splice_read when proto_ops has no splice_read
        net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
        Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs"
        net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
        net/sched: act_pedit: fix memory leak when IDR allocation fails
        net: lantiq: Fix returned value in case of error in 'xrx200_probe()'
        ipv6: fix a dst leak when removing its exception
        net: mvneta: Don't advertise 2.5G modes
        drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo
        net/mlx4: Fix UBSAN warning of signed integer overflow
        ...
      f2ce1065
    • Matthew Cover's avatar
      tuntap: fix multiqueue rx · 8ebebcba
      Matthew Cover authored
      When writing packets to a descriptor associated with a combined queue, the
      packets should end up on that queue.
      
      Before this change all packets written to any descriptor associated with a
      tap interface end up on rx-0, even when the descriptor is associated with a
      different queue.
      
      The rx traffic can be generated by either of the following.
        1. a simple tap program which spins up multiple queues and writes packets
           to each of the file descriptors
        2. tx from a qemu vm with a tap multiqueue netdev
      
      The queue for rx traffic can be observed by either of the following (done
      on the hypervisor in the qemu case).
        1. a simple netmap program which opens and reads from per-queue
           descriptors
        2. configuring RPS and doing per-cpu captures with rxtxcpu
      
      Alternatively, if you printk() the return value of skb_get_rx_queue() just
      before each instance of netif_receive_skb() in tun.c, you will get 65535
      for every skb.
      
      Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
      the association between descriptor and rx queue.
      Signed-off-by: default avatarMatthew Cover <matthew.cover@stackpath.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ebebcba
    • David Ahern's avatar
      ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF · 7ddacfa5
      David Ahern authored
      Preethi reported that PMTU discovery for UDP/raw applications is not
      working in the presence of VRF when the socket is not bound to a device.
      The problem is that ip6_sk_update_pmtu does not consider the L3 domain
      of the skb device if the socket is not bound. Update the function to
      set oif to the L3 master device if relevant.
      
      Fixes: ca254490 ("net: Add VRF support to IPv6 stack")
      Reported-by: default avatarPreethi Ramachandra <preethir@juniper.net>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ddacfa5
  3. 18 Nov, 2018 10 commits
    • Linus Torvalds's avatar
      Linux 4.20-rc3 · 9ff01193
      Linus Torvalds authored
      9ff01193
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 25e19c1f
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "A small batch of fixes for v4.20-rc3.
      
        The overflow continuation fix addresses something that has been broken
        for several releases. Arguably it could wait even longer, but it's a
        one line fix and this finishes the last of the known address range
        scrub bug reports. The revert addresses a lockdep regression. The unit
        tests are not critical to fix, but no reason to hold this fix back.
      
        Summary:
      
         - Address Range Scrub overflow continuation handling has been broken
           since it was initially merged. It was only recently that error
           injection and platform-BIOS support enabled this corner case to be
           exercised.
      
         - The recent attempt to provide more isolation for the kernel Address
           Range Scrub state machine from userapace initiated sessions
           triggers a lockdep report. Revert and try again at the next merge
           window.
      
         - Fix a kasan reported buffer overflow in libnvdimm unit test
           infrastrucutre (nfit_test)"
      
      * tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        Revert "acpi, nfit: Further restrict userspace ARS start requests"
        acpi, nfit: Fix ARS overflow continuation
        tools/testing/nvdimm: Fix the array size for dimm devices.
      25e19c1f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · c67a98c0
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "16 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/memblock.c: fix a typo in __next_mem_pfn_range() comments
        mm, page_alloc: check for max order in hot path
        scripts/spdxcheck.py: make python3 compliant
        tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset
        lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn
        mm/vmstat.c: fix NUMA statistics updates
        mm/gup.c: fix follow_page_mask() kerneldoc comment
        ocfs2: free up write context when direct IO failed
        scripts/faddr2line: fix location of start_kernel in comment
        mm: don't reclaim inodes with many attached pages
        mm, memory_hotplug: check zone_movable in has_unmovable_pages
        mm/swapfile.c: use kvzalloc for swap_info_struct allocation
        MAINTAINERS: update OMAP MMC entry
        hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!
        kernel/sched/psi.c: simplify cgroup_move_task()
        z3fold: fix possible reclaim races
      c67a98c0
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 03582f33
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "Fix an exec() related scalability/performance regression, which was
        caused by incorrectly calculating load and migrating tasks on exec()
        when they shouldn't be"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix cpu_util_wake() for 'execl' type workloads
      03582f33
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b53e27f6
      Linus Torvalds authored
      Pull perf fixes from Ingo Molnar:
       "Fix uncore PMU enumeration for CofeeLake CPUs"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86/intel/uncore: Support CoffeeLake 8th CBOX
        perf/x86/intel/uncore: Add more IMC PCI IDs for KabyLake and CoffeeLake CPUs
      b53e27f6
    • Linus Torvalds's avatar
      Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 743a4863
      Linus Torvalds authored
      Pull EFI fixes from Ingo Molnar:
       "Misc fixes: two warning splat fixes, a leak fix and persistent memory
        allocation fixes for ARM"
      
      * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        efi: Permit calling efi_mem_reserve_persistent() from atomic context
        efi/arm: Defer persistent reservations until after paging_init()
        efi/arm/libstub: Pack FDT after populating it
        efi/arm: Revert deferred unmap of early memmap mapping
        efi: Fix debugobjects warning on 'efi_rts_work'
      743a4863
    • Linus Torvalds's avatar
      Merge branch 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm · cfaa9f02
      Linus Torvalds authored
      Pull ARM spectre updates from Russell King:
       "These are the currently known final bits that resolve the Spectre
        issues. big.Little systems used to be sufficiently identical in that
        there were no differences between individual CPUs in the system that
        mattered to the kernel. With the advent of the Spectre problem, the
        CPUs now have differences in how the workaround is applied.
      
        As a result of previous Spectre patches, these systems ended up
        reporting quite a lot of:
      
           "CPUx: Spectre v2: incorrect context switching function, system vulnerable"
      
        messages due to the action of the big.Little switcher causing the CPUs
        to be re-initialised regularly. This series resolves that issue by
        making the CPU vtable unique to each CPU.
      
        However, since this is used very early, before per-cpu is setup,
        per-cpu can't be used. We also have a problem that two of the methods
        are not called from preempt-safe paths, but thankfully these remain
        identical between all CPUs in the system. To make sure, we validate
        that these are identical during boot"
      
      * 'spectre' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: spectre-v2: per-CPU vtables to work around big.Little systems
        ARM: add PROC_VTABLE and PROC_TABLE macros
        ARM: clean up per-processor check_bugs method call
        ARM: split out processor lookup
        ARM: make lookup_processor_type() non-__init
      cfaa9f02
    • Chen Chang's avatar
    • Michal Hocko's avatar
      mm, page_alloc: check for max order in hot path · c63ae43b
      Michal Hocko authored
      Konstantin has noticed that kvmalloc might trigger the following
      warning:
      
        WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60
        [...]
        Call Trace:
         fragmentation_index+0x76/0x90
         compaction_suitable+0x4f/0xf0
         shrink_node+0x295/0x310
         node_reclaim+0x205/0x250
         get_page_from_freelist+0x649/0xad0
         __alloc_pages_nodemask+0x12a/0x2a0
         kmalloc_large_node+0x47/0x90
         __kmalloc_node+0x22b/0x2e0
         kvmalloc_node+0x3e/0x70
         xt_alloc_table_info+0x3a/0x80 [x_tables]
         do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables]
         nf_setsockopt+0x44/0x60
         SyS_setsockopt+0x6f/0xc0
         do_syscall_64+0x67/0x120
         entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      the problem is that we only check for an out of bound order in the slow
      path and the node reclaim might happen from the fast path already.  This
      is fixable by making sure that kvmalloc doesn't ever use kmalloc for
      requests that are larger than KMALLOC_MAX_SIZE but this also shows that
      the code is rather fragile.  A recent UBSAN report just underlines that
      by the following report
      
        UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19
        shift exponent 51 is too large for 32-bit type 'int'
        CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
        Call Trace:
         __dump_stack lib/dump_stack.c:77 [inline]
         dump_stack+0xd2/0x148 lib/dump_stack.c:113
         ubsan_epilogue+0x12/0x94 lib/ubsan.c:159
         __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425
         __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117
         zone_watermark_fast mm/page_alloc.c:3216 [inline]
         get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300
         __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370
         alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093
         alloc_pages include/linux/gfp.h:509 [inline]
         __get_free_pages+0x12/0x60 mm/page_alloc.c:4414
         dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156
         raw_cmd_copyin drivers/block/floppy.c:3159 [inline]
         raw_cmd_ioctl drivers/block/floppy.c:3206 [inline]
         fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544
         fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571
         __blkdev_driver_ioctl block/ioctl.c:303 [inline]
         blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601
         block_ioctl+0x105/0x150 fs/block_dev.c:1883
         vfs_ioctl fs/ioctl.c:46 [inline]
         do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687
         ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702
         __do_sys_ioctl fs/ioctl.c:709 [inline]
         __se_sys_ioctl fs/ioctl.c:707 [inline]
         __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707
         do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Note that this is not a kvmalloc path.  It is just that the fast path
      really depends on having sanitzed order as well.  Therefore move the
      order check to the fast path.
      
      Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.czSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reported-by: default avatarKyungtae Kim <kt0755@gmail.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Aaron Lu <aaron.lu@intel.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Byoungyoung Lee <lifeasageek@gmail.com>
      Cc: "Dae R. Jeong" <threeearcat@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c63ae43b
    • Uwe Kleine-König's avatar
      scripts/spdxcheck.py: make python3 compliant · 6f4d29df
      Uwe Kleine-König authored
      Without this change the following happens when using Python3 (3.6.6):
      
      	$ echo "GPL-2.0" | python3 scripts/spdxcheck.py -
      	FAIL: 'str' object has no attribute 'decode'
      	Traceback (most recent call last):
      	  File "scripts/spdxcheck.py", line 253, in <module>
      	    parser.parse_lines(sys.stdin, args.maxlines, '-')
      	  File "scripts/spdxcheck.py", line 171, in parse_lines
      	    line = line.decode(locale.getpreferredencoding(False), errors='ignore')
      	AttributeError: 'str' object has no attribute 'decode'
      
      So as the line is already a string, there is no need to decode it and
      the line can be dropped.
      
      /usr/bin/python on Arch is Python 3.  So this would indeed be worth
      going into 4.19.
      
      Link: http://lkml.kernel.org/r/20181023070802.22558-1-u.kleine-koenig@pengutronix.deSigned-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Joe Perches <joe@perches.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6f4d29df