1. 09 Sep, 2022 5 commits
  2. 08 Sep, 2022 9 commits
    • Paolo Abeni's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 9f8f1933
      Paolo Abeni authored
      drivers/net/ethernet/freescale/fec.h
        7d650df9 ("net: fec: add pm_qos support on imx6q platform")
        40c79ce1 ("net: fec: add stop mode support for imx8 platform")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      9f8f1933
    • Casper Andersson's avatar
      net: sparx5: fix function return type to match actual type · 75554fe0
      Casper Andersson authored
      Function returns error integer, not bool.
      
      Does not have any impact on functionality.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarCasper Andersson <casper.casan@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220906065815.3856323-1-casper.casan@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      75554fe0
    • Linus Torvalds's avatar
      Merge tag 'net-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 26b12249
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from rxrpc, netfilter, wireless and bluetooth
        subtrees.
      
        Current release - regressions:
      
         - skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
      
         - bluetooth: fix regression preventing ACL packet transmission
      
        Current release - new code bugs:
      
         - dsa: microchip: fix kernel oops on ksz8 switches
      
         - dsa: qca8k: fix NULL pointer dereference for
           of_device_get_match_data
      
        Previous releases - regressions:
      
         - netfilter: clean up hook list when offload flags check fails
      
         - wifi: mt76: fix crash in chip reset fail
      
         - rxrpc: fix ICMP/ICMP6 error handling
      
         - ice: fix DMA mappings leak
      
         - i40e: fix kernel crash during module removal
      
        Previous releases - always broken:
      
         - ipv6: sr: fix out-of-bounds read when setting HMAC data.
      
         - tcp: TX zerocopy should not sense pfmemalloc status
      
         - sch_sfb: don't assume the skb is still around after
           enqueueing to child
      
         - netfilter: drop dst references before setting
      
         - wifi: wilc1000: fix DMA on stack objects
      
         - rxrpc: fix an insufficiently large sglist in
           rxkad_verify_packet_2()
      
         - fec: use a spinlock to guard `fep->ptp_clk_on`
      
        Misc:
      
         - usb: qmi_wwan: add Quectel RM520N"
      
      * tag 'net-6.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
        sch_sfb: Also store skb len before calling child enqueue
        net: phy: lan87xx: change interrupt src of link_up to comm_ready
        net/smc: Fix possible access to freed memory in link clear
        net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb
        net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM
        net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear
        net: dsa: felix: access QSYS_TAG_CONFIG under tas_lock in vsc9959_sched_speed_set
        net: dsa: felix: disable cut-through forwarding for frames oversized for tc-taprio
        net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet
        net: usb: qmi_wwan: add Quectel RM520N
        net: dsa: qca8k: fix NULL pointer dereference for of_device_get_match_data
        tcp: fix early ETIMEDOUT after spurious non-SACK RTO
        stmmac: intel: Simplify intel_eth_pci_remove()
        net: mvpp2: debugfs: fix memory leak when using debugfs_lookup()
        ipv6: sr: fix out-of-bounds read when setting HMAC data.
        bonding: accept unsolicited NA message
        bonding: add all node mcast address when slave up
        bonding: use unspecified address if no available link local address
        wifi: use struct_group to copy addresses
        wifi: mac80211_hwsim: check length for virtio packets
        ...
      26b12249
    • Linus Torvalds's avatar
      fs: only do a memory barrier for the first set_buffer_uptodate() · 2f79cdfe
      Linus Torvalds authored
      Commit d4252071 ("add barriers to buffer_uptodate and
      set_buffer_uptodate") added proper memory barriers to the buffer head
      BH_Uptodate bit, so that anybody who tests a buffer for being up-to-date
      will be guaranteed to actually see initialized state.
      
      However, that commit didn't _just_ add the memory barrier, it also ended
      up dropping the "was it already set" logic that the BUFFER_FNS() macro
      had.
      
      That's conceptually the right thing for a generic "this is a memory
      barrier" operation, but in the case of the buffer contents, we really
      only care about the memory barrier for the _first_ time we set the bit,
      in that the only memory ordering protection we need is to avoid anybody
      seeing uninitialized memory contents.
      
      Any other access ordering wouldn't be about the BH_Uptodate bit anyway,
      and would require some other proper lock (typically BH_Lock or the folio
      lock).  A reader that races with somebody invalidating the buffer head
      isn't an issue wrt the memory ordering, it's a serialization issue.
      
      Now, you'd think that the buffer head operations don't matter in this
      day and age (and I certainly thought so), but apparently some loads
      still end up being heavy users of buffer heads.  In particular, the
      kernel test robot reported that not having this bit access optimization
      in place caused a noticeable direct IO performance regression on ext4:
      
        fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression
      
      although you presumably need a fast disk and a lot of cores to actually
      notice.
      
      Link: https://lore.kernel.org/all/Yw8L7HTZ%2FdE2%2Fo9C@xsang-OptiPlex-9020/Reported-by: default avatarkernel test robot <oliver.sang@intel.com>
      Tested-by: default avatarFengwei Yin <fengwei.yin@intel.com>
      Cc: Mikulas Patocka <mpatocka@redhat.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2f79cdfe
    • Linus Torvalds's avatar
      Merge tag 'efi-urgent-for-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · f280b987
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
       "A couple of low-priority EFI fixes:
      
         - prevent the randstruct plugin from re-ordering EFI protocol
           definitions
      
         - fix a use-after-free in the capsule loader
      
         - drop unused variable"
      
      * tag 'efi-urgent-for-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: capsule-loader: Fix use-after-free in efi_capsule_write
        efi/x86: libstub: remove unused variable
        efi: libstub: Disable struct randomization
      f280b987
    • Heiner Kallweit's avatar
      r8169: merge support for chip versions 10, 13, 16 · e66d6586
      Heiner Kallweit authored
      These chip versions are closely related and all of them have no
      chip-specific MAC/PHY initialization. Therefore merge support
      for the three chip versions.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Link: https://lore.kernel.org/r/469d27e0-1d06-9b15-6c96-6098b3a52e35@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e66d6586
    • Toke Høiland-Jørgensen's avatar
      sch_sfb: Also store skb len before calling child enqueue · 2f09707d
      Toke Høiland-Jørgensen authored
      Cong Wang noticed that the previous fix for sch_sfb accessing the queued
      skb after enqueueing it to a child qdisc was incomplete: the SFB enqueue
      function was also calling qdisc_qstats_backlog_inc() after enqueue, which
      reads the pkt len from the skb cb field. Fix this by also storing the skb
      len, and using the stored value to increment the backlog after enqueueing.
      
      Fixes: 9efd2329 ("sch_sfb: Don't assume the skb is still around after enqueueing to child")
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@toke.dk>
      Acked-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20220905192137.965549-1-toke@toke.dkSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2f09707d
    • Arun Ramadoss's avatar
      net: phy: lan87xx: change interrupt src of link_up to comm_ready · 5382033a
      Arun Ramadoss authored
      Currently phy link up/down interrupt is enabled using the
      LAN87xx_INTERRUPT_MASK register. In the lan87xx_read_status function,
      phy link is determined using the T1_MODE_STAT_REG register comm_ready bit.
      comm_ready bit is set using the loc_rcvr_status & rem_rcvr_status.
      Whenever the phy link is up, LAN87xx_INTERRUPT_SOURCE link_up bit is set
      first but comm_ready bit takes some time to set based on local and
      remote receiver status.
      As per the current implementation, interrupt is triggered using link_up
      but the comm_ready bit is still cleared in the read_status function. So,
      link is always down.  Initially tested with the shared interrupt
      mechanism with switch and internal phy which is working, but after
      implementing interrupt controller it is not working.
      It can fixed either by updating the read_status function to read from
      LAN87XX_INTERRUPT_SOURCE register or enable the interrupt mask for
      comm_ready bit. But the validation team recommends the use of comm_ready
      for link detection.
      This patch fixes by enabling the comm_ready bit for link_up in the
      LAN87XX_INTERRUPT_MASK_2 register (MISC Bank) and link_down in
      LAN87xx_INTERRUPT_MASK register.
      
      Fixes: 8a1b415d ("net: phy: added ethtool master-slave configuration support")
      Signed-off-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20220905152750.5079-1-arun.ramadoss@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5382033a
    • Kurt Kanzenbach's avatar
      net: stmmac: Disable automatic FCS/Pad stripping · 929d4342
      Kurt Kanzenbach authored
      The stmmac has the possibility to automatically strip the padding/FCS for IEEE
      802.3 type frames. This feature is enabled conditionally. Therefore, the stmmac
      receive path has to have a determination logic whether the FCS has to be
      stripped in software or not.
      
      In fact, for DSA this ACS feature is disabled and the determination logic
      doesn't check for it properly. For instance, when using DSA in combination with
      an older stmmac (pre version 4), the FCS is not stripped by hardware or software
      which is problematic.
      
      So either add another check for DSA to the fast path or simply disable ACS
      feature completely. The latter approach has been chosen, because most of the
      time the FCS is stripped in software anyway and it removes conditionals from the
      receive fast path.
      Signed-off-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Link: https://lore.kernel.org/r/87v8q8jjgh.fsf@kurt/
      Link: https://lore.kernel.org/r/20220905130155.193640-1-kurt@linutronix.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      929d4342
  3. 07 Sep, 2022 26 commits
    • Hyunwoo Kim's avatar
      efi: capsule-loader: Fix use-after-free in efi_capsule_write · 9cb636b5
      Hyunwoo Kim authored
      A race condition may occur if the user calls close() on another thread
      during a write() operation on the device node of the efi capsule.
      
      This is a race condition that occurs between the efi_capsule_write() and
      efi_capsule_flush() functions of efi_capsule_fops, which ultimately
      results in UAF.
      
      So, the page freeing process is modified to be done in
      efi_capsule_release() instead of efi_capsule_flush().
      
      Cc: <stable@vger.kernel.org> # v4.9+
      Signed-off-by: default avatarHyunwoo Kim <imv4bel@gmail.com>
      Link: https://lore.kernel.org/all/20220907102920.GA88602@ubuntu/Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      9cb636b5
    • David S. Miller's avatar
      Merge branch 'hns3-new-features' · 418b0866
      David S. Miller authored
      Guangbin Huang says:
      
      ====================
      hns3: add some new features
      
      This series adds some new features for the HNS3 ethernet driver.
      
      Patches #1~#3 support configuring dscp map to tc.
      
      Patch 4# supports querying FEC statistics by command "ethtool -I --show-fec eth0".
      
      Patch 5# supports querying and setting Serdes lane number.
      
      Change logs:
      V1 -> V2:
       - fix build error of patch 1# reported by robot lkp@intel.com.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      418b0866
    • Hao Chen's avatar
      net: hns3: add support to query and set lane number by ethtool · 0f032f93
      Hao Chen authored
      When serdes lane support setting 25Gb/s or 50Gb/s speed and user wants to
      set port speed as 50Gb/s, it can be setted as one 50Gb/s serdes lane or
      two 25Gb/s serdes lanes.
      
      So, this patch adds support to query and set lane number by ethtool
      to satisfy this scenario.
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f032f93
    • Hao Lan's avatar
      net: hns3: add querying fec statistics · 2cb343b9
      Hao Lan authored
      FEC statistics can be used to check the transmission quality of links.
      This patch implements the get_fec_stats callback of ethtool_ops to support
      querying FEC statistics by command "ethtool -I --show-fec eth0".
      Signed-off-by: default avatarHao Lan <lanhao@huawei.com>
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2cb343b9
    • Guangbin Huang's avatar
      net: hns3: debugfs add dump dscp map info · fddc02eb
      Guangbin Huang authored
      This patch add dump the map relation for dscp, priority and TC, and
      the current tc map mode.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fddc02eb
    • Guangbin Huang's avatar
      net: hns3: support ndo_select_queue() · f6e32724
      Guangbin Huang authored
      To support tx packets to select queue according to its dscp field after
      setting dscp and tc map relationship, this patch implements
      ndo_select_queue() to set skb->priority according to the user's setting
      dscp and priority map relationship.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6e32724
    • Guangbin Huang's avatar
      net: hns3: add support config dscp map to tc · 0ba22bcb
      Guangbin Huang authored
      This patch add support config dscp map to tc by implementing ieee_setapp
      and ieee_delapp of struct dcbnl_rtnl_ops. Driver will convert mapping
      relationship from dscp-prio to dscp-tc.
      Signed-off-by: default avatarGuangbin Huang <huangguangbin2@huawei.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ba22bcb
    • Yacan Liu's avatar
      net/smc: Fix possible access to freed memory in link clear · e9b1a4f8
      Yacan Liu authored
      After modifying the QP to the Error state, all RX WR would be completed
      with WC in IB_WC_WR_FLUSH_ERR status. Current implementation does not
      wait for it is done, but destroy the QP and free the link group directly.
      So there is a risk that accessing the freed memory in tasklet context.
      
      Here is a crash example:
      
       BUG: unable to handle page fault for address: ffffffff8f220860
       #PF: supervisor write access in kernel mode
       #PF: error_code(0x0002) - not-present page
       PGD f7300e067 P4D f7300e067 PUD f7300f063 PMD 8c4e45063 PTE 800ffff08c9df060
       Oops: 0002 [#1] SMP PTI
       CPU: 1 PID: 0 Comm: swapper/1 Kdump: loaded Tainted: G S         OE     5.10.0-0607+ #23
       Hardware name: Inspur NF5280M4/YZMB-00689-101, BIOS 4.1.20 07/09/2018
       RIP: 0010:native_queued_spin_lock_slowpath+0x176/0x1b0
       Code: f3 90 48 8b 32 48 85 f6 74 f6 eb d5 c1 ee 12 83 e0 03 83 ee 01 48 c1 e0 05 48 63 f6 48 05 00 c8 02 00 48 03 04 f5 00 09 98 8e <48> 89 10 8b 42 08 85 c0 75 09 f3 90 8b 42 08 85 c0 74 f7 48 8b 32
       RSP: 0018:ffffb3b6c001ebd8 EFLAGS: 00010086
       RAX: ffffffff8f220860 RBX: 0000000000000246 RCX: 0000000000080000
       RDX: ffff91db1f86c800 RSI: 000000000000173c RDI: ffff91db62bace00
       RBP: ffff91db62bacc00 R08: 0000000000000000 R09: c00000010000028b
       R10: 0000000000055198 R11: ffffb3b6c001ea58 R12: ffff91db80e05010
       R13: 000000000000000a R14: 0000000000000006 R15: 0000000000000040
       FS:  0000000000000000(0000) GS:ffff91db1f840000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: ffffffff8f220860 CR3: 00000001f9580004 CR4: 00000000003706e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
       Call Trace:
        <IRQ>
        _raw_spin_lock_irqsave+0x30/0x40
        mlx5_ib_poll_cq+0x4c/0xc50 [mlx5_ib]
        smc_wr_rx_tasklet_fn+0x56/0xa0 [smc]
        tasklet_action_common.isra.21+0x66/0x100
        __do_softirq+0xd5/0x29c
        asm_call_irq_on_stack+0x12/0x20
        </IRQ>
        do_softirq_own_stack+0x37/0x40
        irq_exit_rcu+0x9d/0xa0
        sysvec_call_function_single+0x34/0x80
        asm_sysvec_call_function_single+0x12/0x20
      
      Fixes: bd4ad577 ("smc: initialize IB transport incl. PD, MR, QP, CQ, event, WR")
      Signed-off-by: default avatarYacan Liu <liuyacan@corp.netease.com>
      Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9b1a4f8
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 2018b22a
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-09-06 (i40e, iavf)
      
      This series contains updates to i40e and iavf drivers.
      
      Stanislaw adds support for new device id for i40e.
      
      Jaroslaw tidies up some code around MSI-X configuration by adding/
      reworking comments and introducing a couple of macros for i40e.
      
      Michal resolves some races around reset and close by deferring and deleting
      some pending AdminQ operations and reworking filter additions and deletions
      during these operations for iavf.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2018b22a
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · 29796143
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-09-06 (ice)
      
      This series contains updates to ice driver only.
      
      Tony reduces device MSI-X request/usage when entire request can't be fulfilled.
      
      Michal adds check for reset when waiting for PTP offsets.
      
      Paul refactors firmware version checks to use a common helper.
      
      Christophe Jaillet changes a couple of local memory allocation to not
      use the devm variant.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      29796143
    • Florian Westphal's avatar
      netfilter: nat: avoid long-running port range loop · adda60cc
      Florian Westphal authored
      Looping a large port range takes too long. Instead select a random
      offset within [ntohs(exp->saved_proto.tcp.port), 65535] and try 128
      ports.
      
      This is a rehash of an erlier patch to do the same, but generalized
      to handle other helpers as well.
      
      Link: https://patchwork.ozlabs.org/project/netfilter-devel/patch/20210920204439.13179-2-Cole.Dishington@alliedtelesis.co.nz/Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      adda60cc
    • Florian Westphal's avatar
      netfilter: nat: move repetitive nat port reserve loop to a helper · c92c2717
      Florian Westphal authored
      Almost all nat helpers reserve an expecation port the same way:
      Try the port inidcated by the peer, then move to next port if that
      port is already in use.
      
      We can squash this into a helper.
      Suggested-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      c92c2717
    • Wolfram Sang's avatar
      netfilter: move from strlcpy with unused retval to strscpy · 8556bceb
      Wolfram Sang authored
      Follow the advice of the below link and prefer 'strscpy' in this
      subsystem. Conversion is 1:1 because the return value is not used.
      Generated by a coccinelle script.
      
      Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/Signed-off-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Reviewed-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      8556bceb
    • Florian Westphal's avatar
      netfilter: remove NFPROTO_DECNET · a0a4de4d
      Florian Westphal authored
      Decnet has been removed. so no need to reserve space in arrays for it.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      a0a4de4d
    • Florian Westphal's avatar
      netfilter: conntrack: reduce timeout when receiving out-of-window fin or rst · 628d6943
      Florian Westphal authored
      In case the endpoints and conntrack go out-of-sync, i.e. there is
      disagreement wrt. validy of sequence/ack numbers between conntracks
      internal state and those of the endpoints, connections can hang for a
      long time (until ESTABLISHED timeout).
      
      This adds a check to detect a fin/fin exchange even if those are
      invalid.  The timeout is then lowered to UNACKED (default 300s).
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      628d6943
    • Florian Westphal's avatar
      netfilter: conntrack: remove unneeded indent level · 09a59001
      Florian Westphal authored
      After previous patch, the conditional branch is obsolete, reformat it.
      gcc generates same code as before this change.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      09a59001
    • Liu Shixin's avatar
      net: sysctl: remove unused variable long_max · 53fc01a0
      Liu Shixin authored
      The variable long_max is replaced by bpf_jit_limit_max and no longer be
      used. So remove it.
      
      No functional change.
      Signed-off-by: default avatarLiu Shixin <liushixin2@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53fc01a0
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: remove mtk_foe_entry_timestamp · c9daab32
      Lorenzo Bianconi authored
      Get rid of mtk_foe_entry_timestamp routine since it is no longer used.
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c9daab32
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: check max allowed hash in mtk_ppe_check_skb · f27b405e
      Lorenzo Bianconi authored
      Even if max hash configured in hw in mtk_ppe_hash_entry is
      MTK_PPE_ENTRIES - 1, check theoretical OOB accesses in
      mtk_ppe_check_skb routine
      
      Fixes: c4f033d9 ("net: ethernet: mtk_eth_soc: rework hardware flow table management")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f27b405e
    • Menglong Dong's avatar
      net: skb: export skb drop reaons to user by TRACE_DEFINE_ENUM · 9cb252c4
      Menglong Dong authored
      As Eric reported, the 'reason' field is not presented when trace the
      kfree_skb event by perf:
      
      $ perf record -e skb:kfree_skb -a sleep 10
      $ perf script
        ip_defrag 14605 [021]   221.614303:   skb:kfree_skb:
        skbaddr=0xffff9d2851242700 protocol=34525 location=0xffffffffa39346b1
        reason:
      
      The cause seems to be passing kernel address directly to TP_printk(),
      which is not right. As the enum 'skb_drop_reason' is not exported to
      user space through TRACE_DEFINE_ENUM(), perf can't get the drop reason
      string from the 'reason' field, which is a number.
      
      Therefore, we introduce the macro DEFINE_DROP_REASON(), which is used
      to define the trace enum by TRACE_DEFINE_ENUM(). With the help of
      DEFINE_DROP_REASON(), now we can remove the auto-generate that we
      introduced in the commit ec43908d
      ("net: skb: use auto-generation to convert skb drop reason to string"),
      and define the string array 'drop_reasons'.
      
      Hmmmm...now we come back to the situation that have to maintain drop
      reasons in both enum skb_drop_reason and DEFINE_DROP_REASON. But they
      are both in dropreason.h, which makes it easier.
      
      After this commit, now the format of kfree_skb is like this:
      
      $ cat /tracing/events/skb/kfree_skb/format
      name: kfree_skb
      ID: 1524
      format:
              field:unsigned short common_type;       offset:0;       size:2; signed:0;
              field:unsigned char common_flags;       offset:2;       size:1; signed:0;
              field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
              field:int common_pid;   offset:4;       size:4; signed:1;
      
              field:void * skbaddr;   offset:8;       size:8; signed:0;
              field:void * location;  offset:16;      size:8; signed:0;
              field:unsigned short protocol;  offset:24;      size:2; signed:0;
              field:enum skb_drop_reason reason;      offset:28;      size:4; signed:0;
      
      print fmt: "skbaddr=%p protocol=%u location=%p reason: %s", REC->skbaddr, REC->protocol, REC->location, __print_symbolic(REC->reason, { 1, "NOT_SPECIFIED" }, { 2, "NO_SOCKET" } ......
      
      Fixes: ec43908d ("net: skb: use auto-generation to convert skb drop reason to string")
      Link: https://lore.kernel.org/netdev/CANn89i+bx0ybvE55iMYf5GJM48WwV1HNpdm9Q6t-HaEstqpCSA@mail.gmail.com/Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cb252c4
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_eth_soc: fix typo in __mtk_foe_entry_clear · 0e80707d
      Lorenzo Bianconi authored
      Set ib1 state to MTK_FOE_STATE_UNBIND in __mtk_foe_entry_clear routine.
      
      Fixes: 33fc42de ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries")
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0e80707d
    • Florian Westphal's avatar
      netfilter: conntrack: ignore overly delayed tcp packets · 6e250dcb
      Florian Westphal authored
      If 'nf_conntrack_tcp_loose' is off (the default), tcp packets that are
      outside of the current window are marked as INVALID.
      
      nf/iptables rulesets often drop such packets via 'ct state invalid' or
      similar checks.
      
      For overly delayed acks, this can be a nuisance if such 'invalid' packets
      are also logged.
      
      Since they are not invalid in a strict sense, just ignore them, i.e.
      conntrack won't extend timeout or change state so that they do not match
      invalid state rules anymore.
      
      This also avoids unwantend connection stalls in case conntrack considers
      retransmission (of data that did not reach the peer) as too old.
      
      The else branch of the conditional becomes obsolete.
      Next patch will reformant the now always-true if condition.
      
      The existing workaround for data that exceeds the calculated receive
      window is adjusted to use the 'ignore' state so that these packets do
      not refresh the timeout or change state other than updating ->td_end.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      6e250dcb
    • Florian Westphal's avatar
      netfilter: conntrack: prepare tcp_in_window for ternary return value · d9a6f0d0
      Florian Westphal authored
      tcp_in_window returns true if the packet is in window and false if it is
      not.
      
      If its outside of window, packet will be treated as INVALID.
      
      There are corner cases where the packet should still be tracked, because
      rulesets may drop or log such packets, even though they can occur during
      normal operation, such as overly delayed acks.
      
      In extreme cases, connection may hang forever because conntrack state
      differs from real state.
      
      There is no retransmission for ACKs.
      
      In case of ACK loss after conntrack processing, its possible that a
      connection can be stuck because the actual retransmits are considered
      stale ("SEQ is under the lower bound (already ACKed data
      retransmitted)".
      
      The problem is made worse by carrier-grade-nat which can also result
      in stale packets from old connections to get treated as 'recent' packets
      in conntrack (it doesn't support tcp timestamps at this time).
      
      Prepare tcp_in_window() to return an enum that tells the desired
      action (in-window/accept, bogus/drop).
      
      A third action (accept the packet as in-window, but do not change
      state) is added in a followup patch.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      d9a6f0d0
    • David S. Miller's avatar
      Merge branch 'macsec-offload-mlx5' · 016eb590
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Introduce MACsec skb_metadata_dst and mlx5 macsec offload
      
      v1->v2:
         - attach mlx5 implementation patches.
      
      This patchset introduces MACsec skb_metadata_dst to lay the ground
      for MACsec HW offload.
      
      MACsec is an IEEE standard (IEEE 802.1AE) for MAC security.
      It defines a way to establish a protocol independent connection
      between two hosts with data confidentiality, authenticity and/or
      integrity, using GCM-AES. MACsec operates on the Ethernet layer and
      as such is a layer 2 protocol, which means it’s designed to secure
      traffic within a layer 2 network, including DHCP or ARP requests.
      
      Linux has a software implementation of the MACsec standard and
      HW offloading support.
      The offloading is re-using the logic, netlink API and data
      structures of the existing MACsec software implementation.
      
      For Tx:
      In the current MACsec offload implementation, MACsec interfaces shares
      the same MAC address by default.
      Therefore, HW can't distinguish from which MACsec interface the traffic
      originated from.
      
      MACsec stack will use skb_metadata_dst to store the SCI value, which is
      unique per MACsec interface, skb_metadat_dst will be used later by the
      offloading device driver to associate the SKB with the corresponding
      offloaded interface (SCI) to facilitate HW MACsec offload.
      
      For Rx:
      Like in the Tx changes, if there are more than one MACsec device with
      the same MAC address as in the packet's destination MAC, the packet will
      be forward only to one of the devices and not neccessarly to the desired one.
      
      Offloading device driver sets the MACsec skb_metadata_dst sci
      field with the appropriaate Rx SCI for each SKB so the MACsec rx handler
      will know to which port to divert those skbs, instead of wrongly solely
      relaying on dst MAC address comparison.
      
      1) patch 1,2, Add support to skb_metadata_dst in MACsec code:
      net/macsec: Add MACsec skb_metadata_dst Tx Data path support
      net/macsec: Add MACsec skb_metadata_dst Rx Data path support
      
      2) patch 3, Move some MACsec driver code for sharing with various
      drivers that implements offload:
      net/macsec: Move some code for sharing with various drivers that
      implements offload
      
      3) The rest of the patches introduce mlx5 implementation for macsec
      offloads TX and RX via steering tables.
        a) TX, intercept skbs with macsec offlad mark in skb_metadata_dst and mark
      the descriptor for offload.
        b) RX, intercept offloaded frames and prepare the proper
      skb_metadata_dst to mark offloaded rx frames.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      016eb590
    • Lior Nahmanson's avatar
      net/mlx5e: Add support to configure more than one macsec offload device · 99d4dc66
      Lior Nahmanson authored
      Add the ability to add up to 16 MACsec offload interfaces
      over the same physical interface
      Signed-off-by: default avatarLior Nahmanson <liorna@nvidia.com>
      Reviewed-by: default avatarRaed Salem <raeds@nvidia.com>
      Signed-off-by: default avatarRaed Salem <raeds@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      99d4dc66
    • Lior Nahmanson's avatar
      net/mlx5e: Add MACsec stats support for Rx/Tx flows · 807a1b76
      Lior Nahmanson authored
      Add the following statistics:
      RX successfully decrypted MACsec packets:
      macsec_rx_pkts : Number of packets decrypted successfully
      macsec_rx_bytes : Number of bytes decrypted successfully
      
      Rx dropped MACsec packets:
      macsec_rx_pkts_drop : Number of MACsec packets dropped
      macsec_rx_bytes_drop : Number of MACsec bytes dropped
      
      TX successfully encrypted MACsec packets:
      macsec_tx_pkts : Number of packets encrypted/authenticated successfully
      macsec_tx_bytes : Number of bytes encrypted/authenticated successfully
      
      Tx dropped MACsec packets:
      macsec_tx_pkts_drop : Number of MACsec packets dropped
      macsec_tx_bytes_drop : Number of MACsec bytes dropped
      
      The above can be seen using:
      ethtool -S <ifc> |grep macsec
      Signed-off-by: default avatarLior Nahmanson <liorna@nvidia.com>
      Reviewed-by: default avatarRaed Salem <raeds@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      807a1b76