1. 25 Dec, 2023 1 commit
    • Siddh Raman Pant's avatar
      nfc: llcp_core: Hold a ref to llcp_local->dev when holding a ref to llcp_local · c95f9195
      Siddh Raman Pant authored
      llcp_sock_sendmsg() calls nfc_llcp_send_ui_frame() which in turn calls
      nfc_alloc_send_skb(), which accesses the nfc_dev from the llcp_sock for
      getting the headroom and tailroom needed for skb allocation.
      
      Parallelly the nfc_dev can be freed, as the refcount is decreased via
      nfc_free_device(), leading to a UAF reported by Syzkaller, which can
      be summarized as follows:
      
      (1) llcp_sock_sendmsg() -> nfc_llcp_send_ui_frame()
      	-> nfc_alloc_send_skb() -> Dereference *nfc_dev
      (2) virtual_ncidev_close() -> nci_free_device() -> nfc_free_device()
      	-> put_device() -> nfc_release() -> Free *nfc_dev
      
      When a reference to llcp_local is acquired, we do not acquire the same
      for the nfc_dev. This leads to freeing even when the llcp_local is in
      use, and this is the case with the UAF described above too.
      
      Thus, when we acquire a reference to llcp_local, we should acquire a
      reference to nfc_dev, and release the references appropriately later.
      
      References for llcp_local is initialized in nfc_llcp_register_device()
      (which is called by nfc_register_device()). Thus, we should acquire a
      reference to nfc_dev there.
      
      nfc_unregister_device() calls nfc_llcp_unregister_device() which in
      turn calls nfc_llcp_local_put(). Thus, the reference to nfc_dev is
      appropriately released later.
      
      Reported-and-tested-by: syzbot+bbe84a4010eeea00982d@syzkaller.appspotmail.com
      Closes: https://syzkaller.appspot.com/bug?extid=bbe84a4010eeea00982d
      Fixes: c7aa1225 ("NFC: Take a reference on the LLCP local pointer when creating a socket")
      Reviewed-by: default avatarSuman Ghosh <sumang@marvell.com>
      Signed-off-by: default avatarSiddh Raman Pant <code@siddh.me>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c95f9195
  2. 21 Dec, 2023 12 commits
  3. 20 Dec, 2023 11 commits
  4. 19 Dec, 2023 16 commits
    • Kent Overstreet's avatar
    • Kent Overstreet's avatar
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 55cb5f43
      Linus Torvalds authored
      Pull tracing fix from Steven Rostedt:
       "While working on the ring buffer, I found one more bug with the
        timestamp code, and the fix for this removed the need for the final
        64-bit cmpxchg!
      
        The ring buffer events hold a "delta" from the previous event. If it
        is determined that the delta can not be calculated, it falls back to
        adding an absolute timestamp value. The way to know if the delta can
        be used is via two stored timestamps in the per-cpu buffer meta data:
      
         before_stamp and write_stamp
      
        The before_stamp is written by every event before it tries to allocate
        its space on the ring buffer. The write_stamp is written after it
        allocates its space and knows that nothing came in after it read the
        previous before_stamp and write_stamp and the two matched.
      
        A previous fix dd939425 ("ring-buffer: Do not try to put back
        write_stamp") removed putting back the write_stamp to match the
        before_stamp so that the next event could use the delta, but races
        were found where the two would match, but not be for of the previous
        event.
      
        It was determined to allow the event reservation to not have a valid
        write_stamp when it is finished, and this fixed a lot of races.
      
        The last use of the 64-bit timestamp cmpxchg depended on the
        write_stamp being valid after an interruption. But this is no longer
        the case, as if an event is interrupted by a softirq that writes an
        event, and that event gets interrupted by a hardirq or NMI and that
        writes an event, then the softirq could finish its reservation without
        a valid write_stamp.
      
        In the slow path of the event reservation, a delta can still be used
        if the write_stamp is valid. Instead of using a cmpxchg against the
        write stamp, the before_stamp needs to be read again to validate the
        write_stamp. The cmpxchg is not needed.
      
        This updates the slowpath to validate the write_stamp by comparing it
        to the before_stamp and removes all rb_time_cmpxchg() as there are no
        more users of that function.
      
        The removal of the 32-bit updates of rb_time_t will be done in the
        next merge window"
      
      * tag 'trace-v6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        ring-buffer: Fix slowpath of interrupted event
      55cb5f43
    • Linus Torvalds's avatar
      Merge tag 'arc-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 9c749e61
      Linus Torvalds authored
      Pull ARC fixes from Vineet Gupta:
      
       - build error for hugetlb, sparse and smatch fixes
      
       - removal of VIPT aliasing cache code
      
      * tag 'arc-6.7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
        ARC: add hugetlb definitions
        ARC: fix smatch warning
        ARC: fix spare error
        ARC: mm: retire support for aliasing VIPT D$
        ARC: entry: move ARCompact specific bits out of entry.h
        ARC: entry: SAVE_ABI_CALLEE_REG: ISA/ABI specific helper
      9c749e61
    • Shyam Prasad N's avatar
      cifs: do not let cifs_chan_update_iface deallocate channels · 12d1e301
      Shyam Prasad N authored
      cifs_chan_update_iface is meant to check and update the server
      interface used for a channel when the existing server interface
      is no longer available.
      
      So far, this handler had the code to remove an interface entry
      even if a new candidate interface is not available. Allowing
      this leads to several corner cases to handle.
      
      This change makes the logic much simpler by not deallocating
      the current channel interface entry if a new interface is not
      found to replace it with.
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      12d1e301
    • Shyam Prasad N's avatar
      cifs: fix a pending undercount of srv_count · f30bbc38
      Shyam Prasad N authored
      The following commit reverted the changes to ref count
      the server struct while scheduling a reconnect work:
      82334252 Revert "cifs: reconnect work should have reference on server struct"
      
      However, a following change also introduced scheduling
      of reconnect work, and assumed ref counting. This change
      fixes that as well.
      
      Fixes umount problems like:
      
      [73496.157838] CPU: 5 PID: 1321389 Comm: umount Tainted: G        W  OE      6.7.0-060700rc6-generic #202312172332
      [73496.157841] Hardware name: LENOVO 20MAS08500/20MAS08500, BIOS N2CET67W (1.50 ) 12/15/2022
      [73496.157843] RIP: 0010:cifs_put_tcp_session+0x17d/0x190 [cifs]
      [73496.157906] Code: 5d 31 c0 31 d2 31 f6 31 ff c3 cc cc cc cc e8 4a 6e 14 e6 e9 f6 fe ff ff be 03 00 00 00 48 89 d7 e8 78 26 b3 e5 e9 e4 fe ff ff <0f> 0b e9 b1 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90
      [73496.157908] RSP: 0018:ffffc90003bcbcb8 EFLAGS: 00010286
      [73496.157911] RAX: 00000000ffffffff RBX: ffff8885830fa800 RCX: 0000000000000000
      [73496.157913] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      [73496.157915] RBP: ffffc90003bcbcc8 R08: 0000000000000000 R09: 0000000000000000
      [73496.157917] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      [73496.157918] R13: ffff8887d56ba800 R14: 00000000ffffffff R15: ffff8885830fa800
      [73496.157920] FS:  00007f1ff0e33800(0000) GS:ffff88887ba80000(0000) knlGS:0000000000000000
      [73496.157922] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [73496.157924] CR2: 0000115f002e2010 CR3: 00000003d1e24005 CR4: 00000000003706f0
      [73496.157926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [73496.157928] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [73496.157929] Call Trace:
      [73496.157931]  <TASK>
      [73496.157933]  ? show_regs+0x6d/0x80
      [73496.157936]  ? __warn+0x89/0x160
      [73496.157939]  ? cifs_put_tcp_session+0x17d/0x190 [cifs]
      [73496.157976]  ? report_bug+0x17e/0x1b0
      [73496.157980]  ? handle_bug+0x51/0xa0
      [73496.157983]  ? exc_invalid_op+0x18/0x80
      [73496.157985]  ? asm_exc_invalid_op+0x1b/0x20
      [73496.157989]  ? cifs_put_tcp_session+0x17d/0x190 [cifs]
      [73496.158023]  ? cifs_put_tcp_session+0x1e/0x190 [cifs]
      [73496.158057]  __cifs_put_smb_ses+0x2b5/0x540 [cifs]
      [73496.158090]  ? tconInfoFree+0xc2/0x120 [cifs]
      [73496.158130]  cifs_put_tcon.part.0+0x108/0x2b0 [cifs]
      [73496.158173]  cifs_put_tlink+0x49/0x90 [cifs]
      [73496.158220]  cifs_umount+0x56/0xb0 [cifs]
      [73496.158258]  cifs_kill_sb+0x52/0x60 [cifs]
      [73496.158306]  deactivate_locked_super+0x32/0xc0
      [73496.158309]  deactivate_super+0x46/0x60
      [73496.158311]  cleanup_mnt+0xc3/0x170
      [73496.158314]  __cleanup_mnt+0x12/0x20
      [73496.158330]  task_work_run+0x5e/0xa0
      [73496.158333]  exit_to_user_mode_loop+0x105/0x130
      [73496.158336]  exit_to_user_mode_prepare+0xa5/0xb0
      [73496.158338]  syscall_exit_to_user_mode+0x29/0x60
      [73496.158341]  do_syscall_64+0x6c/0xf0
      [73496.158344]  ? syscall_exit_to_user_mode+0x37/0x60
      [73496.158346]  ? do_syscall_64+0x6c/0xf0
      [73496.158349]  ? exit_to_user_mode_prepare+0x30/0xb0
      [73496.158353]  ? syscall_exit_to_user_mode+0x37/0x60
      [73496.158355]  ? do_syscall_64+0x6c/0xf0
      Reported-by: default avatarRobert Morris <rtm@csail.mit.edu>
      Fixes: 705fc522 ("cifs: handle when server starts supporting multichannel")
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      f30bbc38
    • Heiko Carstens's avatar
      s390: update defconfigs · 3d940bb1
      Heiko Carstens authored
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarAlexander Gordeev <agordeev@linux.ibm.com>
      3d940bb1
    • Zizhi Wo's avatar
      fs: cifs: Fix atime update check · 01fe654f
      Zizhi Wo authored
      Commit 9b9c5bea ("cifs: do not return atime less than mtime") indicates
      that in cifs, if atime is less than mtime, some apps will break.
      Therefore, it introduce a function to compare this two variables in two
      places where atime is updated. If atime is less than mtime, update it to
      mtime.
      
      However, the patch was handled incorrectly, resulting in atime and mtime
      being exactly equal. A previous commit 69738cfd ("fs: cifs: Fix atime
      update check vs mtime") fixed one place and forgot to fix another. Fix it.
      
      Fixes: 9b9c5bea ("cifs: do not return atime less than mtime")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarZizhi Wo <wozizhi@huawei.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      01fe654f
    • Paulo Alcantara's avatar
      smb: client: fix potential OOB in smb2_dump_detail() · 567320c4
      Paulo Alcantara authored
      Validate SMB message with ->check_message() before calling
      ->calc_smb_size().
      
      This fixes CVE-2023-6610.
      
      Reported-by: j51569436@gmail.com
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218219
      Cc; stable@vger.kernel.org
      Signed-off-by: default avatarPaulo Alcantara <pc@manguebit.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      567320c4
    • Paolo Abeni's avatar
      Merge branch 'check-vlan-filter-feature-in-vlan_vids_add_by_dev-and-vlan_vids_del_by_dev' · 8353c2ab
      Paolo Abeni authored
      Liu Jian says:
      
      ====================
      check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev()
      
      v2->v3:
      	Filter using vlan_hw_filter_capable().
      	Add one basic test.
      ====================
      
      Link: https://lore.kernel.org/r/20231216075219.2379123-1-liujian56@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8353c2ab
    • Liu Jian's avatar
      selftests: add vlan hw filter tests · 2258b666
      Liu Jian authored
      Add one basic vlan hw filter test.
      Signed-off-by: default avatarLiu Jian <liujian56@huawei.com>
      Reviewed-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2258b666
    • Liu Jian's avatar
      net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() · 01a564ba
      Liu Jian authored
      I got the below warning trace:
      
      WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
      CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
      RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
      Call Trace:
       rtnl_dellink
       rtnetlink_rcv_msg
       netlink_rcv_skb
       netlink_unicast
       netlink_sendmsg
       __sock_sendmsg
       ____sys_sendmsg
       ___sys_sendmsg
       __sys_sendmsg
       do_syscall_64
       entry_SYSCALL_64_after_hwframe
      
      It can be repoduced via:
      
          ip netns add ns1
          ip netns exec ns1 ip link add bond0 type bond mode 0
          ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
          ip netns exec ns1 ip link set bond_slave_1 master bond0
      [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
      [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
      [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
      [4] ip netns exec ns1 ip link set bond_slave_1 nomaster
      [5] ip netns exec ns1 ip link del veth2
          ip netns del ns1
      
      This is all caused by command [1] turning off the rx-vlan-filter function
      of bond0. The reason is the same as commit 01f4fd27 ("bonding: Fix
      incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
      [2] [3] add the same vid to slave and master respectively, causing
      command [4] to empty slave->vlan_info. The following command [5] triggers
      this problem.
      
      To fix this problem, we should add VLAN_FILTER feature checks in
      vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
      addition or deletion of vlan_vid information.
      
      Fixes: 348a1443 ("vlan: introduce functions to do mass addition/deletion of vids by another device")
      Signed-off-by: default avatarLiu Jian <liujian56@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      01a564ba
    • Jijie Shao's avatar
      net: hns3: add new maintainer for the HNS3 ethernet driver · fa94a0c8
      Jijie Shao authored
      Jijie Shao will be responsible for
      maintaining the hns3 driver's code in the future,
      so add Jijie to the hns3 driver's matainer list.
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Link: https://lore.kernel.org/r/20231216070413.233668-1-shaojijie@huawei.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fa94a0c8
    • Yury Norov's avatar
      net: mana: select PAGE_POOL · 340943fb
      Yury Norov authored
      Mana uses PAGE_POOL API. x86_64 defconfig doesn't select it:
      
      ld: vmlinux.o: in function `mana_create_page_pool.isra.0':
      mana_en.c:(.text+0x9ae36f): undefined reference to `page_pool_create'
      ld: vmlinux.o: in function `mana_get_rxfrag':
      mana_en.c:(.text+0x9afed1): undefined reference to `page_pool_alloc_pages'
      make[3]: *** [/home/yury/work/linux/scripts/Makefile.vmlinux:37: vmlinux] Error 1
      make[2]: *** [/home/yury/work/linux/Makefile:1154: vmlinux] Error 2
      make[1]: *** [/home/yury/work/linux/Makefile:234: __sub-make] Error 2
      make[1]: Leaving directory '/home/yury/work/build-linux-x86_64'
      make: *** [Makefile:234: __sub-make] Error 2
      
      So we need to select it explicitly.
      Signed-off-by: default avatarYury Norov <yury.norov@gmail.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: Simon Horman <horms@kernel.org> # build-tested
      Fixes: ca9c54d2 ("net: mana: Add a driver for Microsoft Azure Network Adapter")
      Link: https://lore.kernel.org/r/20231215203353.635379-1-yury.norov@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      340943fb
    • Ronald Wahl's avatar
      net: ks8851: Fix TX stall caused by TX buffer overrun · 3dc5d445
      Ronald Wahl authored
      There is a bug in the ks8851 Ethernet driver that more data is written
      to the hardware TX buffer than actually available. This is caused by
      wrong accounting of the free TX buffer space.
      
      The driver maintains a tx_space variable that represents the TX buffer
      space that is deemed to be free. The ks8851_start_xmit_spi() function
      adds an SKB to a queue if tx_space is large enough and reduces tx_space
      by the amount of buffer space it will later need in the TX buffer and
      then schedules a work item. If there is not enough space then the TX
      queue is stopped.
      
      The worker function ks8851_tx_work() dequeues all the SKBs and writes
      the data into the hardware TX buffer. The last packet will trigger an
      interrupt after it was send. Here it is assumed that all data fits into
      the TX buffer.
      
      In the interrupt routine (which runs asynchronously because it is a
      threaded interrupt) tx_space is updated with the current value from the
      hardware. Also the TX queue is woken up again.
      
      Now it could happen that after data was sent to the hardware and before
      handling the TX interrupt new data is queued in ks8851_start_xmit_spi()
      when the TX buffer space had still some space left. When the interrupt
      is actually handled tx_space is updated from the hardware but now we
      already have new SKBs queued that have not been written to the hardware
      TX buffer yet. Since tx_space has been overwritten by the value from the
      hardware the space is not accounted for.
      
      Now we have more data queued then buffer space available in the hardware
      and ks8851_tx_work() will potentially overrun the hardware TX buffer. In
      many cases it will still work because often the buffer is written out
      fast enough so that no overrun occurs but for example if the peer
      throttles us via flow control then an overrun may happen.
      
      This can be fixed in different ways. The most simple way would be to set
      tx_space to 0 before writing data to the hardware TX buffer preventing
      the queuing of more SKBs until the TX interrupt has been handled. I have
      chosen a slightly more efficient (and still rather simple) way and
      track the amount of data that is already queued and not yet written to
      the hardware. When new SKBs are to be queued the already queued amount
      of data is honoured when checking free TX buffer space.
      
      I tested this with a setup of two linked KS8851 running iperf3 between
      the two in bidirectional mode. Before the fix I got a stall after some
      minutes. With the fix I saw now issues anymore after hours.
      
      Fixes: 3ba81f3e ("net: Micrel KS8851 SPI network driver")
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Cc: Ben Dooks <ben.dooks@codethink.co.uk>
      Cc: Tristram Ha <Tristram.Ha@microchip.com>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org # 5.10+
      Signed-off-by: default avatarRonald Wahl <ronald.wahl@raritan.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231214181112.76052-1-rwahl@gmx.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3dc5d445
    • Steven Rostedt (Google)'s avatar
      ring-buffer: Fix slowpath of interrupted event · b803d7c6
      Steven Rostedt (Google) authored
      To synchronize the timestamps with the ring buffer reservation, there are
      two timestamps that are saved in the buffer meta data.
      
      1. before_stamp
      2. write_stamp
      
      When the two are equal, the write_stamp is considered valid, as in, it may
      be used to calculate the delta of the next event as the write_stamp is the
      timestamp of the previous reserved event on the buffer.
      
      This is done by the following:
      
       /*A*/	w = current position on the ring buffer
      	before = before_stamp
      	after = write_stamp
      	ts = read current timestamp
      
      	if (before != after) {
      		write_stamp is not valid, force adding an absolute
      		timestamp.
      	}
      
       /*B*/	before_stamp = ts
      
       /*C*/	write = local_add_return(event length, position on ring buffer)
      
      	if (w == write - event length) {
      		/* Nothing interrupted between A and C */
       /*E*/		write_stamp = ts;
      		delta = ts - after
      		/*
      		 * If nothing interrupted again,
      		 * before_stamp == write_stamp and write_stamp
      		 * can be used to calculate the delta for
      		 * events that come in after this one.
      		 */
      	} else {
      
      		/*
      		 * The slow path!
      		 * Was interrupted between A and C.
      		 */
      
      This is the place that there's a bug. We currently have:
      
      		after = write_stamp
      		ts = read current timestamp
      
       /*F*/		if (write == current position on the ring buffer &&
      		    after < ts && cmpxchg(write_stamp, after, ts)) {
      
      			delta = ts - after;
      
      		} else {
      			delta = 0;
      		}
      
      The assumption is that if the current position on the ring buffer hasn't
      moved between C and F, then it also was not interrupted, and that the last
      event written has a timestamp that matches the write_stamp. That is the
      write_stamp is valid.
      
      But this may not be the case:
      
      If a task context event was interrupted by softirq between B and C.
      
      And the softirq wrote an event that got interrupted by a hard irq between
      C and E.
      
      and the hard irq wrote an event (does not need to be interrupted)
      
      We have:
      
       /*B*/ before_stamp = ts of normal context
      
         ---> interrupted by softirq
      
      	/*B*/ before_stamp = ts of softirq context
      
      	  ---> interrupted by hardirq
      
      		/*B*/ before_stamp = ts of hard irq context
      		/*E*/ write_stamp = ts of hard irq context
      
      		/* matches and write_stamp valid */
      	  <----
      
      	/*E*/ write_stamp = ts of softirq context
      
      	/* No longer matches before_stamp, write_stamp is not valid! */
      
         <---
      
       w != write - length, go to slow path
      
      // Right now the order of events in the ring buffer is:
      //
      // |-- softirq event --|-- hard irq event --|-- normal context event --|
      //
      
       after = write_stamp (this is the ts of softirq)
       ts = read current timestamp
      
       if (write == current position on the ring buffer [true] &&
           after < ts [true] && cmpxchg(write_stamp, after, ts) [true]) {
      
      	delta = ts - after  [Wrong!]
      
      The delta is to be between the hard irq event and the normal context
      event, but the above logic made the delta between the softirq event and
      the normal context event, where the hard irq event is between the two. This
      will shift all the remaining event timestamps on the sub-buffer
      incorrectly.
      
      The write_stamp is only valid if it matches the before_stamp. The cmpxchg
      does nothing to help this.
      
      Instead, the following logic can be done to fix this:
      
      	before = before_stamp
      	ts = read current timestamp
      	before_stamp = ts
      
      	after = write_stamp
      
      	if (write == current position on the ring buffer &&
      	    after == before && after < ts) {
      
      		delta = ts - after
      
      	} else {
      		delta = 0;
      	}
      
      The above will only use the write_stamp if it still matches before_stamp
      and was tested to not have changed since C.
      
      As a bonus, with this logic we do not need any 64-bit cmpxchg() at all!
      
      This means the 32-bit rb_time_t workaround can finally be removed. But
      that's for a later time.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20231218175229.58ec3daf@gandalf.local.home/
      Link: https://lore.kernel.org/linux-trace-kernel/20231218230712.3a76b081@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Fixes: dd939425 ("ring-buffer: Do not try to put back write_stamp")
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      b803d7c6