1. 08 Jan, 2024 1 commit
  2. 29 Dec, 2023 1 commit
    • Vincent Guittot's avatar
      sched/fair: Fix tg->load when offlining a CPU · f60a631a
      Vincent Guittot authored
      When a CPU is taken offline, the contribution of its cfs_rqs to task_groups'
      load may remain and will negatively impact the calculation of the share of
      the online CPUs.
      
      To fix this bug, clear the contribution of an offlining CPU to task groups'
      load and skip its contribution while it is inactive.
      
      Here's the reproducer of the anomaly, by Imran Khan:
      
      	"So far I have encountered only one rather lengthy way of reproducing this issue,
      	which is as follows:
      
      	1. Take a KVM guest (booted with 4 CPUs and can be scaled up to 124 CPUs) and
      	   create 2 custom cgroups: /sys/fs/cgroup/cpu/test_group_1 and /sys/fs/cgroup/
      	   cpu/test_group_2
      
      	2. Assign a CPU intensive workload to each of these cgroups and start the
      	   workload.
      
      	For my tests I am using following app:
      
      	int main(int argc, char *argv[])
      	{
      		unsigned long count, i, val;
      		if (argc != 2) {
      		      printf("usage: ./a.out <number of random nums to generate> \n");
      		      return 0;
      		}
      
      		count = strtoul(argv[1], NULL, 10);
      
      		printf("Generating %lu random numbers \n", count);
      		for (i = 0; i < count; i++) {
      			val = rand();
      			val = val % 2;
      			//usleep(1);
      		}
      		printf("Generated %lu random numbers \n", count);
      		return 0;
      	}
      
      	Also since the system is booted with 4 CPUs, in order to completely load the
      	system I am also launching 4 instances of same test app under:
      
      	   /sys/fs/cgroup/cpu/
      
      	3. We can see that both of the cgroups get similar CPU time:
      
              # systemd-cgtop --depth 1
      	Path                                 Tasks    %CPU  Memory  Input/s    Output/s
      	/                                      659      -     5.5G        -        -
      	/system.slice                            -      -     5.7G        -        -
      	/test_group_1                            4      -        -        -        -
      	/test_group_2                            3      -        -        -        -
      	/user.slice                             31      -    56.5M        -        -
      
      	Path                                 Tasks   %CPU   Memory  Input/s    Output/s
      	/                                      659  394.6     5.5G        -        -
      	/test_group_2                            3   65.7        -        -        -
      	/user.slice                             29   55.1    48.0M        -        -
      	/test_group_1                            4   47.3        -        -        -
      	/system.slice                            -    2.2     5.7G        -        -
      
      	Path                                 Tasks  %CPU    Memory  Input/s    Output/s
      	/                                      659  394.8     5.5G        -        -
      	/test_group_1                            4   62.9        -        -        -
      	/user.slice                             28   44.9    54.2M        -        -
      	/test_group_2                            3   44.7        -        -        -
      	/system.slice                            -    0.9     5.7G        -        -
      
      	Path                                 Tasks  %CPU    Memory  Input/s     Output/s
      	/                                      659  394.4     5.5G        -        -
      	/test_group_2                            3   58.8        -        -        -
      	/test_group_1                            4   51.9        -        -        -
      	/user.slice                              30   39.3    59.6M        -        -
      	/system.slice                            -    1.9     5.7G        -        -
      
      	Path                                 Tasks  %CPU     Memory  Input/s    Output/s
      	/                                      659  394.7     5.5G        -        -
      	/test_group_1                            4   60.9        -        -        -
      	/test_group_2                            3   57.9        -        -        -
      	/user.slice                             28   43.5    36.9M        -        -
      	/system.slice                            -    3.0     5.7G        -        -
      
      	Path                                 Tasks  %CPU     Memory  Input/s     Output/s
      	/                                      659  395.0     5.5G        -        -
      	/test_group_1                            4   66.8        -        -        -
      	/test_group_2                            3   56.3        -        -        -
      	/user.slice                             29   43.1    51.8M        -        -
      	/system.slice                            -    0.7     5.7G        -        -
      
      	4. Now move systemd-udevd to one of these test groups, say test_group_1, and
      	   perform scale up to 124 CPUs followed by scale down back to 4 CPUs from the
      	   host side.
      
      	5. Run the same workload i.e 4 instances of CPU hogger under /sys/fs/cgroup/cpu
      	   and one instance of  CPU hogger each in /sys/fs/cgroup/cpu/test_group_1 and
      	   /sys/fs/cgroup/test_group_2.
      
      	It can be seen that test_group_1 (the one where systemd-udevd was moved) is getting
      	much less CPU time than the test_group_2, even though at this point of time both of
      	these groups have only CPU hogger running:
      
              # systemd-cgtop --depth 1
      	Path                                   Tasks   %CPU   Memory  Input/s   Output/s
      	/                                      1219     -     5.4G        -        -
      	/system.slice                           -       -     5.6G        -        -
      	/test_group_1                           4       -        -        -        -
      	/test_group_2                           3       -        -        -        -
      	/user.slice                            26       -    91.3M        -        -
      
      	Path                                   Tasks  %CPU     Memory  Input/s   Output/s
      	/                                      1221  394.3     5.4G        -        -
      	/test_group_2                             3   82.7        -        -        -
      	/test_group_1                             4   14.3        -        -        -
      	/system.slice                             -    0.8     5.6G        -        -
      	/user.slice                              26    0.4    91.2M        -        -
      
      	Path                                   Tasks  %CPU    Memory  Input/s    Output/s
      	/                                      1221  394.6     5.4G        -        -
      	/test_group_2                             3   67.4        -        -        -
      	/system.slice                             -   24.6     5.6G        -        -
      	/test_group_1                             4   12.5        -        -        -
      	/user.slice                              26    0.4    91.2M        -        -
      
      	Path                                  Tasks  %CPU    Memory  Input/s    Output/s
      	/                                     1221  395.2     5.4G        -        -
      	/test_group_2                            3   60.9        -        -        -
      	/system.slice                            -   27.9     5.6G        -        -
      	/test_group_1                            4   12.2        -        -        -
      	/user.slice                             26    0.4    91.2M        -        -
      
      	Path                                  Tasks  %CPU    Memory  Input/s    Output/s
      	/                                     1221  395.2     5.4G        -        -
      	/test_group_2                            3   69.4        -        -        -
      	/test_group_1                            4   13.9        -        -        -
      	/user.slice                             28    1.6    92.0M        -        -
      	/system.slice                            -    1.0     5.6G        -        -
      
      	Path                                  Tasks  %CPU    Memory  Input/s    Output/s
      	/                                      1221  395.6     5.4G        -        -
      	/test_group_2                             3   59.3        -        -        -
      	/test_group_1                             4   14.1        -        -        -
      	/user.slice                              28    1.3    92.2M        -        -
      	/system.slice                             -    0.7     5.6G        -        -
      
      	Path                                  Tasks  %CPU    Memory  Input/s    Output/s
      	/                                      1221  395.5     5.4G        -        -
      	/test_group_2                            3   67.2        -        -        -
      	/test_group_1                            4   11.5        -        -        -
      	/user.slice                             28    1.3    92.5M        -        -
      	/system.slice                            -    0.6     5.6G        -        -
      
      	Path                                  Tasks  %CPU    Memory  Input/s    Output/s
      	/                                      1221  395.1     5.4G        -        -
      	/test_group_2                             3   76.8        -        -        -
      	/test_group_1                             4   12.9        -        -        -
      	/user.slice                              28    1.3    92.8M        -        -
      	/system.slice                             -    1.2     5.6G        -        -
      
      	From sched_debug data it can be seen that in bad case the load.weight of per-CPU
      	sched entities corresponding to test_group_1 has reduced significantly and
      	also load_avg of test_group_1 remains much higher than that of test_group_2,
      	even though systemd-udevd stopped running long time back and at this point of
      	time both cgroups just have the CPU hogger app as running entity."
      
      [ mingo: Added details from the original discussion, plus minor edits to the patch. ]
      Reported-by: default avatarImran Khan <imran.f.khan@oracle.com>
      Tested-by: default avatarImran Khan <imran.f.khan@oracle.com>
      Tested-by: default avatarAaron Lu <aaron.lu@intel.com>
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarImran Khan <imran.f.khan@oracle.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: https://lore.kernel.org/r/20231223111545.62135-1-vincent.guittot@linaro.org
      f60a631a
  3. 23 Dec, 2023 14 commits
  4. 22 Dec, 2023 12 commits
  5. 21 Dec, 2023 12 commits
    • Jens Axboe's avatar
      Merge tag 'nvme-6.7-2023-12-21' of git://git.infradead.org/nvme into block-6.7 · 13d822bf
      Jens Axboe authored
      Pull NVMe fixes from Keith:
      
      "nvme fixes for Linux 6.7
      
       - Revert a commit with improper sleep context (Keith)
       - Fix async event handling sleep context (Maurizio)"
      
      * tag 'nvme-6.7-2023-12-21' of git://git.infradead.org/nvme:
        nvme-pci: fix sleeping function called from interrupt context
        Revert "nvme-fc: fix race between error recovery and creating association"
      13d822bf
    • David Howells's avatar
      afs: Fix use-after-free due to get/remove race in volume tree · 9a6b294a
      David Howells authored
      When an afs_volume struct is put, its refcount is reduced to 0 before
      the cell->volume_lock is taken and the volume removed from the
      cell->volumes tree.
      
      Unfortunately, this means that the lookup code can race and see a volume
      with a zero ref in the tree, resulting in a use-after-free:
      
          refcount_t: addition on 0; use-after-free.
          WARNING: CPU: 3 PID: 130782 at lib/refcount.c:25 refcount_warn_saturate+0x7a/0xda
          ...
          RIP: 0010:refcount_warn_saturate+0x7a/0xda
          ...
          Call Trace:
           afs_get_volume+0x3d/0x55
           afs_create_volume+0x126/0x1de
           afs_validate_fc+0xfe/0x130
           afs_get_tree+0x20/0x2e5
           vfs_get_tree+0x1d/0xc9
           do_new_mount+0x13b/0x22e
           do_mount+0x5d/0x8a
           __do_sys_mount+0x100/0x12a
           do_syscall_64+0x3a/0x94
           entry_SYSCALL_64_after_hwframe+0x62/0x6a
      
      Fix this by:
      
       (1) When putting, use a flag to indicate if the volume has been removed
           from the tree and skip the rb_erase if it has.
      
       (2) When looking up, use a conditional ref increment and if it fails
           because the refcount is 0, replace the node in the tree and set the
           removal flag.
      
      Fixes: 20325960 ("afs: Reorganise volume and server trees to be rooted on the cell")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9a6b294a
    • Matthew Wilcox (Oracle)'s avatar
      ida: Fix crash in ida_free when the bitmap is empty · af73483f
      Matthew Wilcox (Oracle) authored
      The IDA usually detects double-frees, but that detection failed to
      consider the case when there are no nearby IDs allocated and so we have a
      NULL bitmap rather than simply having a clear bit.  Add some tests to the
      test-suite to be sure we don't inadvertently reintroduce this problem.
      Unfortunately they're quite noisy so include a message to disregard
      the warnings.
      Reported-by: default avatarZhenghan Wang <wzhmmmmm@gmail.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      af73483f
    • David Howells's avatar
      afs: Fix overwriting of result of DNS query · a9e01ac8
      David Howells authored
      In afs_update_cell(), ret is the result of the DNS lookup and the errors
      are to be handled by a switch - however, the value gets clobbered in
      between by setting it to -ENOMEM in case afs_alloc_vlserver_list()
      fails.
      
      Fix this by moving the setting of -ENOMEM into the error handling for
      OOM failure.  Further, only do it if we don't have an alternative error
      to return.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.  Based
      on a patch from Anastasia Belova [1].
      
      Fixes: d5c32c89 ("afs: Fix cell DNS lookup")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      cc: Anastasia Belova <abelova@astralinux.ru>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      cc: lvc-project@linuxtesting.org
      Link: https://lore.kernel.org/r/20231221085849.1463-1-abelova@astralinux.ru/ [1]
      Link: https://lore.kernel.org/r/1700862.1703168632@warthog.procyon.org.uk/ # v1
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9e01ac8
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20231221' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 937fd403
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
       "Improve the interaction of arbitrary lookups in the AFS dynamic root
        that hit DNS lookup failures [1] where kafs behaves differently from
        openafs and causes some applications to fail that aren't expecting
        that. Further, negative DNS results aren't getting removed and are
        causing failures to persist.
      
         - Always delete unused (particularly negative) dentries as soon as
           possible so that they don't prevent future lookups from retrying.
      
         - Fix the handling of new-style negative DNS lookups in ->lookup() to
           make them return ENOENT so that userspace doesn't get confused when
           stat succeeds but the following open on the looked up file then
           fails.
      
         - Fix key handling so that DNS lookup results are reclaimed almost as
           soon as they expire rather than sitting round either forever or for
           an additional 5 mins beyond a set expiry time returning
           EKEYEXPIRED. They persist for 1s as /bin/ls will do a second stat
           call if the first fails"
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=216637 [1]
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      
      * tag 'afs-fixes-20231221' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry
        afs: Fix dynamic root lookup DNS check
        afs: Fix the dynamic root's d_delete to always delete unused dentries
      937fd403
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.7-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 13b73446
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Fix another kerneldoc warning
      
       - Fix eventfs files to inherit the ownership of its parent directory.
      
         The dynamic creation of dentries in eventfs did not take into account
         if the tracefs file system was mounted with a gid/uid, and would
         still default to the gid/uid of root. This is a regression.
      
       - Fix warning when synthetic event testing is enabled along with
         startup event tracing testing is enabled
      
      * tag 'trace-v6.7-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing / synthetic: Disable events after testing in synth_event_gen_test_init()
        eventfs: Have event files and directories default to parent uid and gid
        tracing/synthetic: fix kernel-doc warnings
      13b73446
    • Linus Torvalds's avatar
      Merge tag 'net-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 7c5e046b
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from WiFi and bpf.
      
        Current release - regressions:
      
         - bpf: syzkaller found null ptr deref in unix_bpf proto add
      
         - eth: i40e: fix ST code value for clause 45
      
        Previous releases - regressions:
      
         - core: return error from sk_stream_wait_connect() if sk_wait_event()
           fails
      
         - ipv6: revert remove expired routes with a separated list of routes
      
         - wifi rfkill:
             - set GPIO direction
             - fix crash with WED rx support enabled
      
         - bluetooth:
             - fix deadlock in vhci_send_frame
             - fix use-after-free in bt_sock_recvmsg
      
         - eth: mlx5e: fix a race in command alloc flow
      
         - eth: ice: fix PF with enabled XDP going no-carrier after reset
      
         - eth: bnxt_en: do not map packet buffers twice
      
        Previous releases - always broken:
      
         - core:
             - check vlan filter feature in vlan_vids_add_by_dev() and
               vlan_vids_del_by_dev()
             - check dev->gso_max_size in gso_features_check()
      
         - mptcp: fix inconsistent state on fastopen race
      
         - phy: skip LED triggers on PHYs on SFP modules
      
         - eth: mlx5e:
             - fix double free of encap_header
             - fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list()"
      
      * tag 'net-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
        net: check dev->gso_max_size in gso_features_check()
        kselftest: rtnetlink.sh: use grep_fail when expecting the cmd fail
        net/ipv6: Revert remove expired routes with a separated list of routes
        net: avoid build bug in skb extension length calculation
        net: ethernet: mtk_wed: fix possible NULL pointer dereference in mtk_wed_wo_queue_tx_clean()
        net: stmmac: fix incorrect flag check in timestamp interrupt
        selftests: add vlan hw filter tests
        net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev()
        net: hns3: add new maintainer for the HNS3 ethernet driver
        net: mana: select PAGE_POOL
        net: ks8851: Fix TX stall caused by TX buffer overrun
        ice: Fix PF with enabled XDP going no-carrier after reset
        ice: alter feature support check for SRIOV and LAG
        ice: stop trashing VF VSI aggregator node ID information
        mailmap: add entries for Geliang Tang
        mptcp: fill in missing MODULE_DESCRIPTION()
        mptcp: fix inconsistent state on fastopen race
        selftests: mptcp: join: fix subflow_send_ack lookup
        net: phy: skip LED triggers on PHYs on SFP modules
        bpf: Add missing BPF_LINK_TYPE invocations
        ...
      7c5e046b
    • Steven Rostedt (Google)'s avatar
      tracing / synthetic: Disable events after testing in synth_event_gen_test_init() · 88b30c7f
      Steven Rostedt (Google) authored
      The synth_event_gen_test module can be built in, if someone wants to run
      the tests at boot up and not have to load them.
      
      The synth_event_gen_test_init() function creates and enables the synthetic
      events and runs its tests.
      
      The synth_event_gen_test_exit() disables the events it created and
      destroys the events.
      
      If the module is builtin, the events are never disabled. The issue is, the
      events should be disable after the tests are run. This could be an issue
      if the rest of the boot up tests are enabled, as they expect the events to
      be in a known state before testing. That known state happens to be
      disabled.
      
      When CONFIG_SYNTH_EVENT_GEN_TEST=y and CONFIG_EVENT_TRACE_STARTUP_TEST=y
      a warning will trigger:
      
       Running tests on trace events:
       Testing event create_synth_test:
       Enabled event during self test!
       ------------[ cut here ]------------
       WARNING: CPU: 2 PID: 1 at kernel/trace/trace_events.c:4150 event_trace_self_tests+0x1c2/0x480
       Modules linked in:
       CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc2-test-00031-gb803d7c6-dirty #276
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
       RIP: 0010:event_trace_self_tests+0x1c2/0x480
       Code: bb e8 a2 ab 5d fc 48 8d 7b 48 e8 f9 3d 99 fc 48 8b 73 48 40 f6 c6 01 0f 84 d6 fe ff ff 48 c7 c7 20 b6 ad bb e8 7f ab 5d fc 90 <0f> 0b 90 48 89 df e8 d3 3d 99 fc 48 8b 1b 4c 39 f3 0f 85 2c ff ff
       RSP: 0000:ffffc9000001fdc0 EFLAGS: 00010246
       RAX: 0000000000000029 RBX: ffff88810399ca80 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: ffffffffb9f19478 RDI: ffff88823c734e64
       RBP: ffff88810399f300 R08: 0000000000000000 R09: fffffbfff79eb32a
       R10: ffffffffbcf59957 R11: 0000000000000001 R12: ffff888104068090
       R13: ffffffffbc89f0a0 R14: ffffffffbc8a0f08 R15: 0000000000000078
       FS:  0000000000000000(0000) GS:ffff88823c700000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000000 CR3: 00000001f6282001 CR4: 0000000000170ef0
       Call Trace:
        <TASK>
        ? __warn+0xa5/0x200
        ? event_trace_self_tests+0x1c2/0x480
        ? report_bug+0x1f6/0x220
        ? handle_bug+0x6f/0x90
        ? exc_invalid_op+0x17/0x50
        ? asm_exc_invalid_op+0x1a/0x20
        ? tracer_preempt_on+0x78/0x1c0
        ? event_trace_self_tests+0x1c2/0x480
        ? __pfx_event_trace_self_tests_init+0x10/0x10
        event_trace_self_tests_init+0x27/0xe0
        do_one_initcall+0xd6/0x3c0
        ? __pfx_do_one_initcall+0x10/0x10
        ? kasan_set_track+0x25/0x30
        ? rcu_is_watching+0x38/0x60
        kernel_init_freeable+0x324/0x450
        ? __pfx_kernel_init+0x10/0x10
        kernel_init+0x1f/0x1e0
        ? _raw_spin_unlock_irq+0x33/0x50
        ret_from_fork+0x34/0x60
        ? __pfx_kernel_init+0x10/0x10
        ret_from_fork_asm+0x1b/0x30
        </TASK>
      
      This is because the synth_event_gen_test_init() left the synthetic events
      that it created enabled. By having it disable them after testing, the
      other selftests will run fine.
      
      Link: https://lore.kernel.org/linux-trace-kernel/20231220111525.2f0f49b0@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Tom Zanussi <zanussi@kernel.org>
      Fixes: 9fe41efa ("tracing: Add synth event generation test module")
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Reported-by: default avatarAlexander Graf <graf@amazon.com>
      Tested-by: default avatarAlexander Graf <graf@amazon.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      88b30c7f
    • Steven Rostedt (Google)'s avatar
      eventfs: Have event files and directories default to parent uid and gid · 0dfc852b
      Steven Rostedt (Google) authored
      Dongliang reported:
      
        I found that in the latest version, the nodes of tracefs have been
        changed to dynamically created.
      
        This has caused me to encounter a problem where the gid I specified in
        the mounting parameters cannot apply to all files, as in the following
        situation:
      
        /data/tmp/events # mount | grep tracefs
        tracefs on /data/tmp type tracefs (rw,seclabel,relatime,gid=3012)
      
        gid 3012 = readtracefs
      
        /data/tmp # ls -lh
        total 0
        -r--r-----   1 root readtracefs 0 1970-01-01 08:00 README
        -r--r-----   1 root readtracefs 0 1970-01-01 08:00 available_events
      
        ums9621_1h10:/data/tmp/events # ls -lh
        total 0
        drwxr-xr-x 2 root root 0 2023-12-19 00:56 alarmtimer
        drwxr-xr-x 2 root root 0 2023-12-19 00:56 asoc
      
        It will prevent certain applications from accessing tracefs properly, I
        try to avoid this issue by making the following modifications.
      
      To fix this, have the files created default to taking the ownership of
      the parent dentry unless the ownership was previously set by the user.
      
      Link: https://lore.kernel.org/linux-trace-kernel/1703063706-30539-1-git-send-email-dongliang.cui@unisoc.com/
      Link: https://lore.kernel.org/linux-trace-kernel/20231220105017.1489d790@gandalf.local.home
      
      Cc: stable@vger.kernel.org
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Hongyu Jin  <hongyu.jin@unisoc.com>
      Fixes: 28e12c09 ("eventfs: Save ownership and mode")
      Acked-by: default avatarMasami Hiramatsu (Google) <mhiramat@kernel.org>
      Reported-by: default avatarDongliang Cui <cuidongliang390@gmail.com>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      0dfc852b
    • David Howells's avatar
      keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry · 39299bdd
      David Howells authored
      If a key has an expiration time, then when that time passes, the key is
      left around for a certain amount of time before being collected (5 mins by
      default) so that EKEYEXPIRED can be returned instead of ENOKEY.  This is a
      problem for DNS keys because we want to redo the DNS lookup immediately at
      that point.
      
      Fix this by allowing key types to be marked such that keys of that type
      don't have this extra period, but are reclaimed as soon as they expire and
      turn this on for dns_resolver-type keys.  To make this easier to handle,
      key->expiry is changed to be permanent if TIME64_MAX rather than 0.
      
      Furthermore, give such new-style negative DNS results a 1s default expiry
      if no other expiry time is set rather than allowing it to stick around
      indefinitely.  This shouldn't be zero as ls will follow a failing stat call
      immediately with a second with AT_SYMLINK_NOFOLLOW added.
      
      Fixes: 1a4240f4 ("DNS: Separate out CIFS DNS Resolver code")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMarkus Suvanto <markus.suvanto@gmail.com>
      cc: Wang Lei <wang840925@gmail.com>
      cc: Jeff Layton <jlayton@redhat.com>
      cc: Steve French <smfrench@gmail.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: Jarkko Sakkinen <jarkko@kernel.org>
      cc: "David S. Miller" <davem@davemloft.net>
      cc: Eric Dumazet <edumazet@google.com>
      cc: Jakub Kicinski <kuba@kernel.org>
      cc: Paolo Abeni <pabeni@redhat.com>
      cc: linux-afs@lists.infradead.org
      cc: linux-cifs@vger.kernel.org
      cc: linux-nfs@vger.kernel.org
      cc: ceph-devel@vger.kernel.org
      cc: keyrings@vger.kernel.org
      cc: netdev@vger.kernel.org
      39299bdd
    • Paolo Abeni's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 74769d81
      Paolo Abeni authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2023-12-21
      
      Hi David, hi Jakub, hi Paolo, hi Eric,
      
      The following pull-request contains BPF updates for your *net* tree.
      
      We've added 3 non-merge commits during the last 5 day(s) which contain
      a total of 4 files changed, 45 insertions(+).
      
      The main changes are:
      
      1) Fix a syzkaller splat which triggered an oob issue in bpf_link_show_fdinfo(),
         from Jiri Olsa.
      
      2) Fix another syzkaller-found issue which triggered a NULL pointer dereference
         in BPF sockmap for unconnected unix sockets, from John Fastabend.
      
      bpf-for-netdev
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        bpf: Add missing BPF_LINK_TYPE invocations
        bpf: sockmap, test for unconnected af_unix sock
        bpf: syzkaller found null ptr deref in unix_bpf proto add
      ====================
      
      Link: https://lore.kernel.org/r/20231221104844.1374-1-daniel@iogearbox.netSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      74769d81
    • xiongxin's avatar
      gpio: dwapb: mask/unmask IRQ when disable/enale it · 1cc3542c
      xiongxin authored
      In the hardware implementation of the I2C HID driver based on DesignWare
      GPIO IRQ chip, when the user continues to use the I2C HID device in the
      suspend process, the I2C HID interrupt will be masked after the resume
      process is finished.
      
      This is because the disable_irq()/enable_irq() of the DesignWare GPIO
      driver does not synchronize the IRQ mask register state. In normal use
      of the I2C HID procedure, the GPIO IRQ irq_mask()/irq_unmask() functions
      are called in pairs. In case of an exception, i2c_hid_core_suspend()
      calls disable_irq() to disable the GPIO IRQ. With low probability, this
      causes irq_unmask() to not be called, which causes the GPIO IRQ to be
      masked and not unmasked in enable_irq(), raising an exception.
      
      Add synchronization to the masked register state in the
      dwapb_irq_enable()/dwapb_irq_disable() function. mask the GPIO IRQ
      before disabling it. After enabling the GPIO IRQ, unmask the IRQ.
      
      Fixes: 7779b345 ("gpio: add a driver for the Synopsys DesignWare APB GPIO block")
      Cc: stable@kernel.org
      Co-developed-by: default avatarRiwen Lu <luriwen@kylinos.cn>
      Signed-off-by: default avatarRiwen Lu <luriwen@kylinos.cn>
      Signed-off-by: default avatarxiongxin <xiongxin@kylinos.cn>
      Acked-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Reviewed-by: default avatarAndy Shevchenko <andy@kernel.org>
      Signed-off-by: default avatarBartosz Golaszewski <bartosz.golaszewski@linaro.org>
      1cc3542c