- 26 Aug, 2020 7 commits
-
-
Valentin Schneider authored
SD_DEGENERATE_GROUPS_MASK is only useful for sched/topology.c, but still gets defined for anyone who imports topology.h, leading to a flurry of unused variable warnings. Move it out of the header and place it next to the SD degeneration functions in sched/topology.c. Fixes: 4ee4ea44 ("sched/topology: Introduce SD metaflag for flags needing > 1 groups") Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200825133216.9163-2-valentin.schneider@arm.com
-
Valentin Schneider authored
Defining an array in a header imported all over the place clearly is a daft idea, that still didn't stop me from doing it. Leave a declaration of sd_flag_debug in topology.h and move its definition to sched/debug.c. Fixes: b6e862f3 ("sched/topology: Define and assign sched_domain flag metadata") Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200825133216.9163-1-valentin.schneider@arm.com
-
Sebastian Andrzej Siewior authored
sched_submit_work() is considered to be a hot path. The preempt_disable() instruction is a compiler barrier and forces the compiler to load task_struct::flags for the second comparison. By using a local variable, the compiler can load the value once and keep it in a register for the second comparison. Verified on x86-64 with gcc-10. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200819200025.lqvmyefqnbok5i4f@linutronix.de
-
Sebastian Andrzej Siewior authored
The bits PF_IO_WORKER and PF_WQ_WORKER are tested together in sched_submit_work() which is considered to be a hot path. If the two bits cross the 8 or 16 bit boundary then most architecture require multiple load instructions in order to create the constant value. Also, such a value can not be encoded within the compare opcode. By moving the bit definition within the same block, the compiler can create/use one immediate value. For some reason gcc-10 on ARM64 requires both bits to be next to each other in order to issue "tst reg, val; bne label". Otherwise the result is "mov reg1, val; tst reg, reg1; bne label". Move PF_VCPU out of the way so that PF_IO_WORKER can be next to PF_WQ_WORKER. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200819195505.y3fxk72sotnrkczi@linutronix.de
-
Jiang Biao authored
The code in reweight_entity() can be simplified. For a sched entity on the rq, the entity accounting can be replaced by cfs_rq instantaneous load updates currently called from within the entity accounting. Even though an entity on the rq can't represent a task in reweight_entity() (a task is always dequeued before calling this function) and so the numa task accounting and the rq->cfs_tasks list management of the entity accounting are never called, the redundant cfs_rq->nr_running decrement/increment will be avoided. Signed-off-by: Jiang Biao <benbjiang@tencent.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lkml.kernel.org/r/20200811113209.34057-1-benbjiang@tencent.com
-
Lukasz Luba authored
In find_energy_efficient_cpu() 'cpu_cap' could be less that 'util'. It might be because of RT, DL (so higher sched class than CFS), irq or thermal pressure signal, which reduce the capacity value. In such situation the result of 'cpu_cap - util' might be negative but stored in the unsigned long. Then it might be compared with other unsigned long when uclamp_rq_util_with() reduced the 'util' such that is passes the fits_capacity() check. Prevent this situation and make the arithmetic more safe. Fixes: 1d42509e ("sched/fair: Make EAS wakeup placement consider uclamp restrictions") Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lkml.kernel.org/r/20200810083004.26420-1-lukasz.luba@arm.com
-
Josh Don authored
SMT siblings share caches, so cache hotness should be irrelevant for cross-sibling migration. Signed-off-by: Josh Don <joshdon@google.com> Proposed-by: Venkatesh Pallipadi <venki@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20200804193413.510651-1-joshdon@google.com
-
- 19 Aug, 2020 16 commits
-
-
Valentin Schneider authored
There would be no point in preserving a sched_domain with a single group just because it has this flag set. Add it to SD_DEGENERATE_GROUPS_MASK. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-17-valentin.schneider@arm.com
-
Valentin Schneider authored
A sched_domain can only have overlapping sched_groups if it has more than one group. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-16-valentin.schneider@arm.com
-
Valentin Schneider authored
Being a load-balancing flag, it requires 2+ groups to have any effect. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-15-valentin.schneider@arm.com
-
Valentin Schneider authored
There would be no point in preserving a sched_domain with a single group just because it has this flag set. Add it to SD_DEGENERATE_GROUPS_MASK. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-14-valentin.schneider@arm.com
-
Valentin Schneider authored
Even if no mainline topology uses this flag, it is a load balancing flag just like SD_BALANCE_FORK and requires 2+ groups to have any effect. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-13-valentin.schneider@arm.com
-
Valentin Schneider authored
SD_PREFER_SIBLING is currently considered in sd_parent_degenerate() but not in sd_degenerate(). It too hinges on load balancing, and thus won't have any effect when set on a domain with a single group. Add it to SD_DEGENERATE_GROUPS_MASK. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-12-valentin.schneider@arm.com
-
Valentin Schneider authored
We currently set this flag *only* on domains whose topology level exactly match the level where we detect asymmetry (as returned by asym_cpu_capacity_level()). This is rather problematic. Say there are two clusters in the system, one with a lone big CPU and the other with a mix of big and LITTLE CPUs (as is allowed by DynamIQ): DIE [ ] MC [ ][ ] 0 1 2 3 4 L L B B B asym_cpu_capacity_level() will figure out that the MC level is the one where all CPUs can see a CPU of max capacity, and we will thus set SD_ASYM_CPUCAPACITY at MC level for all CPUs. That lone big CPU will degenerate its MC domain, since it would be alone in there, and will end up with just a DIE domain. Since the flag was only set at MC, this CPU ends up not seeing any SD with the flag set, which is broken. Rather than clearing dflags at every topology level, clear it before entering the topology level loop. This will properly propagate upwards flags that are set starting from a certain level. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Quentin Perret <qperret@google.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-11-valentin.schneider@arm.com
-
Valentin Schneider authored
If there is only a single NUMA node in the system, the only NUMA topology level that will be generated will be NODE (identity distance), which doesn't have SD_SERIALIZE. This means we don't need this special case in sd_parent_degenerate(), as having the NODE level "naturally" covers it. Thus, remove it. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-10-valentin.schneider@arm.com
-
Valentin Schneider authored
Leverage SD_DEGENERATE_GROUPS_MASK in sd_degenerate() and sd_parent_degenerate(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-9-valentin.schneider@arm.com
-
Valentin Schneider authored
In preparation of cleaning up the sd_degenerate*() functions, mark flags used in sd_degenerate() with the new SDF_NEEDS_GROUPS flag. With this, build a compile-time mask of those SD flags. Note that sd_parent_degenerate() uses an extra flag in its mask, SD_PREFER_SIBLING, which remains singled out for now. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-8-valentin.schneider@arm.com
-
Valentin Schneider authored
Decoding the output of /proc/sys/kernel/sched_domain/cpu*/domain*/flags has always been somewhat annoying, as one needs to go fetch the bit -> name mapping from the source code itself. This encoding can be saved in a script somewhere, but that isn't safe from flags being added, removed or even shuffled around. What matters for debugging purposes is to get *which* flags are set in a given domain, their associated value is pretty much meaningless. Make the sd flags debug file output flag names. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-7-valentin.schneider@arm.com
-
Valentin Schneider authored
Now that we have some description of what we expect the flags layout to be, we can use that to assert at runtime that the actual layout is sane. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-6-valentin.schneider@arm.com
-
Valentin Schneider authored
There are some expectations regarding how sched domain flags should be laid out, but none of them are checked or asserted in sched_domain_debug_one(). After staring at said flags for a while, I've come to realize there's two repeating patterns: - Shared with children: those flags are set from the base CPU domain upwards. Any domain that has it set will have it set in its children. It hints at "some property holds true / some behaviour is enabled until this level". - Shared with parents: those flags are set from the topmost domain downwards. Any domain that has it set will have it set in its parents. It hints at "some property isn't visible / some behaviour is disabled until this level". There are two outliers that (currently) do not map to either of these: o SD_PREFER_SIBLING, which is cleared below levels with SD_ASYM_CPUCAPACITY. The change was introduced by commit: 9c63e84d ("sched/core: Disable SD_PREFER_SIBLING on asymmetric CPU capacity domains") as it could break misfit migration on some systems. In light of this, we might want to change it back to make it fit one of the two categories and fix the issue another way. o SD_ASYM_CPUCAPACITY, which gets set on a single level and isn't propagated up nor down. From a topology description point of view, it really wants to be SDF_SHARED_PARENT; this will be rectified in a later patch. Tweak the sched_domain flag declaration to assign each flag an expected layout, and include the rationale for each flag "meta type" assignment as a comment. Consolidate the flag metadata into an array; the index of a flag's metadata can easily be found with log2(flag), IOW __ffs(flag). Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-5-valentin.schneider@arm.com
-
Valentin Schneider authored
To associate the SD flags with some metadata, we need some more structure in the way they are declared. Rather than shove that in a free-standing macro list, move the declaration in a separate file that can be re-imported with different SD_FLAG definitions. This is inspired by what is done with the syscall table (see uapi/asm/unistd.h and sys_call_table). The value assigned to a given SD flag now depends on the order it appears in sd_flags.h. No change in functionality. Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-4-valentin.schneider@arm.com
-
Valentin Schneider authored
The ARM-specific GMC level is meant to be built using the thread sibling mask, but no devicetree in arch/arm/boot/dts uses the 'thread' cpu-map binding. With SD_SHARE_POWERDOMAIN gone, this topology level can be removed, at which point ARM no longer benefits from having a custom defined topology table. Delete the GMC topology level by making ARM use the default scheduler topology table. This essentially reverts commit: fb2aa855 ("sched, ARM: Create a dedicated scheduler topology table") No change in functionality is expected. Suggested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-3-valentin.schneider@arm.com
-
Valentin Schneider authored
This flag was introduced in 2014 by commit: d77b3ed5 ("sched: Add a new SD_SHARE_POWERDOMAIN for sched_domain") but AFAIA it was never leveraged by the scheduler. The closest thing I can think of is EAS caring about frequency domains, and it does that by leveraging performance domains. Remove the flag. No change in functionality is expected. Suggested-by: Morten Rasmussen <morten.rasmussen@arm.com> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200817113003.20802-2-valentin.schneider@arm.com
-
- 18 Aug, 2020 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spiLinus Torvalds authored
Pull spi fixes from Mark Brown: "A bunch of fixes that came in for SPI during the merge window. Some from ST and others for their controller, one from Lukas for a race between device addition and controller unregistration and one from fix from Geert for the DT bindings which unbreaks validation" * tag 'spi-fix-v5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: dt-bindings: lpspi: Add missing boolean type for fsl,spi-only-use-cs1-sel spi: stm32: always perform registers configuration prior to transfer spi: stm32: fixes suspend/resume management spi: stm32: fix stm32_spi_prepare_mbr in case of odd clk_rate spi: stm32: fix fifo threshold level in case of short transfer spi: stm32h7: fix race condition at end of transfer spi: stm32: clear only asserted irq flags on interrupt spi: Prevent adding devices below an unregistering controller
-
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblockLinus Torvalds authored
Pull ia64 page table fix from Mike Rapoport: "Fix regression in IA-64 caused by page table allocation refactoring The refactoring and consolidation of <asm/pgalloc.h> caused regression on parisc and ia64. The fix for parisc made it into v5.9-rc1 while the fix ia64 got delayed a bit and here it is" * tag 'fixes-2020-08-18' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: arch/ia64: Restore arch-specific pgd_offset_k implementation
-
Yang Shi authored
Recently we found regression when running will_it_scale/page_fault3 test on ARM64. Over 70% down for the multi processes cases and over 20% down for the multi threads cases. It turns out the regression is caused by commit 89b15332 ("mm: drop mmap_sem before calling balance_dirty_pages() in write fault"). The test mmaps a memory size file then write to the mapping, this would make all memory dirty and trigger dirty pages throttle, that upstream commit would release mmap_sem then retry the page fault. The retried page fault would see correct PTEs installed then just fall through to spurious TLB flush. The regression is caused by the excessive spurious TLB flush. It is fine on x86 since x86's spurious TLB flush is no-op. We could just skip the spurious TLB flush to mitigate the regression. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Xu Yu <xuyu@linux.alibaba.com> Debugged-by: Xu Yu <xuyu@linux.alibaba.com> Tested-by: Xu Yu <xuyu@linux.alibaba.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Yang Shi <shy828301@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linuxLinus Torvalds authored
Pull mailmap update from Kees Cook: "This was originally part of my pstore tree, but when I realized that mailmap needed re-alphabetizing, I decided to wait until -rc1 to send this, as I saw a lot of mailmap additions pending in -next for the merge window. It's a programmatic reordering and the addition of a pstore contributor's preferred email address" * tag 'pstore-v5.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: mailmap: Add WeiXiong Liao mailmap: Restore dictionary sorting
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from David Miller: "Another batch of fixes: 1) Remove nft_compat counter flush optimization, it generates warnings from the refcount infrastructure. From Florian Westphal. 2) Fix BPF to search for build id more robustly, from Jiri Olsa. 3) Handle bogus getopt lengths in ebtables, from Florian Westphal. 4) Infoleak and other fixes to j1939 CAN driver, from Eric Dumazet and Oleksij Rempel. 5) Reset iter properly on mptcp sendmsg() error, from Florian Westphal. 6) Show a saner speed in bonding broadcast mode, from Jarod Wilson. 7) Various kerneldoc fixes in bonding and elsewhere, from Lee Jones. 8) Fix double unregister in bonding during namespace tear down, from Cong Wang. 9) Disable RP filter during icmp_redirect selftest, from David Ahern" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (75 commits) otx2_common: Use devm_kcalloc() in otx2_config_npa() net: qrtr: fix usage of idr in port assignment to socket selftests: disable rp_filter for icmp_redirect.sh Revert "net: xdp: pull ethernet header off packet after computing skb->protocol" phylink: <linux/phylink.h>: fix function prototype kernel-doc warning mptcp: sendmsg: reset iter on error redux net: devlink: Remove overzealous WARN_ON with snapshots tipc: not enable tipc when ipv6 works as a module tipc: fix uninit skb->data in tipc_nl_compat_dumpit() net: Fix potential wrong skb->protocol in skb_vlan_untag() net: xdp: pull ethernet header off packet after computing skb->protocol ipvlan: fix device features bonding: fix a potential double-unregister can: j1939: add rxtimer for multipacket broadcast session can: j1939: abort multipacket broadcast session when timeout occurs can: j1939: cancel rxtimer on multipacket broadcast session complete can: j1939: fix support for multipacket broadcast message net: fddi: skfp: cfm: Remove seemingly unused variable 'ID_sccs' net: fddi: skfp: cfm: Remove set but unused variable 'oldstate' net: fddi: skfp: smt: Remove seemingly unused variable 'ID_sccs' ...
-
- 17 Aug, 2020 12 commits
-
-
Xu Wang authored
A multiplication for the size determination of a memory allocation indicated that an array data structure should be processed. Thus use the corresponding function "devm_kcalloc". Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Necip Fazil Yildiran authored
Passing large uint32 sockaddr_qrtr.port numbers for port allocation triggers a warning within idr_alloc() since the port number is cast to int, and thus interpreted as a negative number. This leads to the rejection of such valid port numbers in qrtr_port_assign() as idr_alloc() fails. To avoid the problem, switch to idr_alloc_u32() instead. Fixes: bdabad3e ("net: Add Qualcomm IPC router") Reported-by: syzbot+f31428628ef672716ea8@syzkaller.appspotmail.com Signed-off-by: Necip Fazil Yildiran <necip@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
h1 is initially configured to reach h2 via r1 rather than the more direct path through r2. If rp_filter is set and inherited for r2, forwarding fails since the source address of h1 is reachable from eth0 vs the packet coming to it via r1 and eth1. Since rp_filter setting affects the test, explicitly reset it. Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kees Cook authored
WeiXiong Liao noted to me offlist that his preference for email address had changed and that he'd like it updated in the mailmap so people discussing pstore/blk would be able to reach him. Cc: WeiXiong Liao <gmpy.liaowx@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org>
-
Kees Cook authored
Several names had been recently appended (instead of inserted). While git-shortlog doesn't need this file to be sorted, it helps humans to keep it organized this way. Sort the entire file (which includes some minor shuffling for dictionary order). Done with the following commands: grep -E '^(#|$)' .mailmap > .mailmap.head grep -Ev '^(#|$)' .mailmap > .mailmap.body sort -f .mailmap.body > .mailmap.body.sort cat .mailmap.head .mailmap.body.sort > .mailmap rm .mailmap.head .mailmap.body.sort Signed-off-by: Kees Cook <keescook@chromium.org>
-
Jessica Clarke authored
IA-64 is special and treats pgd_offset_k() differently to pgd_offset(), using different formulae to calculate the indices into the kernel and user PGDs. The index into the user PGDs takes into account the region number, but the index into the kernel (init_mm) PGD always assumes a predefined kernel region number. Commit 974b9b2c ("mm: consolidate pte_index() and pte_offset_*() definitions") made IA-64 use a generic pgd_offset_k() which incorrectly used pgd_index() for kernel page tables. As a result, the index into the kernel PGD was going out of bounds and the kernel hung during early boot. Allow overrides of pgd_offset_k() and override it on IA-64 with the old implementation that will correctly index the kernel PGD. Fixes: 974b9b2c ("mm: consolidate pte_index() and pte_offset_*() definitions") Reported-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Acked-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
-
David S. Miller authored
This reverts commit f8414a8d. eth_type_trans() does the necessary pull on the skb. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Randy Dunlap authored
Fix a kernel-doc warning for the pcs_config() function prototype: ../include/linux/phylink.h:406: warning: Excess function parameter 'permit_pause_to_mac' description in 'pcs_config' Fixes: 7137e18f ("net: phylink: add struct phylink_pcs") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: David S. Miller <davem@davemloft.net> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Howells authored
Impose a limit on the number of watches that a user can hold so that they can't use this mechanism to fill up all the available memory. This is done by putting a counter in user_struct that's incremented when a watch is allocated and decreased when it is released. If the number exceeds the RLIMIT_NOFILE limit, the watch is rejected with EAGAIN. This can be tested by the following means: (1) Create a watch queue and attach it to fd 5 in the program given - in this case, bash: keyctl watch_session /tmp/nlog /tmp/gclog 5 bash (2) In the shell, set the maximum number of files to, say, 99: ulimit -n 99 (3) Add 200 keyrings: for ((i=0; i<200; i++)); do keyctl newring a$i @s || break; done (4) Try to watch all of the keyrings: for ((i=0; i<200; i++)); do echo $i; keyctl watch_add 5 %:a$i || break; done This should fail when the number of watches belonging to the user hits 99. (5) Remove all the keyrings and all of those watches should go away: for ((i=0; i<200; i++)); do keyctl unlink %:a$i; done (6) Kill off the watch queue by exiting the shell spawned by watch_session. Fixes: c73be61c ("pipe: Add general notification queue support") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Florian Westphal authored
This fix wasn't correct: When this function is invoked from the retransmission worker, the iterator contains garbage and resetting it causes a crash. As the work queue should not be performance critical also zero the msghdr struct. Fixes: 35759383 "(mptcp: sendmsg: reset iter on error)" Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Lunn authored
It is possible to trigger this WARN_ON from user space by triggering a devlink snapshot with an ID which already exists. We don't need both -EEXISTS being reported and spamming the kernel log. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
When using ipv6_dev_find() in one module, it requires ipv6 not to work as a module. Otherwise, this error occurs in build: undefined reference to `ipv6_dev_find'. So fix it by adding "depends on IPV6 || IPV6=n" to tipc/Kconfig, as it does in sctp/Kconfig. Fixes: 5a6f6f57 ("tipc: set ub->ifindex for local ipv6 address") Reported-by: kernel test robot <lkp@intel.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-