1. 15 Jan, 2024 9 commits
  2. 07 Jan, 2024 1 commit
  3. 06 Jan, 2024 2 commits
  4. 05 Jan, 2024 25 commits
  5. 04 Jan, 2024 3 commits
    • Linus Torvalds's avatar
      x86/csum: clean up `csum_partial' further · a476aae3
      Linus Torvalds authored
      Commit 688eb819 ("x86/csum: Improve performance of `csum_partial`")
      ended up improving the code generation for the IP csum calculations, and
      in particular special-casing the 40-byte case that is a hot case for
      IPv6 headers.
      
      It then had _another_ special case for the 64-byte unrolled loop, which
      did two chains of 32-byte blocks, which allows modern CPU's to improve
      performance by doing the chains in parallel thanks to renaming the carry
      flag.
      
      This just unifies the special cases and combines them into just one
      single helper the 40-byte csum case, and replaces the 64-byte case by a
      80-byte case that just does that single helper twice.  It avoids having
      all these different versions of inline assembly, and actually improved
      performance further in my tests.
      
      There was never anything magical about the 64-byte unrolled case, even
      though it happens to be a common size (and typically is the cacheline
      size).
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a476aae3
    • Noah Goldstein's avatar
      x86/csum: Remove unnecessary odd handling · 5d4acb62
      Noah Goldstein authored
      The special case for odd aligned buffers is unnecessary and mostly
      just adds overhead. Aligned buffers is the expectations, and even for
      unaligned buffer, the only case that was helped is if the buffer was
      1-byte from word aligned which is ~1/7 of the cases. Overall it seems
      highly unlikely to be worth to extra branch.
      
      It was left in the previous perf improvement patch because I was
      erroneously comparing the exact output of `csum_partial(...)`, but
      really we only need `csum_fold(csum_partial(...))` to match so its
      safe to remove.
      
      All csum kunit tests pass.
      Signed-off-by: default avatarNoah Goldstein <goldstein.w.n@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Laight <david.laight@aculab.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5d4acb62
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v6.7-7' of... · 5eff55d7
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v6.7-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver fix from Ilpo Järvinen:
       "Unfortunately the P2SB deadlock fix broke some older HW and we need
        some time to figure out the best way to fix the issue so reverting the
        deadlock fix for now"
      
      * tag 'platform-drivers-x86-v6.7-7' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        Revert "platform/x86: p2sb: Allow p2sb_bar() calls during PCI device probe"
      5eff55d7