1. 21 Jun, 2016 10 commits
  2. 20 Jun, 2016 1 commit
  3. 17 Jun, 2016 1 commit
  4. 15 Jun, 2016 2 commits
    • Paul E. McKenney's avatar
      clk: Use _rcuidle suffix to allow clk_core_enable() to used from idle · f17a0dd1
      Paul E. McKenney authored
      This commit fixes the RCU use-from-idle bug corresponding the following
      splat:
      
      > [ INFO: suspicious RCU usage. ]
      > 4.6.0-rc5-next-20160426+ #1127 Not tainted
      > -------------------------------
      > include/trace/events/clk.h:45 suspicious rcu_dereference_check() usage!
      >
      > other info that might help us debug this:
      >
      >
      > RCU used illegally from idle CPU!
      > rcu_scheduler_active = 1, debug_locks = 0
      > RCU used illegally from extended quiescent state!
      > 2 locks held by swapper/0/0:
      >  #0:  (&oh->hwmod_key#30){......}, at: [<c0121afc>] omap_hwmod_enable+0x18/0x44
      >  #1:  (enable_lock){......}, at: [<c0630684>] clk_enable_lock+0x18/0x124
      >
      > stack backtrace:
      > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1127
      > Hardware name: Generic OMAP36xx (Flattened Device Tree)
      > [<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
      > [<c010c3a8>] (show_stack) from [<c047fd68>] (dump_stack+0xb0/0xe4)
      > [<c047fd68>] (dump_stack) from [<c06315c0>] (clk_core_enable+0x1e0/0x36c)
      > [<c06315c0>] (clk_core_enable) from [<c0632298>] (clk_enable+0x1c/0x38)
      > [<c0632298>] (clk_enable) from [<c01204e0>] (_enable_clocks+0x18/0x7c)
      > [<c01204e0>] (_enable_clocks) from [<c012137c>] (_enable+0x114/0x2ec)
      > [<c012137c>] (_enable) from [<c0121b08>] (omap_hwmod_enable+0x24/0x44)
      > [<c0121b08>] (omap_hwmod_enable) from [<c0122ad0>] (omap_device_enable+0x3c/0x90)
      > [<c0122ad0>] (omap_device_enable) from [<c0122b34>] (_od_runtime_resume+0x10/0x38)
      > [<c0122b34>] (_od_runtime_resume) from [<c052cc00>] (__rpm_callback+0x2c/0x60)
      > [<c052cc00>] (__rpm_callback) from [<c052cc54>] (rpm_callback+0x20/0x80)
      > [<c052cc54>] (rpm_callback) from [<c052df7c>] (rpm_resume+0x3d0/0x6f0)
      > [<c052df7c>] (rpm_resume) from [<c052e2e8>] (__pm_runtime_resume+0x4c/0x64)
      > [<c052e2e8>] (__pm_runtime_resume) from [<c04bf2c4>] (omap2_gpio_resume_after_idle+0x54/0x68)
      > [<c04bf2c4>] (omap2_gpio_resume_after_idle) from [<c01269dc>] (omap3_enter_idle_bm+0xfc/0x1ec)
      > [<c01269dc>] (omap3_enter_idle_bm) from [<c0601888>] (cpuidle_enter_state+0x80/0x3d4)
      > [<c0601888>] (cpuidle_enter_state) from [<c0183b08>] (cpu_startup_entry+0x198/0x3a0)
      > [<c0183b08>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
      > [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: <linux-omap@vger.kernel.org>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: <linux-clk@vger.kernel.org>
      f17a0dd1
    • Paul E. McKenney's avatar
      clk: Add _rcuidle tracepoints to allow clk_core_disable() use from idle · 2f87a6ea
      Paul E. McKenney authored
      This commit adds an _rcuidle suffix to a pair of trace events to
      prevent the following splat:
      
      > ===============================
      > [ INFO: suspicious RCU usage. ]
      > 4.6.0-rc5-next-20160426+ #1114 Not tainted
      > -------------------------------
      > include/trace/events/clk.h:59 suspicious rcu_dereference_check() usage!
      >
      > other info that might help us debug this:
      >
      >
      > RCU used illegally from idle CPU!
      > rcu_scheduler_active = 1, debug_locks = 0
      > RCU used illegally from extended quiescent state!
      > 2 locks held by swapper/0/0:
      >  #0:  (&oh->hwmod_key#30){......}, at: [<c0121b40>] omap_hwmod_idle+0x18/0x44
      >  #1:  (enable_lock){......}, at: [<c0630998>] clk_enable_lock+0x18/0x124
      >
      > stack backtrace:
      > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0-rc5-next-20160426+ #1114
      > Hardware name: Generic OMAP36xx (Flattened Device Tree)
      > [<c0110290>] (unwind_backtrace) from [<c010c3a8>] (show_stack+0x10/0x14)
      > [<c010c3a8>] (show_stack) from [<c047fd68>] (dump_stack+0xb0/0xe4)
      > [<c047fd68>] (dump_stack) from [<c0631618>] (clk_core_disable+0x17c/0x348)
      > [<c0631618>] (clk_core_disable) from [<c0632774>] (clk_disable+0x24/0x30)
      > [<c0632774>] (clk_disable) from [<c0120590>] (_disable_clocks+0x18/0x7c)
      > [<c0120590>] (_disable_clocks) from [<c0121680>] (_idle+0x12c/0x230)
      > [<c0121680>] (_idle) from [<c0121b4c>] (omap_hwmod_idle+0x24/0x44)
      > [<c0121b4c>] (omap_hwmod_idle) from [<c0122c24>] (omap_device_idle+0x3c/0x90)
      > [<c0122c24>] (omap_device_idle) from [<c052cc00>] (__rpm_callback+0x2c/0x60)
      > [<c052cc00>] (__rpm_callback) from [<c052cc54>] (rpm_callback+0x20/0x80)
      > [<c052cc54>] (rpm_callback) from [<c052d150>] (rpm_suspend+0x100/0x768)
      > [<c052d150>] (rpm_suspend) from [<c052ec58>] (__pm_runtime_suspend+0x64/0x84)
      > [<c052ec58>] (__pm_runtime_suspend) from [<c04bf25c>] (omap2_gpio_prepare_for_idle+0x5c/0x70)
      > [<c04bf25c>] (omap2_gpio_prepare_for_idle) from [<c0125568>] (omap_sram_idle+0x140/0x244)
      > [<c0125568>] (omap_sram_idle) from [<c01269dc>] (omap3_enter_idle_bm+0xfc/0x1ec)
      > [<c01269dc>] (omap3_enter_idle_bm) from [<c0601bdc>] (cpuidle_enter_state+0x80/0x3d4)
      > [<c0601bdc>] (cpuidle_enter_state) from [<c0183b08>] (cpu_startup_entry+0x198/0x3a0)
      > [<c0183b08>] (cpu_startup_entry) from [<c0b00c0c>] (start_kernel+0x354/0x3c8)
      > [<c0b00c0c>] (start_kernel) from [<8000807c>] (0x8000807c)
      Reported-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Michael Turquette <mturquette@baylibre.com>
      Cc: Stephen Boyd <sboyd@codeaurora.org>
      Cc: <linux-omap@vger.kernel.org>
      Cc: <linux-arm-kernel@lists.infradead.org>
      Cc: <linux-clk@vger.kernel.org>
      2f87a6ea
  5. 03 Jun, 2016 1 commit
  6. 01 Jun, 2016 6 commits
  7. 30 May, 2016 5 commits
    • Xing Zheng's avatar
      clk: rockchip: fix cpuclk registration error handling · 3183c0d5
      Xing Zheng authored
      It maybe due to a copy-paste error the error handing should be
      cclk not clk when checking if the cpuclk registration succeeded.
      Reported-by: default avatarLin Huang <lin.huang@rock-chips.com>
      Signed-off-by: default avatarXing Zheng <zhengxing@rock-chips.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      3183c0d5
    • Douglas Anderson's avatar
      clk: rockchip: Revert "clk: rockchip: reset init state before mmc card initialization" · 4715f81a
      Douglas Anderson authored
      This reverts commit 7a03fe6f ("clk: rockchip: reset init state
      before mmc card initialization").
      
      Though not totally obvious from the commit message nor from the source
      code, that commit appears to be trying to reset the "_drv" MMC clocks to
      90 degrees (note that the "_sample" MMC clocks have a shift of 0 so are
      not touched).
      
      The major problem here is that it doesn't properly reset things.  The
      phase is a two bit field and the commit only touches one of the two
      bits.  Thus the commit had the following affect:
      - phase   0  => phase  90
      - phase  90  => phase  90
      - phase 180  => phase 270
      - phase 270  => phase 270
      
      Things get even weirder if you happen to have a bootloader that was
      actually using delay elements (should be no reason to, but you never
      know), since those are additional bits that weren't touched by the
      original patch.
      
      This is unlikely to be what we actually want.  Checking on rk3288-veyron
      devices, I can see that the bootloader leaves these clocks as:
      - emmc:  phase 180
      - sdmmc: phase 90
      - sdio0: phase 90
      
      Thus on rk3288-veyron devices the commit we're reverting had the effect
      of changing the eMMC clock to phase 270.  This probably explains the
      scattered reports I've heard of eMMC devices not working on some veyron
      devices when using the upstream kernel.
      
      The original commit was presumably made because previously the kernel
      didn't touch the "_drv" phase at all and relied on whatever value was
      there when the kernel started.  If someone was using a bootloader that
      touched the "_drv" phase then, indeed, we should have code in the kernel
      to fix that.  ...and also, to get ideal timings, we should also have the
      kernel change the phase depending on the speed mode.  In fact, that's
      the subject of a recent patch I posted at
      <https://patchwork.kernel.org/patch/9075141/>.
      
      Ideally, we should take both the patch posted to dw_mmc and this
      revert.  Since those will likely go through different trees, here I
      describe behavior with the combos:
      
      1. Just this revert: likely will fix rk3288-veyron eMMC on some devices
         + other cases; might break someone with a strange bootloader that
         sets the phase to 0 or one that uses delay elements (pretty
         unpredicable what would happen in that case).
      2. Just dw_mmc patch: fixes everyone.  Effectly the dw_mmc patch will
         totally override the broken patch and fix everything.
      3. Both patches: fixes everyone.  Once dw_mmc is initting properly then
         any defaults from the clock code doesn't mattery.
      
      Fixes: 7a03fe6f ("clk: rockchip: reset init state before mmc card initialization")
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Reviewed-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      
      [emmc and sdmmc still work on all current boards in mainline after this
      revert, so they should take precedence over any out-of-tree board that
      will hopefully again get fixed with the better upcoming dw_mmc change.]
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      4715f81a
    • Xing Zheng's avatar
      clk: rockchip: fix incorrect parent for rk3399's {c,g}pll_aclk_perihp_src · 3bd14ae9
      Xing Zheng authored
      There was a typo, swapping 'c' <--> 'g'.
      Signed-off-by: default avatarXing Zheng <zhengxing@rock-chips.com>
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      3bd14ae9
    • Brian Norris's avatar
      clk: rockchip: mark rk3399 GIC clocks as critical · 176df69c
      Brian Norris authored
      We never want to kill the GIC.
      
      Noticed when making other clock fixups, and seeing the newly-constructed
      clock tree try to disable cpll, where we had this parent structure:
      
        aclk_gic <------\
                        |--- aclk_gic_pre <-- cpll <-- pll_cpll
        aclk_gic_noc <--/
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      176df69c
    • Heiko Stuebner's avatar
      clk: rockchip: initialize flags of clk_init_data in mmc-phase clock · 595144c1
      Heiko Stuebner authored
      The flags element of clk_init_data was never initialized for mmc-
      phase-clocks resulting in the element containing a random value
      and thus possibly enabling unwanted clock flags.
      
      Fixes: 89bf26cb ("clk: rockchip: Add support for the mmc clock phases using the framework")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      595144c1
  8. 29 May, 2016 3 commits
  9. 28 May, 2016 11 commits
    • Mikulas Patocka's avatar
      hpfs: implement the show_options method · 037369b8
      Mikulas Patocka authored
      The HPFS filesystem used generic_show_options to produce string that is
      displayed in /proc/mounts.  However, there is a problem that the options
      may disappear after remount.  If we mount the filesystem with option1
      and then remount it with option2, /proc/mounts should show both option1
      and option2, however it only shows option2 because the whole option
      string is replaced with replace_mount_options in hpfs_remount_fs.
      
      To fix this bug, implement the hpfs_show_options function that prints
      options that are currently selected.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      037369b8
    • Mikulas Patocka's avatar
      affs: fix remount failure when there are no options changed · 01d6e087
      Mikulas Patocka authored
      Commit c8f33d0b ("affs: kstrdup() memory handling") checks if the
      kstrdup function returns NULL due to out-of-memory condition.
      
      However, if we are remounting a filesystem with no change to
      filesystem-specific options, the parameter data is NULL.  In this case,
      kstrdup returns NULL (because it was passed NULL parameter), although no
      out of memory condition exists.  The mount syscall then fails with
      ENOMEM.
      
      This patch fixes the bug.  We fail with ENOMEM only if data is non-NULL.
      
      The patch also changes the call to replace_mount_options - if we didn't
      pass any filesystem-specific options, we don't call
      replace_mount_options (thus we don't erase existing reported options).
      
      Fixes: c8f33d0b ("affs: kstrdup() memory handling")
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org	# v4.1+
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      01d6e087
    • Mikulas Patocka's avatar
      hpfs: fix remount failure when there are no options changed · 44d51706
      Mikulas Patocka authored
      Commit ce657611 ("hpfs: kstrdup() out of memory handling") checks if
      the kstrdup function returns NULL due to out-of-memory condition.
      
      However, if we are remounting a filesystem with no change to
      filesystem-specific options, the parameter data is NULL.  In this case,
      kstrdup returns NULL (because it was passed NULL parameter), although no
      out of memory condition exists.  The mount syscall then fails with
      ENOMEM.
      
      This patch fixes the bug.  We fail with ENOMEM only if data is non-NULL.
      
      The patch also changes the call to replace_mount_options - if we didn't
      pass any filesystem-specific options, we don't call
      replace_mount_options (thus we don't erase existing reported options).
      
      Fixes: ce657611 ("hpfs: kstrdup() out of memory handling")
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      44d51706
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 4029632c
      Linus Torvalds authored
      Pull more MIPS updates from Ralf Baechle:
       "This is the secondnd batch of MIPS patches for 4.7. Summary:
      
        CPS:
         - Copy EVA configuration when starting secondary VPs.
      
        EIC:
         - Clear Status IPL.
      
        Lasat:
         - Fix a few off by one bugs.
      
        lib:
         - Mark intrinsics notrace.  Not only are the intrinsics
           uninteresting, it would cause infinite recursion.
      
        MAINTAINERS:
         - Add file patterns for MIPS BRCM device tree bindings.
         - Add file patterns for mips device tree bindings.
      
        MT7628:
         - Fix MT7628 pinmux typos.
         - wled_an pinmux gpio.
         - EPHY LEDs pinmux support.
      
        Pistachio:
         - Enable KASLR
      
        VDSO:
         - Build microMIPS VDSO for microMIPS kernels.
         - Fix aliasing warning by building with `-fno-strict-aliasing' for
           debugging but also tracing them might result in recursion.
      
        Misc:
         - Add missing FROZEN hotplug notifier transitions.
         - Fix clk binding example for varioius PIC32 devices.
         - Fix cpu interrupt controller node-names in the DT files.
         - Fix XPA CPU feature separation.
         - Fix write_gc0_* macros when writing zero.
         - Add inline asm encoding helpers.
         - Add missing VZ accessor microMIPS encodings.
         - Fix little endian microMIPS MSA encodings.
         - Add 64-bit HTW fields and fix its configuration.
         - Fix sigreturn via VDSO on microMIPS kernel.
         - Lots of typo fixes.
         - Add definitions of SegCtl registers and use them"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (49 commits)
        MIPS: Add missing FROZEN hotplug notifier transitions
        MIPS: Build microMIPS VDSO for microMIPS kernels
        MIPS: Fix sigreturn via VDSO on microMIPS kernel
        MIPS: devicetree: fix cpu interrupt controller node-names
        MIPS: VDSO: Build with `-fno-strict-aliasing'
        MIPS: Pistachio: Enable KASLR
        MIPS: lib: Mark intrinsics notrace
        MIPS: Fix 64-bit HTW configuration
        MIPS: Add 64-bit HTW fields
        MAINTAINERS: Add file patterns for mips device tree bindings
        MAINTAINERS: Add file patterns for mips brcm device tree bindings
        MIPS: Simplify DSP instruction encoding macros
        MIPS: Add missing tlbinvf/XPA microMIPS encodings
        MIPS: Fix little endian microMIPS MSA encodings
        MIPS: Add missing VZ accessor microMIPS encodings
        MIPS: Add inline asm encoding helpers
        MIPS: Spelling fix lets -> let's
        MIPS: VR41xx: Fix typo
        MIPS: oprofile: Fix typo
        MIPS: math-emu: Fix typo
        ...
      4029632c
    • Guenter Roeck's avatar
      fs: fix binfmt_aout.c build error · d66492bc
      Guenter Roeck authored
      Various builds (such as i386:allmodconfig) fail with
      
        fs/binfmt_aout.c:133:2: error: expected identifier or '(' before 'return'
        fs/binfmt_aout.c:134:1: error: expected identifier or '(' before '}' token
      
      [ Oops. My bad, I had stupidly thought that "allmodconfig" covered this
        on x86-64 too, but it obviously doesn't.  Egg on my face.  - Linus ]
      
      Fixes: 5d22fc25 ("mm: remove more IS_ERR_VALUE abuses")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d66492bc
    • Linus Torvalds's avatar
      Merge branch 'hash' of git://ftp.sciencehorizons.net/linux · 7e0fb73c
      Linus Torvalds authored
      Pull string hash improvements from George Spelvin:
       "This series does several related things:
      
         - Makes the dcache hash (fs/namei.c) useful for general kernel use.
      
           (Thanks to Bruce for noticing the zero-length corner case)
      
         - Converts the string hashes in <linux/sunrpc/svcauth.h> to use the
           above.
      
         - Avoids 64-bit multiplies in hash_64() on 32-bit platforms.  Two
           32-bit multiplies will do well enough.
      
         - Rids the world of the bad hash multipliers in hash_32.
      
           This finishes the job started in commit 689de1d6 ("Minimal
           fix-up of bad hashing behavior of hash_64()")
      
           The vast majority of Linux architectures have hardware support for
           32x32-bit multiply and so derive no benefit from "simplified"
           multipliers.
      
           The few processors that do not (68000, h8/300 and some models of
           Microblaze) have arch-specific implementations added.  Those
           patches are last in the series.
      
         - Overhauls the dcache hash mixing.
      
           The patch in commit 0fed3ac8 ("namei: Improve hash mixing if
           CONFIG_DCACHE_WORD_ACCESS") was an off-the-cuff suggestion.
           Replaced with a much more careful design that's simultaneously
           faster and better.  (My own invention, as there was noting suitable
           in the literature I could find.  Comments welcome!)
      
         - Modify the hash_name() loop to skip the initial HASH_MIX().  This
           would let us salt the hash if we ever wanted to.
      
         - Sort out partial_name_hash().
      
           The hash function is declared as using a long state, even though
           it's truncated to 32 bits at the end and the extra internal state
           contributes nothing to the result.  And some callers do odd things:
      
            - fs/hfs/string.c only allocates 32 bits of state
            - fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes
      
         - Modify bytemask_from_count to handle inputs of 1..sizeof(long)
           rather than 0..sizeof(long)-1.  This would simplify users other
           than full_name_hash"
      
        Special thanks to Bruce Fields for testing and finding bugs in v1.  (I
        learned some humbling lessons about "obviously correct" code.)
      
        On the arch-specific front, the m68k assembly has been tested in a
        standalone test harness, I've been in contact with the Microblaze
        maintainers who mostly don't care, as the hardware multiplier is never
        omitted in real-world applications, and I haven't heard anything from
        the H8/300 world"
      
      * 'hash' of git://ftp.sciencehorizons.net/linux:
        h8300: Add <asm/hash.h>
        microblaze: Add <asm/hash.h>
        m68k: Add <asm/hash.h>
        <linux/hash.h>: Add support for architecture-specific functions
        fs/namei.c: Improve dcache hash function
        Eliminate bad hash multipliers from hash_32() and  hash_64()
        Change hash_64() return value to 32 bits
        <linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
        fs/namei.c: Add hashlen_string() function
        Pull out string hash to <linux/stringhash.h>
      7e0fb73c
    • George Spelvin's avatar
      h8300: Add <asm/hash.h> · 4684fe95
      George Spelvin authored
      This will improve the performance of hash_32() and hash_64(), but due
      to complete lack of multi-bit shift instructions on H8, performance will
      still be bad in surrounding code.
      
      Designing H8-specific hash algorithms to work around that is a separate
      project.  (But if the maintainers would like to get in touch...)
      Signed-off-by: default avatarGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      4684fe95
    • George Spelvin's avatar
      microblaze: Add <asm/hash.h> · 7b13277b
      George Spelvin authored
      Microblaze is an FPGA soft core that can be configured various ways.
      
      If it is configured without a multiplier, the standard __hash_32()
      will require a call to __mulsi3, which is a slow software loop.
      
      Instead, use a shift-and-add sequence for the constant multiply.
      GCC knows how to do this, but it's not as clever as some.
      Signed-off-by: default avatarGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Alistair Francis <alistair.francis@xilinx.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      7b13277b
    • George Spelvin's avatar
      m68k: Add <asm/hash.h> · 14c44b95
      George Spelvin authored
      This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
      for the original mc68000, which lacks a 32x32-bit multiply instruction.
      
      Yes, the amount of optimization effort put in is excessive. :-)
      
      Shift-add chain found by Yevgen Voronenko's Hcub algorithm at
      http://spiral.ece.cmu.edu/mcm/gen.htmlSigned-off-by: default avatarGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Philippe De Muyter <phdm@macq.eu>
      Cc: linux-m68k@lists.linux-m68k.org
      14c44b95
    • George Spelvin's avatar
      <linux/hash.h>: Add support for architecture-specific functions · 468a9428
      George Spelvin authored
      This is just the infrastructure; there are no users yet.
      
      This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
      the existence of <asm/hash.h>.
      
      That file may define its own versions of various functions, and define
      HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.
      
      Included is a self-test (in lib/test_hash.c) that verifies the basics.
      It is NOT in general required that the arch-specific functions compute
      the same thing as the generic, but if a HAVE_* symbol is defined with
      the value 1, then equality is tested.
      Signed-off-by: default avatarGeorge Spelvin <linux@sciencehorizons.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greg Ungerer <gerg@linux-m68k.org>
      Cc: Andreas Schwab <schwab@linux-m68k.org>
      Cc: Philippe De Muyter <phdm@macq.eu>
      Cc: linux-m68k@lists.linux-m68k.org
      Cc: Alistair Francis <alistai@xilinx.com>
      Cc: Michal Simek <michal.simek@xilinx.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: uclinux-h8-devel@lists.sourceforge.jp
      468a9428
    • George Spelvin's avatar
      fs/namei.c: Improve dcache hash function · 2a18da7a
      George Spelvin authored
      Patch 0fed3ac8 improved the hash mixing, but the function is slower
      than necessary; there's a 7-instruction dependency chain (10 on x86)
      each loop iteration.
      
      Word-at-a-time access is a very tight loop (which is good, because
      link_path_walk() is one of the hottest code paths in the entire kernel),
      and the hash mixing function must not have a longer latency to avoid
      slowing it down.
      
      There do not appear to be any published fast hash functions that:
      1) Operate on the input a word at a time, and
      2) Don't need to know the length of the input beforehand, and
      3) Have a single iterated mixing function, not needing conditional
         branches or unrolling to distinguish different loop iterations.
      
      One of the algorithms which comes closest is Yann Collet's xxHash, but
      that's two dependent multiplies per word, which is too much.
      
      The key insights in this design are:
      
      1) Barring expensive ops like multiplies, to diffuse one input bit
         across 64 bits of hash state takes at least log2(64) = 6 sequentially
         dependent instructions.  That is more cycles than we'd like.
      2) An operation like "hash ^= hash << 13" requires a second temporary
         register anyway, and on a 2-operand machine like x86, it's three
         instructions.
      3) A better use of a second register is to hold a two-word hash state.
         With careful design, no temporaries are needed at all, so it doesn't
         increase register pressure.  And this gets rid of register copying
         on 2-operand machines, so the code is smaller and faster.
      4) Using two words of state weakens the requirement for one-round mixing;
         we now have two rounds of mixing before cancellation is possible.
      5) A two-word hash state also allows operations on both halves to be
         done in parallel, so on a superscalar processor we get more mixing
         in fewer cycles.
      
      I ended up using a mixing function inspired by the ChaCha and Speck
      round functions.  It is 6 simple instructions and 3 cycles per iteration
      (assuming multiply by 9 can be done by an "lea" instruction):
      
      		x ^= *input++;
      	y ^= x;	x = ROL(x, K1);
      	x += y;	y = ROL(y, K2);
      	y *= 9;
      
      Not only is this reversible, two consecutive rounds are reversible:
      if you are given the initial and final states, but not the intermediate
      state, it is possible to compute both input words.  This means that at
      least 3 words of input are required to create a collision.
      
      (It also has the property, used by hash_name() to avoid a branch, that
      it hashes all-zero to all-zero.)
      
      The rotate constants K1 and K2 were found by experiment.  The search took
      a sample of random initial states (I used 1023) and considered the effect
      of flipping each of the 64 input bits on each of the 128 output bits two
      rounds later.  Each of the 8192 pairs can be considered a biased coin, and
      adding up the Shannon entropy of all of them produces a score.
      
      The best-scoring shifts also did well in other tests (flipping bits in y,
      trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits),
      so the choice was made with the additional constraint that the sum of the
      shifts is odd and not too close to the word size.
      
      The final state is then folded into a 32-bit hash value by a less carefully
      optimized multiply-based scheme.  This also has to be fast, as pathname
      components tend to be short (the most common case is one iteration!), but
      there's some room for latency, as there is a fair bit of intervening logic
      before the hash value is used for anything.
      
      (Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs.  I need
      a better benchmark; the numbers seem to show a slight dip in performance
      between 4.6.0 and this patch, but they're too noisy to quote.)
      
      Special thanks to Bruce fields for diligent testing which uncovered a
      nasty fencepost error in an earlier version of this patch.
      
      [checkpatch.pl formatting complaints noted and respectfully disagreed with.]
      Signed-off-by: default avatarGeorge Spelvin <linux@sciencehorizons.net>
      Tested-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      2a18da7a