1. 14 Oct, 2014 1 commit
    • Prarit Bhargava's avatar
      modules, lock around setting of MODULE_STATE_UNFORMED · d3051b48
      Prarit Bhargava authored
      A panic was seen in the following sitation.
      
      There are two threads running on the system. The first thread is a system
      monitoring thread that is reading /proc/modules. The second thread is
      loading and unloading a module (in this example I'm using my simple
      dummy-module.ko).  Note, in the "real world" this occurred with the qlogic
      driver module.
      
      When doing this, the following panic occurred:
      
       ------------[ cut here ]------------
       kernel BUG at kernel/module.c:3739!
       invalid opcode: 0000 [#1] SMP
       Modules linked in: binfmt_misc sg nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw igb gf128mul glue_helper iTCO_wdt iTCO_vendor_support ablk_helper ptp sb_edac cryptd pps_core edac_core shpchp i2c_i801 pcspkr wmi lpc_ich ioatdma mfd_core dca ipmi_si nfsd ipmi_msghandler auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sr_mod cdrom sd_mod crc_t10dif crct10dif_common mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ttm isci drm libsas ahci libahci scsi_transport_sas libata i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: dummy_module]
       CPU: 37 PID: 186343 Comm: cat Tainted: GF          O--------------   3.10.0+ #7
       Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
       task: ffff8807fd2d8000 ti: ffff88080fa7c000 task.ti: ffff88080fa7c000
       RIP: 0010:[<ffffffff810d64c5>]  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
       RSP: 0018:ffff88080fa7fe18  EFLAGS: 00010246
       RAX: 0000000000000003 RBX: ffffffffa03b5200 RCX: 0000000000000000
       RDX: 0000000000001000 RSI: ffff88080fa7fe38 RDI: ffffffffa03b5000
       RBP: ffff88080fa7fe28 R08: 0000000000000010 R09: 0000000000000000
       R10: 0000000000000000 R11: 000000000000000f R12: ffffffffa03b5000
       R13: ffffffffa03b5008 R14: ffffffffa03b5200 R15: ffffffffa03b5000
       FS:  00007f6ae57ef740(0000) GS:ffff88101e7a0000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000404f70 CR3: 0000000ffed48000 CR4: 00000000001407e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Stack:
        ffffffffa03b5200 ffff8810101e4800 ffff88080fa7fe70 ffffffff810d666c
        ffff88081e807300 000000002e0f2fbf 0000000000000000 ffff88100f257b00
        ffffffffa03b5008 ffff88080fa7ff48 ffff8810101e4800 ffff88080fa7fee0
       Call Trace:
        [<ffffffff810d666c>] m_show+0x19c/0x1e0
        [<ffffffff811e4d7e>] seq_read+0x16e/0x3b0
        [<ffffffff812281ed>] proc_reg_read+0x3d/0x80
        [<ffffffff811c0f2c>] vfs_read+0x9c/0x170
        [<ffffffff811c1a58>] SyS_read+0x58/0xb0
        [<ffffffff81605829>] system_call_fastpath+0x16/0x1b
       Code: 48 63 c2 83 c2 01 c6 04 03 29 48 63 d2 eb d9 0f 1f 80 00 00 00 00 48 63 d2 c6 04 13 2d 41 8b 0c 24 8d 50 02 83 f9 01 75 b2 eb cb <0f> 0b 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
       RIP  [<ffffffff810d64c5>] module_flags+0xb5/0xc0
        RSP <ffff88080fa7fe18>
      
          Consider the two processes running on the system.
      
          CPU 0 (/proc/modules reader)
          CPU 1 (loading/unloading module)
      
          CPU 0 opens /proc/modules, and starts displaying data for each module by
          traversing the modules list via fs/seq_file.c:seq_open() and
          fs/seq_file.c:seq_read().  For each module in the modules list, seq_read
          does
      
                  op->start()  <-- this is a pointer to m_start()
                  op->show()   <- this is a pointer to m_show()
                  op->stop()   <-- this is a pointer to m_stop()
      
          The m_start(), m_show(), and m_stop() module functions are defined in
          kernel/module.c. The m_start() and m_stop() functions acquire and release
          the module_mutex respectively.
      
          ie) When reading /proc/modules, the module_mutex is acquired and released
          for each module.
      
          m_show() is called with the module_mutex held.  It accesses the module
          struct data and attempts to write out module data.  It is in this code
          path that the above BUG_ON() warning is encountered, specifically m_show()
          calls
      
          static char *module_flags(struct module *mod, char *buf)
          {
                  int bx = 0;
      
                  BUG_ON(mod->state == MODULE_STATE_UNFORMED);
          ...
      
          The other thread, CPU 1, in unloading the module calls the syscall
          delete_module() defined in kernel/module.c.  The module_mutex is acquired
          for a short time, and then released.  free_module() is called without the
          module_mutex.  free_module() then sets mod->state = MODULE_STATE_UNFORMED,
          also without the module_mutex.  Some additional code is called and then the
          module_mutex is reacquired to remove the module from the modules list:
      
              /* Now we can delete it from the lists */
              mutex_lock(&module_mutex);
              stop_machine(__unlink_module, mod, NULL);
              mutex_unlock(&module_mutex);
      
      This is the sequence of events that leads to the panic.
      
      CPU 1 is removing dummy_module via delete_module().  It acquires the
      module_mutex, and then releases it.  CPU 1 has NOT set dummy_module->state to
      MODULE_STATE_UNFORMED yet.
      
      CPU 0, which is reading the /proc/modules, acquires the module_mutex and
      acquires a pointer to the dummy_module which is still in the modules list.
      CPU 0 calls m_show for dummy_module.  The check in m_show() for
      MODULE_STATE_UNFORMED passed for dummy_module even though it is being
      torn down.
      
      Meanwhile CPU 1, which has been continuing to remove dummy_module without
      holding the module_mutex, now calls free_module() and sets
      dummy_module->state to MODULE_STATE_UNFORMED.
      
      CPU 0 now calls module_flags() with dummy_module and ...
      
      static char *module_flags(struct module *mod, char *buf)
      {
              int bx = 0;
      
              BUG_ON(mod->state == MODULE_STATE_UNFORMED);
      
      and BOOM.
      
      Acquire and release the module_mutex lock around the setting of
      MODULE_STATE_UNFORMED in the teardown path, which should resolve the
      problem.
      
      Testing: In the unpatched kernel I can panic the system within 1 minute by
      doing
      
      while (true) do insmod dummy_module.ko; rmmod dummy_module.ko; done
      
      and
      
      while (true) do cat /proc/modules; done
      
      in separate terminals.
      
      In the patched kernel I was able to run just over one hour without seeing
      any issues.  I also verified the output of panic via sysrq-c and the output
      of /proc/modules looks correct for all three states for the dummy_module.
      
              dummy_module 12661 0 - Unloading 0xffffffffa03a5000 (OE-)
              dummy_module 12661 0 - Live 0xffffffffa03bb000 (OE)
              dummy_module 14015 1 - Loading 0xffffffffa03a5000 (OE+)
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Reviewed-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      d3051b48
  2. 11 Sep, 2014 1 commit
    • Mark Rustad's avatar
      moduleparam: Resolve missing-field-initializer warning · 184c3fc3
      Mark Rustad authored
      Resolve a missing-field-initializer warning, that is produced
      by every reference to module_param_call, by using designated
      initialization for the first field. That is enough to silence
      the complaint.
      
      The message is only seen when doing a W=2 build. I happened to be using gcc
      4.8.3, but I think most versions would produce the warning when it is
      enabled. It can either be silenced by using even a single designated
      initializer as I did here, or providing values for all of the fields. Because
      of the number of references to the macro, this change silences many warnings
      in W=2 builds.
      
      One instance of the full warning message looks like this:
      
      /home/share/git/nn-mdr/include/linux/moduleparam.h:198:16: warning: missing
      initializer for field ‘free’ of ‘struct kernel_param_ops’
      [-Wmissing-field-initializers]
        static struct kernel_param_ops __param_ops_##name =  \
      		  ^
      /home/share/git/nn-mdr/fs/fuse/inode.c:35:1: note: in expansion of macro
      ‘module_param_call’
       module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
       ^
      /home/share/git/nn-mdr/include/linux/moduleparam.h:56:9: note: ‘free’
      declared here
        void (*free)(void *arg);
      Signed-off-by: default avatarMark Rustad <mark.d.rustad@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      184c3fc3
  3. 27 Aug, 2014 10 commits
  4. 26 Aug, 2014 3 commits
  5. 25 Aug, 2014 5 commits
    • Linus Torvalds's avatar
      Linux 3.17-rc2 · 52addcf9
      Linus Torvalds authored
      52addcf9
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · f01bfc97
      Linus Torvalds authored
      Pull NFS client fixes from Trond Myklebust:
       "Highlights:
      
         - more fixes for read/write codepath regressions
           * sleeping while holding the inode lock
           * stricter enforcement of page contiguity when coalescing requests
           * fix up error handling in the page coalescing code
      
         - don't busy wait on SIGKILL in the file locking code"
      
      * tag 'nfs-for-3.17-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        nfs: Don't busy-wait on SIGKILL in __nfs_iocounter_wait
        nfs: can_coalesce_requests must enforce contiguity
        nfs: disallow duplicate pages in pgio page vectors
        nfs: don't sleep with inode lock in lock_and_join_requests
        nfs: fix error handling in lock_and_join_requests
        nfs: use blocking page_group_lock in add_request
        nfs: fix nonblocking calls to nfs_page_group_lock
        nfs: change nfs_page_group_lock argument
      f01bfc97
    • Linus Torvalds's avatar
      Merge tag 'renesas-sh-drivers-for-v3.17' of... · dd5957b7
      Linus Torvalds authored
      Merge tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas
      
      Pull SH driver fix from Simon Horman:
       "Confine SH_INTC to platforms that need it"
      
      * tag 'renesas-sh-drivers-for-v3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
        sh: intc: Confine SH_INTC to platforms that need it
      dd5957b7
    • Linus Torvalds's avatar
      Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 497c01dd
      Linus Torvalds authored
      Pull MIPS fixes from Ralf Baechle:
       "Pretty much all across the field so with this we should be in
        reasonable shape for the upcoming -rc2"
      
      * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
        MIPS: OCTEON: make get_system_type() thread-safe
        MIPS: CPS: Initialize EVA before bringing up VPEs from secondary cores
        MIPS: Malta: EVA: Rename 'eva_entry' to 'platform_eva_init'
        MIPS: EVA: Add new EVA header
        MIPS: scall64-o32: Fix indirect syscall detection
        MIPS: syscall: Fix AUDIT value for O32 processes on MIPS64
        MIPS: Loongson: Fix COP2 usage for preemptible kernel
        MIPS: NL: Fix nlm_xlp_defconfig build error
        MIPS: Remove race window in page fault handling
        MIPS: Malta: Improve system memory detection for '{e, }memsize' >= 2G
        MIPS: Alchemy: Fix db1200 PSC clock enablement
        MIPS: BCM47XX: Fix reboot problem on BCM4705/BCM4785
        MIPS: Remove duplicated include from numa.c
        MIPS: Add common plat_irq_dispatch declaration
        MIPS: MSP71xx: remove unused plat_irq_dispatch() argument
        MIPS: GIC: Remove useless parens from GICBIS().
        MIPS: perf: Mark pmu interupt IRQF_NO_THREAD
      497c01dd
    • Linus Torvalds's avatar
      Merge tag 'trace-fixes-v3.17-rc1' of... · 01e9982a
      Linus Torvalds authored
      Merge tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
      
      Pull fix for ftrace function tracer/profiler conflict from Steven Rostedt:
       "The rewrite of the ftrace code that makes it possible to allow for
        separate trampolines had a design flaw with the interaction between
        the function and function_graph tracers.
      
        The main flaw was the simplification of the use of multiple tracers
        having the same filter (like function and function_graph, that use the
        set_ftrace_filter file to filter their code).  The design assumed that
        the two tracers could never run simultaneously as only one tracer can
        be used at a time.  The problem with this assumption was that the
        function profiler could be implemented on top of the function graph
        tracer, and the function profiler could run at the same time as the
        function tracer.  This caused the assumption to be broken and when
        ftrace detected this failed assumpiton it would spit out a nasty
        warning and shut itself down.
      
        Instead of using a single ftrace_ops that switches between the
        function and function_graph callbacks, the two tracers can again use
        their own ftrace_ops.  But instead of having a complex hierarchy of
        ftrace_ops, the filter fields are placed in its own structure and the
        ftrace_ops can carefully use the same filter.  This change took a bit
        to be able to allow for this and currently only the global_ops can
        share the same filter, but this new design can easily be modified to
        allow for any ftrace_ops to share its filter with another ftrace_ops.
      
        The first four patches deal with the change of allowing the ftrace_ops
        to share the filter (and this needs to go to 3.16 as well).
      
        The fifth patch fixes a bug that was also caused by the new changes
        but only for archs other than x86, and only if those archs implement a
        direct call to the function_graph tracer which they do not do yet but
        will in the future.  It does not need to go to stable, but needs to be
        fixed before the other archs update their code to allow direct calls
        to the function_graph trampoline"
      
      * tag 'trace-fixes-v3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Use current addr when converting to nop in __ftrace_replace_code()
        ftrace: Fix function_profiler and function tracer together
        ftrace: Fix up trampoline accounting with looping on hash ops
        ftrace: Update all ftrace_ops for a ftrace_hash_ops update
        ftrace: Allow ftrace_ops to use the hashes from other ops
      01e9982a
  6. 24 Aug, 2014 12 commits
  7. 23 Aug, 2014 5 commits
    • Heiko Stuebner's avatar
      MAINTAINERS: add new Rockchip SoC list · 00250b52
      Heiko Stuebner authored
      Add the new list that Rockchip-specific patches should also be directed to.
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      00250b52
    • Heiko Stuebner's avatar
      ARM: dts: rockchip: readd missing mmc0 pinctrl settings · 1302d32c
      Heiko Stuebner authored
      During the restructuring of the Rockchip Cortex-A9 dtsi files it seems
      like the pinctrl settings vanished at some point from the mmc0 support.
      
      This of course renders them unusable, so readd the necessary pinctrl
      properties.
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      1302d32c
    • Olof Johansson's avatar
      Merge tag 'sunxi-dt-for-3.17-2' of... · 2136edf3
      Olof Johansson authored
      Merge tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into fixes
      
      Merge "Allwinner DT changes, take 2" from Maxime Ripard:
      
      Only a single patch in here that fixes a DTC warning.
      
      * tag 'sunxi-dt-for-3.17-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux:
        ARM: dt: sun6i: Add #address-cells and #size-cells to i2c controller nodes
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      2136edf3
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Use current addr when converting to nop in __ftrace_replace_code() · 39b5552c
      Steven Rostedt (Red Hat) authored
      In __ftrace_replace_code(), when converting the call to a nop in a function
      it needs to compare against the "curr" (current) value of the ftrace ops, and
      not the "new" one. It currently does not affect x86 which is the only arch
      to do the trampolines with function graph tracer, but when other archs that do
      depend on this code implement the function graph trampoline, it can crash.
      
      Here's an example when ARM uses the trampolines (in the future):
      
       ------------[ cut here ]------------
       WARNING: CPU: 0 PID: 9 at kernel/trace/ftrace.c:1716 ftrace_bug+0x17c/0x1f4()
       Modules linked in: omap_rng rng_core ipv6
       CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.16.0-test-10959-gf0094b28-dirty #52
       [<c02188f4>] (unwind_backtrace) from [<c021343c>] (show_stack+0x20/0x24)
       [<c021343c>] (show_stack) from [<c095a674>] (dump_stack+0x78/0x94)
       [<c095a674>] (dump_stack) from [<c02532a0>] (warn_slowpath_common+0x7c/0x9c)
       [<c02532a0>] (warn_slowpath_common) from [<c02532ec>] (warn_slowpath_null+0x2c/0x34)
       [<c02532ec>] (warn_slowpath_null) from [<c02cbac4>] (ftrace_bug+0x17c/0x1f4)
       [<c02cbac4>] (ftrace_bug) from [<c02cc44c>] (ftrace_replace_code+0x80/0x9c)
       [<c02cc44c>] (ftrace_replace_code) from [<c02cc658>] (ftrace_modify_all_code+0xb8/0x164)
       [<c02cc658>] (ftrace_modify_all_code) from [<c02cc718>] (__ftrace_modify_code+0x14/0x1c)
       [<c02cc718>] (__ftrace_modify_code) from [<c02c7244>] (multi_cpu_stop+0xf4/0x134)
       [<c02c7244>] (multi_cpu_stop) from [<c02c6e90>] (cpu_stopper_thread+0x54/0x130)
       [<c02c6e90>] (cpu_stopper_thread) from [<c0271cd4>] (smpboot_thread_fn+0x1ac/0x1bc)
       [<c0271cd4>] (smpboot_thread_fn) from [<c026ddf0>] (kthread+0xe0/0xfc)
       [<c026ddf0>] (kthread) from [<c020f318>] (ret_from_fork+0x14/0x20)
       ---[ end trace dc9ce72c5b617d8f ]---
      [   65.047264] ftrace failed to modify [<c0208580>] asm_do_IRQ+0x10/0x1c
      [   65.054070]  actual: 85:1b:00:eb
      
      Fixes: 7413af1f "ftrace: Make get_ftrace_addr() and get_ftrace_addr_old() global"
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      39b5552c
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Fix function_profiler and function tracer together · 5f151b24
      Steven Rostedt (Red Hat) authored
      The latest rewrite of ftrace removed the separate ftrace_ops of
      the function tracer and the function graph tracer and had them
      share the same ftrace_ops. This simplified the accounting by removing
      the multiple layers of functions called, where the global_ops func
      would call a special list that would iterate over the other ops that
      were registered within it (like function and function graph), which
      itself was registered to the ftrace ops list of all functions
      currently active. If that sounds confusing, the code that implemented
      it was also confusing and its removal is a good thing.
      
      The problem with this change was that it assumed that the function
      and function graph tracer can never be used at the same time.
      This is mostly true, but there is an exception. That is when the
      function profiler uses the function graph tracer to profile.
      The function profiler can be activated the same time as the function
      tracer, and this breaks the assumption and the result is that ftrace
      will crash (it detects the error and shuts itself down, it does not
      cause a kernel oops).
      
      To solve this issue, a previous change allowed the hash tables
      for the functions traced by a ftrace_ops to be a pointer and let
      multiple ftrace_ops share the same hash. This allows the function
      and function_graph tracer to have separate ftrace_ops, but still
      share the hash, which is what is done.
      
      Now the function and function graph tracers have separate ftrace_ops
      again, and the function tracer can be run while the function_profile
      is active.
      
      Cc: stable@vger.kernel.org # 3.16 (apply after 3.17-rc4 is out)
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      5f151b24
  8. 22 Aug, 2014 3 commits