1. 22 Jul, 2014 1 commit
    • Linus Torvalds's avatar
      Don't trigger congestion wait on dirty-but-not-writeout pages · e43bbc2c
      Linus Torvalds authored
      commit b738d764 upstream.
      
      shrink_inactive_list() used to wait 0.1s to avoid congestion when all
      the pages that were isolated from the inactive list were dirty but not
      under active writeback.  That makes no real sense, and apparently causes
      major interactivity issues under some loads since 3.11.
      
      The ostensible reason for it was to wait for kswapd to start writing
      pages, but that seems questionable as well, since the congestion wait
      code seems to trigger for kswapd itself as well.  Also, the logic behind
      delaying anything when we haven't actually started writeback is not
      clear - it only delays actually starting that writeback.
      
      We'll still trigger the congestion waiting if
      
       (a) the process is kswapd, and we hit pages flagged for immediate
           reclaim
      
       (b) the process is not kswapd, and the zone backing dev writeback is
           actually congested.
      
      This probably needs to be revisited, but as it is this fixes a reported
      regression.
      
      [mhocko@suse.cz: backport to 3.12 stable tree]
      Fixes: e2be15f6 ('mm: vmscan: stall page reclaim and writeback pages based on dirty/writepage pages encountered')
      Reported-by: default avatarFelipe Contreras <felipe.contreras@gmail.com>
      Pinpointed-by: default avatarHillf Danton <dhillf@gmail.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Mel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e43bbc2c
  2. 18 Jul, 2014 39 commits
    • Jiri Slaby's avatar
      Linux 3.12.25 · 0b8e8118
      Jiri Slaby authored
      0b8e8118
    • Lan Tianyu's avatar
      ACPI / battery: Retry to get battery information if failed during probing · b72ab222
      Lan Tianyu authored
      commit 75646e75 upstream.
      
      Some machines (eg. Lenovo Z480) ECs are not stable during boot up
      and causes battery driver fails to be loaded due to failure of getting
      battery information from EC sometimes. After several retries, the
      operation will work. This patch is to retry to get battery information 5
      times if the first try fails.
      
      [ backport to 3.14.5: removed second parameter in acpi_battery_update(),
      introduced by the commit 9e50bc14 (ACPI /
      battery: Accelerate battery resume callback)]
      
      [naszar <naszar@ya.ru>: backport to 3.14.5]
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=75581Reported-and-tested-by: default avatarnaszar <naszar@ya.ru>
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: default avatarLan Tianyu <tianyu.lan@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b72ab222
    • Roland Dreier's avatar
      x86, ioremap: Speed up check for RAM pages · d30ab279
      Roland Dreier authored
      commit c81c8a1e upstream.
      
      In __ioremap_caller() (the guts of ioremap), we loop over the range of
      pfns being remapped and checks each one individually with page_is_ram().
      For large ioremaps, this can be very slow.  For example, we have a
      device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+
      seconds -- sometimes long enough to trigger the soft lockup detector!
      
      Internally, page_is_ram() calls walk_system_ram_range() on a single
      page.  Instead, we can make a single call to walk_system_ram_range()
      from __ioremap_caller(), and do our further checks only for any RAM
      pages that we find.  For the common case of MMIO, this saves an enormous
      amount of work, since the range being ioremapped doesn't intersect
      system RAM at all.
      
      With this change, ioremap on our 256 GiB BAR takes less than 1 second.
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-roland@kernel.orgSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d30ab279
    • Guenter Roeck's avatar
      powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64 · 8310f53f
      Guenter Roeck authored
      commit fb43e847 upstream.
      
      powerpc:allmodconfig has been failing for some time with the following
      error.
      
      arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
      arch/powerpc/kernel/exceptions-64s.S:1312: Error: attempt to move .org backwards
      make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1
      
      A number of attempts to fix the problem by moving around code have been
      unsuccessful and resulted in failed builds for some configurations and
      the discovery of toolchain bugs.
      
      Fix the problem by disabling RELOCATABLE for COMPILE_TEST builds instead.
      While this is less than perfect, it avoids substantial code changes
      which would otherwise be necessary just to make COMPILE_TEST builds
      happy and might have undesired side effects.
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8310f53f
    • Steven Rostedt (Red Hat)'s avatar
      ring-buffer: Check if buffer exists before polling · ca846a7a
      Steven Rostedt (Red Hat) authored
      commit 8b8b3683 upstream.
      
      The per_cpu buffers are created one per possible CPU. But these do
      not mean that those CPUs are online, nor do they even exist.
      
      With the addition of the ring buffer polling, it assumes that the
      caller polls on an existing buffer. But this is not the case if
      the user reads trace_pipe from a CPU that does not exist, and this
      causes the kernel to crash.
      
      Simple fix is to check the cpu against buffer bitmask against to see
      if the buffer was allocated or not and return -ENODEV if it is
      not.
      
      More updates were done to pass the -ENODEV back up to userspace.
      
      Link: http://lkml.kernel.org/r/5393DB61.6060707@oracle.comReported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ca846a7a
    • Joonsoo Kim's avatar
      DMA, CMA: fix possible memory leak · 378005bb
      Joonsoo Kim authored
      commit fe8eea4f upstream.
      
      We should free memory for bitmap when we find zone mismatch, otherwise
      this memory will leak.
      
      Additionally, I copy code comment from PPC KVM's CMA code to inform why
      we need to check zone mis-match.
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Acked-by: default avatarZhang Yanfei <zhangyanfei@cn.fujitsu.com>
      Reviewed-by: default avatarMichal Nazarewicz <mina86@mina86.com>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Michal Nazarewicz <mina86@mina86.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Alexander Graf <agraf@suse.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      378005bb
    • Ville Syrjälä's avatar
      drm/i915: Don't clobber the GTT when it's within stolen memory · f29c738c
      Ville Syrjälä authored
      commit f1e1c212 upstream.
      
      On most gen2-4 platforms the GTT can be (or maybe always is?)
      inside the stolen memory region. If that's the case, reduce the
      size of the stolen memory appropriately to make make sure we
      don't clobber the GTT.
      
      v2: Deal with gen4 36 bit physical address
      Signed-off-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80151Acked-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f29c738c
    • Christian König's avatar
      drm/radeon: stop poisoning the GART TLB · d5a2e149
      Christian König authored
      commit 0986c1a5 upstream.
      
      When we set the valid bit on invalid GART entries they are
      loaded into the TLB when an adjacent entry is loaded. This
      poisons the TLB with invalid entries which are sometimes
      not correctly removed on TLB flush.
      
      For stable inclusion the patch probably needs to be modified a bit.
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d5a2e149
    • Alex Deucher's avatar
      drm/radeon: fix typo in golden register setup on evergreen · 70db0157
      Alex Deucher authored
      commit 6abafb78 upstream.
      
      Fixes hangs on driver load on some cards.
      
      bug:
      https://bugs.freedesktop.org/show_bug.cgi?id=76998Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      70db0157
    • Alex Deucher's avatar
      drm/radeon: fix typo in ci_stop_dpm() · d52023d8
      Alex Deucher authored
      commit ed963771 upstream.
      
      Need to use the RREG32_SMC() accessor since the register
      is an smc indirect index.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d52023d8
    • Alexandre Demers's avatar
      drm/radeon/dpm: Reenabling SS on Cayman · d5d31b71
      Alexandre Demers authored
      commit 41959341 upstream.
      
      It reverts commit c745fe61 now that
      Cayman is stable since VDDCI fix. Spread spectrum was not the culprit.
      
      This depends on b0880e87
      (drm/radeon/dpm: fix vddci setup typo on cayman).
      Signed-off-by: default avatarAlexandre Demers <alexandre.f.demers@gmail.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d5d31b71
    • Theodore Ts'o's avatar
      ext4: fix a potential deadlock in __ext4_es_shrink() · e4b456d4
      Theodore Ts'o authored
      commit 3f1f9b85 upstream.
      
      This fixes the following lockdep complaint:
      
      [ INFO: possible circular locking dependency detected ]
      3.16.0-rc2-mm1+ #7 Tainted: G           O
      -------------------------------------------------------
      kworker/u24:0/4356 is trying to acquire lock:
       (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
      
      but task is already holding lock:
       (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180
      
      which lock already depends on the new lock.
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&ei->i_es_lock);
                                     lock(&(&sbi->s_es_lru_lock)->rlock);
                                     lock(&ei->i_es_lock);
        lock(&(&sbi->s_es_lru_lock)->rlock);
      
       *** DEADLOCK ***
      
      6 locks held by kworker/u24:0/4356:
       #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
       #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
       #2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
       #3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
       #4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
       #5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180
      
      stack backtrace:
      CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ #7
      Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
      Workqueue: writeback bdi_writeback_workfn (flush-253:0)
       ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
       ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
       ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
      Call Trace:
       [<ffffffff815df0bb>] dump_stack+0x4e/0x68
       [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
       [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
       [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
       [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
       [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff810a8707>] lock_acquire+0x87/0x120
       [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
       [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
       [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
       [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
       [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
       [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
       [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
      	...
      Reported-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reported-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Zheng Liu <gnehzuil.liu@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e4b456d4
    • Eric Sandeen's avatar
      ext4: disable synchronous transaction batching if max_batch_time==0 · 38020539
      Eric Sandeen authored
      commit 5dd21424 upstream.
      
      The mount manpage says of the max_batch_time option,
      
      	This optimization can be turned off entirely
      	by setting max_batch_time to 0.
      
      But the code doesn't do that.  So fix the code to do
      that.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      38020539
    • Theodore Ts'o's avatar
      ext4: clarify ext4_error message in ext4_mb_generate_buddy_error() · 5795af2e
      Theodore Ts'o authored
      commit 94d4c066 upstream.
      
      We are spending a lot of time explaining to users what this error
      means.  Let's try to improve the message to avoid this problem.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5795af2e
    • Theodore Ts'o's avatar
      ext4: clarify error count warning messages · f17c60ee
      Theodore Ts'o authored
      commit ae0f78de upstream.
      
      Make it clear that values printed are times, and that it is error
      since last fsck. Also add note about fsck version required.
      Signed-off-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f17c60ee
    • Theodore Ts'o's avatar
      ext4: fix unjournalled bg descriptor while initializing inode bitmap · 95f670b4
      Theodore Ts'o authored
      commit 61c219f5 upstream.
      
      The first time that we allocate from an uninitialized inode allocation
      bitmap, if the block allocation bitmap is also uninitalized, we need
      to get write access to the block group descriptor before we start
      modifying the block group descriptor flags and updating the free block
      count, etc.  Otherwise, there is the potential of a bad journal
      checksum (if journal checksums are enabled), and of the file system
      becoming inconsistent if we crash at exactly the wrong time.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      95f670b4
    • Vincent Minet's avatar
      intel_pstate: Set CPU number before accessing MSRs · 07c07190
      Vincent Minet authored
      commit 179e8471 upstream.
      
      Ensure that cpu->cpu is set before writing MSR_IA32_PERF_CTL during CPU
      initialization. Otherwise only cpu0 has its P-state set and all other
      cores are left with their values unchanged.
      
      In most cases, this is not too serious because the P-states will be set
      correctly when the timer function is run.  But when the default governor
      is set to performance, the per-CPU current_pstate stays the same forever
      and no attempts are made to write the MSRs again.
      Signed-off-by: default avatarVincent Minet <vincent@vincent-minet.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      07c07190
    • Joe Thornber's avatar
      dm io: fix a race condition in the wake up code for sync_io · 86def865
      Joe Thornber authored
      commit 10f1d5d1 upstream.
      
      There's a race condition between the atomic_dec_and_test(&io->count)
      in dec_count() and the waking of the sync_io() thread.  If the thread
      is spuriously woken immediately after the decrement it may exit,
      making the on stack io struct invalid, yet the dec_count could still
      be using it.
      
      Fix this race by using a completion in sync_io() and dec_count().
      Reported-by: default avatarMinfei Huang <huangminfei@ucloud.cn>
      Signed-off-by: default avatarJoe Thornber <thornber@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      86def865
    • K. Y. Srinivasan's avatar
      Drivers: hv: vmbus: Fix a bug in the channel callback dispatch code · 71a046b8
      K. Y. Srinivasan authored
      commit affb1aff upstream.
      
      Starting with Win8, we have implemented several optimizations to improve the
      scalability and performance of the VMBUS transport between the Host and the
      Guest. Some of the non-performance critical services cannot leverage these
      optimization since they only read and process one message at a time.
      Make adjustments to the callback dispatch code to account for the way
      non-performance critical drivers handle reading of the channel.
      Signed-off-by: default avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      71a046b8
    • Krzysztof Kozlowski's avatar
      clk: s2mps11: Fix double free corruption during driver unbind · c2602aa9
      Krzysztof Kozlowski authored
      commit 2a96dfa4 upstream.
      
      After unbinding the driver memory was corrupted by double free of
      clk_lookup structure. This lead to OOPS when re-binding the driver
      again.
      
      The driver allocated memory for 'clk_lookup' with devm_kzalloc. During
      driver removal this memory was freed twice: once by clkdev_drop() and
      second by devm code.
      
      Kernel panic log:
      [   30.839284] Unable to handle kernel paging request at virtual address 5f343173
      [   30.846476] pgd = dee14000
      [   30.849165] [5f343173] *pgd=00000000
      [   30.852703] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
      [   30.858166] Modules linked in:
      [   30.861208] CPU: 0 PID: 1 Comm: bash Not tainted 3.16.0-rc2-00239-g94bdf617b07e-dirty #40
      [   30.869364] task: df478000 ti: df480000 task.ti: df480000
      [   30.874752] PC is at clkdev_add+0x2c/0x38
      [   30.878738] LR is at clkdev_add+0x18/0x38
      [   30.882732] pc : [<c0350908>]    lr : [<c03508f4>]    psr: 60000013
      [   30.882732] sp : df481e78  ip : 00000001  fp : c0700ed8
      [   30.894187] r10: 0000000c  r9 : 00000000  r8 : c07b0e3c
      [   30.899396] r7 : 00000002  r6 : df45f9d0  r5 : df421390  r4 : c0700d6c
      [   30.905906] r3 : 5f343173  r2 : c0700d84  r1 : 60000013  r0 : c0700d6c
      [   30.912417] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
      [   30.919534] Control: 10c53c7d  Table: 5ee1406a  DAC: 00000015
      [   30.925262] Process bash (pid: 1, stack limit = 0xdf480240)
      [   30.930817] Stack: (0xdf481e78 to 0xdf482000)
      [   30.935159] 1e60:                                                       00001000 df6de610
      [   30.943321] 1e80: df7f4558 c0355650 c05ec6ec c0700eb0 df6de600 df7f4510 dec9d69c 00000014
      [   30.951480] 1ea0: 00167b48 df6de610 c0700e30 c0713518 00000000 c0700e30 dec9d69c 00000006
      [   30.959639] 1ec0: 00167b48 c02c1b7c c02c1b64 df6de610 c07aff48 c02c0420 c06fb150 c047cc20
      [   30.967798] 1ee0: df6de610 df6de610 c0700e30 df6de644 c06fb150 0000000c dec9d690 c02bef90
      [   30.975957] 1f00: dec9c6c0 dece4c00 df481f80 dece4c00 0000000c c02be73c 0000000c c016ca8c
      [   30.984116] 1f20: c016ca48 00000000 00000000 c016c1f4 00000000 00000000 b6f18000 df481f80
      [   30.992276] 1f40: df7f66c0 0000000c df480000 df480000 b6f18000 c011094c df47839c 60000013
      [   31.000435] 1f60: 00000000 00000000 df7f66c0 df7f66c0 0000000c df480000 b6f18000 c0110dd4
      [   31.008594] 1f80: 00000000 00000000 0000000c b6ec05d8 0000000c b6f18000 00000004 c000f2a8
      [   31.016753] 1fa0: 00001000 c000f0e0 b6ec05d8 0000000c 00000001 b6f18000 0000000c 00000000
      [   31.024912] 1fc0: b6ec05d8 0000000c b6f18000 00000004 0000000c 00000001 00000000 00167b48
      [   31.033071] 1fe0: 00000000 bed83a80 b6e004f0 b6e5122c 60000010 00000001 ffffffff ffffffff
      [   31.041248] [<c0350908>] (clkdev_add) from [<c0355650>] (s2mps11_clk_probe+0x2b4/0x3b4)
      [   31.049223] [<c0355650>] (s2mps11_clk_probe) from [<c02c1b7c>] (platform_drv_probe+0x18/0x48)
      [   31.057728] [<c02c1b7c>] (platform_drv_probe) from [<c02c0420>] (driver_probe_device+0x13c/0x384)
      [   31.066579] [<c02c0420>] (driver_probe_device) from [<c02bef90>] (bind_store+0x88/0xd8)
      [   31.074564] [<c02bef90>] (bind_store) from [<c02be73c>] (drv_attr_store+0x20/0x2c)
      [   31.082118] [<c02be73c>] (drv_attr_store) from [<c016ca8c>] (sysfs_kf_write+0x44/0x48)
      [   31.090016] [<c016ca8c>] (sysfs_kf_write) from [<c016c1f4>] (kernfs_fop_write+0xc0/0x17c)
      [   31.098176] [<c016c1f4>] (kernfs_fop_write) from [<c011094c>] (vfs_write+0xa0/0x1c4)
      [   31.105899] [<c011094c>] (vfs_write) from [<c0110dd4>] (SyS_write+0x40/0x8c)
      [   31.112931] [<c0110dd4>] (SyS_write) from [<c000f0e0>] (ret_fast_syscall+0x0/0x3c)
      [   31.120481] Code: e2842018 e584501c e1a00004 e885000c (e5835000)
      [   31.126596] ---[ end trace efad45bfa3a61b05 ]---
      [   31.131181] Kernel panic - not syncing: Fatal exception
      [   31.136368] CPU1: stopping
      [   31.139054] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D       3.16.0-rc2-00239-g94bdf617b07e-dirty #40
      [   31.148697] [<c0016480>] (unwind_backtrace) from [<c0012950>] (show_stack+0x10/0x14)
      [   31.156419] [<c0012950>] (show_stack) from [<c0480db8>] (dump_stack+0x80/0xcc)
      [   31.163622] [<c0480db8>] (dump_stack) from [<c001499c>] (handle_IPI+0x130/0x15c)
      [   31.170998] [<c001499c>] (handle_IPI) from [<c000862c>] (gic_handle_irq+0x60/0x68)
      [   31.178549] [<c000862c>] (gic_handle_irq) from [<c0013480>] (__irq_svc+0x40/0x70)
      [   31.186009] Exception stack(0xdf4bdf88 to 0xdf4bdfd0)
      [   31.191046] df80:                   ffffffed 00000000 00000000 00000000 df4bc000 c06d042c
      [   31.199207] dfa0: 00000000 ffffffed c06d03c0 00000000 c070c288 00000000 00000000 df4bdfd0
      [   31.207363] dfc0: c0010324 c0010328 60000013 ffffffff
      [   31.212402] [<c0013480>] (__irq_svc) from [<c0010328>] (arch_cpu_idle+0x28/0x30)
      [   31.219783] [<c0010328>] (arch_cpu_idle) from [<c005f150>] (cpu_startup_entry+0x2c4/0x3f0)
      [   31.228027] [<c005f150>] (cpu_startup_entry) from [<400086c4>] (0x400086c4)
      [   31.234968] ---[ end Kernel panic - not syncing: Fatal exception
      
      Fixes: 7cc560de ("clk: s2mps11: Add support for s2mps11")
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Reviewed-by: default avatarYadwinder Singh Brar <yadi.brar@samsung.com>
      Signed-off-by: default avatarMike Turquette <mturquette@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c2602aa9
    • Thomas Gleixner's avatar
      clk: spear3xx: Use proper control register offset · dce080c4
      Thomas Gleixner authored
      commit 15ebb052 upstream.
      
      The control register is at offset 0x10, not 0x0. This is wreckaged
      since commit 5df33a62 (SPEAr: Switch to common clock framework).
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarMike Turquette <mturquette@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dce080c4
    • Colin Cross's avatar
      arm64: implement TASK_SIZE_OF · ccca5d34
      Colin Cross authored
      commit fa2ec3ea upstream.
      
      include/linux/sched.h implements TASK_SIZE_OF as TASK_SIZE if it
      is not set by the architecture headers.  TASK_SIZE uses the
      current task to determine the size of the virtual address space.
      On a 64-bit kernel this will cause reading /proc/pid/pagemap of a
      64-bit process from a 32-bit process to return EOF when it reads
      past 0xffffffff.
      
      Implement TASK_SIZE_OF exactly the same as TASK_SIZE with
      test_tsk_thread_flag instead of test_thread_flag.
      Signed-off-by: default avatarColin Cross <ccross@android.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ccca5d34
    • Jussi Kivilinna's avatar
      crypto: sha512_ssse3 - fix byte count to bit count conversion · 49049cd0
      Jussi Kivilinna authored
      commit cfe82d4f upstream.
      
      Byte-to-bit-count computation is only partly converted to big-endian and is
      mixing in CPU-endian values. Problem was noticed by sparce with warning:
      
        CHECK   arch/x86/crypto/sha512_ssse3_glue.c
      arch/x86/crypto/sha512_ssse3_glue.c:144:19: warning: restricted __be64 degrades to integer
      arch/x86/crypto/sha512_ssse3_glue.c:144:17: warning: incorrect type in assignment (different base types)
      arch/x86/crypto/sha512_ssse3_glue.c:144:17:    expected restricted __be64 <noident>
      arch/x86/crypto/sha512_ssse3_glue.c:144:17:    got unsigned long long
      Signed-off-by: default avatarJussi Kivilinna <jussi.kivilinna@iki.fi>
      Acked-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      49049cd0
    • Prabhakar Lad's avatar
      cpufreq: Makefile: fix compilation for davinci platform · 52df57db
      Prabhakar Lad authored
      commit 5a90af67 upstream.
      
      Since commtit 8a7b1227 (cpufreq: davinci: move cpufreq driver to
      drivers/cpufreq) this added dependancy only for CONFIG_ARCH_DAVINCI_DA850
      where as davinci_cpufreq_init() call is used by all davinci platform.
      
      This patch fixes following build error:
      
      arch/arm/mach-davinci/built-in.o: In function `davinci_init_late':
      :(.init.text+0x928): undefined reference to `davinci_cpufreq_init'
      make: *** [vmlinux] Error 1
      
      Fixes: 8a7b1227 (cpufreq: davinci: move cpufreq driver to drivers/cpufreq)
      Signed-off-by: default avatarLad, Prabhakar <prabhakar.csengg@gmail.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      52df57db
    • Joel Stanley's avatar
      powerpc/perf: Clear MMCR2 when enabling PMU · 19528a53
      Joel Stanley authored
      commit b50a6c58 upstream.
      
      On POWER8 when switching to a KVM guest we set bits in MMCR2 to freeze
      the PMU counters. Aside from on boot they are then never reset,
      resulting in stuck perf counters for any user in the guest or host.
      
      We now set MMCR2 to 0 whenever enabling the PMU, which provides a sane
      state for perf to use the PMU counters under either the guest or the
      host.
      
      This was manifesting as a bug with ppc64_cpu --frequency:
      
          $ sudo ppc64_cpu --frequency
          WARNING: couldn't run on cpu 0
          WARNING: couldn't run on cpu 8
            ...
          WARNING: couldn't run on cpu 144
          WARNING: couldn't run on cpu 152
          min:    18446744073.710 GHz (cpu -1)
          max:    0.000 GHz (cpu -1)
          avg:    0.000 GHz
      
      The command uses a perf counter to measure CPU cycles over a fixed
      amount of time, in order to approximate the frequency of the machine.
      The counters were returning zero once a guest was started, regardless of
      weather it was still running or had been shut down.
      
      By dumping the value of MMCR2, it was observed that once a guest is
      running MMCR2 is set to 1s - which stops counters from running:
      
          $ sudo sh -c 'echo p > /proc/sysrq-trigger'
          CPU: 0 PMU registers, ppmu = POWER8 n_counters = 6
          PMC1:  5b635e38 PMC2: 00000000 PMC3: 00000000 PMC4: 00000000
          PMC5:  1bf5a646 PMC6: 5793d378 PMC7: deadbeef PMC8: deadbeef
          MMCR0: 0000000080000000 MMCR1: 000000001e000000 MMCRA: 0000040000000000
          MMCR2: fffffffffffffc00 EBBHR: 0000000000000000
          EBBRR: 0000000000000000 BESCR: 0000000000000000
          SIAR:  00000000000a51cc SDAR:  c00000000fc40000 SIER:  0000000001000000
      
      This is done unconditionally in book3s_hv_interrupts.S upon entering the
      guest, and the original value is only save/restored if the host has
      indicated it was using the PMU. This is okay, however the user of the
      PMU needs to ensure that it is in a defined state when it starts using
      it.
      
      Fixes: e05b9b9e ("powerpc/perf: Power8 PMU support")
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      19528a53
    • Joel Stanley's avatar
      powerpc/perf: Add PPMU_ARCH_207S define · 25744477
      Joel Stanley authored
      commit 4d9690dd upstream.
      
      Instead of separate bits for every POWER8 PMU feature, have a single one
      for v2.07 of the architecture.
      
      This saves us adding a MMCR2 define for a future patch.
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Acked-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      25744477
    • Anton Blanchard's avatar
      powerpc/perf: Never program book3s PMCs with values >= 0x80000000 · a348a994
      Anton Blanchard authored
      commit f5602941 upstream.
      
      We are seeing a lot of PMU warnings on POWER8:
      
          Can't find PMC that caused IRQ
      
      Looking closer, the active PMC is 0 at this point and we took a PMU
      exception on the transition from negative to 0. Some versions of POWER8
      have an issue where they edge detect and not level detect PMC overflows.
      
      A number of places program the PMC with (0x80000000 - period_left),
      where period_left can be negative. We can either fix all of these or
      just ensure that period_left is always >= 1.
      
      This patch takes the second option.
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a348a994
    • Lv Zheng's avatar
      ACPI / EC: Fix race condition in ec_transaction_completed() · 8c1eb039
      Lv Zheng authored
      commit c0d65341 upstream.
      
      There is a race condition in ec_transaction_completed().
      
      When ec_transaction_completed() is called in the GPE handler, it could
      return true because of (ec->curr == NULL). Then the wake_up() invocation
      could complete the next command unexpectedly since there is no lock between
      the 2 invocations. With the previous cleanup, the IBF=0 waiter race need
      not be handled any more. It's now safe to return a flag from
      advance_condition() to indicate the requirement of wakeup, the flag is
      returned from a locked context.
      
      The ec_transaction_completed() is now only invoked by the ec_poll() where
      the ec->curr is ensured to be different from NULL.
      
      After cleaning up, the EVT_SCI=1 check should be moved out of the wakeup
      condition so that an EVT_SCI raised with (ec->curr == NULL) can trigger a
      QR_SC command.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911Reported-and-tested-by: default avatarGareth Williams <gareth@garethwilliams.me.uk>
      Reported-and-tested-by: default avatarHans de Goede <jwrdegoede@fedoraproject.org>
      Reported-by: default avatarBarton Xu <tank.xuhan@gmail.com>
      Tested-by: default avatarSteffen Weber <steffen.weber@gmail.com>
      Tested-by: default avatarArthur Chen <axchen@nvidia.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8c1eb039
    • Lv Zheng's avatar
      ACPI / EC: Remove duplicated ec_wait_ibf0() waiter · 208614e5
      Lv Zheng authored
      commit 9b80f0f7 upstream.
      
      After we've added the first command byte write into advance_transaction(),
      the IBF=0 waiter is duplicated with the command completion waiter
      implemented in the ec_poll() because:
         If IBF=1 blocked the first command byte write invoked in the task
         context ec_poll(), it would be kicked off upon IBF=0 interrupt or timed
         out and retried again in the task context.
      
      Remove this seperate and duplicate IBF=0 waiter.  By doing so we can
      reduce the overall number of times to access the EC_SC(R) status
      register.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911Reported-and-tested-by: default avatarGareth Williams <gareth@garethwilliams.me.uk>
      Reported-and-tested-by: default avatarHans de Goede <jwrdegoede@fedoraproject.org>
      Reported-by: default avatarBarton Xu <tank.xuhan@gmail.com>
      Tested-by: default avatarSteffen Weber <steffen.weber@gmail.com>
      Tested-by: default avatarArthur Chen <axchen@nvidia.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      208614e5
    • Lv Zheng's avatar
      ACPI / EC: Add asynchronous command byte write support · 6fb7e70d
      Lv Zheng authored
      commit f92fca00 upstream.
      
      Move the first command byte write into advance_transaction() so that all
      EC register accesses that can affect the command processing state machine
      can happen in this asynchronous state machine advancement function.
      
      The advance_transaction() function then can be a complete implementation
      of an asyncrhonous transaction for a single command so that:
       1. The first command byte can be written in the interrupt context;
       2. The command completion waiter can also be used to wait the first command
          byte's timeout;
       3. In BURST mode, the follow-up command bytes can be written in the
          interrupt context directly, so that it doesn't need to return to the
          task context. Returning to the task context reduces the throughput of
          the BURST mode and in the worst cases where the system workload is very
          high, this leads to the hardware driven automatic BURST mode exit.
      
      In order not to increase memory consumption, convert 'done' into 'flags'
      to contain multiple indications:
       1. ACPI_EC_COMMAND_COMPLETE: converting from original 'done' condition,
          indicating the completion of the command transaction.
       2. ACPI_EC_COMMAND_POLL: indicating the availability of writing the first
          command byte. A new command can utilize this flag to compete for the
          right of accessing the underlying hardware. There is a follow-up bug
          fix that has utilized this new flag.
      
      The 2 flags are important because it also reflects a key concept of IO
      programs' design used in the system softwares. Normally an IO program
      running in the kernel should first be implemented in the asynchronous way.
      And the 2 flags are the most common way to implement its synchronous
      operations on top of the asynchronous operations:
      1. POLL: This flag can be used to block until the asynchronous operations
               can happen.
      2. COMPLETE: This flag can be used to block until the asynchronous
                   operations have completed.
      By constructing code cleanly in this way, many difficult problems can be
      solved smoothly.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911Reported-and-tested-by: default avatarGareth Williams <gareth@garethwilliams.me.uk>
      Reported-and-tested-by: default avatarHans de Goede <jwrdegoede@fedoraproject.org>
      Reported-by: default avatarBarton Xu <tank.xuhan@gmail.com>
      Tested-by: default avatarSteffen Weber <steffen.weber@gmail.com>
      Tested-by: default avatarArthur Chen <axchen@nvidia.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6fb7e70d
    • Lv Zheng's avatar
      ACPI / EC: Avoid race condition related to advance_transaction() · e9b3f9c5
      Lv Zheng authored
      commit 66b42b78 upstream.
      
      The advance_transaction() will be invoked from the IRQ context GPE handler
      and the task context ec_poll(). The handling of this function is locked so
      that the EC state machine are ensured to be advanced sequentially.
      
      But there is a problem. Before invoking advance_transaction(), EC_SC(R) is
      read. Then for advance_transaction(), there could be race condition around
      the lock from both contexts. The first one reading the register could fail
      this race and when it passes the stale register value to the state machine
      advancement code, the hardware condition is totally different from when
      the register is read. And the hardware accesses determined from the wrong
      hardware status can break the EC state machine. And there could be cases
      that the functionalities of the platform firmware are seriously affected.
      For example:
       1. When 2 EC_DATA(W) writes compete the IBF=0, the 2nd EC_DATA(W) write may
          be invalid due to IBF=1 after the 1st EC_DATA(W) write. Then the
          hardware will either refuse to respond a next EC_SC(W) write of the next
          command or discard the current WR_EC command when it receives a EC_SC(W)
          write of the next command.
       2. When 1 EC_SC(W) write and 1 EC_DATA(W) write compete the IBF=0, the
          EC_DATA(W) write may be invalid due to IBF=1 after the EC_SC(W) write.
          The next EC_DATA(R) could never be responded by the hardware. This is
          the root cause of the reported issue.
      
      Fix this issue by moving the EC_SC(R) access into the lock so that we can
      ensure that the state machine is advanced consistently.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=70891
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=63931
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=59911Reported-and-tested-by: default avatarGareth Williams <gareth@garethwilliams.me.uk>
      Reported-and-tested-by: default avatarHans de Goede <jwrdegoede@fedoraproject.org>
      Reported-by: default avatarBarton Xu <tank.xuhan@gmail.com>
      Tested-by: default avatarSteffen Weber <steffen.weber@gmail.com>
      Tested-by: default avatarArthur Chen <axchen@nvidia.com>
      Signed-off-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e9b3f9c5
    • Axel Lin's avatar
      hwmon: (adm1021) Fix cache problem when writing temperature limits · 9d3d2645
      Axel Lin authored
      commit c024044d upstream.
      
      The module test script for the adm1021 driver exposes a cache problem
      when writing temperature limits. temp_min and temp_max are expected
      to be stored in milli-degrees C but are stored in degrees C.
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9d3d2645
    • Axel Lin's avatar
      hwmon: (adm1029) Ensure the fan_div cache is updated in set_fan_div · fc3ea423
      Axel Lin authored
      commit 1035a9e3 upstream.
      
      Writing to fanX_div does not clear the cache. As a result, reading
      from fanX_div may return the old value for up to two seconds
      after writing a new value.
      
      This patch ensures the fan_div cache is updated in set_fan_div().
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fc3ea423
    • Guenter Roeck's avatar
      hwmon: (adm1031) Fix writes to limit registers · b375dd27
      Guenter Roeck authored
      commit 145e74a4 upstream.
      
      Upper limit for write operations to temperature limit registers
      was clamped to a fractional value. However, limit registers do
      not support fractional values. As a result, upper limits of 127.5
      degrees C or higher resulted in a rounded limit of 128 degrees C.
      Since limit registers are signed, this was stored as -128 degrees C.
      Clamp limits to (-55, +127) degrees C to solve the problem.
      
      Value on writes to auto_temp[12]_min and auto_temp[12]_max were not
      clamped at all, but masked. As a result, out-of-range writes resulted
      in a more or less arbitrary limit. Clamp those attributes to (0, 127)
      degrees C for more predictable results.
      
      Cc: Axel Lin <axel.lin@ingics.com>
      Reviewed-by: default avatarJean Delvare <jdelvare@suse.de>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b375dd27
    • Guenter Roeck's avatar
      hwmon: (emc2103) Clamp limits instead of bailing out · 08efa418
      Guenter Roeck authored
      commit f6c2dd20 upstream.
      
      It is customary to clamp limits instead of bailing out with an error
      if a configured limit is out of the range supported by the driver.
      This simplifies limit configuration, since the user will not typically
      know chip and/or driver specific limits.
      Reviewed-by: default avatarJean Delvare <jdelvare@suse.de>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      08efa418
    • Axel Lin's avatar
      hwmon: (amc6821) Fix permissions for temp2_input · b57dd1ad
      Axel Lin authored
      commit df86754b upstream.
      
      temp2_input should not be writable, fix it.
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b57dd1ad
    • Aaron Lu's avatar
      thermal: hwmon: Make the check for critical temp valid consistent · c8052537
      Aaron Lu authored
      commit e8db5d67 upstream.
      
      On 05/21/2014 04:22 PM, Aaron Lu wrote:
      > On 05/21/2014 01:57 PM, Kui Zhang wrote:
      >> Hello,
      >>
      >> I get following error when rmmod thermal.
      >>
      >> rmmod  thermal
      >> Killed
      
      While dealing with this problem, I found another problem that also
      results in a kernel crash on thermal module removal:
      
      From: Aaron Lu <aaron.lu@intel.com>
      Date: Wed, 21 May 2014 16:05:38 +0800
      Subject: thermal: hwmon: Make the check for critical temp valid consistent
      
      We used the tz->ops->get_crit_temp && !tz->ops->get_crit_temp(tz, temp)
      to decide if we need to create the temp_crit attribute file but we just
      check if tz->ops->get_crit_temp exists to decide if we need to remove
      that attribute file. Some ACPI thermal zone doesn't have a valid critical
      trip point and that would result in removing a non-existent device file
      on thermal module unload.
      Signed-off-by: default avatarAaron Lu <aaron.lu@intel.com>
      Signed-off-by: default avatarZhang Rui <rui.zhang@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c8052537
    • Yasuaki Ishimatsu's avatar
      workqueue: zero cpumask of wq_numa_possible_cpumask on init · 9b8fa806
      Yasuaki Ishimatsu authored
      commit 5a6024f1 upstream.
      
      When hot-adding and onlining CPU, kernel panic occurs, showing following
      call trace.
      
        BUG: unable to handle kernel paging request at 0000000000001d08
        IP: [<ffffffff8114acfd>] __alloc_pages_nodemask+0x9d/0xb10
        PGD 0
        Oops: 0000 [#1] SMP
        ...
        Call Trace:
         [<ffffffff812b8745>] ? cpumask_next_and+0x35/0x50
         [<ffffffff810a3283>] ? find_busiest_group+0x113/0x8f0
         [<ffffffff81193bc9>] ? deactivate_slab+0x349/0x3c0
         [<ffffffff811926f1>] new_slab+0x91/0x300
         [<ffffffff815de95a>] __slab_alloc+0x2bb/0x482
         [<ffffffff8105bc1c>] ? copy_process.part.25+0xfc/0x14c0
         [<ffffffff810a3c78>] ? load_balance+0x218/0x890
         [<ffffffff8101a679>] ? sched_clock+0x9/0x10
         [<ffffffff81105ba9>] ? trace_clock_local+0x9/0x10
         [<ffffffff81193d1c>] kmem_cache_alloc_node+0x8c/0x200
         [<ffffffff8105bc1c>] copy_process.part.25+0xfc/0x14c0
         [<ffffffff81114d0d>] ? trace_buffer_unlock_commit+0x4d/0x60
         [<ffffffff81085a80>] ? kthread_create_on_node+0x140/0x140
         [<ffffffff8105d0ec>] do_fork+0xbc/0x360
         [<ffffffff8105d3b6>] kernel_thread+0x26/0x30
         [<ffffffff81086652>] kthreadd+0x2c2/0x300
         [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
         [<ffffffff815f20ec>] ret_from_fork+0x7c/0xb0
         [<ffffffff81086390>] ? kthread_create_on_cpu+0x60/0x60
      
      In my investigation, I found the root cause is wq_numa_possible_cpumask.
      All entries of wq_numa_possible_cpumask is allocated by
      alloc_cpumask_var_node(). And these entries are used without initializing.
      So these entries have wrong value.
      
      When hot-adding and onlining CPU, wq_update_unbound_numa() is called.
      wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq()
      calls get_unbound_pool(). In get_unbound_pool(), worker_pool->node is set
      as follow:
      
      3592         /* if cpumask is contained inside a NUMA node, we belong to that node */
      3593         if (wq_numa_enabled) {
      3594                 for_each_node(node) {
      3595                         if (cpumask_subset(pool->attrs->cpumask,
      3596                                            wq_numa_possible_cpumask[node])) {
      3597                                 pool->node = node;
      3598                                 break;
      3599                         }
      3600                 }
      3601         }
      
      But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong
      node is selected. As a result, kernel panic occurs.
      
      By this patch, all entries of wq_numa_possible_cpumask are allocated by
      zalloc_cpumask_var_node to initialize them. And the panic disappeared.
      Signed-off-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
      Reviewed-by: default avatarLai Jiangshan <laijs@cn.fujitsu.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: bce90380 ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]")
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9b8fa806
    • Gu Zheng's avatar
      cpuset,mempolicy: fix sleeping function called from invalid context · d9e8b4f6
      Gu Zheng authored
      commit 391acf97 upstream.
      
      When runing with the kernel(3.15-rc7+), the follow bug occurs:
      [ 9969.258987] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
      [ 9969.359906] in_atomic(): 1, irqs_disabled(): 0, pid: 160655, name: python
      [ 9969.441175] INFO: lockdep is turned off.
      [ 9969.488184] CPU: 26 PID: 160655 Comm: python Tainted: G       A      3.15.0-rc7+ #85
      [ 9969.581032] Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012
      [ 9969.706052]  ffffffff81a20e60 ffff8803e941fbd0 ffffffff8162f523 ffff8803e941fd18
      [ 9969.795323]  ffff8803e941fbe0 ffffffff8109995a ffff8803e941fc58 ffffffff81633e6c
      [ 9969.884710]  ffffffff811ba5dc ffff880405c6b480 ffff88041fdd90a0 0000000000002000
      [ 9969.974071] Call Trace:
      [ 9970.003403]  [<ffffffff8162f523>] dump_stack+0x4d/0x66
      [ 9970.065074]  [<ffffffff8109995a>] __might_sleep+0xfa/0x130
      [ 9970.130743]  [<ffffffff81633e6c>] mutex_lock_nested+0x3c/0x4f0
      [ 9970.200638]  [<ffffffff811ba5dc>] ? kmem_cache_alloc+0x1bc/0x210
      [ 9970.272610]  [<ffffffff81105807>] cpuset_mems_allowed+0x27/0x140
      [ 9970.344584]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
      [ 9970.409282]  [<ffffffff811b1385>] __mpol_dup+0xe5/0x150
      [ 9970.471897]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
      [ 9970.536585]  [<ffffffff81068c86>] ? copy_process.part.23+0x606/0x1d40
      [ 9970.613763]  [<ffffffff810bf28d>] ? trace_hardirqs_on+0xd/0x10
      [ 9970.683660]  [<ffffffff810ddddf>] ? monotonic_to_bootbased+0x2f/0x50
      [ 9970.759795]  [<ffffffff81068cf0>] copy_process.part.23+0x670/0x1d40
      [ 9970.834885]  [<ffffffff8106a598>] do_fork+0xd8/0x380
      [ 9970.894375]  [<ffffffff81110e4c>] ? __audit_syscall_entry+0x9c/0xf0
      [ 9970.969470]  [<ffffffff8106a8c6>] SyS_clone+0x16/0x20
      [ 9971.030011]  [<ffffffff81642009>] stub_clone+0x69/0x90
      [ 9971.091573]  [<ffffffff81641c29>] ? system_call_fastpath+0x16/0x1b
      
      The cause is that cpuset_mems_allowed() try to take
      mutex_lock(&callback_mutex) under the rcu_read_lock(which was hold in
      __mpol_dup()). And in cpuset_mems_allowed(), the access to cpuset is
      under rcu_read_lock, so in __mpol_dup, we can reduce the rcu_read_lock
      protection region to protect the access to cpuset only in
      current_cpuset_is_being_rebound(). So that we can avoid this bug.
      
      This patch is a temporary solution that just addresses the bug
      mentioned above, can not fix the long-standing issue about cpuset.mems
      rebinding on fork():
      
      "When the forker's task_struct is duplicated (which includes
       ->mems_allowed) and it races with an update to cpuset_being_rebound
       in update_tasks_nodemask() then the task's mems_allowed doesn't get
       updated. And the child task's mems_allowed can be wrong if the
       cpuset's nodemask changes before the child has been added to the
       cgroup's tasklist."
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d9e8b4f6