1. 08 May, 2013 40 commits
    • Dmitry Monakhov's avatar
    • Dmitry Monakhov's avatar
      ext4: unregister es_shrinker if mount failed · 80fcee2b
      Dmitry Monakhov authored
      commit a75ae78f upstream.
      
      Otherwise destroyed ext_sb_info will be part of global shinker list
      and result in the following OOPS:
      
      JBD2: corrupted journal superblock
      JBD2: recovery failed
      EXT4-fs (dm-2): error loading journal
      general protection fault: 0000 [#1] SMP
      Modules linked in: fuse acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel microcode sg button sd_mod crc_t10dif ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_\
      mod
      CPU 1
      Pid: 2758, comm: mount Not tainted 3.8.0-rc3+ #136                  /DH55TC
      RIP: 0010:[<ffffffff811bfb2d>]  [<ffffffff811bfb2d>] unregister_shrinker+0xad/0xe0
      RSP: 0000:ffff88011d5cbcd8  EFLAGS: 00010207
      RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b53 RCX: 0000000000000006
      RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000246
      RBP: ffff88011d5cbce8 R08: 0000000000000002 R09: 0000000000000001
      R10: 0000000000000001 R11: 0000000000000000 R12: ffff88011cd3f848
      R13: ffff88011cd3f830 R14: ffff88011cd3f000 R15: 0000000000000000
      FS:  00007f7b721dd7e0(0000) GS:ffff880121a00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00007fffa6f75038 CR3: 000000011bc1c000 CR4: 00000000000007e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process mount (pid: 2758, threadinfo ffff88011d5ca000, task ffff880116aacb80)
      Stack:
      ffff88011cd3f000 ffffffff8209b6c0 ffff88011d5cbd18 ffffffff812482f1
      00000000000003f3 00000000ffffffea ffff880115f4c200 0000000000000000
      ffff88011d5cbda8 ffffffff81249381 ffff8801219d8bf8 ffffffff00000000
      Call Trace:
      [<ffffffff812482f1>] deactivate_locked_super+0x91/0xb0
      [<ffffffff81249381>] mount_bdev+0x331/0x340
      [<ffffffff81376730>] ? ext4_alloc_flex_bg_array+0x180/0x180
      [<ffffffff81362035>] ext4_mount+0x15/0x20
      [<ffffffff8124869a>] mount_fs+0x9a/0x2e0
      [<ffffffff81277e25>] vfs_kern_mount+0xc5/0x170
      [<ffffffff81279c02>] do_new_mount+0x172/0x2e0
      [<ffffffff8127aa56>] do_mount+0x376/0x380
      [<ffffffff8127ab98>] sys_mount+0x138/0x150
      [<ffffffff818ffed9>] system_call_fastpath+0x16/0x1b
      Code: 8b 05 88 04 eb 00 48 3d 90 ff 06 82 48 8d 58 e8 75 19 4c 89 e7 e8 e4 d7 2c 00 48 c7 c7 00 ff 06 82 e8 58 5f ef ff 5b 41 5c c9 c3 <48> 8b 4b 18 48 8b 73 20 48 89 da 31 c0 48 c7 c7 c5 a0 e4 81 e\
      8
      RIP  [<ffffffff811bfb2d>] unregister_shrinker+0xad/0xe0
      RSP <ffff88011d5cbcd8>
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80fcee2b
    • Dmitry Monakhov's avatar
      ext4: fix journal callback list traversal · 699ce64d
      Dmitry Monakhov authored
      commit 5d3ee208 upstream.
      
      It is incorrect to use list_for_each_entry_safe() for journal callback
      traversial because ->next may be removed by other task:
      ->ext4_mb_free_metadata()
        ->ext4_mb_free_metadata()
          ->ext4_journal_callback_del()
      
      This results in the following issue:
      
      WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
      Hardware name:
      list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
      Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
      Pid: 16400, comm: jbd2/dm-1-8 Tainted: G        W    3.8.0-rc3+ #107
      Call Trace:
       [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
       [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
       [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
       [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
       [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
       [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
       [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
       [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
       [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
       [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
       [<ffffffff810ac6be>] kthread+0x10e/0x120
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
      
      This patch fix the issue as follows:
      - ext4_journal_commit_callback() make list truly traversial safe
        simply by always starting from list_head
      - fix race between two ext4_journal_callback_del() and
        ext4_journal_callback_try_del()
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      699ce64d
    • Dmitry Monakhov's avatar
      jbd2: fix race between jbd2_journal_remove_checkpoint and ->j_commit_callback · ec60dced
      Dmitry Monakhov authored
      commit 794446c6 upstream.
      
      The following race is possible:
      
      [kjournald2]                              other_task
      jbd2_journal_commit_transaction()
        j_state = T_FINISHED;
        spin_unlock(&journal->j_list_lock);
                                               ->jbd2_journal_remove_checkpoint()
      					   ->jbd2_journal_free_transaction();
      					     ->kmem_cache_free(transaction)
        ->j_commit_callback(journal, transaction);
          -> USE_AFTER_FREE
      
      WARNING: at lib/list_debug.c:62 __list_del_entry+0x1c0/0x250()
      Hardware name:
      list_del corruption. prev->next should be ffff88019a4ec198, but was 6b6b6b6b6b6b6b6b
      Modules linked in: cpufreq_ondemand acpi_cpufreq freq_table mperf coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel microcode sg xhci_hcd button sd_mod crc_t10dif aesni_intel ablk_helper cryptd lrw aes_x86_64 xts gf128mul ahci libahci pata_acpi ata_generic dm_mirror dm_region_hash dm_log dm_mod
      Pid: 16400, comm: jbd2/dm-1-8 Tainted: G        W    3.8.0-rc3+ #107
      Call Trace:
       [<ffffffff8106fb0d>] warn_slowpath_common+0xad/0xf0
       [<ffffffff8106fc06>] warn_slowpath_fmt+0x46/0x50
       [<ffffffff813637e9>] ? ext4_journal_commit_callback+0x99/0xc0
       [<ffffffff8148cae0>] __list_del_entry+0x1c0/0x250
       [<ffffffff813637bf>] ext4_journal_commit_callback+0x6f/0xc0
       [<ffffffff813ca336>] jbd2_journal_commit_transaction+0x23a6/0x2570
       [<ffffffff8108aa42>] ? try_to_del_timer_sync+0x82/0xa0
       [<ffffffff8108b491>] ? del_timer_sync+0x91/0x1e0
       [<ffffffff813d3ecf>] kjournald2+0x19f/0x6a0
       [<ffffffff810ad630>] ? wake_up_bit+0x40/0x40
       [<ffffffff813d3d30>] ? bit_spin_lock+0x80/0x80
       [<ffffffff810ac6be>] kthread+0x10e/0x120
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
       [<ffffffff818ff6ac>] ret_from_fork+0x7c/0xb0
       [<ffffffff810ac5b0>] ? __init_kthread_worker+0x70/0x70
      
      In order to demonstrace this issue one should mount ext4 with mount -o
      discard option on SSD disk.  This makes callback longer and race
      window becomes wider.
      
      In order to fix this we should mark transaction as finished only after
      callbacks have completed
      Signed-off-by: default avatarDmitry Monakhov <dmonakhov@openvz.org>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec60dced
    • Theodore Ts'o's avatar
      ext4/jbd2: don't wait (forever) for stale tid caused by wraparound · bf170962
      Theodore Ts'o authored
      commit d76a3a77 upstream.
      
      In the case where an inode has a very stale transaction id (tid) in
      i_datasync_tid or i_sync_tid, it's possible that after a very large
      (2**31) number of transactions, that the tid number space might wrap,
      causing tid_geq()'s calculations to fail.
      
      Commit deeeaf13 "jbd2: fix fsync() tid wraparound bug", later modified
      by commit e7b04ac0 "jbd2: don't wake kjournald unnecessarily",
      attempted to fix this problem, but it only avoided kjournald spinning
      forever by fixing the logic in jbd2_log_start_commit().
      
      Unfortunately, in the codepaths in fs/ext4/fsync.c and fs/ext4/inode.c
      that might call jbd2_log_start_commit() with a stale tid, those
      functions will subsequently call jbd2_log_wait_commit() with the same
      stale tid, and then wait for a very long time.  To fix this, we
      replace the calls to jbd2_log_start_commit() and
      jbd2_log_wait_commit() with a call to a new function,
      jbd2_complete_transaction(), which will correctly handle stale tid's.
      
      As a bonus, jbd2_complete_transaction() will avoid locking
      j_state_lock for writing unless a commit needs to be started.  This
      should have a small (but probably not measurable) improvement for
      ext4's scalability.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reported-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Reported-by: default avatarGeorge Barnett <gbarnett@atlassian.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf170962
    • H. Peter Anvin's avatar
      x86-64, init: Do not set NX bits on non-NX capable hardware · b10a9054
      H. Peter Anvin authored
      commit 78d77df7 upstream.
      
      During early init, we would incorrectly set the NX bit even if the NX
      feature was not supported.  Instead, only set this bit if NX is
      actually available and enabled.  We already do very early detection of
      the NX bit to enable it in EFER, this simply extends this detection to
      the early page table mask.
      Reported-by: default avatarFernando Luis Vázquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1367476850.5660.2.camel@nexusSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b10a9054
    • Richard Cochran's avatar
      e1000e: fix numeric overflow in phc settime method · 851b7ca2
      Richard Cochran authored
      commit 73e3dd6b upstream.
      
      The PTP Hardware Clock settime function in the e1000e driver
      computes nanoseconds from a struct timespec. The code converts the
      seconds field .tv_sec by multiplying it with NSEC_PER_SEC. However,
      both operands are of type long, resulting in an unintended overflow.
      The patch fixes the issue by using the helper function from time.h.
      Signed-off-by: default avatarRichard Cochran <richardcochran@gmail.com>
      Tested-by: default avatarAaron Brown <aaron.f.brown@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      851b7ca2
    • Jacob Keller's avatar
      ixgbe: fix EICR write in ixgbe_msix_other · f5f5be06
      Jacob Keller authored
      commit d87d8307 upstream.
      
      Previously, the ixgbe_msix_other was writing the full 32bits of the set
      interrupts, instead of only the ones which the ixgbe_msix_other is
      handling. This resulted in a loss of performance when the X540's PPS feature is
      enabled due to sometimes clearing queue interrupts which resulted in the driver
      not getting the interrupt for cleaning the q_vector rings often enough. The fix
      is to simply mask the lower 16bits off so that this handler does not write them
      in the EICR, which causes them to remain high and be properly handled by the
      clean_rings interrupt routine as normal.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: default avatarPhil Schmitt <phillip.j.schmitt@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5f5be06
    • Robin Holt's avatar
      ipc: sysv shared memory limited to 8TiB · c10d1bc5
      Robin Holt authored
      commit d69f3bad upstream.
      
      Trying to run an application which was trying to put data into half of
      memory using shmget(), we found that having a shmall value below 8EiB-8TiB
      would prevent us from using anything more than 8TiB.  By setting
      kernel.shmall greater than 8EiB-8TiB would make the job work.
      
      In the newseg() function, ns->shm_tot which, at 8TiB is INT_MAX.
      
      ipc/shm.c:
       458 static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
       459 {
      ...
       465         int numpages = (size + PAGE_SIZE -1) >> PAGE_SHIFT;
      ...
       474         if (ns->shm_tot + numpages > ns->shm_ctlall)
       475                 return -ENOSPC;
      
      [akpm@linux-foundation.org: make ipc/shm.c:newseg()'s numpages size_t, not int]
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Reported-by: default avatarAlex Thorlton <athorlton@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c10d1bc5
    • Johannes Berg's avatar
      wireless: regulatory: fix channel disabling race condition · 50f3a76b
      Johannes Berg authored
      commit 990de49f upstream.
      
      When a full scan 2.4 and 5 GHz scan is scheduled, but then the 2.4 GHz
      part of the scan disables a 5.2 GHz channel due to, e.g. receiving
      country or frequency information, that 5.2 GHz channel might already
      be in the list of channels to scan next. Then, when the driver checks
      if it should do a passive scan, that will return false and attempt an
      active scan. This is not only wrong but can also lead to the iwlwifi
      device firmware crashing since it checks regulatory as well.
      
      Fix this by not setting the channel flags to just disabled but rather
      OR'ing in the disabled flag. That way, even if the race happens, the
      channel will be scanned passively which is still (mostly) correct.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      50f3a76b
    • Bryan Schumaker's avatar
      nfsd: Decode and send 64bit time values · c8a2df2b
      Bryan Schumaker authored
      commit bf8d9097 upstream.
      
      The seconds field of an nfstime4 structure is 64bit, but we are assuming
      that the first 32bits are zero-filled.  So if the client tries to set
      atime to a value before the epoch (touch -t 196001010101), then the
      server will save the wrong value on disk.
      Signed-off-by: default avatarBryan Schumaker <bjschuma@netapp.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c8a2df2b
    • Wei Yongjun's avatar
      nfsd: use kmem_cache_free() instead of kfree() · 2696526a
      Wei Yongjun authored
      commit 2c44a234 upstream.
      
      memory allocated by kmem_cache_alloc() should be freed using
      kmem_cache_free(), not kfree().
      Signed-off-by: default avatarWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2696526a
    • fanchaoting's avatar
      nfsd: don't run get_file if nfs4_preprocess_stateid_op return error · 69aa67b1
      fanchaoting authored
      commit b022032e upstream.
      
      we should return error status directly when nfs4_preprocess_stateid_op
      return error.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69aa67b1
    • J. Bruce Fields's avatar
      nfsd4: don't close read-write opens too soon · 4ef63fed
      J. Bruce Fields authored
      commit 0c7c3e67 upstream.
      
      Don't actually close any opens until we don't need them at all.
      
      This means being left with write access when it's not really necessary,
      but that's better than putting a file that might still have posix locks
      held on it, as we have been.
      Reported-by: default avatarToralf Förster <toralf.foerster@gmx.de>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ef63fed
    • Trond Myklebust's avatar
      NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_open_delegation_recall · a81dc6b7
      Trond Myklebust authored
      commit 8b6cc4d6 upstream.
      
      A server shouldn't normally return NFS4ERR_GRACE if the client holds a
      delegation, since no conflicting lock reclaims can be granted, however
      the spec does not require the server to grant the open in this
      instance
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a81dc6b7
    • Trond Myklebust's avatar
      NFSv4: Handle NFS4ERR_DELAY and NFS4ERR_GRACE in nfs4_lock_delegation_recall · 82f09f78
      Trond Myklebust authored
      commit dbb21c25 upstream.
      
      A server shouldn't normally return NFS4ERR_GRACE if the client holds a
      delegation, since no conflicting lock reclaims can be granted, however
      the spec does not require the server to grant the lock in this
      instance.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82f09f78
    • Shaohua Li's avatar
      MD: ignore discard request for hard disks of hybid raid1/raid10 array · 4e8ff554
      Shaohua Li authored
      commit 32f9f570 upstream.
      
      In SSD/hard disk hybid storage, discard request should be ignored for hard
      disk. We used to be doing this way, but the unplug path forgets it.
      
      This is suitable for stable tree since v3.6.
      Reported-and-tested-by: default avatarMarkus <M4rkusXXL@web.de>
      Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e8ff554
    • NeilBrown's avatar
      md: bad block list should default to disabled. · aaf49388
      NeilBrown authored
      commit 486adf72 upstream.
      
      Maintenance of a bad-block-list currently defaults to 'enabled'
      and is then disabled when it cannot be supported.
      This is backwards and causes problem for dm-raid which didn't know
      to disable it.
      
      So fix the defaults, and only enabled for v1.x metadata which
      explicitly has bad blocks enabled.
      
      The problem with dm-raid has been present since badblock support was
      added in v3.1, so this patch is suitable for any -stable from 3.1
      onwards.
      Reported-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aaf49388
    • Trond Myklebust's avatar
      LOCKD: Ensure that nlmclnt_block resets block->b_status after a server reboot · 17f978dd
      Trond Myklebust authored
      commit 1dfd89af upstream.
      
      After a server reboot, the reclaimer thread will recover all the existing
      locks. For locks that are blocked, however, it will change the value
      of block->b_status to nlm_lck_denied_grace_period in order to signal that
      they need to wake up and resend the original blocking lock request.
      
      Due to a bug, however, the block->b_status never gets reset after the
      blocked locks have been woken up, and so the process goes into an
      infinite loop of resends until the blocked lock is satisfied.
      Reported-by: default avatarMarc Eshel <eshel@us.ibm.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      17f978dd
    • Oleg Nesterov's avatar
      exec: do not abuse ->cred_guard_mutex in threadgroup_lock() · 0565d711
      Oleg Nesterov authored
      commit e56fb287 upstream.
      
      threadgroup_lock() takes signal->cred_guard_mutex to ensure that
      thread_group_leader() is stable.  This doesn't look nice, the scope of
      this lock in do_execve() is huge.
      
      And as Dave pointed out this can lead to deadlock, we have the
      following dependencies:
      
      	do_execve:		cred_guard_mutex -> i_mutex
      	cgroup_mount:		i_mutex -> cgroup_mutex
      	attach_task_by_pid:	cgroup_mutex -> cred_guard_mutex
      
      Change de_thread() to take threadgroup_change_begin() around the
      switch-the-leader code and change threadgroup_lock() to avoid
      ->cred_guard_mutex.
      
      Note that de_thread() can't sleep with ->group_rwsem held, this can
      obviously deadlock with the exiting leader if the writer is active, so it
      does threadgroup_change_end() before schedule().
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0565d711
    • Greg Thelen's avatar
      fs/dcache.c: add cond_resched() to shrink_dcache_parent() · 9c3d6c10
      Greg Thelen authored
      commit 421348f1 upstream.
      
      Call cond_resched() in shrink_dcache_parent() to maintain interactivity.
      
      Before this patch:
      
      	void shrink_dcache_parent(struct dentry * parent)
      	{
      		while ((found = select_parent(parent, &dispose)) != 0)
      			shrink_dentry_list(&dispose);
      	}
      
      select_parent() populates the dispose list with dentries which
      shrink_dentry_list() then deletes.  select_parent() carefully uses
      need_resched() to avoid doing too much work at once.  But neither
      shrink_dcache_parent() nor its called functions call cond_resched().  So
      once need_resched() is set select_parent() will return single dentry
      dispose list which is then deleted by shrink_dentry_list().  This is
      inefficient when there are a lot of dentry to process.  This can cause
      softlockup and hurts interactivity on non preemptable kernels.
      
      This change adds cond_resched() in shrink_dcache_parent().  The benefit
      of this is that need_resched() is quickly cleared so that future calls
      to select_parent() are able to efficiently return a big batch of dentry.
      
      These additional cond_resched() do not seem to impact performance, at
      least for the workload below.
      
      Here is a program which can cause soft lockup if other system activity
      sets need_resched().
      
      	int main()
      	{
      	        struct rlimit rlim;
      	        int i;
      	        int f[100000];
      	        char buf[20];
      	        struct timeval t1, t2;
      	        double diff;
      
      	        /* cleanup past run */
      	        system("rm -rf x");
      
      	        /* boost nfile rlimit */
      	        rlim.rlim_cur = 200000;
      	        rlim.rlim_max = 200000;
      	        if (setrlimit(RLIMIT_NOFILE, &rlim))
      	                err(1, "setrlimit");
      
      	        /* make directory for files */
      	        if (mkdir("x", 0700))
      	                err(1, "mkdir");
      
      	        if (gettimeofday(&t1, NULL))
      	                err(1, "gettimeofday");
      
      	        /* populate directory with open files */
      	        for (i = 0; i < 100000; i++) {
      	                snprintf(buf, sizeof(buf), "x/%d", i);
      	                f[i] = open(buf, O_CREAT);
      	                if (f[i] == -1)
      	                        err(1, "open");
      	        }
      
      	        /* close some of the files */
      	        for (i = 0; i < 85000; i++)
      	                close(f[i]);
      
      	        /* unlink all files, even open ones */
      	        system("rm -rf x");
      
      	        if (gettimeofday(&t2, NULL))
      	                err(1, "gettimeofday");
      
      	        diff = (((double)t2.tv_sec * 1000000 + t2.tv_usec) -
      	                ((double)t1.tv_sec * 1000000 + t1.tv_usec));
      
      	        printf("done: %g elapsed\n", diff/1e6);
      	        return 0;
      	}
      Signed-off-by: default avatarGreg Thelen <gthelen@google.com>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c3d6c10
    • Zhao Hongjiang's avatar
      inotify: invalid mask should return a error number but not set it · 550fbb43
      Zhao Hongjiang authored
      commit 04df32fa upstream.
      
      When we run the crackerjack testsuite, the inotify_add_watch test is
      stalled.
      
      This is caused by the invalid mask 0 - the task is waiting for the event
      but it never comes.  inotify_add_watch() should return -EINVAL as it did
      before commit 676a0675 ("inotify: remove broken mask checks causing
      unmount to be EINVAL").  That commit removes the invalid mask check, but
      that check is needed.
      
      Check the mask's ALL_INOTIFY_BITS before the inotify_arg_to_mask() call.
      If none are set, just return -EINVAL.
      
      Because IN_UNMOUNT is in ALL_INOTIFY_BITS, this change will not trigger
      the problem that above commit fixed.
      
      [akpm@linux-foundation.org: fix build]
      Signed-off-by: default avatarZhao Hongjiang <zhaohongjiang@huawei.com>
      Acked-by: default avatarJim Somerville <Jim.Somerville@windriver.com>
      Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      550fbb43
    • Robert Richter's avatar
      sata_highbank: Rename proc_name to the module name · 4ad8e5d3
      Robert Richter authored
      commit 2cc1144a upstream.
      
      mkinitrd looks at /sys/class/scsi_host/host$hostnum/proc_name to find
      the module name of a disk driver. Current name is "highbank-ahci" but
      the module is "sata_highbank". Rename it to match the module name.
      Signed-off-by: default avatarRobert Richter <robert.richter@calxeda.com>
      Cc: Rob Herring <rob.herring@calxeda.com>
      Cc: Alexander Graf <agraf@suse.de>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ad8e5d3
    • Thomas Gleixner's avatar
      clockevents: Set dummy handler on CPU_DEAD shutdown · ee50f837
      Thomas Gleixner authored
      commit 6f7a05d7 upstream.
      
      Vitaliy reported that a per cpu HPET timer interrupt crashes the
      system during hibernation. What happens is that the per cpu HPET timer
      gets shut down when the nonboot cpus are stopped. When the nonboot
      cpus are onlined again the HPET code sets up the MSI interrupt which
      fires before the clock event device is registered. The event handler
      is still set to hrtimer_interrupt, which then crashes the machine due
      to highres mode not being active.
      
      See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=700333
      
      There is no real good way to avoid that in the HPET code. The HPET
      code alrady has a mechanism to detect spurious interrupts when event
      handler == NULL for a similar reason.
      
      We can handle that in the clockevent/tick layer and replace the
      previous functional handler with a dummy handler like we do in
      tick_setup_new_device().
      
      The original clockevents code did this in clockevents_exchange_device(),
      but that got removed by commit 7c1e7689 (clockevents: prevent
      clockevent event_handler ending up handler_noop) which forgot to fix
      it up in tick_shutdown(). Same issue with the broadcast device.
      Reported-by: default avatarVitaliy Fillipov <vitalif@yourcmc.ru>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      Cc: 700333@bugs.debian.org
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee50f837
    • Steven Rostedt's avatar
      localmodconfig: Process source kconfig files as they are found · 3e3745f5
      Steven Rostedt authored
      commit ced9cb1a upstream.
      
      A bug was reported that caused localmodconfig to not keep all the
      dependencies of ATH9K. This was caused by the kconfig file:
      
      In drivers/net/wireless/ath/Kconfig:
      3e3745f5
    • Li Zefan's avatar
      cgroup: fix broken file xattrs · ae373596
      Li Zefan authored
      commit 712317ad upstream.
      
      We should store file xattrs in struct cfent instead of struct cftype,
      because cftype is a type while cfent is object instance of cftype.
      
      For example each cgroup has a tasks file, and each tasks file is
      associated with a uniq cfent, but all those files share the same
      struct cftype.
      
      Alexey Kodanev reported a crash, which can be reproduced:
      
        # mount -t cgroup -o xattr /sys/fs/cgroup
        # mkdir /sys/fs/cgroup/test
        # setfattr -n trusted.value -v test_value /sys/fs/cgroup/tasks
        # rmdir /sys/fs/cgroup/test
        # umount /sys/fs/cgroup
        oops!
      
      In this case, simple_xattrs_free() will free the same struct simple_xattrs
      twice.
      
      tj: Dropped unused local variable @cft from cgroup_diput().
      Reported-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae373596
    • Li Zefan's avatar
      cgroup: fix an off-by-one bug which may trigger BUG_ON() · d52008e4
      Li Zefan authored
      commit 3ac1707a upstream.
      
      The 3rd parameter of flex_array_prealloc() is the number of elements,
      not the index of the last element.
      
      The effect of the bug is, when opening cgroup.procs, a flex array will
      be allocated and all elements of the array is allocated with
      GFP_KERNEL flag, but the last one is GFP_ATOMIC, and if we fail to
      allocate memory for it, it'll trigger a BUG_ON().
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d52008e4
    • Zhang Rui's avatar
      ACPI / thermal: do not always return THERMAL_TREND_RAISING for active trip points · 0252cb3c
      Zhang Rui authored
      commit 94a40931 upstream.
      
      Commit 4ae46bef "Thermal: Introduce thermal_zone_trip_update()"
      introduced a regression causing the fan to be always on even when
      the system is idle.
      
      My original idea in that commit is that:
       - when the current temperature is above the trip point,
         keep the fan on, even if the temperature is dropping.
       - when the current temperature is below the trip point,
         turn on the fan when the temperature is raising,
         turn off the fan when the temperature is dropping.
      
      But this is what the code actually does:
       - when the current temperature is above the trip point,
         the fan keeps on.
       - when the current temperature is below the trip point,
         the fan is always on because thermal_get_trend()
         in driver/acpi/thermal.c returns THERMAL_TREND_RAISING.
      Thus the fan keeps running even if the system is idle.
      
      Fix this in drivers/acpi/thermal.c.
      
      [rjw: Changelog]
      References: https://bugzilla.kernel.org/show_bug.cgi?id=56591
      References: https://bugzilla.kernel.org/show_bug.cgi?id=56601
      References: https://bugzilla.kernel.org/show_bug.cgi?id=50041#c45Signed-off-by: default avatarZhang Rui <rui.zhang@intel.com>
      Tested-by: default avatarMatthias <morpheusxyz123@yahoo.de>
      Tested-by: default avatarVille Syrjälä <syrjala@sci.fi>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0252cb3c
    • Wang YanQing's avatar
      ACPI: Fix wrong parameter passed to memblock_reserve · 9c2455ef
      Wang YanQing authored
      commit a6432ded upstream.
      
      Commit 53aac44c (ACPI: Store valid ACPI tables passed via early initrd
      in reserved memblock areas) introduced acpi_initrd_override() that
      passes a wrong value as the second argument to memblock_reserve().
      
      Namely, the second argument of memblock_reserve() is the size of the
      region, not the address of the top of it, so make
      acpi_initrd_override() pass the size in there as appropriate.
      
      [rjw: Changelog]
      Signed-off-by: default avatarWang YanQing <udknight@gmail.com>
      Acked-by: default avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c2455ef
    • Aaron Lu's avatar
      libata: acpi: make ata_ap_acpi_handle not block · 98ab042f
      Aaron Lu authored
      commit d66af4df upstream.
      
      Since commit 30dcf76a, ata_ap_acpi_handle will always do a namespace
      walk, which requires acquiring an acpi namespace mutex. This made it
      impossible to be used when calling path has held a spinlock.
      
      For example, it can occur in the following code path for pata_acpi:
      ata_scsi_queuecmd (ap->lock is acquired)
        __ata_scsi_queuecmd
          ata_scsi_translate
            ata_qc_issue
              pacpi_qc_issue
                ata_acpi_stm
                  ata_ap_acpi_handle
                    acpi_get_child
                      acpi_walk_namespace
                        acpi_ut_acquire_mutex (acquire mutex while holding lock)
      This caused scheduling while atomic bug, as reported in bug #56781.
      
      Actually, ata_ap_acpi_handle doesn't have to walk the namespace every
      time it is called, it can simply return the bound acpi handle on the
      corresponding SCSI host. The reason previously it is not done this way
      is, ata_ap_acpi_handle is used in the binding function
      ata_acpi_bind_host by ata_acpi_gtm when the handle is not bound to the
      SCSI host yet. Since we already have the ATA port's handle in its
      binding function, we can simply use it instead of calling
      ata_ap_acpi_handle there. So introduce a new function __ata_acpi_gtm,
      where it will receive an acpi handle param in addition to the ATA port
      which is solely used for debug statement. With this change, we can make
      ata_ap_acpi_handle simply return the bound handle for SCSI host instead
      of walking the acpi namespace now.
      
      Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=56781
      Reported-and-tested-by: <kenzopl@o2.pl>
      Signed-off-by: default avatarAaron Lu <aaron.lu@intel.com>
      Signed-off-by: default avatarJeff Garzik <jgarzik@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      98ab042f
    • Johan Hovold's avatar
      drivers/rtc/rtc-at91rm9200.c: fix missing iounmap · 5b6a8e8e
      Johan Hovold authored
      commit 3427de92 upstream.
      
      Add missing iounmap to probe error path and remove.
      Signed-off-by: default avatarJohan Hovold <jhovold@gmail.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b6a8e8e
    • Derek Basehore's avatar
      drivers/rtc/rtc-cmos.c: don't disable hpet emulation on suspend · ec551567
      Derek Basehore authored
      commit e005715e upstream.
      
      There's a bug where rtc alarms are ignored after the rtc cmos suspends
      but before the system finishes suspend.  Since hpet emulation is
      disabled and it still handles the interrupts, a wake event is never
      registered which is done from the rtc layer.
      
      This patch reverts commit d1b2efa8 ("rtc: disable hpet emulation on
      suspend") which disabled hpet emulation.  To fix the problem mentioned
      in that commit, hpet_rtc_timer_init() is called directly on resume.
      Signed-off-by: default avatarDerek Basehore <dbasehore@chromium.org>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec551567
    • Mel Gorman's avatar
      mm: swap: mark swap pages writeback before queueing for direct IO · 9940e550
      Mel Gorman authored
      commit 0cdc444a upstream.
      
      As pointed out by Andrew Morton, the swap-over-NFS writeback is not
      setting PageWriteback before it is queued for direct IO.  While swap
      pages do not participate in BDI or process dirty accounting and the IO
      is synchronous, the writeback bit is still required and not setting it
      in this case was an oversight.  swapoff depends on the page writeback to
      synchronoise all pending writes on a swap page before it is reused.
      Swapcache freeing and reuse depend on checking the PageWriteback under
      lock to ensure the page is safe to reuse.
      
      Direct IO handlers and the direct IO handler for NFS do not deal with
      PageWriteback as they are synchronous writes.  In the case of NFS, it
      schedules pages (or a page in the case of swap) for IO and then waits
      synchronously for IO to complete in nfs_direct_write().  It is
      recognised that this is a slowdown from normal swap handling which is
      asynchronous and uses a completion handler.  Shoving PageWriteback
      handling down into direct IO handlers looks like a bad fit to handle the
      swap case although it may have to be dealt with some day if swap is
      converted to use direct IO in general and bmap is finally done away
      with.  At that point it will be necessary to refit asynchronous direct
      IO with completion handlers onto the swap subsystem.
      
      As swapcache currently depends on PageWriteback to protect against
      races, this patch sets PageWriteback under the page lock before queueing
      it for direct IO.  It is cleared when the direct IO handler returns.  IO
      errors are treated similarly to the direct-to-bio case except PageError
      is not set as in the case of swap-over-NFS, it is likely to be a
      transient error.
      
      It was asked what prevents such a page being reclaimed in parallel.
      With this patch applied, such a page will now be skipped (most of the
      time) or blocked until the writeback completes.  Reclaim checks
      PageWriteback under the page lock before calling try_to_free_swap and
      the page lock should prevent the page being requeued for IO before it is
      freed.
      
      This and Jerome's related patch should considered for -stable as far
      back as 3.6 when swap-over-NFS was introduced.
      
      [akpm@linux-foundation.org: use pr_err_ratelimited()]
      [akpm@linux-foundation.org: remove hopefully-unneeded cast in printk]
      Signed-off-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9940e550
    • Jerome Marchand's avatar
      swap: redirty page if page write fails on swap file · 4e5d8307
      Jerome Marchand authored
      commit 2d30d31e upstream.
      
      Since commit 62c230bc ("mm: add support for a filesystem to activate
      swap files and use direct_IO for writing swap pages"), swap_writepage()
      calls direct_IO on swap files.  However, in that case the page isn't
      redirtied if I/O fails, and is therefore handled afterwards as if it has
      been successfully written to the swap file, leading to memory corruption
      when the page is eventually swapped back in.
      
      This patch sets the page dirty when direct_IO() fails.  It fixes a
      memory corruption that happened while using swap-over-NFS.
      Signed-off-by: default avatarJerome Marchand <jmarchan@redhat.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e5d8307
    • Prarit Bhargava's avatar
      hrtimer: Add expiry time overflow check in hrtimer_interrupt · 66e79283
      Prarit Bhargava authored
      commit 8f294b5a upstream.
      
      The settimeofday01 test in the LTP testsuite effectively does
      
              gettimeofday(current time);
              settimeofday(Jan 1, 1970 + 100 seconds);
              settimeofday(current time);
      
      This test causes a stack trace to be displayed on the console during the
      setting of timeofday to Jan 1, 1970 + 100 seconds:
      
      [  131.066751] ------------[ cut here ]------------
      [  131.096448] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x135/0x140()
      [  131.104935] Hardware name: Dinar
      [  131.108150] Modules linked in: sg nfsv3 nfs_acl nfsv4 auth_rpcgss nfs dns_resolver fscache lockd sunrpc nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables kvm_amd kvm sp5100_tco bnx2 i2c_piix4 crc32c_intel k10temp fam15h_power ghash_clmulni_intel amd64_edac_mod pcspkr serio_raw edac_mce_amd edac_core microcode xfs libcrc32c sr_mod sd_mod cdrom ata_generic crc_t10dif pata_acpi radeon i2c_algo_bit drm_kms_helper ttm drm ahci pata_atiixp libahci libata usb_storage i2c_core dm_mirror dm_region_hash dm_log dm_mod
      [  131.176784] Pid: 0, comm: swapper/28 Not tainted 3.8.0+ #6
      [  131.182248] Call Trace:
      [  131.184684]  <IRQ>  [<ffffffff810612af>] warn_slowpath_common+0x7f/0xc0
      [  131.191312]  [<ffffffff8106130a>] warn_slowpath_null+0x1a/0x20
      [  131.197131]  [<ffffffff810b9fd5>] clockevents_program_event+0x135/0x140
      [  131.203721]  [<ffffffff810bb584>] tick_program_event+0x24/0x30
      [  131.209534]  [<ffffffff81089ab1>] hrtimer_interrupt+0x131/0x230
      [  131.215437]  [<ffffffff814b9600>] ? cpufreq_p4_target+0x130/0x130
      [  131.221509]  [<ffffffff81619119>] smp_apic_timer_interrupt+0x69/0x99
      [  131.227839]  [<ffffffff8161805d>] apic_timer_interrupt+0x6d/0x80
      [  131.233816]  <EOI>  [<ffffffff81099745>] ? sched_clock_cpu+0xc5/0x120
      [  131.240267]  [<ffffffff814b9ff0>] ? cpuidle_wrap_enter+0x50/0xa0
      [  131.246252]  [<ffffffff814b9fe9>] ? cpuidle_wrap_enter+0x49/0xa0
      [  131.252238]  [<ffffffff814ba050>] cpuidle_enter_tk+0x10/0x20
      [  131.257877]  [<ffffffff814b9c89>] cpuidle_idle_call+0xa9/0x260
      [  131.263692]  [<ffffffff8101c42f>] cpu_idle+0xaf/0x120
      [  131.268727]  [<ffffffff815f8971>] start_secondary+0x255/0x257
      [  131.274449] ---[ end trace 1151a50552231615 ]---
      
      When we change the system time to a low value like this, the value of
      timekeeper->offs_real will be a negative value.
      
      It seems that the WARN occurs because an hrtimer has been started in the time
      between the releasing of the timekeeper lock and the IPI call (via a call to
      on_each_cpu) in clock_was_set() in the do_settimeofday() code.  The end result
      is that a REALTIME_CLOCK timer has been added with softexpires = expires =
      KTIME_MAX.  The hrtimer_interrupt() fires/is called and the loop at
      kernel/hrtimer.c:1289 is executed.  In this loop the code subtracts the
      clock base's offset (which was set to timekeeper->offs_real in
      do_settimeofday()) from the current hrtimer_cpu_base->expiry value (which
      was KTIME_MAX):
      
      	KTIME_MAX - (a negative value) = overflow
      
      A simple check for an overflow can resolve this problem.  Using KTIME_MAX
      instead of the overflow value will result in the hrtimer function being run,
      and the reprogramming of the timer after that.
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      [jstultz: Tweaked commit subject]
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      66e79283
    • David Engraf's avatar
      hrtimer: Fix ktime_add_ns() overflow on 32bit architectures · f931d5e4
      David Engraf authored
      commit 51fd36f3 upstream.
      
      One can trigger an overflow when using ktime_add_ns() on a 32bit
      architecture not supporting CONFIG_KTIME_SCALAR.
      
      When passing a very high value for u64 nsec, e.g. 7881299347898368000
      the do_div() function converts this value to seconds (7881299347) which
      is still to high to pass to the ktime_set() function as long. The result
      in is a negative value.
      
      The problem on my system occurs in the tick-sched.c,
      tick_nohz_stop_sched_tick() when time_delta is set to
      timekeeping_max_deferment(). The check for time_delta < KTIME_MAX is
      valid, thus ktime_add_ns() is called with a too large value resulting in
      a negative expire value. This leads to an endless loop in the ticker code:
      
      time_delta: 7881299347898368000
      expires = ktime_add_ns(last_update, time_delta)
      expires: negative value
      
      This fix caps the value to KTIME_MAX.
      
      This error doesn't occurs on 64bit or architectures supporting
      CONFIG_KTIME_SCALAR (e.g. ARM, x86-32).
      Signed-off-by: default avatarDavid Engraf <david.engraf@sysgo.com>
      [jstultz: Minor tweaks to commit message & header]
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f931d5e4
    • Dylan Reid's avatar
      ASoC: max98088: Fix logging of hardware revision. · bee59d68
      Dylan Reid authored
      commit 98682063 upstream.
      
      The hardware revision of the codec is based at 0x40.  Subtract that
      before convering to ASCII.  The same as it is done for 98095.
      Signed-off-by: default avatarDylan Reid <dgreid@chromium.org>
      Signed-off-by: default avatarMark Brown <broonie@opensource.wolfsonmicro.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bee59d68
    • Kailang Yang's avatar
      ALSA: hda - Add the support for ALC286 codec · febafacf
      Kailang Yang authored
      commit 7fc7d047 upstream.
      
      It's yet another ALC269-variant.
      Signed-off-by: default avatarKailang Yang <kailang@realtek.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      febafacf
    • Takashi Iwai's avatar
      ALSA: hda - Fix aamix activation with loopback control on VIA codecs · 02cd3482
      Takashi Iwai authored
      commit 65033cc8 upstream.
      
      When we have a loopback mixer control, this should manage the state
      whether the output paths include the aamix or not.  But the current
      code blindly initializes the output paths with aamix = true, thus the
      aamix is enabled unless the loopback mixer control is changed.
      
      Also, update_aamix_paths() called by the loopback mixer control put
      callback invokes snd_hda_activate_path() with aamix = true even for
      disabling the mixing.  This leaves the aamix path even though the
      loopback control is turned off.
      
      This patch fixes these issues:
      - Introduced aamix_default() helper to indicate whether with_aamix is
        true or false as default
      - Fix the argument in update_aamix_paths() for disabling loopback
      Reported-by: default avatarLydia Wang <LydiaWang@viatech.com.cn>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02cd3482
    • Clemens Ladisch's avatar
      ALSA: USB: adjust for changed 3.8 USB API · ff865cae
      Clemens Ladisch authored
      commit c75c5ab5 upstream.
      
      The recent changes in the USB API ("implement new semantics for
      URB_ISO_ASAP") made the former meaning of the URB_ISO_ASAP flag the
      default, and changed this flag to mean that URBs can be delayed.
      This is not the behaviour wanted by any of the audio drivers because
      it leads to discontinuous playback with very small period sizes.
      Therefore, our URBs need to be submitted without this flag.
      Reported-by: default avatarJoe Rayhawk <jrayhawk@fairlystable.org>
      Signed-off-by: default avatarClemens Ladisch <clemens@ladisch.de>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ff865cae