1. 25 Jan, 2014 14 commits
    • Steven Rostedt's avatar
      ftrace/x86: Load ftrace_ops in parameter not the variable holding it · bc84f635
      Steven Rostedt authored
      commit 1739f09e upstream.
      
      Function tracing callbacks expect to have the ftrace_ops that registered it
      passed to them, not the address of the variable that holds the ftrace_ops
      that registered it.
      
      Use a mov instead of a lea to store the ftrace_ops into the parameter
      of the function tracing callback.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reviewed-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Link: http://lkml.kernel.org/r/20131113152004.459787f9@gandalf.local.homeSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc84f635
    • Hugh Dickins's avatar
      thp: fix copy_page_rep GPF by testing is_huge_zero_pmd once only · 49426b1d
      Hugh Dickins authored
      commit eecc1e42 upstream.
      
      We see General Protection Fault on RSI in copy_page_rep: that RSI is
      what you get from a NULL struct page pointer.
      
        RIP: 0010:[<ffffffff81154955>]  [<ffffffff81154955>] copy_page_rep+0x5/0x10
        RSP: 0000:ffff880136e15c00  EFLAGS: 00010286
        RAX: ffff880000000000 RBX: ffff880136e14000 RCX: 0000000000000200
        RDX: 6db6db6db6db6db7 RSI: db73880000000000 RDI: ffff880dd0c00000
        RBP: ffff880136e15c18 R08: 0000000000000200 R09: 000000000005987c
        R10: 000000000005987c R11: 0000000000000200 R12: 0000000000000001
        R13: ffffea00305aa000 R14: 0000000000000000 R15: 0000000000000000
        FS:  00007f195752f700(0000) GS:ffff880c7fc20000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000093010000 CR3: 00000001458e1000 CR4: 00000000000027e0
        Call Trace:
          copy_user_huge_page+0x93/0xab
          do_huge_pmd_wp_page+0x710/0x815
          handle_mm_fault+0x15d8/0x1d70
          __do_page_fault+0x14d/0x840
          do_page_fault+0x2f/0x90
          page_fault+0x22/0x30
      
      do_huge_pmd_wp_page() tests is_huge_zero_pmd(orig_pmd) four times: but
      since shrink_huge_zero_page() can free the huge_zero_page, and we have
      no hold of our own on it here (except where the fourth test holds
      page_table_lock and has checked pmd_same), it's possible for it to
      answer yes the first time, but no to the second or third test.  Change
      all those last three to tests for NULL page.
      
      (Note: this is not the same issue as trinity's DEBUG_PAGEALLOC BUG
      in copy_page_rep with RSI: ffff88009c422000, reported by Sasha Levin
      in https://lkml.org/lkml/2013/3/29/103.  I believe that one is due
      to the source page being split, and a tail page freed, while copy
      is in progress; and not a problem without DEBUG_PAGEALLOC, since
      the pmd_same check will prevent a miscopy from being made visible.)
      
      Fixes: 97ae1749 ("thp: implement refcounting for huge zero page")
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      49426b1d
    • Steven Rostedt's avatar
      SELinux: Fix possible NULL pointer dereference in selinux_inode_permission() · 3cb7f44b
      Steven Rostedt authored
      commit 3dc91d43 upstream.
      
      While running stress tests on adding and deleting ftrace instances I hit
      this bug:
      
        BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
        IP: selinux_inode_permission+0x85/0x160
        PGD 63681067 PUD 7ddbe067 PMD 0
        Oops: 0000 [#1] PREEMPT
        CPU: 0 PID: 5634 Comm: ftrace-test-mki Not tainted 3.13.0-rc4-test-00033-gd2a6dde-dirty #20
        Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
        task: ffff880078375800 ti: ffff88007ddb0000 task.ti: ffff88007ddb0000
        RIP: 0010:[<ffffffff812d8bc5>]  [<ffffffff812d8bc5>] selinux_inode_permission+0x85/0x160
        RSP: 0018:ffff88007ddb1c48  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: 0000000000800000 RCX: ffff88006dd43840
        RDX: 0000000000000001 RSI: 0000000000000081 RDI: ffff88006ee46000
        RBP: ffff88007ddb1c88 R08: 0000000000000000 R09: ffff88007ddb1c54
        R10: 6e6576652f6f6f66 R11: 0000000000000003 R12: 0000000000000000
        R13: 0000000000000081 R14: ffff88006ee46000 R15: 0000000000000000
        FS:  00007f217b5b6700(0000) GS:ffffffff81e21000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
        CR2: 0000000000000020 CR3: 000000006a0fe000 CR4: 00000000000007f0
        Call Trace:
          security_inode_permission+0x1c/0x30
          __inode_permission+0x41/0xa0
          inode_permission+0x18/0x50
          link_path_walk+0x66/0x920
          path_openat+0xa6/0x6c0
          do_filp_open+0x43/0xa0
          do_sys_open+0x146/0x240
          SyS_open+0x1e/0x20
          system_call_fastpath+0x16/0x1b
        Code: 84 a1 00 00 00 81 e3 00 20 00 00 89 d8 83 c8 02 40 f6 c6 04 0f 45 d8 40 f6 c6 08 74 71 80 cf 02 49 8b 46 38 4c 8d 4d cc 45 31 c0 <0f> b7 50 20 8b 70 1c 48 8b 41 70 89 d9 8b 78 04 e8 36 cf ff ff
        RIP  selinux_inode_permission+0x85/0x160
        CR2: 0000000000000020
      
      Investigating, I found that the inode->i_security was NULL, and the
      dereference of it caused the oops.
      
      in selinux_inode_permission():
      
      	isec = inode->i_security;
      
      	rc = avc_has_perm_noaudit(sid, isec->sid, isec->sclass, perms, 0, &avd);
      
      Note, the crash came from stressing the deletion and reading of debugfs
      files.  I was not able to recreate this via normal files.  But I'm not
      sure they are safe.  It may just be that the race window is much harder
      to hit.
      
      What seems to have happened (and what I have traced), is the file is
      being opened at the same time the file or directory is being deleted.
      As the dentry and inode locks are not held during the path walk, nor is
      the inodes ref counts being incremented, there is nothing saving these
      structures from being discarded except for an rcu_read_lock().
      
      The rcu_read_lock() protects against freeing of the inode, but it does
      not protect freeing of the inode_security_struct.  Now if the freeing of
      the i_security happens with a call_rcu(), and the i_security field of
      the inode is not changed (it gets freed as the inode gets freed) then
      there will be no issue here.  (Linus Torvalds suggested not setting the
      field to NULL such that we do not need to check if it is NULL in the
      permission check).
      
      Note, this is a hack, but it fixes the problem at hand.  A real fix is
      to restructure the destroy_inode() to call all the destructor handlers
      from the RCU callback.  But that is a major job to do, and requires a
      lot of work.  For now, we just band-aid this bug with this fix (it
      works), and work on a more maintainable solution in the future.
      
      Link: http://lkml.kernel.org/r/20140109101932.0508dec7@gandalf.local.home
      Link: http://lkml.kernel.org/r/20140109182756.17abaaa8@gandalf.local.homeSigned-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3cb7f44b
    • Jan Kara's avatar
      writeback: Fix data corruption on NFS · 0e177339
      Jan Kara authored
      commit f9b0e058 upstream.
      
      Commit 4f8ad655 "writeback: Refactor writeback_single_inode()" added
      a condition to skip clean inode. However this is wrong in WB_SYNC_ALL
      mode because there we also want to wait for outstanding writeback on
      possibly clean inode. This was causing occasional data corruption issues
      on NFS because it uses sync_inode() to make sure all outstanding writes
      are flushed to the server before truncating the inode and with
      sync_inode() returning prematurely file was sometimes extended back
      by an outstanding write after it was truncated.
      
      So modify the test to also check for pages under writeback in
      WB_SYNC_ALL mode.
      
      Fixes: 4f8ad655Reported-and-tested-by: default avatarDan Duval <dan.duval@oracle.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e177339
    • Jean Delvare's avatar
      hwmon: (coretemp) Fix truncated name of alarm attributes · 19519d5b
      Jean Delvare authored
      commit 3f9aec76 upstream.
      
      When the core number exceeds 9, the size of the buffer storing the
      alarm attribute name is insufficient and the attribute name is
      truncated. This causes libsensors to skip these attributes as the
      truncated name is not recognized.
      Reported-by: default avatarAndreas Hollmann <hollmann@in.tum.de>
      Signed-off-by: default avatarJean Delvare <khali@linux-fr.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      19519d5b
    • Stephen Warren's avatar
      i2c: Re-instate body of i2c_parent_is_i2c_adapter() · ae117c4f
      Stephen Warren authored
      commit 2fac2b89 upstream.
      
      The body of i2c_parent_is_i2c_adapter() is currently guarded by
      I2C_MUX. It should be CONFIG_I2C_MUX instead.
      
      Among potentially other problems, this resulted in i2c_lock_adapter()
      only locking I2C mux child adapters, and not the parent adapter. In
      turn, this could allow inter-mingling of mux child selection and I2C
      transactions, which could result in I2C transactions being directed to
      the wrong I2C bus, and possibly even switching between busses in the
      middle of a transaction.
      
      One concrete issue caused by this bug was corrupted HDMI EDID reads
      during boot on the NVIDIA Tegra Seaboard system, although this only
      became apparent in recent linux-next, when the boot timing was changed
      just enough to trigger the race condition.
      
      Fixes: 3923172b ("i2c: reduce parent checking to a NOOP in non-I2C_MUX case")
      Cc: Phil Carmody <phil.carmody@partner.samsung.com>
      Signed-off-by: default avatarStephen Warren <swarren@nvidia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae117c4f
    • Eric W. Biederman's avatar
      fork: Allow CLONE_PARENT after setns(CLONE_NEWPID) · a88576fc
      Eric W. Biederman authored
      commit 1f7f4dde upstream.
      
      Serge Hallyn <serge.hallyn@ubuntu.com> writes:
      > Hi Oleg,
      >
      > commit 40a0d32d :
      > "fork: unify and tighten up CLONE_NEWUSER/CLONE_NEWPID checks"
      > breaks lxc-attach in 3.12.  That code forks a child which does
      > setns() and then does a clone(CLONE_PARENT).  That way the
      > grandchild can be in the right namespaces (which the child was
      > not) and be a child of the original task, which is the monitor.
      >
      > lxc-attach in 3.11 was working fine with no side effects that I
      > could see.  Is there a real danger in allowing CLONE_PARENT
      > when current->nsproxy->pidns_for_children is not our pidns,
      > or was this done out of an "over-abundance of caution"?  Can we
      > safely revert that new extra check?
      
      The two fundamental things I know we can not allow are:
      - A shared signal queue aka CLONE_THREAD.  Because we compute the pid
        and uid of the signal when we place it in the queue.
      
      - Changing the pid and by extention pid_namespace of an existing
        process.
      
      From a parents perspective there is nothing special about the pid
      namespace, to deny CLONE_PARENT, because the parent simply won't know or
      care.
      
      From the childs perspective all that is special really are shared signal
      queues.
      
      User mode threading with CLONE_PARENT|CLONE_VM|CLONE_SIGHAND and tasks
      in different pid namespaces is almost certainly going to break because
      it is complicated.  But shared signal handlers can look at per thread
      information to know which pid namespace a process is in, so I don't know
      of any reason not to support CLONE_PARENT|CLONE_VM|CLONE_SIGHAND threads
      at the kernel level.  It would be absolutely stupid to implement but
      that is a different thing.
      
      So hmm.
      
      Because it can do no harm, and because it is a regression let's remove
      the CLONE_PARENT check and send it stable.
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Acked-by: default avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a88576fc
    • Eric W. Biederman's avatar
      vfs: Fix a regression in mounting proc · 71a34246
      Eric W. Biederman authored
      commit 41301ae7 upstream.
      
      Gao feng <gaofeng@cn.fujitsu.com> reported that commit
      e51db735
      userns: Better restrictions on when proc and sysfs can be mounted
      caused a regression on mounting a new instance of proc in a mount
      namespace created with user namespace privileges, when binfmt_misc
      is mounted on /proc/sys/fs/binfmt_misc.
      
      This is an unintended regression caused by the absolutely bogus empty
      directory check in fs_fully_visible.  The check fs_fully_visible replaced
      didn't even bother to attempt to verify proc was fully visible and
      hiding proc files with any kind of mount is rare.  So for now fix
      the userspace regression by allowing directory with nlink == 1
      as /proc/sys/fs/binfmt_misc has.
      
      I will have a better patch but it is not stable material, or
      last minute kernel material.  So it will have to wait.
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: default avatarGao feng <gaofeng@cn.fujitsu.com>
      Tested-by: default avatarGao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71a34246
    • Eric W. Biederman's avatar
      vfs: In d_path don't call d_dname on a mount point · 0489953b
      Eric W. Biederman authored
      commit f48cfddc upstream.
      
      Aditya Kali (adityakali@google.com) wrote:
      > Commit bf056bfa:
      > "proc: Fix the namespace inode permission checks." converted
      > the namespace files into symlinks. The same commit changed
      > the way namespace bind mounts appear in /proc/mounts:
      >   $ mount --bind /proc/self/ns/ipc /mnt/ipc
      > Originally:
      >   $ cat /proc/mounts | grep ipc
      >   proc /mnt/ipc proc rw,nosuid,nodev,noexec 0 0
      >
      > After commit bf056bfa:
      >   $ cat /proc/mounts | grep ipc
      >   proc ipc:[4026531839] proc rw,nosuid,nodev,noexec 0 0
      >
      > This breaks userspace which expects the 2nd field in
      > /proc/mounts to be a valid path.
      
      The symlink /proc/<pid>/ns/{ipc,mnt,net,pid,user,uts} point to
      dentries allocated with d_alloc_pseudo that we can mount, and
      that have interesting names printed out with d_dname.
      
      When these files are bind mounted /proc/mounts is not currently
      displaying the mount point correctly because d_dname is called instead
      of just displaying the path where the file is mounted.
      
      Solve this by adding an explicit check to distinguish mounted pseudo
      inodes and unmounted pseudo inodes.  Unmounted pseudo inodes always
      use mount of their filesstem as the mnt_root  in their path making
      these two cases easy to distinguish.
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Reported-by: default avatarAditya Kali <adityakali@google.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0489953b
    • H Hartley Sweeten's avatar
      staging: comedi: adl_pci9111: fix incorrect irq passed to request_irq() · e2fcb249
      H Hartley Sweeten authored
      commit 48108fe3 upstream.
      
      The dev->irq passed to request_irq() will always be 0 when the auto_attach
      function is called. The pcidev->irq should be used instead to get the correct
      irq number.
      Signed-off-by: default avatarH Hartley Sweeten <hsweeten@visionengravers.com>
      Reviewed-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2fcb249
    • H Hartley Sweeten's avatar
      staging: comedi: addi_apci_1032: fix subdevice type/flags bug · 10d00b05
      H Hartley Sweeten authored
      commit 90daf69a upstream.
      
      The SDF_CMD_READ should be one of the s->subdev_flags not part of
      the s->type.
      Signed-off-by: default avatarH Hartley Sweeten <hsweeten@visionengravers.com>
      Reviewed-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      10d00b05
    • Bob Peterson's avatar
      GFS2: Increase i_writecount during gfs2_setattr_chown · dfc74e9c
      Bob Peterson authored
      commit 62e96cf8 upstream.
      
      This patch calls get_write_access in function gfs2_setattr_chown,
      which merely increases inode->i_writecount for the duration of the
      function. That will ensure that any file closes won't delete the
      inode's multi-block reservation while the function is running.
      It also ensures that a multi-block reservation exists when needed
      for quota change operations during the chown.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfc74e9c
    • Robert Richter's avatar
      perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h · 280f5dcc
      Robert Richter authored
      commit bee09ed9 upstream.
      
      On AMD family 10h we see following error messages while waking up from
      S3 for all non-boot CPUs leading to a failed IBS initialization:
      
       Enabling non-boot CPUs ...
       smpboot: Booting Node 0 Processor 1 APIC 0x1
       [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
       perf: IBS APIC setup failed on cpu #1
       process: Switch to broadcast mode on CPU1
       CPU1 is up
       ...
       ACPI: Waking up from system sleep state S3
      
      Reason for this is that during suspend the LVT offset for the IBS
      vector gets lost and needs to be reinialized while resuming.
      
      The offset is read from the IBSCTL msr. On family 10h the offset needs
      to be 1 as offset 0 is used for the MCE threshold interrupt, but
      firmware assings it for IBS to 0 too. The kernel needs to reprogram
      the vector. The msr is a readonly node msr, but a new value can be
      written via pci config space access. The reinitialization is
      implemented for family 10h in setup_ibs_ctl() which is forced during
      IBS setup.
      
      This patch fixes IBS setup after waking up from S3 by adding
      resume/supend hooks for the boot cpu which does the offset
      reinitialization.
      
      Marking it as stable to let distros pick up this fix.
      Signed-off-by: default avatarRobert Richter <rric@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      280f5dcc
    • Rafael J. Wysocki's avatar
      Revert "ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs" · d2dca1c6
      Rafael J. Wysocki authored
      commit 2b844ba7 upstream.
      
      This reverts commit f6308b36 (ACPI: Add BayTrail SoC GPIO and LPSS
      ACPI IDs), because it causes the Alan Cox' ASUS T100TA to "crash and
      burn" during boot if the Baytrail pinctrl driver is compiled in.
      
      Fixes: f6308b36 (ACPI: Add BayTrail SoC GPIO and LPSS ACPI IDs)
      Reported-by: default avatarOne Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
      Requested-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2dca1c6
  2. 15 Jan, 2014 26 commits