1. 07 Jun, 2014 40 commits
    • Joe Thornber's avatar
      dm thin: fix discard corruption · 87dba703
      Joe Thornber authored
      commit f046f89a upstream.
      
      Fix a bug in dm_btree_remove that could leave leaf values with incorrect
      reference counts.  The effect of this was that removal of a shared block
      could result in the space maps thinking the block was no longer used.
      More concretely, if you have a thin device and a snapshot of it, sending
      a discard to a shared region of the thin could corrupt the snapshot.
      
      Thinp uses a 2-level nested btree to store it's mappings.  This first
      level is indexed by thin device, and the second level by logical
      block.
      
      Often when we're removing an entry in this mapping tree we need to
      rebalance nodes, which can involve shadowing them, possibly creating a
      copy if the block is shared.  If we do create a copy then children of
      that node need to have their reference counts incremented.  In this
      way reference counts percolate down the tree as shared trees diverge.
      
      The rebalance functions were incrementing the children at the
      appropriate time, but they were always assuming the children were
      internal nodes.  This meant the leaf values (in our case packed
      block/flags entries) were not being incremented.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      [bwh: Backported to 3.2: bump target version numbers from 1.0.1 to 1.0.2]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [xr: Backported to 3.4: bump target version numbers to 1.1.1]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87dba703
    • Shiva Krishna Merla's avatar
      dm mpath: fix race condition between multipath_dtr and pg_init_done · 5e301eba
      Shiva Krishna Merla authored
      commit 954a73d5 upstream.
      
      Whenever multipath_dtr() is happening we must prevent queueing any
      further path activation work.  Implement this by adding a new
      'pg_init_disabled' flag to the multipath structure that denotes future
      path activation work should be skipped if it is set.  By disabling
      pg_init and then re-enabling in flush_multipath_work() we also avoid the
      potential for pg_init to be initiated while suspending an mpath device.
      
      Without this patch a race condition exists that may result in a kernel
      panic:
      
      1) If after pg_init_done() decrements pg_init_in_progress to 0, a call
         to wait_for_pg_init_completion() assumes there are no more pending path
         management commands.
      2) If pg_init_required is set by pg_init_done(), due to retryable
         mode_select errors, then process_queued_ios() will again queue the
         path activation work.
      3) If free_multipath() completes before activate_path() work is called a
         NULL pointer dereference like the following can be seen when
         accessing members of the recently destructed multipath:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
      RIP: 0010:[<ffffffffa003db1b>]  [<ffffffffa003db1b>] activate_path+0x1b/0x30 [dm_multipath]
      [<ffffffff81090ac0>] worker_thread+0x170/0x2a0
      [<ffffffff81096c80>] ? autoremove_wake_function+0x0/0x40
      
      [switch to disabling pg_init in flush_multipath_work & header edits by Mike Snitzer]
      Signed-off-by: default avatarShiva Krishna Merla <shivakrishna.merla@netapp.com>
      Reviewed-by: default avatarKrishnasamy Somasundaram <somasundaram.krishnasamy@netapp.com>
      Tested-by: default avatarSpeagle Andy <Andy.Speagle@netapp.com>
      Acked-by: default avatarJunichi Nomura <j-nomura@ce.jp.nec.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - Bump version to 1.3.2 not 1.6.0]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [xr: Backported to 3.4: Adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e301eba
    • Mikulas Patocka's avatar
      dm snapshot: avoid snapshot space leak on crash · d110fd51
      Mikulas Patocka authored
      commit 230c83af upstream.
      
      There is a possible leak of snapshot space in case of crash.
      
      The reason for space leaking is that chunks in the snapshot device are
      allocated sequentially, but they are finished (and stored in the metadata)
      out of order, depending on the order in which copying finished.
      
      For example, supposed that the metadata contains the following records
      SUPERBLOCK
      METADATA (blocks 0 ... 250)
      DATA 0
      DATA 1
      DATA 2
      ...
      DATA 250
      
      Now suppose that you allocate 10 new data blocks 251-260. Suppose that
      copying of these blocks finish out of order (block 260 finished first
      and the block 251 finished last). Now, the snapshot device looks like
      this:
      SUPERBLOCK
      METADATA (blocks 0 ... 250, 260, 259, 258, 257, 256)
      DATA 0
      DATA 1
      DATA 2
      ...
      DATA 250
      DATA 251
      DATA 252
      DATA 253
      DATA 254
      DATA 255
      METADATA (blocks 255, 254, 253, 252, 251)
      DATA 256
      DATA 257
      DATA 258
      DATA 259
      DATA 260
      
      Now, if the machine crashes after writing the first metadata block but
      before writing the second metadata block, the space for areas DATA 250-255
      is leaked, it contains no valid data and it will never be used in the
      future.
      
      This patch makes dm-snapshot complete exceptions in the same order they
      were allocated, thus fixing this bug.
      
      Note: when backporting this patch to the stable kernel, change the version
      field in the following way:
      * if version in the stable kernel is {1, 11, 1}, change it to {1, 12, 0}
      * if version in the stable kernel is {1, 10, 0} or {1, 10, 1}, change it
        to {1, 10, 2}
      Userspace reads the version to determine if the bug was fixed, so the
      version change is needed.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      [xr: Backported to 3.4: adjust version]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d110fd51
    • Harshula Jayasuriya's avatar
      nfsd: nfsd_open: when dentry_open returns an error do not propagate as struct file · 4834ca94
      Harshula Jayasuriya authored
      commit e4daf1ff upstream.
      
      The following call chain:
      ------------------------------------------------------------
      nfs4_get_vfs_file
      - nfsd_open
        - dentry_open
          - do_dentry_open
            - __get_file_write_access
              - get_write_access
                - return atomic_inc_unless_negative(&inode->i_writecount) ? 0 : -ETXTBSY;
      ------------------------------------------------------------
      
      can result in the following state:
      ------------------------------------------------------------
      struct nfs4_file {
      ...
        fi_fds = {0xffff880c1fa65c80, 0xffffffffffffffe6, 0x0},
        fi_access = {{
            counter = 0x1
          }, {
            counter = 0x0
          }},
      ...
      ------------------------------------------------------------
      
      1) First time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
      NULL, hence nfsd_open() is called where we get status set to an error
      and fp->fi_fds[O_WRONLY] to -ETXTBSY. Thus we do not reach
      nfs4_file_get_access() and fi_access[O_WRONLY] is not incremented.
      
      2) Second time around, in nfs4_get_vfs_file() fp->fi_fds[O_WRONLY] is
      NOT NULL (-ETXTBSY), so nfsd_open() is NOT called, but
      nfs4_file_get_access() IS called and fi_access[O_WRONLY] is incremented.
      Thus we leave a landmine in the form of the nfs4_file data structure in
      an incorrect state.
      
      3) Eventually, when __nfs4_file_put_access() is called it finds
      fi_access[O_WRONLY] being non-zero, it decrements it and calls
      nfs4_file_put_fd() which tries to fput -ETXTBSY.
      ------------------------------------------------------------
      ...
           [exception RIP: fput+0x9]
           RIP: ffffffff81177fa9  RSP: ffff88062e365c90  RFLAGS: 00010282
           RAX: ffff880c2b3d99cc  RBX: ffff880c2b3d9978  RCX: 0000000000000002
           RDX: dead000000100101  RSI: 0000000000000001  RDI: ffffffffffffffe6
           RBP: ffff88062e365c90   R8: ffff88041fe797d8   R9: ffff88062e365d58
           R10: 0000000000000008  R11: 0000000000000000  R12: 0000000000000001
           R13: 0000000000000007  R14: 0000000000000000  R15: 0000000000000000
           ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
        #9 [ffff88062e365c98] __nfs4_file_put_access at ffffffffa0562334 [nfsd]
       #10 [ffff88062e365cc8] nfs4_file_put_access at ffffffffa05623ab [nfsd]
       #11 [ffff88062e365ce8] free_generic_stateid at ffffffffa056634d [nfsd]
       #12 [ffff88062e365d18] release_open_stateid at ffffffffa0566e4b [nfsd]
       #13 [ffff88062e365d38] nfsd4_close at ffffffffa0567401 [nfsd]
       #14 [ffff88062e365d88] nfsd4_proc_compound at ffffffffa0557f28 [nfsd]
       #15 [ffff88062e365dd8] nfsd_dispatch at ffffffffa054543e [nfsd]
       #16 [ffff88062e365e18] svc_process_common at ffffffffa04ba5a4 [sunrpc]
       #17 [ffff88062e365e98] svc_process at ffffffffa04babe0 [sunrpc]
       #18 [ffff88062e365eb8] nfsd at ffffffffa0545b62 [nfsd]
       #19 [ffff88062e365ee8] kthread at ffffffff81090886
       #20 [ffff88062e365f48] kernel_thread at ffffffff8100c14a
      ------------------------------------------------------------
      Signed-off-by: default avatarHarshula Jayasuriya <harshula@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      [xr: Backported to 3.4: adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4834ca94
    • NeilBrown's avatar
      md/raid10: fix "enough" function for detecting if array is failed. · 352f526f
      NeilBrown authored
      commit 80b48124 upstream.
      
      The 'enough' function is written to work with 'near' arrays only
      in that is implicitly assumes that the offset from one 'group' of
      devices to the next is the same as the number of copies.
      In reality it is the number of 'near' copies.
      
      So change it to make this number explicit.
      
      This bug makes it possible to run arrays without enough drives
      present, which is dangerous.
      It is appropriate for an -stable kernel, but will almost certainly
      need to be modified for some of them.
      Reported-by: default avatarJakub Husák <jakub@gooseman.cz>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      [bwh: Backported to 3.2: s/geo->/conf->/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      352f526f
    • Mikulas Patocka's avatar
      dm snapshot: add missing module aliases · e4bf9301
      Mikulas Patocka authored
      commit 23cb2109 upstream.
      
      Add module aliases so that autoloading works correctly if the user
      tries to activate "snapshot-origin" or "snapshot-merge" targets.
      
      Reference: https://bugzilla.redhat.com/889973Reported-by: default avatarChao Yang <chyang@redhat.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4bf9301
    • Mikulas Patocka's avatar
      dm bufio: avoid a possible __vmalloc deadlock · bed74df4
      Mikulas Patocka authored
      commit 502624bd upstream.
      
      This patch uses memalloc_noio_save to avoid a possible deadlock in
      dm-bufio.  (it could happen only with large block size, at most
      PAGE_SIZE << MAX_ORDER (typically 8MiB).
      
      __vmalloc doesn't fully respect gfp flags. The specified gfp flags are
      used for allocation of requested pages, structures vmap_area, vmap_block
      and vm_struct and the radix tree nodes.
      
      However, the kernel pagetables are allocated always with GFP_KERNEL.
      Thus the allocation of pagetables can recurse back to the I/O layer and
      cause a deadlock.
      
      This patch uses the function memalloc_noio_save to set per-process
      PF_MEMALLOC_NOIO flag and the function memalloc_noio_restore to restore
      it. When this flag is set, all allocations in the process are done with
      implied GFP_NOIO flag, thus the deadlock can't happen.
      
      This should be backported to stable kernels, but they don't have the
      PF_MEMALLOC_NOIO flag and memalloc_noio_save/memalloc_noio_restore
      functions. So, PF_MEMALLOC should be set and restored instead.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarAlasdair G Kergon <agk@redhat.com>
      [bwh: Backported to 3.2 as recommended]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bed74df4
    • Trond Myklebust's avatar
      NFSv4.1: Handle NFS4ERR_DELAY when resetting the NFSv4.1 session · ba795927
      Trond Myklebust authored
      commit c489ee29 upstream.
      
      NFS4ERR_DELAY is a legal reply when we call DESTROY_SESSION. It
      usually means that the server is busy handling an unfinished RPC
      request. Just sleep for a second and then retry.
      We also need to be able to handle the NFS4ERR_BACK_CHAN_BUSY return
      value. If the NFS server has outstanding callbacks, we just want to
      similarly sleep & retry.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ba795927
    • Weston Andros Adamson's avatar
      NFSv4.1: Don't decode skipped layoutgets · 0c5fa16a
      Weston Andros Adamson authored
      commit 085b7a45 upstream.
      
      layoutget's prepare hook can call rpc_exit with status = NFS4_OK (0).
      Because of this, nfs4_proc_layoutget can't depend on a 0 status to mean
      that the RPC was successfully sent, received and parsed.
      
      To fix this, use the result's len member to see if parsing took place.
      
      This fixes the following OOPS -- calling xdr_init_decode() with a buffer length
      0 doesn't set the stream's 'p' member and ends up using uninitialized memory
      in filelayout_decode_layout.
      
      BUG: unable to handle kernel paging request at 0000000000008050
      IP: [<ffffffff81282e78>] memcpy+0x18/0x120
      PGD 0
      Oops: 0000 [#1] SMP
      last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:01.0/irq
      CPU 1
      Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache auth_rpcgss nfs_acl autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon i2c_piix4 i2c_core sg shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi [last unloaded: speedstep_lib]
      
      Pid: 1665, comm: flush-0:22 Not tainted 2.6.32-356-test-2 #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
      RIP: 0010:[<ffffffff81282e78>]  [<ffffffff81282e78>] memcpy+0x18/0x120
      RSP: 0018:ffff88003dfab588  EFLAGS: 00010206
      RAX: ffff88003dc42000 RBX: ffff88003dfab610 RCX: 0000000000000009
      RDX: 000000003f807ff0 RSI: 0000000000008050 RDI: ffff88003dc42000
      RBP: ffff88003dfab5b0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000024
      R13: ffff88003dc42000 R14: ffff88003f808030 R15: ffff88003dfab6a0
      FS:  0000000000000000(0000) GS:ffff880003420000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
      CR2: 0000000000008050 CR3: 000000003bc92000 CR4: 00000000001407e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process flush-0:22 (pid: 1665, threadinfo ffff88003dfaa000, task ffff880037f77540)
      Stack:
      ffffffffa0398ac1 ffff8800397c5940 ffff88003dfab610 ffff88003dfab6a0
      <d> ffff88003dfab5d0 ffff88003dfab680 ffffffffa01c150b ffffea0000d82e70
      <d> 000000508116713b 0000000000000000 0000000000000000 0000000000000000
      Call Trace:
      [<ffffffffa0398ac1>] ? xdr_inline_decode+0xb1/0x120 [sunrpc]
      [<ffffffffa01c150b>] filelayout_decode_layout+0xeb/0x350 [nfs_layout_nfsv41_files]
      [<ffffffffa01c17fc>] filelayout_alloc_lseg+0x8c/0x3c0 [nfs_layout_nfsv41_files]
      [<ffffffff8150e6ce>] ? __wait_on_bit+0x7e/0x90
      Signed-off-by: default avatarWeston Andros Adamson <dros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c5fa16a
    • Trond Myklebust's avatar
      NFSv4.1: Fix a race in pNFS layoutcommit · 079ee145
      Trond Myklebust authored
      commit a073dbff upstream.
      
      We need to clear the NFS_LSEG_LAYOUTCOMMIT bits atomically with the
      NFS_INO_LAYOUTCOMMIT bit, otherwise we may end up with situations
      where the two are out of sync.
      The first half of the problem is to ensure that pnfs_layoutcommit_inode
      clears the NFS_LSEG_LAYOUTCOMMIT bit through pnfs_list_write_lseg.
      We still need to keep the reference to those segments until the RPC call
      is finished, so in order to make it clear _where_ those references come
      from, we add a helper pnfs_list_write_lseg_done() that cleans up after
      pnfs_list_write_lseg.
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Acked-by: default avatarBenny Halevy <bhalevy@tonian.com>
      [bwh: Backported to 3.2: s/pnfs_put_lseg/put_lseg/]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      079ee145
    • Chuck Lever's avatar
      NFS: nfs_getaclargs.acl_len is a size_t · f40661f4
      Chuck Lever authored
      commit 56d08fef upstream.
      
      Squelch compiler warnings:
      
      fs/nfs/nfs4proc.c: In function ‘__nfs4_get_acl_uncached’:
      fs/nfs/nfs4proc.c:3811:14: warning: comparison between signed and
      	unsigned integer expressions [-Wsign-compare]
      fs/nfs/nfs4proc.c:3818:15: warning: comparison between signed and
      	unsigned integer expressions [-Wsign-compare]
      
      Introduced by commit bf118a34 "NFSv4: include bitmap in nfsv4 get
      acl data", Dec 7, 2011.
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f40661f4
    • fanchaoting's avatar
      nfsd: don't run get_file if nfs4_preprocess_stateid_op return error · 79854e6e
      fanchaoting authored
      commit b022032e upstream.
      
      we should return error status directly when nfs4_preprocess_stateid_op
      return error.
      Signed-off-by: default avatarfanchaoting <fanchaoting@cn.fujitsu.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79854e6e
    • Dan Carpenter's avatar
      NFSv4.1: integer overflow in decode_cb_sequence_args() · 42b607da
      Dan Carpenter authored
      commit 0439f31c upstream.
      
      This seems like it could overflow on 32 bits.  Use kmalloc_array() which
      has overflow protection built in.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42b607da
    • J. Bruce Fields's avatar
      nfsd4: fix xdr decoding of large non-write compounds · ca36e74e
      J. Bruce Fields authored
      commit 365da4ad upstream.
      
      This fixes a regression from 24750082
      "nfsd4: fix decoding of compounds across page boundaries".  The previous
      code was correct: argp->pagelist is initialized in
      nfs4svc_deocde_compoundargs to rqstp->rq_arg.pages, and is therefore a
      pointer to the page *after* the page we are currently decoding.
      
      The reason that patch nevertheless fixed a problem with decoding
      compounds containing write was a bug in the write decoding introduced by
      5a80a54d "nfsd4: reorganize write
      decoding", after which write decoding no longer adhered to the rule that
      argp->pagelist point to the next page.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      [bwh: Backported to 3.2: adjust context; there is only one instance to fix]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca36e74e
    • Andy Adamson's avatar
      NFSv4 wait on recovery for async session errors · e9d735ee
      Andy Adamson authored
      commit 4a82fd7c upstream.
      
      When the state manager is processing the NFS4CLNT_DELEGRETURN flag, session
      draining is off, but DELEGRETURN can still get a session error.
      The async handler calls nfs4_schedule_session_recovery returns -EAGAIN, and
      the DELEGRETURN done then restarts the RPC task in the prepare state.
      With the state manager still processing the NFS4CLNT_DELEGRETURN flag with
      session draining off, these DELEGRETURNs will cycle with errors filling up the
      session slots.
      
      This prevents OPEN reclaims (from nfs_delegation_claim_opens) required by the
      NFS4CLNT_DELEGRETURN state manager processing from completing, hanging the
      state manager in the __rpc_wait_for_completion_task in nfs4_run_open_task
      as seen in this kernel thread dump:
      
      kernel: 4.12.32.53-ma D 0000000000000000     0  3393      2 0x00000000
      kernel: ffff88013995fb60 0000000000000046 ffff880138cc5400 ffff88013a9df140
      kernel: ffff8800000265c0 ffffffff8116eef0 ffff88013fc10080 0000000300000001
      kernel: ffff88013a4ad058 ffff88013995ffd8 000000000000fbc8 ffff88013a4ad058
      kernel: Call Trace:
      kernel: [<ffffffff8116eef0>] ? cache_alloc_refill+0x1c0/0x240
      kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc]
      kernel: [<ffffffffa0358152>] rpc_wait_bit_killable+0x42/0xa0 [sunrpc]
      kernel: [<ffffffff8152914f>] __wait_on_bit+0x5f/0x90
      kernel: [<ffffffffa0358110>] ? rpc_wait_bit_killable+0x0/0xa0 [sunrpc]
      kernel: [<ffffffff815291f8>] out_of_line_wait_on_bit+0x78/0x90
      kernel: [<ffffffff8109b520>] ? wake_bit_function+0x0/0x50
      kernel: [<ffffffffa035810d>] __rpc_wait_for_completion_task+0x2d/0x30 [sunrpc]
      kernel: [<ffffffffa040d44c>] nfs4_run_open_task+0x11c/0x160 [nfs]
      kernel: [<ffffffffa04114e7>] nfs4_open_recover_helper+0x87/0x120 [nfs]
      kernel: [<ffffffffa0411646>] nfs4_open_recover+0xc6/0x150 [nfs]
      kernel: [<ffffffffa040cc6f>] ? nfs4_open_recoverdata_alloc+0x2f/0x60 [nfs]
      kernel: [<ffffffffa0414e1a>] nfs4_open_delegation_recall+0x6a/0xa0 [nfs]
      kernel: [<ffffffffa0424020>] nfs_end_delegation_return+0x120/0x2e0 [nfs]
      kernel: [<ffffffff8109580f>] ? queue_work+0x1f/0x30
      kernel: [<ffffffffa0424347>] nfs_client_return_marked_delegations+0xd7/0x110 [nfs]
      kernel: [<ffffffffa04225d8>] nfs4_run_state_manager+0x548/0x620 [nfs]
      kernel: [<ffffffffa0422090>] ? nfs4_run_state_manager+0x0/0x620 [nfs]
      kernel: [<ffffffff8109b0f6>] kthread+0x96/0xa0
      kernel: [<ffffffff8100c20a>] child_rip+0xa/0x20
      kernel: [<ffffffff8109b060>] ? kthread+0x0/0xa0
      kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      The state manager can not therefore process the DELEGRETURN session errors.
      Change the async handler to wait for recovery on session errors.
      Signed-off-by: default avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - There's no restart_call label]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9d735ee
    • Mateusz Guzik's avatar
      cifs: delay super block destruction until all cifsFileInfo objects are gone · 38feb080
      Mateusz Guzik authored
      commit 24261fc2 upstream.
      
      cifsFileInfo objects hold references to dentries and it is possible that
      these will still be around in workqueues when VFS decides to kill super
      block during unmount.
      
      This results in panics like this one:
      BUG: Dentry ffff88001f5e76c0{i=66b4a,n=1M-2} still in use (1) [unmount of cifs cifs]
      ------------[ cut here ]------------
      kernel BUG at fs/dcache.c:943!
      [..]
      Process umount (pid: 1781, threadinfo ffff88003d6e8000, task ffff880035eeaec0)
      [..]
      Call Trace:
       [<ffffffff811b44f3>] shrink_dcache_for_umount+0x33/0x60
       [<ffffffff8119f7fc>] generic_shutdown_super+0x2c/0xe0
       [<ffffffff8119f946>] kill_anon_super+0x16/0x30
       [<ffffffffa036623a>] cifs_kill_sb+0x1a/0x30 [cifs]
       [<ffffffff8119fcc7>] deactivate_locked_super+0x57/0x80
       [<ffffffff811a085e>] deactivate_super+0x4e/0x70
       [<ffffffff811bb417>] mntput_no_expire+0xd7/0x130
       [<ffffffff811bc30c>] sys_umount+0x9c/0x3c0
       [<ffffffff81657c19>] system_call_fastpath+0x16/0x1b
      
      Fix this by making each cifsFileInfo object hold a reference to cifs
      super block, which implicitly keeps VFS super block around as well.
      Signed-off-by: default avatarMateusz Guzik <mguzik@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Reported-and-Tested-by: default avatarBen Greear <greearb@candelatech.com>
      Signed-off-by: default avatarSteve French <sfrench@us.ibm.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      [xr: Backported to 3.4: adjust context]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38feb080
    • Linus Torvalds's avatar
      VFS: make vfs_fstat() use f[get|put]_light() · 9b67aeff
      Linus Torvalds authored
      commit e994defb upstream.
      
      Use the *_light() versions that properly avoid doing the file user count
      updates when they are unnecessary.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [xr: Backported to 3.4: adjust function name]
      Signed-off-by: default avatarRui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b67aeff
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Keep overwrite in sync between regular and snapshot buffers · 4652951d
      Steven Rostedt (Red Hat) authored
      commit 80902822 upstream.
      
      Changing the overwrite mode for the ring buffer via the trace
      option only sets the normal buffer. But the snapshot buffer could
      swap with it, and then the snapshot would be in non overwrite mode
      and the normal buffer would be in overwrite mode, even though the
      option flag states otherwise.
      
      Keep the two buffers overwrite modes in sync.
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4652951d
    • Wei Yongjun's avatar
      perf: Fix error return code · 926685e9
      Wei Yongjun authored
      commit c4814202 upstream.
      
      Fix to return -ENOMEM in the allocation error case instead of 0
      (if pmu_bus_running == 1), as done elsewhere in this function.
      Signed-off-by: default avatarWei Yongjun <yongjun_wei@trendmicro.com.cn>
      Cc: a.p.zijlstra@chello.nl
      Cc: paulus@samba.org
      Cc: acme@ghostprotocols.net
      Link: http://lkml.kernel.org/r/CAPgLHd8j_fWcgqe%3DKLWjpBj%2B%3Do0Pw6Z-SEq%3DNTPU08c2w1tngQ@mail.gmail.com
      [ Tweaked the error code setting placement and the changelog. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      926685e9
    • libin's avatar
      sched/debug: Fix sd->*_idx limit range avoiding overflow · e5f1ec5d
      libin authored
      commit fd9b86d3 upstream.
      
      Commit 201c373e ("sched/debug: Limit sd->*_idx range on
      sysctl") was an incomplete bug fix.
      
      This patch fixes sd->*_idx limit range to [0 ~ CPU_LOAD_IDX_MAX-1]
      avoiding array overflow caused by setting sd->*_idx to CPU_LOAD_IDX_MAX
      on sysctl.
      Signed-off-by: default avatarLibin <huawei.libin@huawei.com>
      Cc: <jiang.liu@huawei.com>
      Cc: <guohanjun@huawei.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/51626610.2040607@huawei.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5f1ec5d
    • Namhyung Kim's avatar
      sched/debug: Limit sd->*_idx range on sysctl · 4aff95ab
      Namhyung Kim authored
      commit 201c373e upstream.
      
      Various sd->*_idx's are used for refering the rq's load average table
      when selecting a cpu to run.  However they can be set to any number
      with sysctl knobs so that it can crash the kernel if something bad is
      given. Fix it by limiting them into the actual range.
      Signed-off-by: default avatarNamhyung Kim <namhyung@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1345104204-8317-1-git-send-email-namhyung@kernel.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4aff95ab
    • Steven Rostedt (Red Hat)'s avatar
      ftrace: Check module functions being traced on reload · 74d86ed7
      Steven Rostedt (Red Hat) authored
      commit 8c4f3c3f upstream.
      
      There's been a nasty bug that would show up and not give much info.
      The bug displayed the following warning:
      
       WARNING: at kernel/trace/ftrace.c:1529 __ftrace_hash_rec_update+0x1e3/0x230()
       Pid: 20903, comm: bash Tainted: G           O 3.6.11+ #38405.trunk
       Call Trace:
        [<ffffffff8103e5ff>] warn_slowpath_common+0x7f/0xc0
        [<ffffffff8103e65a>] warn_slowpath_null+0x1a/0x20
        [<ffffffff810c2ee3>] __ftrace_hash_rec_update+0x1e3/0x230
        [<ffffffff810c4f28>] ftrace_hash_move+0x28/0x1d0
        [<ffffffff811401cc>] ? kfree+0x2c/0x110
        [<ffffffff810c68ee>] ftrace_regex_release+0x8e/0x150
        [<ffffffff81149f1e>] __fput+0xae/0x220
        [<ffffffff8114a09e>] ____fput+0xe/0x10
        [<ffffffff8105fa22>] task_work_run+0x72/0x90
        [<ffffffff810028ec>] do_notify_resume+0x6c/0xc0
        [<ffffffff8126596e>] ? trace_hardirqs_on_thunk+0x3a/0x3c
        [<ffffffff815c0f88>] int_signal+0x12/0x17
       ---[ end trace 793179526ee09b2c ]---
      
      It was finally narrowed down to unloading a module that was being traced.
      
      It was actually more than that. When functions are being traced, there's
      a table of all functions that have a ref count of the number of active
      tracers attached to that function. When a function trace callback is
      registered to a function, the function's record ref count is incremented.
      When it is unregistered, the function's record ref count is decremented.
      If an inconsistency is detected (ref count goes below zero) the above
      warning is shown and the function tracing is permanently disabled until
      reboot.
      
      The ftrace callback ops holds a hash of functions that it filters on
      (and/or filters off). If the hash is empty, the default means to filter
      all functions (for the filter_hash) or to disable no functions (for the
      notrace_hash).
      
      When a module is unloaded, it frees the function records that represent
      the module functions. These records exist on their own pages, that is
      function records for one module will not exist on the same page as
      function records for other modules or even the core kernel.
      
      Now when a module unloads, the records that represents its functions are
      freed. When the module is loaded again, the records are recreated with
      a default ref count of zero (unless there's a callback that traces all
      functions, then they will also be traced, and the ref count will be
      incremented).
      
      The problem is that if an ftrace callback hash includes functions of the
      module being unloaded, those hash entries will not be removed. If the
      module is reloaded in the same location, the hash entries still point
      to the functions of the module but the module's ref counts do not reflect
      that.
      
      With the help of Steve and Joern, we found a reproducer:
      
       Using uinput module and uinput_release function.
      
       cd /sys/kernel/debug/tracing
       modprobe uinput
       echo uinput_release > set_ftrace_filter
       echo function > current_tracer
       rmmod uinput
       modprobe uinput
       # check /proc/modules to see if loaded in same addr, otherwise try again
       echo nop > current_tracer
      
       [BOOM]
      
      The above loads the uinput module, which creates a table of functions that
      can be traced within the module.
      
      We add uinput_release to the filter_hash to trace just that function.
      
      Enable function tracincg, which increments the ref count of the record
      associated to uinput_release.
      
      Remove uinput, which frees the records including the one that represents
      uinput_release.
      
      Load the uinput module again (and make sure it's at the same address).
      This recreates the function records all with a ref count of zero,
      including uinput_release.
      
      Disable function tracing, which will decrement the ref count for uinput_release
      which is now zero because of the module removal and reload, and we have
      a mismatch (below zero ref count).
      
      The solution is to check all currently tracing ftrace callbacks to see if any
      are tracing any of the module's functions when a module is loaded (it already does
      that with callbacks that trace all functions). If a callback happens to have
      a module function being traced, it increments that records ref count and starts
      tracing that function.
      
      There may be a strange side effect with this, where tracing module functions
      on unload and then reloading a new module may have that new module's functions
      being traced. This may be something that confuses the user, but it's not
      a big deal. Another approach is to disable all callback hashes on module unload,
      but this leaves some ftrace callbacks that may not be registered, but can
      still have hashes tracing the module's function where ftrace doesn't know about
      it. That situation can cause the same bug. This solution solves that case too.
      Another benefit of this solution, is it is possible to trace a module's
      function on unload and load.
      
      Link: http://lkml.kernel.org/r/20130705142629.GA325@redhat.comReported-by: default avatarJörn Engel <joern@logfs.org>
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Reported-by: default avatarSteve Hodgson <steve@purestorage.com>
      Tested-by: default avatarSteve Hodgson <steve@purestorage.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74d86ed7
    • Peter Zijlstra's avatar
      perf: Fix perf ring buffer memory ordering · 1fbbea7b
      Peter Zijlstra authored
      commit bf378d34 upstream.
      
      The PPC64 people noticed a missing memory barrier and crufty old
      comments in the perf ring buffer code. So update all the comments and
      add the missing barrier.
      
      When the architecture implements local_t using atomic_long_t there
      will be double barriers issued; but short of introducing more
      conditional barrier primitives this is the best we can do.
      Reported-by: default avatarVictor Kaplansky <victork@il.ibm.com>
      Tested-by: default avatarVictor Kaplansky <victork@il.ibm.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: michael@ellerman.id.au
      Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: anton@samba.org
      Cc: benh@kernel.crashing.org
      Link: http://lkml.kernel.org/r/20131025173749.GG19466@laptop.lanSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.2: adjust filename]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fbbea7b
    • Justin Lecher's avatar
      fs: cachefiles: add support for large files in filesystem caching · 76504c24
      Justin Lecher authored
      commit 98c350cd upstream.
      
      Support the caching of large files.
      
      Addresses https://bugzilla.kernel.org/show_bug.cgi?id=31182Signed-off-by: default avatarJustin Lecher <jlec@gentoo.org>
      Signed-off-by: default avatarSuresh Jayaraman <sjayaraman@suse.com>
      Tested-by: default avatarSuresh Jayaraman <sjayaraman@suse.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - dentry_open() takes dentry and vfsmount pointers, not a path pointer]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      76504c24
    • Geyslan G. Bem's avatar
      ecryptfs: Fix memory leakage in keystore.c · 2e4191b3
      Geyslan G. Bem authored
      commit 3edc8376 upstream.
      
      In 'decrypt_pki_encrypted_session_key' function:
      
      Initializes 'payload' pointer and releases it on exit.
      Signed-off-by: default avatarGeyslan G. Bem <geyslan@gmail.com>
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e4191b3
    • Pavel Shilovsky's avatar
      CIFS: Fix error handling in cifs_push_mandatory_locks · 47532a29
      Pavel Shilovsky authored
      commit e2f2886a upstream.
      Signed-off-by: default avatarPavel Shilovsky <pshilovsky@etersoft.ru>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47532a29
    • Steve French's avatar
      setfacl removes part of ACL when setting POSIX ACLs to Samba · afaf7f61
      Steve French authored
      commit b1d93356 upstream.
      
      setfacl over cifs mounts can remove the default ACL when setting the
      (non-default part of) the ACL and vice versa (we were leaving at 0
      rather than setting to -1 the count field for the unaffected
      half of the ACL.  For example notice the setfacl removed
      the default ACL in this sequence:
      
      steven@steven-GA-970A-DS3:~/cifs-2.6$ getfacl /mnt/test-dir ; setfacl
      -m default:user:test:rwx,user:test:rwx /mnt/test-dir
      getfacl: Removing leading '/' from absolute path names
      user::rwx
      group::r-x
      other::r-x
      default:user::rwx
      default:user:test:rwx
      default:group::r-x
      default:mask::rwx
      default:other::r-x
      
      steven@steven-GA-970A-DS3:~/cifs-2.6$ getfacl /mnt/test-dir
      getfacl: Removing leading '/' from absolute path names
      user::rwx
      user:test:rwx
      group::r-x
      mask::rwx
      other::r-x
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Acked-by: default avatarJeremy Allison <jra@samba.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Cc: Rui Xiang <rui.xiang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      afaf7f61
    • Horia Geanta's avatar
      crypto: caam - add allocation failure handling in SPRINTFCAT macro · 0c5e98d5
      Horia Geanta authored
      commit 27c5fb7a upstream.
      
      GFP_ATOMIC memory allocation could fail.
      In this case, avoid NULL pointer dereference and notify user.
      
      Cc: Kim Phillips <kim.phillips@freescale.com>
      Signed-off-by: default avatarHoria Geanta <horia.geanta@freescale.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c5e98d5
    • Du, Wenkai's avatar
      i2c: designware: Mask all interrupts during i2c controller enable · 7ad8d3db
      Du, Wenkai authored
      commit 47bb27e7 upstream.
      
      There have been "i2c_designware 80860F41:00: controller timed out" errors
      on a number of Baytrail platforms. The issue is caused by incorrect value in
      Interrupt Mask Register (DW_IC_INTR_MASK)  when i2c core is being enabled.
      This causes call to __i2c_dw_enable() to immediately start the transfer which
      leads to timeout. There are 3 failure modes observed:
      
      1. Failure in S0 to S3 resume path
      
      The default value after reset for DW_IC_INTR_MASK is 0x8ff. When we start
      the first transaction after resuming from system sleep, TX_EMPTY interrupt
      is already unmasked because of the hardware default.
      
      2. Failure in normal operational path
      
      This failure happens rarely and is hard to reproduce. Debug trace showed that
      DW_IC_INTR_MASK had value of 0x254 when failure occurred, which meant
      TX_EMPTY was unmasked.
      
      3. Failure in S3 to S0 suspend path
      
      This failure also happens rarely and is hard to reproduce. Adding debug trace
      that read DW_IC_INTR_MASK made this failure not reproducible. But from ISR
      call trace we could conclude TX_EMPTY was unmasked when problem occurred.
      
      The patch masks all interrupts before the controller is enabled to resolve the
      faulty DW_IC_INTR_MASK conditions.
      Signed-off-by: default avatarWenkai Du <wenkai.du@intel.com>
      Acked-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      [wsa: improved the comment and removed typo in commit msg]
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7ad8d3db
    • Hans de Goede's avatar
      ACPI / blacklist: Add dmi_enable_osi_linux quirk for Asus EEE PC 1015PX · bb093785
      Hans de Goede authored
      commit f6e6e1b9 upstream.
      
      Without this this EEE PC exports a non working WMI interface, with this it
      exports a working "good old" eeepc_laptop interface, fixing brightness control
      not working as well as rfkill being stuck in a permanent wireless blocked
      state.
      
      This is not an ideal way to fix this, but various attempts to fix this
      otherwise have failed, see:
      
      References: https://bugzilla.redhat.com/show_bug.cgi?id=1067181
      Reported-and-tested-by: lou.cardone@gmail.com
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bb093785
    • Marcel Apfelbaum's avatar
      PCI: shpchp: Check bridge's secondary (not primary) bus speed · e5cc5b9c
      Marcel Apfelbaum authored
      commit 93fa9d32 upstream.
      
      When a new device is added below a hotplug bridge, the bridge's secondary
      bus speed and the device's bus speed must match.  The shpchp driver
      previously checked the bridge's *primary* bus speed, not the secondary bus
      speed.
      
      This caused hot-add errors like:
      
        shpchp 0000:00:03.0: Speed of bus ff and adapter 0 mismatch
      
      Check the secondary bus speed instead.
      
      [bhelgaas: changelog]
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=75251
      Fixes: 3749c51a ("PCI: Make current and maximum bus speeds part of the PCI core")
      Signed-off-by: default avatarMarcel Apfelbaum <marcel.a@redhat.com>
      Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e5cc5b9c
    • Linus Torvalds's avatar
      x86-64, modify_ldt: Make support for 16-bit segments a runtime option · 215990aa
      Linus Torvalds authored
      commit fa81511b upstream.
      
      Checkin:
      
      b3b42ac2 x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
      
      disabled 16-bit segments on 64-bit kernels due to an information
      leak.  However, it does seem that people are genuinely using Wine to
      run old 16-bit Windows programs on Linux.
      
      A proper fix for this ("espfix64") is coming in the upcoming merge
      window, but as a temporary fix, create a sysctl to allow the
      administrator to re-enable support for 16-bit segments.
      
      It adds a "/proc/sys/abi/ldt16" sysctl that defaults to zero (off). If
      you hit this issue and care about your old Windows program more than
      you care about a kernel stack address information leak, you can do
      
         echo 1 > /proc/sys/abi/ldt16
      
      as root (add it to your startup scripts), and you should be ok.
      
      The sysctl table is only added if you have COMPAT support enabled on
      x86-64, but I assume anybody who runs old windows binaries very much
      does that ;)
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/CA%2B55aFw9BPoD10U1LfHbOMpHWZkvJTkMcfCs9s3urPr1YyWBxw@mail.gmail.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      215990aa
    • Charles Keepax's avatar
      ASoC: wm8962: Update register CLASS_D_CONTROL_1 to be non-volatile · 18390a2a
      Charles Keepax authored
      commit 44330ab5 upstream.
      
      The register CLASS_D_CONTROL_1 is marked as volatile because it contains
      a bit, DAC_MUTE, which is also mirrored in the ADC_DAC_CONTROL_1
      register. This causes problems for the "Speaker Switch" control, which
      will report an error if the CODEC is suspended because it relies on a
      volatile register.
      
      To resolve this issue mark CLASS_D_CONTROL_1 as non-volatile and
      manually keep the register cache in sync by updating both bits when
      changing the mute status.
      Reported-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: default avatarCharles Keepax <ckeepax@opensource.wolfsonmicro.com>
      Tested-by: default avatarShawn Guo <shawn.guo@linaro.org>
      Signed-off-by: default avatarMark Brown <broonie@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18390a2a
    • Jianyu Zhan's avatar
      percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree() · 6baddd03
      Jianyu Zhan authored
      commit 5a838c3b upstream.
      
      pcpu_chunk_struct_size = sizeof(struct pcpu_chunk) +
      	BITS_TO_LONGS(pcpu_unit_pages) * sizeof(unsigned long)
      
      It hardly could be ever bigger than PAGE_SIZE even for large-scale machine,
      but for consistency with its couterpart pcpu_mem_zalloc(),
      use pcpu_mem_free() instead.
      
      Commit b4916cb1 ("percpu: make pcpu_free_chunk() use
      pcpu_mem_free() instead of kfree()") addressed this problem, but
      missed this one.
      
      tj: commit message updated
      Signed-off-by: default avatarJianyu Zhan <nasa4836@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 099a19d9 ("percpu: allow limited allocation before slab is online)
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6baddd03
    • J. Bruce Fields's avatar
      nfsd4: remove lockowner when removing lock stateid · a18e4c11
      J. Bruce Fields authored
      commit a1b8ff4c upstream.
      
      The nfsv4 state code has always assumed a one-to-one correspondance
      between lock stateid's and lockowners even if it appears not to in some
      places.
      
      We may actually change that, but for now when FREE_STATEID releases a
      lock stateid it also needs to release the parent lockowner.
      
      Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
      calls same_lockowner_ino on a lockowner that unexpectedly has an empty
      so_stateids list.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a18e4c11
    • J. Bruce Fields's avatar
      nfsd4: warn on finding lockowner without stateid's · d9eea1cc
      J. Bruce Fields authored
      commit 27b11428 upstream.
      
      The current code assumes a one-to-one lockowner<->lock stateid
      correspondance.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9eea1cc
    • Kinglong Mee's avatar
      NFSD: Call ->set_acl with a NULL ACL structure if no entries · 6ba3ac44
      Kinglong Mee authored
      commit aa07c713 upstream.
      
      After setting ACL for directory, I got two problems that caused
      by the cached zero-length default posix acl.
      
      This patch make sure nfsd4_set_nfs4_acl calls ->set_acl
      with a NULL ACL structure if there are no entries.
      
      Thanks for Christoph Hellwig's advice.
      
      First problem:
      ............ hang ...........
      
      Second problem:
      [ 1610.167668] ------------[ cut here ]------------
      [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239!
      [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
      [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE)
      rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack
      rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
      ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
      ip6table_mangle ip6table_security ip6table_raw ip6table_filter
      ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
      nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
      auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus
      snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev
      i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi
      [last unloaded: nfsd]
      [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G           OE
      3.15.0-rc1+ #15
      [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
      VirtualBox 12/01/2006
      [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti:
      ffff88005a944000
      [ 1610.168320] RIP: 0010:[<ffffffffa034d5ed>]  [<ffffffffa034d5ed>]
      _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd]
      [ 1610.168320] RSP: 0018:ffff88005a945b00  EFLAGS: 00010293
      [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX:
      0000000000000000
      [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI:
      ffff880068233300
      [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09:
      0000000000000000
      [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12:
      ffff880068233300
      [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15:
      ffff880068233300
      [ 1610.168320] FS:  0000000000000000(0000) GS:ffff880077800000(0000)
      knlGS:0000000000000000
      [ 1610.168320] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4:
      00000000000006f0
      [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [ 1610.168320] Stack:
      [ 1610.168320]  ffffffff00000000 0000000b67c83500 000000076700bac0
      0000000000000000
      [ 1610.168320]  ffff88006700bac0 ffff880068233300 ffff88005a945c08
      0000000000000002
      [ 1610.168320]  0000000000000000 ffff88005a945b88 ffffffffa034e2d5
      000000065a945b68
      [ 1610.168320] Call Trace:
      [ 1610.168320]  [<ffffffffa034e2d5>] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd]
      [ 1610.168320]  [<ffffffffa03400d6>] nfsd4_encode_fattr+0x646/0x1e70 [nfsd]
      [ 1610.168320]  [<ffffffff816a6e6e>] ? kmemleak_alloc+0x4e/0xb0
      [ 1610.168320]  [<ffffffffa0327962>] ?
      nfsd_setuser_and_check_port+0x52/0x80 [nfsd]
      [ 1610.168320]  [<ffffffff812cd4bb>] ? selinux_cred_prepare+0x1b/0x30
      [ 1610.168320]  [<ffffffffa0341caa>] nfsd4_encode_getattr+0x5a/0x60 [nfsd]
      [ 1610.168320]  [<ffffffffa0341e07>] nfsd4_encode_operation+0x67/0x110
      [nfsd]
      [ 1610.168320]  [<ffffffffa033844d>] nfsd4_proc_compound+0x21d/0x810 [nfsd]
      [ 1610.168320]  [<ffffffffa0324d9b>] nfsd_dispatch+0xbb/0x200 [nfsd]
      [ 1610.168320]  [<ffffffffa00850cd>] svc_process_common+0x46d/0x6d0 [sunrpc]
      [ 1610.168320]  [<ffffffffa0085433>] svc_process+0x103/0x170 [sunrpc]
      [ 1610.168320]  [<ffffffffa032472f>] nfsd+0xbf/0x130 [nfsd]
      [ 1610.168320]  [<ffffffffa0324670>] ? nfsd_destroy+0x80/0x80 [nfsd]
      [ 1610.168320]  [<ffffffff810a5202>] kthread+0xd2/0xf0
      [ 1610.168320]  [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40
      [ 1610.168320]  [<ffffffff816c1ebc>] ret_from_fork+0x7c/0xb0
      [ 1610.168320]  [<ffffffff810a5130>] ? insert_kthread_work+0x40/0x40
      [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce
      41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd
      ff ff <0f> 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c
      [ 1610.168320] RIP  [<ffffffffa034d5ed>] _posix_to_nfsv4_one+0x3cd/0x3d0
      [nfsd]
      [ 1610.168320]  RSP <ffff88005a945b00>
      [ 1610.257313] ---[ end trace 838254e3e352285b ]---
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6ba3ac44
    • Romain Izard's avatar
      trace: module: Maintain a valid user count · efb36edc
      Romain Izard authored
      commit 098507ae upstream.
      
      The replacement of the 'count' variable by two variables 'incs' and
      'decs' to resolve some race conditions during module unloading was done
      in parallel with some cleanup in the trace subsystem, and was integrated
      as a merge.
      
      Unfortunately, the formula for this replacement was wrong in the tracing
      code, and the refcount in the traces was not usable as a result.
      
      Use 'count = incs - decs' to compute the user count.
      
      Link: http://lkml.kernel.org/p/1393924179-9147-1-git-send-email-romain.izard.pro@gmail.comAcked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Fixes: c1ab9cab "merge conflict resolution"
      Signed-off-by: default avatarRomain Izard <romain.izard.pro@gmail.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      efb36edc
    • Salva Peiró's avatar
    • Tim Chen's avatar
      crypto: crypto_wq - Fix late crypto work queue initialization · 2e6c336b
      Tim Chen authored
      commit 130fa5bc upstream.
      
      The crypto algorithm modules utilizing the crypto daemon could
      be used early when the system start up.  Using module_init
      does not guarantee that the daemon's work queue is initialized
      when the cypto alorithm depending on crypto_wq starts.  It is necessary
      to initialize the crypto work queue earlier at the subsystem
      init time to make sure that it is initialized
      when used.
      Signed-off-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e6c336b