1. 30 Sep, 2015 40 commits
    • NeilBrown's avatar
      md/raid10: always set reshape_safe when initializing reshape_position. · c7af5eb4
      NeilBrown authored
      commit 299b0685 upstream.
      
      'reshape_position' tracks where in the reshape we have reached.
      'reshape_safe' tracks where in the reshape we have safely recorded
      in the metadata.
      
      These are compared to determine when to update the metadata.
      So it is important that reshape_safe is initialised properly.
      Currently it isn't.  When starting a reshape from the beginning
      it usually has the correct value by luck.  But when reducing the
      number of devices in a RAID10, it has the wrong value and this leads
      to the metadata not being updated correctly.
      This can lead to corruption if the reshape is not allowed to complete.
      
      This patch is suitable for any -stable kernel which supports RAID10
      reshape, which is 3.5 and later.
      
      Fixes: 3ea7daa5 ("md/raid10: add reshape support")
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c7af5eb4
    • Jialing Fu's avatar
      mmc: core: fix race condition in mmc_wait_data_done · cefcb16c
      Jialing Fu authored
      commit 71f8a4b8 upstream.
      
      The following panic is captured in ker3.14, but the issue still exists
      in latest kernel.
      ---------------------------------------------------------------------
      [   20.738217] c0 3136 (Compiler) Unable to handle kernel NULL pointer dereference
      at virtual address 00000578
      ......
      [   20.738499] c0 3136 (Compiler) PC is at _raw_spin_lock_irqsave+0x24/0x60
      [   20.738527] c0 3136 (Compiler) LR is at _raw_spin_lock_irqsave+0x20/0x60
      [   20.740134] c0 3136 (Compiler) Call trace:
      [   20.740165] c0 3136 (Compiler) [<ffffffc0008ee900>] _raw_spin_lock_irqsave+0x24/0x60
      [   20.740200] c0 3136 (Compiler) [<ffffffc0000dd024>] __wake_up+0x1c/0x54
      [   20.740230] c0 3136 (Compiler) [<ffffffc000639414>] mmc_wait_data_done+0x28/0x34
      [   20.740262] c0 3136 (Compiler) [<ffffffc0006391a0>] mmc_request_done+0xa4/0x220
      [   20.740314] c0 3136 (Compiler) [<ffffffc000656894>] sdhci_tasklet_finish+0xac/0x264
      [   20.740352] c0 3136 (Compiler) [<ffffffc0000a2b58>] tasklet_action+0xa0/0x158
      [   20.740382] c0 3136 (Compiler) [<ffffffc0000a2078>] __do_softirq+0x10c/0x2e4
      [   20.740411] c0 3136 (Compiler) [<ffffffc0000a24bc>] irq_exit+0x8c/0xc0
      [   20.740439] c0 3136 (Compiler) [<ffffffc00008489c>] handle_IRQ+0x48/0xac
      [   20.740469] c0 3136 (Compiler) [<ffffffc000081428>] gic_handle_irq+0x38/0x7c
      ----------------------------------------------------------------------
      Because in SMP, "mrq" has race condition between below two paths:
      path1: CPU0: <tasklet context>
        static void mmc_wait_data_done(struct mmc_request *mrq)
        {
           mrq->host->context_info.is_done_rcv = true;
           //
           // If CPU0 has just finished "is_done_rcv = true" in path1, and at
           // this moment, IRQ or ICache line missing happens in CPU0.
           // What happens in CPU1 (path2)?
           //
           // If the mmcqd thread in CPU1(path2) hasn't entered to sleep mode:
           // path2 would have chance to break from wait_event_interruptible
           // in mmc_wait_for_data_req_done and continue to run for next
           // mmc_request (mmc_blk_rw_rq_prep).
           //
           // Within mmc_blk_rq_prep, mrq is cleared to 0.
           // If below line still gets host from "mrq" as the result of
           // compiler, the panic happens as we traced.
           wake_up_interruptible(&mrq->host->context_info.wait);
        }
      
      path2: CPU1: <The mmcqd thread runs mmc_queue_thread>
        static int mmc_wait_for_data_req_done(...
        {
           ...
           while (1) {
                 wait_event_interruptible(context_info->wait,
                         (context_info->is_done_rcv ||
                          context_info->is_new_req));
           	   static void mmc_blk_rw_rq_prep(...
                 {
                 ...
                 memset(brq, 0, sizeof(struct mmc_blk_request));
      
      This issue happens very coincidentally; however adding mdelay(1) in
      mmc_wait_data_done as below could duplicate it easily.
      
         static void mmc_wait_data_done(struct mmc_request *mrq)
         {
           mrq->host->context_info.is_done_rcv = true;
      +    mdelay(1);
           wake_up_interruptible(&mrq->host->context_info.wait);
          }
      
      At runtime, IRQ or ICache line missing may just happen at the same place
      of the mdelay(1).
      
      This patch gets the mmc_context_info at the beginning of function, it can
      avoid this race condition.
      Signed-off-by: default avatarJialing Fu <jlfu@marvell.com>
      Tested-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Fixes: 2220eedf ("mmc: fix async request mechanism ....")
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cefcb16c
    • Jann Horn's avatar
      fs: if a coredump already exists, unlink and recreate with O_EXCL · 7e73d2c5
      Jann Horn authored
      commit fbb18169 upstream.
      
      It was possible for an attacking user to trick root (or another user) into
      writing his coredumps into an attacker-readable, pre-existing file using
      rename() or link(), causing the disclosure of secret data from the victim
      process' virtual memory.  Depending on the configuration, it was also
      possible to trick root into overwriting system files with coredumps.  Fix
      that issue by never writing coredumps into existing files.
      
      Requirements for the attack:
       - The attack only applies if the victim's process has a nonzero
         RLIMIT_CORE and is dumpable.
       - The attacker can trick the victim into coredumping into an
         attacker-writable directory D, either because the core_pattern is
         relative and the victim's cwd is attacker-writable or because an
         absolute core_pattern pointing to a world-writable directory is used.
       - The attacker has one of these:
        A: on a system with protected_hardlinks=0:
           execute access to a folder containing a victim-owned,
           attacker-readable file on the same partition as D, and the
           victim-owned file will be deleted before the main part of the attack
           takes place. (In practice, there are lots of files that fulfill
           this condition, e.g. entries in Debian's /var/lib/dpkg/info/.)
           This does not apply to most Linux systems because most distros set
           protected_hardlinks=1.
        B: on a system with protected_hardlinks=1:
           execute access to a folder containing a victim-owned,
           attacker-readable and attacker-writable file on the same partition
           as D, and the victim-owned file will be deleted before the main part
           of the attack takes place.
           (This seems to be uncommon.)
        C: on any system, independent of protected_hardlinks:
           write access to a non-sticky folder containing a victim-owned,
           attacker-readable file on the same partition as D
           (This seems to be uncommon.)
      
      The basic idea is that the attacker moves the victim-owned file to where
      he expects the victim process to dump its core.  The victim process dumps
      its core into the existing file, and the attacker reads the coredump from
      it.
      
      If the attacker can't move the file because he does not have write access
      to the containing directory, he can instead link the file to a directory
      he controls, then wait for the original link to the file to be deleted
      (because the kernel checks that the link count of the corefile is 1).
      
      A less reliable variant that requires D to be non-sticky works with link()
      and does not require deletion of the original link: link() the file into
      D, but then unlink() it directly before the kernel performs the link count
      check.
      
      On systems with protected_hardlinks=0, this variant allows an attacker to
      not only gain information from coredumps, but also clobber existing,
      victim-writable files with coredumps.  (This could theoretically lead to a
      privilege escalation.)
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7e73d2c5
    • Jaewon Kim's avatar
      vmscan: fix increasing nr_isolated incurred by putback unevictable pages · 9ee9b7b6
      Jaewon Kim authored
      commit c54839a7 upstream.
      
      reclaim_clean_pages_from_list() assumes that shrink_page_list() returns
      number of pages removed from the candidate list.  But shrink_page_list()
      puts back mlocked pages without passing it to caller and without
      counting as nr_reclaimed.  This increases nr_isolated.
      
      To fix this, this patch changes shrink_page_list() to pass unevictable
      pages back to caller.  Caller will take care those pages.
      
      Minchan said:
      
      It fixes two issues.
      
      1. With unevictable page, cma_alloc will be successful.
      
      Exactly speaking, cma_alloc of current kernel will fail due to
      unevictable pages.
      
      2. fix leaking of NR_ISOLATED counter of vmstat
      
      With it, too_many_isolated works.  Otherwise, it could make hang until
      the process get SIGKILL.
      Signed-off-by: default avatarJaewon Kim <jaewon31.kim@samsung.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9ee9b7b6
    • Helge Deller's avatar
      parisc: Filter out spurious interrupts in PA-RISC irq handler · 768cb8d9
      Helge Deller authored
      commit b1b4e435 upstream.
      
      When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
      long way to go to find the right IRQ line, registering it, then registering the
      serial port and the irq handler for the serial port. During this phase spurious
      interrupts for the serial port may happen which then crashes the kernel because
      the action handler might not have been set up yet.
      
      So, basically it's a race condition between the serial port hardware and the
      CPU which sets up the necessary fields in the irq sructs. The main reason for
      this race is, that we unmask the serial port irqs too early without having set
      up everything properly before (which isn't easily possible because we need the
      IRQ number to register the serial ports).
      
      This patch is a work-around for this problem. It adds checks to the CPU irq
      handler to verify if the IRQ action field has been initialized already. If not,
      we just skip this interrupt (which isn't critical for a serial port at bootup).
      The real fix would probably involve rewriting all PA-RISC specific IRQ code
      (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
      the irq chips and proper irq enabling along this line.
      
      This bug has been in the PA-RISC port since the beginning, but the crashes
      happened very rarely with currently used hardware.  But on the latest machine
      which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
      1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
      crashed at every boot because of this race. So, without this patch the machine
      would currently be unuseable.
      
      For the record, here is the flow logic:
      1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
      2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
      3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
      4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
      5. serial_init_chip() then registers the 8250 port.
      Problems:
      - In step 4 the CPU irq shouldn't have been registered yet, but after step 5
      - If serial irq happens between 4 and 5 have finished, the kernel will crash
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      768cb8d9
    • Trond Myklebust's avatar
      NFS: nfs_set_pgio_error sometimes misses errors · a6209e19
      Trond Myklebust authored
      commit e9ae58ae upstream.
      
      We should ensure that we always set the pgio_header's error field
      if a READ or WRITE RPC call returns an error. The current code depends
      on 'hdr->good_bytes' always being initialised to a large value, which
      is not always done correctly by callers.
      When this happens, applications may end up missing important errors.
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a6209e19
    • NeilBrown's avatar
      NFSv4: don't set SETATTR for O_RDONLY|O_EXCL · d71ea882
      NeilBrown authored
      commit efcbc04e upstream.
      
      It is unusual to combine the open flags O_RDONLY and O_EXCL, but
      it appears that libre-office does just that.
      
      [pid  3250] stat("/home/USER/.config", {st_mode=S_IFDIR|0700, st_size=8192, ...}) = 0
      [pid  3250] open("/home/USER/.config/libreoffice/4-suse/user/extensions/buildid", O_RDONLY|O_EXCL <unfinished ...>
      
      NFSv4 takes O_EXCL as a sign that a setattr command should be sent,
      probably to reset the timestamps.
      
      When it was an O_RDONLY open, the SETATTR command does not
      identify any actual attributes to change.
      If no delegation was provided to the open, the SETATTR uses the
      all-zeros stateid and the request is accepted (at least by the
      Linux NFS server - no harm, no foul).
      
      If a read-delegation was provided, this is used in the SETATTR
      request, and a Netapp filer will justifiably claim
      NFS4ERR_BAD_STATEID, which the Linux client takes as a sign
      to retry - indefinitely.
      
      So only treat O_EXCL specially if O_CREAT was also given.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d71ea882
    • Filipe Manana's avatar
      Btrfs: check if previous transaction aborted to avoid fs corruption · 50ee5ae5
      Filipe Manana authored
      commit 1f9b8c8f upstream.
      
      While we are committing a transaction, it's possible the previous one is
      still finishing its commit and therefore we wait for it to finish first.
      However we were not checking if that previous transaction ended up getting
      aborted after we waited for it to commit, so we ended up committing the
      current transaction which can lead to fs corruption because the new
      superblock can point to trees that have had one or more nodes/leafs that
      were never durably persisted.
      The following sequence diagram exemplifies how this is possible:
      
                CPU 0                                                        CPU 1
      
        transaction N starts
      
        (...)
      
        btrfs_commit_transaction(N)
      
          cur_trans->state = TRANS_STATE_COMMIT_START;
          (...)
          cur_trans->state = TRANS_STATE_COMMIT_DOING;
          (...)
      
          cur_trans->state = TRANS_STATE_UNBLOCKED;
          root->fs_info->running_transaction = NULL;
      
                                                                    btrfs_start_transaction()
                                                                       --> starts transaction N + 1
      
          btrfs_write_and_wait_transaction(trans, root);
            --> starts writing all new or COWed ebs created
                at transaction N
      
                                                                    creates some new ebs, COWs some
                                                                    existing ebs but doesn't COW or
                                                                    deletes eb X
      
                                                                    btrfs_commit_transaction(N + 1)
                                                                      (...)
                                                                      cur_trans->state = TRANS_STATE_COMMIT_START;
                                                                      (...)
                                                                      wait_for_commit(root, prev_trans);
                                                                        --> prev_trans == transaction N
      
          btrfs_write_and_wait_transaction() continues
          writing ebs
             --> fails writing eb X, we abort transaction N
                 and set bit BTRFS_FS_STATE_ERROR on
                 fs_info->fs_state, so no new transactions
                 can start after setting that bit
      
             cleanup_transaction()
               btrfs_cleanup_one_transaction()
                 wakes up task at CPU 1
      
                                                                      continues, doesn't abort because
                                                                      cur_trans->aborted (transaction N + 1)
                                                                      is zero, and no checks for bit
                                                                      BTRFS_FS_STATE_ERROR in fs_info->fs_state
                                                                      are made
      
                                                                      btrfs_write_and_wait_transaction(trans, root);
                                                                        --> succeeds, no errors during writeback
      
                                                                      write_ctree_super(trans, root, 0);
                                                                        --> succeeds
                                                                        --> we have now a superblock that points us
                                                                            to some root that uses eb X, which was
                                                                            never written to disk
      
      In this scenario future attempts to read eb X from disk results in an
      error message like "parent transid verify failed on X wanted Y found Z".
      
      So fix this by aborting the current transaction if after waiting for the
      previous transaction we verify that it was aborted.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarJosef Bacik <jbacik@fb.com>
      Reviewed-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      50ee5ae5
    • Sakari Ailus's avatar
      v4l: omap3isp: Fix sub-device power management code · 887d34e9
      Sakari Ailus authored
      commit 9d39f054 upstream.
      
      Commit 813f5c0a ("media: Change media device link_notify behaviour")
      modified the media controller link setup notification API and updated the
      OMAP3 ISP driver accordingly. As a side effect it introduced a bug by
      turning power on after setting the link instead of before. This results in
      sub-devices not being powered down in some cases when they should be. Fix
      it.
      
      Fixes: 813f5c0a [media] media: Change media device link_notify behaviour
      Signed-off-by: default avatarSakari Ailus <sakari.ailus@iki.fi>
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      887d34e9
    • David Härdeman's avatar
      rc-core: fix remove uevent generation · 152824ba
      David Härdeman authored
      commit a66b0c41 upstream.
      
      The input_dev is already gone when the rc device is being unregistered
      so checking for its presence only means that no remove uevent will be
      generated.
      Signed-off-by: default avatarDavid Härdeman <david@hardeman.nu>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      152824ba
    • Minfei Huang's avatar
      x86/mm: Initialize pmd_idx in page_table_range_init_count() · a6a47b40
      Minfei Huang authored
      commit 9962eea9 upstream.
      
      The variable pmd_idx is not initialized for the first iteration of the
      for loop.
      
      Assign the proper value which indexes the start address.
      
      Fixes: 719272c4 'x86, mm: only call early_ioremap_page_table_range_init() once'
      Signed-off-by: default avatarMinfei Huang <mnfhuang@gmail.com>
      Cc: tony.luck@intel.com
      Cc: wangnan0@huawei.com
      Cc: david.vrabel@citrix.com
      Reviewed-by: yinghai@kernel.org
      Link: http://lkml.kernel.org/r/1436703522-29552-1-git-send-email-mhuang@redhat.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a6a47b40
    • Jeffery Miller's avatar
      Add radeon suspend/resume quirk for HP Compaq dc5750. · bb64a76e
      Jeffery Miller authored
      commit 09bfda10 upstream.
      
      With the radeon driver loaded the HP Compaq dc5750
      Small Form Factor machine fails to resume from suspend.
      Adding a quirk similar to other devices avoids
      the problem and the system resumes properly.
      Signed-off-by: default avatarJeffery Miller <jmiller@neverware.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bb64a76e
    • Aneesh Kumar K.V's avatar
      powerpc/mm: Recompute hash value after a failed update · bdb0c266
      Aneesh Kumar K.V authored
      commit 36b35d5d upstream.
      
      If we had secondary hash flag set, we ended up modifying hash value in
      the updatepp code path. Hence with a failed updatepp we will be using
      a wrong hash value for the following hash insert. Fix this by
      recomputing hash before insert.
      
      Without this patch we can end up with using wrong slot number in linux
      pte. That can result in us missing an hash pte update or invalidate
      which can cause memory corruption or even machine check.
      
      Fixes: 6d492ecc ("powerpc/THP: Add code to handle HPTE faults for hugepages")
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Reviewed-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bdb0c266
    • Thomas Huth's avatar
      powerpc/rtas: Introduce rtas_get_sensor_fast() for IRQ handlers · 108ff598
      Thomas Huth authored
      commit 1c2cb594 upstream.
      
      The EPOW interrupt handler uses rtas_get_sensor(), which in turn
      uses rtas_busy_delay() to wait for RTAS becoming ready in case it
      is necessary. But rtas_busy_delay() is annotated with might_sleep()
      and thus may not be used by interrupts handlers like the EPOW handler!
      This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is
      enabled:
      
       BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:496
       in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
       CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #6
       Call Trace:
       [c00000007ffe7b90] [c000000000807670] dump_stack+0xa0/0xdc (unreliable)
       [c00000007ffe7bc0] [c0000000000e1f14] ___might_sleep+0x134/0x180
       [c00000007ffe7c20] [c00000000002aec0] rtas_busy_delay+0x30/0xd0
       [c00000007ffe7c50] [c00000000002bde4] rtas_get_sensor+0x74/0xe0
       [c00000007ffe7ce0] [c000000000083264] ras_epow_interrupt+0x44/0x450
       [c00000007ffe7d90] [c000000000120260] handle_irq_event_percpu+0xa0/0x300
       [c00000007ffe7e70] [c000000000120524] handle_irq_event+0x64/0xc0
       [c00000007ffe7eb0] [c000000000124dbc] handle_fasteoi_irq+0xec/0x260
       [c00000007ffe7ef0] [c00000000011f4f0] generic_handle_irq+0x50/0x80
       [c00000007ffe7f20] [c000000000010f3c] __do_irq+0x8c/0x200
       [c00000007ffe7f90] [c0000000000236cc] call_do_irq+0x14/0x24
       [c00000007e6f39e0] [c000000000011144] do_IRQ+0x94/0x110
       [c00000007e6f3a30] [c000000000002594] hardware_interrupt_common+0x114/0x180
      
      Fix this issue by introducing a new rtas_get_sensor_fast() function
      that does not use rtas_busy_delay() - and thus can only be used for
      sensors that do not cause a BUSY condition - known as "fast" sensors.
      
      The EPOW sensor is defined to be "fast" in sPAPR - mpe.
      
      Fixes: 587f83e8 ("powerpc/pseries: Use rtas_get_sensor in RAS code")
      Signed-off-by: default avatarThomas Huth <thuth@redhat.com>
      Reviewed-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      108ff598
    • Michael Ellerman's avatar
      powerpc/mm: Fix pte_pagesize_index() crash on 4K w/64K hash · 33493c71
      Michael Ellerman authored
      commit 74b5037b upstream.
      
      The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
      PAGE_SIZE.
      
      However when built with a 4K PAGE_SIZE there is an additional config
      option which can be enabled, PPC_HAS_HASH_64K, which means the kernel
      also knows how to hash a 64K page even though the base PAGE_SIZE is 4K.
      
      This is used in one obscure configuration, to support 64K pages for SPU
      local store on the Cell processor when the rest of the kernel is using
      4K pages.
      
      In this configuration, pte_pagesize_index() is defined to just pass
      through its arguments to get_slice_psize(). However pte_pagesize_index()
      is called for both user and kernel addresses, whereas get_slice_psize()
      only knows how to handle user addresses.
      
      This has been broken forever, however until recently it happened to
      work. That was because in get_slice_psize() the large kernel address
      would cause the right shift of the slice mask to return zero.
      
      However in commit 7aa0727f ("powerpc/mm: Increase the slice range to
      64TB"), the get_slice_psize() code was changed so that instead of a
      right shift we do an array lookup based on the address. When passed a
      kernel address this means we index way off the end of the slice array
      and return random junk.
      
      That is only fatal if we happen to hit something non-zero, but when we
      do return a non-zero value we confuse the MMU code and eventually cause
      a check stop.
      
      This fix is ugly, but simple. When we're called for a kernel address we
      return 4K, which is always correct in this configuration, otherwise we
      use the slice mask.
      
      Fixes: 7aa0727f ("powerpc/mm: Increase the slice range to 64TB")
      Reported-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Reviewed-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      33493c71
    • Takashi Iwai's avatar
      ALSA: hda - Use ALC880_FIXUP_FUJITSU for FSC Amilo M1437 · bbe16223
      Takashi Iwai authored
      commit a161574e upstream.
      
      It turned out that the machine has a bass speaker, so take a correct
      fixup entry.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bbe16223
    • Takashi Iwai's avatar
      ALSA: hda - Enable headphone jack detect on old Fujitsu laptops · 56286045
      Takashi Iwai authored
      commit bb148bde upstream.
      
      According to the bug report, FSC Amilo laptops with ALC880 can detect
      the headphone jack but currently the driver disables it.  It's partly
      intentionally, as non-working jack detect was reported in the past.
      Let's enable now.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=102501Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      56286045
    • Takashi Iwai's avatar
      Input: evdev - do not report errors form flush() · a9e13f8b
      Takashi Iwai authored
      commit eb38f3a4 upstream.
      
      We've got bug reports showing the old systemd-logind (at least
      system-210) aborting unexpectedly, and this turned out to be because
      of an invalid error code from close() call to evdev devices.  close()
      is supposed to return only either EINTR or EBADFD, while the device
      returned ENODEV.  logind was overreacting to it and decided to kill
      itself when an unexpected error code was received.  What a tragedy.
      
      The bad error code comes from flush fops, and actually evdev_flush()
      returns ENODEV when device is disconnected or client's access to it is
      revoked. But in these cases the fact that flush did not actually happen is
      not an error, but rather normal behavior. For non-disconnected devices
      result of flush is also not that interesting as there is no potential of
      data loss and even if it fails application has no way of handling the
      error. Because of that we are better off always returning success from
      evdev_flush().
      
      Also returning EINTR from flush()/close() is discouraged (as it is not
      clear how application should handle this error), so let's stop taking
      evdev->mutex interruptibly.
      
      Bugzilla: http://bugzilla.suse.com/show_bug.cgi?id=939834Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a9e13f8b
    • Marc Zyngier's avatar
      arm64: KVM: Disable virtual timer even if the guest is not using it · eaebd1fb
      Marc Zyngier authored
      commit c4cbba9f upstream.
      
      When running a guest with the architected timer disabled (with QEMU and
      the kernel_irqchip=off option, for example), it is important to make
      sure the timer gets turned off. Otherwise, the guest may try to
      enable it anyway, leading to a screaming HW interrupt.
      
      The fix is to unconditionally turn off the virtual timer on guest
      exit.
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eaebd1fb
    • Will Deacon's avatar
      arm64: errata: add module build workaround for erratum #843419 · 86230818
      Will Deacon authored
      commit df057cc7 upstream.
      
      Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
      lead to a memory access using an incorrect address in certain sequences
      headed by an ADRP instruction.
      
      There is a linker fix to generate veneers for ADRP instructions, but
      this doesn't work for kernel modules which are built as unlinked ELF
      objects.
      
      This patch adds a new config option for the erratum which, when enabled,
      builds kernel modules with the mcmodel=large flag. This uses absolute
      addressing for all kernel symbols, thereby removing the use of ADRP as
      a PC-relative form of addressing. The ADRP relocs are removed from the
      module loader so that we fail to load any potentially affected modules.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      86230818
    • Will Deacon's avatar
      arm64: head.S: initialise mdcr_el2 in el2_setup · 3ebb3728
      Will Deacon authored
      commit d10bcd47 upstream.
      
      When entering the kernel at EL2, we fail to initialise the MDCR_EL2
      register which controls debug access and PMU capabilities at EL1.
      
      This patch ensures that the register is initialised so that all traps
      are disabled and all the PMU counters are available to the host. When a
      guest is scheduled, KVM takes care to configure trapping appropriately.
      Acked-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3ebb3728
    • Will Deacon's avatar
      arm64: compat: fix vfp save/restore across signal handlers in big-endian · cebbd84c
      Will Deacon authored
      commit bdec97a8 upstream.
      
      When saving/restoring the VFP registers from a compat (AArch32)
      signal frame, we rely on the compat registers forming a prefix of the
      native register file and therefore make use of copy_{to,from}_user to
      transfer between the native fpsimd_state and the compat_vfp_sigframe.
      
      Unfortunately, this doesn't work so well in a big-endian environment.
      Our fpsimd save/restore code operates directly on 128-bit quantities
      (Q registers) whereas the compat_vfp_sigframe represents the registers
      as an array of 64-bit (D) registers. The architecture packs the compat D
      registers into the Q registers, with the least significant bytes holding
      the lower register. Consequently, we need to swap the 64-bit halves when
      converting between these two representations on a big-endian machine.
      
      This patch replaces the __copy_{to,from}_user invocations in our
      compat VFP signal handling code with explicit __put_user loops that
      operate on 64-bit values and swap them accordingly.
      Reviewed-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cebbd84c
    • Jeff Vander Stoep's avatar
      arm64: kconfig: Move LIST_POISON to a safe value · 584f19f3
      Jeff Vander Stoep authored
      commit bf0c4e04 upstream.
      
      Move the poison pointer offset to 0xdead000000000000, a
      recognized value that is not mappable by user-space exploits.
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarThierry Strudel <tstrudel@google.com>
      Signed-off-by: default avatarJeff Vander Stoep <jeffv@google.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      584f19f3
    • Bob Copeland's avatar
      mac80211: enable assoc check for mesh interfaces · aa21b9a6
      Bob Copeland authored
      commit 3633ebeb upstream.
      
      We already set a station to be associated when peering completes, both
      in user space and in the kernel.  Thus we should always have an
      associated sta before sending data frames to that station.
      
      Failure to check assoc state can cause crashes in the lower-level driver
      due to transmitting unicast data frames before driver sta structures
      (e.g. ampdu state in ath9k) are initialized.  This occurred when
      forwarding in the presence of fixed mesh paths: frames were transmitted
      to stations with whom we hadn't yet completed peering.
      Reported-by: default avatarAlexis Green <agreen@cococorp.com>
      Tested-by: default avatarJesse Jones <jjones@cococorp.com>
      Signed-off-by: default avatarBob Copeland <me@bobcopeland.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      aa21b9a6
    • Jean Delvare's avatar
      tg3: Fix temperature reporting · 8363652f
      Jean Delvare authored
      commit d3d11fe0 upstream.
      
      The temperature registers appear to report values in degrees Celsius
      while the hwmon API mandates values to be exposed in millidegrees
      Celsius. Do the conversion so that the values reported by "sensors"
      are correct.
      
      Fixes: aed93e0b ("tg3: Add hwmon support for temperature")
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Cc: Prashant Sreedharan <prashant@broadcom.com>
      Cc: Michael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8363652f
    • Adrien Schildknecht's avatar
      rtlwifi: rtl8192cu: Add new device ID · 79b69ca8
      Adrien Schildknecht authored
      commit 1642d09f upstream.
      
      The v2 of NetGear WNA1000M uses a different idProduct: USB ID 0846:9043
      Signed-off-by: default avatarAdrien Schildknecht <adrien+dev@schischi.me>
      Acked-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      79b69ca8
    • Eric W. Biederman's avatar
      unshare: Unsharing a thread does not require unsharing a vm · 40608bef
      Eric W. Biederman authored
      commit 12c641ab upstream.
      
      In the logic in the initial commit of unshare made creating a new
      thread group for a process, contingent upon creating a new memory
      address space for that process.  That is wrong.  Two separate
      processes in different thread groups can share a memory address space
      and clone allows creation of such proceses.
      
      This is significant because it was observed that mm_users > 1 does not
      mean that a process is multi-threaded, as reading /proc/PID/maps
      temporarily increments mm_users, which allows other processes to
      (accidentally) interfere with unshare() calls.
      
      Correct the check in check_unshare_flags() to test for
      !thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM.
      For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM.
      For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM.
      
      By using the correct checks in unshare this removes the possibility of
      an accidental denial of service attack.
      
      Additionally using the correct checks in unshare ensures that only an
      explicit unshare(CLONE_VM) can possibly trigger the slow path of
      current_is_single_threaded().  As an explict unshare(CLONE_VM) is
      pointless it is not expected there are many applications that make
      that call.
      
      Fixes: b2e0d987 userns: Implement unshare of the user namespace
      Reported-by: default avatarRicky Zhou <rickyz@chromium.org>
      Reported-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      40608bef
    • Imre Deak's avatar
      tty/vt: don't set font mappings on vc not supporting this · 8f5ea57d
      Imre Deak authored
      commit 9e326f78 upstream.
      
      We can call this function for a dummy console that doesn't support
      setting the font mapping, which will result in a null ptr BUG. So check
      for this case and return error for consoles w/o font mapping support.
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=59321Signed-off-by: default avatarImre Deak <imre.deak@intel.com>
      Signed-off-by: default avatarSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8f5ea57d
    • Mikulas Patocka's avatar
      hpfs: update ctime and mtime on directory modification · 83bd3842
      Mikulas Patocka authored
      commit f49a26e7 upstream.
      
      Update ctime and mtime when a directory is modified. (though OS/2 doesn't
      update them anyway)
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      83bd3842
    • Grant Likely's avatar
      drivercore: Fix unregistration path of platform devices · 65181120
      Grant Likely authored
      commit 7f5dcaf1 upstream.
      
      The unregister path of platform_device is broken. On registration, it
      will register all resources with either a parent already set, or
      type==IORESOURCE_{IO,MEM}. However, on unregister it will release
      everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There
      are also cases where resources don't get registered in the first place,
      like with devices created by of_platform_populate()*.
      
      Fix the unregister path to be symmetrical with the register path by
      checking the parent pointer instead of the type field to decide which
      resources to unregister. This is safe because the upshot of the
      registration path algorithm is that registered resources have a parent
      pointer, and non-registered resources do not.
      
      * It can be argued that of_platform_populate() should be registering
        it's resources, and they argument has some merit. However, there are
        quite a few platforms that end up broken if we try to do that due to
        overlapping resources in the device tree. Until that is fixed, we need
        to solve the immediate problem.
      
      Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
      Cc: Wolfram Sang <wsa@the-dreams.de>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Signed-off-by: default avatarGrant Likely <grant.likely@linaro.org>
      Tested-by: default avatarRicardo Ribalda Delgado <ricardo.ribalda@gmail.com>
      Tested-by: default avatarWolfram Sang <wsa+renesas@sang-engineering.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      65181120
    • Vignesh R's avatar
      ARM: OMAP2+: DRA7: clockdomain: change l4per2_7xx_clkdm to SW_WKUP · b92a551e
      Vignesh R authored
      commit b9e23f32 upstream.
      
      Legacy IPs like PWMSS, present under l4per2_7xx_clkdm, cannot support
      smart-idle when its clock domain is in HW_AUTO on DRA7 SoCs. Hence,
      program clock domain to SW_WKUP.
      Signed-off-by: default avatarVignesh R <vigneshr@ti.com>
      Acked-by: default avatarTero Kristo <t-kristo@ti.com>
      Reviewed-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarPaul Walmsley <paul@pwsan.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b92a551e
    • David Daney's avatar
      of/address: Don't loop forever in of_find_matching_node_by_address(). · 6618a374
      David Daney authored
      commit 3a496b00 upstream.
      
      If the internal call to of_address_to_resource() fails, we end up
      looping forever in of_find_matching_node_by_address().  This can be
      caused by a defective device tree, or calling with an incorrect
      matches argument.
      
      Fix by calling of_find_matching_node() unconditionally at the end of
      the loop.
      Signed-off-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6618a374
    • Sudip Mukherjee's avatar
      auxdisplay: ks0108: fix refcount · 067f3f38
      Sudip Mukherjee authored
      commit bab383de upstream.
      
      parport_find_base() will implicitly do parport_get_port() which
      increases the refcount. Then parport_register_device() will again
      increment the refcount. But while unloading the module we are only
      doing parport_unregister_device() decrementing the refcount only once.
      We add an parport_put_port() to neutralize the effect of
      parport_get_port().
      Signed-off-by: default avatarSudip Mukherjee <sudip@vectorindia.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      067f3f38
    • Masahiro Yamada's avatar
      devres: fix devres_get() · 1e412fe3
      Masahiro Yamada authored
      commit 64526370 upstream.
      
      Currently, devres_get() passes devres_free() the pointer to devres,
      but devres_free() should be given with the pointer to resource data.
      
      Fixes: 9ac7849e ("devres: device resource management")
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1e412fe3
    • Max Filippov's avatar
      xtensa: fix kernel register spilling · f29cc860
      Max Filippov authored
      commit 77d6273e upstream.
      
      call12 can't be safely used as the first call in the inline function,
      because the compiler does not extend the stack frame of the bounding
      function accordingly, which may result in corruption of local variables.
      
      If a call needs to be done, do call8 first followed by call12.
      
      For pure assembly code in _switch_to increase stack frame size of the
      bounding function.
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f29cc860
    • Max Filippov's avatar
      xtensa: fix threadptr reload on return to userspace · cc8fd338
      Max Filippov authored
      commit 4229fb12 upstream.
      
      Userspace return code may skip restoring THREADPTR register if there are
      no registers that need to be zeroed. This leads to spurious failures in
      libc NPTL tests.
      
      Always restore THREADPTR on return to userspace.
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cc8fd338
    • Xiao Guangrong's avatar
      KVM: MMU: fix validation of mmio page fault · 776a1169
      Xiao Guangrong authored
      commit 6f691251 upstream.
      
      We got the bug that qemu complained with "KVM: unknown exit, hardware
      reason 31" and KVM shown these info:
      [84245.284948] EPT: Misconfiguration.
      [84245.285056] EPT: GPA: 0xfeda848
      [84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4
      [84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3
      [84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2
      [84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1
      
      This is because we got a mmio #PF and the handler see the mmio spte becomes
      normal (points to the ram page)
      
      However, this is valid after introducing fast mmio spte invalidation which
      increases the generation-number instead of zapping mmio sptes, a example
      is as follows:
      1. QEMU drops mmio region by adding a new memslot
      2. invalidate all mmio sptes
      3.
      
              VCPU 0                        VCPU 1
          access the invalid mmio spte
                                  access the region originally was MMIO before
                                  set the spte to the normal ram map
      
          mmio #PF
          check the spte and see it becomes normal ram mapping !!!
      
      This patch fixes the bug just by dropping the check in mmio handler, it's
      good for backport. Full check will be introduced in later patches
      Reported-by: default avatarPavel Shirshov <ru.pchel@gmail.com>
      Tested-by: default avatarPavel Shirshov <ru.pchel@gmail.com>
      Signed-off-by: default avatarXiao Guangrong <guangrong.xiao@linux.intel.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      776a1169
    • Don Zickus's avatar
      HID: usbhid: Fix the check for HID_RESET_PENDING in hid_io_error · 26db2c34
      Don Zickus authored
      commit 3af4e5a9 upstream.
      
      It was reported that after 10-20 reboots, a usb keyboard plugged
      into a docking station would not work unless it was replugged in.
      
      Using usbmon, it turns out the interrupt URBs were streaming with
      callback errors of -71 for some reason.  The hid-core.c::hid_io_error was
      supposed to retry and then reset, but the reset wasn't really happening.
      
      The check for HID_NO_BANDWIDTH was inverted.  Fix was simple.
      
      Tested by reporter and locally by me by unplugging a keyboard halfway until I
      could recreate a stream of errors but no disconnect.
      Signed-off-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      26db2c34
    • Andrey Ryabinin's avatar
      crypto: ghash-clmulni: specify context size for ghash async algorithm · 9cc6ecb5
      Andrey Ryabinin authored
      commit 71c6da84 upstream.
      
      Currently context size (cra_ctxsize) doesn't specified for
      ghash_async_alg. Which means it's zero. Thus crypto_create_tfm()
      doesn't allocate needed space for ghash_async_ctx, so any
      read/write to ctx (e.g. in ghash_async_init_tfm()) is not valid.
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@odin.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9cc6ecb5
    • Maciej S. Szmigiero's avatar
      serial: 8250: don't bind to SMSC IrCC IR port · 01e34fe8
      Maciej S. Szmigiero authored
      commit ffa34de0 upstream.
      
      SMSC IrCC SIR/FIR port should not be bound to by
      (legacy) serial driver so its own driver (smsc-ircc2)
      can bind to it.
      Signed-off-by: default avatarMaciej Szmigiero <mail@maciej.szmigiero.name>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      01e34fe8