1. 05 Mar, 2019 20 commits
    • Peng Hao's avatar
      x86/mm/mem_encrypt: Fix erroneous sizeof() · 7ff77864
      Peng Hao authored
      [ Upstream commit bf7d28c5 ]
      
      Using sizeof(pointer) for determining the size of a memset() only works
      when the size of the pointer and the size of type to which it points are
      the same. For pte_t this is only true for 64bit and 32bit-NONPAE. On 32bit
      PAE systems this is wrong as the pointer size is 4 byte but the PTE entry
      is 8 bytes. It's actually not a real world issue as this code depends on
      64bit, but it's wrong nevertheless.
      
      Use sizeof(*p) for correctness sake.
      
      Fixes: aad98391 ("x86/mm/encrypt: Simplify sme_populate_pgd() and sme_populate_pgd_large()")
      Signed-off-by: default avatarPeng Hao <peng.hao2@zte.com.cn>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: dave.hansen@linux.intel.com
      Cc: peterz@infradead.org
      Cc: luto@kernel.org
      Link: https://lkml.kernel.org/r/1546065252-97996-1-git-send-email-peng.hao2@zte.com.cnSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      7ff77864
    • Srinivas Ramana's avatar
      genirq: Make sure the initial affinity is not empty · 17fab891
      Srinivas Ramana authored
      [ Upstream commit bddda606 ]
      
      If all CPUs in the irq_default_affinity mask are offline when an interrupt
      is initialized then irq_setup_affinity() can set an empty affinity mask for
      a newly allocated interrupt.
      
      Fix this by falling back to cpu_online_mask in case the resulting affinity
      mask is zero.
      Signed-off-by: default avatarSrinivas Ramana <sramana@codeaurora.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arm-msm@vger.kernel.org
      Link: https://lkml.kernel.org/r/1545312957-8504-1-git-send-email-sramana@codeaurora.orgSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      17fab891
    • Alexandre Belloni's avatar
      selftests: rtc: rtctest: add alarm test on minute boundary · 7746dd64
      Alexandre Belloni authored
      [ Upstream commit 7b302772 ]
      
      Unfortunately, some RTC don't have a second resolution for alarm so also
      test for alarm on a minute boundary.
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7746dd64
    • Alexandre Belloni's avatar
      selftests: rtc: rtctest: fix alarm tests · 2409a869
      Alexandre Belloni authored
      [ Upstream commit fdac9448 ]
      
      Return values for select are not checked properly and timeouts may not be
      detected.
      Signed-off-by: default avatarAlexandre Belloni <alexandre.belloni@bootlin.com>
      Signed-off-by: default avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2409a869
    • Dan Carpenter's avatar
      usb: gadget: Potential NULL dereference on allocation error · 4670e839
      Dan Carpenter authored
      [ Upstream commit df28169e ]
      
      The source_sink_alloc_func() function is supposed to return error
      pointers on error.  The function is called from usb_get_function() which
      doesn't check for NULL returns so it would result in an Oops.
      
      Of course, in the current kernel, small allocations always succeed so
      this doesn't affect runtime.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4670e839
    • Zeng Tao's avatar
      usb: dwc3: gadget: Fix the uninitialized link_state when udc starts · 08c937f9
      Zeng Tao authored
      [ Upstream commit 88b1bb1f ]
      
      Currently the link_state is uninitialized and the default value is 0(U0)
      before the first time we start the udc, and after we start the udc then
       stop the udc, the link_state will be undefined.
      We may have the following warnings if we start the udc again with
      an undefined link_state:
      
      WARNING: CPU: 0 PID: 327 at drivers/usb/dwc3/gadget.c:294 dwc3_send_gadget_ep_cmd+0x304/0x308
      dwc3 100e0000.hidwc3_0: wakeup failed --> -22
      [...]
      Call Trace:
      [<c010f270>] (unwind_backtrace) from [<c010b3d8>] (show_stack+0x10/0x14)
      [<c010b3d8>] (show_stack) from [<c034a4dc>] (dump_stack+0x84/0x98)
      [<c034a4dc>] (dump_stack) from [<c0118000>] (__warn+0xe8/0x100)
      [<c0118000>] (__warn) from [<c0118050>](warn_slowpath_fmt+0x38/0x48)
      [<c0118050>] (warn_slowpath_fmt) from [<c0442ec0>](dwc3_send_gadget_ep_cmd+0x304/0x308)
      [<c0442ec0>] (dwc3_send_gadget_ep_cmd) from [<c0445e68>](dwc3_ep0_start_trans+0x48/0xf4)
      [<c0445e68>] (dwc3_ep0_start_trans) from [<c0446750>](dwc3_ep0_out_start+0x64/0x80)
      [<c0446750>] (dwc3_ep0_out_start) from [<c04451c0>](__dwc3_gadget_start+0x1e0/0x278)
      [<c04451c0>] (__dwc3_gadget_start) from [<c04452e0>](dwc3_gadget_start+0x88/0x10c)
      [<c04452e0>] (dwc3_gadget_start) from [<c045ee54>](udc_bind_to_driver+0x88/0xbc)
      [<c045ee54>] (udc_bind_to_driver) from [<c045f29c>](usb_gadget_probe_driver+0xf8/0x140)
      [<c045f29c>] (usb_gadget_probe_driver) from [<bf005424>](gadget_dev_desc_UDC_store+0xac/0xc4 [libcomposite])
      [<bf005424>] (gadget_dev_desc_UDC_store [libcomposite]) from[<c023d8e0>] (configfs_write_file+0xd4/0x160)
      [<c023d8e0>] (configfs_write_file) from [<c01d51e8>] (__vfs_write+0x1c/0x114)
      [<c01d51e8>] (__vfs_write) from [<c01d5ff4>] (vfs_write+0xa4/0x168)
      [<c01d5ff4>] (vfs_write) from [<c01d6d40>] (SyS_write+0x3c/0x90)
      [<c01d6d40>] (SyS_write) from [<c0107400>] (ret_fast_syscall+0x0/0x3c)
      Signed-off-by: default avatarZeng Tao <prime.zeng@hisilicon.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      08c937f9
    • Bo He's avatar
      usb: dwc3: gadget: synchronize_irq dwc irq in suspend · 03a5d4d5
      Bo He authored
      [ Upstream commit 01c10880 ]
      
      We see dwc3 endpoint stopped by unwanted irq during
      suspend resume test, which is caused dwc3 ep can't be started
      with error "No Resource".
      
      Here, add synchronize_irq before suspend to sync the
      pending IRQ handlers complete.
      Signed-off-by: default avatarBo He <bo.he@intel.com>
      Signed-off-by: default avatarYu Wang <yu.y.wang@intel.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      03a5d4d5
    • Dan Carpenter's avatar
      thermal: int340x_thermal: Fix a NULL vs IS_ERR() check · f29024c0
      Dan Carpenter authored
      [ Upstream commit 3fe931b3 ]
      
      The intel_soc_dts_iosf_init() function doesn't return NULL, it returns
      error pointers.
      
      Fixes: 4d0dd6c1 ("Thermal/int340x/processor_thermal: Enable auxiliary DTS for Braswell")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarZhang Rui <rui.zhang@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f29024c0
    • Marek Vasut's avatar
      clk: vc5: Abort clock configuration without upstream clock · fc1073df
      Marek Vasut authored
      [ Upstream commit 2137a109 ]
      
      In case the upstream clock are not set, which can happen in case the
      VC5 has no valid upstream clock, the $src variable is used uninited
      by regmap_update_bits(). Check for this condition and return -EINVAL
      in such case.
      
      Note that in case the VC5 has no valid upstream clock, the VC5 can
      not operate correctly. That is a hardware property of the VC5. The
      internal oscilator present in some VC5 models is also considered
      upstream clock.
      Signed-off-by: default avatarMarek Vasut <marek.vasut+renesas@gmail.com>
      Cc: Alexey Firago <alexey_firago@mentor.com>
      Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
      Cc: Stephen Boyd <sboyd@kernel.org>
      Cc: linux-renesas-soc@vger.kernel.org
      [sboyd@kernel.org: Added comment about probe preventing this from
      happening in the first place]
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fc1073df
    • Lubomir Rintel's avatar
      clk: sysfs: fix invalid JSON in clk_dump · 71943c38
      Lubomir Rintel authored
      [ Upstream commit c6e90997 ]
      
      Add a missing comma so that the output is valid JSON format again.
      
      Fixes: 9fba738a ("clk: add duty cycle support")
      Signed-off-by: default avatarLubomir Rintel <lkundrak@v3.sk>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      71943c38
    • Dan Carpenter's avatar
      clk: tegra: dfll: Fix a potential Oop in remove() · acc934f5
      Dan Carpenter authored
      [ Upstream commit d39eca54 ]
      
      If tegra_dfll_unregister() fails then "soc" is an error pointer.  We
      should just return instead of dereferencing it.
      
      Fixes: 1752c9ee ("clk: tegra: dfll: Fix drvdata overwriting issue")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      acc934f5
    • Yizhuo's avatar
      ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized · 651023ed
      Yizhuo authored
      [ Upstream commit 8c3590de ]
      
      Inside function rt274_i2c_probe(), if regmap_read() function
      returns -EINVAL, then local variable "val" leaves uninitialized
      but used in if statement. This is potentially unsafe.
      Signed-off-by: default avatarYizhuo <yzhai003@ucr.edu>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      651023ed
    • Dan Carpenter's avatar
      ALSA: compress: prevent potential divide by zero bugs · e7b2f9f2
      Dan Carpenter authored
      [ Upstream commit 678e2b44 ]
      
      The problem is seen in the q6asm_dai_compr_set_params() function:
      
      	ret = q6asm_map_memory_regions(dir, prtd->audio_client, prtd->phys,
      				       (prtd->pcm_size / prtd->periods),
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      				       prtd->periods);
      
      In this code prtd->pcm_size is the buffer_size and prtd->periods comes
      from params->buffer.fragments.  If we allow the number of fragments to
      be zero then it results in a divide by zero bug.  One possible fix would
      be to use prtd->pcm_count directly instead of using the division to
      re-calculate it.  But I decided that it doesn't really make sense to
      allow zero fragments.
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e7b2f9f2
    • Rander Wang's avatar
      ASoC: Intel: Haswell/Broadwell: fix setting for .dynamic field · a4964959
      Rander Wang authored
      [ Upstream commit 906a9abc ]
      
      For some reason this field was set to zero when all other drivers use
      .dynamic = 1 for front-ends. This change was tested on Dell XPS13 and
      has no impact with the existing legacy driver. The SOF driver also works
      with this change which enables it to override the fixed topology.
      Signed-off-by: default avatarRander Wang <rander.wang@linux.intel.com>
      Acked-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a4964959
    • Kristian H. Kristensen's avatar
      drm/msm: Unblock writer if reader closes file · 5a700533
      Kristian H. Kristensen authored
      [ Upstream commit 99c66bc0 ]
      
      Prevents deadlock when fifo is full and reader closes file.
      Signed-off-by: default avatarKristian H. Kristensen <hoegsberg@chromium.org>
      Signed-off-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5a700533
    • John Garry's avatar
      scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached · 0f978ec3
      John Garry authored
      commit ffeafdd2 upstream.
      
      The sysfs phy_identifier attribute for a sas_end_device comes from the rphy
      phy_identifier value.
      
      Currently this is not being set for rphys with an end device attached, so
      we see incorrect symlinks from systemd disk/by-path:
      
      root@localhost:~# ls -l /dev/disk/by-path/
      total 0
      lrwxrwxrwx 1 root root  9 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0 -> ../../sdb
      lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part1 -> ../../sdb1
      lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part2 -> ../../sdb2
      lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part3 -> ../../sdc3
      
      Indeed, each sas_end_device phy_identifier value is 0:
      
      root@localhost:/# more sys/class/sas_device/end_device-0\:0\:2/phy_identifier
      0
      root@localhost:/# more sys/class/sas_device/end_device-0\:0\:10/phy_identifier
      0
      
      This patch fixes the discovery code to set the phy_identifier.  With this,
      we now get proper symlinks:
      
      root@localhost:~# ls -l /dev/disk/by-path/
      total 0
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy10-lun-0 -> ../../sdg
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy11-lun-0 -> ../../sdh
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0 -> ../../sda
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0-part1 -> ../../sda1
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0 -> ../../sdb
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part1 -> ../../sdb1
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part2 -> ../../sdb2
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0 -> ../../sdc
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part1 -> ../../sdc1
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part2 -> ../../sdc2
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part3 -> ../../sdc3
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy5-lun-0 -> ../../sdd
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0 -> ../../sde
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part1 -> ../../sde1
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part2 -> ../../sde2
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part3 -> ../../sde3
      lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0 -> ../../sdf
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part1 -> ../../sdf1
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part2 -> ../../sdf2
      lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part3 -> ../../sdf3
      
      Fixes: 2908d778 ("[SCSI] aic94xx: new driver")
      Reported-by: default avatardann frazier <dann.frazier@canonical.com>
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Reviewed-by: default avatarJason Yan <yanaijie@huawei.com>
      Tested-by: default avatardann frazier <dann.frazier@canonical.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0f978ec3
    • Toke Høiland-Jørgensen's avatar
      mac80211: Change default tx_sk_pacing_shift to 7 · a7c6cf3b
      Toke Høiland-Jørgensen authored
      commit 5c14a4d0 upstream.
      
      When we did the original tests for the optimal value of sk_pacing_shift, we
      came up with 6 ms of buffering as the default. Sadly, 6 is not a power of
      two, so when picking the shift value I erred on the size of less buffering
      and picked 4 ms instead of 8. This was probably wrong; those 2 ms of extra
      buffering makes a larger difference than I thought.
      
      So, change the default pacing shift to 7, which corresponds to 8 ms of
      buffering. The point of diminishing returns really kicks in after 8 ms, and
      so having this as a default should cut down on the need for extensive
      per-device testing and overrides needed in the drivers.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      a7c6cf3b
    • Long Li's avatar
      genirq/matrix: Improve target CPU selection for managed interrupts. · 765c30b3
      Long Li authored
      [ Upstream commit e8da8794 ]
      
      On large systems with multiple devices of the same class (e.g. NVMe disks,
      using managed interrupts), the kernel can affinitize these interrupts to a
      small subset of CPUs instead of spreading them out evenly.
      
      irq_matrix_alloc_managed() tries to select the CPU in the supplied cpumask
      of possible target CPUs which has the lowest number of interrupt vectors
      allocated.
      
      This is done by searching the CPU with the highest number of available
      vectors. While this is correct for non-managed CPUs it can select the wrong
      CPU for managed interrupts. Under certain constellations this results in
      affinitizing the managed interrupts of several devices to a single CPU in
      a set.
      
      The book keeping of available vectors works the following way:
      
       1) Non-managed interrupts:
      
          available is decremented when the interrupt is actually requested by
          the device driver and a vector is assigned. It's incremented when the
          interrupt and the vector are freed.
      
       2) Managed interrupts:
      
          Managed interrupts guarantee vector reservation when the MSI/MSI-X
          functionality of a device is enabled, which is achieved by reserving
          vectors in the bitmaps of the possible target CPUs. This reservation
          decrements the available count on each possible target CPU.
      
          When the interrupt is requested by the device driver then a vector is
          allocated from the reserved region. The operation is reversed when the
          interrupt is freed by the device driver. Neither of these operations
          affect the available count.
      
          The reservation persist up to the point where the MSI/MSI-X
          functionality is disabled and only this operation increments the
          available count again.
      
      For non-managed interrupts the available count is the correct selection
      criterion because the guaranteed reservations need to be taken into
      account. Using the allocated counter could lead to a failing allocation in
      the following situation (total vector space of 10 assumed):
      
      		 CPU0	CPU1
       available:	    2	   0
       allocated:	    5	   3   <--- CPU1 is selected, but available space = 0
       managed reserved:  3	   7
      
       while available yields the correct result.
      
      For managed interrupts the available count is not the appropriate
      selection criterion because as explained above the available count is not
      affected by the actual vector allocation.
      
      The following example illustrates that. Total vector space of 10
      assumed. The starting point is:
      
      		 CPU0	CPU1
       available:	    5	   4
       allocated:	    2	   3
       managed reserved:  3	   3
      
       Allocating vectors for three non-managed interrupts will result in
       affinitizing the first two to CPU0 and the third one to CPU1 because the
       available count is adjusted with each allocation:
      
      		  CPU0	CPU1
       available:	     5	   4	<- Select CPU0 for 1st allocation
       --> allocated:	     3	   3
      
       available:	     4	   4	<- Select CPU0 for 2nd allocation
       --> allocated:	     4	   3
      
       available:	     3	   4	<- Select CPU1 for 3rd allocation
       --> allocated:	     4	   4
      
       But the allocation of three managed interrupts starting from the same
       point will affinitize all of them to CPU0 because the available count is
       not affected by the allocation (see above). So the end result is:
      
      		  CPU0	CPU1
       available:	     5	   4
       allocated:	     5	   3
      
      Introduce a "managed_allocated" field in struct cpumap to track the vector
      allocation for managed interrupts separately. Use this information to
      select the target CPU when a vector is allocated for a managed interrupt,
      which results in more evenly distributed vector assignments. The above
      example results in the following allocations:
      
      		 CPU0	CPU1
       managed_allocated: 0	   0	<- Select CPU0 for 1st allocation
       --> allocated:	    3	   3
      
       managed_allocated: 1	   0	<- Select CPU1 for 2nd allocation
       --> allocated:	    3	   4
      
       managed_allocated: 1	   1	<- Select CPU0 for 3rd allocation
       --> allocated:	    4	   4
      
      The allocation of non-managed interrupts is not affected by this change and
      is still evaluating the available count.
      
      The overall distribution of interrupt vectors for both types of interrupts
      might still not be perfectly even depending on the number of non-managed
      and managed interrupts in a system, but due to the reservation guarantee
      for managed interrupts this cannot be avoided.
      
      Expose the new field in debugfs as well.
      
      [ tglx: Clarified the background of the problem in the changelog and
        	described it independent of NVME ]
      Signed-off-by: default avatarLong Li <longli@microsoft.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Michael Kelley <mikelley@microsoft.com>
      Link: https://lkml.kernel.org/r/20181106040000.27316-1-longli@linuxonhyperv.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      765c30b3
    • Dou Liyang's avatar
      irq/matrix: Spread managed interrupts on allocation · 8cae7757
      Dou Liyang authored
      [ Upstream commit 76f99ae5 ]
      
      Linux spreads out the non managed interrupt across the possible target CPUs
      to avoid vector space exhaustion.
      
      Managed interrupts are treated differently, as for them the vectors are
      reserved (with guarantee) when the interrupt descriptors are initialized.
      
      When the interrupt is requested a real vector is assigned. The assignment
      logic uses the first CPU in the affinity mask for assignment. If the
      interrupt has more than one CPU in the affinity mask, which happens when a
      multi queue device has less queues than CPUs, then doing the same search as
      for non managed interrupts makes sense as it puts the interrupt on the
      least interrupt plagued CPU. For single CPU affine vectors that's obviously
      a NOOP.
      
      Restructre the matrix allocation code so it does the 'best CPU' search, add
      the sanity check for an empty affinity mask and adapt the call site in the
      x86 vector management code.
      
      [ tglx: Added the empty mask check to the core and improved change log ]
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20180908175838.14450-2-dou_liyang@163.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8cae7757
    • Dou Liyang's avatar
      irq/matrix: Split out the CPU selection code into a helper · 2948b887
      Dou Liyang authored
      [ Upstream commit 8ffe4e61 ]
      
      Linux finds the CPU which has the lowest vector allocation count to spread
      out the non managed interrupts across the possible target CPUs, but does
      not do so for managed interrupts.
      
      Split out the CPU selection code into a helper function for reuse. No
      functional change.
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: hpa@zytor.com
      Link: https://lkml.kernel.org/r/20180908175838.14450-1-dou_liyang@163.comSigned-off-by: default avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2948b887
  2. 27 Feb, 2019 20 commits