1. 13 Mar, 2019 18 commits
    • Colin Ian King's avatar
      selftests: cpu-hotplug: fix case where CPUs offline > CPUs present · 0165df14
      Colin Ian King authored
      [ Upstream commit 2b531b61 ]
      
      The cpu-hotplug test assumes that we can offline the maximum CPU as
      described by /sys/devices/system/cpu/offline.  However, in the case
      where the number of CPUs exceeds like kernel configuration then
      the offline count can be greater than the present count and we end
      up trying to test the offlining of a CPU that is not available to
      offline.  Fix this by testing the maximum present CPU instead.
      
      Also, the test currently offlines the CPU and does not online it,
      so fix this by onlining the CPU after the test.
      
      Fixes: d89dffa9 ("fault-injection: add selftests for cpu and memory hotplug")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0165df14
    • Feras Daoud's avatar
      IB/ipoib: Fix for use-after-free in ipoib_cm_tx_start · 1ee82160
      Feras Daoud authored
      [ Upstream commit 6ab4aba0 ]
      
      The following BUG was reported by kasan:
      
       BUG: KASAN: use-after-free in ipoib_cm_tx_start+0x430/0x1390 [ib_ipoib]
       Read of size 80 at addr ffff88034c30bcd0 by task kworker/u16:1/24020
      
       Workqueue: ipoib_wq ipoib_cm_tx_start [ib_ipoib]
       Call Trace:
        dump_stack+0x9a/0xeb
        print_address_description+0xe3/0x2e0
        kasan_report+0x18a/0x2e0
        ? ipoib_cm_tx_start+0x430/0x1390 [ib_ipoib]
        memcpy+0x1f/0x50
        ipoib_cm_tx_start+0x430/0x1390 [ib_ipoib]
        ? kvm_clock_read+0x1f/0x30
        ? ipoib_cm_skb_reap+0x610/0x610 [ib_ipoib]
        ? __lock_is_held+0xc2/0x170
        ? process_one_work+0x880/0x1960
        ? process_one_work+0x912/0x1960
        process_one_work+0x912/0x1960
        ? wq_pool_ids_show+0x310/0x310
        ? lock_acquire+0x145/0x440
        worker_thread+0x87/0xbb0
        ? process_one_work+0x1960/0x1960
        kthread+0x314/0x3d0
        ? kthread_create_worker_on_cpu+0xc0/0xc0
        ret_from_fork+0x3a/0x50
      
       Allocated by task 0:
        kasan_kmalloc+0xa0/0xd0
        kmem_cache_alloc_trace+0x168/0x3e0
        path_rec_create+0xa2/0x1f0 [ib_ipoib]
        ipoib_start_xmit+0xa98/0x19e0 [ib_ipoib]
        dev_hard_start_xmit+0x159/0x8d0
        sch_direct_xmit+0x226/0xb40
        __dev_queue_xmit+0x1d63/0x2950
        neigh_update+0x889/0x1770
        arp_process+0xc47/0x21f0
        arp_rcv+0x462/0x760
        __netif_receive_skb_core+0x1546/0x2da0
        netif_receive_skb_internal+0xf2/0x590
        napi_gro_receive+0x28e/0x390
        ipoib_ib_handle_rx_wc_rss+0x873/0x1b60 [ib_ipoib]
        ipoib_rx_poll_rss+0x17d/0x320 [ib_ipoib]
        net_rx_action+0x427/0xe30
        __do_softirq+0x28e/0xc42
      
       Freed by task 26680:
        __kasan_slab_free+0x11d/0x160
        kfree+0xf5/0x360
        ipoib_flush_paths+0x532/0x9d0 [ib_ipoib]
        ipoib_set_mode_rss+0x1ad/0x560 [ib_ipoib]
        set_mode+0xc8/0x150 [ib_ipoib]
        kernfs_fop_write+0x279/0x440
        __vfs_write+0xd8/0x5c0
        vfs_write+0x15e/0x470
        ksys_write+0xb8/0x180
        do_syscall_64+0x9b/0x420
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
       The buggy address belongs to the object at ffff88034c30bcc8
                      which belongs to the cache kmalloc-512 of size 512
       The buggy address is located 8 bytes inside of
                      512-byte region [ffff88034c30bcc8, ffff88034c30bec8)
       The buggy address belongs to the page:
      
      The following race between change mode and xmit flow is the reason for
      this use-after-free:
      
      Change mode     Send packet 1 to GID XX      Send packet 2 to GID XX
           |                    |                             |
         start                  |                             |
           |                    |                             |
           |                    |                             |
           |         Create new path for GID XX               |
           |           and update neigh path                  |
           |                    |                             |
           |                    |                             |
           |                    |                             |
       flush_paths              |                             |
                                |                             |
                     queue_work(cm.start_task)                |
                                |                 Path for GID XX not found
                                |                      create new path
                                |
                                |
                     start_task runs with old
                          released path
      
      There is no locking to protect the lifetime of the path through the
      ipoib_cm_tx struct, so delete it entirely and always use the newly looked
      up path under the priv->lock.
      
      Fixes: 546481c2 ("IB/ipoib: Fix memory corruption in ipoib cm mode connect flow")
      Signed-off-by: default avatarFeras Daoud <ferasda@mellanox.com>
      Reviewed-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ee82160
    • Alexandre Ghiti's avatar
      riscv: Adjust mmap base address at a third of task size · dc04a00b
      Alexandre Ghiti authored
      [ Upstream commit ae662eec ]
      
      This ratio is the most used among all other architectures and make
      icache_hygiene libhugetlbfs test pass: this test mmap lots of
      hugepages whose addresses, without this patch, reach the end of
      the process user address space.
      Signed-off-by: default avatarAlexandre Ghiti <aghiti@upmem.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      dc04a00b
    • Max Filippov's avatar
      xtensa: SMP: fix ccount_timer_shutdown · f43e42f4
      Max Filippov authored
      [ Upstream commit 4fe8713b ]
      
      ccount_timer_shutdown is called from the atomic context in the
      secondary_start_kernel, resulting in the following BUG:
      
      BUG: sleeping function called from invalid context
      in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
      Preemption disabled at:
        secondary_start_kernel+0xa1/0x130
      Call Trace:
        ___might_sleep+0xe7/0xfc
        __might_sleep+0x41/0x44
        synchronize_irq+0x24/0x64
        disable_irq+0x11/0x14
        ccount_timer_shutdown+0x12/0x20
        clockevents_switch_state+0x82/0xb4
        clockevents_exchange_device+0x54/0x60
        tick_check_new_device+0x46/0x70
        clockevents_register_device+0x8c/0xc8
        clockevents_config_and_register+0x1d/0x2c
        local_timer_setup+0x75/0x7c
        secondary_start_kernel+0xb4/0x130
        should_never_return+0x32/0x35
      
      Use disable_irq_nosync instead of disable_irq to avoid it.
      This is safe because the ccount timer IRQ is per-CPU, and once IRQ is
      masked the ISR will not be called.
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f43e42f4
    • Taniya Das's avatar
      clk: qcom: gcc: Use active only source for CPUSS clocks · aad4dc74
      Taniya Das authored
      [ Upstream commit 9ff1a3b4 ]
      
      The clocks of the CPUSS such as "gcc_cpuss_ahb_clk_src" is a CRITICAL
      clock and needs to vote on the active only source of XO, so as to keep
      the vote as long as CPUSS is active. Similar rbcpr_clk_src is also has
      the same requirement.
      Signed-off-by: default avatarTaniya Das <tdas@codeaurora.org>
      Fixes: 06391edd ("clk: qcom: Add Global Clock controller (GCC) driver for SDM845")
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      aad4dc74
    • Dan Carpenter's avatar
      clk: ti: Fix error handling in ti_clk_parse_divider_data() · cf872189
      Dan Carpenter authored
      [ Upstream commit 303aef8b ]
      
      The ti_clk_parse_divider_data() function is only called from
      _get_div_table_from_setup().  That function doesn't look at the return
      value but instead looks at the "*table" pointer.  In this case, if the
      kcalloc() fails then *table is NULL (which means success).  It should
      instead be an error pointer.
      
      The ti_clk_parse_divider_data() function has two callers.  One checks
      for errors and the other doesn't.  I have fixed it so now both handle
      errors.
      
      Fixes: 4f6be565 ("clk: ti: divider: add driver internal API for parsing divider data")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarTero Kristo <t-kristo@ti.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cf872189
    • Suravee Suthikulpanit's avatar
      iommu/amd: Fix IOMMU page flush when detach device from a domain · a038ed68
      Suravee Suthikulpanit authored
      [ Upstream commit 9825bd94 ]
      
      When a VM is terminated, the VFIO driver detaches all pass-through
      devices from VFIO domain by clearing domain id and page table root
      pointer from each device table entry (DTE), and then invalidates
      the DTE. Then, the VFIO driver unmap pages and invalidate IOMMU pages.
      
      Currently, the IOMMU driver keeps track of which IOMMU and how many
      devices are attached to the domain. When invalidate IOMMU pages,
      the driver checks if the IOMMU is still attached to the domain before
      issuing the invalidate page command.
      
      However, since VFIO has already detached all devices from the domain,
      the subsequent INVALIDATE_IOMMU_PAGES commands are being skipped as
      there is no IOMMU attached to the domain. This results in data
      corruption and could cause the PCI device to end up in indeterministic
      state.
      
      Fix this by invalidate IOMMU pages when detach a device, and
      before decrementing the per-domain device reference counts.
      
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Suggested-by: default avatarJoerg Roedel <joro@8bytes.org>
      Co-developed-by: default avatarBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: default avatarBrijesh Singh <brijesh.singh@amd.com>
      Signed-off-by: default avatarSuravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Fixes: 6de8ad9b ('x86/amd-iommu: Make iommu_flush_pages aware of multiple IOMMUs')
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a038ed68
    • ZhangXiaoxu's avatar
      ipvs: Fix signed integer overflow when setsockopt timeout · e0b03a6b
      ZhangXiaoxu authored
      [ Upstream commit 53ab60ba ]
      
      There is a UBSAN bug report as below:
      UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21
      signed integer overflow:
      -2147483647 * 1000 cannot be represented in type 'int'
      
      Reproduce program:
      	#include <stdio.h>
      	#include <sys/types.h>
      	#include <sys/socket.h>
      
      	#define IPPROTO_IP 0
      	#define IPPROTO_RAW 255
      
      	#define IP_VS_BASE_CTL		(64+1024+64)
      	#define IP_VS_SO_SET_TIMEOUT	(IP_VS_BASE_CTL+10)
      
      	/* The argument to IP_VS_SO_GET_TIMEOUT */
      	struct ipvs_timeout_t {
      		int tcp_timeout;
      		int tcp_fin_timeout;
      		int udp_timeout;
      	};
      
      	int main() {
      		int ret = -1;
      		int sockfd = -1;
      		struct ipvs_timeout_t to;
      
      		sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
      		if (sockfd == -1) {
      			printf("socket init error\n");
      			return -1;
      		}
      
      		to.tcp_timeout = -2147483647;
      		to.tcp_fin_timeout = -2147483647;
      		to.udp_timeout = -2147483647;
      
      		ret = setsockopt(sockfd,
      				 IPPROTO_IP,
      				 IP_VS_SO_SET_TIMEOUT,
      				 (char *)(&to),
      				 sizeof(to));
      
      		printf("setsockopt return %d\n", ret);
      		return ret;
      	}
      
      Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'.
      Signed-off-by: default avatarZhangXiaoxu <zhangxiaoxu5@huawei.com>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e0b03a6b
    • Guo Ren's avatar
      riscv: fixup max_low_pfn with PFN_DOWN. · ffabf74c
      Guo Ren authored
      [ Upstream commit 28198c46 ]
      
      max_low_pfn should be pfn_size not byte_size.
      Signed-off-by: default avatarGuo Ren <ren_guo@c-sky.com>
      Signed-off-by: default avatarMao Han <mao_han@c-sky.com>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ffabf74c
    • Jerry Snitselaar's avatar
      iommu/amd: Unmap all mapped pages in error path of map_sg · 9e1f977d
      Jerry Snitselaar authored
      [ Upstream commit f1724c08 ]
      
      In the error path of map_sg there is an incorrect if condition
      for breaking out of the loop that searches the scatterlist
      for mapped pages to unmap. Instead of breaking out of the
      loop once all the pages that were mapped have been unmapped,
      it will break out of the loop after it has unmapped 1 page.
      Fix the condition, so it breaks out of the loop only after
      all the mapped pages have been unmapped.
      
      Fixes: 80187fd3 ("iommu/amd: Optimize map_sg and unmap_sg")
      Cc: Joerg Roedel <joro@8bytes.org>
      Signed-off-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9e1f977d
    • Jerry Snitselaar's avatar
      iommu/amd: Call free_iova_fast with pfn in map_sg · 697863bf
      Jerry Snitselaar authored
      [ Upstream commit 51d8838d ]
      
      In the error path of map_sg, free_iova_fast is being called with
      address instead of the pfn. This results in a bad value getting into
      the rcache, and can result in hitting a BUG_ON when
      iova_magazine_free_pfns is called.
      
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Signed-off-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Fixes: 80187fd3 ("iommu/amd: Optimize map_sg and unmap_sg")
      Signed-off-by: default avatarJoerg Roedel <jroedel@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      697863bf
    • Brian Welty's avatar
      IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM · 43b0c939
      Brian Welty authored
      [ Upstream commit 904bba21 ]
      
      The work completion length for a receiving a UD send with immediate is
      short by 4 bytes causing application using this opcode to fail.
      
      The UD receive logic incorrectly subtracts 4 bytes for immediate
      value. These bytes are already included in header length and are used to
      calculate header/payload split, so the result is these 4 bytes are
      subtracted twice, once when the header length subtracted from the overall
      length and once again in the UD opcode specific path.
      
      Remove the extra subtraction when handling the opcode.
      
      Fixes: 77241056 ("IB/hfi1: add driver files")
      Reviewed-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarBrian Welty <brian.welty@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      43b0c939
    • Tony Jones's avatar
      perf script: Fix crash when processing recorded stat data · d5f05016
      Tony Jones authored
      [ Upstream commit 8bf8c6da ]
      
      While updating perf to work with Python3 and Python2 I noticed that the
      stat-cpi script was dumping core.
      
      $ perf  stat -e cycles,instructions record -o /tmp/perf.data /bin/false
      
       Performance counter stats for '/bin/false':
      
                 802,148      cycles
      
                 604,622      instructions                                                       802,148      cycles
                 604,622      instructions
      
             0.001445842 seconds time elapsed
      
      $ perf script -i /tmp/perf.data -s scripts/python/stat-cpi.py
      Segmentation fault (core dumped)
      ...
      ...
          rblist=rblist@entry=0xb2a200 <rt_stat>,
          new_entry=new_entry@entry=0x7ffcb755c310) at util/rblist.c:33
          ctx=<optimized out>, type=<optimized out>, create=<optimized out>,
          cpu=<optimized out>, evsel=<optimized out>) at util/stat-shadow.c:118
          ctx=<optimized out>, type=<optimized out>, st=<optimized out>)
          at util/stat-shadow.c:196
          count=count@entry=727442, cpu=cpu@entry=0, st=0xb2a200 <rt_stat>)
          at util/stat-shadow.c:239
          config=config@entry=0xafeb40 <stat_config>,
          counter=counter@entry=0x133c6e0) at util/stat.c:372
      ...
      ...
      
      The issue is that since 1fcd0394 perf_stat__update_shadow_stats now calls
      update_runtime_stat passing rt_stat rather than calling update_stats but
      perf_stat__init_shadow_stats has never been called to initialize rt_stat in
      the script path processing recorded stat data.
      
      Since I can't see any reason why perf_stat__init_shadow_stats() is presently
      initialized like it is in builtin-script.c::perf_sample__fprint_metric()
      [4bd1bef8] I'm proposing it instead be initialized once in __cmd_script
      
      Committer testing:
      
      After applying the patch:
      
        # perf script -i /tmp/perf.data -s tools/perf/scripts/python/stat-cpi.py
             0.001970: cpu -1, thread -1 -> cpi 1.709079 (1075684/629394)
        #
      
      No segfault.
      Signed-off-by: default avatarTony Jones <tonyj@suse.de>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Tested-by: default avatarRavi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jin Yao <yao.jin@linux.intel.com>
      Fixes: 1fcd0394 ("perf stat: Update per-thread shadow stats")
      Link: http://lkml.kernel.org/r/20190120191414.12925-1-tonyj@suse.deSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5f05016
    • Stephane Eranian's avatar
      perf tools: Handle TOPOLOGY headers with no CPU · 1e4b7541
      Stephane Eranian authored
      [ Upstream commit 1497e804 ]
      
      This patch fixes an issue in cpumap.c when used with the TOPOLOGY
      header. In some configurations, some NUMA nodes may have no CPU (empty
      cpulist). Yet a cpumap map must be created otherwise perf abort with an
      error. This patch handles this case by creating a dummy map.
      
        Before:
      
        $ perf record -o - -e cycles noploop 2 | perf script -i -
        0x6e8 [0x6c]: failed to process type: 80
      
        After:
      
        $ perf record -o - -e cycles noploop 2 | perf script -i -
        noploop for 2 seconds
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1547885559-1657-1-git-send-email-eranian@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1e4b7541
    • Stephane Eranian's avatar
      perf core: Fix perf_proc_update_handler() bug · 6ec0698f
      Stephane Eranian authored
      [ Upstream commit 1a51c5da ]
      
      The perf_proc_update_handler() handles /proc/sys/kernel/perf_event_max_sample_rate
      syctl variable.  When the PMU IRQ handler timing monitoring is disabled, i.e,
      when /proc/sys/kernel/perf_cpu_time_max_percent is equal to 0 or 100,
      then no modification to sysctl_perf_event_sample_rate is allowed to prevent
      possible hang from wrong values.
      
      The problem is that the test to prevent modification is made after the
      sysctl variable is modified in perf_proc_update_handler().
      
      You get an error:
      
        $ echo 10001 >/proc/sys/kernel/perf_event_max_sample_rate
        echo: write error: invalid argument
      
      But the value is still modified causing all sorts of inconsistencies:
      
        $ cat /proc/sys/kernel/perf_event_max_sample_rate
        10001
      
      This patch fixes the problem by moving the parsing of the value after
      the test.
      
      Committer testing:
      
        # echo 100 > /proc/sys/kernel/perf_cpu_time_max_percent
        # echo 10001 > /proc/sys/kernel/perf_event_max_sample_rate
        -bash: echo: write error: Invalid argument
        # cat /proc/sys/kernel/perf_event_max_sample_rate
        10001
        #
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Reviewed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Reviewed-by: default avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1547169436-6266-1-git-send-email-eranian@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6ec0698f
    • Andi Kleen's avatar
      perf script: Fix crash with printing mixed trace point and other events · 5d1dc10b
      Andi Kleen authored
      [ Upstream commit 96167167 ]
      
      'perf script' crashes currently when printing mixed trace points and
      other events because the trace format does not handle events without
      trace meta data. Add a simple check to avoid that.
      
        % cat > test.c
        main()
        {
            printf("Hello world\n");
        }
        ^D
        % gcc -g -o test test.c
        % sudo perf probe -x test 'test.c:3'
        % perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
        % perf script
        <segfault>
      
      Committer testing:
      
      Before:
      
        # perf probe -x /lib64/libc-2.28.so malloc
        Added new event:
          probe_libc:malloc    (on malloc in /usr/lib64/libc-2.28.so)
      
        You can now use it in all perf tools, such as:
      
      	perf record -e probe_libc:malloc -aR sleep 1
      
        # perf probe -l
        probe_libc:malloc    (on __libc_malloc@malloc/malloc.c in /usr/lib64/libc-2.28.so)
        # perf record -e '{cpu/cpu-cycles,period=10000/,probe_libc:*}:S' sleep 1
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.023 MB perf.data (40 samples) ]
        # perf script
        Segmentation fault (core dumped)
        ^C
        #
      
      After:
      
        # perf script | head -6
           sleep 2888 94796.944981: 16198 cpu/cpu-cycles,period=10000/: ffffffff925dc04f get_random_u32+0x1f (/lib/modules/5.0.0-rc2+/build/vmlinux)
           sleep 2888 [-01] 94796.944981: probe_libc:malloc:
           sleep 2888 94796.944983:  4713 cpu/cpu-cycles,period=10000/: ffffffff922763af change_protection+0xcf (/lib/modules/5.0.0-rc2+/build/vmlinux)
           sleep 2888 [-01] 94796.944983: probe_libc:malloc:
           sleep 2888 94796.944986:  9934 cpu/cpu-cycles,period=10000/: ffffffff922777e0 move_page_tables+0x0 (/lib/modules/5.0.0-rc2+/build/vmlinux)
           sleep 2888 [-01] 94796.944986: probe_libc:malloc:
        #
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20190117194834.21940-1-andi@firstfloor.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5d1dc10b
    • Su Yanjun's avatar
      vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel · 8ce41db0
      Su Yanjun authored
      [ Upstream commit dd9ee344 ]
      
      Recently we run a network test over ipcomp virtual tunnel.We find that
      if a ipv4 packet needs fragment, then the peer can't receive
      it.
      
      We deep into the code and find that when packet need fragment the smaller
      fragment will be encapsulated by ipip not ipcomp. So when the ipip packet
      goes into xfrm, it's skb->dev is not properly set. The ipv4 reassembly code
      always set skb'dev to the last fragment's dev. After ipv4 defrag processing,
      when the kernel rp_filter parameter is set, the skb will be drop by -EXDEV
      error.
      
      This patch adds compatible support for the ipip process in ipcomp virtual tunnel.
      Signed-off-by: default avatarSu Yanjun <suyj.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8ce41db0
    • Alistair Strachan's avatar
      media: uvcvideo: Fix 'type' check leading to overflow · ac8befb6
      Alistair Strachan authored
      commit 47bb1179 upstream.
      
      When initially testing the Camera Terminal Descriptor wTerminalType
      field (buffer[4]), no mask is used. Later in the function, the MSB is
      overloaded to store the descriptor subtype, and so a mask of 0x7fff
      is used to check the type.
      
      If a descriptor is specially crafted to set this overloaded bit in the
      original wTerminalType field, the initial type check will fail (falling
      through, without adjusting the buffer size), but the later type checks
      will pass, assuming the buffer has been made suitably large, causing an
      overflow.
      
      Avoid this problem by checking for the MSB in the wTerminalType field.
      If the bit is set, assume the descriptor is bad, and abort parsing it.
      
      Originally reported here:
      https://groups.google.com/forum/#!topic/syzkaller/Ot1fOE6v1d8
      A similar (non-compiling) patch was provided at that time.
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarAlistair Strachan <astrachan@google.com>
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab+samsung@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac8befb6
  2. 10 Mar, 2019 22 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.19.28 · 6a31767f
      Greg Kroah-Hartman authored
      6a31767f
    • Daniel Borkmann's avatar
      bpf: fix sanitation rewrite in case of non-pointers · ca490a98
      Daniel Borkmann authored
      commit 3612af78 upstream.
      
      Marek reported that he saw an issue with the below snippet in that
      timing measurements where off when loaded as unpriv while results
      were reasonable when loaded as privileged:
      
          [...]
          uint64_t a = bpf_ktime_get_ns();
          uint64_t b = bpf_ktime_get_ns();
          uint64_t delta = b - a;
          if ((int64_t)delta > 0) {
          [...]
      
      Turns out there is a bug where a corner case is missing in the fix
      d3bd7413 ("bpf: fix sanitation of alu op with pointer / scalar
      type from different paths"), namely fixup_bpf_calls() only checks
      whether aux has a non-zero alu_state, but it also needs to test for
      the case of BPF_ALU_NON_POINTER since in both occasions we need to
      skip the masking rewrite (as there is nothing to mask).
      
      Fixes: d3bd7413 ("bpf: fix sanitation of alu op with pointer / scalar type from different paths")
      Reported-by: default avatarMarek Majkowski <marek@cloudflare.com>
      Reported-by: default avatarArthur Fabre <afabre@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/netdev/CAJPywTJqP34cK20iLM5YmUMz9KXQOdu1-+BZrGMAGgLuBWz7fg@mail.gmail.com/T/Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca490a98
    • Martin Wilck's avatar
      scsi: core: reset host byte in DID_NEXUS_FAILURE case · ebfb07e8
      Martin Wilck authored
      commit 4a067cf8 upstream.
      
      Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to
      DID_OK for various cases including DID_NEXUS_FAILURE.  Commit
      2a842aca ("block: introduce new block status code type") replaced this
      function with scsi_result_to_blk_status() and removed the host-byte
      resetting code for the DID_NEXUS_FAILURE case.  As the line
      set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose
      this was an editing mistake.
      
      The fact that the host byte remains set after 4.13 is causing problems with
      the sg_persist tool, which now returns success rather then exit status 24
      when a RESERVATION CONFLICT error is encountered.
      
      Fixes: 2a842aca "block: introduce new block status code type"
      Signed-off-by: default avatarMartin Wilck <mwilck@suse.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ebfb07e8
    • YueHaibing's avatar
      exec: Fix mem leak in kernel_read_file · b60d90b2
      YueHaibing authored
      commit f612acfa upstream.
      
      syzkaller report this:
      BUG: memory leak
      unreferenced object 0xffffc9000488d000 (size 9195520):
        comm "syz-executor.0", pid 2752, jiffies 4294787496 (age 18.757s)
        hex dump (first 32 bytes):
          ff ff ff ff ff ff ff ff a8 00 00 00 01 00 00 00  ................
          02 00 00 00 00 00 00 00 80 a1 7a c1 ff ff ff ff  ..........z.....
        backtrace:
          [<000000000863775c>] __vmalloc_node mm/vmalloc.c:1795 [inline]
          [<000000000863775c>] __vmalloc_node_flags mm/vmalloc.c:1809 [inline]
          [<000000000863775c>] vmalloc+0x8c/0xb0 mm/vmalloc.c:1831
          [<000000003f668111>] kernel_read_file+0x58f/0x7d0 fs/exec.c:924
          [<000000002385813f>] kernel_read_file_from_fd+0x49/0x80 fs/exec.c:993
          [<0000000011953ff1>] __do_sys_finit_module+0x13b/0x2a0 kernel/module.c:3895
          [<000000006f58491f>] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
          [<00000000ee78baf4>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
          [<00000000241f889b>] 0xffffffffffffffff
      
      It should goto 'out_free' lable to free allocated buf while kernel_read
      fails.
      
      Fixes: 39d637af ("vfs: forbid write access when reading a file into memory")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Thibaut Sautereau <thibaut@sautereau.fr>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b60d90b2
    • Matthias Kaehlcke's avatar
      Bluetooth: Fix locking in bt_accept_enqueue() for BH context · 8d368fc5
      Matthias Kaehlcke authored
      commit c4f5627f upstream.
      
      With commit e1633762 ("Bluetooth: Handle bt_accept_enqueue() socket
      atomically") lock_sock[_nested]() is used to acquire the socket lock
      before manipulating the socket. lock_sock[_nested]() may block, which
      is problematic since bt_accept_enqueue() can be called in bottom half
      context (e.g. from rfcomm_connect_ind()):
      
      [<ffffff80080d81ec>] __might_sleep+0x4c/0x80
      [<ffffff800876c7b0>] lock_sock_nested+0x24/0x58
      [<ffffff8000d7c27c>] bt_accept_enqueue+0x48/0xd4 [bluetooth]
      [<ffffff8000e67d8c>] rfcomm_connect_ind+0x190/0x218 [rfcomm]
      
      Add a parameter to bt_accept_enqueue() to indicate whether the
      function is called from BH context, and acquire the socket lock
      with bh_lock_sock_nested() if that's the case.
      
      Also adapt all callers of bt_accept_enqueue() to pass the new
      parameter:
      
      - l2cap_sock_new_connection_cb()
        - uses lock_sock() to lock the parent socket => process context
      
      - rfcomm_connect_ind()
        - acquires the parent socket lock with bh_lock_sock() => BH
          context
      
      - __sco_chan_add()
        - called from sco_chan_add(), which is called from sco_connect().
          parent is NULL, hence bt_accept_enqueue() isn't called in this
          code path and we can ignore it
        - also called from sco_conn_ready(). uses bh_lock_sock() to acquire
          the parent lock => BH context
      
      Fixes: e1633762 ("Bluetooth: Handle bt_accept_enqueue() socket atomically")
      Signed-off-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8d368fc5
    • Kai-Heng Feng's avatar
      Bluetooth: btrtl: Restore old logic to assume firmware is already loaded · 43593a30
      Kai-Heng Feng authored
      commit 00df214b upstream.
      
      Realtek bluetooth may not work after reboot:
      [   12.446130] Bluetooth: hci0: RTL: rtl: unknown IC info, lmp subver a99e, hci rev 826c, hci ver 0008
      
      This is a regression introduced by commit 26503ad2 ("Bluetooth:
      btrtl: split the device initialization into smaller parts"). The new
      logic errors out early when no matching IC info can be found, in this
      case it means the firmware is already loaded.
      
      So let's assume the firmware is already loaded when we can't find
      matching IC info, like the old logic did.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201921
      Fixes: 26503ad2 ("Bluetooth: btrtl: split the device initialization into smaller parts")
      Cc: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43593a30
    • Luis Chamberlain's avatar
      selftests: firmware: fix verify_reqs() return value · cd61d473
      Luis Chamberlain authored
      commit 344c0152 upstream.
      
      commit a6a9be92 ("selftests: firmware: return Kselftest Skip code
      for skipped tests") by Shuah modified failures to return the special
      error code of $ksft_skip (4). We have a corner case issue where we
      *do* want to verify_reqs().
      
      Cc: <stable@vger.kernel.org> # >= 4.18
      Fixes: a6a9be92 ("selftests: firmware: return Kselftest Skip code for for skipped tests")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd61d473
    • Karoly Pados's avatar
      USB: serial: cp210x: fix GPIO in autosuspend · 9765ec7f
      Karoly Pados authored
      commit 7b0b644b upstream.
      
      Current GPIO code in cp210x fails to take USB autosuspend into account,
      making it practically impossible to use GPIOs with autosuspend enabled
      without user configuration. Fix this like for ftdi_sio in a previous patch.
      Tested on a CP2102N.
      Signed-off-by: default avatarKaroly Pados <pados@pados.hu>
      Fixes: cf5276ce ("USB: serial: cp210x: Adding GPIO support for CP2105")
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9765ec7f
    • Johan Hovold's avatar
      gnss: sirf: fix premature wakeup interrupt enable · 09675c2f
      Johan Hovold authored
      commit 82f844c2 upstream.
      
      Make sure the receiver is powered (and booted) before enabling the
      wakeup interrupt to avoid spurious interrupts due to a floating input.
      
      Similarly, disable the interrupt before powering off on probe errors and
      on unbind.
      
      Fixes: d2efbbd1 ("gnss: add driver for sirfstar-based receivers")
      Cc: stable <stable@vger.kernel.org>	# 4.19
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09675c2f
    • Max Filippov's avatar
      xtensa: fix get_wchan · c426de69
      Max Filippov authored
      commit d90b88fd upstream.
      
      Stack unwinding is implemented incorrectly in xtensa get_wchan: instead
      of extracting a0 and a1 registers from the spill location under the
      stack pointer it extracts a word pointed to by the stack pointer and
      subtracts 4 or 3 from it.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c426de69
    • Bart Van Assche's avatar
      aio: Fix locking in aio_poll() · f5e66cdb
      Bart Van Assche authored
      commit d3d6a18d upstream.
      
      wake_up_locked() may but does not have to be called with interrupts
      disabled. Since the fuse filesystem calls wake_up_locked() without
      disabling interrupts aio_poll_wake() may be called with interrupts
      enabled. Since the kioctx.ctx_lock may be acquired from IRQ context,
      all code that acquires that lock from thread context must disable
      interrupts. Hence change the spin_trylock() call in aio_poll_wake()
      into a spin_trylock_irqsave() call. This patch fixes the following
      lockdep complaint:
      
      =====================================================
      WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
      5.0.0-rc4-next-20190131 #23 Not tainted
      -----------------------------------------------------
      syz-executor2/13779 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
      0000000098ac1230 (&fiq->waitq){+.+.}, at: spin_lock include/linux/spinlock.h:329 [inline]
      0000000098ac1230 (&fiq->waitq){+.+.}, at: aio_poll fs/aio.c:1772 [inline]
      0000000098ac1230 (&fiq->waitq){+.+.}, at: __io_submit_one fs/aio.c:1875 [inline]
      0000000098ac1230 (&fiq->waitq){+.+.}, at: io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
      
      and this task is already holding:
      000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:354 [inline]
      000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1771 [inline]
      000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one fs/aio.c:1875 [inline]
      000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: io_submit_one+0xeb6/0x1cf0 fs/aio.c:1908
      which would create a new lock dependency:
       (&(&ctx->ctx_lock)->rlock){..-.} -> (&fiq->waitq){+.+.}
      
      but this new dependency connects a SOFTIRQ-irq-safe lock:
       (&(&ctx->ctx_lock)->rlock){..-.}
      
      ... which became SOFTIRQ-irq-safe at:
        lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
        __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
        _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:160
        spin_lock_irq include/linux/spinlock.h:354 [inline]
        free_ioctx_users+0x2d/0x4a0 fs/aio.c:610
        percpu_ref_put_many include/linux/percpu-refcount.h:285 [inline]
        percpu_ref_put include/linux/percpu-refcount.h:301 [inline]
        percpu_ref_call_confirm_rcu lib/percpu-refcount.c:123 [inline]
        percpu_ref_switch_to_atomic_rcu+0x3e7/0x520 lib/percpu-refcount.c:158
        __rcu_reclaim kernel/rcu/rcu.h:240 [inline]
        rcu_do_batch kernel/rcu/tree.c:2486 [inline]
        invoke_rcu_callbacks kernel/rcu/tree.c:2799 [inline]
        rcu_core+0x928/0x1390 kernel/rcu/tree.c:2780
        __do_softirq+0x266/0x95a kernel/softirq.c:292
        run_ksoftirqd kernel/softirq.c:654 [inline]
        run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
        smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
        kthread+0x357/0x430 kernel/kthread.c:247
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
      
      to a SOFTIRQ-irq-unsafe lock:
       (&fiq->waitq){+.+.}
      
      ... which became SOFTIRQ-irq-unsafe at:
      ...
        lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
        __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
        _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
        spin_lock include/linux/spinlock.h:329 [inline]
        flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
        fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
        fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
        fuse_send_init fs/fuse/inode.c:989 [inline]
        fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
        mount_nodev+0x68/0x110 fs/super.c:1392
        fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
        legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
        vfs_get_tree+0x123/0x450 fs/super.c:1481
        do_new_mount fs/namespace.c:2610 [inline]
        do_mount+0x1436/0x2c40 fs/namespace.c:2932
        ksys_mount+0xdb/0x150 fs/namespace.c:3148
        __do_sys_mount fs/namespace.c:3162 [inline]
        __se_sys_mount fs/namespace.c:3159 [inline]
        __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
        do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      other info that might help us debug this:
      
       Possible interrupt unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&fiq->waitq);
                                     local_irq_disable();
                                     lock(&(&ctx->ctx_lock)->rlock);
                                     lock(&fiq->waitq);
        <Interrupt>
          lock(&(&ctx->ctx_lock)->rlock);
      
       *** DEADLOCK ***
      
      1 lock held by syz-executor2/13779:
       #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq include/linux/spinlock.h:354 [inline]
       #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: aio_poll fs/aio.c:1771 [inline]
       #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: __io_submit_one fs/aio.c:1875 [inline]
       #0: 000000003c46111c (&(&ctx->ctx_lock)->rlock){..-.}, at: io_submit_one+0xeb6/0x1cf0 fs/aio.c:1908
      
      the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
      -> (&(&ctx->ctx_lock)->rlock){..-.} {
         IN-SOFTIRQ-W at:
                          lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                          __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
                          _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:160
                          spin_lock_irq include/linux/spinlock.h:354 [inline]
                          free_ioctx_users+0x2d/0x4a0 fs/aio.c:610
                          percpu_ref_put_many include/linux/percpu-refcount.h:285 [inline]
                          percpu_ref_put include/linux/percpu-refcount.h:301 [inline]
                          percpu_ref_call_confirm_rcu lib/percpu-refcount.c:123 [inline]
                          percpu_ref_switch_to_atomic_rcu+0x3e7/0x520 lib/percpu-refcount.c:158
                          __rcu_reclaim kernel/rcu/rcu.h:240 [inline]
                          rcu_do_batch kernel/rcu/tree.c:2486 [inline]
                          invoke_rcu_callbacks kernel/rcu/tree.c:2799 [inline]
                          rcu_core+0x928/0x1390 kernel/rcu/tree.c:2780
                          __do_softirq+0x266/0x95a kernel/softirq.c:292
                          run_ksoftirqd kernel/softirq.c:654 [inline]
                          run_ksoftirqd+0x8e/0x110 kernel/softirq.c:646
                          smpboot_thread_fn+0x6ab/0xa10 kernel/smpboot.c:164
                          kthread+0x357/0x430 kernel/kthread.c:247
                          ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
         INITIAL USE at:
                         lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                         __raw_spin_lock_irq include/linux/spinlock_api_smp.h:128 [inline]
                         _raw_spin_lock_irq+0x60/0x80 kernel/locking/spinlock.c:160
                         spin_lock_irq include/linux/spinlock.h:354 [inline]
                         __do_sys_io_cancel fs/aio.c:2052 [inline]
                         __se_sys_io_cancel fs/aio.c:2035 [inline]
                         __x64_sys_io_cancel+0xd5/0x5a0 fs/aio.c:2035
                         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                         entry_SYSCALL_64_after_hwframe+0x49/0xbe
       }
       ... key      at: [<ffffffff8a574140>] __key.52370+0x0/0x40
       ... acquired at:
         lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
         __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
         _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
         spin_lock include/linux/spinlock.h:329 [inline]
         aio_poll fs/aio.c:1772 [inline]
         __io_submit_one fs/aio.c:1875 [inline]
         io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
         __do_sys_io_submit fs/aio.c:1953 [inline]
         __se_sys_io_submit fs/aio.c:1923 [inline]
         __x64_sys_io_submit+0x1bd/0x580 fs/aio.c:1923
         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      the dependencies between the lock to be acquired
       and SOFTIRQ-irq-unsafe lock:
      -> (&fiq->waitq){+.+.} {
         HARDIRQ-ON-W at:
                          lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                          __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                          _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
                          spin_lock include/linux/spinlock.h:329 [inline]
                          flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                          fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
                          fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
                          fuse_send_init fs/fuse/inode.c:989 [inline]
                          fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
                          mount_nodev+0x68/0x110 fs/super.c:1392
                          fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
                          legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
                          vfs_get_tree+0x123/0x450 fs/super.c:1481
                          do_new_mount fs/namespace.c:2610 [inline]
                          do_mount+0x1436/0x2c40 fs/namespace.c:2932
                          ksys_mount+0xdb/0x150 fs/namespace.c:3148
                          __do_sys_mount fs/namespace.c:3162 [inline]
                          __se_sys_mount fs/namespace.c:3159 [inline]
                          __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
                          do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                          entry_SYSCALL_64_after_hwframe+0x49/0xbe
         SOFTIRQ-ON-W at:
                          lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                          __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                          _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
                          spin_lock include/linux/spinlock.h:329 [inline]
                          flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                          fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
                          fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
                          fuse_send_init fs/fuse/inode.c:989 [inline]
                          fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
                          mount_nodev+0x68/0x110 fs/super.c:1392
                          fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
                          legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
                          vfs_get_tree+0x123/0x450 fs/super.c:1481
                          do_new_mount fs/namespace.c:2610 [inline]
                          do_mount+0x1436/0x2c40 fs/namespace.c:2932
                          ksys_mount+0xdb/0x150 fs/namespace.c:3148
                          __do_sys_mount fs/namespace.c:3162 [inline]
                          __se_sys_mount fs/namespace.c:3159 [inline]
                          __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
                          do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                          entry_SYSCALL_64_after_hwframe+0x49/0xbe
         INITIAL USE at:
                         lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
                         __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
                         _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
                         spin_lock include/linux/spinlock.h:329 [inline]
                         flush_bg_queue+0x1f3/0x3c0 fs/fuse/dev.c:415
                         fuse_request_queue_background+0x2d1/0x580 fs/fuse/dev.c:676
                         fuse_request_send_background+0x58/0x120 fs/fuse/dev.c:687
                         fuse_send_init fs/fuse/inode.c:989 [inline]
                         fuse_fill_super+0x13bb/0x1730 fs/fuse/inode.c:1214
                         mount_nodev+0x68/0x110 fs/super.c:1392
                         fuse_mount+0x2d/0x40 fs/fuse/inode.c:1239
                         legacy_get_tree+0xf2/0x200 fs/fs_context.c:590
                         vfs_get_tree+0x123/0x450 fs/super.c:1481
                         do_new_mount fs/namespace.c:2610 [inline]
                         do_mount+0x1436/0x2c40 fs/namespace.c:2932
                         ksys_mount+0xdb/0x150 fs/namespace.c:3148
                         __do_sys_mount fs/namespace.c:3162 [inline]
                         __se_sys_mount fs/namespace.c:3159 [inline]
                         __x64_sys_mount+0xbe/0x150 fs/namespace.c:3159
                         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
                         entry_SYSCALL_64_after_hwframe+0x49/0xbe
       }
       ... key      at: [<ffffffff8a60dec0>] __key.43450+0x0/0x40
       ... acquired at:
         lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
         __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
         _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
         spin_lock include/linux/spinlock.h:329 [inline]
         aio_poll fs/aio.c:1772 [inline]
         __io_submit_one fs/aio.c:1875 [inline]
         io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
         __do_sys_io_submit fs/aio.c:1953 [inline]
         __se_sys_io_submit fs/aio.c:1923 [inline]
         __x64_sys_io_submit+0x1bd/0x580 fs/aio.c:1923
         do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
         entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      stack backtrace:
      CPU: 0 PID: 13779 Comm: syz-executor2 Not tainted 5.0.0-rc4-next-20190131 #23
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_bad_irq_dependency kernel/locking/lockdep.c:1573 [inline]
       check_usage.cold+0x60f/0x940 kernel/locking/lockdep.c:1605
       check_irq_usage kernel/locking/lockdep.c:1650 [inline]
       check_prev_add_irq kernel/locking/lockdep_states.h:8 [inline]
       check_prev_add kernel/locking/lockdep.c:1860 [inline]
       check_prevs_add kernel/locking/lockdep.c:1968 [inline]
       validate_chain kernel/locking/lockdep.c:2339 [inline]
       __lock_acquire+0x1f12/0x4790 kernel/locking/lockdep.c:3320
       lock_acquire+0x16f/0x3f0 kernel/locking/lockdep.c:3826
       __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
       _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:144
       spin_lock include/linux/spinlock.h:329 [inline]
       aio_poll fs/aio.c:1772 [inline]
       __io_submit_one fs/aio.c:1875 [inline]
       io_submit_one+0xedf/0x1cf0 fs/aio.c:1908
       __do_sys_io_submit fs/aio.c:1953 [inline]
       __se_sys_io_submit fs/aio.c:1923 [inline]
       __x64_sys_io_submit+0x1bd/0x580 fs/aio.c:1923
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Avi Kivity <avi@scylladb.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org>
      Fixes: e8693bcf ("aio: allow direct aio poll comletions for keyed wakeups") # v4.19
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      [ bvanassche: added a comment ]
      Reluctantly-Acked-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5e66cdb
    • Liu Xiang's avatar
      MIPS: irq: Allocate accurate order pages for irq stack · 88793c03
      Liu Xiang authored
      commit 72faa7a7 upstream.
      
      The irq_pages is the number of pages for irq stack, but not the
      order which is needed by __get_free_pages().
      We can use get_order() to calculate the accurate order.
      Signed-off-by: default avatarLiu Xiang <liu.xiang6@zte.com.cn>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Fixes: fe8bd18f ("MIPS: Introduce irq_stack")
      Cc: linux-mips@vger.kernel.org
      Cc: stable@vger.kernel.org # v4.11+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88793c03
    • Gustavo A. R. Silva's avatar
      applicom: Fix potential Spectre v1 vulnerabilities · 5691b93f
      Gustavo A. R. Silva authored
      commit d7ac3c6e upstream.
      
      IndexCard is indirectly controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/char/applicom.c:418 ac_write() warn: potential spectre issue 'apbs' [r]
      drivers/char/applicom.c:728 ac_ioctl() warn: potential spectre issue 'apbs' [r] (local cap)
      
      Fix this by sanitizing IndexCard before using it to index apbs.
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5691b93f
    • Balaji Manoharan's avatar
      usb: xhci: Fix for Enabling USB ROLE SWITCH QUIRK on INTEL_SUNRISEPOINT_LP_XHCI · 9d53e36c
      Balaji Manoharan authored
      commit 8fde481e upstream.
      
      This fix enables USB role feature on intel commercial nuc
      platform which is based on Kabylake chipset.
      Signed-off-by: default avatarBalaji Manoharan <m.balaji@intel.com>
      Reviewed-by: default avatarHans de Goede <hdegoede@redhat.com>
      Reviewed-by: default avatarHeikki Krogerus <heikki.krogerus@linux.intel.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d53e36c
    • Pavel Tikhomirov's avatar
      tracing: Fix event filters and triggers to handle negative numbers · 690e939d
      Pavel Tikhomirov authored
      commit 6a072128 upstream.
      
      Then tracing syscall exit event it is extremely useful to filter exit
      codes equal to some negative value, to react only to required errors.
      But negative numbers does not work:
      
      [root@snorch sys_exit_read]# echo "ret == -1" > filter
      bash: echo: write error: Invalid argument
      [root@snorch sys_exit_read]# cat filter
      ret == -1
              ^
      parse_error: Invalid value (did you forget quotes)?
      
      Similar thing happens when setting triggers.
      
      These is a regression in v4.17 introduced by the commit mentioned below,
      testing without these commit shows no problem with negative numbers.
      
      Link: http://lkml.kernel.org/r/20180823102534.7642-1-ptikhomirov@virtuozzo.com
      
      Cc: stable@vger.kernel.org
      Fixes: 80765597 ("tracing: Rewrite filter logic to be simpler and faster")
      Signed-off-by: default avatarPavel Tikhomirov <ptikhomirov@virtuozzo.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      690e939d
    • Kirill A. Shutemov's avatar
      x86/boot/compressed/64: Do not read legacy ROM on EFI system · 51c53180
      Kirill A. Shutemov authored
      commit 6f913de3 upstream.
      
      EFI systems do not necessarily provide a legacy ROM. If the ROM is missing
      the memory is not mapped at all.
      
      Trying to dereference values in the legacy ROM area leads to a crash on
      Macbook Pro.
      
      Only look for values in the legacy ROM area for non-EFI system.
      
      Fixes: 3548e131 ("x86/boot/compressed/64: Find a place for 32-bit trampoline")
      Reported-by: default avatarPitam Mitra <pitamm@gmail.com>
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarBockjoo Kim <bockjoo@phys.ufl.edu>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190219075224.35058-1-kirill.shutemov@linux.intel.com
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202351Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51c53180
    • Jiaxun Yang's avatar
      x86/CPU/AMD: Set the CPB bit unconditionally on F17h · eab5ea25
      Jiaxun Yang authored
      commit 02371991 upstream.
      
      Some F17h models do not have CPB set in CPUID even though the CPU
      supports it. Set the feature bit unconditionally on all F17h.
      
       [ bp: Rewrite commit message and patch. ]
      Signed-off-by: default avatarJiaxun Yang <jiaxun.yang@flygoat.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarTom Lendacky <thomas.lendacky@amd.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/20181120030018.5185-1-jiaxun.yang@flygoat.comSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eab5ea25
    • Vlad Buslov's avatar
      net: sched: act_tunnel_key: fix NULL pointer dereference during init · 38460809
      Vlad Buslov authored
      [ Upstream commit a3df633a ]
      
      Metadata pointer is only initialized for action TCA_TUNNEL_KEY_ACT_SET, but
      it is unconditionally dereferenced in tunnel_key_init() error handler.
      Verify that metadata pointer is not NULL before dereferencing it in
      tunnel_key_init error handling code.
      
      Fixes: ee28bb56 ("net/sched: fix memory leak in act_tunnel_key_init()")
      Signed-off-by: default avatarVlad Buslov <vladbu@mellanox.com>
      Reviewed-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      38460809
    • Davide Caratti's avatar
      net/sched: act_skbedit: fix refcount leak when replace fails · 69e6fb18
      Davide Caratti authored
      [ Upstream commit 6191da98 ]
      
      when act_skbedit was converted to use RCU in the data plane, we added an
      error path, but we forgot to drop the action refcount in case of failure
      during a 'replace' operation:
      
       # tc actions add action skbedit ptype otherhost pass index 100
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 1 bind 0
       # tc actions replace action skbedit ptype otherhost drop index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action skbedit
       total acts 1
      
               action order 0: skbedit  ptype otherhost pass
                index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'params_new' allocation failed,
      also when the action is being replaced.
      
      Fixes: c749cdda ("net/sched: act_skbedit: don't use spinlock in the data path")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      69e6fb18
    • Davide Caratti's avatar
      net/sched: act_ipt: fix refcount leak when replace fails · f1446b16
      Davide Caratti authored
      [ Upstream commit 8f67c90e ]
      
      After commit 4e8ddd7f ("net: sched: don't release reference on action
      overwrite"), the error path of all actions was converted to drop refcount
      also when the action was being overwritten. But we forgot act_ipt_init(),
      in case allocation of 'tname' was not successful:
      
       # tc action add action xt -j LOG --log-prefix hello index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "hello" index 100
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 1 bind 0
       # tc action replace action xt -j LOG --log-prefix world index 100
       tablename: mangle hook: NF_IP_POST_ROUTING
               target:  LOG level warning prefix "world" index 100
       RTNETLINK answers: Cannot allocate memory
       We have an error talking to the kernel
       # tc action show action xt
       total acts 1
      
               action order 0: tablename: mangle  hook: NF_IP_POST_ROUTING
               target  LOG level warning prefix "hello"
               index 100 ref 2 bind 0
      
      Ensure we call tcf_idr_release(), in case 'tname' allocation failed, also
      when the action is being replaced.
      
      Fixes: 4e8ddd7f ("net: sched: don't release reference on action overwrite")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1446b16
    • Heiner Kallweit's avatar
      net: dsa: mv88e6xxx: prevent interrupt storm caused by mv88e6390x_port_set_cmode · 4d8f5df0
      Heiner Kallweit authored
      [ Upstream commit ed8fe202 ]
      
      When debugging another issue I faced an interrupt storm in this
      driver (88E6390, port 9 in SGMII mode), consisting of alternating
      link-up / link-down interrupts. Analysis showed that the driver
      wanted to set a cmode that was set already. But so far
      mv88e6390x_port_set_cmode() doesn't check this and powers down
      SERDES, what causes the link to break, and eventually results in
      the described interrupt storm.
      
      Fix this by checking whether the cmode actually changes. We want
      that the very first call to mv88e6390x_port_set_cmode() always
      configures the registers, therefore initialize port.cmode with
      a value that is different from any supported cmode value.
      We have to take care that we only init the ports cmode once
      chip->info->num_ports is set.
      
      v2:
      - add small helper and init the number of actual ports only
      
      Fixes: 364e9d77 ("net: dsa: mv88e6xxx: Power on/off SERDES on cmode change")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d8f5df0
    • Maxime Chevallier's avatar
      net: dsa: mv88e6xxx: power serdes on/off for 10G interfaces on 6390X · 457c1190
      Maxime Chevallier authored
      [ Upstream commit d235c48b ]
      
      Upon setting the cmode on 6390 and 6390X, the associated serdes
      interfaces must be powered off/on.
      
      Both 6390X and 6390 share code to do so, but it currently uses the 6390
      specific helper mv88e6390_serdes_power() to disable and enable the
      serdes interface.
      
      This call will fail silently on 6390X when trying so set a 10G interface
      such as XAUI or RXAUI, since mv88e6390_serdes_power() internally grabs
      the lane number based on modes supported by the 6390, and returns 0 when
      getting -ENODEV as a lane number.
      
      Using mv88e6390x_serdes_power() should be safe here, since we explicitly
      rule-out all ports but the 9 and 10, and because modes supported by 6390
      ports 9 and 10 are a subset of those supported on 6390X.
      
      This was tested on 6390X using RXAUI mode.
      
      Fixes: 364e9d77 ("net: dsa: mv88e6xxx: Power on/off SERDES on cmode change")
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      457c1190