1. 11 Jul, 2018 7 commits
  2. 08 Jul, 2018 33 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.54 · 5893f4c3
      Greg Kroah-Hartman authored
      5893f4c3
    • Damien Thébault's avatar
      net: dsa: b53: Add BCM5389 support · 88b01cac
      Damien Thébault authored
      [ Upstream commit a95691bc ]
      
      This patch adds support for the BCM5389 switch connected through MDIO.
      Signed-off-by: default avatarDamien Thébault <damien.thebault@vitec.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      88b01cac
    • Finn Thain's avatar
      net/sonic: Use dma_mapping_error() · 28b64cc7
      Finn Thain authored
      [ Upstream commit 26de0b76 ]
      
      With CONFIG_DMA_API_DEBUG=y, calling sonic_open() produces the
      message, "DMA-API: device driver failed to check map error".
      Add the missing dma_mapping_error() call.
      
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Signed-off-by: default avatarFinn Thain <fthain@telegraphics.com.au>
      Acked-by: default avatarThomas Bogendoerfer <tsbogend@alpha.franken.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28b64cc7
    • João Paulo Rechi Vita's avatar
      platform/x86: asus-wmi: Fix NULL pointer dereference · 4888ced6
      João Paulo Rechi Vita authored
      [ Upstream commit 32ffd6e8 ]
      
      Do not perform the rfkill cleanup routine when
      (asus->driver->wlan_ctrl_by_user && ashs_present()) is true, since
      nothing is registered with the rfkill subsystem in that case. Doing so
      leads to the following kernel NULL pointer dereference:
      
        BUG: unable to handle kernel NULL pointer dereference at           (null)
        IP: [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120
        PGD 1a3aa8067
        PUD 1a3b3d067
        PMD 0
      
        Oops: 0002 [#1] PREEMPT SMP
        Modules linked in: bnep ccm binfmt_misc uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core hid_a4tech videodev x86_pkg_temp_thermal intel_powerclamp coretemp ath3k btusb btrtl btintel bluetooth kvm_intel snd_hda_codec_hdmi kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass crc32c_intel arc4 i915 snd_hda_intel snd_hda_codec ath9k ath9k_common ath9k_hw ath i2c_algo_bit snd_hwdep mac80211 ghash_clmulni_intel snd_hda_core snd_pcm snd_timer cfg80211 ehci_pci xhci_pci drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm xhci_hcd ehci_hcd asus_nb_wmi(-) asus_wmi sparse_keymap r8169 rfkill mxm_wmi serio_raw snd mii mei_me lpc_ich i2c_i801 video soundcore mei i2c_smbus wmi i2c_core mfd_core
        CPU: 3 PID: 3275 Comm: modprobe Not tainted 4.9.34-gentoo #34
        Hardware name: ASUSTeK COMPUTER INC. K56CM/K56CM, BIOS K56CM.206 08/21/2012
        task: ffff8801a639ba00 task.stack: ffffc900014cc000
        RIP: 0010:[<ffffffff816c7348>]  [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120
        RSP: 0018:ffffc900014cfce0  EFLAGS: 00010282
        RAX: 0000000000000000 RBX: ffff8801a54315b0 RCX: 00000000c0000100
        RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8801a54315b4
        RBP: ffffc900014cfd30 R08: 0000000000000000 R09: 0000000000000002
        R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801a54315b4
        R13: ffff8801a639ba00 R14: 00000000ffffffff R15: ffff8801a54315b8
        FS:  00007faa254fb700(0000) GS:ffff8801aef80000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        CR2: 0000000000000000 CR3: 00000001a3b1b000 CR4: 00000000001406e0
        Stack:
         ffff8801a54315b8 0000000000000000 ffffffff814733ae ffffc900014cfd28
         ffffffff8146a28c ffff8801a54315b0 0000000000000000 ffff8801a54315b0
         ffff8801a66f3820 0000000000000000 ffffc900014cfd48 ffffffff816c73e7
        Call Trace:
         [<ffffffff814733ae>] ? acpi_ut_release_mutex+0x5d/0x61
         [<ffffffff8146a28c>] ? acpi_ns_get_node+0x49/0x52
         [<ffffffff816c73e7>] mutex_lock+0x17/0x30
         [<ffffffffa00a3bb4>] asus_rfkill_hotplug+0x24/0x1a0 [asus_wmi]
         [<ffffffffa00a4421>] asus_wmi_rfkill_exit+0x61/0x150 [asus_wmi]
         [<ffffffffa00a49f1>] asus_wmi_remove+0x61/0xb0 [asus_wmi]
         [<ffffffff814a5128>] platform_drv_remove+0x28/0x40
         [<ffffffff814a2901>] __device_release_driver+0xa1/0x160
         [<ffffffff814a29e3>] device_release_driver+0x23/0x30
         [<ffffffff814a1ffd>] bus_remove_device+0xfd/0x170
         [<ffffffff8149e5a9>] device_del+0x139/0x270
         [<ffffffff814a5028>] platform_device_del+0x28/0x90
         [<ffffffff814a50a2>] platform_device_unregister+0x12/0x30
         [<ffffffffa00a4209>] asus_wmi_unregister_driver+0x19/0x30 [asus_wmi]
         [<ffffffffa00da0ea>] asus_nb_wmi_exit+0x10/0xf26 [asus_nb_wmi]
         [<ffffffff8110c692>] SyS_delete_module+0x192/0x270
         [<ffffffff810022b2>] ? exit_to_usermode_loop+0x92/0xa0
         [<ffffffff816ca560>] entry_SYSCALL_64_fastpath+0x13/0x94
        Code: e8 5e 30 00 00 8b 03 83 f8 01 0f 84 93 00 00 00 48 8b 43 10 4c 8d 7b 08 48 89 63 10 41 be ff ff ff ff 4c 89 3c 24 48 89 44 24 08 <48> 89 20 4c 89 6c 24 10 eb 1d 4c 89 e7 49 c7 45 08 02 00 00 00
        RIP  [<ffffffff816c7348>] __mutex_lock_slowpath+0x98/0x120
         RSP <ffffc900014cfce0>
        CR2: 0000000000000000
        ---[ end trace 8d484233fa7cb512 ]---
        note: modprobe[3275] exited with preempt_count 2
      
      https://bugzilla.kernel.org/show_bug.cgi?id=196467
      
      Reported-by: red.f0xyz@gmail.com
      Signed-off-by: default avatarJoão Paulo Rechi Vita <jprvita@endlessm.com>
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4888ced6
    • Paul Burton's avatar
      sched/core: Require cpu_active() in select_task_rq(), for user tasks · 0d5e04e2
      Paul Burton authored
      [ Upstream commit 7af443ee ]
      
      select_task_rq() is used in a few paths to select the CPU upon which a
      thread should be run - for example it is used by try_to_wake_up() & by
      fork or exec balancing. As-is it allows use of any online CPU that is
      present in the task's cpus_allowed mask.
      
      This presents a problem because there is a period whilst CPUs are
      brought online where a CPU is marked online, but is not yet fully
      initialized - ie. the period where CPUHP_AP_ONLINE_IDLE <= state <
      CPUHP_ONLINE. Usually we don't run any user tasks during this window,
      but there are corner cases where this can happen. An example observed
      is:
      
        - Some user task A, running on CPU X, forks to create task B.
      
        - sched_fork() calls __set_task_cpu() with cpu=X, setting task B's
          task_struct::cpu field to X.
      
        - CPU X is offlined.
      
        - Task A, currently somewhere between the __set_task_cpu() in
          copy_process() and the call to wake_up_new_task(), is migrated to
          CPU Y by migrate_tasks() when CPU X is offlined.
      
        - CPU X is onlined, but still in the CPUHP_AP_ONLINE_IDLE state. The
          scheduler is now active on CPU X, but there are no user tasks on
          the runqueue.
      
        - Task A runs on CPU Y & reaches wake_up_new_task(). This calls
          select_task_rq() with cpu=X, taken from task B's task_struct,
          and select_task_rq() allows CPU X to be returned.
      
        - Task A enqueues task B on CPU X's runqueue, via activate_task() &
          enqueue_task().
      
        - CPU X now has a user task on its runqueue before it has reached the
          CPUHP_ONLINE state.
      
      In most cases, the user tasks that schedule on the newly onlined CPU
      have no idea that anything went wrong, but one case observed to be
      problematic is if the task goes on to invoke the sched_setaffinity
      syscall. The newly onlined CPU reaches the CPUHP_AP_ONLINE_IDLE state
      before the CPU that brought it online calls stop_machine_unpark(). This
      means that for a portion of the window of time between
      CPUHP_AP_ONLINE_IDLE & CPUHP_ONLINE the newly onlined CPU's struct
      cpu_stopper has its enabled field set to false. If a user thread is
      executed on the CPU during this window and it invokes sched_setaffinity
      with a CPU mask that does not include the CPU it's running on, then when
      __set_cpus_allowed_ptr() calls stop_one_cpu() intending to invoke
      migration_cpu_stop() and perform the actual migration away from the CPU
      it will simply return -ENOENT rather than calling migration_cpu_stop().
      We then return from the sched_setaffinity syscall back to the user task
      that is now running on a CPU which it just asked not to run on, and
      which is not present in its cpus_allowed mask.
      
      This patch resolves the problem by having select_task_rq() enforce that
      user tasks run on CPUs that are active - the same requirement that
      select_fallback_rq() already enforces. This should ensure that newly
      onlined CPUs reach the CPUHP_AP_ACTIVE state before being able to
      schedule user tasks, and also implies that bringup_wait_for_ap() will
      have called stop_machine_unpark() which resolves the sched_setaffinity
      issue above.
      
      I haven't yet investigated them, but it may be of interest to review
      whether any of the actions performed by hotplug states between
      CPUHP_AP_ONLINE_IDLE & CPUHP_AP_ACTIVE could have similar unintended
      effects on user tasks that might schedule before they are reached, which
      might widen the scope of the problem from just affecting the behaviour
      of sched_setaffinity.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20180526154648.11635-2-paul.burton@mips.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d5e04e2
    • Peter Zijlstra's avatar
      sched/core: Fix rules for running on online && !active CPUs · e4c55e0e
      Peter Zijlstra authored
      [ Upstream commit 175f0e25 ]
      
      As already enforced by the WARN() in __set_cpus_allowed_ptr(), the rules
      for running on an online && !active CPU are stricter than just being a
      kthread, you need to be a per-cpu kthread.
      
      If you're not strictly per-CPU, you have better CPUs to run on and
      don't need the partially booted one to get your work done.
      
      The exception is to allow smpboot threads to bootstrap the CPU itself
      and get kernel 'services' initialized before we allow userspace on it.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 955dbdf4 ("sched: Allow migrating kthreads into online but inactive CPUs")
      Link: http://lkml.kernel.org/r/20170725165821.cejhb7v2s3kecems@hirez.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4c55e0e
    • Darrick J. Wong's avatar
      fs: clear writeback errors in inode_init_always · 93b84462
      Darrick J. Wong authored
      [ Upstream commit 829bc787 ]
      
      In inode_init_always(), we clear the inode mapping flags, which clears
      any retained error (AS_EIO, AS_ENOSPC) bits.  Unfortunately, we do not
      also clear wb_err, which means that old mapping errors can leak through
      to new inodes.
      
      This is crucial for the XFS inode allocation path because we recycle old
      in-core inodes and we do not want error state from an old file to leak
      into the new file.  This bug was discovered by running generic/036 and
      generic/047 in a loop and noticing that the EIOs generated by the
      collision of direct and buffered writes in generic/036 would survive the
      remount between 036 and 047, and get reported to the fsyncs (on
      different files!) in generic/047.
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Reviewed-by: default avatarJeff Layton <jlayton@kernel.org>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93b84462
    • YueHaibing's avatar
      perf bpf: Fix NULL return handling in bpf__prepare_load() · ae14c044
      YueHaibing authored
      [ Upstream commit ab4e32ff ]
      
      bpf_object__open()/bpf_object__open_buffer can return error pointer or
      NULL, check the return values with IS_ERR_OR_NULL() in bpf__prepare_load
      and bpf__prepare_load_buffer
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: netdev@vger.kernel.org
      Link: https://lkml.kernel.org/n/tip-psf4xwc09n62al2cb9s33v9h@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ae14c044
    • Thomas Richter's avatar
      perf test: "Session topology" dumps core on s390 · be5af6be
      Thomas Richter authored
      [ Upstream commit d1211091 ]
      
      The "perf test Session topology" entry fails with core dump on s390. The root
      cause is a NULL pointer dereference in function check_cpu_topology() line 76
      (or line 82 without -v).
      
      The session->header.env.cpu variable is NULL because on s390 function
      process_cpu_topology() returns with error:
      
          socket_id number is too big.
          You may need to upgrade the perf tool.
      
      and releases the env.cpu variable via zfree() and sets it to NULL.
      
      Here is the gdb output:
      (gdb) n
      76                      pr_debug("CPU %d, core %d, socket %d\n", i,
      (gdb) n
      
      Program received signal SIGSEGV, Segmentation fault.
      0x00000000010f4d9e in check_cpu_topology (path=0x3ffffffd6c8
      	"/tmp/perf-test-J6CHMa", map=0x14a1740) at tests/topology.c:76
      76  pr_debug("CPU %d, core %d, socket %d\n", i,
      (gdb)
      
      Make sure the env.cpu variable is not used when its NULL.
      Test for NULL pointer and return TEST_SKIP if so.
      
      Output before:
      
        [root@p23lp27 perf]# ./perf test -F 39
        39: Session topology  :Segmentation fault (core dumped)
        [root@p23lp27 perf]#
      
      Output after:
      
        [root@p23lp27 perf]# ./perf test -vF 39
        39: Session topology                                      :
        --- start ---
        templ file: /tmp/perf-test-Ajx59D
        socket_id number is too big.You may need to upgrade the perf tool.
        ---- end ----
        Session topology: Skip
        [root@p23lp27 perf]#
      Signed-off-by: default avatarThomas Richter <tmricht@linux.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Link: http://lkml.kernel.org/r/20180528073657.11743-1-tmricht@linux.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      be5af6be
    • Josh Hill's avatar
      net: qmi_wwan: Add Netgear Aircard 779S · d689ad5c
      Josh Hill authored
      [ Upstream commit 2415f3bd ]
      
      Add support for Netgear Aircard 779S
      Signed-off-by: default avatarJosh Hill <josh@joshuajhill.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d689ad5c
    • Ivan Bornyakov's avatar
      atm: zatm: fix memcmp casting · d20dcd2f
      Ivan Bornyakov authored
      [ Upstream commit f9c6442a ]
      
      memcmp() returns int, but eprom_try_esi() cast it to unsigned char. One
      can lose significant bits and get 0 from non-0 value returned by the
      memcmp().
      Signed-off-by: default avatarIvan Bornyakov <brnkv.i1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d20dcd2f
    • Hao Wei Tee's avatar
      iwlwifi: pcie: compare with number of IRQs requested for, not number of CPUs · 3ee6bd94
      Hao Wei Tee authored
      [ Upstream commit ab1068d6 ]
      
      When there are 16 or more logical CPUs, we request for
      `IWL_MAX_RX_HW_QUEUES` (16) IRQs only as we limit to that number of
      IRQs, but later on we compare the number of IRQs returned to
      nr_online_cpus+2 instead of max_irqs, the latter being what we
      actually asked for. This ends up setting num_rx_queues to 17 which
      causes lots of out-of-bounds array accesses later on.
      
      Compare to max_irqs instead, and also add an assertion in case
      num_rx_queues > IWM_MAX_RX_HW_QUEUES.
      
      This fixes https://bugzilla.kernel.org/show_bug.cgi?id=199551
      
      Fixes: 2e5d4a8f ("iwlwifi: pcie: Add new configuration to enable MSIX")
      Signed-off-by: default avatarHao Wei Tee <angelsl@in04.sg>
      Tested-by: default avatarSara Sharon <sara.sharon@intel.com>
      Signed-off-by: default avatarLuca Coelho <luciano.coelho@intel.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ee6bd94
    • Julian Anastasov's avatar
      ipvs: fix buffer overflow with sync daemon and service · 4abab5dc
      Julian Anastasov authored
      [ Upstream commit 52f96757 ]
      
      syzkaller reports for buffer overflow for interface name
      when starting sync daemons [1]
      
      What we do is that we copy user structure into larger stack
      buffer but later we search NUL past the stack buffer.
      The same happens for sched_name when adding/editing virtual server.
      
      We are restricted by IP_VS_SCHEDNAME_MAXLEN and IP_VS_IFNAME_MAXLEN
      being used as size in include/uapi/linux/ip_vs.h, so they
      include the space for NUL.
      
      As using strlcpy is wrong for unsafe source, replace it with
      strscpy and add checks to return EINVAL if source string is not
      NUL-terminated. The incomplete strlcpy fix comes from 2.6.13.
      
      For the netlink interface reduce the len parameter for
      IPVS_DAEMON_ATTR_MCAST_IFN and IPVS_SVC_ATTR_SCHED_NAME,
      so that we get proper EINVAL.
      
      [1]
      kernel BUG at lib/string.c:1052!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
          (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 373 Comm: syz-executor936 Not tainted 4.17.0-rc4+ #45
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:fortify_panic+0x13/0x20 lib/string.c:1051
      RSP: 0018:ffff8801c976f800 EFLAGS: 00010282
      RAX: 0000000000000022 RBX: 0000000000000040 RCX: 0000000000000000
      RDX: 0000000000000022 RSI: ffffffff8160f6f1 RDI: ffffed00392edef6
      RBP: ffff8801c976f800 R08: ffff8801cf4c62c0 R09: ffffed003b5e4fb0
      R10: ffffed003b5e4fb0 R11: ffff8801daf27d87 R12: ffff8801c976fa20
      R13: ffff8801c976fae4 R14: ffff8801c976fae0 R15: 000000000000048b
      FS:  00007fd99f75e700(0000) GS:ffff8801daf00000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000200001c0 CR3: 00000001d6843000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        strlen include/linux/string.h:270 [inline]
        strlcpy include/linux/string.h:293 [inline]
        do_ip_vs_set_ctl+0x31c/0x1d00 net/netfilter/ipvs/ip_vs_ctl.c:2388
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x7d/0xd0 net/netfilter/nf_sockopt.c:115
        ip_setsockopt+0xd8/0xf0 net/ipv4/ip_sockglue.c:1253
        udp_setsockopt+0x62/0xa0 net/ipv4/udp.c:2487
        ipv6_setsockopt+0x149/0x170 net/ipv6/ipv6_sockglue.c:917
        tcp_setsockopt+0x93/0xe0 net/ipv4/tcp.c:3057
        sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3046
        __sys_setsockopt+0x1bd/0x390 net/socket.c:1903
        __do_sys_setsockopt net/socket.c:1914 [inline]
        __se_sys_setsockopt net/socket.c:1911 [inline]
        __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1911
        do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x447369
      RSP: 002b:00007fd99f75dda8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 00000000006e39e4 RCX: 0000000000447369
      RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000000018 R09: 0000000000000000
      R10: 00000000200001c0 R11: 0000000000000246 R12: 00000000006e39e0
      R13: 75a1ff93f0896195 R14: 6f745f3168746576 R15: 0000000000000001
      Code: 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 48 89 df e8 d2 8f 48 fa eb
      de 55 48 89 fe 48 c7 c7 60 65 64 88 48 89 e5 e8 91 dd f3 f9 <0f> 0b 90 90
      90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 56
      RIP: fortify_panic+0x13/0x20 lib/string.c:1051 RSP: ffff8801c976f800
      
      Reported-and-tested-by: syzbot+aac887f77319868646df@syzkaller.appspotmail.com
      Fixes: e4ff6751 ("ipvs: add sync_maxlen parameter for the sync daemon")
      Fixes: 4da62fc7 ("[IPVS]: Fix for overflows")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4abab5dc
    • Pablo Neira Ayuso's avatar
      netfilter: nft_limit: fix packet ratelimiting · 27aa533f
      Pablo Neira Ayuso authored
      [ Upstream commit 3e0f64b7 ]
      
      Credit calculations for the packet ratelimiting are not correct, as per
      the applied ratelimit of 25/second and burst 8, a total of 33 packets
      should have been accepted.  This is true in iptables(33) but not in
      nftables (~65). For packet ratelimiting, use:
      
      	div_u64(limit->nsecs, limit->rate) * limit->burst;
      
      to calculate credit, just like in iptables' xt_limit does.
      
      Moreover, use default burst in iptables, users are expecting similar
      behaviour.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27aa533f
    • Sebastian Ott's avatar
      s390/dasd: use blk_mq_rq_from_pdu for per request data · 510e1e80
      Sebastian Ott authored
      [ Upstream commit f0f59a2f ]
      
      Dasd uses completion_data from struct request to store per request
      private data - this is problematic since this member is part of a
      union which is also used by IO schedulers.
      Let the block layer maintain space for per request data behind each
      struct request.
      
      Fixes crashes on block layer timeouts like this one:
      
      Unable to handle kernel pointer dereference in virtual kernel address space
      Failing address: 0000000000000000 TEID: 0000000000000483
      Fault in home space mode while using kernel ASCE.
      AS:0000000001308007 R3:00000000fffc8007 S:00000000fffcc000 P:000000000000013d
      Oops: 0004 ilc:2 [#1] PREEMPT SMP
      Modules linked in: [...]
      CPU: 0 PID: 1480 Comm: kworker/0:2H Not tainted 4.17.0-rc4-00046-gaa3bcd43b5af #203
      Hardware name: IBM 3906 M02 702 (LPAR)
      Workqueue: kblockd blk_mq_timeout_work
      Krnl PSW : 0000000067ac406b 00000000b6960308 (do_raw_spin_trylock+0x30/0x78)
                 R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
      Krnl GPRS: 0000000000000c00 0000000000000000 0000000000000000 0000000000000001
                 0000000000b9d3c8 0000000000000000 0000000000000001 00000000cf9639d8
                 0000000000000000 0700000000000000 0000000000000000 000000000099f09e
                 0000000000000000 000000000076e9d0 000000006247bb08 000000006247bae0
      Krnl Code: 00000000001c159c: b90400c2           lgr     %r12,%r2
                 00000000001c15a0: a7180000           lhi     %r1,0
                #00000000001c15a4: 583003a4           l       %r3,932
                >00000000001c15a8: ba132000           cs      %r1,%r3,0(%r2)
                 00000000001c15ac: a7180001           lhi     %r1,1
                 00000000001c15b0: a784000b           brc     8,1c15c6
                 00000000001c15b4: c0e5004e72aa       brasl   %r14,b8fb08
                 00000000001c15ba: 1812               lr      %r1,%r2
      Call Trace:
      ([<0700000000000000>] 0x700000000000000)
       [<0000000000b9d3d2>] _raw_spin_lock_irqsave+0x7a/0xb8
       [<000000000099f09e>] dasd_times_out+0x46/0x278
       [<000000000076ea6e>] blk_mq_terminate_expired+0x9e/0x108
       [<000000000077497a>] bt_for_each+0x102/0x130
       [<0000000000774e54>] blk_mq_queue_tag_busy_iter+0x74/0xd8
       [<000000000076fea0>] blk_mq_timeout_work+0x260/0x320
       [<0000000000169dd4>] process_one_work+0x3bc/0x708
       [<000000000016a382>] worker_thread+0x262/0x408
       [<00000000001723a8>] kthread+0x160/0x178
       [<0000000000b9e73a>] kernel_thread_starter+0x6/0xc
       [<0000000000b9e734>] kernel_thread_starter+0x0/0xc
      INFO: lockdep is turned off.
      Last Breaking-Event-Address:
       [<0000000000b9d3cc>] _raw_spin_lock_irqsave+0x74/0xb8
      
      Kernel panic - not syncing: Fatal exception: panic_on_oops
      Signed-off-by: default avatarSebastian Ott <sebott@linux.ibm.com>
      Reviewed-by: default avatarStefan Haberland <sth@linux.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      510e1e80
    • Paolo Abeni's avatar
      netfilter: ebtables: handle string from userspace with care · db73501e
      Paolo Abeni authored
      [ Upstream commit 94c752f9 ]
      
      strlcpy() can't be safely used on a user-space provided string,
      as it can try to read beyond the buffer's end, if the latter is
      not NULL terminated.
      
      Leveraging the above, syzbot has been able to trigger the following
      splat:
      
      BUG: KASAN: stack-out-of-bounds in strlcpy include/linux/string.h:300
      [inline]
      BUG: KASAN: stack-out-of-bounds in compat_mtw_from_user
      net/bridge/netfilter/ebtables.c:1957 [inline]
      BUG: KASAN: stack-out-of-bounds in ebt_size_mwt
      net/bridge/netfilter/ebtables.c:2059 [inline]
      BUG: KASAN: stack-out-of-bounds in size_entry_mwt
      net/bridge/netfilter/ebtables.c:2155 [inline]
      BUG: KASAN: stack-out-of-bounds in compat_copy_entries+0x96c/0x14a0
      net/bridge/netfilter/ebtables.c:2194
      Write of size 33 at addr ffff8801b0abf888 by task syz-executor0/4504
      
      CPU: 0 PID: 4504 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #40
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:77 [inline]
        dump_stack+0x1b9/0x294 lib/dump_stack.c:113
        print_address_description+0x6c/0x20b mm/kasan/report.c:256
        kasan_report_error mm/kasan/report.c:354 [inline]
        kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
        check_memory_region_inline mm/kasan/kasan.c:260 [inline]
        check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
        memcpy+0x37/0x50 mm/kasan/kasan.c:303
        strlcpy include/linux/string.h:300 [inline]
        compat_mtw_from_user net/bridge/netfilter/ebtables.c:1957 [inline]
        ebt_size_mwt net/bridge/netfilter/ebtables.c:2059 [inline]
        size_entry_mwt net/bridge/netfilter/ebtables.c:2155 [inline]
        compat_copy_entries+0x96c/0x14a0 net/bridge/netfilter/ebtables.c:2194
        compat_do_replace+0x483/0x900 net/bridge/netfilter/ebtables.c:2285
        compat_do_ebt_set_ctl+0x2ac/0x324 net/bridge/netfilter/ebtables.c:2367
        compat_nf_sockopt net/netfilter/nf_sockopt.c:144 [inline]
        compat_nf_setsockopt+0x9b/0x140 net/netfilter/nf_sockopt.c:156
        compat_ip_setsockopt+0xff/0x140 net/ipv4/ip_sockglue.c:1279
        inet_csk_compat_setsockopt+0x97/0x120 net/ipv4/inet_connection_sock.c:1041
        compat_tcp_setsockopt+0x49/0x80 net/ipv4/tcp.c:2901
        compat_sock_common_setsockopt+0xb4/0x150 net/core/sock.c:3050
        __compat_sys_setsockopt+0x1ab/0x7c0 net/compat.c:403
        __do_compat_sys_setsockopt net/compat.c:416 [inline]
        __se_compat_sys_setsockopt net/compat.c:413 [inline]
        __ia32_compat_sys_setsockopt+0xbd/0x150 net/compat.c:413
        do_syscall_32_irqs_on arch/x86/entry/common.c:323 [inline]
        do_fast_syscall_32+0x345/0xf9b arch/x86/entry/common.c:394
        entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139
      RIP: 0023:0xf7fb3cb9
      RSP: 002b:00000000fff0c26c EFLAGS: 00000282 ORIG_RAX: 000000000000016e
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000000000
      RDX: 0000000000000080 RSI: 0000000020000300 RDI: 00000000000005f4
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      
      The buggy address belongs to the page:
      page:ffffea0006c2afc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
      flags: 0x2fffc0000000000()
      raw: 02fffc0000000000 0000000000000000 0000000000000000 00000000ffffffff
      raw: 0000000000000000 ffffea0006c20101 0000000000000000 0000000000000000
      page dumped because: kasan: bad access detected
      
      Fix the issue replacing the unsafe function with strscpy() and
      taking care of possible errors.
      
      Fixes: 81e675c2 ("netfilter: ebtables: add CONFIG_COMPAT support")
      Reported-and-tested-by: syzbot+4e42a04e0bc33cb6c087@syzkaller.appspotmail.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db73501e
    • David Howells's avatar
      afs: Fix directory permissions check · e36bc993
      David Howells authored
      [ Upstream commit 378831e4 ]
      
      Doing faccessat("/afs/some/directory", 0) triggers a BUG in the permissions
      check code.
      
      Fix this by just removing the BUG section.  If no permissions are asked
      for, just return okay if the file exists.
      
      Also:
      
       (1) Split up the directory check so that it has separate if-statements
           rather than if-else-if (e.g. checking for MAY_EXEC shouldn't skip the
           check for MAY_READ and MAY_WRITE).
      
       (2) Check for MAY_CHDIR as MAY_EXEC.
      
      Without the main fix, the following BUG may occur:
      
       kernel BUG at fs/afs/security.c:386!
       invalid opcode: 0000 [#1] SMP PTI
       ...
       RIP: 0010:afs_permission+0x19d/0x1a0 [kafs]
       ...
       Call Trace:
        ? inode_permission+0xbe/0x180
        ? do_faccessat+0xdc/0x270
        ? do_syscall_64+0x60/0x1f0
        ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 00d3b7a4 ("[AFS]: Add security support.")
      Reported-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e36bc993
    • Eric Dumazet's avatar
      xfrm6: avoid potential infinite loop in _decode_session6() · 4cf1fbcd
      Eric Dumazet authored
      [ Upstream commit d9f92772 ]
      
      syzbot found a way to trigger an infinitie loop by overflowing
      @offset variable that has been forced to use u16 for some very
      obscure reason in the past.
      
      We probably want to look at NEXTHDR_FRAGMENT handling which looks
      wrong, in a separate patch.
      
      In net-next, we shall try to use skb_header_pointer() instead of
      pskb_may_pull().
      
      watchdog: BUG: soft lockup - CPU#1 stuck for 134s! [syz-executor738:4553]
      Modules linked in:
      irq event stamp: 13885653
      hardirqs last  enabled at (13885652): [<ffffffff878009d5>] restore_regs_and_return_to_kernel+0x0/0x2b
      hardirqs last disabled at (13885653): [<ffffffff87800905>] interrupt_entry+0xb5/0xf0 arch/x86/entry/entry_64.S:625
      softirqs last  enabled at (13614028): [<ffffffff84df0809>] tun_napi_alloc_frags drivers/net/tun.c:1478 [inline]
      softirqs last  enabled at (13614028): [<ffffffff84df0809>] tun_get_user+0x1dd9/0x4290 drivers/net/tun.c:1825
      softirqs last disabled at (13614032): [<ffffffff84df1b6f>] tun_get_user+0x313f/0x4290 drivers/net/tun.c:1942
      CPU: 1 PID: 4553 Comm: syz-executor738 Not tainted 4.17.0-rc3+ #40
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:check_kcov_mode kernel/kcov.c:67 [inline]
      RIP: 0010:__sanitizer_cov_trace_pc+0x20/0x50 kernel/kcov.c:101
      RSP: 0018:ffff8801d8cfe250 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
      RAX: ffff8801d88a8080 RBX: ffff8801d7389e40 RCX: 0000000000000006
      RDX: 0000000000000000 RSI: ffffffff868da4ad RDI: ffff8801c8a53277
      RBP: ffff8801d8cfe250 R08: ffff8801d88a8080 R09: ffff8801d8cfe3e8
      R10: ffffed003b19fc87 R11: ffff8801d8cfe43f R12: ffff8801c8a5327f
      R13: 0000000000000000 R14: ffff8801c8a4e5fe R15: ffff8801d8cfe3e8
      FS:  0000000000d88940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffff600400 CR3: 00000001acab3000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       _decode_session6+0xc1d/0x14f0 net/ipv6/xfrm6_policy.c:150
       __xfrm_decode_session+0x71/0x140 net/xfrm/xfrm_policy.c:2368
       xfrm_decode_session_reverse include/net/xfrm.h:1213 [inline]
       icmpv6_route_lookup+0x395/0x6e0 net/ipv6/icmp.c:372
       icmp6_send+0x1982/0x2da0 net/ipv6/icmp.c:551
       icmpv6_send+0x17a/0x300 net/ipv6/ip6_icmp.c:43
       ip6_input_finish+0x14e1/0x1a30 net/ipv6/ip6_input.c:305
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip6_input+0xe1/0x5e0 net/ipv6/ip6_input.c:327
       dst_input include/net/dst.h:450 [inline]
       ip6_rcv_finish+0x29c/0xa10 net/ipv6/ip6_input.c:71
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ipv6_rcv+0xeb8/0x2040 net/ipv6/ip6_input.c:208
       __netif_receive_skb_core+0x2468/0x3650 net/core/dev.c:4646
       __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4711
       netif_receive_skb_internal+0x126/0x7b0 net/core/dev.c:4785
       napi_frags_finish net/core/dev.c:5226 [inline]
       napi_gro_frags+0x631/0xc40 net/core/dev.c:5299
       tun_get_user+0x3168/0x4290 drivers/net/tun.c:1951
       tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1996
       call_write_iter include/linux/fs.h:1784 [inline]
       do_iter_readv_writev+0x859/0xa50 fs/read_write.c:680
       do_iter_write+0x185/0x5f0 fs/read_write.c:959
       vfs_writev+0x1c7/0x330 fs/read_write.c:1004
       do_writev+0x112/0x2f0 fs/read_write.c:1039
       __do_sys_writev fs/read_write.c:1112 [inline]
       __se_sys_writev fs/read_write.c:1109 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1109
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
      Reported-by: syzbot+0053c8...@syzkaller.appspotmail.com
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cf1fbcd
    • Abhishek Sahu's avatar
      mtd: rawnand: fix return value check for bad block status · 693d06df
      Abhishek Sahu authored
      commit e9893e6f upstream.
      
      Positive return value from read_oob() is making false BAD
      blocks. For some of the NAND controllers, OOB bytes will be
      protected with ECC and read_oob() will return number of bitflips.
      If there is any bitflip in ECC protected OOB bytes for BAD block
      status page, then that block is getting treated as BAD.
      
      Fixes: c120e75e ("mtd: nand: use read_oob() instead of cmdfunc() for bad block check")
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAbhishek Sahu <absahu@codeaurora.org>
      Reviewed-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      [backported to 4.14.y]
      Signed-off-by: default avatarAbhishek Sahu <absahu@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      693d06df
    • Sean Nyekjaer's avatar
      ARM: dts: imx6q: Use correct SDMA script for SPI5 core · 0ed70f20
      Sean Nyekjaer authored
      commit df07101e upstream.
      
      According to the reference manual the shp_2_mcu / mcu_2_shp
      scripts must be used for devices connected through the SPBA.
      
      This fixes an issue we saw with DMA transfers.
      Sometimes the SPI controller RX FIFO was not empty after a DMA
      transfer and the driver got stuck in the next PIO transfer when
      it read one word more than expected.
      
      commit dd4b487b ("ARM: dts: imx6: Use correct SDMA script
      for SPI cores") is fixing the same issue but only for SPI1 - 4.
      
      Fixes: 67794025 ("ARM: dts: imx6q: enable dma for ecspi5")
      Signed-off-by: default avatarSean Nyekjaer <sean.nyekjaer@prevas.dk>
      Reviewed-by: default avatarFabio Estevam <fabio.estevam@nxp.com>
      Signed-off-by: default avatarShawn Guo <shawnguo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ed70f20
    • Taehee Yoo's avatar
      netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain() · 259cc05c
      Taehee Yoo authored
      commit adc972c5 upstream.
      
      When depth of chain is bigger than NFT_JUMP_STACK_SIZE, the nft_do_chain
      crashes. But there is no need to crash hard here.
      Suggested-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      259cc05c
    • Vincent Bernat's avatar
      netfilter: ip6t_rpfilter: provide input interface for route lookup · 5acd6488
      Vincent Bernat authored
      commit cede24d1 upstream.
      
      In commit 47b7e7f8, this bit was removed at the same time the
      RT6_LOOKUP_F_IFACE flag was removed. However, it is needed when
      link-local addresses are used, which is a very common case: when
      packets are routed, neighbor solicitations are done using link-local
      addresses. For example, the following neighbor solicitation is not
      matched by "-m rpfilter":
      
          IP6 fe80::5254:33ff:fe00:1 > ff02::1:ff00:3: ICMP6, neighbor
          solicitation, who has 2001:db8::5254:33ff:fe00:3, length 32
      
      Commit 47b7e7f8 doesn't quite explain why we shouldn't use
      RT6_LOOKUP_F_IFACE in the rpfilter case. I suppose the interface check
      later in the function would make it redundant. However, the remaining
      of the routing code is using RT6_LOOKUP_F_IFACE when there is no
      source address (which matches rpfilter's case with a non-unicast
      destination, like with neighbor solicitation).
      Signed-off-by: default avatarVincent Bernat <vincent@bernat.im>
      Fixes: 47b7e7f8 ("netfilter: don't set F_IFACE on ipv6 fib lookups")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5acd6488
    • Florian Westphal's avatar
      netfilter: don't set F_IFACE on ipv6 fib lookups · 3f8e85fb
      Florian Westphal authored
      commit 47b7e7f8 upstream.
      
      "fib" starts to behave strangely when an ipv6 default route is
      added - the FIB lookup returns a route using 'oif' in this case.
      
      This behaviour was inherited from ip6tables rpfilter so change
      this as well.
      
      Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1221Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f8e85fb
    • NeilBrown's avatar
      md: remove special meaning of ->quiesce(.., 2) · 2fc45ef9
      NeilBrown authored
      commit b03e0ccb upstream.
      
      The '2' argument means "wake up anything that is waiting".
      This is an inelegant part of the design and was added
      to help support management of suspend_lo/suspend_hi setting.
      Now that suspend_lo/hi is managed in mddev_suspend/resume,
      that need is gone.
      These is still a couple of places where we call 'quiesce'
      with an argument of '2', but they can safely be changed to
      call ->quiesce(.., 1); ->quiesce(.., 0) which
      achieve the same result at the small cost of pausing IO
      briefly.
      
      This removes a small "optimization" from suspend_{hi,lo}_store,
      but it isn't clear that optimization served a useful purpose.
      The code now is a lot clearer.
      Suggested-by: default avatarShaohua Li <shli@kernel.org>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2fc45ef9
    • NeilBrown's avatar
      md: allow metadata update while suspending. · ce57466d
      NeilBrown authored
      commit 35bfc521 upstream.
      
      There are various deadlocks that can occur
      when a thread holds reconfig_mutex and calls
      ->quiesce(mddev, 1).
      As some write request block waiting for
      metadata to be updated (e.g. to record device
      failure), and as the md thread updates the metadata
      while the reconfig mutex is held, holding the mutex
      can stop write requests completing, and this prevents
      ->quiesce(mddev, 1) from completing.
      
      ->quiesce() is now usually called from mddev_suspend(),
      and it is always called with reconfig_mutex held.  So
      at this time it is safe for the thread to update metadata
      without explicitly taking the lock.
      
      So add 2 new flags, one which says the unlocked updates is
      allowed, and one which ways it is happening.  Then allow it
      while the quiesce completes, and then wait for it to finish.
      Reported-and-tested-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce57466d
    • NeilBrown's avatar
      md: use mddev_suspend/resume instead of ->quiesce() · 7c435e22
      NeilBrown authored
      commit 9e1cc0a5 upstream.
      
      mddev_suspend() is a more general interface than
      calling ->quiesce() and is so more extensible.  A
      future patch will make use of this.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c435e22
    • NeilBrown's avatar
      md: move suspend_hi/lo handling into core md code · feabea21
      NeilBrown authored
      commit b3143b9a upstream.
      
      responding to ->suspend_lo and ->suspend_hi is similar
      to responding to ->suspended.  It is best to wait in
      the common core code without incrementing ->active_io.
      This allows mddev_suspend()/mddev_resume() to work while
      requests are waiting for suspend_lo/hi to change.
      This is will be important after a subsequent patch
      which uses mddev_suspend() to synchronize updating for
      suspend_lo/hi.
      
      So move the code for testing suspend_lo/hi out of raid1.c
      and raid5.c, and place it in md.c
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      feabea21
    • NeilBrown's avatar
      md: don't call bitmap_create() while array is quiesced. · cc091f3f
      NeilBrown authored
      commit 52a0d49d upstream.
      
      bitmap_create() allocates memory with GFP_KERNEL and
      so can wait for IO.
      If called while the array is quiesced, it could wait indefinitely
      for write out to the array - deadlock.
      So call bitmap_create() before quiescing the array.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc091f3f
    • NeilBrown's avatar
      md: always hold reconfig_mutex when calling mddev_suspend() · e44e4cf3
      NeilBrown authored
      commit 4d5324f7 upstream.
      
      Most often mddev_suspend() is called with
      reconfig_mutex held.  Make this a requirement in
      preparation a subsequent patch.  Also require
      reconfig_mutex to be held for mddev_resume(),
      partly for symmetry and partly to guarantee
      no races with incr/decr of mddev->suspend.
      
      Taking the mutex in r5c_disable_writeback_async() is
      a little tricky as this is called from a work queue
      via log->disable_writeback_work, and flush_work()
      is called on that while holding ->reconfig_mutex.
      If the work item hasn't run before flush_work()
      is called, the work function will not be able to
      get the mutex.
      
      So we use mddev_trylock() inside the wait_event() call, and have that
      abort when conf->log is set to NULL, which happens before
      flush_work() is called.
      We wait in mddev->sb_wait and ensure this is woken
      when any of the conditions change.  This requires
      waking mddev->sb_wait in mddev_unlock().  This is only
      like to trigger extra wake_ups of threads that needn't
      be woken when metadata is being written, and that
      doesn't happen often enough that the cost would be
      noticeable.
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e44e4cf3
    • Taehee Yoo's avatar
      netfilter: nf_tables: fix NULL-ptr in nf_tables_dump_obj() · b8d8cde4
      Taehee Yoo authored
      commit 360cc79d upstream.
      
      The table field in nft_obj_filter is not an array. In order to check
      tablename, we should check if the pointer is set.
      
      Test commands:
      
         %nft add table ip filter
         %nft add counter ip filter ct1
         %nft reset counters
      
      Splat looks like:
      
      [  306.510504] kasan: CONFIG_KASAN_INLINE enabled
      [  306.516184] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  306.524775] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
      [  306.528284] Modules linked in: nft_objref nft_counter nf_tables nfnetlink ip_tables x_tables
      [  306.528284] CPU: 0 PID: 1488 Comm: nft Not tainted 4.17.0-rc4+ #17
      [  306.528284] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
      [  306.528284] RIP: 0010:nf_tables_dump_obj+0x52c/0xa70 [nf_tables]
      [  306.528284] RSP: 0018:ffff8800b6cb7520 EFLAGS: 00010246
      [  306.528284] RAX: 0000000000000000 RBX: ffff8800b6c49820 RCX: 0000000000000000
      [  306.528284] RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffed0016d96e9a
      [  306.528284] RBP: ffff8800b6cb75c0 R08: ffffed00236fce7c R09: ffffed00236fce7b
      [  306.528284] R10: ffffffff9f6241e8 R11: ffffed00236fce7c R12: ffff880111365108
      [  306.528284] R13: 0000000000000000 R14: ffff8800b6c49860 R15: ffff8800b6c49860
      [  306.528284] FS:  00007f838b007700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000
      [  306.528284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  306.528284] CR2: 00007ffeafabcf78 CR3: 00000000b6cbe000 CR4: 00000000001006f0
      [  306.528284] Call Trace:
      [  306.528284]  netlink_dump+0x470/0xa20
      [  306.528284]  __netlink_dump_start+0x5ae/0x690
      [  306.528284]  ? nf_tables_getobj+0x1b3/0x740 [nf_tables]
      [  306.528284]  nf_tables_getobj+0x2f5/0x740 [nf_tables]
      [  306.528284]  ? nft_obj_notify+0x100/0x100 [nf_tables]
      [  306.528284]  ? nf_tables_getobj+0x740/0x740 [nf_tables]
      [  306.528284]  ? nf_tables_dump_flowtable_done+0x70/0x70 [nf_tables]
      [  306.528284]  ? nft_obj_notify+0x100/0x100 [nf_tables]
      [  306.528284]  nfnetlink_rcv_msg+0x8ff/0x932 [nfnetlink]
      [  306.528284]  ? nfnetlink_rcv_msg+0x216/0x932 [nfnetlink]
      [  306.528284]  netlink_rcv_skb+0x1c9/0x2f0
      [  306.528284]  ? nfnetlink_bind+0x1d0/0x1d0 [nfnetlink]
      [  306.528284]  ? debug_check_no_locks_freed+0x270/0x270
      [  306.528284]  ? netlink_ack+0x7a0/0x7a0
      [  306.528284]  ? ns_capable_common+0x6e/0x110
      [ ... ]
      
      Fixes: e46abbcc ("netfilter: nf_tables: Allow table names of up to 255 chars")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b8d8cde4
    • Florian Westphal's avatar
      netfilter: nf_tables: add missing netlink attrs to policies · 44956f98
      Florian Westphal authored
      commit 467697d2 upstream.
      
      Fixes: 8aeff920 ("netfilter: nf_tables: add stateful object reference to set elements")
      Fixes: f25ad2e9 ("netfilter: nf_tables: prepare for expressions associated to set elements")
      Fixes: 1a94e38d ("netfilter: nf_tables: add NFTA_RULE_ID attribute")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      44956f98
    • Colin Ian King's avatar
      netfilter: nf_tables: fix memory leak on error exit return · 082711fa
      Colin Ian King authored
      commit f0dfd7a2 upstream.
      
      Currently the -EBUSY error return path is not free'ing resources
      allocated earlier, leaving a memory leak. Fix this by exiting via the
      error exit label err5 that performs the necessary resource clean
      up.
      
      Detected by CoverityScan, CID#1432975 ("Resource leak")
      
      Fixes: 9744a6fc ("netfilter: nf_tables: check if same extensions are set when adding elements")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      082711fa
    • Taehee Yoo's avatar
      netfilter: nf_tables: increase nft_counters_enabled in nft_chain_stats_replace() · 174757e2
      Taehee Yoo authored
      commit bbb8c61f upstream.
      
      When a chain is updated, a counter can be attached. if so,
      the nft_counters_enabled should be increased.
      
      test commands:
      
         %nft add table ip filter
         %nft add chain ip filter input { type filter hook input priority 4\; }
         %iptables-compat -Z input
         %nft delete chain ip filter input
      
      we can see below messages.
      
      [  286.443720] jump label: negative count!
      [  286.448278] WARNING: CPU: 0 PID: 1459 at kernel/jump_label.c:197 __static_key_slow_dec_cpuslocked+0x6f/0xf0
      [  286.449144] Modules linked in: nf_tables nfnetlink ip_tables x_tables
      [  286.449144] CPU: 0 PID: 1459 Comm: nft Tainted: G        W         4.17.0-rc2+ #12
      [  286.449144] RIP: 0010:__static_key_slow_dec_cpuslocked+0x6f/0xf0
      [  286.449144] RSP: 0018:ffff88010e5176f0 EFLAGS: 00010286
      [  286.449144] RAX: 000000000000001b RBX: ffffffffc0179500 RCX: ffffffffb8a82522
      [  286.449144] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88011b7e5eac
      [  286.449144] RBP: 0000000000000000 R08: ffffed00236fce5c R09: ffffed00236fce5b
      [  286.449144] R10: ffffffffc0179503 R11: ffffed00236fce5c R12: 0000000000000000
      [  286.449144] R13: ffff88011a28e448 R14: ffff88011a28e470 R15: dffffc0000000000
      [  286.449144] FS:  00007f0384328700(0000) GS:ffff88011b600000(0000) knlGS:0000000000000000
      [  286.449144] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  286.449144] CR2: 00007f038394bf10 CR3: 0000000104a86000 CR4: 00000000001006f0
      [  286.449144] Call Trace:
      [  286.449144]  static_key_slow_dec+0x6a/0x70
      [  286.449144]  nf_tables_chain_destroy+0x19d/0x210 [nf_tables]
      [  286.449144]  nf_tables_commit+0x1891/0x1c50 [nf_tables]
      [  286.449144]  nfnetlink_rcv+0x1148/0x13d0 [nfnetlink]
      [ ... ]
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      174757e2