1. 15 Mar, 2018 23 commits
    • Javier Martinez Canillas's avatar
      tpm: only attempt to disable the LPC CLKRUN if is already enabled · 7b6f41b7
      Javier Martinez Canillas authored
      commit 6c9f0ce0 upstream.
      
      Commit 5e572cab ("tpm: Enable CLKRUN protocol for Braswell systems")
      added logic in the TPM TIS driver to disable the Low Pin Count CLKRUN
      signal during TPM transactions.
      
      Unfortunately this breaks other devices that are attached to the LPC bus
      like for example PS/2 mouse and keyboards.
      
      One flaw with the logic is that it assumes that the CLKRUN is always
      enabled, and so it unconditionally enables it after a TPM transaction.
      
      But it could be that the CLKRUN# signal was already disabled in the LPC
      bus and so after the driver probes, CLKRUN_EN will remain enabled which
      may break other devices that are attached to the LPC bus but don't have
      support for the CLKRUN protocol.
      
      Fixes: 5e572cab ("tpm: Enable CLKRUN protocol for Braswell systems")
      Signed-off-by: default avatarJavier Martinez Canillas <javierm@redhat.com>
      Tested-by: default avatarJames Ettle <james@ettle.org.uk>
      Tested-by: default avatarJeffery Miller <jmiller@neverware.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Tested-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b6f41b7
    • Arnd Bergmann's avatar
      tpm: remove unused variables · 30c3b70e
      Arnd Bergmann authored
      commit 68021bf4 upstream.
      
      The CLKRUN fix caused a few harmless compile-time warnings:
      
      drivers/char/tpm/tpm_tis.c: In function 'tpm_tis_pnp_remove':
      drivers/char/tpm/tpm_tis.c:274:23: error: unused variable 'priv' [-Werror=unused-variable]
      drivers/char/tpm/tpm_tis.c: In function 'tpm_tis_plat_remove':
      drivers/char/tpm/tpm_tis.c:324:23: error: unused variable 'priv' [-Werror=unused-variable]
      
      This removes the variables that have now become unused.
      
      Fixes: 6d0866cbc2d3 ("tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd()")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Reviewed-by: default avatarJames Morris <jmorris@namei.org>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30c3b70e
    • Javier Martinez Canillas's avatar
      tpm: delete the TPM_TIS_CLK_ENABLE flag · 1ef7d99c
      Javier Martinez Canillas authored
      commit 764325ad upstream.
      
      This flag is only used to warn if CLKRUN_EN wasn't disabled on Braswell
      systems, but the only way this can happen is if the code is not correct.
      
      So it's an unnecessary check that just makes the code harder to read.
      Suggested-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJavier Martinez Canillas <javierm@redhat.com>
      Reviewed-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Tested-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ef7d99c
    • Azhar Shaikh's avatar
      tpm: Keep CLKRUN enabled throughout the duration of transmit_cmd() · 7cea3381
      Azhar Shaikh authored
      commit b3e958ce upstream.
      
      Commit 5e572cab ("tpm: Enable CLKRUN protocol for Braswell
      systems") disabled CLKRUN protocol during TPM transactions and re-enabled
      once the transaction is completed. But there were still some corner cases
      observed where, reading of TPM header failed for savestate command
      while going to suspend, which resulted in suspend failure.
      To fix this issue keep the CLKRUN protocol disabled for the entire
      duration of a single TPM command and not disabling and re-enabling
      again for every TPM transaction. For the other TPM accesses outside
      TPM command flow, add a higher level of disabling and re-enabling
      the CLKRUN protocol, instead of doing for every TPM transaction.
      
      Fixes: 5e572cab ("tpm: Enable CLKRUN protocol for Braswell systems")
      Signed-off-by: default avatarAzhar Shaikh <azhar.shaikh@intel.com>
      Reviewed-by: default avatarJarkko Sakkinen  <jarkko.sakkinen@linux.intel.com>
      Tested-by: default avatarJarkko Sakkinen  <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen  <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7cea3381
    • Azhar Shaikh's avatar
      tpm_tis: Move ilb_base_addr to tpm_tis_data · f1bb2393
      Azhar Shaikh authored
      commit c382babc upstream.
      
      Move static variable ilb_base_addr to tpm_tis_data.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAzhar Shaikh <azhar.shaikh@intel.com>
      Reviewed-by: default avatarJarkko Sakkinen  <jarkko.sakkinen@linux.intel.com>
      Tested-by: default avatarJarkko Sakkinen  <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarJarkko Sakkinen  <jarkko.sakkinen@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1bb2393
    • Eric Dumazet's avatar
      netfilter: use skb_to_full_sk in ip6_route_me_harder · 9131a1b3
      Eric Dumazet authored
      commit 7d98386d upstream.
      
      For some reason, Florian forgot to apply to ip6_route_me_harder
      the fix that went in commit 29e09229 ("netfilter: use
      skb_to_full_sk in ip_route_me_harder")
      
      Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener") 
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9131a1b3
    • Florian Westphal's avatar
      netfilter: ipv6: fix use-after-free Write in nf_nat_ipv6_manip_pkt · 39f154fa
      Florian Westphal authored
      commit b078556a upstream.
      
      l4proto->manip_pkt() can cause reallocation of skb head so pointer
      to the ipv6 header must be reloaded.
      
      Reported-and-tested-by: <syzbot+10005f4292fc9cc89de7@syzkaller.appspotmail.com>
      Fixes: 58a317f1 ("netfilter: ipv6: add IPv6 NAT support")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      39f154fa
    • Florian Westphal's avatar
      netfilter: bridge: ebt_among: add missing match size checks · 2d7e0700
      Florian Westphal authored
      commit c4585a28 upstream.
      
      ebt_among is special, it has a dynamic match size and is exempt
      from the central size checks.
      
      Therefore it must check that the size of the match structure
      provided from userspace is sane by making sure em->match_size
      is at least the minimum size of the expected structure.
      
      The module has such a check, but its only done after accessing
      a structure that might be out of bounds.
      
      tested with: ebtables -A INPUT ... \
      --among-dst fe:fe:fe:fe:fe:fe
      --among-dst fe:fe:fe:fe:fe:fe --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe
      --among-src fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe
      
      Reported-by: <syzbot+fe0b19af568972814355@syzkaller.appspotmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d7e0700
    • Florian Westphal's avatar
      netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets · eaa06bfb
      Florian Westphal authored
      commit b7181216 upstream.
      
      We need to make sure the offsets are not out of range of the
      total size.
      Also check that they are in ascending order.
      
      The WARN_ON triggered by syzkaller (it sets panic_on_warn) is
      changed to also bail out, no point in continuing parsing.
      
      Briefly tested with simple ruleset of
      -A INPUT --limit 1/s' --log
      plus jump to custom chains using 32bit ebtables binary.
      
      Reported-by: <syzbot+845a53d13171abf8bf29@syzkaller.appspotmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eaa06bfb
    • Eric Dumazet's avatar
      netfilter: IDLETIMER: be syzkaller friendly · c89e04e5
      Eric Dumazet authored
      commit cfc2c740 upstream.
      
      We had one report from syzkaller [1]
      
      First issue is that INIT_WORK() should be done before mod_timer()
      or we risk timer being fired too soon, even with a 1 second timer.
      
      Second issue is that we need to reject too big info->timeout
      to avoid overflows in msecs_to_jiffies(info->timeout * 1000), or
      risk looping, if result after overflow is 0.
      
      [1]
      WARNING: CPU: 1 PID: 5129 at kernel/workqueue.c:1444 __queue_work+0xdf4/0x1230 kernel/workqueue.c:1444
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 1 PID: 5129 Comm: syzkaller159866 Not tainted 4.16.0-rc1+ #230
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       <IRQ>
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       panic+0x1e4/0x41c kernel/panic.c:183
       __warn+0x1dc/0x200 kernel/panic.c:547
       report_bug+0x211/0x2d0 lib/bug.c:184
       fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
       fixup_bug arch/x86/kernel/traps.c:247 [inline]
       do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
       do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
       invalid_op+0x22/0x40 arch/x86/entry/entry_64.S:988
      RIP: 0010:__queue_work+0xdf4/0x1230 kernel/workqueue.c:1444
      RSP: 0018:ffff8801db507538 EFLAGS: 00010006
      RAX: ffff8801aeb46080 RBX: ffff8801db530200 RCX: ffffffff81481404
      RDX: 0000000000000100 RSI: ffffffff86b42640 RDI: 0000000000000082
      RBP: ffff8801db507758 R08: 1ffff1003b6a0de5 R09: 000000000000000c
      R10: ffff8801db5073f0 R11: 0000000000000020 R12: 1ffff1003b6a0eb6
      R13: ffff8801b1067ae0 R14: 00000000000001f8 R15: dffffc0000000000
       queue_work_on+0x16a/0x1c0 kernel/workqueue.c:1488
       queue_work include/linux/workqueue.h:488 [inline]
       schedule_work include/linux/workqueue.h:546 [inline]
       idletimer_tg_expired+0x44/0x60 net/netfilter/xt_IDLETIMER.c:116
       call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
       expire_timers kernel/time/timer.c:1363 [inline]
       __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
       run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
       __do_softirq+0x2d7/0xb85 kernel/softirq.c:285
       invoke_softirq kernel/softirq.c:365 [inline]
       irq_exit+0x1cc/0x200 kernel/softirq.c:405
       exiting_irq arch/x86/include/asm/apic.h:541 [inline]
       smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
       apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:829
       </IRQ>
      RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:777 [inline]
      RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
      RIP: 0010:_raw_spin_unlock_irqrestore+0x5e/0xba kernel/locking/spinlock.c:184
      RSP: 0018:ffff8801c20173c8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff12
      RAX: dffffc0000000000 RBX: 0000000000000282 RCX: 0000000000000006
      RDX: 1ffffffff0d592cd RSI: 1ffff10035d68d23 RDI: 0000000000000282
      RBP: ffff8801c20173d8 R08: 1ffff10038402e47 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8820e5c8
      R13: ffff8801b1067ad8 R14: ffff8801aea7c268 R15: ffff8801aea7c278
       __debug_object_init+0x235/0x1040 lib/debugobjects.c:378
       debug_object_init+0x17/0x20 lib/debugobjects.c:391
       __init_work+0x2b/0x60 kernel/workqueue.c:506
       idletimer_tg_create net/netfilter/xt_IDLETIMER.c:152 [inline]
       idletimer_tg_checkentry+0x691/0xb00 net/netfilter/xt_IDLETIMER.c:213
       xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:850
       check_target net/ipv6/netfilter/ip6_tables.c:533 [inline]
       find_check_entry.isra.7+0x935/0xcf0 net/ipv6/netfilter/ip6_tables.c:575
       translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:744
       do_replace net/ipv6/netfilter/ip6_tables.c:1160 [inline]
       do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1686
       nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
       nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
       ipv6_setsockopt+0x10b/0x130 net/ipv6/ipv6_sockglue.c:927
       udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
       sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2976
       SYSC_setsockopt net/socket.c:1850 [inline]
       SyS_setsockopt+0x189/0x360 net/socket.c:1829
       do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
      
      Fixes: 0902b469 ("netfilter: xtables: idletimer target implementation")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c89e04e5
    • Paolo Abeni's avatar
      netfilter: nat: cope with negative port range · 53f94e61
      Paolo Abeni authored
      commit db57ccf0 upstream.
      
      syzbot reported a division by 0 bug in the netfilter nat code:
      
      divide error: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
          (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 4168 Comm: syzkaller034710 Not tainted 4.16.0-rc1+ #309
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      RIP: 0010:nf_nat_l4proto_unique_tuple+0x291/0x530
      net/netfilter/nf_nat_proto_common.c:88
      RSP: 0018:ffff8801b2466778 EFLAGS: 00010246
      RAX: 000000000000f153 RBX: ffff8801b2466dd8 RCX: ffff8801b2466c7c
      RDX: 0000000000000000 RSI: ffff8801b2466c58 RDI: ffff8801db5293ac
      RBP: ffff8801b24667d8 R08: ffff8801b8ba6dc0 R09: ffffffff88af5900
      R10: ffff8801b24666f0 R11: 0000000000000000 R12: 000000002990f153
      R13: 0000000000000001 R14: 0000000000000000 R15: ffff8801b2466c7c
      FS:  00000000017e3880(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000208fdfe4 CR3: 00000001b5340002 CR4: 00000000001606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        dccp_unique_tuple+0x40/0x50 net/netfilter/nf_nat_proto_dccp.c:30
        get_unique_tuple+0xc28/0x1c10 net/netfilter/nf_nat_core.c:362
        nf_nat_setup_info+0x1c2/0xe00 net/netfilter/nf_nat_core.c:406
        nf_nat_redirect_ipv6+0x306/0x730 net/netfilter/nf_nat_redirect.c:124
        redirect_tg6+0x7f/0xb0 net/netfilter/xt_REDIRECT.c:34
        ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365
        ip6table_nat_do_chain+0x65/0x80 net/ipv6/netfilter/ip6table_nat.c:41
        nf_nat_ipv6_fn+0x594/0xa80 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:302
        nf_nat_ipv6_local_fn+0x33/0x5d0
      net/ipv6/netfilter/nf_nat_l3proto_ipv6.c:407
        ip6table_nat_local_fn+0x2c/0x40 net/ipv6/netfilter/ip6table_nat.c:69
        nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
        nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
        nf_hook include/linux/netfilter.h:243 [inline]
        NF_HOOK include/linux/netfilter.h:286 [inline]
        ip6_xmit+0x10ec/0x2260 net/ipv6/ip6_output.c:277
        inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139
        dccp_transmit_skb+0x9ac/0x10f0 net/dccp/output.c:142
        dccp_connect+0x369/0x670 net/dccp/output.c:564
        dccp_v6_connect+0xe17/0x1bf0 net/dccp/ipv6.c:946
        __inet_stream_connect+0x2d4/0xf00 net/ipv4/af_inet.c:620
        inet_stream_connect+0x58/0xa0 net/ipv4/af_inet.c:684
        SYSC_connect+0x213/0x4a0 net/socket.c:1639
        SyS_connect+0x24/0x30 net/socket.c:1620
        do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x26/0x9b
      RIP: 0033:0x441c69
      RSP: 002b:00007ffe50cc0be8 EFLAGS: 00000217 ORIG_RAX: 000000000000002a
      RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000441c69
      RDX: 000000000000001c RSI: 00000000208fdfe4 RDI: 0000000000000003
      RBP: 00000000006cc018 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000538 R11: 0000000000000217 R12: 0000000000403590
      R13: 0000000000403620 R14: 0000000000000000 R15: 0000000000000000
      Code: 48 89 f0 83 e0 07 83 c0 01 38 d0 7c 08 84 d2 0f 85 46 02 00 00 48 8b
      45 c8 44 0f b7 20 e8 88 97 04 fd 31 d2 41 0f b7 c4 4c 89 f9 <41> f7 f6 48
      c1 e9 03 48 b8 00 00 00 00 00 fc ff df 0f b6 0c 01
      RIP: nf_nat_l4proto_unique_tuple+0x291/0x530
      net/netfilter/nf_nat_proto_common.c:88 RSP: ffff8801b2466778
      
      The problem is that currently we don't have any check on the
      configured port range. A port range == -1 triggers the bug, while
      other negative values may require a very long time to complete the
      following loop.
      
      This commit addresses the issue swapping the two ends on negative
      ranges. The check is performed in nf_nat_l4proto_unique_tuple() since
      the nft nat loads the port values from nft registers at runtime.
      
      v1 -> v2: use the correct 'Fixes' tag
      v2 -> v3: update commit message, drop unneeded READ_ONCE()
      
      Fixes: 5b1158e9 ("[NETFILTER]: Add NAT support for nf_conntrack")
      Reported-by: syzbot+8012e198bd037f4871e5@syzkaller.appspotmail.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      53f94e61
    • Paolo Abeni's avatar
      netfilter: x_tables: fix missing timer initialization in xt_LED · ab737b02
      Paolo Abeni authored
      commit 10414014 upstream.
      
      syzbot reported that xt_LED may try to use the ledinternal->timer
      without previously initializing it:
      
      ------------[ cut here ]------------
      kernel BUG at kernel/time/timer.c:958!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
          (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 1826 Comm: kworker/1:2 Not tainted 4.15.0+ #306
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Workqueue: ipv6_addrconf addrconf_dad_work
      RIP: 0010:__mod_timer kernel/time/timer.c:958 [inline]
      RIP: 0010:mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102
      RSP: 0018:ffff8801d24fe9f8 EFLAGS: 00010293
      RAX: ffff8801d25246c0 RBX: ffff8801aec6cb50 RCX: ffffffff816052c6
      RDX: 0000000000000000 RSI: 00000000fffbd14b RDI: ffff8801aec6cb68
      RBP: ffff8801d24fec98 R08: 0000000000000000 R09: 1ffff1003a49fd6c
      R10: ffff8801d24feb28 R11: 0000000000000005 R12: dffffc0000000000
      R13: ffff8801d24fec70 R14: 00000000fffbd14b R15: ffff8801af608f90
      FS:  0000000000000000(0000) GS:ffff8801db500000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000206d6fd0 CR3: 0000000006a22001 CR4: 00000000001606e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
        led_tg+0x1db/0x2e0 net/netfilter/xt_LED.c:75
        ip6t_do_table+0xc2a/0x1a30 net/ipv6/netfilter/ip6_tables.c:365
        ip6table_raw_hook+0x65/0x80 net/ipv6/netfilter/ip6table_raw.c:42
        nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
        nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
        nf_hook.constprop.27+0x3f6/0x830 include/linux/netfilter.h:243
        NF_HOOK include/linux/netfilter.h:286 [inline]
        ndisc_send_skb+0xa51/0x1370 net/ipv6/ndisc.c:491
        ndisc_send_ns+0x38a/0x870 net/ipv6/ndisc.c:633
        addrconf_dad_work+0xb9e/0x1320 net/ipv6/addrconf.c:4008
        process_one_work+0xbbf/0x1af0 kernel/workqueue.c:2113
        worker_thread+0x223/0x1990 kernel/workqueue.c:2247
        kthread+0x33c/0x400 kernel/kthread.c:238
        ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:429
      Code: 85 2a 0b 00 00 4d 8b 3c 24 4d 85 ff 75 9f 4c 8b bd 60 fd ff ff e8 bb
      57 10 00 65 ff 0d 94 9a a1 7e e9 d9 fc ff ff e8 aa 57 10 00 <0f> 0b e8 a3
      57 10 00 e9 14 fb ff ff e8 99 57 10 00 4c 89 bd 70
      RIP: __mod_timer kernel/time/timer.c:958 [inline] RSP: ffff8801d24fe9f8
      RIP: mod_timer+0x7d6/0x13c0 kernel/time/timer.c:1102 RSP: ffff8801d24fe9f8
      ---[ end trace f661ab06f5dd8b3d ]---
      
      The ledinternal struct can be shared between several different
      xt_LED targets, but the related timer is currently initialized only
      if the first target requires it. Fix it by unconditionally
      initializing the timer struct.
      
      v1 -> v2: call del_timer_sync() unconditionally, too.
      
      Fixes: 268cb38e ("netfilter: x_tables: add LED trigger target")
      Reported-by: syzbot+10c98dc5725c6c8fc7fb@syzkaller.appspotmail.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab737b02
    • Eric Dumazet's avatar
      netfilter: xt_hashlimit: fix lock imbalance · 2a7ebc07
      Eric Dumazet authored
      commit de526f40 upstream.
      
      syszkaller found that rcu was not held in hashlimit_mt_common()
      
      We only need to enable BH at this point.
      
      Fixes: bea74641 ("netfilter: xt_hashlimit: add rate match mode")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2a7ebc07
    • Cong Wang's avatar
      netfilter: ipt_CLUSTERIP: fix a race condition of proc file creation · 4514a597
      Cong Wang authored
      commit b3e456fc upstream.
      
      There is a race condition between clusterip_config_entry_put()
      and clusterip_config_init(), after we release the spinlock in
      clusterip_config_entry_put(), a new proc file with a same IP could
      be created immediately since it is already removed from the configs
      list, therefore it triggers this warning:
      
      ------------[ cut here ]------------
      proc_dir_entry 'ipt_CLUSTERIP/172.20.0.170' already registered
      WARNING: CPU: 1 PID: 4152 at fs/proc/generic.c:330 proc_register+0x2a4/0x370 fs/proc/generic.c:329
      Kernel panic - not syncing: panic_on_warn set ...
      
      As a quick fix, just move the proc_remove() inside the spinlock.
      
      Reported-by: <syzbot+03218bcdba6aa76441a3@syzkaller.appspotmail.com>
      Fixes: 6c5d5cfb ("netfilter: ipt_CLUSTERIP: check duplicate config when initializing")
      Tested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4514a597
    • Florian Westphal's avatar
      netfilter: add back stackpointer size checks · 638c2e4e
      Florian Westphal authored
      commit 57ebd808 upstream.
      
      The rationale for removing the check is only correct for rulesets
      generated by ip(6)tables.
      
      In iptables, a jump can only occur to a user-defined chain, i.e.
      because we size the stack based on number of user-defined chains we
      cannot exceed stack size.
      
      However, the underlying binary format has no such restriction,
      and the validation step only ensures that the jump target is a
      valid rule start point.
      
      IOW, its possible to build a rule blob that has no user-defined
      chains but does contain a jump.
      
      If this happens, no jump stack gets allocated and crash occurs
      because no jumpstack was allocated.
      
      Fixes: 7814b6ec ("netfilter: xtables: don't save/restore jumpstack offset")
      Reported-by: syzbot+e783f671527912cd9403@syzkaller.appspotmail.com
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      638c2e4e
    • Vinod Koul's avatar
      ASoC: Intel: kbl: fix jack name · 310f286d
      Vinod Koul authored
      commit cedb6415 upstream.
      
      Commit d1c4cb44 ("ASoC: Intel: Skylake: Fix jack name format
      substitution") added Jack name but erroneously added a space as well,
      so remove the space in Jack name.
      
      Fixes: d1c4cb44 ("ASoC: Intel: Skylake: Fix jack name format substitution")
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      310f286d
    • Chintan Patel's avatar
      ASoC: Intel: Skylake: Fix jack name format substitution · 314b54aa
      Chintan Patel authored
      commit d1c4cb44 upstream.
      
      Jack name is not getting formatted correctly hence resulting
      in invalid name for HDMI/DP input devices.
      
      This was recently exposed due changes brought by MST:
      commit 3a13347f ("ASoC: Intel: kbl: Add jack port initialize
      in kbl machine drivers")
      Signed-off-by: default avatarChintan Patel <chintan.m.patel@intel.com>
      Acked-By: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      314b54aa
    • Arnd Bergmann's avatar
      ARM: omap2: hide omap3_save_secure_ram on non-OMAP3 builds · c116baf7
      Arnd Bergmann authored
      commit 863204cf upstream.
      
      In configurations without CONFIG_OMAP3 but with secure RAM support,
      we now run into a link failure:
      
      arch/arm/mach-omap2/omap-secure.o: In function `omap3_save_secure_ram':
      omap-secure.c:(.text+0x130): undefined reference to `save_secure_ram_context'
      
      The omap3_save_secure_ram() function is only called from the OMAP34xx
      power management code, so we can simply hide that function in the
      appropriate #ifdef.
      
      Fixes: d09220a8 ("ARM: OMAP2+: Fix SRAM virt to phys translation for save_secure_ram_context")
      Acked-by: default avatarTony Lindgren <tony@atomide.com>
      Tested-by: default avatarDan Murphy <dmurphy@ti.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c116baf7
    • Jerry Hoemann's avatar
      watchdog: hpwdt: Remove legacy NMI sourcing. · 77fbdd1e
      Jerry Hoemann authored
      commit 2b3d89b4 upstream.
      
      Gen8 and prior Proliant systems supported the "CRU" interface
      to firmware.  This interfaces allows linux to "call back" into firmware
      to source the cause of an NMI.  This feature isn't fully utilized
      as the actual source of the NMI isn't printed, the driver only
      indicates that the source couldn't be determined when the call
      fails.
      
      With the advent of Gen9, iCRU replaces the CRU. The call back
      feature is no longer available in firmware.  To be compatible and
      not attempt to call back into firmware on system not supporting CRU,
      the SMBIOS table is consulted to determine if it is safe to
      make the call back or not.
      
      This results in about half of the driver code being devoted
      to either making CRU calls or determing if it is safe to make
      CRU calls.  As noted, the driver isn't really using the results of
      the CRU calls.
      
      Furthermore, as a consequence of the Spectre security issue, the
      BIOS/EFI calls are being wrapped into Spectre-disabling section.
      Removing the call back in hpwdt_pretimeout assists in this effort.
      
      As the CRU sourcing of the NMI isn't required for handling the
      NMI and there are security concerns with making the call back, remove
      the legacy (pre Gen9) NMI sourcing and the DMI code to determine if
      the system had the CRU interface.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hpe.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77fbdd1e
    • Arnd Bergmann's avatar
      watchdog: hpwdt: fix unused variable warning · 41da51db
      Arnd Bergmann authored
      commit aeebc6ba upstream.
      
      The new hpwdt_my_nmi() function is used conditionally, which produces
      a harmless warning in some configurations:
      
      drivers/watchdog/hpwdt.c:478:12: error: 'hpwdt_my_nmi' defined but not used [-Werror=unused-function]
      
      This moves it inside of the #ifdef that protects its caller, to silence
      the warning.
      
      Fixes: 621174a92851 ("watchdog: hpwdt: Check source of NMI")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarJerry Hoemann <jerry.hoemann@hpe.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      41da51db
    • Jerry Hoemann's avatar
      watchdog: hpwdt: Check source of NMI · d40d7b33
      Jerry Hoemann authored
      commit 838534e5 upstream.
      
      Do not claim the NMI (i.e. return NMI_DONE) if the source of
      the NMI isn't the iLO watchdog or debug.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hpe.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d40d7b33
    • Jerry Hoemann's avatar
      watchdog: hpwdt: SMBIOS check · 9a07f4a6
      Jerry Hoemann authored
      commit c42cbe41 upstream.
      
      This corrects:
      commit cce78da7 ("watchdog: hpwdt: Add check for UEFI bits")
      
      The test on HPE SMBIOS extension type 219 record "Misc Features"
      bits for UEFI support is incorrect.  The definition of the Misc Features
      bits in the HPE SMBIOS OEM Extensions specification (and related
      firmware) was changed to use a different pair of bits to
      represent UEFI supported.  Howerver, a corresponding change
      to Linux was missed.
      
      Current code/platform work because the iCRU test is working.
      But purpose of cce78da7 is to ensure correct functionality
      on future systems where iCRU isn't supported.
      Signed-off-by: default avatarJerry Hoemann <jerry.hoemann@hpe.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@iguana.be>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a07f4a6
    • Masahiro Yamada's avatar
      kbuild: move "_all" target out of $(KBUILD_SRC) conditional · 31c4bc6e
      Masahiro Yamada authored
      commit ba634ece upstream.
      
      The first "_all" occurrence around line 120 is only visible when
      KBUILD_SRC is unset.
      
      If O=... is specified, the working directory is relocated, then the
      only second occurrence around line 193 is visible, that is not set
      to PHONY.
      
      Move the first one to an always visible place.  This clarifies "_all"
      is our default target and it is always set to PHONY.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31c4bc6e
  2. 11 Mar, 2018 10 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.26 · 96427a51
      Greg Kroah-Hartman authored
      96427a51
    • Radim Krčmář's avatar
      KVM: x86: fix backward migration with async_PF · dc6fb79d
      Radim Krčmář authored
      commit fe2a3027 upstream.
      
      Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT
      bit when enabling async_PF, but this bit is reserved on old hypervisors,
      which results in a failure upon migration.
      
      To avoid breaking different cases, we are checking for CPUID feature bit
      before enabling the feature and nothing else.
      
      Fixes: 52a5c155 ("KVM: async_pf: Let guest support delivery of async_pf from guest mode")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarWanpeng Li <wanpengli@tencent.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [jwang: port to 4.14]
      Signed-off-by: default avatarJack Wang <jinpu.wang@profitbricks.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc6fb79d
    • Daniel Borkmann's avatar
      bpf, ppc64: fix out of bounds access in tail call · a91064ff
      Daniel Borkmann authored
      [ upstream commit d269176e ]
      
      While working on 16338a9b ("bpf, arm64: fix out of bounds access in
      tail call") I noticed that ppc64 JIT is partially affected as well. While
      the bound checking is correctly performed as unsigned comparison, the
      register with the index value however, is never truncated into 32 bit
      space, so e.g. a index value of 0x100000000ULL with a map of 1 element
      would pass with PPC_CMPLW() whereas we later on continue with the full
      64 bit register value. Therefore, as we do in interpreter and other JITs
      truncate the value to 32 bit initially in order to fix access.
      
      Fixes: ce076141 ("powerpc/bpf: Implement support for tail calls")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Tested-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a91064ff
    • Daniel Borkmann's avatar
      bpf: allow xadd only on aligned memory · 3e272a8c
      Daniel Borkmann authored
      [ upstream commit ca369602 ]
      
      The requirements around atomic_add() / atomic64_add() resp. their
      JIT implementations differ across architectures. E.g. while x86_64
      seems just fine with BPF's xadd on unaligned memory, on arm64 it
      triggers via interpreter but also JIT the following crash:
      
        [  830.864985] Unable to handle kernel paging request at virtual address ffff8097d7ed6703
        [...]
        [  830.916161] Internal error: Oops: 96000021 [#1] SMP
        [  830.984755] CPU: 37 PID: 2788 Comm: test_verifier Not tainted 4.16.0-rc2+ #8
        [  830.991790] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.29 07/17/2017
        [  830.998998] pstate: 80400005 (Nzcv daif +PAN -UAO)
        [  831.003793] pc : __ll_sc_atomic_add+0x4/0x18
        [  831.008055] lr : ___bpf_prog_run+0x1198/0x1588
        [  831.012485] sp : ffff00001ccabc20
        [  831.015786] x29: ffff00001ccabc20 x28: ffff8017d56a0f00
        [  831.021087] x27: 0000000000000001 x26: 0000000000000000
        [  831.026387] x25: 000000c168d9db98 x24: 0000000000000000
        [  831.031686] x23: ffff000008203878 x22: ffff000009488000
        [  831.036986] x21: ffff000008b14e28 x20: ffff00001ccabcb0
        [  831.042286] x19: ffff0000097b5080 x18: 0000000000000a03
        [  831.047585] x17: 0000000000000000 x16: 0000000000000000
        [  831.052885] x15: 0000ffffaeca8000 x14: 0000000000000000
        [  831.058184] x13: 0000000000000000 x12: 0000000000000000
        [  831.063484] x11: 0000000000000001 x10: 0000000000000000
        [  831.068783] x9 : 0000000000000000 x8 : 0000000000000000
        [  831.074083] x7 : 0000000000000000 x6 : 000580d428000000
        [  831.079383] x5 : 0000000000000018 x4 : 0000000000000000
        [  831.084682] x3 : ffff00001ccabcb0 x2 : 0000000000000001
        [  831.089982] x1 : ffff8097d7ed6703 x0 : 0000000000000001
        [  831.095282] Process test_verifier (pid: 2788, stack limit = 0x0000000018370044)
        [  831.102577] Call trace:
        [  831.105012]  __ll_sc_atomic_add+0x4/0x18
        [  831.108923]  __bpf_prog_run32+0x4c/0x70
        [  831.112748]  bpf_test_run+0x78/0xf8
        [  831.116224]  bpf_prog_test_run_xdp+0xb4/0x120
        [  831.120567]  SyS_bpf+0x77c/0x1110
        [  831.123873]  el0_svc_naked+0x30/0x34
        [  831.127437] Code: 97fffe97 17ffffec 00000000 f9800031 (885f7c31)
      
      Reason for this is because memory is required to be aligned. In
      case of BPF, we always enforce alignment in terms of stack access,
      but not when accessing map values or packet data when the underlying
      arch (e.g. arm64) has CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS set.
      
      xadd on packet data that is local to us anyway is just wrong, so
      forbid this case entirely. The only place where xadd makes sense in
      fact are map values; xadd on stack is wrong as well, but it's been
      around for much longer. Specifically enforce strict alignment in case
      of xadd, so that we handle this case generically and avoid such crashes
      in the first place.
      
      Fixes: 17a52670 ("bpf: verifier (add verifier core)")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e272a8c
    • Eric Dumazet's avatar
      bpf: add schedule points in percpu arrays management · e1760b35
      Eric Dumazet authored
      [ upstream commit 32fff239 ]
      
      syszbot managed to trigger RCU detected stalls in
      bpf_array_free_percpu()
      
      It takes time to allocate a huge percpu map, but even more time to free
      it.
      
      Since we run in process context, use cond_resched() to yield cpu if
      needed.
      
      Fixes: a10423b8 ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e1760b35
    • Daniel Borkmann's avatar
      bpf, arm64: fix out of bounds access in tail call · 03549a34
      Daniel Borkmann authored
      [ upstream commit 16338a9b ]
      
      I recently noticed a crash on arm64 when feeding a bogus index
      into BPF tail call helper. The crash would not occur when the
      interpreter is used, but only in case of JIT. Output looks as
      follows:
      
        [  347.007486] Unable to handle kernel paging request at virtual address fffb850e96492510
        [...]
        [  347.043065] [fffb850e96492510] address between user and kernel address ranges
        [  347.050205] Internal error: Oops: 96000004 [#1] SMP
        [...]
        [  347.190829] x13: 0000000000000000 x12: 0000000000000000
        [  347.196128] x11: fffc047ebe782800 x10: ffff808fd7d0fd10
        [  347.201427] x9 : 0000000000000000 x8 : 0000000000000000
        [  347.206726] x7 : 0000000000000000 x6 : 001c991738000000
        [  347.212025] x5 : 0000000000000018 x4 : 000000000000ba5a
        [  347.217325] x3 : 00000000000329c4 x2 : ffff808fd7cf0500
        [  347.222625] x1 : ffff808fd7d0fc00 x0 : ffff808fd7cf0500
        [  347.227926] Process test_verifier (pid: 4548, stack limit = 0x000000007467fa61)
        [  347.235221] Call trace:
        [  347.237656]  0xffff000002f3a4fc
        [  347.240784]  bpf_test_run+0x78/0xf8
        [  347.244260]  bpf_prog_test_run_skb+0x148/0x230
        [  347.248694]  SyS_bpf+0x77c/0x1110
        [  347.251999]  el0_svc_naked+0x30/0x34
        [  347.255564] Code: 9100075a d280220a 8b0a002a d37df04b (f86b694b)
        [...]
      
      In this case the index used in BPF r3 is the same as in r1
      at the time of the call, meaning we fed a pointer as index;
      here, it had the value 0xffff808fd7cf0500 which sits in x2.
      
      While I found tail calls to be working in general (also for
      hitting the error cases), I noticed the following in the code
      emission:
      
        # bpftool p d j i 988
        [...]
        38:   ldr     w10, [x1,x10]
        3c:   cmp     w2, w10
        40:   b.ge    0x000000000000007c              <-- signed cmp
        44:   mov     x10, #0x20                      // #32
        48:   cmp     x26, x10
        4c:   b.gt    0x000000000000007c
        50:   add     x26, x26, #0x1
        54:   mov     x10, #0x110                     // #272
        58:   add     x10, x1, x10
        5c:   lsl     x11, x2, #3
        60:   ldr     x11, [x10,x11]                  <-- faulting insn (f86b694b)
        64:   cbz     x11, 0x000000000000007c
        [...]
      
      Meaning, the tests passed because commit ddb55992 ("arm64:
      bpf: implement bpf_tail_call() helper") was using signed compares
      instead of unsigned which as a result had the test wrongly passing.
      
      Change this but also the tail call count test both into unsigned
      and cap the index as u32. Latter we did as well in 90caccdd
      ("bpf: fix bpf_tail_call() x64 JIT") and is needed in addition here,
      too. Tested on HiSilicon Hi1616.
      
      Result after patch:
      
        # bpftool p d j i 268
        [...]
        38:	ldr	w10, [x1,x10]
        3c:	add	w2, w2, #0x0
        40:	cmp	w2, w10
        44:	b.cs	0x0000000000000080
        48:	mov	x10, #0x20                  	// #32
        4c:	cmp	x26, x10
        50:	b.hi	0x0000000000000080
        54:	add	x26, x26, #0x1
        58:	mov	x10, #0x110                 	// #272
        5c:	add	x10, x1, x10
        60:	lsl	x11, x2, #3
        64:	ldr	x11, [x10,x11]
        68:	cbz	x11, 0x0000000000000080
        [...]
      
      Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03549a34
    • Daniel Borkmann's avatar
      bpf, x64: implement retpoline for tail call · 7e657aa3
      Daniel Borkmann authored
      [ upstream commit a493a87f ]
      
      Implement a retpoline [0] for the BPF tail call JIT'ing that converts
      the indirect jump via jmp %rax that is used to make the long jump into
      another JITed BPF image. Since this is subject to speculative execution,
      we need to control the transient instruction sequence here as well
      when CONFIG_RETPOLINE is set, and direct it into a pause + lfence loop.
      The latter aligns also with what gcc / clang emits (e.g. [1]).
      
      JIT dump after patch:
      
        # bpftool p d x i 1
         0: (18) r2 = map[id:1]
         2: (b7) r3 = 0
         3: (85) call bpf_tail_call#12
         4: (b7) r0 = 2
         5: (95) exit
      
      With CONFIG_RETPOLINE:
      
        # bpftool p d j i 1
        [...]
        33:	cmp    %edx,0x24(%rsi)
        36:	jbe    0x0000000000000072  |*
        38:	mov    0x24(%rbp),%eax
        3e:	cmp    $0x20,%eax
        41:	ja     0x0000000000000072  |
        43:	add    $0x1,%eax
        46:	mov    %eax,0x24(%rbp)
        4c:	mov    0x90(%rsi,%rdx,8),%rax
        54:	test   %rax,%rax
        57:	je     0x0000000000000072  |
        59:	mov    0x28(%rax),%rax
        5d:	add    $0x25,%rax
        61:	callq  0x000000000000006d  |+
        66:	pause                      |
        68:	lfence                     |
        6b:	jmp    0x0000000000000066  |
        6d:	mov    %rax,(%rsp)         |
        71:	retq                       |
        72:	mov    $0x2,%eax
        [...]
      
        * relative fall-through jumps in error case
        + retpoline for indirect jump
      
      Without CONFIG_RETPOLINE:
      
        # bpftool p d j i 1
        [...]
        33:	cmp    %edx,0x24(%rsi)
        36:	jbe    0x0000000000000063  |*
        38:	mov    0x24(%rbp),%eax
        3e:	cmp    $0x20,%eax
        41:	ja     0x0000000000000063  |
        43:	add    $0x1,%eax
        46:	mov    %eax,0x24(%rbp)
        4c:	mov    0x90(%rsi,%rdx,8),%rax
        54:	test   %rax,%rax
        57:	je     0x0000000000000063  |
        59:	mov    0x28(%rax),%rax
        5d:	add    $0x25,%rax
        61:	jmpq   *%rax               |-
        63:	mov    $0x2,%eax
        [...]
      
        * relative fall-through jumps in error case
        - plain indirect jump as before
      
        [0] https://support.google.com/faqs/answer/7625886
        [1] https://github.com/gcc-mirror/gcc/commit/a31e654fa107be968b802786d747e962c2fcdb2bSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e657aa3
    • Yonghong Song's avatar
      bpf: fix rcu lockdep warning for lpm_trie map_free callback · 853223c2
      Yonghong Song authored
      [ upstream commit 6c5f6102 ]
      
      Commit 9a3efb6b ("bpf: fix memory leak in lpm_trie map_free callback function")
      fixed a memory leak and removed unnecessary locks in map_free callback function.
      Unfortrunately, it introduced a lockdep warning. When lockdep checking is turned on,
      running tools/testing/selftests/bpf/test_lpm_map will have:
      
        [   98.294321] =============================
        [   98.294807] WARNING: suspicious RCU usage
        [   98.295359] 4.16.0-rc2+ #193 Not tainted
        [   98.295907] -----------------------------
        [   98.296486] /home/yhs/work/bpf/kernel/bpf/lpm_trie.c:572 suspicious rcu_dereference_check() usage!
        [   98.297657]
        [   98.297657] other info that might help us debug this:
        [   98.297657]
        [   98.298663]
        [   98.298663] rcu_scheduler_active = 2, debug_locks = 1
        [   98.299536] 2 locks held by kworker/2:1/54:
        [   98.300152]  #0:  ((wq_completion)"events"){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
        [   98.301381]  #1:  ((work_completion)(&map->work)){+.+.}, at: [<00000000196bc1f0>] process_one_work+0x157/0x5c0
      
      Since actual trie tree removal happens only after no other
      accesses to the tree are possible, replacing
        rcu_dereference_protected(*slot, lockdep_is_held(&trie->lock))
      with
        rcu_dereference_protected(*slot, 1)
      fixed the issue.
      
      Fixes: 9a3efb6b ("bpf: fix memory leak in lpm_trie map_free callback function")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      853223c2
    • Yonghong Song's avatar
      bpf: fix memory leak in lpm_trie map_free callback function · 62a2caa5
      Yonghong Song authored
      [ upstream commit 9a3efb6b ]
      
      There is a memory leak happening in lpm_trie map_free callback
      function trie_free. The trie structure itself does not get freed.
      
      Also, trie_free function did not do synchronize_rcu before freeing
      various data structures. This is incorrect as some rcu_read_lock
      region(s) for lookup, update, delete or get_next_key may not complete yet.
      The fix is to add synchronize_rcu in the beginning of trie_free.
      The useless spin_lock is removed from this function as well.
      
      Fixes: b95a5c4d ("bpf: add a longest prefix match trie map implementation")
      Reported-by: default avatarMathieu Malaterre <malat@debian.org>
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Tested-by: default avatarMathieu Malaterre <malat@debian.org>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62a2caa5
    • Daniel Borkmann's avatar
      bpf: fix mlock precharge on arraymaps · d9fd73c6
      Daniel Borkmann authored
      [ upstream commit 9c2d63b8 ]
      
      syzkaller recently triggered OOM during percpu map allocation;
      while there is work in progress by Dennis Zhou to add __GFP_NORETRY
      semantics for percpu allocator under pressure, there seems also a
      missing bpf_map_precharge_memlock() check in array map allocation.
      
      Given today the actual bpf_map_charge_memlock() happens after the
      find_and_alloc_map() in syscall path, the bpf_map_precharge_memlock()
      is there to bail out early before we go and do the map setup work
      when we find that we hit the limits anyway. Therefore add this for
      array map as well.
      
      Fixes: 6c905981 ("bpf: pre-allocate hash map elements")
      Fixes: a10423b8 ("bpf: introduce BPF_MAP_TYPE_PERCPU_ARRAY map")
      Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Dennis Zhou <dennisszhou@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9fd73c6
  3. 09 Mar, 2018 7 commits