Commits · 680f6af5337c98d116e4f127cea7845339dba8da · Kirill Smelkov / linux

09 May, 2019 1 commit

netfilter: ebtables: CONFIG_COMPAT: reject trailing data after last rule · 680f6af5

Florian Westphal authored May 05, 2019

If userspace provides a rule blob with trailing data after last target,
we trigger a splat, then convert ruleset to 64bit format (with trailing
data), then pass that to do_replace_finish() which then returns -EINVAL.

Erroring out right away avoids the splat plus unneeded translation and
error unwind.

Fixes: 81e675c2 ("netfilter: ebtables: add CONFIG_COMPAT support")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

680f6af5

06 May, 2019 3 commits

netfilter: nf_flow_table: do not flow offload deleted conntrack entries · 8cd2bc98

Taehee Yoo authored Apr 30, 2019

Conntrack entries can be deleted by the masquerade module. In that case,
flow offload should be deleted too, but GC and data-path of flow offload
do not check for conntrack status bits, hence flow offload entries will
be removed only by the timeout.

Update garbage collector and data-path to check for ct->status. If
IPS_DYING_BIT is set, garbage collector removes flow offload entries and
data-path routine ignores them.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

8cd2bc98

netfilter: nf_conntrack_h323: Remove deprecated config check · b33c448c

Subash Abhinov Kasiviswanathan authored May 03, 2019

CONFIG_NF_CONNTRACK_IPV6 has been deprecated so replace it with a check
for IPV6 instead.

Use nf_ip6_route6() instead of v6ops->route() and keep the IS_MODULE()
in nf_ipv6_ops as mentioned by Florian so that direct calls are used
when IPV6 is builtin and indirect calls are used only when IPV6 is a
module.

Fixes: a0ae2562 ("netfilter: conntrack: remove l3proto abstraction")
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

b33c448c

netfilter: ctnetlink: Resolve conntrack L3-protocol flush regression · f8e60898

Kristian Evensen authored May 03, 2019

Commit 59c08c69 ("netfilter: ctnetlink: Support L3 protocol-filter
on flush") introduced a user-space regression when flushing connection
track entries. Before this commit, the nfgen_family field was not used
by the kernel and all entries were removed. Since this commit,
nfgen_family is used to filter out entries that should not be removed.
One example a broken tool is conntrack. conntrack always sets
nfgen_family to AF_INET, so after 59c08c69 only IPv4 entries were
removed with the -F parameter.

Pablo Neira Ayuso suggested using nfgenmsg->version to resolve the
regression, and this commit implements his suggestion. nfgenmsg->version
is so far set to zero, so it is well-suited to be used as a flag for
selecting old or new flush behavior. If version is 0, nfgen_family is
ignored and all entries are used. If user-space sets the version to one
(or any other value than 0), then the new behavior is used. As version
only can have two valid values, I chose not to add a new
NFNETLINK_VERSION-constant.

Fixes: 59c08c69 ("netfilter: ctnetlink: Support L3 protocol-filter on flush")
Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

f8e60898

05 May, 2019 3 commits

netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast · 43c8f131

Taehee Yoo authored May 03, 2019

rhashtable_insert_fast() may return an error value when memory
allocation fails, but flow_offload_add() does not check for errors.
This patch just adds missing error checking.

Fixes: ac2a6666 ("netfilter: add generic flow table infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

43c8f131

netfilter: nf_tables: fix base chain stat rcu_dereference usage · edbd82c5

Florian Westphal authored Apr 30, 2019

Following splat gets triggered when nfnetlink monitor is running while
xtables-nft selftests are running:

net/netfilter/nf_tables_api.c:1272 suspicious rcu_dereference_check() usage!
other info that might help us debug this:

1 lock held by xtables-nft-mul/27006:
 #0: 00000000e0f85be9 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1a/0x50
Call Trace:
 nf_tables_fill_chain_info.isra.45+0x6cc/0x6e0
 nf_tables_chain_notify+0xf8/0x1a0
 nf_tables_commit+0x165c/0x1740

nf_tables_fill_chain_info() can be called both from dumps (rcu read locked)
or from the transaction path if a userspace process subscribed to nftables
notifications.

In the 'table dump' case, rcu_access_pointer() cannot be used: We do not
hold transaction mutex so the pointer can be NULLed right after the check.
Just unconditionally fetch the value, then have the helper return
immediately if its NULL.

In the notification case we don't hold the rcu read lock, but updates are
prevented due to transaction mutex. Use rcu_dereference_check() to make lockdep
aware of this.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

edbd82c5

netfilter: nf_conntrack_h323: restore boundary check correctness · f5e85ce8

Jakub Jankowski authored Apr 25, 2019

Since commit bc7d811a ("netfilter: nf_ct_h323: Convert
CHECK_BOUND macro to function"), NAT traversal for H.323
doesn't work, failing to parse H323-UserInformation.
nf_h323_error_boundary() compares contents of the bitstring,
not the addresses, preventing valid H.323 packets from being
conntrack'd.

This looks like an oversight from when CHECK_BOUND macro was
converted to a function.

To fix it, stop dereferencing bs->cur and bs->end.

Fixes: bc7d811a ("netfilter: nf_ct_h323: Convert CHECK_BOUND macro to function")
Signed-off-by: Jakub Jankowski <shasta@toxcorp.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

f5e85ce8

30 Apr, 2019 7 commits

netfilter: nf_flow_table: check ttl value in flow offload data path · 33cc3c0c

Taehee Yoo authored Apr 30, 2019

nf_flow_offload_ip_hook() and nf_flow_offload_ipv6_hook() do not check
ttl value. So, ttl value overflow may occur.

Fixes: 97add9f0 ("netfilter: flow table support for IPv4")
Fixes: 09952107 ("netfilter: flow table support for IPv6")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

33cc3c0c

netfilter: nf_flow_table: fix netdev refcnt leak · 26a302af

Taehee Yoo authored Apr 30, 2019

flow_offload_alloc() calls nf_route() to get a dst_entry. Internally,
nf_route() calls ip_route_output_key() that allocates a dst_entry and
holds it. So, a dst_entry should be released by dst_release() if
nf_route() is successful.

Otherwise, netns exit routine cannot be finished and the following
message is printed:

[  257.490952] unregister_netdevice: waiting for lo to become free. Usage count = 1

Fixes: ac2a6666 ("netfilter: add generic flow table infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

26a302af

netfilter: nft_flow_offload: add entry to flowtable after confirmation · 270a8a29

Pablo Neira Ayuso authored Apr 29, 2019

This is fixing flow offload for UDP traffic where packets only follow
one single direction.

The flow_offload_fixup_tcp() mechanism works fine in case that the
offloaded entry remains in SYN_RECV state, given sequence tracking is
reset and that conntrack handles syn+ack packets as a retransmission, ie.

	sES + synack => sIG

for reply traffic.

Fixes: a3c90f7a ("netfilter: nf_tables: flow offload expression")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

270a8a29

netfilter: nf_tables: delay chain policy update until transaction is complete · 66293c46

Florian Westphal authored Apr 12, 2019

When we process a long ruleset of the form

chain input {
   type filter hook input priority filter; policy drop;
   ...
}

Then the base chain gets registered early on, we then continue to
process/validate the next messages coming in the same transaction.

Problem is that if the base chain policy is 'drop', it will take effect
immediately, which causes all traffic to get blocked until the
transaction completes or is aborted.

Fix this by deferring the policy until the transaction has been
processed and all of the rules have been flagged as active.
Reported-by: Jann Haber <jann.haber@selfnet.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

66293c46

ipv6/flowlabel: wait rcu grace period before put_pid() · 6c0afef5

Eric Dumazet authored Apr 27, 2019

syzbot was able to catch a use-after-free read in pid_nr_ns() [1]

ip6fl_seq_show() seems to use RCU protection, dereferencing fl->owner.pid
but fl_free() releases fl->owner.pid before rcu grace period is started.

[1]

BUG: KASAN: use-after-free in pid_nr_ns+0x128/0x140 kernel/pid.c:407
Read of size 4 at addr ffff888094012a04 by task syz-executor.0/18087

CPU: 0 PID: 18087 Comm: syz-executor.0 Not tainted 5.1.0-rc6+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131
 pid_nr_ns+0x128/0x140 kernel/pid.c:407
 ip6fl_seq_show+0x2f8/0x4f0 net/ipv6/ip6_flowlabel.c:794
 seq_read+0xad3/0x1130 fs/seq_file.c:268
 proc_reg_read+0x1fe/0x2c0 fs/proc/inode.c:227
 do_loop_readv_writev fs/read_write.c:701 [inline]
 do_loop_readv_writev fs/read_write.c:688 [inline]
 do_iter_read+0x4a9/0x660 fs/read_write.c:922
 vfs_readv+0xf0/0x160 fs/read_write.c:984
 kernel_readv fs/splice.c:358 [inline]
 default_file_splice_read+0x475/0x890 fs/splice.c:413
 do_splice_to+0x12a/0x190 fs/splice.c:876
 splice_direct_to_actor+0x2d2/0x970 fs/splice.c:953
 do_splice_direct+0x1da/0x2a0 fs/splice.c:1062
 do_sendfile+0x597/0xd00 fs/read_write.c:1443
 __do_sys_sendfile64 fs/read_write.c:1498 [inline]
 __se_sys_sendfile64 fs/read_write.c:1490 [inline]
 __x64_sys_sendfile64+0x15a/0x220 fs/read_write.c:1490
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458da9
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f300d24bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000458da9
RDX: 00000000200000c0 RSI: 0000000000000008 RDI: 0000000000000007
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 000000000000005a R11: 0000000000000246 R12: 00007f300d24c6d4
R13: 00000000004c5fa3 R14: 00000000004da748 R15: 00000000ffffffff

Allocated by task 17543:
 save_stack+0x45/0xd0 mm/kasan/common.c:75
 set_track mm/kasan/common.c:87 [inline]
 __kasan_kmalloc mm/kasan/common.c:497 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
 slab_post_alloc_hook mm/slab.h:437 [inline]
 slab_alloc mm/slab.c:3393 [inline]
 kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
 alloc_pid+0x55/0x8f0 kernel/pid.c:168
 copy_process.part.0+0x3b08/0x7980 kernel/fork.c:1932
 copy_process kernel/fork.c:1709 [inline]
 _do_fork+0x257/0xfd0 kernel/fork.c:2226
 __do_sys_clone kernel/fork.c:2333 [inline]
 __se_sys_clone kernel/fork.c:2327 [inline]
 __x64_sys_clone+0xbf/0x150 kernel/fork.c:2327
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 7789:
 save_stack+0x45/0xd0 mm/kasan/common.c:75
 set_track mm/kasan/common.c:87 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
 __cache_free mm/slab.c:3499 [inline]
 kmem_cache_free+0x86/0x260 mm/slab.c:3765
 put_pid.part.0+0x111/0x150 kernel/pid.c:111
 put_pid+0x20/0x30 kernel/pid.c:105
 fl_free+0xbe/0xe0 net/ipv6/ip6_flowlabel.c:102
 ip6_fl_gc+0x295/0x3e0 net/ipv6/ip6_flowlabel.c:152
 call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
 expire_timers kernel/time/timer.c:1362 [inline]
 __run_timers kernel/time/timer.c:1681 [inline]
 __run_timers kernel/time/timer.c:1649 [inline]
 run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
 __do_softirq+0x266/0x95a kernel/softirq.c:293

The buggy address belongs to the object at ffff888094012a00
 which belongs to the cache pid_2 of size 88
The buggy address is located 4 bytes inside of
 88-byte region [ffff888094012a00, ffff888094012a58)
The buggy address belongs to the page:
page:ffffea0002500480 count:1 mapcount:0 mapping:ffff88809a483080 index:0xffff888094012980
flags: 0x1fffc0000000200(slab)
raw: 01fffc0000000200 ffffea00018a3508 ffffea0002524a88 ffff88809a483080
raw: ffff888094012980 ffff888094012000 000000010000001b 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888094012900: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
 ffff888094012980: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
>ffff888094012a00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
                   ^
 ffff888094012a80: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
 ffff888094012b00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc

Fixes: 4f82f457 ("net ip6 flowlabel: Make owner a union of struct pid * and kuid_t")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6c0afef5

vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach · 1d3fd8a1

Stephen Suryaputra authored Apr 27, 2019

When there is no route to an IPv6 dest addr, skb_dst(skb) points
to loopback dev in the case of that the IP6CB(skb)->iif is
enslaved to a vrf. This causes Ip6InNoRoutes to be incremented on the
loopback dev. This also causes the lookup to fail on icmpv6_send() and
the dest unreachable to not sent and Ip6OutNoRoutes gets incremented on
the loopback dev.

To reproduce:
* Gateway configuration:
        ip link add dev vrf_258 type vrf table 258
        ip link set dev enp0s9 master vrf_258
        ip addr add 66:1/64 dev enp0s9
        ip -6 route add unreachable default metric 8192 table 258
        sysctl -w net.ipv6.conf.all.forwarding=1
        sysctl -w net.ipv6.conf.enp0s9.forwarding=1
* Sender configuration:
        ip addr add 66::2/64 dev enp0s9
        ip -6 route add default via 66::1
and ping 67::1 for example from the sender.

Fix this by counting on the original netdev and reset the skb dst to
force a fresh lookup.

v2: Fix typo of destination address in the repro steps.
v3: Simplify the loopback check (per David Ahern) and use reverse
    Christmas tree format (per David Miller).
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1d3fd8a1

tcp: add sanity tests in tcp_add_backlog() · ca2fe295

Eric Dumazet authored Apr 26, 2019

Richard and Bruno both reported that my commit added a bug,
and Bruno was able to determine the problem came when a segment
wih a FIN packet was coalesced to a prior one in tcp backlog queue.

It turns out the header prediction in tcp_rcv_established()
looks back to TCP headers in the packet, not in the metadata
(aka TCP_SKB_CB(skb)->tcp_flags)

The fast path in tcp_rcv_established() is not supposed to
handle a FIN flag (it does not call tcp_fin())

Therefore we need to make sure to propagate the FIN flag,
so that the coalesced packet does not go through the fast path,
the same than a GRO packet carrying a FIN flag.

While we are at it, make sure we do not coalesce packets with
RST or SYN, or if they do not have ACK set.

Many thanks to Richard and Bruno for pinpointing the bad commit,
and to Richard for providing a first version of the fix.

Fixes: 4f693b55 ("tcp: implement coalescing on backlog queue")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Reported-by: Bruno Prémont <bonbons@sysophe.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

ca2fe295

29 Apr, 2019 3 commits

ipv6: invert flowlabel sharing check in process and user mode · 95c16925

Willem de Bruijn authored Apr 25, 2019

A request for a flowlabel fails in process or user exclusive mode must
fail if the caller pid or uid does not match. Invert the test.

Previously, the test was unsafe wrt PID recycling, but indeed tested
for inequality: fl1->owner != fl->owner

Fixes: 4f82f457 ("net ip6 flowlabel: Make owner a union of struct pid* and kuid_t")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

95c16925

Merge branch 'ieee802154-for-davem-2019-04-25' of... · 6ee12b7b

David S. Miller authored Apr 29, 2019

Merge branch 'ieee802154-for-davem-2019-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan

Stefan Schmidt says:

====================
ieee802154 for net 2019-04-25

An update from ieee802154 for your *net* tree.

Another fix from Kangjie Lu to ensure better checking regmap updates in the
mcr20a driver. Nothing else I have pending for the final release.

If there are any problems let me know.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

6ee12b7b

Merge tag 'mac80211-for-davem-2019-04-26' of... · 2ae7a397

David S. Miller authored Apr 29, 2019

Merge tag 'mac80211-for-davem-2019-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
 * fix use-after-free in mac80211 TXQs
 * fix RX STBC byte order
 * fix debugfs rename crashing due to ERR_PTR()
 * fix missing regulatory notification
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

2ae7a397

28 Apr, 2019 4 commits

udp: fix GRO reception in case of length mismatch · 21f1b8a6

Paolo Abeni authored Apr 26, 2019

Currently, the UDP GRO code path does bad things on some edge
conditions - Aggregation can happen even on packet with different
lengths.

Fix the above by rewriting the 'complete' condition for GRO
packets. While at it, note explicitly that we allow merging the
first packet per burst below gso_size.
Reported-by: Sean Tong <seantong114@gmail.com>
Fixes: e20cf8d3 ("udp: implement GRO for plain UDP sockets.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

21f1b8a6

Merge branch 'tls-data-copies' · fbef9478

David S. Miller authored Apr 27, 2019

Jakub Kicinski says:

====================
net/tls: fix data copies in tls_device_reencrypt()

This series fixes the tls_device_reencrypt() which is broken
if record starts in the frags of the message skb.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fbef9478

net/tls: fix copy to fragments in reencrypt · eb3d38d5

Jakub Kicinski authored Apr 25, 2019

Fragments may contain data from other records so we have to account
for that when we calculate the destination and max length of copy we
can perform.  Note that 'offset' is the offset within the message,
so it can't be passed as offset within the frag..

Here skb_store_bits() would have realised the call is wrong and
simply not copy data.

Fixes: 4799ac81 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

eb3d38d5

net/tls: don't copy negative amounts of data in reencrypt · 97e1caa5

Jakub Kicinski authored Apr 25, 2019

There is no guarantee the record starts before the skb frags.
If we don't check for this condition copy amount will get
negative, leading to reads and writes to random memory locations.
Familiar hilarity ensues.

Fixes: 4799ac81 ("tls: Add rx inline crypto offload")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

97e1caa5

27 Apr, 2019 7 commits

Merge branch 'bnxt_en-Misc-bug-fixes' · b2a20fd0

David S. Miller authored Apr 27, 2019

Michael Chan says:

====================
bnxt_en: Misc. bug fixes.

6 miscellaneous bug fixes covering several issues in error code paths,
a setup issue for statistics DMA, and an improvement for setting up
multicast address filters.

Please queue these for stable as well.
Patch #5 (bnxt_en: Fix statistics context reservation logic) is for the
most recent 5.0 stable only.  Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b2a20fd0

bnxt_en: Fix uninitialized variable usage in bnxt_rx_pkt(). · 0b397b17

Michael Chan authored Apr 25, 2019

In bnxt_rx_pkt(), if the driver encounters BD errors, it will recycle
the buffers and jump to the end where the uninitailized variable "len"
is referenced.  Fix it by adding a new jump label that will skip
the length update.  This is the most correct fix since the length
may not be valid when we get this type of error.

Fixes: 6a8788f2 ("bnxt_en: add support for software dynamic interrupt moderation")
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0b397b17

bnxt_en: Fix statistics context reservation logic. · 3f93cd3f

Michael Chan authored Apr 25, 2019

In an earlier commit that fixes the number of stats contexts to
reserve for the RDMA driver, we added a function parameter to pass in
the number of stats contexts to all the relevant functions.  The passed
in parameter should have been used to set the enables field of the
firmware message.

Fixes: 780baad4 ("bnxt_en: Reserve 1 stat_ctx for RDMA driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3f93cd3f

bnxt_en: Pass correct extended TX port statistics size to firmware. · ad361adf

Michael Chan authored Apr 25, 2019

If driver determines that extended TX port statistics are not supported
or allocation of the data structure fails, make sure to pass 0 TX stats
size to firmware to disable it. The firmware returned TX stats size should
also be set to 0 for consistency. This will prevent
bnxt_get_ethtool_stats() from accessing the NULL TX stats pointer in
case there is mismatch between firmware and driver.

Fixes: 36e53349 ("bnxt_en: Add additional extended port statistics.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad361adf

bnxt_en: Fix possible crash in bnxt_hwrm_ring_free() under error conditions. · 1f83391b

Michael Chan authored Apr 25, 2019

If we encounter errors during open and proceed to clean up,
bnxt_hwrm_ring_free() may crash if the rings we try to free have never
been allocated.  bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx()
may reference pointers that have not been allocated.

Fix it by checking for valid fw_ring_id first before calling
bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx().

Fixes: 2c61d211 ("bnxt_en: Add helper functions to get firmware CP ring ID.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f83391b

bnxt_en: Free short FW command HWRM memory in error path in bnxt_init_one() · f9099d61

Vasundhara Volam authored Apr 25, 2019

In the bnxt_init_one() error path, short FW command request memory
is not freed. This patch fixes it.

Fixes: e605db80 ("bnxt_en: Support for Short Firmware Message")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f9099d61

bnxt_en: Improve multicast address setup logic. · b4e30e8e

Michael Chan authored Apr 25, 2019

The driver builds a list of multicast addresses and sends it to the
firmware when the driver's ndo_set_rx_mode() is called.  In rare
cases, the firmware can fail this call if internal resources to
add multicast addresses are exhausted.  In that case, we should
try the call again by setting the ALL_MCAST flag which is more
guaranteed to succeed.

Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b4e30e8e

26 Apr, 2019 12 commits

net: phy: marvell: Fix buffer overrun with stats counters · fdfdf867

Andrew Lunn authored Apr 25, 2019

marvell_get_sset_count() returns how many statistics counters there
are. If the PHY supports fibre, there are 3, otherwise two.

marvell_get_strings() does not make this distinction, and always
returns 3 strings. This then often results in writing past the end
of the buffer for the strings.

Fixes: 2170fef7 ("Marvell phy: add field to get errors from fiber link.")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fdfdf867

genetlink: use idr_alloc_cyclic for family->id assignment · 4e43df38

Marcel Holtmann authored Apr 24, 2019

When allocating the next family->id it makes more sense to use
idr_alloc_cyclic to avoid re-using a previously used family->id as much
as possible.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e43df38

qmi_wwan: new Wistron, ZTE and D-Link devices · 88ef66a2

Bjørn Mork authored Apr 24, 2019

Adding device entries found in vendor modified versions of this
driver.  Function maps for some of the devices follow:

WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918)

MI_00 Qualcomm HS-USB Diagnostics
MI_01 Android Debug interface
MI_02 Qualcomm HS-USB Modem
MI_03 Qualcomm Wireless HS-USB Ethernet Adapter
MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
MI_06 USB Mass Storage Device

 T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
 D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
 P:  Vendor=1435 ProdID=0918 Rev= 2.32
 S:  Manufacturer=Android
 S:  Product=Android
 S:  SerialNumber=0123456789ABCDEF
 C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA
 I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
 E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
 E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
 E:  Ad=84(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
 E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
 E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
 E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms

WNC D18 LTE CAT3 module (1435:d182)

MI_00 Qualcomm HS-USB Diagnostics
MI_01 Androd Debug interface
MI_02 Qualcomm HS-USB Modem
MI_03 Qualcomm HS-USB NMEA
MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
MI_06 USB Mass Storage Device

ZM8510/ZM8620/ME3960 (19d2:0396)

MI_00 ZTE Mobile Broadband Diagnostics Port
MI_01 ZTE Mobile Broadband AT Port
MI_02 ZTE Mobile Broadband Modem
MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan)
MI_04 ZTE Mobile Broadband ADB Port

ME3620_X (19d2:1432)

MI_00 ZTE Diagnostics Device
MI_01 ZTE UI AT Interface
MI_02 ZTE Modem Device
MI_03 ZTE Mobile Broadband Network Adapter
MI_04 ZTE Composite ADB Interface
Reported-by: Lars Melin <larsm17@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>

88ef66a2

net: ethernet: stmmac: manage the get_irq probe defer case · 56c5bc18

Fabien Dessenne authored Apr 24, 2019

Manage the -EPROBE_DEFER error case for "stm32_pwr_wakeup"  IRQ.
Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

56c5bc18

l2tp: use rcu_dereference_sk_user_data() in l2tp_udp_encap_recv() · c1c47721

Eric Dumazet authored Apr 23, 2019

Canonical way to fetch sk_user_data from an encap_rcv() handler called
from UDP stack in rcu protected section is to use rcu_dereference_sk_user_data(),
otherwise compiler might read it multiple times.

Fixes: d00fa9ad ("il2tp: fix races with tunnel socket close")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Chapman <jchapman@katalix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c1c47721

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · ad759c90

David S. Miller authored Apr 26, 2019

Alexei Starovoitov says:

====================
pull-request: bpf 2019-04-25

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) the bpf verifier fix to properly mark registers in all stack frames, from Paul.

2) preempt_enable_no_resched->preempt_enable fix, from Peter.

3) other misc fixes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

ad759c90

bpf: Fix preempt_enable_no_resched() abuse · 0edd6b64

Peter Zijlstra authored Apr 23, 2019

Unless the very next line is schedule(), or implies it, one must not use
preempt_enable_no_resched(). It can cause a preemption to go missing and
thereby cause arbitrary delays, breaking the PREEMPT=y invariant.

Cc: Roman Gushchin <guro@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

0edd6b64

selftests/bpf: test cases for pkt/null checks in subprogs · 6dd7f140

Paul Chaignon authored Apr 24, 2019

The first test case, for pointer null checks, is equivalent to the
following pseudo-code.  It checks that the verifier does not complain on
line 6 and recognizes that ptr isn't null.

1: ptr = bpf_map_lookup_elem(map, &key);
2: ret = subprog(ptr) {
3:   return ptr != NULL;
4: }
5: if (ret)
6:   value = *ptr;

The second test case, for packet bound checks, is equivalent to the
following pseudo-code.  It checks that the verifier does not complain on
line 7 and recognizes that the packet is at least 1 byte long.

1: pkt_end = ctx.pkt_end;
2: ptr = ctx.pkt + 8;
3: ret = subprog(ptr, pkt_end) {
4:   return ptr <= pkt_end;
5: }
6: if (ret)
7:   value = *(u8 *)ctx.pkt;
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

6dd7f140

bpf: mark registers in all frames after pkt/null checks · c6a9efa1

Paul Chaignon authored Apr 24, 2019

In case of a null check on a pointer inside a subprog, we should mark all
registers with this pointer as either safe or unknown, in both the current
and previous frames.  Currently, only spilled registers and registers in
the current frame are marked.  Packet bound checks in subprogs have the
same issue.  This patch fixes it to mark registers in previous frames as
well.

A good reproducer for null checks looks as follow:

1: ptr = bpf_map_lookup_elem(map, &key);
2: ret = subprog(ptr) {
3:   return ptr != NULL;
4: }
5: if (ret)
6:   value = *ptr;

With the above, the verifier will complain on line 6 because it sees ptr
as map_value_or_null despite the null check in subprog 1.

Note that this patch fixes another resulting bug when using
bpf_sk_release():

1: sk = bpf_sk_lookup_tcp(...);
2: subprog(sk) {
3:   if (sk)
4:     bpf_sk_release(sk);
5: }
6: if (!sk)
7:   return 0;
8: return 1;

In the above, mark_ptr_or_null_regs will warn on line 6 because it will
try to free the reference state, even though it was already freed on
line 3.

Fixes: f4d7e40a ("bpf: introduce function calls (verification)")
Signed-off-by: Paul Chaignon <paul.chaignon@orange.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

c6a9efa1

libbpf: add binary to gitignore · 39391377

Matteo Croce authored Apr 13, 2019

Some binaries are generated when building libbpf from tools/lib/bpf/,
namely libbpf.so.0.0.2 and libbpf.so.0.
Add them to the local .gitignore.
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

39391377

tools: bpftool: fix infinite loop in map create · 8694d8c1

Alban Crequy authored Apr 12, 2019

"bpftool map create" has an infinite loop on "while (argc)". The error
case is missing.

Symptoms: when forgetting to type the keyword 'type' in front of 'hash':
$ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
(infinite loop, taking all the CPU)
^C

After the patch:
$ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
Error: unknown arg hash

Fixes: 0b592b5a ("tools: bpftool: add map create command")
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Song Liu <songliubraving@fb.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

8694d8c1

MIPS: eBPF: Make ebpf_to_mips_reg() static · ecfc3fca

YueHaibing authored Apr 10, 2019

Fix sparse warning:

arch/mips/net/ebpf_jit.c:196:5: warning:
 symbol 'ebpf_to_mips_reg' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

ecfc3fca