- 08 Apr, 2015 13 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextPablo Neira Ayuso authored
Resolve conflicts between 5888b93b ("Merge branch 'nf-hook-compress'") and Florian Westphal br_netfilter works. Conflicts: net/bridge/br_netfilter.c Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Add an userdata set extension and allow the user to attach arbitrary data to set elements. This is intended to hold TLV encoded data like comments or DNS annotations that have no meaning to the kernel. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Add a new "dynset" expression for dynamic set updates. A new set op ->update() is added which, for non existant elements, invokes an initialization callback and inserts the new element. For both new or existing elements the extenstion pointer is returned to the caller to optionally perform timer updates or other actions. Element removal is not supported so far, however that seems to be a rather exotic need and can be added later on. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Currently a set binding is assumed to be related to a lookup and, in case of maps, a data load. In order to use bindings for set updates, the loop detection checks must be restricted to map operations only. Add a flags member to the binding struct to hold the set "action" flags such as NFT_SET_MAP, and perform loop detection based on these. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
Use atomic operations for the element count to avoid races with async updates. To properly handle the transactional semantics during netlink updates, deleted but not yet committed elements are accounted for seperately and are treated as being already removed. This means for the duration of a netlink transaction, the limit might be exceeded by the amount of elements deleted. Set implementations must be prepared to handle this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Patrick McHardy authored
The NFT_SET_TIMEOUT flag is ignore in nft_select_set_ops, which may lead to selection of a set implementation that doesn't actually support timeouts. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
nf_bridge_info->mask is used for several things, for example to remember if skb->pkt_type was set to OTHER_HOST. For a bridge, OTHER_HOST is expected case. For ip forward its a non-starter though -- routing expects PACKET_HOST. Bridge netfilter thus changes OTHER_HOST to PACKET_HOST before hook invocation and then un-does it after hook traversal. This information is irrelevant outside of br_netfilter. After this change, ->mask now only contains flags that need to be known outside of br_netfilter in fast-path. Future patch changes mask into a 2bit state field in sk_buff, so that we can remove skb->nf_bridge pointer for good and consider all remaining places that access nf_bridge info content a not-so fastpath. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
->mask is a bit info field that mixes various use cases. In particular, we have flags that are mutually exlusive, and flags that are only used within br_netfilter while others need to be exposed to other parts of the kernel. Remove BRNF_8021Q/PPPoE flags. They're mutually exclusive and only needed within br_netfilter context. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
Don't access skb->nf_bridge directly, this pointer will be removed soon. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
Avoid skb->nf_bridge accesses where possible. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
right now we store this in the nf_bridge_info struct, accessible via skb->nf_bridge. This patch prepares removal of this pointer from skb: Instead of using skb->nf_bridge->x, we use helpers to obtain the in/out device (or ifindexes). Followup patches to netfilter will then allow nf_bridge_info to be obtained by a call into the br_netfilter core, rather than keeping a pointer to it in sk_buff. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Florian Westphal authored
br_netfilter maintains an extra state, nf_bridge_info, which is attached to skb via skb->nf_bridge pointer. Amongst other things we use skb->nf_bridge->data to store the original mac header for every processed skb. This is required for ip refragmentation when using conntrack on top of bridge, because ip_fragment doesn't copy it from original skb. However there is no need anymore to do this unconditionally. Move this to the one place where its needed -- when br_netfilter calls ip_fragment(). Also switch to percpu storage for this so we can handle fragmenting without accessing nf_bridge meta data. Only user left is neigh resolution when DNAT is detected, to hold the original source mac address (neigh resolution builds new mac header using bridge mac), so rename ->data and reduce its size to whats needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
Daniel Borkmann authored
Currently in xt_socket, we take advantage of early demuxed sockets since commit 00028aa3 ("netfilter: xt_socket: use IP early demux") in order to avoid a second socket lookup in the fast path, but we only make partial use of this: We still unnecessarily parse headers, extract proto, {s,d}addr and {s,d}ports from the skb data, accessing possible conntrack information, etc even though we were not even calling into the socket lookup via xt_socket_get_sock_{v4,v6}() due to skb->sk hit, meaning those cycles can be spared. After this patch, we only proceed the slower, manual lookup path when we have a skb->sk miss, thus time to match verdict for early demuxed sockets will improve further, which might be i.e. interesting for use cases such as mentioned in 681f130f ("netfilter: xt_socket: add XT_SOCKET_NOWILDCARD flag"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
-
- 07 Apr, 2015 20 commits
-
-
Haiyang Zhang authored
In the two places changed, we now use netvsc_xmit_completion() which properly frees hv_netvsc_packet in or not in skb headroom. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
The sum of RNDIS msg and PPI struct sizes is used in multiple places, so we define a macro for them. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haiyang Zhang authored
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Lee authored
Fast Open has been using an experimental option with a magic number (RFC6994). This patch makes the client by default use the RFC7413 option (34) to get and send Fast Open cookies. This patch makes the client solicit cookies from a given server first with the RFC7413 option. If that fails to elicit a cookie, then it tries the RFC6994 experimental option. If that also fails, it uses the RFC7413 option on all subsequent connect attempts. If the server returns a Fast Open cookie then the client caches the form of the option that successfully elicited a cookie, and uses that form on later connects when it presents that cookie. The idea is to gradually obsolete the use of experimental options as the servers and clients upgrade, while keeping the interoperability meanwhile. Signed-off-by: Daniel Lee <Longinus00@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Lee authored
Fast Open has been using the experimental option with a magic number (RFC6994) to request and grant Fast Open cookies. This patch enables the server to support the official IANA option 34 in RFC7413 in addition. The change has passed all existing Fast Open tests with both old and new options at Google. Signed-off-by: Daniel Lee <Longinus00@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hariprasad Shenai authored
Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
Also move 'group' description to match the order of the net_device structure. Fixes: 7a66bbc9 ("net: remove iflink field from struct net_device") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Nicolas Dichtel says: ==================== netns: enhance netlink interface for nsid The first patch is a small cleanup. The second patch implements notifications for netns id events. And the last one allows to dump existing netns id from userland. iproute2 patches are available, I can send them on demand. v2: drop the first patch (the fix is now in net-next) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
Which this patch, it's possible to dump the list of ids allocated for peer netns. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
With this patch, netns ids that are created and deleted are advertised into the group RTNLGRP_NSID. Because callers of rtnl_net_notifyid() already know the id of the peer, there is no need to call __peernet2id() in rtnl_net_fill(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
No need to initialize err, it will be overridden by the value of nlmsg_parse(). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Prevent UDP tunnels from operating on garbage socket So this should do the rest of the work such that when we encapsulate into a UDP tunnel, the output path works on the UDP tunnel's socket rather than skb->sk. Part of this work is based upon changes done by Jiri Pirko some time ago. Basically the first step is to pass the socket through the nf_hook okfn(), and then next we do the same for the UDP tunnel xmit routines. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Miller authored
That was we can make sure the output path of ipv4/ipv6 operate on the UDP socket rather than whatever random thing happens to be in skb->sk. Based upon a patch by Jiri Pirko. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
-
David Miller authored
On the output paths in particular, we have to sometimes deal with two socket contexts. First, and usually skb->sk, is the local socket that generated the frame. And second, is potentially the socket used to control a tunneling socket, such as one the encapsulates using UDP. We do not want to disassociate skb->sk when encapsulating in order to fix this, because that would break socket memory accounting. The most extreme case where this can cause huge problems is an AF_PACKET socket transmitting over a vxlan device. We hit code paths doing checks that assume they are dealing with an ipv4 socket, but are actually operating upon the AF_PACKET one. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Miller authored
It is currently always set to NULL, but nf_queue is adjusted to be prepared for it being set to a real socket by taking and releasing a reference to that socket when necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Miller authored
This way we can consolidate where we setup new nf_hook_state objects, to make sure the entire thing is initialized. The only other place an nf_hook_object is instantiated is nf_queue, wherein a structure copy is used. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julia Lawall authored
Return a negative error code on failure. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier ret; expression e1,e2; @@ ( if (\(ret < 0\|ret != 0\)) { ... return ret; } | ret = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julia Lawall authored
Return a negative error code on failure. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ identifier ret; expression e1,e2; @@ ( if (\(ret < 0\|ret != 0\)) { ... return ret; } | ret = 0 ) ... when != ret = e1 when != &ret *if(...) { ... when != ret = e2 when forall return ret; } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-04-04 Here's what's probably the last bluetooth-next pull request for 4.1: - Fixes for LE advertising data & advertising parameters - Fix for race condition with HCI_RESET flag - New BNEPGETSUPPFEAT ioctl, needed for certification - New HCI request callback type to get the resulting skb - Cleanups to use BIT() macro wherever possible - Consolidate Broadcom device entries in the btusb HCI driver - Check for valid flags in CMTP, HIDP & BNEP - Disallow local privacy & OOB data combo to prevent a potential race - Expose SMP & ECDH selftest results through debugfs - Expose current Device ID info through debugfs Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Conflicts: drivers/net/ethernet/mellanox/mlx4/cmd.c net/core/fib_rules.c net/ipv4/fib_frontend.c The fib_rules.c and fib_frontend.c conflicts were locking adjustments in 'net' overlapping addition and removal of code in 'net-next'. The mlx4 conflict was a bug fix in 'net' happening in the same place a constant was being replaced with a more suitable macro. Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 06 Apr, 2015 7 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) In TCP, don't register an FRTO for cumulatively ACK'd data that was previously SACK'd, from Neal Cardwell. 2) Need to hold RNL mutex in ipv4 multicast code namespace cleanup, from Cong WANG. 3) Similarly we have to hold RNL mutex for fib_rules_unregister(), also from Cong WANG. 4) Revert and rework netns nsid allocation fix, from Nicolas Dichtel. 5) When we encapsulate for a tunnel device, skb->sk still points to the user socket. So this leads to cases where we retraverse the ipv4/ipv6 output path with skb->sk being of some other address family (f.e. AF_PACKET). This can cause things to crash since the ipv4 output path is dereferencing an AF_PACKET socket as if it were an ipv4 one. The short term fix for 'net' and -stable is to elide these socket checks once we've entered an encapsulation sequence by testing xmit_recursion. Longer term we have a better solution wherein we pass the tunnel's socket down through the output paths, but that is way too invasive for 'net' and -stable. From Hannes Frederic Sowa. 6) l2tp_init() failure path forgets to unregister per-net ops, from Cong WANG. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/mlx4_core: Fix error message deprecation for ConnectX-2 cards net: dsa: fix filling routing table from OF description l2tp: unregister l2tp_net_ops on failure path mvneta: dont call mvneta_adjust_link() manually ipv6: protect skb->sk accesses from recursive dereference inside the stack netns: don't allocate an id for dead netns Revert "netns: don't clear nsid too early on removal" ip6mr: call del_timer_sync() in ip6mr_free_table() net: move fib_rules_unregister() under rtnl lock ipv4: take rtnl_lock and mark mrt table as freed on namespace cleanup tcp: fix FRTO undo on cumulative ACK of SACKed range xen-netfront: transmit fully GSO-sized packets
-
Jack Morgenstein authored
Commit 1daa4303 ("net/mlx4_core: Deprecate error message at ConnectX-2 cards startup to debug") did the deprecation only for port 1 of the card. Need to deprecate for port 2 as well. Fixes: 1daa4303 ("net/mlx4_core: Deprecate error message at ConnectX-2 cards startup to debug") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Nakonechny authored
According to description in 'include/net/dsa.h', in cascade switches configurations where there are more than one interconnected devices, 'rtable' array in 'dsa_chip_data' structure is used to indicate which port on this switch should be used to send packets to that are destined for corresponding switch. However, dsa_of_setup_routing_table() fills 'rtable' with port numbers of the _target_ switch, but not current one. This commit removes redundant devicetree parsing and adds needed port number as a function argument. So dsa_of_setup_routing_table() now just looks for target switch number by parsing parent of 'link' device node. To remove possible misunderstandings with the way of determining target switch number, a corresponding comment was added to the source code and to the DSA device tree bindings documentation file. This was tested on a custom board with two Marvell 88E6095 switches with following corresponding routing tables: { -1, 10 } and { 8, -1 }. Signed-off-by: Pavel Nakonechny <pavel.nakonechny@skitlab.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/inputLinus Torvalds authored
Pull input fixes from Dmitry Torokhov: "Updates for the input subsystem - two more tweaks for ALPS driver to work out kinks after splitting the touchpad, trackstick, and potential external PS/2 mouse into separate input devices. Changes to support ALPS SS4 devices (protocol V8) will be coming in 4.1..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: alps - document stick behavior for protocol V2 Input: alps - report V2 Dualpoint Stick events via the right evdev node Input: alps - report interleaved bare PS/2 packets via dev3
-
WANG Cong authored
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael S. Tsirkin authored
commit d631b94e virtio: change comment in transmit started clarifying the logic behind queue state management, but introduced an inaccuracy: TX_BUSY does not cause a BUG message. Clean this up some more, explaining the tradeoffs in detail. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-