- 13 Nov, 2017 40 commits
-
-
David Howells authored
The AFS abort code space is shared across all services, so there's no need for separate abort_to_error translators for each service. Consolidate them into a single function and remove the function pointers for them. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Allow VL server specifications to be given IPv6 addresses as well as IPv4 addresses, for example as: echo add foo.org 1111:2222:3333:0:4444:5555:6666:7777 >/proc/fs/afs/cells Note that ':' is the expected separator for separating IPv4 addresses, but if a ',' is detected or no '.' is detected in the string, the delimiter is switched to ','. This also works with DNS AFSDB or SRV record strings fetched by upcall from userspace. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Keep and pass sockaddr_rxrpc addresses around rather than keeping and passing in_addr addresses to allow for the use of IPv6 and non-standard port numbers in future. This also allows the port and service_id fields to be removed from the afs_call struct. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Update the cache index structure in the following ways: (1) Don't use the volume name followed by the volume type as levels in the cache index. Volumes can be renamed. Use the volume ID instead. (2) Don't store the VLDB data for a volume in the tree. If the volume database should be cached locally, then it should be done in a separate tree. (3) Expand the volume ID stored in the cache to 64 bits. (4) Expand the file/vnode ID stored in the cache to 96 bits. (5) Increment the cache structure version number to 1. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Add some protocol definitions, including max field lengths, flag defs, an XDR-encoded UUID def, more VL operation IDs and more fileserver abort codes. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Push the network namespace pointer to more places in AFS, including the afs_server structure (which doesn't hold a ref on the netns). In particular, afs_put_cell() now takes requires a net ns parameter so that it can safely alter the netns after decrementing the cell usage count - the cell will be deallocated by a background thread after being cached for a period, which means that it's not safe to access it after reducing its usage count. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Keep a reference to the cell in the superblock info structure in addition to the volume and net pointers. This will make it easier to clean up in a future patch in which afs_put_volume() will need the cell pointer. Whilst we're at it, make the cell and volume getting functions return a pointer to the object got to make the call sites look neater. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Fix server reaping and make sure it's all done before we start trying to purge cells, given that servers currently pin cells. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Close the rxrpc socket only after we've purged the server records (and also cell and volume records which might refer to servers) so that we can give up the callbacks on each server. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Lay the groundwork for supporting network namespaces (netns) to the AFS filesystem by moving various global features to a network-namespace struct (afs_net) and providing an instance of this as a temporary global variable that everything uses via accessor functions for the moment. The following changes have been made: (1) Store the netns in the superblock info. This will be obtained from the mounter's nsproxy on a manual mount and inherited from the parent superblock on an automount. (2) The cell list is made per-netns. It can be viewed through /proc/net/afs/cells and also be modified by writing commands to that file. (3) The local workstation cell is set per-ns in /proc/net/afs/rootcell. This is unset by default. (4) The 'rootcell' module parameter, which sets a cell and VL server list modifies the init net namespace, thereby allowing an AFS root fs to be theoretically used. (5) The volume location lists and the file lock manager are made per-netns. (6) The AF_RXRPC socket and associated I/O bits are made per-ns. The various workqueues remain global for the moment. Changes still to be made: (1) /proc/fs/afs/ should be moved to /proc/net/afs/ and a symlink emplaced from the old name. (2) A per-netns subsys needs to be registered for AFS into which it can store its per-netns data. (3) Rather than the AF_RXRPC socket being opened on module init, it needs to be opened on the creation of a superblock in that netns. (4) The socket needs to be closed when the last superblock using it is destroyed and all outstanding client calls on it have been completed. This prevents a reference loop on the namespace. (5) It is possible that several namespaces will want to use AFS, in which case each one will need its own UDP port. These can either be set through /proc/net/afs/cm_port or the kernel can pick one at random. The init_ns gets 7001 by default. Other issues that need resolving: (1) The DNS keyring needs net-namespacing. (2) Where do upcalls go (eg. DNS request-key upcall)? (3) Need something like open_socket_in_file_ns() syscall so that AFS command line tools attempting to operate on an AFS file/volume have their RPC calls go to the right place. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Make wait_on_atomic_t() pass the TASK_* mode onto its action function as an extra argument and make it 'unsigned int throughout. Also, consolidate a bunch of identical action functions into a default function that can do the appropriate thing for the mode. Also, change the argument name in the bit_wait*() function declarations to reflect the fact that it's the mode and not the bit number. [Peter Z gives this a grudging ACK, but thinks that the whole atomic_t wait should be done differently, though he's not immediately sure as to how] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Peter Zijlstra <peterz@infradead.org> cc: Ingo Molnar <mingo@kernel.org>
-
David Howells authored
These AFS patches need the timer_reduce() patch from timers/core. Signed-off-by: David Howells <dhowells@redhat.com>
-
David S. Miller authored
Xin Long says: ==================== net: improve the process of redirect and toobig for ipv6 tunnels Now let's say there are 3 kinds of icmp packets to process for tunnels, toobig(needfrag), redirect, others, their process should be: - toobig(needfrag) update the lower dst's pmtu by route cache, also update sk dst's pmtu if possible, or it will be fine if sk dst pmtu will get updated on tx path. - redirect update the lower dst's gw by route cache and return, no need to send this redirect packet to user sk. - others send the packet to user's sk, or it will also be fine to use err_count to count it and report fail link on tx path. All ipv4 tunnels basically follow this while some of ipv6 tunnels are doing in different ways, like ip6gre and ip6_tunnels update tnl dev's mtu instead of updating lower dst pmtu, no redirect process on their err_handlers, which doesn't make any sense and even causes performance problems. This patchset is to improve the process of redirect and toobig for ip6gre ip4ip6, ip6ip6 tunnels, as in ipv4 tunnels. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
This patch is to remove some useless codes of redirect and fix some indents on ip4ip6 and ip6ip6's err_handlers. Note that redirect icmp packet is already processed in ip6_tnl_err, the old redirect codes in ip4ip6_err actually never worked even before this patch. Besides, there's no need to send redirect to user's sk, it's for lower dst, so just remove it in this patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
The same improvement in "ip6_gre: process toobig in a better way" is needed by ip4ip6 and ip6ip6 as well. Note that ip4ip6 and ip6ip6 will also update sk dst pmtu in their err_handlers. Like I said before, gre6 could not do this as it's inner proto is not certain. But for all of them, sk dst pmtu will be updated in tx path if in need. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
The same process for redirect in "ip6_gre: add the process for redirect in ip6gre_err" is needed by ip4ip6 and ip6ip6 as well. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
Now ip6gre processes toobig icmp packet by setting gre dev's mtu in ip6gre_err, which would cause few things not good: - It couldn't set mtu with dev_set_mtu due to it's not in user context, which causes route cache and idev->cnf.mtu6 not to be updated. - It has to update sk dst pmtu in tx path according to gredev->mtu for ip6gre, while it updates pmtu again according to lower dst pmtu in ip6_tnl_xmit. - To change dev->mtu by toobig icmp packet is not a good idea, it should only work on pmtu. This patch is to process toobig by updating the lower dst's pmtu, as later sk dst pmtu will be updated in ip6_tnl_xmit, the same way as in ip4gre. Note that gre dev's mtu will not be updated any more, it doesn't make any sense to change dev's mtu after receiving a toobig packet. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xin Long authored
This patch is to add redirect icmp packet process for ip6gre by calling ip6_redirect() in ip6gre_err(), as in vti6_err. Prior to this patch, there's even no route cache generated after receiving redirect. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zhu Yanjun authored
In xmit process, the variables are set many times. In fact, it is enough for these variables to be set once. After a long time test, the throughput performance is better than before. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-nextDavid S. Miller authored
Samuel Ortiz says: ==================== NFC 4.15 pull request This is the NFC pull request for 4.15. We have: - A new netlink command for explicitly deactivating NFC targets - i2c constification for all NFC drivers - One NFC device allocation error path fix ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Andy Zhou says: ==================== Openvswitch meter action This patch series is the first attempt to add openvswitch meter support. We have previously experimented with adding metering support in nftables. However 1) It was not clear how to expose a named nftables object cleanly, and 2) the logic that implements metering is quite small, < 100 lines of code. With those two observations, it seems cleaner to add meter support in the openvswitch module directly. --- v1(RFC)->v2: remove unused code improve locking and other review comments v2 -> v3: rebase v3 -> v4: fix undefined "__udivdi3" references on 32 bit builds. use div_u64() instead. v4 -> v5: rebase ==================== Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Zhou authored
Implements OVS kernel meter action support. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Zhou authored
OVS kernel datapath so far does not support Openflow meter action. This is the first stab at adding kernel datapath meter support. This implementation supports only drop band type. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Zhou authored
Later patches will invoke get_dp() outside of datapath.c. Export it. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andy Zhou authored
Meter has its own netlink family. Define netlink messages and attributes for communicating with the user space programs. Signed-off-by: Andy Zhou <azhou@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: dsa: b53: Support prepended Broadcom tags This patch series adds support for prepended 4-bytes Broadcom tags that we already support. This type of tag will typically be used when interfaced to a SoC like BCM58xx (NorthStar Plus) which supports a Flow Accelerator (WIP). In that case, we need to support a slightly different tagging format. The first patch does a bit of re-factoring and passes a port index to the get_tag_protocol() function since at least two different drivers need that type of information (mt7530, b53) to support tagging or not. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
On BCM58xx devices (Northstar Plus), there is an accelerator attached to port 8 which would only work if we use prepended Broadcom tags. Resolve that difference in our get_tag_protocol() function by setting the appropriate tagging protocol in that case. We need to change b53_brcm_hdr_setup() a little bit now since we can deal with two types of Broadcom tags. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Add a new type: DSA_TAG_PROTO_PREPEND which allows us to support for the 4-bytes Broadcom tag that we already support, but in a format where it is pre-pended to the packet instead of located between the MAC SA and the Ethertyper (DSA_TAG_PROTO_BRCM). Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
In preparation for supporting the same Broadcom tag format, but instead of inserted between the MAC SA and EtherType, prepended to the Ethernet frame, restructure the code a little bit to make that possible and take an offset parameter. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
A number of drivers want to check whether the configured CPU port is a possible configuration for enabling tagging, pass down the CPU port number so they verify that. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrew Morton authored
gcc-4.4.4 (at lest) has issues with initializers and anonymous unions: net/sched/sch_red.c: In function 'red_dump_offload': net/sched/sch_red.c:282: error: unknown field 'stats' specified in initializer net/sched/sch_red.c:282: warning: initialization makes integer from pointer without a cast net/sched/sch_red.c:283: error: unknown field 'stats' specified in initializer net/sched/sch_red.c:283: warning: initialization makes integer from pointer without a cast net/sched/sch_red.c: In function 'red_dump_stats': net/sched/sch_red.c:352: error: unknown field 'xstats' specified in initializer net/sched/sch_red.c:352: warning: initialization makes integer from pointer without a cast Work around this. Fixes: 602f3baf ("net_sch: red: Add offload ability to RED qdisc") Cc: Nogah Frankel <nogahf@mellanox.com> Cc: Jiri Pirko <jiri@mellanox.com> Cc: Simon Horman <simon.horman@netronome.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Slava Shwartsman authored
Since Mellanox focus is on newer adapters, we would like to have the ability to disable the support for old gen2 adapters. This can be done by turning off the MLX4_CORE_GEN2 Kconfig flag. We keep it turned on by default. Signed-off-by: Slava Shwartsman <slavash@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason A. Donenfeld authored
The way people generally use netlink_dump is that they fill in the skb as much as possible, breaking when nla_put returns an error. Then, they get called again and start filling out the next skb, and again, and so forth. The mechanism at work here is the ability for the iterative dumping function to detect when the skb is filled up and not fill it past the brim, waiting for a fresh skb for the rest of the data. However, if the attributes are small and nicely packed, it is possible that a dump callback function successfully fills in attributes until the skb is of size 4080 (libmnl's default page-sized receive buffer size). The dump function completes, satisfied, and then, if it happens to be that this is actually the last skb, and no further ones are to be sent, then netlink_dump will add on the NLMSG_DONE part: nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI); It is very important that netlink_dump does this, of course. However, in this example, that call to nlmsg_put_answer will fail, because the previous filling by the dump function did not leave it enough room. And how could it possibly have done so? All of the nla_put variety of functions simply check to see if the skb has enough tailroom, independent of the context it is in. In order to keep the important assumptions of all netlink dump users, it is therefore important to give them an skb that has this end part of the tail already reserved, so that the call to nlmsg_put_answer does not fail. Otherwise, library authors are forced to find some bizarre sized receive buffer that has a large modulo relative to the common sizes of messages received, which is ugly and buggy. This patch thus saves the NLMSG_DONE for an additional message, for the case that things are dangerously close to the brim. This requires keeping track of the errno from ->dump() across calls. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Dave Taht says: ==================== netem: add nsec scheduling and slot feature This patch series converts netem away from the old "ticks" interface and userspace API, and adds support for a new "slot" feature intended to emulate bursty macs such as WiFi and LTE better. Changes since v2: Use u64 for packet_len_sched_time() Use simpler max(time_to_send,q->slot.slot_next) Changes since v1: Always pass new nanosecond APIs to userspace ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dave Taht authored
Slotting is a crude approximation of the behaviors of shared media such as cable, wifi, and LTE, which gather up a bunch of packets within a varying delay window and deliver them, relative to that, nearly all at once. It works within the existing loss, duplication, jitter and delay parameters of netem. Some amount of inherent latency must be specified, regardless. The new "slot" parameter specifies a minimum and maximum delay between transmission attempts. The "bytes" and "packets" parameters can be used to limit the amount of information transferred per slot. Examples of use: tc qdisc add dev eth0 root netem delay 200us \ slot 800us 10ms bytes 64k packets 42 A more correct example, using stacked netem instances and a packet limit to emulate a tail drop wifi queue with slots and variable packet delivery, with a 200Mbit isochronous underlying rate, and 20ms path delay: tc qdisc add dev eth0 root handle 1: netem delay 20ms rate 200mbit \ limit 10000 tc qdisc add dev eth0 parent 1:1 handle 10:1 netem delay 200us \ slot 800us 10ms bytes 64k packets 42 limit 512 Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dave Taht authored
netem userspace has long relied on a horrible /proc/net/psched hack to translate the current notion of "ticks" to nanoseconds. Expressing latency and jitter instead, in well defined nanoseconds, increases the dynamic range of emulated delays and jitter in netem. It will also ease a transition where reducing a tick to nsec equivalence would constrain the max delay in prior versions of netem to only 4.3 seconds. Signed-off-by: Dave Taht <dave.taht@gmail.com> Suggested-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dave Taht authored
Upgrade the internal netem scheduler to use nanoseconds rather than ticks throughout. Convert to and from the std "ticks" userspace api automatically, while allowing for finer grained scheduling to take place. Signed-off-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Francesco Ruggeri authored
Avoid traversing the list of mr6_tables (which requires the rtnl_lock) in ip6mr_sk_done(), when we know in advance that a match will not be found. This can happen when rawv6_close()/ip6mr_sk_done() is invoked on non-mroute6 sockets. This patch helps reduce rtnl_lock contention when destroying a large number of net namespaces, each having a non-mroute6 raw socket. v2: same patch, only fixed subject line and expanded comment. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Colin Ian King authored
The variable giga_ctrl is being assigned to zero however this is never read and hence the assignment is redundant, so remove it. Cleans up clang warning: drivers/net/ethernet/realtek/r8169.c:1978:3: warning: Value stored to 'giga_ctrl' is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Egil Hjelmeland authored
Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no> Signed-off-by: David S. Miller <davem@davemloft.net>
-