- 25 Aug, 2014 24 commits
-
-
Daniel Borkmann authored
This implements the deferred tail pointer flush API for the ixgbe driver. Similar version also proposed longer time ago by Alexander Duyck. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
As reported by Jesper Dangaard Brouer, for high packet rates the overhead of having another indirect call in the TX path is non-trivial. There is the indirect call itself, and then there is all of the reloading of the state to refetch the tail pointer value and then write the device register. Move to a more passive scheme, which requires very light modifications to the device drivers. The signal is a new skb->xmit_more value, if it is non-zero it means that more SKBs are pending to be transmitted on the same queue as the current SKB. And therefore, the driver may elide the tail pointer update. Right now skb->xmit_more is always zero. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Amir Vadai says: ==================== Make is_kdump_kernel() accessible from modules I'm re-spinning this patchset. At the begining it was suggested to use a different name for the parameter, but at the end [3] the resolution was to leave it as it is in this patch. Drivers need to know if running from kdump kernel in order to change their memory profile - since kdump environment is limited by available memory. Currently there are drivers that are using reset_devices as suggested in [2]. In [2] it was suggested to use reset_devices, but the context was, to enable driver to know when the hardware device is needed to be reset, and not if this is a kdump environment. We think that is_kdump_kernel() is better suited to select between different memory profiles. The first patch in this patchset exports a needed symbol in order to make is_kdump_kernel() accessible from the drivers. The rest of the patches change from reset_devices to is_kdump_kernel() in 2 networking drivers. The idea of this patchset was suggested by Vivek Goyal. Tested (only build) and applied on top of commit 8fc54f68: ("net: use reciprocal_scale() helper") [1] - ea1c1af1: ("net/mlx4_en: Reduce memory consumption on kdump kernel") [2] - https://lkml.org/lkml/2011/1/27/341 [3] - http://www.spinics.net/lists/netdev/msg291492.html ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. CC: Ariel Elior <ariel.elior@qlogic.com> CC: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
Use is_kdump_kernel() to detect kdump kernel, instead of reset_devices. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Amir Vadai authored
In order to make is_kdump_kernel() accessible from modules, need to make elfcorehdr_addr exported. This was rejected in the past [1] because reset_devices was prefered in that context (reseting the device in kdump kernel), but now there are some network drivers that need to reduce memory usage when loaded from a kdump kernel. And in that context, is_kdump_kernel() suits better. [1] - https://lkml.org/lkml/2011/1/27/341 CC: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Machek authored
This adds simple cleanups for stmmac, removing test we know is always true, fixing whitespace, and moving code out of if(). Signed-off-by: Pavel Machek <pavel@denx.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
hayeswang authored
626: CHECK: Alignment should match open parenthesis 646: CHECK: Alignment should match open parenthesis 655: CHECK: Alignment should match open parenthesis 695: CHECK: Alignment should match open parenthesis 729: CHECK: Alignment should match open parenthesis 739: CHECK: Alignment should match open parenthesis 976: WARNING: externs should be avoided in .c files 1314: CHECK: Alignment should match open parenthesis 1358: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1402: WARNING: networking block comments don't use an empty /* line, use /* Comment... 1521: CHECK: multiple assignments should be avoided 1775: CHECK: Alignment should match open parenthesis 1838: CHECK: multiple assignments should be avoided 1843: CHECK: multiple assignments should be avoided 1847: CHECK: multiple assignments should be avoided 1850: WARNING: Missing a blank line after declarations 1864: CHECK: Alignment should match open parenthesis 1872: CHECK: braces {} should be used on all arms of this statement 1906: CHECK: usleep_range is preferred over udelay 2865: WARNING: networking block comments don't use an empty /* line, use /* Comment... 3088: CHECK: Alignment should match open parenthesis total: 0 errors, 5 warnings, 16 checks, 3567 lines checked Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Basic deferred TX queue flushing infrastructure. Over time, and specifically and more recently at the Networking Workshop during Kernel SUmmit in Chicago, we have discussed the idea of having some way to optimize transmits of multiple TX packets at a time. There are several areas of overhead that could be amortized with such schemes. One has to do with locking and transactional overhead, the other has to do with device specific costs. This patch set here is more aimed at device specific costs. Typically a device queues up a packet in the TX queue and then has to do something to have the device start processing that new entry. Sometimes this is composed of doing an MMIO write to a "tail" register, and in other cases it can involve something as expensive as a hypervisor call. The basic setup defined here is that when the driver supports deferred TX queue flushing, ndo_start_xmit should no longer perform that operation. Instead a new operation, ndo_xmit_flush, should do it. I have converted IGB and virtio_net as example initial users. The IGB conversion is tested, virtio_net is not but it does compile :-) All ndo_start_xmit call sites have been abstracted behind a new helper called netdev_start_xmit(). This just adds the infrastructure, it does not actually add any instances of actually doing multiple ndo_start_xmit calls per ndo_xmit_flush invocation. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ian Morris authored
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch removes some blank lines between the end of a function definition and the EXPORT_SYMBOL_GPL macro in order to prevent checkpatch warning that EXPORT_SYMBOL must immediately follow a function. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ian Morris authored
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. This patch addresses structure definitions, specifically it cleanses the brace placement and replaces spaces with tabs in a few places. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ian Morris authored
This patch makes no changes to the logic of the code but simply addresses coding style issues as detected by checkpatch. Both objdump and diff -w show no differences. A number of items are addressed in this patch: * Multiple spaces converted to tabs * Spaces before tabs removed. * Spaces in pointer typing cleansed (char *)foo etc. * Remove space after sizeof * Ensure spacing around comparators such as if statements. Signed-off-by: Ian Morris <ipm@chirality.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Darek Marcinkiewicz authored
This cuts down the number of debug information spit out by the driver. Signed-off-by: Dariusz Marcinkiewicz <reksio@newterm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
This patch addresses a couple of minor items, mostly addesssing prandom_bytes(): 1) prandom_bytes{,_state}() should use size_t for length arguments, 2) We can use put_unaligned() when filling the array instead of open coding it [ perhaps some archs will further benefit from their own arch specific implementation when GCC cannot make up for it ], 3) Fix a typo, 4) Better use unsigned int as type for getting the arch seed, 5) Make use of prandom_u32_max() for timer slack. Regarding the change to put_unaligned(), callers of prandom_bytes() which internally invoke prandom_bytes_state(), don't bother as they expect the array to be filled randomly and don't have any control of the internal state what-so-ever (that's also why we have periodic reseeding there, etc), so they really don't care. Now for the direct callers of prandom_bytes_state(), which are solely located in test cases for MTD devices, that is, drivers/mtd/tests/{oobtest.c,pagetest.c,subpagetest.c}: These tests basically fill a test write-vector through prandom_bytes_state() with an a-priori defined seed each time and write that to a MTD device. Later on, they set up a read-vector and read back that blocks from the device. So in the verification phase, the write-vector is being re-setup [ so same seed and prandom_bytes_state() called ], and then memcmp()'ed against the read-vector to check if the data is the same. Akinobu, Lothar and I also tested this patch and it runs through the 3 relevant MTD test cases w/o any errors on the nandsim device (simulator for MTD devs) for x86_64, ppc64, ARM (i.MX28, i.MX53 and i.MX6): # modprobe nandsim first_id_byte=0x20 second_id_byte=0xac \ third_id_byte=0x00 fourth_id_byte=0x15 # modprobe mtd_oobtest dev=0 # modprobe mtd_pagetest dev=0 # modprobe mtd_subpagetest dev=0 We also don't have any users depending directly on a particular result of the PRNG (except the PRNG self-test itself), and that's just fine as it e.g. allowed us easily to do things like upgrading from taus88 to taus113. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Tested-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Lothar Waßmann <LW@KARO-electronics.de> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tom Herbert says: ==================== net: Checksum offload changes - Part V I am working on overhauling RX checksum offload. Goals of this effort are: - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY - Preserve CHECKSUM_COMPLETE through encapsulation layers - Don't do skb_checksum more than once per packet - Unify GRO and non-GRO csum verification as much as possible - Unify the checksum functions (checksum_init) - Simplify code What is in this fifth patch set: - Added GRO checksum validation functions - Call the GRO validations functions from TCP and GRE gro_receive - Perform checksum verification in the UDP gro_receive path using GRO functions and add support for gro_receive in UDP6 Changes in V2: - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it to CHECKSUM_COMPLETE from GRO checksum validation. This avoids performance penalty in checksumming bytes which are before the header GRO is at. Please review carefully and test if possible, mucking with basic checksum functions is always a little precarious :-) ---- Test results with this patch set are below. I did not notice any performace regression. Tests run: TCP_STREAM: super_netperf with 200 streams TCP_RR: super_netperf with 200 streams and -r 1,1 Device bnx2x (10Gbps): No GRE RSS hash (RX interrupts occur on one core) UDP RSS port hashing enabled. * GRE with checksum with IPv4 encapsulated packets With fix: TCP_STREAM 9.91% CPU utilization 5163.78 Mbps TCP_RR 50.64% CPU utilization 219/347/502 90/95/99% latencies 834103 tps Without fix: TCP_STREAM 10.05% CPU utilization 5186.22 tps TCP_RR 49.70% CPU utilization 227/338/486 90/95/99% latencies 813450 tps * GRE without checksum with IPv4 encapsulated packets With fix: TCP_STREAM 10.18% CPU utilization 5159 Mbps TCP_RR 51.86% CPU utilization 214/325/471 90/95/99% latencies 865943 tps Without fix: TCP_STREAM 10.26% CPU utilization 5307.87 Mbps TCP_RR 50.59% CPU utilization 224/325/476 90/95/99% latencies 846429 tps *** Simulate device returns CHECKSUM_COMPLETE * VXLAN with checksum With fix: TCP_STREAM 13.03% CPU utilization 9093.9 Mbps TCP_RR 95.96% CPU utilization 161/259/474 90/95/99% latencies 1.14806e+06 tps Without fix: TCP_STREAM 13.59% CPU utilization 9093.97 Mbps TCP_RR 93.95% CPU utilization 160/259/484 90/95/99% latencies 1.10262e+06 tps * VXLAN without checksum With fix: TCP_STREAM 13.28% CPU utilization 9093.87 Mbps TCP_RR 95.04% CPU utilization 155/246/439 90/95/99% latencies 1.15e+06 tps Without fix: TCP_STREAM 13.37% CPU utilization 9178.45 Mbps TCP_RR 93.74% CPU utilization 161/257/469 90/95/99% latencies 1.1068e+06 Mbps ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
In GRE demux if the GRE checksum pop rcv encapsulation so that any encapsulated checksums are treated as tunnel checksums. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Implement GRO for UDPv6. Add UDP checksum verification in gro_receive for both UDP4 and UDP6 calling skb_gro_checksum_validate_zero_check. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
In tcp[64]_gro_receive call skb_gro_checksum_validate to validate TCP checksum in the gro context. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Add inet_gro_compute_pseudo and ip6_gro_compute_pseudo. These are the logical equivalents of inet_compute_pseudo and ip6_compute_pseudo for GRO path. The IP header is taken from skb_gro_network_header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Add skb_gro_checksum_validate, skb_gro_checksum_validate_zero_check, and skb_gro_checksum_simple_validate, and __skb_gro_checksum_complete. These are the cognates of the normal checksum functions but are used in the gro_receive path and operate on GRO related fields in sk_buffs. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Aug, 2014 16 commits
-
-
Daniel Borkmann authored
Replace open codings of (((u64) <x> * <y>) >> 32) with reciprocal_scale(). Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Drivers, and perhaps other entities we have not yet considered, sometimes want to know how deep the protocol headers go before deciding how large of an SKB to allocate and how much of the packet to place into the linear SKB area. For example, consider a driver which has a device which DMAs into pools of pages and then tells the driver where the data went in the DMA descriptor(s). The driver can then build an SKB and reference most of the data via SKB fragments (which are page/offset/length triplets). However at least some of the front of the packet should be placed into the linear SKB area, which comes before the fragments, so that packet processing can get at the headers efficiently. The first thing each protocol layer is going to do is a "pskb_may_pull()" so we might as well aggregate as much of this as possible while we're building the SKB in the driver. Part of supporting this is that we don't have an SKB yet, so we want to be able to let the flow dissector operate on a raw buffer in order to compute the offset of the end of the headers. So now we have a __skb_flow_dissect() which takes an explicit data pointer and length. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Florian Fainelli says: ==================== net: phy: bcm7xxx: APD and EEE support This patch series enables Auto-power down and EEE for the BCM7xxx integrated Gigabit PHYs. I also put a fix for the fixed PHY that would allow clause 45 over clause 22 reads/writes but would return bogus data by using e.g: ethtool --show-eee ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The 28nm Gigabit PHY on BCM7xxx chips comes out of reset with absolutely no EEE capabilities, such that we would actually return that we do not support EEE when accessing 3.20 (MDIO_PCS_EEE_ABLE) registers. Poke through the vendor-specific C45 register to enable EEE globally at the PHY level, and advertise supported EEE modes. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Internal PHYs do not have any specific phy_interface_t defined because they are within an Ethernet MAC or a larger IC, they will fail the early check in phy_init_eee(). Allow these PHYs to proceed with EEE initialization and report error/success by checking the standard C45 EEE-related registers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Some PHY drivers might need to access Clause 45 registers in Clause 22 compatibility mode to e.g: properly advertise EEE support when disabled by default. Export these two helper functions: phy_read_mmd_indirect() and phy_write_mmd_indirect() for drivers to use them. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The fixed PHY driver does not properly emulate Clause 45 over Clause 22 MDIO reads, and as such, will return bogus values when we access such registers. Return an error when accessing these registers in order to prevent advertising bogus capabilities such as EEE support and such. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The 28nm process BCM7xxx internal Gigabit PHYs all support automatic power down, turn on that feature as part of the configuration initialization callback. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
The shadow register 0x1C is used both by the BCM54xxx PHYs and the BCM7xxx internal PHYs, move the accessors to a common location so both drivers can use them. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
Commit 439d39a9 ("net: phy: broadcom: extract register definitions") added a bunch of registers to brcmphy.h but left some to broadcom.c, move all of them to the header file since the BCM54xx and BCM7xxx PHY drivers do share all of these registers. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jon Maloy says: ==================== tipc: Merge port and socket layer code After the removal of the TIPC native interface, there is no reason to keep a distinction between a "generic" port layer and a "specific" socket layer in the code. Throughout the last months, we have posted several series that aimed at facilitating removal of the port layer, and in particular the port_lock spinlock, which in reality duplicates the role normally kept by lock_sock()/bh_lock_sock(). In this series, we finalize this work, by making a significant number of changes to the link, node, port and socket code, all with the aim of reducing dependencies between the layers. In the final commits, we then remove the port spinlock, port.c and port.h altogether. After this series, we have a socket layer that has only few dependencies to the rest of the stack, so that it should be possible to continue cleanups of its code without significantly affecting other code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
We complete the merging of the port and socket layer by aggregating the fields of struct tipc_port directly into struct tipc_sock, and moving the combined structure into socket.c. We also move all functions and macros that are not any longer exposed to the rest of the stack into socket.c, and rename them accordingly. Despite the size of this commit, there are no functional changes. We have only made such changes that are necessary due of the removal of struct tipc_port. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
The reference table is now 'socket aware' instead of being generic, and has in reality become a socket internal table. In order to be able to minimize the API exposed by the socket layer towards the rest of the stack, we now move the reference table definitions and functions into the file socket.c, and rename the functions accordingly. There are no functional changes in this commit. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
We move the inline functions in the file port.h to socket.c, and modify their names accordingly. We move struct tipc_port and some macros to socket.h. Finally, we remove the file port.h. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
In this commit, we move the remaining functions in port.c to socket.c, and give them new names that correspond to their new location. We then remove the file port.c. There are only cosmetic changes to the moved functions. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jon Paul Maloy authored
In previous commits we have reduced usage of port_lock to a minimum, and complemented it with usage of bh_lock_sock() at the remaining locations. The purpose has been to remove this lock altogether, since it largely duplicates the role of bh_lock_sock. We are now ready to do this. However, we still need to protect the BH callers from inadvertent release of the socket while they hold a reference to it. We do this by replacing port_lock by a combination of a rw-lock protecting the reference table as such, and updating the socket reference counter while the socket is referenced from BH. This technique is more standard and comprehensible than the previous approach, and turns out to have a positive effect on overall performance. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-