- 10 Jan, 2013 20 commits
-
-
Eric Dumazet authored
In various network workloads, __do_softirq() latencies can be up to 20 ms if HZ=1000, and 200 ms if HZ=100. This is because we iterate 10 times in the softirq dispatcher, and some actions can consume a lot of cycles. This patch changes the fallback to ksoftirqd condition to : - A time limit of 2 ms. - need_resched() being set on current task When one of this condition is met, we wakeup ksoftirqd for further softirq processing if we still have pending softirqs. Using need_resched() as the only condition can trigger RCU stalls, as we can keep BH disabled for too long. I ran several benchmarks and got no significant difference in throughput, but a very significant reduction of latencies (one order of magnitude) : In following bench, 200 antagonist "netperf -t TCP_RR" are started in background, using all available cpus. Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC IRQ (hard+soft) Before patch : # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind RT_LATENCY=550110.424 MIN_LATENCY=146858 MAX_LATENCY=997109 P50_LATENCY=305000 P90_LATENCY=550000 P99_LATENCY=710000 MEAN_LATENCY=376989.12 STDDEV_LATENCY=184046.92 After patch : # netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind RT_LATENCY=40545.492 MIN_LATENCY=9834 MAX_LATENCY=78366 P50_LATENCY=33583 P90_LATENCY=59000 P99_LATENCY=69000 MEAN_LATENCY=38364.67 STDDEV_LATENCY=12865.26 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Miller <davem@davemloft.net> Cc: Tom Herbert <therbert@google.com> Cc: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
One long standing problem with TSO/GSO/GRO packets is that skb->len doesn't represent a precise amount of bytes on wire. Headers are only accounted for the first segment. For TCP, thats typically 66 bytes per 1448 bytes segment missing, an error of 4.5 % for normal MSS value. As consequences : 1) TBF/CBQ/HTB/NETEM/... can send more bytes than the assigned limits. 2) Device stats are slightly under estimated as well. Fix this by taking account of headers in qdisc_skb_cb(skb)->pkt_len computation. Packet schedulers should use qdisc pkt_len instead of skb->len for their bandwidth limitations, and TSO enabled devices drivers could use pkt_len if their statistics are not hardware assisted, and if they don't scratch skb->cb[] first word. Both egress and ingress paths work, thanks to commit fda55eca (net: introduce skb_transport_header_was_set()) : If GRO built a GSO packet, it also set the transport header for us. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Paolo Valente <paolo.valente@unimore.it> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vijay Subramanian authored
Recent commit (commit 7e3a2dc5 doc: make the description of how tcp_ecn works more explicit and clear ) clarified the behavior of tcp_ecn sysctl variable but description is inconsistent. When requested by incoming conections, ECN is enabled with not just tcp_ecn = 2 but also with tcp_ecn = 1. This patch makes it clear that with tcp_ecn = 1, ECN is enabled when requested by incoming connections. Also fix spelling of 'incoming'. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ariel Elior authored
Static checkers complained that the E1H_FUNC_MAX define is used incorrectly in bnx2x_pretend_func(). The complaint was justified, although its not a real bug, as the first part of the conditional protects us in this case (a real bug would happen if a VF tried to use the pretend func, but there are no VFs in E1H chips). Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In commit d0e2c55e (veth: avoid a NULL deref in veth_stats_one) we now clear the peer pointers in veth_dellink() veth_close() must therefore make sure the peer pointer is set. Reported-by: Tom Parkin <tom.parkin@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vipul Pandya authored
With Hard-Wired firmware configuration it was incorrectly provisioning the VFs Channel Access Rights Mask. Signed-off-by: Jay Hernandez <jay@chelsio.com> Signed-off-by: Vipul Pandya <vipul@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Barry Grussling authored
Fix DSA whitespace issues reported by checkpatch.pl Signed-off-by: Barry Grussling <barry@grussling.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Barry Grussling authored
Convert DSA printk calls to netdev_info calls as recommended by checkpatch.pl. Signed-off-by: Barry Grussling <barry@grussling.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Barry Grussling authored
Convert DSA msleep calls to timeout/usleep_range calls as reported by checkpatch.pl. Signed-off-by: Barry Grussling <barry@grussling.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Barry Grussling authored
Convert DSA driver comments to network-style comments as reported by checkpatch.pl. Fix spelling error. Signed-off-by: Barry Grussling <barry@grussling.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Do not convert endian back and forth. If the caller uses contant "mask" argument (and most callers do), we can omit runtime endian conversion here. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
In ipv6_recv_error(), addr_offset points to daddr field of the ip header. To get ipv6 header, use container_of() macro instead of substracting magic number (24). Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
"vfop" is NULL here. I've changed the debugging to not use it. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Ariel Elior <ariele@broadcom.com> Acked-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rafał Miłecki authored
BCMA is a Broadcom specific bus with devices AKA cores. All recent BCMA based SoCs have gigabit ethernet provided by the GBit MAC core. This patch adds driver for such a cores registering itself as a netdev. It has been tested on a BCM4706 and BCM4718 chipsets. In the kernel tree there is already b44 driver which has some common things with bgmac, however there are many differences that has led to the decision or writing a new driver: 1) GBit MAC cores appear on BCMA bus (not SSB as in case of b44) 2) There is 64bit DMA engine which differs from 32bit one 3) There is no CAM (Content Addressable Memory) in GBit MAC 4) We have 4 TX queues on GBit MAC devices (instead of 1) 5) Many registers have different addresses/values 6) RX header flags are also different The driver in it's state is functional how, however there is of course place for improvements: 1) Supporting more net_device_ops 2) SUpporting more ethtool_ops 3) Unaligned addressing in DMA 4) Writing separated PHY driver Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 09 Jan, 2013 9 commits
-
-
Jiri Pirko authored
perm_addr is initialized correctly in register_netdevice() so to init it in drivers is no longer needed. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
Benefit from the fact that dev->addr_assign_type is set to NET_ADDR_PERM in case the device has permanent address. This also fixes the problem that many drivers do not set perm_addr at all. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
Update the netconsole document as well. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
Currently, netpoll only supports IPv4. This patch adds IPv6 support to netpoll so that we can run netconsole over IPv6 network. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
As suggested by David, udp6_csum_init() is too big to be inlined, move it to ipv6 static library, net/ipv6/ip6_checksum.c. And the generic csum_ipv6_magic() too. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
This patch adjusts some struct and functions, to prepare for supporting IPv6. Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fabio Estevam authored
Fix the following warning when building with W=1 option: drivers/net/ethernet/freescale/fec.c:810:1: warning: '__inline__' is not at beginning of declaration [-Wold-style-declaration] The inline declaration is pointless in this function, so just remove it. While at it, also remove the other 'inline' declarations. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We have skb_mac_header_was_set() helper to tell if mac_header was set on a skb. We would like the same for transport_header. __netif_receive_skb() doesn't reset the transport header if already set by GRO layer. Note that network stacks usually reset the transport header anyway, after pulling the network header, so this change only allows a followup patch to have more precise qdisc pkt_len computation for GSO packets at ingress side. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linuxDavid S. Miller authored
Paul Gortmaker says: ==================== I'd like to propose that we get rid of these old 8390 EISA drivers. Of the five deleted here, I wrote four -- and while that doesn't give me any authority for deletion above anyone else, it does at least allow me to comment on the absolute absence of anyone reaching out to the driver author for assistance in the last dozen years. Eventually we'll probably get rid of EISA bus support, since in x86, the hardware is close to 20 years old and already too resource constrained to be useful today. However there might still be a few DEC Alpha enthusiasts with old EISA machines kept alive, and so I expect we'll have to wait a bit longer to get unanimous agreement to proceed with the full EISA removal (although I'd love to be proven wrong on that). Most of the DEC Alpha machines shipped in a PCI configuration, and even the few that were EISA had DEC tulip based ethernet and no reason to be needing the inferior 8390 technology. So the interest here for any possible DEC enthusiasts with EISA boxes about these old 8390 drivers should be nil. These really were rare cards -- in fact the smc-ultra32 is the only one that I'd ever seen in person. Even back in the mid 90's when the drivers were written, I would guess that the user base was less than 10 people across all of them. The following patch was created with --irreversible-delete for ease of review (it skips showing the content of files that are deleted); however the complete patch can be pulled as per below. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 08 Jan, 2013 5 commits
-
-
Jiri Pirko authored
No need to check if ethtool_ops == NULL since it can't be. Use local variable "ops" in functions where it is present instead of dev->ethtool_ops Introduce local variable "ops" in functions where dev->ethtool_ops is used many times. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Reviewed-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ariel Elior authored
In this patch the SR-IOV code is segregated from the main bulk of the bnx2x code. The CONFIG_BNX2X_SRIOV define is added to Broadcom's Kconfig, and allows the elision of the building of all the SR-IOV support code in the driver. The define is dependant on the kernel CONFIG_PCI_IOV configuration define. Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Frank Li authored
Report correct hardware stamping capability by ethtool interface. The v1.0 ptp4l check it. Signed-off-by: Frank Li <Frank.Li@freescale.com> Acked-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
commit 2681128f (veth: extend device features) added a NULL deref in veth_stats_one(), as veth_get_stats64() was not testing if the peer device was setup or not. At init time, we call dev_get_stats() before veth pair is fully setup. [ 178.854758] [<ffffffffa00f5677>] veth_get_stats64+0x47/0x70 [veth] [ 178.861013] [<ffffffff814f0a2d>] dev_get_stats+0x6d/0x130 [ 178.866486] [<ffffffff81504efc>] rtnl_fill_ifinfo+0x47c/0x930 [ 178.872299] [<ffffffff81505b93>] rtmsg_ifinfo+0x83/0x100 [ 178.877678] [<ffffffff81505cc6>] rtnl_configure_link+0x76/0xa0 [ 178.883580] [<ffffffffa00f52fa>] veth_newlink+0x16a/0x350 [veth] [ 178.889654] [<ffffffff815061cc>] rtnl_newlink+0x4dc/0x5e0 [ 178.895128] [<ffffffff81505e1e>] ? rtnl_newlink+0x12e/0x5e0 [ 178.900769] [<ffffffff8150587d>] rtnetlink_rcv_msg+0x11d/0x310 [ 178.906669] [<ffffffff81505760>] ? __rtnl_unlock+0x20/0x20 [ 178.912225] [<ffffffff81521f89>] netlink_rcv_skb+0xa9/0xd0 [ 178.917779] [<ffffffff81502d55>] rtnetlink_rcv+0x25/0x40 [ 178.923159] [<ffffffff815218d1>] netlink_unicast+0x1b1/0x230 [ 178.928887] [<ffffffff81521c4e>] netlink_sendmsg+0x2fe/0x3b0 [ 178.934615] [<ffffffff814dbe22>] sock_sendmsg+0xd2/0xf0 So we must check if peer was setup in veth_get_stats64() As pointed out by Ben Hutchings, priv->peer is missing proper synchronization. Adding RCU protection is a safe and well documented way to make sure we don't access about to be freed or already freed data. Reported-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: Eric Dumazet <edumazet@google.com> CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Use more current logging styles. Convert printks to pr_<level> and printks with ("%s: ...", dev->name to netdev_<level>(dev, "... Add pr_fmt #defines where appropriate. Coalesce formats. Use pr_<level>_once where appropriate. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 07 Jan, 2013 6 commits
-
-
Paul Gortmaker authored
The NS8390 chip was essentially the 1st widespread PC ethernet chip, starting its life on 8 bit ISA cards in the late 1980s. Even with better technologies available (bus mastering etc) the 8390 managed to get used on a few rare EISA cards in the early to mid 1990s. The EISA bus in the x86 world was largely confined to systems ranging from 486 to 586 (essentially 200MHz or lower, and less than 100MB RAM) -- i.e. machines unlikely to be still in service, and even less likely to be running a 3.9+ kernel. On top of that, only one of the five really ever was considered non-experimental; the smc-ultra32 was the one -- since it was largely just an EISA version of the popular smc-ultra ISA card. All the others had such a tiny user base that they simply never could be considered anything more than experimental. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
-
Paul Gortmaker authored
We threw away the microchannel support, but the removal wasn't completely trivial since there was namespace overlap with the machine check support, and hence some orphaned dependencies survived the deletion. This attempts to sweep those up and send them to the bit-bucket. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Flavio Leitner authored
The fields must be null-terminated. Signed-off-by: Flavio Leitner <fbl@redhat.com> Acked-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hannes Frederic Sowa authored
As per suggestion from Eric Dumazet this patch makes tcp_ecn sysctl namespace aware. The reason behind this patch is to ease the testing of ecn problems on the internet and allows applications to tune their own use of ecn. Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki / 吉藤英明 authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
splice() can handle pages of any order, but network code tries hard to split them in PAGE_SIZE units. Not quite successfully anyway, as __splice_segment() assumed poff < PAGE_SIZE. This is true for the skb->data part, not necessarily for the fragments. This patch removes this logic to give the pages as they are in the skb. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
-