Commits · aec1e8452dc364cffd0333e5632ec482f6322593 · Kirill Smelkov / linux

17 Sep, 2010 7 commits

Amit Kumar Salecha authored Sep 16, 2010

LRO + GRO + vlan rx accleration support, performance increases
around 20% and cpu utilization reduces around 70% on vlan interface.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

aec1e845

qlcnic: vlan gro support · 5718d3b4

Amit Kumar Salecha authored Sep 16, 2010

GRO support + vlan rx accleration, boost around 9%
performance and reduces 25% of cpu utilization on vlan
interface.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5718d3b4

qlcnic: support vlan rx accleration · d5790663

Amit Kumar Salecha authored Sep 16, 2010

Implemented vlan rx accleration in driver.
This helps in increasing significant performance and
reduces cpu utilization with GRO and LRO.

Eric Dumazet:
	"Its a bit strange you use dev_kfree_skb_any(skb) here."
	"We run in NAPI mode, so you can use dev_kfree_skb()."
Amit:
	Done. Using dev_kfree_skb();
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d5790663

drivers/net/tulip/de4x5.c: fix union member name in DE4X5_GET_REG ioctl · 0c796f91

Dan Rosenberg authored Sep 16, 2010

This was previously reported as a security issue due to leakage of
uninitialized stack memory.  Jeff Mahoney pointed out that this is
incorrect since the copied data is from a union (rather than a struct).
Therefore, this patch is only under consideration for the sake of
correctness, and is not security relevant. 
Signed-off-by: Dan Rosenberg <dan.j.rosenberg@gmail.com>
Acked-by: Grant Grundler <grundler@parisc-linux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

0c796f91

net: shrinks struct net_device · cd13539b

Eric Dumazet authored Sep 16, 2010

commit ab95bfe0 (net: replace hooks in __netif_receive_skb) added
rx_handler at wrong place, between two cache line aligned objects,
creating a big hole (a full cache line)

Move rx_handler and rx_handler_data before rx_queue, filling existing
hole.

Move master field in the cache line(s) used in receive path.

This saves 64 bytes (or L1_CACHE_BYTES), and avoids two possible
cache misses in receive path.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cd13539b

ip6tnl: get rid of ip6_tnl_lock · 94767632

Eric Dumazet authored Sep 15, 2010

As RTNL is held while doing tunnels inserts and deletes, we can remove
ip6_tnl_lock spinlock. My initial RCU conversion was conservative and
converted the rwlock to spinlock, with no RTNL requirement.

Use appropriate rcu annotations and modern lockdep checks as well.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

94767632

net: include inetdevice.h for rcu_dereference_raw api change · caeda9b9

Stephen Rothwell authored Sep 16, 2010

rcu_dereference_raw() now needs to know the type of its argument.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

caeda9b9

16 Sep, 2010 19 commits

net: enable GRO by default for vlan devices · 16c3ea78

Brandon Philips authored Sep 15, 2010

Currently vlan devices don't have GRO by default as none of the Ethernet
drivers add NETIF_F_GRO to their vlan_features.

As GRO is a software feature add GRO to dev->vlan_features in
register_netdevice() and let vlan_dev_init() take care that it gets
enabled only when dev->features has NETIF_F_GRO too.
Signed-off-by: Brandon Philips <bphilips@suse.de>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

16c3ea78

ehea: Remove a silly return · b1cbd5f9

Breno Leitao authored Sep 15, 2010

This patch removes the unconditional return in the end of the
function check_sqs()
Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b1cbd5f9

sky2: enable GRO by default · 1953925e

stephen hemminger authored Sep 15, 2010

The driver has supported GRO for a while, but it was not enabled
by default.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1953925e

ipv4: ip_ptr cleanups · 95ae6b22

Eric Dumazet authored Sep 15, 2010

dev->ip_ptr is protected by rtnl and rcu.

Yet some places dont use appropriate primitives and/or locking rules.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

95ae6b22

phonet: Fix build warning. · 9e0064a5

David S. Miller authored Sep 15, 2010

net/phonet/socket.c: In function ‘pn_res_seq_show’:
net/phonet/socket.c:726: warning: format ‘%02X’ expects type ‘unsigned int’, but argument 3 has type ‘long int’
Signed-off-by: David S. Miller <davem@davemloft.net>

9e0064a5

Phonet: resource routing documentation · 274a517e

Rémi Denis-Courmont authored Sep 15, 2010

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

274a517e

Phonet: list subscribed resources via proc_fs · 507215f8

Rémi Denis-Courmont authored Sep 15, 2010

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

507215f8

Phonet: look up the resource routing table when forwarding · b6a563b2

Rémi Denis-Courmont authored Sep 15, 2010

Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6a563b2

Phonet: hook resource routing to userspace via ioctl()'s · 7417fa83

Rémi Denis-Courmont authored Sep 15, 2010

I wish we could use something cleaner, such as bind(). But that would
not work since resource subscription is orthogonal/in addition to the
normal object ID allocated via bind(). This is similar to multicasting
which also uses ioctl()'s.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7417fa83

Phonet: resource routing backend · 4e3d16ce

Rémi Denis-Courmont authored Sep 15, 2010

When both destination device and object are nul, Phonet routes the
packet according to the resource field. In fact, this is the most
common pattern when sending Phonet "request" packets. In this case,
the packet is delivered to whichever endpoint (socket) has
registered the resource.

This adds a new table so that Linux processes can register their
Phonet sockets to Phonet resources, if they have adequate privileges.

(Namespace support is not implemented at the moment.)
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4e3d16ce

Phonet: remove dangling pipe if an endpoint is closed early · 6482f554

Rémi Denis-Courmont authored Sep 15, 2010

Closing a pipe endpoint is not normally allowed by the Phonet pipe,
other than as a side after-effect of removing the pipe between two
endpoints. But there is no way to prevent Linux userspace processes
from being killed or suffering from bugs, so this can still happen.
We might as well forcefully close Phonet pipe endpoints then.

The cellular modem supports only a few existing pipes at a time. So we
really should not leak them. This change instructs the modem to destroy
the pipe if either of the pipe's endpoint (Linux socket) is closed too
early.
Signed-off-by: Rémi Denis-Courmont <remi.denis-courmont@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6482f554

Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6 · 7fedd7e5
David S. Miller authored Sep 15, 2010

7fedd7e5

misdn: kill big kernel lock · 1a19eb75

Arnd Bergmann authored Sep 14, 2010

The use of the big kernel lock in misdn is completely
bogus, so let's just remove it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

1a19eb75

i4l: kill big kernel lock · 72250d44

Arnd Bergmann authored Sep 14, 2010

The isdn4linux driver uses the big kernel lock only
to serialize access to a few fields in its own
modem_info structure.

The easiest replacement is a driver-wide mutex.
More fine-grained locking would be more appropriate
here, but likely harder to implement.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Karsten Keil <isdn@linux-pingi.de>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

72250d44

irda/irnet: use noop_llseek · 0a5f1d47

Arnd Bergmann authored Sep 14, 2010

There may be applications trying to seek
on the irnet character device, so we should
use noop_llseek to avoid returning an error
when the default llseek changes to no_llseek.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Samuel Ortiz <samuel@sortiz.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>

0a5f1d47

sit: get rid of ipip6_lock · 3a43be3c

Eric Dumazet authored Sep 15, 2010

As RTNL is held while doing tunnels inserts and deletes, we can remove
ipip6_lock spinlock. My initial RCU conversion was conservative and
converted the rwlock to spinlock, with no RTNL requirement.

Use appropriate rcu annotations and modern lockdep checks as well.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3a43be3c

gre: get rid of ipgre_lock · 1507850b

Eric Dumazet authored Sep 15, 2010

As RTNL is held while doing tunnels inserts and deletes, we can remove
ipgre_lock spinlock. My initial RCU conversion was conservative and
converted the rwlock to spinlock, with no RTNL requirement.

Use appropriate rcu annotations and modern lockdep checks as well.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1507850b

ipip: get rid of ipip_lock · b7285b79

Eric Dumazet authored Sep 15, 2010

As RTNL is held while doing tunnels inserts and deletes, we can remove
ipip_lock spinlock. My initial RCU conversion was conservative and
converted the rwlock to spinlock, with no RTNL requirement.

Use appropriate rcu annotations and modern lockdep checks as well.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b7285b79

net: add rtnl_dereference() · 7dff59ef

Eric Dumazet authored Sep 15, 2010

We sometime want to dereference an rcu protected pointer while
holding RTNL. Use a macro to hide all lockdep details.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7dff59ef

15 Sep, 2010 14 commits

ethtool: Remove unimplemented flow specification types · e0de7c93

Ben Hutchings authored Sep 14, 2010

struct ethtool_rawip4_spec and struct ethtool_ether_spec are neither
commented nor used by any driver, so remove them.  Adjust padding in
the user-visible unions that included these structures.

Fix references to struct ethtool_rawip4_spec in
ethtool_get_rx_ntuple(), which should use struct ethtool_usrip4_spec.

struct ethtool_usrip4_spec cannot hold IPv6 host addresses and there
is no separate structure that can, so remove ETH_RX_NFC_IP6 and the
reference to it in niu.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e0de7c93

ethtool: Complete kernel-doc comments for RX flow filter and hash control · e0355873

Ben Hutchings authored Sep 14, 2010

There are now several interfaces within the ethtool API for getting
and setting RX flow filtering and hashing behaviour, most of which are
poorly documented.  This adds kernel-doc comments for all these
interfaces, based on the existing incomplete comments and on the
initial implementations.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e0355873

tg3: phy tmp variable roundup · f833c4c1

Matt Carlson authored Sep 15, 2010

The tg3's phy routines define temporary variables in many locations
within the same routine.  This patch unifies all temporary variables
into one location.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f833c4c1

tg3: Dynamically allocate VPD data memory · a4a8bb15

Matt Carlson authored Sep 15, 2010

This patch eases stack pressure by dynamically allocating the memory
used to temporarily store VPD data.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a4a8bb15

tg3: Use skb_is_gso_v6() · 02e96080

Matt Carlson authored Sep 15, 2010

This patch converts the driver to prefer the skb_is_gso_v6() helper over
the explicit inlined version.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

02e96080

tg3: Move producer ring struct to tg3_napi · 8fea32b9

Matt Carlson authored Sep 15, 2010

Now that each NAPI instance has its own producer ring, it no longer
makes sense to keep the producer ring structure external.  This patch
migrates the producer ring struct to tg3_napi and pivots the code to the
new implementation.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8fea32b9

tg3: Clarify semantics of TG3_IRQ_MAX_VECS · 6fd45cb8

Matt Carlson authored Sep 15, 2010

TG3_IRQ_MAX_VECS should be seen as the maximum number of vectors that
any device could be expected to use.  tp->irq_max represents the maximum
number of vectors the current device can use.  This patch clarifies the
semantics of the code to match the above description.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6fd45cb8

tg3: Unlock 5717 B0+ support · 2e9f7a74

Matt Carlson authored Sep 15, 2010

This patch adjusts the driver to use the tg3_start_xmit_dma_bug()
transmit routine for all revisions of 5717 asic rev devices and then
allows the driver to attach to B0 and later devices.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2e9f7a74

tg3: Don't send APE events for NCSI firmware · dc6d0744

Matt Carlson authored Sep 15, 2010

NCSI firmware does not accept APE events.  It relies on a "driver state"
location in shared memory to tell it what the driver's current state is.

This patch pivots the code to use the new driver state scheme.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dc6d0744

tg3: Disable TSS · f0392d24

Matt Carlson authored Sep 15, 2010

It was recently discovered that enabling TSS can lockup the device.
This patch disables the feature until a suitable workaround can be
found.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f0392d24

tg3: Fix read DMA FIFO overruns on recent devices · 41a8a7ee

Matt Carlson authored Sep 15, 2010

Earlier versions of tg3 devices had a problem where the read DMA FIFO
could be overrun in certain edge conditions.  The fix was to limit the
number of rx BDs the hardware would fetch at a time.  For later devices
(5761, 5784 and later ASIC revs), there is a hardware fix that must be
enabled to fix the same problem.  This patch adds that hardware fix.

There is a gap in the ASIC revision lineage where neither fix is
applied.  This is intentional as these ASIC revisions are not afflicted
by the bug.
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

41a8a7ee

dccp ccid-3: Simplify and consolidate tx_parse_options · 37efb03f

Gerrit Renker authored Sep 14, 2010

This simplifies and consolidates the TX option-parsing code:

 1. The Loss Intervals option is not currently used, so dead code related to
    this option is removed. I am aware of no plans to support the option, but
    if someone wants to implement it (e.g. for inter-op tests), it is better
    to start afresh than having to also update currently unused code.

 2. The Loss Event and Receive Rate options have a lot of code in common (both
    are 32 bit, both have same length etc.), so this is consolidated.

 3. The test against GSR is not necessary, because
    - on first loading CCID3, ccid_new() zeroes out all fields in the socket;
    - ccid3_hc_tx_packet_recv() treats 0 and ~0U equivalently, due to

	pinv = opt_recv->ccid3or_loss_event_rate;
	if (pinv == ~0U || pinv == 0)
		hctx->p = 0;

    - as a result, the sequence number field is removed from opt_recv.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

37efb03f

dccp ccid-3: remove buggy RTT-sampling history lookup · d2c72630

Gerrit Renker authored Sep 14, 2010

This removes the RTT-sampling function tfrc_tx_hist_rtt(), since

1. it suffered from complex passing of return values (the return value both
indicated successful lookup while the value doubled as RTT sample);

2. when for some odd reason the sample value equalled 0, this triggered a bug
warning about "bogus Ack", due to the ambiguity of the return value;

3. on a passive host which has not sent anything the TX history is empty and
thus will lead to unwanted "bogus Ack" warnings such as
ccid3_hc_tx_packet_recv: server(e7b7d518): DATAACK with bogus ACK-28197148
ccid3_hc_tx_packet_recv: server(e7b7d518): DATAACK with bogus ACK-26641606.

The fix is to replace the implicit encoding by performing the steps manually.

Furthermore, the "bogus Ack" warning has been removed, since it can actually be
triggered due to several reasons (network reordering, old packet, (3) above),
hence it is not very useful.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

d2c72630

dccp ccid-3: A lower bound for the inter-packet scheduling algorithm · 20cbd3e1

Gerrit Renker authored Sep 14, 2010

This fixes a subtle bug in the calculation of the inter-packet gap and shows
that t_delta, as it is currently used, is not needed.

The algorithm from RFC 5348, 8.3 below continually computes a send time t_nom,
which is initialised with the current time t_now; t_gran = 1E6 / HZ specifies
the scheduling granularity, s the packet size, and X the sending rate:

  t_distance = t_nom - t_now;		// in microseconds
  t_delta    = min(t_ipi, t_gran) / 2;	// `delta' parameter in microseconds

  if (t_distance >= t_delta) {
	reschedule after (t_distance / 1000) milliseconds;
  } else {
  	t_ipi  = s / X;			// inter-packet interval in usec
	t_nom += t_ipi;			// compute the next send time
	send packet now;
  }

Problem:
--------
Rescheduling requires a conversion into milliseconds (sk_reset_timer()). The
highest jiffy resolution with HZ=1000 is 1 millisecond, so using a higher
granularity does not make much sense here.

As a consequence, values of t_distance < 1000 are truncated to 0. This issue
has so far been resolved by using instead

  if (t_distance >= t_delta + 1000)
	reschedule after (t_distance / 1000) milliseconds;

This is unnecessarily large, a lower bound is t_delta' = max(t_delta, 1000).
And it implies a further simplification:

 a) when HZ >= 500, then t_delta <= t_gran/2 = 10^6/(2*HZ) <= 1000, so that
    t_delta' = MAX(1000, t_delta) = 1000 (constant value);

 b) when HZ < 500, then t_delta = 1/2*MIN(rtt, t_ipi, t_gran) <= t_gran/2,
    so that 1000 <= t_delta' <= t_gran/2.

The maximum error of using a constant t_delta in (b) is less than half a jiffy.

Fix:
----
The patch replaces t_delta with a constant, whose value depends on CONFIG_HZ,
changing the above algorithm to:

  if (t_distance >= t_delta')
	reschedule after (t_distance / 1000) milliseconds;

where t_delta' = 10^6/(2*HZ) if HZ < 500, and t_delta' = 1000 otherwise.
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>

20cbd3e1