Commits · 265ec9275809067fa6925508bed0a590f48286d3 · nexedi / linux

29 Mar, 2015 6 commits

be2net: bump up the driver version to 10.6.0.1 · 265ec927

Sathya Perla authored Mar 26, 2015

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

265ec927

be2net: setup xps queue mapping · 73f394e6

Sathya Perla authored Mar 26, 2015

This patch sets up xps queue mapping on load, so that TX traffic is
steered to the queue whose irqs are being processed by the current cpu.
This helps in avoiding TX lock contention.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

73f394e6

be2net: assign CPU affinity hints to be2net IRQs · d658d98a

Padmanabh Ratnakar authored Mar 26, 2015

This patch provides hints to irqbalance to map be2net IRQs to
specific CPU cores. cpumask_set_cpu_local_first() is used, which first
maps IRQs to near NUMA cores; when those cores are exhausted, IRQs are
mapped to far NUMA cores.
Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d658d98a

tcp: tcp_syn_flood_action() can be static · 41d25fe0

Eric Dumazet authored Mar 25, 2015

After commit 1fb6f159 ("tcp: add tcp_conn_request"),
tcp_syn_flood_action() is no longer used from IPv6.

We can make it static, by moving it above tcp_conn_request()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Octavian Purdila <octavian.purdila@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

41d25fe0

cxgb4: fix boolreturn.cocci warnings · 1fb7cd4e

Wu Fengguang authored Mar 26, 2015

drivers/net/ethernet/chelsio/cxgb4/cxgb4_fcoe.c:49:9-10: WARNING: return of 0/1 in function 'cxgb_fcoe_sof_eof_supported' with return type bool

 Return statements in functions returning bool should use
 true/false instead of 1/0.
Generated by: scripts/coccinelle/misc/boolreturn.cocci

CC: Varun Prakash <varun@chelsio.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1fb7cd4e

fib6: install fib6 ops in the last step · 85b99092

WANG Cong authored Mar 25, 2015

We should not commit the new ops until we finish
all the setup, otherwise we have to NULL it on failure.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

85b99092

27 Mar, 2015 12 commits

Merge branch 'bcmgenet-next' · 7145074b

David S. Miller authored Mar 27, 2015

Petri Gynther says:

====================
net: bcmgenet: multiple Rx queues support

Final patch set to add support for multiple Rx queues:
1. remove priv->int0_mask and priv->int1_mask
2. modify Tx ring int_enable and int_disable vectors
3. simplify bcmgenet_init_dma()
4. tweak init_umac()
5. rework Tx NAPI code
6. rework Rx NAPI code
7. add support for multiple Rx queues
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

7145074b

net: bcmgenet: add support for multiple Rx queues · 4055eaef

Petri Gynther authored Mar 25, 2015

Add support for multiple Rx queues:
1. Add NAPI context per Rx queue
2. Modify Rx interrupt and Rx NAPI code to handle multiple Rx queues
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4055eaef

net: bcmgenet: rework Rx NAPI code · 3ab11339

Petri Gynther authored Mar 25, 2015

Introduce new bcmgenet functions to handle the NAPI calls to:
netif_napi_add()
napi_enable()
napi_disable()
netif_napi_del()
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3ab11339

net: bcmgenet: rework Tx NAPI code · e2aadb4a

Petri Gynther authored Mar 25, 2015

Introduce new bcmgenet functions to handle the NAPI calls to:
netif_napi_add()
napi_enable()
napi_disable()
netif_napi_del()
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e2aadb4a

net: bcmgenet: tweak init_umac() · b2e97eca

Petri Gynther authored Mar 25, 2015

Use more meaningful variable names int0_enable and int1_enable when
enabling bcmgenet interrupts.

For Rx default queue interrupts, use:
UMAC_IRQ_RXDMA_BDONE | UMAC_IRQ_RXDMA_PDONE

For Tx default queue interrupts, use:
UMAC_IRQ_TXDMA_BDONE | UMAC_IRQ_TXDMA_PDONE
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2e97eca

net: bcmgenet: simplify bcmgenet_init_dma() · ebbd96fb

Petri Gynther authored Mar 25, 2015

Do the two kcalloc() calls first, before proceeding into Rx/Tx DMA init.
Makes the error case handling much simpler.
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ebbd96fb

net: bcmgenet: modify Tx ring int_enable and int_disable vectors · 9dbac28f

Petri Gynther authored Mar 25, 2015

Remove unnecessary function parameter priv. Use ring->priv instead.
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9dbac28f

net: bcmgenet: remove priv->int0_mask and priv->int1_mask · e412b104

Petri Gynther authored Mar 25, 2015

Remove unused priv->int0_mask and priv->int1_mask.
Signed-off-by: Petri Gynther <pgynther@google.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e412b104

Merge branch 'xgene-next' · 19cc2dec

David S. Miller authored Mar 27, 2015

Iyappan Subramanian says:

====================
drivers: net: xgene: Add separate tx completion ring

SGMII based 1GbE and 10GbE interfaces support multiple interrupts.
Adding separate tx completion descriptor ring and associating a dedicated irq for the TX completion.
====================
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>

19cc2dec

drivers: net: xgene: Add separate tx completion ring · 6772b653

Iyappan Subramanian authored Mar 25, 2015

- Added wrapper functions around napi_add, napi_del, napi_enable and napi_disable
- Moved platform_get_irq function call after reading phy_mode
- Associating the new irq to tx completion for the supported ethernet interfaces
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6772b653

dtb: xgene: Add interrupt for Tx completion · d3134649

Iyappan Subramanian authored Mar 25, 2015

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d3134649

Documentation: dts: xgene: Update interrupt field description · 7e7d638a

Iyappan Subramanian authored Mar 25, 2015

Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: Keyur Chudgar <kchudgar@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e7d638a

25 Mar, 2015 20 commits

ipv6: hash net ptr into fragmentation bucket selection · 5a352dd0

Hannes Frederic Sowa authored Mar 25, 2015

As namespaces are sometimes used with overlapping ip address ranges,
we should also use the namespace as input to the hash to select the ip
fragmentation counter bucket.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5a352dd0

ipv4: hash net ptr into fragmentation bucket selection · b6a7719a

Hannes Frederic Sowa authored Mar 25, 2015

As namespaces are sometimes used with overlapping ip address ranges,
we should also use the namespace as input to the hash to select the ip
fragmentation counter bucket.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6a7719a

Merge branch 'tipc-next' · 8fa38a38

David S. Miller authored Mar 25, 2015

Jon Maloy says:

====================
tipc: some improvements and fixes

We introduce a better algorithm for selecting when and which
users should be subject to link congestion control, plus clean
up some code for that mechanism.
Commit #3 fixes another rare race condition during packet reception.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8fa38a38

tipc: eliminate race condition at dual link establishment · 8b4ed863

Jon Paul Maloy authored Mar 25, 2015

Despite recent improvements, the establishment of dual parallel
links still has a small glitch where messages can bypass each
other. When the second link in a dual-link configuration is
established, part of the first link's traffic will be steered over
to the new link. Although we do have a mechanism to ensure that
packets sent before and after the establishment of the new link
arrive in sequence to the destination node, this is not enough.
The arriving messages will still be delivered upwards in different
threads, something entailing a risk of message disordering during
the transition phase.

To fix this, we introduce a synchronization mechanism between the
two parallel links, so that traffic arriving on the new link cannot
be added to its input queue until we are guaranteed that all
pre-establishment messages have been delivered on the old, parallel
link.

This problem seems to always have been around, but its occurrence is
so rare that it has not been noticed until recent intensive testing.
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8b4ed863

tipc: clean up handling of link congestion · 3127a020

Jon Paul Maloy authored Mar 25, 2015

After the recent changes in message importance handling it becomes
possible to simplify handling of messages and sockets when we
encounter link congestion.

We merge the function tipc_link_cong() into link_schedule_user(),
and simplify the code of the latter. The code should now be
easier to follow, especially regarding return codes and handling
of the message that caused the situation.

In case the scheduling function is unable to pre-allocate a wakeup
message buffer, it now returns -ENOBUFS, which is a more correct
code than the previously used -EHOSTUNREACH.
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3127a020

tipc: introduce starvation free send algorithm · 1f66d161

Jon Paul Maloy authored Mar 25, 2015

Currently, we only use a single counter; the length of the backlog
queue, to determine whether a message should be accepted to the queue
or not. Each time a message is being sent, the queue length is compared
to a threshold value for the message's importance priority. If the queue
length is beyond this threshold, the message is rejected. This algorithm
implies a risk of starvation of low importance senders during very high
load, because it may take a long time before the backlog queue has
decreased enough to accept a lower level message.

We now eliminate this risk by introducing a counter for each importance
priority. When a message is sent, we check only the queue level for that
particular message's priority. If that is ok, the message can be added
to the backlog, irrespective of the queue level for other priorities.
This way, each level is guaranteed a certain portion of the total
bandwidth, and any risk of starvation is eliminated.
Reviewed-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f66d161

net: dsa: Handle non-bridge master change · b06b107a

Guenter Roeck authored Mar 25, 2015

Master change notifications may occur other than when joining or
leaving a bridge, for example when being added to or removed from
a bond or Open vSwitch. In that case, do nothing instead of asking
the switch driver to remove a port from a bridge that it didn't join.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b06b107a

crypto: algif - fix warn: unsigned 'used' is never less than zero · ac110f49

tadeusz.struk@intel.com authored Mar 25, 2015

Change type from unsigned long to int to fix an issue reported by kbuild robot:
crypto/algif_skcipher.c:596 skcipher_recvmsg_async() warn: unsigned 'used' is
never less than zero.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac110f49

tipc: fix a link reset issue due to retransmission failures · bc14b8d6

Ying Xue authored Mar 25, 2015

When a node joins a cluster while we are transmitting a fragment
stream over the broadcast link, it's missing the preceding fragments
needed to build a meaningful message. As a result, the node has to
drop it. However, as the fragment message is not acknowledged to
its sender before it's dropped, it accidentally causes link reset
of retransmission failure on the node.
Reported-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Reviewed-by: Erik Hugne <erik.hugne@ericsson.com>
Tested-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bc14b8d6

s390: fix /proc/interrupts output · 358e048d

Sebastian Ott authored Mar 25, 2015

The irqclass_sub_desc array and enum interruption_class are out of sync
thus /proc/interrupts is broken. Remove IRQIO_CLW.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

358e048d

sctp: avoid to repeatedly declare external variables · 7e3ea6d5

Ying Xue authored Mar 25, 2015

Move the declaration for external variables to sctp.h file avoiding
to repeatedly declare them with extern keyword.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7e3ea6d5

tcp: fix ipv4 mapped request socks · 0144a81c

Eric Dumazet authored Mar 24, 2015

ss should display ipv4 mapped request sockets like this :

tcp    SYN-RECV   0      0  ::ffff:192.168.0.1:8080   ::ffff:192.0.2.1:35261

and not like this :

tcp    SYN-RECV   0      0  192.168.0.1:8080   192.0.2.1:35261

We should init ireq->ireq_family based on listener sk_family,
not the actual protocol carried by SYN packet.

This means we can set ireq_family in inet_reqsk_alloc()

Fixes: 3f66b083 ("inet: introduce ireq_family")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0144a81c

virtio: change comment in transmit · d631b94e

stephen hemminger authored Mar 24, 2015

The original comment was not really informative or funny
as well as sexist. Replace it with a better explanation of
why the driver does stop and what the impacts are.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

d631b94e

tools: bpf_asm: cleanup vlan extension related token · 835c3d9b

Daniel Borkmann authored Mar 24, 2015

We now have K_VLANT, K_VLANP and K_VLANTPID. Clean them up into more
descriptive token, namely K_VLAN_TCI, K_VLAN_AVAIL and K_VLAN_TPID.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

835c3d9b

Merge branch 'listener_refactor_16' · b1275eb3

David S. Miller authored Mar 24, 2015

Eric Dumazet says:

====================
tcp: listener refactor part 16

A CONFIG_PROVE_RCU=y build revealed an RCU splat I had to fix.

I added const qualifiers to various md5 methods, as I expect
to call them on behalf of request sock traffic even if
the listener socket is not locked. This seems ok, but adding
const makes the contract clearer. Note a good reduction
of code size thanks to request/establish sockets convergence.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b1275eb3

tcp: md5: get rid of tcp_v[46]_reqsk_md5_lookup() · fd3a154a

Eric Dumazet authored Mar 24, 2015

With request socks convergence, we no longer need
different lookup methods. A request socket can
use generic lookup function.

Add const qualifier to 2nd tcp_v[46]_md5_lookup() parameter.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fd3a154a

tcp: md5: remove request sock argument of calc_md5_hash() · 39f8e58e

Eric Dumazet authored Mar 24, 2015

Since request and established sockets now have same base,
there is no need to pass two pointers to tcp_v4_md5_hash_skb()
or tcp_v6_md5_hash_skb()

Also add a const qualifier to their struct tcp_md5sig_key argument.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

39f8e58e

tcp: md5: input path is run under rcu protected sections · ff74e23f

Eric Dumazet authored Mar 24, 2015

It is guaranteed that both tcp_v4_rcv() and tcp_v6_rcv()
run from rcu read locked sections :

ip_local_deliver_finish() and ip6_input_finish() both
use rcu_read_lock()

Also align tcp_v6_inbound_md5_hash() on tcp_v4_inbound_md5_hash()
by returning a boolean.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff74e23f

tcp: use C99 initializers in new_state[] · 0980c1e3

Eric Dumazet authored Mar 24, 2015

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0980c1e3

tcp: md5: fix rcu lockdep splat · 80f03e27

Eric Dumazet authored Mar 24, 2015

While timer handler effectively runs a rcu read locked section,
there is no explicit rcu_read_lock()/rcu_read_unlock() annotations
and lockdep can be confused here :

net/ipv4/tcp_ipv4.c-906- /* caller either holds rcu_read_lock() or socket lock */
net/ipv4/tcp_ipv4.c:907: md5sig = rcu_dereference_check(tp->md5sig_info,
net/ipv4/tcp_ipv4.c-908- sock_owned_by_user(sk) ||
net/ipv4/tcp_ipv4.c-909- lockdep_is_held(&sk->sk_lock.slock));

Let's explicitely acquire rcu_read_lock() in tcp_make_synack()

Before commit fa76ce73 ("inet: get rid of central tcp/dccp listener
timer"), we were holding listener lock so lockdep was happy.

Fixes: fa76ce73 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Eric DUmazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

80f03e27

24 Mar, 2015 2 commits

Merge branch 'rhashtable-next' · 9ead3527

David S. Miller authored Mar 24, 2015

Thomas Graf says:

====================
rhashtable updates on top of Herbert's work

Patch 1 is a bugfix for an RCU splash I encountered while testing.
Patch 2 & 3 are pure cleanups. Patch 4 disables automatic shrinking
by default as discussed in previous thread. Patch 5 removes some
rhashtable internal knowledge from nft_hash and fixes another RCU
splash.

I've pushed various rhashtable tests (Netlink, nft) together with a
Makefile to a git tree [0] for easier stress testing.

[0] https://github.com/tgraf/rhashtable
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

9ead3527

rhashtable: Add rhashtable_free_and_destroy() · 6b6f302c

Thomas Graf authored Mar 24, 2015

rhashtable_destroy() variant which stops rehashes, iterates over
the table and calls a callback to release resources.

Avoids need for nft_hash to embed rhashtable internals and allows to
get rid of the being_destroyed flag. It also saves a 2nd mutex
lock upon destruction.

Also fixes an RCU lockdep splash on nft set destruction due to
calling rht_for_each_entry_safe() without holding bucket locks.
Open code this loop as we need know that no mutations may occur in
parallel.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

6b6f302c