- 24 Aug, 2012 16 commits
-
-
Ben Hutchings authored
We try to defer resets while the device is not READY, but we're not doing this quite correctly. In particular, changes to efx_nic::state are documented as serialised by the RTNL lock, but they aren't. 1. We check whether a reset was requested during probe (suggesting broken hardware) before we allow requested resets to be scheduled. This leaves a window where a requested reset would be deferred indefinitely. 2. Although we cancel the reset work item during device removal, there are still later operations that can cause it to be scheduled again. We need to check the state before scheduling it. 3. Since the state can change between scheduling and running of the work item, we still need to check it there, and we need to do so *after* acquiring the RTNL lock which serialises state changes. 4. We must cancel the reset work item during device removal, if the state could ever have been READY. This wasn't done in some of the failure paths from efx_pci_probe(). Move the cancellation to efx_pci_remove_main(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
The current informational message doesn't properly explain what happens, and could also appear if we defer a reset during suspend/resume. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
efx_change_mtu() and efx_realloc_channels() each stop and start much of the NIC, even if it has been disabled. Since efx_start_all() is a no-op when the NIC is disabled, this is probably harmless in the case of efx_change_mtu(), but efx_realloc_channels() also reenables interrupts which could be a bad thing to do. Change efx_start_all() and efx_start_interrupts() to assert that the NIC is not disabled, but make efx_stop_interrupts() do nothing if the NIC is disabled (since it is already stopped), consistent with efx_stop_all(). Update comments for efx_start_all() and efx_stop_all() to describe their purpose and preconditions more accurately. Add a common function to check and log if the NIC is disabled, and use it in efx_net_open(), efx_change_mtu() and efx_realloc_channels(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Interrupt state should be consistently guarded by the RTNL lock once the net device is registered. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Currently we ignore and clear the disabled state. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
I don't think these PM functions can race with userland net device operations, but it's much easier to reason about locking if state is consistently guarded by the same lock. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
STATE_INIT and STATE_FINI are equivalent and represent incompletely initialised states; combine them as STATE_UNINIT. Rename STATE_RUNNING to STATE_READY, to avoid confusion with netif_running() and IFF_RUNNING. The comments do not quite match current usage, but this will be corrected in subsequent fixes. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
We only use tso_state::full_packet_space to calculate the IPv4 tot_len or IPv6 payload_len, not to set tso_state::packet_space. Replace it with an ip_base_len field holding the value of tot_len or payload_len before including the TCP payload, which is much more useful when constructing the new headers. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
TSO header buffers contain a control structure immediately followed by the packet headers, and are kept on a free list when not in use. This complicates buffer management and tends to result in cache read misses when we recycle such buffers (particularly if DMA-coherent memory requires caches to be disabled). Replace the free list with a simple mapping by descriptor index. We know that there is always a payload descriptor between any two descriptors with TSO header buffers, so we can allocate only one such buffer for each two descriptors. While we're at it, use a standard error code for allocation failure, not -1. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
We now have a definite upper bound on the number of descriptors per skb; use that to stop the queue when the next packet might not fit. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Add a flags field to struct efx_tx_buffer, replacing the continuation and map_single booleans. Since a single descriptor cannot be both a TSO header and the last descriptor for an skb, unionise efx_tx_buffer::{skb,tsoh} and add flags for validity of these fields. Clear all flags in free buffers (whereas previously the continuation flag would be set). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
The operstate of a device is initially IF_OPER_UNKNOWN and is updated asynchronously by linkwatch after each change of carrier state reported by the driver. The default carrier state of a net device is on, and this will never be changed on drivers that do not support carrier detection, thus the operstate remains IF_OPER_UNKNOWN. For devices that do support carrier detection, the driver must set the carrier state to off initially, then poll the hardware state when the device is opened. However, we must not activate linkwatch for a unregistered device, and commit b4730016 ('net: Do not fire linkwatch events until the device is registered.') ensured that we don't. But this means that the operstate for many devices that support carrier detection remains IF_OPER_UNKNOWN when it should be IF_OPER_DOWN. The same issue exists with the dormant state. The proper initialisation sequence, avoiding a race with opening of the device, is: rtnl_lock(); rc = register_netdevice(dev); if (rc) goto out_unlock; netif_carrier_off(dev); /* or netif_dormant_on(dev) */ rtnl_unlock(); but it seems silly that this should have to be repeated in so many drivers. Further, the operstate seen immediately after opening the device may still be IF_OPER_UNKNOWN due to the asynchronous nature of linkwatch. Commit 22604c86 ('net: Fix for initial link state in 2.6.28') attempted to fix this by setting the operstate synchronously, but it was reverted as it could lead to deadlock. This initialises the operstate synchronously at registration time only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Timur Tabi authored
Similar to fsl_pq_mdio.c, this driver is for the 10G MDIO controller on Freescale Frame Manager Ethernet controllers. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Neil Horman authored
The network classifier cgroup initalizes each cgroups instance classid value to 0. However, the sock_update_classid function only updates classid's in sockets if the tasks cgroup classid is not zero, and if it differs from the current classid. The later check is to prevent cache line dirtying, but the former is detrimental, as it prevents resetting a classid for a cgroup to 0. While this is not a common action, it has administrative usefulness (if the admin wants to disable classification of a certain group temporarily for instance). Easy fix, just remove the zero check. Tested successfully by myself Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.open-mesh.org/linux-mergeDavid S. Miller authored
Antonio Quartulli says: ==================== Included changes: - a set of codestyle rearrangements/fixes - new feature to early detect new joining (mesh-unaware) clients - a minor fix for the gw-feature - substitution of shift operations with the BIT() macro - reorganization of the main batman-adv structure (struct batadv_priv) - some more (very) minor cleanups and fixes =================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Aug, 2012 24 commits
-
-
Rami Rosen authored
This patch fixes a broken build due to a missing header: ... CC net/ipv4/proc.o In file included from include/net/net_namespace.h:15, from net/ipv4/proc.c:35: include/net/netns/packet.h:11: error: field 'sklist_lock' has incomplete type ... The lock of netns_packet has been replaced by a recent patch to be a mutex instead of a spinlock, but we need to replace the header file to be linux/mutex.h instead of linux/spinlock.h as well. See commit 0fa7fa98: packet: Protect packet sk list with mutex (v2) patch, Signed-off-by: Rami Rosen <rosenr@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Eric Biederman pointed out that not holding RTNL while calling call_netdevice_notifiers() was racy. This patch is a direct transcription his feedback against commit 0115e8e3 (net: remove delay at device dismantle) Thanks Eric ! Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sven Eckelmann authored
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Antonio Quartulli authored
In order to understand where a broadcast packet is coming from and use this information to detect not yet announced clients, this patch modifies the interface_rx() function by passing a new argument: the orig node corresponding to the node that originated the received packet (if known). This new argument if not NULL for broadcast packets only (other packets does not have source field). Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Antonio Quartulli authored
With the current TT mechanism a new client joining the network is not immediately able to communicate with other hosts because its MAC address has not been announced yet. This situation holds until the first OGM containing its joining event will be spread over the mesh network. This behaviour can be acceptable in networks where the originator interval is a small value (e.g. 1sec) but if that value is set to an higher time (e.g. 5secs) the client could suffer from several malfunctions like DHCP client timeouts, etc. This patch adds an early detection mechanism that makes nodes in the network able to recognise "not yet announced clients" by means of the broadcast packets they emitted on connection (e.g. ARP or DHCP request). The added client will then be confirmed upon receiving the OGM claiming it or purged if such OGM is not received within a fixed amount of time. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Sven Eckelmann authored
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Sven Eckelmann authored
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Sven Eckelmann authored
Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Martin Hundebøll authored
When enabling promiscuous mode, tt queries for other hosts might be received. Before this patch, "foreign" tt queries were processed like any other query and thus forwarded to its destination again and thereby causing a loop. This patch adds a check to drop foreign tt queries. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Martin Hundebøll authored
batadv_check_unicast_packet() is needed in batadv_recv_tt_query(), so move the former to before the latter. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Sven Eckelmann authored
The structure batadv_priv grows everytime a new feature is introduced. It gets hard to find the parts of the struct that belongs to a specific feature. This becomes even harder by the fact that not every feature uses a prefix in the member name. The variables for bridge loop avoidence, gateway handling, translation table and visualization server are moved into separate structs that are included in the bat_priv main struct. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Simon Wunderlich authored
If this call fails, some of the orig_nodes spaces may have been resized for the increased number of interface, and some may not. If we would just continue with the larger number of interfaces, this would lead to access to not allocated memory later. We better check the return code, and don't add the interface if no memory is available. OTOH, keeping some of the orig_nodes with too much memory allocated should hurt no one (except for a few too many bytes allocated). Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Antonio Quartulli authored
the word millisecond is misspelled in several comments. This patch fixes it. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Antonio Quartulli authored
The batadv_tt_orig_list_entry structure didn't have any refcounting mechanism so far. This patch introduces it and makes the structure being usable in much more complex context. Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Jonathan Corbet authored
As much as I'm happy to see LWN links sprinkled through the kernel by the dozen, this one in particular reflects a very old state of reality; the associated comment is now incorrect. So just delete it. Signed-off-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Marek Lindner authored
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Acked-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Simon Wunderlich authored
for consistency reasons within the code and with the documentation, we should always call it "claim" and "unclaim". Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Simon Wunderlich authored
Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Simon Wunderlich authored
This is especially useful if there are no claims yet, but we still want to know which gateways are using bridge loop avoidance in the network. Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Antonio Quartulli authored
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
-
Pavel Emelyanov authored
Change since v1: * Fixed inuse counters access spotted by Eric In patch eea68e2f (packet: Report socket mclist info via diag module) I've introduced a "scheduling in atomic" problem in packet diag module -- the socket list is traversed under rcu_read_lock() while performed under it sk mclist access requires rtnl lock (i.e. -- mutex) to be taken. [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002 [152363.820573] 4 locks held by crtools/12517: [152363.820581] #0: (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e [152363.820613] #1: (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a [152363.820644] #2: (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab [152363.820693] #3: (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af Similar thing was then re-introduced by further packet diag patches (fanount mutex and pgvec mutex for rings) :( Apart from being terribly sorry for the above, I propose to change the packet sk list protection from spinlock to mutex. This lock currently protects two modifications: * sklist * prot inuse counters The sklist modifications can be just reprotected with mutex since they already occur in a sleeping context. The inuse counters modifications are trickier -- the __this_cpu_-s are used inside, thus requiring the caller to handle the potential issues with contexts himself. Since packet sockets' counters are modified in two places only (packet_create and packet_release) we only need to protect the context from being preempted. BH disabling is not required in this case. Signed-off-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Allan, Bruce W authored
The helper functions which translate IEEE MDIO Manageable Device (MMD) Energy-Efficient Ethernet (EEE) registers 3.20, 7.60 and 7.61 to and from the comparable ethtool supported/advertised settings will be needed by drivers other than those in PHYLIB (e.g. e1000e in a follow-on patch). In the same fashion as similar translation functions in linux/mii.h, move these functions from the PHYLIB core to the linux/mdio.h header file so the code will not have to be duplicated in each driver needing MMD-to-ethtool (and vice-versa) translations. The function and some variable names have been renamed to be more descriptive. Not tested on the only hardware that currently calls the related functions, stmmac, because I don't have access to any. Has been compile tested and the translations have been tested on a locally modified version of e1000e. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
danborkmann@iogearbox.net authored
Instead of using a hard-coded value for the status variable, it would make the code more readable to use its destined define from linux/if_packet.h. Signed-off-by: daniel.borkmann@tik.ee.ethz.ch Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ying Xue authored
Since we have already in BH context when *_write_space(), *_data_ready() as well as *_state_change() are called, it's unnecessary to disable BH. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-