Commit 7e299113 authored by Shmulik Hen's avatar Shmulik Hen Committed by Stephen Hemminger

[PATCH] Add support for HW accel. slaves

Now that David Miller accepted the first half of this set into 2.6,
I'm resending the last half to you for inclusion in netdev-2.6.

Tested against latest netdev-2.6. Summary:

Change the bond interface to publish full VLAN hardware acceleration
offloading capabilities, and add capability in all xmit functions to
take special care for VLAN HW accel. tagged skb's that are going out
through a slave that is not offloading capable.

Add a mechanism to collect and save the VLAN Id's that have been
added on top of a bond interface, and propagate the register/add/kill
operations to the slaves.

Add blocking mechanism to prevent adding VLAN interfaces on top of a
bond that contains VLAN challenged slaves and to prevent adding VLAN
challenged slaves to a bond that already has VLAN interfaces on top
of it.

Add a section about VLAN to Documentation/networking/bonding.txt and
also correct some minor spelling/grammer errors.
parent 5055a79b
...@@ -31,6 +31,7 @@ Verifying Bond Configuration ...@@ -31,6 +31,7 @@ Verifying Bond Configuration
Frequently Asked Questions Frequently Asked Questions
High Availability High Availability
Promiscuous Sniffing notes Promiscuous Sniffing notes
8021q VLAN support
Limitations Limitations
Resources and Links Resources and Links
...@@ -140,10 +141,6 @@ probeall bond0 eth0 eth1 bonding ...@@ -140,10 +141,6 @@ probeall bond0 eth0 eth1 bonding
Be careful not to reference bond0 itself at the end of the line, or modprobe Be careful not to reference bond0 itself at the end of the line, or modprobe
will die in an endless recursive loop. will die in an endless recursive loop.
To have device characteristics (such as MTU size) propagate to slave devices,
set the bond characteristics before enslaving the device. The characteristics
are propagated during the enslave process.
If running SNMP agents, the bonding driver should be loaded before any network If running SNMP agents, the bonding driver should be loaded before any network
drivers participating in a bond. This requirement is due to the the interface drivers participating in a bond. This requirement is due to the the interface
index (ipAdEntIfIndex) being associated to the first interface found with a index (ipAdEntIfIndex) being associated to the first interface found with a
...@@ -601,7 +598,7 @@ Frequently Asked Questions ...@@ -601,7 +598,7 @@ Frequently Asked Questions
For ethernet cards not supporting MII status, the arp_interval and For ethernet cards not supporting MII status, the arp_interval and
arp_ip_target parameters must be specified for bonding to work arp_ip_target parameters must be specified for bonding to work
correctly. If packets have not been sent or received during the correctly. If packets have not been sent or received during the
specified arp_interval durration, an ARP request is sent to the specified arp_interval duration, an ARP request is sent to the
targets to generate send and receive traffic. If after this targets to generate send and receive traffic. If after this
interval, either the successful send and/or receive count has not interval, either the successful send and/or receive count has not
incremented, the next slave in the sequence will become the active incremented, the next slave in the sequence will become the active
...@@ -669,16 +666,8 @@ Frequently Asked Questions ...@@ -669,16 +666,8 @@ Frequently Asked Questions
that will be added. that will be added.
To restore your slaves' MAC addresses, you need to detach them To restore your slaves' MAC addresses, you need to detach them
from the bond (`ifenslave -d bond0 eth0'), set them down from the bond (`ifenslave -d bond0 eth0'). The bonding driver will then
(`ifconfig eth0 down'), unload the drivers (`rmmod 3c59x', for restore the MAC addresses that the slaves had before they were enslaved.
example) and reload them to get the MAC addresses from their
eeproms. If the driver is shared by several devices, you need
to turn them all down. Another solution is to look for the MAC
address at boot time (dmesg or tail /var/log/messages) and to
reset it by hand with ifconfig :
# ifconfig eth0 down
# ifconfig eth0 hw ether 00:20:40:60:80:A0
9. Which transmit polices can be used? 9. Which transmit polices can be used?
...@@ -843,7 +832,7 @@ point of failure" solution. ...@@ -843,7 +832,7 @@ point of failure" solution.
In this configuration, there is an ISL - Inter Switch Link (could be a trunk), In this configuration, there is an ISL - Inter Switch Link (could be a trunk),
several servers (host1, host2 ...) attached to both switches each, and one or several servers (host1, host2 ...) attached to both switches each, and one or
more ports to the outside world (port3...). One an only one slave on each host more ports to the outside world (port3...). One and only one slave on each host
is active at a time, while all links are still monitored (the system can is active at a time, while all links are still monitored (the system can
detect a failure of active and backup links). detect a failure of active and backup links).
...@@ -933,6 +922,41 @@ capacity aggregating; but it works fine for unnumbered interfaces; ...@@ -933,6 +922,41 @@ capacity aggregating; but it works fine for unnumbered interfaces;
just ignore all the warnings it emits. just ignore all the warnings it emits.
8021q VLAN support
==================
It is possible to configure VLAN devices over a bond interface using the 8021q
driver. However, only packets coming from the 8021q driver and passing through
bonding will be tagged by default. Self generated packets, like bonding's
learning packets or ARP packets generated by either ALB mode or the ARP
monitor mechanism, are tagged internally by bonding itself. As a result,
bonding has to "learn" what VLAN IDs are configured on top of it, and it uses
those IDs to tag self generated packets.
For simplicity reasons, and to support the use of adapters that can do VLAN
hardware acceleration offloding, the bonding interface declares itself as
fully hardware offloaing capable, it gets the add_vid/kill_vid notifications
to gather the necessary information, and it propagates those actions to the
slaves.
In case of mixed adapter types, hardware accelerated tagged packets that should
go through an adapter that is not offloading capable are "un-accelerated" by the
bonding driver so the VLAN tag sits in the regular location.
VLAN interfaces *must* be added on top of a bonding interface only after
enslaving at least one slave. This is because until the first slave is added the
bonding interface has a HW address of 00:00:00:00:00:00, which will be copied by
the VLAN interface when it is created.
Notice that a problem would occur if all slaves are released from a bond that
still has VLAN interfaces on top of it. When later coming to add new slaves, the
bonding interface would get a HW address from the first slave, which might not
match that of the VLAN interfaces. It is recommended that either all VLANs are
removed and then re-added, or to manually set the bonding interface's HW
address so it matches the VLAN's. (Note: changing a VLAN interface's HW address
would set the underlying device -- i.e. the bonding interface -- to promiscouos
mode, which might not be what you want).
Limitations Limitations
=========== ===========
The main limitations are : The main limitations are :
......
...@@ -2362,6 +2362,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2362,6 +2362,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
int agg_id; int agg_id;
int i; int i;
struct ad_info ad_info; struct ad_info ad_info;
int res = 1;
/* make sure that the slaves list will /* make sure that the slaves list will
* not change during tx * not change during tx
...@@ -2369,12 +2370,12 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2369,12 +2370,12 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
read_lock(&bond->lock); read_lock(&bond->lock);
if (!BOND_IS_OK(bond)) { if (!BOND_IS_OK(bond)) {
goto free_out; goto out;
} }
if (bond_3ad_get_active_agg_info(bond, &ad_info)) { if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n"); printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n");
goto free_out; goto out;
} }
slaves_in_agg = ad_info.ports; slaves_in_agg = ad_info.ports;
...@@ -2383,7 +2384,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2383,7 +2384,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
if (slaves_in_agg == 0) { if (slaves_in_agg == 0) {
/*the aggregator is empty*/ /*the aggregator is empty*/
printk(KERN_DEBUG "ERROR: active aggregator is empty\n"); printk(KERN_DEBUG "ERROR: active aggregator is empty\n");
goto free_out; goto out;
} }
slave_agg_no = (data->h_dest[5]^bond->dev->dev_addr[5]) % slaves_in_agg; slave_agg_no = (data->h_dest[5]^bond->dev->dev_addr[5]) % slaves_in_agg;
...@@ -2401,7 +2402,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2401,7 +2402,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
if (slave_agg_no >= 0) { if (slave_agg_no >= 0) {
printk(KERN_ERR DRV_NAME ": Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id); printk(KERN_ERR DRV_NAME ": Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id);
goto free_out; goto out;
} }
start_at = slave; start_at = slave;
...@@ -2414,24 +2415,19 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2414,24 +2415,19 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
slave_agg_id = agg->aggregator_identifier; slave_agg_id = agg->aggregator_identifier;
} }
if (SLAVE_IS_OK(slave) && if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
agg && (slave_agg_id == agg_id)) { res = bond_dev_queue_xmit(bond, skb, slave->dev);
skb->dev = slave->dev; break;
skb->priority = 1;
dev_queue_xmit(skb);
goto out;
} }
} }
out: out:
if (res) {
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
}
read_unlock(&bond->lock); read_unlock(&bond->lock);
return 0; return 0;
free_out:
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
goto out;
} }
int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype) int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype)
......
...@@ -1193,6 +1193,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev) ...@@ -1193,6 +1193,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
int do_tx_balance = 1; int do_tx_balance = 1;
u32 hash_index = 0; u32 hash_index = 0;
u8 *hash_start = NULL; u8 *hash_start = NULL;
int res = 1;
/* make sure that the curr_active_slave and the slaves list do /* make sure that the curr_active_slave and the slaves list do
* not change during tx * not change during tx
...@@ -1201,7 +1202,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev) ...@@ -1201,7 +1202,7 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
read_lock(&bond->curr_slave_lock); read_lock(&bond->curr_slave_lock);
if (!BOND_IS_OK(bond)) { if (!BOND_IS_OK(bond)) {
goto free_out; goto out;
} }
switch (ntohs(skb->protocol)) { switch (ntohs(skb->protocol)) {
...@@ -1266,29 +1267,27 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev) ...@@ -1266,29 +1267,27 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
} }
if (tx_slave && SLAVE_IS_OK(tx_slave)) { if (tx_slave && SLAVE_IS_OK(tx_slave)) {
skb->dev = tx_slave->dev;
if (tx_slave != bond->curr_active_slave) { if (tx_slave != bond->curr_active_slave) {
memcpy(eth_data->h_source, memcpy(eth_data->h_source,
tx_slave->dev->dev_addr, tx_slave->dev->dev_addr,
ETH_ALEN); ETH_ALEN);
} }
dev_queue_xmit(skb);
res = bond_dev_queue_xmit(bond, skb, tx_slave->dev);
} else { } else {
/* no suitable interface, frame not sent */
if (tx_slave) { if (tx_slave) {
tlb_clear_slave(bond, tx_slave, 0); tlb_clear_slave(bond, tx_slave, 0);
} }
goto free_out;
} }
out: out:
if (res) {
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
}
read_unlock(&bond->curr_slave_lock); read_unlock(&bond->curr_slave_lock);
read_unlock(&bond->lock); read_unlock(&bond->lock);
return 0; return 0;
free_out:
dev_kfree_skb(skb);
goto out;
} }
void bond_alb_monitor(struct bonding *bond) void bond_alb_monitor(struct bonding *bond)
......
This diff is collapsed.
...@@ -147,6 +147,11 @@ struct bond_params { ...@@ -147,6 +147,11 @@ struct bond_params {
u32 arp_targets[BOND_MAX_ARP_TARGETS]; u32 arp_targets[BOND_MAX_ARP_TARGETS];
}; };
struct vlan_entry {
struct list_head vlan_list;
unsigned short vlan_id;
};
struct slave { struct slave {
struct net_device *dev; /* first - usefull for panic debug */ struct net_device *dev; /* first - usefull for panic debug */
struct slave *next; struct slave *next;
...@@ -196,6 +201,8 @@ struct bonding { ...@@ -196,6 +201,8 @@ struct bonding {
struct ad_bond_info ad_info; struct ad_bond_info ad_info;
struct alb_bond_info alb_info; struct alb_bond_info alb_info;
struct bond_params params; struct bond_params params;
struct list_head vlan_list;
struct vlan_group *vlgrp;
}; };
/** /**
...@@ -238,5 +245,7 @@ extern inline void bond_set_slave_active_flags(struct slave *slave) ...@@ -238,5 +245,7 @@ extern inline void bond_set_slave_active_flags(struct slave *slave)
slave->dev->flags &= ~IFF_NOARP; slave->dev->flags &= ~IFF_NOARP;
} }
int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb, struct net_device *slave_dev);
#endif /* _LINUX_BONDING_H */ #endif /* _LINUX_BONDING_H */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment