Commit ec041cd9 authored by Jeff Garzik's avatar Jeff Garzik

Merge redhat.com:/spare/repo/netdev-2.6/bonding-1

into redhat.com:/spare/repo/net-drivers-2.5
parents 56549962 61b6a04a
...@@ -21,7 +21,7 @@ userspace tools, please follow the links at the end of this file. ...@@ -21,7 +21,7 @@ userspace tools, please follow the links at the end of this file.
Table of Contents Table of Contents
================= =================
Installation Installation
Bond Configuration Bond Configuration
Module Parameters Module Parameters
...@@ -66,7 +66,7 @@ of the -I option on the ifenslave compile line is to make sure it uses ...@@ -66,7 +66,7 @@ of the -I option on the ifenslave compile line is to make sure it uses
/usr/include/linux. /usr/include/linux.
To install ifenslave.c, do: To install ifenslave.c, do:
# gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave # gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave
# cp ifenslave /sbin/ifenslave # cp ifenslave /sbin/ifenslave
...@@ -74,10 +74,10 @@ Bond Configuration ...@@ -74,10 +74,10 @@ Bond Configuration
================== ==================
You will need to add at least the following line to /etc/modules.conf You will need to add at least the following line to /etc/modules.conf
so the bonding driver will automatically load when the bond0 interface is so the bonding driver will automatically load when the bond0 interface is
configured. Refer to the modules.conf manual page for specific modules.conf configured. Refer to the modules.conf manual page for specific modules.conf
syntax details. The Module Parameters section of this document describes each syntax details. The Module Parameters section of this document describes each
bonding driver parameter. bonding driver parameter.
alias bond0 bonding alias bond0 bonding
...@@ -113,7 +113,7 @@ bonding interface (bond1), use MASTER=bond1 in the config file to make the ...@@ -113,7 +113,7 @@ bonding interface (bond1), use MASTER=bond1 in the config file to make the
network interface be a slave of bond1. network interface be a slave of bond1.
Restart the networking subsystem or just bring up the bonding device if your Restart the networking subsystem or just bring up the bonding device if your
administration tools allow it. Otherwise, reboot. On Red Hat distros you can administration tools allow it. Otherwise, reboot. On Red Hat distros you can
issue `ifup bond0' or `/etc/rc.d/init.d/network restart'. issue `ifup bond0' or `/etc/rc.d/init.d/network restart'.
If the administration tools of your distribution do not support If the administration tools of your distribution do not support
...@@ -128,30 +128,30 @@ manually configure the bonding device with the following commands: ...@@ -128,30 +128,30 @@ manually configure the bonding device with the following commands:
(use appropriate values for your network above) (use appropriate values for your network above)
You can then create a script containing these commands and place it in the You can then create a script containing these commands and place it in the
appropriate rc directory. appropriate rc directory.
If you specifically need all network drivers loaded before the bonding driver, If you specifically need all network drivers loaded before the bonding driver,
adding the following line to modules.conf will cause the network driver for adding the following line to modules.conf will cause the network driver for
eth0 and eth1 to be loaded before the bonding driver. eth0 and eth1 to be loaded before the bonding driver.
probeall bond0 eth0 eth1 bonding probeall bond0 eth0 eth1 bonding
Be careful not to reference bond0 itself at the end of the line, or modprobe Be careful not to reference bond0 itself at the end of the line, or modprobe
will die in an endless recursive loop. will die in an endless recursive loop.
To have device characteristics (such as MTU size) propagate to slave devices, To have device characteristics (such as MTU size) propagate to slave devices,
set the bond characteristics before enslaving the device. The characteristics set the bond characteristics before enslaving the device. The characteristics
are propagated during the enslave process. are propagated during the enslave process.
If running SNMP agents, the bonding driver should be loaded before any network If running SNMP agents, the bonding driver should be loaded before any network
drivers participating in a bond. This requirement is due to the the interface drivers participating in a bond. This requirement is due to the the interface
index (ipAdEntIfIndex) being associated to the first interface found with a index (ipAdEntIfIndex) being associated to the first interface found with a
given IP address. That is, there is only one ipAdEntIfIndex for each IP given IP address. That is, there is only one ipAdEntIfIndex for each IP
address. For example, if eth0 and eth1 are slaves of bond0 and the driver for address. For example, if eth0 and eth1 are slaves of bond0 and the driver for
eth0 is loaded before the bonding driver, the interface for the IP address eth0 is loaded before the bonding driver, the interface for the IP address
will be associated with the eth0 interface. This configuration is shown below, will be associated with the eth0 interface. This configuration is shown below,
the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0 the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0
in the ifDescr table (ifDescr.2). in the ifDescr table (ifDescr.2).
interfaces.ifTable.ifEntry.ifDescr.1 = lo interfaces.ifTable.ifEntry.ifDescr.1 = lo
...@@ -189,10 +189,10 @@ functions such as Interface_Scan_Next will report that association. ...@@ -189,10 +189,10 @@ functions such as Interface_Scan_Next will report that association.
Module Parameters Module Parameters
================= =================
Optional parameters for the bonding driver can be supplied as command line Optional parameters for the bonding driver can be supplied as command line
arguments to the insmod command. Typically, these parameters are specified in arguments to the insmod command. Typically, these parameters are specified in
the file /etc/modules.conf (see the manual page for modules.conf). The the file /etc/modules.conf (see the manual page for modules.conf). The
available bonding driver parameters are listed below. If a parameter is not available bonding driver parameters are listed below. If a parameter is not
specified the default value is used. When initially configuring a bond, it specified the default value is used. When initially configuring a bond, it
is recommended "tail -f /var/log/messages" be run in a separate window to is recommended "tail -f /var/log/messages" be run in a separate window to
watch for bonding driver error messages. watch for bonding driver error messages.
...@@ -202,19 +202,19 @@ parameters be specified, otherwise serious network degradation will occur ...@@ -202,19 +202,19 @@ parameters be specified, otherwise serious network degradation will occur
during link failures. during link failures.
arp_interval arp_interval
Specifies the ARP monitoring frequency in milli-seconds. Specifies the ARP monitoring frequency in milli-seconds.
If ARP monitoring is used in a load-balancing mode (mode 0 or 2), the If ARP monitoring is used in a load-balancing mode (mode 0 or 2), the
switch should be configured in a mode that evenly distributes packets switch should be configured in a mode that evenly distributes packets
across all links - such as round-robin. If the switch is configured to across all links - such as round-robin. If the switch is configured to
distribute the packets in an XOR fashion, all replies from the ARP distribute the packets in an XOR fashion, all replies from the ARP
targets will be received on the same link which could cause the other targets will be received on the same link which could cause the other
team members to fail. ARP monitoring should not be used in conjunction team members to fail. ARP monitoring should not be used in conjunction
with miimon. A value of 0 disables ARP monitoring. The default value with miimon. A value of 0 disables ARP monitoring. The default value
is 0. is 0.
arp_ip_target arp_ip_target
Specifies the ip addresses to use when arp_interval is > 0. These Specifies the ip addresses to use when arp_interval is > 0. These
are the targets of the ARP request sent to determine the health of are the targets of the ARP request sent to determine the health of
the link to the targets. Specify these values in ddd.ddd.ddd.ddd the link to the targets. Specify these values in ddd.ddd.ddd.ddd
...@@ -223,8 +223,8 @@ arp_ip_target ...@@ -223,8 +223,8 @@ arp_ip_target
maximum number of targets that can be specified is set at 16. maximum number of targets that can be specified is set at 16.
downdelay downdelay
Specifies the delay time in milli-seconds to disable a link after a Specifies the delay time in milli-seconds to disable a link after a
link failure has been detected. This should be a multiple of miimon link failure has been detected. This should be a multiple of miimon
value, otherwise the value will be rounded. The default value is 0. value, otherwise the value will be rounded. The default value is 0.
...@@ -247,7 +247,7 @@ max_bonds ...@@ -247,7 +247,7 @@ max_bonds
and bond2 will be created. The default value is 1. and bond2 will be created. The default value is 1.
miimon miimon
Specifies the frequency in milli-seconds that MII link monitoring Specifies the frequency in milli-seconds that MII link monitoring
will occur. A value of zero disables MII link monitoring. A value will occur. A value of zero disables MII link monitoring. A value
of 100 is a good starting point. See High Availability section for of 100 is a good starting point. See High Availability section for
...@@ -258,7 +258,7 @@ mode ...@@ -258,7 +258,7 @@ mode
Specifies one of the bonding policies. The default is Specifies one of the bonding policies. The default is
round-robin (balance-rr). Possible values are (you can use round-robin (balance-rr). Possible values are (you can use
either the text or numeric option): either the text or numeric option):
balance-rr or 0 balance-rr or 0
Round-robin policy: Transmit in a sequential order Round-robin policy: Transmit in a sequential order
...@@ -273,7 +273,7 @@ mode ...@@ -273,7 +273,7 @@ mode
externally visible on only one port (network adapter) externally visible on only one port (network adapter)
to avoid confusing the switch. This mode provides to avoid confusing the switch. This mode provides
fault tolerance. fault tolerance.
balance-xor or 2 balance-xor or 2
XOR policy: Transmit based on [(source MAC address XOR policy: Transmit based on [(source MAC address
...@@ -293,7 +293,7 @@ mode ...@@ -293,7 +293,7 @@ mode
groups that share the same speed and duplex settings. groups that share the same speed and duplex settings.
Transmits and receives on all slaves in the active Transmits and receives on all slaves in the active
aggregator. aggregator.
Pre-requisites: Pre-requisites:
1. Ethtool support in the base drivers for retrieving the 1. Ethtool support in the base drivers for retrieving the
...@@ -317,7 +317,7 @@ mode ...@@ -317,7 +317,7 @@ mode
Ethtool support in the base drivers for retrieving the Ethtool support in the base drivers for retrieving the
speed of each slave. speed of each slave.
balance-alb or 6 balance-alb or 6
Adaptive load balancing: includes balance-tlb + receive Adaptive load balancing: includes balance-tlb + receive
load balancing (rlb) for IPV4 traffic and does not require load balancing (rlb) for IPV4 traffic and does not require
...@@ -327,7 +327,7 @@ mode ...@@ -327,7 +327,7 @@ mode
overwrites the src hw address with the unique hw address of overwrites the src hw address with the unique hw address of
one of the slaves in the bond such that different clients one of the slaves in the bond such that different clients
use different hw addresses for the server. use different hw addresses for the server.
Receive traffic from connections created by the server is Receive traffic from connections created by the server is
also balanced. When the server sends an ARP Request the also balanced. When the server sends an ARP Request the
bonding driver copies and saves the client's IP information bonding driver copies and saves the client's IP information
...@@ -363,25 +363,11 @@ mode ...@@ -363,25 +363,11 @@ mode
2. Base driver support for setting the hw address of a 2. Base driver support for setting the hw address of a
device also when it is open. This is required so that there device also when it is open. This is required so that there
will always be one slave in the team using the bond hw will always be one slave in the team using the bond hw
address (the current_slave) while having a unique hw address (the curr_active_slave) while having a unique hw
address for each slave in the bond. If the current_slave address for each slave in the bond. If the curr_active_slave
fails it's hw address is swapped with the new current_slave fails it's hw address is swapped with the new curr_active_slave
that was chosen. that was chosen.
multicast
Option specifying the mode of operation for multicast support.
Possible values are:
disabled or 0
Disabled (no multicast support)
active or 1
Enabled on active slave only, useful in active-backup mode
all or 2
Enabled on all slaves, this is the default
primary primary
A string (eth0, eth2, etc) to equate to a primary device. If this A string (eth0, eth2, etc) to equate to a primary device. If this
...@@ -397,11 +383,11 @@ primary ...@@ -397,11 +383,11 @@ primary
primary is only valid in active-backup mode. primary is only valid in active-backup mode.
updelay updelay
Specifies the delay time in milli-seconds to enable a link after a Specifies the delay time in milli-seconds to enable a link after a
link up status has been detected. This should be a multiple of miimon link up status has been detected. This should be a multiple of miimon
value, otherwise the value will be rounded. The default value is 0. value, otherwise the value will be rounded. The default value is 0.
use_carrier use_carrier
Specifies whether or not miimon should use MII or ETHTOOL Specifies whether or not miimon should use MII or ETHTOOL
...@@ -529,20 +515,20 @@ Verifying Bond Configuration ...@@ -529,20 +515,20 @@ Verifying Bond Configuration
---------------------------- ----------------------------
The bonding driver information files reside in the /proc/net/bonding directory. The bonding driver information files reside in the /proc/net/bonding directory.
Sample contents of /proc/net/bonding/bond0 after the driver is loaded with Sample contents of /proc/net/bonding/bond0 after the driver is loaded with
parameters of mode=0 and miimon=1000 is shown below. parameters of mode=0 and miimon=1000 is shown below.
Bonding Mode: load balancing (round-robin) Bonding Mode: load balancing (round-robin)
Currently Active Slave: eth0 Currently Active Slave: eth0
MII Status: up MII Status: up
MII Polling Interval (ms): 1000 MII Polling Interval (ms): 1000
Up Delay (ms): 0 Up Delay (ms): 0
Down Delay (ms): 0 Down Delay (ms): 0
Slave Interface: eth1 Slave Interface: eth1
MII Status: up MII Status: up
Link Failure Count: 1 Link Failure Count: 1
Slave Interface: eth0 Slave Interface: eth0
MII Status: up MII Status: up
Link Failure Count: 1 Link Failure Count: 1
...@@ -550,34 +536,34 @@ parameters of mode=0 and miimon=1000 is shown below. ...@@ -550,34 +536,34 @@ parameters of mode=0 and miimon=1000 is shown below.
2) Network verification 2) Network verification
----------------------- -----------------------
The network configuration can be verified using the ifconfig command. In The network configuration can be verified using the ifconfig command. In
the example below, the bond0 interface is the master (MASTER) while eth0 and the example below, the bond0 interface is the master (MASTER) while eth0 and
eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address
(HWaddr) as bond0 for all modes except TLB and ALB that require a unique MAC (HWaddr) as bond0 for all modes except TLB and ALB that require a unique MAC
address for each slave. address for each slave.
[root]# /sbin/ifconfig [root]# /sbin/ifconfig
bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0 RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0 TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:0 collisions:0 txqueuelen:0
eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0 RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0 TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:100 collisions:0 txqueuelen:100
Interrupt:10 Base address:0x1080 Interrupt:10 Base address:0x1080
eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4 eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0 inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0 RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0 TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100 collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400 Interrupt:9 Base address:0x1400
Frequently Asked Questions Frequently Asked Questions
...@@ -605,9 +591,9 @@ Frequently Asked Questions ...@@ -605,9 +591,9 @@ Frequently Asked Questions
5. What happens when a slave link dies? 5. What happens when a slave link dies?
If your ethernet cards support MII or ETHTOOL link status monitoring If your ethernet cards support MII or ETHTOOL link status monitoring
and the MII monitoring has been enabled in the driver (see description and the MII monitoring has been enabled in the driver (see description
of module parameters), there will be no adverse consequences. This of module parameters), there will be no adverse consequences. This
release of the bonding driver knows how to get the MII information and release of the bonding driver knows how to get the MII information and
enables or disables its slaves according to their link status. enables or disables its slaves according to their link status.
See section on High Availability for additional information. See section on High Availability for additional information.
...@@ -622,8 +608,8 @@ Frequently Asked Questions ...@@ -622,8 +608,8 @@ Frequently Asked Questions
slave. slave.
If neither mii_monitor and arp_interval is configured, the bonding If neither mii_monitor and arp_interval is configured, the bonding
driver will not handle this situation very well. The driver will driver will not handle this situation very well. The driver will
continue to send packets but some packets will be lost. Retransmits continue to send packets but some packets will be lost. Retransmits
will cause serious degradation of performance (in the case when one will cause serious degradation of performance (in the case when one
of two slave links fails, 50% packets will be lost, which is a serious of two slave links fails, 50% packets will be lost, which is a serious
problem for both TCP and UDP). problem for both TCP and UDP).
...@@ -636,9 +622,9 @@ Frequently Asked Questions ...@@ -636,9 +622,9 @@ Frequently Asked Questions
7. Which switches/systems does it work with? 7. Which switches/systems does it work with?
In round-robin and XOR mode, it works with systems that support In round-robin and XOR mode, it works with systems that support
trunking: trunking:
* Many Cisco switches and routers (look for EtherChannel support). * Many Cisco switches and routers (look for EtherChannel support).
* SunTrunking software. * SunTrunking software.
* Alteon AceDirector switches / WebOS (use Trunks). * Alteon AceDirector switches / WebOS (use Trunks).
...@@ -646,7 +632,7 @@ Frequently Asked Questions ...@@ -646,7 +632,7 @@ Frequently Asked Questions
models (450) can define trunks between ports on different physical models (450) can define trunks between ports on different physical
units. units.
* Linux bonding, of course ! * Linux bonding, of course !
In 802.3ad mode, it works with with systems that support IEEE 802.3ad In 802.3ad mode, it works with with systems that support IEEE 802.3ad
Dynamic Link Aggregation: Dynamic Link Aggregation:
...@@ -667,21 +653,21 @@ Frequently Asked Questions ...@@ -667,21 +653,21 @@ Frequently Asked Questions
is then passed to all following slaves and remains persistent (even if is then passed to all following slaves and remains persistent (even if
the the first slave is removed) until the bonding device is brought the the first slave is removed) until the bonding device is brought
down or reconfigured. down or reconfigured.
If you wish to change the MAC address, you can set it with ifconfig: If you wish to change the MAC address, you can set it with ifconfig:
# ifconfig bond0 hw ether 00:11:22:33:44:55 # ifconfig bond0 hw ether 00:11:22:33:44:55
The MAC address can be also changed by bringing down/up the device The MAC address can be also changed by bringing down/up the device
and then changing its slaves (or their order): and then changing its slaves (or their order):
# ifconfig bond0 down ; modprobe -r bonding # ifconfig bond0 down ; modprobe -r bonding
# ifconfig bond0 .... up # ifconfig bond0 .... up
# ifenslave bond0 eth... # ifenslave bond0 eth...
This method will automatically take the address from the next slave This method will automatically take the address from the next slave
that will be added. that will be added.
To restore your slaves' MAC addresses, you need to detach them To restore your slaves' MAC addresses, you need to detach them
from the bond (`ifenslave -d bond0 eth0'), set them down from the bond (`ifenslave -d bond0 eth0'), set them down
(`ifconfig eth0 down'), unload the drivers (`rmmod 3c59x', for (`ifconfig eth0 down'), unload the drivers (`rmmod 3c59x', for
...@@ -729,27 +715,27 @@ High Availability ...@@ -729,27 +715,27 @@ High Availability
================= =================
To implement high availability using the bonding driver, the driver needs to be To implement high availability using the bonding driver, the driver needs to be
compiled as a module, because currently it is the only way to pass parameters compiled as a module, because currently it is the only way to pass parameters
to the driver. This may change in the future. to the driver. This may change in the future.
High availability is achieved by using MII or ETHTOOL status reporting. You High availability is achieved by using MII or ETHTOOL status reporting. You
need to verify that all your interfaces support MII or ETHTOOL link status need to verify that all your interfaces support MII or ETHTOOL link status
reporting. On Linux kernel 2.2.17, all the 100 Mbps capable drivers and reporting. On Linux kernel 2.2.17, all the 100 Mbps capable drivers and
yellowfin gigabit driver support MII. To determine if ETHTOOL link reporting yellowfin gigabit driver support MII. To determine if ETHTOOL link reporting
is available for interface eth0, type "ethtool eth0" and the "Link detected:" is available for interface eth0, type "ethtool eth0" and the "Link detected:"
line should contain the correct link status. If your system has an interface line should contain the correct link status. If your system has an interface
that does not support MII or ETHTOOL status reporting, a failure of its link that does not support MII or ETHTOOL status reporting, a failure of its link
will not be detected! A message indicating MII and ETHTOOL is not supported by will not be detected! A message indicating MII and ETHTOOL is not supported by
a network driver is logged when the bonding driver is loaded with a non-zero a network driver is logged when the bonding driver is loaded with a non-zero
miimon value. miimon value.
The bonding driver can regularly check all its slaves links using the ETHTOOL The bonding driver can regularly check all its slaves links using the ETHTOOL
IOCTL (ETHTOOL_GLINK command) or by checking the MII status registers. The IOCTL (ETHTOOL_GLINK command) or by checking the MII status registers. The
check interval is specified by the module argument "miimon" (MII monitoring). check interval is specified by the module argument "miimon" (MII monitoring).
It takes an integer that represents the checking time in milliseconds. It It takes an integer that represents the checking time in milliseconds. It
should not come to close to (1000/HZ) (10 milli-seconds on i386) because it should not come to close to (1000/HZ) (10 milli-seconds on i386) because it
may then reduce the system interactivity. A value of 100 seems to be a good may then reduce the system interactivity. A value of 100 seems to be a good
starting point. It means that a dead link will be detected at most 100 starting point. It means that a dead link will be detected at most 100
milli-seconds after it goes down. milli-seconds after it goes down.
Example: Example:
...@@ -761,7 +747,7 @@ Or, put the following lines in /etc/modules.conf: ...@@ -761,7 +747,7 @@ Or, put the following lines in /etc/modules.conf:
alias bond0 bonding alias bond0 bonding
options bond0 miimon=100 options bond0 miimon=100
There are currently two policies for high availability. They are dependent on There are currently two policies for high availability. They are dependent on
whether: whether:
a) hosts are connected to a single host or switch that support trunking a) hosts are connected to a single host or switch that support trunking
...@@ -811,7 +797,7 @@ Example 2 : host to switch at twice the speed ...@@ -811,7 +797,7 @@ Example 2 : host to switch at twice the speed
# ifenslave bond0 eth0 eth1 # ifenslave bond0 eth0 eth1
2) High Availability on two or more switches (or a single switch without 2) High Availability on two or more switches (or a single switch without
trunking support) trunking support)
--------------------------------------------------------------------------- ---------------------------------------------------------------------------
This mode is more problematic because it relies on the fact that there This mode is more problematic because it relies on the fact that there
...@@ -870,10 +856,10 @@ by another external mechanism, it is good to have host1's active interface ...@@ -870,10 +856,10 @@ by another external mechanism, it is good to have host1's active interface
connected to one switch and host2's to the other. Such system will survive connected to one switch and host2's to the other. Such system will survive
a failure of a single host, cable, or switch. The worst thing that may happen a failure of a single host, cable, or switch. The worst thing that may happen
in the case of a switch failure is that half of the hosts will be temporarily in the case of a switch failure is that half of the hosts will be temporarily
unreachable until the other switch expires its tables. unreachable until the other switch expires its tables.
Example 2: Using multiple ethernet cards connected to a switch to configure Example 2: Using multiple ethernet cards connected to a switch to configure
NIC failover (switch is not required to support trunking). NIC failover (switch is not required to support trunking).
+----------+ +----------+ +----------+ +----------+
...@@ -957,7 +943,7 @@ The main limitations are : ...@@ -957,7 +943,7 @@ The main limitations are :
servers, but may be useful when the front switches send multicast servers, but may be useful when the front switches send multicast
information on their links (e.g. VRRP), or even health-check the servers. information on their links (e.g. VRRP), or even health-check the servers.
Use the arp_interval/arp_ip_target parameters to count incoming/outgoing Use the arp_interval/arp_ip_target parameters to count incoming/outgoing
frames. frames.
...@@ -973,13 +959,12 @@ Donald Becker's Ethernet Drivers and diag programs may be found at : ...@@ -973,13 +959,12 @@ Donald Becker's Ethernet Drivers and diag programs may be found at :
You will also find a lot of information regarding Ethernet, NWay, MII, etc. at You will also find a lot of information regarding Ethernet, NWay, MII, etc. at
www.scyld.com. www.scyld.com.
For new versions of the driver, patches for older kernels and the updated Patches for 2.2 kernels are at Willy Tarreau's site :
userspace tools, take a look at Willy Tarreau's site :
- http://wtarreau.free.fr/pub/bonding/ - http://wtarreau.free.fr/pub/bonding/
- http://www-miaif.lip6.fr/willy/pub/bonding/ - http://www-miaif.lip6.fr/~tarreau/pub/bonding/
To get latest informations about Linux Kernel development, please consult To get latest informations about Linux Kernel development, please consult
the Linux Kernel Mailing List Archives at : the Linux Kernel Mailing List Archives at :
http://boudicca.tux.org/hypermail/linux-kernel/latest/ http://www.ussg.iu.edu/hypermail/linux/kernel/
-- END -- -- END --
...@@ -4,8 +4,6 @@ ...@@ -4,8 +4,6 @@
* This program controls the Linux implementation of running multiple * This program controls the Linux implementation of running multiple
* network interfaces in parallel. * network interfaces in parallel.
* *
* Usage: ifenslave [-v] master-interface < slave-interface [metric <N>] > ...
*
* Author: Donald Becker <becker@cesdis.gsfc.nasa.gov> * Author: Donald Becker <becker@cesdis.gsfc.nasa.gov>
* Copyright 1994-1996 Donald Becker * Copyright 1994-1996 Donald Becker
* *
...@@ -90,24 +88,30 @@ ...@@ -90,24 +88,30 @@
* - For opt_c: slave should not be set to the master's setting * - For opt_c: slave should not be set to the master's setting
* while it is running. It was already set during enslave. To * while it is running. It was already set during enslave. To
* simplify things, it is now handeled separately. * simplify things, it is now handeled separately.
*
* - 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
* set version to 1.1.0
*/ */
#define APP_VERSION "1.0.12" #define APP_VERSION "1.1.0"
#define APP_RELDATE "June 30, 2003" #define APP_RELDATE "Septemer 24, 2003"
#define APP_NAME "ifenslave" #define APP_NAME "ifenslave"
static char *version = static char *version =
APP_NAME ".c:v" APP_VERSION " (" APP_RELDATE ") " "\nDonald Becker (becker@cesdis.gsfc.nasa.gov).\n" APP_NAME ".c:v" APP_VERSION " (" APP_RELDATE ")\n"
"detach support added on 2000/10/02 by Willy Tarreau (willy at meta-x.org).\n" "o Donald Becker (becker@cesdis.gsfc.nasa.gov).\n"
"2.4 kernel support added on 2001/02/16 by Chad N. Tindel (ctindel at ieee dot org.\n"; "o Detach support added on 2000/10/02 by Willy Tarreau (willy at meta-x.org).\n"
"o 2.4 kernel support added on 2001/02/16 by Chad N. Tindel\n"
" (ctindel at ieee dot org).\n";
static const char *usage_msg = static const char *usage_msg =
"Usage: ifenslave [-adfrvVh] <master-interface> < <slave-if> [metric <N>] > ...\n" "Usage: ifenslave [-f] <master-if> <slave-if> [<slave-if>...]\n"
" ifenslave -c master-interface slave-if\n"; " ifenslave -d <master-if> <slave-if> [<slave-if>...]\n"
" ifenslave -c <master-if> <slave-if>\n"
" ifenslave --help\n";
static const char *howto_msg = static const char *help_msg =
"Usage: ifenslave [-adfrvVh] <master-interface> < <slave-if> [metric <N>] > ...\n"
" ifenslave -c master-interface slave-if\n"
"\n" "\n"
" To create a bond device, simply follow these three steps :\n" " To create a bond device, simply follow these three steps :\n"
" - ensure that the required drivers are properly loaded :\n" " - ensure that the required drivers are properly loaded :\n"
...@@ -115,18 +119,32 @@ static const char *howto_msg = ...@@ -115,18 +119,32 @@ static const char *howto_msg =
" - assign an IP address to the bond device :\n" " - assign an IP address to the bond device :\n"
" # ifconfig bond0 <addr> netmask <mask> broadcast <bcast>\n" " # ifconfig bond0 <addr> netmask <mask> broadcast <bcast>\n"
" - attach all the interfaces you need to the bond device :\n" " - attach all the interfaces you need to the bond device :\n"
" # ifenslave bond0 eth0 eth1 eth2\n" " # ifenslave [{-f|--force}] bond0 eth0 [eth1 [eth2]...]\n"
" If bond0 didn't have a MAC address, it will take eth0's. Then, all\n" " If bond0 didn't have a MAC address, it will take eth0's. Then, all\n"
" interfaces attached AFTER this assignment will get the same MAC addr.\n" " interfaces attached AFTER this assignment will get the same MAC addr.\n"
"\n" " (except for ALB/TLB modes)\n"
" To detach a dead interface without setting the bond device down :\n"
" # ifenslave -d bond0 eth1\n"
"\n" "\n"
" To set the bond device down and automatically release all the slaves :\n" " To set the bond device down and automatically release all the slaves :\n"
" # ifconfig bond0 down\n" " # ifconfig bond0 down\n"
"\n" "\n"
" To detach a dead interface without setting the bond device down :\n"
" # ifenslave {-d|--detach} bond0 eth0 [eth1 [eth2]...]\n"
"\n"
" To change active slave :\n" " To change active slave :\n"
" # ifenslave -c bond0 eth0\n" " # ifenslave {-c|--change-active} bond0 eth0\n"
"\n"
" To show master interface info\n"
" # ifenslave bond0\n"
"\n"
" To show all interfaces info\n"
" # ifenslave {-a|--all-interfaces}\n"
"\n"
" To be more verbose\n"
" # ifenslave {-v|--verbose} ...\n"
"\n"
" # ifenslave {-u|--usage} Show usage\n"
" # ifenslave {-V|--version} Show version\n"
" # ifenslave {-h|--help} This message\n"
"\n"; "\n";
#include <unistd.h> #include <unistd.h>
...@@ -153,476 +171,332 @@ typedef __uint8_t u8; /* ditto */ ...@@ -153,476 +171,332 @@ typedef __uint8_t u8; /* ditto */
#include <linux/ethtool.h> #include <linux/ethtool.h>
struct option longopts[] = { struct option longopts[] = {
/* { name has_arg *flag val } */ /* { name has_arg *flag val } */
{"all-interfaces", 0, 0, 'a'}, /* Show all interfaces. */ {"all-interfaces", 0, 0, 'a'}, /* Show all interfaces. */
{"force", 0, 0, 'f'}, /* Force the operation. */ {"change-active", 0, 0, 'c'}, /* Change the active slave. */
{"help", 0, 0, '?'}, /* Give help */ {"detach", 0, 0, 'd'}, /* Detach a slave interface. */
{"howto", 0, 0, 'h'}, /* Give some more help */ {"force", 0, 0, 'f'}, /* Force the operation. */
{"receive-slave", 0, 0, 'r'}, /* Make a receive-only slave. */ {"help", 0, 0, 'h'}, /* Give help */
{"verbose", 0, 0, 'v'}, /* Report each action taken. */ {"usage", 0, 0, 'u'}, /* Give usage */
{"version", 0, 0, 'V'}, /* Emit version information. */ {"verbose", 0, 0, 'v'}, /* Report each action taken. */
{"detach", 0, 0, 'd'}, /* Detach a slave interface. */ {"version", 0, 0, 'V'}, /* Emit version information. */
{"change-active", 0, 0, 'c'}, /* Change the active slave. */ { 0, 0, 0, 0}
{ 0, 0, 0, 0 }
}; };
/* Command-line flags. */ /* Command-line flags. */
unsigned int unsigned int
opt_a = 0, /* Show-all-interfaces flag. */ opt_a = 0, /* Show-all-interfaces flag. */
opt_f = 0, /* Force the operation. */ opt_c = 0, /* Change-active-slave flag. */
opt_r = 0, /* Set up a Rx-only slave. */ opt_d = 0, /* Detach a slave interface. */
opt_d = 0, /* detach a slave interface. */ opt_f = 0, /* Force the operation. */
opt_c = 0, /* change-active-slave flag. */ opt_h = 0, /* Help */
verbose = 0, /* Verbose flag. */ opt_u = 0, /* Usage */
opt_version = 0, opt_v = 0, /* Verbose flag. */
opt_howto = 0; opt_V = 0; /* Version */
int skfd = -1; /* AF_INET socket for ioctl() calls. */
int skfd = -1; /* AF_INET socket for ioctl() calls.*/
int abi_ver = 0; /* userland - kernel ABI version */
int hwaddr_set = 0; /* Master's hwaddr is set */
int saved_errno;
struct ifreq master_mtu, master_flags, master_hwaddr;
struct ifreq slave_mtu, slave_flags, slave_hwaddr;
struct dev_ifr {
struct ifreq *req_ifr;
char *req_name;
int req_type;
};
static void if_print(char *ifname); struct dev_ifr master_ifra[] = {
static int get_abi_ver(char *master_ifname); {&master_mtu, "SIOCGIFMTU", SIOCGIFMTU},
{&master_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS},
{&master_hwaddr, "SIOCGIFHWADDR", SIOCGIFHWADDR},
{NULL, "", 0}
};
struct dev_ifr slave_ifra[] = {
{&slave_mtu, "SIOCGIFMTU", SIOCGIFMTU},
{&slave_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS},
{&slave_hwaddr, "SIOCGIFHWADDR", SIOCGIFHWADDR},
{NULL, "", 0}
};
int static void if_print(char *ifname);
main(int argc, char **argv) static int get_drv_info(char *master_ifname);
static int get_if_settings(char *ifname, struct dev_ifr ifra[]);
static int get_slave_flags(char *slave_ifname);
static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr);
static int set_slave_hwaddr(char *slave_ifname, struct sockaddr *hwaddr);
static int set_slave_mtu(char *slave_ifname, int mtu);
static int set_if_flags(char *ifname, short flags);
static int set_if_up(char *ifname, short flags);
static int set_if_down(char *ifname, short flags);
static int clear_if_addr(char *ifname);
static int set_if_addr(char *master_ifname, char *slave_ifname);
static int change_active(char *master_ifname, char *slave_ifname);
static int enslave(char *master_ifname, char *slave_ifname);
static int release(char *master_ifname, char *slave_ifname);
#define v_print(fmt, args...) \
if (opt_v) \
fprintf(stderr, fmt, ## args )
int main(int argc, char *argv[])
{ {
struct ifreq ifr2, if_hwaddr, if_ipaddr, if_metric, if_mtu, if_dstaddr;
struct ifreq if_netmask, if_brdaddr, if_flags;
int rv, goterr = 0;
int c, errflag = 0;
sa_family_t master_family;
char **spp, *master_ifname, *slave_ifname; char **spp, *master_ifname, *slave_ifname;
int hwaddr_notset; int c, i, rv;
int abi_ver = 0; int res = 0;
int exclusive = 0;
while ((c = getopt_long(argc, argv, "acdfrvV?h", longopts, 0)) != EOF) while ((c = getopt_long(argc, argv, "acdfhuvV", longopts, 0)) != EOF) {
switch (c) { switch (c) {
case 'a': opt_a++; break; case 'a': opt_a++; exclusive++; break;
case 'f': opt_f++; break; case 'c': opt_c++; exclusive++; break;
case 'r': opt_r++; break; case 'd': opt_d++; exclusive++; break;
case 'd': opt_d++; break; case 'f': opt_f++; exclusive++; break;
case 'c': opt_c++; break; case 'h': opt_h++; exclusive++; break;
case 'v': verbose++; break; case 'u': opt_u++; exclusive++; break;
case 'V': opt_version++; break; case 'v': opt_v++; break;
case 'h': opt_howto++; break; case 'V': opt_V++; exclusive++; break;
case '?': errflag++;
} case '?':
/* option check */
if (opt_c)
if(opt_a || opt_f || opt_r || opt_d || verbose || opt_version ||
opt_howto || errflag ) {
fprintf(stderr, usage_msg); fprintf(stderr, usage_msg);
return 2; res = 2;
goto out;
} }
}
if (errflag) { /* options check */
if (exclusive > 1) {
fprintf(stderr, usage_msg); fprintf(stderr, usage_msg);
return 2; res = 2;
goto out;
} }
if (opt_howto) { if (opt_v || opt_V) {
fprintf(stderr, howto_msg); printf(version);
return 0; if (opt_V) {
res = 0;
goto out;
}
} }
if (verbose || opt_version) { if (opt_u) {
printf(version); printf(usage_msg);
if (opt_version) res = 0;
exit(0); goto out;
} }
/* Open a basic socket. */ if (opt_h) {
if ((skfd = socket(AF_INET, SOCK_DGRAM,0)) < 0) { printf(usage_msg);
perror("socket"); printf(help_msg);
exit(-1); res = 0;
goto out;
} }
if (verbose) /* Open a basic socket */
fprintf(stderr, "DEBUG: argc=%d, optind=%d and argv[optind] is %s.\n", if ((skfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
argc, optind, argv[optind]); perror("socket");
res = 1;
goto out;
}
/* No remaining args means show all interfaces. */ if (opt_a) {
if (optind == argc) { if (optind == argc) {
if_print((char *)NULL); /* No remaining args */
(void) close(skfd); /* show all interfaces */
exit(0); if_print((char *)NULL);
goto out;
} else {
/* Just show usage */
fprintf(stderr, usage_msg);
res = 2;
goto out;
}
} }
/* Copy the interface name. */ /* Copy the interface name */
spp = argv + optind; spp = argv + optind;
master_ifname = *spp++; master_ifname = *spp++;
slave_ifname = *spp++;
/* Check command line. */ if (master_ifname == NULL) {
if (opt_c) { fprintf(stderr, usage_msg);
char **tempp = spp; res = 2;
if ((master_ifname == NULL)||(slave_ifname == NULL)||(*tempp++ != NULL)) { goto out;
fprintf(stderr, usage_msg);
(void) close(skfd);
return 2;
}
} }
/* A single args means show the configuration for this interface. */ /* exchange abi version with bonding module */
if (slave_ifname == NULL) { res = get_drv_info(master_ifname);
if_print(master_ifname); if (res) {
(void) close(skfd); fprintf(stderr,
exit(0); "Master '%s': Error: handshake with driver failed. "
} "Aborting\n",
master_ifname);
/* exchange abi version with bonding driver */ goto out;
abi_ver = get_abi_ver(master_ifname); }
if (abi_ver < 0) {
(void) close(skfd);
exit(1);
}
/* Get the vitals from the master interface. */
{
struct ifreq *ifra[7] = { &if_ipaddr, &if_mtu, &if_dstaddr,
&if_brdaddr, &if_netmask, &if_flags,
&if_hwaddr };
const char *req_name[7] = {
"IP address", "MTU", "destination address",
"broadcast address", "netmask", "status flags",
"hardware address" };
const int ioctl_req_type[7] = {
SIOCGIFADDR, SIOCGIFMTU, SIOCGIFDSTADDR,
SIOCGIFBRDADDR, SIOCGIFNETMASK, SIOCGIFFLAGS,
SIOCGIFHWADDR };
int i;
for (i = 0; i < 7; i++) {
strncpy(ifra[i]->ifr_name, master_ifname, IFNAMSIZ);
if (ioctl(skfd, ioctl_req_type[i], ifra[i]) < 0) {
fprintf(stderr,
"Something broke getting the master's %s: %s.\n",
req_name[i], strerror(errno));
}
}
/* check if master is up; if not then fail any operation */ slave_ifname = *spp++;
if (!(if_flags.ifr_flags & IFF_UP)) {
fprintf(stderr, "Illegal operation; the specified master interface '%s' is not up.\n", master_ifname);
(void) close(skfd);
exit (1);
}
hwaddr_notset = 1; /* assume master's address not set yet */ if (slave_ifname == NULL) {
for (i = 0; hwaddr_notset && (i < 6); i++) { if (opt_d || opt_c) {
hwaddr_notset &= ((unsigned char *)if_hwaddr.ifr_hwaddr.sa_data)[i] == 0; fprintf(stderr, usage_msg);
res = 2;
goto out;
} }
/* The family '1' is ARPHRD_ETHER for ethernet. */ /* A single arg means show the
if (if_hwaddr.ifr_hwaddr.sa_family != 1 && !opt_f) { * configuration for this interface
fprintf(stderr, "The specified master interface '%s' is not" */
" ethernet-like.\n This program is designed to work" if_print(master_ifname);
" with ethernet-like network interfaces.\n" goto out;
" Use the '-f' option to force the operation.\n",
master_ifname);
(void) close(skfd);
exit (1);
}
master_family = if_hwaddr.ifr_hwaddr.sa_family;
if (verbose) {
unsigned char *hwaddr = (unsigned char *)if_hwaddr.ifr_hwaddr.sa_data;
printf("The current hardware address (SIOCGIFHWADDR) of %s is type %d "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", master_ifname,
if_hwaddr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1],
hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
}
} }
res = get_if_settings(master_ifname, master_ifra);
if (res) {
/* Probably a good reason not to go on */
fprintf(stderr,
"Master '%s': Error: get settings failed: %s. "
"Aborting\n",
master_ifname, strerror(res));
goto out;
}
/* do this when enslaving interfaces */ /* check if master is indeed a master;
do { * if not then fail any operation
if (opt_d) { /* detach a slave interface from the master */ */
strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ); if (!(master_flags.ifr_flags & IFF_MASTER)) {
strncpy(if_flags.ifr_slave, slave_ifname, IFNAMSIZ); fprintf(stderr,
if ((ioctl(skfd, SIOCBONDRELEASE, &if_flags) < 0) && "Illegal operation; the specified interface '%s' "
(ioctl(skfd, BOND_RELEASE_OLD, &if_flags) < 0)) { "is not a master. Aborting\n",
fprintf(stderr, "SIOCBONDRELEASE: cannot detach %s from %s. errno=%s.\n", master_ifname);
slave_ifname, master_ifname, strerror(errno)); res = 1;
} goto out;
else if (abi_ver < 1) { }
/* The driver is using an old ABI, so we'll set the interface
* down to avoid any conflicts due to same IP/MAC
*/
strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCGIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "SIOCGIFFLAGS on %s failed: %s\n", slave_ifname,
strerror(saved_errno));
}
else {
ifr2.ifr_flags &= ~(IFF_UP | IFF_RUNNING);
if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "Shutting down interface %s failed: %s\n",
slave_ifname, strerror(saved_errno));
}
}
}
} else if (opt_c) { /* change primary slave */
strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ);
strncpy(if_flags.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDCHANGEACTIVE, &if_flags) < 0) &&
(ioctl(skfd, BOND_CHANGE_ACTIVE_OLD, &if_flags) < 0)) {
fprintf(stderr, "SIOCBONDCHANGEACTIVE: %s.\n", strerror(errno));
}
} else { /* attach a slave interface to the master */
strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCGIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "SIOCGIFFLAGS on %s failed: %s\n", slave_ifname,
strerror(saved_errno));
(void) close(skfd);
return 1;
}
if ((ifr2.ifr_flags & IFF_SLAVE) && !opt_r) {
fprintf(stderr, "%s is already a slave\n", slave_ifname);
(void) close(skfd);
return 1;
}
/* if hwaddr_notset, assign the slave hw address to the master */
if (hwaddr_notset) {
/* assign the slave hw address to the
* master since it currently does not
* have one; otherwise, slaves may
* have different hw addresses in
* active-backup mode as seen when enslaving
* using "ifenslave bond0 eth0 eth1" because
* hwaddr_notset is set outside this loop.
* TODO: put this and the "else" portion in
* a function.
*/
/* get the slaves MAC address */
strncpy(if_hwaddr.ifr_name, slave_ifname,
IFNAMSIZ);
rv = ioctl(skfd, SIOCGIFHWADDR, &if_hwaddr);
if (-1 == rv) {
fprintf(stderr, "Could not get MAC "
"address of %s: %s\n",
slave_ifname,
strerror(errno));
strncpy(if_hwaddr.ifr_name,
master_ifname, IFNAMSIZ);
goterr = 1;
}
if (!goterr) {
if (abi_ver < 1) {
/* In ABI versions older than 1, the
* master's set_mac routine couldn't
* work if it was up, because it
* used the default ethernet set_mac
* function.
*/
/* bring master down */
if_flags.ifr_flags &= ~IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS,
&if_flags) < 0) {
goterr = 1;
fprintf(stderr,
"Shutting down "
"interface %s failed: "
"%s\n",
master_ifname,
strerror(errno));
}
}
strncpy(if_hwaddr.ifr_name,
master_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCSIFHWADDR,
&if_hwaddr) < 0) {
fprintf(stderr,
"Could not set MAC "
"address of %s: %s\n",
master_ifname,
strerror(errno));
goterr=1;
} else {
hwaddr_notset = 0;
}
if (abi_ver < 1) {
/* bring master back up */
if_flags.ifr_flags |= IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS,
&if_flags) < 0) {
fprintf(stderr,
"Bringing up interface "
"%s failed: %s\n",
master_ifname,
strerror(errno));
}
}
}
} else if (abi_ver < 1) { /* if (hwaddr_notset) */
/* The driver is using an old ABI, so we'll set the interface
* down and assign the master's hwaddr to it
*/
if (ifr2.ifr_flags & IFF_UP) {
ifr2.ifr_flags &= ~IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "Shutting down interface %s failed: %s\n",
slave_ifname, strerror(saved_errno));
}
}
strncpy(if_hwaddr.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCSIFHWADDR, &if_hwaddr) < 0) {
int saved_errno = errno;
fprintf(stderr, "SIOCSIFHWADDR on %s failed: %s\n", if_hwaddr.ifr_name,
strerror(saved_errno));
if (saved_errno == EBUSY)
fprintf(stderr, " The slave device %s is busy: it must be"
" idle before running this command.\n", slave_ifname);
else if (saved_errno == EOPNOTSUPP)
fprintf(stderr, " The slave device you specified does not support"
" setting the MAC address.\n Your kernel likely does not"
" support slave devices.\n");
else if (saved_errno == EINVAL)
fprintf(stderr, " The slave device's address type does not match"
" the master's address type.\n");
} else {
if (verbose) {
unsigned char *hwaddr = if_hwaddr.ifr_hwaddr.sa_data;
printf("Slave's (%s) hardware address set to "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", slave_ifname,
hwaddr[0], hwaddr[1], hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
}
}
}
if (*spp && !strcmp(*spp, "metric")) { /* check if master is up; if not then fail any operation */
if (*++spp == NULL) { if (!(master_flags.ifr_flags & IFF_UP)) {
fprintf(stderr, usage_msg); fprintf(stderr,
(void) close(skfd); "Illegal operation; the specified master interface "
exit(2); "'%s' is not up.\n",
} master_ifname);
if_metric.ifr_metric = atoi(*spp); res = 1;
strncpy(if_metric.ifr_name, slave_ifname, IFNAMSIZ); goto out;
if (ioctl(skfd, SIOCSIFMETRIC, &if_metric) < 0) { }
fprintf(stderr, "SIOCSIFMETRIC on %s: %s\n", slave_ifname,
strerror(errno));
goterr = 1;
}
spp++;
}
if (strncpy(if_ipaddr.ifr_name, slave_ifname, IFNAMSIZ) <= 0 /* Only for enslaving */
|| ioctl(skfd, SIOCSIFADDR, &if_ipaddr) < 0) { if (!opt_c && !opt_d) {
fprintf(stderr, sa_family_t master_family = master_hwaddr.ifr_hwaddr.sa_family;
"Something broke setting the slave's address: %s.\n", unsigned char *hwaddr =
strerror(errno)); (unsigned char *)master_hwaddr.ifr_hwaddr.sa_data;
} else {
if (verbose) {
unsigned char *ipaddr = if_ipaddr.ifr_addr.sa_data;
printf("Set the slave's (%s) IP address to %d.%d.%d.%d.\n",
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
}
if (strncpy(if_mtu.ifr_name, slave_ifname, IFNAMSIZ) <= 0 /* The family '1' is ARPHRD_ETHER for ethernet. */
|| ioctl(skfd, SIOCSIFMTU, &if_mtu) < 0) { if (master_family != 1 && !opt_f) {
fprintf(stderr, "Something broke setting the slave MTU: %s.\n", fprintf(stderr,
strerror(errno)); "Illegal operation: The specified master "
} else { "interface '%s' is not ethernet-like.\n "
if (verbose) "This program is designed to work with "
printf("Set the slave's (%s) MTU to %d.\n", slave_ifname, if_mtu.ifr_mtu); "ethernet-like network interfaces.\n "
} "Use the '-f' option to force the "
"operation.\n",
master_ifname);
res = 1;
goto out;
}
if (strncpy(if_dstaddr.ifr_name, slave_ifname, IFNAMSIZ) <= 0 /* Check master's hw addr */
|| ioctl(skfd, SIOCSIFDSTADDR, &if_dstaddr) < 0) { for (i = 0; i < 6; i++) {
fprintf(stderr, "Error setting the slave (%s) with SIOCSIFDSTADDR: %s.\n", if (hwaddr[i] != 0) {
slave_ifname, strerror(errno)); hwaddr_set = 1;
} else { break;
if (verbose) {
unsigned char *ipaddr = if_dstaddr.ifr_dstaddr.sa_data;
printf("Set the slave's (%s) destination address to %d.%d.%d.%d.\n",
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
} }
}
if (strncpy(if_brdaddr.ifr_name, slave_ifname, IFNAMSIZ) <= 0 if (hwaddr_set) {
|| ioctl(skfd, SIOCSIFBRDADDR, &if_brdaddr) < 0) { v_print("current hardware address of master '%s' "
fprintf(stderr, "is %2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x, "
"Something broke setting the slave (%s) broadcast address: %s.\n", "type %d\n",
slave_ifname, strerror(errno)); master_ifname,
} else { hwaddr[0], hwaddr[1],
if (verbose) { hwaddr[2], hwaddr[3],
unsigned char *ipaddr = if_brdaddr.ifr_broadaddr.sa_data; hwaddr[4], hwaddr[5],
printf("Set the slave's (%s) broadcast address to %d.%d.%d.%d.\n", master_family);
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]); }
} }
}
if (strncpy(if_netmask.ifr_name, slave_ifname, IFNAMSIZ) <= 0 /* Accepts only one slave */
|| ioctl(skfd, SIOCSIFNETMASK, &if_netmask) < 0) { if (opt_c) {
fprintf(stderr, /* change active slave */
"Something broke setting the slave (%s) netmask: %s.\n", res = get_slave_flags(slave_ifname);
slave_ifname, strerror(errno)); if (res) {
} else { fprintf(stderr,
if (verbose) { "Slave '%s': Error: get flags failed. "
unsigned char *ipaddr = if_netmask.ifr_netmask.sa_data; "Aborting\n",
printf("Set the slave's (%s) netmask to %d.%d.%d.%d.\n", slave_ifname);
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]); goto out;
}
res = change_active(master_ifname, slave_ifname);
if (res) {
fprintf(stderr,
"Master '%s', Slave '%s': Error: "
"Change active failed\n",
master_ifname, slave_ifname);
}
} else {
/* Accept multiple slaves */
do {
if (opt_d) {
/* detach a slave interface from the master */
rv = get_slave_flags(slave_ifname);
if (rv) {
/* Can't work with this slave. */
/* remember the error and skip it*/
fprintf(stderr,
"Slave '%s': Error: get flags "
"failed. Skipping\n",
slave_ifname);
res = rv;
continue;
} }
} rv = release(master_ifname, slave_ifname);
if (rv) {
if (abi_ver < 1) { fprintf(stderr,
"Master '%s', Slave '%s': Error: "
/* The driver is using an old ABI, so we'll set the interface "Release failed\n",
* up before enslaving it master_ifname, slave_ifname);
*/ res = rv;
ifr2.ifr_flags |= IFF_UP;
if ((ifr2.ifr_flags &= ~(IFF_SLAVE | IFF_MASTER)) == 0
|| strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
fprintf(stderr,
"Something broke setting the slave (%s) flags: %s.\n",
slave_ifname, strerror(errno));
} else {
if (verbose)
printf("Set the slave's (%s) flags %4.4x.\n",
slave_ifname, if_flags.ifr_flags);
} }
} else { } else {
/* the bonding module takes care of setting the slave's mac address /* attach a slave interface to the master */
* and opening its interface rv = get_if_settings(slave_ifname, slave_ifra);
*/ if (rv) {
if (ifr2.ifr_flags & IFF_UP) { /* the interface will need to be down */ /* Can't work with this slave. */
ifr2.ifr_flags &= ~IFF_UP; /* remember the error and skip it*/
if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) { fprintf(stderr,
int saved_errno = errno; "Slave '%s': Error: get "
fprintf(stderr, "Shutting down interface %s failed: %s\n", "settings failed: %s. "
slave_ifname, strerror(saved_errno)); "Skipping\n",
} slave_ifname, strerror(rv));
res = rv;
continue;
} }
} rv = enslave(master_ifname, slave_ifname);
if (rv) {
/* Do the real thing */ fprintf(stderr,
if (!opt_r) { "Master '%s', Slave '%s': Error: "
strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ); "Enslave failed\n",
strncpy(if_flags.ifr_slave, slave_ifname, IFNAMSIZ); master_ifname, slave_ifname);
if ((ioctl(skfd, SIOCBONDENSLAVE, &if_flags) < 0) && res = rv;
(ioctl(skfd, BOND_ENSLAVE_OLD, &if_flags) < 0)) {
fprintf(stderr, "SIOCBONDENSLAVE: %s.\n", strerror(errno));
} }
} }
} } while ((slave_ifname = *spp++) != NULL);
} while ( (slave_ifname = *spp++) != NULL); }
/* Close the socket. */ out:
(void) close(skfd); if (skfd >= 0) {
close(skfd);
}
return(goterr); return res;
} }
static short mif_flags; static short mif_flags;
...@@ -631,35 +505,34 @@ static short mif_flags; ...@@ -631,35 +505,34 @@ static short mif_flags;
static int if_getconfig(char *ifname) static int if_getconfig(char *ifname)
{ {
struct ifreq ifr; struct ifreq ifr;
int metric, mtu; /* Parameters of the master interface. */ int metric, mtu; /* Parameters of the master interface. */
struct sockaddr dstaddr, broadaddr, netmask; struct sockaddr dstaddr, broadaddr, netmask;
unsigned char *hwaddr;
strcpy(ifr.ifr_name, ifname); strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFFLAGS, &ifr) < 0) if (ioctl(skfd, SIOCGIFFLAGS, &ifr) < 0)
return -1; return -1;
mif_flags = ifr.ifr_flags; mif_flags = ifr.ifr_flags;
printf("The result of SIOCGIFFLAGS on %s is %x.\n", printf("The result of SIOCGIFFLAGS on %s is %x.\n",
ifname, ifr.ifr_flags); ifname, ifr.ifr_flags);
strcpy(ifr.ifr_name, ifname); strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFADDR, &ifr) < 0) if (ioctl(skfd, SIOCGIFADDR, &ifr) < 0)
return -1; return -1;
printf("The result of SIOCGIFADDR is %2.2x.%2.2x.%2.2x.%2.2x.\n", printf("The result of SIOCGIFADDR is %2.2x.%2.2x.%2.2x.%2.2x.\n",
ifr.ifr_addr.sa_data[0], ifr.ifr_addr.sa_data[1], ifr.ifr_addr.sa_data[0], ifr.ifr_addr.sa_data[1],
ifr.ifr_addr.sa_data[2], ifr.ifr_addr.sa_data[3]); ifr.ifr_addr.sa_data[2], ifr.ifr_addr.sa_data[3]);
strcpy(ifr.ifr_name, ifname); strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFHWADDR, &ifr) < 0) if (ioctl(skfd, SIOCGIFHWADDR, &ifr) < 0)
return -1; return -1;
{ /* Gotta convert from 'char' to unsigned for printf(). */
/* Gotta convert from 'char' to unsigned for printf(). */ hwaddr = (unsigned char *)ifr.ifr_hwaddr.sa_data;
unsigned char *hwaddr = (unsigned char *)ifr.ifr_hwaddr.sa_data; printf("The result of SIOCGIFHWADDR is type %d "
printf("The result of SIOCGIFHWADDR is type %d " "%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", ifr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1],
ifr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1], hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
}
strcpy(ifr.ifr_name, ifname); strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFMETRIC, &ifr) < 0) { if (ioctl(skfd, SIOCGIFMETRIC, &ifr) < 0) {
...@@ -691,7 +564,7 @@ static int if_getconfig(char *ifname) ...@@ -691,7 +564,7 @@ static int if_getconfig(char *ifname)
} else } else
netmask = ifr.ifr_netmask; netmask = ifr.ifr_netmask;
return(0); return 0;
} }
static void if_print(char *ifname) static void if_print(char *ifname)
...@@ -705,15 +578,16 @@ static void if_print(char *ifname) ...@@ -705,15 +578,16 @@ static void if_print(char *ifname)
ifc.ifc_len = sizeof(buff); ifc.ifc_len = sizeof(buff);
ifc.ifc_buf = buff; ifc.ifc_buf = buff;
if (ioctl(skfd, SIOCGIFCONF, &ifc) < 0) { if (ioctl(skfd, SIOCGIFCONF, &ifc) < 0) {
fprintf(stderr, "SIOCGIFCONF: %s\n", strerror(errno)); perror("SIOCGIFCONF failed");
return; return;
} }
ifr = ifc.ifc_req; ifr = ifc.ifc_req;
for (i = ifc.ifc_len / sizeof(struct ifreq); --i >= 0; ifr++) { for (i = ifc.ifc_len / sizeof(struct ifreq); --i >= 0; ifr++) {
if (if_getconfig(ifr->ifr_name) < 0) { if (if_getconfig(ifr->ifr_name) < 0) {
fprintf(stderr, "%s: unknown interface.\n", fprintf(stderr,
ifr->ifr_name); "%s: unknown interface.\n",
ifr->ifr_name);
continue; continue;
} }
...@@ -721,16 +595,18 @@ static void if_print(char *ifname) ...@@ -721,16 +595,18 @@ static void if_print(char *ifname)
/*ife_print(&ife);*/ /*ife_print(&ife);*/
} }
} else { } else {
if (if_getconfig(ifname) < 0) if (if_getconfig(ifname) < 0) {
fprintf(stderr, "%s: unknown interface.\n", ifname); fprintf(stderr,
"%s: unknown interface.\n", ifname);
}
} }
} }
static int get_abi_ver(char *master_ifname) static int get_drv_info(char *master_ifname)
{ {
struct ifreq ifr; struct ifreq ifr;
struct ethtool_drvinfo info; struct ethtool_drvinfo info;
int abi_ver = 0; char *endptr;
memset(&ifr, 0, sizeof(ifr)); memset(&ifr, 0, sizeof(ifr));
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ); strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
...@@ -739,24 +615,487 @@ static int get_abi_ver(char *master_ifname) ...@@ -739,24 +615,487 @@ static int get_abi_ver(char *master_ifname)
info.cmd = ETHTOOL_GDRVINFO; info.cmd = ETHTOOL_GDRVINFO;
strncpy(info.driver, "ifenslave", 32); strncpy(info.driver, "ifenslave", 32);
snprintf(info.fw_version, 32, "%d", BOND_ABI_VERSION); snprintf(info.fw_version, 32, "%d", BOND_ABI_VERSION);
if (ioctl(skfd, SIOCETHTOOL, &ifr) >= 0) {
char *endptr; if (ioctl(skfd, SIOCETHTOOL, &ifr) < 0) {
if (errno == EOPNOTSUPP) {
abi_ver = strtoul(info.fw_version, &endptr, 0); goto out;
if (*endptr) {
fprintf(stderr, "Error: got invalid string as an ABI "
"version from the bonding module\n");
return -1;
} }
saved_errno = errno;
v_print("Master '%s': Error: get bonding info failed %s\n",
master_ifname, strerror(saved_errno));
return 1;
}
abi_ver = strtoul(info.fw_version, &endptr, 0);
if (*endptr) {
v_print("Master '%s': Error: got invalid string as an ABI "
"version from the bonding module\n",
master_ifname);
return 1;
}
out:
v_print("ABI ver is %d\n", abi_ver);
return 0;
}
static int change_active(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res = 0;
if (!(slave_flags.ifr_flags & IFF_SLAVE)) {
fprintf(stderr,
"Illegal operation: The specified slave interface "
"'%s' is not a slave\n",
slave_ifname);
return 1;
} }
if (verbose) { strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
printf("ABI ver is %d\n", abi_ver); strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDCHANGEACTIVE, &ifr) < 0) &&
(ioctl(skfd, BOND_CHANGE_ACTIVE_OLD, &ifr) < 0)) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCBONDCHANGEACTIVE failed: "
"%s\n",
master_ifname, strerror(saved_errno));
res = 1;
} }
return abi_ver;
return res;
} }
static int enslave(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res = 0;
if (slave_flags.ifr_flags & IFF_SLAVE) {
fprintf(stderr,
"Illegal operation: The specified slave interface "
"'%s' is already a slave\n",
slave_ifname);
return 1;
}
res = set_if_down(slave_ifname, slave_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Slave '%s': Error: bring interface down failed\n",
slave_ifname);
return res;
}
if (abi_ver < 2) {
/* Older bonding versions would panic if the slave has no IP
* address, so get the IP setting from the master.
*/
res = set_if_addr(master_ifname, slave_ifname);
if (res) {
fprintf(stderr,
"Slave '%s': Error: set address failed\n",
slave_ifname);
return res;
}
} else {
res = clear_if_addr(slave_ifname);
if (res) {
fprintf(stderr,
"Slave '%s': Error: clear address failed\n",
slave_ifname);
return res;
}
}
if (master_mtu.ifr_mtu != slave_mtu.ifr_mtu) {
res = set_slave_mtu(slave_ifname, master_mtu.ifr_mtu);
if (res) {
fprintf(stderr,
"Slave '%s': Error: set MTU failed\n",
slave_ifname);
return res;
}
}
if (hwaddr_set) {
/* Master already has an hwaddr
* so set it's hwaddr to the slave
*/
if (abi_ver < 1) {
/* The driver is using an old ABI, so
* the application sets the slave's
* hwaddr
*/
res = set_slave_hwaddr(slave_ifname,
&(master_hwaddr.ifr_hwaddr));
if (res) {
fprintf(stderr,
"Slave '%s': Error: set hw address "
"failed\n",
slave_ifname);
goto undo_mtu;
}
/* For old ABI the application needs to bring the
* slave back up
*/
res = set_if_up(slave_ifname, slave_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Slave '%s': Error: bring interface "
"down failed\n",
slave_ifname);
goto undo_slave_mac;
}
}
/* The driver is using a new ABI,
* so the driver takes care of setting
* the slave's hwaddr and bringing
* it up again
*/
} else {
/* No hwaddr for master yet, so
* set the slave's hwaddr to it
*/
if (abi_ver < 1) {
/* For old ABI, the master needs to be
* down before setting it's hwaddr
*/
res = set_if_down(master_ifname, master_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Master '%s': Error: bring interface "
"down failed\n",
master_ifname);
goto undo_mtu;
}
}
res = set_master_hwaddr(master_ifname,
&(slave_hwaddr.ifr_hwaddr));
if (res) {
fprintf(stderr,
"Master '%s': Error: set hw address "
"failed\n",
master_ifname);
goto undo_mtu;
}
if (abi_ver < 1) {
/* For old ABI, bring the master
* back up
*/
res = set_if_up(master_ifname, master_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Master '%s': Error: bring interface "
"up failed\n",
master_ifname);
goto undo_master_mac;
}
}
hwaddr_set = 1;
}
/* Do the real thing */
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDENSLAVE, &ifr) < 0) &&
(ioctl(skfd, BOND_ENSLAVE_OLD, &ifr) < 0)) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCBONDENSLAVE failed: %s\n",
master_ifname, strerror(saved_errno));
res = 1;
}
if (res) {
goto undo_master_mac;
}
return 0;
/* rollback (best effort) */
undo_master_mac:
set_master_hwaddr(master_ifname, &(master_hwaddr.ifr_hwaddr));
hwaddr_set = 0;
goto undo_mtu;
undo_slave_mac:
set_slave_hwaddr(slave_ifname, &(slave_hwaddr.ifr_hwaddr));
undo_mtu:
set_slave_mtu(slave_ifname, slave_mtu.ifr_mtu);
return res;
}
static int release(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res = 0;
if (!(slave_flags.ifr_flags & IFF_SLAVE)) {
fprintf(stderr,
"Illegal operation: The specified slave interface "
"'%s' is not a slave\n",
slave_ifname);
return 1;
}
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDRELEASE, &ifr) < 0) &&
(ioctl(skfd, BOND_RELEASE_OLD, &ifr) < 0)) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCBONDRELEASE failed: %s\n",
master_ifname, strerror(saved_errno));
return 1;
} else if (abi_ver < 1) {
/* The driver is using an old ABI, so we'll set the interface
* down to avoid any conflicts due to same MAC/IP
*/
res = set_if_down(slave_ifname, slave_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Slave '%s': Error: bring interface "
"down failed\n",
slave_ifname);
}
}
/* set to default mtu */
set_slave_mtu(slave_ifname, 1500);
return res;
}
static int get_if_settings(char *ifname, struct dev_ifr ifra[])
{
int i;
int res = 0;
for (i = 0; ifra[i].req_ifr; i++) {
strncpy(ifra[i].req_ifr->ifr_name, ifname, IFNAMSIZ);
res = ioctl(skfd, ifra[i].req_type, ifra[i].req_ifr);
if (res < 0) {
saved_errno = errno;
v_print("Interface '%s': Error: %s failed: %s\n",
ifname, ifra[i].req_name,
strerror(saved_errno));
return saved_errno;
}
}
return 0;
}
static int get_slave_flags(char *slave_ifname)
{
int res = 0;
strncpy(slave_flags.ifr_name, slave_ifname, IFNAMSIZ);
res = ioctl(skfd, SIOCGIFFLAGS, &slave_flags);
if (res < 0) {
saved_errno = errno;
v_print("Slave '%s': Error: SIOCGIFFLAGS failed: %s\n",
slave_ifname, strerror(saved_errno));
} else {
v_print("Slave %s: flags %04X.\n",
slave_ifname, slave_flags.ifr_flags);
}
return res;
}
static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr)
{
unsigned char *addr = (unsigned char *)hwaddr->sa_data;
struct ifreq ifr;
int res = 0;
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
memcpy(&(ifr.ifr_hwaddr), hwaddr, sizeof(struct sockaddr));
res = ioctl(skfd, SIOCSIFHWADDR, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCSIFHWADDR failed: %s\n",
master_ifname, strerror(saved_errno));
return res;
} else {
v_print("Master '%s': hardware address set to "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
master_ifname, addr[0], addr[1], addr[2],
addr[3], addr[4], addr[5]);
}
return res;
}
static int set_slave_hwaddr(char *slave_ifname, struct sockaddr *hwaddr)
{
unsigned char *addr = (unsigned char *)hwaddr->sa_data;
struct ifreq ifr;
int res = 0;
strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ);
memcpy(&(ifr.ifr_hwaddr), hwaddr, sizeof(struct sockaddr));
res = ioctl(skfd, SIOCSIFHWADDR, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Slave '%s': Error: SIOCSIFHWADDR failed: %s\n",
slave_ifname, strerror(saved_errno));
if (saved_errno == EBUSY) {
v_print(" The device is busy: it must be idle "
"before running this command.\n");
} else if (saved_errno == EOPNOTSUPP) {
v_print(" The device does not support setting "
"the MAC address.\n"
" Your kernel likely does not support slave "
"devices.\n");
} else if (saved_errno == EINVAL) {
v_print(" The device's address type does not match "
"the master's address type.\n");
}
return res;
} else {
v_print("Slave '%s': hardware address set to "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
slave_ifname, addr[0], addr[1], addr[2],
addr[3], addr[4], addr[5]);
}
return res;
}
static int set_slave_mtu(char *slave_ifname, int mtu)
{
struct ifreq ifr;
int res = 0;
ifr.ifr_mtu = mtu;
strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ);
res = ioctl(skfd, SIOCSIFMTU, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Slave '%s': Error: SIOCSIFMTU failed: %s\n",
slave_ifname, strerror(saved_errno));
} else {
v_print("Slave '%s': MTU set to %d.\n", slave_ifname, mtu);
}
return res;
}
static int set_if_flags(char *ifname, short flags)
{
struct ifreq ifr;
int res = 0;
ifr.ifr_flags = flags;
strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
res = ioctl(skfd, SIOCSIFFLAGS, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Interface '%s': Error: SIOCSIFFLAGS failed: %s\n",
ifname, strerror(saved_errno));
} else {
v_print("Interface '%s': flags set to %04X.\n", ifname, flags);
}
return res;
}
static int set_if_up(char *ifname, short flags)
{
return set_if_flags(ifname, flags | IFF_UP);
}
static int set_if_down(char *ifname, short flags)
{
return set_if_flags(ifname, flags & ~IFF_UP);
}
static int clear_if_addr(char *ifname)
{
struct ifreq ifr;
int res = 0;
strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
ifr.ifr_addr.sa_family = AF_INET;
memset(ifr.ifr_addr.sa_data, 0, sizeof(ifr.ifr_addr.sa_data));
res = ioctl(skfd, SIOCSIFADDR, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Interface '%s': Error: SIOCSIFADDR failed: %s\n",
ifname, strerror(saved_errno));
} else {
v_print("Interface '%s': address cleared\n", ifname);
}
return res;
}
static int set_if_addr(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res;
unsigned char *ipaddr;
int i;
struct {
char *req_name;
char *desc;
int g_ioctl;
int s_ioctl;
} ifra[] = {
{"IFADDR", "addr", SIOCGIFADDR, SIOCSIFADDR},
{"DSTADDR", "destination addr", SIOCGIFDSTADDR, SIOCSIFDSTADDR},
{"BRDADDR", "broadcast addr", SIOCGIFBRDADDR, SIOCSIFBRDADDR},
{"NETMASK", "netmask", SIOCGIFNETMASK, SIOCSIFNETMASK},
{NULL, NULL, 0, 0},
};
for (i = 0; ifra[i].req_name; i++) {
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
res = ioctl(skfd, ifra[i].g_ioctl, &ifr);
if (res < 0) {
int saved_errno = errno;
v_print("Interface '%s': Error: SIOCG%s failed: %s\n",
master_ifname, ifra[i].req_name,
strerror(saved_errno));
ifr.ifr_addr.sa_family = AF_INET;
memset(ifr.ifr_addr.sa_data, 0,
sizeof(ifr.ifr_addr.sa_data));
}
strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ);
res = ioctl(skfd, ifra[i].s_ioctl, &ifr);
if (res < 0) {
int saved_errno = errno;
v_print("Interface '%s': Error: SIOCS%s failed: %s\n",
slave_ifname, ifra[i].req_name,
strerror(saved_errno));
return res;
}
ipaddr = ifr.ifr_addr.sa_data;
v_print("Interface '%s': set IP %s to %d.%d.%d.%d\n",
slave_ifname, ifra[i].desc,
ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
return 0;
}
/* /*
* Local variables: * Local variables:
...@@ -768,3 +1107,4 @@ static int get_abi_ver(char *master_ifname) ...@@ -768,3 +1107,4 @@ static int get_abi_ver(char *master_ifname)
* compile-command: "gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave" * compile-command: "gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave"
* End: * End:
*/ */
...@@ -47,8 +47,13 @@ ...@@ -47,8 +47,13 @@
* - Send LACPDU as highest priority packet to further fix the above * - Send LACPDU as highest priority packet to further fix the above
* problem on very high Tx traffic load where packets may get dropped * problem on very high Tx traffic load where packets may get dropped
* by the slave. * by the slave.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/ */
//#define BONDING_DEBUG 1
#include <linux/skbuff.h> #include <linux/skbuff.h>
#include <linux/if_ether.h> #include <linux/if_ether.h>
#include <linux/netdevice.h> #include <linux/netdevice.h>
...@@ -119,6 +124,7 @@ ...@@ -119,6 +124,7 @@
static struct mac_addr null_mac_addr = {{0, 0, 0, 0, 0, 0}}; static struct mac_addr null_mac_addr = {{0, 0, 0, 0, 0, 0}};
static u16 ad_ticks_per_sec; static u16 ad_ticks_per_sec;
static const int ad_delta_in_ticks = (AD_TIMER_INTERVAL * HZ) / 1000;
// ================= 3AD api to bonding and kernel code ================== // ================= 3AD api to bonding and kernel code ==================
static u16 __get_link_speed(struct port *port); static u16 __get_link_speed(struct port *port);
...@@ -196,13 +202,11 @@ static inline struct bonding *__get_bond_by_port(struct port *port) ...@@ -196,13 +202,11 @@ static inline struct bonding *__get_bond_by_port(struct port *port)
*/ */
static inline struct port *__get_first_port(struct bonding *bond) static inline struct port *__get_first_port(struct bonding *bond)
{ {
struct slave *slave = bond->next; if (bond->slave_cnt == 0) {
if (slave == (struct slave *)bond) {
return NULL; return NULL;
} }
return &(SLAVE_AD_INFO(slave).port); return &(SLAVE_AD_INFO(bond->first_slave).port);
} }
/** /**
...@@ -218,7 +222,7 @@ static inline struct port *__get_next_port(struct port *port) ...@@ -218,7 +222,7 @@ static inline struct port *__get_next_port(struct port *port)
struct slave *slave = port->slave; struct slave *slave = port->slave;
// If there's no bond for this port, or this is the last slave // If there's no bond for this port, or this is the last slave
if ((bond == NULL) || (slave->next == bond->next)) { if ((bond == NULL) || (slave->next == bond->first_slave)) {
return NULL; return NULL;
} }
...@@ -236,12 +240,12 @@ static inline struct aggregator *__get_first_agg(struct port *port) ...@@ -236,12 +240,12 @@ static inline struct aggregator *__get_first_agg(struct port *port)
{ {
struct bonding *bond = __get_bond_by_port(port); struct bonding *bond = __get_bond_by_port(port);
// If there's no bond for this port, or this is the last slave // If there's no bond for this port, or bond has no slaves
if ((bond == NULL) || (bond->next == (struct slave *)bond)) { if ((bond == NULL) || (bond->slave_cnt == 0)) {
return NULL; return NULL;
} }
return &(SLAVE_AD_INFO(bond->next).aggregator); return &(SLAVE_AD_INFO(bond->first_slave).aggregator);
} }
/** /**
...@@ -257,7 +261,7 @@ static inline struct aggregator *__get_next_agg(struct aggregator *aggregator) ...@@ -257,7 +261,7 @@ static inline struct aggregator *__get_next_agg(struct aggregator *aggregator)
struct bonding *bond = bond_get_bond_by_slave(slave); struct bonding *bond = bond_get_bond_by_slave(slave);
// If there's no bond for this aggregator, or this is the last slave // If there's no bond for this aggregator, or this is the last slave
if ((bond == NULL) || (slave->next == bond->next)) { if ((bond == NULL) || (slave->next == bond->first_slave)) {
return NULL; return NULL;
} }
...@@ -392,7 +396,7 @@ static u16 __get_link_speed(struct port *port) ...@@ -392,7 +396,7 @@ static u16 __get_link_speed(struct port *port)
} }
} }
BOND_PRINT_DBG(("Port %d Received link speed %d update from adapter", port->actor_port_number, speed)); dprintk("Port %d Received link speed %d update from adapter\n", port->actor_port_number, speed);
return speed; return speed;
} }
...@@ -418,12 +422,12 @@ static u8 __get_duplex(struct port *port) ...@@ -418,12 +422,12 @@ static u8 __get_duplex(struct port *port)
switch (slave->duplex) { switch (slave->duplex) {
case DUPLEX_FULL: case DUPLEX_FULL:
retval=0x1; retval=0x1;
BOND_PRINT_DBG(("Port %d Received status full duplex update from adapter", port->actor_port_number)); dprintk("Port %d Received status full duplex update from adapter\n", port->actor_port_number);
break; break;
case DUPLEX_HALF: case DUPLEX_HALF:
default: default:
retval=0x0; retval=0x0;
BOND_PRINT_DBG(("Port %d Received status NOT full duplex update from adapter", port->actor_port_number)); dprintk("Port %d Received status NOT full duplex update from adapter\n", port->actor_port_number);
break; break;
} }
} }
...@@ -1059,7 +1063,7 @@ static void ad_mux_machine(struct port *port) ...@@ -1059,7 +1063,7 @@ static void ad_mux_machine(struct port *port)
// check if the state machine was changed // check if the state machine was changed
if (port->sm_mux_state != last_state) { if (port->sm_mux_state != last_state) {
BOND_PRINT_DBG(("Mux Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_mux_state)); dprintk("Mux Machine: Port=%d, Last State=%d, Curr State=%d\n", port->actor_port_number, last_state, port->sm_mux_state);
switch (port->sm_mux_state) { switch (port->sm_mux_state) {
case AD_MUX_DETACHED: case AD_MUX_DETACHED:
__detach_bond_from_agg(port); __detach_bond_from_agg(port);
...@@ -1158,7 +1162,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) ...@@ -1158,7 +1162,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
// check if the State machine was changed or new lacpdu arrived // check if the State machine was changed or new lacpdu arrived
if ((port->sm_rx_state != last_state) || (lacpdu)) { if ((port->sm_rx_state != last_state) || (lacpdu)) {
BOND_PRINT_DBG(("Rx Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_rx_state)); dprintk("Rx Machine: Port=%d, Last State=%d, Curr State=%d\n", port->actor_port_number, last_state, port->sm_rx_state);
switch (port->sm_rx_state) { switch (port->sm_rx_state) {
case AD_RX_INITIALIZE: case AD_RX_INITIALIZE:
if (!(port->actor_oper_port_key & AD_DUPLEX_KEY_BITS)) { if (!(port->actor_oper_port_key & AD_DUPLEX_KEY_BITS)) {
...@@ -1204,7 +1208,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port) ...@@ -1204,7 +1208,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
// detect loopback situation // detect loopback situation
if (!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) { if (!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) {
// INFO_RECEIVED_LOOPBACK_FRAMES // INFO_RECEIVED_LOOPBACK_FRAMES
printk(KERN_ERR "bonding: An illegal loopback occurred on adapter (%s)\n", printk(KERN_ERR DRV_NAME ": An illegal loopback occurred on adapter (%s)\n",
port->slave->dev->name); port->slave->dev->name);
printk(KERN_ERR "Check the configuration to verify that all Adapters " printk(KERN_ERR "Check the configuration to verify that all Adapters "
"are connected to 802.3ad compliant switch ports\n"); "are connected to 802.3ad compliant switch ports\n");
...@@ -1245,7 +1249,7 @@ static void ad_tx_machine(struct port *port) ...@@ -1245,7 +1249,7 @@ static void ad_tx_machine(struct port *port)
__update_lacpdu_from_port(port); __update_lacpdu_from_port(port);
// send the lacpdu // send the lacpdu
if (ad_lacpdu_send(port) >= 0) { if (ad_lacpdu_send(port) >= 0) {
BOND_PRINT_DBG(("Sent LACPDU on port %d", port->actor_port_number)); dprintk("Sent LACPDU on port %d\n", port->actor_port_number);
// mark ntt as false, so it will not be sent again until demanded // mark ntt as false, so it will not be sent again until demanded
port->ntt = 0; port->ntt = 0;
} }
...@@ -1318,7 +1322,7 @@ static void ad_periodic_machine(struct port *port) ...@@ -1318,7 +1322,7 @@ static void ad_periodic_machine(struct port *port)
// check if the state machine was changed // check if the state machine was changed
if (port->sm_periodic_state != last_state) { if (port->sm_periodic_state != last_state) {
BOND_PRINT_DBG(("Periodic Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_periodic_state)); dprintk("Periodic Machine: Port=%d, Last State=%d, Curr State=%d\n", port->actor_port_number, last_state, port->sm_periodic_state);
switch (port->sm_periodic_state) { switch (port->sm_periodic_state) {
case AD_NO_PERIODIC: case AD_NO_PERIODIC:
port->sm_periodic_timer_counter = 0; // zero timer port->sm_periodic_timer_counter = 0; // zero timer
...@@ -1375,7 +1379,7 @@ static void ad_port_selection_logic(struct port *port) ...@@ -1375,7 +1379,7 @@ static void ad_port_selection_logic(struct port *port)
port->next_port_in_aggregator=NULL; port->next_port_in_aggregator=NULL;
port->actor_port_aggregator_identifier=0; port->actor_port_aggregator_identifier=0;
BOND_PRINT_DBG(("Port %d left LAG %d", port->actor_port_number, temp_aggregator->aggregator_identifier)); dprintk("Port %d left LAG %d\n", port->actor_port_number, temp_aggregator->aggregator_identifier);
// if the aggregator is empty, clear its parameters, and set it ready to be attached // if the aggregator is empty, clear its parameters, and set it ready to be attached
if (!temp_aggregator->lag_ports) { if (!temp_aggregator->lag_ports) {
ad_clear_agg(temp_aggregator); ad_clear_agg(temp_aggregator);
...@@ -1384,7 +1388,7 @@ static void ad_port_selection_logic(struct port *port) ...@@ -1384,7 +1388,7 @@ static void ad_port_selection_logic(struct port *port)
} }
} }
if (!curr_port) { // meaning: the port was related to an aggregator but was not on the aggregator port list if (!curr_port) { // meaning: the port was related to an aggregator but was not on the aggregator port list
printk(KERN_WARNING "bonding: Warning: Port %d (on %s) was " printk(KERN_WARNING DRV_NAME ": Warning: Port %d (on %s) was "
"related to aggregator %d but was not on its port list\n", "related to aggregator %d but was not on its port list\n",
port->actor_port_number, port->slave->dev->name, port->actor_port_number, port->slave->dev->name,
port->aggregator->aggregator_identifier); port->aggregator->aggregator_identifier);
...@@ -1417,7 +1421,7 @@ static void ad_port_selection_logic(struct port *port) ...@@ -1417,7 +1421,7 @@ static void ad_port_selection_logic(struct port *port)
port->next_port_in_aggregator=aggregator->lag_ports; port->next_port_in_aggregator=aggregator->lag_ports;
port->aggregator->num_of_ports++; port->aggregator->num_of_ports++;
aggregator->lag_ports=port; aggregator->lag_ports=port;
BOND_PRINT_DBG(("Port %d joined LAG %d(existing LAG)", port->actor_port_number, port->aggregator->aggregator_identifier)); dprintk("Port %d joined LAG %d(existing LAG)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
// mark this port as selected // mark this port as selected
port->sm_vars |= AD_PORT_SELECTED; port->sm_vars |= AD_PORT_SELECTED;
...@@ -1454,9 +1458,9 @@ static void ad_port_selection_logic(struct port *port) ...@@ -1454,9 +1458,9 @@ static void ad_port_selection_logic(struct port *port)
// mark this port as selected // mark this port as selected
port->sm_vars |= AD_PORT_SELECTED; port->sm_vars |= AD_PORT_SELECTED;
BOND_PRINT_DBG(("Port %d joined LAG %d(new LAG)", port->actor_port_number, port->aggregator->aggregator_identifier)); dprintk("Port %d joined LAG %d(new LAG)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
} else { } else {
printk(KERN_ERR "bonding: Port %d (on %s) did not find a suitable aggregator\n", printk(KERN_ERR DRV_NAME ": Port %d (on %s) did not find a suitable aggregator\n",
port->actor_port_number, port->slave->dev->name); port->actor_port_number, port->slave->dev->name);
} }
} }
...@@ -1580,30 +1584,30 @@ static void ad_agg_selection_logic(struct aggregator *aggregator) ...@@ -1580,30 +1584,30 @@ static void ad_agg_selection_logic(struct aggregator *aggregator)
aggregator; aggregator;
aggregator = __get_next_agg(aggregator)) { aggregator = __get_next_agg(aggregator)) {
BOND_PRINT_DBG(("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d", dprintk("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d\n",
aggregator->aggregator_identifier, aggregator->num_of_ports, aggregator->aggregator_identifier, aggregator->num_of_ports,
aggregator->actor_oper_aggregator_key, aggregator->partner_oper_aggregator_key, aggregator->actor_oper_aggregator_key, aggregator->partner_oper_aggregator_key,
aggregator->is_individual, aggregator->is_active)); aggregator->is_individual, aggregator->is_active);
} }
// check if any partner replys // check if any partner replys
if (best_aggregator->is_individual) { if (best_aggregator->is_individual) {
printk(KERN_WARNING "bonding: Warning: No 802.3ad response from the link partner " printk(KERN_WARNING DRV_NAME ": Warning: No 802.3ad response from the link partner "
"for any adapters in the bond\n"); "for any adapters in the bond\n");
} }
// check if there are more than one aggregator // check if there are more than one aggregator
if (num_of_aggs > 1) { if (num_of_aggs > 1) {
BOND_PRINT_DBG(("Warning: More than one Link Aggregation Group was " dprintk("Warning: More than one Link Aggregation Group was "
"found in the bond. Only one group will function in the bond")); "found in the bond. Only one group will function in the bond\n");
} }
best_aggregator->is_active = 1; best_aggregator->is_active = 1;
BOND_PRINT_DBG(("LAG %d choosed as the active LAG", best_aggregator->aggregator_identifier)); dprintk("LAG %d choosed as the active LAG\n", best_aggregator->aggregator_identifier);
BOND_PRINT_DBG(("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d", dprintk("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d\n",
best_aggregator->aggregator_identifier, best_aggregator->num_of_ports, best_aggregator->aggregator_identifier, best_aggregator->num_of_ports,
best_aggregator->actor_oper_aggregator_key, best_aggregator->partner_oper_aggregator_key, best_aggregator->actor_oper_aggregator_key, best_aggregator->partner_oper_aggregator_key,
best_aggregator->is_individual, best_aggregator->is_active)); best_aggregator->is_individual, best_aggregator->is_active);
// disable the ports that were related to the former active_aggregator // disable the ports that were related to the former active_aggregator
if (last_active_aggregator) { if (last_active_aggregator) {
...@@ -1644,7 +1648,7 @@ static void ad_clear_agg(struct aggregator *aggregator) ...@@ -1644,7 +1648,7 @@ static void ad_clear_agg(struct aggregator *aggregator)
aggregator->lag_ports = NULL; aggregator->lag_ports = NULL;
aggregator->is_active = 0; aggregator->is_active = 0;
aggregator->num_of_ports = 0; aggregator->num_of_ports = 0;
BOND_PRINT_DBG(("LAG %d was cleared", aggregator->aggregator_identifier)); dprintk("LAG %d was cleared\n", aggregator->aggregator_identifier);
} }
} }
...@@ -1729,7 +1733,7 @@ static void ad_initialize_port(struct port *port, int lacp_fast) ...@@ -1729,7 +1733,7 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
static void ad_enable_collecting_distributing(struct port *port) static void ad_enable_collecting_distributing(struct port *port)
{ {
if (port->aggregator->is_active) { if (port->aggregator->is_active) {
BOND_PRINT_DBG(("Enabling port %d(LAG %d)", port->actor_port_number, port->aggregator->aggregator_identifier)); dprintk("Enabling port %d(LAG %d)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
__enable_port(port); __enable_port(port);
} }
} }
...@@ -1742,7 +1746,7 @@ static void ad_enable_collecting_distributing(struct port *port) ...@@ -1742,7 +1746,7 @@ static void ad_enable_collecting_distributing(struct port *port)
static void ad_disable_collecting_distributing(struct port *port) static void ad_disable_collecting_distributing(struct port *port)
{ {
if (port->aggregator && MAC_ADDRESS_COMPARE(&(port->aggregator->partner_system), &(null_mac_addr))) { if (port->aggregator && MAC_ADDRESS_COMPARE(&(port->aggregator->partner_system), &(null_mac_addr))) {
BOND_PRINT_DBG(("Disabling port %d(LAG %d)", port->actor_port_number, port->aggregator->aggregator_identifier)); dprintk("Disabling port %d(LAG %d)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
__disable_port(port); __disable_port(port);
} }
} }
...@@ -1780,7 +1784,7 @@ static void ad_marker_info_send(struct port *port) ...@@ -1780,7 +1784,7 @@ static void ad_marker_info_send(struct port *port)
// send the marker information // send the marker information
if (ad_marker_send(port, &marker) >= 0) { if (ad_marker_send(port, &marker) >= 0) {
BOND_PRINT_DBG(("Sent Marker Information on port %d", port->actor_port_number)); dprintk("Sent Marker Information on port %d\n", port->actor_port_number);
} }
} }
#endif #endif
...@@ -1803,7 +1807,7 @@ static void ad_marker_info_received(struct marker *marker_info,struct port *port ...@@ -1803,7 +1807,7 @@ static void ad_marker_info_received(struct marker *marker_info,struct port *port
// send the marker response // send the marker response
if (ad_marker_send(port, &marker) >= 0) { if (ad_marker_send(port, &marker) >= 0) {
BOND_PRINT_DBG(("Sent Marker Response on port %d", port->actor_port_number)); dprintk("Sent Marker Response on port %d\n", port->actor_port_number);
} }
} }
...@@ -1890,13 +1894,13 @@ static u16 aggregator_identifier; ...@@ -1890,13 +1894,13 @@ static u16 aggregator_identifier;
void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution, int lacp_fast) void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution, int lacp_fast)
{ {
// check that the bond is not initialized yet // check that the bond is not initialized yet
if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr), &(bond->device->dev_addr))) { if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr), &(bond->dev->dev_addr))) {
aggregator_identifier = 0; aggregator_identifier = 0;
BOND_AD_INFO(bond).lacp_fast = lacp_fast; BOND_AD_INFO(bond).lacp_fast = lacp_fast;
BOND_AD_INFO(bond).system.sys_priority = 0xFFFF; BOND_AD_INFO(bond).system.sys_priority = 0xFFFF;
BOND_AD_INFO(bond).system.sys_mac_addr = *((struct mac_addr *)bond->device->dev_addr); BOND_AD_INFO(bond).system.sys_mac_addr = *((struct mac_addr *)bond->dev->dev_addr);
// initialize how many times this module is called in one second(should be about every 100ms) // initialize how many times this module is called in one second(should be about every 100ms)
ad_ticks_per_sec = tick_resolution; ad_ticks_per_sec = tick_resolution;
...@@ -1921,7 +1925,7 @@ int bond_3ad_bind_slave(struct slave *slave) ...@@ -1921,7 +1925,7 @@ int bond_3ad_bind_slave(struct slave *slave)
struct aggregator *aggregator; struct aggregator *aggregator;
if (bond == NULL) { if (bond == NULL) {
printk(KERN_CRIT "The slave %s is not attached to its bond\n", slave->dev->name); printk(KERN_ERR "The slave %s is not attached to its bond\n", slave->dev->name);
return -1; return -1;
} }
...@@ -1964,7 +1968,7 @@ int bond_3ad_bind_slave(struct slave *slave) ...@@ -1964,7 +1968,7 @@ int bond_3ad_bind_slave(struct slave *slave)
ad_initialize_agg(aggregator); ad_initialize_agg(aggregator);
aggregator->aggregator_mac_address = *((struct mac_addr *)bond->device->dev_addr); aggregator->aggregator_mac_address = *((struct mac_addr *)bond->dev->dev_addr);
aggregator->aggregator_identifier = (++aggregator_identifier); aggregator->aggregator_identifier = (++aggregator_identifier);
aggregator->slave = slave; aggregator->slave = slave;
aggregator->is_active = 0; aggregator->is_active = 0;
...@@ -1996,11 +2000,11 @@ void bond_3ad_unbind_slave(struct slave *slave) ...@@ -1996,11 +2000,11 @@ void bond_3ad_unbind_slave(struct slave *slave)
// if slave is null, the whole port is not initialized // if slave is null, the whole port is not initialized
if (!port->slave) { if (!port->slave) {
printk(KERN_WARNING "bonding: Trying to unbind an uninitialized port on %s\n", slave->dev->name); printk(KERN_WARNING DRV_NAME ": Trying to unbind an uninitialized port on %s\n", slave->dev->name);
return; return;
} }
BOND_PRINT_DBG(("Unbinding Link Aggregation Group %d", aggregator->aggregator_identifier)); dprintk("Unbinding Link Aggregation Group %d\n", aggregator->aggregator_identifier);
/* Tell the partner that this port is not suitable for aggregation */ /* Tell the partner that this port is not suitable for aggregation */
port->actor_oper_port_state &= ~AD_STATE_AGGREGATION; port->actor_oper_port_state &= ~AD_STATE_AGGREGATION;
...@@ -2024,10 +2028,10 @@ void bond_3ad_unbind_slave(struct slave *slave) ...@@ -2024,10 +2028,10 @@ void bond_3ad_unbind_slave(struct slave *slave)
// if new aggregator found, copy the aggregator's parameters // if new aggregator found, copy the aggregator's parameters
// and connect the related lag_ports to the new aggregator // and connect the related lag_ports to the new aggregator
if ((new_aggregator) && ((!new_aggregator->lag_ports) || ((new_aggregator->lag_ports == port) && !new_aggregator->lag_ports->next_port_in_aggregator))) { if ((new_aggregator) && ((!new_aggregator->lag_ports) || ((new_aggregator->lag_ports == port) && !new_aggregator->lag_ports->next_port_in_aggregator))) {
BOND_PRINT_DBG(("Some port(s) related to LAG %d - replaceing with LAG %d", aggregator->aggregator_identifier, new_aggregator->aggregator_identifier)); dprintk("Some port(s) related to LAG %d - replaceing with LAG %d\n", aggregator->aggregator_identifier, new_aggregator->aggregator_identifier);
if ((new_aggregator->lag_ports == port) && new_aggregator->is_active) { if ((new_aggregator->lag_ports == port) && new_aggregator->is_active) {
printk(KERN_INFO "bonding: Removing an active aggregator\n"); printk(KERN_INFO DRV_NAME ": Removing an active aggregator\n");
// select new active aggregator // select new active aggregator
select_new_active_agg = 1; select_new_active_agg = 1;
} }
...@@ -2057,7 +2061,7 @@ void bond_3ad_unbind_slave(struct slave *slave) ...@@ -2057,7 +2061,7 @@ void bond_3ad_unbind_slave(struct slave *slave)
ad_agg_selection_logic(__get_first_agg(port)); ad_agg_selection_logic(__get_first_agg(port));
} }
} else { } else {
printk(KERN_WARNING "bonding: Warning: unbinding aggregator, " printk(KERN_WARNING DRV_NAME ": Warning: unbinding aggregator, "
"and could not find a new aggregator for its ports\n"); "and could not find a new aggregator for its ports\n");
} }
} else { // in case that the only port related to this aggregator is the one we want to remove } else { // in case that the only port related to this aggregator is the one we want to remove
...@@ -2072,7 +2076,7 @@ void bond_3ad_unbind_slave(struct slave *slave) ...@@ -2072,7 +2076,7 @@ void bond_3ad_unbind_slave(struct slave *slave)
} }
} }
BOND_PRINT_DBG(("Unbinding port %d", port->actor_port_number)); dprintk("Unbinding port %d\n", port->actor_port_number);
// find the aggregator that this port is connected to // find the aggregator that this port is connected to
temp_aggregator = __get_first_agg(port); temp_aggregator = __get_first_agg(port);
for (; temp_aggregator; temp_aggregator = __get_next_agg(temp_aggregator)) { for (; temp_aggregator; temp_aggregator = __get_next_agg(temp_aggregator)) {
...@@ -2123,13 +2127,13 @@ void bond_3ad_state_machine_handler(struct bonding *bond) ...@@ -2123,13 +2127,13 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
read_lock(&bond->lock); read_lock(&bond->lock);
//check if there are any slaves if (bond->kill_timers) {
if (bond->next == (struct slave *)bond) { goto out;
goto end;
} }
if ((bond->device->flags & IFF_UP) != IFF_UP) { //check if there are any slaves
goto end; if (bond->slave_cnt == 0) {
goto re_arm;
} }
// check if agg_select_timer timer after initialize is timed out // check if agg_select_timer timer after initialize is timed out
...@@ -2137,8 +2141,8 @@ void bond_3ad_state_machine_handler(struct bonding *bond) ...@@ -2137,8 +2141,8 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
// select the active aggregator for the bond // select the active aggregator for the bond
if ((port = __get_first_port(bond))) { if ((port = __get_first_port(bond))) {
if (!port->slave) { if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: bond's first port is uninitialized\n"); printk(KERN_WARNING DRV_NAME ": Warning: bond's first port is uninitialized\n");
goto end; goto re_arm;
} }
aggregator = __get_first_agg(port); aggregator = __get_first_agg(port);
...@@ -2149,8 +2153,8 @@ void bond_3ad_state_machine_handler(struct bonding *bond) ...@@ -2149,8 +2153,8 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
// for each port run the state machines // for each port run the state machines
for (port = __get_first_port(bond); port; port = __get_next_port(port)) { for (port = __get_first_port(bond); port; port = __get_next_port(port)) {
if (!port->slave) { if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: Found an uninitialized port\n"); printk(KERN_WARNING DRV_NAME ": Warning: Found an uninitialized port\n");
goto end; goto re_arm;
} }
ad_rx_machine(NULL, port); ad_rx_machine(NULL, port);
...@@ -2165,14 +2169,10 @@ void bond_3ad_state_machine_handler(struct bonding *bond) ...@@ -2165,14 +2169,10 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
} }
} }
end: re_arm:
mod_timer(&(BOND_AD_INFO(bond).ad_timer), jiffies + ad_delta_in_ticks);
out:
read_unlock(&bond->lock); read_unlock(&bond->lock);
if ((bond->device->flags & IFF_UP) == IFF_UP) {
/* re-arm the timer */
mod_timer(&(BOND_AD_INFO(bond).ad_timer), jiffies + (AD_TIMER_INTERVAL * HZ / 1000));
}
} }
/** /**
...@@ -2194,14 +2194,14 @@ void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 leng ...@@ -2194,14 +2194,14 @@ void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 leng
port = &(SLAVE_AD_INFO(slave).port); port = &(SLAVE_AD_INFO(slave).port);
if (!port->slave) { if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: port of slave %s is uninitialized\n", slave->dev->name); printk(KERN_WARNING DRV_NAME ": Warning: port of slave %s is uninitialized\n", slave->dev->name);
return; return;
} }
switch (lacpdu->subtype) { switch (lacpdu->subtype) {
case AD_TYPE_LACPDU: case AD_TYPE_LACPDU:
__ntohs_lacpdu(lacpdu); __ntohs_lacpdu(lacpdu);
BOND_PRINT_DBG(("Received LACPDU on port %d", port->actor_port_number)); dprintk("Received LACPDU on port %d\n", port->actor_port_number);
ad_rx_machine(lacpdu, port); ad_rx_machine(lacpdu, port);
break; break;
...@@ -2210,17 +2210,17 @@ void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 leng ...@@ -2210,17 +2210,17 @@ void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 leng
switch (((struct marker *)lacpdu)->tlv_type) { switch (((struct marker *)lacpdu)->tlv_type) {
case AD_MARKER_INFORMATION_SUBTYPE: case AD_MARKER_INFORMATION_SUBTYPE:
BOND_PRINT_DBG(("Received Marker Information on port %d", port->actor_port_number)); dprintk("Received Marker Information on port %d\n", port->actor_port_number);
ad_marker_info_received((struct marker *)lacpdu, port); ad_marker_info_received((struct marker *)lacpdu, port);
break; break;
case AD_MARKER_RESPONSE_SUBTYPE: case AD_MARKER_RESPONSE_SUBTYPE:
BOND_PRINT_DBG(("Received Marker Response on port %d", port->actor_port_number)); dprintk("Received Marker Response on port %d\n", port->actor_port_number);
ad_marker_response_received((struct marker *)lacpdu, port); ad_marker_response_received((struct marker *)lacpdu, port);
break; break;
default: default:
BOND_PRINT_DBG(("Received an unknown Marker subtype on slot %d", port->actor_port_number)); dprintk("Received an unknown Marker subtype on slot %d\n", port->actor_port_number);
} }
} }
} }
...@@ -2240,14 +2240,14 @@ void bond_3ad_adapter_speed_changed(struct slave *slave) ...@@ -2240,14 +2240,14 @@ void bond_3ad_adapter_speed_changed(struct slave *slave)
// if slave is null, the whole port is not initialized // if slave is null, the whole port is not initialized
if (!port->slave) { if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: speed changed for uninitialized port on %s\n", printk(KERN_WARNING DRV_NAME ": Warning: speed changed for uninitialized port on %s\n",
slave->dev->name); slave->dev->name);
return; return;
} }
port->actor_admin_port_key &= ~AD_SPEED_KEY_BITS; port->actor_admin_port_key &= ~AD_SPEED_KEY_BITS;
port->actor_oper_port_key=port->actor_admin_port_key |= (__get_link_speed(port) << 1); port->actor_oper_port_key=port->actor_admin_port_key |= (__get_link_speed(port) << 1);
BOND_PRINT_DBG(("Port %d changed speed", port->actor_port_number)); dprintk("Port %d changed speed\n", port->actor_port_number);
// there is no need to reselect a new aggregator, just signal the // there is no need to reselect a new aggregator, just signal the
// state machines to reinitialize // state machines to reinitialize
port->sm_vars |= AD_PORT_BEGIN; port->sm_vars |= AD_PORT_BEGIN;
...@@ -2267,14 +2267,14 @@ void bond_3ad_adapter_duplex_changed(struct slave *slave) ...@@ -2267,14 +2267,14 @@ void bond_3ad_adapter_duplex_changed(struct slave *slave)
// if slave is null, the whole port is not initialized // if slave is null, the whole port is not initialized
if (!port->slave) { if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: duplex changed for uninitialized port on %s\n", printk(KERN_WARNING DRV_NAME ": Warning: duplex changed for uninitialized port on %s\n",
slave->dev->name); slave->dev->name);
return; return;
} }
port->actor_admin_port_key &= ~AD_DUPLEX_KEY_BITS; port->actor_admin_port_key &= ~AD_DUPLEX_KEY_BITS;
port->actor_oper_port_key=port->actor_admin_port_key |= __get_duplex(port); port->actor_oper_port_key=port->actor_admin_port_key |= __get_duplex(port);
BOND_PRINT_DBG(("Port %d changed duplex", port->actor_port_number)); dprintk("Port %d changed duplex\n", port->actor_port_number);
// there is no need to reselect a new aggregator, just signal the // there is no need to reselect a new aggregator, just signal the
// state machines to reinitialize // state machines to reinitialize
port->sm_vars |= AD_PORT_BEGIN; port->sm_vars |= AD_PORT_BEGIN;
...@@ -2295,10 +2295,8 @@ void bond_3ad_handle_link_change(struct slave *slave, char link) ...@@ -2295,10 +2295,8 @@ void bond_3ad_handle_link_change(struct slave *slave, char link)
// if slave is null, the whole port is not initialized // if slave is null, the whole port is not initialized
if (!port->slave) { if (!port->slave) {
#ifdef BONDING_DEBUG printk(KERN_WARNING DRV_NAME ": Warning: link status changed for uninitialized port on %s\n",
printk(KERN_WARNING "bonding: Warning: link status changed for uninitialized port on %s\n", slave->dev->name);
slave->dev->name);
#endif
return; return;
} }
...@@ -2356,41 +2354,27 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info) ...@@ -2356,41 +2354,27 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
{ {
slave_t *slave, *start_at; struct slave *slave, *start_at;
struct bonding *bond = (struct bonding *) dev->priv; struct bonding *bond = dev->priv;
struct ethhdr *data = (struct ethhdr *)skb->data; struct ethhdr *data = (struct ethhdr *)skb->data;
int slave_agg_no; int slave_agg_no;
int slaves_in_agg; int slaves_in_agg;
int agg_id; int agg_id;
int i;
struct ad_info ad_info; struct ad_info ad_info;
if (!IS_UP(dev)) { /* bond down */ /* make sure that the slaves list will
dev_kfree_skb(skb); * not change during tx
return 0; */
}
if (bond == NULL) {
printk(KERN_CRIT "bonding: Error: bond is NULL on device %s\n", dev->name);
dev_kfree_skb(skb);
return 0;
}
read_lock(&bond->lock); read_lock(&bond->lock);
slave = bond->prev;
/* check if bond is empty */ if (!BOND_IS_OK(bond)) {
if ((slave == (struct slave *) bond) || (bond->slave_cnt == 0)) { goto free_out;
printk(KERN_DEBUG "ERROR: bond is empty\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
} }
if (bond_3ad_get_active_agg_info(bond, &ad_info)) { if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n"); printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n");
dev_kfree_skb(skb); goto free_out;
read_unlock(&bond->lock);
return 0;
} }
slaves_in_agg = ad_info.ports; slaves_in_agg = ad_info.ports;
...@@ -2399,21 +2383,12 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2399,21 +2383,12 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
if (slaves_in_agg == 0) { if (slaves_in_agg == 0) {
/*the aggregator is empty*/ /*the aggregator is empty*/
printk(KERN_DEBUG "ERROR: active aggregator is empty\n"); printk(KERN_DEBUG "ERROR: active aggregator is empty\n");
dev_kfree_skb(skb); goto free_out;
read_unlock(&bond->lock);
return 0;
} }
/* we're at the root, get the first slave */ slave_agg_no = (data->h_dest[5]^bond->dev->dev_addr[5]) % slaves_in_agg;
if ((slave == NULL) || (slave->dev == NULL)) {
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
}
slave_agg_no = (data->h_dest[5]^slave->dev->dev_addr[5]) % slaves_in_agg; bond_for_each_slave(bond, slave, i) {
while (slave != (slave_t *)bond) {
struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator; struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
if (agg && (agg->aggregator_identifier == agg_id)) { if (agg && (agg->aggregator_identifier == agg_id)) {
...@@ -2422,37 +2397,18 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2422,37 +2397,18 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
break; break;
} }
} }
slave = slave->prev;
if (slave == NULL) {
printk(KERN_ERR "bonding: Error: slave is NULL\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
}
} }
if (slave == (slave_t *)bond) { if (slave_agg_no >= 0) {
printk(KERN_ERR "bonding: Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id); printk(KERN_ERR DRV_NAME ": Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id);
dev_kfree_skb(skb); goto free_out;
read_unlock(&bond->lock);
return 0;
} }
start_at = slave; start_at = slave;
do { bond_for_each_slave_from(bond, slave, i, start_at) {
int slave_agg_id = 0; int slave_agg_id = 0;
struct aggregator *agg; struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
if (slave == NULL) {
printk(KERN_ERR "bonding: Error: slave is NULL\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
}
agg = SLAVE_AD_INFO(slave).port.aggregator;
if (agg) { if (agg) {
slave_agg_id = agg->aggregator_identifier; slave_agg_id = agg->aggregator_identifier;
...@@ -2463,20 +2419,24 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev) ...@@ -2463,20 +2419,24 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
skb->dev = slave->dev; skb->dev = slave->dev;
skb->priority = 1; skb->priority = 1;
dev_queue_xmit(skb); dev_queue_xmit(skb);
read_unlock(&bond->lock);
return 0; goto out;
} }
} while ((slave = slave->next) != start_at); }
/* no suitable interface, frame not sent */ out:
dev_kfree_skb(skb);
read_unlock(&bond->lock); read_unlock(&bond->lock);
return 0; return 0;
free_out:
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
goto out;
} }
int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype) int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype)
{ {
struct bonding *bond = (struct bonding *)dev->priv; struct bonding *bond = dev->priv;
struct slave *slave = NULL; struct slave *slave = NULL;
int ret = NET_RX_DROP; int ret = NET_RX_DROP;
......
...@@ -28,6 +28,9 @@ ...@@ -28,6 +28,9 @@
* 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com> * 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com>
* - Renamed bond_3ad_link_status_changed() to * - Renamed bond_3ad_link_status_changed() to
* bond_3ad_handle_link_change() for compatibility with TLB. * bond_3ad_handle_link_change() for compatibility with TLB.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/ */
#ifndef __BOND_3AD_H__ #ifndef __BOND_3AD_H__
......
...@@ -28,8 +28,13 @@ ...@@ -28,8 +28,13 @@
* 2003/08/06 - Amir Noam <amir.noam at intel dot com> * 2003/08/06 - Amir Noam <amir.noam at intel dot com>
* - Add support for setting bond's MAC address with special * - Add support for setting bond's MAC address with special
* handling required for ALB/TLB. * handling required for ALB/TLB.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/ */
//#define BONDING_DEBUG 1
#include <linux/skbuff.h> #include <linux/skbuff.h>
#include <linux/netdevice.h> #include <linux/netdevice.h>
#include <linux/etherdevice.h> #include <linux/etherdevice.h>
...@@ -50,11 +55,11 @@ ...@@ -50,11 +55,11 @@
#define ALB_TIMER_TICKS_PER_SEC 10 /* should be a divisor of HZ */ #define ALB_TIMER_TICKS_PER_SEC 10 /* should be a divisor of HZ */
#define BOND_TLB_REBALANCE_INTERVAL 10 /* in seconds, periodic re-balancing #define BOND_TLB_REBALANCE_INTERVAL 10 /* In seconds, periodic re-balancing.
* used for division - never set * Used for division - never set
* to zero !!! * to zero !!!
*/ */
#define BOND_ALB_LP_INTERVAL 1 /* in seconds periodic send of #define BOND_ALB_LP_INTERVAL 1 /* In seconds, periodic send of
* learning packets to the switch * learning packets to the switch
*/ */
...@@ -66,7 +71,7 @@ ...@@ -66,7 +71,7 @@
#define TLB_HASH_TABLE_SIZE 256 /* The size of the clients hash table. #define TLB_HASH_TABLE_SIZE 256 /* The size of the clients hash table.
* Note that this value MUST NOT be smaller * Note that this value MUST NOT be smaller
* because the key hash table BYTE wide ! * because the key hash table is BYTE wide !
*/ */
...@@ -86,12 +91,15 @@ ...@@ -86,12 +91,15 @@
*/ */
#define RLB_PROMISC_TIMEOUT 10*ALB_TIMER_TICKS_PER_SEC #define RLB_PROMISC_TIMEOUT 10*ALB_TIMER_TICKS_PER_SEC
static const u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
static const int alb_delta_in_ticks = HZ / ALB_TIMER_TICKS_PER_SEC;
#pragma pack(1) #pragma pack(1)
struct learning_pkt { struct learning_pkt {
u8 mac_dst[ETH_ALEN]; u8 mac_dst[ETH_ALEN];
u8 mac_src[ETH_ALEN]; u8 mac_src[ETH_ALEN];
u16 type; u16 type;
u8 padding[ETH_ZLEN - (2*ETH_ALEN + 2)]; u8 padding[ETH_ZLEN - ETH_HLEN];
}; };
struct arp_pkt { struct arp_pkt {
...@@ -110,13 +118,12 @@ struct arp_pkt { ...@@ -110,13 +118,12 @@ struct arp_pkt {
/* Forward declaration */ /* Forward declaration */
static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]); static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]);
static inline u8 static inline u8 _simple_hash(u8 *hash_start, int hash_size)
_simple_hash(u8 *hash_start, int hash_size)
{ {
int i; int i;
u8 hash = 0; u8 hash = 0;
for (i=0; i<hash_size; i++) { for (i = 0; i < hash_size; i++) {
hash ^= hash_start[i]; hash ^= hash_start[i];
} }
...@@ -125,193 +132,151 @@ _simple_hash(u8 *hash_start, int hash_size) ...@@ -125,193 +132,151 @@ _simple_hash(u8 *hash_start, int hash_size)
/*********************** tlb specific functions ***************************/ /*********************** tlb specific functions ***************************/
static inline void static inline void _lock_tx_hashtbl(struct bonding *bond)
_lock_tx_hashtbl(struct bonding *bond)
{ {
spin_lock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock)); spin_lock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
} }
static inline void static inline void _unlock_tx_hashtbl(struct bonding *bond)
_unlock_tx_hashtbl(struct bonding *bond)
{ {
spin_unlock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock)); spin_unlock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
} }
/* Caller must hold tx_hashtbl lock */ /* Caller must hold tx_hashtbl lock */
static inline void static inline void tlb_init_table_entry(struct tlb_client_info *entry, int save_load)
tlb_init_table_entry(struct bonding *bond, u8 index, u8 save_load)
{ {
struct tlb_client_info *entry;
if (BOND_ALB_INFO(bond).tx_hashtbl == NULL) {
return;
}
entry = &(BOND_ALB_INFO(bond).tx_hashtbl[index]);
/* at end of cycle, save the load that was transmitted to the client
* during the cycle, and set the tx_bytes counter to 0 for counting
* the load during the next cycle
*/
if (save_load) { if (save_load) {
entry->load_history = 1 + entry->tx_bytes / entry->load_history = 1 + entry->tx_bytes /
BOND_TLB_REBALANCE_INTERVAL; BOND_TLB_REBALANCE_INTERVAL;
entry->tx_bytes = 0; entry->tx_bytes = 0;
} }
entry->tx_slave = NULL; entry->tx_slave = NULL;
entry->next = TLB_NULL_INDEX; entry->next = TLB_NULL_INDEX;
entry->prev = TLB_NULL_INDEX; entry->prev = TLB_NULL_INDEX;
} }
static inline void static inline void tlb_init_slave(struct slave *slave)
tlb_init_slave(struct slave *slave)
{ {
struct tlb_slave_info *slave_info = &(SLAVE_TLB_INFO(slave)); SLAVE_TLB_INFO(slave).load = 0;
SLAVE_TLB_INFO(slave).head = TLB_NULL_INDEX;
slave_info->load = 0;
slave_info->head = TLB_NULL_INDEX;
} }
/* Caller must hold bond lock for read */ /* Caller must hold bond lock for read */
static inline void static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_load)
tlb_clear_slave(struct bonding *bond, struct slave *slave, u8 save_load)
{ {
struct tlb_client_info *tx_hash_table = NULL; struct tlb_client_info *tx_hash_table;
u32 index, next_index; u32 index;
/* clear slave from tx_hashtbl */
_lock_tx_hashtbl(bond); _lock_tx_hashtbl(bond);
/* clear slave from tx_hashtbl */
tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl; tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl;
if (tx_hash_table) { index = SLAVE_TLB_INFO(slave).head;
index = SLAVE_TLB_INFO(slave).head; while (index != TLB_NULL_INDEX) {
while (index != TLB_NULL_INDEX) { u32 next_index = tx_hash_table[index].next;
next_index = tx_hash_table[index].next; tlb_init_table_entry(&tx_hash_table[index], save_load);
tlb_init_table_entry(bond, index, save_load); index = next_index;
index = next_index;
}
} }
_unlock_tx_hashtbl(bond); _unlock_tx_hashtbl(bond);
tlb_init_slave(slave); tlb_init_slave(slave);
} }
/* Must be called before starting the monitor timer */ /* Must be called before starting the monitor timer */
static int static int tlb_initialize(struct bonding *bond)
tlb_initialize(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
int size = TLB_HASH_TABLE_SIZE * sizeof(struct tlb_client_info);
int i; int i;
size_t size;
#if(TLB_HASH_TABLE_SIZE != 256)
/* Key to the hash table is byte wide. Check the size! */
#error Hash Table size is wrong.
#endif
spin_lock_init(&(bond_info->tx_hashtbl_lock)); spin_lock_init(&(bond_info->tx_hashtbl_lock));
_lock_tx_hashtbl(bond); _lock_tx_hashtbl(bond);
if (bond_info->tx_hashtbl != NULL) {
printk (KERN_ERR "%s: TLB hash table is not NULL\n",
bond->device->name);
_unlock_tx_hashtbl(bond);
return -1;
}
size = TLB_HASH_TABLE_SIZE * sizeof(struct tlb_client_info);
bond_info->tx_hashtbl = kmalloc(size, GFP_KERNEL); bond_info->tx_hashtbl = kmalloc(size, GFP_KERNEL);
if (bond_info->tx_hashtbl == NULL) { if (!bond_info->tx_hashtbl) {
printk (KERN_ERR "%s: Failed to allocate TLB hash table\n", printk(KERN_ERR DRV_NAME
bond->device->name); ": Error: %s: Failed to allocate TLB hash table\n",
bond->dev->name);
_unlock_tx_hashtbl(bond); _unlock_tx_hashtbl(bond);
return -1; return -1;
} }
memset(bond_info->tx_hashtbl, 0, size); memset(bond_info->tx_hashtbl, 0, size);
for (i=0; i<TLB_HASH_TABLE_SIZE; i++) {
tlb_init_table_entry(bond, i, 1); for (i = 0; i < TLB_HASH_TABLE_SIZE; i++) {
tlb_init_table_entry(&bond_info->tx_hashtbl[i], 1);
} }
_unlock_tx_hashtbl(bond); _unlock_tx_hashtbl(bond);
return 0; return 0;
} }
/* Must be called only after all slaves have been released */ /* Must be called only after all slaves have been released */
static void static void tlb_deinitialize(struct bonding *bond)
tlb_deinitialize(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
_lock_tx_hashtbl(bond); _lock_tx_hashtbl(bond);
if (bond_info->tx_hashtbl == NULL) {
_unlock_tx_hashtbl(bond);
return;
}
kfree(bond_info->tx_hashtbl); kfree(bond_info->tx_hashtbl);
bond_info->tx_hashtbl = NULL; bond_info->tx_hashtbl = NULL;
_unlock_tx_hashtbl(bond); _unlock_tx_hashtbl(bond);
} }
/* Caller must hold bond lock for read */ /* Caller must hold bond lock for read */
static struct slave* static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
tlb_get_least_loaded_slave(struct bonding *bond)
{ {
struct slave *slave; struct slave *slave, *least_loaded;
struct slave *least_loaded; s64 max_gap;
s64 curr_gap, max_gap; int i, found = 0;
/* Find the first enabled slave */ /* Find the first enabled slave */
slave = bond_get_first_slave(bond); bond_for_each_slave(bond, slave, i) {
while (slave) {
if (SLAVE_IS_OK(slave)) { if (SLAVE_IS_OK(slave)) {
found = 1;
break; break;
} }
slave = bond_get_next_slave(bond, slave);
} }
if (!slave) { if (!found) {
return NULL; return NULL;
} }
least_loaded = slave; least_loaded = slave;
max_gap = (s64)(slave->speed * 1000000) - max_gap = (s64)(slave->speed << 20) - /* Convert to Megabit per sec */
(s64)(SLAVE_TLB_INFO(slave).load * 8); (s64)(SLAVE_TLB_INFO(slave).load << 3); /* Bytes to bits */
/* Find the slave with the largest gap */ /* Find the slave with the largest gap */
slave = bond_get_next_slave(bond, slave); bond_for_each_slave_from(bond, slave, i, least_loaded) {
while (slave) {
if (SLAVE_IS_OK(slave)) { if (SLAVE_IS_OK(slave)) {
curr_gap = (s64)(slave->speed * 1000000) - s64 gap = (s64)(slave->speed << 20) -
(s64)(SLAVE_TLB_INFO(slave).load * 8); (s64)(SLAVE_TLB_INFO(slave).load << 3);
if (max_gap < curr_gap) { if (max_gap < gap) {
least_loaded = slave; least_loaded = slave;
max_gap = curr_gap; max_gap = gap;
} }
} }
slave = bond_get_next_slave(bond, slave);
} }
return least_loaded; return least_loaded;
} }
/* Caller must hold bond lock for read */ /* Caller must hold bond lock for read */
struct slave* struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct tlb_client_info *hash_table = NULL; struct tlb_client_info *hash_table;
struct slave *assigned_slave = NULL; struct slave *assigned_slave;
_lock_tx_hashtbl(bond); _lock_tx_hashtbl(bond);
hash_table = bond_info->tx_hashtbl; hash_table = bond_info->tx_hashtbl;
if (hash_table == NULL) {
printk (KERN_ERR "%s: TLB hash table is NULL\n",
bond->device->name);
_unlock_tx_hashtbl(bond);
return NULL;
}
assigned_slave = hash_table[hash_index].tx_slave; assigned_slave = hash_table[hash_index].tx_slave;
if (!assigned_slave) { if (!assigned_slave) {
assigned_slave = tlb_get_least_loaded_slave(bond); assigned_slave = tlb_get_least_loaded_slave(bond);
...@@ -345,14 +310,12 @@ tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len) ...@@ -345,14 +310,12 @@ tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
} }
/*********************** rlb specific functions ***************************/ /*********************** rlb specific functions ***************************/
static inline void static inline void _lock_rx_hashtbl(struct bonding *bond)
_lock_rx_hashtbl(struct bonding *bond)
{ {
spin_lock(&(BOND_ALB_INFO(bond).rx_hashtbl_lock)); spin_lock(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
} }
static inline void static inline void _unlock_rx_hashtbl(struct bonding *bond)
_unlock_rx_hashtbl(struct bonding *bond)
{ {
spin_unlock(&(BOND_ALB_INFO(bond).rx_hashtbl_lock)); spin_unlock(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
} }
...@@ -360,26 +323,20 @@ _unlock_rx_hashtbl(struct bonding *bond) ...@@ -360,26 +323,20 @@ _unlock_rx_hashtbl(struct bonding *bond)
/* when an ARP REPLY is received from a client update its info /* when an ARP REPLY is received from a client update its info
* in the rx_hashtbl * in the rx_hashtbl
*/ */
static void static void rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
{ {
u32 hash_index;
struct rlb_client_info *client_info = NULL;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct rlb_client_info *client_info;
u32 hash_index;
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) { hash_index = _simple_hash((u8*)&(arp->ip_src), sizeof(arp->ip_src));
_unlock_rx_hashtbl(bond);
return;
}
hash_index = _simple_hash((u8*)&(arp->ip_src), 4);
client_info = &(bond_info->rx_hashtbl[hash_index]); client_info = &(bond_info->rx_hashtbl[hash_index]);
if ((client_info->assigned) && if ((client_info->assigned) &&
(client_info->ip_src == arp->ip_dst) && (client_info->ip_src == arp->ip_dst) &&
(client_info->ip_dst == arp->ip_src)) { (client_info->ip_dst == arp->ip_src)) {
/* update the clients MAC address */ /* update the clients MAC address */
memcpy(client_info->mac_dst, arp->mac_src, ETH_ALEN); memcpy(client_info->mac_dst, arp->mac_src, ETH_ALEN);
client_info->ntt = 1; client_info->ntt = 1;
...@@ -389,66 +346,60 @@ rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp) ...@@ -389,66 +346,60 @@ rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
_unlock_rx_hashtbl(bond); _unlock_rx_hashtbl(bond);
} }
static int static int rlb_arp_recv(struct sk_buff *skb, struct net_device *bond_dev, struct packet_type *ptype)
rlb_arp_recv(struct sk_buff *skb,
struct net_device *dev,
struct packet_type* ptype)
{ {
struct bonding *bond = (struct bonding *)dev->priv; struct bonding *bond = bond_dev->priv;
int ret = NET_RX_DROP;
struct arp_pkt *arp = (struct arp_pkt *)skb->data; struct arp_pkt *arp = (struct arp_pkt *)skb->data;
int res = NET_RX_DROP;
if (!(dev->flags & IFF_MASTER)) { if (!(bond_dev->flags & IFF_MASTER)) {
goto out; goto out;
} }
if (!arp) { if (!arp) {
printk(KERN_ERR "Packet has no ARP data\n"); dprintk("Packet has no ARP data\n");
goto out; goto out;
} }
if (skb->len < sizeof(struct arp_pkt)) { if (skb->len < sizeof(struct arp_pkt)) {
printk(KERN_ERR "Packet is too small to be an ARP\n"); dprintk("Packet is too small to be an ARP\n");
goto out; goto out;
} }
if (arp->op_code == htons(ARPOP_REPLY)) { if (arp->op_code == htons(ARPOP_REPLY)) {
/* update rx hash table for this ARP */ /* update rx hash table for this ARP */
rlb_update_entry_from_arp(bond, arp); rlb_update_entry_from_arp(bond, arp);
BOND_PRINT_DBG(("Server received an ARP Reply from client")); dprintk("Server received an ARP Reply from client\n");
} }
ret = NET_RX_SUCCESS; res = NET_RX_SUCCESS;
out: out:
dev_kfree_skb(skb); dev_kfree_skb(skb);
return ret; return res;
} }
/* Caller must hold bond lock for read */ /* Caller must hold bond lock for read */
static struct slave* static struct slave *rlb_next_rx_slave(struct bonding *bond)
rlb_next_rx_slave(struct bonding *bond)
{ {
struct slave *rx_slave = NULL, *slave = NULL;
unsigned int i = 0;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *rx_slave, *slave, *start_at;
int i = 0;
slave = bond_info->next_rx_slave; if (bond_info->next_rx_slave) {
if (slave == NULL) { start_at = bond_info->next_rx_slave;
slave = bond->next; } else {
start_at = bond->first_slave;
} }
/* this loop uses the circular linked list property of the rx_slave = NULL;
* slave's list to go through all slaves
*/
for (i = 0; i < bond->slave_cnt; i++, slave = slave->next) {
bond_for_each_slave_from(bond, slave, i, start_at) {
if (SLAVE_IS_OK(slave)) { if (SLAVE_IS_OK(slave)) {
if (!rx_slave) { if (!rx_slave) {
rx_slave = slave; rx_slave = slave;
} } else if (slave->speed > rx_slave->speed) {
else if (slave->speed > rx_slave->speed) {
rx_slave = slave; rx_slave = slave;
} }
} }
...@@ -464,48 +415,41 @@ rlb_next_rx_slave(struct bonding *bond) ...@@ -464,48 +415,41 @@ rlb_next_rx_slave(struct bonding *bond)
/* teach the switch the mac of a disabled slave /* teach the switch the mac of a disabled slave
* on the primary for fault tolerance * on the primary for fault tolerance
* *
* Caller must hold bond->ptrlock for write or bond lock for write * Caller must hold bond->curr_slave_lock for write or bond lock for write
*/ */
static void static void rlb_teach_disabled_mac_on_primary(struct bonding *bond, u8 addr[])
rlb_teach_disabled_mac_on_primary(struct bonding *bond, u8 addr[])
{ {
if (!bond->current_slave) { if (!bond->curr_active_slave) {
return; return;
} }
if (!bond->alb_info.primary_is_promisc) { if (!bond->alb_info.primary_is_promisc) {
bond->alb_info.primary_is_promisc = 1; bond->alb_info.primary_is_promisc = 1;
dev_set_promiscuity(bond->current_slave->dev, 1); dev_set_promiscuity(bond->curr_active_slave->dev, 1);
} }
bond->alb_info.rlb_promisc_timeout_counter = 0; bond->alb_info.rlb_promisc_timeout_counter = 0;
alb_send_learning_packets(bond->current_slave, addr); alb_send_learning_packets(bond->curr_active_slave, addr);
} }
/* slave being removed should not be active at this point /* slave being removed should not be active at this point
* *
* Caller must hold bond lock for read * Caller must hold bond lock for read
*/ */
static void static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
rlb_clear_slave(struct bonding *bond, struct slave *slave)
{ {
struct rlb_client_info *rx_hash_table = NULL;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff}; struct rlb_client_info *rx_hash_table;
u32 index, next_index; u32 index, next_index;
/* clear slave from rx_hashtbl */ /* clear slave from rx_hashtbl */
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
rx_hash_table = bond_info->rx_hashtbl;
if (rx_hash_table == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
rx_hash_table = bond_info->rx_hashtbl;
index = bond_info->rx_hashtbl_head; index = bond_info->rx_hashtbl_head;
for (; index != RLB_NULL_INDEX; index = next_index) { for (; index != RLB_NULL_INDEX; index = next_index) {
next_index = rx_hash_table[index].next; next_index = rx_hash_table[index].next;
if (rx_hash_table[index].slave == slave) { if (rx_hash_table[index].slave == slave) {
struct slave *assigned_slave = rlb_next_rx_slave(bond); struct slave *assigned_slave = rlb_next_rx_slave(bond);
...@@ -533,23 +477,24 @@ rlb_clear_slave(struct bonding *bond, struct slave *slave) ...@@ -533,23 +477,24 @@ rlb_clear_slave(struct bonding *bond, struct slave *slave)
_unlock_rx_hashtbl(bond); _unlock_rx_hashtbl(bond);
write_lock(&bond->ptrlock); write_lock(&bond->curr_slave_lock);
if (slave != bond->current_slave) {
if (slave != bond->curr_active_slave) {
rlb_teach_disabled_mac_on_primary(bond, slave->dev->dev_addr); rlb_teach_disabled_mac_on_primary(bond, slave->dev->dev_addr);
} }
write_unlock(&bond->ptrlock);
write_unlock(&bond->curr_slave_lock);
} }
static void static void rlb_update_client(struct rlb_client_info *client_info)
rlb_update_client(struct rlb_client_info *client_info)
{ {
int i = 0; int i;
if (client_info->slave == NULL) { if (!client_info->slave) {
return; return;
} }
for (i=0; i<RLB_ARP_BURST_SIZE; i++) { for (i = 0; i < RLB_ARP_BURST_SIZE; i++) {
arp_send(ARPOP_REPLY, ETH_P_ARP, arp_send(ARPOP_REPLY, ETH_P_ARP,
client_info->ip_dst, client_info->ip_dst,
client_info->slave->dev, client_info->slave->dev,
...@@ -561,20 +506,14 @@ rlb_update_client(struct rlb_client_info *client_info) ...@@ -561,20 +506,14 @@ rlb_update_client(struct rlb_client_info *client_info)
} }
/* sends ARP REPLIES that update the clients that need updating */ /* sends ARP REPLIES that update the clients that need updating */
static void static void rlb_update_rx_clients(struct bonding *bond)
rlb_update_rx_clients(struct bonding *bond)
{ {
u32 hash_index;
struct rlb_client_info *client_info = NULL;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct rlb_client_info *client_info;
u32 hash_index;
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head; hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) { for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]); client_info = &(bond_info->rx_hashtbl[hash_index]);
...@@ -595,22 +534,15 @@ rlb_update_rx_clients(struct bonding *bond) ...@@ -595,22 +534,15 @@ rlb_update_rx_clients(struct bonding *bond)
} }
/* The slave was assigned a new mac address - update the clients */ /* The slave was assigned a new mac address - update the clients */
static void static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave)
rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave)
{ {
u32 hash_index;
u8 ntt = 0;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff}; struct rlb_client_info *client_info;
struct rlb_client_info* client_info = NULL; int ntt = 0;
u32 hash_index;
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head; hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) { for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]); client_info = &(bond_info->rx_hashtbl[hash_index]);
...@@ -633,37 +565,31 @@ rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave) ...@@ -633,37 +565,31 @@ rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave)
} }
/* mark all clients using src_ip to be updated */ /* mark all clients using src_ip to be updated */
static void static void rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip)
rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip)
{ {
u32 hash_index;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff}; struct rlb_client_info *client_info;
struct rlb_client_info *client_info = NULL; u32 hash_index;
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head; hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) { for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]); client_info = &(bond_info->rx_hashtbl[hash_index]);
if (!client_info->slave) { if (!client_info->slave) {
printk(KERN_ERR "Bonding: Error: found a client with no" printk(KERN_ERR DRV_NAME
" channel in the client's hash table\n"); ": Error: found a client with no channel in "
"the client's hash table\n");
continue; continue;
} }
/*update all clients using this src_ip, that are not assigned /*update all clients using this src_ip, that are not assigned
* to the team's address (current_slave) and have a known * to the team's address (curr_active_slave) and have a known
* unicast mac address. * unicast mac address.
*/ */
if ((client_info->ip_src == src_ip) && if ((client_info->ip_src == src_ip) &&
memcmp(client_info->slave->dev->dev_addr, memcmp(client_info->slave->dev->dev_addr,
bond->device->dev_addr, ETH_ALEN) && bond->dev->dev_addr, ETH_ALEN) &&
memcmp(client_info->mac_dst, mac_bcast, ETH_ALEN)) { memcmp(client_info->mac_dst, mac_bcast, ETH_ALEN)) {
client_info->ntt = 1; client_info->ntt = 1;
bond_info->rx_ntt = 1; bond_info->rx_ntt = 1;
...@@ -674,30 +600,22 @@ rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip) ...@@ -674,30 +600,22 @@ rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip)
} }
/* Caller must hold both bond and ptr locks for read */ /* Caller must hold both bond and ptr locks for read */
struct slave* struct slave *rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct rlb_client_info *client_info = NULL; struct slave *assigned_slave;
struct rlb_client_info *client_info;
u32 hash_index = 0; u32 hash_index = 0;
struct slave *assigned_slave = NULL;
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) { hash_index = _simple_hash((u8 *)&arp->ip_dst, sizeof(arp->ip_src));
_unlock_rx_hashtbl(bond);
return NULL;
}
hash_index = _simple_hash((u8 *)&arp->ip_dst, 4);
client_info = &(bond_info->rx_hashtbl[hash_index]); client_info = &(bond_info->rx_hashtbl[hash_index]);
if (client_info->assigned == 1) { if (client_info->assigned) {
if ((client_info->ip_src == arp->ip_src) && if ((client_info->ip_src == arp->ip_src) &&
(client_info->ip_dst == arp->ip_dst)) { (client_info->ip_dst == arp->ip_dst)) {
/* the entry is already assigned to this client */ /* the entry is already assigned to this client */
if (memcmp(arp->mac_dst, mac_bcast, ETH_ALEN)) { if (memcmp(arp->mac_dst, mac_bcast, ETH_ALEN)) {
/* update mac address from arp */ /* update mac address from arp */
memcpy(client_info->mac_dst, arp->mac_dst, ETH_ALEN); memcpy(client_info->mac_dst, arp->mac_dst, ETH_ALEN);
...@@ -710,12 +628,12 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp) ...@@ -710,12 +628,12 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
} }
} else { } else {
/* the entry is already assigned to some other client, /* the entry is already assigned to some other client,
* move the old client to primary (current_slave) so * move the old client to primary (curr_active_slave) so
* that the new client can be assigned to this entry. * that the new client can be assigned to this entry.
*/ */
if (bond->current_slave && if (bond->curr_active_slave &&
client_info->slave != bond->current_slave) { client_info->slave != bond->curr_active_slave) {
client_info->slave = bond->current_slave; client_info->slave = bond->curr_active_slave;
rlb_update_client(client_info); rlb_update_client(client_info);
} }
} }
...@@ -736,8 +654,7 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp) ...@@ -736,8 +654,7 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
if (memcmp(client_info->mac_dst, mac_bcast, ETH_ALEN)) { if (memcmp(client_info->mac_dst, mac_bcast, ETH_ALEN)) {
client_info->ntt = 1; client_info->ntt = 1;
bond->alb_info.rx_ntt = 1; bond->alb_info.rx_ntt = 1;
} } else {
else {
client_info->ntt = 0; client_info->ntt = 0;
} }
...@@ -760,10 +677,9 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp) ...@@ -760,10 +677,9 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
/* chooses (and returns) transmit channel for arp reply /* chooses (and returns) transmit channel for arp reply
* does not choose channel for other arp types since they are * does not choose channel for other arp types since they are
* sent on the current_slave * sent on the curr_active_slave
*/ */
static struct slave* static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
{ {
struct arp_pkt *arp = (struct arp_pkt *)skb->nh.raw; struct arp_pkt *arp = (struct arp_pkt *)skb->nh.raw;
struct slave *tx_slave = NULL; struct slave *tx_slave = NULL;
...@@ -776,9 +692,8 @@ rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond) ...@@ -776,9 +692,8 @@ rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
if (tx_slave) { if (tx_slave) {
memcpy(arp->mac_src,tx_slave->dev->dev_addr, ETH_ALEN); memcpy(arp->mac_src,tx_slave->dev->dev_addr, ETH_ALEN);
} }
BOND_PRINT_DBG(("Server sent ARP Reply packet")); dprintk("Server sent ARP Reply packet\n");
} else if (arp->op_code == __constant_htons(ARPOP_REQUEST)) { } else if (arp->op_code == __constant_htons(ARPOP_REQUEST)) {
/* Create an entry in the rx_hashtbl for this client as a /* Create an entry in the rx_hashtbl for this client as a
* place holder. * place holder.
* When the arp reply is received the entry will be updated * When the arp reply is received the entry will be updated
...@@ -797,34 +712,29 @@ rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond) ...@@ -797,34 +712,29 @@ rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
* updated with their assigned mac. * updated with their assigned mac.
*/ */
rlb_req_update_subnet_clients(bond, arp->ip_src); rlb_req_update_subnet_clients(bond, arp->ip_src);
BOND_PRINT_DBG(("Server sent ARP Request packet")); dprintk("Server sent ARP Request packet\n");
} }
return tx_slave; return tx_slave;
} }
/* Caller must hold bond lock for read */ /* Caller must hold bond lock for read */
static void static void rlb_rebalance(struct bonding *bond)
rlb_rebalance(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *assigned_slave = NULL; struct slave *assigned_slave;
struct rlb_client_info *client_info;
int ntt;
u32 hash_index; u32 hash_index;
struct rlb_client_info *client_info = NULL;
u8 ntt = 0;
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) { ntt = 0;
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head; hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) { for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]); client_info = &(bond_info->rx_hashtbl[hash_index]);
assigned_slave = rlb_next_rx_slave(bond); assigned_slave = rlb_next_rx_slave(bond);
if (assigned_slave && (client_info->slave != assigned_slave)){ if (assigned_slave && (client_info->slave != assigned_slave)) {
client_info->slave = assigned_slave; client_info->slave = assigned_slave;
client_info->ntt = 1; client_info->ntt = 1;
ntt = 1; ntt = 1;
...@@ -839,96 +749,83 @@ rlb_rebalance(struct bonding *bond) ...@@ -839,96 +749,83 @@ rlb_rebalance(struct bonding *bond)
} }
/* Caller must hold rx_hashtbl lock */ /* Caller must hold rx_hashtbl lock */
static inline void static void rlb_init_table_entry(struct rlb_client_info *entry)
rlb_init_table_entry(struct rlb_client_info *entry)
{ {
memset(entry, 0, sizeof(struct rlb_client_info));
entry->next = RLB_NULL_INDEX; entry->next = RLB_NULL_INDEX;
entry->prev = RLB_NULL_INDEX; entry->prev = RLB_NULL_INDEX;
entry->assigned = 0;
entry->ntt = 0;
} }
static int static int rlb_initialize(struct bonding *bond)
rlb_initialize(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct packet_type *pk_type = &(BOND_ALB_INFO(bond).rlb_pkt_type); struct packet_type *pk_type = &(BOND_ALB_INFO(bond).rlb_pkt_type);
int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
int i; int i;
size_t size;
spin_lock_init(&(bond_info->rx_hashtbl_lock)); spin_lock_init(&(bond_info->rx_hashtbl_lock));
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl != NULL) {
printk (KERN_ERR "%s: RLB hash table is not NULL\n",
bond->device->name);
_unlock_rx_hashtbl(bond);
return -1;
}
size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
bond_info->rx_hashtbl = kmalloc(size, GFP_KERNEL); bond_info->rx_hashtbl = kmalloc(size, GFP_KERNEL);
if (bond_info->rx_hashtbl == NULL) { if (!bond_info->rx_hashtbl) {
printk (KERN_ERR "%s: Failed to allocate" printk(KERN_ERR DRV_NAME
" RLB hash table\n", bond->device->name); ": Error: %s: Failed to allocate RLB hash table\n",
bond->dev->name);
_unlock_rx_hashtbl(bond); _unlock_rx_hashtbl(bond);
return -1; return -1;
} }
bond_info->rx_hashtbl_head = RLB_NULL_INDEX; bond_info->rx_hashtbl_head = RLB_NULL_INDEX;
for (i=0; i<RLB_HASH_TABLE_SIZE; i++) { for (i = 0; i < RLB_HASH_TABLE_SIZE; i++) {
rlb_init_table_entry(bond_info->rx_hashtbl + i); rlb_init_table_entry(bond_info->rx_hashtbl + i);
} }
_unlock_rx_hashtbl(bond);
/* register to receive ARPs */ _unlock_rx_hashtbl(bond);
/*initialize packet type*/ /*initialize packet type*/
pk_type->type = __constant_htons(ETH_P_ARP); pk_type->type = __constant_htons(ETH_P_ARP);
pk_type->dev = bond->device; pk_type->dev = bond->dev;
pk_type->func = rlb_arp_recv; pk_type->func = rlb_arp_recv;
/* register to receive ARPs */
dev_add_pack(pk_type); dev_add_pack(pk_type);
return 0; return 0;
} }
static void static void rlb_deinitialize(struct bonding *bond)
rlb_deinitialize(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
dev_remove_pack(&(bond_info->rlb_pkt_type)); dev_remove_pack(&(bond_info->rlb_pkt_type));
_lock_rx_hashtbl(bond); _lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
kfree(bond_info->rx_hashtbl); kfree(bond_info->rx_hashtbl);
bond_info->rx_hashtbl = NULL; bond_info->rx_hashtbl = NULL;
_unlock_rx_hashtbl(bond); _unlock_rx_hashtbl(bond);
} }
/*********************** tlb/rlb shared functions *********************/ /*********************** tlb/rlb shared functions *********************/
static void static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
{ {
struct sk_buff *skb = NULL;
struct learning_pkt pkt; struct learning_pkt pkt;
char *data = NULL; int size = sizeof(struct learning_pkt);
int i; int i;
unsigned int size = sizeof(struct learning_pkt);
memset(&pkt, 0, size); memset(&pkt, 0, size);
memcpy(pkt.mac_dst, mac_addr, ETH_ALEN); memcpy(pkt.mac_dst, mac_addr, ETH_ALEN);
memcpy(pkt.mac_src, mac_addr, ETH_ALEN); memcpy(pkt.mac_src, mac_addr, ETH_ALEN);
pkt.type = __constant_htons(ETH_P_LOOP); pkt.type = __constant_htons(ETH_P_LOOP);
for (i=0; i < MAX_LP_RETRY; i++) { for (i = 0; i < MAX_LP_RETRY; i++) {
skb = NULL; struct sk_buff *skb;
char *data;
skb = dev_alloc_skb(size); skb = dev_alloc_skb(size);
if (!skb) { if (!skb) {
return; return;
...@@ -936,28 +833,26 @@ alb_send_learning_packets(struct slave *slave, u8 mac_addr[]) ...@@ -936,28 +833,26 @@ alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
data = skb_put(skb, size); data = skb_put(skb, size);
memcpy(data, &pkt, size); memcpy(data, &pkt, size);
skb->mac.raw = data; skb->mac.raw = data;
skb->nh.raw = data + ETH_HLEN; skb->nh.raw = data + ETH_HLEN;
skb->protocol = pkt.type; skb->protocol = pkt.type;
skb->priority = TC_PRIO_CONTROL; skb->priority = TC_PRIO_CONTROL;
skb->dev = slave->dev; skb->dev = slave->dev;
dev_queue_xmit(skb); dev_queue_xmit(skb);
} }
} }
/* hw is a boolean parameter that determines whether we should try and /* hw is a boolean parameter that determines whether we should try and
* set the hw address of the device as well as the hw address of the * set the hw address of the device as well as the hw address of the
* net_device * net_device
*/ */
static int static int alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw)
alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw)
{ {
struct net_device *dev = NULL; struct net_device *dev = slave->dev;
struct sockaddr s_addr; struct sockaddr s_addr;
dev = slave->dev;
if (!hw) { if (!hw) {
memcpy(dev->dev_addr, addr, dev->addr_len); memcpy(dev->dev_addr, addr, dev->addr_len);
return 0; return 0;
...@@ -968,26 +863,23 @@ alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw) ...@@ -968,26 +863,23 @@ alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw)
memcpy(s_addr.sa_data, addr, dev->addr_len); memcpy(s_addr.sa_data, addr, dev->addr_len);
s_addr.sa_family = dev->type; s_addr.sa_family = dev->type;
if (dev->set_mac_address(dev, &s_addr)) { if (dev->set_mac_address(dev, &s_addr)) {
printk(KERN_DEBUG "bonding: Error: alb_set_slave_mac_addr:" printk(KERN_ERR DRV_NAME
" dev->set_mac_address of dev %s failed!" ": Error: dev->set_mac_address of dev %s failed! ALB "
" ALB mode requires that the base driver" "mode requires that the base driver support setting "
" support setting the hw address also when" "the hw address also when the network device's "
" the network device's interface is open\n", "interface is open\n",
dev->name); dev->name);
return -EOPNOTSUPP; return -EOPNOTSUPP;
} }
return 0; return 0;
} }
/* Caller must hold bond lock for write or ptrlock for write*/ /* Caller must hold bond lock for write or curr_slave_lock for write*/
static void static void alb_swap_mac_addr(struct bonding *bond, struct slave *slave1, struct slave *slave2)
alb_swap_mac_addr(struct bonding *bond,
struct slave *slave1,
struct slave *slave2)
{ {
u8 tmp_mac_addr[ETH_ALEN];
struct slave *disabled_slave = NULL; struct slave *disabled_slave = NULL;
u8 slaves_state_differ; u8 tmp_mac_addr[ETH_ALEN];
int slaves_state_differ;
slaves_state_differ = (SLAVE_IS_OK(slave1) != SLAVE_IS_OK(slave2)); slaves_state_differ = (SLAVE_IS_OK(slave1) != SLAVE_IS_OK(slave2));
...@@ -1004,8 +896,7 @@ alb_swap_mac_addr(struct bonding *bond, ...@@ -1004,8 +896,7 @@ alb_swap_mac_addr(struct bonding *bond,
*/ */
rlb_req_update_slave_clients(bond, slave1); rlb_req_update_slave_clients(bond, slave1);
} }
} } else {
else {
disabled_slave = slave1; disabled_slave = slave1;
} }
...@@ -1017,15 +908,14 @@ alb_swap_mac_addr(struct bonding *bond, ...@@ -1017,15 +908,14 @@ alb_swap_mac_addr(struct bonding *bond,
*/ */
rlb_req_update_slave_clients(bond, slave2); rlb_req_update_slave_clients(bond, slave2);
} }
} } else {
else {
disabled_slave = slave2; disabled_slave = slave2;
} }
if (bond->alb_info.rlb_enabled && slaves_state_differ) { if (bond->alb_info.rlb_enabled && slaves_state_differ) {
/* A disabled slave was assigned an active mac addr */ /* A disabled slave was assigned an active mac addr */
rlb_teach_disabled_mac_on_primary(bond, rlb_teach_disabled_mac_on_primary(bond,
disabled_slave->dev->dev_addr); disabled_slave->dev->dev_addr);
} }
} }
...@@ -1043,10 +933,8 @@ alb_swap_mac_addr(struct bonding *bond, ...@@ -1043,10 +933,8 @@ alb_swap_mac_addr(struct bonding *bond,
* *
* Caller must hold bond lock * Caller must hold bond lock
*/ */
static void static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
{ {
struct slave *tmp_slave;
int perm_curr_diff; int perm_curr_diff;
int perm_bond_diff; int perm_bond_diff;
...@@ -1054,20 +942,23 @@ alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave) ...@@ -1054,20 +942,23 @@ alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
slave->dev->dev_addr, slave->dev->dev_addr,
ETH_ALEN); ETH_ALEN);
perm_bond_diff = memcmp(slave->perm_hwaddr, perm_bond_diff = memcmp(slave->perm_hwaddr,
bond->device->dev_addr, bond->dev->dev_addr,
ETH_ALEN); ETH_ALEN);
if (perm_curr_diff && perm_bond_diff) { if (perm_curr_diff && perm_bond_diff) {
tmp_slave = bond_get_first_slave(bond); struct slave *tmp_slave;
while (tmp_slave) { int i, found = 0;
bond_for_each_slave(bond, tmp_slave, i) {
if (!memcmp(slave->perm_hwaddr, if (!memcmp(slave->perm_hwaddr,
tmp_slave->dev->dev_addr, tmp_slave->dev->dev_addr,
ETH_ALEN)) { ETH_ALEN)) {
found = 1;
break; break;
} }
tmp_slave = bond_get_next_slave(bond, tmp_slave);
} }
if (tmp_slave) { if (found) {
alb_swap_mac_addr(bond, slave, tmp_slave); alb_swap_mac_addr(bond, slave, tmp_slave);
} }
} }
...@@ -1098,10 +989,10 @@ alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave) ...@@ -1098,10 +989,10 @@ alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
* caller must hold the bond lock for write since the mac addresses are compared * caller must hold the bond lock for write since the mac addresses are compared
* and may be swapped. * and may be swapped.
*/ */
static int static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
{ {
struct slave *tmp_slave1, *tmp_slave2; struct slave *tmp_slave1, *tmp_slave2, *free_mac_slave;
int i, j, found = 0;
if (bond->slave_cnt == 0) { if (bond->slave_cnt == 0) {
/* this is the first slave */ /* this is the first slave */
...@@ -1112,65 +1003,68 @@ alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave) ...@@ -1112,65 +1003,68 @@ alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
* check uniqueness of slave's mac address against the other * check uniqueness of slave's mac address against the other
* slaves in the bond. * slaves in the bond.
*/ */
if (memcmp(slave->perm_hwaddr, bond->device->dev_addr, ETH_ALEN)) { if (memcmp(slave->perm_hwaddr, bond->dev->dev_addr, ETH_ALEN)) {
tmp_slave1 = bond_get_first_slave(bond); bond_for_each_slave(bond, tmp_slave1, i) {
for (; tmp_slave1; tmp_slave1 = bond_get_next_slave(bond, tmp_slave1)) {
if (!memcmp(tmp_slave1->dev->dev_addr, slave->dev->dev_addr, if (!memcmp(tmp_slave1->dev->dev_addr, slave->dev->dev_addr,
ETH_ALEN)) { ETH_ALEN)) {
found = 1;
break; break;
} }
} }
if (tmp_slave1) {
if (found) {
/* a slave was found that is using the mac address /* a slave was found that is using the mac address
* of the new slave * of the new slave
*/ */
printk(KERN_ERR "bonding: Warning: the hw address " printk(KERN_ERR DRV_NAME
"of slave %s is not unique - cannot enslave it!" ": Error: the hw address of slave %s is not "
, slave->dev->name); "unique - cannot enslave it!",
slave->dev->name);
return -EINVAL; return -EINVAL;
} }
return 0; return 0;
} }
/* the slave's address is equal to the address of the bond /* The slave's address is equal to the address of the bond.
* search for a spare address in the bond for this slave. * Search for a spare address in the bond for this slave.
*/ */
tmp_slave1 = bond_get_first_slave(bond); free_mac_slave = NULL;
for (; tmp_slave1; tmp_slave1 = bond_get_next_slave(bond, tmp_slave1)) {
tmp_slave2 = bond_get_first_slave(bond);
for (; tmp_slave2; tmp_slave2 = bond_get_next_slave(bond, tmp_slave2)) {
bond_for_each_slave(bond, tmp_slave1, i) {
found = 0;
bond_for_each_slave(bond, tmp_slave2, j) {
if (!memcmp(tmp_slave1->perm_hwaddr, if (!memcmp(tmp_slave1->perm_hwaddr,
tmp_slave2->dev->dev_addr, tmp_slave2->dev->dev_addr,
ETH_ALEN)) { ETH_ALEN)) {
found = 1;
break; break;
} }
} }
if (!tmp_slave2) { if (!found) {
/* no slave has tmp_slave1's perm addr /* no slave has tmp_slave1's perm addr
* as its curr addr * as its curr addr
*/ */
free_mac_slave = tmp_slave1;
break; break;
} }
} }
if (tmp_slave1) { if (free_mac_slave) {
alb_set_slave_mac_addr(slave, tmp_slave1->perm_hwaddr, alb_set_slave_mac_addr(slave, free_mac_slave->perm_hwaddr,
bond->alb_info.rlb_enabled); bond->alb_info.rlb_enabled);
printk(KERN_WARNING "bonding: Warning: the hw address " printk(KERN_WARNING DRV_NAME
"of slave %s is in use by the bond; " ": Warning: the hw address of slave %s is in use by "
"giving it the hw address of %s\n", "the bond; giving it the hw address of %s\n",
slave->dev->name, tmp_slave1->dev->name); slave->dev->name, free_mac_slave->dev->name);
} else { } else {
printk(KERN_CRIT "bonding: Error: the hw address " printk(KERN_ERR DRV_NAME
"of slave %s is in use by the bond; " ": Error: the hw address of slave %s is in use by the "
"couldn't find a slave with a free hw " "bond; couldn't find a slave with a free hw address to "
"address to give it (this should not have " "give it (this should not have happened)\n",
"happened)\n", slave->dev->name); slave->dev->name);
return -EFAULT; return -EFAULT;
} }
...@@ -1188,37 +1082,36 @@ alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave) ...@@ -1188,37 +1082,36 @@ alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
* *
* For each slave, this function sets the interface to the new address and then * For each slave, this function sets the interface to the new address and then
* changes its dev_addr field to its previous value. * changes its dev_addr field to its previous value.
* *
* Unwinding assumes bond's mac address has not yet changed. * Unwinding assumes bond's mac address has not yet changed.
*/ */
static inline int static int alb_set_mac_address(struct bonding *bond, void *addr)
alb_set_mac_address(struct bonding *bond, void *addr)
{ {
struct sockaddr sa; struct sockaddr sa;
struct slave *slave; struct slave *slave, *stop_at;
char tmp_addr[ETH_ALEN]; char tmp_addr[ETH_ALEN];
int error; int res;
int i;
if (bond->alb_info.rlb_enabled) { if (bond->alb_info.rlb_enabled) {
return 0; return 0;
} }
slave = bond_get_first_slave(bond); bond_for_each_slave(bond, slave, i) {
for (; slave; slave = bond_get_next_slave(bond, slave)) {
if (slave->dev->set_mac_address == NULL) { if (slave->dev->set_mac_address == NULL) {
error = -EOPNOTSUPP; res = -EOPNOTSUPP;
goto unwind; goto unwind;
} }
/* save net_device's current hw address */ /* save net_device's current hw address */
memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN); memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN);
error = slave->dev->set_mac_address(slave->dev, addr); res = slave->dev->set_mac_address(slave->dev, addr);
/* restore net_device's hw address */ /* restore net_device's hw address */
memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN); memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN);
if (error) { if (res) {
goto unwind; goto unwind;
} }
} }
...@@ -1226,22 +1119,23 @@ alb_set_mac_address(struct bonding *bond, void *addr) ...@@ -1226,22 +1119,23 @@ alb_set_mac_address(struct bonding *bond, void *addr)
return 0; return 0;
unwind: unwind:
memcpy(sa.sa_data, bond->device->dev_addr, bond->device->addr_len); memcpy(sa.sa_data, bond->dev->dev_addr, bond->dev->addr_len);
sa.sa_family = bond->device->type; sa.sa_family = bond->dev->type;
slave = bond_get_first_slave(bond);
for (; slave; slave = bond_get_next_slave(bond, slave)) { /* unwind from head to the slave that failed */
stop_at = slave;
bond_for_each_slave_from_to(bond, slave, i, bond->first_slave, stop_at) {
memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN); memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN);
slave->dev->set_mac_address(slave->dev, &sa); slave->dev->set_mac_address(slave->dev, &sa);
memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN); memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN);
} }
return error; return res;
} }
/************************ exported alb funcions ************************/ /************************ exported alb funcions ************************/
int int bond_alb_initialize(struct bonding *bond, int rlb_enabled)
bond_alb_initialize(struct bonding *bond, int rlb_enabled)
{ {
int res; int res;
...@@ -1263,8 +1157,7 @@ bond_alb_initialize(struct bonding *bond, int rlb_enabled) ...@@ -1263,8 +1157,7 @@ bond_alb_initialize(struct bonding *bond, int rlb_enabled)
return 0; return 0;
} }
void void bond_alb_deinitialize(struct bonding *bond)
bond_alb_deinitialize(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
...@@ -1275,49 +1168,38 @@ bond_alb_deinitialize(struct bonding *bond) ...@@ -1275,49 +1168,38 @@ bond_alb_deinitialize(struct bonding *bond)
} }
} }
int int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
{ {
struct bonding *bond = (struct bonding *) dev->priv; struct bonding *bond = bond_dev->priv;
struct ethhdr *eth_data = (struct ethhdr *)skb->data; struct ethhdr *eth_data = (struct ethhdr *)skb->mac.raw = skb->data;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *tx_slave = NULL; struct slave *tx_slave = NULL;
char do_tx_balance = 1; static u32 ip_bcast = 0xffffffff;
int hash_size = 0; int hash_size = 0;
int do_tx_balance = 1;
u32 hash_index = 0; u32 hash_index = 0;
u8 *hash_start = NULL; u8 *hash_start = NULL;
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
if (!IS_UP(dev)) { /* bond down */
dev_kfree_skb(skb);
return 0;
}
/* make sure that the current_slave and the slaves list do /* make sure that the curr_active_slave and the slaves list do
* not change during tx * not change during tx
*/ */
read_lock(&bond->lock); read_lock(&bond->lock);
read_lock(&bond->curr_slave_lock);
if (bond->slave_cnt == 0) { if (!BOND_IS_OK(bond)) {
/* no suitable interface, frame not sent */ goto free_out;
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
} }
read_lock(&bond->ptrlock);
switch (ntohs(skb->protocol)) { switch (ntohs(skb->protocol)) {
case ETH_P_IP: case ETH_P_IP:
if ((memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) || if ((memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) ||
(skb->nh.iph->daddr == 0xffffffff)) { (skb->nh.iph->daddr == ip_bcast)) {
do_tx_balance = 0; do_tx_balance = 0;
break; break;
} }
hash_start = (char*)&(skb->nh.iph->daddr); hash_start = (char*)&(skb->nh.iph->daddr);
hash_size = 4; hash_size = sizeof(skb->nh.iph->daddr);
break; break;
case ETH_P_IPV6: case ETH_P_IPV6:
if (memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) { if (memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) {
do_tx_balance = 0; do_tx_balance = 0;
...@@ -1325,9 +1207,8 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev) ...@@ -1325,9 +1207,8 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
} }
hash_start = (char*)&(skb->nh.ipv6h->daddr); hash_start = (char*)&(skb->nh.ipv6h->daddr);
hash_size = 16; hash_size = sizeof(skb->nh.ipv6h->daddr);
break; break;
case ETH_P_IPX: case ETH_P_IPX:
if (ipx_hdr(skb)->ipx_checksum != if (ipx_hdr(skb)->ipx_checksum !=
__constant_htons(IPX_NO_CHECKSUM)) { __constant_htons(IPX_NO_CHECKSUM)) {
...@@ -1349,14 +1230,12 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev) ...@@ -1349,14 +1230,12 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
hash_start = (char*)eth_data->h_dest; hash_start = (char*)eth_data->h_dest;
hash_size = ETH_ALEN; hash_size = ETH_ALEN;
break; break;
case ETH_P_ARP: case ETH_P_ARP:
do_tx_balance = 0; do_tx_balance = 0;
if (bond_info->rlb_enabled) { if (bond_info->rlb_enabled) {
tx_slave = rlb_arp_xmit(skb, bond); tx_slave = rlb_arp_xmit(skb, bond);
} }
break; break;
default: default:
do_tx_balance = 0; do_tx_balance = 0;
break; break;
...@@ -1369,16 +1248,16 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev) ...@@ -1369,16 +1248,16 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
if (!tx_slave) { if (!tx_slave) {
/* unbalanced or unassigned, send through primary */ /* unbalanced or unassigned, send through primary */
tx_slave = bond->current_slave; tx_slave = bond->curr_active_slave;
bond_info->unbalanced_load += skb->len; bond_info->unbalanced_load += skb->len;
} }
if (tx_slave && SLAVE_IS_OK(tx_slave)) { if (tx_slave && SLAVE_IS_OK(tx_slave)) {
skb->dev = tx_slave->dev; skb->dev = tx_slave->dev;
if (tx_slave != bond->current_slave) { if (tx_slave != bond->curr_active_slave) {
memcpy(eth_data->h_source, memcpy(eth_data->h_source,
tx_slave->dev->dev_addr, tx_slave->dev->dev_addr,
ETH_ALEN); ETH_ALEN);
} }
dev_queue_xmit(skb); dev_queue_xmit(skb);
} else { } else {
...@@ -1386,26 +1265,35 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev) ...@@ -1386,26 +1265,35 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
if (tx_slave) { if (tx_slave) {
tlb_clear_slave(bond, tx_slave, 0); tlb_clear_slave(bond, tx_slave, 0);
} }
dev_kfree_skb(skb); goto free_out;
} }
read_unlock(&bond->ptrlock); out:
read_unlock(&bond->curr_slave_lock);
read_unlock(&bond->lock); read_unlock(&bond->lock);
return 0; return 0;
free_out:
dev_kfree_skb(skb);
goto out;
} }
void void bond_alb_monitor(struct bonding *bond)
bond_alb_monitor(struct bonding *bond)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *slave = NULL; struct slave *slave;
int i;
read_lock(&bond->lock); read_lock(&bond->lock);
if ((bond->slave_cnt == 0) || !(bond->device->flags & IFF_UP)) { if (bond->kill_timers) {
goto out;
}
if (bond->slave_cnt == 0) {
bond_info->tx_rebalance_counter = 0; bond_info->tx_rebalance_counter = 0;
bond_info->lp_counter = 0; bond_info->lp_counter = 0;
goto out; goto re_arm;
} }
bond_info->tx_rebalance_counter++; bond_info->tx_rebalance_counter++;
...@@ -1413,51 +1301,53 @@ bond_alb_monitor(struct bonding *bond) ...@@ -1413,51 +1301,53 @@ bond_alb_monitor(struct bonding *bond)
/* send learning packets */ /* send learning packets */
if (bond_info->lp_counter >= BOND_ALB_LP_TICKS) { if (bond_info->lp_counter >= BOND_ALB_LP_TICKS) {
/* change of current_slave involves swapping of mac addresses. /* change of curr_active_slave involves swapping of mac addresses.
* in order to avoid this swapping from happening while * in order to avoid this swapping from happening while
* sending the learning packets, the ptrlock must be held for * sending the learning packets, the curr_slave_lock must be held for
* read. * read.
*/ */
read_lock(&bond->ptrlock); read_lock(&bond->curr_slave_lock);
slave = bond_get_first_slave(bond);
while (slave) { bond_for_each_slave(bond, slave, i) {
alb_send_learning_packets(slave,slave->dev->dev_addr); alb_send_learning_packets(slave,slave->dev->dev_addr);
slave = bond_get_next_slave(bond, slave);
} }
read_unlock(&bond->ptrlock);
read_unlock(&bond->curr_slave_lock);
bond_info->lp_counter = 0; bond_info->lp_counter = 0;
} }
/* rebalance tx traffic */ /* rebalance tx traffic */
if (bond_info->tx_rebalance_counter >= BOND_TLB_REBALANCE_TICKS) { if (bond_info->tx_rebalance_counter >= BOND_TLB_REBALANCE_TICKS) {
read_lock(&bond->ptrlock);
slave = bond_get_first_slave(bond); read_lock(&bond->curr_slave_lock);
while (slave) {
bond_for_each_slave(bond, slave, i) {
tlb_clear_slave(bond, slave, 1); tlb_clear_slave(bond, slave, 1);
if (slave == bond->current_slave) { if (slave == bond->curr_active_slave) {
SLAVE_TLB_INFO(slave).load = SLAVE_TLB_INFO(slave).load =
bond_info->unbalanced_load / bond_info->unbalanced_load /
BOND_TLB_REBALANCE_INTERVAL; BOND_TLB_REBALANCE_INTERVAL;
bond_info->unbalanced_load = 0; bond_info->unbalanced_load = 0;
} }
slave = bond_get_next_slave(bond, slave);
} }
read_unlock(&bond->ptrlock);
read_unlock(&bond->curr_slave_lock);
bond_info->tx_rebalance_counter = 0; bond_info->tx_rebalance_counter = 0;
} }
/* handle rlb stuff */ /* handle rlb stuff */
if (bond_info->rlb_enabled) { if (bond_info->rlb_enabled) {
/* the following code changes the promiscuity of the /* the following code changes the promiscuity of the
* the current_slave. It needs to be locked with a * the curr_active_slave. It needs to be locked with a
* write lock to protect from other code that also * write lock to protect from other code that also
* sets the promiscuity. * sets the promiscuity.
*/ */
write_lock(&bond->ptrlock); write_lock(&bond->curr_slave_lock);
if (bond_info->primary_is_promisc && if (bond_info->primary_is_promisc &&
(++bond_info->rlb_promisc_timeout_counter >= (++bond_info->rlb_promisc_timeout_counter >= RLB_PROMISC_TIMEOUT)) {
RLB_PROMISC_TIMEOUT)) {
bond_info->rlb_promisc_timeout_counter = 0; bond_info->rlb_promisc_timeout_counter = 0;
...@@ -1465,12 +1355,13 @@ bond_alb_monitor(struct bonding *bond) ...@@ -1465,12 +1355,13 @@ bond_alb_monitor(struct bonding *bond)
* because a slave was disabled then * because a slave was disabled then
* it can now leave promiscuous mode. * it can now leave promiscuous mode.
*/ */
dev_set_promiscuity(bond->current_slave->dev, -1); dev_set_promiscuity(bond->curr_active_slave->dev, -1);
bond_info->primary_is_promisc = 0; bond_info->primary_is_promisc = 0;
} }
write_unlock(&bond->ptrlock);
if (bond_info->rlb_rebalance == 1) { write_unlock(&bond->curr_slave_lock);
if (bond_info->rlb_rebalance) {
bond_info->rlb_rebalance = 0; bond_info->rlb_rebalance = 0;
rlb_rebalance(bond); rlb_rebalance(bond);
} }
...@@ -1490,28 +1381,23 @@ bond_alb_monitor(struct bonding *bond) ...@@ -1490,28 +1381,23 @@ bond_alb_monitor(struct bonding *bond)
} }
} }
re_arm:
mod_timer(&(bond_info->alb_timer), jiffies + alb_delta_in_ticks);
out: out:
read_unlock(&bond->lock); read_unlock(&bond->lock);
if (bond->device->flags & IFF_UP) {
/* re-arm the timer */
mod_timer(&(bond_info->alb_timer),
jiffies + (HZ/ALB_TIMER_TICKS_PER_SEC));
}
} }
/* assumption: called before the slave is attched to the bond /* assumption: called before the slave is attached to the bond
* and not locked by the bond lock * and not locked by the bond lock
*/ */
int int bond_alb_init_slave(struct bonding *bond, struct slave *slave)
bond_alb_init_slave(struct bonding *bond, struct slave *slave)
{ {
int err = 0; int res;
err = alb_set_slave_mac_addr(slave, slave->perm_hwaddr, res = alb_set_slave_mac_addr(slave, slave->perm_hwaddr,
bond->alb_info.rlb_enabled); bond->alb_info.rlb_enabled);
if (err) { if (res) {
return err; return res;
} }
/* caller must hold the bond lock for write since the mac addresses /* caller must hold the bond lock for write since the mac addresses
...@@ -1519,12 +1405,12 @@ bond_alb_init_slave(struct bonding *bond, struct slave *slave) ...@@ -1519,12 +1405,12 @@ bond_alb_init_slave(struct bonding *bond, struct slave *slave)
*/ */
write_lock_bh(&bond->lock); write_lock_bh(&bond->lock);
err = alb_handle_addr_collision_on_attach(bond, slave); res = alb_handle_addr_collision_on_attach(bond, slave);
write_unlock_bh(&bond->lock); write_unlock_bh(&bond->lock);
if (err) { if (res) {
return err; return res;
} }
tlb_init_slave(slave); tlb_init_slave(slave);
...@@ -1540,8 +1426,7 @@ bond_alb_init_slave(struct bonding *bond, struct slave *slave) ...@@ -1540,8 +1426,7 @@ bond_alb_init_slave(struct bonding *bond, struct slave *slave)
} }
/* Caller must hold bond lock for write */ /* Caller must hold bond lock for write */
void void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
{ {
if (bond->slave_cnt > 1) { if (bond->slave_cnt > 1) {
alb_change_hw_addr_on_detach(bond, slave); alb_change_hw_addr_on_detach(bond, slave);
...@@ -1556,9 +1441,7 @@ bond_alb_deinit_slave(struct bonding *bond, struct slave *slave) ...@@ -1556,9 +1441,7 @@ bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
} }
/* Caller must hold bond lock for read */ /* Caller must hold bond lock for read */
void void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link)
bond_alb_handle_link_change(struct bonding *bond, struct slave *slave,
char link)
{ {
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond)); struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
...@@ -1582,109 +1465,111 @@ bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, ...@@ -1582,109 +1465,111 @@ bond_alb_handle_link_change(struct bonding *bond, struct slave *slave,
} }
/** /**
* bond_alb_assign_current_slave - assign new current_slave * bond_alb_handle_active_change - assign new curr_active_slave
* @bond: our bonding struct * @bond: our bonding struct
* @new_slave: new slave to assign * @new_slave: new slave to assign
* *
* Set the bond->current_slave to @new_slave and handle * Set the bond->curr_active_slave to @new_slave and handle
* mac address swapping and promiscuity changes as needed. * mac address swapping and promiscuity changes as needed.
* *
* Caller must hold bond ptrlock for write (or bond lock for write) * Caller must hold bond curr_slave_lock for write (or bond lock for write)
*/ */
void void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave)
bond_alb_assign_current_slave(struct bonding *bond, struct slave *new_slave)
{ {
struct slave *swap_slave = bond->current_slave; struct slave *swap_slave;
int i;
if (bond->current_slave == new_slave) { if (bond->curr_active_slave == new_slave) {
return; return;
} }
if (bond->current_slave && bond->alb_info.primary_is_promisc) { if (bond->curr_active_slave && bond->alb_info.primary_is_promisc) {
dev_set_promiscuity(bond->current_slave->dev, -1); dev_set_promiscuity(bond->curr_active_slave->dev, -1);
bond->alb_info.primary_is_promisc = 0; bond->alb_info.primary_is_promisc = 0;
bond->alb_info.rlb_promisc_timeout_counter = 0; bond->alb_info.rlb_promisc_timeout_counter = 0;
} }
bond->current_slave = new_slave; swap_slave = bond->curr_active_slave;
bond->curr_active_slave = new_slave;
if (!new_slave || (bond->slave_cnt == 0)) { if (!new_slave || (bond->slave_cnt == 0)) {
return; return;
} }
/* set the new current_slave to the bonds mac address /* set the new curr_active_slave to the bonds mac address
* i.e. swap mac addresses of old current_slave and new current_slave * i.e. swap mac addresses of old curr_active_slave and new curr_active_slave
*/ */
if (!swap_slave) { if (!swap_slave) {
struct slave *tmp_slave;
/* find slave that is holding the bond's mac address */ /* find slave that is holding the bond's mac address */
swap_slave = bond_get_first_slave(bond); bond_for_each_slave(bond, tmp_slave, i) {
while (swap_slave) { if (!memcmp(tmp_slave->dev->dev_addr,
if (!memcmp(swap_slave->dev->dev_addr, bond->dev->dev_addr, ETH_ALEN)) {
bond->device->dev_addr, ETH_ALEN)) { swap_slave = tmp_slave;
break; break;
} }
swap_slave = bond_get_next_slave(bond, swap_slave);
} }
} }
/* current_slave must be set before calling alb_swap_mac_addr */ /* curr_active_slave must be set before calling alb_swap_mac_addr */
if (swap_slave) { if (swap_slave) {
/* swap mac address */ /* swap mac address */
alb_swap_mac_addr(bond, swap_slave, new_slave); alb_swap_mac_addr(bond, swap_slave, new_slave);
} else { } else {
/* set the new_slave to the bond mac address */ /* set the new_slave to the bond mac address */
alb_set_slave_mac_addr(new_slave, bond->device->dev_addr, alb_set_slave_mac_addr(new_slave, bond->dev->dev_addr,
bond->alb_info.rlb_enabled); bond->alb_info.rlb_enabled);
/* fasten bond mac on new current slave */ /* fasten bond mac on new current slave */
alb_send_learning_packets(new_slave, bond->device->dev_addr); alb_send_learning_packets(new_slave, bond->dev->dev_addr);
} }
} }
int int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr)
bond_alb_set_mac_address(struct net_device *dev, void *addr)
{ {
struct bonding *bond = dev->priv; struct bonding *bond = bond_dev->priv;
struct sockaddr *sa = addr; struct sockaddr *sa = addr;
struct slave *swap_slave = NULL; struct slave *slave, *swap_slave;
int error = 0; int res;
int i;
if (!is_valid_ether_addr(sa->sa_data)) { if (!is_valid_ether_addr(sa->sa_data)) {
return -EADDRNOTAVAIL; return -EADDRNOTAVAIL;
} }
error = alb_set_mac_address(bond, addr); res = alb_set_mac_address(bond, addr);
if (error) { if (res) {
return error; return res;
} }
memcpy(dev->dev_addr, sa->sa_data, dev->addr_len); memcpy(bond_dev->dev_addr, sa->sa_data, bond_dev->addr_len);
/* If there is no current_slave there is nothing else to do. /* If there is no curr_active_slave there is nothing else to do.
* Otherwise we'll need to pass the new address to it and handle * Otherwise we'll need to pass the new address to it and handle
* duplications. * duplications.
*/ */
if (bond->current_slave == NULL) { if (!bond->curr_active_slave) {
return 0; return 0;
} }
swap_slave = bond_get_first_slave(bond); swap_slave = NULL;
while (swap_slave) {
if (!memcmp(swap_slave->dev->dev_addr, dev->dev_addr, ETH_ALEN)) { bond_for_each_slave(bond, slave, i) {
if (!memcmp(slave->dev->dev_addr, bond_dev->dev_addr, ETH_ALEN)) {
swap_slave = slave;
break; break;
} }
swap_slave = bond_get_next_slave(bond, swap_slave);
} }
if (swap_slave) { if (swap_slave) {
alb_swap_mac_addr(bond, swap_slave, bond->current_slave); alb_swap_mac_addr(bond, swap_slave, bond->curr_active_slave);
} else { } else {
alb_set_slave_mac_addr(bond->current_slave, dev->dev_addr, alb_set_slave_mac_addr(bond->curr_active_slave, bond_dev->dev_addr,
bond->alb_info.rlb_enabled); bond->alb_info.rlb_enabled);
alb_send_learning_packets(bond->current_slave, dev->dev_addr); alb_send_learning_packets(bond->curr_active_slave, bond_dev->dev_addr);
if (bond->alb_info.rlb_enabled) { if (bond->alb_info.rlb_enabled) {
/* inform clients mac address has changed */ /* inform clients mac address has changed */
rlb_req_update_slave_clients(bond, bond->current_slave); rlb_req_update_slave_clients(bond, bond->curr_active_slave);
} }
} }
......
...@@ -24,6 +24,9 @@ ...@@ -24,6 +24,9 @@
* 2003/08/06 - Amir Noam <amir.noam at intel dot com> * 2003/08/06 - Amir Noam <amir.noam at intel dot com>
* - Add support for setting bond's MAC address with special * - Add support for setting bond's MAC address with special
* handling required for ALB/TLB. * handling required for ALB/TLB.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/ */
#ifndef __BOND_ALB_H__ #ifndef __BOND_ALB_H__
...@@ -126,10 +129,10 @@ void bond_alb_deinitialize(struct bonding *bond); ...@@ -126,10 +129,10 @@ void bond_alb_deinitialize(struct bonding *bond);
int bond_alb_init_slave(struct bonding *bond, struct slave *slave); int bond_alb_init_slave(struct bonding *bond, struct slave *slave);
void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave); void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave);
void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link); void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link);
void bond_alb_assign_current_slave(struct bonding *bond, struct slave *new_slave); void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave);
int bond_alb_xmit(struct sk_buff *skb, struct net_device *dev); int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev);
void bond_alb_monitor(struct bonding *bond); void bond_alb_monitor(struct bonding *bond);
int bond_alb_set_mac_address(struct net_device *dev, void *addr); int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr);
#endif /* __BOND_ALB_H__ */ #endif /* __BOND_ALB_H__ */
This source diff could not be displayed because it is too large. You can view the blob instead.
...@@ -9,7 +9,7 @@ ...@@ -9,7 +9,7 @@
* *
* This software may be used and distributed according to the terms * This software may be used and distributed according to the terms
* of the GNU Public License, incorporated herein by reference. * of the GNU Public License, incorporated herein by reference.
* *
* *
* 2003/03/18 - Amir Noam <amir.noam at intel dot com>, * 2003/03/18 - Amir Noam <amir.noam at intel dot com>,
* Tsippy Mendelson <tsippy.mendelson at intel dot com> and * Tsippy Mendelson <tsippy.mendelson at intel dot com> and
...@@ -22,159 +22,205 @@ ...@@ -22,159 +22,205 @@
* *
* 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com> * 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com>
* - Added support for Transmit load balancing mode. * - Added support for Transmit load balancing mode.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/ */
#ifndef _LINUX_BONDING_H #ifndef _LINUX_BONDING_H
#define _LINUX_BONDING_H #define _LINUX_BONDING_H
#include <linux/timer.h> #include <linux/timer.h>
#include <linux/proc_fs.h> #include <linux/proc_fs.h>
#include <linux/if_bonding.h>
#include "bond_3ad.h" #include "bond_3ad.h"
#include "bond_alb.h" #include "bond_alb.h"
#ifdef BONDING_DEBUG #define DRV_VERSION "2.5.0"
#define DRV_RELDATE "December 1, 2003"
// use this like so: BOND_PRINT_DBG(("foo = %d, bar = %d", foo, bar)); #define DRV_NAME "bonding"
#define BOND_PRINT_DBG(X) \ #define DRV_DESCRIPTION "Ethernet Channel Bonding Driver"
do { \
printk(KERN_DEBUG "%s (%d)", __FUNCTION__, __LINE__); \
printk X; \
printk("\n"); \
} while(0)
#ifdef BONDING_DEBUG
#define dprintk(fmt, args...) \
printk(KERN_DEBUG \
DRV_NAME ": %s() %d: " fmt, __FUNCTION__, __LINE__ , ## args )
#else #else
#define BOND_PRINT_DBG(X) #define dprintk(fmt, args...)
#endif /* BONDING_DEBUG */ #endif /* BONDING_DEBUG */
#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \ #define IS_UP(dev) \
(netif_running(dev) && netif_carrier_ok(dev))) ((((dev)->flags & IFF_UP) == IFF_UP) && \
netif_running(dev) && \
netif_carrier_ok(dev))
/*
* Checks whether bond is ready for transmit.
*
* Caller must hold bond->lock
*/
#define BOND_IS_OK(bond) \
(((bond)->dev->flags & IFF_UP) && \
netif_running((bond)->dev) && \
((bond)->slave_cnt > 0))
/* Checks whether the dev is ready for transmit. We do not check netif_running /*
* since a device can be stopped by the driver for short periods of time for * Checks whether slave is ready for transmit.
* maintainance. dev_queue_xmit() handles this by queing the packet until the
* the dev is running again. Keeping packets ordering requires sticking the
* same dev as much as possible
*/ */
#define SLAVE_IS_OK(slave) \ #define SLAVE_IS_OK(slave) \
((((slave)->dev->flags & (IFF_UP)) == (IFF_UP)) && \ (((slave)->dev->flags & IFF_UP) && \
netif_carrier_ok((slave)->dev) && \ netif_running((slave)->dev) && \
((slave)->link == BOND_LINK_UP) && \ ((slave)->link == BOND_LINK_UP) && \
((slave)->state == BOND_STATE_ACTIVE)) ((slave)->state == BOND_STATE_ACTIVE))
typedef struct slave { #define USES_PRIMARY(mode) \
(((mode) == BOND_MODE_ACTIVEBACKUP) || \
((mode) == BOND_MODE_TLB) || \
((mode) == BOND_MODE_ALB))
/*
* Less bad way to call ioctl from within the kernel; this needs to be
* done some other way to get the call out of interrupt context.
* Needs "ioctl" variable to be supplied by calling context.
*/
#define IOCTL(dev, arg, cmd) ({ \
int res = 0; \
mm_segment_t fs = get_fs(); \
set_fs(get_ds()); \
res = ioctl(dev, arg, cmd); \
set_fs(fs); \
res; })
/**
* bond_for_each_slave_from - iterate the slaves list from a starting point
* @bond: the bond holding this list.
* @pos: current slave.
* @cnt: counter for max number of moves
* @start: starting point.
*
* Caller must hold bond->lock
*/
#define bond_for_each_slave_from(bond, pos, cnt, start) \
for (cnt = 0, pos = start; \
cnt < (bond)->slave_cnt; \
cnt++, pos = (pos)->next)
/**
* bond_for_each_slave_from_to - iterate the slaves list from start point to stop point
* @bond: the bond holding this list.
* @pos: current slave.
* @cnt: counter for number max of moves
* @start: start point.
* @stop: stop point.
*
* Caller must hold bond->lock
*/
#define bond_for_each_slave_from_to(bond, pos, cnt, start, stop) \
for (cnt = 0, pos = start; \
((cnt < (bond)->slave_cnt) && (pos != (stop)->next)); \
cnt++, pos = (pos)->next)
/**
* bond_for_each_slave - iterate the slaves list from head
* @bond: the bond holding this list.
* @pos: current slave.
* @cnt: counter for max number of moves
*
* Caller must hold bond->lock
*/
#define bond_for_each_slave(bond, pos, cnt) \
bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave)
struct slave {
struct net_device *dev; /* first - usefull for panic debug */
struct slave *next; struct slave *next;
struct slave *prev; struct slave *prev;
struct net_device *dev; s16 delay;
short delay; u32 jiffies;
unsigned long jiffies; s8 link; /* one of BOND_LINK_XXXX */
char link; /* one of BOND_LINK_XXXX */ s8 state; /* one of BOND_STATE_XXXX */
char state; /* one of BOND_STATE_XXXX */ u32 original_flags;
unsigned short original_flags; u32 link_failure_count;
u32 link_failure_count;
u16 speed; u16 speed;
u8 duplex; u8 duplex;
u8 perm_hwaddr[ETH_ALEN]; u8 perm_hwaddr[ETH_ALEN];
struct ad_slave_info ad_info; /* HUGE - better to dynamically alloc */ struct ad_slave_info ad_info; /* HUGE - better to dynamically alloc */
struct tlb_slave_info tlb_info; struct tlb_slave_info tlb_info;
} slave_t; };
/* /*
* Here are the locking policies for the two bonding locks: * Here are the locking policies for the two bonding locks:
* *
* 1) Get bond->lock when reading/writing slave list. * 1) Get bond->lock when reading/writing slave list.
* 2) Get bond->ptrlock when reading/writing bond->current_slave. * 2) Get bond->curr_slave_lock when reading/writing bond->curr_active_slave.
* (It is unnecessary when the write-lock is put with bond->lock.) * (It is unnecessary when the write-lock is put with bond->lock.)
* 3) When we lock with bond->ptrlock, we must lock with bond->lock * 3) When we lock with bond->curr_slave_lock, we must lock with bond->lock
* beforehand. * beforehand.
*/ */
typedef struct bonding { struct bonding {
slave_t *next; struct net_device *dev; /* first - usefull for panic debug */
slave_t *prev; struct slave *first_slave;
slave_t *current_slave; struct slave *curr_active_slave;
slave_t *primary_slave; struct slave *current_arp_slave;
slave_t *current_arp_slave; struct slave *primary_slave;
__s32 slave_cnt; s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
rwlock_t lock; rwlock_t lock;
rwlock_t ptrlock; rwlock_t curr_slave_lock;
struct timer_list mii_timer; struct timer_list mii_timer;
struct timer_list arp_timer; struct timer_list arp_timer;
struct net_device_stats stats; s8 kill_timers;
struct net_device_stats stats;
#ifdef CONFIG_PROC_FS #ifdef CONFIG_PROC_FS
struct proc_dir_entry *bond_proc_file; struct proc_dir_entry *proc_entry;
char procdir_name[IFNAMSIZ]; char proc_file_name[IFNAMSIZ];
#endif /* CONFIG_PROC_FS */ #endif /* CONFIG_PROC_FS */
struct list_head bond_list; struct list_head bond_list;
struct net_device *device; struct dev_mc_list *mc_list;
struct dev_mc_list *mc_list; u16 flags;
unsigned short flags; struct ad_bond_info ad_info;
struct ad_bond_info ad_info; struct alb_bond_info alb_info;
struct alb_bond_info alb_info; };
} bonding_t;
/* Forward declarations */
void bond_set_slave_active_flags(slave_t *slave);
void bond_set_slave_inactive_flags(slave_t *slave);
/**
* These functions can be used for iterating the slave list
* (which is circular)
* Caller must hold bond lock for read
*/
extern inline struct slave*
bond_get_first_slave(struct bonding *bond)
{
/* if there are no slaves return NULL */
if (bond->next == (slave_t *)bond) {
return NULL;
}
return bond->next;
}
/**
* Caller must hold bond lock for read
*/
extern inline struct slave*
bond_get_next_slave(struct bonding *bond, struct slave *slave)
{
/* If we have reached the last slave return NULL */
if (slave->next == bond->next) {
return NULL;
}
return slave->next;
}
/** /**
* Returns NULL if the net_device does not belong to any of the bond's slaves * Returns NULL if the net_device does not belong to any of the bond's slaves
* *
* Caller must hold bond lock for read * Caller must hold bond lock for read
*/ */
extern inline struct slave* extern inline struct slave *bond_get_slave_by_dev(struct bonding *bond, struct net_device *slave_dev)
bond_get_slave_by_dev(struct bonding *bond, struct net_device *slave_dev)
{ {
struct slave *our_slave = bond->next; struct slave *slave = NULL;
int i;
/* check if the list of slaves is empty */ bond_for_each_slave(bond, slave, i) {
if (our_slave == (slave_t *)bond) { if (slave->dev == slave_dev) {
return NULL;
}
for (; our_slave; our_slave = bond_get_next_slave(bond, our_slave)) {
if (our_slave->dev == slave_dev) {
break; break;
} }
} }
return our_slave;
return slave;
} }
extern inline struct bonding* extern inline struct bonding *bond_get_bond_by_slave(struct slave *slave)
bond_get_bond_by_slave(struct slave *slave)
{ {
if (!slave || !slave->dev->master) { if (!slave || !slave->dev->master) {
return NULL; return NULL;
} }
return (struct bonding *)(slave->dev->master->priv); return (struct bonding *)slave->dev->master->priv;
}
extern inline void bond_set_slave_inactive_flags(struct slave *slave)
{
slave->state = BOND_STATE_BACKUP;
slave->dev->flags |= IFF_NOARP;
}
extern inline void bond_set_slave_active_flags(struct slave *slave)
{
slave->state = BOND_STATE_ACTIVE;
slave->dev->flags &= ~IFF_NOARP;
} }
#endif /* _LINUX_BONDING_H */ #endif /* _LINUX_BONDING_H */
......
/* /*
* Bond several ethernet interfaces into a Cisco, running 'Etherchannel'. * Bond several ethernet interfaces into a Cisco, running 'Etherchannel'.
* *
* *
* Portions are (c) Copyright 1995 Simon "Guru Aleph-Null" Janes * Portions are (c) Copyright 1995 Simon "Guru Aleph-Null" Janes
* NCM: Network and Communications Management, Inc. * NCM: Network and Communications Management, Inc.
* *
...@@ -10,11 +10,11 @@ ...@@ -10,11 +10,11 @@
* *
* This software may be used and distributed according to the terms * This software may be used and distributed according to the terms
* of the GNU Public License, incorporated herein by reference. * of the GNU Public License, incorporated herein by reference.
* *
* 2003/03/18 - Amir Noam <amir.noam at intel dot com> * 2003/03/18 - Amir Noam <amir.noam at intel dot com>
* - Added support for getting slave's speed and duplex via ethtool. * - Added support for getting slave's speed and duplex via ethtool.
* Needed for 802.3ad and other future modes. * Needed for 802.3ad and other future modes.
* *
* 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and * 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and
* Shmulik Hen <shmulik.hen at intel dot com> * Shmulik Hen <shmulik.hen at intel dot com>
* - Enable support of modes that need to use the unique mac address of * - Enable support of modes that need to use the unique mac address of
...@@ -42,7 +42,7 @@ ...@@ -42,7 +42,7 @@
#include <linux/if_ether.h> #include <linux/if_ether.h>
/* userland - kernel ABI version (2003/05/08) */ /* userland - kernel ABI version (2003/05/08) */
#define BOND_ABI_VERSION 1 #define BOND_ABI_VERSION 2
/* /*
* We can remove these ioctl definitions in 2.5. People should use the * We can remove these ioctl definitions in 2.5. People should use the
...@@ -77,10 +77,6 @@ ...@@ -77,10 +77,6 @@
#define BOND_DEFAULT_MAX_BONDS 1 /* Default maximum number of devices to support */ #define BOND_DEFAULT_MAX_BONDS 1 /* Default maximum number of devices to support */
#define BOND_MULTICAST_DISABLED 0
#define BOND_MULTICAST_ACTIVE 1
#define BOND_MULTICAST_ALL 2
typedef struct ifbond { typedef struct ifbond {
__s32 bond_mode; __s32 bond_mode;
__s32 num_slaves; __s32 num_slaves;
...@@ -90,9 +86,9 @@ typedef struct ifbond { ...@@ -90,9 +86,9 @@ typedef struct ifbond {
typedef struct ifslave typedef struct ifslave
{ {
__s32 slave_id; /* Used as an IN param to the BOND_SLAVE_INFO_QUERY ioctl */ __s32 slave_id; /* Used as an IN param to the BOND_SLAVE_INFO_QUERY ioctl */
char slave_name[IFNAMSIZ]; __s8 slave_name[IFNAMSIZ];
char link; __s8 link;
char state; __s8 state;
__u32 link_failure_count; __u32 link_failure_count;
} ifslave; } ifslave;
...@@ -115,3 +111,4 @@ struct ad_info { ...@@ -115,3 +111,4 @@ struct ad_info {
* tab-width: 8 * tab-width: 8
* End: * End:
*/ */
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment