Commit ec041cd9 authored by Jeff Garzik's avatar Jeff Garzik

Merge redhat.com:/spare/repo/netdev-2.6/bonding-1

into redhat.com:/spare/repo/net-drivers-2.5
parents 56549962 61b6a04a
......@@ -21,7 +21,7 @@ userspace tools, please follow the links at the end of this file.
Table of Contents
=================
Installation
Bond Configuration
Module Parameters
......@@ -66,7 +66,7 @@ of the -I option on the ifenslave compile line is to make sure it uses
/usr/include/linux.
To install ifenslave.c, do:
# gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave
# gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave
# cp ifenslave /sbin/ifenslave
......@@ -74,10 +74,10 @@ Bond Configuration
==================
You will need to add at least the following line to /etc/modules.conf
so the bonding driver will automatically load when the bond0 interface is
configured. Refer to the modules.conf manual page for specific modules.conf
syntax details. The Module Parameters section of this document describes each
bonding driver parameter.
so the bonding driver will automatically load when the bond0 interface is
configured. Refer to the modules.conf manual page for specific modules.conf
syntax details. The Module Parameters section of this document describes each
bonding driver parameter.
alias bond0 bonding
......@@ -113,7 +113,7 @@ bonding interface (bond1), use MASTER=bond1 in the config file to make the
network interface be a slave of bond1.
Restart the networking subsystem or just bring up the bonding device if your
administration tools allow it. Otherwise, reboot. On Red Hat distros you can
administration tools allow it. Otherwise, reboot. On Red Hat distros you can
issue `ifup bond0' or `/etc/rc.d/init.d/network restart'.
If the administration tools of your distribution do not support
......@@ -128,30 +128,30 @@ manually configure the bonding device with the following commands:
(use appropriate values for your network above)
You can then create a script containing these commands and place it in the
You can then create a script containing these commands and place it in the
appropriate rc directory.
If you specifically need all network drivers loaded before the bonding driver,
adding the following line to modules.conf will cause the network driver for
adding the following line to modules.conf will cause the network driver for
eth0 and eth1 to be loaded before the bonding driver.
probeall bond0 eth0 eth1 bonding
Be careful not to reference bond0 itself at the end of the line, or modprobe
Be careful not to reference bond0 itself at the end of the line, or modprobe
will die in an endless recursive loop.
To have device characteristics (such as MTU size) propagate to slave devices,
set the bond characteristics before enslaving the device. The characteristics
To have device characteristics (such as MTU size) propagate to slave devices,
set the bond characteristics before enslaving the device. The characteristics
are propagated during the enslave process.
If running SNMP agents, the bonding driver should be loaded before any network
drivers participating in a bond. This requirement is due to the the interface
index (ipAdEntIfIndex) being associated to the first interface found with a
given IP address. That is, there is only one ipAdEntIfIndex for each IP
address. For example, if eth0 and eth1 are slaves of bond0 and the driver for
eth0 is loaded before the bonding driver, the interface for the IP address
will be associated with the eth0 interface. This configuration is shown below,
the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0
If running SNMP agents, the bonding driver should be loaded before any network
drivers participating in a bond. This requirement is due to the the interface
index (ipAdEntIfIndex) being associated to the first interface found with a
given IP address. That is, there is only one ipAdEntIfIndex for each IP
address. For example, if eth0 and eth1 are slaves of bond0 and the driver for
eth0 is loaded before the bonding driver, the interface for the IP address
will be associated with the eth0 interface. This configuration is shown below,
the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0
in the ifDescr table (ifDescr.2).
interfaces.ifTable.ifEntry.ifDescr.1 = lo
......@@ -189,10 +189,10 @@ functions such as Interface_Scan_Next will report that association.
Module Parameters
=================
Optional parameters for the bonding driver can be supplied as command line
arguments to the insmod command. Typically, these parameters are specified in
the file /etc/modules.conf (see the manual page for modules.conf). The
available bonding driver parameters are listed below. If a parameter is not
Optional parameters for the bonding driver can be supplied as command line
arguments to the insmod command. Typically, these parameters are specified in
the file /etc/modules.conf (see the manual page for modules.conf). The
available bonding driver parameters are listed below. If a parameter is not
specified the default value is used. When initially configuring a bond, it
is recommended "tail -f /var/log/messages" be run in a separate window to
watch for bonding driver error messages.
......@@ -202,19 +202,19 @@ parameters be specified, otherwise serious network degradation will occur
during link failures.
arp_interval
Specifies the ARP monitoring frequency in milli-seconds.
If ARP monitoring is used in a load-balancing mode (mode 0 or 2), the
switch should be configured in a mode that evenly distributes packets
across all links - such as round-robin. If the switch is configured to
distribute the packets in an XOR fashion, all replies from the ARP
targets will be received on the same link which could cause the other
Specifies the ARP monitoring frequency in milli-seconds.
If ARP monitoring is used in a load-balancing mode (mode 0 or 2), the
switch should be configured in a mode that evenly distributes packets
across all links - such as round-robin. If the switch is configured to
distribute the packets in an XOR fashion, all replies from the ARP
targets will be received on the same link which could cause the other
team members to fail. ARP monitoring should not be used in conjunction
with miimon. A value of 0 disables ARP monitoring. The default value
with miimon. A value of 0 disables ARP monitoring. The default value
is 0.
arp_ip_target
Specifies the ip addresses to use when arp_interval is > 0. These
are the targets of the ARP request sent to determine the health of
the link to the targets. Specify these values in ddd.ddd.ddd.ddd
......@@ -223,8 +223,8 @@ arp_ip_target
maximum number of targets that can be specified is set at 16.
downdelay
Specifies the delay time in milli-seconds to disable a link after a
Specifies the delay time in milli-seconds to disable a link after a
link failure has been detected. This should be a multiple of miimon
value, otherwise the value will be rounded. The default value is 0.
......@@ -247,7 +247,7 @@ max_bonds
and bond2 will be created. The default value is 1.
miimon
Specifies the frequency in milli-seconds that MII link monitoring
will occur. A value of zero disables MII link monitoring. A value
of 100 is a good starting point. See High Availability section for
......@@ -258,7 +258,7 @@ mode
Specifies one of the bonding policies. The default is
round-robin (balance-rr). Possible values are (you can use
either the text or numeric option):
balance-rr or 0
Round-robin policy: Transmit in a sequential order
......@@ -273,7 +273,7 @@ mode
externally visible on only one port (network adapter)
to avoid confusing the switch. This mode provides
fault tolerance.
balance-xor or 2
XOR policy: Transmit based on [(source MAC address
......@@ -293,7 +293,7 @@ mode
groups that share the same speed and duplex settings.
Transmits and receives on all slaves in the active
aggregator.
Pre-requisites:
1. Ethtool support in the base drivers for retrieving the
......@@ -317,7 +317,7 @@ mode
Ethtool support in the base drivers for retrieving the
speed of each slave.
balance-alb or 6
balance-alb or 6
Adaptive load balancing: includes balance-tlb + receive
load balancing (rlb) for IPV4 traffic and does not require
......@@ -327,7 +327,7 @@ mode
overwrites the src hw address with the unique hw address of
one of the slaves in the bond such that different clients
use different hw addresses for the server.
Receive traffic from connections created by the server is
also balanced. When the server sends an ARP Request the
bonding driver copies and saves the client's IP information
......@@ -363,25 +363,11 @@ mode
2. Base driver support for setting the hw address of a
device also when it is open. This is required so that there
will always be one slave in the team using the bond hw
address (the current_slave) while having a unique hw
address for each slave in the bond. If the current_slave
fails it's hw address is swapped with the new current_slave
address (the curr_active_slave) while having a unique hw
address for each slave in the bond. If the curr_active_slave
fails it's hw address is swapped with the new curr_active_slave
that was chosen.
multicast
Option specifying the mode of operation for multicast support.
Possible values are:
disabled or 0
Disabled (no multicast support)
active or 1
Enabled on active slave only, useful in active-backup mode
all or 2
Enabled on all slaves, this is the default
primary
A string (eth0, eth2, etc) to equate to a primary device. If this
......@@ -397,11 +383,11 @@ primary
primary is only valid in active-backup mode.
updelay
Specifies the delay time in milli-seconds to enable a link after a
Specifies the delay time in milli-seconds to enable a link after a
link up status has been detected. This should be a multiple of miimon
value, otherwise the value will be rounded. The default value is 0.
use_carrier
Specifies whether or not miimon should use MII or ETHTOOL
......@@ -529,20 +515,20 @@ Verifying Bond Configuration
----------------------------
The bonding driver information files reside in the /proc/net/bonding directory.
Sample contents of /proc/net/bonding/bond0 after the driver is loaded with
Sample contents of /proc/net/bonding/bond0 after the driver is loaded with
parameters of mode=0 and miimon=1000 is shown below.
Bonding Mode: load balancing (round-robin)
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Slave Interface: eth0
MII Status: up
Link Failure Count: 1
......@@ -550,34 +536,34 @@ parameters of mode=0 and miimon=1000 is shown below.
2) Network verification
-----------------------
The network configuration can be verified using the ifconfig command. In
the example below, the bond0 interface is the master (MASTER) while eth0 and
eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address
the example below, the bond0 interface is the master (MASTER) while eth0 and
eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address
(HWaddr) as bond0 for all modes except TLB and ALB that require a unique MAC
address for each slave.
[root]# /sbin/ifconfig
bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:7224794 errors:0 dropped:0 overruns:0 frame:0
TX packets:3286647 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:0
collisions:0 txqueuelen:0
eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
eth0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3573025 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643167 errors:1 dropped:0 overruns:1 carrier:0
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x1080
collisions:0 txqueuelen:100
Interrupt:10 Base address:0x1080
eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
inet addr:XXX.XXX.XXX.YYY Bcast:XXX.XXX.XXX.255 Mask:255.255.252.0
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:3651769 errors:0 dropped:0 overruns:0 frame:0
TX packets:1643480 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400
collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400
Frequently Asked Questions
......@@ -605,9 +591,9 @@ Frequently Asked Questions
5. What happens when a slave link dies?
If your ethernet cards support MII or ETHTOOL link status monitoring
and the MII monitoring has been enabled in the driver (see description
of module parameters), there will be no adverse consequences. This
If your ethernet cards support MII or ETHTOOL link status monitoring
and the MII monitoring has been enabled in the driver (see description
of module parameters), there will be no adverse consequences. This
release of the bonding driver knows how to get the MII information and
enables or disables its slaves according to their link status.
See section on High Availability for additional information.
......@@ -622,8 +608,8 @@ Frequently Asked Questions
slave.
If neither mii_monitor and arp_interval is configured, the bonding
driver will not handle this situation very well. The driver will
continue to send packets but some packets will be lost. Retransmits
driver will not handle this situation very well. The driver will
continue to send packets but some packets will be lost. Retransmits
will cause serious degradation of performance (in the case when one
of two slave links fails, 50% packets will be lost, which is a serious
problem for both TCP and UDP).
......@@ -636,9 +622,9 @@ Frequently Asked Questions
7. Which switches/systems does it work with?
In round-robin and XOR mode, it works with systems that support
In round-robin and XOR mode, it works with systems that support
trunking:
* Many Cisco switches and routers (look for EtherChannel support).
* SunTrunking software.
* Alteon AceDirector switches / WebOS (use Trunks).
......@@ -646,7 +632,7 @@ Frequently Asked Questions
models (450) can define trunks between ports on different physical
units.
* Linux bonding, of course !
In 802.3ad mode, it works with with systems that support IEEE 802.3ad
Dynamic Link Aggregation:
......@@ -667,21 +653,21 @@ Frequently Asked Questions
is then passed to all following slaves and remains persistent (even if
the the first slave is removed) until the bonding device is brought
down or reconfigured.
If you wish to change the MAC address, you can set it with ifconfig:
# ifconfig bond0 hw ether 00:11:22:33:44:55
The MAC address can be also changed by bringing down/up the device
and then changing its slaves (or their order):
# ifconfig bond0 down ; modprobe -r bonding
# ifconfig bond0 .... up
# ifenslave bond0 eth...
This method will automatically take the address from the next slave
that will be added.
To restore your slaves' MAC addresses, you need to detach them
from the bond (`ifenslave -d bond0 eth0'), set them down
(`ifconfig eth0 down'), unload the drivers (`rmmod 3c59x', for
......@@ -729,27 +715,27 @@ High Availability
=================
To implement high availability using the bonding driver, the driver needs to be
compiled as a module, because currently it is the only way to pass parameters
compiled as a module, because currently it is the only way to pass parameters
to the driver. This may change in the future.
High availability is achieved by using MII or ETHTOOL status reporting. You
need to verify that all your interfaces support MII or ETHTOOL link status
reporting. On Linux kernel 2.2.17, all the 100 Mbps capable drivers and
yellowfin gigabit driver support MII. To determine if ETHTOOL link reporting
is available for interface eth0, type "ethtool eth0" and the "Link detected:"
line should contain the correct link status. If your system has an interface
that does not support MII or ETHTOOL status reporting, a failure of its link
will not be detected! A message indicating MII and ETHTOOL is not supported by
a network driver is logged when the bonding driver is loaded with a non-zero
High availability is achieved by using MII or ETHTOOL status reporting. You
need to verify that all your interfaces support MII or ETHTOOL link status
reporting. On Linux kernel 2.2.17, all the 100 Mbps capable drivers and
yellowfin gigabit driver support MII. To determine if ETHTOOL link reporting
is available for interface eth0, type "ethtool eth0" and the "Link detected:"
line should contain the correct link status. If your system has an interface
that does not support MII or ETHTOOL status reporting, a failure of its link
will not be detected! A message indicating MII and ETHTOOL is not supported by
a network driver is logged when the bonding driver is loaded with a non-zero
miimon value.
The bonding driver can regularly check all its slaves links using the ETHTOOL
IOCTL (ETHTOOL_GLINK command) or by checking the MII status registers. The
check interval is specified by the module argument "miimon" (MII monitoring).
It takes an integer that represents the checking time in milliseconds. It
should not come to close to (1000/HZ) (10 milli-seconds on i386) because it
may then reduce the system interactivity. A value of 100 seems to be a good
starting point. It means that a dead link will be detected at most 100
IOCTL (ETHTOOL_GLINK command) or by checking the MII status registers. The
check interval is specified by the module argument "miimon" (MII monitoring).
It takes an integer that represents the checking time in milliseconds. It
should not come to close to (1000/HZ) (10 milli-seconds on i386) because it
may then reduce the system interactivity. A value of 100 seems to be a good
starting point. It means that a dead link will be detected at most 100
milli-seconds after it goes down.
Example:
......@@ -761,7 +747,7 @@ Or, put the following lines in /etc/modules.conf:
alias bond0 bonding
options bond0 miimon=100
There are currently two policies for high availability. They are dependent on
There are currently two policies for high availability. They are dependent on
whether:
a) hosts are connected to a single host or switch that support trunking
......@@ -811,7 +797,7 @@ Example 2 : host to switch at twice the speed
# ifenslave bond0 eth0 eth1
2) High Availability on two or more switches (or a single switch without
2) High Availability on two or more switches (or a single switch without
trunking support)
---------------------------------------------------------------------------
This mode is more problematic because it relies on the fact that there
......@@ -870,10 +856,10 @@ by another external mechanism, it is good to have host1's active interface
connected to one switch and host2's to the other. Such system will survive
a failure of a single host, cable, or switch. The worst thing that may happen
in the case of a switch failure is that half of the hosts will be temporarily
unreachable until the other switch expires its tables.
unreachable until the other switch expires its tables.
Example 2: Using multiple ethernet cards connected to a switch to configure
NIC failover (switch is not required to support trunking).
NIC failover (switch is not required to support trunking).
+----------+ +----------+
......@@ -957,7 +943,7 @@ The main limitations are :
servers, but may be useful when the front switches send multicast
information on their links (e.g. VRRP), or even health-check the servers.
Use the arp_interval/arp_ip_target parameters to count incoming/outgoing
frames.
frames.
......@@ -973,13 +959,12 @@ Donald Becker's Ethernet Drivers and diag programs may be found at :
You will also find a lot of information regarding Ethernet, NWay, MII, etc. at
www.scyld.com.
For new versions of the driver, patches for older kernels and the updated
userspace tools, take a look at Willy Tarreau's site :
Patches for 2.2 kernels are at Willy Tarreau's site :
- http://wtarreau.free.fr/pub/bonding/
- http://www-miaif.lip6.fr/willy/pub/bonding/
- http://www-miaif.lip6.fr/~tarreau/pub/bonding/
To get latest informations about Linux Kernel development, please consult
the Linux Kernel Mailing List Archives at :
http://boudicca.tux.org/hypermail/linux-kernel/latest/
http://www.ussg.iu.edu/hypermail/linux/kernel/
-- END --
......@@ -4,8 +4,6 @@
* This program controls the Linux implementation of running multiple
* network interfaces in parallel.
*
* Usage: ifenslave [-v] master-interface < slave-interface [metric <N>] > ...
*
* Author: Donald Becker <becker@cesdis.gsfc.nasa.gov>
* Copyright 1994-1996 Donald Becker
*
......@@ -90,24 +88,30 @@
* - For opt_c: slave should not be set to the master's setting
* while it is running. It was already set during enslave. To
* simplify things, it is now handeled separately.
*
* - 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
* set version to 1.1.0
*/
#define APP_VERSION "1.0.12"
#define APP_RELDATE "June 30, 2003"
#define APP_VERSION "1.1.0"
#define APP_RELDATE "Septemer 24, 2003"
#define APP_NAME "ifenslave"
static char *version =
APP_NAME ".c:v" APP_VERSION " (" APP_RELDATE ") " "\nDonald Becker (becker@cesdis.gsfc.nasa.gov).\n"
"detach support added on 2000/10/02 by Willy Tarreau (willy at meta-x.org).\n"
"2.4 kernel support added on 2001/02/16 by Chad N. Tindel (ctindel at ieee dot org.\n";
APP_NAME ".c:v" APP_VERSION " (" APP_RELDATE ")\n"
"o Donald Becker (becker@cesdis.gsfc.nasa.gov).\n"
"o Detach support added on 2000/10/02 by Willy Tarreau (willy at meta-x.org).\n"
"o 2.4 kernel support added on 2001/02/16 by Chad N. Tindel\n"
" (ctindel at ieee dot org).\n";
static const char *usage_msg =
"Usage: ifenslave [-adfrvVh] <master-interface> < <slave-if> [metric <N>] > ...\n"
" ifenslave -c master-interface slave-if\n";
"Usage: ifenslave [-f] <master-if> <slave-if> [<slave-if>...]\n"
" ifenslave -d <master-if> <slave-if> [<slave-if>...]\n"
" ifenslave -c <master-if> <slave-if>\n"
" ifenslave --help\n";
static const char *howto_msg =
"Usage: ifenslave [-adfrvVh] <master-interface> < <slave-if> [metric <N>] > ...\n"
" ifenslave -c master-interface slave-if\n"
static const char *help_msg =
"\n"
" To create a bond device, simply follow these three steps :\n"
" - ensure that the required drivers are properly loaded :\n"
......@@ -115,18 +119,32 @@ static const char *howto_msg =
" - assign an IP address to the bond device :\n"
" # ifconfig bond0 <addr> netmask <mask> broadcast <bcast>\n"
" - attach all the interfaces you need to the bond device :\n"
" # ifenslave bond0 eth0 eth1 eth2\n"
" # ifenslave [{-f|--force}] bond0 eth0 [eth1 [eth2]...]\n"
" If bond0 didn't have a MAC address, it will take eth0's. Then, all\n"
" interfaces attached AFTER this assignment will get the same MAC addr.\n"
"\n"
" To detach a dead interface without setting the bond device down :\n"
" # ifenslave -d bond0 eth1\n"
" (except for ALB/TLB modes)\n"
"\n"
" To set the bond device down and automatically release all the slaves :\n"
" # ifconfig bond0 down\n"
"\n"
" To detach a dead interface without setting the bond device down :\n"
" # ifenslave {-d|--detach} bond0 eth0 [eth1 [eth2]...]\n"
"\n"
" To change active slave :\n"
" # ifenslave -c bond0 eth0\n"
" # ifenslave {-c|--change-active} bond0 eth0\n"
"\n"
" To show master interface info\n"
" # ifenslave bond0\n"
"\n"
" To show all interfaces info\n"
" # ifenslave {-a|--all-interfaces}\n"
"\n"
" To be more verbose\n"
" # ifenslave {-v|--verbose} ...\n"
"\n"
" # ifenslave {-u|--usage} Show usage\n"
" # ifenslave {-V|--version} Show version\n"
" # ifenslave {-h|--help} This message\n"
"\n";
#include <unistd.h>
......@@ -153,476 +171,332 @@ typedef __uint8_t u8; /* ditto */
#include <linux/ethtool.h>
struct option longopts[] = {
/* { name has_arg *flag val } */
{"all-interfaces", 0, 0, 'a'}, /* Show all interfaces. */
{"force", 0, 0, 'f'}, /* Force the operation. */
{"help", 0, 0, '?'}, /* Give help */
{"howto", 0, 0, 'h'}, /* Give some more help */
{"receive-slave", 0, 0, 'r'}, /* Make a receive-only slave. */
{"verbose", 0, 0, 'v'}, /* Report each action taken. */
{"version", 0, 0, 'V'}, /* Emit version information. */
{"detach", 0, 0, 'd'}, /* Detach a slave interface. */
{"change-active", 0, 0, 'c'}, /* Change the active slave. */
{ 0, 0, 0, 0 }
/* { name has_arg *flag val } */
{"all-interfaces", 0, 0, 'a'}, /* Show all interfaces. */
{"change-active", 0, 0, 'c'}, /* Change the active slave. */
{"detach", 0, 0, 'd'}, /* Detach a slave interface. */
{"force", 0, 0, 'f'}, /* Force the operation. */
{"help", 0, 0, 'h'}, /* Give help */
{"usage", 0, 0, 'u'}, /* Give usage */
{"verbose", 0, 0, 'v'}, /* Report each action taken. */
{"version", 0, 0, 'V'}, /* Emit version information. */
{ 0, 0, 0, 0}
};
/* Command-line flags. */
unsigned int
opt_a = 0, /* Show-all-interfaces flag. */
opt_f = 0, /* Force the operation. */
opt_r = 0, /* Set up a Rx-only slave. */
opt_d = 0, /* detach a slave interface. */
opt_c = 0, /* change-active-slave flag. */
verbose = 0, /* Verbose flag. */
opt_version = 0,
opt_howto = 0;
int skfd = -1; /* AF_INET socket for ioctl() calls. */
opt_a = 0, /* Show-all-interfaces flag. */
opt_c = 0, /* Change-active-slave flag. */
opt_d = 0, /* Detach a slave interface. */
opt_f = 0, /* Force the operation. */
opt_h = 0, /* Help */
opt_u = 0, /* Usage */
opt_v = 0, /* Verbose flag. */
opt_V = 0; /* Version */
int skfd = -1; /* AF_INET socket for ioctl() calls.*/
int abi_ver = 0; /* userland - kernel ABI version */
int hwaddr_set = 0; /* Master's hwaddr is set */
int saved_errno;
struct ifreq master_mtu, master_flags, master_hwaddr;
struct ifreq slave_mtu, slave_flags, slave_hwaddr;
struct dev_ifr {
struct ifreq *req_ifr;
char *req_name;
int req_type;
};
static void if_print(char *ifname);
static int get_abi_ver(char *master_ifname);
struct dev_ifr master_ifra[] = {
{&master_mtu, "SIOCGIFMTU", SIOCGIFMTU},
{&master_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS},
{&master_hwaddr, "SIOCGIFHWADDR", SIOCGIFHWADDR},
{NULL, "", 0}
};
struct dev_ifr slave_ifra[] = {
{&slave_mtu, "SIOCGIFMTU", SIOCGIFMTU},
{&slave_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS},
{&slave_hwaddr, "SIOCGIFHWADDR", SIOCGIFHWADDR},
{NULL, "", 0}
};
int
main(int argc, char **argv)
static void if_print(char *ifname);
static int get_drv_info(char *master_ifname);
static int get_if_settings(char *ifname, struct dev_ifr ifra[]);
static int get_slave_flags(char *slave_ifname);
static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr);
static int set_slave_hwaddr(char *slave_ifname, struct sockaddr *hwaddr);
static int set_slave_mtu(char *slave_ifname, int mtu);
static int set_if_flags(char *ifname, short flags);
static int set_if_up(char *ifname, short flags);
static int set_if_down(char *ifname, short flags);
static int clear_if_addr(char *ifname);
static int set_if_addr(char *master_ifname, char *slave_ifname);
static int change_active(char *master_ifname, char *slave_ifname);
static int enslave(char *master_ifname, char *slave_ifname);
static int release(char *master_ifname, char *slave_ifname);
#define v_print(fmt, args...) \
if (opt_v) \
fprintf(stderr, fmt, ## args )
int main(int argc, char *argv[])
{
struct ifreq ifr2, if_hwaddr, if_ipaddr, if_metric, if_mtu, if_dstaddr;
struct ifreq if_netmask, if_brdaddr, if_flags;
int rv, goterr = 0;
int c, errflag = 0;
sa_family_t master_family;
char **spp, *master_ifname, *slave_ifname;
int hwaddr_notset;
int abi_ver = 0;
int c, i, rv;
int res = 0;
int exclusive = 0;
while ((c = getopt_long(argc, argv, "acdfrvV?h", longopts, 0)) != EOF)
while ((c = getopt_long(argc, argv, "acdfhuvV", longopts, 0)) != EOF) {
switch (c) {
case 'a': opt_a++; break;
case 'f': opt_f++; break;
case 'r': opt_r++; break;
case 'd': opt_d++; break;
case 'c': opt_c++; break;
case 'v': verbose++; break;
case 'V': opt_version++; break;
case 'h': opt_howto++; break;
case '?': errflag++;
}
/* option check */
if (opt_c)
if(opt_a || opt_f || opt_r || opt_d || verbose || opt_version ||
opt_howto || errflag ) {
case 'a': opt_a++; exclusive++; break;
case 'c': opt_c++; exclusive++; break;
case 'd': opt_d++; exclusive++; break;
case 'f': opt_f++; exclusive++; break;
case 'h': opt_h++; exclusive++; break;
case 'u': opt_u++; exclusive++; break;
case 'v': opt_v++; break;
case 'V': opt_V++; exclusive++; break;
case '?':
fprintf(stderr, usage_msg);
return 2;
res = 2;
goto out;
}
}
if (errflag) {
/* options check */
if (exclusive > 1) {
fprintf(stderr, usage_msg);
return 2;
res = 2;
goto out;
}
if (opt_howto) {
fprintf(stderr, howto_msg);
return 0;
if (opt_v || opt_V) {
printf(version);
if (opt_V) {
res = 0;
goto out;
}
}
if (verbose || opt_version) {
printf(version);
if (opt_version)
exit(0);
if (opt_u) {
printf(usage_msg);
res = 0;
goto out;
}
/* Open a basic socket. */
if ((skfd = socket(AF_INET, SOCK_DGRAM,0)) < 0) {
perror("socket");
exit(-1);
if (opt_h) {
printf(usage_msg);
printf(help_msg);
res = 0;
goto out;
}
if (verbose)
fprintf(stderr, "DEBUG: argc=%d, optind=%d and argv[optind] is %s.\n",
argc, optind, argv[optind]);
/* Open a basic socket */
if ((skfd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
perror("socket");
res = 1;
goto out;
}
/* No remaining args means show all interfaces. */
if (optind == argc) {
if_print((char *)NULL);
(void) close(skfd);
exit(0);
if (opt_a) {
if (optind == argc) {
/* No remaining args */
/* show all interfaces */
if_print((char *)NULL);
goto out;
} else {
/* Just show usage */
fprintf(stderr, usage_msg);
res = 2;
goto out;
}
}
/* Copy the interface name. */
/* Copy the interface name */
spp = argv + optind;
master_ifname = *spp++;
slave_ifname = *spp++;
/* Check command line. */
if (opt_c) {
char **tempp = spp;
if ((master_ifname == NULL)||(slave_ifname == NULL)||(*tempp++ != NULL)) {
fprintf(stderr, usage_msg);
(void) close(skfd);
return 2;
}
if (master_ifname == NULL) {
fprintf(stderr, usage_msg);
res = 2;
goto out;
}
/* A single args means show the configuration for this interface. */
if (slave_ifname == NULL) {
if_print(master_ifname);
(void) close(skfd);
exit(0);
}
/* exchange abi version with bonding driver */
abi_ver = get_abi_ver(master_ifname);
if (abi_ver < 0) {
(void) close(skfd);
exit(1);
}
/* Get the vitals from the master interface. */
{
struct ifreq *ifra[7] = { &if_ipaddr, &if_mtu, &if_dstaddr,
&if_brdaddr, &if_netmask, &if_flags,
&if_hwaddr };
const char *req_name[7] = {
"IP address", "MTU", "destination address",
"broadcast address", "netmask", "status flags",
"hardware address" };
const int ioctl_req_type[7] = {
SIOCGIFADDR, SIOCGIFMTU, SIOCGIFDSTADDR,
SIOCGIFBRDADDR, SIOCGIFNETMASK, SIOCGIFFLAGS,
SIOCGIFHWADDR };
int i;
for (i = 0; i < 7; i++) {
strncpy(ifra[i]->ifr_name, master_ifname, IFNAMSIZ);
if (ioctl(skfd, ioctl_req_type[i], ifra[i]) < 0) {
fprintf(stderr,
"Something broke getting the master's %s: %s.\n",
req_name[i], strerror(errno));
}
}
/* exchange abi version with bonding module */
res = get_drv_info(master_ifname);
if (res) {
fprintf(stderr,
"Master '%s': Error: handshake with driver failed. "
"Aborting\n",
master_ifname);
goto out;
}
/* check if master is up; if not then fail any operation */
if (!(if_flags.ifr_flags & IFF_UP)) {
fprintf(stderr, "Illegal operation; the specified master interface '%s' is not up.\n", master_ifname);
(void) close(skfd);
exit (1);
}
slave_ifname = *spp++;
hwaddr_notset = 1; /* assume master's address not set yet */
for (i = 0; hwaddr_notset && (i < 6); i++) {
hwaddr_notset &= ((unsigned char *)if_hwaddr.ifr_hwaddr.sa_data)[i] == 0;
if (slave_ifname == NULL) {
if (opt_d || opt_c) {
fprintf(stderr, usage_msg);
res = 2;
goto out;
}
/* The family '1' is ARPHRD_ETHER for ethernet. */
if (if_hwaddr.ifr_hwaddr.sa_family != 1 && !opt_f) {
fprintf(stderr, "The specified master interface '%s' is not"
" ethernet-like.\n This program is designed to work"
" with ethernet-like network interfaces.\n"
" Use the '-f' option to force the operation.\n",
master_ifname);
(void) close(skfd);
exit (1);
}
master_family = if_hwaddr.ifr_hwaddr.sa_family;
if (verbose) {
unsigned char *hwaddr = (unsigned char *)if_hwaddr.ifr_hwaddr.sa_data;
printf("The current hardware address (SIOCGIFHWADDR) of %s is type %d "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", master_ifname,
if_hwaddr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1],
hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
}
/* A single arg means show the
* configuration for this interface
*/
if_print(master_ifname);
goto out;
}
res = get_if_settings(master_ifname, master_ifra);
if (res) {
/* Probably a good reason not to go on */
fprintf(stderr,
"Master '%s': Error: get settings failed: %s. "
"Aborting\n",
master_ifname, strerror(res));
goto out;
}
/* do this when enslaving interfaces */
do {
if (opt_d) { /* detach a slave interface from the master */
strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ);
strncpy(if_flags.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDRELEASE, &if_flags) < 0) &&
(ioctl(skfd, BOND_RELEASE_OLD, &if_flags) < 0)) {
fprintf(stderr, "SIOCBONDRELEASE: cannot detach %s from %s. errno=%s.\n",
slave_ifname, master_ifname, strerror(errno));
}
else if (abi_ver < 1) {
/* The driver is using an old ABI, so we'll set the interface
* down to avoid any conflicts due to same IP/MAC
*/
strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCGIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "SIOCGIFFLAGS on %s failed: %s\n", slave_ifname,
strerror(saved_errno));
}
else {
ifr2.ifr_flags &= ~(IFF_UP | IFF_RUNNING);
if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "Shutting down interface %s failed: %s\n",
slave_ifname, strerror(saved_errno));
}
}
}
} else if (opt_c) { /* change primary slave */
strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ);
strncpy(if_flags.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDCHANGEACTIVE, &if_flags) < 0) &&
(ioctl(skfd, BOND_CHANGE_ACTIVE_OLD, &if_flags) < 0)) {
fprintf(stderr, "SIOCBONDCHANGEACTIVE: %s.\n", strerror(errno));
}
} else { /* attach a slave interface to the master */
strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCGIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "SIOCGIFFLAGS on %s failed: %s\n", slave_ifname,
strerror(saved_errno));
(void) close(skfd);
return 1;
}
if ((ifr2.ifr_flags & IFF_SLAVE) && !opt_r) {
fprintf(stderr, "%s is already a slave\n", slave_ifname);
(void) close(skfd);
return 1;
}
/* if hwaddr_notset, assign the slave hw address to the master */
if (hwaddr_notset) {
/* assign the slave hw address to the
* master since it currently does not
* have one; otherwise, slaves may
* have different hw addresses in
* active-backup mode as seen when enslaving
* using "ifenslave bond0 eth0 eth1" because
* hwaddr_notset is set outside this loop.
* TODO: put this and the "else" portion in
* a function.
*/
/* get the slaves MAC address */
strncpy(if_hwaddr.ifr_name, slave_ifname,
IFNAMSIZ);
rv = ioctl(skfd, SIOCGIFHWADDR, &if_hwaddr);
if (-1 == rv) {
fprintf(stderr, "Could not get MAC "
"address of %s: %s\n",
slave_ifname,
strerror(errno));
strncpy(if_hwaddr.ifr_name,
master_ifname, IFNAMSIZ);
goterr = 1;
}
if (!goterr) {
if (abi_ver < 1) {
/* In ABI versions older than 1, the
* master's set_mac routine couldn't
* work if it was up, because it
* used the default ethernet set_mac
* function.
*/
/* bring master down */
if_flags.ifr_flags &= ~IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS,
&if_flags) < 0) {
goterr = 1;
fprintf(stderr,
"Shutting down "
"interface %s failed: "
"%s\n",
master_ifname,
strerror(errno));
}
}
strncpy(if_hwaddr.ifr_name,
master_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCSIFHWADDR,
&if_hwaddr) < 0) {
fprintf(stderr,
"Could not set MAC "
"address of %s: %s\n",
master_ifname,
strerror(errno));
goterr=1;
} else {
hwaddr_notset = 0;
}
if (abi_ver < 1) {
/* bring master back up */
if_flags.ifr_flags |= IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS,
&if_flags) < 0) {
fprintf(stderr,
"Bringing up interface "
"%s failed: %s\n",
master_ifname,
strerror(errno));
}
}
}
} else if (abi_ver < 1) { /* if (hwaddr_notset) */
/* The driver is using an old ABI, so we'll set the interface
* down and assign the master's hwaddr to it
*/
if (ifr2.ifr_flags & IFF_UP) {
ifr2.ifr_flags &= ~IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "Shutting down interface %s failed: %s\n",
slave_ifname, strerror(saved_errno));
}
}
strncpy(if_hwaddr.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCSIFHWADDR, &if_hwaddr) < 0) {
int saved_errno = errno;
fprintf(stderr, "SIOCSIFHWADDR on %s failed: %s\n", if_hwaddr.ifr_name,
strerror(saved_errno));
if (saved_errno == EBUSY)
fprintf(stderr, " The slave device %s is busy: it must be"
" idle before running this command.\n", slave_ifname);
else if (saved_errno == EOPNOTSUPP)
fprintf(stderr, " The slave device you specified does not support"
" setting the MAC address.\n Your kernel likely does not"
" support slave devices.\n");
else if (saved_errno == EINVAL)
fprintf(stderr, " The slave device's address type does not match"
" the master's address type.\n");
} else {
if (verbose) {
unsigned char *hwaddr = if_hwaddr.ifr_hwaddr.sa_data;
printf("Slave's (%s) hardware address set to "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n", slave_ifname,
hwaddr[0], hwaddr[1], hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
}
}
}
/* check if master is indeed a master;
* if not then fail any operation
*/
if (!(master_flags.ifr_flags & IFF_MASTER)) {
fprintf(stderr,
"Illegal operation; the specified interface '%s' "
"is not a master. Aborting\n",
master_ifname);
res = 1;
goto out;
}
if (*spp && !strcmp(*spp, "metric")) {
if (*++spp == NULL) {
fprintf(stderr, usage_msg);
(void) close(skfd);
exit(2);
}
if_metric.ifr_metric = atoi(*spp);
strncpy(if_metric.ifr_name, slave_ifname, IFNAMSIZ);
if (ioctl(skfd, SIOCSIFMETRIC, &if_metric) < 0) {
fprintf(stderr, "SIOCSIFMETRIC on %s: %s\n", slave_ifname,
strerror(errno));
goterr = 1;
}
spp++;
}
/* check if master is up; if not then fail any operation */
if (!(master_flags.ifr_flags & IFF_UP)) {
fprintf(stderr,
"Illegal operation; the specified master interface "
"'%s' is not up.\n",
master_ifname);
res = 1;
goto out;
}
if (strncpy(if_ipaddr.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFADDR, &if_ipaddr) < 0) {
fprintf(stderr,
"Something broke setting the slave's address: %s.\n",
strerror(errno));
} else {
if (verbose) {
unsigned char *ipaddr = if_ipaddr.ifr_addr.sa_data;
printf("Set the slave's (%s) IP address to %d.%d.%d.%d.\n",
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
}
/* Only for enslaving */
if (!opt_c && !opt_d) {
sa_family_t master_family = master_hwaddr.ifr_hwaddr.sa_family;
unsigned char *hwaddr =
(unsigned char *)master_hwaddr.ifr_hwaddr.sa_data;
if (strncpy(if_mtu.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFMTU, &if_mtu) < 0) {
fprintf(stderr, "Something broke setting the slave MTU: %s.\n",
strerror(errno));
} else {
if (verbose)
printf("Set the slave's (%s) MTU to %d.\n", slave_ifname, if_mtu.ifr_mtu);
}
/* The family '1' is ARPHRD_ETHER for ethernet. */
if (master_family != 1 && !opt_f) {
fprintf(stderr,
"Illegal operation: The specified master "
"interface '%s' is not ethernet-like.\n "
"This program is designed to work with "
"ethernet-like network interfaces.\n "
"Use the '-f' option to force the "
"operation.\n",
master_ifname);
res = 1;
goto out;
}
if (strncpy(if_dstaddr.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFDSTADDR, &if_dstaddr) < 0) {
fprintf(stderr, "Error setting the slave (%s) with SIOCSIFDSTADDR: %s.\n",
slave_ifname, strerror(errno));
} else {
if (verbose) {
unsigned char *ipaddr = if_dstaddr.ifr_dstaddr.sa_data;
printf("Set the slave's (%s) destination address to %d.%d.%d.%d.\n",
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
/* Check master's hw addr */
for (i = 0; i < 6; i++) {
if (hwaddr[i] != 0) {
hwaddr_set = 1;
break;
}
}
if (strncpy(if_brdaddr.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFBRDADDR, &if_brdaddr) < 0) {
fprintf(stderr,
"Something broke setting the slave (%s) broadcast address: %s.\n",
slave_ifname, strerror(errno));
} else {
if (verbose) {
unsigned char *ipaddr = if_brdaddr.ifr_broadaddr.sa_data;
printf("Set the slave's (%s) broadcast address to %d.%d.%d.%d.\n",
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
}
if (hwaddr_set) {
v_print("current hardware address of master '%s' "
"is %2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x, "
"type %d\n",
master_ifname,
hwaddr[0], hwaddr[1],
hwaddr[2], hwaddr[3],
hwaddr[4], hwaddr[5],
master_family);
}
}
if (strncpy(if_netmask.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFNETMASK, &if_netmask) < 0) {
fprintf(stderr,
"Something broke setting the slave (%s) netmask: %s.\n",
slave_ifname, strerror(errno));
} else {
if (verbose) {
unsigned char *ipaddr = if_netmask.ifr_netmask.sa_data;
printf("Set the slave's (%s) netmask to %d.%d.%d.%d.\n",
slave_ifname, ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
/* Accepts only one slave */
if (opt_c) {
/* change active slave */
res = get_slave_flags(slave_ifname);
if (res) {
fprintf(stderr,
"Slave '%s': Error: get flags failed. "
"Aborting\n",
slave_ifname);
goto out;
}
res = change_active(master_ifname, slave_ifname);
if (res) {
fprintf(stderr,
"Master '%s', Slave '%s': Error: "
"Change active failed\n",
master_ifname, slave_ifname);
}
} else {
/* Accept multiple slaves */
do {
if (opt_d) {
/* detach a slave interface from the master */
rv = get_slave_flags(slave_ifname);
if (rv) {
/* Can't work with this slave. */
/* remember the error and skip it*/
fprintf(stderr,
"Slave '%s': Error: get flags "
"failed. Skipping\n",
slave_ifname);
res = rv;
continue;
}
}
if (abi_ver < 1) {
/* The driver is using an old ABI, so we'll set the interface
* up before enslaving it
*/
ifr2.ifr_flags |= IFF_UP;
if ((ifr2.ifr_flags &= ~(IFF_SLAVE | IFF_MASTER)) == 0
|| strncpy(ifr2.ifr_name, slave_ifname, IFNAMSIZ) <= 0
|| ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
fprintf(stderr,
"Something broke setting the slave (%s) flags: %s.\n",
slave_ifname, strerror(errno));
} else {
if (verbose)
printf("Set the slave's (%s) flags %4.4x.\n",
slave_ifname, if_flags.ifr_flags);
rv = release(master_ifname, slave_ifname);
if (rv) {
fprintf(stderr,
"Master '%s', Slave '%s': Error: "
"Release failed\n",
master_ifname, slave_ifname);
res = rv;
}
} else {
/* the bonding module takes care of setting the slave's mac address
* and opening its interface
*/
if (ifr2.ifr_flags & IFF_UP) { /* the interface will need to be down */
ifr2.ifr_flags &= ~IFF_UP;
if (ioctl(skfd, SIOCSIFFLAGS, &ifr2) < 0) {
int saved_errno = errno;
fprintf(stderr, "Shutting down interface %s failed: %s\n",
slave_ifname, strerror(saved_errno));
}
/* attach a slave interface to the master */
rv = get_if_settings(slave_ifname, slave_ifra);
if (rv) {
/* Can't work with this slave. */
/* remember the error and skip it*/
fprintf(stderr,
"Slave '%s': Error: get "
"settings failed: %s. "
"Skipping\n",
slave_ifname, strerror(rv));
res = rv;
continue;
}
}
/* Do the real thing */
if (!opt_r) {
strncpy(if_flags.ifr_name, master_ifname, IFNAMSIZ);
strncpy(if_flags.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDENSLAVE, &if_flags) < 0) &&
(ioctl(skfd, BOND_ENSLAVE_OLD, &if_flags) < 0)) {
fprintf(stderr, "SIOCBONDENSLAVE: %s.\n", strerror(errno));
rv = enslave(master_ifname, slave_ifname);
if (rv) {
fprintf(stderr,
"Master '%s', Slave '%s': Error: "
"Enslave failed\n",
master_ifname, slave_ifname);
res = rv;
}
}
}
} while ( (slave_ifname = *spp++) != NULL);
} while ((slave_ifname = *spp++) != NULL);
}
/* Close the socket. */
(void) close(skfd);
out:
if (skfd >= 0) {
close(skfd);
}
return(goterr);
return res;
}
static short mif_flags;
......@@ -631,35 +505,34 @@ static short mif_flags;
static int if_getconfig(char *ifname)
{
struct ifreq ifr;
int metric, mtu; /* Parameters of the master interface. */
int metric, mtu; /* Parameters of the master interface. */
struct sockaddr dstaddr, broadaddr, netmask;
unsigned char *hwaddr;
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFFLAGS, &ifr) < 0)
return -1;
mif_flags = ifr.ifr_flags;
printf("The result of SIOCGIFFLAGS on %s is %x.\n",
ifname, ifr.ifr_flags);
ifname, ifr.ifr_flags);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFADDR, &ifr) < 0)
return -1;
printf("The result of SIOCGIFADDR is %2.2x.%2.2x.%2.2x.%2.2x.\n",
ifr.ifr_addr.sa_data[0], ifr.ifr_addr.sa_data[1],
ifr.ifr_addr.sa_data[2], ifr.ifr_addr.sa_data[3]);
ifr.ifr_addr.sa_data[0], ifr.ifr_addr.sa_data[1],
ifr.ifr_addr.sa_data[2], ifr.ifr_addr.sa_data[3]);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFHWADDR, &ifr) < 0)
return -1;
{
/* Gotta convert from 'char' to unsigned for printf(). */
unsigned char *hwaddr = (unsigned char *)ifr.ifr_hwaddr.sa_data;
printf("The result of SIOCGIFHWADDR is type %d "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
ifr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1],
hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
}
/* Gotta convert from 'char' to unsigned for printf(). */
hwaddr = (unsigned char *)ifr.ifr_hwaddr.sa_data;
printf("The result of SIOCGIFHWADDR is type %d "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
ifr.ifr_hwaddr.sa_family, hwaddr[0], hwaddr[1],
hwaddr[2], hwaddr[3], hwaddr[4], hwaddr[5]);
strcpy(ifr.ifr_name, ifname);
if (ioctl(skfd, SIOCGIFMETRIC, &ifr) < 0) {
......@@ -691,7 +564,7 @@ static int if_getconfig(char *ifname)
} else
netmask = ifr.ifr_netmask;
return(0);
return 0;
}
static void if_print(char *ifname)
......@@ -705,15 +578,16 @@ static void if_print(char *ifname)
ifc.ifc_len = sizeof(buff);
ifc.ifc_buf = buff;
if (ioctl(skfd, SIOCGIFCONF, &ifc) < 0) {
fprintf(stderr, "SIOCGIFCONF: %s\n", strerror(errno));
perror("SIOCGIFCONF failed");
return;
}
ifr = ifc.ifc_req;
for (i = ifc.ifc_len / sizeof(struct ifreq); --i >= 0; ifr++) {
if (if_getconfig(ifr->ifr_name) < 0) {
fprintf(stderr, "%s: unknown interface.\n",
ifr->ifr_name);
fprintf(stderr,
"%s: unknown interface.\n",
ifr->ifr_name);
continue;
}
......@@ -721,16 +595,18 @@ static void if_print(char *ifname)
/*ife_print(&ife);*/
}
} else {
if (if_getconfig(ifname) < 0)
fprintf(stderr, "%s: unknown interface.\n", ifname);
if (if_getconfig(ifname) < 0) {
fprintf(stderr,
"%s: unknown interface.\n", ifname);
}
}
}
static int get_abi_ver(char *master_ifname)
static int get_drv_info(char *master_ifname)
{
struct ifreq ifr;
struct ethtool_drvinfo info;
int abi_ver = 0;
char *endptr;
memset(&ifr, 0, sizeof(ifr));
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
......@@ -739,24 +615,487 @@ static int get_abi_ver(char *master_ifname)
info.cmd = ETHTOOL_GDRVINFO;
strncpy(info.driver, "ifenslave", 32);
snprintf(info.fw_version, 32, "%d", BOND_ABI_VERSION);
if (ioctl(skfd, SIOCETHTOOL, &ifr) >= 0) {
char *endptr;
abi_ver = strtoul(info.fw_version, &endptr, 0);
if (*endptr) {
fprintf(stderr, "Error: got invalid string as an ABI "
"version from the bonding module\n");
return -1;
if (ioctl(skfd, SIOCETHTOOL, &ifr) < 0) {
if (errno == EOPNOTSUPP) {
goto out;
}
saved_errno = errno;
v_print("Master '%s': Error: get bonding info failed %s\n",
master_ifname, strerror(saved_errno));
return 1;
}
abi_ver = strtoul(info.fw_version, &endptr, 0);
if (*endptr) {
v_print("Master '%s': Error: got invalid string as an ABI "
"version from the bonding module\n",
master_ifname);
return 1;
}
out:
v_print("ABI ver is %d\n", abi_ver);
return 0;
}
static int change_active(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res = 0;
if (!(slave_flags.ifr_flags & IFF_SLAVE)) {
fprintf(stderr,
"Illegal operation: The specified slave interface "
"'%s' is not a slave\n",
slave_ifname);
return 1;
}
if (verbose) {
printf("ABI ver is %d\n", abi_ver);
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDCHANGEACTIVE, &ifr) < 0) &&
(ioctl(skfd, BOND_CHANGE_ACTIVE_OLD, &ifr) < 0)) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCBONDCHANGEACTIVE failed: "
"%s\n",
master_ifname, strerror(saved_errno));
res = 1;
}
return abi_ver;
return res;
}
static int enslave(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res = 0;
if (slave_flags.ifr_flags & IFF_SLAVE) {
fprintf(stderr,
"Illegal operation: The specified slave interface "
"'%s' is already a slave\n",
slave_ifname);
return 1;
}
res = set_if_down(slave_ifname, slave_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Slave '%s': Error: bring interface down failed\n",
slave_ifname);
return res;
}
if (abi_ver < 2) {
/* Older bonding versions would panic if the slave has no IP
* address, so get the IP setting from the master.
*/
res = set_if_addr(master_ifname, slave_ifname);
if (res) {
fprintf(stderr,
"Slave '%s': Error: set address failed\n",
slave_ifname);
return res;
}
} else {
res = clear_if_addr(slave_ifname);
if (res) {
fprintf(stderr,
"Slave '%s': Error: clear address failed\n",
slave_ifname);
return res;
}
}
if (master_mtu.ifr_mtu != slave_mtu.ifr_mtu) {
res = set_slave_mtu(slave_ifname, master_mtu.ifr_mtu);
if (res) {
fprintf(stderr,
"Slave '%s': Error: set MTU failed\n",
slave_ifname);
return res;
}
}
if (hwaddr_set) {
/* Master already has an hwaddr
* so set it's hwaddr to the slave
*/
if (abi_ver < 1) {
/* The driver is using an old ABI, so
* the application sets the slave's
* hwaddr
*/
res = set_slave_hwaddr(slave_ifname,
&(master_hwaddr.ifr_hwaddr));
if (res) {
fprintf(stderr,
"Slave '%s': Error: set hw address "
"failed\n",
slave_ifname);
goto undo_mtu;
}
/* For old ABI the application needs to bring the
* slave back up
*/
res = set_if_up(slave_ifname, slave_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Slave '%s': Error: bring interface "
"down failed\n",
slave_ifname);
goto undo_slave_mac;
}
}
/* The driver is using a new ABI,
* so the driver takes care of setting
* the slave's hwaddr and bringing
* it up again
*/
} else {
/* No hwaddr for master yet, so
* set the slave's hwaddr to it
*/
if (abi_ver < 1) {
/* For old ABI, the master needs to be
* down before setting it's hwaddr
*/
res = set_if_down(master_ifname, master_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Master '%s': Error: bring interface "
"down failed\n",
master_ifname);
goto undo_mtu;
}
}
res = set_master_hwaddr(master_ifname,
&(slave_hwaddr.ifr_hwaddr));
if (res) {
fprintf(stderr,
"Master '%s': Error: set hw address "
"failed\n",
master_ifname);
goto undo_mtu;
}
if (abi_ver < 1) {
/* For old ABI, bring the master
* back up
*/
res = set_if_up(master_ifname, master_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Master '%s': Error: bring interface "
"up failed\n",
master_ifname);
goto undo_master_mac;
}
}
hwaddr_set = 1;
}
/* Do the real thing */
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDENSLAVE, &ifr) < 0) &&
(ioctl(skfd, BOND_ENSLAVE_OLD, &ifr) < 0)) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCBONDENSLAVE failed: %s\n",
master_ifname, strerror(saved_errno));
res = 1;
}
if (res) {
goto undo_master_mac;
}
return 0;
/* rollback (best effort) */
undo_master_mac:
set_master_hwaddr(master_ifname, &(master_hwaddr.ifr_hwaddr));
hwaddr_set = 0;
goto undo_mtu;
undo_slave_mac:
set_slave_hwaddr(slave_ifname, &(slave_hwaddr.ifr_hwaddr));
undo_mtu:
set_slave_mtu(slave_ifname, slave_mtu.ifr_mtu);
return res;
}
static int release(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res = 0;
if (!(slave_flags.ifr_flags & IFF_SLAVE)) {
fprintf(stderr,
"Illegal operation: The specified slave interface "
"'%s' is not a slave\n",
slave_ifname);
return 1;
}
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
strncpy(ifr.ifr_slave, slave_ifname, IFNAMSIZ);
if ((ioctl(skfd, SIOCBONDRELEASE, &ifr) < 0) &&
(ioctl(skfd, BOND_RELEASE_OLD, &ifr) < 0)) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCBONDRELEASE failed: %s\n",
master_ifname, strerror(saved_errno));
return 1;
} else if (abi_ver < 1) {
/* The driver is using an old ABI, so we'll set the interface
* down to avoid any conflicts due to same MAC/IP
*/
res = set_if_down(slave_ifname, slave_flags.ifr_flags);
if (res) {
fprintf(stderr,
"Slave '%s': Error: bring interface "
"down failed\n",
slave_ifname);
}
}
/* set to default mtu */
set_slave_mtu(slave_ifname, 1500);
return res;
}
static int get_if_settings(char *ifname, struct dev_ifr ifra[])
{
int i;
int res = 0;
for (i = 0; ifra[i].req_ifr; i++) {
strncpy(ifra[i].req_ifr->ifr_name, ifname, IFNAMSIZ);
res = ioctl(skfd, ifra[i].req_type, ifra[i].req_ifr);
if (res < 0) {
saved_errno = errno;
v_print("Interface '%s': Error: %s failed: %s\n",
ifname, ifra[i].req_name,
strerror(saved_errno));
return saved_errno;
}
}
return 0;
}
static int get_slave_flags(char *slave_ifname)
{
int res = 0;
strncpy(slave_flags.ifr_name, slave_ifname, IFNAMSIZ);
res = ioctl(skfd, SIOCGIFFLAGS, &slave_flags);
if (res < 0) {
saved_errno = errno;
v_print("Slave '%s': Error: SIOCGIFFLAGS failed: %s\n",
slave_ifname, strerror(saved_errno));
} else {
v_print("Slave %s: flags %04X.\n",
slave_ifname, slave_flags.ifr_flags);
}
return res;
}
static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr)
{
unsigned char *addr = (unsigned char *)hwaddr->sa_data;
struct ifreq ifr;
int res = 0;
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
memcpy(&(ifr.ifr_hwaddr), hwaddr, sizeof(struct sockaddr));
res = ioctl(skfd, SIOCSIFHWADDR, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Master '%s': Error: SIOCSIFHWADDR failed: %s\n",
master_ifname, strerror(saved_errno));
return res;
} else {
v_print("Master '%s': hardware address set to "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
master_ifname, addr[0], addr[1], addr[2],
addr[3], addr[4], addr[5]);
}
return res;
}
static int set_slave_hwaddr(char *slave_ifname, struct sockaddr *hwaddr)
{
unsigned char *addr = (unsigned char *)hwaddr->sa_data;
struct ifreq ifr;
int res = 0;
strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ);
memcpy(&(ifr.ifr_hwaddr), hwaddr, sizeof(struct sockaddr));
res = ioctl(skfd, SIOCSIFHWADDR, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Slave '%s': Error: SIOCSIFHWADDR failed: %s\n",
slave_ifname, strerror(saved_errno));
if (saved_errno == EBUSY) {
v_print(" The device is busy: it must be idle "
"before running this command.\n");
} else if (saved_errno == EOPNOTSUPP) {
v_print(" The device does not support setting "
"the MAC address.\n"
" Your kernel likely does not support slave "
"devices.\n");
} else if (saved_errno == EINVAL) {
v_print(" The device's address type does not match "
"the master's address type.\n");
}
return res;
} else {
v_print("Slave '%s': hardware address set to "
"%2.2x:%2.2x:%2.2x:%2.2x:%2.2x:%2.2x.\n",
slave_ifname, addr[0], addr[1], addr[2],
addr[3], addr[4], addr[5]);
}
return res;
}
static int set_slave_mtu(char *slave_ifname, int mtu)
{
struct ifreq ifr;
int res = 0;
ifr.ifr_mtu = mtu;
strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ);
res = ioctl(skfd, SIOCSIFMTU, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Slave '%s': Error: SIOCSIFMTU failed: %s\n",
slave_ifname, strerror(saved_errno));
} else {
v_print("Slave '%s': MTU set to %d.\n", slave_ifname, mtu);
}
return res;
}
static int set_if_flags(char *ifname, short flags)
{
struct ifreq ifr;
int res = 0;
ifr.ifr_flags = flags;
strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
res = ioctl(skfd, SIOCSIFFLAGS, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Interface '%s': Error: SIOCSIFFLAGS failed: %s\n",
ifname, strerror(saved_errno));
} else {
v_print("Interface '%s': flags set to %04X.\n", ifname, flags);
}
return res;
}
static int set_if_up(char *ifname, short flags)
{
return set_if_flags(ifname, flags | IFF_UP);
}
static int set_if_down(char *ifname, short flags)
{
return set_if_flags(ifname, flags & ~IFF_UP);
}
static int clear_if_addr(char *ifname)
{
struct ifreq ifr;
int res = 0;
strncpy(ifr.ifr_name, ifname, IFNAMSIZ);
ifr.ifr_addr.sa_family = AF_INET;
memset(ifr.ifr_addr.sa_data, 0, sizeof(ifr.ifr_addr.sa_data));
res = ioctl(skfd, SIOCSIFADDR, &ifr);
if (res < 0) {
saved_errno = errno;
v_print("Interface '%s': Error: SIOCSIFADDR failed: %s\n",
ifname, strerror(saved_errno));
} else {
v_print("Interface '%s': address cleared\n", ifname);
}
return res;
}
static int set_if_addr(char *master_ifname, char *slave_ifname)
{
struct ifreq ifr;
int res;
unsigned char *ipaddr;
int i;
struct {
char *req_name;
char *desc;
int g_ioctl;
int s_ioctl;
} ifra[] = {
{"IFADDR", "addr", SIOCGIFADDR, SIOCSIFADDR},
{"DSTADDR", "destination addr", SIOCGIFDSTADDR, SIOCSIFDSTADDR},
{"BRDADDR", "broadcast addr", SIOCGIFBRDADDR, SIOCSIFBRDADDR},
{"NETMASK", "netmask", SIOCGIFNETMASK, SIOCSIFNETMASK},
{NULL, NULL, 0, 0},
};
for (i = 0; ifra[i].req_name; i++) {
strncpy(ifr.ifr_name, master_ifname, IFNAMSIZ);
res = ioctl(skfd, ifra[i].g_ioctl, &ifr);
if (res < 0) {
int saved_errno = errno;
v_print("Interface '%s': Error: SIOCG%s failed: %s\n",
master_ifname, ifra[i].req_name,
strerror(saved_errno));
ifr.ifr_addr.sa_family = AF_INET;
memset(ifr.ifr_addr.sa_data, 0,
sizeof(ifr.ifr_addr.sa_data));
}
strncpy(ifr.ifr_name, slave_ifname, IFNAMSIZ);
res = ioctl(skfd, ifra[i].s_ioctl, &ifr);
if (res < 0) {
int saved_errno = errno;
v_print("Interface '%s': Error: SIOCS%s failed: %s\n",
slave_ifname, ifra[i].req_name,
strerror(saved_errno));
return res;
}
ipaddr = ifr.ifr_addr.sa_data;
v_print("Interface '%s': set IP %s to %d.%d.%d.%d\n",
slave_ifname, ifra[i].desc,
ipaddr[0], ipaddr[1], ipaddr[2], ipaddr[3]);
}
return 0;
}
/*
* Local variables:
......@@ -768,3 +1107,4 @@ static int get_abi_ver(char *master_ifname)
* compile-command: "gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave"
* End:
*/
......@@ -47,8 +47,13 @@
* - Send LACPDU as highest priority packet to further fix the above
* problem on very high Tx traffic load where packets may get dropped
* by the slave.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/
//#define BONDING_DEBUG 1
#include <linux/skbuff.h>
#include <linux/if_ether.h>
#include <linux/netdevice.h>
......@@ -119,6 +124,7 @@
static struct mac_addr null_mac_addr = {{0, 0, 0, 0, 0, 0}};
static u16 ad_ticks_per_sec;
static const int ad_delta_in_ticks = (AD_TIMER_INTERVAL * HZ) / 1000;
// ================= 3AD api to bonding and kernel code ==================
static u16 __get_link_speed(struct port *port);
......@@ -196,13 +202,11 @@ static inline struct bonding *__get_bond_by_port(struct port *port)
*/
static inline struct port *__get_first_port(struct bonding *bond)
{
struct slave *slave = bond->next;
if (slave == (struct slave *)bond) {
if (bond->slave_cnt == 0) {
return NULL;
}
return &(SLAVE_AD_INFO(slave).port);
return &(SLAVE_AD_INFO(bond->first_slave).port);
}
/**
......@@ -218,7 +222,7 @@ static inline struct port *__get_next_port(struct port *port)
struct slave *slave = port->slave;
// If there's no bond for this port, or this is the last slave
if ((bond == NULL) || (slave->next == bond->next)) {
if ((bond == NULL) || (slave->next == bond->first_slave)) {
return NULL;
}
......@@ -236,12 +240,12 @@ static inline struct aggregator *__get_first_agg(struct port *port)
{
struct bonding *bond = __get_bond_by_port(port);
// If there's no bond for this port, or this is the last slave
if ((bond == NULL) || (bond->next == (struct slave *)bond)) {
// If there's no bond for this port, or bond has no slaves
if ((bond == NULL) || (bond->slave_cnt == 0)) {
return NULL;
}
return &(SLAVE_AD_INFO(bond->next).aggregator);
return &(SLAVE_AD_INFO(bond->first_slave).aggregator);
}
/**
......@@ -257,7 +261,7 @@ static inline struct aggregator *__get_next_agg(struct aggregator *aggregator)
struct bonding *bond = bond_get_bond_by_slave(slave);
// If there's no bond for this aggregator, or this is the last slave
if ((bond == NULL) || (slave->next == bond->next)) {
if ((bond == NULL) || (slave->next == bond->first_slave)) {
return NULL;
}
......@@ -392,7 +396,7 @@ static u16 __get_link_speed(struct port *port)
}
}
BOND_PRINT_DBG(("Port %d Received link speed %d update from adapter", port->actor_port_number, speed));
dprintk("Port %d Received link speed %d update from adapter\n", port->actor_port_number, speed);
return speed;
}
......@@ -418,12 +422,12 @@ static u8 __get_duplex(struct port *port)
switch (slave->duplex) {
case DUPLEX_FULL:
retval=0x1;
BOND_PRINT_DBG(("Port %d Received status full duplex update from adapter", port->actor_port_number));
dprintk("Port %d Received status full duplex update from adapter\n", port->actor_port_number);
break;
case DUPLEX_HALF:
default:
retval=0x0;
BOND_PRINT_DBG(("Port %d Received status NOT full duplex update from adapter", port->actor_port_number));
dprintk("Port %d Received status NOT full duplex update from adapter\n", port->actor_port_number);
break;
}
}
......@@ -1059,7 +1063,7 @@ static void ad_mux_machine(struct port *port)
// check if the state machine was changed
if (port->sm_mux_state != last_state) {
BOND_PRINT_DBG(("Mux Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_mux_state));
dprintk("Mux Machine: Port=%d, Last State=%d, Curr State=%d\n", port->actor_port_number, last_state, port->sm_mux_state);
switch (port->sm_mux_state) {
case AD_MUX_DETACHED:
__detach_bond_from_agg(port);
......@@ -1158,7 +1162,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
// check if the State machine was changed or new lacpdu arrived
if ((port->sm_rx_state != last_state) || (lacpdu)) {
BOND_PRINT_DBG(("Rx Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_rx_state));
dprintk("Rx Machine: Port=%d, Last State=%d, Curr State=%d\n", port->actor_port_number, last_state, port->sm_rx_state);
switch (port->sm_rx_state) {
case AD_RX_INITIALIZE:
if (!(port->actor_oper_port_key & AD_DUPLEX_KEY_BITS)) {
......@@ -1204,7 +1208,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
// detect loopback situation
if (!MAC_ADDRESS_COMPARE(&(lacpdu->actor_system), &(port->actor_system))) {
// INFO_RECEIVED_LOOPBACK_FRAMES
printk(KERN_ERR "bonding: An illegal loopback occurred on adapter (%s)\n",
printk(KERN_ERR DRV_NAME ": An illegal loopback occurred on adapter (%s)\n",
port->slave->dev->name);
printk(KERN_ERR "Check the configuration to verify that all Adapters "
"are connected to 802.3ad compliant switch ports\n");
......@@ -1245,7 +1249,7 @@ static void ad_tx_machine(struct port *port)
__update_lacpdu_from_port(port);
// send the lacpdu
if (ad_lacpdu_send(port) >= 0) {
BOND_PRINT_DBG(("Sent LACPDU on port %d", port->actor_port_number));
dprintk("Sent LACPDU on port %d\n", port->actor_port_number);
// mark ntt as false, so it will not be sent again until demanded
port->ntt = 0;
}
......@@ -1318,7 +1322,7 @@ static void ad_periodic_machine(struct port *port)
// check if the state machine was changed
if (port->sm_periodic_state != last_state) {
BOND_PRINT_DBG(("Periodic Machine: Port=%d, Last State=%d, Curr State=%d", port->actor_port_number, last_state, port->sm_periodic_state));
dprintk("Periodic Machine: Port=%d, Last State=%d, Curr State=%d\n", port->actor_port_number, last_state, port->sm_periodic_state);
switch (port->sm_periodic_state) {
case AD_NO_PERIODIC:
port->sm_periodic_timer_counter = 0; // zero timer
......@@ -1375,7 +1379,7 @@ static void ad_port_selection_logic(struct port *port)
port->next_port_in_aggregator=NULL;
port->actor_port_aggregator_identifier=0;
BOND_PRINT_DBG(("Port %d left LAG %d", port->actor_port_number, temp_aggregator->aggregator_identifier));
dprintk("Port %d left LAG %d\n", port->actor_port_number, temp_aggregator->aggregator_identifier);
// if the aggregator is empty, clear its parameters, and set it ready to be attached
if (!temp_aggregator->lag_ports) {
ad_clear_agg(temp_aggregator);
......@@ -1384,7 +1388,7 @@ static void ad_port_selection_logic(struct port *port)
}
}
if (!curr_port) { // meaning: the port was related to an aggregator but was not on the aggregator port list
printk(KERN_WARNING "bonding: Warning: Port %d (on %s) was "
printk(KERN_WARNING DRV_NAME ": Warning: Port %d (on %s) was "
"related to aggregator %d but was not on its port list\n",
port->actor_port_number, port->slave->dev->name,
port->aggregator->aggregator_identifier);
......@@ -1417,7 +1421,7 @@ static void ad_port_selection_logic(struct port *port)
port->next_port_in_aggregator=aggregator->lag_ports;
port->aggregator->num_of_ports++;
aggregator->lag_ports=port;
BOND_PRINT_DBG(("Port %d joined LAG %d(existing LAG)", port->actor_port_number, port->aggregator->aggregator_identifier));
dprintk("Port %d joined LAG %d(existing LAG)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
// mark this port as selected
port->sm_vars |= AD_PORT_SELECTED;
......@@ -1454,9 +1458,9 @@ static void ad_port_selection_logic(struct port *port)
// mark this port as selected
port->sm_vars |= AD_PORT_SELECTED;
BOND_PRINT_DBG(("Port %d joined LAG %d(new LAG)", port->actor_port_number, port->aggregator->aggregator_identifier));
dprintk("Port %d joined LAG %d(new LAG)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
} else {
printk(KERN_ERR "bonding: Port %d (on %s) did not find a suitable aggregator\n",
printk(KERN_ERR DRV_NAME ": Port %d (on %s) did not find a suitable aggregator\n",
port->actor_port_number, port->slave->dev->name);
}
}
......@@ -1580,30 +1584,30 @@ static void ad_agg_selection_logic(struct aggregator *aggregator)
aggregator;
aggregator = __get_next_agg(aggregator)) {
BOND_PRINT_DBG(("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d",
dprintk("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d\n",
aggregator->aggregator_identifier, aggregator->num_of_ports,
aggregator->actor_oper_aggregator_key, aggregator->partner_oper_aggregator_key,
aggregator->is_individual, aggregator->is_active));
aggregator->is_individual, aggregator->is_active);
}
// check if any partner replys
if (best_aggregator->is_individual) {
printk(KERN_WARNING "bonding: Warning: No 802.3ad response from the link partner "
printk(KERN_WARNING DRV_NAME ": Warning: No 802.3ad response from the link partner "
"for any adapters in the bond\n");
}
// check if there are more than one aggregator
if (num_of_aggs > 1) {
BOND_PRINT_DBG(("Warning: More than one Link Aggregation Group was "
"found in the bond. Only one group will function in the bond"));
dprintk("Warning: More than one Link Aggregation Group was "
"found in the bond. Only one group will function in the bond\n");
}
best_aggregator->is_active = 1;
BOND_PRINT_DBG(("LAG %d choosed as the active LAG", best_aggregator->aggregator_identifier));
BOND_PRINT_DBG(("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d",
dprintk("LAG %d choosed as the active LAG\n", best_aggregator->aggregator_identifier);
dprintk("Agg=%d; Ports=%d; a key=%d; p key=%d; Indiv=%d; Active=%d\n",
best_aggregator->aggregator_identifier, best_aggregator->num_of_ports,
best_aggregator->actor_oper_aggregator_key, best_aggregator->partner_oper_aggregator_key,
best_aggregator->is_individual, best_aggregator->is_active));
best_aggregator->is_individual, best_aggregator->is_active);
// disable the ports that were related to the former active_aggregator
if (last_active_aggregator) {
......@@ -1644,7 +1648,7 @@ static void ad_clear_agg(struct aggregator *aggregator)
aggregator->lag_ports = NULL;
aggregator->is_active = 0;
aggregator->num_of_ports = 0;
BOND_PRINT_DBG(("LAG %d was cleared", aggregator->aggregator_identifier));
dprintk("LAG %d was cleared\n", aggregator->aggregator_identifier);
}
}
......@@ -1729,7 +1733,7 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
static void ad_enable_collecting_distributing(struct port *port)
{
if (port->aggregator->is_active) {
BOND_PRINT_DBG(("Enabling port %d(LAG %d)", port->actor_port_number, port->aggregator->aggregator_identifier));
dprintk("Enabling port %d(LAG %d)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
__enable_port(port);
}
}
......@@ -1742,7 +1746,7 @@ static void ad_enable_collecting_distributing(struct port *port)
static void ad_disable_collecting_distributing(struct port *port)
{
if (port->aggregator && MAC_ADDRESS_COMPARE(&(port->aggregator->partner_system), &(null_mac_addr))) {
BOND_PRINT_DBG(("Disabling port %d(LAG %d)", port->actor_port_number, port->aggregator->aggregator_identifier));
dprintk("Disabling port %d(LAG %d)\n", port->actor_port_number, port->aggregator->aggregator_identifier);
__disable_port(port);
}
}
......@@ -1780,7 +1784,7 @@ static void ad_marker_info_send(struct port *port)
// send the marker information
if (ad_marker_send(port, &marker) >= 0) {
BOND_PRINT_DBG(("Sent Marker Information on port %d", port->actor_port_number));
dprintk("Sent Marker Information on port %d\n", port->actor_port_number);
}
}
#endif
......@@ -1803,7 +1807,7 @@ static void ad_marker_info_received(struct marker *marker_info,struct port *port
// send the marker response
if (ad_marker_send(port, &marker) >= 0) {
BOND_PRINT_DBG(("Sent Marker Response on port %d", port->actor_port_number));
dprintk("Sent Marker Response on port %d\n", port->actor_port_number);
}
}
......@@ -1890,13 +1894,13 @@ static u16 aggregator_identifier;
void bond_3ad_initialize(struct bonding *bond, u16 tick_resolution, int lacp_fast)
{
// check that the bond is not initialized yet
if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr), &(bond->device->dev_addr))) {
if (MAC_ADDRESS_COMPARE(&(BOND_AD_INFO(bond).system.sys_mac_addr), &(bond->dev->dev_addr))) {
aggregator_identifier = 0;
BOND_AD_INFO(bond).lacp_fast = lacp_fast;
BOND_AD_INFO(bond).system.sys_priority = 0xFFFF;
BOND_AD_INFO(bond).system.sys_mac_addr = *((struct mac_addr *)bond->device->dev_addr);
BOND_AD_INFO(bond).system.sys_mac_addr = *((struct mac_addr *)bond->dev->dev_addr);
// initialize how many times this module is called in one second(should be about every 100ms)
ad_ticks_per_sec = tick_resolution;
......@@ -1921,7 +1925,7 @@ int bond_3ad_bind_slave(struct slave *slave)
struct aggregator *aggregator;
if (bond == NULL) {
printk(KERN_CRIT "The slave %s is not attached to its bond\n", slave->dev->name);
printk(KERN_ERR "The slave %s is not attached to its bond\n", slave->dev->name);
return -1;
}
......@@ -1964,7 +1968,7 @@ int bond_3ad_bind_slave(struct slave *slave)
ad_initialize_agg(aggregator);
aggregator->aggregator_mac_address = *((struct mac_addr *)bond->device->dev_addr);
aggregator->aggregator_mac_address = *((struct mac_addr *)bond->dev->dev_addr);
aggregator->aggregator_identifier = (++aggregator_identifier);
aggregator->slave = slave;
aggregator->is_active = 0;
......@@ -1996,11 +2000,11 @@ void bond_3ad_unbind_slave(struct slave *slave)
// if slave is null, the whole port is not initialized
if (!port->slave) {
printk(KERN_WARNING "bonding: Trying to unbind an uninitialized port on %s\n", slave->dev->name);
printk(KERN_WARNING DRV_NAME ": Trying to unbind an uninitialized port on %s\n", slave->dev->name);
return;
}
BOND_PRINT_DBG(("Unbinding Link Aggregation Group %d", aggregator->aggregator_identifier));
dprintk("Unbinding Link Aggregation Group %d\n", aggregator->aggregator_identifier);
/* Tell the partner that this port is not suitable for aggregation */
port->actor_oper_port_state &= ~AD_STATE_AGGREGATION;
......@@ -2024,10 +2028,10 @@ void bond_3ad_unbind_slave(struct slave *slave)
// if new aggregator found, copy the aggregator's parameters
// and connect the related lag_ports to the new aggregator
if ((new_aggregator) && ((!new_aggregator->lag_ports) || ((new_aggregator->lag_ports == port) && !new_aggregator->lag_ports->next_port_in_aggregator))) {
BOND_PRINT_DBG(("Some port(s) related to LAG %d - replaceing with LAG %d", aggregator->aggregator_identifier, new_aggregator->aggregator_identifier));
dprintk("Some port(s) related to LAG %d - replaceing with LAG %d\n", aggregator->aggregator_identifier, new_aggregator->aggregator_identifier);
if ((new_aggregator->lag_ports == port) && new_aggregator->is_active) {
printk(KERN_INFO "bonding: Removing an active aggregator\n");
printk(KERN_INFO DRV_NAME ": Removing an active aggregator\n");
// select new active aggregator
select_new_active_agg = 1;
}
......@@ -2057,7 +2061,7 @@ void bond_3ad_unbind_slave(struct slave *slave)
ad_agg_selection_logic(__get_first_agg(port));
}
} else {
printk(KERN_WARNING "bonding: Warning: unbinding aggregator, "
printk(KERN_WARNING DRV_NAME ": Warning: unbinding aggregator, "
"and could not find a new aggregator for its ports\n");
}
} else { // in case that the only port related to this aggregator is the one we want to remove
......@@ -2072,7 +2076,7 @@ void bond_3ad_unbind_slave(struct slave *slave)
}
}
BOND_PRINT_DBG(("Unbinding port %d", port->actor_port_number));
dprintk("Unbinding port %d\n", port->actor_port_number);
// find the aggregator that this port is connected to
temp_aggregator = __get_first_agg(port);
for (; temp_aggregator; temp_aggregator = __get_next_agg(temp_aggregator)) {
......@@ -2123,13 +2127,13 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
read_lock(&bond->lock);
//check if there are any slaves
if (bond->next == (struct slave *)bond) {
goto end;
if (bond->kill_timers) {
goto out;
}
if ((bond->device->flags & IFF_UP) != IFF_UP) {
goto end;
//check if there are any slaves
if (bond->slave_cnt == 0) {
goto re_arm;
}
// check if agg_select_timer timer after initialize is timed out
......@@ -2137,8 +2141,8 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
// select the active aggregator for the bond
if ((port = __get_first_port(bond))) {
if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: bond's first port is uninitialized\n");
goto end;
printk(KERN_WARNING DRV_NAME ": Warning: bond's first port is uninitialized\n");
goto re_arm;
}
aggregator = __get_first_agg(port);
......@@ -2149,8 +2153,8 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
// for each port run the state machines
for (port = __get_first_port(bond); port; port = __get_next_port(port)) {
if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: Found an uninitialized port\n");
goto end;
printk(KERN_WARNING DRV_NAME ": Warning: Found an uninitialized port\n");
goto re_arm;
}
ad_rx_machine(NULL, port);
......@@ -2165,14 +2169,10 @@ void bond_3ad_state_machine_handler(struct bonding *bond)
}
}
end:
re_arm:
mod_timer(&(BOND_AD_INFO(bond).ad_timer), jiffies + ad_delta_in_ticks);
out:
read_unlock(&bond->lock);
if ((bond->device->flags & IFF_UP) == IFF_UP) {
/* re-arm the timer */
mod_timer(&(BOND_AD_INFO(bond).ad_timer), jiffies + (AD_TIMER_INTERVAL * HZ / 1000));
}
}
/**
......@@ -2194,14 +2194,14 @@ void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 leng
port = &(SLAVE_AD_INFO(slave).port);
if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: port of slave %s is uninitialized\n", slave->dev->name);
printk(KERN_WARNING DRV_NAME ": Warning: port of slave %s is uninitialized\n", slave->dev->name);
return;
}
switch (lacpdu->subtype) {
case AD_TYPE_LACPDU:
__ntohs_lacpdu(lacpdu);
BOND_PRINT_DBG(("Received LACPDU on port %d", port->actor_port_number));
dprintk("Received LACPDU on port %d\n", port->actor_port_number);
ad_rx_machine(lacpdu, port);
break;
......@@ -2210,17 +2210,17 @@ void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 leng
switch (((struct marker *)lacpdu)->tlv_type) {
case AD_MARKER_INFORMATION_SUBTYPE:
BOND_PRINT_DBG(("Received Marker Information on port %d", port->actor_port_number));
dprintk("Received Marker Information on port %d\n", port->actor_port_number);
ad_marker_info_received((struct marker *)lacpdu, port);
break;
case AD_MARKER_RESPONSE_SUBTYPE:
BOND_PRINT_DBG(("Received Marker Response on port %d", port->actor_port_number));
dprintk("Received Marker Response on port %d\n", port->actor_port_number);
ad_marker_response_received((struct marker *)lacpdu, port);
break;
default:
BOND_PRINT_DBG(("Received an unknown Marker subtype on slot %d", port->actor_port_number));
dprintk("Received an unknown Marker subtype on slot %d\n", port->actor_port_number);
}
}
}
......@@ -2240,14 +2240,14 @@ void bond_3ad_adapter_speed_changed(struct slave *slave)
// if slave is null, the whole port is not initialized
if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: speed changed for uninitialized port on %s\n",
printk(KERN_WARNING DRV_NAME ": Warning: speed changed for uninitialized port on %s\n",
slave->dev->name);
return;
}
port->actor_admin_port_key &= ~AD_SPEED_KEY_BITS;
port->actor_oper_port_key=port->actor_admin_port_key |= (__get_link_speed(port) << 1);
BOND_PRINT_DBG(("Port %d changed speed", port->actor_port_number));
dprintk("Port %d changed speed\n", port->actor_port_number);
// there is no need to reselect a new aggregator, just signal the
// state machines to reinitialize
port->sm_vars |= AD_PORT_BEGIN;
......@@ -2267,14 +2267,14 @@ void bond_3ad_adapter_duplex_changed(struct slave *slave)
// if slave is null, the whole port is not initialized
if (!port->slave) {
printk(KERN_WARNING "bonding: Warning: duplex changed for uninitialized port on %s\n",
printk(KERN_WARNING DRV_NAME ": Warning: duplex changed for uninitialized port on %s\n",
slave->dev->name);
return;
}
port->actor_admin_port_key &= ~AD_DUPLEX_KEY_BITS;
port->actor_oper_port_key=port->actor_admin_port_key |= __get_duplex(port);
BOND_PRINT_DBG(("Port %d changed duplex", port->actor_port_number));
dprintk("Port %d changed duplex\n", port->actor_port_number);
// there is no need to reselect a new aggregator, just signal the
// state machines to reinitialize
port->sm_vars |= AD_PORT_BEGIN;
......@@ -2295,10 +2295,8 @@ void bond_3ad_handle_link_change(struct slave *slave, char link)
// if slave is null, the whole port is not initialized
if (!port->slave) {
#ifdef BONDING_DEBUG
printk(KERN_WARNING "bonding: Warning: link status changed for uninitialized port on %s\n",
slave->dev->name);
#endif
printk(KERN_WARNING DRV_NAME ": Warning: link status changed for uninitialized port on %s\n",
slave->dev->name);
return;
}
......@@ -2356,41 +2354,27 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
{
slave_t *slave, *start_at;
struct bonding *bond = (struct bonding *) dev->priv;
struct slave *slave, *start_at;
struct bonding *bond = dev->priv;
struct ethhdr *data = (struct ethhdr *)skb->data;
int slave_agg_no;
int slaves_in_agg;
int agg_id;
int i;
struct ad_info ad_info;
if (!IS_UP(dev)) { /* bond down */
dev_kfree_skb(skb);
return 0;
}
if (bond == NULL) {
printk(KERN_CRIT "bonding: Error: bond is NULL on device %s\n", dev->name);
dev_kfree_skb(skb);
return 0;
}
/* make sure that the slaves list will
* not change during tx
*/
read_lock(&bond->lock);
slave = bond->prev;
/* check if bond is empty */
if ((slave == (struct slave *) bond) || (bond->slave_cnt == 0)) {
printk(KERN_DEBUG "ERROR: bond is empty\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
if (!BOND_IS_OK(bond)) {
goto free_out;
}
if (bond_3ad_get_active_agg_info(bond, &ad_info)) {
printk(KERN_DEBUG "ERROR: bond_3ad_get_active_agg_info failed\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
goto free_out;
}
slaves_in_agg = ad_info.ports;
......@@ -2399,21 +2383,12 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
if (slaves_in_agg == 0) {
/*the aggregator is empty*/
printk(KERN_DEBUG "ERROR: active aggregator is empty\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
goto free_out;
}
/* we're at the root, get the first slave */
if ((slave == NULL) || (slave->dev == NULL)) {
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
}
slave_agg_no = (data->h_dest[5]^bond->dev->dev_addr[5]) % slaves_in_agg;
slave_agg_no = (data->h_dest[5]^slave->dev->dev_addr[5]) % slaves_in_agg;
while (slave != (slave_t *)bond) {
bond_for_each_slave(bond, slave, i) {
struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
if (agg && (agg->aggregator_identifier == agg_id)) {
......@@ -2422,37 +2397,18 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
break;
}
}
slave = slave->prev;
if (slave == NULL) {
printk(KERN_ERR "bonding: Error: slave is NULL\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
}
}
if (slave == (slave_t *)bond) {
printk(KERN_ERR "bonding: Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id);
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
if (slave_agg_no >= 0) {
printk(KERN_ERR DRV_NAME ": Error: Couldn't find a slave to tx on for aggregator ID %d\n", agg_id);
goto free_out;
}
start_at = slave;
do {
bond_for_each_slave_from(bond, slave, i, start_at) {
int slave_agg_id = 0;
struct aggregator *agg;
if (slave == NULL) {
printk(KERN_ERR "bonding: Error: slave is NULL\n");
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
}
agg = SLAVE_AD_INFO(slave).port.aggregator;
struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
if (agg) {
slave_agg_id = agg->aggregator_identifier;
......@@ -2463,20 +2419,24 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
skb->dev = slave->dev;
skb->priority = 1;
dev_queue_xmit(skb);
read_unlock(&bond->lock);
return 0;
goto out;
}
} while ((slave = slave->next) != start_at);
}
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
out:
read_unlock(&bond->lock);
return 0;
free_out:
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
goto out;
}
int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct net_device *dev, struct packet_type* ptype)
{
struct bonding *bond = (struct bonding *)dev->priv;
struct bonding *bond = dev->priv;
struct slave *slave = NULL;
int ret = NET_RX_DROP;
......
......@@ -28,6 +28,9 @@
* 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com>
* - Renamed bond_3ad_link_status_changed() to
* bond_3ad_handle_link_change() for compatibility with TLB.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/
#ifndef __BOND_3AD_H__
......
......@@ -28,8 +28,13 @@
* 2003/08/06 - Amir Noam <amir.noam at intel dot com>
* - Add support for setting bond's MAC address with special
* handling required for ALB/TLB.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/
//#define BONDING_DEBUG 1
#include <linux/skbuff.h>
#include <linux/netdevice.h>
#include <linux/etherdevice.h>
......@@ -50,11 +55,11 @@
#define ALB_TIMER_TICKS_PER_SEC 10 /* should be a divisor of HZ */
#define BOND_TLB_REBALANCE_INTERVAL 10 /* in seconds, periodic re-balancing
* used for division - never set
#define BOND_TLB_REBALANCE_INTERVAL 10 /* In seconds, periodic re-balancing.
* Used for division - never set
* to zero !!!
*/
#define BOND_ALB_LP_INTERVAL 1 /* in seconds periodic send of
#define BOND_ALB_LP_INTERVAL 1 /* In seconds, periodic send of
* learning packets to the switch
*/
......@@ -66,7 +71,7 @@
#define TLB_HASH_TABLE_SIZE 256 /* The size of the clients hash table.
* Note that this value MUST NOT be smaller
* because the key hash table BYTE wide !
* because the key hash table is BYTE wide !
*/
......@@ -86,12 +91,15 @@
*/
#define RLB_PROMISC_TIMEOUT 10*ALB_TIMER_TICKS_PER_SEC
static const u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
static const int alb_delta_in_ticks = HZ / ALB_TIMER_TICKS_PER_SEC;
#pragma pack(1)
struct learning_pkt {
u8 mac_dst[ETH_ALEN];
u8 mac_src[ETH_ALEN];
u16 type;
u8 padding[ETH_ZLEN - (2*ETH_ALEN + 2)];
u8 padding[ETH_ZLEN - ETH_HLEN];
};
struct arp_pkt {
......@@ -110,13 +118,12 @@ struct arp_pkt {
/* Forward declaration */
static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[]);
static inline u8
_simple_hash(u8 *hash_start, int hash_size)
static inline u8 _simple_hash(u8 *hash_start, int hash_size)
{
int i;
u8 hash = 0;
for (i=0; i<hash_size; i++) {
for (i = 0; i < hash_size; i++) {
hash ^= hash_start[i];
}
......@@ -125,193 +132,151 @@ _simple_hash(u8 *hash_start, int hash_size)
/*********************** tlb specific functions ***************************/
static inline void
_lock_tx_hashtbl(struct bonding *bond)
static inline void _lock_tx_hashtbl(struct bonding *bond)
{
spin_lock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
}
static inline void
_unlock_tx_hashtbl(struct bonding *bond)
static inline void _unlock_tx_hashtbl(struct bonding *bond)
{
spin_unlock(&(BOND_ALB_INFO(bond).tx_hashtbl_lock));
}
/* Caller must hold tx_hashtbl lock */
static inline void
tlb_init_table_entry(struct bonding *bond, u8 index, u8 save_load)
static inline void tlb_init_table_entry(struct tlb_client_info *entry, int save_load)
{
struct tlb_client_info *entry;
if (BOND_ALB_INFO(bond).tx_hashtbl == NULL) {
return;
}
entry = &(BOND_ALB_INFO(bond).tx_hashtbl[index]);
/* at end of cycle, save the load that was transmitted to the client
* during the cycle, and set the tx_bytes counter to 0 for counting
* the load during the next cycle
*/
if (save_load) {
entry->load_history = 1 + entry->tx_bytes /
BOND_TLB_REBALANCE_INTERVAL;
BOND_TLB_REBALANCE_INTERVAL;
entry->tx_bytes = 0;
}
entry->tx_slave = NULL;
entry->next = TLB_NULL_INDEX;
entry->prev = TLB_NULL_INDEX;
}
static inline void
tlb_init_slave(struct slave *slave)
static inline void tlb_init_slave(struct slave *slave)
{
struct tlb_slave_info *slave_info = &(SLAVE_TLB_INFO(slave));
slave_info->load = 0;
slave_info->head = TLB_NULL_INDEX;
SLAVE_TLB_INFO(slave).load = 0;
SLAVE_TLB_INFO(slave).head = TLB_NULL_INDEX;
}
/* Caller must hold bond lock for read */
static inline void
tlb_clear_slave(struct bonding *bond, struct slave *slave, u8 save_load)
static void tlb_clear_slave(struct bonding *bond, struct slave *slave, int save_load)
{
struct tlb_client_info *tx_hash_table = NULL;
u32 index, next_index;
struct tlb_client_info *tx_hash_table;
u32 index;
/* clear slave from tx_hashtbl */
_lock_tx_hashtbl(bond);
/* clear slave from tx_hashtbl */
tx_hash_table = BOND_ALB_INFO(bond).tx_hashtbl;
if (tx_hash_table) {
index = SLAVE_TLB_INFO(slave).head;
while (index != TLB_NULL_INDEX) {
next_index = tx_hash_table[index].next;
tlb_init_table_entry(bond, index, save_load);
index = next_index;
}
index = SLAVE_TLB_INFO(slave).head;
while (index != TLB_NULL_INDEX) {
u32 next_index = tx_hash_table[index].next;
tlb_init_table_entry(&tx_hash_table[index], save_load);
index = next_index;
}
_unlock_tx_hashtbl(bond);
tlb_init_slave(slave);
}
/* Must be called before starting the monitor timer */
static int
tlb_initialize(struct bonding *bond)
static int tlb_initialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
int size = TLB_HASH_TABLE_SIZE * sizeof(struct tlb_client_info);
int i;
size_t size;
#if(TLB_HASH_TABLE_SIZE != 256)
/* Key to the hash table is byte wide. Check the size! */
#error Hash Table size is wrong.
#endif
spin_lock_init(&(bond_info->tx_hashtbl_lock));
_lock_tx_hashtbl(bond);
if (bond_info->tx_hashtbl != NULL) {
printk (KERN_ERR "%s: TLB hash table is not NULL\n",
bond->device->name);
_unlock_tx_hashtbl(bond);
return -1;
}
size = TLB_HASH_TABLE_SIZE * sizeof(struct tlb_client_info);
bond_info->tx_hashtbl = kmalloc(size, GFP_KERNEL);
if (bond_info->tx_hashtbl == NULL) {
printk (KERN_ERR "%s: Failed to allocate TLB hash table\n",
bond->device->name);
if (!bond_info->tx_hashtbl) {
printk(KERN_ERR DRV_NAME
": Error: %s: Failed to allocate TLB hash table\n",
bond->dev->name);
_unlock_tx_hashtbl(bond);
return -1;
}
memset(bond_info->tx_hashtbl, 0, size);
for (i=0; i<TLB_HASH_TABLE_SIZE; i++) {
tlb_init_table_entry(bond, i, 1);
for (i = 0; i < TLB_HASH_TABLE_SIZE; i++) {
tlb_init_table_entry(&bond_info->tx_hashtbl[i], 1);
}
_unlock_tx_hashtbl(bond);
return 0;
}
/* Must be called only after all slaves have been released */
static void
tlb_deinitialize(struct bonding *bond)
static void tlb_deinitialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
_lock_tx_hashtbl(bond);
if (bond_info->tx_hashtbl == NULL) {
_unlock_tx_hashtbl(bond);
return;
}
kfree(bond_info->tx_hashtbl);
bond_info->tx_hashtbl = NULL;
_unlock_tx_hashtbl(bond);
}
/* Caller must hold bond lock for read */
static struct slave*
tlb_get_least_loaded_slave(struct bonding *bond)
static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
{
struct slave *slave;
struct slave *least_loaded;
s64 curr_gap, max_gap;
struct slave *slave, *least_loaded;
s64 max_gap;
int i, found = 0;
/* Find the first enabled slave */
slave = bond_get_first_slave(bond);
while (slave) {
bond_for_each_slave(bond, slave, i) {
if (SLAVE_IS_OK(slave)) {
found = 1;
break;
}
slave = bond_get_next_slave(bond, slave);
}
if (!slave) {
if (!found) {
return NULL;
}
least_loaded = slave;
max_gap = (s64)(slave->speed * 1000000) -
(s64)(SLAVE_TLB_INFO(slave).load * 8);
max_gap = (s64)(slave->speed << 20) - /* Convert to Megabit per sec */
(s64)(SLAVE_TLB_INFO(slave).load << 3); /* Bytes to bits */
/* Find the slave with the largest gap */
slave = bond_get_next_slave(bond, slave);
while (slave) {
bond_for_each_slave_from(bond, slave, i, least_loaded) {
if (SLAVE_IS_OK(slave)) {
curr_gap = (s64)(slave->speed * 1000000) -
(s64)(SLAVE_TLB_INFO(slave).load * 8);
if (max_gap < curr_gap) {
s64 gap = (s64)(slave->speed << 20) -
(s64)(SLAVE_TLB_INFO(slave).load << 3);
if (max_gap < gap) {
least_loaded = slave;
max_gap = curr_gap;
max_gap = gap;
}
}
slave = bond_get_next_slave(bond, slave);
}
return least_loaded;
}
/* Caller must hold bond lock for read */
struct slave*
tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
struct slave *tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct tlb_client_info *hash_table = NULL;
struct slave *assigned_slave = NULL;
struct tlb_client_info *hash_table;
struct slave *assigned_slave;
_lock_tx_hashtbl(bond);
hash_table = bond_info->tx_hashtbl;
if (hash_table == NULL) {
printk (KERN_ERR "%s: TLB hash table is NULL\n",
bond->device->name);
_unlock_tx_hashtbl(bond);
return NULL;
}
assigned_slave = hash_table[hash_index].tx_slave;
if (!assigned_slave) {
assigned_slave = tlb_get_least_loaded_slave(bond);
......@@ -345,14 +310,12 @@ tlb_choose_channel(struct bonding *bond, u32 hash_index, u32 skb_len)
}
/*********************** rlb specific functions ***************************/
static inline void
_lock_rx_hashtbl(struct bonding *bond)
static inline void _lock_rx_hashtbl(struct bonding *bond)
{
spin_lock(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
}
static inline void
_unlock_rx_hashtbl(struct bonding *bond)
static inline void _unlock_rx_hashtbl(struct bonding *bond)
{
spin_unlock(&(BOND_ALB_INFO(bond).rx_hashtbl_lock));
}
......@@ -360,26 +323,20 @@ _unlock_rx_hashtbl(struct bonding *bond)
/* when an ARP REPLY is received from a client update its info
* in the rx_hashtbl
*/
static void
rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
static void rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
{
u32 hash_index;
struct rlb_client_info *client_info = NULL;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct rlb_client_info *client_info;
u32 hash_index;
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = _simple_hash((u8*)&(arp->ip_src), 4);
hash_index = _simple_hash((u8*)&(arp->ip_src), sizeof(arp->ip_src));
client_info = &(bond_info->rx_hashtbl[hash_index]);
if ((client_info->assigned) &&
(client_info->ip_src == arp->ip_dst) &&
(client_info->ip_dst == arp->ip_src)) {
/* update the clients MAC address */
memcpy(client_info->mac_dst, arp->mac_src, ETH_ALEN);
client_info->ntt = 1;
......@@ -389,66 +346,60 @@ rlb_update_entry_from_arp(struct bonding *bond, struct arp_pkt *arp)
_unlock_rx_hashtbl(bond);
}
static int
rlb_arp_recv(struct sk_buff *skb,
struct net_device *dev,
struct packet_type* ptype)
static int rlb_arp_recv(struct sk_buff *skb, struct net_device *bond_dev, struct packet_type *ptype)
{
struct bonding *bond = (struct bonding *)dev->priv;
int ret = NET_RX_DROP;
struct bonding *bond = bond_dev->priv;
struct arp_pkt *arp = (struct arp_pkt *)skb->data;
int res = NET_RX_DROP;
if (!(dev->flags & IFF_MASTER)) {
if (!(bond_dev->flags & IFF_MASTER)) {
goto out;
}
if (!arp) {
printk(KERN_ERR "Packet has no ARP data\n");
dprintk("Packet has no ARP data\n");
goto out;
}
if (skb->len < sizeof(struct arp_pkt)) {
printk(KERN_ERR "Packet is too small to be an ARP\n");
dprintk("Packet is too small to be an ARP\n");
goto out;
}
if (arp->op_code == htons(ARPOP_REPLY)) {
/* update rx hash table for this ARP */
rlb_update_entry_from_arp(bond, arp);
BOND_PRINT_DBG(("Server received an ARP Reply from client"));
dprintk("Server received an ARP Reply from client\n");
}
ret = NET_RX_SUCCESS;
res = NET_RX_SUCCESS;
out:
dev_kfree_skb(skb);
return ret;
return res;
}
/* Caller must hold bond lock for read */
static struct slave*
rlb_next_rx_slave(struct bonding *bond)
static struct slave *rlb_next_rx_slave(struct bonding *bond)
{
struct slave *rx_slave = NULL, *slave = NULL;
unsigned int i = 0;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *rx_slave, *slave, *start_at;
int i = 0;
slave = bond_info->next_rx_slave;
if (slave == NULL) {
slave = bond->next;
if (bond_info->next_rx_slave) {
start_at = bond_info->next_rx_slave;
} else {
start_at = bond->first_slave;
}
/* this loop uses the circular linked list property of the
* slave's list to go through all slaves
*/
for (i = 0; i < bond->slave_cnt; i++, slave = slave->next) {
rx_slave = NULL;
bond_for_each_slave_from(bond, slave, i, start_at) {
if (SLAVE_IS_OK(slave)) {
if (!rx_slave) {
rx_slave = slave;
}
else if (slave->speed > rx_slave->speed) {
} else if (slave->speed > rx_slave->speed) {
rx_slave = slave;
}
}
......@@ -464,48 +415,41 @@ rlb_next_rx_slave(struct bonding *bond)
/* teach the switch the mac of a disabled slave
* on the primary for fault tolerance
*
* Caller must hold bond->ptrlock for write or bond lock for write
* Caller must hold bond->curr_slave_lock for write or bond lock for write
*/
static void
rlb_teach_disabled_mac_on_primary(struct bonding *bond, u8 addr[])
static void rlb_teach_disabled_mac_on_primary(struct bonding *bond, u8 addr[])
{
if (!bond->current_slave) {
if (!bond->curr_active_slave) {
return;
}
if (!bond->alb_info.primary_is_promisc) {
bond->alb_info.primary_is_promisc = 1;
dev_set_promiscuity(bond->current_slave->dev, 1);
dev_set_promiscuity(bond->curr_active_slave->dev, 1);
}
bond->alb_info.rlb_promisc_timeout_counter = 0;
alb_send_learning_packets(bond->current_slave, addr);
alb_send_learning_packets(bond->curr_active_slave, addr);
}
/* slave being removed should not be active at this point
*
* Caller must hold bond lock for read
*/
static void
rlb_clear_slave(struct bonding *bond, struct slave *slave)
static void rlb_clear_slave(struct bonding *bond, struct slave *slave)
{
struct rlb_client_info *rx_hash_table = NULL;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
struct rlb_client_info *rx_hash_table;
u32 index, next_index;
/* clear slave from rx_hashtbl */
_lock_rx_hashtbl(bond);
rx_hash_table = bond_info->rx_hashtbl;
if (rx_hash_table == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
rx_hash_table = bond_info->rx_hashtbl;
index = bond_info->rx_hashtbl_head;
for (; index != RLB_NULL_INDEX; index = next_index) {
next_index = rx_hash_table[index].next;
if (rx_hash_table[index].slave == slave) {
struct slave *assigned_slave = rlb_next_rx_slave(bond);
......@@ -533,23 +477,24 @@ rlb_clear_slave(struct bonding *bond, struct slave *slave)
_unlock_rx_hashtbl(bond);
write_lock(&bond->ptrlock);
if (slave != bond->current_slave) {
write_lock(&bond->curr_slave_lock);
if (slave != bond->curr_active_slave) {
rlb_teach_disabled_mac_on_primary(bond, slave->dev->dev_addr);
}
write_unlock(&bond->ptrlock);
write_unlock(&bond->curr_slave_lock);
}
static void
rlb_update_client(struct rlb_client_info *client_info)
static void rlb_update_client(struct rlb_client_info *client_info)
{
int i = 0;
int i;
if (client_info->slave == NULL) {
if (!client_info->slave) {
return;
}
for (i=0; i<RLB_ARP_BURST_SIZE; i++) {
for (i = 0; i < RLB_ARP_BURST_SIZE; i++) {
arp_send(ARPOP_REPLY, ETH_P_ARP,
client_info->ip_dst,
client_info->slave->dev,
......@@ -561,20 +506,14 @@ rlb_update_client(struct rlb_client_info *client_info)
}
/* sends ARP REPLIES that update the clients that need updating */
static void
rlb_update_rx_clients(struct bonding *bond)
static void rlb_update_rx_clients(struct bonding *bond)
{
u32 hash_index;
struct rlb_client_info *client_info = NULL;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct rlb_client_info *client_info;
u32 hash_index;
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]);
......@@ -595,22 +534,15 @@ rlb_update_rx_clients(struct bonding *bond)
}
/* The slave was assigned a new mac address - update the clients */
static void
rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave)
static void rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave)
{
u32 hash_index;
u8 ntt = 0;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
struct rlb_client_info* client_info = NULL;
struct rlb_client_info *client_info;
int ntt = 0;
u32 hash_index;
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]);
......@@ -633,37 +565,31 @@ rlb_req_update_slave_clients(struct bonding *bond, struct slave *slave)
}
/* mark all clients using src_ip to be updated */
static void
rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip)
static void rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip)
{
u32 hash_index;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
struct rlb_client_info *client_info = NULL;
struct rlb_client_info *client_info;
u32 hash_index;
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]);
if (!client_info->slave) {
printk(KERN_ERR "Bonding: Error: found a client with no"
" channel in the client's hash table\n");
printk(KERN_ERR DRV_NAME
": Error: found a client with no channel in "
"the client's hash table\n");
continue;
}
/*update all clients using this src_ip, that are not assigned
* to the team's address (current_slave) and have a known
* to the team's address (curr_active_slave) and have a known
* unicast mac address.
*/
if ((client_info->ip_src == src_ip) &&
memcmp(client_info->slave->dev->dev_addr,
bond->device->dev_addr, ETH_ALEN) &&
bond->dev->dev_addr, ETH_ALEN) &&
memcmp(client_info->mac_dst, mac_bcast, ETH_ALEN)) {
client_info->ntt = 1;
bond_info->rx_ntt = 1;
......@@ -674,30 +600,22 @@ rlb_req_update_subnet_clients(struct bonding *bond, u32 src_ip)
}
/* Caller must hold both bond and ptr locks for read */
struct slave*
rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
struct slave *rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct rlb_client_info *client_info = NULL;
struct slave *assigned_slave;
struct rlb_client_info *client_info;
u32 hash_index = 0;
struct slave *assigned_slave = NULL;
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return NULL;
}
hash_index = _simple_hash((u8 *)&arp->ip_dst, 4);
hash_index = _simple_hash((u8 *)&arp->ip_dst, sizeof(arp->ip_src));
client_info = &(bond_info->rx_hashtbl[hash_index]);
if (client_info->assigned == 1) {
if (client_info->assigned) {
if ((client_info->ip_src == arp->ip_src) &&
(client_info->ip_dst == arp->ip_dst)) {
/* the entry is already assigned to this client */
if (memcmp(arp->mac_dst, mac_bcast, ETH_ALEN)) {
/* update mac address from arp */
memcpy(client_info->mac_dst, arp->mac_dst, ETH_ALEN);
......@@ -710,12 +628,12 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
}
} else {
/* the entry is already assigned to some other client,
* move the old client to primary (current_slave) so
* move the old client to primary (curr_active_slave) so
* that the new client can be assigned to this entry.
*/
if (bond->current_slave &&
client_info->slave != bond->current_slave) {
client_info->slave = bond->current_slave;
if (bond->curr_active_slave &&
client_info->slave != bond->curr_active_slave) {
client_info->slave = bond->curr_active_slave;
rlb_update_client(client_info);
}
}
......@@ -736,8 +654,7 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
if (memcmp(client_info->mac_dst, mac_bcast, ETH_ALEN)) {
client_info->ntt = 1;
bond->alb_info.rx_ntt = 1;
}
else {
} else {
client_info->ntt = 0;
}
......@@ -760,10 +677,9 @@ rlb_choose_channel(struct bonding *bond, struct arp_pkt *arp)
/* chooses (and returns) transmit channel for arp reply
* does not choose channel for other arp types since they are
* sent on the current_slave
* sent on the curr_active_slave
*/
static struct slave*
rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
{
struct arp_pkt *arp = (struct arp_pkt *)skb->nh.raw;
struct slave *tx_slave = NULL;
......@@ -776,9 +692,8 @@ rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
if (tx_slave) {
memcpy(arp->mac_src,tx_slave->dev->dev_addr, ETH_ALEN);
}
BOND_PRINT_DBG(("Server sent ARP Reply packet"));
dprintk("Server sent ARP Reply packet\n");
} else if (arp->op_code == __constant_htons(ARPOP_REQUEST)) {
/* Create an entry in the rx_hashtbl for this client as a
* place holder.
* When the arp reply is received the entry will be updated
......@@ -797,34 +712,29 @@ rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
* updated with their assigned mac.
*/
rlb_req_update_subnet_clients(bond, arp->ip_src);
BOND_PRINT_DBG(("Server sent ARP Request packet"));
dprintk("Server sent ARP Request packet\n");
}
return tx_slave;
}
/* Caller must hold bond lock for read */
static void
rlb_rebalance(struct bonding *bond)
static void rlb_rebalance(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *assigned_slave = NULL;
struct slave *assigned_slave;
struct rlb_client_info *client_info;
int ntt;
u32 hash_index;
struct rlb_client_info *client_info = NULL;
u8 ntt = 0;
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
ntt = 0;
hash_index = bond_info->rx_hashtbl_head;
for (; hash_index != RLB_NULL_INDEX; hash_index = client_info->next) {
client_info = &(bond_info->rx_hashtbl[hash_index]);
assigned_slave = rlb_next_rx_slave(bond);
if (assigned_slave && (client_info->slave != assigned_slave)){
if (assigned_slave && (client_info->slave != assigned_slave)) {
client_info->slave = assigned_slave;
client_info->ntt = 1;
ntt = 1;
......@@ -839,96 +749,83 @@ rlb_rebalance(struct bonding *bond)
}
/* Caller must hold rx_hashtbl lock */
static inline void
rlb_init_table_entry(struct rlb_client_info *entry)
static void rlb_init_table_entry(struct rlb_client_info *entry)
{
memset(entry, 0, sizeof(struct rlb_client_info));
entry->next = RLB_NULL_INDEX;
entry->prev = RLB_NULL_INDEX;
entry->assigned = 0;
entry->ntt = 0;
}
static int
rlb_initialize(struct bonding *bond)
static int rlb_initialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct packet_type *pk_type = &(BOND_ALB_INFO(bond).rlb_pkt_type);
int size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
int i;
size_t size;
spin_lock_init(&(bond_info->rx_hashtbl_lock));
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl != NULL) {
printk (KERN_ERR "%s: RLB hash table is not NULL\n",
bond->device->name);
_unlock_rx_hashtbl(bond);
return -1;
}
size = RLB_HASH_TABLE_SIZE * sizeof(struct rlb_client_info);
bond_info->rx_hashtbl = kmalloc(size, GFP_KERNEL);
if (bond_info->rx_hashtbl == NULL) {
printk (KERN_ERR "%s: Failed to allocate"
" RLB hash table\n", bond->device->name);
if (!bond_info->rx_hashtbl) {
printk(KERN_ERR DRV_NAME
": Error: %s: Failed to allocate RLB hash table\n",
bond->dev->name);
_unlock_rx_hashtbl(bond);
return -1;
}
bond_info->rx_hashtbl_head = RLB_NULL_INDEX;
for (i=0; i<RLB_HASH_TABLE_SIZE; i++) {
for (i = 0; i < RLB_HASH_TABLE_SIZE; i++) {
rlb_init_table_entry(bond_info->rx_hashtbl + i);
}
_unlock_rx_hashtbl(bond);
/* register to receive ARPs */
_unlock_rx_hashtbl(bond);
/*initialize packet type*/
pk_type->type = __constant_htons(ETH_P_ARP);
pk_type->dev = bond->device;
pk_type->dev = bond->dev;
pk_type->func = rlb_arp_recv;
/* register to receive ARPs */
dev_add_pack(pk_type);
return 0;
}
static void
rlb_deinitialize(struct bonding *bond)
static void rlb_deinitialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
dev_remove_pack(&(bond_info->rlb_pkt_type));
_lock_rx_hashtbl(bond);
if (bond_info->rx_hashtbl == NULL) {
_unlock_rx_hashtbl(bond);
return;
}
kfree(bond_info->rx_hashtbl);
bond_info->rx_hashtbl = NULL;
_unlock_rx_hashtbl(bond);
}
/*********************** tlb/rlb shared functions *********************/
static void
alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
static void alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
{
struct sk_buff *skb = NULL;
struct learning_pkt pkt;
char *data = NULL;
int size = sizeof(struct learning_pkt);
int i;
unsigned int size = sizeof(struct learning_pkt);
memset(&pkt, 0, size);
memcpy(pkt.mac_dst, mac_addr, ETH_ALEN);
memcpy(pkt.mac_src, mac_addr, ETH_ALEN);
pkt.type = __constant_htons(ETH_P_LOOP);
for (i=0; i < MAX_LP_RETRY; i++) {
skb = NULL;
for (i = 0; i < MAX_LP_RETRY; i++) {
struct sk_buff *skb;
char *data;
skb = dev_alloc_skb(size);
if (!skb) {
return;
......@@ -936,28 +833,26 @@ alb_send_learning_packets(struct slave *slave, u8 mac_addr[])
data = skb_put(skb, size);
memcpy(data, &pkt, size);
skb->mac.raw = data;
skb->nh.raw = data + ETH_HLEN;
skb->protocol = pkt.type;
skb->priority = TC_PRIO_CONTROL;
skb->dev = slave->dev;
dev_queue_xmit(skb);
}
}
/* hw is a boolean parameter that determines whether we should try and
* set the hw address of the device as well as the hw address of the
* net_device
*/
static int
alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw)
static int alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw)
{
struct net_device *dev = NULL;
struct net_device *dev = slave->dev;
struct sockaddr s_addr;
dev = slave->dev;
if (!hw) {
memcpy(dev->dev_addr, addr, dev->addr_len);
return 0;
......@@ -968,26 +863,23 @@ alb_set_slave_mac_addr(struct slave *slave, u8 addr[], int hw)
memcpy(s_addr.sa_data, addr, dev->addr_len);
s_addr.sa_family = dev->type;
if (dev->set_mac_address(dev, &s_addr)) {
printk(KERN_DEBUG "bonding: Error: alb_set_slave_mac_addr:"
" dev->set_mac_address of dev %s failed!"
" ALB mode requires that the base driver"
" support setting the hw address also when"
" the network device's interface is open\n",
dev->name);
printk(KERN_ERR DRV_NAME
": Error: dev->set_mac_address of dev %s failed! ALB "
"mode requires that the base driver support setting "
"the hw address also when the network device's "
"interface is open\n",
dev->name);
return -EOPNOTSUPP;
}
return 0;
}
/* Caller must hold bond lock for write or ptrlock for write*/
static void
alb_swap_mac_addr(struct bonding *bond,
struct slave *slave1,
struct slave *slave2)
/* Caller must hold bond lock for write or curr_slave_lock for write*/
static void alb_swap_mac_addr(struct bonding *bond, struct slave *slave1, struct slave *slave2)
{
u8 tmp_mac_addr[ETH_ALEN];
struct slave *disabled_slave = NULL;
u8 slaves_state_differ;
u8 tmp_mac_addr[ETH_ALEN];
int slaves_state_differ;
slaves_state_differ = (SLAVE_IS_OK(slave1) != SLAVE_IS_OK(slave2));
......@@ -1004,8 +896,7 @@ alb_swap_mac_addr(struct bonding *bond,
*/
rlb_req_update_slave_clients(bond, slave1);
}
}
else {
} else {
disabled_slave = slave1;
}
......@@ -1017,15 +908,14 @@ alb_swap_mac_addr(struct bonding *bond,
*/
rlb_req_update_slave_clients(bond, slave2);
}
}
else {
} else {
disabled_slave = slave2;
}
if (bond->alb_info.rlb_enabled && slaves_state_differ) {
/* A disabled slave was assigned an active mac addr */
rlb_teach_disabled_mac_on_primary(bond,
disabled_slave->dev->dev_addr);
/* A disabled slave was assigned an active mac addr */
rlb_teach_disabled_mac_on_primary(bond,
disabled_slave->dev->dev_addr);
}
}
......@@ -1043,10 +933,8 @@ alb_swap_mac_addr(struct bonding *bond,
*
* Caller must hold bond lock
*/
static void
alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
static void alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
{
struct slave *tmp_slave;
int perm_curr_diff;
int perm_bond_diff;
......@@ -1054,20 +942,23 @@ alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
slave->dev->dev_addr,
ETH_ALEN);
perm_bond_diff = memcmp(slave->perm_hwaddr,
bond->device->dev_addr,
bond->dev->dev_addr,
ETH_ALEN);
if (perm_curr_diff && perm_bond_diff) {
tmp_slave = bond_get_first_slave(bond);
while (tmp_slave) {
struct slave *tmp_slave;
int i, found = 0;
bond_for_each_slave(bond, tmp_slave, i) {
if (!memcmp(slave->perm_hwaddr,
tmp_slave->dev->dev_addr,
ETH_ALEN)) {
tmp_slave->dev->dev_addr,
ETH_ALEN)) {
found = 1;
break;
}
tmp_slave = bond_get_next_slave(bond, tmp_slave);
}
if (tmp_slave) {
if (found) {
alb_swap_mac_addr(bond, slave, tmp_slave);
}
}
......@@ -1098,10 +989,10 @@ alb_change_hw_addr_on_detach(struct bonding *bond, struct slave *slave)
* caller must hold the bond lock for write since the mac addresses are compared
* and may be swapped.
*/
static int
alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
static int alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
{
struct slave *tmp_slave1, *tmp_slave2;
struct slave *tmp_slave1, *tmp_slave2, *free_mac_slave;
int i, j, found = 0;
if (bond->slave_cnt == 0) {
/* this is the first slave */
......@@ -1112,65 +1003,68 @@ alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
* check uniqueness of slave's mac address against the other
* slaves in the bond.
*/
if (memcmp(slave->perm_hwaddr, bond->device->dev_addr, ETH_ALEN)) {
tmp_slave1 = bond_get_first_slave(bond);
for (; tmp_slave1; tmp_slave1 = bond_get_next_slave(bond, tmp_slave1)) {
if (memcmp(slave->perm_hwaddr, bond->dev->dev_addr, ETH_ALEN)) {
bond_for_each_slave(bond, tmp_slave1, i) {
if (!memcmp(tmp_slave1->dev->dev_addr, slave->dev->dev_addr,
ETH_ALEN)) {
found = 1;
break;
}
}
if (tmp_slave1) {
if (found) {
/* a slave was found that is using the mac address
* of the new slave
*/
printk(KERN_ERR "bonding: Warning: the hw address "
"of slave %s is not unique - cannot enslave it!"
, slave->dev->name);
printk(KERN_ERR DRV_NAME
": Error: the hw address of slave %s is not "
"unique - cannot enslave it!",
slave->dev->name);
return -EINVAL;
}
return 0;
}
/* the slave's address is equal to the address of the bond
* search for a spare address in the bond for this slave.
/* The slave's address is equal to the address of the bond.
* Search for a spare address in the bond for this slave.
*/
tmp_slave1 = bond_get_first_slave(bond);
for (; tmp_slave1; tmp_slave1 = bond_get_next_slave(bond, tmp_slave1)) {
tmp_slave2 = bond_get_first_slave(bond);
for (; tmp_slave2; tmp_slave2 = bond_get_next_slave(bond, tmp_slave2)) {
free_mac_slave = NULL;
bond_for_each_slave(bond, tmp_slave1, i) {
found = 0;
bond_for_each_slave(bond, tmp_slave2, j) {
if (!memcmp(tmp_slave1->perm_hwaddr,
tmp_slave2->dev->dev_addr,
ETH_ALEN)) {
found = 1;
break;
}
}
if (!tmp_slave2) {
if (!found) {
/* no slave has tmp_slave1's perm addr
* as its curr addr
*/
free_mac_slave = tmp_slave1;
break;
}
}
if (tmp_slave1) {
alb_set_slave_mac_addr(slave, tmp_slave1->perm_hwaddr,
if (free_mac_slave) {
alb_set_slave_mac_addr(slave, free_mac_slave->perm_hwaddr,
bond->alb_info.rlb_enabled);
printk(KERN_WARNING "bonding: Warning: the hw address "
"of slave %s is in use by the bond; "
"giving it the hw address of %s\n",
slave->dev->name, tmp_slave1->dev->name);
printk(KERN_WARNING DRV_NAME
": Warning: the hw address of slave %s is in use by "
"the bond; giving it the hw address of %s\n",
slave->dev->name, free_mac_slave->dev->name);
} else {
printk(KERN_CRIT "bonding: Error: the hw address "
"of slave %s is in use by the bond; "
"couldn't find a slave with a free hw "
"address to give it (this should not have "
"happened)\n", slave->dev->name);
printk(KERN_ERR DRV_NAME
": Error: the hw address of slave %s is in use by the "
"bond; couldn't find a slave with a free hw address to "
"give it (this should not have happened)\n",
slave->dev->name);
return -EFAULT;
}
......@@ -1188,37 +1082,36 @@ alb_handle_addr_collision_on_attach(struct bonding *bond, struct slave *slave)
*
* For each slave, this function sets the interface to the new address and then
* changes its dev_addr field to its previous value.
*
*
* Unwinding assumes bond's mac address has not yet changed.
*/
static inline int
alb_set_mac_address(struct bonding *bond, void *addr)
static int alb_set_mac_address(struct bonding *bond, void *addr)
{
struct sockaddr sa;
struct slave *slave;
struct slave *slave, *stop_at;
char tmp_addr[ETH_ALEN];
int error;
int res;
int i;
if (bond->alb_info.rlb_enabled) {
return 0;
}
slave = bond_get_first_slave(bond);
for (; slave; slave = bond_get_next_slave(bond, slave)) {
bond_for_each_slave(bond, slave, i) {
if (slave->dev->set_mac_address == NULL) {
error = -EOPNOTSUPP;
res = -EOPNOTSUPP;
goto unwind;
}
/* save net_device's current hw address */
memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN);
error = slave->dev->set_mac_address(slave->dev, addr);
res = slave->dev->set_mac_address(slave->dev, addr);
/* restore net_device's hw address */
memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN);
if (error) {
if (res) {
goto unwind;
}
}
......@@ -1226,22 +1119,23 @@ alb_set_mac_address(struct bonding *bond, void *addr)
return 0;
unwind:
memcpy(sa.sa_data, bond->device->dev_addr, bond->device->addr_len);
sa.sa_family = bond->device->type;
slave = bond_get_first_slave(bond);
for (; slave; slave = bond_get_next_slave(bond, slave)) {
memcpy(sa.sa_data, bond->dev->dev_addr, bond->dev->addr_len);
sa.sa_family = bond->dev->type;
/* unwind from head to the slave that failed */
stop_at = slave;
bond_for_each_slave_from_to(bond, slave, i, bond->first_slave, stop_at) {
memcpy(tmp_addr, slave->dev->dev_addr, ETH_ALEN);
slave->dev->set_mac_address(slave->dev, &sa);
memcpy(slave->dev->dev_addr, tmp_addr, ETH_ALEN);
}
return error;
return res;
}
/************************ exported alb funcions ************************/
int
bond_alb_initialize(struct bonding *bond, int rlb_enabled)
int bond_alb_initialize(struct bonding *bond, int rlb_enabled)
{
int res;
......@@ -1263,8 +1157,7 @@ bond_alb_initialize(struct bonding *bond, int rlb_enabled)
return 0;
}
void
bond_alb_deinitialize(struct bonding *bond)
void bond_alb_deinitialize(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
......@@ -1275,49 +1168,38 @@ bond_alb_deinitialize(struct bonding *bond)
}
}
int
bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
{
struct bonding *bond = (struct bonding *) dev->priv;
struct ethhdr *eth_data = (struct ethhdr *)skb->data;
struct bonding *bond = bond_dev->priv;
struct ethhdr *eth_data = (struct ethhdr *)skb->mac.raw = skb->data;
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *tx_slave = NULL;
char do_tx_balance = 1;
static u32 ip_bcast = 0xffffffff;
int hash_size = 0;
int do_tx_balance = 1;
u32 hash_index = 0;
u8 *hash_start = NULL;
u8 mac_bcast[ETH_ALEN] = {0xff,0xff,0xff,0xff,0xff,0xff};
if (!IS_UP(dev)) { /* bond down */
dev_kfree_skb(skb);
return 0;
}
/* make sure that the current_slave and the slaves list do
/* make sure that the curr_active_slave and the slaves list do
* not change during tx
*/
read_lock(&bond->lock);
read_lock(&bond->curr_slave_lock);
if (bond->slave_cnt == 0) {
/* no suitable interface, frame not sent */
dev_kfree_skb(skb);
read_unlock(&bond->lock);
return 0;
if (!BOND_IS_OK(bond)) {
goto free_out;
}
read_lock(&bond->ptrlock);
switch (ntohs(skb->protocol)) {
case ETH_P_IP:
if ((memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) ||
(skb->nh.iph->daddr == 0xffffffff)) {
(skb->nh.iph->daddr == ip_bcast)) {
do_tx_balance = 0;
break;
}
hash_start = (char*)&(skb->nh.iph->daddr);
hash_size = 4;
hash_size = sizeof(skb->nh.iph->daddr);
break;
case ETH_P_IPV6:
if (memcmp(eth_data->h_dest, mac_bcast, ETH_ALEN) == 0) {
do_tx_balance = 0;
......@@ -1325,9 +1207,8 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
}
hash_start = (char*)&(skb->nh.ipv6h->daddr);
hash_size = 16;
hash_size = sizeof(skb->nh.ipv6h->daddr);
break;
case ETH_P_IPX:
if (ipx_hdr(skb)->ipx_checksum !=
__constant_htons(IPX_NO_CHECKSUM)) {
......@@ -1349,14 +1230,12 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
hash_start = (char*)eth_data->h_dest;
hash_size = ETH_ALEN;
break;
case ETH_P_ARP:
do_tx_balance = 0;
if (bond_info->rlb_enabled) {
tx_slave = rlb_arp_xmit(skb, bond);
}
break;
default:
do_tx_balance = 0;
break;
......@@ -1369,16 +1248,16 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
if (!tx_slave) {
/* unbalanced or unassigned, send through primary */
tx_slave = bond->current_slave;
tx_slave = bond->curr_active_slave;
bond_info->unbalanced_load += skb->len;
}
if (tx_slave && SLAVE_IS_OK(tx_slave)) {
skb->dev = tx_slave->dev;
if (tx_slave != bond->current_slave) {
if (tx_slave != bond->curr_active_slave) {
memcpy(eth_data->h_source,
tx_slave->dev->dev_addr,
ETH_ALEN);
tx_slave->dev->dev_addr,
ETH_ALEN);
}
dev_queue_xmit(skb);
} else {
......@@ -1386,26 +1265,35 @@ bond_alb_xmit(struct sk_buff *skb, struct net_device *dev)
if (tx_slave) {
tlb_clear_slave(bond, tx_slave, 0);
}
dev_kfree_skb(skb);
goto free_out;
}
read_unlock(&bond->ptrlock);
out:
read_unlock(&bond->curr_slave_lock);
read_unlock(&bond->lock);
return 0;
free_out:
dev_kfree_skb(skb);
goto out;
}
void
bond_alb_monitor(struct bonding *bond)
void bond_alb_monitor(struct bonding *bond)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
struct slave *slave = NULL;
struct slave *slave;
int i;
read_lock(&bond->lock);
if ((bond->slave_cnt == 0) || !(bond->device->flags & IFF_UP)) {
if (bond->kill_timers) {
goto out;
}
if (bond->slave_cnt == 0) {
bond_info->tx_rebalance_counter = 0;
bond_info->lp_counter = 0;
goto out;
goto re_arm;
}
bond_info->tx_rebalance_counter++;
......@@ -1413,51 +1301,53 @@ bond_alb_monitor(struct bonding *bond)
/* send learning packets */
if (bond_info->lp_counter >= BOND_ALB_LP_TICKS) {
/* change of current_slave involves swapping of mac addresses.
/* change of curr_active_slave involves swapping of mac addresses.
* in order to avoid this swapping from happening while
* sending the learning packets, the ptrlock must be held for
* sending the learning packets, the curr_slave_lock must be held for
* read.
*/
read_lock(&bond->ptrlock);
slave = bond_get_first_slave(bond);
while (slave) {
read_lock(&bond->curr_slave_lock);
bond_for_each_slave(bond, slave, i) {
alb_send_learning_packets(slave,slave->dev->dev_addr);
slave = bond_get_next_slave(bond, slave);
}
read_unlock(&bond->ptrlock);
read_unlock(&bond->curr_slave_lock);
bond_info->lp_counter = 0;
}
/* rebalance tx traffic */
if (bond_info->tx_rebalance_counter >= BOND_TLB_REBALANCE_TICKS) {
read_lock(&bond->ptrlock);
slave = bond_get_first_slave(bond);
while (slave) {
read_lock(&bond->curr_slave_lock);
bond_for_each_slave(bond, slave, i) {
tlb_clear_slave(bond, slave, 1);
if (slave == bond->current_slave) {
if (slave == bond->curr_active_slave) {
SLAVE_TLB_INFO(slave).load =
bond_info->unbalanced_load /
BOND_TLB_REBALANCE_INTERVAL;
bond_info->unbalanced_load = 0;
}
slave = bond_get_next_slave(bond, slave);
}
read_unlock(&bond->ptrlock);
read_unlock(&bond->curr_slave_lock);
bond_info->tx_rebalance_counter = 0;
}
/* handle rlb stuff */
if (bond_info->rlb_enabled) {
/* the following code changes the promiscuity of the
* the current_slave. It needs to be locked with a
* the curr_active_slave. It needs to be locked with a
* write lock to protect from other code that also
* sets the promiscuity.
*/
write_lock(&bond->ptrlock);
write_lock(&bond->curr_slave_lock);
if (bond_info->primary_is_promisc &&
(++bond_info->rlb_promisc_timeout_counter >=
RLB_PROMISC_TIMEOUT)) {
(++bond_info->rlb_promisc_timeout_counter >= RLB_PROMISC_TIMEOUT)) {
bond_info->rlb_promisc_timeout_counter = 0;
......@@ -1465,12 +1355,13 @@ bond_alb_monitor(struct bonding *bond)
* because a slave was disabled then
* it can now leave promiscuous mode.
*/
dev_set_promiscuity(bond->current_slave->dev, -1);
dev_set_promiscuity(bond->curr_active_slave->dev, -1);
bond_info->primary_is_promisc = 0;
}
write_unlock(&bond->ptrlock);
if (bond_info->rlb_rebalance == 1) {
write_unlock(&bond->curr_slave_lock);
if (bond_info->rlb_rebalance) {
bond_info->rlb_rebalance = 0;
rlb_rebalance(bond);
}
......@@ -1490,28 +1381,23 @@ bond_alb_monitor(struct bonding *bond)
}
}
re_arm:
mod_timer(&(bond_info->alb_timer), jiffies + alb_delta_in_ticks);
out:
read_unlock(&bond->lock);
if (bond->device->flags & IFF_UP) {
/* re-arm the timer */
mod_timer(&(bond_info->alb_timer),
jiffies + (HZ/ALB_TIMER_TICKS_PER_SEC));
}
}
/* assumption: called before the slave is attched to the bond
/* assumption: called before the slave is attached to the bond
* and not locked by the bond lock
*/
int
bond_alb_init_slave(struct bonding *bond, struct slave *slave)
int bond_alb_init_slave(struct bonding *bond, struct slave *slave)
{
int err = 0;
int res;
err = alb_set_slave_mac_addr(slave, slave->perm_hwaddr,
res = alb_set_slave_mac_addr(slave, slave->perm_hwaddr,
bond->alb_info.rlb_enabled);
if (err) {
return err;
if (res) {
return res;
}
/* caller must hold the bond lock for write since the mac addresses
......@@ -1519,12 +1405,12 @@ bond_alb_init_slave(struct bonding *bond, struct slave *slave)
*/
write_lock_bh(&bond->lock);
err = alb_handle_addr_collision_on_attach(bond, slave);
res = alb_handle_addr_collision_on_attach(bond, slave);
write_unlock_bh(&bond->lock);
if (err) {
return err;
if (res) {
return res;
}
tlb_init_slave(slave);
......@@ -1540,8 +1426,7 @@ bond_alb_init_slave(struct bonding *bond, struct slave *slave)
}
/* Caller must hold bond lock for write */
void
bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
{
if (bond->slave_cnt > 1) {
alb_change_hw_addr_on_detach(bond, slave);
......@@ -1556,9 +1441,7 @@ bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
}
/* Caller must hold bond lock for read */
void
bond_alb_handle_link_change(struct bonding *bond, struct slave *slave,
char link)
void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link)
{
struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
......@@ -1582,109 +1465,111 @@ bond_alb_handle_link_change(struct bonding *bond, struct slave *slave,
}
/**
* bond_alb_assign_current_slave - assign new current_slave
* bond_alb_handle_active_change - assign new curr_active_slave
* @bond: our bonding struct
* @new_slave: new slave to assign
*
* Set the bond->current_slave to @new_slave and handle
* Set the bond->curr_active_slave to @new_slave and handle
* mac address swapping and promiscuity changes as needed.
*
* Caller must hold bond ptrlock for write (or bond lock for write)
* Caller must hold bond curr_slave_lock for write (or bond lock for write)
*/
void
bond_alb_assign_current_slave(struct bonding *bond, struct slave *new_slave)
void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave)
{
struct slave *swap_slave = bond->current_slave;
struct slave *swap_slave;
int i;
if (bond->current_slave == new_slave) {
if (bond->curr_active_slave == new_slave) {
return;
}
if (bond->current_slave && bond->alb_info.primary_is_promisc) {
dev_set_promiscuity(bond->current_slave->dev, -1);
if (bond->curr_active_slave && bond->alb_info.primary_is_promisc) {
dev_set_promiscuity(bond->curr_active_slave->dev, -1);
bond->alb_info.primary_is_promisc = 0;
bond->alb_info.rlb_promisc_timeout_counter = 0;
}
bond->current_slave = new_slave;
swap_slave = bond->curr_active_slave;
bond->curr_active_slave = new_slave;
if (!new_slave || (bond->slave_cnt == 0)) {
return;
}
/* set the new current_slave to the bonds mac address
* i.e. swap mac addresses of old current_slave and new current_slave
/* set the new curr_active_slave to the bonds mac address
* i.e. swap mac addresses of old curr_active_slave and new curr_active_slave
*/
if (!swap_slave) {
struct slave *tmp_slave;
/* find slave that is holding the bond's mac address */
swap_slave = bond_get_first_slave(bond);
while (swap_slave) {
if (!memcmp(swap_slave->dev->dev_addr,
bond->device->dev_addr, ETH_ALEN)) {
bond_for_each_slave(bond, tmp_slave, i) {
if (!memcmp(tmp_slave->dev->dev_addr,
bond->dev->dev_addr, ETH_ALEN)) {
swap_slave = tmp_slave;
break;
}
swap_slave = bond_get_next_slave(bond, swap_slave);
}
}
/* current_slave must be set before calling alb_swap_mac_addr */
/* curr_active_slave must be set before calling alb_swap_mac_addr */
if (swap_slave) {
/* swap mac address */
alb_swap_mac_addr(bond, swap_slave, new_slave);
} else {
/* set the new_slave to the bond mac address */
alb_set_slave_mac_addr(new_slave, bond->device->dev_addr,
alb_set_slave_mac_addr(new_slave, bond->dev->dev_addr,
bond->alb_info.rlb_enabled);
/* fasten bond mac on new current slave */
alb_send_learning_packets(new_slave, bond->device->dev_addr);
alb_send_learning_packets(new_slave, bond->dev->dev_addr);
}
}
int
bond_alb_set_mac_address(struct net_device *dev, void *addr)
int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr)
{
struct bonding *bond = dev->priv;
struct bonding *bond = bond_dev->priv;
struct sockaddr *sa = addr;
struct slave *swap_slave = NULL;
int error = 0;
struct slave *slave, *swap_slave;
int res;
int i;
if (!is_valid_ether_addr(sa->sa_data)) {
return -EADDRNOTAVAIL;
}
error = alb_set_mac_address(bond, addr);
if (error) {
return error;
res = alb_set_mac_address(bond, addr);
if (res) {
return res;
}
memcpy(dev->dev_addr, sa->sa_data, dev->addr_len);
memcpy(bond_dev->dev_addr, sa->sa_data, bond_dev->addr_len);
/* If there is no current_slave there is nothing else to do.
/* If there is no curr_active_slave there is nothing else to do.
* Otherwise we'll need to pass the new address to it and handle
* duplications.
*/
if (bond->current_slave == NULL) {
if (!bond->curr_active_slave) {
return 0;
}
swap_slave = bond_get_first_slave(bond);
while (swap_slave) {
if (!memcmp(swap_slave->dev->dev_addr, dev->dev_addr, ETH_ALEN)) {
swap_slave = NULL;
bond_for_each_slave(bond, slave, i) {
if (!memcmp(slave->dev->dev_addr, bond_dev->dev_addr, ETH_ALEN)) {
swap_slave = slave;
break;
}
swap_slave = bond_get_next_slave(bond, swap_slave);
}
if (swap_slave) {
alb_swap_mac_addr(bond, swap_slave, bond->current_slave);
alb_swap_mac_addr(bond, swap_slave, bond->curr_active_slave);
} else {
alb_set_slave_mac_addr(bond->current_slave, dev->dev_addr,
alb_set_slave_mac_addr(bond->curr_active_slave, bond_dev->dev_addr,
bond->alb_info.rlb_enabled);
alb_send_learning_packets(bond->current_slave, dev->dev_addr);
alb_send_learning_packets(bond->curr_active_slave, bond_dev->dev_addr);
if (bond->alb_info.rlb_enabled) {
/* inform clients mac address has changed */
rlb_req_update_slave_clients(bond, bond->current_slave);
rlb_req_update_slave_clients(bond, bond->curr_active_slave);
}
}
......
......@@ -24,6 +24,9 @@
* 2003/08/06 - Amir Noam <amir.noam at intel dot com>
* - Add support for setting bond's MAC address with special
* handling required for ALB/TLB.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/
#ifndef __BOND_ALB_H__
......@@ -126,10 +129,10 @@ void bond_alb_deinitialize(struct bonding *bond);
int bond_alb_init_slave(struct bonding *bond, struct slave *slave);
void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave);
void bond_alb_handle_link_change(struct bonding *bond, struct slave *slave, char link);
void bond_alb_assign_current_slave(struct bonding *bond, struct slave *new_slave);
int bond_alb_xmit(struct sk_buff *skb, struct net_device *dev);
void bond_alb_handle_active_change(struct bonding *bond, struct slave *new_slave);
int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev);
void bond_alb_monitor(struct bonding *bond);
int bond_alb_set_mac_address(struct net_device *dev, void *addr);
int bond_alb_set_mac_address(struct net_device *bond_dev, void *addr);
#endif /* __BOND_ALB_H__ */
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -9,7 +9,7 @@
*
* This software may be used and distributed according to the terms
* of the GNU Public License, incorporated herein by reference.
*
*
*
* 2003/03/18 - Amir Noam <amir.noam at intel dot com>,
* Tsippy Mendelson <tsippy.mendelson at intel dot com> and
......@@ -22,159 +22,205 @@
*
* 2003/05/01 - Shmulik Hen <shmulik.hen at intel dot com>
* - Added support for Transmit load balancing mode.
*
* 2003/09/24 - Shmulik Hen <shmulik.hen at intel dot com>
* - Code cleanup and style changes
*/
#ifndef _LINUX_BONDING_H
#define _LINUX_BONDING_H
#include <linux/timer.h>
#include <linux/proc_fs.h>
#include <linux/if_bonding.h>
#include "bond_3ad.h"
#include "bond_alb.h"
#ifdef BONDING_DEBUG
// use this like so: BOND_PRINT_DBG(("foo = %d, bar = %d", foo, bar));
#define BOND_PRINT_DBG(X) \
do { \
printk(KERN_DEBUG "%s (%d)", __FUNCTION__, __LINE__); \
printk X; \
printk("\n"); \
} while(0)
#define DRV_VERSION "2.5.0"
#define DRV_RELDATE "December 1, 2003"
#define DRV_NAME "bonding"
#define DRV_DESCRIPTION "Ethernet Channel Bonding Driver"
#ifdef BONDING_DEBUG
#define dprintk(fmt, args...) \
printk(KERN_DEBUG \
DRV_NAME ": %s() %d: " fmt, __FUNCTION__, __LINE__ , ## args )
#else
#define BOND_PRINT_DBG(X)
#define dprintk(fmt, args...)
#endif /* BONDING_DEBUG */
#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \
(netif_running(dev) && netif_carrier_ok(dev)))
#define IS_UP(dev) \
((((dev)->flags & IFF_UP) == IFF_UP) && \
netif_running(dev) && \
netif_carrier_ok(dev))
/*
* Checks whether bond is ready for transmit.
*
* Caller must hold bond->lock
*/
#define BOND_IS_OK(bond) \
(((bond)->dev->flags & IFF_UP) && \
netif_running((bond)->dev) && \
((bond)->slave_cnt > 0))
/* Checks whether the dev is ready for transmit. We do not check netif_running
* since a device can be stopped by the driver for short periods of time for
* maintainance. dev_queue_xmit() handles this by queing the packet until the
* the dev is running again. Keeping packets ordering requires sticking the
* same dev as much as possible
/*
* Checks whether slave is ready for transmit.
*/
#define SLAVE_IS_OK(slave) \
((((slave)->dev->flags & (IFF_UP)) == (IFF_UP)) && \
netif_carrier_ok((slave)->dev) && \
#define SLAVE_IS_OK(slave) \
(((slave)->dev->flags & IFF_UP) && \
netif_running((slave)->dev) && \
((slave)->link == BOND_LINK_UP) && \
((slave)->state == BOND_STATE_ACTIVE))
typedef struct slave {
#define USES_PRIMARY(mode) \
(((mode) == BOND_MODE_ACTIVEBACKUP) || \
((mode) == BOND_MODE_TLB) || \
((mode) == BOND_MODE_ALB))
/*
* Less bad way to call ioctl from within the kernel; this needs to be
* done some other way to get the call out of interrupt context.
* Needs "ioctl" variable to be supplied by calling context.
*/
#define IOCTL(dev, arg, cmd) ({ \
int res = 0; \
mm_segment_t fs = get_fs(); \
set_fs(get_ds()); \
res = ioctl(dev, arg, cmd); \
set_fs(fs); \
res; })
/**
* bond_for_each_slave_from - iterate the slaves list from a starting point
* @bond: the bond holding this list.
* @pos: current slave.
* @cnt: counter for max number of moves
* @start: starting point.
*
* Caller must hold bond->lock
*/
#define bond_for_each_slave_from(bond, pos, cnt, start) \
for (cnt = 0, pos = start; \
cnt < (bond)->slave_cnt; \
cnt++, pos = (pos)->next)
/**
* bond_for_each_slave_from_to - iterate the slaves list from start point to stop point
* @bond: the bond holding this list.
* @pos: current slave.
* @cnt: counter for number max of moves
* @start: start point.
* @stop: stop point.
*
* Caller must hold bond->lock
*/
#define bond_for_each_slave_from_to(bond, pos, cnt, start, stop) \
for (cnt = 0, pos = start; \
((cnt < (bond)->slave_cnt) && (pos != (stop)->next)); \
cnt++, pos = (pos)->next)
/**
* bond_for_each_slave - iterate the slaves list from head
* @bond: the bond holding this list.
* @pos: current slave.
* @cnt: counter for max number of moves
*
* Caller must hold bond->lock
*/
#define bond_for_each_slave(bond, pos, cnt) \
bond_for_each_slave_from(bond, pos, cnt, (bond)->first_slave)
struct slave {
struct net_device *dev; /* first - usefull for panic debug */
struct slave *next;
struct slave *prev;
struct net_device *dev;
short delay;
unsigned long jiffies;
char link; /* one of BOND_LINK_XXXX */
char state; /* one of BOND_STATE_XXXX */
unsigned short original_flags;
u32 link_failure_count;
s16 delay;
u32 jiffies;
s8 link; /* one of BOND_LINK_XXXX */
s8 state; /* one of BOND_STATE_XXXX */
u32 original_flags;
u32 link_failure_count;
u16 speed;
u8 duplex;
u8 perm_hwaddr[ETH_ALEN];
struct ad_slave_info ad_info; /* HUGE - better to dynamically alloc */
struct tlb_slave_info tlb_info;
} slave_t;
};
/*
* Here are the locking policies for the two bonding locks:
*
* 1) Get bond->lock when reading/writing slave list.
* 2) Get bond->ptrlock when reading/writing bond->current_slave.
* 2) Get bond->curr_slave_lock when reading/writing bond->curr_active_slave.
* (It is unnecessary when the write-lock is put with bond->lock.)
* 3) When we lock with bond->ptrlock, we must lock with bond->lock
* 3) When we lock with bond->curr_slave_lock, we must lock with bond->lock
* beforehand.
*/
typedef struct bonding {
slave_t *next;
slave_t *prev;
slave_t *current_slave;
slave_t *primary_slave;
slave_t *current_arp_slave;
__s32 slave_cnt;
struct bonding {
struct net_device *dev; /* first - usefull for panic debug */
struct slave *first_slave;
struct slave *curr_active_slave;
struct slave *current_arp_slave;
struct slave *primary_slave;
s32 slave_cnt; /* never change this value outside the attach/detach wrappers */
rwlock_t lock;
rwlock_t ptrlock;
struct timer_list mii_timer;
struct timer_list arp_timer;
struct net_device_stats stats;
rwlock_t curr_slave_lock;
struct timer_list mii_timer;
struct timer_list arp_timer;
s8 kill_timers;
struct net_device_stats stats;
#ifdef CONFIG_PROC_FS
struct proc_dir_entry *bond_proc_file;
char procdir_name[IFNAMSIZ];
struct proc_dir_entry *proc_entry;
char proc_file_name[IFNAMSIZ];
#endif /* CONFIG_PROC_FS */
struct list_head bond_list;
struct net_device *device;
struct dev_mc_list *mc_list;
unsigned short flags;
struct ad_bond_info ad_info;
struct alb_bond_info alb_info;
} bonding_t;
/* Forward declarations */
void bond_set_slave_active_flags(slave_t *slave);
void bond_set_slave_inactive_flags(slave_t *slave);
/**
* These functions can be used for iterating the slave list
* (which is circular)
* Caller must hold bond lock for read
*/
extern inline struct slave*
bond_get_first_slave(struct bonding *bond)
{
/* if there are no slaves return NULL */
if (bond->next == (slave_t *)bond) {
return NULL;
}
return bond->next;
}
/**
* Caller must hold bond lock for read
*/
extern inline struct slave*
bond_get_next_slave(struct bonding *bond, struct slave *slave)
{
/* If we have reached the last slave return NULL */
if (slave->next == bond->next) {
return NULL;
}
return slave->next;
}
struct list_head bond_list;
struct dev_mc_list *mc_list;
u16 flags;
struct ad_bond_info ad_info;
struct alb_bond_info alb_info;
};
/**
* Returns NULL if the net_device does not belong to any of the bond's slaves
*
* Caller must hold bond lock for read
*/
extern inline struct slave*
bond_get_slave_by_dev(struct bonding *bond, struct net_device *slave_dev)
extern inline struct slave *bond_get_slave_by_dev(struct bonding *bond, struct net_device *slave_dev)
{
struct slave *our_slave = bond->next;
struct slave *slave = NULL;
int i;
/* check if the list of slaves is empty */
if (our_slave == (slave_t *)bond) {
return NULL;
}
for (; our_slave; our_slave = bond_get_next_slave(bond, our_slave)) {
if (our_slave->dev == slave_dev) {
bond_for_each_slave(bond, slave, i) {
if (slave->dev == slave_dev) {
break;
}
}
return our_slave;
return slave;
}
extern inline struct bonding*
bond_get_bond_by_slave(struct slave *slave)
extern inline struct bonding *bond_get_bond_by_slave(struct slave *slave)
{
if (!slave || !slave->dev->master) {
return NULL;
}
return (struct bonding *)(slave->dev->master->priv);
return (struct bonding *)slave->dev->master->priv;
}
extern inline void bond_set_slave_inactive_flags(struct slave *slave)
{
slave->state = BOND_STATE_BACKUP;
slave->dev->flags |= IFF_NOARP;
}
extern inline void bond_set_slave_active_flags(struct slave *slave)
{
slave->state = BOND_STATE_ACTIVE;
slave->dev->flags &= ~IFF_NOARP;
}
#endif /* _LINUX_BONDING_H */
......
/*
* Bond several ethernet interfaces into a Cisco, running 'Etherchannel'.
*
*
*
* Portions are (c) Copyright 1995 Simon "Guru Aleph-Null" Janes
* NCM: Network and Communications Management, Inc.
*
......@@ -10,11 +10,11 @@
*
* This software may be used and distributed according to the terms
* of the GNU Public License, incorporated herein by reference.
*
*
* 2003/03/18 - Amir Noam <amir.noam at intel dot com>
* - Added support for getting slave's speed and duplex via ethtool.
* Needed for 802.3ad and other future modes.
*
*
* 2003/03/18 - Tsippy Mendelson <tsippy.mendelson at intel dot com> and
* Shmulik Hen <shmulik.hen at intel dot com>
* - Enable support of modes that need to use the unique mac address of
......@@ -42,7 +42,7 @@
#include <linux/if_ether.h>
/* userland - kernel ABI version (2003/05/08) */
#define BOND_ABI_VERSION 1
#define BOND_ABI_VERSION 2
/*
* We can remove these ioctl definitions in 2.5. People should use the
......@@ -77,10 +77,6 @@
#define BOND_DEFAULT_MAX_BONDS 1 /* Default maximum number of devices to support */
#define BOND_MULTICAST_DISABLED 0
#define BOND_MULTICAST_ACTIVE 1
#define BOND_MULTICAST_ALL 2
typedef struct ifbond {
__s32 bond_mode;
__s32 num_slaves;
......@@ -90,9 +86,9 @@ typedef struct ifbond {
typedef struct ifslave
{
__s32 slave_id; /* Used as an IN param to the BOND_SLAVE_INFO_QUERY ioctl */
char slave_name[IFNAMSIZ];
char link;
char state;
__s8 slave_name[IFNAMSIZ];
__s8 link;
__s8 state;
__u32 link_failure_count;
} ifslave;
......@@ -115,3 +111,4 @@ struct ad_info {
* tab-width: 8
* End:
*/
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment