Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
L
linux
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
nexedi
linux
Commits
d8908c80
Commit
d8908c80
authored
Dec 20, 2002
by
Chad N. Tindel
Committed by
David S. Miller
Dec 20, 2002
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
[BONDING]: Update to version 2.4.20-20021210.
parent
0dfc54e4
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
1456 additions
and
354 deletions
+1456
-354
Documentation/networking/bonding.txt
Documentation/networking/bonding.txt
+377
-133
Documentation/networking/ifenslave.c
Documentation/networking/ifenslave.c
+90
-3
drivers/net/bonding.c
drivers/net/bonding.c
+982
-215
include/linux/if_bonding.h
include/linux/if_bonding.h
+7
-3
No files found.
Documentation/networking/bonding.txt
View file @
d8908c80
...
...
@@ -17,6 +17,23 @@ extreme-linux and beowulf sites will not work with this version of the driver.
For new versions of the driver, patches for older kernels and the updated
userspace tools, please follow the links at the end of this file.
Table of Contents
=================
Installation
Bond Configuration
Module Parameters
Configuring Multiple Bonds
Switch Configuration
Verifying Bond Configuration
Frequently Asked Questions
High Availability
Promiscuous Sniffing notes
Limitations
Resources and Links
Installation
============
...
...
@@ -51,16 +68,21 @@ To install ifenslave.c, do:
# gcc -Wall -Wstrict-prototypes -O -I/usr/src/linux/include ifenslave.c -o ifenslave
# cp ifenslave /sbin/ifenslave
3) Configure your system
------------------------
Also see the following section on the module parameters. You will need to add
at least the following line to /etc/conf.modules (or /etc/modules.conf):
Bond Configuration
==================
You will need to add at least the following line to /etc/modules.conf
so the bonding driver will automatically load when the bond0 interface is
configured. Refer to the modules.conf manual page for specific modules.conf
syntax details. The Module Parameters section of this document describes each
bonding driver parameter.
alias bond0 bonding
Use standard distribution techniques to define bond0 network interface. For
example, on modern Red
Hat distributions, create
ifcfg-bond0 file in
/etc/sysconfig/network-scripts directory that looks like this
:
Use standard distribution techniques to define
the
bond0 network interface. For
example, on modern Red
Hat distributions, create an
ifcfg-bond0 file in
the /etc/sysconfig/network-scripts directory that resembles the following
:
DEVICE=bond0
IPADDR=192.168.1.1
...
...
@@ -71,12 +93,12 @@ ONBOOT=yes
BOOTPROTO=none
USERCTL=no
(
put the appropriate values for you network instead of 192.168.1).
(
use appropriate values for your network above)
All interfaces that are part of
the trunk,
should have SLAVE and MASTER
definitions. For example, in the case of RedHat, if you wish to make eth0 and
eth1
(or other interfaces) a part of the bonding interface bond0, their config
files (ifcfg-eth0, ifcfg-eth1, etc.) should look like this
:
All interfaces that are part of
a bond
should have SLAVE and MASTER
definitions. For example, in the case of Red
Hat, if you wish to make eth0 and
eth1
a part of the bonding interface bond0, their config files (ifcfg-eth0 and
ifcfg-eth1) should resemble the following
:
DEVICE=eth0
USERCTL=no
...
...
@@ -85,89 +107,261 @@ MASTER=bond0
SLAVE=yes
BOOTPROTO=none
(use DEVICE=eth1 for eth1 and MASTER=bond1 for bond1 if you have configured
second bonding interface).
Use DEVICE=eth1 in the ifcfg-eth1 config file. If you configure a second bonding
interface (bond1), use MASTER=bond1 in the config file to make the network
interface be a slave of bond1.
Restart the networking subsystem or just bring up the bonding device if your
administration tools allow it. Otherwise, reboot.
(For the case of RedHat
distros, you can do `ifup bond0' or `/etc/rc.d/init.d/network restart'.)
administration tools allow it. Otherwise, reboot.
On Red Hat distros you can
issue `ifup bond0' or `/etc/rc.d/init.d/network restart'.
If the administration tools of your distribution do not support master/slave
notation in configuration of network interfaces, you will need to configure
the bonding device with the following commands manually:
notation in configuring network interfaces, you will need to manually configure
the bonding device with the following commands:
# /sbin/ifconfig bond0 192.168.1.1 netmask 255.255.255.0 \
broadcast 192.168.1.255 up
# /sbin/ifconfig bond0 192.168.1.1 up
# /sbin/ifenslave bond0 eth0
# /sbin/ifenslave bond0 eth1
(substitute 192.168.1.1 with your IP address and add custom network and custom
netmask to the arguments of ifconfig if required).
(use appropriate values for your network above)
You can then create a script
with these commands and put it into the appropriate
rc directory.
You can then create a script
containing these commands and place it in the
appropriate
rc directory.
If you specifically need
that all your network drivers are loaded before the
bonding driver, use one of modutils' powerful features : in your modules.conf,
tell that when asked for bond0, modprobe should first load all your interfaces :
If you specifically need
all network drivers loaded before the bonding driver,
adding the following line to modules.conf will cause the network driver for
eth0 and eth1 to be loaded before the bonding driver.
probeall bond0 eth0 eth1 bonding
Be careful not to reference bond0 itself at the end of the line, or modprobe will
die in an endless recursive loop.
4) Module parameters.
---------------------
The following module parameters can be passed:
mode=
Possible values are 0 (round robin policy, default) and 1 (active backup
policy), and 2 (XOR). See question 9 and the HA section for additional info.
miimon=
Use integer value for the frequency (in ms) of MII link monitoring. Zero value
is default and means the link monitoring will be disabled. A good value is 100
if you wish to use link monitoring. See HA section for additional info.
downdelay=
Use integer value for delaying disabling a link by this number (in ms) after
the link failure has been detected. Must be a multiple of miimon. Default
value is zero. See HA section for additional info.
updelay=
Use integer value for delaying enabling a link by this number (in ms) after
the "link up" status has been detected. Must be a multiple of miimon. Default
value is zero. See HA section for additional info.
Be careful not to reference bond0 itself at the end of the line, or modprobe
will die in an endless recursive loop.
To have device characteristics (such as MTU size) propagate to slave devices,
set the bond characteristics before enslaving the device. The characteristics
are propagated during the enslave process.
If running SNMP agents, the bonding driver should be loaded before any network
drivers participating in a bond. This requirement is due to the the interface
index (ipAdEntIfIndex) being associated to the first interface found with a
given IP address. That is, there is only one ipAdEntIfIndex for each IP
address. For example, if eth0 and eth1 are slaves of bond0 and the driver for
eth0 is loaded before the bonding driver, the interface for the IP address
will be associated with the eth0 interface. This configuration is shown below,
the IP address 192.168.1.1 has an interface index of 2 which indexes to eth0
in the ifDescr table (ifDescr.2).
interfaces.ifTable.ifEntry.ifDescr.1 = lo
interfaces.ifTable.ifEntry.ifDescr.2 = eth0
interfaces.ifTable.ifEntry.ifDescr.3 = eth1
interfaces.ifTable.ifEntry.ifDescr.4 = eth2
interfaces.ifTable.ifEntry.ifDescr.5 = eth3
interfaces.ifTable.ifEntry.ifDescr.6 = bond0
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 5
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 4
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
This problem is avoided by loading the bonding driver before any network
drivers participating in a bond. Below is an example of loading the bonding
driver first, the IP address 192.168.1.1 is correctly associated with ifDescr.2.
interfaces.ifTable.ifEntry.ifDescr.1 = lo
interfaces.ifTable.ifEntry.ifDescr.2 = bond0
interfaces.ifTable.ifEntry.ifDescr.3 = eth0
interfaces.ifTable.ifEntry.ifDescr.4 = eth1
interfaces.ifTable.ifEntry.ifDescr.5 = eth2
interfaces.ifTable.ifEntry.ifDescr.6 = eth3
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.10.10.10 = 6
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.192.168.1.1 = 2
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.10.74.20.94 = 5
ip.ipAddrTable.ipAddrEntry.ipAdEntIfIndex.127.0.0.1 = 1
While some distributions may not report the interface name in ifDescr,
the association between the IP address and IfIndex remains and SNMP
functions such as Interface_Scan_Next will report that association.
Module Parameters
=================
arp_interval=
Optional parameters for the bonding driver can be supplied as command line
arguments to the insmod command. Typically, these parameters are specified in
the file /etc/modules.conf (see the manual page for modules.conf). The
available bonding driver parameters are listed below. If a parameter is not
specified the default value is used. When initially configuring a bond, it
is recommended "tail -f /var/log/messages" be run in a separate window to
watch for bonding driver error messages.
It is critical that either the miimon or arp_interval and arp_ip_target
parameters be specified, otherwise serious network degradation will occur
during link failures.
mode
Specifies one of four bonding policies. The default is round-robin.
Possible values are:
0 Round-robin policy: Transmit in a sequential order from the
first available slave through the last. This mode provides
load balancing and fault tolerance.
1 Active-backup policy: Only one slave in the bond is active. A
different slave becomes active if, and only if, the active slave
fails. The bond's MAC address is externally visible on only
one port (network adapter) to avoid confusing the switch.
This mode provides fault tolerance.
2 XOR policy: Transmit based on [(source MAC address XOR'd with
destination MAC address) modula slave count]. This selects the
same slave for each destination MAC address. This mode provides
load balancing and fault tolerance.
3 Broadcast policy: transmits everything on all slave interfaces.
This mode provides fault tolerance.
miimon
Specifies the frequency in milli-seconds that MII link monitoring will
occur. A value of zero disables MII link monitoring. A value of
100 is a good starting point. See High Availability section for
additional information. The default value is 0.
downdelay
Specifies the delay time in milli-seconds to disable a link after a
link failure has been detected. This should be a multiple of miimon
value, otherwise the value will be rounded. The default value is 0.
updelay
Specifies the delay time in milli-seconds to enable a link after a
link up status has been detected. This should be a multiple of miimon
value, otherwise the value will be rounded. The default value is 0.
arp_interval
Specifies the ARP monitoring frequency in milli-seconds.
If ARP monitoring is used in a load-balancing mode (mode 0 or 2), the
switch should be configured in a mode that evenly distributes packets
across all links - such as round-robin. If the switch is configured to
distribute the packets in an XOR fashion, all replies from the ARP
targets will be received on the same link which could cause the other
team members to fail. ARP monitoring should not be used in conjunction
with miimon. A value of 0 disables ARP monitoring. The default value
is 0.
arp_ip_target
Specifies the ip addresses to use when arp_interval is > 0. These are
the targets of the ARP request sent to determine the health of the link
to the targets. Specify these values in ddd.ddd.ddd.ddd format.
Multiple ip adresses must be seperated by a comma. At least one ip
address needs to be given for ARP monitoring to work. The maximum number
of targets that can be specified is set at 16.
primary
A string (eth0, eth2, etc) to equate to a primary device. If this
value is entered, and the device is on-line, it will be used first as
the output media. Only when this device is off-line, will alternate
devices be used. Otherwise, once a failover is detected and a new
default output is chosen, it will remain the output media until it too
fails. This is useful when one slave was preferred over another, i.e.
when one slave is 1000Mbps and another is 100Mbps. If the 1000Mbps
slave fails and is later restored, it may be preferred the faster slave
gracefully become the active slave - without deliberately failing the
100Mbps slave. Specifying a primary is only valid in active-backup mode.
multicast
Integer value for the mode of operation for multicast support.
Possible values are:
0 Disabled (no multicast support)
1 Enabled on active slave only, useful in active-backup mode
2 Enabled on all slaves, this is the default
Configuring Multiple Bonds
==========================
If several bonding interfaces are required, the driver must be loaded
multiple times. For example, to configure two bonding interfaces with link
monitoring performed every 100 milli-seconds, the /etc/conf.modules should
resemble the following:
Use integer value for the frequency (in ms) of arp monitoring. Zero value
is default and means the arp monitoring will be disabled. See HA section
for additional info. This field is value in active_backup mode only.
alias bond0 bonding
alias bond1 bonding
arp_ip_target=
options bond0 miimon=100
options bond1 -o bonding1 miimon=100
An ip address to use when arp_interval is > 0. This is the target of the
arp request sent to determine the health of the link to the target.
Specify this value in ddd.ddd.ddd.ddd format.
Configuring Multiple ARP Targets
================================
If you need to configure several bonding devices, the driver must be loaded
several times. I.e. for two bonding devices, your /etc/conf.modules must look
like this:
While ARP monitoring can be done with just one target, it can be usefull
in a High Availability setup to have several targets to monitor. In the
case of just one target, the target itself may go down or have a problem
making it unresponsive to ARP requests. Having an additional target (or
several) would increase the reliability of the ARP monitoring.
Multiple ARP targets must be seperated by commas as follows:
# example options for ARP monitoring with three targets
alias bond0 bonding
alias bond1 bonding
options bond0 arp_interval=60 arp_ip_target=192.168.0.1,192.168.0.3,192.168.0.9
options bond0 miimon=100
options bond1 -o bonding1 miimon=100
For just a single target the options would resemble:
5) Testing configuration
------------------------
You can test the configuration and transmit policy with ifconfig. For example,
for round robin policy, you should get something like this:
# example options for ARP monitoring with one target
alias bond0 bonding
options bond0 arp_interval=60 arp_ip_target=192.168.0.100
Switch Configuration
====================
While the switch does not need to be configured when the active-backup
policy is used (mode=1), it does need to be configured for the round-robin,
XOR, and broadcast policies (mode=0, mode=2, and mode=3).
Verifying Bond Configuration
============================
1) Bonding information files
----------------------------
The bonding driver information files reside in the /proc/net/bond* directories.
Sample contents of /proc/net/bond0/info after the driver is loaded with
parameters of mode=0 and miimon=1000 is shown below.
Bonding Mode: load balancing (round-robin)
Currently Active Slave: eth0
MII Status: up
MII Polling Interval (ms): 1000
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: eth1
MII Status: up
Link Failure Count: 1
Slave Interface: eth0
MII Status: up
Link Failure Count: 1
2) Network verification
-----------------------
The network configuration can be verified using the ifconfig command. In
the example below, the bond0 interface is the master (MASTER) while eth0 and
eth1 are slaves (SLAVE). Notice all slaves of bond0 have the same MAC address
(HWaddr) as bond0.
[root]# /sbin/ifconfig
bond0 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
...
...
@@ -193,12 +387,13 @@ eth1 Link encap:Ethernet HWaddr 00:C0:F0:1F:37:B4
collisions:0 txqueuelen:100
Interrupt:9 Base address:0x1400
Questions :
===========
Frequently Asked Questions
==========================
1. Is it SMP safe?
Yes.
The old 2.0.xx channel bonding patch was not SMP safe.
Yes. The old 2.0.xx channel bonding patch was not SMP safe.
The new driver was designed to be SMP safe from the start.
2. What type of cards will work with it?
...
...
@@ -209,31 +404,30 @@ Questions :
3. How many bonding devices can I have?
One for each module you load. See section on
module p
arameters for how
One for each module you load. See section on
Module P
arameters for how
to accomplish this.
4. How many slaves can a bonding device have?
Limited by the number of network interfaces Linux supports and the
number of cards you can place in your system.
Limited by the number of network interfaces Linux supports and
/or
the
number of
network
cards you can place in your system.
5. What happens when a slave link dies?
If your ethernet cards support MII
status monitoring and the MII
monitoring has been enabled in the driver (see description of module
parameters), there will be no adverse consequences. This release
of the bonding driver knows how to get the MII information and
If your ethernet cards support MII
or ETHTOOL link status monitoring
and the MII monitoring has been enabled in the driver (see description
of module parameters), there will be no adverse consequences. This
release
of the bonding driver knows how to get the MII information and
enables or disables its slaves according to their link status.
See section on H
A
for additional information.
See section on H
igh Availability
for additional information.
For ethernet cards not supporting MII status, or if you wish to
verify that packets have been both send and received, you may
configure the arp_interval and arp_ip_target. If packets have
not been sent or received during this interval, an arp request
is sent to the target to generate send and receive traffic.
If after this interval, either the successful send and/or
receive count has not incremented, the next slave in the sequence
will become the active slave.
For ethernet cards not supporting MII status, the arp_interval and
arp_ip_target parameters must be specified for bonding to work
correctly. If packets have not been sent or received during the
specified arp_interval durration, an ARP request is sent to the targets
to generate send and receive traffic. If after this interval, either
the successful send and/or receive count has not incremented, the next
slave in the sequence will become the active slave.
If neither mii_monitor and arp_interval is configured, the bonding
driver will not handle this situation very well. The driver will
...
...
@@ -245,11 +439,12 @@ Questions :
6. Can bonding be used for High Availability?
Yes, if you use MII monitoring and ALL your cards support MII link
status reporting. See section on H
A
for more information.
status reporting. See section on H
igh Availability
for more information.
7. Which switches/systems does it work with?
In round-robin mode, it works with systems that support trunking:
In round-robin and XOR mode, it works with systems that support
trunking:
* Cisco 5500 series (look for EtherChannel support).
* SunTrunking software.
...
...
@@ -259,7 +454,8 @@ Questions :
units.
* Linux bonding, of course !
In Active-backup mode, it should work with any Layer-II switches.
In active-backup mode, it should work with any Layer-II switche.
8. Where does a bonding device get its MAC address from?
...
...
@@ -297,55 +493,68 @@ Questions :
9. Which transmit polices can be used?
Round
robin, based on the order of enslaving, the output device
is selected base on the next available slave.
Regardless of
Round
-
robin, based on the order of enslaving, the output device
is selected base on the next available slave. Regardless of
the source and/or destination of the packet.
XOR, based on (src hw addr XOR dst hw addr) % slave cnt. This
selects the same slave for each destination hw address.
Active-backup policy that ensures that one and only one device will
transmit at any given moment. Active-backup policy is useful for
implementing high availability solutions using two hubs (see
section on H
A
).
section on H
igh Availability
).
High availability
=================
XOR, based on (src hw addr XOR dst hw addr) % slave count. This
policy selects the same slave for each destination hw address.
Broadcast policy transmits everything on all slave interfaces.
To implement high availability using the bonding driver, you need to
compile the driver as module because currently it is the only way to pass
parameters to the driver. This may change in the future.
High availability is achieved by using MII status reporting. You need to
verify that all your interfaces support MII link status reporting. On Linux
kernel 2.2.17, all the 100 Mbps capable drivers and yellowfin gigabit driver
support it. If your system has an interface that does not support MII status
reporting, a failure of its link will not be detected!
High Availability
=================
The bonding driver can regularly check all its slaves links by checking the
MII status registers. The check interval is specified by the module argument
"miimon" (MII monitoring). It takes an integer that represents the
checking time in milliseconds. It should not come to close to (1000/HZ)
(10 ms on i386) because it may then reduce the system interactivity. 100 ms
seems to be a good value. It means that a dead link will be detected at most
100 ms after it goes down.
To implement high availability using the bonding driver, the driver needs to be
compiled as a module, because currently it is the only way to pass parameters
to the driver. This may change in the future.
High availability is achieved by using MII or ETHTOOL status reporting. You
need to verify that all your interfaces support MII or ETHTOOL link status
reporting. On Linux kernel 2.2.17, all the 100 Mbps capable drivers and
yellowfin gigabit driver support MII. To determine if ETHTOOL link reporting
is available for interface eth0, type "ethtool eth0" and the "Link detected:"
line should contain the correct link status. If your system has an interface
that does not support MII or ETHTOOL status reporting, a failure of its link
will not be detected! A message indicating MII and ETHTOOL is not supported by
a network driver is logged when the bonding driver is loaded with a non-zero
miimon value.
The bonding driver can regularly check all its slaves links using the ETHTOOL
IOCTL (ETHTOOL_GLINK command) or by checking the MII status registers. The
check interval is specified by the module argument "miimon" (MII monitoring).
It takes an integer that represents the checking time in milliseconds. It
should not come to close to (1000/HZ) (10 milli-seconds on i386) because it
may then reduce the system interactivity. A value of 100 seems to be a good
starting point. It means that a dead link will be detected at most 100
milli-seconds after it goes down.
Example:
# modprobe bonding miimon=100
Or, put
in your /etc/modules.conf
:
Or, put
the following lines in /etc/modules.conf
:
alias bond0 bonding
options bond0 miimon=100
There are currently two policies for high availability, depending on whether
a) hosts are connected to a single host or switch that support trunking
b) hosts are connected to several different switches or a single switch that
does not support trunking.
There are currently two policies for high availability. They are dependent on
whether:
a) hosts are connected to a single host or switch that support trunking
b) hosts are connected to several different switches or a single switch that
does not support trunking
1) HA on a single switch or host - load balancing
-------------------------------------------------
1) High Availability on a single switch or host - load balancing
----------------------------------------------------------------
It is the easiest to set up and to understand. Simply configure the
remote equipment (host or switch) to aggregate traffic over several
ports (Trunk, EtherChannel, etc.) and configure the bonding interfaces.
...
...
@@ -356,7 +565,7 @@ encounter problems on some buggy switches that disable the trunk for a
long time if all ports in a trunk go down. This is not Linux, but really
the switch (reboot it to ensure).
Example 1 : host to host at
doubl
e speed
Example 1 : host to host at
twice th
e speed
+----------+ +----------+
| |eth0 eth0| |
...
...
@@ -370,7 +579,7 @@ Example 1 : host to host at double speed
# ifconfig bond0 addr
# ifenslave bond0 eth0 eth1
Example 2 : host to switch at
doubl
e speed
Example 2 : host to switch at
twice th
e speed
+----------+ +----------+
| |eth0 port1| |
...
...
@@ -384,7 +593,9 @@ Example 2 : host to switch at double speed
# ifconfig bond0 addr and port2
# ifenslave bond0 eth0 eth1
2) HA on two or more switches (or a single switch without trunking support)
2) High Availability on two or more switches (or a single switch without
trunking support)
---------------------------------------------------------------------------
This mode is more problematic because it relies on the fact that there
are multiple ports and the host's MAC address should be visible on one
...
...
@@ -423,14 +634,14 @@ point of failure" solution.
+--------------+ host2 +----------------+
eth0 +-------+ eth1
In this configuration, there
are
an ISL - Inter Switch Link (could be a trunk),
In this configuration, there
is
an ISL - Inter Switch Link (could be a trunk),
several servers (host1, host2 ...) attached to both switches each, and one or
more ports to the outside world (port3...). One an only one slave on each host
is active at a time, while all links are still monitored (the system can
detect a failure of active and backup links).
Each time a host changes its active interface, it sticks to the new one until
it goes down. In this example, the hosts are n
ot too much
affected by the
it goes down. In this example, the hosts are n
egligibly
affected by the
expiration time of the switches' forwarding tables.
If host1 and host2 have the same functionality and are used in load balancing
...
...
@@ -460,6 +671,7 @@ Each time the host changes its active interface, it sticks to the new one until
it goes down. In this example, the host is strongly affected by the expiration
time of the switch forwarding table.
3) Adapting to your switches' timing
------------------------------------
If your switches take a long time to go into backup mode, it may be
...
...
@@ -488,8 +700,34 @@ Examples :
# modprobe bonding miimon=100 mode=1 downdelay=2000 updelay=5000
# modprobe bonding miimon=100 mode=0 downdelay=0 updelay=5000
4) Limitations
--------------
Promiscuous Sniffing notes
==========================
If you wish to bond channels together for a network sniffing
application --- you wish to run tcpdump, or ethereal, or an IDS like
snort, with its input aggregated from multiple interfaces using the
bonding driver --- then you need to handle the Promiscuous interface
setting by hand. Specifically, when you "ifconfing bond0 up" you
must add the promisc flag there; it will be propagated down to the
slave interfaces at ifenslave time; a full example might look like:
grep bond0 /etc/modules.conf || echo alias bond0 bonding >/etc/modules.conf
ifconfig bond0 promisc up
for if in eth1 eth2 ...;do
ifconfig $if up
ifenslave bond0 $if
done
snort ... -i bond0 ...
Ifenslave also wants to propagate addresses from interface to
interface, appropriately for its design functions in HA and channel
capacity aggregating; but it works fine for unnumbered interfaces;
just ignore all the warnings it emits.
Limitations
===========
The main limitations are :
- only the link status is monitored. If the switch on the other side is
partially down (e.g. doesn't forward anymore, but the link is OK), the link
...
...
@@ -500,7 +738,13 @@ The main limitations are :
Use the arp_interval/arp_ip_target parameters to count incoming/outgoing
frames.
Resources and links
- A Transmit Load Balancing policy is not currently available. This mode
allows every slave in the bond to transmit while only one receives. If
the "receiving" slave fails, another slave takes over the MAC address of
the failed receiving slave.
Resources and Links
===================
Current development on this driver is posted to:
...
...
Documentation/networking/ifenslave.c
View file @
d8908c80
...
...
@@ -41,6 +41,16 @@
* - 2002/02/18 Erik Habbinga <erik_habbinga @ hp dot com> :
* - ifr2.ifr_flags was not initialized in the hwaddr_notset case,
* SIOCGIFFLAGS now called before hwaddr_notset test
*
* - 2002/10/31 Tony Cureington <tony.cureington * hp_com> :
* - If the master does not have a hardware address when the first slave
* is enslaved, the master is assigned the hardware address of that
* slave - there is a comment in bonding.c stating "ifenslave takes
* care of this now." This corrects the problem of slaves having
* different hardware addresses in active-backup mode when
* multiple interfaces are specified on a single ifenslave command
* (ifenslave bond0 eth0 eth1).
*
*/
static
char
*
version
=
...
...
@@ -131,6 +141,7 @@ main(int argc, char **argv)
sa_family_t
master_family
;
char
**
spp
,
*
master_ifname
,
*
slave_ifname
;
int
hwaddr_notset
;
int
master_up
;
while
((
c
=
getopt_long
(
argc
,
argv
,
"acdfrvV?h"
,
longopts
,
0
))
!=
EOF
)
switch
(
c
)
{
...
...
@@ -300,10 +311,86 @@ main(int argc, char **argv)
return
1
;
}
if
(
hwaddr_notset
)
{
/* we do nothing */
if
(
hwaddr_notset
)
{
/* assign the slave hw address to the
* master since it currently does not
* have one; otherwise, slaves may
* have different hw addresses in
* active-backup mode as seen when enslaving
* using "ifenslave bond0 eth0 eth1" because
* hwaddr_notset is set outside this loop.
* TODO: put this and the "else" portion in
* a function.
*/
goterr
=
0
;
master_up
=
0
;
if
(
if_flags
.
ifr_flags
&
IFF_UP
)
{
if_flags
.
ifr_flags
&=
~
IFF_UP
;
if
(
ioctl
(
skfd
,
SIOCSIFFLAGS
,
&
if_flags
)
<
0
)
{
goterr
=
1
;
fprintf
(
stderr
,
"Shutting down "
"interface %s failed: "
"%s
\n
"
,
master_ifname
,
strerror
(
errno
));
}
else
{
/* we took the master down,
* so we must bring it up
*/
master_up
=
1
;
}
}
}
else
{
/* we'll assign master's hwaddr to this slave */
if
(
!
goterr
)
{
/* get the slaves MAC address */
strncpy
(
if_hwaddr
.
ifr_name
,
slave_ifname
,
IFNAMSIZ
);
if
(
ioctl
(
skfd
,
SIOCGIFHWADDR
,
&
if_hwaddr
)
<
0
)
{
fprintf
(
stderr
,
"Could not get MAC "
"address of %s: %s
\n
"
,
slave_ifname
,
strerror
(
errno
));
strncpy
(
if_hwaddr
.
ifr_name
,
master_ifname
,
IFNAMSIZ
);
goterr
=
1
;
}
}
if
(
!
goterr
)
{
strncpy
(
if_hwaddr
.
ifr_name
,
master_ifname
,
IFNAMSIZ
);
if
(
ioctl
(
skfd
,
SIOCSIFHWADDR
,
&
if_hwaddr
)
<
0
)
{
fprintf
(
stderr
,
"Could not set MAC "
"address of %s: %s
\n
"
,
master_ifname
,
strerror
(
errno
));
goterr
=
1
;
}
else
{
hwaddr_notset
=
0
;
}
}
if
(
master_up
)
{
if_flags
.
ifr_flags
|=
IFF_UP
;
if
(
ioctl
(
skfd
,
SIOCSIFFLAGS
,
&
if_flags
)
<
0
)
{
fprintf
(
stderr
,
"Bringing up interface "
"%s failed: %s
\n
"
,
master_ifname
,
strerror
(
errno
));
}
}
}
else
{
/* we'll assign master's hwaddr to this slave */
if
(
ifr2
.
ifr_flags
&
IFF_UP
)
{
ifr2
.
ifr_flags
&=
~
IFF_UP
;
if
(
ioctl
(
skfd
,
SIOCSIFFLAGS
,
&
ifr2
)
<
0
)
{
...
...
drivers/net/bonding.c
View file @
d8908c80
...
...
@@ -177,20 +177,91 @@
* - Port Gleb Natapov's multicast support patchs from 2.4.12
* to 2.4.18 adding support for multicast.
*
* 2002/06/1
7
- Tony Cureington <tony.cureington * hp_com>
* 2002/06/1
0
- Tony Cureington <tony.cureington * hp_com>
* - corrected uninitialized pointer (ifr.ifr_data) in bond_check_dev_link;
* actually changed function to use
ETHTOOL, then MIIPHY
, and finally
*
MIIREG
to determine the link status
* actually changed function to use
MIIPHY, then MIIREG
, and finally
*
ETHTOOL
to determine the link status
* - fixed bad ifr_data pointer assignments in bond_ioctl
* - corrected mode 1 being reported as active-backup in bond_get_info;
* also added text to distinguish type of load balancing (rr or xor)
* - change arp_ip_target module param from "1-12s" (array of 12 ptrs)
* to "s" (a single ptr)
*
* 2002/08/30 - Jay Vosburgh <fubar at us dot ibm dot com>
* - Removed acquisition of xmit_lock in set_multicast_list; caused
* deadlock on SMP (lock is held by caller).
* - Revamped SIOCGMIIPHY, SIOCGMIIREG portion of bond_check_dev_link().
*
* 2002/09/18 - Jay Vosburgh <fubar at us dot ibm dot com>
* - Fixed up bond_check_dev_link() (and callers): removed some magic
* numbers, banished local MII_ defines, wrapped ioctl calls to
* prevent EFAULT errors
*
* 2002/9/30 - Jay Vosburgh <fubar at us dot ibm dot com>
* - make sure the ip target matches the arp_target before saving the
* hw address.
*
* 2002/9/30 - Dan Eisner <eisner at 2robots dot com>
* - make sure my_ip is set before taking down the link, since
* not all switches respond if the source ip is not set.
*
* 2002/10/8 - Janice Girouard <girouard at us dot ibm dot com>
* - read in the local ip address when enslaving a device
* - add primary support
* - make sure 2*arp_interval has passed when a new device
* is brought on-line before taking it down.
*
* 2002/09/11 - Philippe De Muyter <phdm at macqel dot be>
* - Added bond_xmit_broadcast logic.
* - Added bond_mode() support function.
*
* 2002/10/26 - Laurent Deniel <laurent.deniel at free.fr>
* - allow to register multicast addresses only on active slave
* (useful in active-backup mode)
* - add multicast module parameter
* - fix deletion of multicast groups after unloading module
*
* 2002/11/06 - Kameshwara Rayaprolu <kameshwara.rao * wipro_com>
* - Changes to prevent panic from closing the device twice; if we close
* the device in bond_release, we must set the original_flags to down
* so it won't be closed again by the network layer.
*
* 2002/11/07 - Tony Cureington <tony.cureington * hp_com>
* - Fix arp_target_hw_addr memory leak
* - Created activebackup_arp_monitor function to handle arp monitoring
* in active backup mode - the bond_arp_monitor had several problems...
* such as allowing slaves to tx arps sequentially without any delay
* for a response
* - Renamed bond_arp_monitor to loadbalance_arp_monitor and re-wrote
* this function to just handle arp monitoring in load-balancing mode;
* it is a lot more compact now
* - Changes to ensure one and only one slave transmits in active-backup
* mode
* - Robustesize parameters; warn users about bad combinations of
* parameters; also if miimon is specified and a network driver does
* not support MII or ETHTOOL, inform the user of this
* - Changes to support link_failure_count when in arp monitoring mode
* - Fix up/down delay reported in /proc
* - Added version; log version; make version available from "modinfo -d"
* - Fixed problem in bond_check_dev_link - if the first IOCTL (SIOCGMIIPH)
* failed, the ETHTOOL ioctl never got a chance
*
* 2002/11/16 - Laurent Deniel <laurent.deniel at free.fr>
* - fix multicast handling in activebackup_arp_monitor
* - remove one unnecessary and confusing current_slave == slave test
* in activebackup_arp_monitor
*
* 2002/11/17 - Laurent Deniel <laurent.deniel at free.fr>
* - fix bond_slave_info_query when slave_id = num_slaves
*
* 2002/11/19 - Janice Girouard <girouard at us dot ibm dot com>
* - correct ifr_data reference. Update ifr_data reference
* to mii_ioctl_data struct values to avoid confusion.
*
*
* 2002/11/22 - Bert Barbe <bert.barbe at oracle dot com>
* - Add support for multiple arp_ip_target
*
*/
#include <linux/config.h>
...
...
@@ -201,6 +272,7 @@
#include <linux/interrupt.h>
#include <linux/ioport.h>
#include <linux/in.h>
#include <linux/ip.h>
#include <linux/slab.h>
#include <linux/string.h>
#include <linux/init.h>
...
...
@@ -208,6 +280,7 @@
#include <linux/socket.h>
#include <linux/errno.h>
#include <linux/netdevice.h>
#include <linux/inetdevice.h>
#include <linux/etherdevice.h>
#include <linux/skbuff.h>
#include <net/sock.h>
...
...
@@ -225,6 +298,13 @@
#include <asm/dma.h>
#include <asm/uaccess.h>
#define DRV_VERSION "2.4.20-20021210"
#define DRV_RELDATE "December 10, 2002"
#define DRV_NAME "bonding"
#define DRV_DESCRIPTION "Ethernet Channel Bonding Driver"
static
const
char
*
version
=
DRV_NAME
".c:v"
DRV_VERSION
" ("
DRV_RELDATE
")
\n
"
;
/* monitor all links that often (in milliseconds). <=0 disables monitoring */
#ifndef BOND_LINK_MON_INTERV
...
...
@@ -235,20 +315,31 @@
#define BOND_LINK_ARP_INTERV 0
#endif
#ifndef MAX_ARP_IP_TARGETS
#define MAX_ARP_IP_TARGETS 16
#endif
static
int
arp_interval
=
BOND_LINK_ARP_INTERV
;
static
char
*
arp_ip_target
=
NULL
;
static
unsigned
long
arp_target
=
0
;
static
char
*
arp_ip_target
[
MAX_ARP_IP_TARGETS
]
=
{
NULL
,
};
static
unsigned
long
arp_target
[
MAX_ARP_IP_TARGETS
]
=
{
0
,
}
;
static
int
arp_ip_count
=
0
;
static
u32
my_ip
=
0
;
char
*
arp_target_hw_addr
=
NULL
;
static
char
*
primary
=
NULL
;
static
int
max_bonds
=
BOND_DEFAULT_MAX_BONDS
;
static
int
miimon
=
BOND_LINK_MON_INTERV
;
static
int
mode
=
BOND_MODE_ROUNDROBIN
;
static
int
updelay
=
0
;
static
int
downdelay
=
0
;
#define BOND_MULTICAST_DISABLED 0
#define BOND_MULTICAST_ACTIVE 1
#define BOND_MULTICAST_ALL 2
static
int
multicast
=
BOND_MULTICAST_ALL
;
static
int
first_pass
=
1
;
int
bond_cnt
;
static
struct
bonding
*
these_bonds
=
NULL
;
static
struct
net_device
*
dev_bonds
=
NULL
;
...
...
@@ -259,13 +350,17 @@ MODULE_PARM_DESC(miimon, "Link check interval in milliseconds");
MODULE_PARM
(
mode
,
"i"
);
MODULE_PARM
(
arp_interval
,
"i"
);
MODULE_PARM_DESC
(
arp_interval
,
"arp interval in milliseconds"
);
MODULE_PARM
(
arp_ip_target
,
"s"
);
MODULE_PARM_DESC
(
arp_ip_target
,
"arp target in n.n.n.n form"
);
MODULE_PARM
(
arp_ip_target
,
"
1-"
__MODULE_STRING
(
MAX_ARP_IP_TARGETS
)
"
s"
);
MODULE_PARM_DESC
(
arp_ip_target
,
"arp target
s
in n.n.n.n form"
);
MODULE_PARM_DESC
(
mode
,
"Mode of operation : 0 for round robin, 1 for active-backup, 2 for xor"
);
MODULE_PARM
(
updelay
,
"i"
);
MODULE_PARM_DESC
(
updelay
,
"Delay before considering link up, in milliseconds"
);
MODULE_PARM
(
downdelay
,
"i"
);
MODULE_PARM_DESC
(
downdelay
,
"Delay before considering link down, in milliseconds"
);
MODULE_PARM
(
primary
,
"s"
);
MODULE_PARM_DESC
(
primary
,
"Primary network device to use"
);
MODULE_PARM
(
multicast
,
"i"
);
MODULE_PARM_DESC
(
multicast
,
"Mode for multicast support : 0 for none, 1 for active slave, 2 for all slaves (default)"
);
extern
void
arp_send
(
int
type
,
int
ptype
,
u32
dest_ip
,
struct
net_device
*
dev
,
u32
src_ip
,
unsigned
char
*
dest_hw
,
unsigned
char
*
src_hw
,
...
...
@@ -276,7 +371,8 @@ static int bond_xmit_xor(struct sk_buff *skb, struct net_device *dev);
static
int
bond_xmit_activebackup
(
struct
sk_buff
*
skb
,
struct
net_device
*
dev
);
static
struct
net_device_stats
*
bond_get_stats
(
struct
net_device
*
dev
);
static
void
bond_mii_monitor
(
struct
net_device
*
dev
);
static
void
bond_arp_monitor
(
struct
net_device
*
dev
);
static
void
loadbalance_arp_monitor
(
struct
net_device
*
dev
);
static
void
activebackup_arp_monitor
(
struct
net_device
*
dev
);
static
int
bond_event
(
struct
notifier_block
*
this
,
unsigned
long
event
,
void
*
ptr
);
static
void
bond_restore_slave_flags
(
slave_t
*
slave
);
static
void
bond_mc_list_destroy
(
struct
bonding
*
bond
);
...
...
@@ -287,6 +383,7 @@ static inline int dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2);
static
void
bond_set_promiscuity
(
bonding_t
*
bond
,
int
inc
);
static
void
bond_set_allmulti
(
bonding_t
*
bond
,
int
inc
);
static
struct
dev_mc_list
*
bond_mc_list_find_dmi
(
struct
dev_mc_list
*
dmi
,
struct
dev_mc_list
*
mc_list
);
static
void
bond_mc_update
(
bonding_t
*
bond
,
slave_t
*
new
,
slave_t
*
old
);
static
void
bond_set_slave_inactive_flags
(
slave_t
*
slave
);
static
void
bond_set_slave_active_flags
(
slave_t
*
slave
);
static
int
bond_enslave
(
struct
net_device
*
master
,
struct
net_device
*
slave
);
...
...
@@ -308,6 +405,47 @@ static int bond_get_info(char *buf, char **start, off_t offset, int length);
#define IS_UP(dev) ((((dev)->flags & (IFF_UP)) == (IFF_UP)) && \
(netif_running(dev) && netif_carrier_ok(dev)))
static
void
arp_send_all
(
slave_t
*
slave
)
{
int
i
;
for
(
i
=
0
;
(
i
<
MAX_ARP_IP_TARGETS
)
&&
arp_target
[
i
];
i
++
)
{
arp_send
(
ARPOP_REQUEST
,
ETH_P_ARP
,
arp_target
[
i
],
slave
->
dev
,
my_ip
,
arp_target_hw_addr
,
slave
->
dev
->
dev_addr
,
arp_target_hw_addr
);
}
}
static
const
char
*
bond_mode
(
void
)
{
switch
(
mode
)
{
case
BOND_MODE_ROUNDROBIN
:
return
"load balancing (round-robin)"
;
case
BOND_MODE_ACTIVEBACKUP
:
return
"fault-tolerance (active-backup)"
;
case
BOND_MODE_XOR
:
return
"load balancing (xor)"
;
case
BOND_MODE_BROADCAST
:
return
"fault-tolerance (broadcast)"
;
default
:
return
"unknown"
;
}
}
static
const
char
*
multicast_mode
(
void
)
{
switch
(
multicast
)
{
case
BOND_MULTICAST_DISABLED
:
return
"disabled"
;
case
BOND_MULTICAST_ACTIVE
:
return
"active slave only"
;
case
BOND_MULTICAST_ALL
:
return
"all slaves"
;
default
:
return
"unknown"
;
}
}
static
void
bond_restore_slave_flags
(
slave_t
*
slave
)
{
...
...
@@ -415,11 +553,24 @@ static u16 bond_check_dev_link(struct net_device *dev)
/* call it and not the others for that team */
/* member. */
/* try SOICETHTOOL ioctl, some drivers cache ETHTOOL_GLINK */
/* for a period of time; we need to encourage link status */
/* be reported by network drivers in real time; if the */
/* value is cached, the mmimon module parm may have no */
/* effect... */
/*
* We cannot assume that SIOCGMIIPHY will also read a
* register; not all network drivers (e.g., e100)
* support that.
*/
/* Yes, the mii is overlaid on the ifreq.ifr_ifru */
mii
=
(
struct
mii_ioctl_data
*
)
&
ifr
.
ifr_data
;
if
(
IOCTL
(
dev
,
&
ifr
,
SIOCGMIIPHY
)
==
0
)
{
mii
->
reg_num
=
MII_BMSR
;
if
(
IOCTL
(
dev
,
&
ifr
,
SIOCGMIIREG
)
==
0
)
{
return
mii
->
val_out
&
BMSR_LSTATUS
;
}
}
/* try SIOCETHTOOL ioctl, some drivers cache ETHTOOL_GLINK */
/* for a period of time so we attempt to get link status */
/* from it last if the above MII ioctls fail... */
etool
.
cmd
=
ETHTOOL_GLINK
;
ifr
.
ifr_data
=
(
char
*
)
&
etool
;
if
(
IOCTL
(
dev
,
&
ifr
,
SIOCETHTOOL
)
==
0
)
{
...
...
@@ -427,26 +578,14 @@ static u16 bond_check_dev_link(struct net_device *dev)
return
BMSR_LSTATUS
;
}
else
{
#ifdef BONDING_DEBUG
printk
(
KERN_INFO
":: SIOCETHTOOL shows failure
\n
"
);
#endif
return
(
0
);
}
}
/*
* We cannot assume that SIOCGMIIPHY will also read a
* register; not all network drivers support that.
*/
/* Yes, the mii is overlaid on the ifreq.ifr_ifru */
mii
=
(
struct
mii_ioctl_data
*
)
&
ifr
.
ifr_data
;
if
(
IOCTL
(
dev
,
&
ifr
,
SIOCGMIIPHY
)
!=
0
)
{
return
BMSR_LSTATUS
;
/* can't tell */
}
mii
->
reg_num
=
MII_BMSR
;
if
(
IOCTL
(
dev
,
&
ifr
,
SIOCGMIIREG
)
==
0
)
{
return
mii
->
val_out
&
BMSR_LSTATUS
;
}
}
return
BMSR_LSTATUS
;
/* spoof link up ( we can't check it) */
}
...
...
@@ -483,7 +622,11 @@ static int bond_open(struct net_device *dev)
init_timer
(
arp_timer
);
arp_timer
->
expires
=
jiffies
+
(
arp_interval
*
HZ
/
1000
);
arp_timer
->
data
=
(
unsigned
long
)
dev
;
arp_timer
->
function
=
(
void
*
)
&
bond_arp_monitor
;
if
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
{
arp_timer
->
function
=
(
void
*
)
&
activebackup_arp_monitor
;
}
else
{
arp_timer
->
function
=
(
void
*
)
&
loadbalance_arp_monitor
;
}
add_timer
(
arp_timer
);
}
return
0
;
...
...
@@ -501,6 +644,10 @@ static int bond_close(struct net_device *master)
}
if
(
arp_interval
>
0
)
{
/* arp interval, in milliseconds. */
del_timer
(
&
bond
->
arp_timer
);
if
(
arp_target_hw_addr
!=
NULL
)
{
kfree
(
arp_target_hw_addr
);
arp_target_hw_addr
=
NULL
;
}
}
/* Release the bonded slaves */
...
...
@@ -545,9 +692,18 @@ static void bond_mc_list_destroy(struct bonding *bond)
static
void
bond_mc_add
(
bonding_t
*
bond
,
void
*
addr
,
int
alen
)
{
slave_t
*
slave
;
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
{
dev_mc_add
(
slave
->
dev
,
addr
,
alen
,
0
);
switch
(
multicast
)
{
case
BOND_MULTICAST_ACTIVE
:
/* write lock already acquired */
if
(
bond
->
current_slave
!=
NULL
)
dev_mc_add
(
bond
->
current_slave
->
dev
,
addr
,
alen
,
0
);
break
;
case
BOND_MULTICAST_ALL
:
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_mc_add
(
slave
->
dev
,
addr
,
alen
,
0
);
break
;
case
BOND_MULTICAST_DISABLED
:
break
;
}
}
...
...
@@ -557,9 +713,19 @@ static void bond_mc_add(bonding_t *bond, void *addr, int alen)
static
void
bond_mc_delete
(
bonding_t
*
bond
,
void
*
addr
,
int
alen
)
{
slave_t
*
slave
;
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_mc_delete
(
slave
->
dev
,
addr
,
alen
,
0
);
switch
(
multicast
)
{
case
BOND_MULTICAST_ACTIVE
:
/* write lock already acquired */
if
(
bond
->
current_slave
!=
NULL
)
dev_mc_delete
(
bond
->
current_slave
->
dev
,
addr
,
alen
,
0
);
break
;
case
BOND_MULTICAST_ALL
:
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_mc_delete
(
slave
->
dev
,
addr
,
alen
,
0
);
break
;
case
BOND_MULTICAST_DISABLED
:
break
;
}
}
/*
...
...
@@ -603,9 +769,19 @@ static inline int dmi_same(struct dev_mc_list *dmi1, struct dev_mc_list *dmi2)
static
void
bond_set_promiscuity
(
bonding_t
*
bond
,
int
inc
)
{
slave_t
*
slave
;
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_set_promiscuity
(
slave
->
dev
,
inc
);
switch
(
multicast
)
{
case
BOND_MULTICAST_ACTIVE
:
/* write lock already acquired */
if
(
bond
->
current_slave
!=
NULL
)
dev_set_promiscuity
(
bond
->
current_slave
->
dev
,
inc
);
break
;
case
BOND_MULTICAST_ALL
:
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_set_promiscuity
(
slave
->
dev
,
inc
);
break
;
case
BOND_MULTICAST_DISABLED
:
break
;
}
}
/*
...
...
@@ -614,9 +790,19 @@ static void bond_set_promiscuity(bonding_t *bond, int inc)
static
void
bond_set_allmulti
(
bonding_t
*
bond
,
int
inc
)
{
slave_t
*
slave
;
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_set_allmulti
(
slave
->
dev
,
inc
);
switch
(
multicast
)
{
case
BOND_MULTICAST_ACTIVE
:
/* write lock already acquired */
if
(
bond
->
current_slave
!=
NULL
)
dev_set_allmulti
(
bond
->
current_slave
->
dev
,
inc
);
break
;
case
BOND_MULTICAST_ALL
:
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
slave
=
slave
->
prev
)
dev_set_allmulti
(
slave
->
dev
,
inc
);
break
;
case
BOND_MULTICAST_DISABLED
:
break
;
}
}
/*
...
...
@@ -641,6 +827,8 @@ static void set_multicast_list(struct net_device *master)
struct
dev_mc_list
*
dmi
;
unsigned
long
flags
=
0
;
if
(
multicast
==
BOND_MULTICAST_DISABLED
)
return
;
/*
* Lock the private data for the master
*/
...
...
@@ -682,6 +870,43 @@ static void set_multicast_list(struct net_device *master)
write_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
}
/*
* Update the mc list and multicast-related flags for the new and
* old active slaves (if any) according to the multicast mode
*/
static
void
bond_mc_update
(
bonding_t
*
bond
,
slave_t
*
new
,
slave_t
*
old
)
{
struct
dev_mc_list
*
dmi
;
switch
(
multicast
)
{
case
BOND_MULTICAST_ACTIVE
:
if
(
bond
->
device
->
flags
&
IFF_PROMISC
)
{
if
(
old
!=
NULL
&&
new
!=
old
)
dev_set_promiscuity
(
old
->
dev
,
-
1
);
dev_set_promiscuity
(
new
->
dev
,
1
);
}
if
(
bond
->
device
->
flags
&
IFF_ALLMULTI
)
{
if
(
old
!=
NULL
&&
new
!=
old
)
dev_set_allmulti
(
old
->
dev
,
-
1
);
dev_set_allmulti
(
new
->
dev
,
1
);
}
/* first remove all mc addresses from old slave if any,
and _then_ add them to new active slave */
if
(
old
!=
NULL
&&
new
!=
old
)
{
for
(
dmi
=
bond
->
device
->
mc_list
;
dmi
!=
NULL
;
dmi
=
dmi
->
next
)
dev_mc_delete
(
old
->
dev
,
dmi
->
dmi_addr
,
dmi
->
dmi_addrlen
,
0
);
}
for
(
dmi
=
bond
->
device
->
mc_list
;
dmi
!=
NULL
;
dmi
=
dmi
->
next
)
dev_mc_add
(
new
->
dev
,
dmi
->
dmi_addr
,
dmi
->
dmi_addrlen
,
0
);
break
;
case
BOND_MULTICAST_ALL
:
/* nothing to do: mc list is already up-to-date on all slaves */
break
;
case
BOND_MULTICAST_DISABLED
:
break
;
}
}
/*
* This function counts the number of attached
* slaves for use by bond_xmit_xor.
...
...
@@ -703,9 +928,16 @@ static int bond_enslave(struct net_device *master_dev,
bonding_t
*
bond
=
NULL
;
slave_t
*
new_slave
=
NULL
;
unsigned
long
flags
=
0
;
unsigned
long
rflags
=
0
;
int
ndx
=
0
;
int
err
=
0
;
struct
dev_mc_list
*
dmi
;
struct
in_ifaddr
**
ifap
;
struct
in_ifaddr
*
ifa
;
static
int
(
*
ioctl
)(
struct
net_device
*
,
struct
ifreq
*
,
int
);
struct
ifreq
ifr
;
struct
ethtool_value
etool
;
int
link_reporting
=
0
;
if
(
master_dev
==
NULL
||
slave_dev
==
NULL
)
{
return
-
ENODEV
;
...
...
@@ -758,17 +990,19 @@ static int bond_enslave(struct net_device *master_dev,
new_slave
->
dev
=
slave_dev
;
/* set promiscuity level to new slave */
if
(
master_dev
->
flags
&
IFF_PROMISC
)
dev_set_promiscuity
(
slave_dev
,
1
);
if
(
multicast
==
BOND_MULTICAST_ALL
)
{
/* set promiscuity level to new slave */
if
(
master_dev
->
flags
&
IFF_PROMISC
)
dev_set_promiscuity
(
slave_dev
,
1
);
/* set allmulti level to new slave */
if
(
master_dev
->
flags
&
IFF_ALLMULTI
)
dev_set_allmulti
(
slave_dev
,
1
);
/* upload master's mc_list to new slave */
for
(
dmi
=
master_dev
->
mc_list
;
dmi
!=
NULL
;
dmi
=
dmi
->
next
)
dev_mc_add
(
slave_dev
,
dmi
->
dmi_addr
,
dmi
->
dmi_addrlen
,
0
);
/* set allmulti level to new slave */
if
(
master_dev
->
flags
&
IFF_ALLMULTI
)
dev_set_allmulti
(
slave_dev
,
1
);
/* upload master's mc_list to new slave */
for
(
dmi
=
master_dev
->
mc_list
;
dmi
!=
NULL
;
dmi
=
dmi
->
next
)
dev_mc_add
(
slave_dev
,
dmi
->
dmi_addr
,
dmi
->
dmi_addrlen
,
0
);
}
/*
* queue to the end of the slaves list, make the first element its
...
...
@@ -799,6 +1033,56 @@ static int bond_enslave(struct net_device *master_dev,
new_slave
->
delay
=
0
;
new_slave
->
link_failure_count
=
0
;
if
(
miimon
>
0
)
{
/* if the network driver for the slave does not support
* ETHTOOL/MII link status reporting, warn the user of this
*/
if
((
ioctl
=
slave_dev
->
do_ioctl
)
!=
NULL
)
{
etool
.
cmd
=
ETHTOOL_GLINK
;
ifr
.
ifr_data
=
(
char
*
)
&
etool
;
if
(
IOCTL
(
slave_dev
,
&
ifr
,
SIOCETHTOOL
)
==
0
)
{
link_reporting
=
1
;
}
else
{
if
(
IOCTL
(
slave_dev
,
&
ifr
,
SIOCGMIIPHY
)
==
0
)
{
/* Yes, the mii is overlaid on the
* ifreq.ifr_ifru
*/
((
struct
mii_ioctl_data
*
)
(
&
ifr
.
ifr_data
))
->
reg_num
=
1
;
if
(
IOCTL
(
slave_dev
,
&
ifr
,
SIOCGMIIREG
)
==
0
)
{
link_reporting
=
1
;
}
}
}
}
if
((
link_reporting
==
0
)
&&
(
arp_interval
==
0
))
{
/* miimon is set but a bonded network driver does
* not support ETHTOOL/MII and arp_interval is
* not set
*/
printk
(
KERN_ERR
"bond_enslave(): MII and ETHTOOL support not "
"available for interface %s, and "
"arp_interval/arp_ip_target module parameters "
"not specified, thus bonding will not detect "
"link failures! see bonding.txt for details.
\n
"
,
slave_dev
->
name
);
}
else
if
(
link_reporting
==
0
)
{
/* unable get link status using mii/ethtool */
printk
(
KERN_WARNING
"bond_enslave: can't get link status from "
"interface %s; the network driver associated "
"with this interface does not support "
"MII or ETHTOOL link status reporting, thus "
"miimon has no effect on this interface.
\n
"
,
slave_dev
->
name
);
}
}
/* check for initial state */
if
((
miimon
<=
0
)
||
(
bond_check_dev_link
(
slave_dev
)
==
BMSR_LSTATUS
))
{
...
...
@@ -806,6 +1090,7 @@ static int bond_enslave(struct net_device *master_dev,
printk
(
KERN_CRIT
"Initial state of slave_dev is BOND_LINK_UP
\n
"
);
#endif
new_slave
->
link
=
BOND_LINK_UP
;
new_slave
->
jiffies
=
jiffies
;
}
else
{
#ifdef BONDING_DEBUG
...
...
@@ -832,6 +1117,7 @@ static int bond_enslave(struct net_device *master_dev,
is OK, so make this interface the active one */
bond
->
current_slave
=
new_slave
;
bond_set_slave_active_flags
(
new_slave
);
bond_mc_update
(
bond
,
new_slave
,
NULL
);
}
else
{
#ifdef BONDING_DEBUG
...
...
@@ -839,15 +1125,24 @@ static int bond_enslave(struct net_device *master_dev,
#endif
bond_set_slave_inactive_flags
(
new_slave
);
}
read_lock_irqsave
(
&
(((
struct
in_device
*
)
slave_dev
->
ip_ptr
)
->
lock
),
rflags
);
ifap
=
&
(((
struct
in_device
*
)
slave_dev
->
ip_ptr
)
->
ifa_list
);
ifa
=
*
ifap
;
my_ip
=
ifa
->
ifa_address
;
read_unlock_irqrestore
(
&
(((
struct
in_device
*
)
slave_dev
->
ip_ptr
)
->
lock
),
rflags
);
/* if there is a primary slave, remember it */
if
(
primary
!=
NULL
)
if
(
strcmp
(
primary
,
new_slave
->
dev
->
name
)
==
0
)
bond
->
primary_slave
=
new_slave
;
}
else
{
#ifdef BONDING_DEBUG
printk
(
KERN_CRIT
"This slave is always active in trunk mode
\n
"
);
#endif
/* always active in trunk mode */
new_slave
->
state
=
BOND_STATE_ACTIVE
;
if
(
bond
->
current_slave
==
NULL
)
{
if
(
bond
->
current_slave
==
NULL
)
bond
->
current_slave
=
new_slave
;
}
}
update_slave_cnt
(
bond
);
...
...
@@ -938,6 +1233,7 @@ static int bond_change_active(struct net_device *master_dev, struct net_device *
IS_UP
(
newactive
->
dev
))
{
bond_set_slave_inactive_flags
(
oldactive
);
bond_set_slave_active_flags
(
newactive
);
bond_mc_update
(
bond
,
newactive
,
oldactive
);
bond
->
current_slave
=
newactive
;
printk
(
"%s : activate %s(old : %s)
\n
"
,
master_dev
->
name
,
newactive
->
dev
->
name
,
...
...
@@ -978,6 +1274,7 @@ slave_t *change_active_interface(bonding_t *bond)
newslave
=
bond
->
current_slave
=
bond
->
next
;
write_unlock
(
&
bond
->
ptrlock
);
}
else
{
printk
(
" but could not find any %s interface.
\n
"
,
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
?
"backup"
:
"other"
);
write_lock
(
&
bond
->
ptrlock
);
...
...
@@ -985,16 +1282,38 @@ slave_t *change_active_interface(bonding_t *bond)
write_unlock
(
&
bond
->
ptrlock
);
return
NULL
;
/* still no slave, return NULL */
}
}
else
if
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
{
/* make sure oldslave doesn't send arps - this could
* cause a ping-pong effect between interfaces since they
* would be able to tx arps - in active backup only one
* slave should be able to tx arps, and that should be
* the current_slave; the only exception is when all
* slaves have gone down, then only one non-current slave can
* send arps at a time; clearing oldslaves' mc list is handled
* later in this function.
*/
bond_set_slave_inactive_flags
(
oldslave
);
}
mintime
=
updelay
;
/* first try the primary link; if arping, a link must tx/rx traffic
* before it can be considered the current_slave - also, we would skip
* slaves between the current_slave and primary_slave that may be up
* and able to arp
*/
if
((
bond
->
primary_slave
!=
NULL
)
&&
(
arp_interval
==
0
))
{
if
(
IS_UP
(
bond
->
primary_slave
->
dev
))
newslave
=
bond
->
primary_slave
;
}
do
{
if
(
IS_UP
(
newslave
->
dev
))
{
if
(
newslave
->
link
==
BOND_LINK_UP
)
{
/* this one is immediately usable */
if
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
{
bond_set_slave_active_flags
(
newslave
);
bond_mc_update
(
bond
,
newslave
,
oldslave
);
printk
(
" and making interface %s the active one.
\n
"
,
newslave
->
dev
->
name
);
}
...
...
@@ -1030,14 +1349,30 @@ slave_t *change_active_interface(bonding_t *bond)
bestslave
->
delay
=
0
;
bestslave
->
link
=
BOND_LINK_UP
;
bestslave
->
jiffies
=
jiffies
;
bond_set_slave_active_flags
(
bestslave
);
bond_mc_update
(
bond
,
bestslave
,
oldslave
);
write_lock
(
&
bond
->
ptrlock
);
bond
->
current_slave
=
bestslave
;
write_unlock
(
&
bond
->
ptrlock
);
return
bestslave
;
}
if
((
mode
==
BOND_MODE_ACTIVEBACKUP
)
&&
(
multicast
==
BOND_MULTICAST_ACTIVE
)
&&
(
oldslave
!=
NULL
))
{
/* flush bonds (master's) mc_list from oldslave since it wasn't
* updated (and deleted) above
*/
bond_mc_list_flush
(
oldslave
->
dev
,
bond
->
device
);
if
(
bond
->
device
->
flags
&
IFF_PROMISC
)
{
dev_set_promiscuity
(
oldslave
->
dev
,
-
1
);
}
if
(
bond
->
device
->
flags
&
IFF_ALLMULTI
)
{
dev_set_allmulti
(
oldslave
->
dev
,
-
1
);
}
}
printk
(
" but could not find any %s interface.
\n
"
,
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
?
"backup"
:
"other"
);
...
...
@@ -1081,6 +1416,7 @@ static int bond_release(struct net_device *master, struct net_device *slave)
return
-
EINVAL
;
}
bond
->
current_arp_slave
=
NULL
;
our_slave
=
(
slave_t
*
)
bond
;
old_current
=
bond
->
current_slave
;
while
((
our_slave
=
our_slave
->
prev
)
!=
(
slave_t
*
)
bond
)
{
...
...
@@ -1101,16 +1437,18 @@ static int bond_release(struct net_device *master, struct net_device *slave)
/* release the slave from its bond */
/* flush master's mc_list from slave */
bond_mc_list_flush
(
slave
,
master
);
/* unset promiscuity level from slave */
if
(
master
->
flags
&
IFF_PROMISC
)
dev_set_promiscuity
(
slave
,
-
1
);
if
(
multicast
==
BOND_MULTICAST_ALL
)
{
/* flush master's mc_list from slave */
bond_mc_list_flush
(
slave
,
master
);
/* unset promiscuity level from slave */
if
(
master
->
flags
&
IFF_PROMISC
)
dev_set_promiscuity
(
slave
,
-
1
);
/* unset allmulti level from slave */
if
(
master
->
flags
&
IFF_ALLMULTI
)
dev_set_allmulti
(
slave
,
-
1
);
/* unset allmulti level from slave */
if
(
master
->
flags
&
IFF_ALLMULTI
)
dev_set_allmulti
(
slave
,
-
1
);
}
netdev_set_master
(
slave
,
NULL
);
...
...
@@ -1122,6 +1460,7 @@ static int bond_release(struct net_device *master, struct net_device *slave)
if
(
slave
->
flags
&
IFF_NOARP
||
bond
->
current_slave
!=
NULL
)
{
dev_close
(
slave
);
our_slave
->
original_flags
&=
~
IFF_UP
;
}
bond_restore_slave_flags
(
our_slave
);
...
...
@@ -1135,6 +1474,10 @@ static int bond_release(struct net_device *master, struct net_device *slave)
update_slave_cnt
(
bond
);
if
(
bond
->
primary_slave
==
our_slave
)
{
bond
->
primary_slave
=
NULL
;
}
write_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
return
0
;
/* deletion OK */
}
...
...
@@ -1166,12 +1509,28 @@ static int bond_release_all(struct net_device *master)
}
bond
=
(
struct
bonding
*
)
master
->
priv
;
bond
->
current_slave
=
NULL
;
bond
->
current_
arp_
slave
=
NULL
;
while
((
our_slave
=
bond
->
prev
)
!=
(
slave_t
*
)
bond
)
{
slave_dev
=
our_slave
->
dev
;
bond
->
prev
=
our_slave
->
prev
;
if
(
multicast
==
BOND_MULTICAST_ALL
||
(
multicast
==
BOND_MULTICAST_ACTIVE
&&
bond
->
current_slave
==
our_slave
))
{
/* flush master's mc_list from slave */
bond_mc_list_flush
(
slave_dev
,
master
);
/* unset promiscuity level from slave */
if
(
master
->
flags
&
IFF_PROMISC
)
dev_set_promiscuity
(
slave_dev
,
-
1
);
/* unset allmulti level from slave */
if
(
master
->
flags
&
IFF_ALLMULTI
)
dev_set_allmulti
(
slave_dev
,
-
1
);
}
kfree
(
our_slave
);
netdev_set_master
(
slave_dev
,
NULL
);
...
...
@@ -1183,9 +1542,12 @@ static int bond_release_all(struct net_device *master)
if
(
slave_dev
->
flags
&
IFF_NOARP
)
dev_close
(
slave_dev
);
}
bond
->
current_slave
=
NULL
;
bond
->
next
=
(
slave_t
*
)
bond
;
bond
->
slave_cnt
=
0
;
printk
(
KERN_INFO
"%s: releases all slaves
\n
"
,
master
->
name
);
bond
->
primary_slave
=
NULL
;
printk
(
KERN_INFO
"%s: released all slaves
\n
"
,
master
->
name
);
return
0
;
}
...
...
@@ -1291,6 +1653,7 @@ static void bond_mii_monitor(struct net_device *master)
}
else
{
/* link up again */
slave
->
link
=
BOND_LINK_UP
;
slave
->
jiffies
=
jiffies
;
printk
(
KERN_INFO
"%s: link status up again after %d ms "
"for interface %s.
\n
"
,
...
...
@@ -1343,8 +1706,10 @@ static void bond_mii_monitor(struct net_device *master)
if
(
slave
->
delay
==
0
)
{
/* now the link has been up for long time enough */
slave
->
link
=
BOND_LINK_UP
;
slave
->
jiffies
=
jiffies
;
if
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
{
if
(
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
||
(
slave
!=
bond
->
primary_slave
)
)
{
/* prevent it from being the active one */
slave
->
state
=
BOND_STATE_BACKUP
;
}
...
...
@@ -1358,15 +1723,26 @@ static void bond_mii_monitor(struct net_device *master)
"for interface %s.
\n
"
,
master
->
name
,
dev
->
name
);
if
(
(
bond
->
primary_slave
!=
NULL
)
&&
(
slave
==
bond
->
primary_slave
)
)
change_active_interface
(
bond
);
}
else
slave
->
delay
--
;
/* we'll also look for the mostly eligible slave */
if
(
IS_UP
(
dev
)
&&
(
slave
->
delay
<
mindelay
))
{
if
(
bond
->
primary_slave
==
NULL
)
{
if
(
IS_UP
(
dev
)
&&
(
slave
->
delay
<
mindelay
))
{
mindelay
=
slave
->
delay
;
bestslave
=
slave
;
}
}
else
if
(
(
IS_UP
(
bond
->
primary_slave
->
dev
))
||
(
(
!
IS_UP
(
bond
->
primary_slave
->
dev
))
&&
(
IS_UP
(
dev
)
&&
(
slave
->
delay
<
mindelay
))
)
)
{
mindelay
=
slave
->
delay
;
bestslave
=
slave
;
}
}
}
break
;
}
/* end of switch */
...
...
@@ -1380,7 +1756,8 @@ static void bond_mii_monitor(struct net_device *master)
oldcurrent
=
bond
->
current_slave
;
read_unlock
(
&
bond
->
ptrlock
);
if
(
oldcurrent
==
NULL
)
{
/* no active interface at the moment */
/* no active interface at the moment or need to bring up the primary */
if
(
oldcurrent
==
NULL
)
{
/* no active interface at the moment */
if
(
bestslave
!=
NULL
)
{
/* last chance to find one ? */
if
(
bestslave
->
link
==
BOND_LINK_UP
)
{
printk
(
KERN_INFO
...
...
@@ -1395,10 +1772,12 @@ static void bond_mii_monitor(struct net_device *master)
bestslave
->
delay
=
0
;
bestslave
->
link
=
BOND_LINK_UP
;
bestslave
->
jiffies
=
jiffies
;
}
if
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
{
bond_set_slave_active_flags
(
bestslave
);
bond_mc_update
(
bond
,
bestslave
,
NULL
);
}
else
{
bestslave
->
state
=
BOND_STATE_ACTIVE
;
}
...
...
@@ -1420,10 +1799,12 @@ static void bond_mii_monitor(struct net_device *master)
/*
* this function is called regularly to monitor each slave's link
* insuring that traffic is being sent and received. If the adapter
* has been dormant, then an arp is transmitted to generate traffic
* ensuring that traffic is being sent and received when arp monitoring
* is used in load-balancing mode. if the adapter has been dormant, then an
* arp is transmitted to generate traffic. see activebackup_arp_monitor for
* arp monitoring in active backup mode.
*/
static
void
bond
_arp_monitor
(
struct
net_device
*
master
)
static
void
loadbalance
_arp_monitor
(
struct
net_device
*
master
)
{
bonding_t
*
bond
;
unsigned
long
flags
;
...
...
@@ -1439,147 +1820,358 @@ static void bond_arp_monitor(struct net_device *master)
read_lock_irqsave
(
&
bond
->
lock
,
flags
);
if
(
!
IS_UP
(
master
))
{
/* TODO: investigate why rtnl_shlock_nowait and rtnl_exlock_nowait
* are called below and add comment why they are required...
*/
if
((
!
IS_UP
(
master
))
||
rtnl_shlock_nowait
())
{
mod_timer
(
&
bond
->
arp_timer
,
next_timer
);
goto
arp_monitor_out
;
}
if
(
rtnl_shlock_nowait
())
{
goto
arp_monitor_out
;
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
return
;
}
if
(
rtnl_exlock_nowait
())
{
rtnl_shunlock
();
goto
arp_monitor_out
;
mod_timer
(
&
bond
->
arp_timer
,
next_timer
);
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
return
;
}
/* see if any of the previous devices are up now (i.e. they have seen a
* response from an arp request sent by another adapter, since they
* have the same hardware address).
/* see if any of the previous devices are up now (i.e. they have
* xmt and rcv traffic). the current_slave does not come into
* the picture unless it is null. also, slave->jiffies is not needed
* here because we send an arp on each slave and give a slave as
* long as it needs to get the tx/rx within the delta.
* TODO: what about up/down delay in arp mode? it wasn't here before
* so it can wait
*/
slave
=
(
slave_t
*
)
bond
;
while
((
slave
=
slave
->
prev
)
!=
(
slave_t
*
)
bond
)
{
read_lock
(
&
bond
->
ptrlock
);
if
(
(
!
(
slave
->
link
==
BOND_LINK_UP
))
&&
(
slave
!=
bond
->
current_slave
)
)
{
read_unlock
(
&
bond
->
ptrlock
);
if
(
slave
->
link
!=
BOND_LINK_UP
)
{
if
(
((
jiffies
-
slave
->
dev
->
trans_start
)
<=
if
(((
jiffies
-
slave
->
dev
->
trans_start
)
<=
the_delta_in_ticks
)
&&
((
jiffies
-
slave
->
dev
->
last_rx
)
<=
the_delta_in_ticks
)
)
{
the_delta_in_ticks
))
{
slave
->
link
=
BOND_LINK_UP
;
write_lock
(
&
bond
->
ptrlock
);
slave
->
state
=
BOND_STATE_ACTIVE
;
/* primary_slave has no meaning in round-robin
* mode. the window of a slave being up and
* current_slave being null after enslaving
* is closed.
*/
read_lock
(
&
bond
->
ptrlock
);
if
(
bond
->
current_slave
==
NULL
)
{
slave
->
state
=
BOND_STATE_ACTIVE
;
read_unlock
(
&
bond
->
ptrlock
);
printk
(
KERN_INFO
"%s: link status definitely up "
"for interface %s, "
,
master
->
name
,
slave
->
dev
->
name
);
change_active_interface
(
bond
);
}
else
{
read_unlock
(
&
bond
->
ptrlock
);
printk
(
KERN_INFO
"%s: interface %s is now up
\n
"
,
master
->
name
,
slave
->
dev
->
name
);
}
}
}
else
{
/* slave->link == BOND_LINK_UP */
/* not all switches will respond to an arp request
* when the source ip is 0, so don't take the link down
* if we don't know our ip yet
*/
if
(((
jiffies
-
slave
->
dev
->
trans_start
)
>=
(
2
*
the_delta_in_ticks
))
||
(((
jiffies
-
slave
->
dev
->
last_rx
)
>=
(
2
*
the_delta_in_ticks
))
&&
my_ip
!=
0
))
{
slave
->
link
=
BOND_LINK_DOWN
;
slave
->
state
=
BOND_STATE_BACKUP
;
if
(
slave
->
link_failure_count
<
UINT_MAX
)
{
slave
->
link_failure_count
++
;
}
printk
(
KERN_INFO
"%s: interface %s is now down.
\n
"
,
master
->
name
,
slave
->
dev
->
name
);
read_lock
(
&
bond
->
ptrlock
);
if
(
slave
==
bond
->
current_slave
)
{
read_unlock
(
&
bond
->
ptrlock
);
change_active_interface
(
bond
);
}
else
{
read_unlock
(
&
bond
->
ptrlock
);
}
}
}
/* note: if switch is in round-robin mode, all links
* must tx arp to ensure all links rx an arp - otherwise
* links may oscillate or not come up at all; if switch is
* in something like xor mode, there is nothing we can
* do - all replies will be rx'ed on same link causing slaves
* to be unstable during low/no traffic periods
*/
if
(
IS_UP
(
slave
->
dev
))
{
arp_send_all
(
slave
);
}
}
rtnl_exunlock
();
rtnl_shunlock
();
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
/* re-arm the timer */
mod_timer
(
&
bond
->
arp_timer
,
next_timer
);
}
/*
* When using arp monitoring in active-backup mode, this function is
* called to determine if any backup slaves have went down or a new
* current slave needs to be found.
* The backup slaves never generate traffic, they are considered up by merely
* receiving traffic. If the current slave goes down, each backup slave will
* be given the opportunity to tx/rx an arp before being taken down - this
* prevents all slaves from being taken down due to the current slave not
* sending any traffic for the backups to receive. The arps are not necessarily
* necessary, any tx and rx traffic will keep the current slave up. While any
* rx traffic will keep the backup slaves up, the current slave is responsible
* for generating traffic to keep them up regardless of any other traffic they
* may have received.
* see loadbalance_arp_monitor for arp monitoring in load balancing mode
*/
static
void
activebackup_arp_monitor
(
struct
net_device
*
master
)
{
bonding_t
*
bond
;
unsigned
long
flags
;
slave_t
*
slave
;
int
the_delta_in_ticks
=
arp_interval
*
HZ
/
1000
;
int
next_timer
=
jiffies
+
(
arp_interval
*
HZ
/
1000
);
bond
=
(
struct
bonding
*
)
master
->
priv
;
if
(
master
->
priv
==
NULL
)
{
mod_timer
(
&
bond
->
arp_timer
,
next_timer
);
return
;
}
read_lock_irqsave
(
&
bond
->
lock
,
flags
);
if
(
!
IS_UP
(
master
))
{
mod_timer
(
&
bond
->
arp_timer
,
next_timer
);
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
return
;
}
/* determine if any slave has come up or any backup slave has
* gone down
* TODO: what about up/down delay in arp mode? it wasn't here before
* so it can wait
*/
slave
=
(
slave_t
*
)
bond
;
while
((
slave
=
slave
->
prev
)
!=
(
slave_t
*
)
bond
)
{
if
(
slave
->
link
!=
BOND_LINK_UP
)
{
if
((
jiffies
-
slave
->
dev
->
last_rx
)
<=
the_delta_in_ticks
)
{
slave
->
link
=
BOND_LINK_UP
;
write_lock
(
&
bond
->
ptrlock
);
if
((
bond
->
current_slave
==
NULL
)
&&
((
jiffies
-
slave
->
dev
->
trans_start
)
<=
the_delta_in_ticks
))
{
bond
->
current_slave
=
slave
;
bond_set_slave_active_flags
(
slave
);
bond_mc_update
(
bond
,
slave
,
NULL
);
bond
->
current_arp_slave
=
NULL
;
}
else
if
(
bond
->
current_slave
!=
slave
)
{
/* this slave has just come up but we
* already have a current slave; this
* can also happen if bond_enslave adds
* a new slave that is up while we are
* searching for a new slave
*/
bond_set_slave_inactive_flags
(
slave
);
bond
->
current_arp_slave
=
NULL
;
}
if
(
slave
!=
bond
->
current_slave
)
{
slave
->
dev
->
flags
|=
IFF_NOARP
;
if
(
slave
==
bond
->
current_slave
)
{
printk
(
KERN_INFO
"%s: %s is up and now the "
"active interface
\n
"
,
master
->
name
,
slave
->
dev
->
name
);
}
else
{
printk
(
KERN_INFO
"%s: backup interface %s is "
"now up
\n
"
,
master
->
name
,
slave
->
dev
->
name
);
}
write_unlock
(
&
bond
->
ptrlock
);
}
else
{
if
((
jiffies
-
slave
->
dev
->
last_rx
)
<=
the_delta_in_ticks
)
{
arp_send
(
ARPOP_REQUEST
,
ETH_P_ARP
,
arp_target
,
slave
->
dev
,
my_ip
,
arp_target_hw_addr
,
slave
->
dev
->
dev_addr
,
arp_target_hw_addr
);
}
}
else
{
read_lock
(
&
bond
->
ptrlock
);
if
((
slave
!=
bond
->
current_slave
)
&&
(
bond
->
current_arp_slave
==
NULL
)
&&
(((
jiffies
-
slave
->
dev
->
last_rx
)
>=
3
*
the_delta_in_ticks
)
&&
(
my_ip
!=
0
)))
{
/* a backup slave has gone down; three times
* the delta allows the current slave to be
* taken out before the backup slave.
* note: a non-null current_arp_slave indicates
* the current_slave went down and we are
* searching for a new one; under this
* condition we only take the current_slave
* down - this gives each slave a chance to
* tx/rx traffic before being taken out
*/
read_unlock
(
&
bond
->
ptrlock
);
slave
->
link
=
BOND_LINK_DOWN
;
if
(
slave
->
link_failure_count
<
UINT_MAX
)
{
slave
->
link_failure_count
++
;
}
bond_set_slave_inactive_flags
(
slave
);
printk
(
KERN_INFO
"%s: backup interface %s is now down
\n
"
,
master
->
name
,
slave
->
dev
->
name
);
}
else
{
read_unlock
(
&
bond
->
ptrlock
);
}
}
else
read_unlock
(
&
bond
->
ptrlock
);
}
}
read_lock
(
&
bond
->
ptrlock
);
slave
=
bond
->
current_slave
;
read_unlock
(
&
bond
->
ptrlock
);
if
(
slave
!=
0
)
{
/* see if you need to take down the current_slave, since
* you haven't seen an arp in 2*arp_intervals
*/
if
(
((
jiffies
-
slave
->
dev
->
trans_start
)
>=
(
2
*
the_delta_in_ticks
))
||
((
jiffies
-
slave
->
dev
->
last_rx
)
>=
(
2
*
the_delta_in_ticks
))
)
{
if
(
slave
!=
NULL
)
{
if
(
slave
->
link
==
BOND_LINK_UP
)
{
slave
->
link
=
BOND_LINK_DOWN
;
slave
->
state
=
BOND_STATE_BACKUP
;
/*
* we want to see arps, otherwise we couldn't
* bring the adapter back online...
*/
printk
(
KERN_INFO
"%s: link status definitely "
"down for interface %s, "
"disabling it"
,
slave
->
dev
->
master
->
name
,
slave
->
dev
->
name
);
/* find a new interface and be verbose */
change_active_interface
(
bond
);
read_lock
(
&
bond
->
ptrlock
);
slave
=
bond
->
current_slave
;
read_unlock
(
&
bond
->
ptrlock
);
/* if we have sent traffic in the past 2*arp_intervals but
* haven't xmit and rx traffic in that time interval, select
* a different slave. slave->jiffies is only updated when
* a slave first becomes the current_slave - not necessarily
* after every arp; this ensures the slave has a full 2*delta
* before being taken out. if a primary is being used, check
* if it is up and needs to take over as the current_slave
*/
if
((((
jiffies
-
slave
->
dev
->
trans_start
)
>=
(
2
*
the_delta_in_ticks
))
||
(((
jiffies
-
slave
->
dev
->
last_rx
)
>=
(
2
*
the_delta_in_ticks
))
&&
(
my_ip
!=
0
)))
&&
((
jiffies
-
slave
->
jiffies
)
>=
2
*
the_delta_in_ticks
))
{
slave
->
link
=
BOND_LINK_DOWN
;
if
(
slave
->
link_failure_count
<
UINT_MAX
)
{
slave
->
link_failure_count
++
;
}
printk
(
KERN_INFO
"%s: link status down for "
"active interface %s, disabling it"
,
master
->
name
,
slave
->
dev
->
name
);
slave
=
change_active_interface
(
bond
);
bond
->
current_arp_slave
=
slave
;
if
(
slave
!=
NULL
)
{
slave
->
jiffies
=
jiffies
;
}
}
/*
* ok, we know up/down, so just send a arp out if there has
* been no activity for a while
}
else
if
((
bond
->
primary_slave
!=
NULL
)
&&
(
bond
->
primary_slave
!=
slave
)
&&
(
bond
->
primary_slave
->
link
==
BOND_LINK_UP
))
{
/* at this point, slave is the current_slave */
printk
(
KERN_INFO
"%s: changing from interface %s to primary "
"interface %s
\n
"
,
master
->
name
,
slave
->
dev
->
name
,
bond
->
primary_slave
->
dev
->
name
);
/* primary is up so switch to it */
bond_set_slave_inactive_flags
(
slave
);
bond_mc_update
(
bond
,
bond
->
primary_slave
,
slave
);
write_lock
(
&
bond
->
ptrlock
);
bond
->
current_slave
=
bond
->
primary_slave
;
write_unlock
(
&
bond
->
ptrlock
);
slave
=
bond
->
primary_slave
;
bond_set_slave_active_flags
(
slave
);
slave
->
jiffies
=
jiffies
;
}
else
{
bond
->
current_arp_slave
=
NULL
;
}
/* the current slave must tx an arp to ensure backup slaves
* rx traffic
*/
if
((
slave
!=
NULL
)
&&
(((
jiffies
-
slave
->
dev
->
last_rx
)
>=
the_delta_in_ticks
)
&&
(
my_ip
!=
0
)))
{
arp_send_all
(
slave
);
}
}
if
(
slave
!=
NULL
)
{
if
(
((
jiffies
-
slave
->
dev
->
trans_start
)
>=
the_delta_in_ticks
)
||
((
jiffies
-
slave
->
dev
->
last_rx
)
>=
the_delta_in_ticks
)
)
{
arp_send
(
ARPOP_REQUEST
,
ETH_P_ARP
,
arp_target
,
slave
->
dev
,
my_ip
,
arp_target_hw_addr
,
slave
->
dev
->
dev_addr
,
arp_target_hw_addr
);
}
/* if we don't have a current_slave, search for the next available
* backup slave from the current_arp_slave and make it the candidate
* for becoming the current_slave
*/
if
(
slave
==
NULL
)
{
if
((
bond
->
current_arp_slave
==
NULL
)
||
(
bond
->
current_arp_slave
==
(
slave_t
*
)
bond
))
{
bond
->
current_arp_slave
=
bond
->
prev
;
}
}
if
(
bond
->
current_arp_slave
!=
(
slave_t
*
)
bond
)
{
bond_set_slave_inactive_flags
(
bond
->
current_arp_slave
);
slave
=
bond
->
current_arp_slave
->
next
;
/* search for next candidate */
do
{
if
(
IS_UP
(
slave
->
dev
))
{
slave
->
link
=
BOND_LINK_BACK
;
bond_set_slave_active_flags
(
slave
);
arp_send_all
(
slave
);
slave
->
jiffies
=
jiffies
;
bond
->
current_arp_slave
=
slave
;
break
;
}
/* if we have no current slave.. try sending
* an arp on all of the interfaces
*/
/* if the link state is up at this point, we
* mark it down - this can happen if we have
* simultaneous link failures and
* change_active_interface doesn't make this
* one the current slave so it is still marked
* up when it is actually down
*/
if
(
slave
->
link
==
BOND_LINK_UP
)
{
slave
->
link
=
BOND_LINK_DOWN
;
if
(
slave
->
link_failure_count
<
UINT_MAX
)
{
slave
->
link_failure_count
++
;
}
read_lock
(
&
bond
->
ptrlock
);
if
(
bond
->
current_slave
==
NULL
)
{
read_unlock
(
&
bond
->
ptrlock
);
slave
=
(
slave_t
*
)
bond
;
while
((
slave
=
slave
->
prev
)
!=
(
slave_t
*
)
bond
)
{
arp_send
(
ARPOP_REQUEST
,
ETH_P_ARP
,
arp_target
,
slave
->
dev
,
my_ip
,
arp_target_hw_addr
,
slave
->
dev
->
dev_addr
,
arp_target_hw_addr
);
bond_set_slave_inactive_flags
(
slave
);
printk
(
KERN_INFO
"%s: backup interface "
"%s is now down.
\n
"
,
master
->
name
,
slave
->
dev
->
name
);
}
}
while
((
slave
=
slave
->
next
)
!=
bond
->
current_arp_slave
->
next
);
}
}
else
{
read_unlock
(
&
bond
->
ptrlock
);
}
rtnl_exunlock
();
rtnl_shunlock
();
arp_monitor_out:
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
/* re-arm the timer */
mod_timer
(
&
bond
->
arp_timer
,
next_timer
);
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
}
#define isdigit(c) (c >= '0' && c <= '9')
__inline
static
int
atoi
(
char
**
s
)
{
...
...
@@ -1720,7 +2312,7 @@ static int bond_slave_info_query(struct net_device *master,
}
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
if
(
cur_ndx
==
info
->
slave_i
d
)
{
if
(
slave
!=
(
slave_t
*
)
bon
d
)
{
strcpy
(
info
->
slave_name
,
slave
->
dev
->
name
);
info
->
link
=
slave
->
link
;
info
->
state
=
slave
->
state
;
...
...
@@ -1737,7 +2329,7 @@ static int bond_ioctl(struct net_device *master_dev, struct ifreq *ifr, int cmd)
struct
net_device
*
slave_dev
=
NULL
;
struct
ifbond
*
u_binfo
=
NULL
,
k_binfo
;
struct
ifslave
*
u_sinfo
=
NULL
,
k_sinfo
;
u16
*
data
=
NULL
;
struct
mii_ioctl_data
*
mii
=
NULL
;
int
ret
=
0
;
#ifdef BONDING_DEBUG
...
...
@@ -1747,23 +2339,23 @@ static int bond_ioctl(struct net_device *master_dev, struct ifreq *ifr, int cmd)
switch
(
cmd
)
{
case
SIOCGMIIPHY
:
data
=
(
u16
*
)
ifr
->
ifr_data
;
if
(
data
==
NULL
)
{
mii
=
(
struct
mii_ioctl_data
*
)
&
ifr
->
ifr_data
;
if
(
mii
==
NULL
)
{
return
-
EINVAL
;
}
data
[
0
]
=
0
;
mii
->
phy_id
=
0
;
/* Fall Through */
case
SIOCGMIIREG
:
/*
* We do this again just in case we were called by SIOCGMIIREG
* instead of SIOCGMIIPHY.
*/
data
=
(
u16
*
)
ifr
->
ifr_data
;
if
(
data
==
NULL
)
{
mii
=
(
struct
mii_ioctl_data
*
)
&
ifr
->
ifr_data
;
if
(
mii
==
NULL
)
{
return
-
EINVAL
;
}
if
(
data
[
1
]
==
1
)
{
data
[
3
]
=
bond_check_mii_link
(
if
(
mii
->
reg_num
==
1
)
{
mii
->
val_out
=
bond_check_mii_link
(
(
struct
bonding
*
)
master_dev
->
priv
);
}
return
0
;
...
...
@@ -1846,6 +2438,65 @@ static int bond_accept_fastpath(struct net_device *dev, struct dst_entry *dst)
}
#endif
/*
* in broadcast mode, we send everything to all usable interfaces.
*/
static
int
bond_xmit_broadcast
(
struct
sk_buff
*
skb
,
struct
net_device
*
dev
)
{
slave_t
*
slave
,
*
start_at
;
struct
bonding
*
bond
=
(
struct
bonding
*
)
dev
->
priv
;
unsigned
long
flags
;
struct
net_device
*
device_we_should_send_to
=
0
;
if
(
!
IS_UP
(
dev
))
{
/* bond down */
dev_kfree_skb
(
skb
);
return
0
;
}
read_lock_irqsave
(
&
bond
->
lock
,
flags
);
read_lock
(
&
bond
->
ptrlock
);
slave
=
start_at
=
bond
->
current_slave
;
read_unlock
(
&
bond
->
ptrlock
);
if
(
slave
==
NULL
)
{
/* we're at the root, get the first slave */
/* no suitable interface, frame not sent */
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
dev_kfree_skb
(
skb
);
return
0
;
}
do
{
if
(
IS_UP
(
slave
->
dev
)
&&
(
slave
->
link
==
BOND_LINK_UP
)
&&
(
slave
->
state
==
BOND_STATE_ACTIVE
))
{
if
(
device_we_should_send_to
)
{
struct
sk_buff
*
skb2
;
if
((
skb2
=
skb_clone
(
skb
,
GFP_ATOMIC
))
==
NULL
)
{
printk
(
KERN_ERR
"bond_xmit_broadcast: skb_clone() failed
\n
"
);
continue
;
}
skb2
->
dev
=
device_we_should_send_to
;
skb2
->
priority
=
1
;
dev_queue_xmit
(
skb2
);
}
device_we_should_send_to
=
slave
->
dev
;
}
}
while
((
slave
=
slave
->
next
)
!=
start_at
);
if
(
device_we_should_send_to
)
{
skb
->
dev
=
device_we_should_send_to
;
skb
->
priority
=
1
;
dev_queue_xmit
(
skb
);
}
else
dev_kfree_skb
(
skb
);
/* frame sent to all suitable interfaces */
read_unlock_irqrestore
(
&
bond
->
lock
,
flags
);
return
0
;
}
static
int
bond_xmit_roundrobin
(
struct
sk_buff
*
skb
,
struct
net_device
*
dev
)
{
slave_t
*
slave
,
*
start_at
;
...
...
@@ -1978,14 +2629,25 @@ static int bond_xmit_activebackup(struct sk_buff *skb, struct net_device *dev)
}
/* if we are sending arp packets and don't know
the target hw address, save it so we don't need
to use a broadcast address */
if
(
(
arp_interval
>
0
)
&&
(
arp_target_hw_addr
==
NULL
)
&&
* the target hw address, save it so we don't need
* to use a broadcast address.
* don't do this if in active backup mode because the slaves must
* receive packets to stay up, and the only ones they receive are
* broadcasts.
*/
if
(
(
mode
!=
BOND_MODE_ACTIVEBACKUP
)
&&
(
arp_ip_count
==
1
)
&&
(
arp_interval
>
0
)
&&
(
arp_target_hw_addr
==
NULL
)
&&
(
skb
->
protocol
==
__constant_htons
(
ETH_P_IP
)
)
)
{
struct
ethhdr
*
eth_hdr
=
(
struct
ethhdr
*
)
(((
char
*
)
skb
->
data
));
arp_target_hw_addr
=
kmalloc
(
ETH_ALEN
,
GFP_KERNEL
);
memcpy
(
arp_target_hw_addr
,
eth_hdr
->
h_dest
,
ETH_ALEN
);
struct
iphdr
*
ip_hdr
=
(
struct
iphdr
*
)(
eth_hdr
+
1
);
if
(
arp_target
[
0
]
==
ip_hdr
->
daddr
)
{
arp_target_hw_addr
=
kmalloc
(
ETH_ALEN
,
GFP_KERNEL
);
if
(
arp_target_hw_addr
!=
NULL
)
memcpy
(
arp_target_hw_addr
,
eth_hdr
->
h_dest
,
ETH_ALEN
);
}
}
read_lock_irqsave
(
&
bond
->
lock
,
flags
);
...
...
@@ -2074,29 +2736,7 @@ static int bond_get_info(char *buf, char **start, off_t offset, int length)
*/
link
=
bond_check_mii_link
(
bond
);
len
+=
sprintf
(
buf
+
len
,
"Bonding Mode: "
);
switch
(
mode
)
{
case
BOND_MODE_ACTIVEBACKUP
:
len
+=
sprintf
(
buf
+
len
,
"%s
\n
"
,
"active-backup"
);
break
;
case
BOND_MODE_ROUNDROBIN
:
len
+=
sprintf
(
buf
+
len
,
"%s
\n
"
,
"load balancing (round-robin)"
);
break
;
case
BOND_MODE_XOR
:
len
+=
sprintf
(
buf
+
len
,
"%s
\n
"
,
"load balancing (xor)"
);
break
;
default:
len
+=
sprintf
(
buf
+
len
,
"%s
\n
"
,
"unknown"
);
break
;
}
len
+=
sprintf
(
buf
+
len
,
"Bonding Mode: %s
\n
"
,
bond_mode
());
if
(
mode
==
BOND_MODE_ACTIVEBACKUP
)
{
read_lock_irqsave
(
&
bond
->
lock
,
flags
);
...
...
@@ -2115,8 +2755,11 @@ static int bond_get_info(char *buf, char **start, off_t offset, int length)
link
==
BMSR_LSTATUS
?
"up
\n
"
:
"down
\n
"
);
len
+=
sprintf
(
buf
+
len
,
"MII Polling Interval (ms): %d
\n
"
,
miimon
);
len
+=
sprintf
(
buf
+
len
,
"Up Delay (ms): %d
\n
"
,
updelay
);
len
+=
sprintf
(
buf
+
len
,
"Down Delay (ms): %d
\n
"
,
downdelay
);
len
+=
sprintf
(
buf
+
len
,
"Up Delay (ms): %d
\n
"
,
updelay
*
miimon
);
len
+=
sprintf
(
buf
+
len
,
"Down Delay (ms): %d
\n
"
,
downdelay
*
miimon
);
len
+=
sprintf
(
buf
+
len
,
"Multicast Mode: %s
\n
"
,
multicast_mode
());
read_lock_irqsave
(
&
bond
->
lock
,
flags
);
for
(
slave
=
bond
->
prev
;
slave
!=
(
slave_t
*
)
bond
;
...
...
@@ -2205,6 +2848,7 @@ static struct notifier_block bond_netdev_notifier = {
static
int
__init
bond_init
(
struct
net_device
*
dev
)
{
bonding_t
*
bond
,
*
this_bond
,
*
last_bond
;
int
count
;
#ifdef BONDING_DEBUG
printk
(
KERN_INFO
"Begin bond_init for %s
\n
"
,
dev
->
name
);
...
...
@@ -2228,6 +2872,7 @@ static int __init bond_init(struct net_device *dev)
bond
->
next
=
bond
->
prev
=
(
slave_t
*
)
bond
;
bond
->
current_slave
=
NULL
;
bond
->
current_arp_slave
=
NULL
;
bond
->
device
=
dev
;
dev
->
priv
=
bond
;
...
...
@@ -2238,6 +2883,8 @@ static int __init bond_init(struct net_device *dev)
dev
->
hard_start_xmit
=
bond_xmit_roundrobin
;
}
else
if
(
mode
==
BOND_MODE_XOR
)
{
dev
->
hard_start_xmit
=
bond_xmit_xor
;
}
else
if
(
mode
==
BOND_MODE_BROADCAST
)
{
dev
->
hard_start_xmit
=
bond_xmit_broadcast
;
}
else
{
printk
(
KERN_ERR
"Unknown bonding mode %d
\n
"
,
mode
);
kfree
(
bond
->
stats
);
...
...
@@ -2272,7 +2919,18 @@ static int __init bond_init(struct net_device *dev)
}
else
{
printk
(
"out MII link monitoring"
);
}
printk
(
", in %s mode.
\n
"
,
mode
?
"active-backup"
:
"bonding"
);
printk
(
", in %s mode.
\n
"
,
bond_mode
());
printk
(
KERN_INFO
"%s registered with"
,
dev
->
name
);
if
(
arp_interval
>
0
)
{
printk
(
" ARP monitoring set to %d ms with %d target(s):"
,
arp_interval
,
arp_ip_count
);
for
(
count
=
0
;
count
<
arp_ip_count
;
count
++
)
printk
(
" %s"
,
arp_ip_target
[
count
]);
printk
(
"
\n
"
);
}
else
{
printk
(
"out ARP monitoring
\n
"
);
}
#ifdef CONFIG_PROC_FS
bond
->
bond_proc_dir
=
proc_mkdir
(
dev
->
name
,
proc_net
);
...
...
@@ -2329,6 +2987,8 @@ static int __init bonding_init(void)
/* Find a name for this unit */
static
struct
net_device
*
dev_bond
=
NULL
;
printk
(
KERN_INFO
"%s"
,
version
);
if
(
max_bonds
<
1
||
max_bonds
>
INT_MAX
)
{
printk
(
KERN_WARNING
"bonding_init(): max_bonds (%d) not in range %d-%d, "
...
...
@@ -2343,6 +3003,14 @@ static int __init bonding_init(void)
}
memset
(
dev_bonds
,
0
,
max_bonds
*
sizeof
(
struct
net_device
));
if
(
miimon
<
0
)
{
printk
(
KERN_WARNING
"bonding_init(): miimon module parameter (%d), "
"not in range 0-%d, so it was reset to %d
\n
"
,
miimon
,
INT_MAX
,
BOND_LINK_MON_INTERV
);
miimon
=
BOND_LINK_MON_INTERV
;
}
if
(
updelay
<
0
)
{
printk
(
KERN_WARNING
"bonding_init(): updelay module parameter (%d), "
...
...
@@ -2359,6 +3027,52 @@ static int __init bonding_init(void)
downdelay
=
0
;
}
if
(
miimon
==
0
)
{
if
((
updelay
!=
0
)
||
(
downdelay
!=
0
))
{
/* just warn the user the up/down delay will have
* no effect since miimon is zero...
*/
printk
(
KERN_WARNING
"bonding_init(): miimon module parameter not "
"set and updelay (%d) or downdelay (%d) module "
"parameter is set; updelay and downdelay have "
"no effect unless miimon is set
\n
"
,
updelay
,
downdelay
);
}
}
else
{
/* don't allow arp monitoring */
if
(
arp_interval
!=
0
)
{
printk
(
KERN_WARNING
"bonding_init(): miimon (%d) and arp_interval "
"(%d) can't be used simultaneously, "
"disabling ARP monitoring
\n
"
,
miimon
,
arp_interval
);
arp_interval
=
0
;
}
if
((
updelay
%
miimon
)
!=
0
)
{
/* updelay will be rounded in bond_init() when it
* is divided by miimon, we just inform user here
*/
printk
(
KERN_WARNING
"bonding_init(): updelay (%d) is not a multiple "
"of miimon (%d), updelay rounded to %d ms
\n
"
,
updelay
,
miimon
,
(
updelay
/
miimon
)
*
miimon
);
}
if
((
downdelay
%
miimon
)
!=
0
)
{
/* downdelay will be rounded in bond_init() when it
* is divided by miimon, we just inform user here
*/
printk
(
KERN_WARNING
"bonding_init(): downdelay (%d) is not a "
"multiple of miimon (%d), downdelay rounded "
"to %d ms
\n
"
,
downdelay
,
miimon
,
(
downdelay
/
miimon
)
*
miimon
);
}
}
if
(
arp_interval
<
0
)
{
printk
(
KERN_WARNING
"bonding_init(): arp_interval module parameter (%d), "
...
...
@@ -2367,11 +3081,63 @@ static int __init bonding_init(void)
arp_interval
=
BOND_LINK_ARP_INTERV
;
}
if
(
arp_ip_target
)
{
/* TODO: check and log bad ip address */
if
(
my_inet_aton
(
arp_ip_target
,
&
arp_target
)
==
0
)
{
arp_interval
=
0
;
for
(
arp_ip_count
=
0
;
(
arp_ip_count
<
MAX_ARP_IP_TARGETS
)
&&
arp_ip_target
[
arp_ip_count
];
arp_ip_count
++
)
{
/* TODO: check and log bad ip address */
if
(
my_inet_aton
(
arp_ip_target
[
arp_ip_count
],
&
arp_target
[
arp_ip_count
])
==
0
)
{
printk
(
KERN_WARNING
"bonding_init(): bad arp_ip_target module "
"parameter (%s), ARP monitoring will not be "
"performed
\n
"
,
arp_ip_target
[
arp_ip_count
]);
arp_interval
=
0
;
}
}
if
(
(
arp_interval
>
0
)
&&
(
arp_ip_count
==
0
))
{
/* don't allow arping if no arp_ip_target given... */
printk
(
KERN_WARNING
"bonding_init(): arp_interval module parameter "
"(%d) specified without providing an arp_ip_target "
"parameter, arp_interval was reset to 0
\n
"
,
arp_interval
);
arp_interval
=
0
;
}
if
((
miimon
==
0
)
&&
(
arp_interval
==
0
))
{
/* miimon and arp_interval not set, we need one so things
* work as expected, see bonding.txt for details
*/
printk
(
KERN_ERR
"bonding_init(): either miimon or "
"arp_interval and arp_ip_target module parameters "
"must be specified, otherwise bonding will not detect "
"link failures! see bonding.txt for details.
\n
"
);
}
if
((
primary
!=
NULL
)
&&
(
mode
!=
BOND_MODE_ACTIVEBACKUP
)){
/* currently, using a primary only makes sence
* in active backup mode
*/
printk
(
KERN_WARNING
"bonding_init(): %s primary device specified but has "
" no effect in %s mode
\n
"
,
primary
,
bond_mode
());
primary
=
NULL
;
}
if
(
multicast
!=
BOND_MULTICAST_DISABLED
&&
multicast
!=
BOND_MULTICAST_ACTIVE
&&
multicast
!=
BOND_MULTICAST_ALL
)
{
printk
(
KERN_WARNING
"bonding_init(): unknown multicast module "
"parameter (%d), multicast reset to %d
\n
"
,
multicast
,
BOND_MULTICAST_ALL
);
multicast
=
BOND_MULTICAST_ALL
;
}
for
(
no
=
0
;
no
<
max_bonds
;
no
++
)
{
...
...
@@ -2420,6 +3186,7 @@ static void __exit bonding_exit(void)
module_init
(
bonding_init
);
module_exit
(
bonding_exit
);
MODULE_LICENSE
(
"GPL"
);
MODULE_DESCRIPTION
(
DRV_DESCRIPTION
", v"
DRV_VERSION
);
/*
* Local variables:
...
...
include/linux/if_bonding.h
View file @
d8908c80
...
...
@@ -37,9 +37,10 @@
#define BOND_CHECK_MII_STATUS (SIOCGMIIPHY)
#define BOND_MODE_ROUNDROBIN 0
#define BOND_MODE_ACTIVEBACKUP 1
#define BOND_MODE_XOR 2
#define BOND_MODE_ROUNDROBIN 0
#define BOND_MODE_ACTIVEBACKUP 1
#define BOND_MODE_XOR 2
#define BOND_MODE_BROADCAST 3
/* each slave's link has 4 states */
#define BOND_LINK_UP 0
/* link is up and running */
...
...
@@ -74,6 +75,7 @@ typedef struct slave {
struct
slave
*
prev
;
struct
net_device
*
dev
;
short
delay
;
unsigned
long
jiffies
;
char
link
;
/* one of BOND_LINK_XXXX */
char
state
;
/* one of BOND_STATE_XXXX */
unsigned
short
original_flags
;
...
...
@@ -93,6 +95,8 @@ typedef struct bonding {
slave_t
*
next
;
slave_t
*
prev
;
slave_t
*
current_slave
;
slave_t
*
primary_slave
;
slave_t
*
current_arp_slave
;
__s32
slave_cnt
;
rwlock_t
lock
;
rwlock_t
ptrlock
;
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment