Commits · 7c0c6095d48dcd0e67c917aa73cdbb2715aafc36 · nexedi / linux

23 May, 2019 32 commits

selftests/bpf: adjust verifier scale test · 7c0c6095

Alexei Starovoitov authored May 21, 2019

Adjust scale tests to check for new jmp sequence limit.

BPF_JGT had to be changed to BPF_JEQ because the verifier was
too smart. It tracked the known safe range of R0 values
and pruned the search earlier before hitting exact 8192 limit.
bpf_semi_rand_get() was too (un)?lucky.

k = 0; was missing in bpf_fill_scale2.
It was testing a bit shorter sequence of jumps than intended.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

7c0c6095

bpf: bump jmp sequence limit · b285fcb7

Alexei Starovoitov authored May 21, 2019

The limit of 1024 subsequent jumps was causing otherwise valid
programs to be rejected. Bump it to 8192 and make the error more verbose.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

b285fcb7

libbpf: emit diff of mismatched public API, if any · 9efc7794

Andrii Nakryiko authored May 22, 2019

It's easy to have a mismatch of "intended to be public" vs really
exposed API functions. While Makefile does check for this mismatch, if
it actually occurs it's not trivial to determine which functions are
accidentally exposed. This patch dumps out a diff showing what's not
supposed to be exposed facilitating easier fixing.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

9efc7794

hv_sock: perf: loop in send() to maximize bandwidth · 14a1eaa8

Sunil Muthuswamy authored May 22, 2019

Currently, the hv_sock send() iterates once over the buffer, puts data into
the VMBUS channel and returns. It doesn't maximize on the case when there
is a simultaneous reader draining data from the channel. In such a case,
the send() can maximize the bandwidth (and consequently minimize the cpu
cycles) by iterating until the channel is found to be full.

Perf data:
Total Data Transfer: 10GB/iteration
Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
reader
Packet size: 64KB
CPU sys time was captured using the 'time' command for the writer to send
10GB of data.
'Send Buffer Loop' is with the patch applied.
The values below are over 10 iterations.

|--------------------------------------------------------|
|        |        Current        |   Send Buffer Loop    |
|--------------------------------------------------------|
|        | Throughput | CPU sys  | Throughput | CPU sys  |
|        | (MB/s)     | time (s) | (MB/s)     | time (s) |
|--------------------------------------------------------|
| Min    |     407    |   7.048  |    401     |  5.958   |
|--------------------------------------------------------|
| Max    |     455    |   7.563  |    542     |  6.993   |
|--------------------------------------------------------|
| Avg    |     440    |   7.411  |    451     |  6.639   |
|--------------------------------------------------------|
| Median |     446    |   7.417  |    447     |  6.761   |
|--------------------------------------------------------|

Observation:
1. The avg throughput doesn't really change much with this change for this
scenario. This is most probably because the bottleneck on throughput is
somewhere else.
2. The average system (or kernel) cpu time goes down by 10%+ with this
change, for the same amount of data transfer.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

14a1eaa8

hv_sock: perf: Allow the socket buffer size options to influence the actual socket buffers · ac383f58

Sunil Muthuswamy authored May 22, 2019

Currently, the hv_sock buffer size is static and can't scale to the
bandwidth requirements of the application. This change allows the
applications to influence the socket buffer sizes using the SO_SNDBUF and
the SO_RCVBUF socket options.

Few interesting points to note:
1. Since the VMBUS does not allow a resize operation of the ring size, the
socket buffer size option should be set prior to establishing the
connection for it to take effect.
2. Setting the socket option comes with the cost of that much memory being
reserved/allocated by the kernel, for the lifetime of the connection.

Perf data:
Total Data Transfer: 1GB
Single threaded reader/writer
Results below are summarized over 10 iterations.

Linux hvsocket writer + Windows hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size ->   |      128B       |       1KB       |       4KB       |        64KB         |
|---------------------------------------------------------------------------------------------|
|SO_SNDBUF size | |                 Throughput in MB/s (min/max/avg/median):                  |
|               v |                                                                           |
|---------------------------------------------------------------------------------------------|
|      Default    | 109/118/114/116 | 636/774/701/700 | 435/507/480/476 |   410/491/462/470   |
|      16KB       | 110/116/112/111 | 575/705/662/671 | 749/900/854/869 |   592/824/692/676   |
|      32KB       | 108/120/115/115 | 703/823/767/772 | 718/878/850/866 | 1593/2124/2000/2085 |
|      64KB       | 108/119/114/114 | 592/732/683/688 | 805/934/903/911 | 1784/1943/1862/1843 |
|---------------------------------------------------------------------------------------------|

Windows hvsocket writer + Linux hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size ->   |     128B    |      1KB        |          4KB        |        64KB         |
|---------------------------------------------------------------------------------------------|
|SO_RCVBUF size | |               Throughput in MB/s (min/max/avg/median):                    |
|               v |                                                                           |
|---------------------------------------------------------------------------------------------|
|      Default    | 69/82/75/73 | 313/343/333/336 |   418/477/446/445   |   659/701/676/678   |
|      16KB       | 69/83/76/77 | 350/401/375/382 |   506/548/517/516   |   602/624/615/615   |
|      32KB       | 62/83/73/73 | 471/529/496/494 |   830/1046/935/939  | 944/1180/1070/1100  |
|      64KB       | 64/70/68/69 | 467/533/501/497 | 1260/1590/1430/1431 | 1605/1819/1670/1660 |
|---------------------------------------------------------------------------------------------|
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac383f58

ipv4/igmp: shrink struct ip_sf_list · 0db355d4

Eric Dumazet authored May 22, 2019

Removing two 4 bytes holes allows to use kmalloc-32
kmem cache instead of kmalloc-64 on 64bit kernels.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0db355d4

neighbor: Add tracepoint to __neigh_create · fc651001

David Ahern authored May 22, 2019

Add tracepoint to __neigh_create to enable debugging of new entries.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fc651001

selftests: pmtu: Simplify cleanup and namespace names · a92a0a7b

David Ahern authored May 22, 2019

The point of the pause-on-fail argument is to leave the setup as is after
a test fails to allow a user to debug why it failed. Move the cleanup
after posting the result to the user to make it so.

Random names for the namespaces are not user friendly when trying to
debug a failure. Make them simpler and more direct for the tests. Run
cleanup at the beginning to ensure they are cleaned up if they already
exist.

Remove cleanup_done. There is no harm in doing cleanup twice; just
ignore any errors related to not existing - which is already done.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a92a0a7b

selftests: fib-onlink: Make quiet by default · 9b7e94e6

David Ahern authored May 22, 2019

Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by
default. Add getopt parsing of inputs and support for -v (verbose) and
-p (pause on fail).
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9b7e94e6

net: Set strict_start_type for routes and rules · 75425657

David Ahern authored May 22, 2019

New userspace on an older kernel can send unknown and unsupported
attributes resulting in an incompelete config which is almost
always wrong for routing (few exceptions are passthrough settings
like the protocol that installed the route).

Set strict_start_type in the policies for IPv4 and IPv6 routes and
rules to detect new, unsupported attributes and fail the route add.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

75425657

Merge branch 'net-Export-functions-for-nexthop-code' · e38f7cbd

David S. Miller authored May 22, 2019

David Ahern says:

====================
net: Export functions for nexthop code

This set exports ipv4 and ipv6 fib functions for use by the nexthop
code. It also adds new ones to send route notifications if a nexthop
configuration changes.

v2
- repost of patches dropped at the end of the last dev window
  added patch 8 which exports nh_update_mtu since it is inline with
  the other patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e38f7cbd

ipv4: Rename and export nh_update_mtu · 06c77c3e

David Ahern authored May 22, 2019

Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the
nexthop code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

06c77c3e

ipv4: export fib_info_update_nh_saddr · c3669486

David Ahern authored May 22, 2019

Add scope as input argument versus relying on fib_info reference in
fib_nh, and export fib_info_update_nh_saddr.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3669486

ipv4: export fib_flush · 9bd83667

David Ahern authored May 22, 2019

As nexthops are deleted, fib entries referencing it are marked dead.
Export fib_flush so those entries can be removed in a timely manner.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9bd83667

ipv4: export fib_check_nh · ac1fab2d

David Ahern authored May 22, 2019

Change fib_check_nh to take net, table and scope as input arguments
over struct fib_config and export for use by nexthop code.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac1fab2d

ipv4: Add function to send route updates · 1bff1a0c

David Ahern authored May 22, 2019

Add fib_info_notify_update to walk the fib and send RTM_NEWROUTE
notifications with NLM_F_REPLACE set for entries linked to a fib_info
that have nh_updated flag set. This helper will be used by the nexthop
code to notify userspace of routes that are impacted when a nexthop
config is updated via replace. The new function and its helper are
similar to how fib_flush and fib_table_flush work for address delete
and link down events.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1bff1a0c

ipv6: export function to send route updates · 19a3b7ee

David Ahern authored May 22, 2019

Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This
helper will be used by the nexthop code to notify userspace of routes
that are impacted when a nexthop config is updated via replace.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

19a3b7ee

ipv6: Add hook to bump sernum for a route to stubs · cdaa16a4

David Ahern authored May 22, 2019

Add hook to ipv6 stub to bump the sernum up to the root node for a
route. This is needed by the nexthop code when a nexthop config changes.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cdaa16a4

ipv6: Add delete route hook to stubs · 68a9b13d

David Ahern authored May 22, 2019

Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop
code to remove entries linked to a nexthop that is getting deleted.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

68a9b13d

Merge branch 'net-phy-T1-support' · 26b1b8d7

David S. Miller authored May 22, 2019

Andrew Lunn says:

====================
net: phy: T1 support

T1 PHYs make use of a single twisted pair, rather than the traditional
2 pair for 100BaseT or 4 pair for 1000BaseT. This patchset adds link
modes for 100BaseT1 and 1000BaseT1, and them makes use of 100BaseT1 in
the list of PHY features used by current T1 drivers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

26b1b8d7

net: phy: Make phy_basic_t1_features use base100t1. · e5fb32c6

Andrew Lunn authored May 22, 2019

Now that there is a link mode for 100BaseT1, use it in
phy_basic_t1_features so T1 PHY drivers will indicate this mode via
the Ethtool API.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

e5fb32c6

net: phy: Add support for 100BaseT1 and 1000BaseT1 · b2557764

Andrew Lunn authored May 22, 2019

Add link modes for 100Mbps and 1Gbps over a single pair.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

b2557764

net: phy: dp83867: Allocate state struct in probe · 565d9d22

Trent Piepho authored May 22, 2019

This was being done in config the first time the phy was configured.
Should be in the probe method.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

565d9d22

net: phy: dp83867: Validate FIFO depth property · f8bbf417

Trent Piepho authored May 22, 2019

Insure property is in valid range and fail when reading DT if it is not.
Also add error message for existing failure if required property is not
present.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

f8bbf417

net: phy: dp83867: IO impedance is not dependent on RGMII delay · 27708eb5

Trent Piepho authored May 22, 2019

The driver would only set the IO impedance value when RGMII internal
delays were enabled.  There is no reason for this.  Move the IO
impedance block out of the RGMII delay block.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

27708eb5

net: phy: dp83867: Use unsigned variables to store unsigned properties · 1b9b2954

Trent Piepho authored May 22, 2019

The variables used to store u32 DT properties were signed ints.  This
doesn't work properly if the value of the property were to overflow.
Use unsigned variables so this doesn't happen.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

1b9b2954

net: phy: dp83867: Rework delay rgmii delay handling · c11669a2

Trent Piepho authored May 22, 2019

The code was assuming the reset default of the delay control register
was to have delay disabled.  This is what the datasheet shows as the
register's initial value.  However, that's not actually true: the
default is controlled by the PHY's pin strapping.

If the interface mode is selected as RX or TX delay only, insure the
other direction's delay is disabled.

If the interface mode is just "rgmii", with neither TX or RX internal
delay, one might expect that the driver should disable both delays.  But
this is not what the driver does.  It leaves the setting at the PHY's
strapping's default.  And that default, for no pins with strapping
resistors, is to have delay enabled and 2.00 ns.

Rather than change this behavior, I've kept it the same and documented
it.  No delay will most likely not work and will break ethernet on any
board using "rgmii" mode.  If the board is strapped to have a delay and
is configured to use "rgmii" mode a warning is generated that "rgmii-id"
should have been used.

Also validate the delay values and fail if they are not in range.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

c11669a2

net: phy: dp83867: Add ability to disable output clock · 13c83cf8

Trent Piepho authored May 22, 2019

Generally, the output clock pin is only used for testing and only serves
as a source of RF noise after this.  It could be used to daisy-chain
PHYs, but this is uncommon.  Since the PHY can disable the output, make
doing so an option.  I do this by adding another enumeration to the
allowed values of ti,clk-output-sel.

The code was not using the value DP83867_CLK_O_SEL_REF_CLK as one might
expect: to select the REF_CLK as the output.  Rather it meant "keep
clock output setting as is", which, depending on PHY strapping, might
not be outputting REF_CLK.

Change this so DP83867_CLK_O_SEL_REF_CLK means enable REF_CLK output.
Omitting the property will leave the setting as is (which was the
previous behavior in this case).

Out of range values were silently converted into
DP83867_CLK_O_SEL_REF_CLK.  Change this so they generate an error.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

13c83cf8

dt-bindings: phy: dp83867: Add documentation for disabling clock output · 980066e6

Trent Piepho authored May 22, 2019

The clock output is generally only used for testing and development and
not used to daisy-chain PHYs.  It's just a source of RF noise afterward.

Add a mux value for "off".  I've added it as another enumeration to the
output property.  In the actual PHY, the mux and the output enable are
independently controllable.  However, it doesn't seem useful to be able
to describe the mux setting when the output is disabled.

Document that PHY's default setting will be left as is if the property
is omitted.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

980066e6

dt-bindings: phy: dp83867: Describe how driver behaves w.r.t rgmii delay · 9c3f3410

Trent Piepho authored May 22, 2019

Add a note to make it more clear how the driver behaves when "rgmii" vs
"rgmii-id", "rgmii-idrx", or "rgmii-idtx" interface modes are selected.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

9c3f3410

cxgb4: Enable hash filter with offload · 74dd5aa1

Vishal Kulkarni authored May 22, 2019

Hash (exact-match) filters used for offloading flows share the
same active region resources on the chip with upper layer drivers,
like iw_cxgb4, chcr, etc. Currently, only either Hash filters
or ULDs can use the active region resources, but not both. Hence,
use the new firmware configuration parameters (when available)
to allow both the Hash filters and ULDs to share the
active region simultaneously.
Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

74dd5aa1

net: fec: remove redundant ipg clock disable · 2bb0f3b4

Baruch Siach authored May 22, 2019

Don't disable the ipg clock in the regulator error path. The clock is
disable unconditionally two lines below the failed_regulator label.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

2bb0f3b4

22 May, 2019 6 commits

net: Add UNIX_DIAG_UID to Netlink UNIX socket diagnostics. · cae9910e

Felipe Gasper authored May 20, 2019

This adds the ability for Netlink to report a socket's UID along with the
other UNIX diagnostic information that is already available. This will
allow diagnostic tools greater insight into which users control which
socket.

To test this, do the following as a non-root user:

    unshare -U -r bash
    nc -l -U user.socket.$$ &

.. and verify from within that same session that Netlink UNIX socket
diagnostics report the socket's UID as 0. Also verify that Netlink UNIX
socket diagnostics report the socket's UID as the user's UID from an
unprivileged process in a different session. Verify the same from
a root process.
Signed-off-by: Felipe Gasper <felipe@felipegasper.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cae9910e

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 54dee406

Linus Torvalds authored May 22, 2019

Pull arm64 fixes from Will Deacon:

 - Fix SPE probe failure when backing auxbuf with high-order pages

 - Fix handling of DMA allocations from outside of the vmalloc area

 - Fix generation of build-id ELF section for vDSO object

 - Disable huge I/O mappings if kernel page table dumping is enabled

 - A few other minor fixes (comments, kconfig etc)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: vdso: Explicitly add build-id option
  arm64/mm: Inhibit huge-vmap with ptdump
  arm64: Print physical address of page table base in show_pte()
  arm64: don't trash config with compat symbol if COMPAT is disabled
  arm64: assembler: Update comment above cond_yield_neon() macro
  drivers/perf: arm_spe: Don't error on high-order pages for aux buf
  arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable

54dee406

Merge tag 'gfs2-5.1.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 · 651bae98

Linus Torvalds authored May 22, 2019

Pull gfs2 fix from Andreas Gruenbacher:
 "Fix a gfs2 sign extension bug introduced in v4.3"

* tag 'gfs2-5.1.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fix sign extension bug in gfs2_update_stats

651bae98

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f75b6f30

Linus Torvalds authored May 22, 2019

Pull networking fixes from David Miller:

 1) Clear up some recent tipc regressions because of registration
    ordering. Fix from Junwei Hu.

 2) tipc's TLV_SET() can read past the end of the supplied buffer during
    the copy. From Chris Packham.

 3) ptp example program doesn't match the kernel, from Richard Cochran.

 4) Outgoing message type fix in qrtr, from Bjorn Andersson.

 5) Flow control regression in stmmac, from Tan Tee Min.

 6) Fix inband autonegotiation in phylink, from Russell King.

 7) Fix sk_bound_dev_if handling in rawv6_bind(), from Mike Manning.

 8) Fix usbnet crash after disconnect, from Kloetzke Jan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
  usbnet: fix kernel crash after disconnect
  selftests: fib_rule_tests: use pre-defined DEV_ADDR
  net-next: net: Fix typos in ip-sysctl.txt
  ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
  net: phylink: ensure inband AN works correctly
  usbnet: ipheth: fix racing condition
  net: stmmac: dma channel control register need to be init first
  net: stmmac: fix ethtool flow control not able to get/set
  net: qrtr: Fix message type of outgoing packets
  networking: : fix typos in code comments
  ptp: Fix example program to match kernel.
  fddi: fix typos in code comments
  selftests: fib_rule_tests: enable forwarding before ipv4 from/iif test
  selftests: fib_rule_tests: fix local IPv4 address typo
  tipc: Avoid copying bytes beyond the supplied data
  2/2] net: xilinx_emaclite: use readx_poll_timeout() in mdio wait function
  1/2] net: axienet: use readx_poll_timeout() in mdio wait function
  vlan: Mark expected switch fall-through
  macvlan: Mark expected switch fall-through
  net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
  ...

f75b6f30

Merge tag 'for-5.2/dm-fix-1' of... · 86f9e56d

Linus Torvalds authored May 22, 2019

Merge tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
 "Fix a particularly glaring oversight in a DM core commit from 5.1 that
  doesn't properly trim special IOs (e.g. discards) relative to
  corresponding target's max_io_len_target_boundary()"

* tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: make sure to obey max_io_len_target_boundary

86f9e56d

gfs2: Fix sign extension bug in gfs2_update_stats · 5a5ec83d

Andreas Gruenbacher authored May 17, 2019

Commit 4d207133 changed the types of the statistic values in struct
gfs2_lkstats from s64 to u64. Because of that, what should be a signed
value in gfs2_update_stats turned into an unsigned value. When shifted
right, we end up with a large positive value instead of a small negative
value, which results in an incorrect variance estimate.

Fixes: 4d207133 ("gfs2: Make statistics unsigned, suitable for use with do_div()")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: stable@vger.kernel.org # v4.4+

5a5ec83d

21 May, 2019 2 commits

dm: make sure to obey max_io_len_target_boundary · 51b86f9a

Michael Lass authored May 21, 2019

Commit 61697a6a ("dm: eliminate 'split_discard_bios' flag from DM
target interface") incorrectly removed code from
__send_changing_extent_only() that is required to impose a per-target IO
boundary on IO that exceeds max_io_len_target_boundary().  Otherwise
"special" IO (e.g. DISCARD, WRITE SAME, WRITE ZEROES) can write beyond
where allowed.

Fix this by restoring the max_io_len_target_boundary() limit in
__send_changing_extent_only()

Fixes: 61697a6a ("dm: eliminate 'split_discard_bios' flag from DM target interface")
Cc: stable@vger.kernel.org # 5.1+
Signed-off-by: Michael Lass <bevan@bi-co.net>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>

51b86f9a

usbnet: fix kernel crash after disconnect · ad70411a

Kloetzke Jan authored May 21, 2019

When disconnecting cdc_ncm the kernel sporadically crashes shortly
after the disconnect:

  [   57.868812] Unable to handle kernel NULL pointer dereference at virtual address 00000000
  ...
  [   58.006653] PC is at 0x0
  [   58.009202] LR is at call_timer_fn+0xec/0x1b4
  [   58.013567] pc : [<0000000000000000>] lr : [<ffffff80080f5130>] pstate: 00000145
  [   58.020976] sp : ffffff8008003da0
  [   58.024295] x29: ffffff8008003da0 x28: 0000000000000001
  [   58.029618] x27: 000000000000000a x26: 0000000000000100
  [   58.034941] x25: 0000000000000000 x24: ffffff8008003e68
  [   58.040263] x23: 0000000000000000 x22: 0000000000000000
  [   58.045587] x21: 0000000000000000 x20: ffffffc68fac1808
  [   58.050910] x19: 0000000000000100 x18: 0000000000000000
  [   58.056232] x17: 0000007f885aff8c x16: 0000007f883a9f10
  [   58.061556] x15: 0000000000000001 x14: 000000000000006e
  [   58.066878] x13: 0000000000000000 x12: 00000000000000ba
  [   58.072201] x11: ffffffc69ff1db30 x10: 0000000000000020
  [   58.077524] x9 : 8000100008001000 x8 : 0000000000000001
  [   58.082847] x7 : 0000000000000800 x6 : ffffff8008003e70
  [   58.088169] x5 : ffffffc69ff17a28 x4 : 00000000ffff138b
  [   58.093492] x3 : 0000000000000000 x2 : 0000000000000000
  [   58.098814] x1 : 0000000000000000 x0 : 0000000000000000
  ...
  [   58.205800] [<          (null)>]           (null)
  [   58.210521] [<ffffff80080f5298>] expire_timers+0xa0/0x14c
  [   58.215937] [<ffffff80080f542c>] run_timer_softirq+0xe8/0x128
  [   58.221702] [<ffffff8008081120>] __do_softirq+0x298/0x348
  [   58.227118] [<ffffff80080a6304>] irq_exit+0x74/0xbc
  [   58.232009] [<ffffff80080e17dc>] __handle_domain_irq+0x78/0xac
  [   58.237857] [<ffffff8008080cf4>] gic_handle_irq+0x80/0xac
  ...

The crash happens roughly 125..130ms after the disconnect. This
correlates with the 'delay' timer that is started on certain USB tx/rx
errors in the URB completion handler.

The problem is a race of usbnet_stop() with usbnet_start_xmit(). In
usbnet_stop() we call usbnet_terminate_urbs() to cancel all URBs in
flight. This only makes sense if no new URBs are submitted
concurrently, though. But the usbnet_start_xmit() can run at the same
time on another CPU which almost unconditionally submits an URB. The
error callback of the new URB will then schedule the timer after it was
already stopped.

The fix adds a check if the tx queue is stopped after the tx list lock
has been taken. This should reliably prevent the submission of new URBs
while usbnet_terminate_urbs() does its job. The same thing is done on
the rx side even though it might be safe due to other flags that are
checked there.
Signed-off-by: Jan Klötzke <Jan.Kloetzke@preh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

ad70411a