- 18 Jun, 2019 17 commits
-
-
Ido Schimmel authored
Allow the driver to create an IPv6 multipath route in one go by passing an array of sibling routes and iterating over them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Currently, the functions that take care of populating IPv6 nexthop groups only add / delete a single nexthop. Prepare them to handle multiple routes in one notification by passing an array of routes and adding / deleting all of them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Prepare the driver to handle multiple routes in a single notification by passing an array of routes to the functions that actually add / delete a route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Previously, IPv6 replace notifications were only sent from fib6_add_rt2node(). The function only emitted such notifications if a route actually replaced another route. A previous patch added another call site in ip6_route_multipath_add() from which such notification can be emitted even if a route was merely added and did not replace another route. Adjust the driver to take this into account and potentially set the 'replace' flag to 'false' if the notified route did not replace an existing route. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Prepare the driver to process IPv6 multipath notifications by passing an array of 'struct fib6_info' instead of just one route. A reference is taken on each sibling route in order to prevent them from being freed until they are processed by the workqueue. v2: * Remove 'multipath_rt' usage Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The function mlxsw_sp_router_fib6_event() takes care of preparing the needed information for the work item that actually inserts the route into the device. When processing an IPv6 multipath route, the function will need to allocate an array to store pointers to all the sibling routes. Change the function's signature to return an error code and adjust the single call site. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
No such notifications are sent by the IPv6 code, so remove them. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
If all the nexthops of a multipath route are being deleted, send one notification for the entire route, instead of one per-nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Emit a notification when a multipath routes is added or replace. Note that unlike the replace notifications sent from fib6_add_rt2node(), it is possible we are sending a 'FIB_EVENT_ENTRY_REPLACE' when a route was merely added and not replaced. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
In a similar fashion to previous patch, have netdevsim ignore IPv6 multipath notifications for now. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
IPv6 multipath notifications are about to be sent, but mlxsw is not ready to process them, so ignore them. The limitation will be lifted by a subsequent patch which will also stop the kernel from sending a notification for each nexthop. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Extend the IPv6 FIB notifier info with number of sibling routes being notified. This will later allow listeners to process one notification for a multipath routes instead of N, where N is the number of nexthops. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
The struct includes a 'skip_notify' flag that indicates if netlink notifications to user space should be suppressed. As explained in commit 3b1137fe ("net: ipv6: Change notifications for multipath add to RTA_MULTIPATH"), this is useful to suppress per-nexthop RTM_NEWROUTE notifications when an IPv6 multipath route is added / deleted. Instead, one notification is sent for the entire multipath route. This concept is also useful for in-kernel notifications. Sending one in-kernel notification for the addition / deletion of an IPv6 multipath route - instead of one per-nexthop - provides a significant increase in the insertion / deletion rate to underlying devices. Add a 'skip_notify_kernel' flag to suppress in-kernel notifications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Some fields were not documented. Add documentation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queueDavid S. Miller authored
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2019-06-17 This series contains updates to the iavf driver only. Akeem updates the driver to change how VLAN tags are being populated and programmed into the hardware by starting from the first member of the list until the number of allowed VLAN tags is exhausted. Mitch fixed the variable type since the variable counter starts out negative and climbs to zero, so use a signed integer instead of unsigned. Also increase the timeout to avoid erroneous errors. Fixed the driver to be able to handle when the hardware hands us a null receive descriptor with no data attached, yet is still valid. Aleksandr fixes the driver to use GFP_ATOMIC when allocating memory in atomic context. Avinash updates the driver to fix a calculation error in virtchnl regarding the valid length. Jakub does some refactoring of the commands processing the watchdog state machine to reduce the length and complexity of the function. Also decalre watchdog task as delayed work and use a dedicated work queue to service the driver tasks. Paul updated the iavf_process_aq_command to call the necessary functions to be able to clear cloud filter bits that need to be cleared. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shalom Toledo authored
Compilation on 32-bit ARM fails after commit 992aa864 ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations") because of 64-bit division: arm-linux-gnueabi-ld: drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.o: in function `mlxsw_sp1_ptp_phc_settime': spectrum_ptp.c:(.text+0x39c): undefined reference to `__aeabi_uldivmod' Fix by using div_u64(). Fixes: 992aa864 ("mlxsw: spectrum_ptp: Add implementation for physical hardware clock operations") Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Honestly all the conflicts were simple overlapping changes, nothing really interesting to report. Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 17 Jun, 2019 23 commits
-
-
David S. Miller authored
Fred Klassen says: ==================== UDP GSO audit tests Updates to UDP GSO selftests ot optionally stress test CMSG subsytem, and report the reliability and performance of both TX Timestamping and ZEROCOPY messages. ==================== Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fred Klassen authored
Ensure that failure on any individual test results in an overall failure of the test script. Signed-off-by: Fred Klassen <fklassen@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fred Klassen authored
Audit tests count the total number of messages sent and compares with total number of CMSG received on error queue. Example: udp gso zerocopy timestamp audit udp rx: 1599 MB/s 1166414 calls/s udp tx: 1615 MB/s 27395 calls/s 27395 msg/s udp rx: 1634 MB/s 1192261 calls/s udp tx: 1633 MB/s 27699 calls/s 27699 msg/s udp rx: 1633 MB/s 1191358 calls/s udp tx: 1631 MB/s 27678 calls/s 27678 msg/s Summary over 4.000 seconds... sum udp tx: 1665 MB/s 82772 calls (27590/s) 82772 msgs (27590/s) Tx Timestamps: 82772 received 0 errors Zerocopy acks: 82772 received Errors are thrown if CMSG count does not equal send count, example: Summary over 4.000 seconds... sum tcp tx: 7451 MB/s 493706 calls (123426/s) 493706 msgs (123426/s) ./udpgso_bench_tx: Unexpected number of Zerocopy completions: 493706 expected 493704 received Also reduce individual test time from 4 to 3 seconds so that overall test time does not increase significantly. v3: Enhancements as per Willem de Bruijn <willemb@google.com> - document -P option for TCP audit Signed-off-by: Fred Klassen <fklassen@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Fred Klassen authored
This enhancement adds options that facilitate load testing with additional TX CMSG options, and to optionally print results of various send CMSG operations. These options are especially useful in isolating situations where error-queue messages are lost when combined with other CMSG operations (e.g. SO_ZEROCOPY). New options: -a - count all CMSG messages and match to sent messages -T - add TX CMSG that requests TX software timestamps -H - similar to -T except request TX hardware timestamps -P - call poll() before reading error queue -v - print detailed results v2: Enhancements as per Willem de Bruijn <willemb@google.com> - Updated control and buffer parameters for recvmsg - poll() parameter cleanup - fail on bad audit results - remove TOS options - improved reporting v3: Enhancements as per Willem de Bruijn <willemb@google.com> - add SOF_TIMESTAMPING_OPT_TSONLY to eliminate MSG_TRUNC - general code cleanup Signed-off-by: Fred Klassen <fklassen@appneta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs fixes from Al Viro: "MS_MOVE regression fix + breakage in fsmount(2) (also introduced in this cycle, along with fsmount(2) itself). I'm still digging through the piles of mail, so there might be more fixes to follow, but these two are obvious and self-contained, so there's no point delaying those..." * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs/namespace: fix unprivileged mount propagation vfs: fsmount: add missing mntget()
-
David S. Miller authored
Florian Westphal says: ==================== net: ipv4: remove erroneous advancement of list pointer Tariq reported a soft lockup on net-next that Mellanox was able to bisect to 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list"). While reviewing above patch I found a regression when addresses have a lifetime specified. Second patch extends rtnetlink.sh to trigger crash (without first patch applied). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
This exercises kernel code path that deal with addresses that have a limited lifetime. Without previous fix, this triggers following crash on net-next: BUG: KASAN: null-ptr-deref in check_lifetime+0x403/0x670 Read of size 8 at addr 0000000000000010 by task kworker [..] Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Westphal authored
Causes crash when lifetime expires on an adress as garbage is dereferenced soon after. This used to look like this: for (ifap = &ifa->ifa_dev->ifa_list; *ifap != NULL; ifap = &(*ifap)->ifa_next) { if (*ifap == ifa) ... but this was changed to: struct in_ifaddr *tmp; ifap = &ifa->ifa_dev->ifa_list; tmp = rtnl_dereference(*ifap); while (tmp) { tmp = rtnl_dereference(tmp->ifa_next); // Bogus if (rtnl_dereference(*ifap) == ifa) { ... ifap = &tmp->ifa_next; // Can be NULL tmp = rtnl_dereference(*ifap); // Dereference } } Remove the bogus assigment/list entry skip. Fixes: 2638eb8b ("net: ipv4: provide __rcu annotation for ifa_list") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
Due to a reversed dependency, it is possible to build the lower ptp driver as a loadable module and the actual driver using it as built-in, causing a link error: drivers/net/dsa/sja1105/sja1105_spi.o: In function `sja1105_static_config_upload': sja1105_spi.c:(.text+0x6f0): undefined reference to `sja1105_ptp_reset' drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x2d4): undefined reference to `sja1105et_ptp_cmd' drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x604): undefined reference to `sja1105pqrs_ptp_cmd' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_remove': sja1105_main.c:(.text+0x8d4): undefined reference to `sja1105_ptp_clock_unregister' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_rxtstamp_work': sja1105_main.c:(.text+0x964): undefined reference to `sja1105_tstamp_reconstruct' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_setup': sja1105_main.c:(.text+0xb7c): undefined reference to `sja1105_ptp_clock_register' drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_port_deferred_xmit': sja1105_main.c:(.text+0x1fa0): undefined reference to `sja1105_ptpegr_ts_poll' sja1105_main.c:(.text+0x1fc4): undefined reference to `sja1105_tstamp_reconstruct' drivers/net/dsa/sja1105/sja1105_main.o:(.rodata+0x5b0): undefined reference to `sja1105_get_ts_info' Change the Makefile logic to always build the ptp module the same way as the rest. Another option would be to just add it to the same module and remove the exports, but I don't know if there was a good reason to keep them separate. Fixes: bb77f36a ("net: dsa: sja1105: Add support for the PTP clock") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Arnd Bergmann authored
When building without CONFIG_OF, we get a harmless build warning: drivers/net/ethernet/stmicro/stmmac/stmmac_main.c: In function 'stmmac_phy_setup': drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:973:22: error: unused variable 'node' [-Werror=unused-variable] struct device_node *node = priv->plat->phy_node; Reword it so we always use the local variable, by making it the fwnode pointer instead of the device_node. Fixes: 74371272 ("net: stmmac: Convert to phylink and remove phylib logic") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: "Lots of bug fixes here: 1) Out of bounds access in __bpf_skc_lookup, from Lorenz Bauer. 2) Fix rate reporting in cfg80211_calculate_bitrate_he(), from John Crispin. 3) Use after free in psock backlog workqueue, from John Fastabend. 4) Fix source port matching in fdb peer flow rule of mlx5, from Raed Salem. 5) Use atomic_inc_not_zero() in fl6_sock_lookup(), from Eric Dumazet. 6) Network header needs to be set for packet redirect in nfp, from John Hurley. 7) Fix udp zerocopy refcnt, from Willem de Bruijn. 8) Don't assume linear buffers in vxlan and geneve error handlers, from Stefano Brivio. 9) Fix TOS matching in mlxsw, from Jiri Pirko. 10) More SCTP cookie memory leak fixes, from Neil Horman. 11) Fix VLAN filtering in rtl8366, from Linus Walluij. 12) Various TCP SACK payload size and fragmentation memory limit fixes from Eric Dumazet. 13) Use after free in pneigh_get_next(), also from Eric Dumazet. 14) LAPB control block leak fix from Jeremy Sowden" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (145 commits) lapb: fixed leak of control-blocks. tipc: purge deferredq list for each grp member in tipc_group_delete ax25: fix inconsistent lock state in ax25_destroy_timer neigh: fix use-after-free read in pneigh_get_next tcp: fix compile error if !CONFIG_SYSCTL hv_sock: Suppress bogus "may be used uninitialized" warnings be2net: Fix number of Rx queues used for flow hashing net: handle 802.1P vlan 0 packets properly tcp: enforce tcp_min_snd_mss in tcp_mtu_probing() tcp: add tcp_min_snd_mss sysctl tcp: tcp_fragment() should apply sane memory limits tcp: limit payload size of sacked skbs Revert "net: phylink: set the autoneg state in phylink_phy_change" bpf: fix nested bpf tracepoints with per-cpu data bpf: Fix out of bounds memory access in bpf_sk_storage vsock/virtio: set SOCK_DONE on peer shutdown net: dsa: rtl8366: Fix up VLAN filtering net: phylink: set the autoneg state in phylink_phy_change net: add high_order_alloc_disable sysctl/static key tcp: add tcp_tx_skb_cache sysctl ...
-
Mitch Williams authored
In some circumstances, the hardware can hand us a null receive descriptor, with no data attached but otherwise valid. Unfortunately, the driver was ill-equipped to handle such an event, and would stop processing packets at that point. To fix this, use the Descriptor Done bit instead of the size to determine whether or not a descriptor is ready to be processed. Add some checks to allow for unused buffers. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Paul Greenwalt authored
Add call to iavf_add_cloud_filter and iavf_del_cloud_filter from iavf_process_aq_command to clear aq_required IAVF_FLAG_AQ_ADD_CLOUD_FILTER and IAVF_FLAG_AQ_DEL_CLOUD_FILTER bits. aq_required IAVF_FLAG_AQ_DEL_CLOUD_FILTER bit is being set in iavf_down and iavf_delete_clsflower, and are never cleared. aq_required IAVF_FLAG_AQ_ADD_CLOUD_FILTER bit is being set in iavf_handle_reset and iavf_configure_clsflower, and are never cleared. Since the aq_required is not zero, iavf_watchdog_task is setting the queue_delayed_work to 20 msec instead of the longer delay. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jakub Pawlak authored
Cleanup of init state machine, move state specific code to separate functions and rewrite the iavf_init_task() function. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jan Sokolowski authored
Refactor the watchdog state machine implementation. Add the additional state __IAVF_COMM_FAILED to process the PF communication fails. Prepare the watchdog state machine to integrate with init state machine. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jakub Pawlak authored
Remove the watchdog timer, instead declare watchdog task as delayed work and use dedicated workqueue to service driver tasks. The dedicated driver workqueue iavf_wq is common for all driver instances. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jakub Pawlak authored
Move the commands processing outside the watchdog_task() function. This reduce length and complexity of the function which is mainly designed to process the watchdog state machine. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Avinash Dayanand authored
There was a calculation error in virtchnl regarding the valid length which was fixed recently and a corresponding change needs to go into the code while we enable ADq. Signed-off-by: Avinash Dayanand <avinash.dayanand@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Aleksandr Loktionov authored
iavf_add_vlan() is being called in atomic context so kzalloc() needs GFP_ATOMIC. This patch fixes it. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Mitch Williams authored
On some hardware/driver/architecture combinations, it may take longer than 200msec for all close operations to be completed, causing a spurious error message to be logged. Increase the timeout value to 500msec to avoid this erroneous error. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Mitch Williams authored
The counter variable in iavf_clean_tx_irq starts out negative and climbs to 0. So allocating it as u16 is actually a really bad idea that just happens to work because the value underflows and overflows consistently on most architectures. Replace the u16 with an int so signed math works as expected. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Akeem G Abodunrin authored
This patch changes how VLAN tag are being populated and programmed into the HW - Instead of start adding VF VLAN tag from the last member of the element list, start from the first member of the list, until number of allowed VLAN tags is exhausted in the HW. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Christian Brauner authored
When propagating mounts across mount namespaces owned by different user namespaces it is not possible anymore to move or umount the mount in the less privileged mount namespace. Here is a reproducer: sudo mount -t tmpfs tmpfs /mnt sudo --make-rshared /mnt # create unprivileged user + mount namespace and preserve propagation unshare -U -m --map-root --propagation=unchanged # now change back to the original mount namespace in another terminal: sudo mkdir /mnt/aaa sudo mount -t tmpfs tmpfs /mnt/aaa # now in the unprivileged user + mount namespace mount --move /mnt/aaa /opt Unfortunately, this is a pretty big deal for userspace since this is e.g. used to inject mounts into running unprivileged containers. So this regression really needs to go away rather quickly. The problem is that a recent change falsely locked the root of the newly added mounts by setting MNT_LOCKED. Fix this by only locking the mounts on copy_mnt_ns() and not when adding a new mount. Fixes: 3bd045cc ("separate copying and locking mount tree on cross-userns copies") Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Tested-by: Christian Brauner <christian@brauner.io> Acked-by: Christian Brauner <christian@brauner.io> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Christian Brauner <christian@brauner.io> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-