An error occurred fetching the project authors.
- 05 Aug, 2023 1 commit
-
-
Paolo Abeni authored
Since the blamed commit, the MPTCP protocol unconditionally sends TCP resets on all the subflows on disconnect(). That fits full-blown MPTCP sockets - to implement the fastclose mechanism - but causes unexpected corruption of the data stream, caught as sporadic self-tests failures. Fixes: d21f8348 ("mptcp: use fastclose on more edge scenarios") Cc: stable@vger.kernel.org Tested-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/419Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20230803-upstream-net-20230803-misc-fixes-6-5-v1-3-6671b1ab11cc@tessares.netSigned-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 04 Aug, 2023 1 commit
-
-
Xiang Yang authored
Coccicheck reports the error below: net/mptcp/protocol.c:3330:15-28: ERROR: test of a variable/field address Since the address of msk->cb_flags is used in __test_and_clear_bit, the address should not be NULL. The judgment for if (unlikely(msk->cb_flags)) will always be true, we should check the real value of msk->cb_flags here. Fixes: 65a569b0 ("mptcp: optimize release_cb for the common case") Signed-off-by:
Xiang Yang <xiangyang3@huawei.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Link: https://lore.kernel.org/r/20230803072438.1847500-1-xiangyang3@huawei.comSigned-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 26 Jul, 2023 1 commit
-
-
Paolo Abeni authored
Currently the mptcp code generate a "new listener" event even if the actual listen() syscall fails. Address the issue moving the event generation call under the successful branch. Cc: stable@vger.kernel.org Fixes: f8c9dfbd ("mptcp: add pm listener events") Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20230725-send-net-20230725-v1-2-6f60fe7137a9@kernel.orgSigned-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 05 Jul, 2023 2 commits
-
-
Paolo Abeni authored
Since the blamed commit, closing the first subflow resets the first subflow socket state to SS_UNCONNECTED. The current mptcp listen implementation relies only on such state to prevent touching not-fully-disconnected sockets. Incoming mptcp fastclose (or paired endpoint removal) unconditionally closes the first subflow. All the above allows an incoming fastclose followed by a listen() call to successfully race with a blocking recvmsg(), potentially causing the latter to hit a divide by zero bug in cleanup_rbuf/__tcp_select_window(). Address the issue explicitly checking the msk socket state in mptcp_listen(). An alternative solution would be moving the first subflow socket state update into mptcp_disconnect(), but in the long term the first subflow socket should be removed: better avoid relaying on it for internal consistency check. Fixes: b29fcfb5 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Reported-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/414Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
While tacking care of the mptcp-level listener I unintentionally moved the subflow level unhash after the subflow listener backlog cleanup. That could cause some nasty race and makes the code harder to read. Address the issue restoring the proper order of operations. Fixes: 57fc0f1c ("mptcp: ensure listener is unhashed before updating the sk status") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 24 Jun, 2023 1 commit
-
-
David Howells authored
Remove ->sendpage() and ->sendpage_locked(). sendmsg() with MSG_SPLICE_PAGES should be used instead. This allows multiple pages and multipage folios to be passed through. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> cc: linux-afs@lists.infradead.org cc: mptcp@lists.linux.dev cc: rds-devel@oss.oracle.com cc: tipc-discussion@lists.sourceforge.net cc: virtualization@lists.linux-foundation.org Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.comSigned-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 22 Jun, 2023 10 commits
-
-
Paolo Abeni authored
The MPTCP code always set the msk state to TCP_CLOSE before calling performing the fast-close. Move such state transition in mptcp_do_fastclose() to avoid some code duplication. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The user-space need to properly account the data received/sent by individual subflows. When additional subflows are created and/or closed during the MPTCP socket lifetime, the information currently exposed via MPTCP_TCPINFO are not enough: subflows are identified only by the sequential position inside the info dumps, and that will change with the above mentioned events. To solve the above problem, this patch introduces a new subflow identifier that is unique inside the given MPTCP socket scope. The initial subflow get the id 1 and the other subflows get incremental values at join time. Link: https://github.com/multipath-tcp/mptcp_net-next/issues/388Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Currently there are no data transfer counters accounting for all the subflows used by a given MPTCP socket. The user-space can compute such figures aggregating the subflow info, but that is inaccurate if any subflow is closed before the MPTCP socket itself. Add the new counters in the MPTCP socket itself and expose them via the existing diag and sockopt. While touching mptcp_diag_fill_info(), acquire the relevant locks before fetching the msk data, to ensure better data consistency Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/385Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
That will avoid an unneeded conditional in both the fast-path and in the fallback case and will simplify a bit the next patch. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The MPTCP protocol access the listener subflow in a lockless manner in a couple of places (poll, diag). That works only if the msk itself leaves the listener status only after that the subflow itself has been closed/disconnected. Otherwise we risk deadlock in diag, as reported by Christoph. Address the issue ensuring that the first subflow (the listener one) is always disconnected before updating the msk socket status. Reported-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/407 Fixes: b29fcfb5 ("mptcp: full disconnect implementation") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Thanks to the previous patch -- "mptcp: consolidate fallback and non fallback state machine" -- we can finally drop the "temporary hack" used to detect rx eof. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
An orphaned msk releases the used resources via the worker, when the latter first see the msk in CLOSED status. If the msk status transitions to TCP_CLOSE in the release callback invoked by the worker's final release_sock(), such instance of the workqueue will not take any action. Additionally the MPTCP code prevents scheduling the worker once the socket reaches the CLOSE status: such msk resources will be leaked. The only code path that can trigger the above scenario is the __mptcp_check_send_data_fin() in fallback mode. Address the issue removing the special handling of fallback socket in __mptcp_check_send_data_fin(), consolidating the state machine for fallback and non fallback socket. Since non-fallback sockets do not send and do not receive data_fin, the mptcp code can update the msk internal status to match the next step in the SM every time data fin (ack) should be generated or received. As a consequence we can remove a bunch of checks for fallback from the fastpath. Fixes: 6e628cd3 ("mptcp: use mptcp release_cb for delayed tasks") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
At passive MPJ time, if the msk socket lock is held by the user, the new subflow is appended to the msk->join_list under the msk data lock. In mptcp_release_cb()/__mptcp_flush_join_list(), the subflows in that list are moved from the join_list into the conn_list under the msk socket lock. Append and removal could race, possibly corrupting such list. Address the issue splicing the join list into a temporary one while still under the msk data lock. Found by code inspection, the race itself should be almost impossible to trigger in practice. Fixes: 3e501490 ("mptcp: cleanup MPJ subflow list handling") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Christoph reported a divide by zero bug in mptcp_recvmsg(): divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 19978 Comm: syz-executor.6 Not tainted 6.4.0-rc2-gffcc7899081b #20 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:__tcp_select_window+0x30e/0x420 net/ipv4/tcp_output.c:3018 Code: 11 ff 0f b7 cd c1 e9 0c b8 ff ff ff ff d3 e0 89 c1 f7 d1 01 cb 21 c3 eb 17 e8 2e 83 11 ff 31 db eb 0e e8 25 83 11 ff 89 d8 99 <f7> 7c 24 04 29 d3 65 48 8b 04 25 28 00 00 00 48 3b 44 24 10 75 60 RSP: 0018:ffffc90000a07a18 EFLAGS: 00010246 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000040000 RDX: 0000000000000000 RSI: 000000000003ffff RDI: 0000000000040000 RBP: 000000000000ffd7 R08: ffffffff820cf297 R09: 0000000000000001 R10: 0000000000000000 R11: ffffffff8103d1a0 R12: 0000000000003f00 R13: 0000000000300000 R14: ffff888101cf3540 R15: 0000000000180000 FS: 00007f9af4c09640(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b33824000 CR3: 000000012f241001 CR4: 0000000000170ee0 Call Trace: <TASK> __tcp_cleanup_rbuf+0x138/0x1d0 net/ipv4/tcp.c:1611 mptcp_recvmsg+0xcb8/0xdd0 net/mptcp/protocol.c:2034 inet_recvmsg+0x127/0x1f0 net/ipv4/af_inet.c:861 ____sys_recvmsg+0x269/0x2b0 net/socket.c:1019 ___sys_recvmsg+0xe6/0x260 net/socket.c:2764 do_recvmmsg+0x1a5/0x470 net/socket.c:2858 __do_sys_recvmmsg net/socket.c:2937 [inline] __se_sys_recvmmsg net/socket.c:2953 [inline] __x64_sys_recvmmsg+0xa6/0x130 net/socket.c:2953 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x47/0xa0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f9af58fc6a9 Code: 5c c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 37 0d 00 f7 d8 64 89 01 48 RSP: 002b:00007f9af4c08cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f9af58fc6a9 RDX: 0000000000000001 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000f00 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 </TASK> mptcp_recvmsg is allowed to release the msk socket lock when blocking, and before re-acquiring it another thread could have switched the sock to TCP_LISTEN status - with a prior connect(AF_UNSPEC) - also clearing icsk_ack.rcv_mss. Address the issue preventing the disconnect if some other process is concurrently performing a blocking syscall on the same socket, alike commit 4faeee0c ("tcp: deny tcp_disconnect() when threads are waiting"). Fixes: a6b118fe ("mptcp: add receive buffer auto-tuning") Cc: stable@vger.kernel.org Reported-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/404Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Tested-by:
Christoph Paasch <cpaasch@apple.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Currently the mptcp code has assumes that disconnect() can fail only at mptcp_sendmsg_fastopen() time - to avoid a deadlock scenario - and don't even bother returning an error code. Soon mptcp_disconnect() will handle more error conditions: let's track them explicitly. As a bonus, explicitly annotate TCP-level disconnect as not failing: the mptcp code never blocks for event on the subflows. Fixes: 7d803344 ("mptcp: fix deadlock in fastopen error path") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Tested-by:
Christoph Paasch <cpaasch@apple.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 16 Jun, 2023 1 commit
-
-
Breno Leitao authored
Most of the ioctls to net protocols operates directly on userspace argument (arg). Usually doing get_user()/put_user() directly in the ioctl callback. This is not flexible, because it is hard to reuse these functions without passing userspace buffers. Change the "struct proto" ioctls to avoid touching userspace memory and operate on kernel buffers, i.e., all protocol's ioctl callbacks is adapted to operate on a kernel memory other than on userspace (so, no more {put,get}_user() and friends being called in the ioctl callback). This changes the "struct proto" ioctl format in the following way: int (*ioctl)(struct sock *sk, int cmd, - unsigned long arg); + int *karg); (Important to say that this patch does not touch the "struct proto_ops" protocols) So, the "karg" argument, which is passed to the ioctl callback, is a pointer allocated to kernel space memory (inside a function wrapper). This buffer (karg) may contain input argument (copied from userspace in a prep function) and it might return a value/buffer, which is copied back to userspace if necessary. There is not one-size-fits-all format (that is I am using 'may' above), but basically, there are three type of ioctls: 1) Do not read from userspace, returns a result to userspace 2) Read an input parameter from userspace, and does not return anything to userspace 3) Read an input from userspace, and return a buffer to userspace. The default case (1) (where no input parameter is given, and an "int" is returned to userspace) encompasses more than 90% of the cases, but there are two other exceptions. Here is a list of exceptions: * Protocol RAW: * cmd = SIOCGETVIFCNT: * input and output = struct sioc_vif_req * cmd = SIOCGETSGCNT * input and output = struct sioc_sg_req * Explanation: for the SIOCGETVIFCNT case, userspace passes the input argument, which is struct sioc_vif_req. Then the callback populates the struct, which is copied back to userspace. * Protocol RAW6: * cmd = SIOCGETMIFCNT_IN6 * input and output = struct sioc_mif_req6 * cmd = SIOCGETSGCNT_IN6 * input and output = struct sioc_sg_req6 * Protocol PHONET: * cmd == SIOCPNADDRESOURCE | SIOCPNDELRESOURCE * input int (4 bytes) * Nothing is copied back to userspace. For the exception cases, functions sock_sk_ioctl_inout() will copy the userspace input, and copy it back to kernel space. The wrapper that prepare the buffer and put the buffer back to user is sk_ioctl(), so, instead of calling sk->sk_prot->ioctl(), the callee now calls sk_ioctl(), which will handle all cases. Signed-off-by:
Breno Leitao <leitao@debian.org> Reviewed-by:
Willem de Bruijn <willemb@google.com> Reviewed-by:
David Ahern <dsahern@kernel.org> Reviewed-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20230609152800.830401-1-leitao@debian.orgSigned-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 01 Jun, 2023 6 commits
-
-
Paolo Abeni authored
Active subflow are inserted into the connection list at creation time. When the MPJ handshake completes successfully, a new subflow creation netlink event is generated correctly, but the current code wrongly avoid initializing a couple of subflow data. The above will cause misbehavior on a few exceptional events: unneeded mptcp-level retransmission on msk-level sequence wrap-around and infinite mapping fallback even when a MPJ socket is present. Address the issue factoring out the needed initialization in a new helper and invoking the latter from __mptcp_finish_join() time for passive subflow and from mptcp_finish_join() for active ones. Fixes: 0530020a ("mptcp: track and update contiguous data status") Cc: stable@vger.kernel.org Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Christoph reported the mptcp variant of a recently addressed plain TCP issue. Similar to commit e14cadfd ("tcp: add annotations around sk->sk_shutdown accesses") add READ/WRITE ONCE annotations to silence KCSAN reports around lockless sk_shutdown access. Fixes: 71ba088c ("mptcp: cleanup accept and poll") Reported-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/401Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The first subflow socket is accessed outside the msk socket lock by mptcp_subflow_fail(), we need to annotate each write access with WRITE_ONCE, but a few spots still lacks it. Fixes: 76a13b31 ("mptcp: invoke MP_FAIL response when needed") Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
When the msk socket is cloned at MPC handshake time, a few fields are initialized in a racy way outside mptcp_sk_clone() and the msk socket lock. The above is due historical reasons: before commit a88d0092 ("mptcp: simplify subflow_syn_recv_sock()") as the first subflow socket carrying all the needed date was not available yet at msk creation time We can now refactor the code moving the missing initialization bit under the socket lock, removing the init race and avoiding some code duplication. This will also simplify the next patch, as all msk->first write access are now under the msk socket lock. Fixes: 0397c6d8 ("mptcp: keep unaccepted MPC subflow into join list") Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
The MPTCP can access the first subflow socket in a few spots outside the socket lock scope. That is actually safe, as MPTCP will delete the socket itself only after the msk sock close(). Still the such accesses causes a few KCSAN splats, as reported by Christoph. Silence the harmless warning adding a few annotation around the relevant accesses. Fixes: 71ba088c ("mptcp: cleanup accept and poll") Reported-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/402Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Ondrej reported a functional issue WRT timeout handling on connect with a nice reproducer. The problem is that the current mptcp connect waits for both the MPTCP socket level timeout, and the first subflow socket timeout. The latter is not influenced/touched by the exposed setsockopt(). Overall the above makes the SO_SNDTIMEO a no-op on connect. Since mptcp_connect is invoked via inet_stream_connect and the latter properly handle the MPTCP level timeout, we can address the issue making the nested subflow level connect always unblocking. This also allow simplifying a bit the code, dropping an ugly hack to handle the fastopen and custom proto_ops connect. The issues predates the blamed commit below, but the current resolution requires the infrastructure introduced there. Fixes: 54f1944e ("mptcp: factor out mptcp_connect()") Reported-by:
Ondrej Mosnacek <omosnace@redhat.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399 Cc: stable@vger.kernel.org Reviewed-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 19 May, 2023 1 commit
-
-
Paolo Abeni authored
Rewrite the mptcp socket accept op, leveraging the new __inet_accept() helper. This way we can avoid acquiring the new socket lock twice and we can avoid a couple of indirect calls. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Mat Martineau <martineau@kernel.org> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 19 Apr, 2023 2 commits
-
-
Paolo Abeni authored
The mptcp worker and mptcp_accept() can race, as reported by Christoph: refcount_t: addition on 0; use-after-free. WARNING: CPU: 1 PID: 14351 at lib/refcount.c:25 refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25 Modules linked in: CPU: 1 PID: 14351 Comm: syz-executor.2 Not tainted 6.3.0-rc1-gde5e8fd0123c #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:refcount_warn_saturate+0x105/0x1b0 lib/refcount.c:25 Code: 02 31 ff 89 de e8 1b f0 a7 ff 84 db 0f 85 6e ff ff ff e8 3e f5 a7 ff 48 c7 c7 d8 c7 34 83 c6 05 6d 2d 0f 02 01 e8 cb 3d 90 ff <0f> 0b e9 4f ff ff ff e8 1f f5 a7 ff 0f b6 1d 54 2d 0f 02 31 ff 89 RSP: 0018:ffffc90000a47bf8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88802eae98c0 RSI: ffffffff81097d4f RDI: 0000000000000001 RBP: ffff88802e712180 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000001 R11: ffff88802eaea148 R12: ffff88802e712100 R13: ffff88802e712a88 R14: ffff888005cb93a8 R15: ffff88802e712a88 FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f277fd89120 CR3: 0000000035486002 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __refcount_add include/linux/refcount.h:199 [inline] __refcount_inc include/linux/refcount.h:250 [inline] refcount_inc include/linux/refcount.h:267 [inline] sock_hold include/net/sock.h:775 [inline] __mptcp_close+0x4c6/0x4d0 net/mptcp/protocol.c:3051 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072 inet_release+0x56/0xa0 net/ipv4/af_inet.c:429 __sock_release+0x51/0xf0 net/socket.c:653 sock_close+0x18/0x20 net/socket.c:1395 __fput+0x113/0x430 fs/file_table.c:321 task_work_run+0x96/0x100 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x4fc/0x10c0 kernel/exit.c:869 do_group_exit+0x51/0xf0 kernel/exit.c:1019 get_signal+0x12b0/0x1390 kernel/signal.c:2859 arch_do_signal_or_restart+0x25/0x260 arch/x86/kernel/signal.c:306 exit_to_user_mode_loop kernel/entry/common.c:168 [inline] exit_to_user_mode_prepare+0x131/0x1a0 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x19/0x40 kernel/entry/common.c:296 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7fec4b4926a9 Code: Unable to access opcode bytes at 0x7fec4b49267f. RSP: 002b:00007fec49f9dd78 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca RAX: fffffffffffffe00 RBX: 00000000006bc058 RCX: 00007fec4b4926a9 RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00000000006bc058 RBP: 00000000006bc050 R08: 00000000007df998 R09: 00000000007df998 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 000000000000000b R15: 000000000001fe40 </TASK> The root cause is that the worker can force fallback to TCP the first mptcp subflow, actually deleting the unaccepted msk socket. We can explicitly prevent the race delaying the unaccepted msk deletion at listener shutdown time. In case the closed subflow is later accepted, just drop the mptcp context and let the user-space deal with the paired mptcp socket. Fixes: b6985b9b ("mptcp: use the workqueue to destroy unaccepted sockets") Cc: stable@vger.kernel.org Reported-by:
Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/375Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by:
Christoph Paasch <cpaasch@apple.com> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
This is a partial revert of the blamed commit, with a relevant change: mptcp_subflow_queue_clean() now just change the msk socket status and stop the worker, so that the UaF issue addressed by the blamed commit is not re-introduced. The above prevents the mptcp worker from running concurrently with inet_csk_listen_stop(), as such race would trigger a warning, as reported by Christoph: RSP: 002b:00007f784fe09cd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e WARNING: CPU: 0 PID: 25807 at net/ipv4/inet_connection_sock.c:1387 inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387 RAX: ffffffffffffffda RBX: 00000000006bc050 RCX: 00007f7850afd6a9 RDX: 0000000000000000 RSI: 0000000020000340 RDI: 0000000000000004 Modules linked in: RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006bc05c R13: fffffffffffffea8 R14: 00000000006bc050 R15: 000000000001fe40 </TASK> CPU: 0 PID: 25807 Comm: syz-executor.7 Not tainted 6.2.0-g778e54711659 #7 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 RIP: 0010:inet_csk_listen_stop+0x664/0x870 net/ipv4/inet_connection_sock.c:1387 RAX: 0000000000000000 RBX: ffff888100dfbd40 RCX: 0000000000000000 RDX: ffff8881363aab80 RSI: ffffffff81c494f4 RDI: 0000000000000005 RBP: ffff888126dad080 R08: 0000000000000005 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000000 R12: ffff888100dfe040 R13: 0000000000000001 R14: 0000000000000000 R15: ffff888100dfbdd8 FS: 00007f7850a2c800(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b32d26000 CR3: 000000012fdd8006 CR4: 0000000000770ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> __tcp_close+0x5b2/0x620 net/ipv4/tcp.c:2875 __mptcp_close_ssk+0x145/0x3d0 net/mptcp/protocol.c:2427 mptcp_destroy_common+0x8a/0x1c0 net/mptcp/protocol.c:3277 mptcp_destroy+0x41/0x60 net/mptcp/protocol.c:3304 __mptcp_destroy_sock+0x56/0x140 net/mptcp/protocol.c:2965 __mptcp_close+0x38f/0x4a0 net/mptcp/protocol.c:3057 mptcp_close+0x24/0xe0 net/mptcp/protocol.c:3072 inet_release+0x53/0xa0 net/ipv4/af_inet.c:429 __sock_release+0x4e/0xf0 net/socket.c:651 sock_close+0x15/0x20 net/socket.c:1393 __fput+0xff/0x420 fs/file_table.c:321 task_work_run+0x8b/0xe0 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop kernel/entry/common.c:171 [inline] exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x1d/0x40 kernel/entry/common.c:296 do_syscall_64+0x46/0x90 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x72/0xdc RIP: 0033:0x7f7850af70dc RAX: 0000000000000000 RBX: 0000000000000004 RCX: 00007f7850af70dc RDX: 00007f7850a2c800 RSI: 0000000000000002 RDI: 0000000000000003 RBP: 00000000006bd980 R08: 0000000000000000 R09: 00000000000018a0 R10: 00000000316338a4 R11: 0000000000000293 R12: 0000000000211e31 R13: 00000000006bc05c R14: 00007f785062c000 R15: 0000000000211af0 Fixes: 0a3f4f1f ("mptcp: fix UaF in listener shutdown") Cc: stable@vger.kernel.org Reported-by:
Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/371Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 17 Apr, 2023 4 commits
-
-
Paolo Abeni authored
When cleaning up unaccepted mptcp socket still laying inside the listener queue at listener close time, such sockets will go through a regular close, waiting for a timeout before shutting down the subflows. There is no need to keep the kernel resources in use for such a possibly long time: short-circuit to fast-close. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
In the long run this will simplify the mptcp code and will allow for more consistent behavior. Move the first subflow allocation out of the sock->init ops into the __mptcp_nmpc_socket() helper. Since the first subflow creation can now happen after the first setsockopt() we additionally need to invoke mptcp_sockopt_sync() on it. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
So that we can avoid a bunch of check in fastpath. Additionally we can specialize such check according to the specific fastopen method - defer_connect vs MSG_FASTOPEN. The latter bits will simplify the next patches. Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Paolo Abeni authored
In a few spots, the mptcp code invokes the __mptcp_nmpc_socket() helper multiple times under the same socket lock scope. Additionally, in such places, the socket status ensures that there is no MP capable handshake running. Under the above condition we can replace the later __mptcp_nmpc_socket() helper invocation with direct access to the msk->subflow pointer and better document such access is not supposed to fail with WARN(). Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 13 Apr, 2023 1 commit
-
-
Paolo Abeni authored
As reported by Christoph, the mptcp protocol can run the worker when the relevant msk socket is in an unexpected state: connect() // incoming reset + fastclose // the mptcp worker is scheduled mptcp_disconnect() // msk is now CLOSED listen() mptcp_worker() Leading to the following splat: divide error: 0000 [#1] PREEMPT SMP CPU: 1 PID: 21 Comm: kworker/1:0 Not tainted 6.3.0-rc1-gde5e8fd0123c #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014 Workqueue: events mptcp_worker RIP: 0010:__tcp_select_window+0x22c/0x4b0 net/ipv4/tcp_output.c:3018 RSP: 0018:ffffc900000b3c98 EFLAGS: 00010293 RAX: 000000000000ffd7 RBX: 000000000000ffd7 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8214ce97 RDI: 0000000000000004 RBP: 000000000000ffd7 R08: 0000000000000004 R09: 0000000000010000 R10: 000000000000ffd7 R11: ffff888005afa148 R12: 000000000000ffd7 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000405270 CR3: 000000003011e006 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_select_window net/ipv4/tcp_output.c:262 [inline] __tcp_transmit_skb+0x356/0x1280 net/ipv4/tcp_output.c:1345 tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline] tcp_send_active_reset+0x13e/0x320 net/ipv4/tcp_output.c:3459 mptcp_check_fastclose net/mptcp/protocol.c:2530 [inline] mptcp_worker+0x6c7/0x800 net/mptcp/protocol.c:2705 process_one_work+0x3bd/0x950 kernel/workqueue.c:2390 worker_thread+0x5b/0x610 kernel/workqueue.c:2537 kthread+0x138/0x170 kernel/kthread.c:376 ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308 </TASK> This change addresses the issue explicitly checking for bad states before running the mptcp worker. Fixes: e16163b6 ("mptcp: refactor shutdown and close") Cc: stable@vger.kernel.org Reported-by:
Christoph Paasch <cpaasch@apple.com> Link: https://github.com/multipath-tcp/mptcp_net-next/issues/374Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by:
Christoph Paasch <cpaasch@apple.com> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 18 Mar, 2023 1 commit
-
-
Eric Dumazet authored
We can change mptcp_sk() to propagate its argument const qualifier, thanks to container_of_const(). We need to change few things to avoid build errors: mptcp_set_datafin_timeout() and mptcp_rtx_head() have to accept non-const sk pointers. @msk local variable in mptcp_pending_tail() must be const. Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by:
Simon Horman <simon.horman@corigine.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 17 Mar, 2023 1 commit
-
-
Eric Dumazet authored
mptcp_poll() reads sk->sk_err without socket lock held/owned. Add READ_ONCE() and WRITE_ONCE() to avoid load/store tearing. Signed-off-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 11 Mar, 2023 3 commits
-
-
Paolo Abeni authored
As reported by Christoph after having refactored the passive socket initialization, the mptcp listener shutdown path is prone to an UaF issue. BUG: KASAN: use-after-free in _raw_spin_lock_bh+0x73/0xe0 Write of size 4 at addr ffff88810cb23098 by task syz-executor731/1266 CPU: 1 PID: 1266 Comm: syz-executor731 Not tainted 6.2.0-rc59af4eaa31c1f6c00c8f1e448ed99a45c66340dd5 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6e/0x91 print_report+0x16a/0x46f kasan_report+0xad/0x130 kasan_check_range+0x14a/0x1a0 _raw_spin_lock_bh+0x73/0xe0 subflow_error_report+0x6d/0x110 sk_error_report+0x3b/0x190 tcp_disconnect+0x138c/0x1aa0 inet_child_forget+0x6f/0x2e0 inet_csk_listen_stop+0x209/0x1060 __mptcp_close_ssk+0x52d/0x610 mptcp_destroy_common+0x165/0x640 mptcp_destroy+0x13/0x80 __mptcp_destroy_sock+0xe7/0x270 __mptcp_close+0x70e/0x9b0 mptcp_close+0x2b/0x150 inet_release+0xe9/0x1f0 __sock_release+0xd2/0x280 sock_close+0x15/0x20 __fput+0x252/0xa20 task_work_run+0x169/0x250 exit_to_user_mode_prepare+0x113/0x120 syscall_exit_to_user_mode+0x1d/0x40 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc The msk grace period can legitly expire in between the last reference count dropped in mptcp_subflow_queue_clean() and the later eventual access in inet_csk_listen_stop() After the previous patch we don't need anymore special-casing msk listener socket cleanup: the mptcp worker will process each of the unaccepted msk sockets. Just drop the now unnecessary code. Please note this commit depends on the two parent ones: mptcp: refactor passive socket initialization mptcp: use the workqueue to destroy unaccepted sockets Fixes: 6aeed904 ("mptcp: fix race on unaccepted mptcp sockets") Cc: stable@vger.kernel.org Reported-and-tested-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/346Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
Christoph reported a UaF at token lookup time after having refactored the passive socket initialization part: BUG: KASAN: use-after-free in __token_bucket_busy+0x253/0x260 Read of size 4 at addr ffff88810698d5b0 by task syz-executor653/3198 CPU: 1 PID: 3198 Comm: syz-executor653 Not tainted 6.2.0-rc59af4eaa31c1f6c00c8f1e448ed99a45c66340dd5 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6e/0x91 print_report+0x16a/0x46f kasan_report+0xad/0x130 __token_bucket_busy+0x253/0x260 mptcp_token_new_connect+0x13d/0x490 mptcp_connect+0x4ed/0x860 __inet_stream_connect+0x80e/0xd90 tcp_sendmsg_fastopen+0x3ce/0x710 mptcp_sendmsg+0xff1/0x1a20 inet_sendmsg+0x11d/0x140 __sys_sendto+0x405/0x490 __x64_sys_sendto+0xdc/0x1b0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc We need to properly clean-up all the paired MPTCP-level resources and be sure to release the msk last, even when the unaccepted subflow is destroyed by the TCP internals via inet_child_forget(). We can re-use the existing MPTCP_WORK_CLOSE_SUBFLOW infra, explicitly checking that for the critical scenario: the closed subflow is the MPC one, the msk is not accepted and eventually going through full cleanup. With such change, __mptcp_destroy_sock() is always called on msk sockets, even on accepted ones. We don't need anymore to transiently drop one sk reference at msk clone time. Please note this commit depends on the parent one: mptcp: refactor passive socket initialization Fixes: 58b09919 ("mptcp: create msk early") Cc: stable@vger.kernel.org Reported-and-tested-by:
Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/347Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
Paolo Abeni authored
After commit 30e51b92 ("mptcp: fix unreleased socket in accept queue") unaccepted msk sockets go throu complete shutdown, we don't need anymore to delay inserting the first subflow into the subflow lists. The reference counting deserve some extra care, as __mptcp_close() is unaware of the request socket linkage to the first subflow. Please note that this is more a refactoring than a fix but because this modification is needed to include other corrections, see the following commits. Then a Fixes tag has been added here to help the stable team. Fixes: 30e51b92 ("mptcp: fix unreleased socket in accept queue") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Tested-by:
Christoph Paasch <cpaasch@apple.com> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 15 Feb, 2023 1 commit
-
-
Jason Xing authored
Commit e48c414e ("[INET]: Generalise the TCP sock ID lookup routines") commented out the definition of SOCK_REFCNT_DEBUG in 2005 and later another commit 463c84b9 ("[NET]: Introduce inet_connection_sock") removed it. Since we could track all of them through bpf and kprobe related tools and the feature could print loads of information which might not be that helpful even under a little bit pressure, the whole feature which has been inactive for many years is no longer supported. Link: https://lore.kernel.org/lkml/20230211065153.54116-1-kerneljasonxing@gmail.com/Suggested-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by:
Jason Xing <kernelxing@tencent.com> Reviewed-by:
Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by:
Wenjia Zhang <wenjia@linux.ibm.com> Reviewed-by:
Eric Dumazet <edumazet@google.com> Acked-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 08 Feb, 2023 1 commit
-
-
Paolo Abeni authored
If the peer closes all the existing subflows for a given mptcp socket and later the application closes it, the current implementation let it survive until the timewait timeout expires. While the above is allowed by the protocol specification it consumes resources for almost no reason and additionally causes sporadic self-tests failures. Let's move the mptcp socket to the TCP_CLOSE state when there are no alive subflows at close time, so that the allocated resources will be freed immediately. Fixes: e16163b6 ("mptcp: refactor shutdown and close") Cc: stable@vger.kernel.org Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
- 14 Jan, 2023 1 commit
-
-
Paolo Abeni authored
Let the caller specify the to-be-created subflow family. For a given MPTCP socket created with the AF_INET6 family, the current userspace PM can already ask the kernel to create subflows in v4 and v6. If "plain" IPv4 addresses are passed to the kernel, they are automatically mapped in v6 addresses "by accident". This can be problematic because the userspace will need to pass different addresses, now the v4-mapped-v6 addresses to destroy this new subflow. On the other hand, if the MPTCP socket has been created with the AF_INET family, the command to create a subflow in v6 will be accepted but the result will not be the one as expected as new subflow will be created in IPv4 using part of the v6 addresses passed to the kernel: not creating the expected subflow then. No functional change intended for the in-kernel PM where an explicit enforcement is currently in place. This arbitrary enforcement will be leveraged by other patches in a future version. Fixes: 702c2f64 ("mptcp: netlink: allow userspace-driven subflow establishment") Cc: stable@vger.kernel.org Co-developed-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Paolo Abeni <pabeni@redhat.com> Reviewed-by:
Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by:
Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by:
Jakub Kicinski <kuba@kernel.org>
-
- 09 Jan, 2023 1 commit
-
-
Menglong Dong authored
Do the statistics of mptcp socket in use with sock_prot_inuse_add(). Therefore, we can get the count of used mptcp socket from /proc/net/protocols: & cat /proc/net/protocols protocol size sockets memory press maxhdr slab module cl co di ac io in de sh ss gs se re sp bi br ha uh gp em MPTCPv6 2048 0 0 no 0 yes kernel y n y y y y y y y y y y n n n y y y n MPTCP 1896 1 0 no 0 yes kernel y n y y y y y y y y y y n n n y y y n Acked-by:
Paolo Abeni <pabeni@redhat.com> Signed-off-by:
Menglong Dong <imagedong@tencent.com> Signed-off-by:
Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-