- 05 Jul, 2014 4 commits
-
-
Jason Low authored
The mutex_trylock() function calls into __mutex_trylock_fastpath() when trying to obtain the mutex. On 32 bit x86, in the !__HAVE_ARCH_CMPXCHG case, __mutex_trylock_fastpath() calls directly into __mutex_trylock_slowpath() regardless of whether or not the mutex is locked. In __mutex_trylock_slowpath(), we then acquire the wait_lock spinlock, xchg() lock->count with -1, then set lock->count back to 0 if there are no waiters, and return true if the prev lock count was 1. However, if the mutex is already locked, then there isn't much point in attempting all of the above expensive operations. In this patch, we only attempt the above trylock operations if the mutex is unlocked. Signed-off-by: Jason Low <jason.low2@hp.com> Reviewed-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: akpm@linux-foundation.org Cc: tim.c.chen@linux.intel.com Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Cc: Waiman.Long@hp.com Cc: scott.norton@hp.com Cc: aswin@hp.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402511843-4721-5-git-send-email-jason.low2@hp.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jason Low authored
Upon entering the slowpath in __mutex_lock_common(), we try once more to acquire the mutex. We only try to acquire if (lock->count >= 0). However, what we actually want here is to try to acquire if the mutex is unlocked (lock->count == 1). This patch changes it so that we only try-acquire the mutex upon entering the slowpath if it is unlocked, rather than if the lock count is non-negative. This helps further reduce unnecessary atomic xchg() operations. Furthermore, this patch uses !mutex_is_locked(lock) to do the initial checks for if the lock is free rather than directly calling atomic_read() on the lock->count, in order to improve readability. Signed-off-by: Jason Low <jason.low2@hp.com> Acked-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: akpm@linux-foundation.org Cc: tim.c.chen@linux.intel.com Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Cc: davidlohr@hp.com Cc: scott.norton@hp.com Cc: aswin@hp.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402511843-4721-4-git-send-email-jason.low2@hp.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jason Low authored
MUTEX_SHOW_NO_WAITER() is a macro which checks for if there are "no waiters" on a mutex by checking if the lock count is non-negative. Based on feedback from the discussion in the earlier version of this patchset, the macro is not very readable. Furthermore, checking lock->count isn't always the correct way to determine if there are "no waiters" on a mutex. For example, a negative count on a mutex really only means that there "potentially" are waiters. Likewise, there can be waiters on the mutex even if the count is non-negative. Thus, "MUTEX_SHOW_NO_WAITER" doesn't always do what the name of the macro suggests. So this patch deletes the MUTEX_SHOW_NO_WAITERS() macro, directly use atomic_read() instead of the macro, and adds comments which elaborate on how the extra atomic_read() checks can help reduce unnecessary xchg() operations. Signed-off-by: Jason Low <jason.low2@hp.com> Acked-by: Waiman Long <Waiman.Long@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: akpm@linux-foundation.org Cc: tim.c.chen@linux.intel.com Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Cc: davidlohr@hp.com Cc: scott.norton@hp.com Cc: aswin@hp.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402511843-4721-3-git-send-email-jason.low2@hp.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Jason Low authored
The mutex optimistic spinning documentation states that we spin for acquisition when we find that there are no pending waiters. However, in actuality, whether or not there are waiters for the mutex doesn't determine if we will spin for it. This patch removes that statement and also adds a comment which mentions that we spin for the mutex while we don't need to reschedule. Signed-off-by: Jason Low <jason.low2@hp.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: akpm@linux-foundation.org Cc: tim.c.chen@linux.intel.com Cc: paulmck@linux.vnet.ibm.com Cc: rostedt@goodmis.org Cc: Waiman.Long@hp.com Cc: scott.norton@hp.com Cc: aswin@hp.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1402511843-4721-2-git-send-email-jason.low2@hp.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 21 Jun, 2014 16 commits
-
-
Thomas Gleixner authored
It has been broken for quite some time. Just the recent updates made it compile time broken. Make it depend on BROKEN instead of removing it right away as we want a proper replacement. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
futex_lock_pi_atomic() is a maze of retry hoops and loops. Reduce it to simple and understandable states: First step is to lookup existing waiters (state) in the kernel. If there is an existing waiter, validate it and attach to it. If there is no existing waiter, check the user space value If the TID encoded in the user space value is 0, take over the futex preserving the owner died bit. If the TID encoded in the user space value is != 0, lookup the owner task, validate it and attach to it. Reduces text size by 128 bytes on x8664. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Kees Cook <kees@outflux.net> Cc: wad@chromium.org Cc: Darren Hart <darren@dvhart.com> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1406131137020.5170@nanosSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
We want to be a bit more clever in futex_lock_pi_atomic() and separate the possible states. Split out the code which attaches the first waiter to the owner into a separate function. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Hart <darren@dvhart.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Kees Cook <kees@outflux.net> Cc: wad@chromium.org Link: http://lkml.kernel.org/r/20140611204237.271300614@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
We want to be a bit more clever in futex_lock_pi_atomic() and separate the possible states. Split out the waiter verification into a separate function. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Hart <darren@dvhart.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Kees Cook <kees@outflux.net> Cc: wad@chromium.org Link: http://lkml.kernel.org/r/20140611204237.180458410@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
No point in open coding the same function again. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Darren Hart <darren@dvhart.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Kees Cook <kees@outflux.net> Cc: wad@chromium.org Link: http://lkml.kernel.org/r/20140611204237.092947239@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
The kernel tries to atomically unlock the futex without checking whether there is kernel state associated to the futex. So if user space manipulated the user space value, this will leave kernel internal state around associated to the owner task. For robustness sake, lookup first whether there are waiters on the futex. If there are waiters, wake the top priority waiter with all the proper sanity checks applied. If there are no waiters, do the atomic release. We do not have to preserve the waiters bit in this case, because a potentially incoming waiter is blocked on the hb->lock and will acquire the futex atomically. We neither have to preserve the owner died bit. The caller is the owner and it was supposed to cleanup the mess. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <darren@dvhart.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Kees Cook <kees@outflux.net> Cc: wad@chromium.org Link: http://lkml.kernel.org/r/20140611204237.016987332@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
In case the dead lock detector is enabled we follow the lock chain to the end in rt_mutex_adjust_prio_chain, even if we could stop earlier due to the priority/waiter constellation. But once we are no longer the top priority waiter in a certain step or the task holding the lock has already the same priority then there is no point in dequeing and enqueing along the lock chain as there is no change at all. So stop the queueing at this point. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/20140522031950.280830190@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
The conditions under which deadlock detection is conducted are unclear and undocumented. Add constants instead of using 0/1 and provide a selection function which hides the additional debug dependency from the calling code. Add comments where needed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Link: http://lkml.kernel.org/r/20140522031949.947264874@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Thomas Gleixner authored
The deadlock logic is only required for futexes. Remove the extra arguments for the public functions and also for the futex specific ones which get always called with deadlock detection enabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
Exit right away, when the removed waiter was not the top priority waiter on the lock. Get rid of the extra indent level. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
-
Thomas Gleixner authored
Add commentry to document the chain walk and the protection mechanisms and their scope. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
Add a separate local variable for the boost/deboost logic to make the code more readable. Add comments where appropriate. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
There is no point to keep the task ref across the check for lock owner. Drop the ref before that, so the protection context is clear. Found while documenting the chain walk. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
-
Thomas Gleixner authored
The current implementation of try_to_take_rtmutex() is correct, but requires more than a single brain twist to understand the clever encoded conditionals. Untangle it and document the cases proper. Looks less efficient at the first glance, but actually reduces the binary code size on x8664 by 80 bytes. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
-
Thomas Gleixner authored
Oleg noticed that rtmutex_slowtrylock() has a pointless check for rt_mutex_owner(lock) != current. To avoid calling try_to_take_rtmutex() we really want to check whether the lock has an owner at all or whether the trylock failed because the owner is NULL, but the RT_MUTEX_HAS_WAITERS bit is set. This covers the lock is owned by caller situation as well. We can actually do this check lockless. trylock is taking a chance whether we take lock->wait_lock to do the check or not. Add comments to the function while at it. Reported-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
-
Thomas Gleixner authored
Reason: Required to add more rtmutex robustness changes on top of those already in mainline. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
- 18 Jun, 2014 2 commits
-
-
Trond Myklebust authored
This commit reverts the addition of lockdep checking to raw_seqcount_begin for the following reasons: 1) It violates the naming convention that raw_* functions should not do lockdep checks (a convention that is also followed by the other raw_*_seqcount_begin functions). 2) raw_seqcount_begin does not spin, so it can only be part of an ABBA deadlock in very special circumstances (for instance if a lock is held across the entire raw_seqcount_begin()+read_seqcount_retry() loop while also being taken inside the write_seqcount protected area). 3) It is causing false positives with some existing callers, and there is no non-lockdep alternative for those callers to use. None of the three existing callers (__d_lookup_rcu, netdev_get_name, and the NFS state code) appear to use the function in a manner that is ABBA deadlock prone. Fixes: 1ca7d67c: seqcount: Add lockdep functionality to seqcount/seqlock Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/CAHQdGtRR6SvEhXiqWo24hoUh9AU9cL82Z8Z-d8-7u951F_d+5g@mail.gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
Peter Zijlstra authored
Paul reported that X86_OOSTORE is dead, yay! Update a comment and remove a newly added reference. Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Dave Jones <davej@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20140611090509.GC3588@twins.programming.kicks-ass.netSigned-off-by: Ingo Molnar <mingo@kernel.org>
-
- 17 Jun, 2014 1 commit
-
-
Konstantin Khlebnikov authored
This fixes use-after-free of epi->fllink.next inside list loop macro. This loop actually releases elements in the body. The list is rcu-protected but here we cannot hold rcu_read_lock because we need to lock mutex inside. The obvious solution is to use list_for_each_entry_safe(). RCU-ness isn't essential because nobody can change this list under us, it's final fput for this file. The bug was introduced by ae10b2b4 ("epoll: optimize EPOLL_CTL_DEL using rcu") Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Reported-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Stable <stable@vger.kernel.org> # 3.13+ Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 16 Jun, 2014 6 commits
-
-
Benjamin Herrenschmidt authored
This reverts commit e1edf18b. This patch was a misguided attempt at fixing offb for LE ppc64 kernels on BE qemu but is just wrong ... it breaks real LE/LE setups, LE with real HW, and existing mixed endian systems that did the fight thing with the appropriate device-tree property. Bad reviewing on my part, sorry. The right fix is to either make qemu change its endian when the guest changes endian (working on that) or to use the existing foreign endian support. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org> [v3.13+] ---
-
Thomas Gleixner authored
When the rtmutex fast path is enabled the slow unlock function can create the following situation: spin_lock(foo->m->wait_lock); foo->m->owner = NULL; rt_mutex_lock(foo->m); <-- fast path free = atomic_dec_and_test(foo->refcnt); rt_mutex_unlock(foo->m); <-- fast path if (free) kfree(foo); spin_unlock(foo->m->wait_lock); <--- Use after free. Plug the race by changing the slow unlock to the following scheme: while (!rt_mutex_has_waiters(m)) { /* Clear the waiters bit in m->owner */ clear_rt_mutex_waiters(m); owner = rt_mutex_owner(m); spin_unlock(m->wait_lock); if (cmpxchg(m->owner, owner, 0) == owner) return; spin_lock(m->wait_lock); } So in case of a new waiter incoming while the owner tries the slow path unlock we have two situations: unlock(wait_lock); lock(wait_lock); cmpxchg(p, owner, 0) == owner mark_rt_mutex_waiters(lock); acquire(lock); Or: unlock(wait_lock); lock(wait_lock); mark_rt_mutex_waiters(lock); cmpxchg(p, owner, 0) != owner enqueue_waiter(); unlock(wait_lock); lock(wait_lock); wakeup_next waiter(); unlock(wait_lock); lock(wait_lock); acquire(lock); If the fast path is disabled, then the simple m->owner = NULL; unlock(m->wait_lock); is sufficient as all access to m->owner is serialized via m->wait_lock; Also document and clarify the wakeup_next_waiter function as suggested by Oleg Nesterov. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de Cc: stable@vger.kernel.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix checksumming regressions, from Tom Herbert. 2) Undo unintentional permissions changes for SCTP rto_alpha and rto_beta sysfs knobs, from Denial Borkmann. 3) VXLAN, like other IP tunnels, should advertize it's encapsulation size using dev->needed_headroom instead of dev->hard_header_len. From Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: sctp: fix permissions for rto_alpha and rto_beta knobs vxlan: Checksum fixes net: add skb_pop_rcv_encapsulation udp: call __skb_checksum_complete when doing full checksum net: Fix save software checksum complete net: Fix GSO constants to match NETIF flags udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup vxlan: use dev->needed_headroom instead of dev->hard_header_len MAINTAINERS: update cxgb4 maintainer
-
git://git.linaro.org/people/mike.turquette/linuxLinus Torvalds authored
Pull more clock framework updates from Mike Turquette: "This contains the second half the of the clk changes for 3.16. They are simply fixes and code refactoring for the OMAP clock drivers. The sunxi clock driver changes include splitting out the one mega-driver into several smaller pieces and adding support for the A31 SoC clocks" * tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits) clk: sunxi: document PRCM clock compatible strings clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support clk: sun6i: Protect SDRAM gating bit clk: sun6i: Protect CPU clock clk: sunxi: Rework clock protection code clk: sunxi: Move the GMAC clock to a file of its own clk: sunxi: Move the 24M oscillator to a file of its own clk: sunxi: Remove calls to clk_put clk: sunxi: document new A31 USB clock compatible clk: sunxi: Implement A31 USB clock ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC) CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic) dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock CLK: TI: gate: add composite interface clock to OMAP2 only build ARM: OMAP2: clock: add DT boot support for cpufreq_ck CLK: TI: OMAP2: add clock init support ...
-
git://git.infradead.org/users/willy/linux-nvmeLinus Torvalds authored
Pull NVMe update from Matthew Wilcox: "Mostly bugfixes again for the NVMe driver. I'd like to call out the exported tracepoint in the block layer; I believe Keith has cleared this with Jens. We've had a few reports from people who're really pounding on NVMe devices at scale, hence the timeout changes (and new module parameters), hotplug cpu deadlock, tracepoints, and minor performance tweaks" [ Jens hadn't seen that tracepoint thing, but is ok with it - it will end up going away when mq conversion happens ] * git://git.infradead.org/users/willy/linux-nvme: (22 commits) NVMe: Fix START_STOP_UNIT Scsi->NVMe translation. NVMe: Use Log Page constants in SCSI emulation NVMe: Define Log Page constants NVMe: Fix hot cpu notification dead lock NVMe: Rename io_timeout to nvme_io_timeout NVMe: Use last bytes of f/w rev SCSI Inquiry NVMe: Adhere to request queue block accounting enable/disable NVMe: Fix nvme get/put queue semantics NVMe: Delete NVME_GET_FEAT_TEMP_THRESH NVMe: Make admin timeout a module parameter NVMe: Make iod bio timeout a parameter NVMe: Prevent possible NULL pointer dereference NVMe: Fix the buffer size passed in GetLogPage(CDW10.NUMD) NVMe: Update data structures for NVMe 1.2 NVMe: Enable BUILD_BUG_ON checks NVMe: Update namespace and controller identify structures to the 1.1a spec NVMe: Flush with data support NVMe: Configure support for block flush NVMe: Add tracepoints NVMe: Protect against badly formatted CQEs ...
-
- 15 Jun, 2014 11 commits
-
-
Daniel Borkmann authored
Commit 3fd091e7 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.") has silently changed permissions for rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of this was to discourage users from tweaking rto_alpha and rto_beta knobs in production environments since they are key to correctly compute rtt/srtt. RFC4960 under section 6.3.1. RTO Calculation says regarding rto_alpha and rto_beta under rule C3 and C4: [...] C3) When a new RTT measurement R' is made, set RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'| and SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R' Note: The value of SRTT used in the update to RTTVAR is its value before updating SRTT itself using the second assignment. After the computation, update RTO <- SRTT + 4 * RTTVAR. C4) When data is in flight and when allowed by rule C5 below, a new RTT measurement MUST be made each round trip. Furthermore, new RTT measurements SHOULD be made no more than once per round trip for a given destination transport address. There are two reasons for this recommendation: First, it appears that measuring more frequently often does not in practice yield any significant benefit [ALLMAN99]; second, if measurements are made more often, then the values of RTO.Alpha and RTO.Beta in rule C3 above should be adjusted so that SRTT and RTTVAR still adjust to changes at roughly the same rate (in terms of how many round trips it takes them to reflect new values) as they would if making only one measurement per round-trip and using RTO.Alpha and RTO.Beta as given in rule C3. However, the exact nature of these adjustments remains a research issue. [...] While it is discouraged to adjust rto_alpha and rto_beta and not further specified how to adjust them, the RFC also doesn't explicitly forbid it, but rather gives a RECOMMENDED default value (rto_alpha=3, rto_beta=2). We have a couple of users relying on the old permissions before they got changed. That said, if someone really has the urge to adjust them, we could allow it with a warning in the log. Fixes: 3fd091e7 ("[SCTP]: Remove multiple levels of msecs to jiffies conversions.") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tom Herbert says: ==================== Fixes related to some recent checksum modifications. - Fix GSO constants to match NETIF flags - Fix logic in saving checksum complete in __skb_checksum_complete - Call __skb_checksum_complete from UDP if we are checksumming over whole packet in order to save checksum. - Fixes to VXLAN to work correctly with checksum complete ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet header to work properly with checksum complete. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
This function is used by UDP encapsulation protocols in RX when crossing encapsulation boundary. If ip_summed is set to CHECKSUM_UNNECESSARY and encapsulation is not set, change to CHECKSUM_NONE since the checksum has not been validated within the encapsulation. Clears csum_valid by the same rationale. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
In __udp_lib_checksum_complete check if checksum is being done over all the data (len is equal to skb->len) and if it is call __skb_checksum_complete instead of __skb_checksum_complete_head. This allows checksum to be saved in checksum complete. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Geert reported issues regarding checksum complete and UDP. The logic introduced in commit 7e3cead5 ("net: Save software checksum complete") is not correct. This patch: 1) Restores code in __skb_checksum_complete_header except for setting CHECKSUM_UNNECESSARY. This function may be calculating checksum on something less than skb->len. 2) Adds saving checksum to __skb_checksum_complete. The full packet checksum 0..skb->len is calculated without adding in pseudo header. This value is saved in skb->csum and then the pseudo header is added to that to derive the checksum for validation. 3) In both __skb_checksum_complete_header and __skb_checksum_complete, set skb->csum_valid to whether checksum of zero was computed. This allows skb_csum_unnecessary to return true without changing to CHECKSUM_UNNECESSARY which was done previously. 4) Copy new csum related bits in __copy_skb_header. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Joseph Gasparakis reported that VXLAN GSO offload stopped working with i40e device after recent UDP changes. The problem is that the SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several GSO constants that were missing to avoid the problem in the future. Reported-by: Joseph Gasparakis <joseph.gasparakis@intel.com> Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull more SCSI updates from James Bottomley: "This is just a couple of drivers (hpsa and lpfc) that got left out for further testing in linux-next. We also have one fix to a prior submission (qla2xxx sparse)" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (36 commits) qla2xxx: fix sparse warnings introduced by previous target mode t10-dif patch lpfc: Update lpfc version to driver version 10.2.8001.0 lpfc: Fix ExpressLane priority setup lpfc: mark old devices as obsolete lpfc: Fix for initializing RRQ bitmap lpfc: Fix for cleaning up stale ring flag and sp_queue_event entries lpfc: Update lpfc version to driver version 10.2.8000.0 lpfc: Update Copyright on changed files from 8.3.45 patches lpfc: Update Copyright on changed files lpfc: Fixed locking for scsi task management commands lpfc: Convert runtime references to old xlane cfg param to fof cfg param lpfc: Fix FW dump using sysfs lpfc: Fix SLI4 s abort loop to process all FCP rings and under ring_lock lpfc: Fixed kernel panic in lpfc_abort_handler lpfc: Fix locking for postbufq when freeing lpfc: Fix locking for lpfc_hba_down_post lpfc: Fix dynamic transitions of FirstBurst from on to off hpsa: fix handling of hpsa_volume_offline return value hpsa: return -ENOMEM not -1 on kzalloc failure in hpsa_get_device_id hpsa: remove messages about volume status VPD inquiry page not supported ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull more btrfs updates from Chris Mason: "This has a few fixes since our last pull and a new ioctl for doing btree searches from userland. It's very similar to the existing ioctl, but lets us return larger items back down to the app" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: fix error handling in create_pending_snapshot btrfs: fix use of uninit "ret" in end_extent_writepage() btrfs: free ulist in qgroup_shared_accounting() error path Btrfs: fix qgroups sanity test crash or hang btrfs: prevent RCU warning when dereferencing radix tree slot Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting btrfs: new ioctl TREE_SEARCH_V2 btrfs: tree_search, search_ioctl: direct copy to userspace btrfs: new function read_extent_buffer_to_user btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer btrfs: tree_search, search_ioctl: accept varying buffer btrfs: tree_search: eliminate redundant nr_items check
-
git://git.kvack.org/~bcrl/aio-nextLinus Torvalds authored
Pull aio fix and cleanups from Ben LaHaise: "This consists of a couple of code cleanups plus a minor bug fix" * git://git.kvack.org/~bcrl/aio-next: aio: cleanup: flatten kill_ioctx() aio: report error from io_destroy() when threads race in io_destroy() fs/aio.c: Remove ctx parameter in kiocb_cancel
-
Al Viro authored
Tetsuo Handa wrote: "Commit 62a8067a ("bio_vec-backed iov_iter") introduced an unnamed union inside a struct which gcc-4.4.7 cannot handle. Name the unnamed union as u in order to fix build failure" Let's do this instead: there is only one place in the entire tree that steps into this breakage. Anon structs and unions work in older gcc versions; as the matter of fact, we have those in the tree - see e.g. struct ieee80211_tx_info in include/net/mac80211.h What doesn't work is handling their initializers: struct { int a; union { int b; char c; }; } x[2] = {{.a = 1, .c = 'a'}, {.a = 0, .b = 1}}; is the obvious syntax for initializer, perfectly fine for C11 and handled correctly by gcc-4.7 or later. Earlier versions, though, break on it - declaration is fine and so's access to fields (i.e. x[0].c = 'a'; would produce the right code), but members of the anon structs and unions are not inserted into the right namespace. Tellingly, those older versions will not barf on struct {int a; struct {int a;};}; - looks like they just have it hacked up somewhere around the handling of . and -> instead of doing the right thing. The easiest way to deal with that crap is to turn initialization of those fields (in the only place where we have such initializer of iov_iter) into plain assignment. Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-