1. 22 Oct, 2015 25 commits
  2. 03 Oct, 2015 15 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.1.10 · 27f1b7fe
      Greg Kroah-Hartman authored
      27f1b7fe
    • Kyle Evans's avatar
      hp-wmi: limit hotkey enable · 85695fdc
      Kyle Evans authored
      commit 8a1513b4 upstream.
      
      Do not write initialize magic on systems that do not have
      feature query 0xb. Fixes Bug #82451.
      
      Redefine FEATURE_QUERY to align with 0xb and FEATURE2 with 0xd
      for code clearity.
      
      Add a new test function, hp_wmi_bios_2008_later() & simplify
      hp_wmi_bios_2009_later(), which fixes a bug in cases where
      an improper value is returned. Probably also fixes Bug #69131.
      
      Add missing __init tag.
      Signed-off-by: default avatarKyle Evans <kvans32@gmail.com>
      Signed-off-by: default avatarDarren Hart <dvhart@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85695fdc
    • Luis Henriques's avatar
      zram: fix possible use after free in zcomp_create() · 6e3105d5
      Luis Henriques authored
      commit 3aaf14da upstream.
      
      zcomp_create() verifies the success of zcomp_strm_{multi,single}_create()
      through comp->stream, which can potentially be pointing to memory that
      was freed if these functions returned an error.
      
      While at it, replace a 'ERR_PTR(-ENOMEM)' by a more generic
      'ERR_PTR(error)' as in the future zcomp_strm_{multi,siggle}_create()
      could return other error codes.  Function documentation updated
      accordingly.
      
      Fixes: beca3ec7 ("zram: add multi stream functionality")
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      Acked-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e3105d5
    • Herbert Xu's avatar
      netlink: Replace rhash_portid with bound · d4862367
      Herbert Xu authored
      [ Upstream commit da314c99 ]
      
      On Mon, Sep 21, 2015 at 02:20:22PM -0400, Tejun Heo wrote:
      >
      > store_release and load_acquire are different from the usual memory
      > barriers and can't be paired this way.  You have to pair store_release
      > and load_acquire.  Besides, it isn't a particularly good idea to
      
      OK I've decided to drop the acquire/release helpers as they don't
      help us at all and simply pessimises the code by using full memory
      barriers (on some architectures) where only a write or read barrier
      is needed.
      
      > depend on memory barriers embedded in other data structures like the
      > above.  Here, especially, rhashtable_insert() would have write barrier
      > *before* the entry is hashed not necessarily *after*, which means that
      > in the above case, a socket which appears to have set bound to a
      > reader might not visible when the reader tries to look up the socket
      > on the hashtable.
      
      But you are right we do need an explicit write barrier here to
      ensure that the hashing is visible.
      
      > There's no reason to be overly smart here.  This isn't a crazy hot
      > path, write barriers tend to be very cheap, store_release more so.
      > Please just do smp_store_release() and note what it's paired with.
      
      It's not about being overly smart.  It's about actually understanding
      what's going on with the code.  I've seen too many instances of
      people simply sprinkling synchronisation primitives around without
      any knowledge of what is happening underneath, which is just a recipe
      for creating hard-to-debug races.
      
      > > @@ -1539,7 +1546,7 @@ static int netlink_bind(struct socket *sock, struct sockaddr *addr,
      > >  		}
      > >  	}
      > >
      > > -	if (!nlk->portid) {
      > > +	if (!nlk->bound) {
      >
      > I don't think you can skip load_acquire here just because this is the
      > second deref of the variable.  That doesn't change anything.  Race
      > condition could still happen between the first and second tests and
      > skipping the second would lead to the same kind of bug.
      
      The reason this one is OK is because we do not use nlk->portid or
      try to get nlk from the hash table before we return to user-space.
      
      However, there is a real bug here that none of these acquire/release
      helpers discovered.  The two bound tests here used to be a single
      one.  Now that they are separate it is entirely possible for another
      thread to come in the middle and bind the socket.  So we need to
      repeat the portid check in order to maintain consistency.
      
      > > @@ -1587,7 +1594,7 @@ static int netlink_connect(struct socket *sock, struct sockaddr *addr,
      > >  	    !netlink_allowed(sock, NL_CFG_F_NONROOT_SEND))
      > >  		return -EPERM;
      > >
      > > -	if (!nlk->portid)
      > > +	if (!nlk->bound)
      >
      > Don't we need load_acquire here too?  Is this path holding a lock
      > which makes that unnecessary?
      
      Ditto.
      
      ---8<---
      The commit 1f770c0a ("netlink:
      Fix autobind race condition that leads to zero port ID") created
      some new races that can occur due to inconcsistencies between the
      two port IDs.
      
      Tejun is right that a barrier is unavoidable.  Therefore I am
      reverting to the original patch that used a boolean to indicate
      that a user netlink socket has been bound.
      
      Barriers have been added where necessary to ensure that a valid
      portid and the hashed socket is visible.
      
      I have also changed netlink_insert to only return EBUSY if the
      socket is bound to a portid different to the requested one.  This
      combined with only reading nlk->bound once in netlink_bind fixes
      a race where two threads that bind the socket at the same time
      with different port IDs may both succeed.
      
      Fixes: 1f770c0a ("netlink: Fix autobind race condition that leads to zero port ID")
      Reported-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Nacked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d4862367
    • Herbert Xu's avatar
      netlink: Fix autobind race condition that leads to zero port ID · 4e277624
      Herbert Xu authored
      [ Upstream commit 1f770c0a ]
      
      The commit c0bb07df ("netlink:
      Reset portid after netlink_insert failure") introduced a race
      condition where if two threads try to autobind the same socket
      one of them may end up with a zero port ID.  This led to kernel
      deadlocks that were observed by multiple people.
      
      This patch reverts that commit and instead fixes it by introducing
      a separte rhash_portid variable so that the real portid is only set
      after the socket has been successfully hashed.
      
      Fixes: c0bb07df ("netlink: Reset portid after netlink_insert failure")
      Reported-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4e277624
    • Stas Sergeev's avatar
      mvneta: use inband status only when explicitly enabled · d6001764
      Stas Sergeev authored
      [ Upstream commit f8af8e6e in net-next tree,
        will be pushed to Linus very soon. ]
      
      The commit 898b2970 ("mvneta: implement SGMII-based in-band link state
      signaling") implemented the link parameters auto-negotiation unconditionally.
      Unfortunately it appears that some HW that implements SGMII protocol,
      doesn't generate the inband status, so it is not possible to auto-negotiate
      anything with such HW.
      
      This patch enables the auto-negotiation only if explicitly requested with
      the 'managed' DT property.
      
      This patch fixes the following regression:
      https://lkml.org/lkml/2015/7/8/865Signed-off-by: default avatarStas Sergeev <stsp@users.sourceforge.net>
      
      CC: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
      CC: netdev@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d6001764
    • Stas Sergeev's avatar
      of_mdio: add new DT property 'managed' to specify the PHY management type · ebfd3e10
      Stas Sergeev authored
      [ Upstream commit 4cba5c21 in net-next tree,
        will be pushed to Linus very soon. ]
      
      Currently the PHY management type is selected by the MAC driver arbitrary.
      The decision is based on the presence of the "fixed-link" node and on a
      will of the driver's authors.
      This caused a regression recently, when mvneta driver suddenly started
      to use the in-band status for auto-negotiation on fixed links.
      It appears the auto-negotiation may not work when expected by the MAC driver.
      Sebastien Rannou explains:
      << Yes, I confirm that my HW does not generate an in-band status. AFAIK, it's
      a PHY that aggregates 4xSGMIIs to 1xQSGMII ; the MAC side of the PHY (with
      inband status) is connected to the switch through QSGMII, and in this context
      we are on the media side of the PHY. >>
      https://lkml.org/lkml/2015/7/10/206
      
      This patch introduces the new string property 'managed' that allows
      the user to set the management type explicitly.
      The supported values are:
      "auto" - default. Uses either MDIO or nothing, depending on the presence
      of the fixed-link node
      "in-band-status" - use in-band status
      Signed-off-by: default avatarStas Sergeev <stsp@users.sourceforge.net>
      
      CC: Rob Herring <robh+dt@kernel.org>
      CC: Pawel Moll <pawel.moll@arm.com>
      CC: Mark Rutland <mark.rutland@arm.com>
      CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
      CC: Kumar Gala <galak@codeaurora.org>
      CC: Florian Fainelli <f.fainelli@gmail.com>
      CC: Grant Likely <grant.likely@linaro.org>
      CC: devicetree@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      CC: netdev@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ebfd3e10
    • Stas Sergeev's avatar
      net: phy: fixed_phy: handle link-down case · 282117ac
      Stas Sergeev authored
      [ Upstream 868a4215 in net-next tree,
        will be pushed to Linus very soon. ]
      
      fixed_phy_register() currently hardcodes the fixed PHY link to 1, and
      expects to find a "speed" parameter to provide correct information
      towards the fixed PHY consumer.
      
      In a subsequent change, where we allow "managed" (e.g: (RS)GMII in-band
      status auto-negotiation) fixed PHYs, none of these parameters can be
      provided since they will be auto-negotiated, hence, we just provide a
      zero-initialized fixed_phy_status to fixed_phy_register() which makes it
      fail when we call fixed_phy_update_regs() since status.speed = 0 which
      makes us hit the "default" label and error out.
      
      Without this change, we would also see potentially inconsistent
      speed/duplex parameters for fixed PHYs when the link is DOWN.
      
      CC: netdev@vger.kernel.org
      CC: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarStas Sergeev <stsp@users.sourceforge.net>
      [florian: add more background to why this is correct and desirable]
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      282117ac
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Do not override speed settings · 90eb52c9
      Florian Fainelli authored
      [ Upstream d2eac98f in net-next tree,
        will be pushed to Linus very soon. ]
      
      The SF2 driver currently overrides speed settings for its port
      configured using a fixed PHY, this is both unnecessary and incorrect,
      because we keep feedback to the hardware parameters that we read from
      the PHY device, which in the case of a fixed PHY cannot possibly change
      speed.
      
      This is a required change to allow the fixed PHY code to allow
      registering a PHY with a link configured as DOWN by default and avoid
      some sort of circular dependency where we require the link_update
      callback to run to program the hardware, and we then utilize the fixed
      PHY parameters to program the hardware with the same settings.
      
      Fixes: 246d7f77 ("net: dsa: add Broadcom SF2 switch driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      90eb52c9
    • Wilson Kok's avatar
      fib_rules: fix fib rule dumps across multiple skbs · fce13464
      Wilson Kok authored
      [ Upstream commit 41fc0143 ]
      
      dump_rules returns skb length and not error.
      But when family == AF_UNSPEC, the caller of dump_rules
      assumes that it returns an error. Hence, when family == AF_UNSPEC,
      we continue trying to dump on -EMSGSIZE errors resulting in
      incorrect dump idx carried between skbs belonging to the same dump.
      This results in fib rule dump always only dumping rules that fit
      into the first skb.
      
      This patch fixes dump_rules to return error so that we exit correctly
      and idx is correctly maintained between skbs that are part of the
      same dump.
      Signed-off-by: default avatarWilson Kok <wkok@cumulusnetworks.com>
      Signed-off-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fce13464
    • WANG Cong's avatar
      net: revert "net_sched: move tp->root allocation into fw_init()" · 74bff4a0
      WANG Cong authored
      [ Upstream commit d8aecb10 ]
      
      fw filter uses tp->root==NULL to check if it is the old method,
      so it doesn't need allocation at all in this case. This patch
      reverts the offending commit and adds some comments for old
      method to make it obvious.
      
      Fixes: 33f8b9ec ("net_sched: move tp->root allocation into fw_init()")
      Reported-by: default avatarAkshat Kakkar <akshat.1984@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74bff4a0
    • Eric Dumazet's avatar
      tcp: add proper TS val into RST packets · c3647e60
      Eric Dumazet authored
      [ Upstream commit 675ee231 ]
      
      RST packets sent on behalf of TCP connections with TS option (RFC 7323
      TCP timestamps) have incorrect TS val (set to 0), but correct TS ecr.
      
      A > B: Flags [S], seq 0, win 65535, options [mss 1000,nop,nop,TS val 100
      ecr 0], length 0
      B > A: Flags [S.], seq 2444755794, ack 1, win 28960, options [mss
      1460,nop,nop,TS val 7264344 ecr 100], length 0
      A > B: Flags [.], ack 1, win 65535, options [nop,nop,TS val 110 ecr
      7264344], length 0
      
      B > A: Flags [R.], seq 1, ack 1, win 28960, options [nop,nop,TS val 0
      ecr 110], length 0
      
      We need to call skb_mstamp_get() to get proper TS val,
      derived from skb->skb_mstamp
      
      Note that RFC 1323 was advocating to not send TS option in RST segment,
      but RFC 7323 recommends the opposite :
      
        Once TSopt has been successfully negotiated, that is both <SYN> and
        <SYN,ACK> contain TSopt, the TSopt MUST be sent in every non-<RST>
        segment for the duration of the connection, and SHOULD be sent in an
        <RST> segment (see Section 5.2 for details)
      
      Note this RFC recommends to send TS val = 0, but we believe it is
      premature : We do not know if all TCP stacks are properly
      handling the receive side :
      
         When an <RST> segment is
         received, it MUST NOT be subjected to the PAWS check by verifying an
         acceptable value in SEG.TSval, and information from the Timestamps
         option MUST NOT be used to update connection state information.
         SEG.TSecr MAY be used to provide stricter <RST> acceptance checks.
      
      In 5 years, if/when all TCP stack are RFC 7323 ready, we might consider
      to decide to send TS val = 0, if it buys something.
      
      Fixes: 7faee5c0 ("tcp: remove TCP_SKB_CB(skb)->when")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c3647e60
    • Jesse Gross's avatar
      openvswitch: Zero flows on allocation. · 6d80e350
      Jesse Gross authored
      [ Upstream commit ae5f2fb1 ]
      
      When support for megaflows was introduced, OVS needed to start
      installing flows with a mask applied to them. Since masking is an
      expensive operation, OVS also had an optimization that would only
      take the parts of the flow keys that were covered by a non-zero
      mask. The values stored in the remaining pieces should not matter
      because they are masked out.
      
      While this works fine for the purposes of matching (which must always
      look at the mask), serialization to netlink can be problematic. Since
      the flow and the mask are serialized separately, the uninitialized
      portions of the flow can be encoded with whatever values happen to be
      present.
      
      In terms of functionality, this has little effect since these fields
      will be masked out by definition. However, it leaks kernel memory to
      userspace, which is a potential security vulnerability. It is also
      possible that other code paths could look at the masked key and get
      uninitialized data, although this does not currently appear to be an
      issue in practice.
      
      This removes the mask optimization for flows that are being installed.
      This was always intended to be the case as the mask optimizations were
      really targetting per-packet flow operations.
      
      Fixes: 03f0d916 ("openvswitch: Mega flow implementation")
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6d80e350
    • Michael S. Tsirkin's avatar
      macvtap: fix TUNSETSNDBUF values > 64k · cf9cf6bc
      Michael S. Tsirkin authored
      [ Upstream commit 3ea79249 ]
      
      Upon TUNSETSNDBUF,  macvtap reads the requested sndbuf size into
      a local variable u.
      commit 39ec7de7 ("macvtap: fix uninitialized access on
      TUNSETIFF") changed its type to u16 (which is the right thing to
      do for all other macvtap ioctls), breaking all values > 64k.
      
      The value of TUNSETSNDBUF is actually a signed 32 bit integer, so
      the right thing to do is to read it into an int.
      
      Cc: David S. Miller <davem@davemloft.net>
      Fixes: 39ec7de7 ("macvtap: fix uninitialized access on TUNSETIFF")
      Reported-by: Mark A. Peloquin
      Bisected-by: default avatarMatthew Rosato <mjrosato@linux.vnet.ibm.com>
      Reported-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Tested-by: default avatarMatthew Rosato <mjrosato@linux.vnet.ibm.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf9cf6bc
    • Eric Dumazet's avatar
      net/mlx4_en: really allow to change RSS key · fd0a1a9d
      Eric Dumazet authored
      [ Upsteam commit 4671fc6d ]
      
      When changing rss key, we do not want to overwrite user provided key
      by the one provided by netdev_rss_key_fill(), which is the host random
      key generated at boot time.
      
      Fixes: 947cbb0a ("net/mlx4_en: Support for configurable RSS hash function")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Eyal Perry <eyalpe@mellanox.com>
      CC: Amir Vadai <amirv@mellanox.com>
      Acked-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd0a1a9d