1. 17 Aug, 2015 3 commits
    • Guillaume Nault's avatar
      ppp: fix device unregistration upon netns deletion · 8cb775bc
      Guillaume Nault authored
      PPP devices may get automatically unregistered when their network
      namespace is getting removed. This happens if the ppp control plane
      daemon (e.g. pppd) exits while it is the last user of this namespace.
      
      This leads to several races:
      
        * ppp_exit_net() may destroy the per namespace idr (pn->units_idr)
          before all file descriptors were released. Successive ppp_release()
          calls may then cleanup PPP devices with ppp_shutdown_interface() and
          try to use the already destroyed idr.
      
        * Automatic device unregistration may also happen before the
          ppp_release() call for that device gets executed. Once called on
          the file owning the device, ppp_release() will then clean it up and
          try to unregister it a second time.
      
      To fix these issues, operations defined in ppp_shutdown_interface() are
      moved to the PPP device's ndo_uninit() callback. This allows PPP
      devices to be properly cleaned up by unregister_netdev() and friends.
      So checking for ppp->owner is now an accurate test to decide if a PPP
      device should be unregistered.
      
      Setting ppp->owner is done in ppp_create_interface(), before device
      registration, in order to avoid unprotected modification of this field.
      
      Finally, ppp_exit_net() now starts by unregistering all remaining PPP
      devices to ensure that none will get unregistered after the call to
      idr_destroy().
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8cb775bc
    • Shaohui Xie's avatar
      net: phy: fix PHY_RUNNING in phy_state_machine · 11e122cb
      Shaohui Xie authored
      Currently, if phy state is PHY_RUNNING, we always register a CHANGE
      when phy works in polling or interrupt ignored, this will make the
      adjust_link being called even the phy link did Not changed.
      
      checking the phy link to make sure the link did changed before we
      register a CHANGE, if link did not changed, we do nothing.
      Signed-off-by: default avatarShaohui Xie <Shaohui.Xie@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11e122cb
    • Calvin Owens's avatar
      Revert "net: limit tcp/udp rmem/wmem to SOCK_{RCV,SND}BUF_MIN" · 5d37852b
      Calvin Owens authored
      Commit 8133534c ("net: limit tcp/udp rmem/wmem to
      SOCK_{RCV,SND}BUF_MIN") modified four sysctls to enforce that the values
      written to them are not less than SOCK_MIN_{RCV,SND}BUF.
      
      That change causes 4096 to no longer be accepted as a valid value for
      'min' in tcp_wmem and udp_wmem_min. 4096 has been the default for both
      of those sysctls for a long time, and unfortunately seems to be an
      extremely popular setting. This change breaks a large number of sysctl
      configurations at Facebook.
      
      That commit referred to b1cb59cf ("net: sysctl_net_core: check
      SNDBUF and RCVBUF for min length"), which choose to use the SOCK_MIN
      constants as the lower limits to avoid nasty bugs. But AFAICS, a limit
      of SOCK_MIN_SNDBUF isn't necessary to do that: the BUG_ON cited in the
      commit message seems to have happened because unix_stream_sendmsg()
      expects a minimum of a full page (ie SK_MEM_QUANTUM) and the math broke,
      not because it had less than SOCK_MIN_SNDBUF allocated.
      
      This particular issue doesn't seem to affect TCP however: using a
      setting of "1 1 1" for tcp_{r,w}mem works, although it's obviously
      suboptimal. SK_MEM_QUANTUM would be a nice minimum, but it's 64K on
      some archs, so there would still be breakage.
      
      Since a value of one doesn't seem to cause any problems, we can drop the
      minimum 8133534c added to fix this.
      
      This reverts commit 8133534c.
      
      Fixes: 8133534c ("net: limit tcp/udp rmem/wmem to SOCK_MIN...")
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Cc: Sorin Dumitru <sorin@returnze.ro>
      Signed-off-by: default avatarCalvin Owens <calvinowens@fb.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d37852b
  2. 14 Aug, 2015 4 commits
    • Eric Dumazet's avatar
      inet: fix potential deadlock in reqsk_queue_unlink() · 83fccfc3
      Eric Dumazet authored
      When replacing del_timer() with del_timer_sync(), I introduced
      a deadlock condition :
      
      reqsk_queue_unlink() is called from inet_csk_reqsk_queue_drop()
      
      inet_csk_reqsk_queue_drop() can be called from many contexts,
      one being the timer handler itself (reqsk_timer_handler()).
      
      In this case, del_timer_sync() loops forever.
      
      Simple fix is to test if timer is pending.
      
      Fixes: 2235f2ac ("inet: fix races with reqsk timers")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83fccfc3
    • Claudiu Manoil's avatar
      gianfar: Restore link state settings after MAC reset · 2a4eebf0
      Claudiu Manoil authored
      There are some MAC registers that need to be kept in sync
      with the link state parameters, see adjust_link().
      However, after a MAC soft reset default values for
      these registers are assumed.  In some cases (excepting
      if down/ if up for example) adjust_link() does not see
      that these values were reset to default because the
      priv->old* link parameters were left unchanged.
      So, reset the priv->old* link params as well during a
      MAC reset to let adjust_link() restore the MAC link
      settings to the actual link state values.
      
      Fixes following case, for example:
      Setting link to 100M, changing MTU (implies MAC reset),
      link state remains unchanged to 100M but MAC registers
      were reset to default (1G) breaking the connectivity w/
      the PHY.  Closing and re-opening the interface would
      restore the MAC link parameters to the correct values.
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a4eebf0
    • Andy Whitcroft's avatar
      ipv4: off-by-one in continuation handling in /proc/net/route · 25b97c01
      Andy Whitcroft authored
      When generating /proc/net/route we emit a header followed by a line for
      each route.  When a short read is performed we will restart this process
      based on the open file descriptor.  When calculating the start point we
      fail to take into account that the 0th entry is the header.  This leads
      us to skip the first entry when doing a continuation read.
      
      This can be easily seen with the comparison below:
      
        while read l; do echo "$l"; done </proc/net/route >A
        cat /proc/net/route >B
        diff -bu A B | grep '^[+-]'
      
      On my example machine I have approximatly 10KB of route output.  There we
      see the very first non-title element is lost in the while read case,
      and an entry around the 8K mark in the cat case:
      
        +wlan0 00000000 02021EAC 0003 0 0 400 00000000 0 0 0
        -tun1  00C0AC0A 00000000 0001 0 0 950 00C0FFFF 0 0 0
      
      Fix up the off-by-one when reaquiring position on continuation.
      
      Fixes: 8be33e95 ("fib_trie: Fib walk rcu should take a tnode and key instead of a trie and a leaf")
      BugLink: http://bugs.launchpad.net/bugs/1483440Acked-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarAndy Whitcroft <apw@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      25b97c01
    • Linus Lüssing's avatar
      net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code · a516993f
      Linus Lüssing authored
      The recent refactoring of the IGMP and MLD parsing code into
      ipv6_mc_check_mld() / ip_mc_check_igmp() introduced a potential crash /
      BUG() invocation for bridges:
      
      I wrongly assumed that skb_get() could be used as a simple reference
      counter for an skb which is not the case. skb_get() bears additional
      semantics, a user count. This leads to a BUG() invocation in
      pskb_expand_head() / kernel panic if pskb_may_pull() is called on an skb
      with a user count greater than one - unfortunately the refactoring did
      just that.
      
      Fixing this by removing the skb_get() call and changing the API: The
      caller of ipv6_mc_check_mld() / ip_mc_check_igmp() now needs to
      additionally check whether the returned skb_trimmed is a clone.
      
      Fixes: 9afd85c9 ("net: Export IGMP/MLD message validation code")
      Reported-by: default avatarBrenden Blanco <bblanco@plumgrid.com>
      Signed-off-by: default avatarLinus Lüssing <linus.luessing@c0d3.blue>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a516993f
  3. 13 Aug, 2015 7 commits
    • Linus Torvalds's avatar
      Merge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 5b3e2e14
      Linus Torvalds authored
      Pull device mapper fixes from Mike Snitzer:
      
       - two stable fixes for corruption seen in a snapshot of thinp metadata;
         metadata snapshots aren't widely used but help provide a consistent
         view of the metadata associated with an active thin-pool.
      
       - a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"
      
      * tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm cache policy smq: move 'dm-cache-default' module alias to SMQ
        dm btree: add ref counting ops for the leaves of top level btrees
        dm thin metadata: delete btrees when releasing metadata snapshot
      5b3e2e14
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · ebcbf166
      Linus Torvalds authored
      Pull xen block driver fixes from Jens Axboe:
       "A few small bug fixes for xen-blk{front,back} that have been sitting
        over my vacation"
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
        xen-blkfront: don't add indirect pages to list when !feature_persistent
        xen-blkfront: introduce blkfront_gather_backend_features()
      ebcbf166
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 6b476e11
      Linus Torvalds authored
      Pull xen bug fixes from David Vrabel:
      
       - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.
      
       - fix a memory leak affecting backends in HVM guests.
      
       - fix PV domU hang with certain configurations.
      
      * tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
        Revert "xen/events/fifo: Handle linked events when closing a port"
        x86/xen: build "Xen PV" APIC driver for domU as well
      6b476e11
    • Linus Torvalds's avatar
      Revert x86 sigcontext cleanups · ed596cde
      Linus Torvalds authored
      This reverts commits 9a036b93 ("x86/signal/64: Remove 'fs' and 'gs'
      from sigcontext") and c6f20629 ("x86/signal/64: Fix SS handling for
      signals delivered to 64-bit programs").
      
      They were cleanups, but they break dosemu by changing the signal return
      behavior (and removing 'fs' and 'gs' from the sigcontext struct - while
      not actually changing any behavior - causes build problems).
      Reported-and-tested-by: default avatarStas Sergeev <stsp@list.ru>
      Acked-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed596cde
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 26b552e0
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Workaround hw bug when acquiring PCI bos ownership of iwlwifi
          devices, from Emmanuel Grumbach.
      
       2) Falling back to vmalloc in conntrack should not emit a warning, from
          Pablo Neira Ayuso.
      
       3) Fix NULL deref when rtlwifi driver is used as an AP, from Luis
          Felipe Dominguez Vega.
      
       4) Rocker doesn't free netdev on device removal, from Ido Schimmel.
      
       5) UDP multicast early sock demux has route handling races, from Eric
          Dumazet.
      
       6) Fix L4 checksum handling in openvswitch, from Glenn Griffin.
      
       7) Fix use-after-free in skb_set_peeked, from Herbert Xu.
      
       8) Don't advertize NETIF_F_FRAGLIST in virtio_net driver, this can lead
          to fraglists longer than the driver can support.  From Jason Wang.
      
       9) Fix mlx5 on non-4k-pagesize systems, from Carol L Soto.
      
      10) Fix interrupt storm in bna driver, from Ivan Vecera.
      
      11) Don't propagate -EBUSY from netlink_insert(), from Daniel Borkmann.
      
      12) Fix inet request sock leak, from Eric Dumazet.
      
      13) Fix TX interrupt masking and marking in TX descriptors of fs_enet
          driver, from LEROY Christophe.
      
      14) Get rid of rule optimizer in gianfar driver, it's buggy and unlikely
          to get fixed any time soon.  From Jakub Kicinski
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
        cosa: missing error code on failure in probe()
        gianfar: remove faulty filer optimizer
        gianfar: correct list membership accounting
        gianfar: correct filer table writing
        bonding: Gratuitous ARP gets dropped when first slave added
        net: dsa: Do not override PHY interface if already configured
        net: fs_enet: mask interrupts for TX partial frames.
        net: fs_enet: explicitly remove I flag on TX partial frames
        inet: fix possible request socket leak
        inet: fix races with reqsk timers
        mkiss: Fix error handling in mkiss_open()
        bnx2x: Free NVRAM lock at end of each page
        bnx2x: Prevent null pointer dereference on SKB release
        cxgb4: missing curly braces in t4_setup_debugfs()
        net-timestamp: Update skb_complete_tx_timestamp comment
        ipv6: don't reject link-local nexthop on other interface
        netlink: make sure -EBUSY won't escape from netlink_insert
        bna: fix interrupts storm caused by erroneous packets
        net: mvpp2: replace TX coalescing interrupts with hrtimer
        net: mvpp2: enable proper per-CPU TX buffers unmapping
        ...
      26b552e0
    • Linus Torvalds's avatar
      Merge tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp · 2331d30d
      Linus Torvalds authored
      Pull EDAC fix from Borislav Petkov:
       "A ppc4xx_edac fix for accessing ->csrows properly.  This driver was
        missed during the conversion a couple of years ago"
      
      * tag 'edac_fix_for_4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp:
        EDAC, ppc4xx: Access mci->csrows array elements properly
      2331d30d
    • Michael Walle's avatar
      EDAC, ppc4xx: Access mci->csrows array elements properly · 5c16179b
      Michael Walle authored
      The commit
      
        de3910eb ("edac: change the mem allocation scheme to
      		 make Documentation/kobject.txt happy")
      
      changed the memory allocation for the csrows member. But ppc4xx_edac was
      forgotten in the patch. Fix it.
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Cc: <stable@vger.kernel.org>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
      Link: http://lkml.kernel.org/r/1437469253-8611-1-git-send-email-michael@walle.ccSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      5c16179b
  4. 12 Aug, 2015 16 commits
  5. 11 Aug, 2015 10 commits