1. 05 May, 2017 2 commits
    • Rafael J. Wysocki's avatar
      ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle · eed4d47e
      Rafael J. Wysocki authored
      The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
      during suspend-to-idle transitions and, consequently, any events
      signaled through it wake up the system from that state.  However,
      on some systems some of the events signaled via the ACPI SCI while
      suspended to idle should not cause the system to wake up.  In fact,
      quite often they should just be discarded.
      
      Arguably, systems should not resume entirely on such events, but in
      order to decide which events really should cause the system to resume
      and which are spurious, it is necessary to resume up to the point
      when ACPI SCIs are actually handled and processed, which is after
      executing dpm_resume_noirq() in the system resume path.
      
      For this reasons, add a loop around freeze_enter() in which the
      platforms can process events signaled via multiplexed IRQ lines
      like the ACPI SCI and add suspend-to-idle hooks that can be
      used for this purpose to struct platform_freeze_ops.
      
      In the ACPI case, the ->wake hook is used for checking if the SCI
      has triggered while suspended and deferring the interrupt-induced
      system wakeup until the events signaled through it are actually
      processed sufficiently to decide whether or not the system should
      resume.  In turn, the ->sync hook allows all of the relevant event
      queues to be flushed so as to prevent events from being missed due
      to race conditions.
      
      In addition to that, some ACPI code processing wakeup events needs
      to be modified to use the "hard" version of wakeup triggers, so that
      it will cause a system resume to happen on device-induced wakeup
      events even if the "soft" mechanism to prevent the system from
      suspending is not enabled (that also helps to catch device-induced
      wakeup events occurring during suspend transitions in progress).
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      eed4d47e
    • Rafael J. Wysocki's avatar
      PM / wakeup: Integrate mechanism to abort transitions in progress · 8a537ece
      Rafael J. Wysocki authored
      The system wakeup framework is not very consistent with respect to
      the way it handles suspend-to-idle and generally wakeup events
      occurring during transitions to system low-power states.
      
      First off, system transitions in progress are aborted by the event
      reporting helpers like pm_wakeup_event() only if the wakeup_count
      sysfs attribute is in use (as documented), but there are cases in
      which system-wide transitions should be aborted even if that is
      not the case.  For example, a wakeup signal from a designated
      wakeup device during system-wide PM transition, it should cause
      the transition to be aborted right away.
      
      Moreover, there is a freeze_wake() call in wakeup_source_activate(),
      but that really is only effective after suspend_freeze_state has
      been set to FREEZE_STATE_ENTER by freeze_enter().  However, it
      is very unlikely that wakeup_source_activate() will ever be called
      at that time, as it could only be triggered by a IRQF_NO_SUSPEND
      interrupt handler, so wakeups from suspend-to-idle don't really
      occur in wakeup_source_activate().
      
      At the same time there is a way to abort a system suspend in
      progress (or wake up the system from suspend-to-idle), which is by
      calling pm_system_wakeup(), but in turn that doesn't cause any
      wakeup source objects to be activated, so it will not be covered
      by wakeup source statistics and will not prevent the system from
      suspending again immediately (in case autosleep is used, for
      example).  Consequently, if anyone wants to abort system transitions
      in progress and allow the wakeup_count mechanism to work, they need
      to use both pm_system_wakeup() and pm_wakeup_event(), say, at the
      same time which is awkward.
      
      For the above reasons, make it possible to trigger
      pm_system_wakeup() from within wakeup_source_activate() and
      provide a new pm_wakeup_hard_event() helper to do so within the
      wakeup framework.
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      8a537ece
  2. 01 May, 2017 1 commit
  3. 30 Apr, 2017 3 commits
  4. 29 Apr, 2017 3 commits
  5. 28 Apr, 2017 15 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 0e911788
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Just a couple more stragglers, I really hope this is it.
      
        1) Don't let frags slip down into the GRO segmentation handlers, from
           Steffen Klassert.
      
        2) Truesize under-estimation triggers warnings in TCP over loopback
           with socket filters, 2 part fix from Eric Dumazet.
      
        3) Fix undesirable reset of bonding MTU to ETH_HLEN on slave removal,
           from Paolo Abeni.
      
        4) If we flush the XFRM policy after garbage collection, it doesn't
           work because stray entries can be created afterwards. Fix from Xin
           Long.
      
        5) Hung socket connection fixes in TIPC from Parthasarathy Bhuvaragan.
      
        6) Fix GRO regression with IPSEC when netfilter is disabled, from
           Sabrina Dubroca.
      
        7) Fix cpsw driver Kconfig dependency regression, from Arnd Bergmann"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        net: hso: register netdev later to avoid a race condition
        net: adjust skb->truesize in ___pskb_trim()
        tcp: do not underestimate skb->truesize in tcp_trim_head()
        bonding: avoid defaulting hard_header_len to ETH_HLEN on slave removal
        ipv4: Don't pass IP fragments to upper layer GRO handlers.
        cpsw/netcp: refine cpts dependency
        tipc: close the connection if protocol messages contain errors
        tipc: improve error validations for sockets in CONNECTING state
        tipc: Fix missing connection request handling
        xfrm: fix GRO for !CONFIG_NETFILTER
        xfrm: do the garbage collection after flushing policy
      0e911788
    • Andreas Kemnade's avatar
      net: hso: register netdev later to avoid a race condition · 4c761daf
      Andreas Kemnade authored
      If the netdev is accessed before the urbs are initialized,
      there will be NULL pointer dereferences. That is avoided by
      registering it when it is fully initialized.
      
      This case occurs e.g. if dhcpcd is running in the background
      and the device is probed, either after insmod hso or
      when the device appears on the usb bus.
      
      A backtrace is the following:
      
      [ 1357.356048] usb 1-2: new high-speed USB device number 12 using ehci-omap
      [ 1357.551177] usb 1-2: New USB device found, idVendor=0af0, idProduct=8800
      [ 1357.558654] usb 1-2: New USB device strings: Mfr=3, Product=2, SerialNumber=0
      [ 1357.568572] usb 1-2: Product: Globetrotter HSUPA Modem
      [ 1357.574096] usb 1-2: Manufacturer: Option N.V.
      [ 1357.685882] hso 1-2:1.5: Not our interface
      [ 1460.886352] hso: unloaded
      [ 1460.889984] usbcore: deregistering interface driver hso
      [ 1513.769134] hso: ../drivers/net/usb/hso.c: Option Wireless
      [ 1513.846771] Unable to handle kernel NULL pointer dereference at virtual address 00000030
      [ 1513.887664] hso 1-2:1.5: Not our interface
      [ 1513.906890] usbcore: registered new interface driver hso
      [ 1513.937988] pgd = ecdec000
      [ 1513.949890] [00000030] *pgd=acd15831, *pte=00000000, *ppte=00000000
      [ 1513.956573] Internal error: Oops: 817 [#1] PREEMPT SMP ARM
      [ 1513.962371] Modules linked in: hso usb_f_ecm omap2430 bnep bluetooth g_ether usb_f_rndis u_ether libcomposite configfs ipv6 arc4 wl18xx wlcore mac80211 cfg80211 bq27xxx_battery panel_tpo_td028ttec1 omapdrm drm_kms_helper cfbfillrect snd_soc_simple_card syscopyarea cfbimgblt snd_soc_simple_card_utils sysfillrect sysimgblt fb_sys_fops snd_soc_omap_twl4030 cfbcopyarea encoder_opa362 drm twl4030_madc_hwmon wwan_on_off snd_soc_gtm601 pwm_omap_dmtimer generic_adc_battery connector_analog_tv pwm_bl extcon_gpio omap3_isp wlcore_sdio videobuf2_dma_contig videobuf2_memops w1_bq27000 videobuf2_v4l2 videobuf2_core omap_hdq snd_soc_omap_mcbsp ov9650 snd_soc_omap bmp280_i2c bmg160_i2c v4l2_common snd_pcm_dmaengine bmp280 bmg160_core at24 bmc150_magn_i2c nvmem_core videodev phy_twl4030_usb bmc150_accel_i2c tsc2007
      [ 1514.037384]  bmc150_magn bmc150_accel_core media leds_tca6507 bno055 industrialio_triggered_buffer kfifo_buf gpio_twl4030 musb_hdrc snd_soc_twl4030 twl4030_vibra twl4030_madc twl4030_pwrbutton twl4030_charger industrialio w2sg0004 ehci_omap omapdss [last unloaded: hso]
      [ 1514.062622] CPU: 0 PID: 3433 Comm: dhcpcd Tainted: G        W       4.11.0-rc8-letux+ #1
      [ 1514.071136] Hardware name: Generic OMAP36xx (Flattened Device Tree)
      [ 1514.077758] task: ee748240 task.stack: ecdd6000
      [ 1514.082580] PC is at hso_start_net_device+0x50/0xc0 [hso]
      [ 1514.088287] LR is at hso_net_open+0x68/0x84 [hso]
      [ 1514.093231] pc : [<bf79c304>]    lr : [<bf79ced8>]    psr: a00f0013
      sp : ecdd7e20  ip : 00000000  fp : ffffffff
      [ 1514.105316] r10: 00000000  r9 : ed0e080c  r8 : ecd8fe2c
      [ 1514.110839] r7 : bf79cef4  r6 : ecd8fe00  r5 : 00000000  r4 : ed0dbd80
      [ 1514.117706] r3 : 00000000  r2 : c0020c80  r1 : 00000000  r0 : ecdb7800
      [ 1514.124572] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      [ 1514.132110] Control: 10c5387d  Table: acdec019  DAC: 00000051
      [ 1514.138153] Process dhcpcd (pid: 3433, stack limit = 0xecdd6218)
      [ 1514.144470] Stack: (0xecdd7e20 to 0xecdd8000)
      [ 1514.149078] 7e20: ed0dbd80 ecd8fe98 00000001 00000000 ecd8f800 ecd8fe00 ecd8fe60 00000000
      [ 1514.157714] 7e40: ed0e080c bf79ced8 bf79ce70 ecd8f800 00000001 bf7a0258 ecd8f830 c068d958
      [ 1514.166320] 7e60: c068d8b8 ecd8f800 00000001 00001091 00001090 c068dba4 ecd8f800 00001090
      [ 1514.174926] 7e80: ecd8f940 ecd8f800 00000000 c068dc60 00000000 00000001 ed0e0800 ecd8f800
      [ 1514.183563] 7ea0: 00000000 c06feaa8 c0ca39c2 beea57dc 00000020 00000000 306f7368 00000000
      [ 1514.192169] 7ec0: 00000000 00000000 00001091 00000000 00000000 00000000 00000000 00008914
      [ 1514.200805] 7ee0: eaa9ab60 beea57dc c0c9bfc0 eaa9ab40 00000006 00000000 00046858 c066a948
      [ 1514.209411] 7f00: beea57dc eaa9ab60 ecc6b0c0 c02837b0 00000006 c0282c90 0000c000 c0283654
      [ 1514.218017] 7f20: c09b0c00 c098bc31 00000001 c0c5e513 c0c5e513 00000000 c0151354 c01a20c0
      [ 1514.226654] 7f40: c0c5e513 c01a3134 ecdd6000 c01a3160 ee7487f0 600f0013 00000000 ee748240
      [ 1514.235260] 7f60: ee748734 00000000 ecc6b0c0 ecc6b0c0 beea57dc 00008914 00000006 00000000
      [ 1514.243896] 7f80: 00046858 c02837b0 00001091 0003a1f0 00046608 0003a248 00000036 c01071e4
      [ 1514.252502] 7fa0: ecdd6000 c0107040 0003a1f0 00046608 00000006 00008914 beea57dc 00001091
      [ 1514.261108] 7fc0: 0003a1f0 00046608 0003a248 00000036 0003ac0c 00046608 00046610 00046858
      [ 1514.269744] 7fe0: 0003a0ac beea57d4 000167eb b6f23106 400f0030 00000006 00000000 00000000
      [ 1514.278411] [<bf79c304>] (hso_start_net_device [hso]) from [<bf79ced8>] (hso_net_open+0x68/0x84 [hso])
      [ 1514.288238] [<bf79ced8>] (hso_net_open [hso]) from [<c068d958>] (__dev_open+0xa0/0xf4)
      [ 1514.296600] [<c068d958>] (__dev_open) from [<c068dba4>] (__dev_change_flags+0x8c/0x130)
      [ 1514.305023] [<c068dba4>] (__dev_change_flags) from [<c068dc60>] (dev_change_flags+0x18/0x48)
      [ 1514.313934] [<c068dc60>] (dev_change_flags) from [<c06feaa8>] (devinet_ioctl+0x348/0x714)
      [ 1514.322540] [<c06feaa8>] (devinet_ioctl) from [<c066a948>] (sock_ioctl+0x2b0/0x308)
      [ 1514.330627] [<c066a948>] (sock_ioctl) from [<c0282c90>] (vfs_ioctl+0x20/0x34)
      [ 1514.338165] [<c0282c90>] (vfs_ioctl) from [<c0283654>] (do_vfs_ioctl+0x82c/0x93c)
      [ 1514.346038] [<c0283654>] (do_vfs_ioctl) from [<c02837b0>] (SyS_ioctl+0x4c/0x74)
      [ 1514.353759] [<c02837b0>] (SyS_ioctl) from [<c0107040>] (ret_fast_syscall+0x0/0x1c)
      [ 1514.361755] Code: e3822103 e3822080 e1822781 e5981014 (e5832030)
      [ 1514.510833] ---[ end trace dfb3e53c657f34a0 ]---
      Reported-by: default avatarH. Nikolaus Schaller <hns@goldelico.com>
      Signed-off-by: default avatarAndreas Kemnade <andreas@kemnade.info>
      Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c761daf
    • Eric Dumazet's avatar
      net: adjust skb->truesize in ___pskb_trim() · c21b48cc
      Eric Dumazet authored
      Andrey found a way to trigger the WARN_ON_ONCE(delta < len) in
      skb_try_coalesce() using syzkaller and a filter attached to a TCP
      socket.
      
      As we did recently in commit 158f323b ("net: adjust skb->truesize in
      pskb_expand_head()") we can adjust skb->truesize from ___pskb_trim(),
      via a call to skb_condense().
      
      If all frags were freed, then skb->truesize can be recomputed.
      
      This call can be done if skb is not yet owned, or destructor is
      sock_edemux().
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c21b48cc
    • Eric Dumazet's avatar
      tcp: do not underestimate skb->truesize in tcp_trim_head() · 7162fb24
      Eric Dumazet authored
      Andrey found a way to trigger the WARN_ON_ONCE(delta < len) in
      skb_try_coalesce() using syzkaller and a filter attached to a TCP
      socket over loopback interface.
      
      I believe one issue with looped skbs is that tcp_trim_head() can end up
      producing skb with under estimated truesize.
      
      It hardly matters for normal conditions, since packets sent over
      loopback are never truncated.
      
      Bytes trimmed from skb->head should not change skb truesize, since
      skb->head is not reallocated.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7162fb24
    • Paolo Abeni's avatar
      bonding: avoid defaulting hard_header_len to ETH_HLEN on slave removal · 19cdead3
      Paolo Abeni authored
      On slave list updates, the bonding driver computes its hard_header_len
      as the maximum of all enslaved devices's hard_header_len.
      If the slave list is empty, e.g. on last enslaved device removal,
      ETH_HLEN is used.
      
      Since the bonding header_ops are set only when the first enslaved
      device is attached, the above can lead to header_ops->create()
      being called with the wrong skb headroom in place.
      
      If bond0 is configured on top of ipoib devices, with the
      following commands:
      
      ifup bond0
      for slave in $BOND_SLAVES_LIST; do
      	ip link set dev $slave nomaster
      done
      ping -c 1 <ip on bond0 subnet>
      
      we will obtain a skb_under_panic() with a similar call trace:
      	skb_push+0x3d/0x40
      	push_pseudo_header+0x17/0x30 [ib_ipoib]
      	ipoib_hard_header+0x4e/0x80 [ib_ipoib]
      	arp_create+0x12f/0x220
      	arp_send_dst.part.19+0x28/0x50
      	arp_solicit+0x115/0x290
      	neigh_probe+0x4d/0x70
      	__neigh_event_send+0xa7/0x230
      	neigh_resolve_output+0x12e/0x1c0
      	ip_finish_output2+0x14b/0x390
      	ip_finish_output+0x136/0x1e0
      	ip_output+0x76/0xe0
      	ip_local_out+0x35/0x40
      	ip_send_skb+0x19/0x40
      	ip_push_pending_frames+0x33/0x40
      	raw_sendmsg+0x7d3/0xb50
      	inet_sendmsg+0x31/0xb0
      	sock_sendmsg+0x38/0x50
      	SYSC_sendto+0x102/0x190
      	SyS_sendto+0xe/0x10
      	do_syscall_64+0x67/0x180
      	entry_SYSCALL64_slow_path+0x25/0x25
      
      This change addresses the issue avoiding updating the bonding device
      hard_header_len when the slaves list become empty, forbidding to
      shrink it below the value used by header_ops->create().
      
      The bug is there since commit 54ef3137 ("[PATCH] bonding: Handle large
      hard_header_len") but the panic can be triggered only since
      commit fc791b63 ("IB/ipoib: move back IB LL address into the hard
      header").
      Reported-by: default avatarNorbert P <noe@physik.uzh.ch>
      Fixes: 54ef3137 ("[PATCH] bonding: Handle large hard_header_len")
      Fixes: fc791b63 ("IB/ipoib: move back IB LL address into the hard header")
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19cdead3
    • Steffen Klassert's avatar
      ipv4: Don't pass IP fragments to upper layer GRO handlers. · 9b83e031
      Steffen Klassert authored
      Upper layer GRO handlers can not handle IP fragments, so
      exit GRO processing in this case.
      
      This fixes ESP GRO because the packet must be reassembled
      before we can decapsulate, otherwise we get authentication
      failures.
      
      It also aligns IPv4 to IPv6 where packets with fragmentation
      headers are not passed to upper layer GRO handlers.
      
      Fixes: 7785bba2 ("esp: Add a software GRO codepath")
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b83e031
    • Arnd Bergmann's avatar
      cpsw/netcp: refine cpts dependency · 504926df
      Arnd Bergmann authored
      Tony Lindgren reports a kernel oops that resulted from my compile-time
      fix on the default config. This shows two problems:
      
      a) configurations that did not already enable PTP_1588_CLOCK will
         now miss the cpts driver
      
      b) when cpts support is disabled, the driver crashes. This is a
         preexisting problem that we did not notice before my patch.
      
      While the second problem is still being investigated, this modifies
      the dependencies again, getting us back to the original state, with
      another 'select NET_PTP_CLASSIFY' added in to avoid the original
      link error we got, and the 'depends on POSIX_TIMERS' to hide
      the CPTS support when turning it on would be useless.
      
      Cc: stable@vger.kernel.org # 4.11 needs this
      Fixes: 07fef362 ("cpsw/netcp: cpts depends on posix_timers")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Tested-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      504926df
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 5577e679
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2017-04-28
      
      1) Do garbage collecting after a policy flush to remove old
         bundles immediately. From Xin Long.
      
      2) Fix GRO if netfilter is not defined.
         From Sabrina Dubroca.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5577e679
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · affb852d
      Linus Torvalds authored
      Pull input fix from Dmitry Torokhov:
       "Yet another quirk to i8042 to get touchpad recognized on some laptops"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: i8042 - add Clevo P650RS to the i8042 reset list
      affb852d
    • Arnd Bergmann's avatar
      clk: sunxi-ng: always select CCU_GATE · 36c02d0b
      Arnd Bergmann authored
      When the base driver is enabled but all SoC specific drivers are turned
      off, we now get a build error after code was added to always refer to the
      clk gates:
      
      drivers/clk/built-in.o: In function `ccu_pll_notifier_cb':
      :(.text+0x154f8): undefined reference to `ccu_gate_helper_disable'
      :(.text+0x15504): undefined reference to `ccu_gate_helper_enable'
      
      This changes the Kconfig to always require the gate code to be built-in
      when CONFIG_SUNXI_CCU is set.
      
      Fixes: 02ae2bc6 ("clk: sunxi-ng: Add clk notifier to gate then ungate PLL clocks")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarMaxime Ripard <maxime.ripard@free-electrons.com>
      Signed-off-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      36c02d0b
    • Linus Torvalds's avatar
      Merge branch 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · 28b20135
      Linus Torvalds authored
      Pull btrfs fix from Chris Mason:
       "We have one more fix for btrfs.
      
        This gets rid of a new WARN_ON from rc1 that ended up making more
        noise than we really want. The larger fix for the underflow got
        delayed a bit and it's better for now to put it under
        CONFIG_BTRFS_DEBUG"
      
      * 'for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
        btrfs: qgroup: move noisy underflow warning to debugging build
      28b20135
    • David S. Miller's avatar
      Merge branch 'tipc-socket-connection-hangs' · c5184717
      David S. Miller authored
      Parthasarathy Bhuvaragan says:
      
      ====================
      tipc: fix hanging socket connections
      
      This patch series contains fixes for the socket layer to
      prevent hanging / stale connections.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c5184717
    • Parthasarathy Bhuvaragan's avatar
      tipc: close the connection if protocol messages contain errors · c1be7756
      Parthasarathy Bhuvaragan authored
      When a socket is shutting down, we notify the peer node about the
      connection termination by reusing an incoming message if possible.
      If the last received message was a connection acknowledgment
      message, we reverse this message and set the error code to
      TIPC_ERR_NO_PORT and send it to peer.
      
      In tipc_sk_proto_rcv(), we never check for message errors while
      processing the connection acknowledgment or probe messages. Thus
      this message performs the usual flow control accounting and leaves
      the session hanging.
      
      In this commit, we terminate the connection when we receive such
      error messages.
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1be7756
    • Parthasarathy Bhuvaragan's avatar
      tipc: improve error validations for sockets in CONNECTING state · 4e0df495
      Parthasarathy Bhuvaragan authored
      Until now, the checks for sockets in CONNECTING state was based on
      the assumption that the incoming message was always from the
      peer's accepted data socket.
      
      However an application using a non-blocking socket sends an implicit
      connect, this socket which is in CONNECTING state can receive error
      messages from the peer's listening socket. As we discard these
      messages, the application socket hangs as there due to inactivity.
      In addition to this, there are other places where we process errors
      but do not notify the user.
      
      In this commit, we process such incoming error messages and notify
      our users about them using sk_state_change().
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e0df495
    • Parthasarathy Bhuvaragan's avatar
      tipc: Fix missing connection request handling · 42b531de
      Parthasarathy Bhuvaragan authored
      In filter_connect, we use waitqueue_active() to check for any
      connections to wakeup. But waitqueue_active() is missing memory
      barriers while accessing the critical sections, leading to
      inconsistent results.
      
      In this commit, we replace this with an SMP safe wq_has_sleeper()
      using the generic socket callback sk_data_ready().
      Signed-off-by: default avatarParthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
      Reviewed-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42b531de
  6. 27 Apr, 2017 7 commits
    • Linus Torvalds's avatar
      Merge tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux · 8b5d11e4
      Linus Torvalds authored
      Pull nfsd fixes from Bruce Fields:
       "Thanks to Ari Kauppi and Tuomas Haanpää at Synopsis for spotting bugs
        in our NFSv2/v3 xdr code that could crash the server or leak memory"
      
      * tag 'nfsd-4.11-3' of git://linux-nfs.org/~bfields/linux:
        nfsd: stricter decoding of write-like NFSv2/v3 ops
        nfsd4: minor NFSv2/v3 write decoding cleanup
        nfsd: check for oversized NFSv2/v3 arguments
      8b5d11e4
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client · 19ac4474
      Linus Torvalds authored
      Pull ceph fix from Ilya Dryomov:
       "A fix for a kernel stack overflow bug in ceph setattr code, marked for
        stable"
      
      * tag 'ceph-for-4.11-rc9' of git://github.com/ceph/ceph-client:
        ceph: fix recursion between ceph_set_acl() and __ceph_setattr()
      19ac4474
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · f56fc7bd
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
      
       - fix orangefs handling of faults on write() - I'd missed that one back
         when orangefs was going through review.
      
       - readdir counterpart of "9p: cope with bogus responses from server in
         p9_client_{read,write}" - server might be lying or broken, and we'd
         better not overrun the kmalloc'ed buffer we are copying the results
         into.
      
       - NFS O_DIRECT read/write can leave iov_iter advanced by too much;
         that's what had been causing iov_iter_pipe() warnings davej had been
         seeing.
      
       - statx_timestamp.tv_nsec type fix (s32 -> u32). That one really should
         go in before 4.11.
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        uapi: change the type of struct statx_timestamp.tv_nsec to unsigned
        fix nfs O_DIRECT advancing iov_iter too much
        p9_client_readdir() fix
        orangefs_bufmap_copy_from_iovec(): fix EFAULT handling
      f56fc7bd
    • Michael Kerrisk (man-pages)'s avatar
      statx: correct error handling of NULL pathname · 59372bbf
      Michael Kerrisk (man-pages) authored
      The change in commit 1e2f82d1 ("statx: Kill fd-with-NULL-path
      support in favour of AT_EMPTY_PATH") to error on a NULL pathname to
      statx() is inconsistent.
      
      It results in the error EINVAL for a NULL pathname.  Other system calls
      with similar APIs (fchownat(), fstatat(), linkat()), return EFAULT.
      
      The solution is simply to remove the EINVAL check.  As I already pointed
      out in [1], user_path_at*() and filename_lookup() will handle the NULL
      pathname as per the other APIs, to correctly produce the error EFAULT.
      
      [1] https://lkml.org/lkml/2017/4/26/561Signed-off-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Sandeen <sandeen@sandeen.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      59372bbf
    • Sabrina Dubroca's avatar
      xfrm: fix GRO for !CONFIG_NETFILTER · cfcf99f9
      Sabrina Dubroca authored
      In xfrm_input() when called from GRO, async == 0, and we end up
      skipping the processing in xfrm4_transport_finish(). GRO path will
      always skip the NF_HOOK, so we don't need the special-case for
      !NETFILTER during GRO processing.
      
      Fixes: 7785bba2 ("esp: Add a software GRO codepath")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      cfcf99f9
    • Frederic Weisbecker's avatar
      sched/cputime: Fix ksoftirqd cputime accounting regression · 25e2d8c1
      Frederic Weisbecker authored
      irq_time_read() returns the irqtime minus the ksoftirqd time. This
      is necessary because irq_time_read() is used to substract the IRQ time
      from the sum_exec_runtime of a task. If we were to include the softirq
      time of ksoftirqd, this task would substract its own CPU time everytime
      it updates ksoftirqd->sum_exec_runtime which would therefore never
      progress.
      
      But this behaviour got broken by:
      
        a499a5a1 ("sched/cputime: Increment kcpustat directly on irqtime account")
      
      ... which now includes ksoftirqd softirq time in the time returned by
      irq_time_read().
      
      This has resulted in wrong ksoftirqd cputime reported to userspace
      through /proc/stat and thus "top" not showing ksoftirqd when it should
      after intense networking load.
      
      ksoftirqd->stime happens to be correct but it gets scaled down by
      sum_exec_runtime through task_cputime_adjusted().
      
      To fix this, just account the strict IRQ time in a separate counter and
      use it to report the IRQ time.
      Reported-and-tested-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stanislaw Gruszka <sgruszka@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wanpeng Li <wanpeng.li@hotmail.com>
      Link: http://lkml.kernel.org/r/1493129448-5356-1-git-send-email-fweisbec@gmail.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      25e2d8c1
    • Dmitry V. Levin's avatar
      uapi: change the type of struct statx_timestamp.tv_nsec to unsigned · 1741937d
      Dmitry V. Levin authored
      The comment asserting that the value of struct statx_timestamp.tv_nsec
      must be negative when statx_timestamp.tv_sec is negative, is wrong, as
      could be seen from the following example:
      
      	#define _FILE_OFFSET_BITS 64
      	#include <assert.h>
      	#include <fcntl.h>
      	#include <stdio.h>
      	#include <sys/stat.h>
      	#include <unistd.h>
      	#include <asm/unistd.h>
      	#include <linux/stat.h>
      
      	int main(void)
      	{
      		static const struct timespec ts[2] = {
      			{ .tv_nsec = UTIME_OMIT },
      			{ .tv_sec = -2, .tv_nsec = 42 }
      		};
      		assert(utimensat(AT_FDCWD, ".", ts, 0) == 0);
      
      		struct stat st;
      		assert(stat(".", &st) == 0);
      		printf("st_mtim.tv_sec = %lld, st_mtim.tv_nsec = %lu\n",
      		       (long long) st.st_mtim.tv_sec,
      		       (unsigned long) st.st_mtim.tv_nsec);
      
      		struct statx stx;
      		assert(syscall(__NR_statx, AT_FDCWD, ".", 0, 0, &stx) == 0);
      		printf("stx_mtime.tv_sec = %lld, stx_mtime.tv_nsec = %lu\n",
      		       (long long) stx.stx_mtime.tv_sec,
      		       (unsigned long) stx.stx_mtime.tv_nsec);
      
      		return 0;
      	}
      
      It expectedly prints:
      st_mtim.tv_sec = -2, st_mtim.tv_nsec = 42
      stx_mtime.tv_sec = -2, stx_mtime.tv_nsec = 42
      
      The more generic comment asserting that the value of struct
      statx_timestamp.tv_nsec might be negative is confusing to say the least.
      
      It contradicts both the struct stat.st_[acm]time_nsec tradition and
      struct timespec.tv_nsec requirements in utimensat syscall.
      If statx syscall ever returns a stx_[acm]time containing a negative
      tv_nsec that cannot be passed unmodified to utimensat syscall,
      it will cause an immense confusion.
      
      Fix this source of confusion by changing the type of struct
      statx_timestamp.tv_nsec from __s32 to __u32.
      
      Fixes: a528d35e ("statx: Add a system call to make enhanced file info available")
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: linux-api@vger.kernel.org
      cc: mtk.manpages@gmail.com
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      1741937d
  7. 26 Apr, 2017 9 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · f8324608
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
       "I didn't want the release to go out without the statx system call
        properly hooked up"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Update syscall tables.
        sparc64: Fill in rest of HAVE_REGS_AND_STACK_ACCESS_API
      f8324608
    • David Howells's avatar
      statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH · 1e2f82d1
      David Howells authored
      With the new statx() syscall, the following both allow the attributes of
      the file attached to a file descriptor to be retrieved:
      
      	statx(dfd, NULL, 0, ...);
      
      and:
      
      	statx(dfd, "", AT_EMPTY_PATH, ...);
      
      Change the code to reject the first option, though this means copying
      the path and engaging pathwalk for the fstat() equivalent.  dfd can be a
      non-directory provided path is "".
      
      [ The timing of this isn't wonderful, but applying this now before we
        have statx() in any released kernel, before anybody starts using the
        NULL special case.    - Linus ]
      
      Fixes: a528d35e ("statx: Add a system call to make enhanced file info available")
      Reported-by: default avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Eric Sandeen <sandeen@sandeen.net>
      cc: fstests@vger.kernel.org
      cc: linux-api@vger.kernel.org
      cc: linux-man@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e2f82d1
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · fc08b197
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) MLX5 bug fixes from Saeed Mahameed et al:
           - released wrong resources when firmware timeout happens
           - fix wrong check for encapsulation size limits
           - UAR memory leak
           - ETHTOOL_GRXCLSRLALL failed to fill in info->data
      
       2) Don't cache l3mdev on mis-matches local route, causes net devices to
          leak refs. From Robert Shearman.
      
       3) Handle fragmented SKBs properly in macsec driver, the problem is
          that we were mis-sizing the sgvec table. From Jason A. Donenfeld.
      
       4) We cannot have checksum offload enabled for inner UDP tunneled
          packet during IPSEC, from Ansis Atteka.
      
       5) Fix double SKB free in ravb driver, from Dan Carpenter.
      
       6) Fix CPU port handling in b53 DSA driver, from Florian Dainelli.
      
       7) Don't use on-stack buffers for usb_control_msg() in CAN usb driver,
          from Maksim Salau.
      
       8) Fix device leak in macvlan driver, from Herbert Xu. We have to purge
          the broadcast queue properly on port destroy.
      
       9) Fix tx ring entry limit on EF10 devices in sfc driver. From Bert
          Kenward.
      
      10) Fix memory leaks in team driver, from Pan Bian.
      
      11) Don't setup ipv6_stub before it can be actually used, from Paolo
          Abeni.
      
      12) Fix tipc socket flow control accounting, from Parthasarathy
          Bhuvaragan.
      
      13) Fix crash on module unload in hso driver, from Andreas Kemnade.
      
      14) Fix purging of bridge multicast entries, the problem is that if we
          don't defer it to ndo_uninit it's possible for new entries to get
          added after we purge. Fix from Xin Long.
      
      15) Don't return garbage for PACKET_HDRLEN getsockopt, from Alexander
          Potapenko.
      
      16) Fix autoneg stall properly in PHY layer, and revert micrel driver
          change that was papering over it. From Alexander Kochetkov.
      
      17) Don't dereference an ipv4 route as an ipv6 one in the ip6_tunnnel
          code, from Cong Wang.
      
      18) Clear out the congestion control private of the TCP socket in all of
          the right places, from Wei Wang.
      
      19) rawv6_ioctl measures SKB length incorrectly, fix from Jamie
          Bainbridge.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
        ipv6: check raw payload size correctly in ioctl
        tcp: memset ca_priv data to 0 properly
        ipv6: check skb->protocol before lookup for nexthop
        net: core: Prevent from dereferencing null pointer when releasing SKB
        macsec: dynamically allocate space for sglist
        Revert "phy: micrel: Disable auto negotiation on startup"
        net: phy: fix auto-negotiation stall due to unavailable interrupt
        net/packet: check length in getsockopt() called with PACKET_HDRLEN
        net: ipv6: regenerate host route if moved to gc list
        bridge: move bridge multicast cleanup to ndo_uninit
        ipv6: fix source routing
        qed: Fix error in the dcbx app meta data initialization.
        netvsc: fix calculation of available send sections
        net: hso: fix module unloading
        tipc: fix socket flow control accounting error at tipc_recv_stream
        tipc: fix socket flow control accounting error at tipc_send_stream
        ipv6: move stub initialization after ipv6 setup completion
        team: fix memory leaks
        sfc: tx ring can only have 2048 entries for all EF10 NICs
        macvlan: Fix device ref leak when purging bc_queue
        ...
      fc08b197
    • Jamie Bainbridge's avatar
      ipv6: check raw payload size correctly in ioctl · 105f5528
      Jamie Bainbridge authored
      In situations where an skb is paged, the transport header pointer and
      tail pointer can be the same because the skb contents are in frags.
      
      This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
      length of 0 when the length to receive is actually greater than zero.
      
      skb->len is already correctly set in ip6_input_finish() with
      pskb_pull(), so use skb->len as it always returns the correct result
      for both linear and paged data.
      Signed-off-by: default avatarJamie Bainbridge <jbainbri@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      105f5528
    • Wei Wang's avatar
      tcp: memset ca_priv data to 0 properly · c1201444
      Wei Wang authored
      Always zero out ca_priv data in tcp_assign_congestion_control() so that
      ca_priv data is cleared out during socket creation.
      Also always zero out ca_priv data in tcp_reinit_congestion_control() so
      that when cc algorithm is changed, ca_priv data is cleared out as well.
      We should still zero out ca_priv data even in TCP_CLOSE state because
      user could call connect() on AF_UNSPEC to disconnect the socket and
      leave it in TCP_CLOSE state and later call setsockopt() to switch cc
      algorithm on this socket.
      
      Fixes: 2b0a8c9e ("tcp: add CDG congestion control")
      Reported-by: default avatarAndrey Konovalov  <andreyknvl@google.com>
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1201444
    • WANG Cong's avatar
      ipv6: check skb->protocol before lookup for nexthop · 199ab00f
      WANG Cong authored
      Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
      is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
      neigh key as an IPv6 address:
      
              neigh = dst_neigh_lookup(skb_dst(skb),
                                       &ipv6_hdr(skb)->daddr);
              if (!neigh)
                      goto tx_err_link_failure;
      
              addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
              addr_type = ipv6_addr_type(addr6);
      
              if (addr_type == IPV6_ADDR_ANY)
                      addr6 = &ipv6_hdr(skb)->daddr;
      
              memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
      
      Also the network header of the skb at this point should be still IPv4
      for 4in6 tunnels, we shold not just use it as IPv6 header.
      
      This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
      is, we are safe to do the nexthop lookup using skb_dst() and
      ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
      dest address we can pick here, we have to rely on callers to fill it
      from tunnel config, so just fall to ip6_route_output() to make the
      decision.
      
      Fixes: ea3dc960 ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      199ab00f
    • Myungho Jung's avatar
      net: core: Prevent from dereferencing null pointer when releasing SKB · 9899886d
      Myungho Jung authored
      Added NULL check to make __dev_kfree_skb_irq consistent with kfree
      family of functions.
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289Signed-off-by: default avatarMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9899886d
    • Jason A. Donenfeld's avatar
      macsec: dynamically allocate space for sglist · 5294b830
      Jason A. Donenfeld authored
      We call skb_cow_data, which is good anyway to ensure we can actually
      modify the skb as such (another error from prior). Now that we have the
      number of fragments required, we can safely allocate exactly that amount
      of memory.
      
      Fixes: c09440f7 ("macsec: introduce IEEE 802.1AE driver")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Acked-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5294b830
    • David S. Miller's avatar
      Revert "phy: micrel: Disable auto negotiation on startup" · b43bd728
      David S. Miller authored
      This reverts commit 99f81afc.
      
      It was papering over the real problem, which is fixed by commit
      f555f34f ("net: phy: fix auto-negotiation stall due to unavailable
      interrupt")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b43bd728