1. 03 Jul, 2014 10 commits
    • Alexander Aring's avatar
      ieee802154: reassembly: fix possible buffer overflow · 48bc0343
      Alexander Aring authored
      The max_dsize attribute in ctl_table for lowpan_frags_ns_ctl_table is
      configured with integer accessing methods. This patch change the
      max_dsize attribute to int to avoid a possible buffer overflow.
      Signed-off-by: default avatarAlexander Aring <alex.aring@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      48bc0343
    • David S. Miller's avatar
      Merge branch 'mlx4' · 98846e69
      David S. Miller authored
      Amir Vadai says:
      
      ====================
      Mellanox EN driver fixes 2014-06-23
      
      Below are some fixes to patches submitted to 3.16.
      
      First patch is according to discussions with Ben [1] and Thomas [2] - to do not
      use affinity notifier, since it breaks RFS. Instead detect changes in IRQ
      affinity map, by checking if current CPU is set in affinity map on NAPI poll.
      The two other patches fix some bugs introduced in commit [3].
      
      Patches were applied and tested over commit dba63115: ('powerpc: bpf: Fix the
      broken LD_VLAN_TAG_PRESENT test')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98846e69
    • Amir Vadai's avatar
      net/mlx4_en: IRQ affinity hint is not cleared on port down · bb273617
      Amir Vadai authored
      Need to remove affinity hint at mlx4_en_deactivate_cq() and not at
      mlx4_en_destroy_cq() - since affinity_mask might be free'd while still
      being used by procfs.
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb273617
    • Amir Vadai's avatar
      lib/cpumask: cpumask_set_cpu_local_first to use all cores when numa node is not defined · 143b5ba2
      Amir Vadai authored
      When device is non numa aware (numa_node == -1), use all online cpu's.
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      143b5ba2
    • Amir Vadai's avatar
      net/mlx4_en: Don't use irq_affinity_notifier to track changes in IRQ affinity map · 35f6f453
      Amir Vadai authored
      IRQ affinity notifier can only have a single notifier - cpu_rmap
      notifier. Can't use it to track changes in IRQ affinity map.
      Detect IRQ affinity changes by comparing CPU to current IRQ affinity map
      during NAPI poll thread.
      
      CC: Thomas Gleixner <tglx@linutronix.de>
      CC: Ben Hutchings <ben@decadent.org.uk>
      Fixes: 2eacc23c ("net/mlx4_core: Enforce irq affinity changes immediatly")
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      35f6f453
    • Maciej W. Rozycki's avatar
      defxx: Fix !DYNAMIC_BUFFERS compilation warnings · 1b037474
      Maciej W. Rozycki authored
      This fixes compilation warnings:
      
      drivers/net/fddi/defxx.c:294: warning: 'dfx_rcv_flush' declared inline after being called
      drivers/net/fddi/defxx.c:294: warning: previous declaration of 'dfx_rcv_flush' was here
      drivers/net/fddi/defxx.c:2854: warning: 'my_skb_align' defined but not used
      
      triggered when the driver is built with DYNAMIC_BUFFERS undefined.  Code
      tested to work just fine with these changes and a few DEFPA and DEFTA
      boards.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b037474
    • Maciej W. Rozycki's avatar
      defxx: Remove an incorrectly inverted preprocessor conditional · f46d53d0
      Maciej W. Rozycki authored
      The RX handler of the driver has two paths switched between, depending on
      the size of the frame received, as determined by SKBUFF_RX_COPYBREAK.
      
      When a small frame is received, a new skb allocated has data space large
      enough to hold the incoming frame only, and data is copied there from the
      original skb whose buffer is returned to the DMA RX ring; in that case
      `rx_in_place' is 0.  When a large frame is received, a new skb allocated
      has data space large enough to hold the largest frame possible, including
      the overhead for alignment, the receive status and padding, over 4.5kiB
      overall, and its buffer is placed on the DMA RX ring while the original
      buffer is passed up to the network stack avoiding the need to copy data;
      in that case `rx_in_place' is 1.
      
      However the latter scenario is only possible when dynamic buffers are
      used, as determined by DYNAMIC_BUFFERS, because otherwise the buffers used
      for the DMA RX ring are fixed at the time the interface is brought up.
      
      That leads to an observation that the preprocessor conditional around the
      `rx_in_place' check is inverted, the check only really matters when
      dynamic buffers are in use.  It has gone unnoticed for many years since
      support for using dynamic buffers on the DMA RX ring was introduced in
      2.1.40 -- because the only problem that results is in the case where
      `rx_in_place' is 1 frame data received is unnecessarily copied to the
      newly-allocated buffer, before the buffer placed on the the DMA receive RX
      and its contents ignored.  Therefore the only symptom is some performance
      loss.
      
      Rather than flipping the condition though I decided to discard the
      conditional altogether -- in the case of static buffers `rx_in_place' is
      always 0 so GCC will optimise the C conditional away instead.
      
      Tested on a few DEFPA and DEFTA boards successfully using both small and
      large frames, both with DYNAMIC_BUFFERS defined and with the macro
      undefined.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f46d53d0
    • Christoph Paasch's avatar
      tcp: Fix divide by zero when pushing during tcp-repair · 5924f17a
      Christoph Paasch authored
      When in repair-mode and TCP_RECV_QUEUE is set, we end up calling
      tcp_push with mss_now being 0. If data is in the send-queue and
      tcp_set_skb_tso_segs gets called, we crash because it will divide by
      mss_now:
      
      [  347.151939] divide error: 0000 [#1] SMP
      [  347.152907] Modules linked in:
      [  347.152907] CPU: 1 PID: 1123 Comm: packetdrill Not tainted 3.16.0-rc2 #4
      [  347.152907] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [  347.152907] task: f5b88540 ti: f3c82000 task.ti: f3c82000
      [  347.152907] EIP: 0060:[<c1601359>] EFLAGS: 00210246 CPU: 1
      [  347.152907] EIP is at tcp_set_skb_tso_segs+0x49/0xa0
      [  347.152907] EAX: 00000b67 EBX: f5acd080 ECX: 00000000 EDX: 00000000
      [  347.152907] ESI: f5a28f40 EDI: f3c88f00 EBP: f3c83d10 ESP: f3c83d00
      [  347.152907]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
      [  347.152907] CR0: 80050033 CR2: 083158b0 CR3: 35146000 CR4: 000006b0
      [  347.152907] Stack:
      [  347.152907]  c167f9d9 f5acd080 000005b4 00000002 f3c83d20 c16013e6 f3c88f00 f5acd080
      [  347.152907]  f3c83da0 c1603b5a f3c83d38 c10a0188 00000000 00000000 f3c83d84 c10acc85
      [  347.152907]  c1ad5ec0 00000000 00000000 c1ad679c 010003e0 00000000 00000000 f3c88fc8
      [  347.152907] Call Trace:
      [  347.152907]  [<c167f9d9>] ? apic_timer_interrupt+0x2d/0x34
      [  347.152907]  [<c16013e6>] tcp_init_tso_segs+0x36/0x50
      [  347.152907]  [<c1603b5a>] tcp_write_xmit+0x7a/0xbf0
      [  347.152907]  [<c10a0188>] ? up+0x28/0x40
      [  347.152907]  [<c10acc85>] ? console_unlock+0x295/0x480
      [  347.152907]  [<c10ad24f>] ? vprintk_emit+0x1ef/0x4b0
      [  347.152907]  [<c1605716>] __tcp_push_pending_frames+0x36/0xd0
      [  347.152907]  [<c15f4860>] tcp_push+0xf0/0x120
      [  347.152907]  [<c15f7641>] tcp_sendmsg+0xf1/0xbf0
      [  347.152907]  [<c116d920>] ? kmem_cache_free+0xf0/0x120
      [  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
      [  347.152907]  [<c106a682>] ? __sigqueue_free+0x32/0x40
      [  347.152907]  [<c114f0f0>] ? do_wp_page+0x3e0/0x850
      [  347.152907]  [<c161c36a>] inet_sendmsg+0x4a/0xb0
      [  347.152907]  [<c1150269>] ? handle_mm_fault+0x709/0xfb0
      [  347.152907]  [<c15a006b>] sock_aio_write+0xbb/0xd0
      [  347.152907]  [<c1180b79>] do_sync_write+0x69/0xa0
      [  347.152907]  [<c1181023>] vfs_write+0x123/0x160
      [  347.152907]  [<c1181d55>] SyS_write+0x55/0xb0
      [  347.152907]  [<c167f0d8>] sysenter_do_call+0x12/0x28
      
      This can easily be reproduced with the following packetdrill-script (the
      "magic" with netem, sk_pacing and limit_output_bytes is done to prevent
      the kernel from pushing all segments, because hitting the limit without
      doing this is not so easy with packetdrill):
      
      0   socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +0  setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      
      +0  bind(3, ..., ...) = 0
      +0  listen(3, 1) = 0
      
      +0  < S 0:0(0) win 32792 <mss 1460>
      +0  > S. 0:0(0) ack 1 <mss 1460>
      +0.1  < . 1:1(0) ack 1 win 65000
      
      +0  accept(3, ..., ...) = 4
      
      // This forces that not all segments of the snd-queue will be pushed
      +0 `tc qdisc add dev tun0 root netem delay 10ms`
      +0 `sysctl -w net.ipv4.tcp_limit_output_bytes=2`
      +0 setsockopt(4, SOL_SOCKET, 47, [2], 4) = 0
      
      +0 write(4,...,10000) = 10000
      +0 write(4,...,10000) = 10000
      
      // Set tcp-repair stuff, particularly TCP_RECV_QUEUE
      +0 setsockopt(4, SOL_TCP, 19, [1], 4) = 0
      +0 setsockopt(4, SOL_TCP, 20, [1], 4) = 0
      
      // This now will make the write push the remaining segments
      +0 setsockopt(4, SOL_SOCKET, 47, [20000], 4) = 0
      +0 `sysctl -w net.ipv4.tcp_limit_output_bytes=130000`
      
      // Now we will crash
      +0 write(4,...,1000) = 1000
      
      This happens since ec342325 (tcp: fix retransmission in repair
      mode). Prior to that, the call to tcp_push was prevented by a check for
      tp->repair.
      
      The patch fixes it, by adding the new goto-label out_nopush. When exiting
      tcp_sendmsg and a push is not required, which is the case for tp->repair,
      we go to this label.
      
      When repairing and calling send() with TCP_RECV_QUEUE, the data is
      actually put in the receive-queue. So, no push is required because no
      data has been added to the send-queue.
      
      Cc: Andrew Vagin <avagin@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Fixes: ec342325 (tcp: fix retransmission in repair mode)
      Signed-off-by: default avatarChristoph Paasch <christoph.paasch@uclouvain.be>
      Acked-by: default avatarAndrew Vagin <avagin@openvz.org>
      Acked-by: default avatarPavel Emelyanov <xemul@parallels.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5924f17a
    • Eric Dumazet's avatar
      net: fix sparse warning in sk_dst_set() · 5925a055
      Eric Dumazet authored
      sk_dst_cache has __rcu annotation, so we need a cast to avoid
      following sparse error :
      
      include/net/sock.h:1774:19: warning: incorrect type in initializer (different address spaces)
      include/net/sock.h:1774:19:    expected struct dst_entry [noderef] <asn:4>*__ret
      include/net/sock.h:1774:19:    got struct dst_entry *dst
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Fixes: 7f502361 ("ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5925a055
    • Eric Dumazet's avatar
      vlan: free percpu stats in device destructor · a48e5faf
      Eric Dumazet authored
      Madalin-Cristian reported crashs happening after a recent commit
      (5a4ae5f6 "vlan: unnecessary to check if vlan_pcpu_stats is NULL")
      
      -----------------------------------------------------------------------
      root@p5040ds:~# vconfig add eth8 1
      root@p5040ds:~# vconfig rem eth8.1
      Unable to handle kernel paging request for data at address 0x2bc88028
      Faulting instruction address: 0xc058e950
      Oops: Kernel access of bad area, sig: 11 [#1]
      SMP NR_CPUS=8 CoreNet Generic
      Modules linked in:
      CPU: 3 PID: 2167 Comm: vconfig Tainted: G        W     3.16.0-rc3-00346-g65e85bf #2
      task: e7264d90 ti: e2c2c000 task.ti: e2c2c000
      NIP: c058e950 LR: c058ea30 CTR: c058e900
      REGS: e2c2db20 TRAP: 0300   Tainted: G        W      (3.16.0-rc3-00346-g65e85bf)
      MSR: 00029002 <CE,EE,ME>  CR: 48000428  XER: 20000000
      DEAR: 2bc88028 ESR: 00000000
      GPR00: c047299c e2c2dbd0 e7264d90 00000000 2bc88000 00000000 ffffffff 00000000
      GPR08: 0000000f 00000000 000000ff 00000000 28000422 10121928 10100000 10100000
      GPR16: 10100000 00000000 c07c5968 00000000 00000000 00000000 e2c2dc48 e7838000
      GPR24: c07c5bac c07c58a8 e77290cc c07b0000 00000000 c05de6c0 e7838000 e2c2dc48
      NIP [c058e950] vlan_dev_get_stats64+0x50/0x170
      LR [c058ea30] vlan_dev_get_stats64+0x130/0x170
      Call Trace:
      [e2c2dbd0] [ffffffea] 0xffffffea (unreliable)
      [e2c2dc20] [c047299c] dev_get_stats+0x4c/0x140
      [e2c2dc40] [c0488ca8] rtnl_fill_ifinfo+0x3d8/0x960
      [e2c2dd70] [c0489f4c] rtmsg_ifinfo+0x6c/0x110
      [e2c2dd90] [c04731d4] rollback_registered_many+0x344/0x3b0
      [e2c2ddd0] [c047332c] rollback_registered+0x2c/0x50
      [e2c2ddf0] [c0476058] unregister_netdevice_queue+0x78/0xf0
      [e2c2de00] [c058d800] unregister_vlan_dev+0xc0/0x160
      [e2c2de20] [c058e360] vlan_ioctl_handler+0x1c0/0x550
      [e2c2de90] [c045d11c] sock_ioctl+0x28c/0x2f0
      [e2c2deb0] [c010d070] do_vfs_ioctl+0x90/0x7b0
      [e2c2df20] [c010d7d0] SyS_ioctl+0x40/0x80
      [e2c2df40] [c000f924] ret_from_syscall+0x0/0x3c
      
      Fix this problem by freeing percpu stats from dev->destructor() instead
      of ndo_uninit()
      Reported-by: default avatarMadalin-Cristian Bucur <madalin.bucur@freescale.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarMadalin-Cristian Bucur <madalin.bucur@freescale.com>
      Fixes: 5a4ae5f6 ("vlan: unnecessary to check if vlan_pcpu_stats is NULL")
      Cc: Li RongQing <roy.qing.li@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a48e5faf
  2. 02 Jul, 2014 8 commits
    • Daniel Mack's avatar
      net: fix circular dependency in of_mdio code · d9daa247
      Daniel Mack authored
      Commit 86f6cf41 (net: of_mdio: add of_mdiobus_link_phydev()) introduced a
      circular dependency between libphy and of_mdio.
      
      depmod: ERROR: <modroot>/kernel/drivers/net/phy/libphy.ko in
      dependency cycle!
      depmod: ERROR: <modroot>/kernel/drivers/of/of_mdio.ko in dependency cycle!
      
      The problem is that of_mdio.c references &mdio_bus_type and libphy now
      references of_mdiobus_link_phydev.
      
      Fix this by not exporting of_mdiobus_link_phydev() from of_mdio.ko.
      Make it a static function in mdio_bus.c instead.
      Signed-off-by: default avatarDaniel Mack <zonque@gmail.com>
      Reported-by: default avatarJeff Mahoney <jeffm@suse.com>
      Fixes: 86f6cf41 (net: of_mdio: add of_mdiobus_link_phydev())
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9daa247
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · eb608d2b
      David S. Miller authored
      John W. Linville says:
      
      ====================
      pull request: wireless 2014-06-27
      
      Please pull the following batch of fixes for the 3.16 stream...
      
      For the mac80211 bits, Johannes says:
      
      "We have a fix from Eliad for a time calculation, a fix from Max for
      head/tailroom when sending authentication packets, a revert that Felix
      requested since the patch in question broke regulatory and a fix from
      myself for an issue with a new command that we advertised in the wrong
      place."
      
      For the bluetooth bits, Gustavo says:
      
      "A few fixes for 3.16. This pull request contains a NULL dereference fix,
      and some security/pairing fixes."
      
      For the iwlwifi bits, Emmanuel says:
      
      "I have here a fix from Eliad for scheduled scan: it fixes a firmware
      assertion. Arik reverts a patch I made that didn't take into account
      that 3160 doesn't have UAPSD and hence, we can't assume that all
      newer firmwares support the feature. Here too, the visible effect
      is a firmware assertion. Along with that, we have a few fixes and
      additions to the device list."
      
      For the ath10k bits, Kalle says:
      
      "Bartosz fixed an issue where we were not able to create 8 vdevs when
      using DFS. Michal removed a false warning which was just confusing
      people."
      
      On top of that...
      
      Arend van Spriel fixes a 'divide by zero' regression in brcmfmac.
      
      Amitkumar Karwar corrects a transmit timeout in mwifiex.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb608d2b
    • Florian Fainelli's avatar
      net: bcmgenet: do not set packet length for RX buffers · b758858c
      Florian Fainelli authored
      Hardware will provide this information as soon as we will start
      processing incoming packets, so there is no need to set the RX buffer
      length during buffer allocation.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b758858c
    • Florian Fainelli's avatar
      net: bcmgenet: start with carrier off · 219575eb
      Florian Fainelli authored
      We use the PHY library which will determine the link state for us, make
      sure we start with a carrier off until libphy has completed the link
      training.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      219575eb
    • Florian Fainelli's avatar
      net: bcmgenet: disable clock before register_netdev · 0f50ce96
      Florian Fainelli authored
      As soon as register_netdev() is called, the network device notifiers are
      running which means that other parts of the kernel, or user-space
      programs can call the network device ndo_open() callback and use the
      interface.
      
      Disable the Ethernet device clock before we register the network device
      such that we do not create the following situation:
      
      CPU0				CPU1
      register_netdev()
      				bcmgenet_open()
      				clk_prepare_enable()
      clk_disable_unprepare()
      
      and leave the hardware block gated off, while we think it should be
      gated on.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f50ce96
    • Florian Fainelli's avatar
      net: systemport: fix TX NAPI work done return value · 16f62d9b
      Florian Fainelli authored
      Although we do not limit the number of packets the TX completion
      function bcm_sysport_tx_reclaim() is allowed to reclaim, we were still
      using its return value as-is. This means that we could hit the WARN() in
      net/core/dev.c where work_done >= budget.
      
      Make sure we do exit the NAPI context when the TX ring is empty, and
      pretend there was no work to do.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16f62d9b
    • Florian Fainelli's avatar
      net: systemport: fix UniMAC reset logic · 412bce83
      Florian Fainelli authored
      The UniMAC CMD_SW_RESET bit is not a self-clearing bit, so we need to
      assert it, wait a bit and clear it manually. As a result, umac_reset()
      is updated not to return any value. The previous version of the code
      simply wrote 0 to the CMD register, which would make the busy-waiting
      loop exit immediately, having zero effect.
      
      By writing 0 to the CMD register, we were clearing all bits in the CMD
      register, and not using the hardware reset default values which are
      set on purpose.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      412bce83
    • Florian Fainelli's avatar
      net: systemport: do not clear IFF_MULTICAST flag · 3b140a67
      Florian Fainelli authored
      The SYSTEMPORT Ethernet MAC supports multicast just fine, it just lacks
      any sort of Unicast/Broadcast/Multicasting filtering at the Ethernet MAC
      level since that is handled by the front end Ethernet switch, but that
      is properly handled by bcm_sysport_set_rx_mode().
      
      Some user-space applications might be relying on the presence of this
      flag to prevent using multicast sockets, this also prevents that
      interface from joining the IPv6 all-router mcast group.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b140a67
  3. 01 Jul, 2014 2 commits
    • Eric Dumazet's avatar
      bnx2x: fix possible panic under memory stress · 07b0f009
      Eric Dumazet authored
      While it is legal to kfree(NULL), it is not wise to use :
      put_page(virt_to_head_page(NULL))
      
       BUG: unable to handle kernel paging request at ffffeba400000000
       IP: [<ffffffffc01f5928>] virt_to_head_page+0x36/0x44 [bnx2x]
      Reported-by: default avatarMichel Lespinasse <walken@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Ariel Elior <ariel.elior@qlogic.com>
      Fixes: d46d132c ("bnx2x: use netdev_alloc_frag()")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      07b0f009
    • Eric Dumazet's avatar
      ipv4: irq safe sk_dst_[re]set() and ipv4_sk_update_pmtu() fix · 7f502361
      Eric Dumazet authored
      We have two different ways to handle changes to sk->sk_dst
      
      First way (used by TCP) assumes socket lock is owned by caller, and use
      no extra lock : __sk_dst_set() & __sk_dst_reset()
      
      Another way (used by UDP) uses sk_dst_lock because socket lock is not
      always taken. Note that sk_dst_lock is not softirq safe.
      
      These ways are not inter changeable for a given socket type.
      
      ipv4_sk_update_pmtu(), added in linux-3.8, added a race, as it used
      the socket lock as synchronization, but users might be UDP sockets.
      
      Instead of converting sk_dst_lock to a softirq safe version, use xchg()
      as we did for sk_rx_dst in commit e47eb5df ("udp: ipv4: do not use
      sk_dst_lock from softirq context")
      
      In a follow up patch, we probably can remove sk_dst_lock, as it is
      only used in IPv6.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Fixes: 9cb3a50c ("ipv4: Invalidate the socket cached route on pmtu events if possible")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7f502361
  4. 27 Jun, 2014 6 commits
  5. 26 Jun, 2014 7 commits
    • John W. Linville's avatar
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6 · d7933ab7
      Linus Torvalds authored
      Pull CIFS fixes from Steve French:
       "Small set of misc cifs/smb3 fixes"
      
      * 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
        [CIFS] fix mount failure with broken pathnames when smb3 mount with mapchars option
        cifs: revalidate mapping prior to satisfying read_iter request with cache=loose
        fs/cifs: fix regression in cifs_create_mf_symlink()
      d7933ab7
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 2a80ff86
      Linus Torvalds authored
      Pull hwmon fixes from Guenter Roeck:
       "Various minor fixes"
      
      * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        hwmon: (gpio-fan) Change name used in hwmon_device_register_with_groups
        hwmon: (emc1403) Fix missing 'select REGMAP_I2C' in Kconfig
        hwmon: (ntc_thermistor) Use the manufacturer name properly
        devicetree: bindings: Document murata vendor prefix
        hwmon: (w83l786ng) Report correct minimum fan speed
      2a80ff86
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f40ede39
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix crash in ipvs tot_stats estimator, from Julian Anastasov.
      
       2) Fix OOPS in nf_nat on netns removal, from Florian Westphal.
      
       3) Really really really fix locking issues in slip and slcan tty write
          wakeups, from Tyler Hall.
      
       4) Fix checksum offloading in fec driver, from Fugang Duan.
      
       5) Off by one in BPF instruction limit test, from Kees Cook.
      
       6) Need to clear all TSO capability flags when doing software TSO in
          tg3 driver, from Prashant Sreedharan.
      
       7) Fix memory leak in vlan_reorder_header() error path, from Li
          RongQing.
      
       8) Fix various bugs in xen-netfront and xen-netback multiqueue support,
          from David Vrabel and Wei Liu.
      
       9) Fix deadlock in cxgb4 driver, from Li RongQing.
      
      10) Prevent double free of no-cache DST entries, from Eric Dumazet.
      
      11) Bad csum_start handling in skb_segment() leads to crashes when
          forwarding, from Tom Herbert.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        net: fix setting csum_start in skb_segment()
        ipv4: fix dst race in sk_dst_get()
        net: filter: Use kcalloc/kmalloc_array to allocate arrays
        trivial: net: filter: Change kerneldoc parameter order
        trivial: net: filter: Fix typo in comment
        net: allwinner: emac: Add missing free_irq
        cxgb4: use dev_port to identify ports
        xen-netback: bookkeep number of active queues in our own module
        tg3: Change nvram command timeout value to 50ms
        cxgb4: Not need to hold the adap_rcu_lock lock when read adap_rcu_list
        be2net: fix qnq mode detection on VFs
        of: mdio: fixup of_phy_register_fixed_link parsing of new bindings
        at86rf230: fix irq setup
        net: phy: at803x: fix coccinelle warnings
        net/mlx4_core: Fix the error flow when probing with invalid VF configuration
        tulip: Poll link status more frequently for Comet chips
        net: huawei_cdc_ncm: increase command buffer size
        drivers: net: cpsw: fix dual EMAC stall when connected to same switch
        xen-netfront: recreate queues correctly when reconnecting
        xen-netfront: fix oops when disconnected from backend
        ...
      f40ede39
    • Tom Herbert's avatar
      net: fix setting csum_start in skb_segment() · de843723
      Tom Herbert authored
      Dave Jones reported that a crash is occurring in
      
      csum_partial
      tcp_gso_segment
      inet_gso_segment
      ? update_dl_migration
      skb_mac_gso_segment
      __skb_gso_segment
      dev_hard_start_xmit
      sch_direct_xmit
      __dev_queue_xmit
      ? dev_hard_start_xmit
      dev_queue_xmit
      ip_finish_output
      ? ip_output
      ip_output
      ip_forward_finish
      ip_forward
      ip_rcv_finish
      ip_rcv
      __netif_receive_skb_core
      ? __netif_receive_skb_core
      ? trace_hardirqs_on
      __netif_receive_skb
      netif_receive_skb_internal
      napi_gro_complete
      ? napi_gro_complete
      dev_gro_receive
      ? dev_gro_receive
      napi_gro_receive
      
      It looks like a likely culprit is that SKB_GSO_CB()->csum_start is
      not set correctly when doing non-scatter gather. We are using
      offset as opposed to doffset.
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Tested-by: default avatarDave Jones <davej@redhat.com>
      Signed-off-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: 7e2b10c1 ("net: Support for multiple checksums with gso")
      Acked-by: default avatarTom Herbert <therbert@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de843723
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-3.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · ec71feae
      Linus Torvalds authored
      Pull NFS client fixes from Trond Myklebust:
       "Highlights include:
      
         - Stable fix for a data corruption case due to incorrect cache
           validation
         - Fix a couple of false positive cache invalidations
         - Fix NFSv4 security negotiation issues"
      
      * tag 'nfs-for-3.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        NFSv4: test SECINFO RPC_AUTH_GSS pseudoflavors for support
        NFS Return -EPERM if no supported or matching SECINFO flavor
        NFS check the return of nfs4_negotiate_security in nfs4_submount
        NFS: Don't mark the data cache as invalid if it has been flushed
        NFS: Clear NFS_INO_REVAL_PAGECACHE when we update the file size
        nfs: Fix cache_validity check in nfs_write_pageuptodate()
      ec71feae
    • Eric Dumazet's avatar
      ipv4: fix dst race in sk_dst_get() · f8864972
      Eric Dumazet authored
      When IP route cache had been removed in linux-3.6, we broke assumption
      that dst entries were all freed after rcu grace period. DST_NOCACHE
      dst were supposed to be freed from dst_release(). But it appears
      we want to keep such dst around, either in UDP sockets or tunnels.
      
      In sk_dst_get() we need to make sure dst refcount is not 0
      before incrementing it, or else we might end up freeing a dst
      twice.
      
      DST_NOCACHE set on a dst does not mean this dst can not be attached
      to a socket or a tunnel.
      
      Then, before actual freeing, we need to observe a rcu grace period
      to make sure all other cpus can catch the fact the dst is no longer
      usable.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDormando <dormando@rydia.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8864972
  6. 25 Jun, 2014 7 commits