1. 19 Oct, 2015 1 commit
    • Arad, Ronen's avatar
      netlink: Trim skb to alloc size to avoid MSG_TRUNC · db65a3aa
      Arad, Ronen authored
      netlink_dump() allocates skb based on the calculated min_dump_alloc or
      a per socket max_recvmsg_len.
      min_alloc_size is maximum space required for any single netdev
      attributes as calculated by rtnl_calcit().
      max_recvmsg_len tracks the user provided buffer to netlink_recvmsg.
      It is capped at 16KiB.
      The intention is to avoid small allocations and to minimize the number
      of calls required to obtain dump information for all net devices.
      
      netlink_dump packs as many small messages as could fit within an skb
      that was sized for the largest single netdev information. The actual
      space available within an skb is larger than what is requested. It could
      be much larger and up to near 2x with align to next power of 2 approach.
      
      Allowing netlink_dump to use all the space available within the
      allocated skb increases the buffer size a user has to provide to avoid
      truncaion (i.e. MSG_TRUNG flag set).
      
      It was observed that with many VLANs configured on at least one netdev,
      a larger buffer of near 64KiB was necessary to avoid "Message truncated"
      error in "ip link" or "bridge [-c[ompressvlans]] vlan show" when
      min_alloc_size was only little over 32KiB.
      
      This patch trims skb to allocated size in order to allow the user to
      avoid truncation with more reasonable buffer size.
      Signed-off-by: default avatarRonen Arad <ronen.arad@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      db65a3aa
  2. 17 Oct, 2015 1 commit
    • Eric Dumazet's avatar
      net: add pfmemalloc check in sk_add_backlog() · c7c49b8f
      Eric Dumazet authored
      Greg reported crashes hitting the following check in __sk_backlog_rcv()
      
      	BUG_ON(!sock_flag(sk, SOCK_MEMALLOC));
      
      The pfmemalloc bit is currently checked in sk_filter().
      
      This works correctly for TCP, because sk_filter() is ran in
      tcp_v[46]_rcv() before hitting the prequeue or backlog checks.
      
      For UDP or other protocols, this does not work, because the sk_filter()
      is ran from sock_queue_rcv_skb(), which might be called _after_ backlog
      queuing if socket is owned by user by the time packet is processed by
      softirq handler.
      
      Fixes: b4b9e355 ("netvm: set PF_MEMALLOC as appropriate during SKB processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarGreg Thelen <gthelen@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c7c49b8f
  3. 16 Oct, 2015 4 commits
    • Andrej Ota's avatar
      via-rhine: fix VLAN receive handling regression. · 5f715c09
      Andrej Ota authored
      Because eth_type_trans() consumes ethernet header worth of bytes, a call
      to read TCI from end of packet using rhine_rx_vlan_tag() no longer works
      as it's reading from an invalid offset.
      
      Tested to be working on PCEngines Alix board.
      
      Fixes: 810f19bc ("via-rhine: add consistent memory barrier in vlan receive code.")
      Signed-off-by: default avatarAndrej Ota <andrej@ota.si>
      Acked-by: default avatarFrancois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f715c09
    • David S. Miller's avatar
      Merge branch 'ipv6-blackhole-route-fix' · 7de88271
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      ipv6: Initialize rt6_info properly in ip6_blackhole_route()
      
      This patchset ensures the rt6_info's fields are initialized properly
      in ip6_blackhole_route() where xfrm_policy is the primarily user.
      The first patch is a prep work.  The second patch is the fix.  It
      fixes d52d3997 ("ipv6: Create percpu rt6_info").
      
      Here is the oops reported by Phil Sutter <phil@nwl.cc>:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
      IP: [<ffffffff8171a95e>] __ip6_datagram_connect+0x71e/0xa20
      PGD c2cb1067 PUD c2d7a067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP
      Modules linked in: cmac nfs lockd grace sunrpc bridge stp llc nvidia(PO) snd_usb_audio snd_usbmidi_lib iTCO_wdt
      CPU: 1 PID: 2964 Comm: ping6 Tainted: P           O    4.2.1-aufs #10
      Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./4Core1333-Viiv, BIOS P1.60 07/01/2008
      task: ffff8800ca62bc00 ti: ffff880129a14000 task.ti: ffff880129a14000
      RIP: 0010:[<ffffffff8171a95e>]  [<ffffffff8171a95e>] __ip6_datagram_connect+0x71e/0xa20
      RSP: 0018:ffff880129a17da8  EFLAGS: 00010296
      RAX: 000000000000000b RBX: 0000000000000000 RCX: 0000000000000006
      RDX: 0000000000000007 RSI: 0000000000000246 RDI: ffff88012fc8d5a0
      RBP: ffff8800cb9a9048 R08: 756e207369207472 R09: 216c6c756e207369
      R10: 0000000000000665 R11: 0000000000000006 R12: ffff8800cb9a8cf8
      R13: ffff8800cb9a8cf8 R14: 0000000000000000 R15: ffff8800cb9a8cc0
      FS:  00007fb76ad74700(0000) GS:ffff88012fc80000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000000a0 CR3: 00000000c2dba000 CR4: 00000000000406e0
      Stack:
       ffff8800cb9a9048 ffff8800cb9a8de0 ffff8800cb9feb70 ffffffff816b2c41
       00007fb70000000b ffffea0000df7200 ffff8800cb9f5cfc ffff8800cb9a8cc0
       03fffffffe068a20 ffff8800cb9a8cc0 ffffffff817097c0 0000000100000000
      Call Trace:
       [<ffffffff816b2c41>] ? udp_lib_get_port+0x1a1/0x380
       [<ffffffff817097c0>] ? udpv6_rcv+0x20/0x20
       [<ffffffff8171ac82>] ? ip6_datagram_connect+0x22/0x40
       [<ffffffff8163ae9b>] ? SyS_connect+0x6b/0xb0
       [<ffffffff810767ac>] ? __do_page_fault+0x15c/0x380
       [<ffffffff8163a8d3>] ? SyS_socket+0x63/0xa0
       [<ffffffff81741957>] ? entry_SYSCALL_64_fastpath+0x12/0x6a
      Code: ba ae 00 00 00 48 c7 c6 7b 71 94 81 48 c7 c7 63 71 94 81 e8 6c 0f 02 00 48 85 db 75 0e 48 c7 c7 9f 71 94 81 31 c0 e8 59 0f 02 00 <48> 83 bb a0 00 00 00 00 75 0e 48 c7 c7 ae 71 94 81 31 c0 e8 41
      RIP  [<ffffffff8171a95e>] __ip6_datagram_connect+0x71e/0xa20
       RSP <ffff880129a17da8>
      CR2: 00000000000000a0
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7de88271
    • Martin KaFai Lau's avatar
      ipv6: Initialize rt6_info properly in ip6_blackhole_route() · 0a1f5962
      Martin KaFai Lau authored
      ip6_blackhole_route() does not initialize the newly allocated
      rt6_info properly.  This patch:
      1. Call rt6_info_init() to initialize rt6i_siblings and rt6i_uncached
      
      2. The current rt->dst._metrics init code is incorrect:
         - 'rt->dst._metrics = ort->dst._metris' is not always safe
         - Not sure what dst_copy_metrics() is trying to do here
           considering ip6_rt_blackhole_cow_metrics() always returns
           NULL
      
         Fix:
         - Always do dst_copy_metrics()
         - Replace ip6_rt_blackhole_cow_metrics() with
           dst_cow_metrics_generic()
      
      3. Mask out the RTF_PCPU bit from the newly allocated blackhole route.
         This bug triggers an oops (reported by Phil Sutter) in rt6_get_cookie().
         It is because RTF_PCPU is set while rt->dst.from is NULL.
      
      Fixes: d52d3997 ("ipv6: Create percpu rt6_info")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Reported-by: default avatarPhil Sutter <phil@nwl.cc>
      Tested-by: default avatarPhil Sutter <phil@nwl.cc>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: Phil Sutter <phil@nwl.cc>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a1f5962
    • Martin KaFai Lau's avatar
      ipv6: Move common init code for rt6_info to a new function rt6_info_init() · ebfa45f0
      Martin KaFai Lau authored
      Introduce rt6_info_init() to do the common init work for
      'struct rt6_info' (after calling dst_alloc).
      
      It is a prep work to fix the rt6_info init logic in the
      ip6_blackhole_route().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Cc: Julian Anastasov <ja@ssi.bg>
      Cc: Phil Sutter <phil@nwl.cc>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebfa45f0
  4. 15 Oct, 2015 4 commits
  5. 14 Oct, 2015 1 commit
    • Jon Paul Maloy's avatar
      tipc: eliminate risk of stalled link synchronization · 0f8b8e28
      Jon Paul Maloy authored
      In commit 6e498158 ("tipc: move link synch and failover to link aggregation level")
      we introduced a new mechanism for performing link failover and
      synchronization. We have now detected a bug in this mechanism.
      
      During link synchronization we use the arrival of any packet on
      the tunnel link to trig a check for whether it has reached the
      synchronization point or not. This has turned out to be too
      permissive, since it may cause an arriving non-last SYNCH packet to
      end the synch state, just to see the next SYNCH packet initiate a
      new synch state with a new, higher synch point. This is not fatal,
      but should be avoided, because it may significantly extend the
      synchronization period, while at the same time we are not allowed
      to send NACKs if packets are lost. In the worst case, a low-traffic
      user may see its traffic stall until a LINK_PROTOCOL state message
      trigs the link to leave synchronization state.
      
      At the same time, LINK_PROTOCOL packets which happen to have a (non-
      valid) sequence number lower than the tunnel link's rcv_nxt value will
      be consistently dropped, and will never be able to resolve the situation
      described above.
      
      We fix this by exempting LINK_PROTOCOL packets from the sequence number
      check, as they should be. We also reduce (but don't completely
      eliminate) the risk of entering multiple synchronization states by only
      allowing the (logically) first SYNCH packet to initiate a synchronization
      state. This works independently of actual packet arrival order.
      
      Fixes: commit 6e498158 ("tipc: move link synch and failover to link aggregation level")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0f8b8e28
  6. 13 Oct, 2015 12 commits
    • Eric W. Biederman's avatar
      ipv6: Don't call with rt6_uncached_list_flush_dev · e332bc67
      Eric W. Biederman authored
      As originally written rt6_uncached_list_flush_dev makes no sense when
      called with dev == NULL as it attempts to flush all uncached routes
      regardless of network namespace when dev == NULL.  Which is simply
      incorrect behavior.
      
      Furthermore at the point rt6_ifdown is called with dev == NULL no more
      network devices exist in the network namespace so even if the code in
      rt6_uncached_list_flush_dev were to attempt something sensible it
      would be meaningless.
      
      Therefore remove support in rt6_uncached_list_flush_dev for handling
      network devices where dev == NULL, and only call rt6_uncached_list_flush_dev
       when rt6_ifdown is called with a network device.
      
      Fixes: 8d0b94af ("ipv6: Keep track of DST_NOCACHE routes in case of iface down/unregister")
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Reviewed-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Tested-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e332bc67
    • Nikolay Aleksandrov's avatar
      switchdev: check if the vlan id is in the proper vlan range · 87aaf2ca
      Nikolay Aleksandrov authored
      VLANs 0 and 4095 are reserved and shouldn't be used, add checks to
      switchdev similar to the bridge. Also make sure ids above 4095 cannot
      be passed either.
      
      Fixes: 47f8328b ("switchdev: add new switchdev bridge setlink")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: default avatarScott Feldman <sfeldma@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      87aaf2ca
    • David S. Miller's avatar
      Merge branch 'be2net-fixes' · 1f225031
      David S. Miller authored
      Sathya Perla says:
      
      ====================
      be2net: patch set
      
      Patch 1 fixes a FW image compatibility check in the driver that
      prevents certain FW images from being flashed on BE3 (not BE3-R)
      adapters.
      
      Patch 2 fixes a spin_lock not being released in a failure case in
      be_cmd_notify_wait().
      
      Patch 3 includes a workaround to pad packets that are only 32b long or less
      to be applicabe to BE3 too. This workaround was currently applied only to
      Skyhawk and Lancer chips. Such packets are causing BE3's TX path to stall
      on a SR-IOV config.
      
      Patch 4 fixes the be_cmd_get_profile_config() routine to set the pf_num
      field in the cmd request. The FW requires this field to be set for it to
      return the specific function's descriptors. If not set, the FW returns
      the descriptors of all the functions on the device. If the first descriptor
      is not what is being queried for, the driver will read wrong data.
      This patch fixes this issue by using the GET_CNTL_ATTRIB cmd to query the
      real pci_func_num of a function and then uses it in the GET_PROFILE_CONFIG
      cmd.
      
      Patch 5 completes an earlier fix that removed the vlan promisc capability
      for VFs. The earlier fix did not update the removal of this capability from
      the profile descriptor of the VF. This causes the VF driver to request this
      capability when it tries to create it's interface at probe time. This could
      potentailly cause the VF probe to fail if the FW enforces strict checking of
      the flags based on what was provisoned by the PF.  This strict checking is
      not being done by FW currently but will be fixed in a future version. This
      patch fixes this issue by updating the VF's profile descriptor so that they
      match the interface capability flags provisioned by the PF.
      
      Pls consider adding these patches to the net tree. Thanks!
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f225031
    • Kalesh AP's avatar
      be2net: remove vlan promisc capability from VF's profile descriptors · 196e3735
      Kalesh AP authored
      The commit 435452aa ("Prevent VFs from enabling VLAN promiscuous mode")
      fixed the PF driver to not include the VLAN promisc capability while
      provisioning the interface for a VF. But the fix did not remove this
      capability from the profile descriptor of the VF. This causes the VF
      driver to request this capability when it tries to create it's interface
      at probe time.  This could potentailly cause the VF probe to fail if the
      FW enforces strict checking of the flags based on what was provisoned
      by the PF.  This strict checking is not being done by FW currently but
      will be fixed in a future version. This patch fixes this issue by updating
      the VF's profile descriptor so that they match the interface capability
      flags provisioned by the PF.
      
      Fixes: 435452aa ("Prevent VFs from enabling VLAN promiscuous mode")
      Signed-off-by: default avatarKalesh AP <kalesh.purayil@avagotech.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      196e3735
    • Somnath Kotur's avatar
      be2net: set pci_func_num while issuing GET_PROFILE_CONFIG cmd · 72ef3a88
      Somnath Kotur authored
      The FW requires the pf_num field in the cmd hdr to be set for it to return
      the specific function's descriptors in the GET_PROFILE_CONFIG cmd. If not
      set, the FW returns the descriptors of all the functions on the device.
      If the first descriptor is not what is being queried for, the driver will
      read wrong data. This patch fixes this issue by using the GET_CNTL_ATTRIB
      cmd to query the real pci_func_num of a function and then uses it in the
      GET_PROFILE_CONFIG cmd.
      Signed-off-by: default avatarSomnath Kotur <somnath.kotur@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72ef3a88
    • Suresh Reddy's avatar
      be2net: pad skb to meet minimum TX pkt size in BE3 · 8227e990
      Suresh Reddy authored
      On BE3 chips in SRIOV configs, the TX path stalls when a packet less
      than 32B is received from the host. A workaround to pad such packets
      already exists for the Skyhawk and Lancer chips. Use the same workaround
      for BE3 chips too.
      Signed-off-by: default avatarSuresh Reddy <suresh.reddy@avagotech.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8227e990
    • Suresh Reddy's avatar
      be2net: release mcc-lock in a failure case in be_cmd_notify_wait() · 0c884567
      Suresh Reddy authored
      The mcc/mbox lock is not being released when be_cmd_copy() returns
      an error.
      Signed-off-by: default avatarSuresh Reddy <suresh.reddy@avagotech.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0c884567
    • Kalesh AP's avatar
      be2net: fix BE3-R FW download compatibility check · ae4a9d6a
      Kalesh AP authored
      In the BE3 FW image, unlike Skyhawk's, the "asic_type_rev" field doesn't
      track the asic_rev of chip it is compatible with. When asic_type_rev
      is 0 the image is compatible only with pre-BE3-R chips (asic_rev < 0x10).
      Fix the current compatibility check to take care of this.
      We hit this issue when we try to flash old BE3 images (used prior to the
      release of BE3-R) on pre-BE3-R adapters.
      
      Fixes: a6e6ff6e ("be2net: simplify UFI compatibility checking")
      Signed-off-by: default avatarKalesh AP <kalesh.purayil@avagotech.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae4a9d6a
    • Gerlando Falauto's avatar
      net/fsl_pq_mdio: fix computed address for the TBI register · 3bb35ac4
      Gerlando Falauto authored
      commit afae5ad7
        "net/fsl_pq_mdio: streamline probing of MDIO nodes"
      
      added support for different types of MDIO devices:
      1) Gianfar MDIO nodes that only map the MII registers
      2) Gianfar MDIO nodes that map the full MDIO register set
      3) eTSEC2 MDIO nodes (which map the full MDIO register set)
      4) QE MDIO nodes (which map only the MII registers)
      
      However, the implementation for types 1 and 4 would mistakenly assume
      a mapping of the full MDIO register set, thereby computing the address
      for the TBI register starting from the containing structure.
      The TBI register would therefore be accessed at a wrong (much bigger)
      address, not giving the expected result at all.
      This patch restores the correct behavior we had prior to the above one.
      
      The consequences of this bug are apparent when trying to access a PHY
      with the same address as the value contained in the initial value of
      the TBI register (normally 0); in that case you'll get answers from the
      internal TBI device (even though MDIO/MDC pins are actually *also*
      toggling on the physical bus!).
      Beware that you also need to add a fake tbi node to your device tree
      with an unused address.
      
      Notice how this fix is related to commit
      22066949
        "powerpc: Add TBI PHY node to first MDIO bus"
      
      which fixed the behavior in kernel 3.3, which was later broken by the
      above commit on kernel 3.7.
      Signed-off-by: default avatarGerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Timur Tabi <timur@tabi.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3bb35ac4
    • Gerlando Falauto's avatar
      net/fsl_pq_mdio: check TBI address for consistency with mapped range · 3dd03e52
      Gerlando Falauto authored
      When configuring the MDIO subsystem it is also necessary to configure
      the TBI register. Make sure the TBI is contained within the mapped
      register range in order to:
      a) make sure the address is computed correctly
      b) make users aware that we're actually accessing that register
      
      In case of error, print a message but continue anyway.
      Signed-off-by: default avatarGerlando Falauto <gerlando.falauto@keymile.com>
      Cc: Timur Tabi <timur@tabi.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3dd03e52
    • Mohammed Shafi Shajakhan's avatar
      mac80211: Fix hwflags debugfs file format · 4633dfc3
      Mohammed Shafi Shajakhan authored
      Commit 30686bf7 ("mac80211: convert HW flags to unsigned long
      bitmap") accidentally removed the newline delimiter from the hwflags
      debugfs file. Fix this by adding back the newline between the HW flags.
      
      Cc: stable@vger.kernel.org [4.2]
      Signed-off-by: default avatarMohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
      [fix commit log]
      Signed-off-by: default avatarJouni Malinen <jouni@qca.qualcomm.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      4633dfc3
    • Arad, Ronen's avatar
      rtnetlink: fix gcc -Wconversion warning · e8444637
      Arad, Ronen authored
      RTA_ALIGNTO is currently define as 4. It has to be 4U to prevent warning
      for RTA_ALIGN and RTA_DATA expansions when -Wconversion gcc option is
      enabled.
      This follows NLMSG_ALIGNTO definition in <include/uapi/linux/netlink.h>.
      Signed-off-by: default avatarRonen Arad <ronen.arad@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e8444637
  7. 11 Oct, 2015 7 commits
  8. 09 Oct, 2015 5 commits
  9. 08 Oct, 2015 5 commits