1. 03 Dec, 2018 5 commits
    • Yoshihiro Shimoda's avatar
      net: phy: micrel: add toggling phy reset if PHY is not attached · 8c85f4b8
      Yoshihiro Shimoda authored
      This patch adds toggling phy reset if PHY is not attached. Otherwise,
      some boards (e.g. R-Car H3 Salvator-XS) cannot link up correctly if
      we do the following method:
      
       1) Kernel boots by using initramfs.
       --> No open the nic, so phy_device_register() and phy_probe()
           deasserts the reset.
       2) Kernel enters the suspend.
       --> So, keep the reset signal as deassert.
       --> On R-Car Salvator-XS board, unfortunately, the board power is
           turned off.
       3) Kernel returns from suspend.
       4) ifconfig eth0 up
       --> Then, since edge signal of the reset doesn't happen,
           it cannot link up.
       5) ifconfig eth0 down
       6) ifconfig eth0 up
       --> In this case, it can link up.
      Reported-by: default avatarHiromitsu Yamasaki <hiromitsu.yamasaki.ym@renesas.com>
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8c85f4b8
    • Yoshihiro Shimoda's avatar
      net: phy: Fix not to call phy_resume() if PHY is not attached · ef1b5bf5
      Yoshihiro Shimoda authored
      This patch fixes an issue that mdio_bus_phy_resume() doesn't call
      phy_resume() if the PHY is not attached.
      
      Fixes: 803dd9c7 ("net: phy: avoid suspending twice a PHY")
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef1b5bf5
    • Prashant Bhole's avatar
      tun: remove skb access after netif_receive_skb · 4e4b08e5
      Prashant Bhole authored
      In tun.c skb->len was accessed while doing stats accounting after a
      call to netif_receive_skb. We can not access skb after this call
      because buffers may be dropped.
      
      The fix for this bug would be to store skb->len in local variable and
      then use it after netif_receive_skb(). IMO using xdp data size for
      accounting bytes will be better because input for tun_xdp_one() is
      xdp_buff.
      
      Hence this patch:
      - fixes a bug by removing skb access after netif_receive_skb()
      - uses xdp data size for accounting bytes
      
      [613.019057] BUG: KASAN: use-after-free in tun_sendmsg+0x77c/0xc50 [tun]
      [613.021062] Read of size 4 at addr ffff8881da9ab7c0 by task vhost-1115/1155
      [613.023073]
      [613.024003] CPU: 0 PID: 1155 Comm: vhost-1115 Not tainted 4.20.0-rc3-vm+ #232
      [613.026029] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
      [613.029116] Call Trace:
      [613.031145]  dump_stack+0x5b/0x90
      [613.032219]  print_address_description+0x6c/0x23c
      [613.034156]  ? tun_sendmsg+0x77c/0xc50 [tun]
      [613.036141]  kasan_report.cold.5+0x241/0x308
      [613.038125]  tun_sendmsg+0x77c/0xc50 [tun]
      [613.040109]  ? tun_get_user+0x1960/0x1960 [tun]
      [613.042094]  ? __isolate_free_page+0x270/0x270
      [613.045173]  vhost_tx_batch.isra.14+0xeb/0x1f0 [vhost_net]
      [613.047127]  ? peek_head_len.part.13+0x90/0x90 [vhost_net]
      [613.049096]  ? get_tx_bufs+0x5a/0x2c0 [vhost_net]
      [613.051106]  ? vhost_enable_notify+0x2d8/0x420 [vhost]
      [613.053139]  handle_tx_copy+0x2d0/0x8f0 [vhost_net]
      [613.053139]  ? vhost_net_buf_peek+0x340/0x340 [vhost_net]
      [613.053139]  ? __mutex_lock+0x8d9/0xb30
      [613.053139]  ? finish_task_switch+0x8f/0x3f0
      [613.053139]  ? handle_tx+0x32/0x120 [vhost_net]
      [613.053139]  ? mutex_trylock+0x110/0x110
      [613.053139]  ? finish_task_switch+0xcf/0x3f0
      [613.053139]  ? finish_task_switch+0x240/0x3f0
      [613.053139]  ? __switch_to_asm+0x34/0x70
      [613.053139]  ? __switch_to_asm+0x40/0x70
      [613.053139]  ? __schedule+0x506/0xf10
      [613.053139]  handle_tx+0xc7/0x120 [vhost_net]
      [613.053139]  vhost_worker+0x166/0x200 [vhost]
      [613.053139]  ? vhost_dev_init+0x580/0x580 [vhost]
      [613.053139]  ? __kthread_parkme+0x77/0x90
      [613.053139]  ? vhost_dev_init+0x580/0x580 [vhost]
      [613.053139]  kthread+0x1b1/0x1d0
      [613.053139]  ? kthread_park+0xb0/0xb0
      [613.053139]  ret_from_fork+0x35/0x40
      [613.088705]
      [613.088705] Allocated by task 1155:
      [613.088705]  kasan_kmalloc+0xbf/0xe0
      [613.088705]  kmem_cache_alloc+0xdc/0x220
      [613.088705]  __build_skb+0x2a/0x160
      [613.088705]  build_skb+0x14/0xc0
      [613.088705]  tun_sendmsg+0x4f0/0xc50 [tun]
      [613.088705]  vhost_tx_batch.isra.14+0xeb/0x1f0 [vhost_net]
      [613.088705]  handle_tx_copy+0x2d0/0x8f0 [vhost_net]
      [613.088705]  handle_tx+0xc7/0x120 [vhost_net]
      [613.088705]  vhost_worker+0x166/0x200 [vhost]
      [613.088705]  kthread+0x1b1/0x1d0
      [613.088705]  ret_from_fork+0x35/0x40
      [613.088705]
      [613.088705] Freed by task 1155:
      [613.088705]  __kasan_slab_free+0x12e/0x180
      [613.088705]  kmem_cache_free+0xa0/0x230
      [613.088705]  ip6_mc_input+0x40f/0x5a0
      [613.088705]  ipv6_rcv+0xc9/0x1e0
      [613.088705]  __netif_receive_skb_one_core+0xc1/0x100
      [613.088705]  netif_receive_skb_internal+0xc4/0x270
      [613.088705]  br_pass_frame_up+0x2b9/0x2e0
      [613.088705]  br_handle_frame_finish+0x2fb/0x7a0
      [613.088705]  br_handle_frame+0x30f/0x6c0
      [613.088705]  __netif_receive_skb_core+0x61a/0x15b0
      [613.088705]  __netif_receive_skb_one_core+0x8e/0x100
      [613.088705]  netif_receive_skb_internal+0xc4/0x270
      [613.088705]  tun_sendmsg+0x738/0xc50 [tun]
      [613.088705]  vhost_tx_batch.isra.14+0xeb/0x1f0 [vhost_net]
      [613.088705]  handle_tx_copy+0x2d0/0x8f0 [vhost_net]
      [613.088705]  handle_tx+0xc7/0x120 [vhost_net]
      [613.088705]  vhost_worker+0x166/0x200 [vhost]
      [613.088705]  kthread+0x1b1/0x1d0
      [613.088705]  ret_from_fork+0x35/0x40
      [613.088705]
      [613.088705] The buggy address belongs to the object at ffff8881da9ab740
      [613.088705]  which belongs to the cache skbuff_head_cache of size 232
      
      Fixes: 043d222f ("tuntap: accept an array of XDP buffs through sendmsg()")
      Reviewed-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Signed-off-by: default avatarPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e4b08e5
    • Su Yanjun's avatar
      net: 8139cp: fix a BUG triggered by changing mtu with network traffic · a5d4a892
      Su Yanjun authored
      When changing mtu many times with traffic, a bug is triggered:
      
      [ 1035.684037] kernel BUG at lib/dynamic_queue_limits.c:26!
      [ 1035.684042] invalid opcode: 0000 [#1] SMP
      [ 1035.684049] Modules linked in: loop binfmt_misc 8139cp(OE) macsec
      tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag tcp_lp
      fuse uinput xt_CHECKSUM iptable_mangle ipt_MASQUERADE
      nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
      nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
      bridge stp llc ebtable_filter ebtables ip6table_filter devlink
      ip6_tables iptable_filter sunrpc snd_hda_codec_generic snd_hda_intel
      snd_hda_codec snd_hda_core snd_hwdep ppdev snd_seq iosf_mbi crc32_pclmul
      parport_pc snd_seq_device ghash_clmulni_intel parport snd_pcm
      aesni_intel joydev lrw snd_timer virtio_balloon sg gf128mul glue_helper
      ablk_helper cryptd snd soundcore i2c_piix4 pcspkr ip_tables xfs
      libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic ata_generic
      [ 1035.684102]  pata_acpi virtio_console qxl drm_kms_helper syscopyarea
      sysfillrect sysimgblt floppy fb_sys_fops crct10dif_pclmul
      crct10dif_common ttm crc32c_intel serio_raw ata_piix drm libata 8139too
      virtio_pci drm_panel_orientation_quirks virtio_ring virtio mii dm_mirror
      dm_region_hash dm_log dm_mod [last unloaded: 8139cp]
      [ 1035.684132] CPU: 9 PID: 25140 Comm: if-mtu-change Kdump: loaded
      Tainted: G           OE  ------------ T 3.10.0-957.el7.x86_64 #1
      [ 1035.684134] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
      [ 1035.684136] task: ffff8f59b1f5a080 ti: ffff8f5a2e32c000 task.ti:
      ffff8f5a2e32c000
      [ 1035.684149] RIP: 0010:[<ffffffffba3a40d0>]  [<ffffffffba3a40d0>]
      dql_completed+0x180/0x190
      [ 1035.684162] RSP: 0000:ffff8f5a75483e50  EFLAGS: 00010093
      [ 1035.684162] RAX: 00000000000000c2 RBX: ffff8f5a6f91c000 RCX:
      0000000000000000
      [ 1035.684162] RDX: 0000000000000000 RSI: 0000000000000184 RDI:
      ffff8f599fea3ec0
      [ 1035.684162] RBP: ffff8f5a75483ea8 R08: 00000000000000c2 R09:
      0000000000000000
      [ 1035.684162] R10: 00000000000616ef R11: ffff8f5a75483b56 R12:
      ffff8f599fea3e00
      [ 1035.684162] R13: 0000000000000001 R14: 0000000000000000 R15:
      0000000000000184
      [ 1035.684162] FS:  00007fa8434de740(0000) GS:ffff8f5a75480000(0000)
      knlGS:0000000000000000
      [ 1035.684162] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1035.684162] CR2: 00000000004305d0 CR3: 000000024eb66000 CR4:
      00000000001406e0
      [ 1035.684162] Call Trace:
      [ 1035.684162]  <IRQ>
      [ 1035.684162]  [<ffffffffc08cbaf8>] ? cp_interrupt+0x478/0x580 [8139cp]
      [ 1035.684162]  [<ffffffffba14a294>]
      __handle_irq_event_percpu+0x44/0x1c0
      [ 1035.684162]  [<ffffffffba14a442>] handle_irq_event_percpu+0x32/0x80
      [ 1035.684162]  [<ffffffffba14a4cc>] handle_irq_event+0x3c/0x60
      [ 1035.684162]  [<ffffffffba14db29>] handle_fasteoi_irq+0x59/0x110
      [ 1035.684162]  [<ffffffffba02e554>] handle_irq+0xe4/0x1a0
      [ 1035.684162]  [<ffffffffba7795dd>] do_IRQ+0x4d/0xf0
      [ 1035.684162]  [<ffffffffba76b362>] common_interrupt+0x162/0x162
      [ 1035.684162]  <EOI>
      [ 1035.684162]  [<ffffffffba0c2ae4>] ? __wake_up_bit+0x24/0x70
      [ 1035.684162]  [<ffffffffba1e46f5>] ? do_set_pte+0xd5/0x120
      [ 1035.684162]  [<ffffffffba1b64fb>] unlock_page+0x2b/0x30
      [ 1035.684162]  [<ffffffffba1e4879>] do_read_fault.isra.61+0x139/0x1b0
      [ 1035.684162]  [<ffffffffba1e9134>] handle_pte_fault+0x2f4/0xd10
      [ 1035.684162]  [<ffffffffba1ebc6d>] handle_mm_fault+0x39d/0x9b0
      [ 1035.684162]  [<ffffffffba76f5e3>] __do_page_fault+0x203/0x500
      [ 1035.684162]  [<ffffffffba76f9c6>] trace_do_page_fault+0x56/0x150
      [ 1035.684162]  [<ffffffffba76ef42>] do_async_page_fault+0x22/0xf0
      [ 1035.684162]  [<ffffffffba76b788>] async_page_fault+0x28/0x30
      [ 1035.684162] Code: 54 c7 47 54 ff ff ff ff 44 0f 49 ce 48 8b 35 48 2f
      9c 00 48 89 77 58 e9 fe fe ff ff 0f 1f 80 00 00 00 00 41 89 d1 e9 ef fe
      ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 55 8d 42 ff 48
      [ 1035.684162] RIP  [<ffffffffba3a40d0>] dql_completed+0x180/0x190
      [ 1035.684162]  RSP <ffff8f5a75483e50>
      
      It's not the same as in 7fe0ee09 patch described.
      As 8139cp uses shared irq mode, other device irq will trigger
      cp_interrupt to execute.
      
      cp_change_mtu
       -> cp_close
       -> cp_open
      
      In cp_close routine  just before free_irq(), some interrupt may occur.
      In my environment, cp_interrupt exectutes and IntrStatus is 0x4,
      exactly TxOk. That will cause cp_tx to wake device queue.
      
      As device queue is started, cp_start_xmit and cp_open will run at same
      time which will cause kernel BUG.
      
      For example:
      [#] for tx descriptor
      
      At start:
      
      [#][#][#]
      num_queued=3
      
      After cp_init_hw->cp_start_hw->netdev_reset_queue:
      
      [#][#][#]
      num_queued=0
      
      When 8139cp starts to work then cp_tx will check
      num_queued mismatchs the complete_bytes.
      
      The patch will check IntrMask before check IntrStatus in cp_interrupt.
      When 8139cp interrupt is disabled, just return.
      Signed-off-by: default avatarSu Yanjun <suyj.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5d4a892
    • Heiner Kallweit's avatar
      net: phy: don't allow __set_phy_supported to add unsupported modes · d2a36971
      Heiner Kallweit authored
      Currently __set_phy_supported allows to add modes w/o checking whether
      the PHY supports them. This is wrong, it should never add modes but
      only remove modes we don't want to support.
      
      The commit marked as fixed didn't do anything wrong, it just copied
      existing functionality to the helper which is being fixed now.
      
      Fixes: f3a6bd39 ("phylib: Add phy_set_max_speed helper")
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2a36971
  2. 01 Dec, 2018 10 commits
  3. 30 Nov, 2018 9 commits
    • John Hurley's avatar
      nfp: flower: prevent offload if rhashtable insert fails · b5f0cf08
      John Hurley authored
      For flow offload adds, if the rhash insert code fails, the flow will still
      have been offloaded but the reference to it in the driver freed.
      
      Re-order the offload setup calls to ensure that a flow will only be written
      to FW if a kernel reference is held and stored in the rhashtable. Remove
      this hashtable entry if the offload fails.
      
      Fixes: c01d0efa ("nfp: flower: use rhashtable for flow caching")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5f0cf08
    • John Hurley's avatar
      nfp: flower: release metadata on offload failure · 11664948
      John Hurley authored
      Calling nfp_compile_flow_metadata both assigns a stats context and
      increments a ref counter on (or allocates) a mask id table entry. These
      are released by the nfp_modify_flow_metadata call on flow deletion,
      however, if a flow add fails after metadata is set then the flow entry
      will be deleted but the metadata assignments leaked.
      
      Add an error path to the flow add offload function to ensure allocated
      metadata is released in the event of an offload fail.
      
      Fixes: 81f3ddf2 ("nfp: add control message passing capabilities to flower offloads")
      Signed-off-by: default avatarJohn Hurley <john.hurley@netronome.com>
      Reviewed-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11664948
    • Toni Peltonen's avatar
      bonding: fix 802.3ad state sent to partner when unbinding slave · 3b5b3a33
      Toni Peltonen authored
      Previously when unbinding a slave the 802.3ad implementation only told
      partner that the port is not suitable for aggregation by setting the port
      aggregation state from aggregatable to individual. This is not enough. If the
      physical layer still stays up and we only unbinded this port from the bond there
      is nothing in the aggregation status alone to prevent the partner from sending
      traffic towards us. To ensure that the partner doesn't consider this
      port at all anymore we should also disable collecting and distributing to
      signal that this actor is going away. Also clear AD_STATE_SYNCHRONIZATION to
      ensure partner exits collecting + distributing state.
      
      I have tested this behaviour againts Arista EOS switches with mlx5 cards
      (physical link stays up even when interface is down) and simulated
      the same situation virtually Linux <-> Linux with two network namespaces
      running two veth device pairs. In both cases setting aggregation to
      individual doesn't alone prevent traffic from being to sent towards this
      port given that the link stays up in partners end. Partner still keeps
      it's end in collecting + distributing state and continues until timeout is
      reached. In most cases this means we are losing the traffic partner sends
      towards our port while we wait for timeout. This is most visible with slow
      periodic time (LACP rate slow).
      
      Other open source implementations like Open VSwitch and libreswitch, and
      vendor implementations like Arista EOS, seem to disable collecting +
      distributing to when doing similar port disabling/detaching/removing change.
      With this patch kernel implementation would behave the same way and ensure
      partner doesn't consider our actor viable anymore.
      Signed-off-by: default avatarToni Peltonen <peltzi@peltzi.fi>
      Signed-off-by: default avatarJay Vosburgh <jay.vosburgh@canonical.com>
      Acked-by: default avatarJonathan Toppins <jtoppins@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b5b3a33
    • Dmitry Bogdanov's avatar
      net: aquantia: fix rx checksum offload bits · 37c4b91f
      Dmitry Bogdanov authored
      The last set of csum offload fixes had a leak:
      
      Checksum enabled status bits from rx descriptor were incorrectly
      interpreted. Consequently all the other valid logic worked on zero bits.
      That caused rx checksum offloads never to trigger.
      
      Tested by dumping rx descriptors and validating resulting csum_level.
      Reported-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Signed-off-by: default avatarDmitry Bogdanov <dmitry.bogdanov@aquantia.com>
      Signed-off-by: default avatarIgor Russkikh <igor.russkikh@aquantia.com>
      Fixes: ad703c2b ("net: aquantia: invalid checksumm offload implementation")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37c4b91f
    • Colin Ian King's avatar
      openvswitch: fix spelling mistake "execeeds" -> "exceeds" · 43d0e960
      Colin Ian King authored
      There is a spelling mistake in a net_warn_ratelimited message, fix this.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43d0e960
    • Colin Ian King's avatar
      liquidio: fix spelling mistake "deferal" -> "deferral" · 56e0e295
      Colin Ian King authored
      There is a spelling mistake in the oct_stats_strings array, fix it.
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56e0e295
    • Thierry Reding's avatar
      net: stmmac: Move debugfs init/exit to ->probe()/->remove() · 5f2b8b62
      Thierry Reding authored
      Setting up and tearing down debugfs is current unbalanced, as seen by
      this error during resume from suspend:
      
          [  752.134067] dwc-eth-dwmac 2490000.ethernet eth0: ERROR failed to create debugfs directory
          [  752.134347] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: failed debugFS registration
      
      The imbalance happens because the driver creates the debugfs hierarchy
      when the device is opened and tears it down when the device is closed.
      There's little gain in that, and it could be argued that it is even
      surprising because it's not usually done for other devices. Fix the
      imbalance by moving the debugfs creation and teardown to the driver's
      ->probe() and ->remove() implementations instead.
      
      Note that the ring descriptors cannot be read while the interface is
      down, so make sure to return an empty file when the descriptors_status
      debugfs file is read.
      Signed-off-by: default avatarThierry Reding <treding@nvidia.com>
      Acked-by: default avatarJose Abreu <joabreu@synopsys.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f2b8b62
    • Xin Long's avatar
      sctp: update frag_point when stream_interleave is set · 4135cce7
      Xin Long authored
      sctp_assoc_update_frag_point() should be called whenever asoc->pathmtu
      changes, but we missed one place in sctp_association_init(). It would
      cause frag_point is zero when sending data.
      
      As says in Jakub's reproducer, if sp->pathmtu is set by socketopt, the
      new asoc->pathmtu inherits it in sctp_association_init(). Later when
      transports are added and their pmtu >= asoc->pathmtu, it will never
      call sctp_assoc_update_frag_point() to set frag_point.
      
      This patch is to fix it by updating frag_point after asoc->pathmtu is
      set as sp->pathmtu in sctp_association_init(). Note that it moved them
      after sctp_stream_init(), as stream->si needs to be set first.
      
      Frag_point's calculation is also related with datachunk's type, so it
      needs to update frag_point when stream->si may be changed in
      sctp_process_init().
      
      v1->v2:
        - call sctp_assoc_update_frag_point() separately in sctp_process_init
          and sctp_association_init, per Marcelo's suggestion.
      
      Fixes: 2f5e3c9d ("sctp: introduce sctp_assoc_update_frag_point")
      Reported-by: default avatarJakub Audykowicz <jakub.audykowicz@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4135cce7
    • Christoph Paasch's avatar
      net: Prevent invalid access to skb->prev in __qdisc_drop_all · 9410d386
      Christoph Paasch authored
      __qdisc_drop_all() accesses skb->prev to get to the tail of the
      segment-list.
      
      With commit 68d2f84a ("net: gro: properly remove skb from list")
      the skb-list handling has been changed to set skb->next to NULL and set
      the list-poison on skb->prev.
      
      With that change, __qdisc_drop_all() will panic when it tries to
      dereference skb->prev.
      
      Since commit 992cba7e ("net: Add and use skb_list_del_init().")
      __list_del_entry is used, leaving skb->prev unchanged (thus,
      pointing to the list-head if it's the first skb of the list).
      This will make __qdisc_drop_all modify the next-pointer of the list-head
      and result in a panic later on:
      
      [   34.501053] general protection fault: 0000 [#1] SMP KASAN PTI
      [   34.501968] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.20.0-rc2.mptcp #108
      [   34.502887] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
      [   34.504074] RIP: 0010:dev_gro_receive+0x343/0x1f90
      [   34.504751] Code: e0 48 c1 e8 03 42 80 3c 30 00 0f 85 4a 1c 00 00 4d 8b 24 24 4c 39 65 d0 0f 84 0a 04 00 00 49 8d 7c 24 38 48 89 f8 48 c1 e8 03 <42> 0f b6 04 30 84 c0 74 08 3c 04
      [   34.507060] RSP: 0018:ffff8883af507930 EFLAGS: 00010202
      [   34.507761] RAX: 0000000000000007 RBX: ffff8883970b2c80 RCX: 1ffff11072e165a6
      [   34.508640] RDX: 1ffff11075867008 RSI: ffff8883ac338040 RDI: 0000000000000038
      [   34.509493] RBP: ffff8883af5079d0 R08: ffff8883970b2d40 R09: 0000000000000062
      [   34.510346] R10: 0000000000000034 R11: 0000000000000000 R12: 0000000000000000
      [   34.511215] R13: 0000000000000000 R14: dffffc0000000000 R15: ffff8883ac338008
      [   34.512082] FS:  0000000000000000(0000) GS:ffff8883af500000(0000) knlGS:0000000000000000
      [   34.513036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   34.513741] CR2: 000055ccc3e9d020 CR3: 00000003abf32000 CR4: 00000000000006e0
      [   34.514593] Call Trace:
      [   34.514893]  <IRQ>
      [   34.515157]  napi_gro_receive+0x93/0x150
      [   34.515632]  receive_buf+0x893/0x3700
      [   34.516094]  ? __netif_receive_skb+0x1f/0x1a0
      [   34.516629]  ? virtnet_probe+0x1b40/0x1b40
      [   34.517153]  ? __stable_node_chain+0x4d0/0x850
      [   34.517684]  ? kfree+0x9a/0x180
      [   34.518067]  ? __kasan_slab_free+0x171/0x190
      [   34.518582]  ? detach_buf+0x1df/0x650
      [   34.519061]  ? lapic_next_event+0x5a/0x90
      [   34.519539]  ? virtqueue_get_buf_ctx+0x280/0x7f0
      [   34.520093]  virtnet_poll+0x2df/0xd60
      [   34.520533]  ? receive_buf+0x3700/0x3700
      [   34.521027]  ? qdisc_watchdog_schedule_ns+0xd5/0x140
      [   34.521631]  ? htb_dequeue+0x1817/0x25f0
      [   34.522107]  ? sch_direct_xmit+0x142/0xf30
      [   34.522595]  ? virtqueue_napi_schedule+0x26/0x30
      [   34.523155]  net_rx_action+0x2f6/0xc50
      [   34.523601]  ? napi_complete_done+0x2f0/0x2f0
      [   34.524126]  ? kasan_check_read+0x11/0x20
      [   34.524608]  ? _raw_spin_lock+0x7d/0xd0
      [   34.525070]  ? _raw_spin_lock_bh+0xd0/0xd0
      [   34.525563]  ? kvm_guest_apic_eoi_write+0x6b/0x80
      [   34.526130]  ? apic_ack_irq+0x9e/0xe0
      [   34.526567]  __do_softirq+0x188/0x4b5
      [   34.527015]  irq_exit+0x151/0x180
      [   34.527417]  do_IRQ+0xdb/0x150
      [   34.527783]  common_interrupt+0xf/0xf
      [   34.528223]  </IRQ>
      
      This patch makes sure that skb->prev is set to NULL when entering
      netem_enqueue.
      
      Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Cc: Tyler Hicks <tyhicks@canonical.com>
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Fixes: 68d2f84a ("net: gro: properly remove skb from list")
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarChristoph Paasch <cpaasch@apple.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9410d386
  4. 29 Nov, 2018 12 commits
  5. 28 Nov, 2018 4 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 60b54823
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) ARM64 JIT fixes for subprog handling from Daniel Borkmann.
      
       2) Various sparc64 JIT bug fixes (fused branch convergance, frame
          pointer usage detection logic, PSEODU call argument handling).
      
       3) Fix to use BH locking in nf_conncount, from Taehee Yoo.
      
       4) Fix race of TX skb freeing in ipheth driver, from Bernd Eckstein.
      
       5) Handle return value of TX NAPI completion properly in lan743x
          driver, from Bryan Whitehead.
      
       6) MAC filter deletion in i40e driver clears wrong state bit, from
          Lihong Yang.
      
       7) Fix use after free in rionet driver, from Pan Bian.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
        s390/qeth: fix length check in SNMP processing
        net: hisilicon: remove unexpected free_netdev
        rapidio/rionet: do not free skb before reading its length
        i40e: fix kerneldoc for xsk methods
        ixgbe: recognize 1000BaseLX SFP modules as 1Gbps
        i40e: Fix deletion of MAC filters
        igb: fix uninitialized variables
        netfilter: nf_tables: deactivate expressions in rule replecement routine
        lan743x: Enable driver to work with LAN7431
        tipc: fix lockdep warning during node delete
        lan743x: fix return value for lan743x_tx_napi_poll
        net: via: via-velocity: fix spelling mistake "alignement" -> "alignment"
        qed: fix spelling mistake "attnetion" -> "attention"
        net: thunderx: fix NULL pointer dereference in nic_remove
        sctp: increase sk_wmem_alloc when head->truesize is increased
        firestream: fix spelling mistake: "Inititing" -> "Initializing"
        net: phy: add workaround for issue where PHY driver doesn't bind to the device
        usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2
        sparc: Adjust bpf JIT prologue for PSEUDO calls.
        bpf, doc: add entries of who looks over which jits
        ...
      60b54823
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa · b26b2b24
      Linus Torvalds authored
      Pull Xtensa fixes from Max Filippov:
      
       - fix kernel exception on userspace access to a currently disabled
         coprocessor
      
       - fix coprocessor data saving/restoring in configurations with multiple
         coprocessors
      
       - fix ptrace access to coprocessor data on configurations with multiple
         coprocessors with high alignment requirements
      
      * tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix coprocessor part of ptrace_{get,set}xregs
        xtensa: fix coprocessor context offset definitions
        xtensa: enable coprocessors that are being flushed
      b26b2b24
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · d78a5ebd
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Fixes 2018-11-28
      
      This series contains fixes to igb, ixgbe and i40e.
      
      Yunjian Wang from Huawei resolves a variable that could potentially be
      NULL before it is used.
      
      Lihong fixes an i40e issue which goes back to 4.17 kernels, where
      deleting any of the MAC filters was causing the incorrect syncing for
      the PF.
      
      Josh Elsasser caught that there were missing enum values in the link
      capabilities for x550 devices, which was preventing link for 1000BaseLX
      SFP modules.
      
      Jan fixes the function header comments for XSK methods.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d78a5ebd
    • Julian Wiedmann's avatar
      s390/qeth: fix length check in SNMP processing · 9a764c1e
      Julian Wiedmann authored
      The response for a SNMP request can consist of multiple parts, which
      the cmd callback stages into a kernel buffer until all parts have been
      received. If the callback detects that the staging buffer provides
      insufficient space, it bails out with error.
      This processing is buggy for the first part of the response - while it
      initially checks for a length of 'data_len', it later copies an
      additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.
      
      Fix the calculation of 'data_len' for the first part of the response.
      This also nicely cleans up the memcpy code.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Reviewed-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9a764c1e