1. 26 Aug, 2015 2 commits
    • Iyappan Subramanian's avatar
      drivers: net: xgene: fix: Oops in linkwatch_fire_event · ccc02ddb
      Iyappan Subramanian authored
      [ 1065.801569] Internal error: Oops: 96000006 [#1] SMP
      ...
      [ 1065.866655] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Apr 22 2015
      [ 1065.873937] Workqueue: events_power_efficient phy_state_machine
      [ 1065.879837] task: fffffe01de105e80 ti: fffffe00bcf18000 task.ti: fffffe00bcf18000
      [ 1065.887288] PC is at linkwatch_fire_event+0xac/0xc0
      [ 1065.892141] LR is at linkwatch_fire_event+0xa0/0xc0
      [ 1065.896995] pc : [<fffffe000060284c>] lr : [<fffffe0000602840>] pstate: 200001c5
      [ 1065.904356] sp : fffffe00bcf1bd00
      ...
      [ 1066.196813] Call Trace:
      [ 1066.199248] [<fffffe000060284c>] linkwatch_fire_event+0xac/0xc0
      [ 1066.205140] [<fffffe000061167c>] netif_carrier_off+0x54/0x64
      [ 1066.210773] [<fffffe00004f1654>] phy_state_machine+0x120/0x3bc
      [ 1066.216578] [<fffffe00000d8d10>] process_one_work+0x15c/0x3a8
      [ 1066.222296] [<fffffe00000d9090>] worker_thread+0x134/0x470
      [ 1066.227757] [<fffffe00000df014>] kthread+0xe0/0xf8
      [ 1066.232525] Code: 97f65ee9 f9420660 d538d082 8b000042 (885f7c40)
      
      The fix is to call phy_disconnect() from xgene_enet_mdio_remove,
      which in turn call cancel_delayed_work_sync().
      Signed-off-by: default avatarIyappan Subramanian <isubramanian@apm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccc02ddb
    • WANG Cong's avatar
      cls_u32: complete the check for non-forced case in u32_destroy() · a6c1aea0
      WANG Cong authored
      In commit 1e052be6 ("net_sched: destroy proto tp when all filters are gone")
      I added a check in u32_destroy() to see if all real filters are gone
      for each tp, however, that is only done for root_ht, same is needed
      for others.
      
      This can be reproduced by the following tc commands:
      
      tc filter add dev eth0 parent 1:0 prio 5 handle 15: protocol ip u32 divisor 256
      tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:2 u32
      ht 15:2: match ip src 10.0.0.2 flowid 1:10
      tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:3 u32
      ht 15:2: match ip src 10.0.0.3 flowid 1:10
      
      Fixes: 1e052be6 ("net_sched: destroy proto tp when all filters are gone")
      Reported-by: default avatarAkshat Kakkar <akshat.1984@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6c1aea0
  2. 25 Aug, 2015 8 commits
  3. 24 Aug, 2015 3 commits
    • David Ahern's avatar
      net: Fix RCU splat in af_key · ba51b6be
      David Ahern authored
      Hit the following splat testing VRF change for ipsec:
      
      [  113.475692] ===============================
      [  113.476194] [ INFO: suspicious RCU usage. ]
      [  113.476667] 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED Not tainted
      [  113.477545] -------------------------------
      [  113.478013] /work/monster-14/dsa/kernel.git/include/linux/rcupdate.h:568 Illegal context switch in RCU read-side critical section!
      [  113.479288]
      [  113.479288] other info that might help us debug this:
      [  113.479288]
      [  113.480207]
      [  113.480207] rcu_scheduler_active = 1, debug_locks = 1
      [  113.480931] 2 locks held by setkey/6829:
      [  113.481371]  #0:  (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [<ffffffff814e9887>] pfkey_sendmsg+0xfb/0x213
      [  113.482509]  #1:  (rcu_read_lock){......}, at: [<ffffffff814e767f>] rcu_read_lock+0x0/0x6e
      [  113.483509]
      [  113.483509] stack backtrace:
      [  113.484041] CPU: 0 PID: 6829 Comm: setkey Not tainted 4.2.0-rc6-1+deb7u2+clUNRELEASED #3.2.65-1+deb7u2+clUNRELEASED
      [  113.485422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
      [  113.486845]  0000000000000001 ffff88001d4c7a98 ffffffff81518af2 ffffffff81086962
      [  113.487732]  ffff88001d538480 ffff88001d4c7ac8 ffffffff8107ae75 ffffffff8180a154
      [  113.488628]  0000000000000b30 0000000000000000 00000000000000d0 ffff88001d4c7ad8
      [  113.489525] Call Trace:
      [  113.489813]  [<ffffffff81518af2>] dump_stack+0x4c/0x65
      [  113.490389]  [<ffffffff81086962>] ? console_unlock+0x3d6/0x405
      [  113.491039]  [<ffffffff8107ae75>] lockdep_rcu_suspicious+0xfa/0x103
      [  113.491735]  [<ffffffff81064032>] rcu_preempt_sleep_check+0x45/0x47
      [  113.492442]  [<ffffffff8106404d>] ___might_sleep+0x19/0x1c8
      [  113.493077]  [<ffffffff81064268>] __might_sleep+0x6c/0x82
      [  113.493681]  [<ffffffff81133190>] cache_alloc_debugcheck_before.isra.50+0x1d/0x24
      [  113.494508]  [<ffffffff81134876>] kmem_cache_alloc+0x31/0x18f
      [  113.495149]  [<ffffffff814012b5>] skb_clone+0x64/0x80
      [  113.495712]  [<ffffffff814e6f71>] pfkey_broadcast_one+0x3d/0xff
      [  113.496380]  [<ffffffff814e7b84>] pfkey_broadcast+0xb5/0x11e
      [  113.497024]  [<ffffffff814e82d1>] pfkey_register+0x191/0x1b1
      [  113.497653]  [<ffffffff814e9770>] pfkey_process+0x162/0x17e
      [  113.498274]  [<ffffffff814e9895>] pfkey_sendmsg+0x109/0x213
      
      In pfkey_sendmsg the net mutex is taken and then pfkey_broadcast takes
      the RCU lock.
      
      Since pfkey_broadcast takes the RCU lock the allocation argument is
      pointless since GFP_ATOMIC must be used between the rcu_read_{,un}lock.
      The one call outside of rcu can be done with GFP_KERNEL.
      
      Fixes: 7f6b9dbd ("af_key: locking change")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba51b6be
    • Jaedon Shin's avatar
      net: bcmgenet: fix uncleaned dma flags · b6df7d61
      Jaedon Shin authored
      Clean the dma flags of multiq ring buffer int the interface stop
      process. This patch fixes that the genet is not running while the
      interface is re-enabled.
      
      $ ifup eth0 - running after booting
      $ ifdown eth0
      $ ifup eth0 - not running and occur tx_timeout
      
      The bcmgenet_dma_disable() in bcmgenet_open() do clean ring16 dma flag
      only. If the genet has multiq, the dma register is not cleaned. and
      bcmgenet_init_dma() is not done correctly. in case
      GENET_V2(tx_queues=4), tdma_ctrl has 0x1e after running
      bcmgenet_dma_disable().
      Signed-off-by: default avatarJaedon Shin <jaedon.shin@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6df7d61
    • Florian Fainelli's avatar
      net: bcmgenet: Avoid sleeping in bcmgenet_timeout · eed63569
      Florian Fainelli authored
      bcmgenet_timeout() executes in atomic context, yet we will invoke
      napi_disable() which does sleep. Looking back at the changes, disabling
      TX napi and re-enabling it is completely useless, since we reclaim all
      TX buffers and re-enable interrupts, and wake up the TX queues.
      
      Fixes: 13ea6578 ("net: bcmgenet: improve TX timeout")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eed63569
  4. 23 Aug, 2015 1 commit
  5. 21 Aug, 2015 1 commit
  6. 20 Aug, 2015 7 commits
    • Rafael J. Wysocki's avatar
      Merge branches 'acpi-video' and 'cpufreq-fixes' · b8a1171f
      Rafael J. Wysocki authored
      * acpi-video:
        ACPI / video: Fix circular lock dependency issue in the video-detect code
      
      * cpufreq-fixes:
        cpufreq: exynos: Fix for memory leak in case SoC name does not match
      b8a1171f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 28e55d07
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Out of bounds array access in 802.11 minstrel code, from Adrien
          Schildknecht.
      
       2) Don't use skb_get() in IGMP/MLD code paths, as this makes
          pskb_may_pull() BUG.  From Linus Luessing.
      
       3) Fix off by one in ipv4 route dumping code, from Andy Whitcroft.
      
       4) Fix deadlock in reqsk_queue_unlink(), from Eric Dumazet.
      
       5) Fix ppp device deregistration wrt.  netns deletion, from Guillaume
          Nault.
      
       6) Fix deadlock when creating per-cpu ipv6 routes, from Martin KaFai
          Lau.
      
       7) Fix memory leak in batman-adv code, from Sven Eckelmann.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        batman-adv: Fix memory leak on tt add with invalid vlan
        net: phy: fix semicolon.cocci warnings
        net: qmi_wwan: add HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module
        be2net: avoid vxlan offloading on multichannel configs
        ipv6: Fix a potential deadlock when creating pcpu rt
        ipv6: Add rt6_make_pcpu_route()
        ipv6: Remove un-used argument from ip6_dst_alloc()
        net: phy: workaround for buggy cable detection by LAN8700 after cable plugging
        net: ethernet: micrel: fix an error code
        ppp: fix device unregistration upon netns deletion
        net: phy: fix PHY_RUNNING in phy_state_machine
        Revert "net: limit tcp/udp rmem/wmem to SOCK_{RCV,SND}BUF_MIN"
        inet: fix potential deadlock in reqsk_queue_unlink()
        gianfar: Restore link state settings after MAC reset
        ipv4: off-by-one in continuation handling in /proc/net/route
        net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code
        mac80211: fix invalid read in minstrel_sort_best_tp_rates()
      28e55d07
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 3d3e66ba
      Linus Torvalds authored
      Pull xen build fix from David Vrabel:
       "Fix i386 build with an (uncommon) configuration"
      
      * tag 'for-linus-4.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        x86/xen: make CONFIG_XEN depend on CONFIG_X86_LOCAL_APIC
      3d3e66ba
    • Linus Torvalds's avatar
      Merge tag 'sound-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · a971dbca
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Here are a small collecton of sound fix patches.
      
        The most significant one is the disablement of newly introduced
        topology API.  Its ABI couldn't be stabilized enough, so we decided to
        delay for 4.3 in the end.  Other than that, all oneliner fixes: a
        USB-audio runtime PM fix and a couple of HD-audio quirks"
      
      * tag 'sound-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Add dock support for Thinkpad W541 (17aa:2211)
        ALSA: usb-audio: Fix runtime PM unbalance
        ASoC: topology: Disable use from userspace
        ASoC: topology: Add Kconfig option for topology
        ALSA: hda - Fix the white noise on Dell laptop
      a971dbca
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 3243f50b
      Linus Torvalds authored
      Pull SCSI target fixes from Nicholas Bellinger:
       "This contains a v4.2-rc specific RCU module unload regression bug-fix,
        a long-standing iscsi-target bug-fix for duplicate target_xfer_tags
        during NOP processing from Alexei, and two more small REPORT_LUNs
        emulation related patches to make Solaris FC host LUN scanning happy
        from Roland.
      
        There is also one patch not included that allows target-core to limit
        the number of fabric driver SGLs per I/O request using residuals, that
        is currently required as a work-around for FC hosts which don't honor
        EVPD block-limits settings.  At this point, it will most likely become
        for-next material"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
        target: Fix handling of small allocation lengths in REPORT LUNS
        target: REPORT LUNS should return LUN 0 even for dynamic ACLs
        target/iscsi: Fix double free of a TUR followed by a solicited NOPOUT
        target: Perform RCU callback barrier before backend/fabric unload
      3243f50b
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal · 3bd8f7d8
      Linus Torvalds authored
      Pull thermal fixes from Eduardo Valentin:
       "Last minute fixes on the thermal-soc tree.  There is a fix of a long
        lasting bug in cpu cooling device, thanks for RMK for being pushing
        this"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
        thermal/cpu_cooling: update policy limits if clipped_freq < policy->max
        thermal/cpu_cooling: rename max_freq as clipped_freq in notifier
        thermal/cpu_cooling: rename cpufreq_val as clipped_freq
        thermal/cpu_cooling: convert 'switch' block to 'if' block in notifier
        thermal/cpu_cooling: quit early after updating policy
        thermal/cpu_cooling: No need to initialize max_freq to 0
        thermal: cpu_cooling: fix lockdep problems in cpu_cooling
        thermal: power_allocator: do not use devm* interfaces
      3bd8f7d8
    • David Vrabel's avatar
      x86/xen: make CONFIG_XEN depend on CONFIG_X86_LOCAL_APIC · 87ffd2b9
      David Vrabel authored
      Since commit feb44f1f (x86/xen:
      Provide a "Xen PV" APIC driver to support >255 VCPUs) Xen guests need
      a full APIC driver and thus should depend on X86_LOCAL_APIC.
      
      This fixes an i386 build failure with !SMP && !CONFIG_X86_UP_APIC by
      disabling Xen support in this configuration.
      
      Users needing Xen support in a non-SMP i386 kernel will need to enable
      CONFIG_X86_UP_APIC.
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Cc: <stable@vger.kernel.org>
      87ffd2b9
  7. 19 Aug, 2015 5 commits
  8. 18 Aug, 2015 4 commits
  9. 17 Aug, 2015 9 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · bf674028
      Linus Torvalds authored
      Pull rdma bugfix from Doug Ledford:
       "Bugfix in iw_cxgb4"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
        iw_cxgb4: gracefully handle unknown CQE status errors
      bf674028
    • Linus Torvalds's avatar
      Merge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 4e7fca0d
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Three minor device-specific fixes and revert of NCQ autosense added
        during this -rc1.
      
        It turned out that NCQ autosense as currently implemented interferes
        with the usual error handling behavior.  It will be revisited in the
        near future"
      
      * 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        ata: ahci_brcmstb: Fix misuse of IS_ENABLED
        sata_sx4: Check return code from pdc20621_i2c_read()
        Revert "libata: Implement NCQ autosense"
        Revert "libata: Implement support for sense data reporting"
        Revert "libata-eh: Set 'information' field for autosense"
        ata: ahci_brcmstb: Fix warnings with CONFIG_PM_SLEEP=n
      4e7fca0d
    • Linus Torvalds's avatar
      Merge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · e9ab22d2
      Linus Torvalds authored
      Pull cgroup fix from Tejun Heo:
       "A fix for a subtle bug introduced back during 3.17 cycle which
        interferes with setting configurations under specific conditions"
      
      * 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cpuset: use trialcs->mems_allowed as a temp variable
      e9ab22d2
    • kbuild test robot's avatar
      net: phy: fix semicolon.cocci warnings · ff94c742
      kbuild test robot authored
      drivers/net/phy/smsc.c:127:3-4: Unneeded semicolon
      
       Remove unneeded semicolon.
      
      Generated by: scripts/coccinelle/misc/semicolon.cocci
      
      CC: Igor Plyatov <plyatov@gmail.com>
      Signed-off-by: default avatarFengguang Wu <fengguang.wu@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff94c742
    • David Ward's avatar
      a8079092
    • Ivan Vecera's avatar
      be2net: avoid vxlan offloading on multichannel configs · af19e686
      Ivan Vecera authored
      VxLAN offloading is not functional if the NIC is running in multichannel
      mode (UMC, FLEX-10, VNIC...). Enabling this additionally kills whole
      connectivity through the NIC and the device needs to be down and up to
      restore it. The firmware should take care about it and does not allow
      the conversion of interface to tunnel type (be_cmd_manage_iface) or should
      support VxLAN offloading if multichannel config is enabled.
      I have tested this on the latest available firmware (10.6.144.21).
      
      Result:
      [root@sm-04 ~]# ip link set enp5s0f0 up[root@sm-04 ~]# ip addr add 172.30.10.50/24 dev enp5s0f0
      [root@sm-04 ~]# ping -c 3 172.30.10.254PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
      64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.317 ms
      64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.187 ms
      64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.188 ms
      
       --- 172.30.10.254 ping statistics ---
      3 packets transmitted, 3 received, 0% packet loss, time 2000ms
      rtt min/avg/max/mdev = 0.187/0.230/0.317/0.063 ms
      [root@sm-04 ~]# ip link add link enp5s0f0 vxlan10 type vxlan id 10 remote 172.30.10.60 dstport 4789
      [root@sm-04 ~]# ip link set vxlan10 up
      [ 7900.442811] be2net 0000:05:00.0: Enabled VxLAN offloads for UDP port 4789
      [ 7900.455722] be2net 0000:05:00.1: Enabled VxLAN offloads for UDP port 4789
      [ 7900.468635] be2net 0000:05:00.2: Enabled VxLAN offloads for UDP port 4789
      [ 7900.481553] be2net 0000:05:00.3: Enabled VxLAN offloads for UDP port 4789
      [root@sm-04 ~]# ping -c 3 172.30.10.254
      PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
      
       --- 172.30.10.254 ping statistics ---
      3 packets transmitted, 0 received, 100% packet loss, time 1999ms
      
      [root@sm-04 ~]# ip link set vxlan10 down
      [ 7959.434093] be2net 0000:05:00.0: Disabled VxLAN offloads for UDP port 4789
      [ 7959.444792] be2net 0000:05:00.1: Disabled VxLAN offloads for UDP port 4789
      [ 7959.455592] be2net 0000:05:00.2: Disabled VxLAN offloads for UDP port 4789
      [ 7959.466416] be2net 0000:05:00.3: Disabled VxLAN offloads for UDP port 4789
      [root@sm-04 ~]# ip link del vxlan10
      [root@sm-04 ~]# ping -c 3 172.30.10.254
      PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
      
       --- 172.30.10.254 ping statistics ---
      3 packets transmitted, 0 received, 100% packet loss, time 1999ms
      
      [root@sm-04 ~]# ip link set enp5s0f0 down
      [root@sm-04 ~]# ip link set enp5s0f0 up
      [ 8071.019003] be2net 0000:05:00.0 enp5s0f0: Link is Up
      [root@sm-04 ~]# ping -c 3 172.30.10.254
      PING 172.30.10.254 (172.30.10.254) 56(84) bytes of data.
      64 bytes from 172.30.10.254: icmp_seq=1 ttl=64 time=0.318 ms
      64 bytes from 172.30.10.254: icmp_seq=2 ttl=64 time=0.196 ms
      64 bytes from 172.30.10.254: icmp_seq=3 ttl=64 time=0.194 ms
      
       --- 172.30.10.254 ping statistics ---
      3 packets transmitted, 3 received, 0% packet loss, time 2000ms
      rtt min/avg/max/mdev = 0.194/0.236/0.318/0.057 ms
      
      Cc: Sathya Perla <sathya.perla@avagotech.com>
      Cc: Ajit Khaparde <ajit.khaparde@avagotech.com>
      Cc: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com>
      Cc: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
      Signed-off-by: default avatarIvan Vecera <ivecera@redhat.com>
      Acked-by: default avatarAjit Khaparde <ajit.khaparde@avagotech.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af19e686
    • David S. Miller's avatar
      Merge branch 'ipv6_percpu_rt_deadlock' · 1f979b11
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      ipv6: Fix a potential deadlock when creating pcpu rt
      
      v1 -> v2:
      A minor change in the commit message of patch 2.
      
      This patch series fixes a potential deadlock when creating a pcpu rt.
      It happens when dst_alloc() decided to run gc. Something like this:
      
      read_lock(&table->tb6_lock);
      ip6_rt_pcpu_alloc()
      => dst_alloc()
      => ip6_dst_gc()
      => write_lock(&table->tb6_lock); /* oops */
      
      Patch 1 and 2 are some prep works.
      Patch 3 is the fix.
      
      Original report: https://bugzilla.kernel.org/show_bug.cgi?id=102291
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f979b11
    • Martin KaFai Lau's avatar
      ipv6: Fix a potential deadlock when creating pcpu rt · 9c7370a1
      Martin KaFai Lau authored
      rt6_make_pcpu_route() is called under read_lock(&table->tb6_lock).
      rt6_make_pcpu_route() calls ip6_rt_pcpu_alloc(rt) which then
      calls dst_alloc().  dst_alloc() _may_ call ip6_dst_gc() which takes
      the write_lock(&tabl->tb6_lock).  A visualized version:
      
      read_lock(&table->tb6_lock);
      rt6_make_pcpu_route();
      => ip6_rt_pcpu_alloc();
      => dst_alloc();
      => ip6_dst_gc();
      => write_lock(&table->tb6_lock); /* oops */
      
      The fix is to do a read_unlock first before calling ip6_rt_pcpu_alloc().
      
      A reported stack:
      
      [141625.537638] INFO: rcu_sched self-detected stall on CPU { 27}  (t=60000 jiffies g=4159086 c=4159085 q=2139)
      [141625.547469] Task dump for CPU 27:
      [141625.550881] mtr             R  running task        0 22121  22081 0x00000008
      [141625.558069]  0000000000000000 ffff88103f363d98 ffffffff8106e488 000000000000001b
      [141625.565641]  ffffffff81684900 ffff88103f363db8 ffffffff810702b0 0000000008000000
      [141625.573220]  ffffffff81684900 ffff88103f363de8 ffffffff8108df9f ffff88103f375a00
      [141625.580803] Call Trace:
      [141625.583345]  <IRQ>  [<ffffffff8106e488>] sched_show_task+0xc1/0xc6
      [141625.589650]  [<ffffffff810702b0>] dump_cpu_task+0x35/0x39
      [141625.595144]  [<ffffffff8108df9f>] rcu_dump_cpu_stacks+0x6a/0x8c
      [141625.601320]  [<ffffffff81090606>] rcu_check_callbacks+0x1f6/0x5d4
      [141625.607669]  [<ffffffff810940c8>] update_process_times+0x2a/0x4f
      [141625.613925]  [<ffffffff8109fbee>] tick_sched_handle+0x32/0x3e
      [141625.619923]  [<ffffffff8109fc2f>] tick_sched_timer+0x35/0x5c
      [141625.625830]  [<ffffffff81094a1f>] __hrtimer_run_queues+0x8f/0x18d
      [141625.632171]  [<ffffffff81094c9e>] hrtimer_interrupt+0xa0/0x166
      [141625.638258]  [<ffffffff8102bf2a>] local_apic_timer_interrupt+0x4e/0x52
      [141625.645036]  [<ffffffff8102c36f>] smp_apic_timer_interrupt+0x39/0x4a
      [141625.651643]  [<ffffffff8140b9e8>] apic_timer_interrupt+0x68/0x70
      [141625.657895]  <EOI>  [<ffffffff81346ee8>] ? dst_destroy+0x7c/0xb5
      [141625.664188]  [<ffffffff813d45b5>] ? fib6_flush_trees+0x20/0x20
      [141625.670272]  [<ffffffff81082b45>] ? queue_write_lock_slowpath+0x60/0x6f
      [141625.677140]  [<ffffffff8140aa33>] _raw_write_lock_bh+0x23/0x25
      [141625.683218]  [<ffffffff813d4553>] __fib6_clean_all+0x40/0x82
      [141625.689124]  [<ffffffff813d45b5>] ? fib6_flush_trees+0x20/0x20
      [141625.695207]  [<ffffffff813d6058>] fib6_clean_all+0xe/0x10
      [141625.700854]  [<ffffffff813d60d3>] fib6_run_gc+0x79/0xc8
      [141625.706329]  [<ffffffff813d0510>] ip6_dst_gc+0x85/0xf9
      [141625.711718]  [<ffffffff81346d68>] dst_alloc+0x55/0x159
      [141625.717105]  [<ffffffff813d09b5>] __ip6_dst_alloc.isra.32+0x19/0x63
      [141625.723620]  [<ffffffff813d1830>] ip6_pol_route+0x36a/0x3e8
      [141625.729441]  [<ffffffff813d18d6>] ip6_pol_route_output+0x11/0x13
      [141625.735700]  [<ffffffff813f02c8>] fib6_rule_action+0xa7/0x1bf
      [141625.741698]  [<ffffffff813d18c5>] ? ip6_pol_route_input+0x17/0x17
      [141625.748043]  [<ffffffff81357c48>] fib_rules_lookup+0xb5/0x12a
      [141625.754050]  [<ffffffff81141628>] ? poll_select_copy_remaining+0xf9/0xf9
      [141625.761002]  [<ffffffff813f0535>] fib6_rule_lookup+0x37/0x5c
      [141625.766914]  [<ffffffff813d18c5>] ? ip6_pol_route_input+0x17/0x17
      [141625.773260]  [<ffffffff813d008c>] ip6_route_output+0x7a/0x82
      [141625.779177]  [<ffffffff813c44c8>] ip6_dst_lookup_tail+0x53/0x112
      [141625.785437]  [<ffffffff813c45c3>] ip6_dst_lookup_flow+0x2a/0x6b
      [141625.791604]  [<ffffffff813ddaab>] rawv6_sendmsg+0x407/0x9b6
      [141625.797423]  [<ffffffff813d7914>] ? do_ipv6_setsockopt.isra.8+0xd87/0xde2
      [141625.804464]  [<ffffffff8139d4b4>] inet_sendmsg+0x57/0x8e
      [141625.810028]  [<ffffffff81329ba3>] sock_sendmsg+0x2e/0x3c
      [141625.815588]  [<ffffffff8132be57>] SyS_sendto+0xfe/0x143
      [141625.821063]  [<ffffffff813dd551>] ? rawv6_setsockopt+0x5e/0x67
      [141625.827146]  [<ffffffff8132c9f8>] ? sock_common_setsockopt+0xf/0x11
      [141625.833660]  [<ffffffff8132c08c>] ? SyS_setsockopt+0x81/0xa2
      [141625.839565]  [<ffffffff8140ac17>] entry_SYSCALL_64_fastpath+0x12/0x6a
      
      Fixes: d52d3997 ("pv6: Create percpu rt6_info")
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Reported-by: default avatarSteinar H. Gunderson <sgunderson@bigfoot.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9c7370a1
    • Martin KaFai Lau's avatar
      ipv6: Add rt6_make_pcpu_route() · a73e4195
      Martin KaFai Lau authored
      It is a prep work for fixing a potential deadlock when creating
      a pcpu rt.
      
      The current rt6_get_pcpu_route() will also create a pcpu rt if one does not
      exist.  This patch moves the pcpu rt creation logic into another function,
      rt6_make_pcpu_route().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      CC: Hannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a73e4195