1. 27 Jun, 2014 15 commits
    • Ben Hutchings's avatar
      net/compat: Fix minor information leak in siocdevprivate_ioctl() · e45145b6
      Ben Hutchings authored
      commit 417c3522 upstream.
      
      We don't need to check that ifr_data itself is a valid user pointer,
      but we should check &ifr_data is.  Thankfully the copy of ifr_name is
      checked, so this can only leak a few bytes from immediately above the
      user address limit.
      Signed-off-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e45145b6
    • Benjamin Poirier's avatar
      net: Do not enable tx-nocache-copy by default · 9f6e089c
      Benjamin Poirier authored
      commit cdb3f4a3 upstream.
      
      There are many cases where this feature does not improve performance or even
      reduces it.
      
      For example, here are the results from tests that I've run using 3.12.6 on one
      Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
      from the Xeon, but they're similar on the i7. All numbers report the
      mean±stddev over 10 runs of 10s.
      
      1) latency tests similar to what is described in "c6e1a0d1 net: Allow no-cache
      copy from user on transmit"
      There is no statistically significant difference between tx-nocache-copy
      on/off.
      nic irqs spread out (one queue per cpu)
      
      200x netperf -r 1400,1
      tx-nocache-copy off
              692000±1000 tps
              50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
      tx-nocache-copy on
              693000±1000 tps
              50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7
      
      200x netperf -r 14000,14000
      tx-nocache-copy off
              86450±80 tps
              50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
      tx-nocache-copy on
              86110±60 tps
              50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20
      
      2) single stream throughput tests
      tx-nocache-copy leads to higher service demand
      
                              throughput  cpu0        cpu1        demand
                              (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)
      
      nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)
      
      tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
      tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004
      
      nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)
      
      tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
      tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002
      
      As a second example, here are some results from Eric Dumazet with latest
      net-next.
      tx-nocache-copy also leads to higher service demand
      
      (cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)
      
      lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
      lpq83:~# perf stat ./netperf -H lpq84 -c
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB
      
       87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000
      
       Performance counter stats for './netperf -H lpq84 -c':
      
             4282.648396 task-clock                #    0.423 CPUs utilized
                   9,348 context-switches          #    0.002 M/sec
                      88 CPU-migrations            #    0.021 K/sec
                     355 page-faults               #    0.083 K/sec
          11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
           9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
           4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
           6,053,172,792 instructions              #    0.51  insns per cycle
                                                   #    1.49  stalled cycles per insn [83.64%]
             597,275,583 branches                  #  139.464 M/sec                   [83.70%]
               8,960,541 branch-misses             #    1.50% of all branches         [83.65%]
      
            10.128990264 seconds time elapsed
      
      lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
      lpq83:~# perf stat ./netperf -H lpq84 -c
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB
      
       87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000
      
       Performance counter stats for './netperf -H lpq84 -c':
      
             2847.375441 task-clock                #    0.281 CPUs utilized
                  11,632 context-switches          #    0.004 M/sec
                      49 CPU-migrations            #    0.017 K/sec
                     354 page-faults               #    0.124 K/sec
           7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
           6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
           1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
           2,079,702,453 instructions              #    0.27  insns per cycle
                                                   #    2.94  stalled cycles per insn [83.22%]
             363,773,213 branches                  #  127.757 M/sec                   [83.29%]
               4,242,732 branch-misses             #    1.17% of all branches         [83.51%]
      
            10.128449949 seconds time elapsed
      
      CC: Tom Herbert <therbert@google.com>
      Signed-off-by: default avatarBenjamin Poirier <bpoirier@suse.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      9f6e089c
    • Prarit Bhargava's avatar
      ACPI / memhotplug: add parameter to disable memory hotplug · e801ecec
      Prarit Bhargava authored
      commit 00159a20 upstream.
      
      When booting a kexec/kdump kernel on a system that has specific memory
      hotplug regions the boot will fail with warnings like:
      
       swapper/0: page allocation failure: order:9, mode:0x84d0
       CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-65.el7.x86_64 #1
       Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011
        0000000000000000 ffff8800341bd8c8 ffffffff815bcc67 ffff8800341bd950
        ffffffff8113b1a0 ffff880036339b00 0000000000000009 00000000000084d0
        ffff8800341bd950 ffffffff815b87ee 0000000000000000 0000000000000200
       Call Trace:
        [<ffffffff815bcc67>] dump_stack+0x19/0x1b
        [<ffffffff8113b1a0>] warn_alloc_failed+0xf0/0x160
        [<ffffffff815b87ee>] ?  __alloc_pages_direct_compact+0xac/0x196
        [<ffffffff8113f14f>] __alloc_pages_nodemask+0x7ff/0xa00
        [<ffffffff815b417c>] vmemmap_alloc_block+0x62/0xba
        [<ffffffff815b41e9>] vmemmap_alloc_block_buf+0x15/0x3b
        [<ffffffff815b1ff6>] vmemmap_populate+0xb4/0x21b
        [<ffffffff815b461d>] sparse_mem_map_populate+0x27/0x35
        [<ffffffff815b400f>] sparse_add_one_section+0x7a/0x185
        [<ffffffff815a1e9f>] __add_pages+0xaf/0x240
        [<ffffffff81047359>] arch_add_memory+0x59/0xd0
        [<ffffffff815a21d9>] add_memory+0xb9/0x1b0
        [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d
        [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd
        [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f
        [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
        [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
        [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5
        [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d
        [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160
        [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6
        [<ffffffff81a1fd58>] ? acpi_sleep_proc_init+0x2a/0x2a
        [<ffffffff810020e2>] do_one_initcall+0xe2/0x190
        [<ffffffff819e20c4>] kernel_init_freeable+0x17c/0x207
        [<ffffffff819e18d0>] ? do_early_param+0x88/0x88
        [<ffffffff8159fea0>] ? rest_init+0x80/0x80
        [<ffffffff8159feae>] kernel_init+0xe/0x180
        [<ffffffff815cca2c>] ret_from_fork+0x7c/0xb0
        [<ffffffff8159fea0>] ? rest_init+0x80/0x80
       Mem-Info:
       Node 0 DMA per-cpu:
       CPU    0: hi:    0, btch:   1 usd:   0
       Node 0 DMA32 per-cpu:
       CPU    0: hi:   42, btch:   7 usd:   0
       active_anon:0 inactive_anon:0 isolated_anon:0
        active_file:0 inactive_file:0 isolated_file:0
        unevictable:0 dirty:0 writeback:0 unstable:0
        free:872 slab_reclaimable:13 slab_unreclaimable:1880
        mapped:0 shmem:0 pagetables:0 bounce:0
        free_cma:0
      
      because the system has run out of memory at boot time.  This occurs
      because of the following sequence in the boot:
      
      Main kernel boots and sets E820 map.  The second kernel is booted with a
      map generated by the kdump service using memmap= and memmap=exactmap.
      These parameters are added to the kernel parameters of the kexec/kdump
      kernel.   The kexec/kdump kernel has limited memory resources so as not
      to severely impact the main kernel.
      
      The system then panics and the kdump/kexec kernel boots (which is a
      completely new kernel boot).  During this boot ACPI is initialized and the
      kernel (as can be seen above) traverses the ACPI namespace and finds an
      entry for a memory device to be hotadded.
      
      ie)
      
        [<ffffffff815a1e9f>] __add_pages+0xaf/0x240
        [<ffffffff81047359>] arch_add_memory+0x59/0xd0
        [<ffffffff815a21d9>] add_memory+0xb9/0x1b0
        [<ffffffff81333b9c>] acpi_memory_device_add+0x18d/0x26d
        [<ffffffff81309a01>] acpi_bus_device_attach+0x7d/0xcd
        [<ffffffff8132379d>] acpi_ns_walk_namespace+0xc8/0x17f
        [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
        [<ffffffff81309984>] ? acpi_bus_type_and_status+0x90/0x90
        [<ffffffff81323c8c>] acpi_walk_namespace+0x95/0xc5
        [<ffffffff8130a6d6>] acpi_bus_scan+0x8b/0x9d
        [<ffffffff81a2019a>] acpi_scan_init+0x63/0x160
        [<ffffffff81a1ffb5>] acpi_init+0x25d/0x2a6
      
      At this point the kernel adds page table information and the the kexec/kdump
      kernel runs out of memory.
      
      This can also be reproduced by using the memmap=exactmap and mem=X
      parameters on the main kernel and booting.
      
      This patchset resolves the problem by adding a kernel parameter,
      acpi_no_memhotplug, to disable ACPI memory hotplug.
      Signed-off-by: default avatarPrarit Bhargava <prarit@redhat.com>
      Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: default avatarToshi Kani <toshi.kani@hp.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e801ecec
    • Peter Zijlstra's avatar
      sched: Make scale_rt_power() deal with backward clocks · 5ff029e2
      Peter Zijlstra authored
      commit cadefd3d upstream.
      
      Mike reported that, while unlikely, its entirely possible for
      scale_rt_power() to see the time go backwards. This yields rather
      'interesting' results.
      
      So like all other sites that deal with clocks; make this one ignore
      backward clock movement too.
      Reported-by: default avatarMike Galbraith <bitbucket@online.de>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140227094035.GZ9987@twins.programming.kicks-ass.net
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      5ff029e2
    • Wendy Xiong's avatar
      [SCSI] ipr: Add new CCIN definition for Grand Canyon support · dcc23f13
      Wendy Xiong authored
      commit 5eeac3e9 upstream.
      
      Add the appropriate definition and table entry for new hardware support.
      Signed-off-by: default avatarWen Xiong <wenxiong@linux.vnet.ibm.com>
      Acked-by: default avatarBrian King <brking@linux.vnet.ibm.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dcc23f13
    • Mike Qiu's avatar
      powerpc/mm: fix ".__node_distance" undefined · 311222a9
      Mike Qiu authored
      commit 12c743eb upstream.
      
        CHK     include/config/kernel.release
        CHK     include/generated/uapi/linux/version.h
        CHK     include/generated/utsrelease.h
        ...
        Building modules, stage 2.
      WARNING: 1 bad relocations
      c0000000013d6a30 R_PPC64_ADDR64    uprobes_fetch_type_table
        WRAP    arch/powerpc/boot/zImage.pseries
        WRAP    arch/powerpc/boot/zImage.epapr
        MODPOST 1849 modules
      ERROR: ".__node_distance" [drivers/block/nvme.ko] undefined!
      make[1]: *** [__modpost] Error 1
      make: *** [modules] Error 2
      make: *** Waiting for unfinished jobs....
      
      The reason is symbol "__node_distance" not been exported in powerpc.
      Signed-off-by: default avatarMike Qiu <qiudayu@linux.vnet.ibm.com>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Cc: Jesse Larrew <jlarrew@linux.vnet.ibm.com>
      Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
      Cc: Alistair Popple <alistair@popple.id.au>
      Cc: Mike Qiu <qiudayu@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      311222a9
    • Petr Mladek's avatar
      ftrace/x86: Call text_ip_addr() instead of the duplicated code · 38a572d0
      Petr Mladek authored
      commit 964f7b6b upstream.
      
      I just went over this when looking at some Xen-related ftrace initialization
      problems. They were related to Xen code that is not upstream but this clean up
      would make sense here.
      
      I think that this was already the intention when text_ip_addr() was introduced
      in the commit 87fbb2ac (ftrace/x86: Use breakpoints for converting
      function graph caller). Anyway, better do it now before it shots people into
      their leg ;-)
      
      Link: http://lkml.kernel.org/p/1401812601-2359-1-git-send-email-pmladek@suse.czSigned-off-by: default avatarPetr Mladek <pmladek@suse.cz>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      38a572d0
    • J. Bruce Fields's avatar
      nfsd4: fix FREE_STATEID lockowner leak · 2eaaa8d2
      J. Bruce Fields authored
      commit 48385408 upstream.
      
      27b11428 ("nfsd4: remove lockowner when removing lock stateid")
      introduced a memory leak.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarJeff Layton <jeff.layton@primarydata.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2eaaa8d2
    • Ying Xue's avatar
      tipc: fix memory leak of publications · 48195684
      Ying Xue authored
      commit 1621b94d upstream.
      
      Commit 1bb8dce5 ("tipc: fix memory
      leak during module removal") introduced a memory leak issue: when
      name table is stopped, it's forgotten that publication instances are
      freed properly. Additionally the useless "continue" statement in
      tipc_nametbl_stop() is removed as well.
      Reported-by: default avatarJason <huzhijiang@gmail.com>
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Acked-by: default avatarErik Hugne <erik.hugne@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      48195684
    • Jiang Liu's avatar
      intel_idle: close avn_cstates array with correct marker · 4c65d4f6
      Jiang Liu authored
      commit 88390996 upstream.
      
      Close avn_cstates array with correct marker to avoid overflow
      in function intel_idle_cpu_init().
      
      [rjw: The problem was introduced when commit 22e580d0 was merged
       on top of eba682a5 (intel_idle: shrink states tables).]
      
      Fixes: 22e580d0 (intel_idle: Fixed C6 state on Avoton/Rangeley processors)
      Signed-off-by: default avatarJiang Liu <jiang.liu@linux.intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4c65d4f6
    • Viresh Kumar's avatar
      tick-sched: Check tick_nohz_enabled in tick_nohz_switch_to_nohz() · e54e6e8e
      Viresh Kumar authored
      commit 27630532 upstream.
      
      Since commit d689fe22 (NOHZ: Check for nohz active instead of nohz
      enabled) the tick_nohz_switch_to_nohz() function returns because it
      checks for the tick_nohz_active flag. This can't be set, because the
      function itself sets it.
      
      Undo the change in tick_nohz_switch_to_nohz().
      Signed-off-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Cc: linaro-kernel@lists.linaro.org
      Cc: fweisbec@gmail.com
      Cc: Arvind.Chauhan@arm.com
      Cc: linaro-networking@linaro.org
      Cc: <stable@vger.kernel.org> # 3.13+
      Link: http://lkml.kernel.org/r/40939c05f2d65d781b92b20302b02243d0654224.1397537987.git.viresh.kumar@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e54e6e8e
    • Konstantin Khlebnikov's avatar
      epoll: fix use-after-free in eventpoll_release_file · c79460f6
      Konstantin Khlebnikov authored
      commit ebe06187 upstream.
      
      This fixes use-after-free of epi->fllink.next inside list loop macro.
      This loop actually releases elements in the body.  The list is
      rcu-protected but here we cannot hold rcu_read_lock because we need to
      lock mutex inside.
      
      The obvious solution is to use list_for_each_entry_safe().  RCU-ness
      isn't essential because nobody can change this list under us, it's final
      fput for this file.
      
      The bug was introduced by ae10b2b4 ("epoll: optimize EPOLL_CTL_DEL
      using rcu")
      Signed-off-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Reported-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Cc: Stable <stable@vger.kernel.org> # 3.13+
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Cc: Jason Baron <jbaron@akamai.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c79460f6
    • Li Zhong's avatar
      powerpc: Fix Oops in rtas_stop_self() · f116dbc9
      Li Zhong authored
      commit 4fb8d027 upstream.
      
      commit 41dd03a9 may cause Oops in rtas_stop_self().
      
      The reason is that the rtas_args was moved into stack space. For a box
      with more that 4GB RAM, the stack could easily be outside 32bit range,
      but RTAS is 32bit.
      
      So the patch moves rtas_args away from stack by adding static before
      it.
      Signed-off-by: default avatarLi Zhong <zhong@linux.vnet.ibm.com>
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Cc: stable@vger.kernel.org # 3.14+
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      f116dbc9
    • J. Bruce Fields's avatar
      GFS2: revert "GFS2: d_splice_alias() can't return error" · 22d3112a
      J. Bruce Fields authored
      commit d57b9c9a upstream.
      
      0d0d1107 asserts that "d_splice_alias()
      can't return error unless it was given an IS_ERR(inode)".
      
      That was true of the implementation of d_splice_alias, but this is
      really a problem with d_splice_alias: at a minimum it should be able to
      return -ELOOP in the case where inserting the given dentry would cause a
      directory loop.
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      22d3112a
    • Jiri Slaby's avatar
      Revert "bio-integrity: Fix bio_integrity_verify segment start bug" · 63a97385
      Jiri Slaby authored
      This reverts commit 7cbcb219,
      misapplied upstream commit 5837c80e.
      
      The upstream commit was applied twice to stable-3.12, the second time
      to bio_integrity_generate. Revert this second application.
      
      Cc: Martin K. Petersen <martin.petersen@oracle.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarNicholas Bellinger <nab@linux-iscsi.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      63a97385
  2. 23 Jun, 2014 25 commits
    • Vincent Guittot's avatar
      Revert "sched: Fix sleep time double accounting in enqueue entity" · 61844d8e
      Vincent Guittot authored
      commit 9390675a upstream.
      
      This reverts commit 282cf499.
      
      With the current implementation, the load average statistics of a sched entity
      change according to other activity on the CPU even if this activity is done
      between the running window of the sched entity and have no influence on the
      running duration of the task.
      
      When a task wakes up on the same CPU, we currently update last_runnable_update
      with the return  of __synchronize_entity_decay without updating the
      runnable_avg_sum and runnable_avg_period accordingly. In fact, we have to sync
      the load_contrib of the se with the rq's blocked_load_contrib before removing
      it from the latter (with __synchronize_entity_decay) but we must keep
      last_runnable_update unchanged for updating runnable_avg_sum/period during the
      next update_entity_load_avg.
      Signed-off-by: default avatarVincent Guittot <vincent.guittot@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarBen Segall <bsegall@google.com>
      Cc: pjt@google.com
      Cc: alex.shi@linaro.org
      Link: http://lkml.kernel.org/r/1390376734-6800-1-git-send-email-vincent.guittot@linaro.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      61844d8e
    • Jiri Slaby's avatar
      Linux 3.12.23 · 85ee5c00
      Jiri Slaby authored
      85ee5c00
    • Boris BREZILLON's avatar
      ARM: at91: fix at91_sysirq_mask_rtc for sam9x5 SoCs · 8d8a2761
      Boris BREZILLON authored
      commit 9dcc87fe upstream.
      
      sam9x5 SoCs have the following errata:
       "RTC: Interrupt Mask Register cannot be used
        Interrupt Mask Register read always returns 0."
      
      Hence we should not rely on what IMR claims about already masked IRQs
      and just disable all IRQs.
      Signed-off-by: default avatarBoris BREZILLON <boris.brezillon@free-electrons.com>
      Reported-by: default avatarBryan Evenson <bevenson@melinkcorp.com>
      Reviewed-by: default avatarJohan Hovold <johan@hovold.com>
      Acked-by: default avatarNicolas Ferre <nicolas.ferre@atmel.com>
      Cc: Bryan Evenson <bevenson@melinkcorp.com>
      Cc: Andrew Victor <linux@maxim.org.za>
      Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Mark Roszko <mark.roszko@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8d8a2761
    • Cong Wang's avatar
      vxlan: use dev->needed_headroom instead of dev->hard_header_len · 0067390a
      Cong Wang authored
      [ Upstream commit 2853af6a ]
      
      When we mirror packets from a vxlan tunnel to other device,
      the mirror device should see the same packets (that is, without
      outer header). Because vxlan tunnel sets dev->hard_header_len,
      tcf_mirred() resets mac header back to outer mac, the mirror device
      actually sees packets with outer headers
      
      Vxlan tunnel should set dev->needed_headroom instead of
      dev->hard_header_len, like what other ip tunnels do. This fixes
      the above problem.
      
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: stephen hemminger <stephen@networkplumber.org>
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0067390a
    • Michal Schmidt's avatar
      rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 · 76acb942
      Michal Schmidt authored
      [ Upstream commit e5eca6d4 ]
      
      When running RHEL6 userspace on a current upstream kernel, "ip link"
      fails to show VF information.
      
      The reason is a kernel<->userspace API change introduced by commit
      88c5b5ce ("rtnetlink: Call nlmsg_parse() with correct header length"),
      after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
      in the netlink request.
      
      iproute2 adjusted for the API change in its commit 63338dca4513
      ("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").
      
      The problem has been noticed before:
      http://marc.info/?l=linux-netdev&m=136692296022182&w=2
      (Subject: Re: getting VF link info seems to be broken in 3.9-rc8)
      
      We can do better than tell those with old userspace to upgrade. We can
      recognize the old iproute2 in the kernel by checking the netlink message
      length. Even when including the IFLA_EXT_MASK attribute, its netlink
      message is shorter than struct ifinfomsg.
      
      With this patch "ip link" shows VF information in both old and new
      iproute2 versions.
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      76acb942
    • Xufeng Zhang's avatar
      sctp: Fix sk_ack_backlog wrap-around problem · ddb638e6
      Xufeng Zhang authored
      [ Upstream commit d3217b15 ]
      
      Consider the scenario:
      For a TCP-style socket, while processing the COOKIE_ECHO chunk in
      sctp_sf_do_5_1D_ce(), after it has passed a series of sanity check,
      a new association would be created in sctp_unpack_cookie(), but afterwards,
      some processing maybe failed, and sctp_association_free() will be called to
      free the previously allocated association, in sctp_association_free(),
      sk_ack_backlog value is decremented for this socket, since the initial
      value for sk_ack_backlog is 0, after the decrement, it will be 65535,
      a wrap-around problem happens, and if we want to establish new associations
      afterward in the same socket, ABORT would be triggered since sctp deem the
      accept queue as full.
      Fix this issue by only decrementing sk_ack_backlog for associations in
      the endpoint's list.
      Fix-suggested-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarXufeng Zhang <xufeng.zhang@windriver.com>
      Acked-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ddb638e6
    • Eric Dumazet's avatar
      ipv4: fix a race in ip4_datagram_release_cb() · c671113b
      Eric Dumazet authored
      [ Upstream commit 9709674e ]
      
      Alexey gave a AddressSanitizer[1] report that finally gave a good hint
      at where was the origin of various problems already reported by Dormando
      in the past [2]
      
      Problem comes from the fact that UDP can have a lockless TX path, and
      concurrent threads can manipulate sk_dst_cache, while another thread,
      is holding socket lock and calls __sk_dst_set() in
      ip4_datagram_release_cb() (this was added in linux-3.8)
      
      It seems that all we need to do is to use sk_dst_check() and
      sk_dst_set() so that all the writers hold same spinlock
      (sk->sk_dst_lock) to prevent corruptions.
      
      TCP stack do not need this protection, as all sk_dst_cache writers hold
      the socket lock.
      
      [1]
      https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel
      
      AddressSanitizer: heap-use-after-free in ipv4_dst_check
      Read of size 2 by thread T15453:
       [<ffffffff817daa3a>] ipv4_dst_check+0x1a/0x90 ./net/ipv4/route.c:1116
       [<ffffffff8175b789>] __sk_dst_check+0x89/0xe0 ./net/core/sock.c:531
       [<ffffffff81830a36>] ip4_datagram_release_cb+0x46/0x390 ??:0
       [<ffffffff8175eaea>] release_sock+0x17a/0x230 ./net/core/sock.c:2413
       [<ffffffff81830882>] ip4_datagram_connect+0x462/0x5d0 ??:0
       [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
       [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
       [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
       [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
      ./arch/x86/kernel/entry_64.S:629
      
      Freed by thread T15455:
       [<ffffffff8178d9b8>] dst_destroy+0xa8/0x160 ./net/core/dst.c:251
       [<ffffffff8178de25>] dst_release+0x45/0x80 ./net/core/dst.c:280
       [<ffffffff818304c1>] ip4_datagram_connect+0xa1/0x5d0 ??:0
       [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
       [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
       [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
       [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
      ./arch/x86/kernel/entry_64.S:629
      
      Allocated by thread T15453:
       [<ffffffff8178d291>] dst_alloc+0x81/0x2b0 ./net/core/dst.c:171
       [<ffffffff817db3b7>] rt_dst_alloc+0x47/0x50 ./net/ipv4/route.c:1406
       [<     inlined    >] __ip_route_output_key+0x3e8/0xf70
      __mkroute_output ./net/ipv4/route.c:1939
       [<ffffffff817dde08>] __ip_route_output_key+0x3e8/0xf70 ./net/ipv4/route.c:2161
       [<ffffffff817deb34>] ip_route_output_flow+0x14/0x30 ./net/ipv4/route.c:2249
       [<ffffffff81830737>] ip4_datagram_connect+0x317/0x5d0 ??:0
       [<ffffffff81846d06>] inet_dgram_connect+0x76/0xd0 ./net/ipv4/af_inet.c:534
       [<ffffffff817580ac>] SYSC_connect+0x15c/0x1c0 ./net/socket.c:1701
       [<ffffffff817596ce>] SyS_connect+0xe/0x10 ./net/socket.c:1682
       [<ffffffff818b0a29>] system_call_fastpath+0x16/0x1b
      ./arch/x86/kernel/entry_64.S:629
      
      [2]
      <4>[196727.311203] general protection fault: 0000 [#1] SMP
      <4>[196727.311224] Modules linked in: xt_TEE xt_dscp xt_DSCP macvlan bridge coretemp crc32_pclmul ghash_clmulni_intel gpio_ich microcode ipmi_watchdog ipmi_devintf sb_edac edac_core lpc_ich mfd_core tpm_tis tpm tpm_bios ipmi_si ipmi_msghandler isci igb libsas i2c_algo_bit ixgbe ptp pps_core mdio
      <4>[196727.311333] CPU: 17 PID: 0 Comm: swapper/17 Not tainted 3.10.26 #1
      <4>[196727.311344] Hardware name: Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0 07/05/2013
      <4>[196727.311364] task: ffff885e6f069700 ti: ffff885e6f072000 task.ti: ffff885e6f072000
      <4>[196727.311377] RIP: 0010:[<ffffffff815f8c7f>]  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
      <4>[196727.311399] RSP: 0018:ffff885effd23a70  EFLAGS: 00010282
      <4>[196727.311409] RAX: dead000000200200 RBX: ffff8854c398ecc0 RCX: 0000000000000040
      <4>[196727.311423] RDX: dead000000100100 RSI: dead000000100100 RDI: dead000000200200
      <4>[196727.311437] RBP: ffff885effd23a80 R08: ffffffff815fd9e0 R09: ffff885d5a590800
      <4>[196727.311451] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
      <4>[196727.311464] R13: ffffffff81c8c280 R14: 0000000000000000 R15: ffff880e85ee16ce
      <4>[196727.311510] FS:  0000000000000000(0000) GS:ffff885effd20000(0000) knlGS:0000000000000000
      <4>[196727.311554] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4>[196727.311581] CR2: 00007a46751eb000 CR3: 0000005e65688000 CR4: 00000000000407e0
      <4>[196727.311625] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      <4>[196727.311669] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      <4>[196727.311713] Stack:
      <4>[196727.311733]  ffff8854c398ecc0 ffff8854c398ecc0 ffff885effd23ab0 ffffffff815b7f42
      <4>[196727.311784]  ffff88be6595bc00 ffff8854c398ecc0 0000000000000000 ffff8854c398ecc0
      <4>[196727.311834]  ffff885effd23ad0 ffffffff815b86c6 ffff885d5a590800 ffff8816827821c0
      <4>[196727.311885] Call Trace:
      <4>[196727.311907]  <IRQ>
      <4>[196727.311912]  [<ffffffff815b7f42>] dst_destroy+0x32/0xe0
      <4>[196727.311959]  [<ffffffff815b86c6>] dst_release+0x56/0x80
      <4>[196727.311986]  [<ffffffff81620bd5>] tcp_v4_do_rcv+0x2a5/0x4a0
      <4>[196727.312013]  [<ffffffff81622b5a>] tcp_v4_rcv+0x7da/0x820
      <4>[196727.312041]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
      <4>[196727.312070]  [<ffffffff815de02d>] ? nf_hook_slow+0x7d/0x150
      <4>[196727.312097]  [<ffffffff815fd9e0>] ? ip_rcv_finish+0x360/0x360
      <4>[196727.312125]  [<ffffffff815fda92>] ip_local_deliver_finish+0xb2/0x230
      <4>[196727.312154]  [<ffffffff815fdd9a>] ip_local_deliver+0x4a/0x90
      <4>[196727.312183]  [<ffffffff815fd799>] ip_rcv_finish+0x119/0x360
      <4>[196727.312212]  [<ffffffff815fe00b>] ip_rcv+0x22b/0x340
      <4>[196727.312242]  [<ffffffffa0339680>] ? macvlan_broadcast+0x160/0x160 [macvlan]
      <4>[196727.312275]  [<ffffffff815b0c62>] __netif_receive_skb_core+0x512/0x640
      <4>[196727.312308]  [<ffffffff811427fb>] ? kmem_cache_alloc+0x13b/0x150
      <4>[196727.312338]  [<ffffffff815b0db1>] __netif_receive_skb+0x21/0x70
      <4>[196727.312368]  [<ffffffff815b0fa1>] netif_receive_skb+0x31/0xa0
      <4>[196727.312397]  [<ffffffff815b1ae8>] napi_gro_receive+0xe8/0x140
      <4>[196727.312433]  [<ffffffffa00274f1>] ixgbe_poll+0x551/0x11f0 [ixgbe]
      <4>[196727.312463]  [<ffffffff815fe00b>] ? ip_rcv+0x22b/0x340
      <4>[196727.312491]  [<ffffffff815b1691>] net_rx_action+0x111/0x210
      <4>[196727.312521]  [<ffffffff815b0db1>] ? __netif_receive_skb+0x21/0x70
      <4>[196727.312552]  [<ffffffff810519d0>] __do_softirq+0xd0/0x270
      <4>[196727.312583]  [<ffffffff816cef3c>] call_softirq+0x1c/0x30
      <4>[196727.312613]  [<ffffffff81004205>] do_softirq+0x55/0x90
      <4>[196727.312640]  [<ffffffff81051c85>] irq_exit+0x55/0x60
      <4>[196727.312668]  [<ffffffff816cf5c3>] do_IRQ+0x63/0xe0
      <4>[196727.312696]  [<ffffffff816c5aaa>] common_interrupt+0x6a/0x6a
      <4>[196727.312722]  <EOI>
      <1>[196727.313071] RIP  [<ffffffff815f8c7f>] ipv4_dst_destroy+0x4f/0x80
      <4>[196727.313100]  RSP <ffff885effd23a70>
      <4>[196727.313377] ---[ end trace 64b3f14fae0f2e29 ]---
      <0>[196727.380908] Kernel panic - not syncing: Fatal exception in interrupt
      Reported-by: default avatarAlexey Preobrazhensky <preobr@google.com>
      Reported-by: default avatardormando <dormando@rydia.ne>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Fixes: 8141ed9f ("ipv4: Add a socket release callback for datagram sockets")
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c671113b
    • Dmitry Popov's avatar
      ipip, sit: fix ipv4_{update_pmtu,redirect} calls · 4b5b1dd6
      Dmitry Popov authored
      [ Upstream commit 2346829e ]
      
      ipv4_{update_pmtu,redirect} were called with tunnel's ifindex (t->dev is a
      tunnel netdevice). It caused wrong route lookup and failure of pmtu update or
      redirect. We should use the same ifindex that we use in ip_route_output_* in
      *tunnel_xmit code. It is t->parms.link .
      Signed-off-by: default avatarDmitry Popov <ixaphire@qrator.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4b5b1dd6
    • Eric Dumazet's avatar
      net: force a list_del() in unregister_netdevice_many() · cbaf35e5
      Eric Dumazet authored
      [ Upstream commit 87757a91 ]
      
      unregister_netdevice_many() API is error prone and we had too
      many bugs because of dangling LIST_HEAD on stacks.
      
      See commit f87e6f47 ("net: dont leave active on stack LIST_HEAD")
      
      In fact, instead of making sure no caller leaves an active list_head,
      just force a list_del() in the callee. No one seems to need to access
      the list after unregister_netdevice_many()
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      cbaf35e5
    • Bjørn Mork's avatar
      net: qmi_wwan: add Olivetti Olicard modems · ab7dcb1f
      Bjørn Mork authored
      [ Upstream commit ba6de0f5 ]
      
      Lars writes: "I'm only 99% sure that the net interfaces are qmi
      interfaces, nothing to lose by adding them in my opinion."
      
      And I tend to agree based on the similarity with the two Olicard
      modems we already have here.
      Reported-by: default avatarLars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ab7dcb1f
    • Alexei Starovoitov's avatar
      net: filter: fix sparc32 typo · 099d431e
      Alexei Starovoitov authored
      [ Upstream commit 588f5d62 ]
      
      Fixes: 569810d1 ("net: filter: fix typo in sparc BPF JIT")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      099d431e
    • Alexei Starovoitov's avatar
      net: filter: fix typo in sparc BPF JIT · eadac3f2
      Alexei Starovoitov authored
      [ Upstream commit 569810d1 ]
      
      fix typo in sparc codegen for SKF_AD_IFINDEX and SKF_AD_HATYPE
      classic BPF extensions
      
      Fixes: 2809a208 ("net: filter: Just In Time compiler for sparc")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      eadac3f2
    • Sergei Shtylyov's avatar
      sh_eth: fix SH7619/771x support · b4e3e36a
      Sergei Shtylyov authored
      [ Upstream commit d8b0426a ]
      
      Commit 4a55530f (net: sh_eth: modify the definitions of register) managed
      to leave out the E-DMAC register entries in sh_eth_offset_fast_sh3_sh2[], thus
      totally breaking SH7619/771x support.  Add the missing entries using  the data
      from before that commit.
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Acked-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b4e3e36a
    • Ben Dooks's avatar
      sh_eth: use RNC mode for packet reception · 84a6e2bd
      Ben Dooks authored
      [ Upstream commit 530aa2d0 ]
      
      The current behaviour of the sh_eth driver is not to use the RNC bit
      for the receive ring. This means that every packet recieved is not only
      generating an IRQ but it also stops the receive ring DMA as well until
      the driver re-enables it after unloading the packet.
      
      This means that a number of the following errors are generated due to
      the receive packet FIFO overflowing due to nowhere to put packets:
      
      	net eth0: Receive FIFO Overflow
      
      Since feedback from Yoshihiro Shimoda shows that every supported LSI
      for this driver should have the bit enabled it seems the best way is
      to remove the RMCR default value from the per-system data and just
      write it when initialising the RMCR value. This is discussed in
      the message (http://www.spinics.net/lists/netdev/msg284912.html).
      
      I have tested the 0x00000001 configuration with NFS root filesystem and
      the driver has not failed yet.  There are further test reports from
      Sergei Shtylov and others for both the R8A7790 and R8A7791.
      
      There is also feedback fron Cao Minh Hiep[1] which reports the
      same issue in (http://comments.gmane.org/gmane.linux.network/316285)
      showing this fixes issues with losing UDP datagrams under iperf.
      Tested-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarBen Dooks <ben.dooks@codethink.co.uk>
      Acked-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Acked-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      84a6e2bd
    • Yuchung Cheng's avatar
      tcp: fix cwnd undo on DSACK in F-RTO · dc4d3702
      Yuchung Cheng authored
      [ Upstream commit 0cfa5c07 ]
      
      This bug is discovered by an recent F-RTO issue on tcpm list
      https://www.ietf.org/mail-archive/web/tcpm/current/msg08794.html
      
      The bug is that currently F-RTO does not use DSACK to undo cwnd in
      certain cases: upon receiving an ACK after the RTO retransmission in
      F-RTO, and the ACK has DSACK indicating the retransmission is spurious,
      the sender only calls tcp_try_undo_loss() if some never retransmisted
      data is sacked (FLAG_ORIG_DATA_SACKED).
      
      The correct behavior is to unconditionally call tcp_try_undo_loss so
      the DSACK information is used properly to undo the cwnd reduction.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dc4d3702
    • Jiri Pirko's avatar
      team: fix mtu setting · 14bf635b
      Jiri Pirko authored
      [ Upstream commit 9d0d68fa ]
      
      Now it is not possible to set mtu to team device which has a port
      enslaved to it. The reason is that when team_change_mtu() calls
      dev_set_mtu() for port device, notificator for NETDEV_PRECHANGEMTU
      event is called and team_device_event() returns NOTIFY_BAD forbidding
      the change. So fix this by returning NOTIFY_DONE here in case team is
      changing mtu in team_change_mtu().
      
      Introduced-by: 3d249d4c "net: introduce ethernet teaming device"
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarFlavio Leitner <fbl@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      14bf635b
    • Eric Dumazet's avatar
      net: fix inet_getid() and ipv6_select_ident() bugs · 12b4aba2
      Eric Dumazet authored
      [ Upstream commit 39c36094 ]
      
      I noticed we were sending wrong IPv4 ID in TCP flows when MTU discovery
      is disabled.
      Note how GSO/TSO packets do not have monotonically incrementing ID.
      
      06:37:41.575531 IP (id 14227, proto: TCP (6), length: 4396)
      06:37:41.575534 IP (id 14272, proto: TCP (6), length: 65212)
      06:37:41.575544 IP (id 14312, proto: TCP (6), length: 57972)
      06:37:41.575678 IP (id 14317, proto: TCP (6), length: 7292)
      06:37:41.575683 IP (id 14361, proto: TCP (6), length: 63764)
      
      It appears I introduced this bug in linux-3.1.
      
      inet_getid() must return the old value of peer->ip_id_count,
      not the new one.
      
      Lets revert this part, and remove the prevention of
      a null identification field in IPv6 Fragment Extension Header,
      which is dubious and not even done properly.
      
      Fixes: 87c48fa3 ("ipv6: make fragment identifications less predictable")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      12b4aba2
    • Tom Gundersen's avatar
      net: tunnels - enable module autoloading · fe2fe574
      Tom Gundersen authored
      [ Upstream commit f98f89a0 ]
      
      Enable the module alias hookup to allow tunnel modules to be autoloaded on demand.
      
      This is in line with how most other netdev kinds work, and will allow userspace
      to create tunnels without having CAP_SYS_MODULE.
      Signed-off-by: default avatarTom Gundersen <teg@jklm.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      fe2fe574
    • Toshiaki Makita's avatar
      bridge: Prevent insertion of FDB entry with disallowed vlan · 3ed50132
      Toshiaki Makita authored
      [ Upstream commit e0d7968a ]
      
      br_handle_local_finish() is allowing us to insert an FDB entry with
      disallowed vlan. For example, when port 1 and 2 are communicating in
      vlan 10, and even if vlan 10 is disallowed on port 3, port 3 can
      interfere with their communication by spoofed src mac address with
      vlan id 10.
      
      Note: Even if it is judged that a frame should not be learned, it should
      not be dropped because it is destined for not forwarding layer but higher
      layer. See IEEE 802.1Q-2011 8.13.10.
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarVlad Yasevich <vyasevic@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3ed50132
    • Michal Schmidt's avatar
      netlink: rate-limit leftover bytes warning and print process name · ec2ab4bc
      Michal Schmidt authored
      [ Upstream commit bfc5184b ]
      
      Any process is able to send netlink messages with leftover bytes.
      Make the warning rate-limited to prevent too much log spam.
      
      The warning is supposed to help find userspace bugs, so print the
      triggering command name to implicate the buggy program.
      
      [v2: Use pr_warn_ratelimited instead of printk_ratelimited.]
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      ec2ab4bc
    • Dan Carpenter's avatar
      qlcnic: info leak in qlcnic_dcb_peer_app_info() · d8981d30
      Dan Carpenter authored
      [ Upstream commit 7df566bb ]
      
      This function is called from dcbnl_build_peer_app().  The "info"
      struct isn't initialized at all so we disclose 2 bytes of uninitialized
      stack data.  We should clear it before passing it to the user.
      
      Fixes: 48365e48 ('qlcnic: dcb: Add support for CEE Netlink interface.')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      d8981d30
    • Eric W. Biederman's avatar
      netlink: Only check file credentials for implicit destinations · bf5e2def
      Eric W. Biederman authored
      [ Upstream commit 2d7a85f4 ]
      
      It was possible to get a setuid root or setcap executable to write to
      it's stdout or stderr (which has been set made a netlink socket) and
      inadvertently reconfigure the networking stack.
      
      To prevent this we check that both the creator of the socket and
      the currentl applications has permission to reconfigure the network
      stack.
      
      Unfortunately this breaks Zebra which always uses sendto/sendmsg
      and creates it's socket without any privileges.
      
      To keep Zebra working don't bother checking if the creator of the
      socket has privilege when a destination address is specified.  Instead
      rely exclusively on the privileges of the sender of the socket.
      
      Note from Andy: This is exactly Eric's code except for some comment
      clarifications and formatting fixes.  Neither I nor, I think, anyone
      else is thrilled with this approach, but I'm hesitant to wait on a
      better fix since 3.15 is almost here.
      
      Note to stable maintainers: This is a mess.  An earlier series of
      patches in 3.15 fix a rather serious security issue (CVE-2014-0181),
      but they did so in a way that breaks Zebra.  The offending series
      includes:
      
          commit aa4cf945
          Author: Eric W. Biederman <ebiederm@xmission.com>
          Date:   Wed Apr 23 14:28:03 2014 -0700
      
              net: Add variants of capable for use on netlink messages
      
      If a given kernel version is missing that series of fixes, it's
      probably worth backporting it and this patch.  if that series is
      present, then this fix is critical if you care about Zebra.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      bf5e2def
    • Eric W. Biederman's avatar
      net: Use netlink_ns_capable to verify the permisions of netlink messages · 50b8b6e7
      Eric W. Biederman authored
      [ Upstream commit 90f62cf3 ]
      
      It is possible by passing a netlink socket to a more privileged
      executable and then to fool that executable into writing to the socket
      data that happens to be valid netlink message to do something that
      privileged executable did not intend to do.
      
      To keep this from happening replace bare capable and ns_capable calls
      with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
      Which act the same as the previous calls except they verify that the
      opener of the socket had the desired permissions as well.
      Reported-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      50b8b6e7
    • Eric W. Biederman's avatar
      net: Add variants of capable for use on netlink messages · 3b175c7b
      Eric W. Biederman authored
      [ Upstream commit aa4cf945 ]
      
      netlink_net_capable - The common case use, for operations that are safe on a network namespace
      netlink_capable - For operations that are only known to be safe for the global root
      netlink_ns_capable - The general case of capable used to handle special cases
      
      __netlink_ns_capable - Same as netlink_ns_capable except taking a netlink_skb_parms instead of
      		       the skbuff of a netlink message.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3b175c7b
    • Eric W. Biederman's avatar
      net: Add variants of capable for use on on sockets · 8c91777b
      Eric W. Biederman authored
      [ Upstream commit a3b299da ]
      
      sk_net_capable - The common case, operations that are safe in a network namespace.
      sk_capable - Operations that are not known to be safe in a network namespace
      sk_ns_capable - The general case for special cases.
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      8c91777b