1. 09 Oct, 2013 4 commits
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · f6063850
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      1) We used the wrong netlink attribute to verify the
         lenght of the replay window on async events. Fix this by
         using the right netlink attribute.
      
      2) Policy lookups can not match the output interface on forwarding.
         Add the needed informations to the flow informations.
      
      3) We update the pmtu when we receive a ICMPV6_DEST_UNREACH message
         on IPsec with ipv6. This is wrong and leads to strange fragmented
         packets, only ICMPV6_PKT_TOOBIG messages should update the pmtu.
         Fix this by removing the ICMPV6_DEST_UNREACH check from the IPsec
         protocol error handlers.
      
      4) The legacy IPsec anti replay mechanism supports anti replay
         windows up to 32 packets. If a user requests for a bigger
         anti replay window, we use 32 packets but pretend that we use
         the requested window size. Fix from Fan Du.
      
      5) If asynchronous events are enabled and replay_maxdiff is set to
         zero, we generate an async event for every received packet instead
         of checking whether a timeout occurred. Fix from Thomas Egerer.
      
      6) Policies need a refcount when the state resolution timer is armed.
         Otherwise the timer can fire after the policy is deleted.
      
      7) We might dreference a NULL pointer if the hold_queue is empty,
         add a check to avoid this.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f6063850
    • Fabio Estevam's avatar
      net: secure_seq: Fix warning when CONFIG_IPV6 and CONFIG_INET are not selected · cb03db9d
      Fabio Estevam authored
      net_secret() is only used when CONFIG_IPV6 or CONFIG_INET are selected.
      
      Building a defconfig with both of these symbols unselected (Using the ARM
      at91sam9rl_defconfig, for example) leads to the following build warning:
      
      $ make at91sam9rl_defconfig
      #
      # configuration written to .config
      #
      
      $ make net/core/secure_seq.o
      scripts/kconfig/conf --silentoldconfig Kconfig
        CHK     include/config/kernel.release
        CHK     include/generated/uapi/linux/version.h
        CHK     include/generated/utsrelease.h
      make[1]: `include/generated/mach-types.h' is up to date.
        CALL    scripts/checksyscalls.sh
        CC      net/core/secure_seq.o
      net/core/secure_seq.c:17:13: warning: 'net_secret_init' defined but not used [-Wunused-function]
      
      Fix this warning by protecting the definition of net_secret() with these
      symbols.
      Reported-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarFabio Estevam <fabio.estevam@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb03db9d
    • David S. Miller's avatar
      Merge branch 'sfc-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc · 9684d7b0
      David S. Miller authored
      Ben Hutchings says:
      
      ====================
      Some more fixes for EF10 support; hopefully the last lot:
      
      1. Fixes for reading statistics, from Edward Cree and Jon Cooper.
      2. Addition of ethtool statistics for packets dropped by the hardware
      before they were associated with a specific function, from Edward Cree.
      3. Only bind to functions that are in control of their associated port,
      as the driver currently assumes this is the case.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9684d7b0
    • Eric Dumazet's avatar
      pkt_sched: fq: fix non TCP flows pacing · 7eec4174
      Eric Dumazet authored
      Steinar reported FQ pacing was not working for UDP flows.
      
      It looks like the initial sk->sk_pacing_rate value of 0 was
      a wrong choice. We should init it to ~0U (unlimited)
      
      Then, TCA_FQ_FLOW_DEFAULT_RATE should be removed because it makes
      no real sense. The default rate is really unlimited, and we
      need to avoid a zero divide.
      Reported-by: default avatarSteinar H. Gunderson <sesse@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7eec4174
  2. 08 Oct, 2013 21 commits
  3. 07 Oct, 2013 9 commits
    • Eric W. Biederman's avatar
      net: Update the sysctl permissions handler to test effective uid/gid · 88ba09df
      Eric W. Biederman authored
      On Tue, 20 Aug 2013 11:40:04 -0500 Eric Sandeen <sandeen@redhat.com> wrote:
      > This was brought up in a Red Hat bug (which may be marked private, I'm sorry):
      >
      > Bug 987055 - open O_WRONLY succeeds on some root owned files in /proc for process running with unprivileged EUID
      >
      > "On RHEL7 some of the files in /proc can be opened for writing by an unprivileged EUID."
      >
      > The flaw existed upstream as well last I checked.
      >
      > This commit in kernel v3.8 caused the regression:
      >
      > commit cff10976
      > Author: Eric W. Biederman <ebiederm@xmission.com>
      > Date:   Fri Nov 16 03:03:01 2012 +0000
      >
      >     net: Update the per network namespace sysctls to be available to the network namespace owner
      >
      >     - Allow anyone with CAP_NET_ADMIN rights in the user namespace of the
      >       the netowrk namespace to change sysctls.
      >     - Allow anyone the uid of the user namespace root the same
      >       permissions over the network namespace sysctls as the global root.
      >     - Allow anyone with gid of the user namespace root group the same
      >       permissions over the network namespace sysctl as the global root group.
      >
      >     Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
      >     Signed-off-by: David S. Miller <davem@davemloft.net>
      >
      > because it changed /sys/net's special permission handler to test current_uid, not
      > current_euid; same for current_gid/current_egid.
      >
      > So in this case, root cannot drop privs via set[ug]id, and retains all privs
      > in this codepath.
      
      Modify the code to use current_euid(), and in_egroup_p, as in done
      in fs/proc/proc_sysctl.c:test_perm()
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarEric Sandeen <sandeen@redhat.com>
      Reported-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88ba09df
    • Marc Kleine-Budde's avatar
      can: dev: fix nlmsg size calculation in can_get_size() · fe119a05
      Marc Kleine-Budde authored
      This patch fixes the calculation of the nlmsg size, by adding the missing
      nla_total_size().
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fe119a05
    • Jiri Benc's avatar
      ipv4: fix ineffective source address selection · 0a7e2260
      Jiri Benc authored
      When sending out multicast messages, the source address in inet->mc_addr is
      ignored and rewritten by an autoselected one. This is caused by a typo in
      commit 813b3b5d ("ipv4: Use caller's on-stack flowi as-is in output
      route lookups").
      Signed-off-by: default avatarJiri Benc <jbenc@redhat.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0a7e2260
    • Markus Pargmann's avatar
    • Markus Pargmann's avatar
      net: ethernet: cpsw: Search childs for slave nodes · f468b10e
      Markus Pargmann authored
      The current implementation searches the whole DT for nodes named
      "slave".
      
      This patch changes it to search only child nodes for slaves.
      Signed-off-by: default avatarMarkus Pargmann <mpa@pengutronix.de>
      Acked-by: default avatarMugunthan V N <mugunthanvnm@ti.com>
      Acked-by: default avatarPeter Korsgaard <jacmet@sunsite.dk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f468b10e
    • Alexei Starovoitov's avatar
      net: fix unsafe set_memory_rw from softirq · d45ed4a4
      Alexei Starovoitov authored
      on x86 system with net.core.bpf_jit_enable = 1
      
      sudo tcpdump -i eth1 'tcp port 22'
      
      causes the warning:
      [   56.766097]  Possible unsafe locking scenario:
      [   56.766097]
      [   56.780146]        CPU0
      [   56.786807]        ----
      [   56.793188]   lock(&(&vb->lock)->rlock);
      [   56.799593]   <Interrupt>
      [   56.805889]     lock(&(&vb->lock)->rlock);
      [   56.812266]
      [   56.812266]  *** DEADLOCK ***
      [   56.812266]
      [   56.830670] 1 lock held by ksoftirqd/1/13:
      [   56.836838]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8118f44c>] vm_unmap_aliases+0x8c/0x380
      [   56.849757]
      [   56.849757] stack backtrace:
      [   56.862194] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 3.12.0-rc3+ #45
      [   56.868721] Hardware name: System manufacturer System Product Name/P8Z77 WS, BIOS 3007 07/26/2012
      [   56.882004]  ffffffff821944c0 ffff88080bbdb8c8 ffffffff8175a145 0000000000000007
      [   56.895630]  ffff88080bbd5f40 ffff88080bbdb928 ffffffff81755b14 0000000000000001
      [   56.909313]  ffff880800000001 ffff880800000000 ffffffff8101178f 0000000000000001
      [   56.923006] Call Trace:
      [   56.929532]  [<ffffffff8175a145>] dump_stack+0x55/0x76
      [   56.936067]  [<ffffffff81755b14>] print_usage_bug+0x1f7/0x208
      [   56.942445]  [<ffffffff8101178f>] ? save_stack_trace+0x2f/0x50
      [   56.948932]  [<ffffffff810cc0a0>] ? check_usage_backwards+0x150/0x150
      [   56.955470]  [<ffffffff810ccb52>] mark_lock+0x282/0x2c0
      [   56.961945]  [<ffffffff810ccfed>] __lock_acquire+0x45d/0x1d50
      [   56.968474]  [<ffffffff810cce6e>] ? __lock_acquire+0x2de/0x1d50
      [   56.975140]  [<ffffffff81393bf5>] ? cpumask_next_and+0x55/0x90
      [   56.981942]  [<ffffffff810cef72>] lock_acquire+0x92/0x1d0
      [   56.988745]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   56.995619]  [<ffffffff817628f1>] _raw_spin_lock+0x41/0x50
      [   57.002493]  [<ffffffff8118f52a>] ? vm_unmap_aliases+0x16a/0x380
      [   57.009447]  [<ffffffff8118f52a>] vm_unmap_aliases+0x16a/0x380
      [   57.016477]  [<ffffffff8118f44c>] ? vm_unmap_aliases+0x8c/0x380
      [   57.023607]  [<ffffffff810436b0>] change_page_attr_set_clr+0xc0/0x460
      [   57.030818]  [<ffffffff810cfb8d>] ? trace_hardirqs_on+0xd/0x10
      [   57.037896]  [<ffffffff811a8330>] ? kmem_cache_free+0xb0/0x2b0
      [   57.044789]  [<ffffffff811b59c3>] ? free_object_rcu+0x93/0xa0
      [   57.051720]  [<ffffffff81043d9f>] set_memory_rw+0x2f/0x40
      [   57.058727]  [<ffffffff8104e17c>] bpf_jit_free+0x2c/0x40
      [   57.065577]  [<ffffffff81642cba>] sk_filter_release_rcu+0x1a/0x30
      [   57.072338]  [<ffffffff811108e2>] rcu_process_callbacks+0x202/0x7c0
      [   57.078962]  [<ffffffff81057f17>] __do_softirq+0xf7/0x3f0
      [   57.085373]  [<ffffffff81058245>] run_ksoftirqd+0x35/0x70
      
      cannot reuse jited filter memory, since it's readonly,
      so use original bpf insns memory to hold work_struct
      
      defer kfree of sk_filter until jit completed freeing
      
      tested on x86_64 and i386
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d45ed4a4
    • Ben Hutchings's avatar
      sfc: Only bind to EF10 functions with the LinkCtrl and Trusted flags · ecb1c9cc
      Ben Hutchings authored
      Although we do not yet enable multiple PFs per port, it is possible
      that a board will be reconfigured to enable them while the driver has
      not yet been updated to fully support this.
      
      The most obvious problem is that multiple functions may try to set
      conflicting link settings.  But we will also run into trouble if the
      firmware doesn't consider us fully trusted.  So, abort probing unless
      both the LinkCtrl and Trusted flags are set for this function.
      Signed-off-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      ecb1c9cc
    • Oussama Ghorbel's avatar
      ipv6: Allow the MTU of ipip6 tunnel to be set below 1280 · 582442d6
      Oussama Ghorbel authored
      The (inner) MTU of a ipip6 (IPv4-in-IPv6) tunnel cannot be set below 1280, which is the minimum MTU in IPv6.
      However, there should be no IPv6 on the tunnel interface at all, so the IPv6 rules should not apply.
      More info at https://bugzilla.kernel.org/show_bug.cgi?id=15530
      
      This patch allows to check the minimum MTU for ipv6 tunnel according to these rules:
      -In case the tunnel is configured with ipip6 mode the minimum MTU is 68.
      -In case the tunnel is configured with ip6ip6 or any mode the minimum MTU is 1280.
      Signed-off-by: default avatarOussama Ghorbel <ou.ghorbel@gmail.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      582442d6
    • Michael S. Tsirkin's avatar
      netif_set_xps_queue: make cpu mask const · 3573540c
      Michael S. Tsirkin authored
      virtio wants to pass in cpumask_of(cpu), make parameter
      const to avoid build warnings.
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3573540c
  4. 04 Oct, 2013 6 commits
    • Edward Cree's avatar
      sfc: Add PM and RXDP drop counters to ethtool stats · 568d7a00
      Edward Cree authored
      Recognise the new Packet Memory and RX Data Path counters.
      
      The following counters are added:
      rx_pm_{trunc,discard}_bb_overflow - burst buffer overflowed.  This should not
       occur if BB correctly configured.
      rx_pm_{trunc,discard}_vfifo_full - not enough space in packet memory.  May
       indicate RX performance problems.
      rx_pm_{trunc,discard}_qbb - dropped by 802.1Qbb early discard mechanism.
       Since Qbb is not supported at present, this should not occur.
      rx_pm_discard_mapping - 802.1p priority configured to be dropped.  This should
       not occur in normal operation.
      rx_dp_q_disabled_packets - packet was to be delivered to a queue but queue is
       disabled.  May indicate misconfiguration by the driver.
      rx_dp_di_dropped_packets - parser-dispatcher indicated that a packet should be
       dropped.
      rx_dp_streaming_packets - packet was sent to the RXDP streaming bus, ie. a
       filter directed the packet to the MCPU.
      rx_dp_emerg_{fetch,wait} - RX datapath had to wait for descriptors to be
       loaded.  Indicates performance problems but not drops.
      
      These are only provided if the MC firmware has the
      PM_AND_RXDP_COUNTERS capability.  Otherwise, mask them out.
      Signed-off-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      568d7a00
    • Matthew Slattery's avatar
    • Edward Cree's avatar
      sfc: Refactor EF10 stat mask code to allow for more conditional stats · 4bae913b
      Edward Cree authored
      Previously, efx_ef10_stat_mask returned a static const unsigned long[], which
      meant that each possible mask had to be declared statically with
      STAT_MASK_BITMAP.  Since adding a condition would double the size of the
      decision tree, we now create the bitmask dynamically.
      
      To do this, we have two functions efx_ef10_raw_stat_mask, which returns a u64,
      and efx_ef10_get_stat_mask, which fills in an unsigned long * argument.
      Signed-off-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      4bae913b
    • Edward Cree's avatar
      sfc: Fix internal indices of ethtool stats for EF10 · 87648cc9
      Edward Cree authored
      The indices in nic_data->stats need to match the EF10_STAT_whatever
      enum values.  In efx_nic_update_stats, only mask; gaps are removed in
      efx_ef10_update_stats.
      Signed-off-by: default avatarBen Hutchings <bhutchings@solarflare.com>
      87648cc9
    • Jon Cooper's avatar
    • Eric Dumazet's avatar
      tcp: do not forget FIN in tcp_shifted_skb() · 5e8a402f
      Eric Dumazet authored
      Yuchung found following problem :
      
       There are bugs in the SACK processing code, merging part in
       tcp_shift_skb_data(), that incorrectly resets or ignores the sacked
       skbs FIN flag. When a receiver first SACK the FIN sequence, and later
       throw away ofo queue (e.g., sack-reneging), the sender will stop
       retransmitting the FIN flag, and hangs forever.
      
      Following packetdrill test can be used to reproduce the bug.
      
      $ cat sack-merge-bug.pkt
      `sysctl -q net.ipv4.tcp_fack=0`
      
      // Establish a connection and send 10 MSS.
      0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
      +.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +.000 bind(3, ..., ...) = 0
      +.000 listen(3, 1) = 0
      
      +.050 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
      +.000 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6>
      +.001 < . 1:1(0) ack 1 win 1024
      +.000 accept(3, ..., ...) = 4
      
      +.100 write(4, ..., 12000) = 12000
      +.000 shutdown(4, SHUT_WR) = 0
      +.000 > . 1:10001(10000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257
      +.000 > FP. 10001:12001(2000) ack 1
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:11001,nop,nop>
      +.050 < . 1:1(0) ack 2001 win 257 <sack 10001:12002,nop,nop>
      // SACK reneg
      +.050 < . 1:1(0) ack 12001 win 257
      +0 %{ print "unacked: ",tcpi_unacked }%
      +5 %{ print "" }%
      
      First, a typo inverted left/right of one OR operation, then
      code forgot to advance end_seq if the merged skb carried FIN.
      
      Bug was added in 2.6.29 by commit 832d11c5
      ("tcp: Try to restore large SKBs while SACK processing")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e8a402f