1. 01 Oct, 2008 21 commits
    • KOVACS Krisztian's avatar
      udp: Export UDP socket lookup function · bcd41303
      KOVACS Krisztian authored
      The iptables tproxy code has to be able to do UDP socket hash lookups,
      so we have to provide an exported lookup function for this purpose.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bcd41303
    • KOVACS Krisztian's avatar
      tcp: Port redirection support for TCP · a3116ac5
      KOVACS Krisztian authored
      Current TCP code relies on the local port of the listening socket
      being the same as the destination address of the incoming
      connection. Port redirection used by many transparent proxying
      techniques obviously breaks this, so we have to store the original
      destination port address.
      
      This patch extends struct inet_request_sock and stores the incoming
      destination port value there. It also modifies the handshake code to
      use that value as the source port when sending reply packets.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a3116ac5
    • KOVACS Krisztian's avatar
      ipv4: Make Netfilter's ip_route_me_harder() non-local address compatible · 86b08d86
      KOVACS Krisztian authored
      Netfilter's ip_route_me_harder() tries to re-route packets either
      generated or re-routed by Netfilter. This patch changes
      ip_route_me_harder() to handle packets from non-locally-bound sockets
      with IP_TRANSPARENT set as local and to set the appropriate flowi
      flags when re-doing the routing lookup.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86b08d86
    • KOVACS Krisztian's avatar
      tcp: Handle TCP SYN+ACK/ACK/RST transparency · 88ef4a5a
      KOVACS Krisztian authored
      The TCP stack sends out SYN+ACK/ACK/RST reply packets in response to
      incoming packets. The non-local source address check on output bites
      us again, as replies for transparently redirected traffic won't have a
      chance to leave the node.
      
      This patch selectively sets the FLOWI_FLAG_ANYSRC flag when doing the
      route lookup for those replies. Transparent replies are enabled if the
      listening socket has the transparent socket flag set.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88ef4a5a
    • KOVACS Krisztian's avatar
      ipv4: Conditionally enable transparent flow flag when connecting · 79876874
      KOVACS Krisztian authored
      Set FLOWI_FLAG_ANYSRC in flowi->flags if the socket has the
      transparent socket option set. This way we selectively enable certain
      connections with non-local source addresses to be routed.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79876874
    • KOVACS Krisztian's avatar
      ipv4: Make inet_sock.h independent of route.h · 1668e010
      KOVACS Krisztian authored
      inet_iif() in inet_sock.h requires route.h. Since users of inet_iif()
      usually require other route.h functionality anyway this patch moves
      inet_iif() to route.h.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1668e010
    • Tóth László Attila's avatar
      ipv4: Allow binding to non-local addresses if IP_TRANSPARENT is set · b9fb1506
      Tóth László Attila authored
      Setting IP_TRANSPARENT is not really useful without allowing non-local
      binds for the socket. To make user-space code simpler we allow these
      binds even if IP_TRANSPARENT is set but IP_FREEBIND is not.
      Signed-off-by: default avatarTóth László Attila <panther@balabit.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9fb1506
    • KOVACS Krisztian's avatar
      ipv4: Implement IP_TRANSPARENT socket option · f5715aea
      KOVACS Krisztian authored
      This patch introduces the IP_TRANSPARENT socket option: enabling that
      will make the IPv4 routing omit the non-local source address check on
      output. Setting IP_TRANSPARENT requires NET_ADMIN capability.
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f5715aea
    • Julian Anastasov's avatar
      ipv4: Loosen source address check on IPv4 output · a210d01a
      Julian Anastasov authored
      ip_route_output() contains a check to make sure that no flows with
      non-local source IP addresses are routed. This obviously makes using
      such addresses impossible.
      
      This patch introduces a flowi flag which makes omitting this check
      possible. The new flag provides a way of handling transparent and
      non-transparent connections differently.
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarKOVACS Krisztian <hidden@sch.bme.hu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a210d01a
    • Herbert Xu's avatar
      net: BUG instead of corrupting memory in pskb_expand_head · 4edd87ad
      Herbert Xu authored
      If the caller of pskb_expand_head specifies a negative nhead
      we'll silently overwrite other people's memory.  This patch
      makes it BUG instead.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4edd87ad
    • Herbert Xu's avatar
      ipsec: Put dumpers on the dump list · 12a169e7
      Herbert Xu authored
      Herbert Xu came up with the idea and the original patch to make
      xfrm_state dump list contain also dumpers:
      
      As it is we go to extraordinary lengths to ensure that states
      don't go away while dumpers go to sleep.  It's much easier if
      we just put the dumpers themselves on the list since they can't
      go away while they're going.
      
      I've also changed the order of addition on new states to prevent
      a never-ending dump.
      
      Timo Teräs improved the patch to apply cleanly to latest tree,
      modified iteration code to be more readable by using a common
      struct for entries in the list, implemented the same idea for
      xfrm_policy dumping and moved the af_key specific "last" entry
      caching to af_key.
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarTimo Teras <timo.teras@iki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      12a169e7
    • David S. Miller's avatar
      Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 · b262e603
      David S. Miller authored
      Conflicts:
      
      	drivers/net/wireless/ath9k/core.c
      	drivers/net/wireless/ath9k/main.c
      	net/core/dev.c
      b262e603
    • Timo Teras's avatar
      af_key: Free dumping state on socket close · 05238204
      Timo Teras authored
      Fix a xfrm_{state,policy}_walk leak if pfkey socket is closed while
      dumping is on-going.
      Signed-off-by: default avatarTimo Teras <timo.teras@iki.fi>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      05238204
    • Ilpo Järvinen's avatar
      ipv6: almost identical frag hashing funcs combined · 93c8b90f
      Ilpo Järvinen authored
      $ diff-funcs ip6qhashfn reassembly.c netfilter/nf_conntrack_reasm.c
       --- reassembly.c:ip6qhashfn()
       +++ netfilter/nf_conntrack_reasm.c:ip6qhashfn()
      @@ -1,5 +1,5 @@
      -static unsigned int ip6qhashfn(__be32 id, struct in6_addr *saddr,
      -			       struct in6_addr *daddr)
      +static unsigned int ip6qhashfn(__be32 id, const struct in6_addr *saddr,
      +			       const struct in6_addr *daddr)
       {
       	u32 a, b, c;
      
      @@ -9,7 +9,7 @@
      
       	a += JHASH_GOLDEN_RATIO;
       	b += JHASH_GOLDEN_RATIO;
      -	c += ip6_frags.rnd;
      +	c += nf_frags.rnd;
       	__jhash_mix(a, b, c);
      
       	a += (__force u32)saddr->s6_addr32[3];
      
      And codiff xx.o.old xx.o.new:
      
      net/ipv6/netfilter/nf_conntrack_reasm.c:
        ip6qhashfn         | -512
        nf_hashfn          |   +6
        nf_ct_frag6_gather |  +36
       3 functions changed, 42 bytes added, 512 bytes removed, diff: -470
      net/ipv6/reassembly.c:
        ip6qhashfn    | -512
        ip6_hashfn    |   +7
        ipv6_frag_rcv |  +89
       3 functions changed, 96 bytes added, 512 bytes removed, diff: -416
      
      net/ipv6/reassembly.c:
        inet6_hash_frag | +510
       1 function changed, 510 bytes added, diff: +510
      
      Total: -376
      
      Compile tested.
      Signed-off-by: default avatarIlpo Järvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      93c8b90f
    • Arnaud Ebalard's avatar
      XFRM,IPv6: initialize ip6_dst_blackhole_ops.kmem_cachep · 5dc121e9
      Arnaud Ebalard authored
      ip6_dst_blackhole_ops.kmem_cachep is not expected to be NULL (i.e. to
      be initialized) when dst_alloc() is called from ip6_dst_blackhole().
      Otherwise, it results in the following (xfrm_larval_drop is now set to
      1 by default):
      
      [   78.697642] Unable to handle kernel paging request for data at address 0x0000004c
      [   78.703449] Faulting instruction address: 0xc0097f54
      [   78.786896] Oops: Kernel access of bad area, sig: 11 [#1]
      [   78.792791] PowerMac
      [   78.798383] Modules linked in: btusb usbhid bluetooth b43 mac80211 cfg80211 ehci_hcd ohci_hcd sungem sungem_phy usbcore ssb
      [   78.804263] NIP: c0097f54 LR: c0334a28 CTR: c002d430
      [   78.809997] REGS: eef19ad0 TRAP: 0300   Not tainted  (2.6.27-rc5)
      [   78.815743] MSR: 00001032 <ME,IR,DR>  CR: 22242482  XER: 20000000
      [   78.821550] DAR: 0000004c, DSISR: 40000000
      [   78.827278] TASK = eef0df40[3035] 'mip6d' THREAD: eef18000
      [   78.827408] GPR00: 00001032 eef19b80 eef0df40 00000000 00008020 eef19c30 00000001 00000000
      [   78.833249] GPR08: eee5101c c05a5c10 ef9ad500 00000000 24242422 1005787c 00000000 1004f960
      [   78.839151] GPR16: 00000000 10024e90 10050040 48030018 0fe44150 00000000 00000000 eef19c30
      [   78.845046] GPR24: eef19e44 00000000 eef19bf8 efb37c14 eef19bf8 00008020 00009032 c0596064
      [   78.856671] NIP [c0097f54] kmem_cache_alloc+0x20/0x94
      [   78.862581] LR [c0334a28] dst_alloc+0x40/0xc4
      [   78.868451] Call Trace:
      [   78.874252] [eef19b80] [c03c1810] ip6_dst_lookup_tail+0x1c8/0x1dc (unreliable)
      [   78.880222] [eef19ba0] [c0334a28] dst_alloc+0x40/0xc4
      [   78.886164] [eef19bb0] [c03cd698] ip6_dst_blackhole+0x28/0x1cc
      [   78.892090] [eef19be0] [c03d9be8] rawv6_sendmsg+0x75c/0xc88
      [   78.897999] [eef19cb0] [c038bca4] inet_sendmsg+0x4c/0x78
      [   78.903907] [eef19cd0] [c03207c8] sock_sendmsg+0xac/0xe4
      [   78.909734] [eef19db0] [c03209e4] sys_sendmsg+0x1e4/0x2a0
      [   78.915540] [eef19f00] [c03220a8] sys_socketcall+0xfc/0x210
      [   78.921406] [eef19f40] [c0014b3c] ret_from_syscall+0x0/0x38
      [   78.927295] --- Exception: c01 at 0xfe2d730
      [   78.927297]     LR = 0xfe2d71c
      [   78.939019] Instruction dump:
      [   78.944835] 91640018 9144001c 900a0000 4bffff44 9421ffe0 7c0802a6 bf810010 7c9d2378
      [   78.950694] 90010024 7fc000a6 57c0045e 7c000124 <83e3004c> 8383005c 2f9f0000 419e0050
      [   78.956464] ---[ end trace 05fa1ed7972487a1 ]---
      
      As commented by Benjamin Thery, the bug was introduced by
      f2fc6a54, while adding network
      namespaces support to ipv6 routes.
      Signed-off-by: default avatarArnaud Ebalard <arno@natisbad.org>
      Acked-by: default avatarBenjamin Thery <benjamin.thery@bull.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5dc121e9
    • Lennert Buytenhek's avatar
      mv643xx_eth: hook up skb recycling · 2bcb4b0f
      Lennert Buytenhek authored
      This gives a nice increase in the maximum loss-free packet forwarding
      rate in routing workloads.
      Signed-off-by: default avatarLennert Buytenhek <buytenh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2bcb4b0f
    • Lennert Buytenhek's avatar
      net: add skb_recycle_check() to enable netdriver skb recycling · 04a4bb55
      Lennert Buytenhek authored
      This patch adds skb_recycle_check(), which can be used by a network
      driver after transmitting an skb to check whether this skb can be
      recycled as a receive buffer.
      
      skb_recycle_check() checks that the skb is not shared or cloned, and
      that it is linear and its head portion large enough (as determined by
      the driver) to be recycled as a receive buffer.  If these conditions
      are met, it does any necessary reference count dropping and cleans
      up the skbuff as if it just came from __alloc_skb().
      Signed-off-by: default avatarLennert Buytenhek <buytenh@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      04a4bb55
    • Denis V. Lunev's avatar
      ipv6: NULL pointer dereferrence in tcp_v6_send_ack · 2a5b8275
      Denis V. Lunev authored
      The following actions are possible:
      tcp_v6_rcv
        skb->dev = NULL;
        tcp_v6_do_rcv
          tcp_v6_hnd_req
            tcp_check_req
              req->rsk_ops->send_ack == tcp_v6_send_ack
      
      So, skb->dev can be NULL in tcp_v6_send_ack. We must obtain namespace
      from dst entry.
      
      Thanks to Vitaliy Gusev <vgusev@openvz.org> for initial problem finding
      in IPv4 code.
      Signed-off-by: default avatarDenis V. Lunev <den@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a5b8275
    • David S. Miller's avatar
    • Vitaliy Gusev's avatar
      tcp: Fix NULL dereference in tcp_4_send_ack() · 4dd7972d
      Vitaliy Gusev authored
      Fix NULL dereference in tcp_4_send_ack().
      
      As skb->dev is reset to NULL in tcp_v4_rcv() thus OOPS occurs:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000004d0
      IP: [<ffffffff80498503>] tcp_v4_send_ack+0x203/0x250
      
      Stack:  ffff810005dbb000 ffff810015c8acc0 e77b2c6e5f861600 a01610802e90cb6d
       0a08010100000000 88afffff88afffff 0000000080762be8 0000000115c872e8
       0004122000000000 0000000000000001 ffffffff80762b88 0000000000000020
      Call Trace:
       <IRQ>  [<ffffffff80499c33>] tcp_v4_reqsk_send_ack+0x20/0x22
       [<ffffffff8049bce5>] tcp_check_req+0x108/0x14c
       [<ffffffff8047aaf7>] ? rt_intern_hash+0x322/0x33c
       [<ffffffff80499846>] tcp_v4_do_rcv+0x399/0x4ec
       [<ffffffff8045ce4b>] ? skb_checksum+0x4f/0x272
       [<ffffffff80485b74>] ? __inet_lookup_listener+0x14a/0x15c
       [<ffffffff8049babc>] tcp_v4_rcv+0x6a1/0x701
       [<ffffffff8047e739>] ip_local_deliver_finish+0x157/0x24a
       [<ffffffff8047ec9a>] ip_local_deliver+0x72/0x7c
       [<ffffffff8047e5bd>] ip_rcv_finish+0x38d/0x3b2
       [<ffffffff803d3548>] ? scsi_io_completion+0x19d/0x39e
       [<ffffffff8047ebe5>] ip_rcv+0x2a2/0x2e5
       [<ffffffff80462faa>] netif_receive_skb+0x293/0x303
       [<ffffffff80465a9b>] process_backlog+0x80/0xd0
       [<ffffffff802630b4>] ? __rcu_process_callbacks+0x125/0x1b4
       [<ffffffff8046560e>] net_rx_action+0xb9/0x17f
       [<ffffffff80234cc5>] __do_softirq+0xa3/0x164
       [<ffffffff8020c52c>] call_softirq+0x1c/0x28
       <EOI>  [<ffffffff8020de1c>] do_softirq+0x34/0x72
       [<ffffffff80234b8e>] local_bh_enable_ip+0x3f/0x50
       [<ffffffff804d43ca>] _spin_unlock_bh+0x12/0x14
       [<ffffffff804599cd>] release_sock+0xb8/0xc1
       [<ffffffff804a6f9a>] inet_stream_connect+0x146/0x25c
       [<ffffffff80243078>] ? autoremove_wake_function+0x0/0x38
       [<ffffffff8045751f>] sys_connect+0x68/0x8e
       [<ffffffff80291818>] ? fd_install+0x5f/0x68
       [<ffffffff80457784>] ? sock_map_fd+0x55/0x62
       [<ffffffff8020b39b>] system_call_after_swapgs+0x7b/0x80
      
      Code: 41 10 11 d0 83 d0 00 4d 85 ed 89 45 c0 c7 45 c4 08 00 00 00 74 07 41 8b 45 04 89 45 c8 48 8b 43 20 8b 4d b8 48 8d 55 b0 48 89 de <48> 8b 80 d0 04 00 00 48 8b b8 60 01 00 00 e8 20 ae fe ff 65 48
      RIP  [<ffffffff80498503>] tcp_v4_send_ack+0x203/0x250
       RSP <ffffffff80762b78>
      CR2: 00000000000004d0
      Signed-off-by: default avatarVitaliy Gusev <vgusev@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4dd7972d
    • Remi Denis-Courmont's avatar
      phonet: Protect if_phonet.h against multiple inclusions. · 6e50e8a2
      Remi Denis-Courmont authored
      From: Remi Denis-Courmont <remi.denis-courmont@nokia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6e50e8a2
  2. 30 Sep, 2008 19 commits