- 15 Mar, 2011 9 commits
-
-
Julian Anastasov authored
Move the estimator reading from estimation_timer to user context. ip_vs_read_estimator() will be used to decode the rate values. As the decoded rates are not set by estimation timer there is no need to reset them in ip_vs_zero_stats. There is no need ip_vs_new_estimator() to encode stats to rates, if the destination is in trash both the stats and the rates are inactive. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Julian Anastasov authored
Remove ustats_seq, IPVS_STAT_INC and IPVS_STAT_ADD because they are not used. They were replaced with u64_stats. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Julian Anastasov authored
Currently, the new percpu counters are not zeroed and the zero commands do not work as expected, we still show the old sum of percpu values. OTOH, we can not reset the percpu counters from user context without causing the incrementing to use old and bogus values. So, as Eric Dumazet suggested fix that by moving all overhead to stats reading in user context. Do not introduce overhead in timer context (estimator) and incrementing (packet handling in softirqs). The new ustats0 field holds the zero point for all counter values, the rates always use 0 as base value as before. When showing the values to user space just give the difference between counters and the base values. The only drawback is that percpu stats are not zeroed, they are accessible only from /proc and are new interface, so it should not be a compatibility problem as long as the sum stats are correct after zeroing. Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Julian Anastasov authored
The global tot_stats contains cpustats field just like the stats for dest and svc, so better use it to simplify the usage in estimation_timer. As tot_stats is registered as estimator we can remove the special ip_vs_read_cpu_stats call for tot_stats. Fix ip_vs_read_cpu_stats to be called under stats lock because it is still used as synchronization between estimation timer and user context (the stats readers). Also, make sure ip_vs_stats_percpu_show reads properly the u64 stats from user context. Signed-off-by: Julian Anastasov <ja@ssi.bg> Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Julian Anastasov authored
Remove include/net/netns/ip_vs.h because it depends on structures from include/net/ip_vs.h. As ipvs is pointer in struct net it is better to move struct netns_ipvs into include/net/ip_vs.h, so that we can easily use other structures in struct netns_ipvs. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Jesper Juhl authored
There's no sense to 'ct = ct = ' in ip_vs_notrack(). Just assign nf_ct_get()'s return value directly to the pointer variable 'ct' once. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Shan Wei authored
The semantic patch that makes this output is available in scripts/coccinelle/api/memdup.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Julian Anastasov authored
ip_vs_read_cpu_stats is called only from timer, so no need for _bh locks. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Julian Anastasov authored
Restore the previous behaviour to lookup for fwmark service only when fwmark is non-null. This saves only CPU. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
- 14 Mar, 2011 2 commits
-
-
Stephen Hemminger authored
Message in log because sysctl table was not empty at netns exit WARNING: at net/sysctl_net.c:84 sysctl_net_exit+0x2a/0x2c() Instrumenting showed that the nf_conntrack_timestamp was the entry that was being created but not cleared. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Patrick McHardy authored
As Stephen correctly points out, we need to return -ENOENT in xt_find_match()/xt_find_target() after the patch "netfilter: x_tables: misuse of try_then_request_module" in order to properly indicate a non-existant module to the caller. Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 09 Mar, 2011 1 commit
-
-
Stephen Hemminger authored
Since xt_find_match() returns ERR_PTR(xx) on error not NULL, the macro try_then_request_module won't work correctly here. The macro expects its first argument will be zero if condition fails. But ERR_PTR(-ENOENT) is not zero. The correct solution is to propagate the error value back. Found by inspection, and compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 08 Mar, 2011 1 commit
-
-
Shan Wei authored
net/netfilter/ipset/ip_set_core.c:615: warning: ‘clash’ may be used uninitialized in this function Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 28 Feb, 2011 1 commit
-
-
Pablo Neira Ayuso authored
This patch fixes the out of sync scenarios while in SYN_RECV state. Quoting Jozsef, what it happens if we are out of sync if the following: > > b. conntrack entry is outdated, new SYN received > > - (b1) we ignore it but save the initialization data from it > > - (b2) when the reply SYN/ACK receives and it matches the saved data, > > we pick up the new connection This is what it should happen if we are in SYN_RECV state. Initially, the SYN packet hits b1, thus we save data from it. But the SYN/ACK packet is considered a retransmission given that we're in SYN_RECV state. Therefore, we never hit b2 and we don't get in sync. To fix this, we ignore SYN/ACK if we are in SYN_RECV. If the previous packet was a SYN, then we enter the ignore case that get us in sync. This patch helps a lot to conntrackd in stress scenarios (assumming a client that generates lots of small TCP connections). During the failover, consider that the new primary has injected one outdated flow in SYN_RECV state (this is likely to happen if the conntrack event rate is high because the backup will be a bit delayed from the primary). With the current code, if the client starts a new fresh connection that matches the tuple, the SYN packet will be ignored without updating the state tracking, and the SYN+ACK in reply will blocked as it will not pass checkings III or IV (since all state tracking in the original direction is not initialized because of the SYN packet was ignored and the ignore case that get us in sync is not applied). I posted a couple of patches before this one. Changli Gao spotted a simpler way to fix this problem. This patch implements his idea. Cc: Changli Gao <xiaosuo@gmail.com> Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 25 Feb, 2011 1 commit
-
-
Changli Gao authored
lc and wlc use the same formula, but lblc and lblcr use another one. There is no reason for using two different formulas for the lc variants. The formula used by lc is used by all the lc variants in this patch. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Acked-by: Wensong Zhang <wensong@linux-vs.org> Signed-off-by: Simon Horman <horms@verge.net.au>
-
- 24 Feb, 2011 1 commit
-
-
Changli Gao authored
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
- 22 Feb, 2011 1 commit
-
-
Changli Gao authored
Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
- 16 Feb, 2011 1 commit
-
-
Patrick Schaaf authored
When IP_VS schedulers do not find a destination, they output a terse "WLC: no destination available" message through kernel syslog, which I can not only make sense of because syslog puts them in a logfile together with keepalived checker results. This patch makes the output a bit more informative, by telling you which virtual service failed to find a destination. Example output: kernel: [1539214.552233] IPVS: wlc: TCP 192.168.8.30:22 - no destination available kernel: [1539299.674418] IPVS: wlc: FWM 22 0x00000016 - no destination available I have tested the code for IPv4 and FWM services, as you can see from the example; I do not have an IPv6 setup to test the third code path with. To avoid code duplication, I put a new function ip_vs_scheduler_err() into ip_vs_sched.c, and use that from the schedulers instead of calling IP_VS_ERR_RL directly. Signed-off-by: Patrick Schaaf <netdev@bof.de> Signed-off-by: Simon Horman <horms@verge.net.au>
-
- 15 Feb, 2011 3 commits
-
-
Julian Anastasov authored
Remove code that should not be called anymore. Now when ip_vs_out handles replies for local clients at LOCAL_IN hook we do not need to call conn_out_get and handle_response_icmp from ip_vs_in_icmp* because such lookups were already performed for the ICMP packet and no connection was found. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Tinggong Wang authored
Fix get_curr_sync_buff to keep buffer for 2 seconds as intended, not just for the current jiffie. By this way we will sync more connection structures with single packet. Signed-off-by: Tinggong Wang <wangtinggong@gmail.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
-
Florian Westphal authored
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 14 Feb, 2011 3 commits
-
-
Jan Engelhardt authored
nfct happens to run after the raw table only. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Stefan Berger authored
This reverts commit 44bd4de9. I have to revert the early loop termination in connlimit since it generates problems when an iptables statement does not use -m state --state NEW before the connlimit match extension. Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Vasiliy Kulikov authored
Struct tmp is copied from userspace. It is not checked whether the "name" field is NULL terminated. This may lead to buffer overflow and passing contents of kernel stack as a module name to try_then_request_module() and, consequently, to modprobe commandline. It would be seen by all userspace processes. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 11 Feb, 2011 1 commit
-
-
Stefan Berger authored
The patch below introduces an early termination of the loop that is counting matches. It terminates once the counter has exceeded the threshold provided by the user. There's no point in continuing the loop afterwards and looking at other entries. It plays together with the following code further below: return (connections > info->limit) ^ info->inverse; where connections is the result of the counted connection, which in turn is the matches variable in the loop. So once -> matches = info->limit + 1 alias -> matches > info->limit alias -> matches > threshold we can terminate the loop. Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 10 Feb, 2011 1 commit
-
-
Patrick McHardy authored
When SYSCTL and PROC_FS and NETFILTER_NETLINK are not enabled: net/built-in.o: In function `try_to_load_type': ip_set_core.c:(.text+0x3ab49): undefined reference to `nfnl_unlock' ip_set_core.c:(.text+0x3ab4e): undefined reference to `nfnl_lock' ... Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 07 Feb, 2011 1 commit
-
-
Dan Carpenter authored
'!' has higher precedence than '&'. IP_VS_STATE_MASTER is 0x1 so the original code is equivelent to if (!ipvs->sync_state) ... Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com> Signed-off-by: Simon Horman <horms@verge.net.au>
-
- 03 Feb, 2011 1 commit
-
-
Simon Horman authored
Use sctp_app_lock instead of tcp_app_lock in the SCTP protocol module. This appears to be a typo introduced by the netns changes. Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
-
- 02 Feb, 2011 4 commits
-
-
Patrick McHardy authored
Add a new 'devgroup' match to match on the device group of the incoming and outgoing network device of a packet. Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Jozsef Kadlecsik authored
When a message carries multiple commands and one of them triggers an error, we have to report to the userspace which one was that. The line number of the command plays this role and there's an attribute reserved in the header part of the message to be filled out with the error line number. In order not to modify the original message received from the userspace, we construct a new, complete netlink error message and modifies the attribute there, then send it. Netlink is notified not to send its ACK/error message. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Patrick McHardy authored
Add a dummy ip_set_get_ip6_port function that unconditionally returns false for CONFIG_IPV6=n and convert the real function to ipv6_skip_exthdr() to avoid pulling in the ip6_tables module when loading ipset. Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Patrick McHardy authored
Don't fall through in the switch statement, otherwise IPv4 headers are incorrectly parsed again as IPv6 and the return value will always be 'false'. Signed-off-by: Patrick McHardy <kaber@trash.net>
-
- 01 Feb, 2011 8 commits
-
-
Patrick McHardy authored
Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Simon Horman authored
ip_vs_sync_cleanup() may be called from ip_vs_init() on error and thus needs to be accesible from section __init Reporte-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Tested-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Simon Horman authored
This is a rather naieve approach to allowing PVS to compile with CONFIG_SYSCTL disabled. I am working on a more comprehensive patch which will remove compilation of all sysctl-related IPVS code when CONFIG_SYSCTL is disabled. Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Tested-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Simon Horman authored
These variables are unused as a result of the recent netns work. Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Tested-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Simon Horman authored
Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Tested-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Simon Horman authored
Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Simon Horman <horms@verge.net.au> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Hans Schillstrom <hans@schillstrom.com> Tested-by: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Patrick McHardy authored
net/netfilter/nf_conntrack_netlink.c: In function 'ctnetlink_parse_tuple': net/netfilter/nf_conntrack_netlink.c:832:11: warning: comparison between 'enum ctattr_tuple' and 'enum ctattr_type' Use ctattr_type for the 'type' parameter since that's the type of all attributes passed to this function. Signed-off-by: Patrick McHardy <kaber@trash.net>
-
Patrick McHardy authored
None of the set types need uaccess.h since this is handled centrally in ip_set_core. Most set types additionally don't need bitops.h and spinlock.h since they use neither. tcp.h is only needed by those using before(), udp.h is not needed at all. Signed-off-by: Patrick McHardy <kaber@trash.net>
-