• Gerd Bayer's avatar
    net/smc: Fix setsockopt and sysctl to specify same buffer size again · 833bac7e
    Gerd Bayer authored
    Commit 0227f058 ("net/smc: Unbind r/w buffer size from clcsock
    and make them tunable") introduced the net.smc.rmem and net.smc.wmem
    sysctls to specify the size of buffers to be used for SMC type
    connections. This created a regression for users that specified the
    buffer size via setsockopt() as the effective buffer size was now
    doubled.
    
    Re-introduce the division by 2 in the SMC buffer create code and level
    this out by duplicating the net.smc.[rw]mem values used for initializing
    sk_rcvbuf/sk_sndbuf at socket creation time. This gives users of both
    methods (setsockopt or sysctl) the effective buffer size that they
    expect.
    
    Initialize net.smc.[rw]mem from its own constant of 64kB, respectively.
    Internal performance tests show that this value is a good compromise
    between throughput/latency and memory consumption. Also, this decouples
    it from any tuning that was done to net.ipv4.tcp_[rw]mem[1] before the
    module for SMC protocol was loaded. Check that no more than INT_MAX / 2
    is assigned to net.smc.[rw]mem, in order to avoid any overflow condition
    when that is doubled for use in sk_sndbuf or sk_rcvbuf.
    
    While at it, drop the confusing sk_buf_size variable from
    __smc_buf_create and name "compressed" buffer size variables more
    consistently.
    
    Background:
    
    Before the commit mentioned above, SMC's buffer allocator in
    __smc_buf_create() always used half of the sockets' sk_rcvbuf/sk_sndbuf
    value as initial value to search for appropriate buffers. If the search
    resorted to using a bigger buffer when all buffers of the specified
    size were busy, the duplicate of the used effective buffer size is
    stored back to sk_rcvbuf/sk_sndbuf.
    
    When available, buffers of exactly the size that a user had specified as
    input to setsockopt() were used, despite setsockopt()'s documentation in
    "man 7 socket" talking of a mandatory duplication:
    
    [...]
           SO_SNDBUF
                  Sets  or  gets the maximum socket send buffer in bytes.
                  The kernel doubles this value (to allow space for book‐
                  keeping  overhead)  when it is set using setsockopt(2),
                  and this doubled value is  returned  by  getsockopt(2).
                  The     default     value     is     set     by     the
                  /proc/sys/net/core/wmem_default file  and  the  maximum
                  allowed value is set by the /proc/sys/net/core/wmem_max
                  file.  The minimum (doubled) value for this  option  is
                  2048.
    [...]
    
    Fixes: 0227f058 ("net/smc: Unbind r/w buffer size from clcsock and make them tunable")
    Co-developed-by: default avatarJan Karcher <jaka@linux.ibm.com>
    Signed-off-by: default avatarJan Karcher <jaka@linux.ibm.com>
    Reviewed-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
    Reviewed-by: default avatarTony Lu <tonylu@linux.alibaba.com>
    Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
    Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
    833bac7e
smc_core.c 68.1 KB