Commit 06b19fe9 authored by David S. Miller's avatar David S. Miller

Merge branch 'chelsio-inline-tls'

Atul Gupta says:

====================
Chelsio Inline TLS

Series for Chelsio Inline TLS driver (chtls)

Use tls ULP infrastructure to register chtls as Inline TLS driver.
Chtls use TCP Sockets to Tx/Rx TLS records.
TCP sk_proto APIs are enhanced to offload TLS record.

T6 adapter provides the following features:
        -TLS record offload, TLS header, encrypt, digest and transmit
        -TLS record receive and decrypt
        -TLS keys store
        -TCP/IP engine
        -TLS engine
        -GCM crypto engine [support CBC also]

TLS provides security at the transport layer. It uses TCP to provide
reliable end-to-end transport of application data.
It relies on TCP for any retransmission.
TLS session comprises of three parts:
a. TCP/IP connection
b. TLS handshake
c. Record layer processing

TLS handshake state machine is executed in host (refer standard
implementation eg. OpenSSL).  Setsockopt [SOL_TCP, TCP_ULP]
initialize TCP proto-ops for Chelsio inline tls support.
setsockopt(sock, SOL_TCP, TCP_ULP, "tls", sizeof("tls"));

Tx and Rx Keys are decided during handshake and programmed on
the chip after CCS is exchanged.
struct tls12_crypto_info_aes_gcm_128 crypto_info
setsockopt(sock, SOL_TLS, TLS_TX, &crypto_info, sizeof(crypto_info))
Finish is the first encrypted/decrypted message tx/rx inline.

On the Tx path TLS engine receive plain text from openssl, insert IV,
fetches the tx key, create cipher text records and generate MAC.

TLS header is added to cipher text and forward to TCP/IP engine for
transport layer processing and transmission on wire.
TX PATH:
Apps--openssl--chtls---TLS engine---encrypt/auth---TCP/IP engine---wire

On the Rx side, data received is PDU aligned at record boundaries.
TLS processes only the complete record. If rx key is programmed on
CCS receive, data is decrypted and plain text is posted to host.
RX PATH:
Wire--cipher-text--TCP/IP engine [PDU align]---TLS engine---
decrypt/auth---plain-text--chtls--openssl--application

v15: indent fix in mark_urg
     -removed unwanted checks in sendmsg, sendpage, recvmsg,
      close, disconnect,shutdown, destroy sock [Sabrina]
     - removed unused chtls_free_kmap [chtls.h]
     - rebase to top of net-next

v14: -Reverse christmas tree style for variable declarations for
     various functions in chtls_hw.c, chtls_io.c [Stefano Brivio]
     - replaced break with return in tcp_state_to_flowc_state
       [Stefano Brivio]
     - renamed tlstx_seq_number to tlstx_incr_seqnum [Stefano Brivio]
     - use bool for corked, should_push and send_should_push
       [Stefano Brivio]
     - removed "Reviewed-by" tag for Stefano, Sabrina, Dave Watson

v13: handle clean ctx free for HW_RECORD in tls_sk_proto_close
    -removed SOCK_INLINE [chtls.h], using csk_conn_inline instead
     in send_abort_rpl,chtls_send_abort_rpl,chtls_sendmsg,chtls_sendpage
    -removed sk_no_receive [chtls_io.c] replaced with sk_shutdown &
     RCV_SHUTDOWN in chtls_pt_recvmsg, peekmsg and chtls_recvmsg
    -cleaned chtls_expansion_size [Stefano Brivio]
    - u8 conf:3 in tls_sw_context to add TLS_HW_RECORD
    -removed is_tls_skb, using tls_skb_inline [Stefano Brivio]
    -reverse christmas tree formatting in chtls_io.c, chtls_cm.c
     [Stefano Brivio]
    -fixed build warning reported by kbuild robot
    -retained ctx conf enum in chtls_main vs earlier versions, tls_prots
     not used in chtls.
    -cleanup [removed syn_sent, base_prot, added synq] [Michael Werner]
    - passing struct fw_wr_hdr * to ofldtxq_stop [Casey]
    - rebased on top of the current net-next

v12: patch against net-next
    -fixed build error [reported by Julia]
    -replace set_queue with skb_set_queue_mapping [Sabrina]
    -copyright year correction [chtls]

v11: formatting and cleanup, few function rename and error
     handling [Stefano Brivio]
     - ctx freed later for TLS_HW_RECORD
     - split tx and rx in different patch

v10: fixed following based on the review comments of Sabrina Dubroca
     -docs header added for struct tls_device [tls.h]
     -changed TLS_FULL_HW to TLS_HW_RECORD
     -similary using tls-hw-record instead of tls-inline for
     ethtool feature config
     -added more description to patch sets
     -replaced kmalloc/vmalloc/kfree with kvzalloc/kvfree
     -reordered the patch sequence
     -formatted entire patch for func return values

v9: corrected __u8 and similar usage
    -create_ctx to alloc tls_context
    -tls_hw_prot before sk !establish check

v8: tls_main.c cleanup comment [Dave Watson]

v7: func name change, use sk->sk_prot where required

v6: modify prot only for FULL_HW
   -corrected commit message for patch 11

v5: set TLS_FULL_HW for registered inline tls drivers
   -set TLS_FULL_HW prot for offload connection else move
    to TLS_SW_TX
   -Case handled for interface with same IP [Dave Miller]
   -Removed Specific IP and INADDR_ANY handling [v4]

v4: removed chtls ULP type, retained tls ULP
   -registered chtls with net tls
   -defined struct tls_device to register the Inline drivers
   -ethtool interface tls-inline to enable Inline TLS for interface
   -prot update to support inline TLS

v3: fixed the kbuild test issues
   -made few funtions static
   -initialized few variables

v2: fixed the following based on the review comments of Stephan Mueller,
    Stefano Brivio and Hannes Frederic
    -Added more details in cover letter
    -Fixed indentation and formating issues
    -Using aes instead of aes-generic
    -memset key info after programing the key on chip
    -reordered the patch sequence
====================
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parents d4069fe6 bd7f4857
......@@ -29,3 +29,14 @@ config CHELSIO_IPSEC_INLINE
default n
---help---
Enable support for IPSec Tx Inline.
config CRYPTO_DEV_CHELSIO_TLS
tristate "Chelsio Crypto Inline TLS Driver"
depends on CHELSIO_T4
depends on TLS
select CRYPTO_DEV_CHELSIO
---help---
Support Chelsio Inline TLS with Chelsio crypto accelerator.
To compile this driver as a module, choose M here: the module
will be called chtls.
......@@ -3,3 +3,4 @@ ccflags-y := -Idrivers/net/ethernet/chelsio/cxgb4
obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chcr.o
chcr-objs := chcr_core.o chcr_algo.o
chcr-$(CONFIG_CHELSIO_IPSEC_INLINE) += chcr_ipsec.o
obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls/
......@@ -86,6 +86,39 @@
KEY_CONTEXT_OPAD_PRESENT_M)
#define KEY_CONTEXT_OPAD_PRESENT_F KEY_CONTEXT_OPAD_PRESENT_V(1U)
#define TLS_KEYCTX_RXFLIT_CNT_S 24
#define TLS_KEYCTX_RXFLIT_CNT_V(x) ((x) << TLS_KEYCTX_RXFLIT_CNT_S)
#define TLS_KEYCTX_RXPROT_VER_S 20
#define TLS_KEYCTX_RXPROT_VER_M 0xf
#define TLS_KEYCTX_RXPROT_VER_V(x) ((x) << TLS_KEYCTX_RXPROT_VER_S)
#define TLS_KEYCTX_RXCIPH_MODE_S 16
#define TLS_KEYCTX_RXCIPH_MODE_M 0xf
#define TLS_KEYCTX_RXCIPH_MODE_V(x) ((x) << TLS_KEYCTX_RXCIPH_MODE_S)
#define TLS_KEYCTX_RXAUTH_MODE_S 12
#define TLS_KEYCTX_RXAUTH_MODE_M 0xf
#define TLS_KEYCTX_RXAUTH_MODE_V(x) ((x) << TLS_KEYCTX_RXAUTH_MODE_S)
#define TLS_KEYCTX_RXCIAU_CTRL_S 11
#define TLS_KEYCTX_RXCIAU_CTRL_V(x) ((x) << TLS_KEYCTX_RXCIAU_CTRL_S)
#define TLS_KEYCTX_RX_SEQCTR_S 9
#define TLS_KEYCTX_RX_SEQCTR_M 0x3
#define TLS_KEYCTX_RX_SEQCTR_V(x) ((x) << TLS_KEYCTX_RX_SEQCTR_S)
#define TLS_KEYCTX_RX_VALID_S 8
#define TLS_KEYCTX_RX_VALID_V(x) ((x) << TLS_KEYCTX_RX_VALID_S)
#define TLS_KEYCTX_RXCK_SIZE_S 3
#define TLS_KEYCTX_RXCK_SIZE_M 0x7
#define TLS_KEYCTX_RXCK_SIZE_V(x) ((x) << TLS_KEYCTX_RXCK_SIZE_S)
#define TLS_KEYCTX_RXMK_SIZE_S 0
#define TLS_KEYCTX_RXMK_SIZE_M 0x7
#define TLS_KEYCTX_RXMK_SIZE_V(x) ((x) << TLS_KEYCTX_RXMK_SIZE_S)
#define CHCR_HASH_MAX_DIGEST_SIZE 64
#define CHCR_MAX_SHA_DIGEST_SIZE 64
......@@ -176,6 +209,15 @@
KEY_CONTEXT_SALT_PRESENT_V(1) | \
KEY_CONTEXT_CTX_LEN_V((ctx_len)))
#define FILL_KEY_CRX_HDR(ck_size, mk_size, d_ck, opad, ctx_len) \
htonl(TLS_KEYCTX_RXMK_SIZE_V(mk_size) | \
TLS_KEYCTX_RXCK_SIZE_V(ck_size) | \
TLS_KEYCTX_RX_VALID_V(1) | \
TLS_KEYCTX_RX_SEQCTR_V(3) | \
TLS_KEYCTX_RXAUTH_MODE_V(4) | \
TLS_KEYCTX_RXCIPH_MODE_V(2) | \
TLS_KEYCTX_RXFLIT_CNT_V((ctx_len)))
#define FILL_WR_OP_CCTX_SIZE \
htonl( \
FW_CRYPTO_LOOKASIDE_WR_OPCODE_V( \
......
......@@ -65,10 +65,58 @@ struct uld_ctx;
struct _key_ctx {
__be32 ctx_hdr;
u8 salt[MAX_SALT];
__be64 reserverd;
__be64 iv_to_auth;
unsigned char key[0];
};
#define KEYCTX_TX_WR_IV_S 55
#define KEYCTX_TX_WR_IV_M 0x1ffULL
#define KEYCTX_TX_WR_IV_V(x) ((x) << KEYCTX_TX_WR_IV_S)
#define KEYCTX_TX_WR_IV_G(x) \
(((x) >> KEYCTX_TX_WR_IV_S) & KEYCTX_TX_WR_IV_M)
#define KEYCTX_TX_WR_AAD_S 47
#define KEYCTX_TX_WR_AAD_M 0xffULL
#define KEYCTX_TX_WR_AAD_V(x) ((x) << KEYCTX_TX_WR_AAD_S)
#define KEYCTX_TX_WR_AAD_G(x) (((x) >> KEYCTX_TX_WR_AAD_S) & \
KEYCTX_TX_WR_AAD_M)
#define KEYCTX_TX_WR_AADST_S 39
#define KEYCTX_TX_WR_AADST_M 0xffULL
#define KEYCTX_TX_WR_AADST_V(x) ((x) << KEYCTX_TX_WR_AADST_S)
#define KEYCTX_TX_WR_AADST_G(x) \
(((x) >> KEYCTX_TX_WR_AADST_S) & KEYCTX_TX_WR_AADST_M)
#define KEYCTX_TX_WR_CIPHER_S 30
#define KEYCTX_TX_WR_CIPHER_M 0x1ffULL
#define KEYCTX_TX_WR_CIPHER_V(x) ((x) << KEYCTX_TX_WR_CIPHER_S)
#define KEYCTX_TX_WR_CIPHER_G(x) \
(((x) >> KEYCTX_TX_WR_CIPHER_S) & KEYCTX_TX_WR_CIPHER_M)
#define KEYCTX_TX_WR_CIPHERST_S 23
#define KEYCTX_TX_WR_CIPHERST_M 0x7f
#define KEYCTX_TX_WR_CIPHERST_V(x) ((x) << KEYCTX_TX_WR_CIPHERST_S)
#define KEYCTX_TX_WR_CIPHERST_G(x) \
(((x) >> KEYCTX_TX_WR_CIPHERST_S) & KEYCTX_TX_WR_CIPHERST_M)
#define KEYCTX_TX_WR_AUTH_S 14
#define KEYCTX_TX_WR_AUTH_M 0x1ff
#define KEYCTX_TX_WR_AUTH_V(x) ((x) << KEYCTX_TX_WR_AUTH_S)
#define KEYCTX_TX_WR_AUTH_G(x) \
(((x) >> KEYCTX_TX_WR_AUTH_S) & KEYCTX_TX_WR_AUTH_M)
#define KEYCTX_TX_WR_AUTHST_S 7
#define KEYCTX_TX_WR_AUTHST_M 0x7f
#define KEYCTX_TX_WR_AUTHST_V(x) ((x) << KEYCTX_TX_WR_AUTHST_S)
#define KEYCTX_TX_WR_AUTHST_G(x) \
(((x) >> KEYCTX_TX_WR_AUTHST_S) & KEYCTX_TX_WR_AUTHST_M)
#define KEYCTX_TX_WR_AUTHIN_S 0
#define KEYCTX_TX_WR_AUTHIN_M 0x7f
#define KEYCTX_TX_WR_AUTHIN_V(x) ((x) << KEYCTX_TX_WR_AUTHIN_S)
#define KEYCTX_TX_WR_AUTHIN_G(x) \
(((x) >> KEYCTX_TX_WR_AUTHIN_S) & KEYCTX_TX_WR_AUTHIN_M)
struct chcr_wr {
struct fw_crypto_lookaside_wr wreq;
struct ulp_txpkt ulptx;
......@@ -90,6 +138,11 @@ struct uld_ctx {
struct chcr_dev *dev;
};
struct sge_opaque_hdr {
void *dev;
dma_addr_t addr[MAX_SKB_FRAGS + 1];
};
struct chcr_ipsec_req {
struct ulp_txpkt ulptx;
struct ulptx_idata sc_imm;
......
ccflags-y := -Idrivers/net/ethernet/chelsio/cxgb4 -Idrivers/crypto/chelsio/
obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls.o
chtls-objs := chtls_main.o chtls_cm.o chtls_io.o chtls_hw.o
/*
* Copyright (c) 2018 Chelsio Communications, Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef __CHTLS_H__
#define __CHTLS_H__
#include <crypto/aes.h>
#include <crypto/algapi.h>
#include <crypto/hash.h>
#include <crypto/sha.h>
#include <crypto/authenc.h>
#include <crypto/ctr.h>
#include <crypto/gf128mul.h>
#include <crypto/internal/aead.h>
#include <crypto/null.h>
#include <crypto/internal/skcipher.h>
#include <crypto/aead.h>
#include <crypto/scatterwalk.h>
#include <crypto/internal/hash.h>
#include <linux/tls.h>
#include <net/tls.h>
#include "t4fw_api.h"
#include "t4_msg.h"
#include "cxgb4.h"
#include "cxgb4_uld.h"
#include "l2t.h"
#include "chcr_algo.h"
#include "chcr_core.h"
#include "chcr_crypto.h"
#define MAX_IVS_PAGE 256
#define TLS_KEY_CONTEXT_SZ 64
#define CIPHER_BLOCK_SIZE 16
#define GCM_TAG_SIZE 16
#define KEY_ON_MEM_SZ 16
#define AEAD_EXPLICIT_DATA_SIZE 8
#define TLS_HEADER_LENGTH 5
#define SCMD_CIPH_MODE_AES_GCM 2
/* Any MFS size should work and come from openssl */
#define TLS_MFS 16384
#define RSS_HDR sizeof(struct rss_header)
#define TLS_WR_CPL_LEN \
(sizeof(struct fw_tlstx_data_wr) + sizeof(struct cpl_tx_tls_sfo))
enum {
CHTLS_KEY_CONTEXT_DSGL,
CHTLS_KEY_CONTEXT_IMM,
CHTLS_KEY_CONTEXT_DDR,
};
enum {
CHTLS_LISTEN_START,
CHTLS_LISTEN_STOP,
};
/* Flags for return value of CPL message handlers */
enum {
CPL_RET_BUF_DONE = 1, /* buffer processing done */
CPL_RET_BAD_MSG = 2, /* bad CPL message */
CPL_RET_UNKNOWN_TID = 4 /* unexpected unknown TID */
};
#define TLS_RCV_ST_READ_HEADER 0xF0
#define TLS_RCV_ST_READ_BODY 0xF1
#define TLS_RCV_ST_READ_DONE 0xF2
#define TLS_RCV_ST_READ_NB 0xF3
#define LISTEN_INFO_HASH_SIZE 32
#define RSPQ_HASH_BITS 5
struct listen_info {
struct listen_info *next; /* Link to next entry */
struct sock *sk; /* The listening socket */
unsigned int stid; /* The server TID */
};
enum {
T4_LISTEN_START_PENDING,
T4_LISTEN_STARTED
};
enum csk_flags {
CSK_CALLBACKS_CHKD, /* socket callbacks have been sanitized */
CSK_ABORT_REQ_RCVD, /* received one ABORT_REQ_RSS message */
CSK_TX_MORE_DATA, /* sending ULP data; don't set SHOVE bit */
CSK_TX_WAIT_IDLE, /* suspend Tx until in-flight data is ACKed */
CSK_ABORT_SHUTDOWN, /* shouldn't send more abort requests */
CSK_ABORT_RPL_PENDING, /* expecting an abort reply */
CSK_CLOSE_CON_REQUESTED,/* we've sent a close_conn_req */
CSK_TX_DATA_SENT, /* sent a TX_DATA WR on this connection */
CSK_TX_FAILOVER, /* Tx traffic failing over */
CSK_UPDATE_RCV_WND, /* Need to update rcv window */
CSK_RST_ABORTED, /* outgoing RST was aborted */
CSK_TLS_HANDSHK, /* TLS Handshake */
CSK_CONN_INLINE, /* Connection on HW */
};
struct listen_ctx {
struct sock *lsk;
struct chtls_dev *cdev;
struct sk_buff_head synq;
u32 state;
};
struct key_map {
unsigned long *addr;
unsigned int start;
unsigned int available;
unsigned int size;
spinlock_t lock; /* lock for key id request from map */
} __packed;
struct tls_scmd {
u32 seqno_numivs;
u32 ivgen_hdrlen;
};
struct chtls_dev {
struct tls_device tlsdev;
struct list_head list;
struct cxgb4_lld_info *lldi;
struct pci_dev *pdev;
struct listen_info *listen_hash_tab[LISTEN_INFO_HASH_SIZE];
spinlock_t listen_lock; /* lock for listen list */
struct net_device **ports;
struct tid_info *tids;
unsigned int pfvf;
const unsigned short *mtus;
struct idr hwtid_idr;
struct idr stid_idr;
spinlock_t idr_lock ____cacheline_aligned_in_smp;
struct net_device *egr_dev[NCHAN * 2];
struct sk_buff *rspq_skb_cache[1 << RSPQ_HASH_BITS];
struct sk_buff *askb;
struct sk_buff_head deferq;
struct work_struct deferq_task;
struct list_head list_node;
struct list_head rcu_node;
struct list_head na_node;
unsigned int send_page_order;
struct key_map kmap;
};
struct chtls_hws {
struct sk_buff_head sk_recv_queue;
u8 txqid;
u8 ofld;
u16 type;
u16 rstate;
u16 keyrpl;
u16 pldlen;
u16 rcvpld;
u16 compute;
u16 expansion;
u16 keylen;
u16 pdus;
u16 adjustlen;
u16 ivsize;
u16 txleft;
u32 mfs;
s32 txkey;
s32 rxkey;
u32 fcplenmax;
u32 copied_seq;
u64 tx_seq_no;
struct tls_scmd scmd;
struct tls12_crypto_info_aes_gcm_128 crypto_info;
};
struct chtls_sock {
struct sock *sk;
struct chtls_dev *cdev;
struct l2t_entry *l2t_entry; /* pointer to the L2T entry */
struct net_device *egress_dev; /* TX_CHAN for act open retry */
struct sk_buff_head txq;
struct sk_buff *wr_skb_head;
struct sk_buff *wr_skb_tail;
struct sk_buff *ctrl_skb_cache;
struct sk_buff *txdata_skb_cache; /* abort path messages */
struct kref kref;
unsigned long flags;
u32 opt2;
u32 wr_credits;
u32 wr_unacked;
u32 wr_max_credits;
u32 wr_nondata;
u32 hwtid; /* TCP Control Block ID */
u32 txq_idx;
u32 rss_qid;
u32 tid;
u32 idr;
u32 mss;
u32 ulp_mode;
u32 tx_chan;
u32 rx_chan;
u32 sndbuf;
u32 txplen_max;
u32 mtu_idx; /* MTU table index */
u32 smac_idx;
u8 port_id;
u8 tos;
u16 resv2;
u32 delack_mode;
u32 delack_seq;
void *passive_reap_next; /* placeholder for passive */
struct chtls_hws tlshws;
struct synq {
struct sk_buff *next;
struct sk_buff *prev;
} synq;
struct listen_ctx *listen_ctx;
};
struct tls_hdr {
u8 type;
u16 version;
u16 length;
} __packed;
struct tlsrx_cmp_hdr {
u8 type;
u16 version;
u16 length;
u64 tls_seq;
u16 reserved1;
u8 res_to_mac_error;
} __packed;
/* res_to_mac_error fields */
#define TLSRX_HDR_PKT_INT_ERROR_S 4
#define TLSRX_HDR_PKT_INT_ERROR_M 0x1
#define TLSRX_HDR_PKT_INT_ERROR_V(x) \
((x) << TLSRX_HDR_PKT_INT_ERROR_S)
#define TLSRX_HDR_PKT_INT_ERROR_G(x) \
(((x) >> TLSRX_HDR_PKT_INT_ERROR_S) & TLSRX_HDR_PKT_INT_ERROR_M)
#define TLSRX_HDR_PKT_INT_ERROR_F TLSRX_HDR_PKT_INT_ERROR_V(1U)
#define TLSRX_HDR_PKT_SPP_ERROR_S 3
#define TLSRX_HDR_PKT_SPP_ERROR_M 0x1
#define TLSRX_HDR_PKT_SPP_ERROR_V(x) ((x) << TLSRX_HDR_PKT_SPP_ERROR)
#define TLSRX_HDR_PKT_SPP_ERROR_G(x) \
(((x) >> TLSRX_HDR_PKT_SPP_ERROR_S) & TLSRX_HDR_PKT_SPP_ERROR_M)
#define TLSRX_HDR_PKT_SPP_ERROR_F TLSRX_HDR_PKT_SPP_ERROR_V(1U)
#define TLSRX_HDR_PKT_CCDX_ERROR_S 2
#define TLSRX_HDR_PKT_CCDX_ERROR_M 0x1
#define TLSRX_HDR_PKT_CCDX_ERROR_V(x) ((x) << TLSRX_HDR_PKT_CCDX_ERROR_S)
#define TLSRX_HDR_PKT_CCDX_ERROR_G(x) \
(((x) >> TLSRX_HDR_PKT_CCDX_ERROR_S) & TLSRX_HDR_PKT_CCDX_ERROR_M)
#define TLSRX_HDR_PKT_CCDX_ERROR_F TLSRX_HDR_PKT_CCDX_ERROR_V(1U)
#define TLSRX_HDR_PKT_PAD_ERROR_S 1
#define TLSRX_HDR_PKT_PAD_ERROR_M 0x1
#define TLSRX_HDR_PKT_PAD_ERROR_V(x) ((x) << TLSRX_HDR_PKT_PAD_ERROR_S)
#define TLSRX_HDR_PKT_PAD_ERROR_G(x) \
(((x) >> TLSRX_HDR_PKT_PAD_ERROR_S) & TLSRX_HDR_PKT_PAD_ERROR_M)
#define TLSRX_HDR_PKT_PAD_ERROR_F TLSRX_HDR_PKT_PAD_ERROR_V(1U)
#define TLSRX_HDR_PKT_MAC_ERROR_S 0
#define TLSRX_HDR_PKT_MAC_ERROR_M 0x1
#define TLSRX_HDR_PKT_MAC_ERROR_V(x) ((x) << TLSRX_HDR_PKT_MAC_ERROR)
#define TLSRX_HDR_PKT_MAC_ERROR_G(x) \
(((x) >> S_TLSRX_HDR_PKT_MAC_ERROR_S) & TLSRX_HDR_PKT_MAC_ERROR_M)
#define TLSRX_HDR_PKT_MAC_ERROR_F TLSRX_HDR_PKT_MAC_ERROR_V(1U)
#define TLSRX_HDR_PKT_ERROR_M 0x1F
struct ulp_mem_rw {
__be32 cmd;
__be32 len16; /* command length */
__be32 dlen; /* data length in 32-byte units */
__be32 lock_addr;
};
struct tls_key_wr {
__be32 op_to_compl;
__be32 flowid_len16;
__be32 ftid;
u8 reneg_to_write_rx;
u8 protocol;
__be16 mfs;
};
struct tls_key_req {
struct tls_key_wr wr;
struct ulp_mem_rw req;
struct ulptx_idata sc_imm;
};
/*
* This lives in skb->cb and is used to chain WRs in a linked list.
*/
struct wr_skb_cb {
struct l2t_skb_cb l2t; /* reserve space for l2t CB */
struct sk_buff *next_wr; /* next write request */
};
/* Per-skb backlog handler. Run when a socket's backlog is processed. */
struct blog_skb_cb {
void (*backlog_rcv)(struct sock *sk, struct sk_buff *skb);
struct chtls_dev *cdev;
};
/*
* Similar to tcp_skb_cb but with ULP elements added to support TLS,
* etc.
*/
struct ulp_skb_cb {
struct wr_skb_cb wr; /* reserve space for write request */
u16 flags; /* TCP-like flags */
u8 psh;
u8 ulp_mode; /* ULP mode/submode of sk_buff */
u32 seq; /* TCP sequence number */
union { /* ULP-specific fields */
struct {
u8 type;
u8 ofld;
u8 iv;
} tls;
} ulp;
};
#define ULP_SKB_CB(skb) ((struct ulp_skb_cb *)&((skb)->cb[0]))
#define BLOG_SKB_CB(skb) ((struct blog_skb_cb *)(skb)->cb)
/*
* Flags for ulp_skb_cb.flags.
*/
enum {
ULPCB_FLAG_NEED_HDR = 1 << 0, /* packet needs a TX_DATA_WR header */
ULPCB_FLAG_NO_APPEND = 1 << 1, /* don't grow this skb */
ULPCB_FLAG_BARRIER = 1 << 2, /* set TX_WAIT_IDLE after sending */
ULPCB_FLAG_HOLD = 1 << 3, /* skb not ready for Tx yet */
ULPCB_FLAG_COMPL = 1 << 4, /* request WR completion */
ULPCB_FLAG_URG = 1 << 5, /* urgent data */
ULPCB_FLAG_TLS_ND = 1 << 6, /* payload of zero length */
ULPCB_FLAG_NO_HDR = 1 << 7, /* not a ofld wr */
};
/* The ULP mode/submode of an skbuff */
#define skb_ulp_mode(skb) (ULP_SKB_CB(skb)->ulp_mode)
#define TCP_PAGE(sk) (sk->sk_frag.page)
#define TCP_OFF(sk) (sk->sk_frag.offset)
static inline struct chtls_dev *to_chtls_dev(struct tls_device *tlsdev)
{
return container_of(tlsdev, struct chtls_dev, tlsdev);
}
static inline void csk_set_flag(struct chtls_sock *csk,
enum csk_flags flag)
{
__set_bit(flag, &csk->flags);
}
static inline void csk_reset_flag(struct chtls_sock *csk,
enum csk_flags flag)
{
__clear_bit(flag, &csk->flags);
}
static inline bool csk_conn_inline(const struct chtls_sock *csk)
{
return test_bit(CSK_CONN_INLINE, &csk->flags);
}
static inline int csk_flag(const struct sock *sk, enum csk_flags flag)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
if (!csk_conn_inline(csk))
return 0;
return test_bit(flag, &csk->flags);
}
static inline int csk_flag_nochk(const struct chtls_sock *csk,
enum csk_flags flag)
{
return test_bit(flag, &csk->flags);
}
static inline void *cplhdr(struct sk_buff *skb)
{
return skb->data;
}
static inline int is_neg_adv(unsigned int status)
{
return status == CPL_ERR_RTX_NEG_ADVICE ||
status == CPL_ERR_KEEPALV_NEG_ADVICE ||
status == CPL_ERR_PERSIST_NEG_ADVICE;
}
static inline void process_cpl_msg(void (*fn)(struct sock *, struct sk_buff *),
struct sock *sk,
struct sk_buff *skb)
{
skb_reset_mac_header(skb);
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
bh_lock_sock(sk);
if (unlikely(sock_owned_by_user(sk))) {
BLOG_SKB_CB(skb)->backlog_rcv = fn;
__sk_add_backlog(sk, skb);
} else {
fn(sk, skb);
}
bh_unlock_sock(sk);
}
static inline void chtls_sock_free(struct kref *ref)
{
struct chtls_sock *csk = container_of(ref, struct chtls_sock,
kref);
kfree(csk);
}
static inline void __chtls_sock_put(const char *fn, struct chtls_sock *csk)
{
kref_put(&csk->kref, chtls_sock_free);
}
static inline void __chtls_sock_get(const char *fn,
struct chtls_sock *csk)
{
kref_get(&csk->kref);
}
static inline void send_or_defer(struct sock *sk, struct tcp_sock *tp,
struct sk_buff *skb, int through_l2t)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
if (through_l2t) {
/* send through L2T */
cxgb4_l2t_send(csk->egress_dev, skb, csk->l2t_entry);
} else {
/* send directly */
cxgb4_ofld_send(csk->egress_dev, skb);
}
}
typedef int (*chtls_handler_func)(struct chtls_dev *, struct sk_buff *);
extern chtls_handler_func chtls_handlers[NUM_CPL_CMDS];
void chtls_install_cpl_ops(struct sock *sk);
int chtls_init_kmap(struct chtls_dev *cdev, struct cxgb4_lld_info *lldi);
void chtls_listen_stop(struct chtls_dev *cdev, struct sock *sk);
int chtls_listen_start(struct chtls_dev *cdev, struct sock *sk);
void chtls_close(struct sock *sk, long timeout);
int chtls_disconnect(struct sock *sk, int flags);
void chtls_shutdown(struct sock *sk, int how);
void chtls_destroy_sock(struct sock *sk);
int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
int chtls_recvmsg(struct sock *sk, struct msghdr *msg,
size_t len, int nonblock, int flags, int *addr_len);
int chtls_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags);
int send_tx_flowc_wr(struct sock *sk, int compl,
u32 snd_nxt, u32 rcv_nxt);
void chtls_tcp_push(struct sock *sk, int flags);
int chtls_push_frames(struct chtls_sock *csk, int comp);
int chtls_set_tcb_tflag(struct sock *sk, unsigned int bit_pos, int val);
int chtls_setkey(struct chtls_sock *csk, u32 keylen, u32 mode);
void skb_entail(struct sock *sk, struct sk_buff *skb, int flags);
unsigned int keyid_to_addr(int start_addr, int keyid);
void free_tls_keyid(struct sock *sk);
#endif
/*
* Copyright (c) 2018 Chelsio Communications, Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Written by: Atul Gupta (atul.gupta@chelsio.com)
*/
#include <linux/module.h>
#include <linux/list.h>
#include <linux/workqueue.h>
#include <linux/skbuff.h>
#include <linux/timer.h>
#include <linux/notifier.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <linux/sched/signal.h>
#include <linux/kallsyms.h>
#include <linux/kprobes.h>
#include <linux/if_vlan.h>
#include <net/tcp.h>
#include <net/dst.h>
#include "chtls.h"
#include "chtls_cm.h"
/*
* State transitions and actions for close. Note that if we are in SYN_SENT
* we remain in that state as we cannot control a connection while it's in
* SYN_SENT; such connections are allowed to establish and are then aborted.
*/
static unsigned char new_state[16] = {
/* current state: new state: action: */
/* (Invalid) */ TCP_CLOSE,
/* TCP_ESTABLISHED */ TCP_FIN_WAIT1 | TCP_ACTION_FIN,
/* TCP_SYN_SENT */ TCP_SYN_SENT,
/* TCP_SYN_RECV */ TCP_FIN_WAIT1 | TCP_ACTION_FIN,
/* TCP_FIN_WAIT1 */ TCP_FIN_WAIT1,
/* TCP_FIN_WAIT2 */ TCP_FIN_WAIT2,
/* TCP_TIME_WAIT */ TCP_CLOSE,
/* TCP_CLOSE */ TCP_CLOSE,
/* TCP_CLOSE_WAIT */ TCP_LAST_ACK | TCP_ACTION_FIN,
/* TCP_LAST_ACK */ TCP_LAST_ACK,
/* TCP_LISTEN */ TCP_CLOSE,
/* TCP_CLOSING */ TCP_CLOSING,
};
static struct chtls_sock *chtls_sock_create(struct chtls_dev *cdev)
{
struct chtls_sock *csk = kzalloc(sizeof(*csk), GFP_ATOMIC);
if (!csk)
return NULL;
csk->txdata_skb_cache = alloc_skb(TXDATA_SKB_LEN, GFP_ATOMIC);
if (!csk->txdata_skb_cache) {
kfree(csk);
return NULL;
}
kref_init(&csk->kref);
csk->cdev = cdev;
skb_queue_head_init(&csk->txq);
csk->wr_skb_head = NULL;
csk->wr_skb_tail = NULL;
csk->mss = MAX_MSS;
csk->tlshws.ofld = 1;
csk->tlshws.txkey = -1;
csk->tlshws.rxkey = -1;
csk->tlshws.mfs = TLS_MFS;
skb_queue_head_init(&csk->tlshws.sk_recv_queue);
return csk;
}
static void chtls_sock_release(struct kref *ref)
{
struct chtls_sock *csk =
container_of(ref, struct chtls_sock, kref);
kfree(csk);
}
static struct net_device *chtls_ipv4_netdev(struct chtls_dev *cdev,
struct sock *sk)
{
struct net_device *ndev = cdev->ports[0];
if (likely(!inet_sk(sk)->inet_rcv_saddr))
return ndev;
ndev = ip_dev_find(&init_net, inet_sk(sk)->inet_rcv_saddr);
if (!ndev)
return NULL;
if (is_vlan_dev(ndev))
return vlan_dev_real_dev(ndev);
return ndev;
}
static void assign_rxopt(struct sock *sk, unsigned int opt)
{
const struct chtls_dev *cdev;
struct chtls_sock *csk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tp = tcp_sk(sk);
cdev = csk->cdev;
tp->tcp_header_len = sizeof(struct tcphdr);
tp->rx_opt.mss_clamp = cdev->mtus[TCPOPT_MSS_G(opt)] - 40;
tp->mss_cache = tp->rx_opt.mss_clamp;
tp->rx_opt.tstamp_ok = TCPOPT_TSTAMP_G(opt);
tp->rx_opt.snd_wscale = TCPOPT_SACK_G(opt);
tp->rx_opt.wscale_ok = TCPOPT_WSCALE_OK_G(opt);
SND_WSCALE(tp) = TCPOPT_SND_WSCALE_G(opt);
if (!tp->rx_opt.wscale_ok)
tp->rx_opt.rcv_wscale = 0;
if (tp->rx_opt.tstamp_ok) {
tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED;
tp->rx_opt.mss_clamp -= TCPOLEN_TSTAMP_ALIGNED;
} else if (csk->opt2 & TSTAMPS_EN_F) {
csk->opt2 &= ~TSTAMPS_EN_F;
csk->mtu_idx = TCPOPT_MSS_G(opt);
}
}
static void chtls_purge_receive_queue(struct sock *sk)
{
struct sk_buff *skb;
while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
skb_dst_set(skb, (void *)NULL);
kfree_skb(skb);
}
}
static void chtls_purge_write_queue(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct sk_buff *skb;
while ((skb = __skb_dequeue(&csk->txq))) {
sk->sk_wmem_queued -= skb->truesize;
__kfree_skb(skb);
}
}
static void chtls_purge_recv_queue(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct chtls_hws *tlsk = &csk->tlshws;
struct sk_buff *skb;
while ((skb = __skb_dequeue(&tlsk->sk_recv_queue)) != NULL) {
skb_dst_set(skb, NULL);
kfree_skb(skb);
}
}
static void abort_arp_failure(void *handle, struct sk_buff *skb)
{
struct cpl_abort_req *req = cplhdr(skb);
struct chtls_dev *cdev;
cdev = (struct chtls_dev *)handle;
req->cmd = CPL_ABORT_NO_RST;
cxgb4_ofld_send(cdev->lldi->ports[0], skb);
}
static struct sk_buff *alloc_ctrl_skb(struct sk_buff *skb, int len)
{
if (likely(skb && !skb_shared(skb) && !skb_cloned(skb))) {
__skb_trim(skb, 0);
refcount_add(2, &skb->users);
} else {
skb = alloc_skb(len, GFP_KERNEL | __GFP_NOFAIL);
}
return skb;
}
static void chtls_send_abort(struct sock *sk, int mode, struct sk_buff *skb)
{
struct cpl_abort_req *req;
struct chtls_sock *csk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tp = tcp_sk(sk);
if (!skb)
skb = alloc_ctrl_skb(csk->txdata_skb_cache, sizeof(*req));
req = (struct cpl_abort_req *)skb_put(skb, sizeof(*req));
INIT_TP_WR_CPL(req, CPL_ABORT_REQ, csk->tid);
skb_set_queue_mapping(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA);
req->rsvd0 = htonl(tp->snd_nxt);
req->rsvd1 = !csk_flag_nochk(csk, CSK_TX_DATA_SENT);
req->cmd = mode;
t4_set_arp_err_handler(skb, csk->cdev, abort_arp_failure);
send_or_defer(sk, tp, skb, mode == CPL_ABORT_SEND_RST);
}
static void chtls_send_reset(struct sock *sk, int mode, struct sk_buff *skb)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
if (unlikely(csk_flag_nochk(csk, CSK_ABORT_SHUTDOWN) ||
!csk->cdev)) {
if (sk->sk_state == TCP_SYN_RECV)
csk_set_flag(csk, CSK_RST_ABORTED);
goto out;
}
if (!csk_flag_nochk(csk, CSK_TX_DATA_SENT)) {
struct tcp_sock *tp = tcp_sk(sk);
if (send_tx_flowc_wr(sk, 0, tp->snd_nxt, tp->rcv_nxt) < 0)
WARN_ONCE(1, "send tx flowc error");
csk_set_flag(csk, CSK_TX_DATA_SENT);
}
csk_set_flag(csk, CSK_ABORT_RPL_PENDING);
chtls_purge_write_queue(sk);
csk_set_flag(csk, CSK_ABORT_SHUTDOWN);
if (sk->sk_state != TCP_SYN_RECV)
chtls_send_abort(sk, mode, skb);
else
goto out;
return;
out:
if (skb)
kfree_skb(skb);
}
static void release_tcp_port(struct sock *sk)
{
if (inet_csk(sk)->icsk_bind_hash)
inet_put_port(sk);
}
static void tcp_uncork(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
if (tp->nonagle & TCP_NAGLE_CORK) {
tp->nonagle &= ~TCP_NAGLE_CORK;
chtls_tcp_push(sk, 0);
}
}
static void chtls_close_conn(struct sock *sk)
{
struct cpl_close_con_req *req;
struct chtls_sock *csk;
struct sk_buff *skb;
unsigned int tid;
unsigned int len;
len = roundup(sizeof(struct cpl_close_con_req), 16);
csk = rcu_dereference_sk_user_data(sk);
tid = csk->tid;
skb = alloc_skb(len, GFP_KERNEL | __GFP_NOFAIL);
req = (struct cpl_close_con_req *)__skb_put(skb, len);
memset(req, 0, len);
req->wr.wr_hi = htonl(FW_WR_OP_V(FW_TP_WR) |
FW_WR_IMMDLEN_V(sizeof(*req) -
sizeof(req->wr)));
req->wr.wr_mid = htonl(FW_WR_LEN16_V(DIV_ROUND_UP(sizeof(*req), 16)) |
FW_WR_FLOWID_V(tid));
OPCODE_TID(req) = htonl(MK_OPCODE_TID(CPL_CLOSE_CON_REQ, tid));
tcp_uncork(sk);
skb_entail(sk, skb, ULPCB_FLAG_NO_HDR | ULPCB_FLAG_NO_APPEND);
if (sk->sk_state != TCP_SYN_SENT)
chtls_push_frames(csk, 1);
}
/*
* Perform a state transition during close and return the actions indicated
* for the transition. Do not make this function inline, the main reason
* it exists at all is to avoid multiple inlining of tcp_set_state.
*/
static int make_close_transition(struct sock *sk)
{
int next = (int)new_state[sk->sk_state];
tcp_set_state(sk, next & TCP_STATE_MASK);
return next & TCP_ACTION_FIN;
}
void chtls_close(struct sock *sk, long timeout)
{
int data_lost, prev_state;
struct chtls_sock *csk;
csk = rcu_dereference_sk_user_data(sk);
lock_sock(sk);
sk->sk_shutdown |= SHUTDOWN_MASK;
data_lost = skb_queue_len(&sk->sk_receive_queue);
data_lost |= skb_queue_len(&csk->tlshws.sk_recv_queue);
chtls_purge_recv_queue(sk);
chtls_purge_receive_queue(sk);
if (sk->sk_state == TCP_CLOSE) {
goto wait;
} else if (data_lost || sk->sk_state == TCP_SYN_SENT) {
chtls_send_reset(sk, CPL_ABORT_SEND_RST, NULL);
release_tcp_port(sk);
goto unlock;
} else if (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime) {
sk->sk_prot->disconnect(sk, 0);
} else if (make_close_transition(sk)) {
chtls_close_conn(sk);
}
wait:
if (timeout)
sk_stream_wait_close(sk, timeout);
unlock:
prev_state = sk->sk_state;
sock_hold(sk);
sock_orphan(sk);
release_sock(sk);
local_bh_disable();
bh_lock_sock(sk);
if (prev_state != TCP_CLOSE && sk->sk_state == TCP_CLOSE)
goto out;
if (sk->sk_state == TCP_FIN_WAIT2 && tcp_sk(sk)->linger2 < 0 &&
!csk_flag(sk, CSK_ABORT_SHUTDOWN)) {
struct sk_buff *skb;
skb = alloc_skb(sizeof(struct cpl_abort_req), GFP_ATOMIC);
if (skb)
chtls_send_reset(sk, CPL_ABORT_SEND_RST, skb);
}
if (sk->sk_state == TCP_CLOSE)
inet_csk_destroy_sock(sk);
out:
bh_unlock_sock(sk);
local_bh_enable();
sock_put(sk);
}
/*
* Wait until a socket enters on of the given states.
*/
static int wait_for_states(struct sock *sk, unsigned int states)
{
DECLARE_WAITQUEUE(wait, current);
struct socket_wq _sk_wq;
long current_timeo;
int err = 0;
current_timeo = 200;
/*
* We want this to work even when there's no associated struct socket.
* In that case we provide a temporary wait_queue_head_t.
*/
if (!sk->sk_wq) {
init_waitqueue_head(&_sk_wq.wait);
_sk_wq.fasync_list = NULL;
init_rcu_head_on_stack(&_sk_wq.rcu);
RCU_INIT_POINTER(sk->sk_wq, &_sk_wq);
}
add_wait_queue(sk_sleep(sk), &wait);
while (!sk_in_state(sk, states)) {
if (!current_timeo) {
err = -EBUSY;
break;
}
if (signal_pending(current)) {
err = sock_intr_errno(current_timeo);
break;
}
set_current_state(TASK_UNINTERRUPTIBLE);
release_sock(sk);
if (!sk_in_state(sk, states))
current_timeo = schedule_timeout(current_timeo);
__set_current_state(TASK_RUNNING);
lock_sock(sk);
}
remove_wait_queue(sk_sleep(sk), &wait);
if (rcu_dereference(sk->sk_wq) == &_sk_wq)
sk->sk_wq = NULL;
return err;
}
int chtls_disconnect(struct sock *sk, int flags)
{
struct chtls_sock *csk;
struct tcp_sock *tp;
int err;
tp = tcp_sk(sk);
csk = rcu_dereference_sk_user_data(sk);
chtls_purge_recv_queue(sk);
chtls_purge_receive_queue(sk);
chtls_purge_write_queue(sk);
if (sk->sk_state != TCP_CLOSE) {
sk->sk_err = ECONNRESET;
chtls_send_reset(sk, CPL_ABORT_SEND_RST, NULL);
err = wait_for_states(sk, TCPF_CLOSE);
if (err)
return err;
}
chtls_purge_recv_queue(sk);
chtls_purge_receive_queue(sk);
tp->max_window = 0xFFFF << (tp->rx_opt.snd_wscale);
return tcp_disconnect(sk, flags);
}
#define SHUTDOWN_ELIGIBLE_STATE (TCPF_ESTABLISHED | \
TCPF_SYN_RECV | TCPF_CLOSE_WAIT)
void chtls_shutdown(struct sock *sk, int how)
{
if ((how & SEND_SHUTDOWN) &&
sk_in_state(sk, SHUTDOWN_ELIGIBLE_STATE) &&
make_close_transition(sk))
chtls_close_conn(sk);
}
void chtls_destroy_sock(struct sock *sk)
{
struct chtls_sock *csk;
csk = rcu_dereference_sk_user_data(sk);
chtls_purge_recv_queue(sk);
csk->ulp_mode = ULP_MODE_NONE;
chtls_purge_write_queue(sk);
free_tls_keyid(sk);
kref_put(&csk->kref, chtls_sock_release);
sk->sk_prot = &tcp_prot;
sk->sk_prot->destroy(sk);
}
static void reset_listen_child(struct sock *child)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(child);
struct sk_buff *skb;
skb = alloc_ctrl_skb(csk->txdata_skb_cache,
sizeof(struct cpl_abort_req));
chtls_send_reset(child, CPL_ABORT_SEND_RST, skb);
sock_orphan(child);
INC_ORPHAN_COUNT(child);
if (child->sk_state == TCP_CLOSE)
inet_csk_destroy_sock(child);
}
static void chtls_disconnect_acceptq(struct sock *listen_sk)
{
struct request_sock **pprev;
pprev = ACCEPT_QUEUE(listen_sk);
while (*pprev) {
struct request_sock *req = *pprev;
if (req->rsk_ops == &chtls_rsk_ops) {
struct sock *child = req->sk;
*pprev = req->dl_next;
sk_acceptq_removed(listen_sk);
reqsk_put(req);
sock_hold(child);
local_bh_disable();
bh_lock_sock(child);
release_tcp_port(child);
reset_listen_child(child);
bh_unlock_sock(child);
local_bh_enable();
sock_put(child);
} else {
pprev = &req->dl_next;
}
}
}
static int listen_hashfn(const struct sock *sk)
{
return ((unsigned long)sk >> 10) & (LISTEN_INFO_HASH_SIZE - 1);
}
static struct listen_info *listen_hash_add(struct chtls_dev *cdev,
struct sock *sk,
unsigned int stid)
{
struct listen_info *p = kmalloc(sizeof(*p), GFP_KERNEL);
if (p) {
int key = listen_hashfn(sk);
p->sk = sk;
p->stid = stid;
spin_lock(&cdev->listen_lock);
p->next = cdev->listen_hash_tab[key];
cdev->listen_hash_tab[key] = p;
spin_unlock(&cdev->listen_lock);
}
return p;
}
static int listen_hash_find(struct chtls_dev *cdev,
struct sock *sk)
{
struct listen_info *p;
int stid = -1;
int key;
key = listen_hashfn(sk);
spin_lock(&cdev->listen_lock);
for (p = cdev->listen_hash_tab[key]; p; p = p->next)
if (p->sk == sk) {
stid = p->stid;
break;
}
spin_unlock(&cdev->listen_lock);
return stid;
}
static int listen_hash_del(struct chtls_dev *cdev,
struct sock *sk)
{
struct listen_info *p, **prev;
int stid = -1;
int key;
key = listen_hashfn(sk);
prev = &cdev->listen_hash_tab[key];
spin_lock(&cdev->listen_lock);
for (p = *prev; p; prev = &p->next, p = p->next)
if (p->sk == sk) {
stid = p->stid;
*prev = p->next;
kfree(p);
break;
}
spin_unlock(&cdev->listen_lock);
return stid;
}
static void cleanup_syn_rcv_conn(struct sock *child, struct sock *parent)
{
struct request_sock *req;
struct chtls_sock *csk;
csk = rcu_dereference_sk_user_data(child);
req = csk->passive_reap_next;
reqsk_queue_removed(&inet_csk(parent)->icsk_accept_queue, req);
__skb_unlink((struct sk_buff *)&csk->synq, &csk->listen_ctx->synq);
chtls_reqsk_free(req);
csk->passive_reap_next = NULL;
}
static void chtls_reset_synq(struct listen_ctx *listen_ctx)
{
struct sock *listen_sk = listen_ctx->lsk;
while (!skb_queue_empty(&listen_ctx->synq)) {
struct chtls_sock *csk =
container_of((struct synq *)__skb_dequeue
(&listen_ctx->synq), struct chtls_sock, synq);
struct sock *child = csk->sk;
cleanup_syn_rcv_conn(child, listen_sk);
sock_hold(child);
local_bh_disable();
bh_lock_sock(child);
release_tcp_port(child);
reset_listen_child(child);
bh_unlock_sock(child);
local_bh_enable();
sock_put(child);
}
}
int chtls_listen_start(struct chtls_dev *cdev, struct sock *sk)
{
struct net_device *ndev;
struct listen_ctx *ctx;
struct adapter *adap;
struct port_info *pi;
int stid;
int ret;
if (sk->sk_family != PF_INET)
return -EAGAIN;
rcu_read_lock();
ndev = chtls_ipv4_netdev(cdev, sk);
rcu_read_unlock();
if (!ndev)
return -EBADF;
pi = netdev_priv(ndev);
adap = pi->adapter;
if (!(adap->flags & FULL_INIT_DONE))
return -EBADF;
if (listen_hash_find(cdev, sk) >= 0) /* already have it */
return -EADDRINUSE;
ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx)
return -ENOMEM;
__module_get(THIS_MODULE);
ctx->lsk = sk;
ctx->cdev = cdev;
ctx->state = T4_LISTEN_START_PENDING;
skb_queue_head_init(&ctx->synq);
stid = cxgb4_alloc_stid(cdev->tids, sk->sk_family, ctx);
if (stid < 0)
goto free_ctx;
sock_hold(sk);
if (!listen_hash_add(cdev, sk, stid))
goto free_stid;
ret = cxgb4_create_server(ndev, stid,
inet_sk(sk)->inet_rcv_saddr,
inet_sk(sk)->inet_sport, 0,
cdev->lldi->rxq_ids[0]);
if (ret > 0)
ret = net_xmit_errno(ret);
if (ret)
goto del_hash;
return 0;
del_hash:
listen_hash_del(cdev, sk);
free_stid:
cxgb4_free_stid(cdev->tids, stid, sk->sk_family);
sock_put(sk);
free_ctx:
kfree(ctx);
module_put(THIS_MODULE);
return -EBADF;
}
void chtls_listen_stop(struct chtls_dev *cdev, struct sock *sk)
{
struct listen_ctx *listen_ctx;
int stid;
stid = listen_hash_del(cdev, sk);
if (stid < 0)
return;
listen_ctx = (struct listen_ctx *)lookup_stid(cdev->tids, stid);
chtls_reset_synq(listen_ctx);
cxgb4_remove_server(cdev->lldi->ports[0], stid,
cdev->lldi->rxq_ids[0], 0);
chtls_disconnect_acceptq(sk);
}
static int chtls_pass_open_rpl(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_pass_open_rpl *rpl = cplhdr(skb) + RSS_HDR;
unsigned int stid = GET_TID(rpl);
struct listen_ctx *listen_ctx;
listen_ctx = (struct listen_ctx *)lookup_stid(cdev->tids, stid);
if (!listen_ctx)
return CPL_RET_BUF_DONE;
if (listen_ctx->state == T4_LISTEN_START_PENDING) {
listen_ctx->state = T4_LISTEN_STARTED;
return CPL_RET_BUF_DONE;
}
if (rpl->status != CPL_ERR_NONE) {
pr_info("Unexpected PASS_OPEN_RPL status %u for STID %u\n",
rpl->status, stid);
return CPL_RET_BUF_DONE;
}
cxgb4_free_stid(cdev->tids, stid, listen_ctx->lsk->sk_family);
sock_put(listen_ctx->lsk);
kfree(listen_ctx);
module_put(THIS_MODULE);
return 0;
}
static int chtls_close_listsrv_rpl(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_close_listsvr_rpl *rpl = cplhdr(skb) + RSS_HDR;
struct listen_ctx *listen_ctx;
unsigned int stid;
void *data;
stid = GET_TID(rpl);
data = lookup_stid(cdev->tids, stid);
listen_ctx = (struct listen_ctx *)data;
if (rpl->status != CPL_ERR_NONE) {
pr_info("Unexpected CLOSE_LISTSRV_RPL status %u for STID %u\n",
rpl->status, stid);
return CPL_RET_BUF_DONE;
}
cxgb4_free_stid(cdev->tids, stid, listen_ctx->lsk->sk_family);
sock_put(listen_ctx->lsk);
kfree(listen_ctx);
module_put(THIS_MODULE);
return 0;
}
static void chtls_release_resources(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct chtls_dev *cdev = csk->cdev;
unsigned int tid = csk->tid;
struct tid_info *tids;
if (!cdev)
return;
tids = cdev->tids;
kfree_skb(csk->txdata_skb_cache);
csk->txdata_skb_cache = NULL;
if (csk->l2t_entry) {
cxgb4_l2t_release(csk->l2t_entry);
csk->l2t_entry = NULL;
}
cxgb4_remove_tid(tids, csk->port_id, tid, sk->sk_family);
sock_put(sk);
}
static void chtls_conn_done(struct sock *sk)
{
if (sock_flag(sk, SOCK_DEAD))
chtls_purge_receive_queue(sk);
sk_wakeup_sleepers(sk, 0);
tcp_done(sk);
}
static void do_abort_syn_rcv(struct sock *child, struct sock *parent)
{
/*
* If the server is still open we clean up the child connection,
* otherwise the server already did the clean up as it was purging
* its SYN queue and the skb was just sitting in its backlog.
*/
if (likely(parent->sk_state == TCP_LISTEN)) {
cleanup_syn_rcv_conn(child, parent);
/* Without the below call to sock_orphan,
* we leak the socket resource with syn_flood test
* as inet_csk_destroy_sock will not be called
* in tcp_done since SOCK_DEAD flag is not set.
* Kernel handles this differently where new socket is
* created only after 3 way handshake is done.
*/
sock_orphan(child);
percpu_counter_inc((child)->sk_prot->orphan_count);
chtls_release_resources(child);
chtls_conn_done(child);
} else {
if (csk_flag(child, CSK_RST_ABORTED)) {
chtls_release_resources(child);
chtls_conn_done(child);
}
}
}
static void pass_open_abort(struct sock *child, struct sock *parent,
struct sk_buff *skb)
{
do_abort_syn_rcv(child, parent);
kfree_skb(skb);
}
static void bl_pass_open_abort(struct sock *lsk, struct sk_buff *skb)
{
pass_open_abort(skb->sk, lsk, skb);
}
static void chtls_pass_open_arp_failure(struct sock *sk,
struct sk_buff *skb)
{
const struct request_sock *oreq;
struct chtls_sock *csk;
struct chtls_dev *cdev;
struct sock *parent;
void *data;
csk = rcu_dereference_sk_user_data(sk);
cdev = csk->cdev;
/*
* If the connection is being aborted due to the parent listening
* socket going away there's nothing to do, the ABORT_REQ will close
* the connection.
*/
if (csk_flag(sk, CSK_ABORT_RPL_PENDING)) {
kfree_skb(skb);
return;
}
oreq = csk->passive_reap_next;
data = lookup_stid(cdev->tids, oreq->ts_recent);
parent = ((struct listen_ctx *)data)->lsk;
bh_lock_sock(parent);
if (!sock_owned_by_user(parent)) {
pass_open_abort(sk, parent, skb);
} else {
BLOG_SKB_CB(skb)->backlog_rcv = bl_pass_open_abort;
__sk_add_backlog(parent, skb);
}
bh_unlock_sock(parent);
}
static void chtls_accept_rpl_arp_failure(void *handle,
struct sk_buff *skb)
{
struct sock *sk = (struct sock *)handle;
sock_hold(sk);
process_cpl_msg(chtls_pass_open_arp_failure, sk, skb);
sock_put(sk);
}
static unsigned int chtls_select_mss(const struct chtls_sock *csk,
unsigned int pmtu,
struct cpl_pass_accept_req *req)
{
struct chtls_dev *cdev;
struct dst_entry *dst;
unsigned int tcpoptsz;
unsigned int iphdrsz;
unsigned int mtu_idx;
struct tcp_sock *tp;
unsigned int mss;
struct sock *sk;
mss = ntohs(req->tcpopt.mss);
sk = csk->sk;
dst = __sk_dst_get(sk);
cdev = csk->cdev;
tp = tcp_sk(sk);
tcpoptsz = 0;
iphdrsz = sizeof(struct iphdr) + sizeof(struct tcphdr);
if (req->tcpopt.tstamp)
tcpoptsz += round_up(TCPOLEN_TIMESTAMP, 4);
tp->advmss = dst_metric_advmss(dst);
if (USER_MSS(tp) && tp->advmss > USER_MSS(tp))
tp->advmss = USER_MSS(tp);
if (tp->advmss > pmtu - iphdrsz)
tp->advmss = pmtu - iphdrsz;
if (mss && tp->advmss > mss)
tp->advmss = mss;
tp->advmss = cxgb4_best_aligned_mtu(cdev->lldi->mtus,
iphdrsz + tcpoptsz,
tp->advmss - tcpoptsz,
8, &mtu_idx);
tp->advmss -= iphdrsz;
inet_csk(sk)->icsk_pmtu_cookie = pmtu;
return mtu_idx;
}
static unsigned int select_rcv_wnd(struct chtls_sock *csk)
{
unsigned int rcvwnd;
unsigned int wnd;
struct sock *sk;
sk = csk->sk;
wnd = tcp_full_space(sk);
if (wnd < MIN_RCV_WND)
wnd = MIN_RCV_WND;
rcvwnd = MAX_RCV_WND;
csk_set_flag(csk, CSK_UPDATE_RCV_WND);
return min(wnd, rcvwnd);
}
static unsigned int select_rcv_wscale(int space, int wscale_ok, int win_clamp)
{
int wscale = 0;
if (space > MAX_RCV_WND)
space = MAX_RCV_WND;
if (win_clamp && win_clamp < space)
space = win_clamp;
if (wscale_ok) {
while (wscale < 14 && (65535 << wscale) < space)
wscale++;
}
return wscale;
}
static void chtls_pass_accept_rpl(struct sk_buff *skb,
struct cpl_pass_accept_req *req,
unsigned int tid)
{
struct cpl_t5_pass_accept_rpl *rpl5;
struct cxgb4_lld_info *lldi;
const struct tcphdr *tcph;
const struct tcp_sock *tp;
struct chtls_sock *csk;
unsigned int len;
struct sock *sk;
u32 opt2, hlen;
u64 opt0;
sk = skb->sk;
tp = tcp_sk(sk);
csk = sk->sk_user_data;
csk->tid = tid;
lldi = csk->cdev->lldi;
len = roundup(sizeof(*rpl5), 16);
rpl5 = __skb_put_zero(skb, len);
INIT_TP_WR(rpl5, tid);
OPCODE_TID(rpl5) = cpu_to_be32(MK_OPCODE_TID(CPL_PASS_ACCEPT_RPL,
csk->tid));
csk->mtu_idx = chtls_select_mss(csk, dst_mtu(__sk_dst_get(sk)),
req);
opt0 = TCAM_BYPASS_F |
WND_SCALE_V((tp)->rx_opt.rcv_wscale) |
MSS_IDX_V(csk->mtu_idx) |
L2T_IDX_V(csk->l2t_entry->idx) |
NAGLE_V(!(tp->nonagle & TCP_NAGLE_OFF)) |
TX_CHAN_V(csk->tx_chan) |
SMAC_SEL_V(csk->smac_idx) |
DSCP_V(csk->tos >> 2) |
ULP_MODE_V(ULP_MODE_TLS) |
RCV_BUFSIZ_V(min(tp->rcv_wnd >> 10, RCV_BUFSIZ_M));
opt2 = RX_CHANNEL_V(0) |
RSS_QUEUE_VALID_F | RSS_QUEUE_V(csk->rss_qid);
if (!is_t5(lldi->adapter_type))
opt2 |= RX_FC_DISABLE_F;
if (req->tcpopt.tstamp)
opt2 |= TSTAMPS_EN_F;
if (req->tcpopt.sack)
opt2 |= SACK_EN_F;
hlen = ntohl(req->hdr_len);
tcph = (struct tcphdr *)((u8 *)(req + 1) +
T6_ETH_HDR_LEN_G(hlen) + T6_IP_HDR_LEN_G(hlen));
if (tcph->ece && tcph->cwr)
opt2 |= CCTRL_ECN_V(1);
opt2 |= CONG_CNTRL_V(CONG_ALG_NEWRENO);
opt2 |= T5_ISS_F;
opt2 |= T5_OPT_2_VALID_F;
rpl5->opt0 = cpu_to_be64(opt0);
rpl5->opt2 = cpu_to_be32(opt2);
rpl5->iss = cpu_to_be32((prandom_u32() & ~7UL) - 1);
set_wr_txq(skb, CPL_PRIORITY_SETUP, csk->port_id);
t4_set_arp_err_handler(skb, sk, chtls_accept_rpl_arp_failure);
cxgb4_l2t_send(csk->egress_dev, skb, csk->l2t_entry);
}
static void inet_inherit_port(struct inet_hashinfo *hash_info,
struct sock *lsk, struct sock *newsk)
{
local_bh_disable();
__inet_inherit_port(lsk, newsk);
local_bh_enable();
}
static int chtls_backlog_rcv(struct sock *sk, struct sk_buff *skb)
{
if (skb->protocol) {
kfree_skb(skb);
return 0;
}
BLOG_SKB_CB(skb)->backlog_rcv(sk, skb);
return 0;
}
static struct sock *chtls_recv_sock(struct sock *lsk,
struct request_sock *oreq,
void *network_hdr,
const struct cpl_pass_accept_req *req,
struct chtls_dev *cdev)
{
const struct tcphdr *tcph;
struct inet_sock *newinet;
const struct iphdr *iph;
struct net_device *ndev;
struct chtls_sock *csk;
struct dst_entry *dst;
struct neighbour *n;
struct tcp_sock *tp;
struct sock *newsk;
u16 port_id;
int rxq_idx;
int step;
iph = (const struct iphdr *)network_hdr;
newsk = tcp_create_openreq_child(lsk, oreq, cdev->askb);
if (!newsk)
goto free_oreq;
dst = inet_csk_route_child_sock(lsk, newsk, oreq);
if (!dst)
goto free_sk;
tcph = (struct tcphdr *)(iph + 1);
n = dst_neigh_lookup(dst, &iph->saddr);
if (!n)
goto free_sk;
ndev = n->dev;
if (!ndev)
goto free_dst;
port_id = cxgb4_port_idx(ndev);
csk = chtls_sock_create(cdev);
if (!csk)
goto free_dst;
csk->l2t_entry = cxgb4_l2t_get(cdev->lldi->l2t, n, ndev, 0);
if (!csk->l2t_entry)
goto free_csk;
newsk->sk_user_data = csk;
newsk->sk_backlog_rcv = chtls_backlog_rcv;
tp = tcp_sk(newsk);
newinet = inet_sk(newsk);
newinet->inet_daddr = iph->saddr;
newinet->inet_rcv_saddr = iph->daddr;
newinet->inet_saddr = iph->daddr;
oreq->ts_recent = PASS_OPEN_TID_G(ntohl(req->tos_stid));
sk_setup_caps(newsk, dst);
csk->sk = newsk;
csk->passive_reap_next = oreq;
csk->tx_chan = cxgb4_port_chan(ndev);
csk->port_id = port_id;
csk->egress_dev = ndev;
csk->tos = PASS_OPEN_TOS_G(ntohl(req->tos_stid));
csk->ulp_mode = ULP_MODE_TLS;
step = cdev->lldi->nrxq / cdev->lldi->nchan;
csk->rss_qid = cdev->lldi->rxq_ids[port_id * step];
rxq_idx = port_id * step;
csk->txq_idx = (rxq_idx < cdev->lldi->ntxq) ? rxq_idx :
port_id * step;
csk->sndbuf = newsk->sk_sndbuf;
csk->smac_idx = cxgb4_tp_smt_idx(cdev->lldi->adapter_type,
cxgb4_port_viid(ndev));
tp->rcv_wnd = select_rcv_wnd(csk);
RCV_WSCALE(tp) = select_rcv_wscale(tcp_full_space(newsk),
WSCALE_OK(tp),
tp->window_clamp);
neigh_release(n);
inet_inherit_port(&tcp_hashinfo, lsk, newsk);
csk_set_flag(csk, CSK_CONN_INLINE);
bh_unlock_sock(newsk); /* tcp_create_openreq_child ->sk_clone_lock */
return newsk;
free_csk:
chtls_sock_release(&csk->kref);
free_dst:
dst_release(dst);
free_sk:
inet_csk_prepare_forced_close(newsk);
tcp_done(newsk);
free_oreq:
chtls_reqsk_free(oreq);
return NULL;
}
/*
* Populate a TID_RELEASE WR. The skb must be already propely sized.
*/
static void mk_tid_release(struct sk_buff *skb,
unsigned int chan, unsigned int tid)
{
struct cpl_tid_release *req;
unsigned int len;
len = roundup(sizeof(struct cpl_tid_release), 16);
req = (struct cpl_tid_release *)__skb_put(skb, len);
memset(req, 0, len);
set_wr_txq(skb, CPL_PRIORITY_SETUP, chan);
INIT_TP_WR_CPL(req, CPL_TID_RELEASE, tid);
}
static int chtls_get_module(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
if (!try_module_get(icsk->icsk_ulp_ops->owner))
return -1;
return 0;
}
static void chtls_pass_accept_request(struct sock *sk,
struct sk_buff *skb)
{
struct cpl_t5_pass_accept_rpl *rpl;
struct cpl_pass_accept_req *req;
struct listen_ctx *listen_ctx;
struct request_sock *oreq;
struct sk_buff *reply_skb;
struct chtls_sock *csk;
struct chtls_dev *cdev;
struct tcphdr *tcph;
struct sock *newsk;
struct ethhdr *eh;
struct iphdr *iph;
void *network_hdr;
unsigned int stid;
unsigned int len;
unsigned int tid;
req = cplhdr(skb) + RSS_HDR;
tid = GET_TID(req);
cdev = BLOG_SKB_CB(skb)->cdev;
newsk = lookup_tid(cdev->tids, tid);
stid = PASS_OPEN_TID_G(ntohl(req->tos_stid));
if (newsk) {
pr_info("tid (%d) already in use\n", tid);
return;
}
len = roundup(sizeof(*rpl), 16);
reply_skb = alloc_skb(len, GFP_ATOMIC);
if (!reply_skb) {
cxgb4_remove_tid(cdev->tids, 0, tid, sk->sk_family);
kfree_skb(skb);
return;
}
if (sk->sk_state != TCP_LISTEN)
goto reject;
if (inet_csk_reqsk_queue_is_full(sk))
goto reject;
if (sk_acceptq_is_full(sk))
goto reject;
oreq = inet_reqsk_alloc(&chtls_rsk_ops, sk, true);
if (!oreq)
goto reject;
oreq->rsk_rcv_wnd = 0;
oreq->rsk_window_clamp = 0;
oreq->cookie_ts = 0;
oreq->mss = 0;
oreq->ts_recent = 0;
eh = (struct ethhdr *)(req + 1);
iph = (struct iphdr *)(eh + 1);
if (iph->version != 0x4)
goto free_oreq;
network_hdr = (void *)(eh + 1);
tcph = (struct tcphdr *)(iph + 1);
tcp_rsk(oreq)->tfo_listener = false;
tcp_rsk(oreq)->rcv_isn = ntohl(tcph->seq);
chtls_set_req_port(oreq, tcph->source, tcph->dest);
inet_rsk(oreq)->ecn_ok = 0;
chtls_set_req_addr(oreq, iph->daddr, iph->saddr);
if (req->tcpopt.wsf <= 14) {
inet_rsk(oreq)->wscale_ok = 1;
inet_rsk(oreq)->snd_wscale = req->tcpopt.wsf;
}
inet_rsk(oreq)->ir_iif = sk->sk_bound_dev_if;
newsk = chtls_recv_sock(sk, oreq, network_hdr, req, cdev);
if (!newsk)
goto reject;
if (chtls_get_module(newsk))
goto reject;
inet_csk_reqsk_queue_added(sk);
reply_skb->sk = newsk;
chtls_install_cpl_ops(newsk);
cxgb4_insert_tid(cdev->tids, newsk, tid, newsk->sk_family);
csk = rcu_dereference_sk_user_data(newsk);
listen_ctx = (struct listen_ctx *)lookup_stid(cdev->tids, stid);
csk->listen_ctx = listen_ctx;
__skb_queue_tail(&listen_ctx->synq, (struct sk_buff *)&csk->synq);
chtls_pass_accept_rpl(reply_skb, req, tid);
kfree_skb(skb);
return;
free_oreq:
chtls_reqsk_free(oreq);
reject:
mk_tid_release(reply_skb, 0, tid);
cxgb4_ofld_send(cdev->lldi->ports[0], reply_skb);
kfree_skb(skb);
}
/*
* Handle a CPL_PASS_ACCEPT_REQ message.
*/
static int chtls_pass_accept_req(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_pass_accept_req *req = cplhdr(skb) + RSS_HDR;
struct listen_ctx *ctx;
unsigned int stid;
unsigned int tid;
struct sock *lsk;
void *data;
stid = PASS_OPEN_TID_G(ntohl(req->tos_stid));
tid = GET_TID(req);
data = lookup_stid(cdev->tids, stid);
if (!data)
return 1;
ctx = (struct listen_ctx *)data;
lsk = ctx->lsk;
if (unlikely(tid >= cdev->tids->ntids)) {
pr_info("passive open TID %u too large\n", tid);
return 1;
}
BLOG_SKB_CB(skb)->cdev = cdev;
process_cpl_msg(chtls_pass_accept_request, lsk, skb);
return 0;
}
/*
* Completes some final bits of initialization for just established connections
* and changes their state to TCP_ESTABLISHED.
*
* snd_isn here is the ISN after the SYN, i.e., the true ISN + 1.
*/
static void make_established(struct sock *sk, u32 snd_isn, unsigned int opt)
{
struct tcp_sock *tp = tcp_sk(sk);
tp->pushed_seq = snd_isn;
tp->write_seq = snd_isn;
tp->snd_nxt = snd_isn;
tp->snd_una = snd_isn;
inet_sk(sk)->inet_id = tp->write_seq ^ jiffies;
assign_rxopt(sk, opt);
if (tp->rcv_wnd > (RCV_BUFSIZ_M << 10))
tp->rcv_wup -= tp->rcv_wnd - (RCV_BUFSIZ_M << 10);
smp_mb();
tcp_set_state(sk, TCP_ESTABLISHED);
}
static void chtls_abort_conn(struct sock *sk, struct sk_buff *skb)
{
struct sk_buff *abort_skb;
abort_skb = alloc_skb(sizeof(struct cpl_abort_req), GFP_ATOMIC);
if (abort_skb)
chtls_send_reset(sk, CPL_ABORT_SEND_RST, abort_skb);
}
static struct sock *reap_list;
static DEFINE_SPINLOCK(reap_list_lock);
/*
* Process the reap list.
*/
DECLARE_TASK_FUNC(process_reap_list, task_param)
{
spin_lock_bh(&reap_list_lock);
while (reap_list) {
struct sock *sk = reap_list;
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
reap_list = csk->passive_reap_next;
csk->passive_reap_next = NULL;
spin_unlock(&reap_list_lock);
sock_hold(sk);
bh_lock_sock(sk);
chtls_abort_conn(sk, NULL);
sock_orphan(sk);
if (sk->sk_state == TCP_CLOSE)
inet_csk_destroy_sock(sk);
bh_unlock_sock(sk);
sock_put(sk);
spin_lock(&reap_list_lock);
}
spin_unlock_bh(&reap_list_lock);
}
static DECLARE_WORK(reap_task, process_reap_list);
static void add_to_reap_list(struct sock *sk)
{
struct chtls_sock *csk = sk->sk_user_data;
local_bh_disable();
bh_lock_sock(sk);
release_tcp_port(sk); /* release the port immediately */
spin_lock(&reap_list_lock);
csk->passive_reap_next = reap_list;
reap_list = sk;
if (!csk->passive_reap_next)
schedule_work(&reap_task);
spin_unlock(&reap_list_lock);
bh_unlock_sock(sk);
local_bh_enable();
}
static void add_pass_open_to_parent(struct sock *child, struct sock *lsk,
struct chtls_dev *cdev)
{
struct request_sock *oreq;
struct chtls_sock *csk;
if (lsk->sk_state != TCP_LISTEN)
return;
csk = child->sk_user_data;
oreq = csk->passive_reap_next;
csk->passive_reap_next = NULL;
reqsk_queue_removed(&inet_csk(lsk)->icsk_accept_queue, oreq);
__skb_unlink((struct sk_buff *)&csk->synq, &csk->listen_ctx->synq);
if (sk_acceptq_is_full(lsk)) {
chtls_reqsk_free(oreq);
add_to_reap_list(child);
} else {
refcount_set(&oreq->rsk_refcnt, 1);
inet_csk_reqsk_queue_add(lsk, oreq, child);
lsk->sk_data_ready(lsk);
}
}
static void bl_add_pass_open_to_parent(struct sock *lsk, struct sk_buff *skb)
{
struct sock *child = skb->sk;
skb->sk = NULL;
add_pass_open_to_parent(child, lsk, BLOG_SKB_CB(skb)->cdev);
kfree_skb(skb);
}
static int chtls_pass_establish(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_pass_establish *req = cplhdr(skb) + RSS_HDR;
struct chtls_sock *csk;
struct sock *lsk, *sk;
unsigned int hwtid;
hwtid = GET_TID(req);
sk = lookup_tid(cdev->tids, hwtid);
if (!sk)
return (CPL_RET_UNKNOWN_TID | CPL_RET_BUF_DONE);
bh_lock_sock(sk);
if (unlikely(sock_owned_by_user(sk))) {
kfree_skb(skb);
} else {
unsigned int stid;
void *data;
csk = sk->sk_user_data;
csk->wr_max_credits = 64;
csk->wr_credits = 64;
csk->wr_unacked = 0;
make_established(sk, ntohl(req->snd_isn), ntohs(req->tcp_opt));
stid = PASS_OPEN_TID_G(ntohl(req->tos_stid));
sk->sk_state_change(sk);
if (unlikely(sk->sk_socket))
sk_wake_async(sk, 0, POLL_OUT);
data = lookup_stid(cdev->tids, stid);
lsk = ((struct listen_ctx *)data)->lsk;
bh_lock_sock(lsk);
if (unlikely(skb_queue_empty(&csk->listen_ctx->synq))) {
/* removed from synq */
bh_unlock_sock(lsk);
kfree_skb(skb);
goto unlock;
}
if (likely(!sock_owned_by_user(lsk))) {
kfree_skb(skb);
add_pass_open_to_parent(sk, lsk, cdev);
} else {
skb->sk = sk;
BLOG_SKB_CB(skb)->cdev = cdev;
BLOG_SKB_CB(skb)->backlog_rcv =
bl_add_pass_open_to_parent;
__sk_add_backlog(lsk, skb);
}
bh_unlock_sock(lsk);
}
unlock:
bh_unlock_sock(sk);
return 0;
}
/*
* Handle receipt of an urgent pointer.
*/
static void handle_urg_ptr(struct sock *sk, u32 urg_seq)
{
struct tcp_sock *tp = tcp_sk(sk);
urg_seq--;
if (tp->urg_data && !after(urg_seq, tp->urg_seq))
return; /* duplicate pointer */
sk_send_sigurg(sk);
if (tp->urg_seq == tp->copied_seq && tp->urg_data &&
!sock_flag(sk, SOCK_URGINLINE) &&
tp->copied_seq != tp->rcv_nxt) {
struct sk_buff *skb = skb_peek(&sk->sk_receive_queue);
tp->copied_seq++;
if (skb && tp->copied_seq - ULP_SKB_CB(skb)->seq >= skb->len)
chtls_free_skb(sk, skb);
}
tp->urg_data = TCP_URG_NOTYET;
tp->urg_seq = urg_seq;
}
static void check_sk_callbacks(struct chtls_sock *csk)
{
struct sock *sk = csk->sk;
if (unlikely(sk->sk_user_data &&
!csk_flag_nochk(csk, CSK_CALLBACKS_CHKD)))
csk_set_flag(csk, CSK_CALLBACKS_CHKD);
}
/*
* Handles Rx data that arrives in a state where the socket isn't accepting
* new data.
*/
static void handle_excess_rx(struct sock *sk, struct sk_buff *skb)
{
if (!csk_flag(sk, CSK_ABORT_SHUTDOWN))
chtls_abort_conn(sk, skb);
kfree_skb(skb);
}
static void chtls_recv_data(struct sock *sk, struct sk_buff *skb)
{
struct cpl_rx_data *hdr = cplhdr(skb) + RSS_HDR;
struct chtls_sock *csk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tp = tcp_sk(sk);
if (unlikely(sk->sk_shutdown & RCV_SHUTDOWN)) {
handle_excess_rx(sk, skb);
return;
}
ULP_SKB_CB(skb)->seq = ntohl(hdr->seq);
ULP_SKB_CB(skb)->psh = hdr->psh;
skb_ulp_mode(skb) = ULP_MODE_NONE;
skb_reset_transport_header(skb);
__skb_pull(skb, sizeof(*hdr) + RSS_HDR);
if (!skb->data_len)
__skb_trim(skb, ntohs(hdr->len));
if (unlikely(hdr->urg))
handle_urg_ptr(sk, tp->rcv_nxt + ntohs(hdr->urg));
if (unlikely(tp->urg_data == TCP_URG_NOTYET &&
tp->urg_seq - tp->rcv_nxt < skb->len))
tp->urg_data = TCP_URG_VALID |
skb->data[tp->urg_seq - tp->rcv_nxt];
if (unlikely(hdr->dack_mode != csk->delack_mode)) {
csk->delack_mode = hdr->dack_mode;
csk->delack_seq = tp->rcv_nxt;
}
tcp_hdr(skb)->fin = 0;
tp->rcv_nxt += skb->len;
__skb_queue_tail(&sk->sk_receive_queue, skb);
if (!sock_flag(sk, SOCK_DEAD)) {
check_sk_callbacks(csk);
sk->sk_data_ready(sk);
}
}
static int chtls_rx_data(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_rx_data *req = cplhdr(skb) + RSS_HDR;
unsigned int hwtid = GET_TID(req);
struct sock *sk;
sk = lookup_tid(cdev->tids, hwtid);
skb_dst_set(skb, NULL);
process_cpl_msg(chtls_recv_data, sk, skb);
return 0;
}
static void chtls_recv_pdu(struct sock *sk, struct sk_buff *skb)
{
struct cpl_tls_data *hdr = cplhdr(skb);
struct chtls_sock *csk;
struct chtls_hws *tlsk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tlsk = &csk->tlshws;
tp = tcp_sk(sk);
if (unlikely(sk->sk_shutdown & RCV_SHUTDOWN)) {
handle_excess_rx(sk, skb);
return;
}
ULP_SKB_CB(skb)->seq = ntohl(hdr->seq);
ULP_SKB_CB(skb)->flags = 0;
skb_ulp_mode(skb) = ULP_MODE_TLS;
skb_reset_transport_header(skb);
__skb_pull(skb, sizeof(*hdr));
if (!skb->data_len)
__skb_trim(skb,
CPL_TLS_DATA_LENGTH_G(ntohl(hdr->length_pkd)));
if (unlikely(tp->urg_data == TCP_URG_NOTYET && tp->urg_seq -
tp->rcv_nxt < skb->len))
tp->urg_data = TCP_URG_VALID |
skb->data[tp->urg_seq - tp->rcv_nxt];
tcp_hdr(skb)->fin = 0;
tlsk->pldlen = CPL_TLS_DATA_LENGTH_G(ntohl(hdr->length_pkd));
__skb_queue_tail(&tlsk->sk_recv_queue, skb);
}
static int chtls_rx_pdu(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_tls_data *req = cplhdr(skb);
unsigned int hwtid = GET_TID(req);
struct sock *sk;
sk = lookup_tid(cdev->tids, hwtid);
skb_dst_set(skb, NULL);
process_cpl_msg(chtls_recv_pdu, sk, skb);
return 0;
}
static void chtls_set_hdrlen(struct sk_buff *skb, unsigned int nlen)
{
struct tlsrx_cmp_hdr *tls_cmp_hdr = cplhdr(skb);
skb->hdr_len = ntohs((__force __be16)tls_cmp_hdr->length);
tls_cmp_hdr->length = ntohs((__force __be16)nlen);
}
static void chtls_rx_hdr(struct sock *sk, struct sk_buff *skb)
{
struct cpl_rx_tls_cmp *cmp_cpl = cplhdr(skb);
struct sk_buff *skb_rec;
struct chtls_sock *csk;
struct chtls_hws *tlsk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tlsk = &csk->tlshws;
tp = tcp_sk(sk);
ULP_SKB_CB(skb)->seq = ntohl(cmp_cpl->seq);
ULP_SKB_CB(skb)->flags = 0;
skb_reset_transport_header(skb);
__skb_pull(skb, sizeof(*cmp_cpl));
if (!skb->data_len)
__skb_trim(skb, CPL_RX_TLS_CMP_LENGTH_G
(ntohl(cmp_cpl->pdulength_length)));
tp->rcv_nxt +=
CPL_RX_TLS_CMP_PDULENGTH_G(ntohl(cmp_cpl->pdulength_length));
skb_rec = __skb_dequeue(&tlsk->sk_recv_queue);
if (!skb_rec) {
ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_TLS_ND;
__skb_queue_tail(&sk->sk_receive_queue, skb);
} else {
chtls_set_hdrlen(skb, tlsk->pldlen);
tlsk->pldlen = 0;
__skb_queue_tail(&sk->sk_receive_queue, skb);
__skb_queue_tail(&sk->sk_receive_queue, skb_rec);
}
if (!sock_flag(sk, SOCK_DEAD)) {
check_sk_callbacks(csk);
sk->sk_data_ready(sk);
}
}
static int chtls_rx_cmp(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_rx_tls_cmp *req = cplhdr(skb);
unsigned int hwtid = GET_TID(req);
struct sock *sk;
sk = lookup_tid(cdev->tids, hwtid);
skb_dst_set(skb, NULL);
process_cpl_msg(chtls_rx_hdr, sk, skb);
return 0;
}
static void chtls_timewait(struct sock *sk)
{
struct tcp_sock *tp = tcp_sk(sk);
tp->rcv_nxt++;
tp->rx_opt.ts_recent_stamp = get_seconds();
tp->srtt_us = 0;
tcp_time_wait(sk, TCP_TIME_WAIT, 0);
}
static void chtls_peer_close(struct sock *sk, struct sk_buff *skb)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
sk->sk_shutdown |= RCV_SHUTDOWN;
sock_set_flag(sk, SOCK_DONE);
switch (sk->sk_state) {
case TCP_SYN_RECV:
case TCP_ESTABLISHED:
tcp_set_state(sk, TCP_CLOSE_WAIT);
break;
case TCP_FIN_WAIT1:
tcp_set_state(sk, TCP_CLOSING);
break;
case TCP_FIN_WAIT2:
chtls_release_resources(sk);
if (csk_flag_nochk(csk, CSK_ABORT_RPL_PENDING))
chtls_conn_done(sk);
else
chtls_timewait(sk);
break;
default:
pr_info("cpl_peer_close in bad state %d\n", sk->sk_state);
}
if (!sock_flag(sk, SOCK_DEAD)) {
sk->sk_state_change(sk);
/* Do not send POLL_HUP for half duplex close. */
if ((sk->sk_shutdown & SEND_SHUTDOWN) ||
sk->sk_state == TCP_CLOSE)
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_HUP);
else
sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN);
}
}
static void chtls_close_con_rpl(struct sock *sk, struct sk_buff *skb)
{
struct cpl_close_con_rpl *rpl = cplhdr(skb) + RSS_HDR;
struct chtls_sock *csk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tp = tcp_sk(sk);
tp->snd_una = ntohl(rpl->snd_nxt) - 1; /* exclude FIN */
switch (sk->sk_state) {
case TCP_CLOSING:
chtls_release_resources(sk);
if (csk_flag_nochk(csk, CSK_ABORT_RPL_PENDING))
chtls_conn_done(sk);
else
chtls_timewait(sk);
break;
case TCP_LAST_ACK:
chtls_release_resources(sk);
chtls_conn_done(sk);
break;
case TCP_FIN_WAIT1:
tcp_set_state(sk, TCP_FIN_WAIT2);
sk->sk_shutdown |= SEND_SHUTDOWN;
if (!sock_flag(sk, SOCK_DEAD))
sk->sk_state_change(sk);
else if (tcp_sk(sk)->linger2 < 0 &&
!csk_flag_nochk(csk, CSK_ABORT_SHUTDOWN))
chtls_abort_conn(sk, skb);
break;
default:
pr_info("close_con_rpl in bad state %d\n", sk->sk_state);
}
kfree_skb(skb);
}
static struct sk_buff *get_cpl_skb(struct sk_buff *skb,
size_t len, gfp_t gfp)
{
if (likely(!skb_is_nonlinear(skb) && !skb_cloned(skb))) {
WARN_ONCE(skb->len < len, "skb alloc error");
__skb_trim(skb, len);
skb_get(skb);
} else {
skb = alloc_skb(len, gfp);
if (skb)
__skb_put(skb, len);
}
return skb;
}
static void set_abort_rpl_wr(struct sk_buff *skb, unsigned int tid,
int cmd)
{
struct cpl_abort_rpl *rpl = cplhdr(skb);
INIT_TP_WR_CPL(rpl, CPL_ABORT_RPL, tid);
rpl->cmd = cmd;
}
static void send_defer_abort_rpl(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_abort_req_rss *req = cplhdr(skb);
struct sk_buff *reply_skb;
reply_skb = alloc_skb(sizeof(struct cpl_abort_rpl),
GFP_KERNEL | __GFP_NOFAIL);
__skb_put(reply_skb, sizeof(struct cpl_abort_rpl));
set_abort_rpl_wr(reply_skb, GET_TID(req),
(req->status & CPL_ABORT_NO_RST));
set_wr_txq(reply_skb, CPL_PRIORITY_DATA, req->status >> 1);
cxgb4_ofld_send(cdev->lldi->ports[0], reply_skb);
kfree_skb(skb);
}
static void send_abort_rpl(struct sock *sk, struct sk_buff *skb,
struct chtls_dev *cdev, int status, int queue)
{
struct cpl_abort_req_rss *req = cplhdr(skb);
struct sk_buff *reply_skb;
struct chtls_sock *csk;
csk = rcu_dereference_sk_user_data(sk);
reply_skb = alloc_skb(sizeof(struct cpl_abort_rpl),
GFP_KERNEL);
if (!reply_skb) {
req->status = (queue << 1);
send_defer_abort_rpl(cdev, skb);
return;
}
set_abort_rpl_wr(reply_skb, GET_TID(req), status);
kfree_skb(skb);
set_wr_txq(reply_skb, CPL_PRIORITY_DATA, queue);
if (csk_conn_inline(csk)) {
struct l2t_entry *e = csk->l2t_entry;
if (e && sk->sk_state != TCP_SYN_RECV) {
cxgb4_l2t_send(csk->egress_dev, reply_skb, e);
return;
}
}
cxgb4_ofld_send(cdev->lldi->ports[0], reply_skb);
}
/*
* Add an skb to the deferred skb queue for processing from process context.
*/
static void t4_defer_reply(struct sk_buff *skb, struct chtls_dev *cdev,
defer_handler_t handler)
{
DEFERRED_SKB_CB(skb)->handler = handler;
spin_lock_bh(&cdev->deferq.lock);
__skb_queue_tail(&cdev->deferq, skb);
if (skb_queue_len(&cdev->deferq) == 1)
schedule_work(&cdev->deferq_task);
spin_unlock_bh(&cdev->deferq.lock);
}
static void chtls_send_abort_rpl(struct sock *sk, struct sk_buff *skb,
struct chtls_dev *cdev,
int status, int queue)
{
struct cpl_abort_req_rss *req = cplhdr(skb) + RSS_HDR;
struct sk_buff *reply_skb;
struct chtls_sock *csk;
unsigned int tid;
csk = rcu_dereference_sk_user_data(sk);
tid = GET_TID(req);
reply_skb = get_cpl_skb(skb, sizeof(struct cpl_abort_rpl), gfp_any());
if (!reply_skb) {
req->status = (queue << 1) | status;
t4_defer_reply(skb, cdev, send_defer_abort_rpl);
return;
}
set_abort_rpl_wr(reply_skb, tid, status);
set_wr_txq(reply_skb, CPL_PRIORITY_DATA, queue);
if (csk_conn_inline(csk)) {
struct l2t_entry *e = csk->l2t_entry;
if (e && sk->sk_state != TCP_SYN_RECV) {
cxgb4_l2t_send(csk->egress_dev, reply_skb, e);
return;
}
}
cxgb4_ofld_send(cdev->lldi->ports[0], reply_skb);
kfree_skb(skb);
}
/*
* This is run from a listener's backlog to abort a child connection in
* SYN_RCV state (i.e., one on the listener's SYN queue).
*/
static void bl_abort_syn_rcv(struct sock *lsk, struct sk_buff *skb)
{
struct chtls_sock *csk;
struct sock *child;
int queue;
child = skb->sk;
csk = rcu_dereference_sk_user_data(child);
queue = csk->txq_idx;
skb->sk = NULL;
do_abort_syn_rcv(child, lsk);
send_abort_rpl(child, skb, BLOG_SKB_CB(skb)->cdev,
CPL_ABORT_NO_RST, queue);
}
static int abort_syn_rcv(struct sock *sk, struct sk_buff *skb)
{
const struct request_sock *oreq;
struct listen_ctx *listen_ctx;
struct chtls_sock *csk;
struct chtls_dev *cdev;
struct sock *psk;
void *ctx;
csk = sk->sk_user_data;
oreq = csk->passive_reap_next;
cdev = csk->cdev;
if (!oreq)
return -1;
ctx = lookup_stid(cdev->tids, oreq->ts_recent);
if (!ctx)
return -1;
listen_ctx = (struct listen_ctx *)ctx;
psk = listen_ctx->lsk;
bh_lock_sock(psk);
if (!sock_owned_by_user(psk)) {
int queue = csk->txq_idx;
do_abort_syn_rcv(sk, psk);
send_abort_rpl(sk, skb, cdev, CPL_ABORT_NO_RST, queue);
} else {
skb->sk = sk;
BLOG_SKB_CB(skb)->backlog_rcv = bl_abort_syn_rcv;
__sk_add_backlog(psk, skb);
}
bh_unlock_sock(psk);
return 0;
}
static void chtls_abort_req_rss(struct sock *sk, struct sk_buff *skb)
{
const struct cpl_abort_req_rss *req = cplhdr(skb) + RSS_HDR;
struct chtls_sock *csk = sk->sk_user_data;
int rst_status = CPL_ABORT_NO_RST;
int queue = csk->txq_idx;
if (is_neg_adv(req->status)) {
if (sk->sk_state == TCP_SYN_RECV)
chtls_set_tcb_tflag(sk, 0, 0);
kfree_skb(skb);
return;
}
csk_reset_flag(csk, CSK_ABORT_REQ_RCVD);
if (!csk_flag_nochk(csk, CSK_ABORT_SHUTDOWN) &&
!csk_flag_nochk(csk, CSK_TX_DATA_SENT)) {
struct tcp_sock *tp = tcp_sk(sk);
if (send_tx_flowc_wr(sk, 0, tp->snd_nxt, tp->rcv_nxt) < 0)
WARN_ONCE(1, "send_tx_flowc error");
csk_set_flag(csk, CSK_TX_DATA_SENT);
}
csk_set_flag(csk, CSK_ABORT_SHUTDOWN);
if (!csk_flag_nochk(csk, CSK_ABORT_RPL_PENDING)) {
sk->sk_err = ETIMEDOUT;
if (!sock_flag(sk, SOCK_DEAD))
sk->sk_error_report(sk);
if (sk->sk_state == TCP_SYN_RECV && !abort_syn_rcv(sk, skb))
return;
chtls_release_resources(sk);
chtls_conn_done(sk);
}
chtls_send_abort_rpl(sk, skb, csk->cdev, rst_status, queue);
}
static void chtls_abort_rpl_rss(struct sock *sk, struct sk_buff *skb)
{
struct cpl_abort_rpl_rss *rpl = cplhdr(skb) + RSS_HDR;
struct chtls_sock *csk;
struct chtls_dev *cdev;
csk = rcu_dereference_sk_user_data(sk);
cdev = csk->cdev;
if (csk_flag_nochk(csk, CSK_ABORT_RPL_PENDING)) {
csk_reset_flag(csk, CSK_ABORT_RPL_PENDING);
if (!csk_flag_nochk(csk, CSK_ABORT_REQ_RCVD)) {
if (sk->sk_state == TCP_SYN_SENT) {
cxgb4_remove_tid(cdev->tids,
csk->port_id,
GET_TID(rpl),
sk->sk_family);
sock_put(sk);
}
chtls_release_resources(sk);
chtls_conn_done(sk);
}
}
kfree_skb(skb);
}
static int chtls_conn_cpl(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_peer_close *req = cplhdr(skb) + RSS_HDR;
void (*fn)(struct sock *sk, struct sk_buff *skb);
unsigned int hwtid = GET_TID(req);
struct sock *sk;
u8 opcode;
opcode = ((const struct rss_header *)cplhdr(skb))->opcode;
sk = lookup_tid(cdev->tids, hwtid);
if (!sk)
goto rel_skb;
switch (opcode) {
case CPL_PEER_CLOSE:
fn = chtls_peer_close;
break;
case CPL_CLOSE_CON_RPL:
fn = chtls_close_con_rpl;
break;
case CPL_ABORT_REQ_RSS:
fn = chtls_abort_req_rss;
break;
case CPL_ABORT_RPL_RSS:
fn = chtls_abort_rpl_rss;
break;
default:
goto rel_skb;
}
process_cpl_msg(fn, sk, skb);
return 0;
rel_skb:
kfree_skb(skb);
return 0;
}
static struct sk_buff *dequeue_wr(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct sk_buff *skb = csk->wr_skb_head;
if (likely(skb)) {
/* Don't bother clearing the tail */
csk->wr_skb_head = WR_SKB_CB(skb)->next_wr;
WR_SKB_CB(skb)->next_wr = NULL;
}
return skb;
}
static void chtls_rx_ack(struct sock *sk, struct sk_buff *skb)
{
struct cpl_fw4_ack *hdr = cplhdr(skb) + RSS_HDR;
struct chtls_sock *csk = sk->sk_user_data;
struct tcp_sock *tp = tcp_sk(sk);
u32 credits = hdr->credits;
u32 snd_una;
snd_una = ntohl(hdr->snd_una);
csk->wr_credits += credits;
if (csk->wr_unacked > csk->wr_max_credits - csk->wr_credits)
csk->wr_unacked = csk->wr_max_credits - csk->wr_credits;
while (credits) {
struct sk_buff *pskb = csk->wr_skb_head;
u32 csum;
if (unlikely(!pskb)) {
if (csk->wr_nondata)
csk->wr_nondata -= credits;
break;
}
csum = (__force u32)pskb->csum;
if (unlikely(credits < csum)) {
pskb->csum = (__force __wsum)(csum - credits);
break;
}
dequeue_wr(sk);
credits -= csum;
kfree_skb(pskb);
}
if (hdr->seq_vld & CPL_FW4_ACK_FLAGS_SEQVAL) {
if (unlikely(before(snd_una, tp->snd_una))) {
kfree_skb(skb);
return;
}
if (tp->snd_una != snd_una) {
tp->snd_una = snd_una;
tp->rcv_tstamp = tcp_time_stamp(tp);
if (tp->snd_una == tp->snd_nxt &&
!csk_flag_nochk(csk, CSK_TX_FAILOVER))
csk_reset_flag(csk, CSK_TX_WAIT_IDLE);
}
}
if (hdr->seq_vld & CPL_FW4_ACK_FLAGS_CH) {
unsigned int fclen16 = roundup(failover_flowc_wr_len, 16);
csk->wr_credits -= fclen16;
csk_reset_flag(csk, CSK_TX_WAIT_IDLE);
csk_reset_flag(csk, CSK_TX_FAILOVER);
}
if (skb_queue_len(&csk->txq) && chtls_push_frames(csk, 0))
sk->sk_write_space(sk);
kfree_skb(skb);
}
static int chtls_wr_ack(struct chtls_dev *cdev, struct sk_buff *skb)
{
struct cpl_fw4_ack *rpl = cplhdr(skb) + RSS_HDR;
unsigned int hwtid = GET_TID(rpl);
struct sock *sk;
sk = lookup_tid(cdev->tids, hwtid);
process_cpl_msg(chtls_rx_ack, sk, skb);
return 0;
}
chtls_handler_func chtls_handlers[NUM_CPL_CMDS] = {
[CPL_PASS_OPEN_RPL] = chtls_pass_open_rpl,
[CPL_CLOSE_LISTSRV_RPL] = chtls_close_listsrv_rpl,
[CPL_PASS_ACCEPT_REQ] = chtls_pass_accept_req,
[CPL_PASS_ESTABLISH] = chtls_pass_establish,
[CPL_RX_DATA] = chtls_rx_data,
[CPL_TLS_DATA] = chtls_rx_pdu,
[CPL_RX_TLS_CMP] = chtls_rx_cmp,
[CPL_PEER_CLOSE] = chtls_conn_cpl,
[CPL_CLOSE_CON_RPL] = chtls_conn_cpl,
[CPL_ABORT_REQ_RSS] = chtls_conn_cpl,
[CPL_ABORT_RPL_RSS] = chtls_conn_cpl,
[CPL_FW4_ACK] = chtls_wr_ack,
};
/*
* Copyright (c) 2018 Chelsio Communications, Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef __CHTLS_CM_H__
#define __CHTLS_CM_H__
/*
* TCB settings
*/
/* 3:0 */
#define TCB_ULP_TYPE_W 0
#define TCB_ULP_TYPE_S 0
#define TCB_ULP_TYPE_M 0xfULL
#define TCB_ULP_TYPE_V(x) ((x) << TCB_ULP_TYPE_S)
/* 11:4 */
#define TCB_ULP_RAW_W 0
#define TCB_ULP_RAW_S 4
#define TCB_ULP_RAW_M 0xffULL
#define TCB_ULP_RAW_V(x) ((x) << TCB_ULP_RAW_S)
#define TF_TLS_KEY_SIZE_S 7
#define TF_TLS_KEY_SIZE_V(x) ((x) << TF_TLS_KEY_SIZE_S)
#define TF_TLS_CONTROL_S 2
#define TF_TLS_CONTROL_V(x) ((x) << TF_TLS_CONTROL_S)
#define TF_TLS_ACTIVE_S 1
#define TF_TLS_ACTIVE_V(x) ((x) << TF_TLS_ACTIVE_S)
#define TF_TLS_ENABLE_S 0
#define TF_TLS_ENABLE_V(x) ((x) << TF_TLS_ENABLE_S)
#define TF_RX_QUIESCE_S 15
#define TF_RX_QUIESCE_V(x) ((x) << TF_RX_QUIESCE_S)
/*
* Max receive window supported by HW in bytes. Only a small part of it can
* be set through option0, the rest needs to be set through RX_DATA_ACK.
*/
#define MAX_RCV_WND ((1U << 27) - 1)
#define MAX_MSS 65536
/*
* Min receive window. We want it to be large enough to accommodate receive
* coalescing, handle jumbo frames, and not trigger sender SWS avoidance.
*/
#define MIN_RCV_WND (24 * 1024U)
#define LOOPBACK(x) (((x) & htonl(0xff000000)) == htonl(0x7f000000))
/* ulp_mem_io + ulptx_idata + payload + padding */
#define MAX_IMM_ULPTX_WR_LEN (32 + 8 + 256 + 8)
/* for TX: a skb must have a headroom of at least TX_HEADER_LEN bytes */
#define TX_HEADER_LEN \
(sizeof(struct fw_ofld_tx_data_wr) + sizeof(struct sge_opaque_hdr))
#define TX_TLSHDR_LEN \
(sizeof(struct fw_tlstx_data_wr) + sizeof(struct cpl_tx_tls_sfo) + \
sizeof(struct sge_opaque_hdr))
#define TXDATA_SKB_LEN 128
enum {
CPL_TX_TLS_SFO_TYPE_CCS,
CPL_TX_TLS_SFO_TYPE_ALERT,
CPL_TX_TLS_SFO_TYPE_HANDSHAKE,
CPL_TX_TLS_SFO_TYPE_DATA,
CPL_TX_TLS_SFO_TYPE_HEARTBEAT,
};
enum {
TLS_HDR_TYPE_CCS = 20,
TLS_HDR_TYPE_ALERT,
TLS_HDR_TYPE_HANDSHAKE,
TLS_HDR_TYPE_RECORD,
TLS_HDR_TYPE_HEARTBEAT,
};
typedef void (*defer_handler_t)(struct chtls_dev *dev, struct sk_buff *skb);
extern struct request_sock_ops chtls_rsk_ops;
struct deferred_skb_cb {
defer_handler_t handler;
struct chtls_dev *dev;
};
#define DEFERRED_SKB_CB(skb) ((struct deferred_skb_cb *)(skb)->cb)
#define failover_flowc_wr_len offsetof(struct fw_flowc_wr, mnemval[3])
#define WR_SKB_CB(skb) ((struct wr_skb_cb *)(skb)->cb)
#define ACCEPT_QUEUE(sk) (&inet_csk(sk)->icsk_accept_queue.rskq_accept_head)
#define SND_WSCALE(tp) ((tp)->rx_opt.snd_wscale)
#define RCV_WSCALE(tp) ((tp)->rx_opt.rcv_wscale)
#define USER_MSS(tp) ((tp)->rx_opt.user_mss)
#define TS_RECENT_STAMP(tp) ((tp)->rx_opt.ts_recent_stamp)
#define WSCALE_OK(tp) ((tp)->rx_opt.wscale_ok)
#define TSTAMP_OK(tp) ((tp)->rx_opt.tstamp_ok)
#define SACK_OK(tp) ((tp)->rx_opt.sack_ok)
#define INC_ORPHAN_COUNT(sk) percpu_counter_inc((sk)->sk_prot->orphan_count)
/* TLS SKB */
#define skb_ulp_tls_inline(skb) (ULP_SKB_CB(skb)->ulp.tls.ofld)
#define skb_ulp_tls_iv_imm(skb) (ULP_SKB_CB(skb)->ulp.tls.iv)
void chtls_defer_reply(struct sk_buff *skb, struct chtls_dev *dev,
defer_handler_t handler);
/*
* Returns true if the socket is in one of the supplied states.
*/
static inline unsigned int sk_in_state(const struct sock *sk,
unsigned int states)
{
return states & (1 << sk->sk_state);
}
static void chtls_rsk_destructor(struct request_sock *req)
{
/* do nothing */
}
static inline void chtls_init_rsk_ops(struct proto *chtls_tcp_prot,
struct request_sock_ops *chtls_tcp_ops,
struct proto *tcp_prot, int family)
{
memset(chtls_tcp_ops, 0, sizeof(*chtls_tcp_ops));
chtls_tcp_ops->family = family;
chtls_tcp_ops->obj_size = sizeof(struct tcp_request_sock);
chtls_tcp_ops->destructor = chtls_rsk_destructor;
chtls_tcp_ops->slab = tcp_prot->rsk_prot->slab;
chtls_tcp_prot->rsk_prot = chtls_tcp_ops;
}
static inline void chtls_reqsk_free(struct request_sock *req)
{
if (req->rsk_listener)
sock_put(req->rsk_listener);
kmem_cache_free(req->rsk_ops->slab, req);
}
#define DECLARE_TASK_FUNC(task, task_param) \
static void task(struct work_struct *task_param)
static inline void sk_wakeup_sleepers(struct sock *sk, bool interruptable)
{
struct socket_wq *wq;
rcu_read_lock();
wq = rcu_dereference(sk->sk_wq);
if (skwq_has_sleeper(wq)) {
if (interruptable)
wake_up_interruptible(sk_sleep(sk));
else
wake_up_all(sk_sleep(sk));
}
rcu_read_unlock();
}
static inline void chtls_set_req_port(struct request_sock *oreq,
__be16 source, __be16 dest)
{
inet_rsk(oreq)->ir_rmt_port = source;
inet_rsk(oreq)->ir_num = ntohs(dest);
}
static inline void chtls_set_req_addr(struct request_sock *oreq,
__be32 local_ip, __be32 peer_ip)
{
inet_rsk(oreq)->ir_loc_addr = local_ip;
inet_rsk(oreq)->ir_rmt_addr = peer_ip;
}
static inline void chtls_free_skb(struct sock *sk, struct sk_buff *skb)
{
skb_dst_set(skb, NULL);
__skb_unlink(skb, &sk->sk_receive_queue);
__kfree_skb(skb);
}
static inline void chtls_kfree_skb(struct sock *sk, struct sk_buff *skb)
{
skb_dst_set(skb, NULL);
__skb_unlink(skb, &sk->sk_receive_queue);
kfree_skb(skb);
}
static inline void enqueue_wr(struct chtls_sock *csk, struct sk_buff *skb)
{
WR_SKB_CB(skb)->next_wr = NULL;
skb_get(skb);
if (!csk->wr_skb_head)
csk->wr_skb_head = skb;
else
WR_SKB_CB(csk->wr_skb_tail)->next_wr = skb;
csk->wr_skb_tail = skb;
}
#endif
/*
* Copyright (c) 2018 Chelsio Communications, Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Written by: Atul Gupta (atul.gupta@chelsio.com)
*/
#include <linux/module.h>
#include <linux/list.h>
#include <linux/workqueue.h>
#include <linux/skbuff.h>
#include <linux/timer.h>
#include <linux/notifier.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <linux/tls.h>
#include <net/tls.h>
#include "chtls.h"
#include "chtls_cm.h"
static void __set_tcb_field_direct(struct chtls_sock *csk,
struct cpl_set_tcb_field *req, u16 word,
u64 mask, u64 val, u8 cookie, int no_reply)
{
struct ulptx_idata *sc;
INIT_TP_WR_CPL(req, CPL_SET_TCB_FIELD, csk->tid);
req->wr.wr_mid |= htonl(FW_WR_FLOWID_V(csk->tid));
req->reply_ctrl = htons(NO_REPLY_V(no_reply) |
QUEUENO_V(csk->rss_qid));
req->word_cookie = htons(TCB_WORD_V(word) | TCB_COOKIE_V(cookie));
req->mask = cpu_to_be64(mask);
req->val = cpu_to_be64(val);
sc = (struct ulptx_idata *)(req + 1);
sc->cmd_more = htonl(ULPTX_CMD_V(ULP_TX_SC_NOOP));
sc->len = htonl(0);
}
static void __set_tcb_field(struct sock *sk, struct sk_buff *skb, u16 word,
u64 mask, u64 val, u8 cookie, int no_reply)
{
struct cpl_set_tcb_field *req;
struct chtls_sock *csk;
struct ulptx_idata *sc;
unsigned int wrlen;
wrlen = roundup(sizeof(*req) + sizeof(*sc), 16);
csk = rcu_dereference_sk_user_data(sk);
req = (struct cpl_set_tcb_field *)__skb_put(skb, wrlen);
__set_tcb_field_direct(csk, req, word, mask, val, cookie, no_reply);
set_wr_txq(skb, CPL_PRIORITY_CONTROL, csk->port_id);
}
/*
* Send control message to HW, message go as immediate data and packet
* is freed immediately.
*/
static int chtls_set_tcb_field(struct sock *sk, u16 word, u64 mask, u64 val)
{
struct cpl_set_tcb_field *req;
unsigned int credits_needed;
struct chtls_sock *csk;
struct ulptx_idata *sc;
struct sk_buff *skb;
unsigned int wrlen;
int ret;
wrlen = roundup(sizeof(*req) + sizeof(*sc), 16);
skb = alloc_skb(wrlen, GFP_ATOMIC);
if (!skb)
return -ENOMEM;
credits_needed = DIV_ROUND_UP(wrlen, 16);
csk = rcu_dereference_sk_user_data(sk);
__set_tcb_field(sk, skb, word, mask, val, 0, 1);
skb_set_queue_mapping(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA);
csk->wr_credits -= credits_needed;
csk->wr_unacked += credits_needed;
enqueue_wr(csk, skb);
ret = cxgb4_ofld_send(csk->egress_dev, skb);
if (ret < 0)
kfree_skb(skb);
return ret < 0 ? ret : 0;
}
/*
* Set one of the t_flags bits in the TCB.
*/
int chtls_set_tcb_tflag(struct sock *sk, unsigned int bit_pos, int val)
{
return chtls_set_tcb_field(sk, 1, 1ULL << bit_pos,
val << bit_pos);
}
static int chtls_set_tcb_keyid(struct sock *sk, int keyid)
{
return chtls_set_tcb_field(sk, 31, 0xFFFFFFFFULL, keyid);
}
static int chtls_set_tcb_seqno(struct sock *sk)
{
return chtls_set_tcb_field(sk, 28, ~0ULL, 0);
}
static int chtls_set_tcb_quiesce(struct sock *sk, int val)
{
return chtls_set_tcb_field(sk, 1, (1ULL << TF_RX_QUIESCE_S),
TF_RX_QUIESCE_V(val));
}
/* TLS Key bitmap processing */
int chtls_init_kmap(struct chtls_dev *cdev, struct cxgb4_lld_info *lldi)
{
unsigned int num_key_ctx, bsize;
int ksize;
num_key_ctx = (lldi->vr->key.size / TLS_KEY_CONTEXT_SZ);
bsize = BITS_TO_LONGS(num_key_ctx);
cdev->kmap.size = num_key_ctx;
cdev->kmap.available = bsize;
ksize = sizeof(*cdev->kmap.addr) * bsize;
cdev->kmap.addr = kvzalloc(ksize, GFP_KERNEL);
if (!cdev->kmap.addr)
return -ENOMEM;
cdev->kmap.start = lldi->vr->key.start;
spin_lock_init(&cdev->kmap.lock);
return 0;
}
static int get_new_keyid(struct chtls_sock *csk, u32 optname)
{
struct net_device *dev = csk->egress_dev;
struct chtls_dev *cdev = csk->cdev;
struct chtls_hws *hws;
struct adapter *adap;
int keyid;
adap = netdev2adap(dev);
hws = &csk->tlshws;
spin_lock_bh(&cdev->kmap.lock);
keyid = find_first_zero_bit(cdev->kmap.addr, cdev->kmap.size);
if (keyid < cdev->kmap.size) {
__set_bit(keyid, cdev->kmap.addr);
if (optname == TLS_RX)
hws->rxkey = keyid;
else
hws->txkey = keyid;
atomic_inc(&adap->chcr_stats.tls_key);
} else {
keyid = -1;
}
spin_unlock_bh(&cdev->kmap.lock);
return keyid;
}
void free_tls_keyid(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct net_device *dev = csk->egress_dev;
struct chtls_dev *cdev = csk->cdev;
struct chtls_hws *hws;
struct adapter *adap;
if (!cdev->kmap.addr)
return;
adap = netdev2adap(dev);
hws = &csk->tlshws;
spin_lock_bh(&cdev->kmap.lock);
if (hws->rxkey >= 0) {
__clear_bit(hws->rxkey, cdev->kmap.addr);
atomic_dec(&adap->chcr_stats.tls_key);
hws->rxkey = -1;
}
if (hws->txkey >= 0) {
__clear_bit(hws->txkey, cdev->kmap.addr);
atomic_dec(&adap->chcr_stats.tls_key);
hws->txkey = -1;
}
spin_unlock_bh(&cdev->kmap.lock);
}
unsigned int keyid_to_addr(int start_addr, int keyid)
{
return (start_addr + (keyid * TLS_KEY_CONTEXT_SZ)) >> 5;
}
static void chtls_rxkey_ivauth(struct _key_ctx *kctx)
{
kctx->iv_to_auth = cpu_to_be64(KEYCTX_TX_WR_IV_V(6ULL) |
KEYCTX_TX_WR_AAD_V(1ULL) |
KEYCTX_TX_WR_AADST_V(5ULL) |
KEYCTX_TX_WR_CIPHER_V(14ULL) |
KEYCTX_TX_WR_CIPHERST_V(0ULL) |
KEYCTX_TX_WR_AUTH_V(14ULL) |
KEYCTX_TX_WR_AUTHST_V(16ULL) |
KEYCTX_TX_WR_AUTHIN_V(16ULL));
}
static int chtls_key_info(struct chtls_sock *csk,
struct _key_ctx *kctx,
u32 keylen, u32 optname)
{
unsigned char key[CHCR_KEYCTX_CIPHER_KEY_SIZE_256];
struct tls12_crypto_info_aes_gcm_128 *gcm_ctx;
unsigned char ghash_h[AEAD_H_SIZE];
struct crypto_cipher *cipher;
int ck_size, key_ctx_size;
int ret;
gcm_ctx = (struct tls12_crypto_info_aes_gcm_128 *)
&csk->tlshws.crypto_info;
key_ctx_size = sizeof(struct _key_ctx) +
roundup(keylen, 16) + AEAD_H_SIZE;
if (keylen == AES_KEYSIZE_128) {
ck_size = CHCR_KEYCTX_CIPHER_KEY_SIZE_128;
} else if (keylen == AES_KEYSIZE_192) {
ck_size = CHCR_KEYCTX_CIPHER_KEY_SIZE_192;
} else if (keylen == AES_KEYSIZE_256) {
ck_size = CHCR_KEYCTX_CIPHER_KEY_SIZE_256;
} else {
pr_err("GCM: Invalid key length %d\n", keylen);
return -EINVAL;
}
memcpy(key, gcm_ctx->key, keylen);
/* Calculate the H = CIPH(K, 0 repeated 16 times).
* It will go in key context
*/
cipher = crypto_alloc_cipher("aes", 0, 0);
if (IS_ERR(cipher)) {
ret = -ENOMEM;
goto out;
}
ret = crypto_cipher_setkey(cipher, key, keylen);
if (ret)
goto out1;
memset(ghash_h, 0, AEAD_H_SIZE);
crypto_cipher_encrypt_one(cipher, ghash_h, ghash_h);
csk->tlshws.keylen = key_ctx_size;
/* Copy the Key context */
if (optname == TLS_RX) {
int key_ctx;
key_ctx = ((key_ctx_size >> 4) << 3);
kctx->ctx_hdr = FILL_KEY_CRX_HDR(ck_size,
CHCR_KEYCTX_MAC_KEY_SIZE_128,
0, 0, key_ctx);
chtls_rxkey_ivauth(kctx);
} else {
kctx->ctx_hdr = FILL_KEY_CTX_HDR(ck_size,
CHCR_KEYCTX_MAC_KEY_SIZE_128,
0, 0, key_ctx_size >> 4);
}
memcpy(kctx->salt, gcm_ctx->salt, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
memcpy(kctx->key, gcm_ctx->key, keylen);
memcpy(kctx->key + keylen, ghash_h, AEAD_H_SIZE);
/* erase key info from driver */
memset(gcm_ctx->key, 0, keylen);
out1:
crypto_free_cipher(cipher);
out:
return ret;
}
static void chtls_set_scmd(struct chtls_sock *csk)
{
struct chtls_hws *hws = &csk->tlshws;
hws->scmd.seqno_numivs =
SCMD_SEQ_NO_CTRL_V(3) |
SCMD_PROTO_VERSION_V(0) |
SCMD_ENC_DEC_CTRL_V(0) |
SCMD_CIPH_AUTH_SEQ_CTRL_V(1) |
SCMD_CIPH_MODE_V(2) |
SCMD_AUTH_MODE_V(4) |
SCMD_HMAC_CTRL_V(0) |
SCMD_IV_SIZE_V(4) |
SCMD_NUM_IVS_V(1);
hws->scmd.ivgen_hdrlen =
SCMD_IV_GEN_CTRL_V(1) |
SCMD_KEY_CTX_INLINE_V(0) |
SCMD_TLS_FRAG_ENABLE_V(1);
}
int chtls_setkey(struct chtls_sock *csk, u32 keylen, u32 optname)
{
struct tls_key_req *kwr;
struct chtls_dev *cdev;
struct _key_ctx *kctx;
int wrlen, klen, len;
struct sk_buff *skb;
struct sock *sk;
int keyid;
int kaddr;
int ret;
cdev = csk->cdev;
sk = csk->sk;
klen = roundup((keylen + AEAD_H_SIZE) + sizeof(*kctx), 32);
wrlen = roundup(sizeof(*kwr), 16);
len = klen + wrlen;
/* Flush out-standing data before new key takes effect */
if (optname == TLS_TX) {
lock_sock(sk);
if (skb_queue_len(&csk->txq))
chtls_push_frames(csk, 0);
release_sock(sk);
}
skb = alloc_skb(len, GFP_KERNEL);
if (!skb)
return -ENOMEM;
keyid = get_new_keyid(csk, optname);
if (keyid < 0) {
ret = -ENOSPC;
goto out_nokey;
}
kaddr = keyid_to_addr(cdev->kmap.start, keyid);
kwr = (struct tls_key_req *)__skb_put_zero(skb, len);
kwr->wr.op_to_compl =
cpu_to_be32(FW_WR_OP_V(FW_ULPTX_WR) | FW_WR_COMPL_F |
FW_WR_ATOMIC_V(1U));
kwr->wr.flowid_len16 =
cpu_to_be32(FW_WR_LEN16_V(DIV_ROUND_UP(len, 16) |
FW_WR_FLOWID_V(csk->tid)));
kwr->wr.protocol = 0;
kwr->wr.mfs = htons(TLS_MFS);
kwr->wr.reneg_to_write_rx = optname;
/* ulptx command */
kwr->req.cmd = cpu_to_be32(ULPTX_CMD_V(ULP_TX_MEM_WRITE) |
T5_ULP_MEMIO_ORDER_V(1) |
T5_ULP_MEMIO_IMM_V(1));
kwr->req.len16 = cpu_to_be32((csk->tid << 8) |
DIV_ROUND_UP(len - sizeof(kwr->wr), 16));
kwr->req.dlen = cpu_to_be32(ULP_MEMIO_DATA_LEN_V(klen >> 5));
kwr->req.lock_addr = cpu_to_be32(ULP_MEMIO_ADDR_V(kaddr));
/* sub command */
kwr->sc_imm.cmd_more = cpu_to_be32(ULPTX_CMD_V(ULP_TX_SC_IMM));
kwr->sc_imm.len = cpu_to_be32(klen);
/* key info */
kctx = (struct _key_ctx *)(kwr + 1);
ret = chtls_key_info(csk, kctx, keylen, optname);
if (ret)
goto out_notcb;
set_wr_txq(skb, CPL_PRIORITY_DATA, csk->tlshws.txqid);
csk->wr_credits -= DIV_ROUND_UP(len, 16);
csk->wr_unacked += DIV_ROUND_UP(len, 16);
enqueue_wr(csk, skb);
cxgb4_ofld_send(csk->egress_dev, skb);
chtls_set_scmd(csk);
/* Clear quiesce for Rx key */
if (optname == TLS_RX) {
ret = chtls_set_tcb_keyid(sk, keyid);
if (ret)
goto out_notcb;
ret = chtls_set_tcb_field(sk, 0,
TCB_ULP_RAW_V(TCB_ULP_RAW_M),
TCB_ULP_RAW_V((TF_TLS_KEY_SIZE_V(1) |
TF_TLS_CONTROL_V(1) |
TF_TLS_ACTIVE_V(1) |
TF_TLS_ENABLE_V(1))));
if (ret)
goto out_notcb;
ret = chtls_set_tcb_seqno(sk);
if (ret)
goto out_notcb;
ret = chtls_set_tcb_quiesce(sk, 0);
if (ret)
goto out_notcb;
csk->tlshws.rxkey = keyid;
} else {
csk->tlshws.tx_seq_no = 0;
csk->tlshws.txkey = keyid;
}
return ret;
out_notcb:
free_tls_keyid(sk);
out_nokey:
kfree_skb(skb);
return ret;
}
/*
* Copyright (c) 2018 Chelsio Communications, Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Written by: Atul Gupta (atul.gupta@chelsio.com)
*/
#include <linux/module.h>
#include <linux/list.h>
#include <linux/workqueue.h>
#include <linux/skbuff.h>
#include <linux/timer.h>
#include <linux/notifier.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <linux/sched/signal.h>
#include <net/tcp.h>
#include <net/busy_poll.h>
#include <crypto/aes.h>
#include "chtls.h"
#include "chtls_cm.h"
static bool is_tls_tx(struct chtls_sock *csk)
{
return csk->tlshws.txkey >= 0;
}
static bool is_tls_rx(struct chtls_sock *csk)
{
return csk->tlshws.rxkey >= 0;
}
static int data_sgl_len(const struct sk_buff *skb)
{
unsigned int cnt;
cnt = skb_shinfo(skb)->nr_frags;
return sgl_len(cnt) * 8;
}
static int nos_ivs(struct sock *sk, unsigned int size)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
return DIV_ROUND_UP(size, csk->tlshws.mfs);
}
static int set_ivs_imm(struct sock *sk, const struct sk_buff *skb)
{
int ivs_size = nos_ivs(sk, skb->len) * CIPHER_BLOCK_SIZE;
int hlen = TLS_WR_CPL_LEN + data_sgl_len(skb);
if ((hlen + KEY_ON_MEM_SZ + ivs_size) <
MAX_IMM_OFLD_TX_DATA_WR_LEN) {
ULP_SKB_CB(skb)->ulp.tls.iv = 1;
return 1;
}
ULP_SKB_CB(skb)->ulp.tls.iv = 0;
return 0;
}
static int max_ivs_size(struct sock *sk, int size)
{
return nos_ivs(sk, size) * CIPHER_BLOCK_SIZE;
}
static int ivs_size(struct sock *sk, const struct sk_buff *skb)
{
return set_ivs_imm(sk, skb) ? (nos_ivs(sk, skb->len) *
CIPHER_BLOCK_SIZE) : 0;
}
static int flowc_wr_credits(int nparams, int *flowclenp)
{
int flowclen16, flowclen;
flowclen = offsetof(struct fw_flowc_wr, mnemval[nparams]);
flowclen16 = DIV_ROUND_UP(flowclen, 16);
flowclen = flowclen16 * 16;
if (flowclenp)
*flowclenp = flowclen;
return flowclen16;
}
static struct sk_buff *create_flowc_wr_skb(struct sock *sk,
struct fw_flowc_wr *flowc,
int flowclen)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct sk_buff *skb;
skb = alloc_skb(flowclen, GFP_ATOMIC);
if (!skb)
return NULL;
memcpy(__skb_put(skb, flowclen), flowc, flowclen);
skb_set_queue_mapping(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA);
return skb;
}
static int send_flowc_wr(struct sock *sk, struct fw_flowc_wr *flowc,
int flowclen)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
int flowclen16;
int ret;
flowclen16 = flowclen / 16;
if (csk_flag(sk, CSK_TX_DATA_SENT)) {
skb = create_flowc_wr_skb(sk, flowc, flowclen);
if (!skb)
return -ENOMEM;
skb_entail(sk, skb,
ULPCB_FLAG_NO_HDR | ULPCB_FLAG_NO_APPEND);
return 0;
}
ret = cxgb4_immdata_send(csk->egress_dev,
csk->txq_idx,
flowc, flowclen);
if (!ret)
return flowclen16;
skb = create_flowc_wr_skb(sk, flowc, flowclen);
if (!skb)
return -ENOMEM;
send_or_defer(sk, tp, skb, 0);
return flowclen16;
}
static u8 tcp_state_to_flowc_state(u8 state)
{
switch (state) {
case TCP_ESTABLISHED:
return FW_FLOWC_MNEM_TCPSTATE_ESTABLISHED;
case TCP_CLOSE_WAIT:
return FW_FLOWC_MNEM_TCPSTATE_CLOSEWAIT;
case TCP_FIN_WAIT1:
return FW_FLOWC_MNEM_TCPSTATE_FINWAIT1;
case TCP_CLOSING:
return FW_FLOWC_MNEM_TCPSTATE_CLOSING;
case TCP_LAST_ACK:
return FW_FLOWC_MNEM_TCPSTATE_LASTACK;
case TCP_FIN_WAIT2:
return FW_FLOWC_MNEM_TCPSTATE_FINWAIT2;
}
return FW_FLOWC_MNEM_TCPSTATE_ESTABLISHED;
}
int send_tx_flowc_wr(struct sock *sk, int compl,
u32 snd_nxt, u32 rcv_nxt)
{
struct flowc_packed {
struct fw_flowc_wr fc;
struct fw_flowc_mnemval mnemval[FW_FLOWC_MNEM_MAX];
} __packed sflowc;
int nparams, paramidx, flowclen16, flowclen;
struct fw_flowc_wr *flowc;
struct chtls_sock *csk;
struct tcp_sock *tp;
csk = rcu_dereference_sk_user_data(sk);
tp = tcp_sk(sk);
memset(&sflowc, 0, sizeof(sflowc));
flowc = &sflowc.fc;
#define FLOWC_PARAM(__m, __v) \
do { \
flowc->mnemval[paramidx].mnemonic = FW_FLOWC_MNEM_##__m; \
flowc->mnemval[paramidx].val = cpu_to_be32(__v); \
paramidx++; \
} while (0)
paramidx = 0;
FLOWC_PARAM(PFNVFN, FW_PFVF_CMD_PFN_V(csk->cdev->lldi->pf));
FLOWC_PARAM(CH, csk->tx_chan);
FLOWC_PARAM(PORT, csk->tx_chan);
FLOWC_PARAM(IQID, csk->rss_qid);
FLOWC_PARAM(SNDNXT, tp->snd_nxt);
FLOWC_PARAM(RCVNXT, tp->rcv_nxt);
FLOWC_PARAM(SNDBUF, csk->sndbuf);
FLOWC_PARAM(MSS, tp->mss_cache);
FLOWC_PARAM(TCPSTATE, tcp_state_to_flowc_state(sk->sk_state));
if (SND_WSCALE(tp))
FLOWC_PARAM(RCV_SCALE, SND_WSCALE(tp));
if (csk->ulp_mode == ULP_MODE_TLS)
FLOWC_PARAM(ULD_MODE, ULP_MODE_TLS);
if (csk->tlshws.fcplenmax)
FLOWC_PARAM(TXDATAPLEN_MAX, csk->tlshws.fcplenmax);
nparams = paramidx;
#undef FLOWC_PARAM
flowclen16 = flowc_wr_credits(nparams, &flowclen);
flowc->op_to_nparams =
cpu_to_be32(FW_WR_OP_V(FW_FLOWC_WR) |
FW_WR_COMPL_V(compl) |
FW_FLOWC_WR_NPARAMS_V(nparams));
flowc->flowid_len16 = cpu_to_be32(FW_WR_LEN16_V(flowclen16) |
FW_WR_FLOWID_V(csk->tid));
return send_flowc_wr(sk, flowc, flowclen);
}
/* Copy IVs to WR */
static int tls_copy_ivs(struct sock *sk, struct sk_buff *skb)
{
struct chtls_sock *csk;
unsigned char *iv_loc;
struct chtls_hws *hws;
unsigned char *ivs;
u16 number_of_ivs;
struct page *page;
int err = 0;
csk = rcu_dereference_sk_user_data(sk);
hws = &csk->tlshws;
number_of_ivs = nos_ivs(sk, skb->len);
if (number_of_ivs > MAX_IVS_PAGE) {
pr_warn("MAX IVs in PAGE exceeded %d\n", number_of_ivs);
return -ENOMEM;
}
/* generate the IVs */
ivs = kmalloc(number_of_ivs * CIPHER_BLOCK_SIZE, GFP_ATOMIC);
if (!ivs)
return -ENOMEM;
get_random_bytes(ivs, number_of_ivs * CIPHER_BLOCK_SIZE);
if (skb_ulp_tls_iv_imm(skb)) {
/* send the IVs as immediate data in the WR */
iv_loc = (unsigned char *)__skb_push(skb, number_of_ivs *
CIPHER_BLOCK_SIZE);
if (iv_loc)
memcpy(iv_loc, ivs, number_of_ivs * CIPHER_BLOCK_SIZE);
hws->ivsize = number_of_ivs * CIPHER_BLOCK_SIZE;
} else {
/* Send the IVs as sgls */
/* Already accounted IV DSGL for credits */
skb_shinfo(skb)->nr_frags--;
page = alloc_pages(sk->sk_allocation | __GFP_COMP, 0);
if (!page) {
pr_info("%s : Page allocation for IVs failed\n",
__func__);
err = -ENOMEM;
goto out;
}
memcpy(page_address(page), ivs, number_of_ivs *
CIPHER_BLOCK_SIZE);
skb_fill_page_desc(skb, skb_shinfo(skb)->nr_frags, page, 0,
number_of_ivs * CIPHER_BLOCK_SIZE);
hws->ivsize = 0;
}
out:
kfree(ivs);
return err;
}
/* Copy Key to WR */
static void tls_copy_tx_key(struct sock *sk, struct sk_buff *skb)
{
struct ulptx_sc_memrd *sc_memrd;
struct chtls_sock *csk;
struct chtls_dev *cdev;
struct ulptx_idata *sc;
struct chtls_hws *hws;
u32 immdlen;
int kaddr;
csk = rcu_dereference_sk_user_data(sk);
hws = &csk->tlshws;
cdev = csk->cdev;
immdlen = sizeof(*sc) + sizeof(*sc_memrd);
kaddr = keyid_to_addr(cdev->kmap.start, hws->txkey);
sc = (struct ulptx_idata *)__skb_push(skb, immdlen);
if (sc) {
sc->cmd_more = htonl(ULPTX_CMD_V(ULP_TX_SC_NOOP));
sc->len = htonl(0);
sc_memrd = (struct ulptx_sc_memrd *)(sc + 1);
sc_memrd->cmd_to_len =
htonl(ULPTX_CMD_V(ULP_TX_SC_MEMRD) |
ULP_TX_SC_MORE_V(1) |
ULPTX_LEN16_V(hws->keylen >> 4));
sc_memrd->addr = htonl(kaddr);
}
}
static u64 tlstx_incr_seqnum(struct chtls_hws *hws)
{
return hws->tx_seq_no++;
}
static bool is_sg_request(const struct sk_buff *skb)
{
return skb->peeked ||
(skb->len > MAX_IMM_ULPTX_WR_LEN);
}
/*
* Returns true if an sk_buff carries urgent data.
*/
static bool skb_urgent(struct sk_buff *skb)
{
return ULP_SKB_CB(skb)->flags & ULPCB_FLAG_URG;
}
/* TLS content type for CPL SFO */
static unsigned char tls_content_type(unsigned char content_type)
{
switch (content_type) {
case TLS_HDR_TYPE_CCS:
return CPL_TX_TLS_SFO_TYPE_CCS;
case TLS_HDR_TYPE_ALERT:
return CPL_TX_TLS_SFO_TYPE_ALERT;
case TLS_HDR_TYPE_HANDSHAKE:
return CPL_TX_TLS_SFO_TYPE_HANDSHAKE;
case TLS_HDR_TYPE_HEARTBEAT:
return CPL_TX_TLS_SFO_TYPE_HEARTBEAT;
}
return CPL_TX_TLS_SFO_TYPE_DATA;
}
static void tls_tx_data_wr(struct sock *sk, struct sk_buff *skb,
int dlen, int tls_immd, u32 credits,
int expn, int pdus)
{
struct fw_tlstx_data_wr *req_wr;
struct cpl_tx_tls_sfo *req_cpl;
unsigned int wr_ulp_mode_force;
struct tls_scmd *updated_scmd;
unsigned char data_type;
struct chtls_sock *csk;
struct net_device *dev;
struct chtls_hws *hws;
struct tls_scmd *scmd;
struct adapter *adap;
unsigned char *req;
int immd_len;
int iv_imm;
int len;
csk = rcu_dereference_sk_user_data(sk);
iv_imm = skb_ulp_tls_iv_imm(skb);
dev = csk->egress_dev;
adap = netdev2adap(dev);
hws = &csk->tlshws;
scmd = &hws->scmd;
len = dlen + expn;
dlen = (dlen < hws->mfs) ? dlen : hws->mfs;
atomic_inc(&adap->chcr_stats.tls_pdu_tx);
updated_scmd = scmd;
updated_scmd->seqno_numivs &= 0xffffff80;
updated_scmd->seqno_numivs |= SCMD_NUM_IVS_V(pdus);
hws->scmd = *updated_scmd;
req = (unsigned char *)__skb_push(skb, sizeof(struct cpl_tx_tls_sfo));
req_cpl = (struct cpl_tx_tls_sfo *)req;
req = (unsigned char *)__skb_push(skb, (sizeof(struct
fw_tlstx_data_wr)));
req_wr = (struct fw_tlstx_data_wr *)req;
immd_len = (tls_immd ? dlen : 0);
req_wr->op_to_immdlen =
htonl(FW_WR_OP_V(FW_TLSTX_DATA_WR) |
FW_TLSTX_DATA_WR_COMPL_V(1) |
FW_TLSTX_DATA_WR_IMMDLEN_V(immd_len));
req_wr->flowid_len16 = htonl(FW_TLSTX_DATA_WR_FLOWID_V(csk->tid) |
FW_TLSTX_DATA_WR_LEN16_V(credits));
wr_ulp_mode_force = TX_ULP_MODE_V(ULP_MODE_TLS);
if (is_sg_request(skb))
wr_ulp_mode_force |= FW_OFLD_TX_DATA_WR_ALIGNPLD_F |
((tcp_sk(sk)->nonagle & TCP_NAGLE_OFF) ? 0 :
FW_OFLD_TX_DATA_WR_SHOVE_F);
req_wr->lsodisable_to_flags =
htonl(TX_ULP_MODE_V(ULP_MODE_TLS) |
FW_OFLD_TX_DATA_WR_URGENT_V(skb_urgent(skb)) |
T6_TX_FORCE_F | wr_ulp_mode_force |
TX_SHOVE_V((!csk_flag(sk, CSK_TX_MORE_DATA)) &&
skb_queue_empty(&csk->txq)));
req_wr->ctxloc_to_exp =
htonl(FW_TLSTX_DATA_WR_NUMIVS_V(pdus) |
FW_TLSTX_DATA_WR_EXP_V(expn) |
FW_TLSTX_DATA_WR_CTXLOC_V(CHTLS_KEY_CONTEXT_DDR) |
FW_TLSTX_DATA_WR_IVDSGL_V(!iv_imm) |
FW_TLSTX_DATA_WR_KEYSIZE_V(hws->keylen >> 4));
/* Fill in the length */
req_wr->plen = htonl(len);
req_wr->mfs = htons(hws->mfs);
req_wr->adjustedplen_pkd =
htons(FW_TLSTX_DATA_WR_ADJUSTEDPLEN_V(hws->adjustlen));
req_wr->expinplenmax_pkd =
htons(FW_TLSTX_DATA_WR_EXPINPLENMAX_V(hws->expansion));
req_wr->pdusinplenmax_pkd =
FW_TLSTX_DATA_WR_PDUSINPLENMAX_V(hws->pdus);
req_wr->r10 = 0;
data_type = tls_content_type(ULP_SKB_CB(skb)->ulp.tls.type);
req_cpl->op_to_seg_len = htonl(CPL_TX_TLS_SFO_OPCODE_V(CPL_TX_TLS_SFO) |
CPL_TX_TLS_SFO_DATA_TYPE_V(data_type) |
CPL_TX_TLS_SFO_CPL_LEN_V(2) |
CPL_TX_TLS_SFO_SEG_LEN_V(dlen));
req_cpl->pld_len = htonl(len - expn);
req_cpl->type_protover = htonl(CPL_TX_TLS_SFO_TYPE_V
((data_type == CPL_TX_TLS_SFO_TYPE_HEARTBEAT) ?
TLS_HDR_TYPE_HEARTBEAT : 0) |
CPL_TX_TLS_SFO_PROTOVER_V(0));
/* create the s-command */
req_cpl->r1_lo = 0;
req_cpl->seqno_numivs = cpu_to_be32(hws->scmd.seqno_numivs);
req_cpl->ivgen_hdrlen = cpu_to_be32(hws->scmd.ivgen_hdrlen);
req_cpl->scmd1 = cpu_to_be64(tlstx_incr_seqnum(hws));
}
/*
* Calculate the TLS data expansion size
*/
static int chtls_expansion_size(struct sock *sk, int data_len,
int fullpdu,
unsigned short *pducnt)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct chtls_hws *hws = &csk->tlshws;
struct tls_scmd *scmd = &hws->scmd;
int fragsize = hws->mfs;
int expnsize = 0;
int fragleft;
int fragcnt;
int expppdu;
if (SCMD_CIPH_MODE_G(scmd->seqno_numivs) ==
SCMD_CIPH_MODE_AES_GCM) {
expppdu = GCM_TAG_SIZE + AEAD_EXPLICIT_DATA_SIZE +
TLS_HEADER_LENGTH;
if (fullpdu) {
*pducnt = data_len / (expppdu + fragsize);
if (*pducnt > 32)
*pducnt = 32;
else if (!*pducnt)
*pducnt = 1;
expnsize = (*pducnt) * expppdu;
return expnsize;
}
fragcnt = (data_len / fragsize);
expnsize = fragcnt * expppdu;
fragleft = data_len % fragsize;
if (fragleft > 0)
expnsize += expppdu;
}
return expnsize;
}
/* WR with IV, KEY and CPL SFO added */
static void make_tlstx_data_wr(struct sock *sk, struct sk_buff *skb,
int tls_tx_imm, int tls_len, u32 credits)
{
unsigned short pdus_per_ulp = 0;
struct chtls_sock *csk;
struct chtls_hws *hws;
int expn_sz;
int pdus;
csk = rcu_dereference_sk_user_data(sk);
hws = &csk->tlshws;
pdus = DIV_ROUND_UP(tls_len, hws->mfs);
expn_sz = chtls_expansion_size(sk, tls_len, 0, NULL);
if (!hws->compute) {
hws->expansion = chtls_expansion_size(sk,
hws->fcplenmax,
1, &pdus_per_ulp);
hws->pdus = pdus_per_ulp;
hws->adjustlen = hws->pdus *
((hws->expansion / hws->pdus) + hws->mfs);
hws->compute = 1;
}
if (tls_copy_ivs(sk, skb))
return;
tls_copy_tx_key(sk, skb);
tls_tx_data_wr(sk, skb, tls_len, tls_tx_imm, credits, expn_sz, pdus);
hws->tx_seq_no += (pdus - 1);
}
static void make_tx_data_wr(struct sock *sk, struct sk_buff *skb,
unsigned int immdlen, int len,
u32 credits, u32 compl)
{
struct fw_ofld_tx_data_wr *req;
unsigned int wr_ulp_mode_force;
struct chtls_sock *csk;
unsigned int opcode;
csk = rcu_dereference_sk_user_data(sk);
opcode = FW_OFLD_TX_DATA_WR;
req = (struct fw_ofld_tx_data_wr *)__skb_push(skb, sizeof(*req));
req->op_to_immdlen = htonl(WR_OP_V(opcode) |
FW_WR_COMPL_V(compl) |
FW_WR_IMMDLEN_V(immdlen));
req->flowid_len16 = htonl(FW_WR_FLOWID_V(csk->tid) |
FW_WR_LEN16_V(credits));
wr_ulp_mode_force = TX_ULP_MODE_V(csk->ulp_mode);
if (is_sg_request(skb))
wr_ulp_mode_force |= FW_OFLD_TX_DATA_WR_ALIGNPLD_F |
((tcp_sk(sk)->nonagle & TCP_NAGLE_OFF) ? 0 :
FW_OFLD_TX_DATA_WR_SHOVE_F);
req->tunnel_to_proxy = htonl(wr_ulp_mode_force |
FW_OFLD_TX_DATA_WR_URGENT_V(skb_urgent(skb)) |
FW_OFLD_TX_DATA_WR_SHOVE_V((!csk_flag
(sk, CSK_TX_MORE_DATA)) &&
skb_queue_empty(&csk->txq)));
req->plen = htonl(len);
}
static int chtls_wr_size(struct chtls_sock *csk, const struct sk_buff *skb,
bool size)
{
int wr_size;
wr_size = TLS_WR_CPL_LEN;
wr_size += KEY_ON_MEM_SZ;
wr_size += ivs_size(csk->sk, skb);
if (size)
return wr_size;
/* frags counted for IV dsgl */
if (!skb_ulp_tls_iv_imm(skb))
skb_shinfo(skb)->nr_frags++;
return wr_size;
}
static bool is_ofld_imm(struct chtls_sock *csk, const struct sk_buff *skb)
{
int length = skb->len;
if (skb->peeked || skb->len > MAX_IMM_ULPTX_WR_LEN)
return false;
if (likely(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NEED_HDR)) {
/* Check TLS header len for Immediate */
if (csk->ulp_mode == ULP_MODE_TLS &&
skb_ulp_tls_inline(skb))
length += chtls_wr_size(csk, skb, true);
else
length += sizeof(struct fw_ofld_tx_data_wr);
return length <= MAX_IMM_OFLD_TX_DATA_WR_LEN;
}
return true;
}
static unsigned int calc_tx_flits(const struct sk_buff *skb,
unsigned int immdlen)
{
unsigned int flits, cnt;
flits = immdlen / 8; /* headers */
cnt = skb_shinfo(skb)->nr_frags;
if (skb_tail_pointer(skb) != skb_transport_header(skb))
cnt++;
return flits + sgl_len(cnt);
}
static void arp_failure_discard(void *handle, struct sk_buff *skb)
{
kfree_skb(skb);
}
int chtls_push_frames(struct chtls_sock *csk, int comp)
{
struct chtls_hws *hws = &csk->tlshws;
struct tcp_sock *tp;
struct sk_buff *skb;
int total_size = 0;
struct sock *sk;
int wr_size;
wr_size = sizeof(struct fw_ofld_tx_data_wr);
sk = csk->sk;
tp = tcp_sk(sk);
if (unlikely(sk_in_state(sk, TCPF_SYN_SENT | TCPF_CLOSE)))
return 0;
if (unlikely(csk_flag(sk, CSK_ABORT_SHUTDOWN)))
return 0;
while (csk->wr_credits && (skb = skb_peek(&csk->txq)) &&
(!(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_HOLD) ||
skb_queue_len(&csk->txq) > 1)) {
unsigned int credit_len = skb->len;
unsigned int credits_needed;
unsigned int completion = 0;
int tls_len = skb->len;/* TLS data len before IV/key */
unsigned int immdlen;
int len = skb->len; /* length [ulp bytes] inserted by hw */
int flowclen16 = 0;
int tls_tx_imm = 0;
immdlen = skb->len;
if (!is_ofld_imm(csk, skb)) {
immdlen = skb_transport_offset(skb);
if (skb_ulp_tls_inline(skb))
wr_size = chtls_wr_size(csk, skb, false);
credit_len = 8 * calc_tx_flits(skb, immdlen);
} else {
if (skb_ulp_tls_inline(skb)) {
wr_size = chtls_wr_size(csk, skb, false);
tls_tx_imm = 1;
}
}
if (likely(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NEED_HDR))
credit_len += wr_size;
credits_needed = DIV_ROUND_UP(credit_len, 16);
if (!csk_flag_nochk(csk, CSK_TX_DATA_SENT)) {
flowclen16 = send_tx_flowc_wr(sk, 1, tp->snd_nxt,
tp->rcv_nxt);
if (flowclen16 <= 0)
break;
csk->wr_credits -= flowclen16;
csk->wr_unacked += flowclen16;
csk->wr_nondata += flowclen16;
csk_set_flag(csk, CSK_TX_DATA_SENT);
}
if (csk->wr_credits < credits_needed) {
if (skb_ulp_tls_inline(skb) &&
!skb_ulp_tls_iv_imm(skb))
skb_shinfo(skb)->nr_frags--;
break;
}
__skb_unlink(skb, &csk->txq);
skb_set_queue_mapping(skb, (csk->txq_idx << 1) |
CPL_PRIORITY_DATA);
if (hws->ofld)
hws->txqid = (skb->queue_mapping >> 1);
skb->csum = (__force __wsum)(credits_needed + csk->wr_nondata);
csk->wr_credits -= credits_needed;
csk->wr_unacked += credits_needed;
csk->wr_nondata = 0;
enqueue_wr(csk, skb);
if (likely(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NEED_HDR)) {
if ((comp && csk->wr_unacked == credits_needed) ||
(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_COMPL) ||
csk->wr_unacked >= csk->wr_max_credits / 2) {
completion = 1;
csk->wr_unacked = 0;
}
if (skb_ulp_tls_inline(skb))
make_tlstx_data_wr(sk, skb, tls_tx_imm,
tls_len, credits_needed);
else
make_tx_data_wr(sk, skb, immdlen, len,
credits_needed, completion);
tp->snd_nxt += len;
tp->lsndtime = tcp_time_stamp(tp);
if (completion)
ULP_SKB_CB(skb)->flags &= ~ULPCB_FLAG_NEED_HDR;
} else {
struct cpl_close_con_req *req = cplhdr(skb);
unsigned int cmd = CPL_OPCODE_G(ntohl
(OPCODE_TID(req)));
if (cmd == CPL_CLOSE_CON_REQ)
csk_set_flag(csk,
CSK_CLOSE_CON_REQUESTED);
if ((ULP_SKB_CB(skb)->flags & ULPCB_FLAG_COMPL) &&
(csk->wr_unacked >= csk->wr_max_credits / 2)) {
req->wr.wr_hi |= htonl(FW_WR_COMPL_F);
csk->wr_unacked = 0;
}
}
total_size += skb->truesize;
if (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_BARRIER)
csk_set_flag(csk, CSK_TX_WAIT_IDLE);
t4_set_arp_err_handler(skb, NULL, arp_failure_discard);
cxgb4_l2t_send(csk->egress_dev, skb, csk->l2t_entry);
}
sk->sk_wmem_queued -= total_size;
return total_size;
}
static void mark_urg(struct tcp_sock *tp, int flags,
struct sk_buff *skb)
{
if (unlikely(flags & MSG_OOB)) {
tp->snd_up = tp->write_seq;
ULP_SKB_CB(skb)->flags = ULPCB_FLAG_URG |
ULPCB_FLAG_BARRIER |
ULPCB_FLAG_NO_APPEND |
ULPCB_FLAG_NEED_HDR;
}
}
/*
* Returns true if a connection should send more data to TCP engine
*/
static bool should_push(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct chtls_dev *cdev = csk->cdev;
struct tcp_sock *tp = tcp_sk(sk);
/*
* If we've released our offload resources there's nothing to do ...
*/
if (!cdev)
return false;
/*
* If there aren't any work requests in flight, or there isn't enough
* data in flight, or Nagle is off then send the current TX_DATA
* otherwise hold it and wait to accumulate more data.
*/
return csk->wr_credits == csk->wr_max_credits ||
(tp->nonagle & TCP_NAGLE_OFF);
}
/*
* Returns true if a TCP socket is corked.
*/
static bool corked(const struct tcp_sock *tp, int flags)
{
return (flags & MSG_MORE) || (tp->nonagle & TCP_NAGLE_CORK);
}
/*
* Returns true if a send should try to push new data.
*/
static bool send_should_push(struct sock *sk, int flags)
{
return should_push(sk) && !corked(tcp_sk(sk), flags);
}
void chtls_tcp_push(struct sock *sk, int flags)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
int qlen = skb_queue_len(&csk->txq);
if (likely(qlen)) {
struct sk_buff *skb = skb_peek_tail(&csk->txq);
struct tcp_sock *tp = tcp_sk(sk);
mark_urg(tp, flags, skb);
if (!(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND) &&
corked(tp, flags)) {
ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_HOLD;
return;
}
ULP_SKB_CB(skb)->flags &= ~ULPCB_FLAG_HOLD;
if (qlen == 1 &&
((ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND) ||
should_push(sk)))
chtls_push_frames(csk, 1);
}
}
/*
* Calculate the size for a new send sk_buff. It's maximum size so we can
* pack lots of data into it, unless we plan to send it immediately, in which
* case we size it more tightly.
*
* Note: we don't bother compensating for MSS < PAGE_SIZE because it doesn't
* arise in normal cases and when it does we are just wasting memory.
*/
static int select_size(struct sock *sk, int io_len, int flags, int len)
{
const int pgbreak = SKB_MAX_HEAD(len);
/*
* If the data wouldn't fit in the main body anyway, put only the
* header in the main body so it can use immediate data and place all
* the payload in page fragments.
*/
if (io_len > pgbreak)
return 0;
/*
* If we will be accumulating payload get a large main body.
*/
if (!send_should_push(sk, flags))
return pgbreak;
return io_len;
}
void skb_entail(struct sock *sk, struct sk_buff *skb, int flags)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct tcp_sock *tp = tcp_sk(sk);
ULP_SKB_CB(skb)->seq = tp->write_seq;
ULP_SKB_CB(skb)->flags = flags;
__skb_queue_tail(&csk->txq, skb);
sk->sk_wmem_queued += skb->truesize;
if (TCP_PAGE(sk) && TCP_OFF(sk)) {
put_page(TCP_PAGE(sk));
TCP_PAGE(sk) = NULL;
TCP_OFF(sk) = 0;
}
}
static struct sk_buff *get_tx_skb(struct sock *sk, int size)
{
struct sk_buff *skb;
skb = alloc_skb(size + TX_HEADER_LEN, sk->sk_allocation);
if (likely(skb)) {
skb_reserve(skb, TX_HEADER_LEN);
skb_entail(sk, skb, ULPCB_FLAG_NEED_HDR);
skb_reset_transport_header(skb);
}
return skb;
}
static struct sk_buff *get_record_skb(struct sock *sk, int size, bool zcopy)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct sk_buff *skb;
skb = alloc_skb(((zcopy ? 0 : size) + TX_TLSHDR_LEN +
KEY_ON_MEM_SZ + max_ivs_size(sk, size)),
sk->sk_allocation);
if (likely(skb)) {
skb_reserve(skb, (TX_TLSHDR_LEN +
KEY_ON_MEM_SZ + max_ivs_size(sk, size)));
skb_entail(sk, skb, ULPCB_FLAG_NEED_HDR);
skb_reset_transport_header(skb);
ULP_SKB_CB(skb)->ulp.tls.ofld = 1;
ULP_SKB_CB(skb)->ulp.tls.type = csk->tlshws.type;
}
return skb;
}
static void tx_skb_finalize(struct sk_buff *skb)
{
struct ulp_skb_cb *cb = ULP_SKB_CB(skb);
if (!(cb->flags & ULPCB_FLAG_NO_HDR))
cb->flags = ULPCB_FLAG_NEED_HDR;
cb->flags |= ULPCB_FLAG_NO_APPEND;
}
static void push_frames_if_head(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
if (skb_queue_len(&csk->txq) == 1)
chtls_push_frames(csk, 1);
}
static int chtls_skb_copy_to_page_nocache(struct sock *sk,
struct iov_iter *from,
struct sk_buff *skb,
struct page *page,
int off, int copy)
{
int err;
err = skb_do_copy_data_nocache(sk, skb, from, page_address(page) +
off, copy, skb->len);
if (err)
return err;
skb->len += copy;
skb->data_len += copy;
skb->truesize += copy;
sk->sk_wmem_queued += copy;
return 0;
}
/* Read TLS header to find content type and data length */
static u16 tls_header_read(struct tls_hdr *thdr, struct iov_iter *from)
{
if (copy_from_iter(thdr, sizeof(*thdr), from) != sizeof(*thdr))
return -EFAULT;
return (__force u16)cpu_to_be16(thdr->length);
}
int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct chtls_dev *cdev = csk->cdev;
struct tcp_sock *tp = tcp_sk(sk);
struct sk_buff *skb;
int mss, flags, err;
int recordsz = 0;
int copied = 0;
int hdrlen = 0;
long timeo;
lock_sock(sk);
flags = msg->msg_flags;
timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
if (!sk_in_state(sk, TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) {
err = sk_stream_wait_connect(sk, &timeo);
if (err)
goto out_err;
}
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
err = -EPIPE;
if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
goto out_err;
mss = csk->mss;
csk_set_flag(csk, CSK_TX_MORE_DATA);
while (msg_data_left(msg)) {
int copy = 0;
skb = skb_peek_tail(&csk->txq);
if (skb) {
copy = mss - skb->len;
skb->ip_summed = CHECKSUM_UNNECESSARY;
}
if (is_tls_tx(csk) && !csk->tlshws.txleft) {
struct tls_hdr hdr;
recordsz = tls_header_read(&hdr, &msg->msg_iter);
size -= TLS_HEADER_LENGTH;
hdrlen += TLS_HEADER_LENGTH;
csk->tlshws.txleft = recordsz;
csk->tlshws.type = hdr.type;
if (skb)
ULP_SKB_CB(skb)->ulp.tls.type = hdr.type;
}
if (!skb || (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND) ||
copy <= 0) {
new_buf:
if (skb) {
tx_skb_finalize(skb);
push_frames_if_head(sk);
}
if (is_tls_tx(csk)) {
skb = get_record_skb(sk,
select_size(sk,
recordsz,
flags,
TX_TLSHDR_LEN),
false);
} else {
skb = get_tx_skb(sk,
select_size(sk, size, flags,
TX_HEADER_LEN));
}
if (unlikely(!skb))
goto wait_for_memory;
skb->ip_summed = CHECKSUM_UNNECESSARY;
copy = mss;
}
if (copy > size)
copy = size;
if (skb_tailroom(skb) > 0) {
copy = min(copy, skb_tailroom(skb));
if (is_tls_tx(csk))
copy = min_t(int, copy, csk->tlshws.txleft);
err = skb_add_data_nocache(sk, skb,
&msg->msg_iter, copy);
if (err)
goto do_fault;
} else {
int i = skb_shinfo(skb)->nr_frags;
struct page *page = TCP_PAGE(sk);
int pg_size = PAGE_SIZE;
int off = TCP_OFF(sk);
bool merge;
if (page)
pg_size <<= compound_order(page);
if (off < pg_size &&
skb_can_coalesce(skb, i, page, off)) {
merge = 1;
goto copy;
}
merge = 0;
if (i == (is_tls_tx(csk) ? (MAX_SKB_FRAGS - 1) :
MAX_SKB_FRAGS))
goto new_buf;
if (page && off == pg_size) {
put_page(page);
TCP_PAGE(sk) = page = NULL;
pg_size = PAGE_SIZE;
}
if (!page) {
gfp_t gfp = sk->sk_allocation;
int order = cdev->send_page_order;
if (order) {
page = alloc_pages(gfp | __GFP_COMP |
__GFP_NOWARN |
__GFP_NORETRY,
order);
if (page)
pg_size <<=
compound_order(page);
}
if (!page) {
page = alloc_page(gfp);
pg_size = PAGE_SIZE;
}
if (!page)
goto wait_for_memory;
off = 0;
}
copy:
if (copy > pg_size - off)
copy = pg_size - off;
if (is_tls_tx(csk))
copy = min_t(int, copy, csk->tlshws.txleft);
err = chtls_skb_copy_to_page_nocache(sk, &msg->msg_iter,
skb, page,
off, copy);
if (unlikely(err)) {
if (!TCP_PAGE(sk)) {
TCP_PAGE(sk) = page;
TCP_OFF(sk) = 0;
}
goto do_fault;
}
/* Update the skb. */
if (merge) {
skb_shinfo(skb)->frags[i - 1].size += copy;
} else {
skb_fill_page_desc(skb, i, page, off, copy);
if (off + copy < pg_size) {
/* space left keep page */
get_page(page);
TCP_PAGE(sk) = page;
} else {
TCP_PAGE(sk) = NULL;
}
}
TCP_OFF(sk) = off + copy;
}
if (unlikely(skb->len == mss))
tx_skb_finalize(skb);
tp->write_seq += copy;
copied += copy;
size -= copy;
if (is_tls_tx(csk))
csk->tlshws.txleft -= copy;
if (corked(tp, flags) &&
(sk_stream_wspace(sk) < sk_stream_min_wspace(sk)))
ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_NO_APPEND;
if (size == 0)
goto out;
if (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND)
push_frames_if_head(sk);
continue;
wait_for_memory:
err = sk_stream_wait_memory(sk, &timeo);
if (err)
goto do_error;
}
out:
csk_reset_flag(csk, CSK_TX_MORE_DATA);
if (copied)
chtls_tcp_push(sk, flags);
done:
release_sock(sk);
return copied + hdrlen;
do_fault:
if (!skb->len) {
__skb_unlink(skb, &csk->txq);
sk->sk_wmem_queued -= skb->truesize;
__kfree_skb(skb);
}
do_error:
if (copied)
goto out;
out_err:
if (csk_conn_inline(csk))
csk_reset_flag(csk, CSK_TX_MORE_DATA);
copied = sk_stream_error(sk, flags, err);
goto done;
}
int chtls_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
struct chtls_sock *csk;
int mss, err, copied;
struct tcp_sock *tp;
long timeo;
tp = tcp_sk(sk);
copied = 0;
csk = rcu_dereference_sk_user_data(sk);
timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
err = sk_stream_wait_connect(sk, &timeo);
if (!sk_in_state(sk, TCPF_ESTABLISHED | TCPF_CLOSE_WAIT) &&
err != 0)
goto out_err;
mss = csk->mss;
csk_set_flag(csk, CSK_TX_MORE_DATA);
while (size > 0) {
struct sk_buff *skb = skb_peek_tail(&csk->txq);
int copy, i;
copy = mss - skb->len;
if (!skb || (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND) ||
copy <= 0) {
new_buf:
if (is_tls_tx(csk)) {
skb = get_record_skb(sk,
select_size(sk, size,
flags,
TX_TLSHDR_LEN),
true);
} else {
skb = get_tx_skb(sk, 0);
}
if (!skb)
goto do_error;
copy = mss;
}
if (copy > size)
copy = size;
i = skb_shinfo(skb)->nr_frags;
if (skb_can_coalesce(skb, i, page, offset)) {
skb_shinfo(skb)->frags[i - 1].size += copy;
} else if (i < MAX_SKB_FRAGS) {
get_page(page);
skb_fill_page_desc(skb, i, page, offset, copy);
} else {
tx_skb_finalize(skb);
push_frames_if_head(sk);
goto new_buf;
}
skb->len += copy;
if (skb->len == mss)
tx_skb_finalize(skb);
skb->data_len += copy;
skb->truesize += copy;
sk->sk_wmem_queued += copy;
tp->write_seq += copy;
copied += copy;
offset += copy;
size -= copy;
if (corked(tp, flags) &&
(sk_stream_wspace(sk) < sk_stream_min_wspace(sk)))
ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_NO_APPEND;
if (!size)
break;
if (unlikely(ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND))
push_frames_if_head(sk);
continue;
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
}
out:
csk_reset_flag(csk, CSK_TX_MORE_DATA);
if (copied)
chtls_tcp_push(sk, flags);
done:
release_sock(sk);
return copied;
do_error:
if (copied)
goto out;
out_err:
if (csk_conn_inline(csk))
csk_reset_flag(csk, CSK_TX_MORE_DATA);
copied = sk_stream_error(sk, flags, err);
goto done;
}
static void chtls_select_window(struct sock *sk)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct tcp_sock *tp = tcp_sk(sk);
unsigned int wnd = tp->rcv_wnd;
wnd = max_t(unsigned int, wnd, tcp_full_space(sk));
wnd = max_t(unsigned int, MIN_RCV_WND, wnd);
if (wnd > MAX_RCV_WND)
wnd = MAX_RCV_WND;
/*
* Check if we need to grow the receive window in response to an increase in
* the socket's receive buffer size. Some applications increase the buffer
* size dynamically and rely on the window to grow accordingly.
*/
if (wnd > tp->rcv_wnd) {
tp->rcv_wup -= wnd - tp->rcv_wnd;
tp->rcv_wnd = wnd;
/* Mark the receive window as updated */
csk_reset_flag(csk, CSK_UPDATE_RCV_WND);
}
}
/*
* Send RX credits through an RX_DATA_ACK CPL message. We are permitted
* to return without sending the message in case we cannot allocate
* an sk_buff. Returns the number of credits sent.
*/
static u32 send_rx_credits(struct chtls_sock *csk, u32 credits)
{
struct cpl_rx_data_ack *req;
struct sk_buff *skb;
skb = alloc_skb(sizeof(*req), GFP_ATOMIC);
if (!skb)
return 0;
__skb_put(skb, sizeof(*req));
req = (struct cpl_rx_data_ack *)skb->head;
set_wr_txq(skb, CPL_PRIORITY_ACK, csk->port_id);
INIT_TP_WR(req, csk->tid);
OPCODE_TID(req) = cpu_to_be32(MK_OPCODE_TID(CPL_RX_DATA_ACK,
csk->tid));
req->credit_dack = cpu_to_be32(RX_CREDITS_V(credits) |
RX_FORCE_ACK_F);
cxgb4_ofld_send(csk->cdev->ports[csk->port_id], skb);
return credits;
}
#define CREDIT_RETURN_STATE (TCPF_ESTABLISHED | \
TCPF_FIN_WAIT1 | \
TCPF_FIN_WAIT2)
/*
* Called after some received data has been read. It returns RX credits
* to the HW for the amount of data processed.
*/
static void chtls_cleanup_rbuf(struct sock *sk, int copied)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct tcp_sock *tp;
int must_send;
u32 credits;
u32 thres;
thres = 15 * 1024;
if (!sk_in_state(sk, CREDIT_RETURN_STATE))
return;
chtls_select_window(sk);
tp = tcp_sk(sk);
credits = tp->copied_seq - tp->rcv_wup;
if (unlikely(!credits))
return;
/*
* For coalescing to work effectively ensure the receive window has
* at least 16KB left.
*/
must_send = credits + 16384 >= tp->rcv_wnd;
if (must_send || credits >= thres)
tp->rcv_wup += send_rx_credits(csk, credits);
}
static int chtls_pt_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int nonblock, int flags, int *addr_len)
{
struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
struct net_device *dev = csk->egress_dev;
struct chtls_hws *hws = &csk->tlshws;
struct tcp_sock *tp = tcp_sk(sk);
struct adapter *adap;
unsigned long avail;
int buffers_freed;
int copied = 0;
int request;
int target;
long timeo;
adap = netdev2adap(dev);
buffers_freed = 0;
timeo = sock_rcvtimeo(sk, nonblock);
target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);
request = len;
if (unlikely(csk_flag(sk, CSK_UPDATE_RCV_WND)))
chtls_cleanup_rbuf(sk, copied);
do {
struct sk_buff *skb;
u32 offset = 0;
if (unlikely(tp->urg_data &&
tp->urg_seq == tp->copied_seq)) {
if (copied)
break;
if (signal_pending(current)) {
copied = timeo ? sock_intr_errno(timeo) :
-EAGAIN;
break;
}
}
skb = skb_peek(&sk->sk_receive_queue);
if (skb)
goto found_ok_skb;
if (csk->wr_credits &&
skb_queue_len(&csk->txq) &&
chtls_push_frames(csk, csk->wr_credits ==
csk->wr_max_credits))
sk->sk_write_space(sk);
if (copied >= target && !sk->sk_backlog.tail)
break;
if (copied) {
if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
(sk->sk_shutdown & RCV_SHUTDOWN) ||
signal_pending(current))
break;
if (!timeo)
break;
} else {
if (sock_flag(sk, SOCK_DONE))
break;
if (sk->sk_err) {
copied = sock_error(sk);
break;
}
if (sk->sk_shutdown & RCV_SHUTDOWN)
break;
if (sk->sk_state == TCP_CLOSE) {
copied = -ENOTCONN;
break;
}
if (!timeo) {
copied = -EAGAIN;
break;
}
if (signal_pending(current)) {
copied = sock_intr_errno(timeo);
break;
}
}
if (sk->sk_backlog.tail) {
release_sock(sk);
lock_sock(sk);
chtls_cleanup_rbuf(sk, copied);
continue;
}
if (copied >= target)
break;
chtls_cleanup_rbuf(sk, copied);
sk_wait_data(sk, &timeo, NULL);
continue;
found_ok_skb:
if (!skb->len) {
skb_dst_set(skb, NULL);
__skb_unlink(skb, &sk->sk_receive_queue);
kfree_skb(skb);
if (!copied && !timeo) {
copied = -EAGAIN;
break;
}
if (copied < target) {
release_sock(sk);
lock_sock(sk);
continue;
}
break;
}
offset = hws->copied_seq;
avail = skb->len - offset;
if (len < avail)
avail = len;
if (unlikely(tp->urg_data)) {
u32 urg_offset = tp->urg_seq - tp->copied_seq;
if (urg_offset < avail) {
if (urg_offset) {
avail = urg_offset;
} else if (!sock_flag(sk, SOCK_URGINLINE)) {
/* First byte is urgent, skip */
tp->copied_seq++;
offset++;
avail--;
if (!avail)
goto skip_copy;
}
}
}
if (hws->rstate == TLS_RCV_ST_READ_BODY) {
if (skb_copy_datagram_msg(skb, offset,
msg, avail)) {
if (!copied) {
copied = -EFAULT;
break;
}
}
} else {
struct tlsrx_cmp_hdr *tls_hdr_pkt =
(struct tlsrx_cmp_hdr *)skb->data;
if ((tls_hdr_pkt->res_to_mac_error &
TLSRX_HDR_PKT_ERROR_M))
tls_hdr_pkt->type = 0x7F;
/* CMP pld len is for recv seq */
hws->rcvpld = skb->hdr_len;
if (skb_copy_datagram_msg(skb, offset, msg, avail)) {
if (!copied) {
copied = -EFAULT;
break;
}
}
}
copied += avail;
len -= avail;
hws->copied_seq += avail;
skip_copy:
if (tp->urg_data && after(tp->copied_seq, tp->urg_seq))
tp->urg_data = 0;
if (hws->rstate == TLS_RCV_ST_READ_BODY &&
(avail + offset) >= skb->len) {
if (likely(skb))
chtls_free_skb(sk, skb);
buffers_freed++;
hws->rstate = TLS_RCV_ST_READ_HEADER;
atomic_inc(&adap->chcr_stats.tls_pdu_rx);
tp->copied_seq += hws->rcvpld;
hws->copied_seq = 0;
if (copied >= target &&
!skb_peek(&sk->sk_receive_queue))
break;
} else {
if (likely(skb)) {
if (ULP_SKB_CB(skb)->flags &
ULPCB_FLAG_TLS_ND)
hws->rstate =
TLS_RCV_ST_READ_HEADER;
else
hws->rstate =
TLS_RCV_ST_READ_BODY;
chtls_free_skb(sk, skb);
}
buffers_freed++;
tp->copied_seq += avail;
hws->copied_seq = 0;
}
} while (len > 0);
if (buffers_freed)
chtls_cleanup_rbuf(sk, copied);
release_sock(sk);
return copied;
}
/*
* Peek at data in a socket's receive buffer.
*/
static int peekmsg(struct sock *sk, struct msghdr *msg,
size_t len, int nonblock, int flags)
{
struct tcp_sock *tp = tcp_sk(sk);
u32 peek_seq, offset;
struct sk_buff *skb;
int copied = 0;
size_t avail; /* amount of available data in current skb */
long timeo;
lock_sock(sk);
timeo = sock_rcvtimeo(sk, nonblock);
peek_seq = tp->copied_seq;
do {
if (unlikely(tp->urg_data && tp->urg_seq == peek_seq)) {
if (copied)
break;
if (signal_pending(current)) {
copied = timeo ? sock_intr_errno(timeo) :
-EAGAIN;
break;
}
}
skb_queue_walk(&sk->sk_receive_queue, skb) {
offset = peek_seq - ULP_SKB_CB(skb)->seq;
if (offset < skb->len)
goto found_ok_skb;
}
/* empty receive queue */
if (copied)
break;
if (sock_flag(sk, SOCK_DONE))
break;
if (sk->sk_err) {
copied = sock_error(sk);
break;
}
if (sk->sk_shutdown & RCV_SHUTDOWN)
break;
if (sk->sk_state == TCP_CLOSE) {
copied = -ENOTCONN;
break;
}
if (!timeo) {
copied = -EAGAIN;
break;
}
if (signal_pending(current)) {
copied = sock_intr_errno(timeo);
break;
}
if (sk->sk_backlog.tail) {
/* Do not sleep, just process backlog. */
release_sock(sk);
lock_sock(sk);
} else {
sk_wait_data(sk, &timeo, NULL);
}
if (unlikely(peek_seq != tp->copied_seq)) {
if (net_ratelimit())
pr_info("TCP(%s:%d), race in MSG_PEEK.\n",
current->comm, current->pid);
peek_seq = tp->copied_seq;
}
continue;
found_ok_skb:
avail = skb->len - offset;
if (len < avail)
avail = len;
/*
* Do we have urgent data here? We need to skip over the
* urgent byte.
*/
if (unlikely(tp->urg_data)) {
u32 urg_offset = tp->urg_seq - peek_seq;
if (urg_offset < avail) {
/*
* The amount of data we are preparing to copy
* contains urgent data.
*/
if (!urg_offset) { /* First byte is urgent */
if (!sock_flag(sk, SOCK_URGINLINE)) {
peek_seq++;
offset++;
avail--;
}
if (!avail)
continue;
} else {
/* stop short of the urgent data */
avail = urg_offset;
}
}
}
/*
* If MSG_TRUNC is specified the data is discarded.
*/
if (likely(!(flags & MSG_TRUNC)))
if (skb_copy_datagram_msg(skb, offset, msg, len)) {
if (!copied) {
copied = -EFAULT;
break;
}
}
peek_seq += avail;
copied += avail;
len -= avail;
} while (len > 0);
release_sock(sk);
return copied;
}
int chtls_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
int nonblock, int flags, int *addr_len)
{
struct tcp_sock *tp = tcp_sk(sk);
struct chtls_sock *csk;
struct chtls_hws *hws;
unsigned long avail; /* amount of available data in current skb */
int buffers_freed;
int copied = 0;
int request;
long timeo;
int target; /* Read at least this many bytes */
buffers_freed = 0;
if (unlikely(flags & MSG_OOB))
return tcp_prot.recvmsg(sk, msg, len, nonblock, flags,
addr_len);
if (unlikely(flags & MSG_PEEK))
return peekmsg(sk, msg, len, nonblock, flags);
if (sk_can_busy_loop(sk) &&
skb_queue_empty(&sk->sk_receive_queue) &&
sk->sk_state == TCP_ESTABLISHED)
sk_busy_loop(sk, nonblock);
lock_sock(sk);
csk = rcu_dereference_sk_user_data(sk);
hws = &csk->tlshws;
if (is_tls_rx(csk))
return chtls_pt_recvmsg(sk, msg, len, nonblock,
flags, addr_len);
timeo = sock_rcvtimeo(sk, nonblock);
target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);
request = len;
if (unlikely(csk_flag(sk, CSK_UPDATE_RCV_WND)))
chtls_cleanup_rbuf(sk, copied);
do {
struct sk_buff *skb;
u32 offset;
if (unlikely(tp->urg_data && tp->urg_seq == tp->copied_seq)) {
if (copied)
break;
if (signal_pending(current)) {
copied = timeo ? sock_intr_errno(timeo) :
-EAGAIN;
break;
}
}
skb = skb_peek(&sk->sk_receive_queue);
if (skb)
goto found_ok_skb;
if (csk->wr_credits &&
skb_queue_len(&csk->txq) &&
chtls_push_frames(csk, csk->wr_credits ==
csk->wr_max_credits))
sk->sk_write_space(sk);
if (copied >= target && !sk->sk_backlog.tail)
break;
if (copied) {
if (sk->sk_err || sk->sk_state == TCP_CLOSE ||
(sk->sk_shutdown & RCV_SHUTDOWN) ||
signal_pending(current))
break;
} else {
if (sock_flag(sk, SOCK_DONE))
break;
if (sk->sk_err) {
copied = sock_error(sk);
break;
}
if (sk->sk_shutdown & RCV_SHUTDOWN)
break;
if (sk->sk_state == TCP_CLOSE) {
copied = -ENOTCONN;
break;
}
if (!timeo) {
copied = -EAGAIN;
break;
}
if (signal_pending(current)) {
copied = sock_intr_errno(timeo);
break;
}
}
if (sk->sk_backlog.tail) {
release_sock(sk);
lock_sock(sk);
chtls_cleanup_rbuf(sk, copied);
continue;
}
if (copied >= target)
break;
chtls_cleanup_rbuf(sk, copied);
sk_wait_data(sk, &timeo, NULL);
continue;
found_ok_skb:
if (!skb->len) {
chtls_kfree_skb(sk, skb);
if (!copied && !timeo) {
copied = -EAGAIN;
break;
}
if (copied < target)
continue;
break;
}
offset = tp->copied_seq - ULP_SKB_CB(skb)->seq;
avail = skb->len - offset;
if (len < avail)
avail = len;
if (unlikely(tp->urg_data)) {
u32 urg_offset = tp->urg_seq - tp->copied_seq;
if (urg_offset < avail) {
if (urg_offset) {
avail = urg_offset;
} else if (!sock_flag(sk, SOCK_URGINLINE)) {
tp->copied_seq++;
offset++;
avail--;
if (!avail)
goto skip_copy;
}
}
}
if (likely(!(flags & MSG_TRUNC))) {
if (skb_copy_datagram_msg(skb, offset,
msg, avail)) {
if (!copied) {
copied = -EFAULT;
break;
}
}
}
tp->copied_seq += avail;
copied += avail;
len -= avail;
skip_copy:
if (tp->urg_data && after(tp->copied_seq, tp->urg_seq))
tp->urg_data = 0;
if (avail + offset >= skb->len) {
if (likely(skb))
chtls_free_skb(sk, skb);
buffers_freed++;
if (copied >= target &&
!skb_peek(&sk->sk_receive_queue))
break;
}
} while (len > 0);
if (buffers_freed)
chtls_cleanup_rbuf(sk, copied);
release_sock(sk);
return copied;
}
/*
* Copyright (c) 2018 Chelsio Communications, Inc.
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*
* Written by: Atul Gupta (atul.gupta@chelsio.com)
*/
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/skbuff.h>
#include <linux/socket.h>
#include <linux/hash.h>
#include <linux/in.h>
#include <linux/net.h>
#include <linux/ip.h>
#include <linux/tcp.h>
#include <net/tcp.h>
#include <net/tls.h>
#include "chtls.h"
#include "chtls_cm.h"
#define DRV_NAME "chtls"
/*
* chtls device management
* maintains a list of the chtls devices
*/
static LIST_HEAD(cdev_list);
static DEFINE_MUTEX(cdev_mutex);
static DEFINE_MUTEX(cdev_list_lock);
static DEFINE_MUTEX(notify_mutex);
static RAW_NOTIFIER_HEAD(listen_notify_list);
static struct proto chtls_cpl_prot;
struct request_sock_ops chtls_rsk_ops;
static uint send_page_order = (14 - PAGE_SHIFT < 0) ? 0 : 14 - PAGE_SHIFT;
static void register_listen_notifier(struct notifier_block *nb)
{
mutex_lock(&notify_mutex);
raw_notifier_chain_register(&listen_notify_list, nb);
mutex_unlock(&notify_mutex);
}
static void unregister_listen_notifier(struct notifier_block *nb)
{
mutex_lock(&notify_mutex);
raw_notifier_chain_unregister(&listen_notify_list, nb);
mutex_unlock(&notify_mutex);
}
static int listen_notify_handler(struct notifier_block *this,
unsigned long event, void *data)
{
struct chtls_dev *cdev;
struct sock *sk;
int ret;
sk = data;
ret = NOTIFY_DONE;
switch (event) {
case CHTLS_LISTEN_START:
case CHTLS_LISTEN_STOP:
mutex_lock(&cdev_list_lock);
list_for_each_entry(cdev, &cdev_list, list) {
if (event == CHTLS_LISTEN_START)
ret = chtls_listen_start(cdev, sk);
else
chtls_listen_stop(cdev, sk);
}
mutex_unlock(&cdev_list_lock);
break;
}
return ret;
}
static struct notifier_block listen_notifier = {
.notifier_call = listen_notify_handler
};
static int listen_backlog_rcv(struct sock *sk, struct sk_buff *skb)
{
if (likely(skb_transport_header(skb) != skb_network_header(skb)))
return tcp_v4_do_rcv(sk, skb);
BLOG_SKB_CB(skb)->backlog_rcv(sk, skb);
return 0;
}
static int chtls_start_listen(struct sock *sk)
{
int err;
if (sk->sk_protocol != IPPROTO_TCP)
return -EPROTONOSUPPORT;
if (sk->sk_family == PF_INET &&
LOOPBACK(inet_sk(sk)->inet_rcv_saddr))
return -EADDRNOTAVAIL;
sk->sk_backlog_rcv = listen_backlog_rcv;
mutex_lock(&notify_mutex);
err = raw_notifier_call_chain(&listen_notify_list,
CHTLS_LISTEN_START, sk);
mutex_unlock(&notify_mutex);
return err;
}
static void chtls_stop_listen(struct sock *sk)
{
if (sk->sk_protocol != IPPROTO_TCP)
return;
mutex_lock(&notify_mutex);
raw_notifier_call_chain(&listen_notify_list,
CHTLS_LISTEN_STOP, sk);
mutex_unlock(&notify_mutex);
}
static int chtls_inline_feature(struct tls_device *dev)
{
struct net_device *netdev;
struct chtls_dev *cdev;
int i;
cdev = to_chtls_dev(dev);
for (i = 0; i < cdev->lldi->nports; i++) {
netdev = cdev->ports[i];
if (netdev->features & NETIF_F_HW_TLS_RECORD)
return 1;
}
return 0;
}
static int chtls_create_hash(struct tls_device *dev, struct sock *sk)
{
if (sk->sk_state == TCP_LISTEN)
return chtls_start_listen(sk);
return 0;
}
static void chtls_destroy_hash(struct tls_device *dev, struct sock *sk)
{
if (sk->sk_state == TCP_LISTEN)
chtls_stop_listen(sk);
}
static void chtls_register_dev(struct chtls_dev *cdev)
{
struct tls_device *tlsdev = &cdev->tlsdev;
strlcpy(tlsdev->name, "chtls", TLS_DEVICE_NAME_MAX);
strlcat(tlsdev->name, cdev->lldi->ports[0]->name,
TLS_DEVICE_NAME_MAX);
tlsdev->feature = chtls_inline_feature;
tlsdev->hash = chtls_create_hash;
tlsdev->unhash = chtls_destroy_hash;
tls_register_device(&cdev->tlsdev);
}
static void chtls_unregister_dev(struct chtls_dev *cdev)
{
tls_unregister_device(&cdev->tlsdev);
}
static void process_deferq(struct work_struct *task_param)
{
struct chtls_dev *cdev = container_of(task_param,
struct chtls_dev, deferq_task);
struct sk_buff *skb;
spin_lock_bh(&cdev->deferq.lock);
while ((skb = __skb_dequeue(&cdev->deferq)) != NULL) {
spin_unlock_bh(&cdev->deferq.lock);
DEFERRED_SKB_CB(skb)->handler(cdev, skb);
spin_lock_bh(&cdev->deferq.lock);
}
spin_unlock_bh(&cdev->deferq.lock);
}
static int chtls_get_skb(struct chtls_dev *cdev)
{
cdev->askb = alloc_skb(sizeof(struct tcphdr), GFP_KERNEL);
if (!cdev->askb)
return -ENOMEM;
skb_put(cdev->askb, sizeof(struct tcphdr));
skb_reset_transport_header(cdev->askb);
memset(cdev->askb->data, 0, cdev->askb->len);
return 0;
}
static void *chtls_uld_add(const struct cxgb4_lld_info *info)
{
struct cxgb4_lld_info *lldi;
struct chtls_dev *cdev;
int i, j;
cdev = kzalloc(sizeof(*cdev) + info->nports *
(sizeof(struct net_device *)), GFP_KERNEL);
if (!cdev)
goto out;
lldi = kzalloc(sizeof(*lldi), GFP_KERNEL);
if (!lldi)
goto out_lldi;
if (chtls_get_skb(cdev))
goto out_skb;
*lldi = *info;
cdev->lldi = lldi;
cdev->pdev = lldi->pdev;
cdev->tids = lldi->tids;
cdev->ports = (struct net_device **)(cdev + 1);
cdev->ports = lldi->ports;
cdev->mtus = lldi->mtus;
cdev->tids = lldi->tids;
cdev->pfvf = FW_VIID_PFN_G(cxgb4_port_viid(lldi->ports[0]))
<< FW_VIID_PFN_S;
for (i = 0; i < (1 << RSPQ_HASH_BITS); i++) {
unsigned int size = 64 - sizeof(struct rsp_ctrl) - 8;
cdev->rspq_skb_cache[i] = __alloc_skb(size,
gfp_any(), 0,
lldi->nodeid);
if (unlikely(!cdev->rspq_skb_cache[i]))
goto out_rspq_skb;
}
idr_init(&cdev->hwtid_idr);
INIT_WORK(&cdev->deferq_task, process_deferq);
spin_lock_init(&cdev->listen_lock);
spin_lock_init(&cdev->idr_lock);
cdev->send_page_order = min_t(uint, get_order(32768),
send_page_order);
if (lldi->vr->key.size)
if (chtls_init_kmap(cdev, lldi))
goto out_rspq_skb;
mutex_lock(&cdev_mutex);
list_add_tail(&cdev->list, &cdev_list);
mutex_unlock(&cdev_mutex);
return cdev;
out_rspq_skb:
for (j = 0; j <= i; j++)
kfree_skb(cdev->rspq_skb_cache[j]);
kfree_skb(cdev->askb);
out_skb:
kfree(lldi);
out_lldi:
kfree(cdev);
out:
return NULL;
}
static void chtls_free_uld(struct chtls_dev *cdev)
{
int i;
chtls_unregister_dev(cdev);
kvfree(cdev->kmap.addr);
idr_destroy(&cdev->hwtid_idr);
for (i = 0; i < (1 << RSPQ_HASH_BITS); i++)
kfree_skb(cdev->rspq_skb_cache[i]);
kfree(cdev->lldi);
if (cdev->askb)
kfree_skb(cdev->askb);
kfree(cdev);
}
static void chtls_free_all_uld(void)
{
struct chtls_dev *cdev, *tmp;
mutex_lock(&cdev_mutex);
list_for_each_entry_safe(cdev, tmp, &cdev_list, list)
chtls_free_uld(cdev);
mutex_unlock(&cdev_mutex);
}
static int chtls_uld_state_change(void *handle, enum cxgb4_state new_state)
{
struct chtls_dev *cdev = handle;
switch (new_state) {
case CXGB4_STATE_UP:
chtls_register_dev(cdev);
break;
case CXGB4_STATE_DOWN:
break;
case CXGB4_STATE_START_RECOVERY:
break;
case CXGB4_STATE_DETACH:
mutex_lock(&cdev_mutex);
list_del(&cdev->list);
mutex_unlock(&cdev_mutex);
chtls_free_uld(cdev);
break;
default:
break;
}
return 0;
}
static struct sk_buff *copy_gl_to_skb_pkt(const struct pkt_gl *gl,
const __be64 *rsp,
u32 pktshift)
{
struct sk_buff *skb;
/* Allocate space for cpl_pass_accpet_req which will be synthesized by
* driver. Once driver synthesizes cpl_pass_accpet_req the skb will go
* through the regular cpl_pass_accept_req processing in TOM.
*/
skb = alloc_skb(gl->tot_len + sizeof(struct cpl_pass_accept_req)
- pktshift, GFP_ATOMIC);
if (unlikely(!skb))
return NULL;
__skb_put(skb, gl->tot_len + sizeof(struct cpl_pass_accept_req)
- pktshift);
/* For now we will copy cpl_rx_pkt in the skb */
skb_copy_to_linear_data(skb, rsp, sizeof(struct cpl_rx_pkt));
skb_copy_to_linear_data_offset(skb, sizeof(struct cpl_pass_accept_req)
, gl->va + pktshift,
gl->tot_len - pktshift);
return skb;
}
static int chtls_recv_packet(struct chtls_dev *cdev,
const struct pkt_gl *gl, const __be64 *rsp)
{
unsigned int opcode = *(u8 *)rsp;
struct sk_buff *skb;
int ret;
skb = copy_gl_to_skb_pkt(gl, rsp, cdev->lldi->sge_pktshift);
if (!skb)
return -ENOMEM;
ret = chtls_handlers[opcode](cdev, skb);
if (ret & CPL_RET_BUF_DONE)
kfree_skb(skb);
return 0;
}
static int chtls_recv_rsp(struct chtls_dev *cdev, const __be64 *rsp)
{
unsigned long rspq_bin;
unsigned int opcode;
struct sk_buff *skb;
unsigned int len;
int ret;
len = 64 - sizeof(struct rsp_ctrl) - 8;
opcode = *(u8 *)rsp;
rspq_bin = hash_ptr((void *)rsp, RSPQ_HASH_BITS);
skb = cdev->rspq_skb_cache[rspq_bin];
if (skb && !skb_is_nonlinear(skb) &&
!skb_shared(skb) && !skb_cloned(skb)) {
refcount_inc(&skb->users);
if (refcount_read(&skb->users) == 2) {
__skb_trim(skb, 0);
if (skb_tailroom(skb) >= len)
goto copy_out;
}
refcount_dec(&skb->users);
}
skb = alloc_skb(len, GFP_ATOMIC);
if (unlikely(!skb))
return -ENOMEM;
copy_out:
__skb_put(skb, len);
skb_copy_to_linear_data(skb, rsp, len);
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
ret = chtls_handlers[opcode](cdev, skb);
if (ret & CPL_RET_BUF_DONE)
kfree_skb(skb);
return 0;
}
static void chtls_recv(struct chtls_dev *cdev,
struct sk_buff **skbs, const __be64 *rsp)
{
struct sk_buff *skb = *skbs;
unsigned int opcode;
int ret;
opcode = *(u8 *)rsp;
__skb_push(skb, sizeof(struct rss_header));
skb_copy_to_linear_data(skb, rsp, sizeof(struct rss_header));
ret = chtls_handlers[opcode](cdev, skb);
if (ret & CPL_RET_BUF_DONE)
kfree_skb(skb);
}
static int chtls_uld_rx_handler(void *handle, const __be64 *rsp,
const struct pkt_gl *gl)
{
struct chtls_dev *cdev = handle;
unsigned int opcode;
struct sk_buff *skb;
opcode = *(u8 *)rsp;
if (unlikely(opcode == CPL_RX_PKT)) {
if (chtls_recv_packet(cdev, gl, rsp) < 0)
goto nomem;
return 0;
}
if (!gl)
return chtls_recv_rsp(cdev, rsp);
#define RX_PULL_LEN 128
skb = cxgb4_pktgl_to_skb(gl, RX_PULL_LEN, RX_PULL_LEN);
if (unlikely(!skb))
goto nomem;
chtls_recv(cdev, &skb, rsp);
return 0;
nomem:
return -ENOMEM;
}
static int do_chtls_getsockopt(struct sock *sk, char __user *optval,
int __user *optlen)
{
struct tls_crypto_info crypto_info;
crypto_info.version = TLS_1_2_VERSION;
if (copy_to_user(optval, &crypto_info, sizeof(struct tls_crypto_info)))
return -EFAULT;
return 0;
}
static int chtls_getsockopt(struct sock *sk, int level, int optname,
char __user *optval, int __user *optlen)
{
struct tls_context *ctx = tls_get_ctx(sk);
if (level != SOL_TLS)
return ctx->getsockopt(sk, level, optname, optval, optlen);
return do_chtls_getsockopt(sk, optval, optlen);
}
static int do_chtls_setsockopt(struct sock *sk, int optname,
char __user *optval, unsigned int optlen)
{
struct tls_crypto_info *crypto_info, tmp_crypto_info;
struct chtls_sock *csk;
int keylen;
int rc = 0;
csk = rcu_dereference_sk_user_data(sk);
if (!optval || optlen < sizeof(*crypto_info)) {
rc = -EINVAL;
goto out;
}
rc = copy_from_user(&tmp_crypto_info, optval, sizeof(*crypto_info));
if (rc) {
rc = -EFAULT;
goto out;
}
/* check version */
if (tmp_crypto_info.version != TLS_1_2_VERSION) {
rc = -ENOTSUPP;
goto out;
}
crypto_info = (struct tls_crypto_info *)&csk->tlshws.crypto_info;
switch (tmp_crypto_info.cipher_type) {
case TLS_CIPHER_AES_GCM_128: {
rc = copy_from_user(crypto_info, optval,
sizeof(struct
tls12_crypto_info_aes_gcm_128));
if (rc) {
rc = -EFAULT;
goto out;
}
keylen = TLS_CIPHER_AES_GCM_128_KEY_SIZE;
rc = chtls_setkey(csk, keylen, optname);
break;
}
default:
rc = -EINVAL;
goto out;
}
out:
return rc;
}
static int chtls_setsockopt(struct sock *sk, int level, int optname,
char __user *optval, unsigned int optlen)
{
struct tls_context *ctx = tls_get_ctx(sk);
if (level != SOL_TLS)
return ctx->setsockopt(sk, level, optname, optval, optlen);
return do_chtls_setsockopt(sk, optname, optval, optlen);
}
static struct cxgb4_uld_info chtls_uld_info = {
.name = DRV_NAME,
.nrxq = MAX_ULD_QSETS,
.ntxq = MAX_ULD_QSETS,
.rxq_size = 1024,
.add = chtls_uld_add,
.state_change = chtls_uld_state_change,
.rx_handler = chtls_uld_rx_handler,
};
void chtls_install_cpl_ops(struct sock *sk)
{
sk->sk_prot = &chtls_cpl_prot;
}
static void __init chtls_init_ulp_ops(void)
{
chtls_cpl_prot = tcp_prot;
chtls_init_rsk_ops(&chtls_cpl_prot, &chtls_rsk_ops,
&tcp_prot, PF_INET);
chtls_cpl_prot.close = chtls_close;
chtls_cpl_prot.disconnect = chtls_disconnect;
chtls_cpl_prot.destroy = chtls_destroy_sock;
chtls_cpl_prot.shutdown = chtls_shutdown;
chtls_cpl_prot.sendmsg = chtls_sendmsg;
chtls_cpl_prot.sendpage = chtls_sendpage;
chtls_cpl_prot.recvmsg = chtls_recvmsg;
chtls_cpl_prot.setsockopt = chtls_setsockopt;
chtls_cpl_prot.getsockopt = chtls_getsockopt;
}
static int __init chtls_register(void)
{
chtls_init_ulp_ops();
register_listen_notifier(&listen_notifier);
cxgb4_register_uld(CXGB4_ULD_TLS, &chtls_uld_info);
return 0;
}
static void __exit chtls_unregister(void)
{
unregister_listen_notifier(&listen_notifier);
chtls_free_all_uld();
cxgb4_unregister_uld(CXGB4_ULD_TLS);
}
module_init(chtls_register);
module_exit(chtls_unregister);
MODULE_DESCRIPTION("Chelsio TLS Inline driver");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Chelsio Communications");
MODULE_VERSION(DRV_VERSION);
......@@ -4549,19 +4549,33 @@ static int adap_init0(struct adapter *adap)
adap->num_ofld_uld += 2;
}
if (caps_cmd.cryptocaps) {
/* Should query params here...TODO */
if (ntohs(caps_cmd.cryptocaps) &
FW_CAPS_CONFIG_CRYPTO_LOOKASIDE) {
params[0] = FW_PARAM_PFVF(NCRYPTO_LOOKASIDE);
ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 2,
params, val);
ret = t4_query_params(adap, adap->mbox, adap->pf, 0,
2, params, val);
if (ret < 0) {
if (ret != -EINVAL)
goto bye;
} else {
adap->vres.ncrypto_fc = val[0];
}
adap->params.crypto = ntohs(caps_cmd.cryptocaps);
adap->num_ofld_uld += 1;
}
if (ntohs(caps_cmd.cryptocaps) &
FW_CAPS_CONFIG_TLS_INLINE) {
params[0] = FW_PARAM_PFVF(TLS_START);
params[1] = FW_PARAM_PFVF(TLS_END);
ret = t4_query_params(adap, adap->mbox, adap->pf, 0,
2, params, val);
if (ret < 0)
goto bye;
adap->vres.key.start = val[0];
adap->vres.key.size = val[1] - val[0] + 1;
adap->num_uld += 1;
}
adap->params.crypto = ntohs(caps_cmd.cryptocaps);
}
#undef FW_PARAM_PFVF
#undef FW_PARAM_DEV
......
......@@ -237,6 +237,7 @@ enum cxgb4_uld {
CXGB4_ULD_ISCSI,
CXGB4_ULD_ISCSIT,
CXGB4_ULD_CRYPTO,
CXGB4_ULD_TLS,
CXGB4_ULD_MAX
};
......@@ -289,6 +290,7 @@ struct cxgb4_virt_res { /* virtualized HW resources */
struct cxgb4_range qp;
struct cxgb4_range cq;
struct cxgb4_range ocq;
struct cxgb4_range key;
unsigned int ncrypto_fc;
};
......@@ -300,6 +302,9 @@ struct chcr_stats_debug {
atomic_t error;
atomic_t fallback;
atomic_t ipsec_cnt;
atomic_t tls_pdu_tx;
atomic_t tls_pdu_rx;
atomic_t tls_key;
};
#define OCQ_WIN_OFFSET(pdev, vres) \
......@@ -382,6 +387,8 @@ struct cxgb4_uld_info {
int cxgb4_register_uld(enum cxgb4_uld type, const struct cxgb4_uld_info *p);
int cxgb4_unregister_uld(enum cxgb4_uld type);
int cxgb4_ofld_send(struct net_device *dev, struct sk_buff *skb);
int cxgb4_immdata_send(struct net_device *dev, unsigned int idx,
const void *src, unsigned int len);
int cxgb4_crypto_send(struct net_device *dev, struct sk_buff *skb);
unsigned int cxgb4_dbfifo_count(const struct net_device *dev, int lpfifo);
unsigned int cxgb4_port_chan(const struct net_device *dev);
......
......@@ -1019,8 +1019,8 @@ EXPORT_SYMBOL(cxgb4_ring_tx_db);
void cxgb4_inline_tx_skb(const struct sk_buff *skb,
const struct sge_txq *q, void *pos)
{
u64 *p;
int left = (void *)q->stat - pos;
u64 *p;
if (likely(skb->len <= left)) {
if (likely(!skb->data_len))
......@@ -1735,15 +1735,13 @@ static void txq_stop_maperr(struct sge_uld_txq *q)
/**
* ofldtxq_stop - stop an offload Tx queue that has become full
* @q: the queue to stop
* @skb: the packet causing the queue to become full
* @wr: the Work Request causing the queue to become full
*
* Stops an offload Tx queue that has become full and modifies the packet
* being written to request a wakeup.
*/
static void ofldtxq_stop(struct sge_uld_txq *q, struct sk_buff *skb)
static void ofldtxq_stop(struct sge_uld_txq *q, struct fw_wr_hdr *wr)
{
struct fw_wr_hdr *wr = (struct fw_wr_hdr *)skb->data;
wr->lo |= htonl(FW_WR_EQUEQ_F | FW_WR_EQUIQ_F);
q->q.stops++;
q->full = 1;
......@@ -1804,7 +1802,7 @@ static void service_ofldq(struct sge_uld_txq *q)
credits = txq_avail(&q->q) - ndesc;
BUG_ON(credits < 0);
if (unlikely(credits < TXQ_STOP_THRES))
ofldtxq_stop(q, skb);
ofldtxq_stop(q, (struct fw_wr_hdr *)skb->data);
pos = (u64 *)&q->q.desc[q->q.pidx];
if (is_ofld_imm(skb))
......@@ -2005,6 +2003,103 @@ int cxgb4_ofld_send(struct net_device *dev, struct sk_buff *skb)
}
EXPORT_SYMBOL(cxgb4_ofld_send);
static void *inline_tx_header(const void *src,
const struct sge_txq *q,
void *pos, int length)
{
int left = (void *)q->stat - pos;
u64 *p;
if (likely(length <= left)) {
memcpy(pos, src, length);
pos += length;
} else {
memcpy(pos, src, left);
memcpy(q->desc, src + left, length - left);
pos = (void *)q->desc + (length - left);
}
/* 0-pad to multiple of 16 */
p = PTR_ALIGN(pos, 8);
if ((uintptr_t)p & 8) {
*p = 0;
return p + 1;
}
return p;
}
/**
* ofld_xmit_direct - copy a WR into offload queue
* @q: the Tx offload queue
* @src: location of WR
* @len: WR length
*
* Copy an immediate WR into an uncontended SGE offload queue.
*/
static int ofld_xmit_direct(struct sge_uld_txq *q, const void *src,
unsigned int len)
{
unsigned int ndesc;
int credits;
u64 *pos;
/* Use the lower limit as the cut-off */
if (len > MAX_IMM_OFLD_TX_DATA_WR_LEN) {
WARN_ON(1);
return NET_XMIT_DROP;
}
/* Don't return NET_XMIT_CN here as the current
* implementation doesn't queue the request
* using an skb when the following conditions not met
*/
if (!spin_trylock(&q->sendq.lock))
return NET_XMIT_DROP;
if (q->full || !skb_queue_empty(&q->sendq) ||
q->service_ofldq_running) {
spin_unlock(&q->sendq.lock);
return NET_XMIT_DROP;
}
ndesc = flits_to_desc(DIV_ROUND_UP(len, 8));
credits = txq_avail(&q->q) - ndesc;
pos = (u64 *)&q->q.desc[q->q.pidx];
/* ofldtxq_stop modifies WR header in-situ */
inline_tx_header(src, &q->q, pos, len);
if (unlikely(credits < TXQ_STOP_THRES))
ofldtxq_stop(q, (struct fw_wr_hdr *)pos);
txq_advance(&q->q, ndesc);
cxgb4_ring_tx_db(q->adap, &q->q, ndesc);
spin_unlock(&q->sendq.lock);
return NET_XMIT_SUCCESS;
}
int cxgb4_immdata_send(struct net_device *dev, unsigned int idx,
const void *src, unsigned int len)
{
struct sge_uld_txq_info *txq_info;
struct sge_uld_txq *txq;
struct adapter *adap;
int ret;
adap = netdev2adap(dev);
local_bh_disable();
txq_info = adap->sge.uld_txq_info[CXGB4_TX_OFLD];
if (unlikely(!txq_info)) {
WARN_ON(true);
local_bh_enable();
return NET_XMIT_DROP;
}
txq = &txq_info->uldtxq[idx];
ret = ofld_xmit_direct(txq, src, len);
local_bh_enable();
return net_xmit_eval(ret);
}
EXPORT_SYMBOL(cxgb4_immdata_send);
/**
* t4_crypto_send - send crypto packet
* @adap: the adapter
......
......@@ -82,6 +82,7 @@ enum {
CPL_RX_ISCSI_CMP = 0x45,
CPL_TRACE_PKT_T5 = 0x48,
CPL_RX_ISCSI_DDP = 0x49,
CPL_RX_TLS_CMP = 0x4E,
CPL_RDMA_READ_REQ = 0x60,
......@@ -89,6 +90,7 @@ enum {
CPL_ACT_OPEN_REQ6 = 0x83,
CPL_TX_TLS_PDU = 0x88,
CPL_TX_TLS_SFO = 0x89,
CPL_TX_SEC_PDU = 0x8A,
CPL_TX_TLS_ACK = 0x8B,
......@@ -98,6 +100,7 @@ enum {
CPL_RX_MPS_PKT = 0xAF,
CPL_TRACE_PKT = 0xB0,
CPL_TLS_DATA = 0xB1,
CPL_ISCSI_DATA = 0xB2,
CPL_FW4_MSG = 0xC0,
......@@ -155,6 +158,7 @@ enum {
ULP_MODE_RDMA = 4,
ULP_MODE_TCPDDP = 5,
ULP_MODE_FCOE = 6,
ULP_MODE_TLS = 8,
};
enum {
......@@ -1445,6 +1449,14 @@ struct cpl_tx_data {
#define T6_TX_FORCE_V(x) ((x) << T6_TX_FORCE_S)
#define T6_TX_FORCE_F T6_TX_FORCE_V(1U)
#define TX_SHOVE_S 14
#define TX_SHOVE_V(x) ((x) << TX_SHOVE_S)
#define TX_ULP_MODE_S 10
#define TX_ULP_MODE_M 0x7
#define TX_ULP_MODE_V(x) ((x) << TX_ULP_MODE_S)
#define TX_ULP_MODE_G(x) (((x) >> TX_ULP_MODE_S) & TX_ULP_MODE_M)
enum {
ULP_TX_MEM_READ = 2,
ULP_TX_MEM_WRITE = 3,
......@@ -1455,12 +1467,21 @@ enum {
ULP_TX_SC_NOOP = 0x80,
ULP_TX_SC_IMM = 0x81,
ULP_TX_SC_DSGL = 0x82,
ULP_TX_SC_ISGL = 0x83
ULP_TX_SC_ISGL = 0x83,
ULP_TX_SC_MEMRD = 0x86
};
#define ULPTX_CMD_S 24
#define ULPTX_CMD_V(x) ((x) << ULPTX_CMD_S)
#define ULPTX_LEN16_S 0
#define ULPTX_LEN16_M 0xFF
#define ULPTX_LEN16_V(x) ((x) << ULPTX_LEN16_S)
#define ULP_TX_SC_MORE_S 23
#define ULP_TX_SC_MORE_V(x) ((x) << ULP_TX_SC_MORE_S)
#define ULP_TX_SC_MORE_F ULP_TX_SC_MORE_V(1U)
struct ulptx_sge_pair {
__be32 len[2];
__be64 addr[2];
......@@ -2183,4 +2204,101 @@ struct cpl_srq_table_rpl {
#define SRQT_IDX_V(x) ((x) << SRQT_IDX_S)
#define SRQT_IDX_G(x) (((x) >> SRQT_IDX_S) & SRQT_IDX_M)
struct cpl_tx_tls_sfo {
__be32 op_to_seg_len;
__be32 pld_len;
__be32 type_protover;
__be32 r1_lo;
__be32 seqno_numivs;
__be32 ivgen_hdrlen;
__be64 scmd1;
};
/* cpl_tx_tls_sfo macros */
#define CPL_TX_TLS_SFO_OPCODE_S 24
#define CPL_TX_TLS_SFO_OPCODE_V(x) ((x) << CPL_TX_TLS_SFO_OPCODE_S)
#define CPL_TX_TLS_SFO_DATA_TYPE_S 20
#define CPL_TX_TLS_SFO_DATA_TYPE_V(x) ((x) << CPL_TX_TLS_SFO_DATA_TYPE_S)
#define CPL_TX_TLS_SFO_CPL_LEN_S 16
#define CPL_TX_TLS_SFO_CPL_LEN_V(x) ((x) << CPL_TX_TLS_SFO_CPL_LEN_S)
#define CPL_TX_TLS_SFO_SEG_LEN_S 0
#define CPL_TX_TLS_SFO_SEG_LEN_M 0xffff
#define CPL_TX_TLS_SFO_SEG_LEN_V(x) ((x) << CPL_TX_TLS_SFO_SEG_LEN_S)
#define CPL_TX_TLS_SFO_SEG_LEN_G(x) \
(((x) >> CPL_TX_TLS_SFO_SEG_LEN_S) & CPL_TX_TLS_SFO_SEG_LEN_M)
#define CPL_TX_TLS_SFO_TYPE_S 24
#define CPL_TX_TLS_SFO_TYPE_M 0xff
#define CPL_TX_TLS_SFO_TYPE_V(x) ((x) << CPL_TX_TLS_SFO_TYPE_S)
#define CPL_TX_TLS_SFO_TYPE_G(x) \
(((x) >> CPL_TX_TLS_SFO_TYPE_S) & CPL_TX_TLS_SFO_TYPE_M)
#define CPL_TX_TLS_SFO_PROTOVER_S 8
#define CPL_TX_TLS_SFO_PROTOVER_M 0xffff
#define CPL_TX_TLS_SFO_PROTOVER_V(x) ((x) << CPL_TX_TLS_SFO_PROTOVER_S)
#define CPL_TX_TLS_SFO_PROTOVER_G(x) \
(((x) >> CPL_TX_TLS_SFO_PROTOVER_S) & CPL_TX_TLS_SFO_PROTOVER_M)
struct cpl_tls_data {
struct rss_header rsshdr;
union opcode_tid ot;
__be32 length_pkd;
__be32 seq;
__be32 r1;
};
#define CPL_TLS_DATA_OPCODE_S 24
#define CPL_TLS_DATA_OPCODE_M 0xff
#define CPL_TLS_DATA_OPCODE_V(x) ((x) << CPL_TLS_DATA_OPCODE_S)
#define CPL_TLS_DATA_OPCODE_G(x) \
(((x) >> CPL_TLS_DATA_OPCODE_S) & CPL_TLS_DATA_OPCODE_M)
#define CPL_TLS_DATA_TID_S 0
#define CPL_TLS_DATA_TID_M 0xffffff
#define CPL_TLS_DATA_TID_V(x) ((x) << CPL_TLS_DATA_TID_S)
#define CPL_TLS_DATA_TID_G(x) \
(((x) >> CPL_TLS_DATA_TID_S) & CPL_TLS_DATA_TID_M)
#define CPL_TLS_DATA_LENGTH_S 0
#define CPL_TLS_DATA_LENGTH_M 0xffff
#define CPL_TLS_DATA_LENGTH_V(x) ((x) << CPL_TLS_DATA_LENGTH_S)
#define CPL_TLS_DATA_LENGTH_G(x) \
(((x) >> CPL_TLS_DATA_LENGTH_S) & CPL_TLS_DATA_LENGTH_M)
struct cpl_rx_tls_cmp {
struct rss_header rsshdr;
union opcode_tid ot;
__be32 pdulength_length;
__be32 seq;
__be32 ddp_report;
__be32 r;
__be32 ddp_valid;
};
#define CPL_RX_TLS_CMP_OPCODE_S 24
#define CPL_RX_TLS_CMP_OPCODE_M 0xff
#define CPL_RX_TLS_CMP_OPCODE_V(x) ((x) << CPL_RX_TLS_CMP_OPCODE_S)
#define CPL_RX_TLS_CMP_OPCODE_G(x) \
(((x) >> CPL_RX_TLS_CMP_OPCODE_S) & CPL_RX_TLS_CMP_OPCODE_M)
#define CPL_RX_TLS_CMP_TID_S 0
#define CPL_RX_TLS_CMP_TID_M 0xffffff
#define CPL_RX_TLS_CMP_TID_V(x) ((x) << CPL_RX_TLS_CMP_TID_S)
#define CPL_RX_TLS_CMP_TID_G(x) \
(((x) >> CPL_RX_TLS_CMP_TID_S) & CPL_RX_TLS_CMP_TID_M)
#define CPL_RX_TLS_CMP_PDULENGTH_S 16
#define CPL_RX_TLS_CMP_PDULENGTH_M 0xffff
#define CPL_RX_TLS_CMP_PDULENGTH_V(x) ((x) << CPL_RX_TLS_CMP_PDULENGTH_S)
#define CPL_RX_TLS_CMP_PDULENGTH_G(x) \
(((x) >> CPL_RX_TLS_CMP_PDULENGTH_S) & CPL_RX_TLS_CMP_PDULENGTH_M)
#define CPL_RX_TLS_CMP_LENGTH_S 0
#define CPL_RX_TLS_CMP_LENGTH_M 0xffff
#define CPL_RX_TLS_CMP_LENGTH_V(x) ((x) << CPL_RX_TLS_CMP_LENGTH_S)
#define CPL_RX_TLS_CMP_LENGTH_G(x) \
(((x) >> CPL_RX_TLS_CMP_LENGTH_S) & CPL_RX_TLS_CMP_LENGTH_M)
#endif /* __T4_MSG_H */
......@@ -2775,6 +2775,8 @@
#define ULP_RX_LA_RDPTR_A 0x19240
#define ULP_RX_LA_RDDATA_A 0x19244
#define ULP_RX_LA_WRPTR_A 0x19248
#define ULP_RX_TLS_KEY_LLIMIT_A 0x192ac
#define ULP_RX_TLS_KEY_ULIMIT_A 0x192b0
#define HPZ3_S 24
#define HPZ3_V(x) ((x) << HPZ3_S)
......
......@@ -105,6 +105,7 @@ enum fw_wr_opcodes {
FW_RI_INV_LSTAG_WR = 0x1a,
FW_ISCSI_TX_DATA_WR = 0x45,
FW_PTP_TX_PKT_WR = 0x46,
FW_TLSTX_DATA_WR = 0x68,
FW_CRYPTO_LOOKASIDE_WR = 0X6d,
FW_LASTC2E_WR = 0x70,
FW_FILTER2_WR = 0x77
......@@ -635,6 +636,30 @@ struct fw_ofld_connection_wr {
#define FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_F \
FW_OFLD_CONNECTION_WR_CPLPASSACCEPTRPL_V(1U)
enum fw_flowc_mnem_tcpstate {
FW_FLOWC_MNEM_TCPSTATE_CLOSED = 0, /* illegal */
FW_FLOWC_MNEM_TCPSTATE_LISTEN = 1, /* illegal */
FW_FLOWC_MNEM_TCPSTATE_SYNSENT = 2, /* illegal */
FW_FLOWC_MNEM_TCPSTATE_SYNRECEIVED = 3, /* illegal */
FW_FLOWC_MNEM_TCPSTATE_ESTABLISHED = 4, /* default */
FW_FLOWC_MNEM_TCPSTATE_CLOSEWAIT = 5, /* got peer close already */
FW_FLOWC_MNEM_TCPSTATE_FINWAIT1 = 6, /* haven't gotten ACK for FIN and
* will resend FIN - equiv ESTAB
*/
FW_FLOWC_MNEM_TCPSTATE_CLOSING = 7, /* haven't gotten ACK for FIN and
* will resend FIN but have
* received FIN
*/
FW_FLOWC_MNEM_TCPSTATE_LASTACK = 8, /* haven't gotten ACK for FIN and
* will resend FIN but have
* received FIN
*/
FW_FLOWC_MNEM_TCPSTATE_FINWAIT2 = 9, /* sent FIN and got FIN + ACK,
* waiting for FIN
*/
FW_FLOWC_MNEM_TCPSTATE_TIMEWAIT = 10, /* not expected */
};
enum fw_flowc_mnem {
FW_FLOWC_MNEM_PFNVFN, /* PFN [15:8] VFN [7:0] */
FW_FLOWC_MNEM_CH,
......@@ -651,6 +676,8 @@ enum fw_flowc_mnem {
FW_FLOWC_MNEM_DCBPRIO,
FW_FLOWC_MNEM_SND_SCALE,
FW_FLOWC_MNEM_RCV_SCALE,
FW_FLOWC_MNEM_ULD_MODE,
FW_FLOWC_MNEM_MAX,
};
struct fw_flowc_mnemval {
......@@ -675,6 +702,14 @@ struct fw_ofld_tx_data_wr {
__be32 tunnel_to_proxy;
};
#define FW_OFLD_TX_DATA_WR_ALIGNPLD_S 30
#define FW_OFLD_TX_DATA_WR_ALIGNPLD_V(x) ((x) << FW_OFLD_TX_DATA_WR_ALIGNPLD_S)
#define FW_OFLD_TX_DATA_WR_ALIGNPLD_F FW_OFLD_TX_DATA_WR_ALIGNPLD_V(1U)
#define FW_OFLD_TX_DATA_WR_SHOVE_S 29
#define FW_OFLD_TX_DATA_WR_SHOVE_V(x) ((x) << FW_OFLD_TX_DATA_WR_SHOVE_S)
#define FW_OFLD_TX_DATA_WR_SHOVE_F FW_OFLD_TX_DATA_WR_SHOVE_V(1U)
#define FW_OFLD_TX_DATA_WR_TUNNEL_S 19
#define FW_OFLD_TX_DATA_WR_TUNNEL_V(x) ((x) << FW_OFLD_TX_DATA_WR_TUNNEL_S)
......@@ -691,10 +726,6 @@ struct fw_ofld_tx_data_wr {
#define FW_OFLD_TX_DATA_WR_MORE_S 15
#define FW_OFLD_TX_DATA_WR_MORE_V(x) ((x) << FW_OFLD_TX_DATA_WR_MORE_S)
#define FW_OFLD_TX_DATA_WR_SHOVE_S 14
#define FW_OFLD_TX_DATA_WR_SHOVE_V(x) ((x) << FW_OFLD_TX_DATA_WR_SHOVE_S)
#define FW_OFLD_TX_DATA_WR_SHOVE_F FW_OFLD_TX_DATA_WR_SHOVE_V(1U)
#define FW_OFLD_TX_DATA_WR_ULPMODE_S 10
#define FW_OFLD_TX_DATA_WR_ULPMODE_V(x) ((x) << FW_OFLD_TX_DATA_WR_ULPMODE_S)
......@@ -1121,6 +1152,12 @@ enum fw_caps_config_iscsi {
FW_CAPS_CONFIG_ISCSI_TARGET_CNXOFLD = 0x00000008,
};
enum fw_caps_config_crypto {
FW_CAPS_CONFIG_CRYPTO_LOOKASIDE = 0x00000001,
FW_CAPS_CONFIG_TLS_INLINE = 0x00000002,
FW_CAPS_CONFIG_IPSEC_INLINE = 0x00000004,
};
enum fw_caps_config_fcoe {
FW_CAPS_CONFIG_FCOE_INITIATOR = 0x00000001,
FW_CAPS_CONFIG_FCOE_TARGET = 0x00000002,
......@@ -1266,6 +1303,8 @@ enum fw_params_param_pfvf {
FW_PARAMS_PARAM_PFVF_CPLFW4MSG_ENCAP = 0x31,
FW_PARAMS_PARAM_PFVF_HPFILTER_START = 0x32,
FW_PARAMS_PARAM_PFVF_HPFILTER_END = 0x33,
FW_PARAMS_PARAM_PFVF_TLS_START = 0x34,
FW_PARAMS_PARAM_PFVF_TLS_END = 0x35,
FW_PARAMS_PARAM_PFVF_NCRYPTO_LOOKASIDE = 0x39,
FW_PARAMS_PARAM_PFVF_PORT_CAPS32 = 0x3A,
};
......@@ -3839,4 +3878,122 @@ struct fw_crypto_lookaside_wr {
(((x) >> FW_CRYPTO_LOOKASIDE_WR_HASH_SIZE_S) & \
FW_CRYPTO_LOOKASIDE_WR_HASH_SIZE_M)
struct fw_tlstx_data_wr {
__be32 op_to_immdlen;
__be32 flowid_len16;
__be32 plen;
__be32 lsodisable_to_flags;
__be32 r5;
__be32 ctxloc_to_exp;
__be16 mfs;
__be16 adjustedplen_pkd;
__be16 expinplenmax_pkd;
u8 pdusinplenmax_pkd;
u8 r10;
};
#define FW_TLSTX_DATA_WR_OPCODE_S 24
#define FW_TLSTX_DATA_WR_OPCODE_M 0xff
#define FW_TLSTX_DATA_WR_OPCODE_V(x) ((x) << FW_TLSTX_DATA_WR_OPCODE_S)
#define FW_TLSTX_DATA_WR_OPCODE_G(x) \
(((x) >> FW_TLSTX_DATA_WR_OPCODE_S) & FW_TLSTX_DATA_WR_OPCODE_M)
#define FW_TLSTX_DATA_WR_COMPL_S 21
#define FW_TLSTX_DATA_WR_COMPL_M 0x1
#define FW_TLSTX_DATA_WR_COMPL_V(x) ((x) << FW_TLSTX_DATA_WR_COMPL_S)
#define FW_TLSTX_DATA_WR_COMPL_G(x) \
(((x) >> FW_TLSTX_DATA_WR_COMPL_S) & FW_TLSTX_DATA_WR_COMPL_M)
#define FW_TLSTX_DATA_WR_COMPL_F FW_TLSTX_DATA_WR_COMPL_V(1U)
#define FW_TLSTX_DATA_WR_IMMDLEN_S 0
#define FW_TLSTX_DATA_WR_IMMDLEN_M 0xff
#define FW_TLSTX_DATA_WR_IMMDLEN_V(x) ((x) << FW_TLSTX_DATA_WR_IMMDLEN_S)
#define FW_TLSTX_DATA_WR_IMMDLEN_G(x) \
(((x) >> FW_TLSTX_DATA_WR_IMMDLEN_S) & FW_TLSTX_DATA_WR_IMMDLEN_M)
#define FW_TLSTX_DATA_WR_FLOWID_S 8
#define FW_TLSTX_DATA_WR_FLOWID_M 0xfffff
#define FW_TLSTX_DATA_WR_FLOWID_V(x) ((x) << FW_TLSTX_DATA_WR_FLOWID_S)
#define FW_TLSTX_DATA_WR_FLOWID_G(x) \
(((x) >> FW_TLSTX_DATA_WR_FLOWID_S) & FW_TLSTX_DATA_WR_FLOWID_M)
#define FW_TLSTX_DATA_WR_LEN16_S 0
#define FW_TLSTX_DATA_WR_LEN16_M 0xff
#define FW_TLSTX_DATA_WR_LEN16_V(x) ((x) << FW_TLSTX_DATA_WR_LEN16_S)
#define FW_TLSTX_DATA_WR_LEN16_G(x) \
(((x) >> FW_TLSTX_DATA_WR_LEN16_S) & FW_TLSTX_DATA_WR_LEN16_M)
#define FW_TLSTX_DATA_WR_LSODISABLE_S 31
#define FW_TLSTX_DATA_WR_LSODISABLE_M 0x1
#define FW_TLSTX_DATA_WR_LSODISABLE_V(x) \
((x) << FW_TLSTX_DATA_WR_LSODISABLE_S)
#define FW_TLSTX_DATA_WR_LSODISABLE_G(x) \
(((x) >> FW_TLSTX_DATA_WR_LSODISABLE_S) & FW_TLSTX_DATA_WR_LSODISABLE_M)
#define FW_TLSTX_DATA_WR_LSODISABLE_F FW_TLSTX_DATA_WR_LSODISABLE_V(1U)
#define FW_TLSTX_DATA_WR_ALIGNPLD_S 30
#define FW_TLSTX_DATA_WR_ALIGNPLD_M 0x1
#define FW_TLSTX_DATA_WR_ALIGNPLD_V(x) ((x) << FW_TLSTX_DATA_WR_ALIGNPLD_S)
#define FW_TLSTX_DATA_WR_ALIGNPLD_G(x) \
(((x) >> FW_TLSTX_DATA_WR_ALIGNPLD_S) & FW_TLSTX_DATA_WR_ALIGNPLD_M)
#define FW_TLSTX_DATA_WR_ALIGNPLD_F FW_TLSTX_DATA_WR_ALIGNPLD_V(1U)
#define FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_S 29
#define FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_M 0x1
#define FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_V(x) \
((x) << FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_S)
#define FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_G(x) \
(((x) >> FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_S) & \
FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_M)
#define FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_F FW_TLSTX_DATA_WR_ALIGNPLDSHOVE_V(1U)
#define FW_TLSTX_DATA_WR_FLAGS_S 0
#define FW_TLSTX_DATA_WR_FLAGS_M 0xfffffff
#define FW_TLSTX_DATA_WR_FLAGS_V(x) ((x) << FW_TLSTX_DATA_WR_FLAGS_S)
#define FW_TLSTX_DATA_WR_FLAGS_G(x) \
(((x) >> FW_TLSTX_DATA_WR_FLAGS_S) & FW_TLSTX_DATA_WR_FLAGS_M)
#define FW_TLSTX_DATA_WR_CTXLOC_S 30
#define FW_TLSTX_DATA_WR_CTXLOC_M 0x3
#define FW_TLSTX_DATA_WR_CTXLOC_V(x) ((x) << FW_TLSTX_DATA_WR_CTXLOC_S)
#define FW_TLSTX_DATA_WR_CTXLOC_G(x) \
(((x) >> FW_TLSTX_DATA_WR_CTXLOC_S) & FW_TLSTX_DATA_WR_CTXLOC_M)
#define FW_TLSTX_DATA_WR_IVDSGL_S 29
#define FW_TLSTX_DATA_WR_IVDSGL_M 0x1
#define FW_TLSTX_DATA_WR_IVDSGL_V(x) ((x) << FW_TLSTX_DATA_WR_IVDSGL_S)
#define FW_TLSTX_DATA_WR_IVDSGL_G(x) \
(((x) >> FW_TLSTX_DATA_WR_IVDSGL_S) & FW_TLSTX_DATA_WR_IVDSGL_M)
#define FW_TLSTX_DATA_WR_IVDSGL_F FW_TLSTX_DATA_WR_IVDSGL_V(1U)
#define FW_TLSTX_DATA_WR_KEYSIZE_S 24
#define FW_TLSTX_DATA_WR_KEYSIZE_M 0x1f
#define FW_TLSTX_DATA_WR_KEYSIZE_V(x) ((x) << FW_TLSTX_DATA_WR_KEYSIZE_S)
#define FW_TLSTX_DATA_WR_KEYSIZE_G(x) \
(((x) >> FW_TLSTX_DATA_WR_KEYSIZE_S) & FW_TLSTX_DATA_WR_KEYSIZE_M)
#define FW_TLSTX_DATA_WR_NUMIVS_S 14
#define FW_TLSTX_DATA_WR_NUMIVS_M 0xff
#define FW_TLSTX_DATA_WR_NUMIVS_V(x) ((x) << FW_TLSTX_DATA_WR_NUMIVS_S)
#define FW_TLSTX_DATA_WR_NUMIVS_G(x) \
(((x) >> FW_TLSTX_DATA_WR_NUMIVS_S) & FW_TLSTX_DATA_WR_NUMIVS_M)
#define FW_TLSTX_DATA_WR_EXP_S 0
#define FW_TLSTX_DATA_WR_EXP_M 0x3fff
#define FW_TLSTX_DATA_WR_EXP_V(x) ((x) << FW_TLSTX_DATA_WR_EXP_S)
#define FW_TLSTX_DATA_WR_EXP_G(x) \
(((x) >> FW_TLSTX_DATA_WR_EXP_S) & FW_TLSTX_DATA_WR_EXP_M)
#define FW_TLSTX_DATA_WR_ADJUSTEDPLEN_S 1
#define FW_TLSTX_DATA_WR_ADJUSTEDPLEN_V(x) \
((x) << FW_TLSTX_DATA_WR_ADJUSTEDPLEN_S)
#define FW_TLSTX_DATA_WR_EXPINPLENMAX_S 4
#define FW_TLSTX_DATA_WR_EXPINPLENMAX_V(x) \
((x) << FW_TLSTX_DATA_WR_EXPINPLENMAX_S)
#define FW_TLSTX_DATA_WR_PDUSINPLENMAX_S 2
#define FW_TLSTX_DATA_WR_PDUSINPLENMAX_V(x) \
((x) << FW_TLSTX_DATA_WR_PDUSINPLENMAX_S)
#endif /* _T4FW_INTERFACE_H_ */
......@@ -79,6 +79,7 @@ enum {
NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */
NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */
/*
* Add your fresh new feature above and remember to update
......@@ -145,6 +146,7 @@ enum {
#define NETIF_F_HW_ESP __NETIF_F(HW_ESP)
#define NETIF_F_HW_ESP_TX_CSUM __NETIF_F(HW_ESP_TX_CSUM)
#define NETIF_F_RX_UDP_TUNNEL_PORT __NETIF_F(RX_UDP_TUNNEL_PORT)
#define NETIF_F_HW_TLS_RECORD __NETIF_F(HW_TLS_RECORD)
#define for_each_netdev_feature(mask_addr, bit) \
for_each_set_bit(bit, (unsigned long *)mask_addr, NETDEV_FEATURE_COUNT)
......
......@@ -56,6 +56,32 @@
#define TLS_RECORD_TYPE_DATA 0x17
#define TLS_AAD_SPACE_SIZE 13
#define TLS_DEVICE_NAME_MAX 32
/*
* This structure defines the routines for Inline TLS driver.
* The following routines are optional and filled with a
* null pointer if not defined.
*
* @name: Its the name of registered Inline tls device
* @dev_list: Inline tls device list
* int (*feature)(struct tls_device *device);
* Called to return Inline TLS driver capability
*
* int (*hash)(struct tls_device *device, struct sock *sk);
* This function sets Inline driver for listen and program
* device specific functioanlity as required
*
* void (*unhash)(struct tls_device *device, struct sock *sk);
* This function cleans listen state set by Inline TLS driver
*/
struct tls_device {
char name[TLS_DEVICE_NAME_MAX];
struct list_head dev_list;
int (*feature)(struct tls_device *device);
int (*hash)(struct tls_device *device, struct sock *sk);
void (*unhash)(struct tls_device *device, struct sock *sk);
};
struct tls_sw_context {
struct crypto_aead *aead_send;
......@@ -114,7 +140,7 @@ struct tls_context {
void *priv_ctx;
u8 conf:2;
u8 conf:3;
struct cipher_context tx;
struct cipher_context rx;
......@@ -135,6 +161,8 @@ struct tls_context {
int (*getsockopt)(struct sock *sk, int level,
int optname, char __user *optval,
int __user *optlen);
int (*hash)(struct sock *sk);
void (*unhash)(struct sock *sk);
};
int wait_on_pending_writer(struct sock *sk, long *timeo);
......@@ -283,5 +311,7 @@ static inline struct tls_offload_context *tls_offload_ctx(
int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
unsigned char *record_type);
void tls_register_device(struct tls_device *device);
void tls_unregister_device(struct tls_device *device);
#endif /* _TLS_OFFLOAD_H */
......@@ -108,6 +108,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
[NETIF_F_HW_ESP_BIT] = "esp-hw-offload",
[NETIF_F_HW_ESP_TX_CSUM_BIT] = "esp-tx-csum-hw-offload",
[NETIF_F_RX_UDP_TUNNEL_PORT_BIT] = "rx-udp_tunnel-port-offload",
[NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record",
};
static const char
......
......@@ -332,6 +332,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
tcp_update_metrics(sk);
tcp_done(sk);
}
EXPORT_SYMBOL(tcp_time_wait);
void tcp_twsk_destructor(struct sock *sk)
{
......
......@@ -38,6 +38,7 @@
#include <linux/highmem.h>
#include <linux/netdevice.h>
#include <linux/sched/signal.h>
#include <linux/inetdevice.h>
#include <net/tls.h>
......@@ -56,11 +57,14 @@ enum {
TLS_SW_TX,
TLS_SW_RX,
TLS_SW_RXTX,
TLS_HW_RECORD,
TLS_NUM_CONFIG,
};
static struct proto *saved_tcpv6_prot;
static DEFINE_MUTEX(tcpv6_prot_mutex);
static LIST_HEAD(device_list);
static DEFINE_MUTEX(device_mutex);
static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG];
static struct proto_ops tls_sw_proto_ops;
......@@ -241,8 +245,12 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
lock_sock(sk);
sk_proto_close = ctx->sk_proto_close;
if (ctx->conf == TLS_HW_RECORD)
goto skip_tx_cleanup;
if (ctx->conf == TLS_BASE) {
kfree(ctx);
ctx = NULL;
goto skip_tx_cleanup;
}
......@@ -276,6 +284,11 @@ static void tls_sk_proto_close(struct sock *sk, long timeout)
skip_tx_cleanup:
release_sock(sk);
sk_proto_close(sk, timeout);
/* free ctx for TLS_HW_RECORD, used by tcp_set_state
* for sk->sk_prot->unhash [tls_hw_unhash]
*/
if (ctx && ctx->conf == TLS_HW_RECORD)
kfree(ctx);
}
static int do_tls_getsockopt_tx(struct sock *sk, char __user *optval,
......@@ -493,6 +506,79 @@ static int tls_setsockopt(struct sock *sk, int level, int optname,
return do_tls_setsockopt(sk, optname, optval, optlen);
}
static struct tls_context *create_ctx(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tls_context *ctx;
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx)
return NULL;
icsk->icsk_ulp_data = ctx;
return ctx;
}
static int tls_hw_prot(struct sock *sk)
{
struct tls_context *ctx;
struct tls_device *dev;
int rc = 0;
mutex_lock(&device_mutex);
list_for_each_entry(dev, &device_list, dev_list) {
if (dev->feature && dev->feature(dev)) {
ctx = create_ctx(sk);
if (!ctx)
goto out;
ctx->hash = sk->sk_prot->hash;
ctx->unhash = sk->sk_prot->unhash;
ctx->sk_proto_close = sk->sk_prot->close;
ctx->conf = TLS_HW_RECORD;
update_sk_prot(sk, ctx);
rc = 1;
break;
}
}
out:
mutex_unlock(&device_mutex);
return rc;
}
static void tls_hw_unhash(struct sock *sk)
{
struct tls_context *ctx = tls_get_ctx(sk);
struct tls_device *dev;
mutex_lock(&device_mutex);
list_for_each_entry(dev, &device_list, dev_list) {
if (dev->unhash)
dev->unhash(dev, sk);
}
mutex_unlock(&device_mutex);
ctx->unhash(sk);
}
static int tls_hw_hash(struct sock *sk)
{
struct tls_context *ctx = tls_get_ctx(sk);
struct tls_device *dev;
int err;
err = ctx->hash(sk);
mutex_lock(&device_mutex);
list_for_each_entry(dev, &device_list, dev_list) {
if (dev->hash)
err |= dev->hash(dev, sk);
}
mutex_unlock(&device_mutex);
if (err)
tls_hw_unhash(sk);
return err;
}
static void build_protos(struct proto *prot, struct proto *base)
{
prot[TLS_BASE] = *base;
......@@ -511,15 +597,22 @@ static void build_protos(struct proto *prot, struct proto *base)
prot[TLS_SW_RXTX] = prot[TLS_SW_TX];
prot[TLS_SW_RXTX].recvmsg = tls_sw_recvmsg;
prot[TLS_SW_RXTX].close = tls_sk_proto_close;
prot[TLS_HW_RECORD] = *base;
prot[TLS_HW_RECORD].hash = tls_hw_hash;
prot[TLS_HW_RECORD].unhash = tls_hw_unhash;
prot[TLS_HW_RECORD].close = tls_sk_proto_close;
}
static int tls_init(struct sock *sk)
{
int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
struct inet_connection_sock *icsk = inet_csk(sk);
struct tls_context *ctx;
int rc = 0;
if (tls_hw_prot(sk))
goto out;
/* The TLS ulp is currently supported only for TCP sockets
* in ESTABLISHED state.
* Supporting sockets in LISTEN state will require us
......@@ -530,12 +623,11 @@ static int tls_init(struct sock *sk)
return -ENOTSUPP;
/* allocate tls context */
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
ctx = create_ctx(sk);
if (!ctx) {
rc = -ENOMEM;
goto out;
}
icsk->icsk_ulp_data = ctx;
ctx->setsockopt = sk->sk_prot->setsockopt;
ctx->getsockopt = sk->sk_prot->getsockopt;
ctx->sk_proto_close = sk->sk_prot->close;
......@@ -557,6 +649,22 @@ static int tls_init(struct sock *sk)
return rc;
}
void tls_register_device(struct tls_device *device)
{
mutex_lock(&device_mutex);
list_add_tail(&device->dev_list, &device_list);
mutex_unlock(&device_mutex);
}
EXPORT_SYMBOL(tls_register_device);
void tls_unregister_device(struct tls_device *device)
{
mutex_lock(&device_mutex);
list_del(&device->dev_list);
mutex_unlock(&device_mutex);
}
EXPORT_SYMBOL(tls_unregister_device);
static struct tcp_ulp_ops tcp_tls_ulp_ops __read_mostly = {
.name = "tls",
.uid = TCP_ULP_TLS,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment