Commit ccd5d1b9 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'ntb-4.13' of git://github.com/jonmason/ntb

Pull NTB updates from Jon Mason:
 "The major change in the series is a rework of the NTB infrastructure
  to all for IDT hardware to be supported (and resulting fallout from
  that). There are also a few clean-ups, etc.

  New IDT NTB driver and changes to the NTB infrastructure to allow for
  this different kind of NTB HW, some style fixes (per Greg KH
  recommendation), and some ntb_test tweaks"

* tag 'ntb-4.13' of git://github.com/jonmason/ntb:
  ntb_netdev: set the net_device's parent
  ntb: Add error path/handling to Debug FS entry creation
  ntb: Add more debugfs support for ntb_perf testing options
  ntb: Remove debug-fs variables from the context structure
  ntb: Add a module option to control affinity of DMA channels
  NTB: Add IDT 89HPESxNTx PCIe-switches support
  ntb_hw_intel: Style fixes: open code macros that just obfuscate code
  ntb_hw_amd: Style fixes: open code macros that just obfuscate code
  NTB: Add ntb.h comments
  NTB: Add PCIe Gen4 link speed
  NTB: Add new Memory Windows API documentation
  NTB: Add Messaging NTB API
  NTB: Alter Scratchpads API to support multi-ports devices
  NTB: Alter MW API to support multi-ports devices
  NTB: Alter link-state API to support multi-port devices
  NTB: Add indexed ports NTB API
  NTB: Make link-state API being declared first
  NTB: ntb_test: add parameter for doorbell bitmask
  NTB: ntb_test: modprobe on remote host
parents 4d25ec19 854b1dd9
# NTB Drivers # NTB Drivers
NTB (Non-Transparent Bridge) is a type of PCI-Express bridge chip that connects NTB (Non-Transparent Bridge) is a type of PCI-Express bridge chip that connects
the separate memory systems of two computers to the same PCI-Express fabric. the separate memory systems of two or more computers to the same PCI-Express
Existing NTB hardware supports a common feature set, including scratchpad fabric. Existing NTB hardware supports a common feature set: doorbell
registers, doorbell registers, and memory translation windows. Scratchpad registers and memory translation windows, as well as non common features like
registers are read-and-writable registers that are accessible from either side scratchpad and message registers. Scratchpad registers are read-and-writable
of the device, so that peers can exchange a small amount of information at a registers that are accessible from either side of the device, so that peers can
fixed address. Doorbell registers provide a way for peers to send interrupt exchange a small amount of information at a fixed address. Message registers can
events. Memory windows allow translated read and write access to the peer be utilized for the same purpose. Additionally they are provided with with
memory. special status bits to make sure the information isn't rewritten by another
peer. Doorbell registers provide a way for peers to send interrupt events.
Memory windows allow translated read and write access to the peer memory.
## NTB Core Driver (ntb) ## NTB Core Driver (ntb)
...@@ -26,6 +28,87 @@ as ntb hardware, or hardware drivers, are inserted and removed. The ...@@ -26,6 +28,87 @@ as ntb hardware, or hardware drivers, are inserted and removed. The
registration uses the Linux Device framework, so it should feel familiar to registration uses the Linux Device framework, so it should feel familiar to
anyone who has written a pci driver. anyone who has written a pci driver.
### NTB Typical client driver implementation
Primary purpose of NTB is to share some peace of memory between at least two
systems. So the NTB device features like Scratchpad/Message registers are
mainly used to perform the proper memory window initialization. Typically
there are two types of memory window interfaces supported by the NTB API:
inbound translation configured on the local ntb port and outbound translation
configured by the peer, on the peer ntb port. The first type is
depicted on the next figure
Inbound translation:
Memory: Local NTB Port: Peer NTB Port: Peer MMIO:
____________
| dma-mapped |-ntb_mw_set_trans(addr) |
| memory | _v____________ | ______________
| (addr) |<======| MW xlat addr |<====| MW base addr |<== memory-mapped IO
|------------| |--------------| | |--------------|
So typical scenario of the first type memory window initialization looks:
1) allocate a memory region, 2) put translated address to NTB config,
3) somehow notify a peer device of performed initialization, 4) peer device
maps corresponding outbound memory window so to have access to the shared
memory region.
The second type of interface, that implies the shared windows being
initialized by a peer device, is depicted on the figure:
Outbound translation:
Memory: Local NTB Port: Peer NTB Port: Peer MMIO:
____________ ______________
| dma-mapped | | | MW base addr |<== memory-mapped IO
| memory | | |--------------|
| (addr) |<===================| MW xlat addr |<-ntb_peer_mw_set_trans(addr)
|------------| | |--------------|
Typical scenario of the second type interface initialization would be:
1) allocate a memory region, 2) somehow deliver a translated address to a peer
device, 3) peer puts the translated address to NTB config, 4) peer device maps
outbound memory window so to have access to the shared memory region.
As one can see the described scenarios can be combined in one portable
algorithm.
Local device:
1) Allocate memory for a shared window
2) Initialize memory window by translated address of the allocated region
(it may fail if local memory window initialization is unsupported)
3) Send the translated address and memory window index to a peer device
Peer device:
1) Initialize memory window with retrieved address of the allocated
by another device memory region (it may fail if peer memory window
initialization is unsupported)
2) Map outbound memory window
In accordance with this scenario, the NTB Memory Window API can be used as
follows:
Local device:
1) ntb_mw_count(pidx) - retrieve number of memory ranges, which can
be allocated for memory windows between local device and peer device
of port with specified index.
2) ntb_get_align(pidx, midx) - retrieve parameters restricting the
shared memory region alignment and size. Then memory can be properly
allocated.
3) Allocate physically contiguous memory region in compliance with
restrictions retrieved in 2).
4) ntb_mw_set_trans(pidx, midx) - try to set translation address of
the memory window with specified index for the defined peer device
(it may fail if local translated address setting is not supported)
5) Send translated base address (usually together with memory window
number) to the peer device using, for instance, scratchpad or message
registers.
Peer device:
1) ntb_peer_mw_set_trans(pidx, midx) - try to set received from other
device (related to pidx) translated address for specified memory
window. It may fail if retrieved address, for instance, exceeds
maximum possible address or isn't properly aligned.
2) ntb_peer_mw_get_addr(widx) - retrieve MMIO address to map the memory
window so to have an access to the shared memory.
Also it is worth to note, that method ntb_mw_count(pidx) should return the
same value as ntb_peer_mw_count() on the peer with port index - pidx.
### NTB Transport Client (ntb\_transport) and NTB Netdev (ntb\_netdev) ### NTB Transport Client (ntb\_transport) and NTB Netdev (ntb\_netdev)
The primary client for NTB is the Transport client, used in tandem with NTB The primary client for NTB is the Transport client, used in tandem with NTB
......
...@@ -9381,6 +9381,12 @@ F: include/linux/ntb.h ...@@ -9381,6 +9381,12 @@ F: include/linux/ntb.h
F: include/linux/ntb_transport.h F: include/linux/ntb_transport.h
F: tools/testing/selftests/ntb/ F: tools/testing/selftests/ntb/
NTB IDT DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-ntb@googlegroups.com
S: Supported
F: drivers/ntb/hw/idt/
NTB INTEL DRIVER NTB INTEL DRIVER
M: Jon Mason <jdmason@kudzu.us> M: Jon Mason <jdmason@kudzu.us>
M: Dave Jiang <dave.jiang@intel.com> M: Dave Jiang <dave.jiang@intel.com>
......
...@@ -418,6 +418,8 @@ static int ntb_netdev_probe(struct device *client_dev) ...@@ -418,6 +418,8 @@ static int ntb_netdev_probe(struct device *client_dev)
if (!ndev) if (!ndev)
return -ENOMEM; return -ENOMEM;
SET_NETDEV_DEV(ndev, client_dev);
dev = netdev_priv(ndev); dev = netdev_priv(ndev);
dev->ndev = ndev; dev->ndev = ndev;
dev->pdev = pdev; dev->pdev = pdev;
......
source "drivers/ntb/hw/amd/Kconfig" source "drivers/ntb/hw/amd/Kconfig"
source "drivers/ntb/hw/idt/Kconfig"
source "drivers/ntb/hw/intel/Kconfig" source "drivers/ntb/hw/intel/Kconfig"
obj-$(CONFIG_NTB_AMD) += amd/ obj-$(CONFIG_NTB_AMD) += amd/
obj-$(CONFIG_NTB_IDT) += idt/
obj-$(CONFIG_NTB_INTEL) += intel/ obj-$(CONFIG_NTB_INTEL) += intel/
This diff is collapsed.
...@@ -211,9 +211,6 @@ struct amd_ntb_dev { ...@@ -211,9 +211,6 @@ struct amd_ntb_dev {
struct dentry *debugfs_info; struct dentry *debugfs_info;
}; };
#define ndev_pdev(ndev) ((ndev)->ntb.pdev)
#define ndev_name(ndev) pci_name(ndev_pdev(ndev))
#define ndev_dev(ndev) (&ndev_pdev(ndev)->dev)
#define ntb_ndev(__ntb) container_of(__ntb, struct amd_ntb_dev, ntb) #define ntb_ndev(__ntb) container_of(__ntb, struct amd_ntb_dev, ntb)
#define hb_ndev(__work) container_of(__work, struct amd_ntb_dev, hb_timer.work) #define hb_ndev(__work) container_of(__work, struct amd_ntb_dev, hb_timer.work)
......
config NTB_IDT
tristate "IDT PCIe-switch Non-Transparent Bridge support"
depends on PCI
help
This driver supports NTB of cappable IDT PCIe-switches.
Some of the pre-initializations must be made before IDT PCIe-switch
exposes it NT-functions correctly. It should be done by either proper
initialisation of EEPROM connected to master smbus of the switch or
by BIOS using slave-SMBus interface changing corresponding registers
value. Evidently it must be done before PCI bus enumeration is
finished in Linux kernel.
First of all partitions must be activated and properly assigned to all
the ports with NT-functions intended to be activated (see SWPARTxCTL
and SWPORTxCTL registers). Then all NT-function BARs must be enabled
with chosen valid aperture. For memory windows related BARs the
aperture settings shall determine the maximum size of memory windows
accepted by a BAR. Note that BAR0 must map PCI configuration space
registers.
It's worth to note, that since a part of this driver relies on the
BAR settings of peer NT-functions, the BAR setups can't be done over
kernel PCI fixups. That's why the alternative pre-initialization
techniques like BIOS using SMBus interface or EEPROM should be
utilized. Additionally if one needs to have temperature sensor
information printed to system log, the corresponding registers must
be initialized within BIOS/EEPROM as well.
If unsure, say N.
obj-$(CONFIG_NTB_IDT) += ntb_hw_idt.o
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -382,9 +382,6 @@ struct intel_ntb_dev { ...@@ -382,9 +382,6 @@ struct intel_ntb_dev {
struct dentry *debugfs_info; struct dentry *debugfs_info;
}; };
#define ndev_pdev(ndev) ((ndev)->ntb.pdev)
#define ndev_name(ndev) pci_name(ndev_pdev(ndev))
#define ndev_dev(ndev) (&ndev_pdev(ndev)->dev)
#define ntb_ndev(__ntb) container_of(__ntb, struct intel_ntb_dev, ntb) #define ntb_ndev(__ntb) container_of(__ntb, struct intel_ntb_dev, ntb)
#define hb_ndev(__work) container_of(__work, struct intel_ntb_dev, \ #define hb_ndev(__work) container_of(__work, struct intel_ntb_dev, \
hb_timer.work) hb_timer.work)
......
...@@ -5,6 +5,7 @@ ...@@ -5,6 +5,7 @@
* GPL LICENSE SUMMARY * GPL LICENSE SUMMARY
* *
* Copyright (C) 2015 EMC Corporation. All Rights Reserved. * Copyright (C) 2015 EMC Corporation. All Rights Reserved.
* Copyright (C) 2016 T-Platforms. All Rights Reserved.
* *
* This program is free software; you can redistribute it and/or modify * This program is free software; you can redistribute it and/or modify
* it under the terms of version 2 of the GNU General Public License as * it under the terms of version 2 of the GNU General Public License as
...@@ -18,6 +19,7 @@ ...@@ -18,6 +19,7 @@
* BSD LICENSE * BSD LICENSE
* *
* Copyright (C) 2015 EMC Corporation. All Rights Reserved. * Copyright (C) 2015 EMC Corporation. All Rights Reserved.
* Copyright (C) 2016 T-Platforms. All Rights Reserved.
* *
* Redistribution and use in source and binary forms, with or without * Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions * modification, are permitted provided that the following conditions
...@@ -191,6 +193,73 @@ void ntb_db_event(struct ntb_dev *ntb, int vector) ...@@ -191,6 +193,73 @@ void ntb_db_event(struct ntb_dev *ntb, int vector)
} }
EXPORT_SYMBOL(ntb_db_event); EXPORT_SYMBOL(ntb_db_event);
void ntb_msg_event(struct ntb_dev *ntb)
{
unsigned long irqflags;
spin_lock_irqsave(&ntb->ctx_lock, irqflags);
{
if (ntb->ctx_ops && ntb->ctx_ops->msg_event)
ntb->ctx_ops->msg_event(ntb->ctx);
}
spin_unlock_irqrestore(&ntb->ctx_lock, irqflags);
}
EXPORT_SYMBOL(ntb_msg_event);
int ntb_default_port_number(struct ntb_dev *ntb)
{
switch (ntb->topo) {
case NTB_TOPO_PRI:
case NTB_TOPO_B2B_USD:
return NTB_PORT_PRI_USD;
case NTB_TOPO_SEC:
case NTB_TOPO_B2B_DSD:
return NTB_PORT_SEC_DSD;
default:
break;
}
return -EINVAL;
}
EXPORT_SYMBOL(ntb_default_port_number);
int ntb_default_peer_port_count(struct ntb_dev *ntb)
{
return NTB_DEF_PEER_CNT;
}
EXPORT_SYMBOL(ntb_default_peer_port_count);
int ntb_default_peer_port_number(struct ntb_dev *ntb, int pidx)
{
if (pidx != NTB_DEF_PEER_IDX)
return -EINVAL;
switch (ntb->topo) {
case NTB_TOPO_PRI:
case NTB_TOPO_B2B_USD:
return NTB_PORT_SEC_DSD;
case NTB_TOPO_SEC:
case NTB_TOPO_B2B_DSD:
return NTB_PORT_PRI_USD;
default:
break;
}
return -EINVAL;
}
EXPORT_SYMBOL(ntb_default_peer_port_number);
int ntb_default_peer_port_idx(struct ntb_dev *ntb, int port)
{
int peer_port = ntb_default_peer_port_number(ntb, NTB_DEF_PEER_IDX);
if (peer_port == -EINVAL || port != peer_port)
return -EINVAL;
return 0;
}
EXPORT_SYMBOL(ntb_default_peer_port_idx);
static int ntb_probe(struct device *dev) static int ntb_probe(struct device *dev)
{ {
struct ntb_dev *ntb; struct ntb_dev *ntb;
......
...@@ -95,6 +95,9 @@ MODULE_PARM_DESC(use_dma, "Use DMA engine to perform large data copy"); ...@@ -95,6 +95,9 @@ MODULE_PARM_DESC(use_dma, "Use DMA engine to perform large data copy");
static struct dentry *nt_debugfs_dir; static struct dentry *nt_debugfs_dir;
/* Only two-ports NTB devices are supported */
#define PIDX NTB_DEF_PEER_IDX
struct ntb_queue_entry { struct ntb_queue_entry {
/* ntb_queue list reference */ /* ntb_queue list reference */
struct list_head entry; struct list_head entry;
...@@ -670,7 +673,7 @@ static void ntb_free_mw(struct ntb_transport_ctx *nt, int num_mw) ...@@ -670,7 +673,7 @@ static void ntb_free_mw(struct ntb_transport_ctx *nt, int num_mw)
if (!mw->virt_addr) if (!mw->virt_addr)
return; return;
ntb_mw_clear_trans(nt->ndev, num_mw); ntb_mw_clear_trans(nt->ndev, PIDX, num_mw);
dma_free_coherent(&pdev->dev, mw->buff_size, dma_free_coherent(&pdev->dev, mw->buff_size,
mw->virt_addr, mw->dma_addr); mw->virt_addr, mw->dma_addr);
mw->xlat_size = 0; mw->xlat_size = 0;
...@@ -727,7 +730,8 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw, ...@@ -727,7 +730,8 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
} }
/* Notify HW the memory location of the receive buffer */ /* Notify HW the memory location of the receive buffer */
rc = ntb_mw_set_trans(nt->ndev, num_mw, mw->dma_addr, mw->xlat_size); rc = ntb_mw_set_trans(nt->ndev, PIDX, num_mw, mw->dma_addr,
mw->xlat_size);
if (rc) { if (rc) {
dev_err(&pdev->dev, "Unable to set mw%d translation", num_mw); dev_err(&pdev->dev, "Unable to set mw%d translation", num_mw);
ntb_free_mw(nt, num_mw); ntb_free_mw(nt, num_mw);
...@@ -858,17 +862,17 @@ static void ntb_transport_link_work(struct work_struct *work) ...@@ -858,17 +862,17 @@ static void ntb_transport_link_work(struct work_struct *work)
size = max_mw_size; size = max_mw_size;
spad = MW0_SZ_HIGH + (i * 2); spad = MW0_SZ_HIGH + (i * 2);
ntb_peer_spad_write(ndev, spad, upper_32_bits(size)); ntb_peer_spad_write(ndev, PIDX, spad, upper_32_bits(size));
spad = MW0_SZ_LOW + (i * 2); spad = MW0_SZ_LOW + (i * 2);
ntb_peer_spad_write(ndev, spad, lower_32_bits(size)); ntb_peer_spad_write(ndev, PIDX, spad, lower_32_bits(size));
} }
ntb_peer_spad_write(ndev, NUM_MWS, nt->mw_count); ntb_peer_spad_write(ndev, PIDX, NUM_MWS, nt->mw_count);
ntb_peer_spad_write(ndev, NUM_QPS, nt->qp_count); ntb_peer_spad_write(ndev, PIDX, NUM_QPS, nt->qp_count);
ntb_peer_spad_write(ndev, VERSION, NTB_TRANSPORT_VERSION); ntb_peer_spad_write(ndev, PIDX, VERSION, NTB_TRANSPORT_VERSION);
/* Query the remote side for its info */ /* Query the remote side for its info */
val = ntb_spad_read(ndev, VERSION); val = ntb_spad_read(ndev, VERSION);
...@@ -944,7 +948,7 @@ static void ntb_qp_link_work(struct work_struct *work) ...@@ -944,7 +948,7 @@ static void ntb_qp_link_work(struct work_struct *work)
val = ntb_spad_read(nt->ndev, QP_LINKS); val = ntb_spad_read(nt->ndev, QP_LINKS);
ntb_peer_spad_write(nt->ndev, QP_LINKS, val | BIT(qp->qp_num)); ntb_peer_spad_write(nt->ndev, PIDX, QP_LINKS, val | BIT(qp->qp_num));
/* query remote spad for qp ready bits */ /* query remote spad for qp ready bits */
dev_dbg_ratelimited(&pdev->dev, "Remote QP link status = %x\n", val); dev_dbg_ratelimited(&pdev->dev, "Remote QP link status = %x\n", val);
...@@ -1055,7 +1059,12 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev) ...@@ -1055,7 +1059,12 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
int node; int node;
int rc, i; int rc, i;
mw_count = ntb_mw_count(ndev); mw_count = ntb_mw_count(ndev, PIDX);
if (!ndev->ops->mw_set_trans) {
dev_err(&ndev->dev, "Inbound MW based NTB API is required\n");
return -EINVAL;
}
if (ntb_db_is_unsafe(ndev)) if (ntb_db_is_unsafe(ndev))
dev_dbg(&ndev->dev, dev_dbg(&ndev->dev,
...@@ -1064,6 +1073,9 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev) ...@@ -1064,6 +1073,9 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
dev_dbg(&ndev->dev, dev_dbg(&ndev->dev,
"scratchpad is unsafe, proceed anyway...\n"); "scratchpad is unsafe, proceed anyway...\n");
if (ntb_peer_port_count(ndev) != NTB_DEF_PEER_CNT)
dev_warn(&ndev->dev, "Multi-port NTB devices unsupported\n");
node = dev_to_node(&ndev->dev); node = dev_to_node(&ndev->dev);
nt = kzalloc_node(sizeof(*nt), GFP_KERNEL, node); nt = kzalloc_node(sizeof(*nt), GFP_KERNEL, node);
...@@ -1094,8 +1106,13 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev) ...@@ -1094,8 +1106,13 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
for (i = 0; i < mw_count; i++) { for (i = 0; i < mw_count; i++) {
mw = &nt->mw_vec[i]; mw = &nt->mw_vec[i];
rc = ntb_mw_get_range(ndev, i, &mw->phys_addr, &mw->phys_size, rc = ntb_mw_get_align(ndev, PIDX, i, &mw->xlat_align,
&mw->xlat_align, &mw->xlat_align_size); &mw->xlat_align_size, NULL);
if (rc)
goto err1;
rc = ntb_peer_mw_get_addr(ndev, i, &mw->phys_addr,
&mw->phys_size);
if (rc) if (rc)
goto err1; goto err1;
...@@ -2091,8 +2108,7 @@ void ntb_transport_link_down(struct ntb_transport_qp *qp) ...@@ -2091,8 +2108,7 @@ void ntb_transport_link_down(struct ntb_transport_qp *qp)
val = ntb_spad_read(qp->ndev, QP_LINKS); val = ntb_spad_read(qp->ndev, QP_LINKS);
ntb_peer_spad_write(qp->ndev, QP_LINKS, ntb_peer_spad_write(qp->ndev, PIDX, QP_LINKS, val & ~BIT(qp->qp_num));
val & ~BIT(qp->qp_num));
if (qp->link_is_up) if (qp->link_is_up)
ntb_send_link_down(qp); ntb_send_link_down(qp);
......
...@@ -76,6 +76,7 @@ ...@@ -76,6 +76,7 @@
#define DMA_RETRIES 20 #define DMA_RETRIES 20
#define SZ_4G (1ULL << 32) #define SZ_4G (1ULL << 32)
#define MAX_SEG_ORDER 20 /* no larger than 1M for kmalloc buffer */ #define MAX_SEG_ORDER 20 /* no larger than 1M for kmalloc buffer */
#define PIDX NTB_DEF_PEER_IDX
MODULE_LICENSE(DRIVER_LICENSE); MODULE_LICENSE(DRIVER_LICENSE);
MODULE_VERSION(DRIVER_VERSION); MODULE_VERSION(DRIVER_VERSION);
...@@ -100,6 +101,10 @@ static bool use_dma; /* default to 0 */ ...@@ -100,6 +101,10 @@ static bool use_dma; /* default to 0 */
module_param(use_dma, bool, 0644); module_param(use_dma, bool, 0644);
MODULE_PARM_DESC(use_dma, "Using DMA engine to measure performance"); MODULE_PARM_DESC(use_dma, "Using DMA engine to measure performance");
static bool on_node = true; /* default to 1 */
module_param(on_node, bool, 0644);
MODULE_PARM_DESC(on_node, "Run threads only on NTB device node (default: true)");
struct perf_mw { struct perf_mw {
phys_addr_t phys_addr; phys_addr_t phys_addr;
resource_size_t phys_size; resource_size_t phys_size;
...@@ -135,9 +140,6 @@ struct perf_ctx { ...@@ -135,9 +140,6 @@ struct perf_ctx {
bool link_is_up; bool link_is_up;
struct delayed_work link_work; struct delayed_work link_work;
wait_queue_head_t link_wq; wait_queue_head_t link_wq;
struct dentry *debugfs_node_dir;
struct dentry *debugfs_run;
struct dentry *debugfs_threads;
u8 perf_threads; u8 perf_threads;
/* mutex ensures only one set of threads run at once */ /* mutex ensures only one set of threads run at once */
struct mutex run_mutex; struct mutex run_mutex;
...@@ -344,6 +346,10 @@ static int perf_move_data(struct pthr_ctx *pctx, char __iomem *dst, char *src, ...@@ -344,6 +346,10 @@ static int perf_move_data(struct pthr_ctx *pctx, char __iomem *dst, char *src,
static bool perf_dma_filter_fn(struct dma_chan *chan, void *node) static bool perf_dma_filter_fn(struct dma_chan *chan, void *node)
{ {
/* Is the channel required to be on the same node as the device? */
if (!on_node)
return true;
return dev_to_node(&chan->dev->device) == (int)(unsigned long)node; return dev_to_node(&chan->dev->device) == (int)(unsigned long)node;
} }
...@@ -361,7 +367,7 @@ static int ntb_perf_thread(void *data) ...@@ -361,7 +367,7 @@ static int ntb_perf_thread(void *data)
pr_debug("kthread %s starting...\n", current->comm); pr_debug("kthread %s starting...\n", current->comm);
node = dev_to_node(&pdev->dev); node = on_node ? dev_to_node(&pdev->dev) : NUMA_NO_NODE;
if (use_dma && !pctx->dma_chan) { if (use_dma && !pctx->dma_chan) {
dma_cap_mask_t dma_mask; dma_cap_mask_t dma_mask;
...@@ -454,7 +460,7 @@ static void perf_free_mw(struct perf_ctx *perf) ...@@ -454,7 +460,7 @@ static void perf_free_mw(struct perf_ctx *perf)
if (!mw->virt_addr) if (!mw->virt_addr)
return; return;
ntb_mw_clear_trans(perf->ntb, 0); ntb_mw_clear_trans(perf->ntb, PIDX, 0);
dma_free_coherent(&pdev->dev, mw->buf_size, dma_free_coherent(&pdev->dev, mw->buf_size,
mw->virt_addr, mw->dma_addr); mw->virt_addr, mw->dma_addr);
mw->xlat_size = 0; mw->xlat_size = 0;
...@@ -490,7 +496,7 @@ static int perf_set_mw(struct perf_ctx *perf, resource_size_t size) ...@@ -490,7 +496,7 @@ static int perf_set_mw(struct perf_ctx *perf, resource_size_t size)
mw->buf_size = 0; mw->buf_size = 0;
} }
rc = ntb_mw_set_trans(perf->ntb, 0, mw->dma_addr, mw->xlat_size); rc = ntb_mw_set_trans(perf->ntb, PIDX, 0, mw->dma_addr, mw->xlat_size);
if (rc) { if (rc) {
dev_err(&perf->ntb->dev, "Unable to set mw0 translation\n"); dev_err(&perf->ntb->dev, "Unable to set mw0 translation\n");
perf_free_mw(perf); perf_free_mw(perf);
...@@ -517,9 +523,9 @@ static void perf_link_work(struct work_struct *work) ...@@ -517,9 +523,9 @@ static void perf_link_work(struct work_struct *work)
if (max_mw_size && size > max_mw_size) if (max_mw_size && size > max_mw_size)
size = max_mw_size; size = max_mw_size;
ntb_peer_spad_write(ndev, MW_SZ_HIGH, upper_32_bits(size)); ntb_peer_spad_write(ndev, PIDX, MW_SZ_HIGH, upper_32_bits(size));
ntb_peer_spad_write(ndev, MW_SZ_LOW, lower_32_bits(size)); ntb_peer_spad_write(ndev, PIDX, MW_SZ_LOW, lower_32_bits(size));
ntb_peer_spad_write(ndev, VERSION, PERF_VERSION); ntb_peer_spad_write(ndev, PIDX, VERSION, PERF_VERSION);
/* now read what peer wrote */ /* now read what peer wrote */
val = ntb_spad_read(ndev, VERSION); val = ntb_spad_read(ndev, VERSION);
...@@ -561,8 +567,12 @@ static int perf_setup_mw(struct ntb_dev *ntb, struct perf_ctx *perf) ...@@ -561,8 +567,12 @@ static int perf_setup_mw(struct ntb_dev *ntb, struct perf_ctx *perf)
mw = &perf->mw; mw = &perf->mw;
rc = ntb_mw_get_range(ntb, 0, &mw->phys_addr, &mw->phys_size, rc = ntb_mw_get_align(ntb, PIDX, 0, &mw->xlat_align,
&mw->xlat_align, &mw->xlat_align_size); &mw->xlat_align_size, NULL);
if (rc)
return rc;
rc = ntb_peer_mw_get_addr(ntb, 0, &mw->phys_addr, &mw->phys_size);
if (rc) if (rc)
return rc; return rc;
...@@ -677,7 +687,8 @@ static ssize_t debugfs_run_write(struct file *filp, const char __user *ubuf, ...@@ -677,7 +687,8 @@ static ssize_t debugfs_run_write(struct file *filp, const char __user *ubuf,
pr_info("Fix run_order to %u\n", run_order); pr_info("Fix run_order to %u\n", run_order);
} }
node = dev_to_node(&perf->ntb->pdev->dev); node = on_node ? dev_to_node(&perf->ntb->pdev->dev)
: NUMA_NO_NODE;
atomic_set(&perf->tdone, 0); atomic_set(&perf->tdone, 0);
/* launch kernel thread */ /* launch kernel thread */
...@@ -723,34 +734,71 @@ static const struct file_operations ntb_perf_debugfs_run = { ...@@ -723,34 +734,71 @@ static const struct file_operations ntb_perf_debugfs_run = {
static int perf_debugfs_setup(struct perf_ctx *perf) static int perf_debugfs_setup(struct perf_ctx *perf)
{ {
struct pci_dev *pdev = perf->ntb->pdev; struct pci_dev *pdev = perf->ntb->pdev;
struct dentry *debugfs_node_dir;
struct dentry *debugfs_run;
struct dentry *debugfs_threads;
struct dentry *debugfs_seg_order;
struct dentry *debugfs_run_order;
struct dentry *debugfs_use_dma;
struct dentry *debugfs_on_node;
if (!debugfs_initialized()) if (!debugfs_initialized())
return -ENODEV; return -ENODEV;
/* Assumpion: only one NTB device in the system */
if (!perf_debugfs_dir) { if (!perf_debugfs_dir) {
perf_debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL); perf_debugfs_dir = debugfs_create_dir(KBUILD_MODNAME, NULL);
if (!perf_debugfs_dir) if (!perf_debugfs_dir)
return -ENODEV; return -ENODEV;
} }
perf->debugfs_node_dir = debugfs_create_dir(pci_name(pdev), debugfs_node_dir = debugfs_create_dir(pci_name(pdev),
perf_debugfs_dir); perf_debugfs_dir);
if (!perf->debugfs_node_dir) if (!debugfs_node_dir)
return -ENODEV; goto err;
perf->debugfs_run = debugfs_create_file("run", S_IRUSR | S_IWUSR, debugfs_run = debugfs_create_file("run", S_IRUSR | S_IWUSR,
perf->debugfs_node_dir, perf, debugfs_node_dir, perf,
&ntb_perf_debugfs_run); &ntb_perf_debugfs_run);
if (!perf->debugfs_run) if (!debugfs_run)
return -ENODEV; goto err;
perf->debugfs_threads = debugfs_create_u8("threads", S_IRUSR | S_IWUSR, debugfs_threads = debugfs_create_u8("threads", S_IRUSR | S_IWUSR,
perf->debugfs_node_dir, debugfs_node_dir,
&perf->perf_threads); &perf->perf_threads);
if (!perf->debugfs_threads) if (!debugfs_threads)
return -ENODEV; goto err;
debugfs_seg_order = debugfs_create_u32("seg_order", 0600,
debugfs_node_dir,
&seg_order);
if (!debugfs_seg_order)
goto err;
debugfs_run_order = debugfs_create_u32("run_order", 0600,
debugfs_node_dir,
&run_order);
if (!debugfs_run_order)
goto err;
debugfs_use_dma = debugfs_create_bool("use_dma", 0600,
debugfs_node_dir,
&use_dma);
if (!debugfs_use_dma)
goto err;
debugfs_on_node = debugfs_create_bool("on_node", 0600,
debugfs_node_dir,
&on_node);
if (!debugfs_on_node)
goto err;
return 0; return 0;
err:
debugfs_remove_recursive(perf_debugfs_dir);
perf_debugfs_dir = NULL;
return -ENODEV;
} }
static int perf_probe(struct ntb_client *client, struct ntb_dev *ntb) static int perf_probe(struct ntb_client *client, struct ntb_dev *ntb)
...@@ -766,8 +814,15 @@ static int perf_probe(struct ntb_client *client, struct ntb_dev *ntb) ...@@ -766,8 +814,15 @@ static int perf_probe(struct ntb_client *client, struct ntb_dev *ntb)
return -EIO; return -EIO;
} }
node = dev_to_node(&pdev->dev); if (!ntb->ops->mw_set_trans) {
dev_err(&ntb->dev, "Need inbound MW based NTB API\n");
return -EINVAL;
}
if (ntb_peer_port_count(ntb) != NTB_DEF_PEER_CNT)
dev_warn(&ntb->dev, "Multi-port NTB devices unsupported\n");
node = on_node ? dev_to_node(&pdev->dev) : NUMA_NO_NODE;
perf = kzalloc_node(sizeof(*perf), GFP_KERNEL, node); perf = kzalloc_node(sizeof(*perf), GFP_KERNEL, node);
if (!perf) { if (!perf) {
rc = -ENOMEM; rc = -ENOMEM;
......
...@@ -90,6 +90,9 @@ static unsigned long db_init = 0x7; ...@@ -90,6 +90,9 @@ static unsigned long db_init = 0x7;
module_param(db_init, ulong, 0644); module_param(db_init, ulong, 0644);
MODULE_PARM_DESC(db_init, "Initial doorbell bits to ring on the peer"); MODULE_PARM_DESC(db_init, "Initial doorbell bits to ring on the peer");
/* Only two-ports NTB devices are supported */
#define PIDX NTB_DEF_PEER_IDX
struct pp_ctx { struct pp_ctx {
struct ntb_dev *ntb; struct ntb_dev *ntb;
u64 db_bits; u64 db_bits;
...@@ -135,7 +138,7 @@ static void pp_ping(unsigned long ctx) ...@@ -135,7 +138,7 @@ static void pp_ping(unsigned long ctx)
"Ping bits %#llx read %#x write %#x\n", "Ping bits %#llx read %#x write %#x\n",
db_bits, spad_rd, spad_wr); db_bits, spad_rd, spad_wr);
ntb_peer_spad_write(pp->ntb, 0, spad_wr); ntb_peer_spad_write(pp->ntb, PIDX, 0, spad_wr);
ntb_peer_db_set(pp->ntb, db_bits); ntb_peer_db_set(pp->ntb, db_bits);
ntb_db_clear_mask(pp->ntb, db_mask); ntb_db_clear_mask(pp->ntb, db_mask);
...@@ -222,6 +225,12 @@ static int pp_probe(struct ntb_client *client, ...@@ -222,6 +225,12 @@ static int pp_probe(struct ntb_client *client,
} }
} }
if (ntb_spad_count(ntb) < 1) {
dev_dbg(&ntb->dev, "no enough scratchpads\n");
rc = -EINVAL;
goto err_pp;
}
if (ntb_spad_is_unsafe(ntb)) { if (ntb_spad_is_unsafe(ntb)) {
dev_dbg(&ntb->dev, "scratchpad is unsafe\n"); dev_dbg(&ntb->dev, "scratchpad is unsafe\n");
if (!unsafe) { if (!unsafe) {
...@@ -230,6 +239,9 @@ static int pp_probe(struct ntb_client *client, ...@@ -230,6 +239,9 @@ static int pp_probe(struct ntb_client *client,
} }
} }
if (ntb_peer_port_count(ntb) != NTB_DEF_PEER_CNT)
dev_warn(&ntb->dev, "multi-port NTB is unsupported\n");
pp = kmalloc(sizeof(*pp), GFP_KERNEL); pp = kmalloc(sizeof(*pp), GFP_KERNEL);
if (!pp) { if (!pp) {
rc = -ENOMEM; rc = -ENOMEM;
......
This diff is collapsed.
This diff is collapsed.
...@@ -18,6 +18,7 @@ LIST_DEVS=FALSE ...@@ -18,6 +18,7 @@ LIST_DEVS=FALSE
DEBUGFS=${DEBUGFS-/sys/kernel/debug} DEBUGFS=${DEBUGFS-/sys/kernel/debug}
DB_BITMASK=0x7FFF
PERF_RUN_ORDER=32 PERF_RUN_ORDER=32
MAX_MW_SIZE=0 MAX_MW_SIZE=0
RUN_DMA_TESTS= RUN_DMA_TESTS=
...@@ -38,6 +39,7 @@ function show_help() ...@@ -38,6 +39,7 @@ function show_help()
echo "be highly recommended." echo "be highly recommended."
echo echo
echo "Options:" echo "Options:"
echo " -b BITMASK doorbell clear bitmask for ntb_tool"
echo " -C don't cleanup ntb modules on exit" echo " -C don't cleanup ntb modules on exit"
echo " -d run dma tests" echo " -d run dma tests"
echo " -h show this help message" echo " -h show this help message"
...@@ -52,8 +54,9 @@ function show_help() ...@@ -52,8 +54,9 @@ function show_help()
function parse_args() function parse_args()
{ {
OPTIND=0 OPTIND=0
while getopts "Cdhlm:r:p:w:" opt; do while getopts "b:Cdhlm:r:p:w:" opt; do
case "$opt" in case "$opt" in
b) DB_BITMASK=${OPTARG} ;;
C) DONT_CLEANUP=1 ;; C) DONT_CLEANUP=1 ;;
d) RUN_DMA_TESTS=1 ;; d) RUN_DMA_TESTS=1 ;;
h) show_help; exit 0 ;; h) show_help; exit 0 ;;
...@@ -85,6 +88,10 @@ set -e ...@@ -85,6 +88,10 @@ set -e
function _modprobe() function _modprobe()
{ {
modprobe "$@" modprobe "$@"
if [[ "$REMOTE_HOST" != "" ]]; then
ssh "$REMOTE_HOST" modprobe "$@"
fi
} }
function split_remote() function split_remote()
...@@ -154,7 +161,7 @@ function doorbell_test() ...@@ -154,7 +161,7 @@ function doorbell_test()
echo "Running db tests on: $(basename $LOC) / $(basename $REM)" echo "Running db tests on: $(basename $LOC) / $(basename $REM)"
write_file "c 0xFFFFFFFF" "$REM/db" write_file "c $DB_BITMASK" "$REM/db"
for ((i=1; i <= 8; i++)); do for ((i=1; i <= 8; i++)); do
let DB=$(read_file "$REM/db") || true let DB=$(read_file "$REM/db") || true
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment