- 21 Nov, 2018 2 commits
-
-
Jason Gunthorpe authored
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates taken for dependencies on later ODP patches. Conflict resolved by deleting mlx5_ib_get_vector_affinity() * branch 'mlx5-next': (21 commits) net/mlx5: EQ, Make EQE access methods inline {net,IB}/mlx5: Move Page fault EQ and ODP logic to RDMA net/mlx5: EQ, Generic EQ net/mlx5: EQ, Different EQ types net/mlx5: EQ, Privatize eq_table and friends net/mlx5: EQ, irq_info and rmap belong to eq_table net/mlx5: EQ, Create all EQs in one place net/mlx5: EQ, Move all EQ logic to eq.c net/mlx5: EQ, Remove redundant completion EQ list lock net/mlx5: EQ, No need to store eq index as a field net/mlx5: EQ, Remove unused fields and structures net/mlx5: EQ, Use the right place to store/read IRQ affinity hint IB/mlx5: Improve ODP debugging messages net/mlx5: Use multi threaded workqueue for page fault handling net/mlx5: Return success for PAGE_FAULT_RESUME in internal error state IB/mlx5: Lock QP during page fault handling net/mlx5: Enumerate page fault types net/mlx5: Add interface to hold and release core resources net/mlx5: Release resource on error flow net/mlx5: Fix offsets of ifc reserved fields ... Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Artemy Kovalyov authored
This is required so the user can set the SL on the DC QP. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 20 Nov, 2018 12 commits
-
-
Saeed Mahameed authored
These are one/two liner generic EQ access methods, better have them declared static inline in eq.h. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Use the new generic EQ API to move all ODP RDMA data structures and logic form mlx5 core driver into mlx5_ib driver. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Add mlx5_eq_{create/destroy}_generic APIs and EQE access methods, for mlx5 core consumers generic EQs. This API will be used in downstream patch to move page fault (RDMA ODP) EQ logic into mlx5_ib rdma driver, hence it will use a generic EQ. Current mlx5 EQ allocation scheme: On load mlx5 allocates 4 (for async) + #cores (for data completions) MSIX vectors, mlx5 core will assign 3 MSIX vectors for internal async EQs and will use all of the #cores MSIX vectors for completion EQs, (One vector is going to be reserved for a generic EQ). After this patch an external user (e.g mlx5_ib) of mlx5_core can use this new API to create new generic EQs with the reserved msix vector index for that eq. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
In mlx5 we have three types of usages for EQs, 1. Asynchronous EQs, used internally by mlx5 core for a. FW command completions b. FW page requests c. one EQ for all other Asynchronous events 2. Completion EQs, used for CQ completion (we create one per core) 3. *Special type of EQ (page fault) used for RDMA on demand paging (ODP). *The 3rd type shouldn't be special at least in mlx5 core, it is yet another async events EQ with specific use case, it will be removed in the next two patches, and will completely move its logic to mlx5_ib, as it is rdma specific. In this patch we remove use case (eq type) specific fields from struct mlx5_eq into a new eq type specific structures. struct mlx5_eq_async; truct mlx5_eq_comp; struct mlx5_eq_pagefault; Separate between their type specific flows. In the future we will allow users to create there own generic EQs. for now we will allow only one for ODP in next patches. We will introduce event listeners registration API for those who want to receive mlx5 async events. After that mlx5 eq handling will be clean from feature/user specific handling. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Move unnecessary EQ table structures and declaration from the public include/linux/mlx5/driver.h into the private area of mlx5_core and into eq.c/eq.h. Introduce new mlx5 EQ APIs: mlx5_comp_vectors_count(dev); mlx5_comp_irq_get_affinity_mask(dev, vector); And use them from mlx5_ib or mlx5e netdevice instead of direct access to mlx5_core internal structures. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
irq_info and rmap are EQ properties of the driver, and only needed for EQ objects, move them to the eq_table EQs database structure. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Instead of creating the EQ table in three steps at driver load, - allocate irq vectors - allocate async EQs - allocate completion EQs Gather all of the procedures into one function in eq.c and call it from driver load. This will help us reduce the EQ and EQ table private structures visibility to eq.c in downstream refactoring. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Move completion EQs flows from main.c to eq.c, reasons: 1) It is where this logic belongs. 2) It will help centralize the EQ logic in one file for downstream refactoring, and future extensions/updates. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Completion EQs list is only modified on driver load/unload, locking is not required, remove it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
eq->index is used only for completion EQs and is assigned to be the completion eq index, it is used only when traversing the completion eqs list, and it can be calculated dynamically, thus remove the eq->index field. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Some fields and structures are not referenced nor used by the driver, remove them. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Saeed Mahameed authored
Currently the cpu affinity hint mask for completion EQs is stored and read from the wrong place, since reading and storing is done from the same index, there is no actual issue with that, but internal irq_info for completion EQs stars at MLX5_EQ_VEC_COMP_BASE offset in irq_info array, this patch changes the code to use the correct offset to store and read the IRQ affinity hint. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
- 12 Nov, 2018 7 commits
-
-
Moni Shoua authored
Add and modify debug messages to ODP related error flows. In that context, return code EAGAIN is considered less severe and print level for it is set debug instead of warn. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Moni Shoua authored
Page fault events are processed in a workqueue context. Since each QP can have up to two concurrent unrelated page-faults, one for requester and one for responder, page-fault handling can be done in parallel. Achieve this by changing the workqueue to be multi-threaded. The number of threads is the same as the number of command interface channels to avoid command interface bottlenecks. In addition to multi-threads, change the workqueue flags to give it high priority. Stress benchmark shows that before this change 85% of page faults were waiting in queue 8 seconds or more while after the change 98% of page faults were waiting in queue 64 milliseconds or less. The number of threads was chosen as the number of channels to the command interface. Fixes: d9aaed83 ("{net,IB}/mlx5: Refactor page fault handling") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Moni Shoua authored
When the device is in internal error state, command interface isn't accessible and the driver decides which commands to fail and which to pass. Move the PAGE_FAULT_RESUME command to the pass list in order to avoid redundant failure messages. Fixes: 89d44f0a ("net/mlx5_core: Add pci error handlers to mlx5_core driver") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Moni Shoua authored
When page fault event for a WQE arrives, the event data contains the resource (e.g. QP) number which will later be used by the page fault handler to retrieve the resource. Meanwhile, another context can destroy the resource and cause use-after-free. To avoid that, take a reference on the resource when handler starts and release it when it ends. Page fault events for RDMA operations don't need to be protected because the driver doesn't need to access the QP in the page fault handler. Fixes: d9aaed83 ("{net,IB}/mlx5: Refactor page fault handling") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Moni Shoua authored
Give meaningful names to type of WQE page faults. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Moni Shoua authored
Sometimes upper layers may want to prevent the destruction of a core resource for a period of time while work on that resource is in progress. Add API to support this. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
Moni Shoua authored
Fix reference counting leakage when the event handler aborts due to an unsupported event for the resource type. Fixes: a14c2d4b ("net/mlx5_core: Warn on unsupported events of QP/RQ/SQ") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
- 09 Nov, 2018 1 commit
-
-
Gal Pressman authored
Fix wrong offsets of reserved fields in ifc file. Issues found using pahole. Signed-off-by: Gal Pressman <pressmangal@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
-
- 08 Nov, 2018 6 commits
-
-
Zhu Yanjun authored
Since the function rxe_unregister_device always returns 0, it is changed to void. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Zhu Yanjun authored
The variable rxe is only used in the function rxe_xmit_packet, and the caller functions do not use it. So move this variable into the function rxe_xmit_packet. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Andrew Boyer authored
link_down is self-explanatory. rdma_sends and rdma_recvs count the number of RDMA Send and RDMA Receive operations completed successfully. This is different from the existing sent_pkts and rcvd_pkts counters because the existing counters measure packets, not RDMA operations. ack_deffered is renamed to ack_deferred to fix the spelling. out_of_sequence is renamed to out_of_seq_request to make clear that it is counting only requests and not other packets which can be out of sequence. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Andrew Boyer authored
In ib_query_port(), use the netdev's IFF_UP flag to determine phys_state (flag set = down = POLLING, flag clear = disabled = DISABLED). Callers can then use the phys_state field to distinguish between links which have a dead partner, cable missing, etc., from links which are turned off on the local node. This is useful for HA and supportability. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Sagi Grimberg authored
Devices that does not use managed affinity can not export a vector affinity as the consumer relies on having a static mapping it can map to upper layer affinity (e.g. sw queues). If the driver allows the user to set the device irq affinity, then the affinitization of a long term existing entites is not relevant. For example, nvme-rdma controllers queue-irq affinitization is determined at init time so if the irq affinity changes over time, we are no longer aligned. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
Sagi Grimberg authored
Devices that does not use managed affinity can not export a vector affinity as the consumer relies on having a static mapping it can map to upper layer affinity (e.g. sw queues). If the driver allows the user to set the device irq affinity, then the affinitization of a long term existing entites is not relevant. For example, nvme-rdma controllers queue-irq affinitization is determined at init time so if the irq affinity changes over time, we are no longer aligned. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
-
- 07 Nov, 2018 1 commit
-
-
Yishai Hadas authored
Adapt XRC SRQ to the latest HW specification with fixed definition around umem valid bits. The previous definition relied on a bit which was taken for other purposes in legacy FW. Fixes: bd371975 ("net/mlx5: Update mlx5_ifc with DEVX UID bits") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
-
- 06 Nov, 2018 4 commits
-
-
Rami Rosen authored
This patch fixes a typo in include/rdma/ib_verbs.h. See: https://www.merriam-webster.com/dictionary/lieuSigned-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Sagi Grimberg authored
Error completions must still contain a valid wr_id and qp_num such that the consumer can rely on. Correctly fill these fields in receive error completions. Reported-by: Walker Benjamin <benjamin.walker@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Tested-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Zhu Yanjun authored
When resp is in error state, the queued SKBs will not be handled. The function get_req cleans up the skb queue directly. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Allen Pais authored
This is a mechanical transformation, no change in logic. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
- 04 Nov, 2018 7 commits
-
-
Linus Torvalds authored
-
git://git.infradead.org/linux-ubifsLinus Torvalds authored
Pull UBIFS updates from Richard Weinberger: - Full filesystem authentication feature, UBIFS is now able to have the whole filesystem structure authenticated plus user data encrypted and authenticated. - Minor cleanups * tag 'tags/upstream-4.20-rc1' of git://git.infradead.org/linux-ubifs: (26 commits) ubifs: Remove unneeded semicolon Documentation: ubifs: Add authentication whitepaper ubifs: Enable authentication support ubifs: Do not update inode size in-place in authenticated mode ubifs: Add hashes and HMACs to default filesystem ubifs: authentication: Authenticate super block node ubifs: Create hash for default LPT ubfis: authentication: Authenticate master node ubifs: authentication: Authenticate LPT ubifs: Authenticate replayed journal ubifs: Add auth nodes to garbage collector journal head ubifs: Add authentication nodes to journal ubifs: authentication: Add hashes to index nodes ubifs: Add hashes to the tree node cache ubifs: Create functions to embed a HMAC in a node ubifs: Add helper functions for authentication support ubifs: Add separate functions to init/crc a node ubifs: Format changes for authentication support ubifs: Store read superblock node ubifs: Drop write_node ...
-
git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds authored
Pull NFS client bugfixes from Trond Myklebust: "Highlights include: Bugfix: - Fix build issues on architectures that don't provide 64-bit cmpxchg Cleanups: - Fix a spelling mistake" * tag 'nfs-for-4.20-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: fix spelling mistake, EACCESS -> EACCES SUNRPC: Use atomic(64)_t for seq_send(64)
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull more timer updates from Thomas Gleixner: "A set of commits for the new C-SKY architecture timers" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dt-bindings: timer: gx6605s SOC timer clocksource/drivers/c-sky: Add gx6605s SOC system timer dt-bindings: timer: C-SKY Multi-processor timer clocksource/drivers/c-sky: Add C-SKY SMP timer
-
git://github.com/jonmason/ntbLinus Torvalds authored
Pull NTB updates from Jon Mason: "Fairly minor changes and bug fixes: NTB IDT thermal changes and hook into hwmon, ntb_netdev clean-up of private struct, and a few bug fixes" * tag 'ntb-4.20' of git://github.com/jonmason/ntb: ntb: idt: Alter the driver info comments ntb: idt: Discard temperature sensor IRQ handler ntb: idt: Add basic hwmon sysfs interface ntb: idt: Alter temperature read method ntb_netdev: Simplify remove with client device drvdata NTB: transport: Try harder to alloc an aligned MW buffer ntb: ntb_transport: Mark expected switch fall-throughs ntb: idt: Set PCIe bus address to BARLIMITx NTB: ntb_hw_idt: replace IS_ERR_OR_NULL with regular NULL checks ntb: intel: fix return value for ndev_vec_mask() ntb_netdev: fix sleep time mismatch
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull scheduler fixes from Ingo Molnar: "A memory (under-)allocation fix and a comment fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/topology: Fix off by one bug sched/rt: Update comment in pick_next_task_rt()
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull x86 fixes from Ingo Molnar: "A number of fixes and some late updates: - make in_compat_syscall() behavior on x86-32 similar to other platforms, this touches a number of generic files but is not intended to impact non-x86 platforms. - objtool fixes - PAT preemption fix - paravirt fixes/cleanups - cpufeatures updates for new instructions - earlyprintk quirk - make microcode version in sysfs world-readable (it is already world-readable in procfs) - minor cleanups and fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: compat: Cleanup in_compat_syscall() callers x86/compat: Adjust in_compat_syscall() to generic code under !COMPAT objtool: Support GCC 9 cold subfunction naming scheme x86/numa_emulation: Fix uniform-split numa emulation x86/paravirt: Remove unused _paravirt_ident_32 x86/mm/pat: Disable preemption around __flush_tlb_all() x86/paravirt: Remove GPL from pv_ops export x86/traps: Use format string with panic() call x86: Clean up 'sizeof x' => 'sizeof(x)' x86/cpufeatures: Enumerate MOVDIR64B instruction x86/cpufeatures: Enumerate MOVDIRI instruction x86/earlyprintk: Add a force option for pciserial device objtool: Support per-function rodata sections x86/microcode: Make revision and processor flags world-readable
-