- 28 Nov, 2016 23 commits
-
-
Shaker Daibes authored
The user can now override the automatic driver decision using the rx_cqe_compress flag, which is the preference for CQE compression. The flag is initialized with the automatic driver decision. Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shaker Daibes authored
pflags is a configuration parameter for the netdev, naturally it belongs to priv->params. Also introduce MLX5E_GET_PFLAG Signed-off-by: Shaker Daibes <shakerd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
Extend the self diagnostic tests to support loopback test. The loopback test doesn't require the offline flag, it will use the generic dev_queue_xmit and a dedicated packet_type to capture and verify mlx5e selftest loopback packets. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kamal Heib authored
The self diagnostics test implementaion include the following features: 1. Link Test: Check that link is in up state. 2. Speed Test: Check that link was negotiated correctly. 3. Health Test: Check the device health. Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Use setdcbx interface to set the DCBX mode to firmware or os. If setdcbx is called with mode value of zero, the DCBX mode is set to firmware. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
DBCX by default is controlled by firmware where dcbx capability bit is set. In this mode, firmware is responsible for reading/sending the TLV packets from/to the remote partner. This patch sets up the infrastructure to move between HOST/FW DCBX control mode. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Add set/query commands for DCBX_PARAM register Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Issue description: Current implementation saves the ETS settings from user in a temporal soft copy and returns this settings when user queries the ETS settings. With the new DCBX firmware, the ETS settings can be changed by firmware when the DCBX is in firmware controlled mode. Therefore, user will obtain wrong values from the temporal soft copy. Solution: 1. Read the ETS settings directly from firmware. 2. For tc_tsa: a. Initialize tc_tsa to vendor IEEE_8021QAZ_TSA_VENDOR at netdev creation. b. When reading ETS setting from FW, if the traffic class bandwidth is less than 100, set tc_tsa to IEEE_8021QAZ_TSA_ETS. This implementation solves the scenarios when the DCBX is in FW control and willing bit is on which means the ETS setting is dictated by remote switch. Also check ETS capability where needed. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Add DCBX CEE API interface for ConnectX-4. Configurations are stored in a temporary structure and are applied to the card's firmware when the CEE's setall callback function is called. Note: priority group in CEE is equivalent to traffic class in ConnectX-4 hardware spec. bw allocation per priority in CEE is not supported because ConnectX-4 only supports bw allocation per traffic class. user priority in CEE does not have an equivalent term in ConnectX-4. Therefore, user priority to priority mapping in CEE is not supported. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Huy Nguyen authored
Make sure firmware supports qos before exposing the DCB API. Signed-off-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jason Wang authored
We use single queue even if multiqueue is enabled and let admin to enable it through ethtool later. This is used to avoid possible regression (small packet TCP stream transmission). But looks like an overkill since: - single queue user can disable multiqueue when launching qemu - brings extra troubles for the management since it needs extra admin tool in guest to enable multiqueue - multiqueue performs much better than single queue in most of the cases So this patch enables multiqueue by default: if #queues is less than or equal to #vcpu, enable as much as queue pairs; if #queues is greater than #vcpu, enable #vcpu queue pairs. Cc: Hannes Frederic Sowa <hannes@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Neil Horman <nhorman@redhat.com> Cc: Jeremy Eder <jeder@redhat.com> Cc: Marko Myllynen <myllynen@redhat.com> Cc: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Stefan Eichenberger says: ==================== Fix support for the MV88E6097 This patchset fixes the following two issues for the MV88E6097: - Add missing definition of g1_irqs - Add missing comment ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefan Eichenberger authored
Add a missing comment for the MV88E6097 because of unification. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@netmodule.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefan Eichenberger authored
Add the missing definition of g1_irqs for MV88E6097. Signed-off-by: Stefan Eichenberger <stefan.eichenberger@netmodule.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Daniel Borkmann says: ==================== BPF cleanups and misc updates This patch set adds couple of cleanups in first few patches, exposes owner_prog_type for array maps as well as mlocked mem for maps in fdinfo, allows for mount permissions in fs and fixes various outstanding issues in selftests and samples. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
1) The test_lru_map and test_lru_dist fails building on my machine since the sys/resource.h header is not included. 2) test_verifier fails in one test case where we try to call an invalid function, since the verifier log output changed wrt printing function names. 3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for retrieving the number of possible CPUs. This is broken at least in our scenario and really just doesn't work. glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF. First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l, if that fails, depending on the config, it either tries to count CPUs in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead. If /proc/cpuinfo has some issue, it returns just 1 worst case. This oddity is nothing new [1], but semantics/behaviour seems to be settled. _SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if that fails it looks into /proc/stat for cpuX entries, and if also that fails for some reason, /proc/cpuinfo is consulted (and returning 1 if unlikely all breaks down). While that might match num_possible_cpus() from the kernel in some cases, it's really not guaranteed with CPU hotplugging, and can result in a buffer overflow since the array in user space could have too few number of slots, and on perpcu map lookup, the kernel will write beyond that memory of the value buffer. William Tu reported such mismatches: [...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu() happens when CPU hotadd is enabled. For example, in Fusion when setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64 -smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf() will be 2 [2]. [...] Documentation/cputopology.txt says /sys/devices/system/cpu/possible outputs cpu_possible_mask. That is the same as in num_possible_cpus(), so first step would be to fix the _SC_NPROCESSORS_CONF calls with our own implementation. Later, we could add support to bpf(2) for passing a mask via CPU_SET(3), for example, to just select a subset of CPUs. BPF samples code needs this fix as well (at least so that people stop copying this). Thus, define bpf_num_possible_cpus() once in selftests and import it from there for the sample code to avoid duplicating it. The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated. After all three issues are fixed, the test suite runs fine again: # make run_tests | grep self selftests: test_verifier [PASS] selftests: test_maps [PASS] selftests: test_lru_map [PASS] selftests: test_kmod.sh [PASS] [1] https://www.sourceware.org/ml/libc-alpha/2011-06/msg00079.html [2] https://www.mail-archive.com/netdev@vger.kernel.org/msg121183.html Fixes: 3059303f ("samples/bpf: update tracex[23] examples to use per-cpu maps") Fixes: 86af8b41 ("Add sample for adding simple drop program to link") Fixes: df570f57 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY") Fixes: e1559671 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_HASH") Fixes: ebb676da ("bpf: Print function name in addition to function id") Fixes: 5db58faf ("bpf: Add tests for the LRU bpf_htab") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: William Tu <u9012063@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Since we recently converted the BPF filesystem over to use mount_nodev(), we now have the possibility to also hold mount options in sb's s_fs_info. This work implements mount options support for specifying permissions on the sb's inode, which will be used by tc when it manually needs to mount the fs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Allow for checking the owner_prog_type of a program array map. In some cases bpf(2) can return -EINVAL /after/ the verifier passed and did all the rewrites of the bpf program. The reason that lets us fail at this late stage is that program array maps are incompatible. Allow users to inspect this earlier after they got the map fd through BPF_OBJ_GET command. tc will get support for this. Also, display how much we charged the map with regards to RLIMIT_MEMLOCK. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Commit dcf80034 ("net/sched: act_mirred: Refactor detection whether dev needs xmit at mac header") added dev_is_mac_header_xmit(); since it's also useful elsewhere, move it to if_arp.h and reuse it for BPF. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
After setup we don't need to keep user space fd number around anymore, as it also has no useful meaning for anyone, just remove it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Since long already bpf_func is not only about struct sk_buff * as input anymore. Make it generic as void *, so that callers don't need to cast for it each time they call BPF_PROG_RUN(). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Dan Carpenter authored
We don't use ->heap_buf after commit 46d1efd8 ("sfc: remove Software TSO") so let's remove the last traces. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'wireless-drivers-next-for-davem-2016-11-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.10 Major changes: iwlwifi * finalize and enable dynamic queue allocation * use dev_coredumpmsg() to prevent locking the driver * small fix to pass the AID to the FW * use FW PS decisions with multi-queue ath9k * add device tree bindings * switch to use mac80211 intermediate software queues to reduce latency and fix bufferbloat wl18xx * allow scanning in AP mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 27 Nov, 2016 6 commits
-
-
Michael S. Tsirkin authored
sparse warns about context imbalance in any code that uses HARD_TX_LOCK/UNLOCK - this is because it's unable to determine that flags don't change so lock and unlock are paired. Seems easy enough to fix by adding __acquire/__release calls. With this patch af_packet.c is now sparse-clean, Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ulrik De Bie authored
This patch depends on commit d8d26354 ("ptp: Introduce a high resolution frequency adjustment method.") The gianfar devices offer a frequency resolution of about 0.46 ppb (depends on actual value of tmr_add, for the calculation assumed 0x80000000). This patch lets users of the device benefit from the increased frequency resolution when tuning the clock. Thanks to the rounding the maximum error between the requested frequency and the applied frequency will then be about 0.23 ppb. Tested on a v3.3.8 kernel on a real gianfar device. Verified compilation on net-next (currently at v4.9-rc5). Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Per RX ring packets/bytes counters are not protected by global priv->stats_lock. Better not confuse the reader, and use READ_ONCE() to show we read these counters without surrounding synchronization. Interrupt moderation is best effort, and we do not really care of ultra precise counters. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
udplite conflict is resolved by taking what 'net-next' did which removed the backlog receive method assignment, since it is no longer necessary. Two entries were added to the non-priv ethtool operations switch statement, one in 'net' and one in 'net-next, so simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs splice fix from Al Viro. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix default_file_splice_read()
-
Al Viro authored
Botched calculation of number of pages. As the result, we were dropping pieces when doing splice to pipe from e.g. 9p. Reported-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
- 26 Nov, 2016 11 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c fixes from Wolfram Sang: "Here is a revert and two bugfixes for the I2C designware driver. Please note that we are still hunting down a regression for the i2c-octeon driver. While there is a fix pending, we have unclear feedback from the testers currently. An rc8 would be quite helpful for this case" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: Revert "i2c: designware: do not disable adapter after transfer" i2c: designware: fix rx fifo depth tracking i2c: designware: report short transfers
-
git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM fix from Russell King: "This resolves the ksyms issues by reverting the commit which introduced the breakage" There was what I consider to be a better fix, but it's late in the rc game, so I'll take the revert. * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm: Revert "arm: move exports to definitions"
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix leak in fsl/fman driver, from Dan Carpenter. 2) Call flow dissector initcall earlier than any networking driver can register and start to use it, from Eric Dumazet. 3) Some dup header fixes from Geliang Tang. 4) TIPC link monitoring compat fix from Jon Paul Maloy. 5) Link changes require EEE re-negotiation in bcm_sf2 driver, from Florian Fainelli. 6) Fix bogus handle ID passed into tfilter_notify_chain(), from Roman Mashak. 7) Fix dump size calculation in rtnl_calcit(), from Zhang Shengju. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits) tipc: resolve connection flow control compatibility problem mvpp2: use correct size for memset net/mlx5: drop duplicate header delay.h net: ieee802154: drop duplicate header delay.h ibmvnic: drop duplicate header seq_file.h fsl/fman: fix a leak in tgec_free() net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS tipc: improve sanity check for received domain records tipc: fix compatibility bug in link monitoring net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented dwc_eth_qos: drop duplicate headers net sched filters: fix filter handle ID in tfilter_notify_chain() net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change bnxt: do not busy-poll when link is down udplite: call proper backlog handlers ipv6: bump genid when the IFA_F_TENTATIVE flag is clear net/mlx4_en: Free netdev resources under state lock net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit" rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit() bnxt_en: Fix a VXLAN vs GENEVE issue ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull libnvdimm fixes from Dan Williams: - Fix a crash that occurs at driver initialization if the memory region is already busy (request_mem_region() fails). - Fix a vma validation check that mistakenly allows a private device- dax mapping to be established. Device-dax explicitly forbids private mappings so it can guarantee a given fault granularity and backing memory type. Both of these fixes have soaked in -next and are tagged for -stable. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: device-dax: fail all private mapping attempts device-dax: check devm_nsio_enable() return value
-
git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds authored
Pull KVM fixes from Radim Krčmář: "Four fixes for bugs found by syzkaller on x86, all for stable" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: check for pic and ioapic presence before use KVM: x86: fix out-of-bounds accesses of rtc_eoi map KVM: x86: drop error recovery in em_jmp_far and em_ret_far KVM: x86: fix out-of-bounds access in lapic
-
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Set missing wakeup bit in LPCR on POWER9 - Fix the early OPAL console wrappers - Fixup kernel read only mapping Fixes for code merged this cycle: - Fix missing CRCs, add more asm-prototypes.h declarations" * tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fixup kernel read only mapping powerpc/boot: Fix the early OPAL console wrappers powerpc: Fix missing CRCs, add more asm-prototypes.h declarations powerpc: Set missing wakeup bit in LPCR on POWER9
-
Jon Paul Maloy authored
In commit 10724cc7 ("tipc: redesign connection-level flow control") we replaced the previous message based flow control with one based on 1k blocks. In order to ensure backwards compatibility the mechanism falls back to using message as base unit when it senses that the peer doesn't support the new algorithm. The default flow control window, i.e., how many units can be sent before the sender blocks and waits for an acknowledge (aka advertisement) is 512. This was tested against the previous version, which uses an acknowledge frequency of on ack per 256 received message, and found to work fine. However, we missed the fact that versions older than Linux 3.15 use an acknowledge frequency of 512, which is exactly the limit where a 4.6+ sender will stop and wait for acknowledge. This would also work fine if it weren't for the fact that if the first sent message on a 4.6+ server side is an empty SYNACK, this one is also is counted as a sent message, while it is not counted as a received message on a legacy 3.15-receiver. This leads to the sender always being one step ahead of the receiver, a scenario causing the sender to block after 512 sent messages, while the receiver only has registered 511 read messages. Hence, the legacy receiver is not trigged to send an acknowledge, with a permanently blocked sender as result. We solve this deadlock by simply allowing the sender to send one more message before it blocks, i.e., by a making minimal change to the condition used for determining connection congestion. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: traps, trap groups and policers Nogah says: For a packet to be sent from the HW to the cpu, it needs to be trapped. For a trap to be activate it should be assigned to a trap group. Those trap groups can have policers, to limit the packet rate (the max number of packets that can be sent to the cpu in a time slot, the rest will be discarded) or the data rate (the same, but the count is not by the number of packets but by their total length in bytes). This patchset rearrange the trap setting API, re-write the traps and the trap groups list in spectrum and assign them policers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nogah Frankel authored
Configure policers and connect them to trap groups. While many trap groups share policer's configuration they don't share the actual policer because each trap group represents a different flow / protocol and we don't want one of them to be able to exceed its rate on behalf of another. For example, if STP and LLDP gets to send 128 packets/sec each, if we put them in one 256 packets/sec policer, one can send 200 packets while the other only 50. Note that IP2ME covers lots of flows, so it's limit is set to match the cpu ability to handle data. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nogah Frankel authored
The QPCR register is used to create and control policers. A policer can discard or change the color of packets that are trapped by a specific trap. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nogah Frankel authored
Add a new resource to resources query: max cpu policers which tells us how many policers can be used to limit the data rate to the cpu port. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-