1. 13 Jul, 2018 17 commits
    • Maxime Chevallier's avatar
      net: mvpp2: split ingress traffic into multiple flows · f9358e12
      Maxime Chevallier authored
      The PPv2 classifier allows to perform classification operations on each
      ingress packet, based on the flow the packet is assigned to.
      
      The current code uses only 1 flow per port, and the only classification
      action consists of assigning the rx queue to the packet, depending on the
      port.
      
      In preparation for adding RSS support, we have to split all incoming
      traffic into different flows. Since RSS assigns a rx queue depending on
      the hash of some header fields, we have to make sure that the hash is
      generated in a consistent way for all packets in the same flow.
      
      What we call a "flow" is actually a set of attributes attached to a
      packet that depends on various L2/L3/L4 info.
      
      This patch introduces 52 flows, wich are a combination of various L2, L3
      and L4 attributes :
       - Whether or not the packet has a VLAN tag
       - Whether the packet is IPv4, IPv6 or something else
       - Whether the packet is TCP, UDP or something else
       - Whether or not the packet is fragmented at L3 level.
      
      The flow is associated to a packet by the Header Parser. Each flow
      corresponds to an entry in the decoding table. This entry then points to
      the sequence of classification lookups to be performed by the
      classifier, represented in the flow table.
      
      For now, the only lookup we perform is a C2 lookup to set the default
      rx queue.
      
                     Header parser          Dec table
       Ingress pkt  +-------------+ flow id +----------------------------+
      ------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag  |
                    +-------------+         |TCP IPv4 w/o VLAN, not frag |
                                            |TCP IPv4 w/ VLAN, frag      |--+
                                            |etc.                        |  |
                                            +----------------------------+  |
                                                                            |
                                                 Flow table                 |
                      +------------+        +---------------------+         |
           To RxQ <---| Classifier |<-------| flow 0: C2 lookup   |<--------+
                      +------------+        | flow 1: C2 lookup   |
                             |              | ...                 |
                      +------------+        | flow 51 : C2 lookup |
      		| C2 engine  |        +---------------------+
                      +------------+
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f9358e12
    • Maxime Chevallier's avatar
      net: mvpp2: use classifier to assign default rx queue · b1a962c6
      Maxime Chevallier authored
      The PPv2 Controller has a classifier, that can perform multiple lookup
      operations for each packet, using different engines.
      
      One of these engines is the C2 engine, which performs TCAM based lookups
      on data extracted from the packet header. When a packet matches an
      entry, the engine sets various attributes, used to perform
      classification operations.
      
      One of these attributes is the rx queue in which the packet should be sent.
      The current code uses the lookup_id table (also called decoding table)
      to assign the rx queue. However, this only works if we use one entry per
      port in the decoding table, which won't be the case once we add RSS
      lookups.
      
      This patch uses the C2 engine to assign the rx queue to each packet.
      
      The C2 engine is used through the flow table, which dictates what
      classification operations are done for a given flow.
      
      Right now, we have one flow per port, which contains every ingress
      packet for this port.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1a962c6
    • Maxime Chevallier's avatar
      net: mvpp2: rename per-port RSS init function · e6e21c02
      Maxime Chevallier authored
      mvpp22_init_rss function configures the RSS parameters for each port, so
      rename it accordingly. Since this function relies on classifier
      configuration, move its call right after the classifier config.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6e21c02
    • Maxime Chevallier's avatar
      net: mvpp2: make sure we don't spread load on disabled CPUs · 2a2f467d
      Maxime Chevallier authored
      When filling the RSS table, we have to make sure that the rx queue is
      attached to an online CPU.
      
      This patch is not a full support for cpu_hotplug, but rather a way to
      make sure that we don't break network on system booted with the maxcpus
      parameter.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a2f467d
    • Antoine Tenart's avatar
      net: mvpp2: improve the distribution of packets on CPUs when using RSS · 662ae3fe
      Antoine Tenart authored
      This patch adds an extra indirection when setting the indirection table
      into the RSS hardware table to improve the packets distribution across
      CPUs. For example, if 2 queues are used on a multi-core system this new
      indirection will choose two queues on two different CPUs instead of the
      two first queues which are on the same first CPU.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      662ae3fe
    • Antoine Tenart's avatar
      net: mvpp2: RSS indirection table support · 8179642b
      Antoine Tenart authored
      This patch adds the RSS indirection table support, allowing to use the
      ethtool -x and -X options to dump and set this table.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      [Maxime: Small warning fixes, use one table per port]
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8179642b
    • Maxime Chevallier's avatar
      net: mvpp2: use one RSS table per port · a27a254c
      Maxime Chevallier authored
      PPv2 Controller has 8 RSS Tables, of 32 entries each. A lookup in the
      RXQ2RSS_TABLE is performed for each incoming packet, and the RSS Table
      to be used is chosen according to the default rx queue that would be
      used for the packet.
      
      This default rx queue is set in the Lookup_id Table (also called
      Decoding Table), and is equal to the port->first_rxq.
      
      Since the Classifier itself isn't active at any time for the moment,
      this doesn't have a direct effect, the default rx queue at the moment is
      the one where all packets end-up into.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a27a254c
    • Maxime Chevallier's avatar
      net: mvpp2: fix RSS register definitions · 4b86097b
      Maxime Chevallier authored
      There is no RSS_TABLE register in PPv2 Controller. The register 0x1510
      which was specified is actually named "RSS_HASH_SEL", but isn't used by
      this driver at all.
      
      Based on how this register was used, it should have been the
      RXQ2RSS_TABLE register, which allows to select the RSS table that will
      be used for the incoming packet.
      
      The RSS_TABLE_POINTER is actually a field of this RXQ2RSS_TABLE
      register.
      
      Since RSS tables are actually not used by the driver for now, this
      commit does not fix a runtime bug.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4b86097b
    • Antoine Tenart's avatar
      net: mvpp2: fix a typo in the RSS code · 132baa03
      Antoine Tenart authored
      Cosmetic patch fixing a typo in one of the RSS comments.
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      132baa03
    • Maxime Chevallier's avatar
      net: mvpp2: use only one rx queue per port per CPU · f8c6ba84
      Maxime Chevallier authored
      The number of receive queue per port is :
       - MVPP2_DEFAULT_RXQ if in single queue mode
       - MVPP2_DEFAULT_RXQ * num_possible_cpus if in multi queue mode
      
      with MVPP2_DEFAULT_RXQ = 4.
      
      However, we don't use the extra rx queues at the moment, we really only
      need one per port per CPU, until some more advanced classification rules
      are implemented.
      Suggested-by: default avatarStefan Chulski <stefanc@marvell.com>
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8c6ba84
    • Maxime Chevallier's avatar
      net: mvpp2: fix hardcoded number of rx queues · 790d32c6
      Maxime Chevallier authored
      There's a dedicated #define that indicates the number of rx queues per
      port per cpu, this commit removes a harcoded use of that value
      
      This doesn't fix any runtime bugs since the harcoded value matches the
      expected value.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      790d32c6
    • Yan Markman's avatar
      net: mvpp2: use RSS only when using multi-queue mode · 4c4a5686
      Yan Markman authored
      Since RSS only applies when we have per-cpu rx queues, it should only
      be enabled when the driver is configured to make use of multi-queue
      mode.
      Signed-off-by: default avatarYan Markman <ymarkman@marvell.com>
      [Maxime: Commit message]
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c4a5686
    • Maxime Chevallier's avatar
      net: mvpp2: make multi queue mode the default mode · 3f6aaf72
      Maxime Chevallier authored
      The multi queue mode is needed to have RSS available, and offers some
      nice advantages, being able to have one rx queue vector per CPU.
      
      This mode has been usable through the use of a module parameter, this
      commit makes it the default value.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f6aaf72
    • Maxime Chevallier's avatar
      net: mvpp2: make sure we use single queue mode on PPv2.1 · 1e27a628
      Maxime Chevallier authored
      The PPv2 driver defines 2 "queue_modes" :
       - QDIST_SINGLE_MODE, where each port share one rx queue vector
         between all CPUs
       - QDIST_MULTI_MODE, where each port has one rx queue vector per CPU.
      
      Multi queue mode isn't available on PPv2.1, make sure we fallback to
      single mode when running on this revision.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e27a628
    • Maxime Chevallier's avatar
      net: mvpp2: define the number of RSS entries per table in mvpp2.h · 0ad2f539
      Maxime Chevallier authored
      The size of the the RSS indirection tables should be defined in mvpp2.h,
      so that we can use it in all files of the PPv2 driver.
      
      This commit moves the define in mvpp2.h, and adds the missing #include
      in mvpp2_cls.h.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ad2f539
    • Maxime Chevallier's avatar
      net: mvpp2: fix include guards in mvpp2_prs.h · 53a40025
      Maxime Chevallier authored
      Include guards should be put before #includes. This doesn't fix any bug,
      but prevent future compilation issues when adding new files in the mvpp2
      driver
      
      The Header Parser init function needs the platform_device definition,
      and with the fixed include guards we need to add the missing include.
      Signed-off-by: default avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      53a40025
    • Prashant Bhole's avatar
      net: gro: properly remove skb from list · 68d2f84a
      Prashant Bhole authored
      Following crash occurs in validate_xmit_skb_list() when same skb is
      iterated multiple times in the loop and consume_skb() is called.
      
      The root cause is calling list_del_init(&skb->list) and not clearing
      skb->next in d4546c25. list_del_init(&skb->list) sets skb->next
      to point to skb itself. skb->next needs to be cleared because other
      parts of network stack uses another kind of SKB lists.
      validate_xmit_skb_list() uses such list.
      
      A similar type of bugfix was reported by Jesper Dangaard Brouer.
      https://patchwork.ozlabs.org/patch/942541/
      
      This patch clears skb->next and changes list_del_init() to list_del()
      so that list->prev will maintain the list poison.
      
      [  148.185511] ==================================================================
      [  148.187865] BUG: KASAN: use-after-free in validate_xmit_skb_list+0x4b/0xa0
      [  148.190158] Read of size 8 at addr ffff8801e52eefc0 by task swapper/1/0
      [  148.192940]
      [  148.193642] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.18.0-rc3+ #25
      [  148.195423] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014
      [  148.199129] Call Trace:
      [  148.200565]  <IRQ>
      [  148.201911]  dump_stack+0xc6/0x14c
      [  148.203572]  ? dump_stack_print_info.cold.1+0x2f/0x2f
      [  148.205083]  ? kmsg_dump_rewind_nolock+0x59/0x59
      [  148.206307]  ? validate_xmit_skb+0x2c6/0x560
      [  148.207432]  ? debug_show_held_locks+0x30/0x30
      [  148.208571]  ? validate_xmit_skb_list+0x4b/0xa0
      [  148.211144]  print_address_description+0x6c/0x23c
      [  148.212601]  ? validate_xmit_skb_list+0x4b/0xa0
      [  148.213782]  kasan_report.cold.6+0x241/0x2fd
      [  148.214958]  validate_xmit_skb_list+0x4b/0xa0
      [  148.216494]  sch_direct_xmit+0x1b0/0x680
      [  148.217601]  ? dev_watchdog+0x4e0/0x4e0
      [  148.218675]  ? do_raw_spin_trylock+0x10/0x120
      [  148.219818]  ? do_raw_spin_lock+0xe0/0xe0
      [  148.221032]  __dev_queue_xmit+0x1167/0x1810
      [  148.222155]  ? sched_clock+0x5/0x10
      [...]
      
      [  148.474257] Allocated by task 0:
      [  148.475363]  kasan_kmalloc+0xbf/0xe0
      [  148.476503]  kmem_cache_alloc+0xb4/0x1b0
      [  148.477654]  __build_skb+0x91/0x250
      [  148.478677]  build_skb+0x67/0x180
      [  148.479657]  e1000_clean_rx_irq+0x542/0x8a0
      [  148.480757]  e1000_clean+0x652/0xd10
      [  148.481772]  net_rx_action+0x4ea/0xc20
      [  148.482808]  __do_softirq+0x1f9/0x574
      [  148.483831]
      [  148.484575] Freed by task 0:
      [  148.485504]  __kasan_slab_free+0x12e/0x180
      [  148.486589]  kmem_cache_free+0xb4/0x240
      [  148.487634]  kfree_skbmem+0xed/0x150
      [  148.488648]  consume_skb+0x146/0x250
      [  148.489665]  validate_xmit_skb+0x2b7/0x560
      [  148.490754]  validate_xmit_skb_list+0x70/0xa0
      [  148.491897]  sch_direct_xmit+0x1b0/0x680
      [  148.493949]  __dev_queue_xmit+0x1167/0x1810
      [  148.495103]  br_dev_queue_push_xmit+0xce/0x250
      [  148.496196]  br_forward_finish+0x276/0x280
      [  148.497234]  __br_forward+0x44f/0x520
      [  148.498260]  br_forward+0x19f/0x1b0
      [  148.499264]  br_handle_frame_finish+0x65e/0x980
      [  148.500398]  NF_HOOK.constprop.10+0x290/0x2a0
      [  148.501522]  br_handle_frame+0x417/0x640
      [  148.502582]  __netif_receive_skb_core+0xaac/0x18f0
      [  148.503753]  __netif_receive_skb_one_core+0x98/0x120
      [  148.504958]  netif_receive_skb_internal+0xe3/0x330
      [  148.506154]  napi_gro_complete+0x190/0x2a0
      [  148.507243]  dev_gro_receive+0x9f7/0x1100
      [  148.508316]  napi_gro_receive+0xcb/0x260
      [  148.509387]  e1000_clean_rx_irq+0x2fc/0x8a0
      [  148.510501]  e1000_clean+0x652/0xd10
      [  148.511523]  net_rx_action+0x4ea/0xc20
      [  148.512566]  __do_softirq+0x1f9/0x574
      [  148.513598]
      [  148.514346] The buggy address belongs to the object at ffff8801e52eefc0
      [  148.514346]  which belongs to the cache skbuff_head_cache of size 232
      [  148.517047] The buggy address is located 0 bytes inside of
      [  148.517047]  232-byte region [ffff8801e52eefc0, ffff8801e52ef0a8)
      [  148.519549] The buggy address belongs to the page:
      [  148.520726] page:ffffea000794bb00 count:1 mapcount:0 mapping:ffff880106f4dfc0 index:0xffff8801e52ee840 compound_mapcount: 0
      [  148.524325] flags: 0x17ffffc0008100(slab|head)
      [  148.525481] raw: 0017ffffc0008100 ffff880106b938d0 ffff880106b938d0 ffff880106f4dfc0
      [  148.527503] raw: ffff8801e52ee840 0000000000190011 00000001ffffffff 0000000000000000
      [  148.529547] page dumped because: kasan: bad access detected
      
      Fixes: d4546c25 ("net: Convert GRO SKB handling to list_head.")
      Signed-off-by: default avatarPrashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
      Reported-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Tested-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68d2f84a
  2. 12 Jul, 2018 23 commits