1. 07 Jun, 2017 11 commits
  2. 06 Jun, 2017 29 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · b29794ec
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Made TCP congestion control documentation match current reality,
          from Anmol Sarma.
      
       2) Various build warning and failure fixes from Arnd Bergmann.
      
       3) Fix SKB list leak in ipv6_gso_segment().
      
       4) Use after free in ravb driver, from Eugeniu Rosca.
      
       5) Don't use udp_poll() in ping protocol driver, from Eric Dumazet.
      
       6) Don't crash in PCI error recovery of cxgb4 driver, from Guilherme
          Piccoli.
      
       7) _SRC_NAT_DONE_BIT needs to be cleared using atomics, from Liping
          Zhang.
      
       8) Use after free in vxlan deletion, from Mark Bloch.
      
       9) Fix ordering of NAPI poll enabled in ethoc driver, from Max
          Filippov.
      
      10) Fix stmmac hangs with TSO, from Niklas Cassel.
      
      11) Fix crash in CALIPSO ipv6, from Richard Haines.
      
      12) Clear nh_flags properly on mpls link up. From Roopa Prabhu.
      
      13) Fix regression in sk_err socket error queue handling, noticed by
          ping applications. From Soheil Hassas Yeganeh.
      
      14) Update mlx4/mlx5 MAINTAINERS information.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
        net: stmmac: fix a broken u32 less than zero check
        net: stmmac: fix completely hung TX when using TSO
        net: ethoc: enable NAPI before poll may be scheduled
        net: bridge: fix a null pointer dereference in br_afspec
        ravb: Fix use-after-free on `ifconfig eth0 down`
        net/ipv6: Fix CALIPSO causing GPF with datagram support
        net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value
        Revert "sit: reload iphdr in ipip6_rcv"
        i40e/i40evf: proper update of the page_offset field
        i40e: Fix state flags for bit set and clean operations of PF
        iwlwifi: fix host command memory leaks
        iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265
        iwlwifi: mvm: clear new beacon command template struct
        iwlwifi: mvm: don't fail when removing a key from an inexisting sta
        iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3
        iwlwifi: mvm: fix firmware debug restart recording
        iwlwifi: tt: move ucode_loaded check under mutex
        iwlwifi: mvm: support ibss in dqa mode
        iwlwifi: mvm: Fix command queue number on d0i3 flow
        iwlwifi: mvm: rs: start using LQ command color
        ...
      b29794ec
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · e87f327e
      Linus Torvalds authored
      Pull sparc fixes from David Miller:
      
       1) Fix TLB context wrap races, from Pavel Tatashin.
      
       2) Cure some gcc-7 build issues.
      
       3) Handle invalid setup_hugepagesz command line values properly, from
          Liam R Howlett.
      
       4) Copy TSB using the correct address shift for the huge TSB, from Mike
          Kravetz.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc64: delete old wrap code
        sparc64: new context wrap
        sparc64: add per-cpu mm of secondary contexts
        sparc64: redefine first version
        sparc64: combine activate_mm and switch_mm
        sparc64: reset mm cpumask after wrap
        sparc/mm/hugepages: Fix setup_hugepagesz for invalid values.
        sparc: Machine description indices can vary
        sparc64: mm: fix copy_tsb to correctly copy huge page TSBs
        arch/sparc: support NR_CPUS = 4096
        sparc64: Add __multi3 for gcc 7.x and later.
        sparc64: Fix build warnings with gcc 7.
        arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5
      e87f327e
    • David Rientjes's avatar
      compiler, clang: suppress warning for unused static inline functions · abb2ea7d
      David Rientjes authored
      GCC explicitly does not warn for unused static inline functions for
      -Wunused-function.  The manual states:
      
      	Warn whenever a static function is declared but not defined or
      	a non-inline static function is unused.
      
      Clang does warn for static inline functions that are unused.
      
      It turns out that suppressing the warnings avoids potentially complex
      #ifdef directives, which also reduces LOC.
      
      Suppress the warning for clang.
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      abb2ea7d
    • David S. Miller's avatar
      Merge branch 'sparc64-context-wrap-fixes' · b3aefc2f
      David S. Miller authored
      Pavel Tatashin says:
      
      ====================
      sparc64: context wrap fixes
      
      This patch series contains fixes for context wrap: when we are out of
      context ids, and need to get a new version.
      
      It fixes memory corruption issues which happen when more than number of
      context ids (currently set to 8K) number of processes are started
      simultaneously, and processes can get a wrong context.
      
      sparc64: new context wrap:
      - contains explanation of new wrap method, and also explanation of races
        that it solves
      sparc64: reset mm cpumask after wrap
      - explains issue of not reseting cpu mask on a wrap
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b3aefc2f
    • Pavel Tatashin's avatar
      sparc64: delete old wrap code · 0197e41c
      Pavel Tatashin authored
      The old method that is using xcall and softint to get new context id is
      deleted, as it is replaced by a method of using per_cpu_secondary_mm
      without xcall to perform the context wrap.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0197e41c
    • Pavel Tatashin's avatar
      sparc64: new context wrap · a0582f26
      Pavel Tatashin authored
      The current wrap implementation has a race issue: it is called outside of
      the ctx_alloc_lock, and also does not wait for all CPUs to complete the
      wrap.  This means that a thread can get a new context with a new version
      and another thread might still be running with the same context. The
      problem is especially severe on CPUs with shared TLBs, like sun4v. I used
      the following test to very quickly reproduce the problem:
      - start over 8K processes (must be more than context IDs)
      - write and read values at a  memory location in every process.
      
      Very quickly memory corruptions start happening, and what we read back
      does not equal what we wrote.
      
      Several approaches were explored before settling on this one:
      
      Approach 1:
      Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for
      every process to complete the wrap. (Note: every CPU must WAIT before
      leaving smp_new_mmu_context_version_client() until every one arrives).
      
      This approach ends up with deadlocks, as some threads own locks which other
      threads are waiting for, and they never receive softint until these threads
      exit smp_new_mmu_context_version_client(). Since we do not allow the exit,
      deadlock happens.
      
      Approach 2:
      Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into
      into C code, and issue new versions to every CPU.
      This approach adds some overhead to runtime: in switch_mm() we must add
      some checks to make sure that versions have not changed due to wrap while
      we were loading the new secondary context. (could be protected by PSTATE_IE
      but that degrades performance as on M7 and older CPUs as it takes 50 cycles
      for each access). Also, we still need a global per-cpu array of MMs to know
      where we need to load new contexts, otherwise we can change context to a
      thread that is going way (if we received mondo between switch_mm() and
      switch_to() time). Finally, there are some issues with window registers in
      rtrap() when context IDs are changed during CPU mondo time.
      
      The approach in this patch is the simplest and has almost no impact on
      runtime.  We use the array with mm's where last secondary contexts were
      loaded onto CPUs and bump their versions to the new generation without
      changing context IDs. If a new process comes in to get a context ID, it
      will go through get_new_mmu_context() because of version mismatch. But the
      running processes do not need to be interrupted. And wrap is quicker as we
      do not need to xcall and wait for everyone to receive and complete wrap.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0582f26
    • Pavel Tatashin's avatar
      sparc64: add per-cpu mm of secondary contexts · 7a5b4bbf
      Pavel Tatashin authored
      The new wrap is going to use information from this array to figure out
      mm's that currently have valid secondary contexts setup.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a5b4bbf
    • Pavel Tatashin's avatar
      sparc64: redefine first version · c4415235
      Pavel Tatashin authored
      CTX_FIRST_VERSION defines the first context version, but also it defines
      first context. This patch redefines it to only include the first context
      version.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4415235
    • Pavel Tatashin's avatar
      sparc64: combine activate_mm and switch_mm · 14d0334c
      Pavel Tatashin authored
      The only difference between these two functions is that in activate_mm we
      unconditionally flush context. However, there is no need to keep this
      difference after fixing a bug where cpumask was not reset on a wrap. So, in
      this patch we combine these.
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14d0334c
    • Pavel Tatashin's avatar
      sparc64: reset mm cpumask after wrap · 58897485
      Pavel Tatashin authored
      After a wrap (getting a new context version) a process must get a new
      context id, which means that we would need to flush the context id from
      the TLB before running for the first time with this ID on every CPU. But,
      we use mm_cpumask to determine if this process has been running on this CPU
      before, and this mask is not reset after a wrap. So, there are two possible
      fixes for this issue:
      
      1. Clear mm cpumask whenever mm gets a new context id
      2. Unconditionally flush context every time process is running on a CPU
      
      This patch implements the first solution
      Signed-off-by: default avatarPavel Tatashin <pasha.tatashin@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarSteven Sistare <steven.sistare@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58897485
    • Liam R. Howlett's avatar
      sparc/mm/hugepages: Fix setup_hugepagesz for invalid values. · f322980b
      Liam R. Howlett authored
      hugetlb_bad_size needs to be called on invalid values.  Also change the
      pr_warn to a pr_err to better align with other platforms.
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@Oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f322980b
    • James Clarke's avatar
      sparc: Machine description indices can vary · c982aa9c
      James Clarke authored
      VIO devices were being looked up by their index in the machine
      description node block, but this often varies over time as devices are
      added and removed. Instead, store the ID and look up using the type,
      config handle and ID.
      Signed-off-by: default avatarJames Clarke <jrtc27@jrtc27.com>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c982aa9c
    • Mike Kravetz's avatar
      sparc64: mm: fix copy_tsb to correctly copy huge page TSBs · 654f4807
      Mike Kravetz authored
      When a TSB grows beyond its current capacity, a new TSB is allocated
      and copy_tsb is called to copy entries from the old TSB to the new.
      A hash shift based on page size is used to calculate the index of an
      entry in the TSB.  copy_tsb has hard coded PAGE_SHIFT in these
      calculations.  However, for huge page TSBs the value REAL_HPAGE_SHIFT
      should be used.  As a result, when copy_tsb is called for a huge page
      TSB the entries are placed at the incorrect index in the newly
      allocated TSB.  When doing hardware table walk, the MMU does not
      match these entries and we end up in the TSB miss handling code.
      This code will then create and write an entry to the correct index
      in the TSB.  We take a performance hit for the table walk miss and
      recreation of these entries.
      
      Pass a new parameter to copy_tsb that is the page size shift to be
      used when copying the TSB.
      Suggested-by: default avatarAnthony Yznaga <anthony.yznaga@oracle.com>
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      654f4807
    • Jane Chu's avatar
      arch/sparc: support NR_CPUS = 4096 · c79a1373
      Jane Chu authored
      Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info()
      only allocates a single page for NR_CPUS mondo entries. Thus we cannot
      use all 4096 CPUs on some SPARC platforms.
      
      To fix, allocate (2^order) pages where order is set according to the size
      of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa
      are not used in asm code, there are no imm13 offsets from the base PA
      that will break because they can only reach one page.
      
      Orabug: 25505750
      Signed-off-by: default avatarJane Chu <jane.chu@oracle.com>
      Reviewed-by: default avatarBob Picco <bob.picco@oracle.com>
      Reviewed-by: default avatarAtish Patra <atish.patra@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c79a1373
    • Daniel Borkmann's avatar
      bpf: cgroup skb progs cannot access ld_abs/ind · 92046578
      Daniel Borkmann authored
      Commit fb9a307d ("bpf: Allow CGROUP_SKB eBPF program to
      access sk_buff") enabled programs of BPF_PROG_TYPE_CGROUP_SKB
      type to use ld_abs/ind instructions. However, at this point,
      we cannot use them, since offsets relative to SKF_LL_OFF will
      end up pointing skb_mac_header(skb) out of bounds since in the
      egress path it is not yet set at that point in time, but only
      after __dev_queue_xmit() did a general reset on the mac header.
      bpf_internal_load_pointer_neg_helper() will then end up reading
      data from a wrong offset.
      
      BPF_PROG_TYPE_CGROUP_SKB programs can use bpf_skb_load_bytes()
      already to access packet data, which is also more flexible than
      the insns carried over from cBPF.
      
      Fixes: fb9a307d ("bpf: Allow CGROUP_SKB eBPF program to access sk_buff")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Cc: Chenbo Feng <fengc@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92046578
    • Colin Ian King's avatar
      net: stmmac: fix a broken u32 less than zero check · 1d3028f4
      Colin Ian King authored
      The check that queue is less or equal to zero is always true
      because queue is a u32; queue is decremented and will wrap around
      and never go -ve. Fix this by making queue an int.
      
      Detected by CoverityScan, CID#1428988 ("Unsigned compared against 0")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d3028f4
    • Niklas Cassel's avatar
      net: stmmac: fix completely hung TX when using TSO · 426849e6
      Niklas Cassel authored
      stmmac_tso_allocator can fail to set the Last Descriptor bit
      on a descriptor that actually was the last descriptor.
      
      This happens when the buffer of the last descriptor ends
      up having a size of exactly TSO_MAX_BUFF_SIZE.
      
      When the IP eventually reaches the next last descriptor,
      which actually has the bit set, the DMA will hang.
      
      When the DMA hangs, we get a tx timeout, however,
      since stmmac does not do a complete reset of the IP
      in stmmac_tx_timeout, we end up in a state with
      completely hung TX.
      Signed-off-by: default avatarNiklas Cassel <niklas.cassel@axis.com>
      Acked-by: default avatarGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Acked-by: default avatarAlexandre TORGUE <alexandre.torgue@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      426849e6
    • Jason Wang's avatar
      tun: use symmetric hash · feec084a
      Jason Wang authored
      Tun actually expects a symmetric hash for queue selecting to work
      correctly, otherwise packets belongs to a single flow may be
      redirected to the wrong queue. So this patch switch to use
      __skb_get_hash_symmetric().
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      feec084a
    • Max Filippov's avatar
      net: ethoc: enable NAPI before poll may be scheduled · d220b942
      Max Filippov authored
      ethoc_reset enables device interrupts, ethoc_interrupt may schedule a
      NAPI poll before NAPI is enabled in the ethoc_open, which results in
      device being unable to send or receive anything until it's closed and
      reopened. In case the device is flooded with ingress packets it may be
      unable to recover at all.
      Move napi_enable above ethoc_reset in the ethoc_open to fix that.
      
      Fixes: a1702857 ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.")
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Reviewed-by: default avatarTobias Klauser <tklauser@distanz.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d220b942
    • Nikolay Aleksandrov's avatar
      net: bridge: fix a null pointer dereference in br_afspec · 1020ce31
      Nikolay Aleksandrov authored
      We might call br_afspec() with p == NULL which is a valid use case if
      the action is on the bridge device itself, but the bridge tunnel code
      dereferences the p pointer without checking, so check if p is null
      first.
      Reported-by: default avatarGustavo A. R. Silva <garsilva@embeddedor.com>
      Fixes: efa5356b ("bridge: per vlan dst_metadata netlink support")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Acked-by: default avatarRoopa Prabhu <roopa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1020ce31
    • Vivien Didelot's avatar
      net: dsa: mv88e6xxx: fix 6085 frame mode masking · 5461bd41
      Vivien Didelot authored
      The register bits used for the frame mode were masked with DSA (0x1)
      instead of the mask value (0x3) in the 6085 implementation of
      port_set_frame_mode. Fix this.
      
      Fixes: 56995cbc ("net: dsa: mv88e6xxx: Refactor CPU and DSA port setup")
      Signed-off-by: default avatarVivien Didelot <vivien.didelot@savoirfairelinux.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5461bd41
    • Eugeniu Rosca's avatar
      ravb: Fix use-after-free on `ifconfig eth0 down` · 79514ef6
      Eugeniu Rosca authored
      Commit a47b70ea ("ravb: unmap descriptors when freeing rings") has
      introduced the issue seen in [1] reproduced on H3ULCB board.
      
      Fix this by relocating the RX skb ringbuffer free operation, so that
      swiotlb page unmapping can be done first. Freeing of aligned TX buffers
      is not relevant to the issue seen in [1]. Still, reposition TX free
      calls as well, to have all kfree() operations performed consistently
      _after_ dma_unmap_*()/dma_free_*().
      
      [1] Console screenshot with the problem reproduced:
      
      salvator-x login: root
      root@salvator-x:~# ifconfig eth0 up
      Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: \
             attached PHY driver [Micrel KSZ9031 Gigabit PHY]   \
             (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=235)
      IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
      root@salvator-x:~#
      root@salvator-x:~# ifconfig eth0 down
      
      ==================================================================
      BUG: KASAN: use-after-free in swiotlb_tbl_unmap_single+0xc4/0x35c
      Write of size 1538 at addr ffff8006d884f780 by task ifconfig/1649
      
      CPU: 0 PID: 1649 Comm: ifconfig Not tainted 4.12.0-rc4-00004-g112eb072 #32
      Hardware name: Renesas H3ULCB board based on r8a7795 (DT)
      Call trace:
      [<ffff20000808f11c>] dump_backtrace+0x0/0x3a4
      [<ffff20000808f4d4>] show_stack+0x14/0x1c
      [<ffff20000865970c>] dump_stack+0xf8/0x150
      [<ffff20000831f8b0>] print_address_description+0x7c/0x330
      [<ffff200008320010>] kasan_report+0x2e0/0x2f4
      [<ffff20000831eac0>] check_memory_region+0x20/0x14c
      [<ffff20000831f054>] memcpy+0x48/0x68
      [<ffff20000869ed50>] swiotlb_tbl_unmap_single+0xc4/0x35c
      [<ffff20000869fcf4>] unmap_single+0x90/0xa4
      [<ffff20000869fd14>] swiotlb_unmap_page+0xc/0x14
      [<ffff2000080a2974>] __swiotlb_unmap_page+0xcc/0xe4
      [<ffff2000088acdb8>] ravb_ring_free+0x514/0x870
      [<ffff2000088b25dc>] ravb_close+0x288/0x36c
      [<ffff200008aaf8c4>] __dev_close_many+0x14c/0x174
      [<ffff200008aaf9b4>] __dev_close+0xc8/0x144
      [<ffff200008ac2100>] __dev_change_flags+0xd8/0x194
      [<ffff200008ac221c>] dev_change_flags+0x60/0xb0
      [<ffff200008ba2dec>] devinet_ioctl+0x484/0x9d4
      [<ffff200008ba7b78>] inet_ioctl+0x190/0x194
      [<ffff200008a78c44>] sock_do_ioctl+0x78/0xa8
      [<ffff200008a7a128>] sock_ioctl+0x110/0x3c4
      [<ffff200008365a70>] vfs_ioctl+0x90/0xa0
      [<ffff200008365dbc>] do_vfs_ioctl+0x148/0xc38
      [<ffff2000083668f0>] SyS_ioctl+0x44/0x74
      [<ffff200008083770>] el0_svc_naked+0x24/0x28
      
      The buggy address belongs to the page:
      page:ffff7e001b6213c0 count:0 mapcount:0 mapping:          (null) index:0x0
      flags: 0x4000000000000000()
      raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff
      raw: 0000000000000000 ffff7e001b6213e0 0000000000000000 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8006d884f680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff8006d884f700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      >ffff8006d884f780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
                         ^
       ffff8006d884f800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
       ffff8006d884f880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
      ==================================================================
      Disabling lock debugging due to kernel taint
      root@salvator-x:~#
      
      Fixes: a47b70ea ("ravb: unmap descriptors when freeing rings")
      Signed-off-by: default avatarEugeniu Rosca <erosca@de.adit-jv.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79514ef6
    • David S. Miller's avatar
      Merge branch 'bpf-prog-map-ID' · 286556c0
      David S. Miller authored
      Martin KaFai Lau says:
      
      ====================
      Introduce bpf ID
      
      This patch series:
      1) Introduce ID for both bpf_prog and bpf_map.
      2) Add bpf commands to iterate the prog IDs and map
         IDs of the system.
      3) Add bpf commands to get a prog/map fd from an ID
      4) Add bpf command to get prog/map info from a fd.
         The prog/map info is a jump start in this patchset
         and it is not meant to be a complete list.  They can
         be extended in the future patches.
      
      v3:
      - I suspect v2 may not have applied cleanly.
        In particular, patch 1 has conflict with a recent
        change in struct bpf_prog_aux introduced at a similar time frame:
        8726679a ("bpf: teach verifier to track stack depth")
        v3 should have fixed it.
      
      v2:
      Compiler warning fixes:
      - Remove lockdep_is_held() usage.  Add comment
        to explain the lock situation instead.
      - Add static for idr related variables
      - Add __user to the uattr param in bpf_prog_get_info_by_fd()
        and bpf_map_get_info_by_fd().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      286556c0
    • Martin KaFai Lau's avatar
      bpf: Test for bpf ID · 95b9afd3
      Martin KaFai Lau authored
      Add test to exercise the bpf_prog/map id generation,
      bpf_(prog|map)_get_next_id(), bpf_(prog|map)_get_fd_by_id() and
      bpf_get_obj_info_by_fd().
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAlexei Starovoitov <ast@fb.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95b9afd3
    • Martin KaFai Lau's avatar
      bpf: Add BPF_OBJ_GET_INFO_BY_FD · 1e270976
      Martin KaFai Lau authored
      A single BPF_OBJ_GET_INFO_BY_FD cmd is used to obtain the info
      for both bpf_prog and bpf_map.  The kernel can figure out the
      fd is associated with a bpf_prog or bpf_map.
      
      The suggested struct bpf_prog_info and struct bpf_map_info are
      not meant to be a complete list and it is not the goal of this patch.
      New fields can be added in the future patch.
      
      The focus of this patch is to create the interface,
      BPF_OBJ_GET_INFO_BY_FD cmd for exposing the bpf_prog's and
      bpf_map's info.
      
      The obj's info, which will be extended (and get bigger) over time, is
      separated from the bpf_attr to avoid bloating the bpf_attr.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAlexei Starovoitov <ast@fb.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e270976
    • Martin KaFai Lau's avatar
      bpf: Add jited_len to struct bpf_prog · 783d28dd
      Martin KaFai Lau authored
      Add jited_len to struct bpf_prog.  It will be
      useful for the struct bpf_prog_info which will
      be added in the later patch.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAlexei Starovoitov <ast@fb.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      783d28dd
    • Martin KaFai Lau's avatar
      bpf: Add BPF_MAP_GET_FD_BY_ID · bd5f5f4e
      Martin KaFai Lau authored
      Add BPF_MAP_GET_FD_BY_ID command to allow user to get a fd
      from a bpf_map's ID.
      
      bpf_map_inc_not_zero() is added and is called with map_idr_lock
      held.
      
      __bpf_map_put() is also added which has the 'bool do_idr_lock'
      param to decide if the map_idr_lock should be acquired when
      freeing the map->id.
      
      In the error path of bpf_map_inc_not_zero(), it may have to
      call __bpf_map_put(map, false) which does not need
      to take the map_idr_lock when freeing the map->id.
      
      It is currently limited to CAP_SYS_ADMIN which we can
      consider to lift it in followup patches.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAlexei Starovoitov <ast@fb.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bd5f5f4e
    • Martin KaFai Lau's avatar
      bpf: Add BPF_PROG_GET_FD_BY_ID · b16d9aa4
      Martin KaFai Lau authored
      Add BPF_PROG_GET_FD_BY_ID command to allow user to get a fd
      from a bpf_prog's ID.
      
      bpf_prog_inc_not_zero() is added and is called with prog_idr_lock
      held.
      
      __bpf_prog_put() is also added which has the 'bool do_idr_lock'
      param to decide if the prog_idr_lock should be acquired when
      freeing the prog->id.
      
      In the error path of bpf_prog_inc_not_zero(), it may have to
      call __bpf_prog_put(map, false) which does not need
      to take the prog_idr_lock when freeing the prog->id.
      
      It is currently limited to CAP_SYS_ADMIN which we can
      consider to lift it in followup patches.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAlexei Starovoitov <ast@fb.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b16d9aa4
    • Martin KaFai Lau's avatar
      bpf: Add BPF_(PROG|MAP)_GET_NEXT_ID command · 34ad5580
      Martin KaFai Lau authored
      This patch adds BPF_PROG_GET_NEXT_ID and BPF_MAP_GET_NEXT_ID
      to allow userspace to iterate all bpf_prog IDs and bpf_map IDs.
      
      The API is trying to be consistent with the existing
      BPF_MAP_GET_NEXT_KEY.
      
      It is currently limited to CAP_SYS_ADMIN which we can
      consider to lift it in followup patches.
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Acked-by: default avatarAlexei Starovoitov <ast@fb.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      34ad5580