1. 06 Mar, 2013 17 commits
    • Paolo Valente's avatar
      pkt_sched: sch_qfq: do not allow virtual time to jump if an aggregate is in service · 40dd2d54
      Paolo Valente authored
      By definition of (the algorithm of) QFQ+, the system virtual time must
      be pushed up only if there is no 'eligible' aggregate, i.e. no
      aggregate that would have started to be served also in the ideal
      system emulated by QFQ+.  QFQ+ serves only eligible aggregates, hence
      the aggregate currently in service is eligible.  As a consequence, to
      decide whether there is no eligible aggregate, QFQ+ must also check
      whether there is no aggregate in service.
      Signed-off-by: default avatarPaolo Valente <paolo.valente@unimore.it>
      Reviewed-by: default avatarFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40dd2d54
    • Paolo Valente's avatar
      pkt_sched: sch_qfq: prevent budget from wrapping around after a dequeue · a0143efa
      Paolo Valente authored
      Aggregate budgets are computed so as to guarantee that, after an
      aggregate has been selected for service, that aggregate has enough
      budget to serve at least one maximum-size packet for the classes it
      contains. For this reason, after a new aggregate has been selected
      for service, its next packet is immediately dequeued, without any
      further control.
      
      The maximum packet size for a class, lmax, can be changed through
      qfq_change_class(). In case the user sets lmax to a lower value than
      the the size of some of the still-to-arrive packets, QFQ+ will
      automatically push up lmax as it enqueues these packets.  This
      automatic push up is likely to happen with TSO/GSO.
      
      In any case, if lmax is assigned a lower value than the size of some
      of the packets already enqueued for the class, then the following
      problem may occur: the size of the next packet to dequeue for the
      class may happen to be larger than lmax, after the aggregate to which
      the class belongs has been just selected for service. In this case,
      even the budget of the aggregate, which is an unsigned value, may be
      lower than the size of the next packet to dequeue. After dequeueing
      this packet and subtracting its size from the budget, the latter would
      wrap around.
      
      This fix prevents the budget from wrapping around after any packet
      dequeue.
      Signed-off-by: default avatarPaolo Valente <paolo.valente@unimore.it>
      Reviewed-by: default avatarFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a0143efa
    • Paolo Valente's avatar
      pkt_sched: sch_qfq: serve activated aggregates immediately if the scheduler is empty · 2f3b89a1
      Paolo Valente authored
      If no aggregate is in service, then the function qfq_dequeue() does
      not dequeue any packet. For this reason, to guarantee QFQ+ to be work
      conserving, a just-activated aggregate must be set as in service
      immediately if it happens to be the only active aggregate.
      This is done by the function qfq_enqueue().
      
      Unfortunately, the function qfq_add_to_agg(), used to add a class to
      an aggregate, does not perform this important additional operation.
      In particular, if: 1) qfq_add_to_agg() is invoked to complete the move
      of a class from a source aggregate, becoming, for this move, inactive,
      to a destination aggregate, becoming instead active, and 2) the
      destination aggregate becomes the only active aggregate, then this
      aggregate is not however set as in service. QFQ+ remains then in a
      non-work-conserving state until a new invocation of qfq_enqueue()
      recovers the situation.
      
      This fix solves the problem by moving the logic for setting an
      aggregate as in service directly into the function qfq_activate_agg().
      Hence, from whatever point qfq_activate_aggregate() is invoked, QFQ+
      remains work conserving.  Since the more-complex logic of this new
      version of activate_aggregate() is not necessary, in qfq_dequeue(), to
      reschedule an aggregate that finishes its budget, then the aggregate
      is now rescheduled by invoking directly the functions needed.
      Signed-off-by: default avatarPaolo Valente <paolo.valente@unimore.it>
      Reviewed-by: default avatarFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f3b89a1
    • Paolo Valente's avatar
      pkt_sched: sch_qfq: fix the update of eligible-group sets · 624b85fb
      Paolo Valente authored
      Between two invocations of make_eligible, the system virtual time may
      happen to grow enough that, in its binary representation, a bit with
      higher order than 31 flips. This happens especially with
      TSO/GSO. Before this fix, the mask used in make_eligible was computed
      as (1UL<<index_of_last_flipped_bit)-1, whose value is well defined on
      a 64-bit architecture, because index_of_flipped_bit <= 63, but is in
      general undefined on a 32-bit architecture if index_of_flipped_bit > 31.
      The fix just replaces 1UL with 1ULL.
      Signed-off-by: default avatarPaolo Valente <paolo.valente@unimore.it>
      Reviewed-by: default avatarFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      624b85fb
    • Paolo Valente's avatar
      pkt_sched: sch_qfq: properly cap timestamps in charge_actual_service · 9b99b7e9
      Paolo Valente authored
      QFQ+ schedules the active aggregates in a group using a bucket list
      (one list per group). The bucket in which each aggregate is inserted
      depends on the aggregate's timestamps, and the number
      of buckets in a group is enough to accomodate the possible (range of)
      values of the timestamps of all the aggregates in the group. For this
      property to hold, timestamps must however be computed correctly.  One
      necessary condition for computing timestamps correctly is that the
      number of bits dequeued for each aggregate, while the aggregate is in
      service, does not exceed the maximum budget budgetmax assigned to the
      aggregate.
      
      For each aggregate, budgetmax is proportional to the number of classes
      in the aggregate. If the number of classes of the aggregate is
      decreased through qfq_change_class(), then budgetmax is decreased
      automatically as well.  Problems may occur if the aggregate is in
      service when budgetmax is decreased, because the current remaining
      budget of the aggregate and/or the service already received by the
      aggregate may happen to be larger than the new value of budgetmax.  In
      this case, when the aggregate is eventually deselected and its
      timestamps are updated, the aggregate may happen to have received an
      amount of service larger than budgetmax.  This may cause the aggregate
      to be assigned a higher virtual finish time than the maximum
      acceptable value for the last bucket in the bucket list of the group.
      
      This fix introduces a cap that addresses this issue.
      Signed-off-by: default avatarPaolo Valente <paolo.valente@unimore.it>
      Reviewed-by: default avatarFabio Checconi <fchecconi@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b99b7e9
    • Peter Hurley's avatar
      net/irda: Raise dtr in non-blocking open · f74861ca
      Peter Hurley authored
      DTR/RTS need to be raised, regardless of the open() mode, but not
      if the port has already shutdown.
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f74861ca
    • Peter Hurley's avatar
      net/irda: Use barrier to set task state · 0b176ce3
      Peter Hurley authored
      Without a memory and compiler barrier, the task state change
      can migrate relative to the condition testing in a blocking loop.
      However, the task state change must be visible across all cpus
      prior to testing those conditions. Failing to do this can result
      in the familiar 'lost wakeup' and this task will hang until killed.
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b176ce3
    • Peter Hurley's avatar
      net/irda: Hold port lock while bumping blocked_open · 2f7c069b
      Peter Hurley authored
      Although tty_lock() already protects concurrent update to
      blocked_open, that fails to meet the separation-of-concerns between
      tty_port and tty.
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2f7c069b
    • Peter Hurley's avatar
      net/irda: Fix port open counts · a4ed2e73
      Peter Hurley authored
      Saving the port count bump is unsafe. If the tty is hung up while
      this open was blocking, the port count is zeroed.
      
      Explicitly check if the tty was hung up while blocking, and correct
      the port count if not.
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a4ed2e73
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net into intel · 0305d068
      David S. Miller authored
      Jeff Kirsher says:
      
      ===================
      This series contains fixes to e1000e and igb.
      
      The e1000e fix resolves an issue at 1000Mbps link speed, where one of the
      MAC's internal clocks can be stopped for up to 4us when entering K1 (a
      power mode of the MAC-PHY interconnect).  If the MAC is waiting for
      completion indications for 2 DMA write requests into Host memory
      (e.g. descriptor writeback or Rx packet writing) and the
      indications occur while the clock is stopped, both indications will be
      missed by the MAC causing the MAC to wait for the completion indications
      and be unable to generate further DMA write requests.  This results in an
      apparent hardware hang.  The patch works-around the issue by disabling
      the de-assertion of the clock request when 1000Mbps link is acquired (K1
      must be disabled while doing this).
      
      The igb fix to drop BUILD_BUG_ON check from igb_build_rx_buffer resolves
      a build error on s390 devices.  The igb driver was throwing a build error
      due to the fact that a frame built using build_skb would be larger than 2K.
      Since this is not likely to change at any point in the future we are better
      off just dropping the check since we already had a check in
      igb_set_rx_buffer_len that will just disable the usage of build_skb anyway.
      
      The igb fix for i210 link setup changes the setup copper link function
      to use a switch statement, so that the appropriate setup link function
      is called for the given PHY types.
      
      Lastly, the igb fix for a lockdep issue in igb_get_i2c_client resolves
      the issue by re-factoring the initialization and usage of the i2c_client.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0305d068
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 9f225788
      Linus Torvalds authored
      Pull powerpc fixes from Ben Herrenschmidt:
       "Here are a few powerpc bits & fixes for rc1.  A couple of str*cpy
        fixes, some fixes in handling the FSCR register on Power8 (controls
        the enabling of processor features), a 32-bit build fix and a couple
        more nits."
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc: Set DSCR bit in FSCR setup
        powerpc: Add DSCR FSCR register bit definition
        powerpc: Fix setting FSCR for HV=0 and on secondary CPUs
        powerpc: Wireup the kcmp syscall to sys_ni
        powerpc: Remove unused BITOP_LE_SWIZZLE macro
        powerpc: Avoid link stack corruption in MMU on syscall entry path
        drivers/tty/hvc: Use strlcpy instead of strncpy
        powerpc/pseries/hvcserver: Fix strncpy buffer limit in location code
        powerpc: Fix compile of sha1-powerpc-asm.S on 32-bit
      9f225788
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · d7b815d4
      Linus Torvalds authored
      Pull virtio hwrng fix from Rusty Russell:
       "Nasty side-effect of vmalloc'ing modules: their static vars cannot be
        put into scatterlists.  Jens has a check queued for this, so it
        shouldn't happen again.
      
        We could fix this in virtio_rng, but it's actually far easier to just
        do it in the core"
      
      * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
        hw_random: make buffer usable in scatterlist.
      d7b815d4
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 9da060d0
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "A moderately sized pile of fixes, some specifically for merge window
        introduced regressions although others are for longer standing items
        and have been queued up for -stable.
      
        I'm kind of tired of all the RDS protocol bugs over the years, to be
        honest, it's way out of proportion to the number of people who
        actually use it.
      
         1) Fix missing range initialization in netfilter IPSET, from Jozsef
            Kadlecsik.
      
         2) ieee80211_local->tim_lock needs to use BH disabling, from Johannes
            Berg.
      
         3) Fix DMA syncing in SFC driver, from Ben Hutchings.
      
         4) Fix regression in BOND device MAC address setting, from Jiri
            Pirko.
      
         5) Missing usb_free_urb in ISDN Hisax driver, from Marina Makienko.
      
         6) Fix UDP checksumming in bnx2x driver for 57710 and 57711 chips,
            fix from Dmitry Kravkov.
      
         7) Missing cfgspace_lock initialization in BCMA driver.
      
         8) Validate parameter size for SCTP assoc stats getsockopt(), from
            Guenter Roeck.
      
         9) Fix SCTP association hangs, from Lee A Roberts.
      
        10) Fix jumbo frame handling in r8169, from Francois Romieu.
      
        11) Fix phy_device memory leak, from Petr Malat.
      
        12) Omit trailing FCS from frames received in BGMAC driver, from Hauke
            Mehrtens.
      
        13) Missing socket refcount release in L2TP, from Guillaume Nault.
      
        14) sctp_endpoint_init should respect passed in gfp_t, rather than use
            GFP_KERNEL unconditionally.  From Dan Carpenter.
      
        15) Add AISX AX88179 USB driver, from Freddy Xin.
      
        16) Remove MAINTAINERS entries for drivers deleted during the merge
            window, from Cesar Eduardo Barros.
      
        17) RDS protocol can try to allocate huge amounts of memory, check
            that the user's request length makes sense, from Cong Wang.
      
        18) SCTP should use the provided KMALLOC_MAX_SIZE instead of it's own,
            bogus, definition.  From Cong Wang.
      
        19) Fix deadlocks in FEC driver by moving TX reclaim into NAPI poll,
            from Frank Li.  Also, fix a build error introduced in the merge
            window.
      
        20) Fix bogus purging of default routes in ipv6, from Lorenzo Colitti.
      
        21) Don't double count RTT measurements when we leave the TCP receive
            fast path, from Neal Cardwell."
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
        tcp: fix double-counted receiver RTT when leaving receiver fast path
        CAIF: fix sparse warning for caif_usb
        rds: simplify a warning message
        net: fec: fix build error in no MXC platform
        net: ipv6: Don't purge default router if accept_ra=2
        net: fec: put tx to napi poll function to fix dead lock
        sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE
        rds: limit the size allocated by rds_message_alloc()
        MAINTAINERS: remove eexpress
        MAINTAINERS: remove drivers/net/wan/cycx*
        MAINTAINERS: remove 3c505
        caif_dev: fix sparse warnings for caif_flow_cb
        ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver
        sctp: use the passed in gfp flags instead GFP_KERNEL
        ipv[4|6]: correct dropwatch false positive in local_deliver_finish
        l2tp: Restore socket refcount when sendmsg succeeds
        net/phy: micrel: Disable asymmetric pause for KSZ9021
        bgmac: omit the fcs
        phy: Fix phy_device_free memory leak
        bnx2x: Fix KR2 work-around condition
        ...
      9da060d0
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · e3b59518
      Linus Torvalds authored
      Pull irq fixes and cleanups from Thomas Gleixner:
       "Commit e5ab012c ("nohz: Make tick_nohz_irq_exit() irq safe") is
        the first commit in the series and the minimal necessary bugfix, which
        needs to go back into stable.
      
        The remanining commits enforce irq disabling in irq_exit(), sanitize
        the hardirq/softirq preempt count transition and remove a bunch of no
        longer necessary conditionals."
      
      I personally love getting rid of the very subtle and confusing
      IRQ_EXIT_OFFSET thing.  Even apart from the whole "more lines removed
      than added" thing.
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irq: Don't re-enable interrupts at the end of irq_exit
        irq: Remove IRQ_EXIT_OFFSET workaround
        Revert "nohz: Make tick_nohz_irq_exit() irq safe"
        irq: Sanitize invoke_softirq
        irq: Ensure irq_exit() code runs with interrupts disabled
        nohz: Make tick_nohz_irq_exit() irq safe
      e3b59518
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6516ab6f
      Linus Torvalds authored
      Pull smpboot bugfix from Thomas Gleixner:
       "A single bugfix for a regression introduced with the conversion of the
        stop machine threads to the generic smpboot thread management
        facility"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        stop_machine: Mark per cpu stopper enabled early
      6516ab6f
    • Linus Torvalds's avatar
      Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux · 06e79d3b
      Linus Torvalds authored
      Pull second round of GPIO changes from Grant Likely:
       "This branch contains a few bug fixes that I missed the first time
        around and updates to the gpio_desc series included in the first pull
        request.  This tag has been retagged to drop the 2 head commits
        because the one of them caused a build failure."
      
      * tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux:
        gpio/gpio-ich: fix ichx_gpio_check_available() return what callers expect
        gpiolib: move comment to right function
        gpiolib: use const parameters when possible
        gpiolib: check descriptors validity before use
      06e79d3b
    • Linus Torvalds's avatar
      Merge tag 'md-3.9' of git://neil.brown.name/md · a5e0d731
      Linus Torvalds authored
      Pull md updates from NeilBrown:
       "Mostly little bugfixes.
      
        Only "feature" is a new RAID10 layout which slightly improves the
        number of sets of devices that can concurrently fail, without data
        loss."
      
      * tag 'md-3.9' of git://neil.brown.name/md:
        md: expedite metadata update when switching  read-auto -> active
        md: remove CONFIG_MULTICORE_RAID456
        md/raid1,raid10: fix deadlock with freeze_array()
        md/raid0: improve error message when converting RAID4-with-spares to RAID0
        md: raid0: fix error return from create_stripe_zones.
        md: fix two bugs when attempting to resize RAID0 array.
        DM RAID: Add support for MD's RAID10 "far" and "offset" algorithms
        MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 2)
        MD RAID10: Improve redundancy for 'far' and 'offset' algorithms (part 1)
        MD RAID10: Minor non-functional code changes
        md: raid1,10: Handle REQ_WRITE_SAME flag in write bios
        md: protect against crash upon fsync on ro array
      a5e0d731
  2. 05 Mar, 2013 13 commits
  3. 04 Mar, 2013 9 commits
    • Rusty Russell's avatar
      hw_random: make buffer usable in scatterlist. · f7f154f1
      Rusty Russell authored
      virtio_rng feeds the randomness buffer handed by the core directly
      into the scatterlist, since commit bb347d98.
      
      However, if CONFIG_HW_RANDOM=m, the static buffer isn't a linear address
      (at least on most archs).  We could fix this in virtio_rng, but it's actually
      far easier to just do it in the core as virtio_rng would have to allocate
      a buffer every time (it doesn't know how much the core will want to read).
      Reported-by: default avatarAurelien Jarno <aurelien@aurel32.net>
      Tested-by: default avatarAurelien Jarno <aurelien@aurel32.net>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Cc: stable@kernel.org
      f7f154f1
    • Neal Cardwell's avatar
      tcp: fix double-counted receiver RTT when leaving receiver fast path · aab2b4bf
      Neal Cardwell authored
      We should not update ts_recent and call tcp_rcv_rtt_measure_ts() both
      before and after going to step5. That wastes CPU and double-counts the
      receiver-side RTT sample.
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aab2b4bf
    • Silviu-Mihai Popescu's avatar
      CAIF: fix sparse warning for caif_usb · d2123be0
      Silviu-Mihai Popescu authored
      This fixes the following sparse warning:
      net/caif/caif_usb.c:84:16: warning: symbol 'cfusbl_create' was not
      declared. Should it be static?
      Signed-off-by: default avatarSilviu-Mihai Popescu <silviupopescu1990@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2123be0
    • Cong Wang's avatar
      rds: simplify a warning message · 7dac1b51
      Cong Wang authored
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: default avatarCong Wang <amwang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7dac1b51
    • Frank Li's avatar
      net: fec: fix build error in no MXC platform · acac8406
      Frank Li authored
      build error cause by
      Commit ff43da86
      ("NET: FEC: dynamtic check DMA desc buff type")
      
      drivers/net/ethernet/freescale/fec.c: In function ‘fec_enet_get_nextdesc’:
      drivers/net/ethernet/freescale/fec.c:215:18: error: invalid use of undefined type ‘struct bufdesc_ex’
      drivers/net/ethernet/freescale/fec.c: In function ‘fec_enet_get_prevdesc’:
      drivers/net/ethernet/freescale/fec.c:224:18: error: invalid use of undefined type ‘struct bufdesc_ex’
      drivers/net/ethernet/freescale/fec.c: In function ‘fec_enet_start_xmit’:
      drivers/net/ethernet/freescale/fec.c:286:37: error: arithmetic on pointer to an incomplete type
      drivers/net/ethernet/freescale/fec.c:287:13: error: arithmetic on pointer to an incomplete type
      drivers/net/ethernet/freescale/fec.c:324:7: error: dereferencing pointer to incomplete type etc....
      Signed-off-by: default avatarFrank Li <Frank.Li@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acac8406
    • Lorenzo Colitti's avatar
      net: ipv6: Don't purge default router if accept_ra=2 · 3e8b0ac3
      Lorenzo Colitti authored
      Setting net.ipv6.conf.<interface>.accept_ra=2 causes the kernel
      to accept RAs even when forwarding is enabled. However, enabling
      forwarding purges all default routes on the system, breaking
      connectivity until the next RA is received. Fix this by not
      purging default routes on interfaces that have accept_ra=2.
      Signed-off-by: default avatarLorenzo Colitti <lorenzo@google.com>
      Acked-by: default avatarYOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3e8b0ac3
    • Frank Li's avatar
      net: fec: put tx to napi poll function to fix dead lock · de5fb0a0
      Frank Li authored
      up stack ndo_start_xmit already hold lock.
      fec_enet_start_xmit needn't spin lock.
      stat_xmit just update fep->cur_tx
      fec_enet_tx just update fep->dirty_tx
      
      Reserve a empty bdb to check full or empty
      cur_tx == dirty_tx    means full
      cur_tx == dirty_tx +1 means empty
      
      So needn't is_full variable.
      
      Fix spin lock deadlock
      
      =================================
      [ INFO: inconsistent lock state ]
      3.8.0-rc5+ #107 Not tainted
      ---------------------------------
      inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
      ptp4l/615 [HC1[1]:SC0[0]:HE0:SE1] takes:
       (&(&list->lock)->rlock#3){?.-...}, at: [<8042c3c4>] skb_queue_tail+0x20/0x50
       {HARDIRQ-ON-W} state was registered at:
       [<80067250>] mark_lock+0x154/0x4e8
       [<800676f4>] mark_irqflags+0x110/0x1a4
       [<80069208>] __lock_acquire+0x494/0x9c0
       [<80069ce8>] lock_acquire+0x90/0xa4
       [<80527ad0>] _raw_spin_lock_bh+0x44/0x54
       [<804877e0>] first_packet_length+0x38/0x1f0
       [<804879e4>] udp_poll+0x4c/0x5c
       [<804231f8>] sock_poll+0x24/0x28
       [<800d27f0>] do_poll.isra.10+0x120/0x254
       [<800d36e4>] do_sys_poll+0x15c/0x1e8
       [<800d3828>] sys_poll+0x60/0xc8
       [<8000e780>] ret_fast_syscall+0x0/0x3c
      
       *** DEADLOCK ***
      
       1 lock held by ptp4l/615:
        #0:  (&(&fep->hw_lock)->rlock){-.-...}, at: [<80355f9c>] fec_enet_tx+0x24/0x268
        stack backtrace:
        Backtrace:
        [<800121e0>] (dump_backtrace+0x0/0x10c) from [<80516210>] (dump_stack+0x18/0x1c)
        r6:8063b1fc r5:bf38b2f8 r4:bf38b000 r3:bf38b000
        [<805161f8>] (dump_stack+0x0/0x1c) from [<805189d0>] (print_usage_bug.part.34+0x164/0x1a4)
        [<8051886c>] (print_usage_bug.part.34+0x0/0x1a4) from [<80518a88>] (print_usage_bug+0x78/0x88)
        r8:80065664 r7:bf38b2f8 r6:00000002 r5:00000000 r4:bf38b000
        [<80518a10>] (print_usage_bug+0x0/0x88) from [<80518b58>] (mark_lock_irq+0xc0/0x270)
        r7:bf38b000 r6:00000002 r5:bf38b2f8 r4:00000000
        [<80518a98>] (mark_lock_irq+0x0/0x270) from [<80067270>] (mark_lock+0x174/0x4e8)
        [<800670fc>] (mark_lock+0x0/0x4e8) from [<80067744>] (mark_irqflags+0x160/0x1a4)
        [<800675e4>] (mark_irqflags+0x0/0x1a4) from [<80069208>] (__lock_acquire+0x494/0x9c0)
        r5:00000002 r4:bf38b2f8
        [<80068d74>] (__lock_acquire+0x0/0x9c0) from [<80069ce8>] (lock_acquire+0x90/0xa4)
        [<80069c58>] (lock_acquire+0x0/0xa4) from [<805278d8>] (_raw_spin_lock_irqsave+0x4c/0x60)
        [<8052788c>] (_raw_spin_lock_irqsave+0x0/0x60) from [<8042c3c4>] (skb_queue_tail+0x20/0x50)
        r6:bfbb2180 r5:bf1d0190 r4:bf1d0184
        [<8042c3a4>] (skb_queue_tail+0x0/0x50) from [<8042c4cc>] (sock_queue_err_skb+0xd8/0x188)
        r6:00000056 r5:bfbb2180 r4:bf1d0000 r3:00000000
        [<8042c3f4>] (sock_queue_err_skb+0x0/0x188) from [<8042d15c>] (skb_tstamp_tx+0x70/0xa0)
        r6:bf0dddb0 r5:bf1d0000 r4:bfbb2180 r3:00000004
        [<8042d0ec>] (skb_tstamp_tx+0x0/0xa0) from [<803561d0>] (fec_enet_tx+0x258/0x268)
        r6:c089d260 r5:00001c00 r4:bfbd0000
        [<80355f78>] (fec_enet_tx+0x0/0x268) from [<803562cc>] (fec_enet_interrupt+0xec/0xf8)
        [<803561e0>] (fec_enet_interrupt+0x0/0xf8) from [<8007d5b0>] (handle_irq_event_percpu+0x54/0x1a0)
        [<8007d55c>] (handle_irq_event_percpu+0x0/0x1a0) from [<8007d740>] (handle_irq_event+0x44/0x64)
        [<8007d6fc>] (handle_irq_event+0x0/0x64) from [<80080690>] (handle_fasteoi_irq+0xc4/0x15c)
        r6:bf0dc000 r5:bf811290 r4:bf811240 r3:00000000
        [<800805cc>] (handle_fasteoi_irq+0x0/0x15c) from [<8007ceec>] (generic_handle_irq+0x28/0x38)
        r5:807130c8 r4:00000096
        [<8007cec4>] (generic_handle_irq+0x0/0x38) from [<8000f16c>] (handle_IRQ+0x54/0xb4)
        r4:8071d280 r3:00000180
        [<8000f118>] (handle_IRQ+0x0/0xb4) from [<80008544>] (gic_handle_irq+0x30/0x64)
        r8:8000e924 r7:f4000100 r6:bf0ddef8 r5:8071c974 r4:f400010c
        r3:00000000
        [<80008514>] (gic_handle_irq+0x0/0x64) from [<8000e2e4>] (__irq_svc+0x44/0x5c)
        Exception stack(0xbf0ddef8 to 0xbf0ddf40)
      Signed-off-by: default avatarFrank Li <Frank.Li@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de5fb0a0
    • Cong Wang's avatar
      sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE · 3f736868
      Cong Wang authored
      Don't definite its own MAX_KMALLOC_SIZE, use the one
      defined in mm.
      
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Sridhar Samudrala <sri@us.ibm.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: David S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarCong Wang <amwang@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3f736868
    • Cong Wang's avatar
      rds: limit the size allocated by rds_message_alloc() · ece6b0a2
      Cong Wang authored
      Dave Jones reported the following bug:
      
      "When fed mangled socket data, rds will trust what userspace gives it,
      and tries to allocate enormous amounts of memory larger than what
      kmalloc can satisfy."
      
      WARNING: at mm/page_alloc.c:2393 __alloc_pages_nodemask+0xa0d/0xbe0()
      Hardware name: GA-MA78GM-S2H
      Modules linked in: vmw_vsock_vmci_transport vmw_vmci vsock fuse bnep dlci bridge 8021q garp stp mrp binfmt_misc l2tp_ppp l2tp_core rfcomm s
      Pid: 24652, comm: trinity-child2 Not tainted 3.8.0+ #65
      Call Trace:
       [<ffffffff81044155>] warn_slowpath_common+0x75/0xa0
       [<ffffffff8104419a>] warn_slowpath_null+0x1a/0x20
       [<ffffffff811444ad>] __alloc_pages_nodemask+0xa0d/0xbe0
       [<ffffffff8100a196>] ? native_sched_clock+0x26/0x90
       [<ffffffff810b2128>] ? trace_hardirqs_off_caller+0x28/0xc0
       [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff811861f8>] alloc_pages_current+0xb8/0x180
       [<ffffffff8113eaaa>] __get_free_pages+0x2a/0x80
       [<ffffffff811934fe>] kmalloc_order_trace+0x3e/0x1a0
       [<ffffffff81193955>] __kmalloc+0x2f5/0x3a0
       [<ffffffff8104df0c>] ? local_bh_enable_ip+0x7c/0xf0
       [<ffffffffa0401ab3>] rds_message_alloc+0x23/0xb0 [rds]
       [<ffffffffa04043a1>] rds_sendmsg+0x2b1/0x990 [rds]
       [<ffffffff810b21cd>] ? trace_hardirqs_off+0xd/0x10
       [<ffffffff81564620>] sock_sendmsg+0xb0/0xe0
       [<ffffffff810b2052>] ? get_lock_stats+0x22/0x70
       [<ffffffff810b24be>] ? put_lock_stats.isra.23+0xe/0x40
       [<ffffffff81567f30>] sys_sendto+0x130/0x180
       [<ffffffff810b872d>] ? trace_hardirqs_on+0xd/0x10
       [<ffffffff816c547b>] ? _raw_spin_unlock_irq+0x3b/0x60
       [<ffffffff816cd767>] ? sysret_check+0x1b/0x56
       [<ffffffff810b8695>] ? trace_hardirqs_on_caller+0x115/0x1a0
       [<ffffffff81341d8e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff816cd742>] system_call_fastpath+0x16/0x1b
      ---[ end trace eed6ae990d018c8b ]---
      Reported-by: default avatarDave Jones <davej@redhat.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: default avatarCong Wang <amwang@redhat.com>
      Acked-by: default avatarVenkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ece6b0a2
  4. 03 Mar, 2013 1 commit