- 09 May, 2013 2 commits
-
-
Vineet Gupta authored
Nothing semantical * simplify the alignement code by using & operation only * rename variables clearly as paddr Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
vaddr used to index the cache was clipped from the wrong end, and thus would potentially fail to flush the correct lines. The problem was dorment for so long because up until the recent optimizations it was only used for ptrace break-point only flushes. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
- 07 May, 2013 22 commits
-
-
Vineet Gupta authored
flush_dcache_page( ) is MM hook to ensure that a page has consistent views between kernel and userspace. Thus it is called when * kernel writes to a page which at some later point could get mapped to userspace (so kernel mapping needs to be flushed-n-inv) * kernel is about to read from a page with possible userspace mappings (so userspace mappings needs to be made coherent with kernel ones) However for Non aliasing VIPT dcache, any userspace mapping will always be congruent to kernel mapping. Thus d-cache need need not be flushed at all (or delayed indefinitely). The only reason it does need to be flushed is when mapping code pages. Since icache doesn't snoop dcache, those dirty dcache lines need to be written back to memory and icache line invalidated so that icache lines fetch will get the right data. Decent gains on LMBench fork/exec/sh and File I/O micro-benchmarks. (1) FPGA @ 80 MHZ Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.9-rc6-a Linux 3.9.0-r 80 4.79 8.72 66.7 116. 239. 8.39 30.4 4798 14.K 34.K 3.9-rc6-b Linux 3.9.0-r 80 4.79 8.62 65.4 111. 239. 8.35 29.0 3995 12.K 30.K 3.9-rc7-c Linux 3.9.0-r 80 4.79 9.00 66.1 106. 239. 8.61 30.4 2858 10.K 24.K ^^^^ ^^^^ ^^^ File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- 3.9-rc6-a Linux 3.9.0-r 317.8 204.2 1122.3 375.1 3522.0 4.288 20.7 126.8 3.9-rc6-b Linux 3.9.0-r 298.7 223.0 1141.6 367.8 3531.0 4.866 20.9 126.4 3.9-rc7-c Linux 3.9.0-r 278.4 179.2 862.1 339.3 3705.0 3.223 20.3 126.6 ^^^^^ ^^^^^ ^^^^^ ^^^^ (2) Customer Silicon @ 500 MHz (166 MHz mem) ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- abilis-ba Linux 3.9.0-r 497 0.71 1.38 4.58 12.0 35.5 1.40 3.89 2070 5525 13.K abilis-ca Linux 3.9.0-r 497 0.71 1.40 4.61 11.8 35.6 1.37 3.92 1411 4317 10.K ^^^^ ^^^^ ^^^ Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
start address is already page aligned and size is const PAGE_SIZE, thus fixups for alignment not needed in generated code. bloat-o-meter vmlinux-mm5 vmlinux add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-32 (-32) function old new delta __inv_icache_page 82 50 -32 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
No users of this code anymore - so RIP ! Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Now that we have same helper used for all icache invalidates (i.e. vaddr+paddr based exact line invalidate), consolidate the open coded calls into one place. Also rename flush_icache_range_vaddr => __sync_icache_dcache Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
This change continues the theme from prev commit - this time icache handling for kernel's own code modification (vmalloc: loadable modules, breakpoints for kprobes/kgdb...) flush_icache_range() calls the CDU icache helper with vaddr to enable exact line invalidate. For a true kernel-virtual mapping, the vaddr is actually virtual hence valid as index into cache. For kprobes breakpoint however, the vaddr arg is actually paddr - since that's how normal kernel is mapped in ARC memory map. This implies that CDU will use the same addr for indexing as for tag match - which is fine since kernel code would only have that "implicit" mapping and none other. This should speed up module loading significantly - specially on default ARC700 icache configurations (32k) which alias. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
ARC icache doesn't snoop dcache thus executable pages need to be made coherent before mapping into userspace in flush_icache_page(). However ARC700 CDU (hardware cache flush module) requires both vaddr (index in cache) as well as paddr (tag match) to correctly identify a line in the VIPT cache. A typical ARC700 SoC has aliasing icache, thus the paddr only based flush_icache_page() API couldn't be implemented efficiently. It had to loop thru all possible alias indexes and perform the invalidate operation (ofcourse the cache op would only succeed at the index(es) where tag matches - typically only 1, but the cost of visiting all the cache-bins needs to paid nevertheless). Turns out however that the vaddr (along with paddr) is available in update_mmu_cache() hence better suits ARC icache flush semantics. With both vaddr+paddr, exactly one flush operation per line is done. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
munmap ends up calling tlb_flush() which for ARC was flushing the entire TLB unconditionally (by moving the MMU to a new ASID) do_munmap unmap_region unmap_vmas unmap_single_vma unmap_page_range tlb_start_vma zap_pud_range tlb_end_vma() tlb_finish_mmu tlb_flush() ---> unconditional flush_tlb_mm() So even a single page munmap, a frequent operation when uClibc dynamic linker (ldso) is loading the dependent shared libraries, would move the the ASID multiple times - needlessly invalidating the pre-faulted TLB entries (and increasing the rate of ASID wraparound + full TLB flush). This is now optimised to only be called if tlb->full_mm (which means for exit/execve) cases only. And for those cases, flush_tlb_mm() is already optimised to be a no-op for mm->mm_users == 0. So essentially there are no mmore full mm flushes - except for fork which anyhow needs it for properly COW'ing parent address space. munmap now needs to do TLB range flush, which is implemented with tlb_end_vma() Results ------- 1. ASID now consistenly moves by 4 during a simple ls (as opposed to 5 or 7 before). 2. LMBench microbenchmark also shows improvements Basic system parameters ------------------------------------------------------------------------------ Host OS Description Mhz tlb cache mem scal pages line par load bytes --------- ------------- ----------------------- ---- ----- ----- ------ ---- 3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0404-gcc-4.4-ba 80 8 64 1.1000 1 3.9-rc5-0 Linux 3.9.0-r 3.9-rc5-0405-avoid-full 80 8 64 1.1200 1 Processor, Processes - times in microseconds - smaller is better ------------------------------------------------------------------------------ Host OS Mhz null null open slct sig sig fork exec sh call I/O stat clos TCP inst hndl proc proc proc --------- ------------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 3.9-rc5-0 Linux 3.9.0-r 80 4.81 8.69 68.6 118. 239. 8.53 31.6 4839 13.K 34.K 3.9-rc5-0 Linux 3.9.0-r 80 4.46 8.36 53.8 91.3 223. 8.12 24.2 4725 13.K 33.K File & VM system latencies in microseconds - smaller is better ------------------------------------------------------------------------------- Host OS 0K File 10K File Mmap Prot Page 100fd Create Delete Create Delete Latency Fault Fault selct --------- ------------- ------ ------ ------ ------ ------- ----- ------- ----- 3.9-rc5-0 Linux 3.9.0-r 314.7 223.2 1054.9 390.2 3615.0 1.590 20.1 126.6 3.9-rc5-0 Linux 3.9.0-r 265.8 183.8 1014.2 314.1 3193.0 6.910 18.8 110.4 Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Mischa Jonker authored
This adds support for an ARC Virtual Platform. This platform is based on the System C standard promoted by the OSCI (Open System C Initiative) and uses nSIM to simulate the ARC CPU core itself. Users can build a virtual SoC by combining System C models of peripherals and CPU cores. Signed-off-by: Mischa Jonker <mjonker@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Christian Ruppert authored
The original device tree was written using a slightly different implementation of the fixed-factor-clock device tree binding. The compatible string must be modified in order to be compatible with the new implementation. Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Christian Ruppert authored
Infrastructure required to make the Linux kernel compile and boot on the Abilis Systems TB10x series of SOCs based on ARC700 CPUs: - Kmake related files (Kconfig, Makefile, tb10x_defconfig) - TB10x platform initialisation Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Christian Ruppert authored
These are the device tree files for the Abilis Systems TB100 and TB101 ICs and their respective development kit PCBs. These files are committed in preparation of the following patch set which adds support for these chips to the ARC platform. Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Christian Ruppert authored
This patch adds some room for CPU-external interrupt controllers in the Linux interrupt space. Until now, only the 32 CPU internal interrupt lines were supported which does not allow for external interrupt controllers such as GPIO modules etc. Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Pierrick Hascoet <pierrick.hascoet@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
arc-intc is initialized in arc common code as it is applicable to all platforms. However platforms with their own external intc still need to refer to it for correct DT interrupt tree hierarchy setup, e.g. static struct of_device_id __initdata tb10x_irq_ids[] = { { .compatible = "snps,arc700-intc", .data = dummy_init_irq }, { .compatible = "abilis,tb10x_ictl", .data = tb10x_init_irq }, {}, }; The fix is to use the generic irqchip framework to tie all irqchips in a special linker section and then call irqchip_init() which calls the DT of_irq_init() for all the intc in one go. That way the platform code need not be aware of arc-intc at all. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
The existing code was wrong on several counts: * uboot provided bootargs were copied into @boot_command_line, only to be over-written by setup_machine_fdt(), effectively lost * @cmdline_p returned by setup_arch() to start_kernel() didn't include the DT /bootargs Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Given that DeviceTree /bootargs can provide similar functionality, no point in providing duplicate infrastructure. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
* Allow initramfs path to be symlink * CONFIG_PREEMPT be default Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
The fixup code correctly updates the callee-regs on stack, but fails to unwind it into actual register file. Thus userspace won't see the update. Reported-by: Noam Camus <noamc@ezchip.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
If CONFIG_ARC_MISALIGN_ACCESS is not enabled, or if the fixup fails, call the same error handler: same signal/si_code to user (SIGBUS) Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
* Remove the line-break between scratch/callee-regs (sneaked in when we converted from printk to pr_* * Use %pS to print the symbol names of faulting PC (ret pseudo register) and BLINK (call return register) * Don't print user-vma for a kernel crash (only do it for print-fatal-signals based regfile dump) * Verbose print the Interrupt/Exception Enable/Active state * for main executable link address is 0x10000 based (vs. 0) thus offset of faulting PC needs to be adjusted Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Noam Camus authored
Signed-off-by: Noam Camus <noamc@ezchip.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Noam Camus authored
Signed-off-by: Noam Camus <noamc@ezchip.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Alexander Shiyan authored
This tracks mainline commit ae903caa "Bury the conditionals from kernel_thread/kernel_execve series" which we missed out as ARC port was not yet mainline. [vgupta: commit log modified] Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
- 17 Apr, 2013 1 commit
-
-
Vineet Gupta authored
Currently, for every ARC kernel build I see the following: --------------->8----------------- DTB arch/arc/boot/dts/angel4.dtb.S AS arch/arc/boot/dts/angel4.dtb.o LD arch/arc/boot/dts/built-in.o rm arch/arc/boot/dts/angel4.dtb.S <-- forces rebuild next iter CHK kernel/config_data.h --------------->8----------------- This is because *.dts.S is intermediate file in dtb generation and is by default deleted by make which needs a ".SECONDARY" hint to NOT do so. This could have ideally been done in scripts/Makefile.lib - for benefit of all, however .SECONDARY doesn't seem to work with wildcards. Thanks to Stephen for suggesting .SECONDARY (vs .PRECIOUS) and making that work using a non wildcard version in arch makefile. Thanks to James Hogan for pointing out that *.dtb.S now needs to be added to clean-files Signed-off-by: Stephen Warren <swarren@nvidia.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
- 09 Apr, 2013 12 commits
-
-
Vineet Gupta authored
commit "fs: make binfmt support for #! scripts modular and removable" made support for #!scripts optional - thus need to include the Kconfig file to get all relevant BINFMT_* Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Currently ARC_HAS_LLSC can be influenced by platform for SMP only using ARC_HAS_COH_LLSC. For !SMP it defaults to "y". It turns out that some customers can't support it all, even in UP. So we change the semantics, and use a negative dependency ARC_CANT_LLSC. Any platform (independent of SMP or !SMP) can select it to disable LLSC. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Christian Ruppert authored
arch/arc/kernel/traps.c no longer compiles without explicitly including asm/kprobes.h Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
The existing uImage target always generates gzip compressed image which drags bootup for some very slow FPGA customer boards. So introduce seperate make targets:uImage.{bin,gz} with uncompressed being default. Also tie gz generation to CONFIG_KERNEL_GZIP, which a platform can select in it's Kconfig if it wishes gz to be default. Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Vineet Gupta authored
We do cross compiles for ARC Linux. With gcc 4.7, a make defconfig spews out the following: ------------------->8-------------------------- make ARCH=arc defconfig gcc: error: unrecognized command line option '-marc600' gcc: error: unrecognized command line option '-mA7' gcc: error: unrecognized command line option '-mno-sdata' gcc: error: unrecognized command line option '-mno-mpy' *** Default configuration is based on 'fpga_defconfig' ------------------->8-------------------------- This apparently is coming from LIBGCC line - which is strange to be invoked for defconfig generation. Reported-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Paul Bolle authored
There's no (Kconfig) macro CONFIG_BLOCK_DEV_RAM. (CONFIG_BLK_DEV_RAM does exist though.) But linux/blk.h got killed in 2005 anyway (in a patch titled "kill blk.h"), so these three lines can be removed. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Sachin Kamat authored
Some header files were included twice in the same file. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Sachin Kamat authored
Fixes the following coding style issues as detected by checkpatch: ERROR: space required before the open parenthesis '(' ERROR: "foo * bar" should be "foo *bar" WARNING: space prohibited between function name and open parenthesis '(' WARNING: please, no spaces at the start of a line Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Sachin Kamat authored
Silences the following checkpatch warnings: WARNING: Use #include <linux/ptrace.h> instead of <asm/ptrace.h> WARNING: Use #include <linux/kprobes.h> instead of <asm/kprobes.h> WARNING: Use #include <linux/kgdb.h> instead of <asm/kgdb.h> WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h> WARNING: Use #include <linux/cache.h> instead of <asm/cache.h> Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
Sachin Kamat authored
version.h header file inclusion is not necessary as detected by versioncheck script. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
-
- 08 Apr, 2013 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixesLinus Torvalds authored
Pull powerpc bugfix from Stephen Rothwell: "A single BUG_ON fix for a condition that could happen for machines with certain hardware installed." * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sfr/next-fixes: powerpc: pSeries_lpar_hpte_remove fails from Adjunct partition being performed before the ANDCOND test
-
Christian Ruppert authored
ARC irqsave/restore macros were missing the compiler barrier, causing a stale load in irq-enabled region be used in irq-safe region, despite being changed, because the register holding the value was still live. The problem manifested as random crashes in timer code when stress testing ARCLinux (3.9-rc3) on a !SMP && !PREEMPT_COUNT Here's the exact sequence which caused this: (0). tv1[x] <----> t1 <---> t2 (1). mod_timer(t1) interrupted after it calls timer_pending() (2). mod_timer(t2) completes (3). mod_timer(t1) resumes but messes up the list (4). __runt_timers( ) uses bogus timer_list entry / crashes in timer->function Essentially mod_timer() was racing against itself and while the spinlock serialized the tv1[] timer link list, timer_pending() called outside the spinlock, cached timer link list element in a register. With low register pressure (and a deep register file), lack of barrier in raw_local_irqsave() as well as preempt_disable (!PREEMPT_COUNT version), there was nothing to force gcc to reload across the spinlock, causing a stale value in reg be used for link list manipulation - ensuing a corruption. ARcompact disassembly which shows the culprit generated code: mod_timer: push_s blink mov_s r13,r0 # timer, timer .. ###### timer_pending( ) ld_s r3,[r13] # <------ <variable>.entry.next LOADED brne r3, 0, @.L163 .L163: .. ###### spin_lock_irq( ) lr r5, [status32] # flags bic r4, r5, 6 # temp, flags, and.f 0, r5, 6 # flags, flag.nz r4 ###### detach_if_pending( ) begins tst_s r3,r3 <-------------- # timer_pending( ) checks timer->entry.next # r3 is NOT reloaded by gcc, using stale value beq.d @.L169 mov.eq r0,0 ##### detach_timer( ): __list_del( ) ld r4,[r13,4] # <variable>.entry.prev, D.31439 st r4,[r3,4] # <variable>.prev, D.31439 st r3,[r4] # <variable>.next, D.30246 We initially tried to fix this by adding barrier() to preempt_* macros for !PREEMPT_COUNT but Linus clarified that it was anything but wrong. http://www.spinics.net/lists/kernel/msg1512709.html [vgupta: updated commitlog] Reported-by/Signed-off-by: Christian Ruppert <christian.ruppert@abilis.com> Cc: Christian Ruppert <christian.ruppert@abilis.com> Cc: Pierrick Hascoet <pierrick.hascoet@abilis.com> Debugged-by/Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-devLinus Torvalds authored
Pull libata fixes from Jeff Garzik: "The HDIO_DRIVE_* fix is really the biggie. 1) Fix ATAPI regression, noticed mainly on tape drives, due to a commit which mistakenly changed an 'int' return type to a 'bool'. Broken by commit 4dce8ba9 ("libata: Use 'bool' return value for ata_id_XXX") 2) Add Slimtype DVD A DS8A8SH ATAPI quirk 3) ata_piix: Intel Haswell platform quirk 4) Avoid DMA'ing to stack buffer, when obtaining DEVSLP timings. IMO a mild regression, given that libata previously did not DMA to a stack buffer. Broken by commit commit 803739d2 ("[libata] replace sata_settings with devslp_timing") 5) Fix regression impacting SMART and smartd, broken by commit 84a9a8cd ("[libata] Set proper SK when CK_COND is set")" * tag 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: [libata] Fix HDIO_DRIVE_* ioctl() Linux 3.9 regression libata: fix DMA to stack in reading devslp_timing parameters ata_piix: Fix DVD not dectected at some Haswell platforms libata: Set max sector to 65535 for Slimtype DVD A DS8A8SH drive libata: Use integer return value for atapi_command_packet_set
-