- 25 Jun, 2014 2 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
- 24 Jun, 2014 7 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Required a bit of refactoring in terms of how parameter specifiers are passed around.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Well not sure if it's a deadlock; I think we try to re-entrantly acquire the codegen lock, which then confuses it later?
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
GC: Register None
-
- 23 Jun, 2014 5 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Implement chained comparisons
-
https://github.com/xiafan68/pystonKevin Modzelewski authored
Merges #88 Conflicts: src/core/threading.cpp
-
Kevin Modzelewski authored
Not sure why greg_t got changed from 'long' to 'long long' in newer versions of glibc; just cast it to be compatible
-
Kevin Modzelewski authored
Build fix
-
- 20 Jun, 2014 1 commit
-
-
Vinzenz Feenstra authored
Signed-off-by: Vinzenz Feenstra <evilissimo@gmail.com>
-
- 19 Jun, 2014 3 commits
-
-
Marius Wachtler authored
-
Marius Wachtler authored
-
Kevin Modzelewski authored
Implementation is pretty straightforward for now: - find all names that get accessed from a nested function - if any, create a closure object at function entry - any time we set a name accessed from a nested function, update its value in the closure - when evaluating a functiondef that needs a closure, attach the created closure to the created function object. Closures are currently passed as an extra argument before any python-level args, which I'm not convinced is the right strategy. It's works out fine but it feels messy to say that functions can have different C-level calling conventions. It felt worse to include the closure as part of the python-level arg passing. Maybe it should be passed after all the other arguments? Closures are currently just simple objects, on which we set and get Python-level attributes. The performance (which I haven't tested) relies on attribute access being made fast through the hidden-class inline caches. There are a number of ways that this could be improved: - be smarter about when we create the closure object, or when we update it. - not create empty pass-through closures - give the closures a pre-defined shape, since we know at irgen-time what names can get set. could probably avoid the inline cache machinery and also have better code.
-
- 18 Jun, 2014 5 commits
-
-
xiafan_linux authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Enable the VTune JIT support in llvm, and add it as a jit listener. I think it's mostly confirming my suspicion that the slowdown is cache-related... it's not being very helpful with determining why (it's in some function that it can't analyze). I updated the memory allocator to have strong thread-affinity (ie a thread now generally gets back memory that it had previously freed), but that doesn't seem to have any effect. Going to punt on further investigations for now, pretty happy though that there's an overall speedup with the grwl, even if there are still issues.
-
Kevin Modzelewski authored
Turns out a large amount of thread contention was coming from these shared counters -- disable some of them and add some thread-local caching
-
- 17 Jun, 2014 9 commits
-
-
Kevin Modzelewski authored
insert full blocks back at the end of the free list to hopefully reduce the amount of times we have to check them
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Now, threads will claim an entire block at a time, and put it in a thread-local cache. In the common case, they can allocate out of this block, and only need to take out a lock if they run out.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Trying to add a generic PerThread class has involved a journey into the wonderful world of template programming, including C++11 variadic templates.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Make rules more general, and generate rules that can't be patterns Made it easy to add a pyston_grwl target
-
Kevin Modzelewski authored
Add some basic locking to: - code generation (one lock for all of it) - garbage collection (spin lock for allocations, global serialization for collections) - lists (mutex per list object) Can run the GRWL on some simple tests (microbenchmarks/thread_contention.py and thread_uncontended.py) Performance is not great yet
-
- 11 Jun, 2014 4 commits
-
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
And then use that register info rather than sending a signal to the thread; this lets the thread that called AllowThreads avoid receiving signals ex during a syscall. I'm not sure if this is valid though; are we really guaranteed that the thread can't invalidate the saved state?
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Once the GRWL is added, will also be for GC safepoints.
-
- 10 Jun, 2014 4 commits
-
-
Kevin Modzelewski authored
We always want to crawl the entire stack, and it's possible to determine the extents of the stack, so just do a scan over the entire memory range. Also, change the way the interpreter keeps track of its roots; we don't really need to associate the roots with a specific interpreter frame. This should hopefully clear up the weirdness about libunwind trying to unwind through the pthreads assembly code, and potentially also make stack crawling faster.
-
Kevin Modzelewski authored
I think threading now "works" ie doesn't crash Pyston, though we don't release the GIL until the thread exits.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
-