- 27 Jul, 2015 7 commits
-
-
Kevin Modzelewski authored
optimize some misc runtime functions
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
callable(), str(), repr(), PySequence_GetItem(), and PyObject_HasAttrString() Mostly by bringing in the CPython versions.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
They are tricky since these are types, which means they invoke the relatively-complicated constructor logic. ie str() doesn't just call __str__ on the argument: if the result is a subclass of str, it calls result.__init__(). Similarly for unicode, except unicode is even trickier since it takes some more arguments, one of which is "encoding" which will have non-type-based dynamic behavior. I didn't realize that at first and optimized unicode() by exposing an inner version of it that takes its arguments in registers, which we can take advantage of using our jit-arg-rearrangement capability. This means we have to do parts of PyArg_ParseTuple ourselves, so I added a PyArg_ParseSingle that runs a single object through the arg-conversion code. PyArg_ParseSingle could be further optimized if we want to. Or rather, if we have functions of the form PyArg_ParseSingle_s (which corresponds to the "s" format code) we could skip some more of the overhead. I had to disable most of that once I realized the encoding issue, but I left it in since hopefully we will be able to use it again once we have some "do some guards after mutations if we know how to resume after a failed guard" rewriter support.
-
Kevin Modzelewski authored
get cpython/test_tuple.py to pass
-
Kevin Modzelewski authored
Get test_iter.py to work
-
- 24 Jul, 2015 12 commits
-
-
Kevin Modzelewski authored
Min max
-
Kevin Modzelewski authored
assign fixed slots (vregs) to the symbols.
-
Kevin Modzelewski authored
Use cpythons lock implementation
-
Marius Wachtler authored
This switches the thread lock implementation to use a semaphore instead of a mutex. I hope this gets rid of the threading_local.py error on travis-ci.
-
Marius Wachtler authored
This removes a bottleneck of the interpreter/bjit: most var accesses introduced a DenseMap lookup, with this change we use a fixed offset per var. The bjit stores the pointer to the vregs array inside r14 for fast accesses.
-
Marius Wachtler authored
Not having the ASTInterpreter GC allocated improves performance. I had to add a small asm function in order to produce a special stack frame where we can easily retrieve the ASTInterpreter*, to replace s_interpreterMaps job. This also make sure that this function really does not get inlined. The s_interpreterMap was hard to understand and produced several times problems (duplicate entries,...) This patch contains a hack which limits the number of variables inside a function to 512. Because we have to make sure the are all on the stack and can't dynamically add more space. An upcoming patch will remove this limitation and replace it with a stack alloca of the size of the actual number of variables the function uses.
-
Marius Wachtler authored
This switches the thread lock implementation to use a semaphore instead of a mutex. I hope this gets rid of the threading_local.py error on travis-ci.
-
Boxiang Sun authored
-
Boxiang Sun authored
-
Boxiang Sun authored
-
Kevin Modzelewski authored
Allow passing NULL for empty kwargs
-
Rudi Chen authored
-
- 23 Jul, 2015 21 commits
-
-
Kevin Modzelewski authored
This is a convention that cpython uses that is a good performance improvement -- unlike tuples, dicts are mutable, so we currently have to create a new dict for every call into a kwargs-taking function even if there are no keywords to pass. Have to modify the receiving ends to allow NULL kwargs. This includes the ast interpreter (which I think will automatically get the bjit), the llvm irgenerator, and as many runtime functions as I can find.
-
Kevin Modzelewski authored
-
Kevin Modzelewski authored
Pow
-
Kevin Modzelewski authored
Gcc 4.9 gmp fix
-
Kevin Modzelewski authored
Improve __getattr__ handling
-
Kevin Modzelewski authored
Reduce number of optimization passes we run
-
Kevin Modzelewski authored
bjit: inline trivial helper functions
-
Kevin Modzelewski authored
Release leftover chunk in the TraceStack.
-
Chris Toshok authored
-
Chris Toshok authored
-
Rudi Chen authored
-
Marius Wachtler authored
-
Kevin Modzelewski authored
The previous set was based on trying to get the absolute best results on microbenchmarks; go back to the small set.
-
Kevin Modzelewski authored
Fix gc visiting code for running generators
-
Kevin Modzelewski authored
This is not quite the same as checking whether tp_getattro has been updated, since setting a __getattr__ will change tp_getattro. Usually this isn't that important since we will still need to do some special getattr handling, but in a couple specific cases we can optimize based on it. In particular, this commit optimizes calls to isinstance(inst, cls) on instances which have a __getattr__ set: in the common case (no __getattribute__, no custom __class__) we can know that the __getattr__ will not get called and we can stay on the isinstance fastpath.
-
Kevin Modzelewski authored
- rewriting a runtimeCall of an instancemethod was broken (this is a separate code path from the much-more-common callattr form). - we don't need to guard on the this->cls as often in Box::getattr, specifically in cases that are coming from typeLookup. Not because the classes are fixed (I think they can change), but because they are not allowed to change in a way that would change what Box::getattr has the guard for (guarding on attrs_offset and tp_dictoffset). - slightly change the place we guard on tp_getattro and tp_getattr to a place I think is a bit more correct (or at least easier to understand as being correct).
-
Kevin Modzelewski authored
ie the way that __getattr__ ends up getting called. Two main optimizations: - switch the initial attribute access (ie checking to see the attribute exists before calling __getattr__) to use a non-throwing api. For the typical case, even though the throwing would be a C API throw, the construction of the AttributeError is relatively expensive, and the object would be immediately discarded anyway. - add rewriting to the function They both roughly cut out half the overhead of accessing attributes on classes with __getattr__.
-
Boxiang Sun authored
-
Boxiang Sun authored
-
Boxiang Sun authored
-
Boxiang Sun authored
-