Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Gwenaël Samain
cython
Commits
c03b0bca
Commit
c03b0bca
authored
Jul 05, 2018
by
gabrieldemarmiesse
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
docs: Emphasized the speedups of Cython vs NumPy in both the notebook and the docs.
parent
20547723
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
135 additions
and
138 deletions
+135
-138
docs/examples/userguide/numpy_tutorial/numpy_and_cython.ipynb
.../examples/userguide/numpy_tutorial/numpy_and_cython.ipynb
+118
-125
docs/src/userguide/numpy_tutorial.rst
docs/src/userguide/numpy_tutorial.rst
+17
-13
No files found.
docs/examples/userguide/numpy_tutorial/numpy_and_cython.ipynb
View file @
c03b0bca
...
...
@@ -20,9 +20,20 @@
"metadata": {
"scrolled": true
},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0.29a0\n"
]
}
],
"source": [
"%load_ext cython"
"from __future__ import print_function\n",
"%load_ext cython\n",
"import Cython\n",
"print(Cython.__version__)"
]
},
{
...
...
@@ -72,12 +83,13 @@
"name": "stdout",
"output_type": "stream",
"text": [
"8.
69 ms ± 297
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"8.
11 ms ± 25.4
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
]
}
],
"source": [
"%timeit compute_np(array_1, array_2, a, b, c)"
"timeit_result = %timeit -o compute_np(array_1, array_2, a, b, c)\n",
"np_time = timeit_result.average"
]
},
{
...
...
@@ -86,7 +98,7 @@
"metadata": {},
"outputs": [],
"source": [
"result = compute_np(array_1, array_2, a, b, c)"
"
np_
result = compute_np(array_1, array_2, a, b, c)"
]
},
{
...
...
@@ -136,7 +148,7 @@
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
"assert np.all(compute(array_1, array_2, a, b, c) ==
np_
result)"
]
},
{
...
...
@@ -148,24 +160,56 @@
"name": "stdout",
"output_type": "stream",
"text": [
"2
5.6 s ± 225 m
s per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"2
7.9 s ± 1.75
s per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit compute(array_1, array_2, a, b, c)"
"timeit_result = %timeit -o compute(array_1, array_2, a, b, c)\n",
"py_time = timeit_result.average"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"####
# Pure Python version compiled with Cython:
"
"####
We make a function to be able to easily compare timings with the NumPy version and the pure Python version.
"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def compare_time(current, reference, name):\n",
" ratio = reference/current\n",
" if ratio > 1:\n",
" word = \"faster\"\n",
" else:\n",
" ratio = 1 / ratio \n",
" word = \"slower\"\n",
" \n",
" print(\"We are\", \"{0:.1f}\".format(ratio), \"times\", word, \"than the\", name, \"version.\")\n",
"\n",
"def print_report(compute_function):\n",
" assert np.all(compute_function(array_1, array_2, a, b, c) == np_result)\n",
" timeit_result = %timeit -o compute_function(array_1, array_2, a, b, c)\n",
" run_time = timeit_result.average\n",
" compare_time(run_time, py_time, \"pure Python\")\n",
" compare_time(run_time, np_time, \"NumPy\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Pure Python version compiled with Cython:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"scrolled": false
},
...
...
@@ -1115,7 +1159,7 @@
"
<IPython
.
core
.
display
.
HTML
object
>
"
]
},
"execution_count":
9
,
"execution_count":
10
,
"metadata": {},
"output_type": "execute_result"
}
...
...
@@ -1153,31 +1197,25 @@
" return result"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The history saving thread hit an unexpected error (OperationalError('disk I/O error',)).History will not be written to the database.\n",
"21.9 s ± 398 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"22.1 s ± 142 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"We are 1.3 times faster than the pure Python version.\n",
"We are 2724.1 times slower than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
},
{
...
...
@@ -2021,27 +2059,22 @@
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10.5 s ± 301 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
"10.1 s ± 50.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
"We are 2.8 times faster than the pure Python version.\n",
"We are 1250.0 times slower than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
},
{
...
...
@@ -2053,7 +2086,7 @@
},
{
"cell_type": "code",
"execution_count": 1
5
,
"execution_count": 1
4
,
"metadata": {},
"outputs": [
{
...
...
@@ -2763,7 +2796,7 @@
"
<IPython
.
core
.
display
.
HTML
object
>
"
]
},
"execution_count": 1
5
,
"execution_count": 1
4
,
"metadata": {},
"output_type": "execute_result"
}
...
...
@@ -2804,28 +2837,21 @@
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"9.56 ms ± 139 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"8.83 ms ± 42.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"We are 3161.3 times faster than the pure Python version.\n",
"We are 1.1 times slower than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
},
{
...
...
@@ -2837,8 +2863,10 @@
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"execution_count": 16,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
...
...
@@ -3510,7 +3538,7 @@
"
<IPython
.
core
.
display
.
HTML
object
>
"
]
},
"execution_count":
4
6,
"execution_count":
1
6,
"metadata": {},
"output_type": "execute_result"
}
...
...
@@ -3553,28 +3581,21 @@
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6.06 ms ± 26 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"6.04 ms ± 12.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"We are 4623.4 times faster than the pure Python version.\n",
"We are 1.3 times faster than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
},
{
...
...
@@ -3586,7 +3607,7 @@
},
{
"cell_type": "code",
"execution_count":
62
,
"execution_count":
18
,
"metadata": {},
"outputs": [],
"source": [
...
...
@@ -3628,28 +3649,21 @@
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 51,
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4.13 ms ± 87.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"4.18 ms ± 34 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"We are 6673.5 times faster than the pure Python version.\n",
"We are 1.9 times faster than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
},
{
...
...
@@ -3661,7 +3675,7 @@
},
{
"cell_type": "code",
"execution_count":
63
,
"execution_count":
20
,
"metadata": {},
"outputs": [],
"source": [
...
...
@@ -3704,28 +3718,21 @@
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"execution_count": 21,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4.1 ms ± 54.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"4.25 ms ± 52.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"We are 6562.5 times faster than the pure Python version.\n",
"We are 1.9 times faster than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
},
{
...
...
@@ -3737,7 +3744,7 @@
},
{
"cell_type": "code",
"execution_count":
64
,
"execution_count":
22
,
"metadata": {},
"outputs": [],
"source": [
...
...
@@ -3790,16 +3797,22 @@
},
{
"cell_type": "code",
"execution_count":
56
,
"execution_count":
23
,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
"arr_1_float = array_1.astype(np.float64)\n",
"arr_2_float = array_2.astype(np.float64)\n",
"\n",
"float_cython_result = compute(arr_1_float, arr_2_float, a, b, c)\n",
"float_numpy_result = compute_np(arr_1_float, arr_2_float, a, b, c)\n",
"\n",
"assert np.all(float_cython_result == float_numpy_result)"
]
},
{
"cell_type": "code",
"execution_count":
57
,
"execution_count":
24
,
"metadata": {
"scrolled": true
},
...
...
@@ -3808,27 +3821,14 @@
"name": "stdout",
"output_type": "stream",
"text": [
"6 ms ± 70.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"6.17 ms ± 164 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"We are 4525.9 times faster than the pure Python version.\n",
"We are 1.3 times faster than the NumPy version.\n"
]
}
],
"source": [
"%timeit compute(array_1, array_2, a, b, c)"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [],
"source": [
"arr_1 = np.random.uniform(0, 1000, size=(100, 100)).astype(np.float64)\n",
"arr_2 = np.random.uniform(0, 1000, size=(100, 100)).astype(np.float64)\n",
"\n",
"float_cython_result = compute(arr_1, arr_2, a, b, c)\n",
"float_numpy_result = compute_np(arr_1, arr_2, a, b, c)\n",
"\n",
"assert np.all(float_cython_result == float_numpy_result)"
"print_report(compute)"
]
},
{
...
...
@@ -3840,7 +3840,7 @@
},
{
"cell_type": "code",
"execution_count":
6
5,
"execution_count":
2
5,
"metadata": {},
"outputs": [],
"source": [
...
...
@@ -3895,30 +3895,23 @@
},
{
"cell_type": "code",
"execution_count": 60,
"metadata": {},
"outputs": [],
"source": [
"assert np.all(compute(array_1, array_2, a, b, c) == result)"
]
},
{
"cell_type": "code",
"execution_count": 61,
"execution_count": 26,
"metadata": {
"scrolled":
tru
e
"scrolled":
fals
e
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3.41 ms ± 93.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
"3.55 ms ± 80.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
"We are 7858.9 times faster than the pure Python version.\n",
"We are 2.3 times faster than the NumPy version.\n"
]
}
],
"source": [
"
%timeit compute(array_1, array_2, a, b, c
)"
"
print_report(compute
)"
]
}
],
...
...
docs/src/userguide/numpy_tutorial.rst
View file @
c03b0bca
...
...
@@ -175,15 +175,15 @@ run a Python session to test both the Python version (imported from
In [7]: def compute_np(array_1, array_2, a, b, c):
...: return np.clip(array_1, 2, 10) * a + array_2 * b + c
In [8]: %timeit compute_np(array_1, array_2, a, b, c)
8.
69 ms ± 297
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
8.
11 ms ± 25.4
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [9]: import compute_py
In [10]: compute_py.compute(array_1, array_2, a, b, c)
2
5.6 s ± 225 m
s per loop (mean ± std. dev. of 7 runs, 1 loop each)
2
7.9 s ± 1.75
s per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [11]: import compute_cy
In [12]: compute_cy.compute(array_1, array_2, a, b, c)
2
1.9 s ± 398
ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2
2.1 s ± 142
ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
There's not such a huge difference yet; because the C code still does exactly
what the Python interpreter does (meaning, for instance, that a new object is
...
...
@@ -218,7 +218,7 @@ After building this and continuing my (very informal) benchmarks, I get:
.. sourcecode:: ipython
In [13]: %timeit compute_typed.compute(array_1, array_2, a, b, c)
10.
5 s ± 301
ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
10.
1 s ± 50.9
ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
So adding types does make the code faster, but nowhere
near the speed of NumPy?
...
...
@@ -287,10 +287,10 @@ Let's see how much faster accessing is now.
.. sourcecode:: ipython
In [22]: %timeit compute_memview.compute(array_1, array_2, a, b, c)
9.56 ms ± 13
9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
8.83 ms ± 42.
9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Note the importance of this change.
We're now
2700
times faster than an interpreted version of Python and close
We're now
3161
times faster than an interpreted version of Python and close
to NumPy speed.
Memoryviews can be used with slices too, or even
...
...
@@ -326,9 +326,9 @@ information.
.. sourcecode:: ipython
In [23]: %timeit compute_index.compute(array_1, array_2, a, b, c)
6.
1 ms ± 103
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
6.
04 ms ± 12.2
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now faster than the NumPy version. NumPy is really well written,
We're now faster than the NumPy version
, not by much (1.3x)
. NumPy is really well written,
but does not performs operation lazily, meaning a lot
of back and forth in memory. Our version is very memory efficient and
cache friendly because we know the operations in advance.
...
...
@@ -375,9 +375,10 @@ get by declaring the memoryviews as contiguous:
.. sourcecode:: ipython
In [23]: %timeit compute_contiguous.compute(array_1, array_2, a, b, c)
4.1
3 ms ± 87.2
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
4.1
8 ms ± 34
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now around two times faster than the NumPy version.
We're now around two times faster than the NumPy version, and 6600 times
faster than the pure Python version!
Making the function cleaner
===========================
...
...
@@ -403,7 +404,7 @@ We now do a speed test:
.. sourcecode:: ipython
In [24]: %timeit compute_infer_types.compute(array_1, array_2, a, b, c)
4.
1 ms ± 54.8
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
4.
25 ms ± 52.2
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Lo and behold, the speed has not changed.
...
...
@@ -444,7 +445,7 @@ We now do a speed test:
.. sourcecode:: ipython
In [25]: %timeit compute_fused_types.compute(array_1, array_2, a, b, c)
6
ms ± 70.3
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
6
.17 ms ± 164
µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're a bit slower than before, because of the right call to the clip function
must be found at runtime and adds a bit of overhead.
...
...
@@ -471,7 +472,10 @@ We can have substantial speed gains for minimal effort:
.. sourcecode:: ipython
In [25]: %timeit compute_prange.compute(array_1, array_2, a, b, c)
3.41 ms ± 93.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.55 ms ± 80.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now 7858 times faster than the pure Python version and 2.3 times faster
than NumPy!
Where to go from here?
======================
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment