Skip to content

Commit

Permalink
Added tools for generating featured HTML recipes.
Browse files Browse the repository at this point in the history
  • Loading branch information
rossant committed Jun 15, 2014
1 parent 7085626 commit 8c9d877
Show file tree
Hide file tree
Showing 5 changed files with 204 additions and 225 deletions.
33 changes: 20 additions & 13 deletions featured/01_numpy_performance.ipynb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:015ce6a5abd2c1803c32dae6f3cc89bbb103b97ee5b159129f2454ad27d3ef6f"
"signature": "sha256:b9b65add267b3e6d0058a88dbd981ac0f74d0719f95b4ec1a70421fa85278bfe"
},
"nbformat": 3,
"nbformat_minor": 0,
Expand All @@ -12,14 +12,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Featured recipe #1: Get the best performance out of NumPy"
"# Featured Recipe #1: Getting the Best Performance out of NumPy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> This is the first featured recipe of the [IPython Cookbook](http://ipython-books.github.io/), the definitive guide to high-performance scientific computing and data science in Python."
"> This is the first featured recipe of the [**IPython Cookbook**](http://ipython-books.github.io/), the definitive guide to **high-performance scientific computing** and **data science** in Python."
]
},
{
Expand All @@ -28,7 +28,7 @@
"source": [
"**NumPy** is the cornerstone of the scientific Python software stack. It provides a special data type optimized for vector computations, the `ndarray`. This object is at the core of most algorithms in scientific numerical computing.\n",
"\n",
"With NumPy arrays, you can achieve significant performance speedups over native Python, particularly when your computations follow the *Single Instruction, Multiple Data* (SIMD) paradigm. However, it is also possible to unintentionally write non-optimized code with NumPy.\n",
"With NumPy arrays, you can achieve significant performance speedups over native Python, particularly when your computations follow the ***Single Instruction, Multiple Data* (SIMD)** paradigm. However, it is also possible to unintentionally write non-optimized code with NumPy.\n",
"\n",
"In this featured recipe, we will see some tricks that can help you write optimized NumPy code. We will start by looking at ways to avoid unnecessary array copies in order to save time and memory. In that respect, we will need to dig into the internals of NumPy."
]
Expand All @@ -37,7 +37,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Understanding the internals of NumPy to avoid unnecessary array copy"
"## Learning to avoid unnecessary array copies"
]
},
{
Expand Down Expand Up @@ -418,7 +418,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"6.\tBroadcasting rules allow you to make computations on arrays with different but compatible shapes. In other words, you don't always need to reshape or tile your arrays to make their shapes match. The following example illustrates two ways of doing an outer product between two vectors: the first method involves array tiling, the second one involves broadcasting. The last method is significantly faster."
"6. **Broadcasting rules** allow you to make computations on arrays with different but compatible shapes. In other words, you don't always need to reshape or tile your arrays to make their shapes match. The following example illustrates two ways of doing an outer product between two vectors: the first method involves array tiling, the second one involves broadcasting. The last method is significantly faster."
]
},
{
Expand Down Expand Up @@ -875,7 +875,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Explanations"
"## How it works?"
]
},
{
Expand Down Expand Up @@ -905,25 +905,25 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"1.\tComputations on arrays can be written very efficiently in a low-level language like C (and a large part of NumPy is actually written in C). Knowing the address of the memory block and the data type, it is just simple arithmetic to loop over all items, for example. There would be a significant overhead to do that in Python with a list.\n",
"1.\t**Array computations can be written very efficiently in a low-level language like C** (and a large part of NumPy is actually written in C). Knowing the address of the memory block and the data type, it is just simple arithmetic to loop over all items, for example. There would be a significant overhead to do that in Python with a list.\n",
"\n",
"2.\tSpatial locality in memory access patterns results in performance gains notably due to the CPU cache. Indeed, the cache loads bytes in chunks from RAM to the CPU registers. Adjacent items are then loaded very efficiently (sequential locality, or locality of reference).\n",
"2.\t**Spatial locality in memory access patterns** results in significant performance gains, notably thanks to the CPU cache. Indeed, the cache loads bytes in chunks from RAM to the CPU registers. Adjacent items are then loaded very efficiently (sequential locality, or locality of reference).\n",
"\n",
"3.\tFinally, the fact that items are stored contiguously in memory allows NumPy to take advantage of vectorized instructions of modern CPUs, like Intel's SSE and AVX, AMD's XOP, and so on. For example, multiple consecutive floating point numbers can be loaded in 128, 256 or 512 bits registers for vectorized arithmetical computations implemented as CPU instructions."
"3.\t**Data elements are stored contiguously in memory**, so that NumPy can take advantage of vectorized instructions on modern CPUs, like Intel's SSE and AVX, AMD's XOP, and so on. For example, multiple consecutive floating point numbers can be loaded in 128, 256 or 512 bits registers for vectorized arithmetical computations implemented as CPU instructions."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Additionally, let's mention the fact that NumPy can be linked to highly optimized linear algebra libraries like BLAS and LAPACK, for example through the Intel Math Kernel Library (MKL). A few specific matrix computations may also be multithreaded, taking advantage of the power of modern multicore processors."
"Additionally, let's mention the fact that NumPy can be linked to highly optimized linear algebra libraries like *BLAS* and *LAPACK*, for example through the *Intel Math Kernel Library (MKL)*. A few specific matrix computations may also be multithreaded, taking advantage of the power of modern multicore processors."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In conclusion, storing data in a contiguous block of memory ensures that the architecture of modern CPUs is used optimally, in terms of memory access patterns, CPU cache, and vectorized instructions."
"In conclusion, **storing data in a contiguous block of memory ensures that the architecture of modern CPUs is used optimally, in terms of memory access patterns, CPU cache, and vectorized instructions**."
]
},
{
Expand Down Expand Up @@ -1054,7 +1054,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"> This was a featured recipe from the [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014 (400 pages). If you liked this recipe, [pre-order the book now](http://www.packtpub.com/ipython-interactive-computing-and-visualization-cookbook/book)! There's a time-limited 50% discount with the code `PICVCEB`."
"You will find related recipes on the [book's repository](https://github.com/ipython-books/cookbook-code)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> This was a featured recipe from the [IPython Cookbook](http://ipython-books.github.io/), by [Cyrille Rossant](http://cyrille.rossant.net), Packt Publishing, 2014. If you liked this recipe, [pre-order the book now](http://www.packtpub.com/ipython-interactive-computing-and-visualization-cookbook/book)! There's a time-limited 50% discount with the code `PICVCEB`."
]
}
],
Expand Down
Loading

0 comments on commit 8c9d877

Please sign in to comment.