9 Introduction to Numpy

This chapter is devoted to an important library for numerical calculations: NumPy (abbreviation of Numerical Python).

It is common practice to import NumPy by assigning it the alias np:

import numpy as np

9.1 Arrays

NumPy offers a popular data structure, arrays, on which calculations can be performed efficiently. Arrays are a useful structure for performing basic statistical operations as well as pseudo-random number generation.

The structure of the tables is similar to that of the lists, but the latter are slower to process and use more memory. The gain in processing speed of the `NumPy’ arrays comes from the fact that the data is stored in contiguous memory blocks, thus facilitating read access.

To be convinced, we can use the example of Pierre Navaro given in his notebook on NumPy.. Let’s create two lists of 1000 length each, with numbers drawn randomly using the random() function of the random module. Let’s divide each element in the first list by the element at the same position in the second line, then calculate the sum of these 1000 divisions. Then let’s look at the execution time using the magic function %timeit:

from random import random
from operator import truediv
l1 = [random() for i in range(1000)]
l2 = [random() for i in range(1000)]
# %timeit s = sum(map(truediv,l1,l2))

(uncomment the last line and test on a Jupyter Notebook)

Now, let’s transform the two lists into NumPy tables with the array() method, and do the same calculation with a NumPy method:

a1 = np.array(l1)
a2 = np.array(l2)
# %timeit s = np.sum(a1/a2)

As can be seen by executing these codes in an IPython environment, the execution time is much faster with the NumPy methods for this calculation.

9.1.1 Creation

The creation of an array can be done with the array() method, from a list, as we just did:

list = [1,2,4]
table = np.array(list)
print(table)
## [1 2 4]
print(type(table))
## <class 'numpy.ndarray'>

If array() is provided with a list of nested lists of the same length, a multidimensional array will be created:

list_2 = [ [1,2,3], [4,5,6] ]
table_2 = np.array(list_2)
print(table_2)
## [[1 2 3]
##  [4 5 6]]
print(type(table_2))
## <class 'numpy.ndarray'>

Tables can also be created from tuples:

tup = (1, 2, 3)
table = np.array(tup)
print(table)
## [1 2 3]
print(type(table))
## <class 'numpy.ndarray'>

An 1-dimension array can be casted to a 2-dimension array (if possible), by changing its shape attribute:

table = np.array([3, 2, 5, 1, 6, 5])
table.shape = (3,2)
print(table)
## [[3 2]
##  [5 1]
##  [6 5]]

9.1.1.1 Some Functions Generating array Objects

Some of the functions in NumPy produce pre-filled arrays. This is the case of the zeros() function. When given an integer value \(n\), the zeros() function creates a one-dimensional array, with \(n\) 0 :

print( np.zeros(4) )
## [0. 0. 0. 0.]

The type of zeros (e. g. int, int32, int64, int64, float, float32, float64, etc.) can be specified using the dtype argument:

print( np.zeros(4, dtype = "int") )
## [0 0 0 0]

More explanations on the types of data with NumPy are availableon the online documentation.

The type of the elements of an array is indicated via the argument dtype:

x = np.zeros(4, dtype = "int")
print(x, x.dtype)
## [0 0 0 0] int64

It is also possible to convert the type of elements into another type, using the astype() method:

y = x.astype("float")
print(x, x.dtype)
## [0 0 0 0] int64
print(y, y.dtype)
## [0. 0. 0. 0.] float64

When provided with a tuple longer than 1, zeros() creates a multidimensional array:

print( np.zeros((2, 3)) )
## [[0. 0. 0.]
##  [0. 0. 0.]]
print( np.zeros((2, 3, 4)) )
## [[[0. 0. 0. 0.]
##   [0. 0. 0. 0.]
##   [0. 0. 0. 0.]]
## 
##  [[0. 0. 0. 0.]
##   [0. 0. 0. 0.]
##   [0. 0. 0. 0.]]]

The empty() function of Numpy also returns an array on the same principle as zeros(), but without initializing the values inside.

print( np.empty((2, 3), dtype = "int") )
## [[0 0 0]
##  [0 0 0]]

The ones() function of Numpy returns the same kind of arrays, with 1s in initialized values:

print( np.ones((2, 3), dtype = "float") )
## [[1. 1. 1.]
##  [1. 1. 1.]]

To choose a specific value for initialization, you can use the full() function of Numpy:

print( np.full((2, 3), 10, dtype = "float") )
## [[10. 10. 10.]
##  [10. 10. 10.]]
print( np.full((2, 3), np.inf) )
## [[inf inf inf]
##  [inf inf inf]]

The eye() function of Numpy creates a two-dimensional array in which all elements are initialized to zero, except those of the diagonal initialized to 1 :

print( np.eye(2, dtype="int64") )
## [[1 0]
##  [0 1]]

By modifying the keyword argument k, the diagonal can be shifted:

print( np.eye(3, k=-1) )
## [[0. 0. 0.]
##  [1. 0. 0.]
##  [0. 1. 0.]]

The identity() function of Numpy creates an identity matrix in the form of an array:

print( np.identity(3, dtype = "int") )
## [[1 0 0]
##  [0 1 0]
##  [0 0 1]]

The arange() function of Numpy allows to generate a sequence of numbers separated by a fixed interval, all stored in an array. The syntax is as follows:

np.arange( start, stop, step, dtype )

with start the start value, stop the finish value, step the step, i.e., the spacing between the numbers in the sequence and type the type of numbers :

print( np.arange(5) )
## [0 1 2 3 4]
print( np.arange(2, 5) )
## [2 3 4]
print( np.arange(2, 10, 2) )
## [2 4 6 8]

9.1.2 Dimensions

To know the size of an array, the value of the attribute ndim can be displayed:

print("ndim tableau : ", table.ndim)
## ndim tableau :  2
print("ndim table_2 : ", table_2.ndim)
## ndim table_2 :  2

The number of elements in the array can be obtained by the size attribute or by the size() function of Numpy:

print("size table : ", table.size)
## size table :  6
print("size table_2: ", table_2.size)
## size table_2:  6
print("np.size(table):", np.size(table))
## np.size(table): 6

The shape attribute returns a tuple indicating the length for each dimension of the array:

print("size table: ", table.shape)
## size table:  (3, 2)
print("size table_2: ", table_2.shape)
## size table_2:  (2, 3)

9.1.3 Extracting Elements from an Array

Access to the elements of an array is done in the same way as for lists (see Section 3.1.1), using indexes. The syntax is as follows:

array[lower:upper:step]

with lower the lower boundary of the index range, upper the upper range, and step the spacing between the values.

  • When lower is not specified, the first element (indexed 0) is considered as the value assigned to lower.
  • When upper' is not specified, the last element is considered as the value assigned toupper’.
  • When step is not specified, a step of 1 is assigned by default.

Let’s take a quick look at some examples, using two objects: an array of dimension 1, and a second of dimension 2.

table_1 = np.arange(1,13)
table_2 = [ [1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
table_2 = np.array(table_2)

Access to the first element:

message = "table_{}[0] : {} (type : {})"
print(message.format(0, table_1[0], type(table_1[0])))
## table_0[0] : 1 (type : <class 'numpy.int64'>)
print(message.format(1, table_2[0], type(table_2[0])))
## table_1[0] : [1 2 3] (type : <class 'numpy.ndarray'>)

Access to the elements can be done from the end:

print("table_1[-1] : ", table_1[-1]) # last element
## table_1[-1] :  12
print("table_2[-1] : ", table_2[-1]) # last element
## table_2[-1] :  [10 11 12]

Slicing is possible:

# the elements from the 2nd (not included) to the 4th
print("Slice Table 1 : \n", table_1[2:4])
## Slice Table 1 : 
##  [3 4]
print("Sclie Table 2 : \n", table_2[2:4])
## Sclie Table 2 : 
##  [[ 7  8  9]
##  [10 11 12]]

For two-dimensional arrays, the elements can be accessed in the following ways:

# Within the 3rd element, access the 1st element
print(table_2[2][0])
## 7
print(table_2[2,0])
## 7

To extract columns from an array with two entries:

print("Second column: \n", table_2[:, [1]])
## Second column: 
##  [[ 2]
##  [ 5]
##  [ 8]
##  [11]]
print("Second and third columns: \n", table_2[:, [1,2]])
## Second and third columns: 
##  [[ 2  3]
##  [ 5  6]
##  [ 8  9]
##  [11 12]]

For this last instruction, we specify with the first argument not filled in (before the two points) that we want all the elements of the first dimension, then, with the comma, we indicate that we look inside each element of the first dimension, and that we want the values at positions 1 and 2 (therefore the elements of columns 2 and 3).

To extract only some elements from a 1-dimensional array, we can specify the indices of the elements to be recovered:

print("2nd and 4th elements: \n", table_2[[1,3]])
## 2nd and 4th elements: 
##  [[ 4  5  6]
##  [10 11 12]]

9.1.3.1 Extraction Using Boolean

To extract or not elements from a table, you can use Boolean tables as masks. The idea is to provide a boolean array (a mask) of the same size as the one for which you want to extract elements under certain conditions. When the value of the Boolean in the mask is set to True, the corresponding element of the array is returned; otherwise, it is not.

table = np.array([0, 3, 2, 5, 1, 4])
res = table[[True, False, True, False, True, True]]
print(res)
## [0 2 1 4]

Only the elements in positions 1, 3, 5 and 6 were returned.

In practice, the mask is only very rarely created by the user, but rather comes from a logical instruction applied to the interest table. For example, in our table, we can first create a mask to identify even elements:

mask = table % 2 == 0
print(mask)
## [ True False  True False False  True]
print(type(mask))
## <class 'numpy.ndarray'>

Once this mask is created, it can be applied to the array to extract only those elements for which the corresponding value in the mask is True:

print(table[mask])
## [0 2 4]

9.1.4 Modification

To replace the values in an array, equal sign (=) can be used:

table = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
table[0] = [11, 22, 33]
print(table)
## [[11 22 33]
##  [ 4  5  6]
##  [ 7  8  9]
##  [10 11 12]]

If a scalar is provided during replacement, the value will be repeated for all elements of the dimension :

table[0] = 100
print(table)
## [[100 100 100]
##  [  4   5   6]
##  [  7   8   9]
##  [ 10  11  12]]

Same idea with a slicing:

table[0:2] = 100
print(table)
## [[100 100 100]
##  [100 100 100]
##  [  7   8   9]
##  [ 10  11  12]]

In fact, a breakdown with just the two points without specifying the start and end arguments of the breakdown followed by an equal sign and a number replaces all the values in the table with this number:

table[:] = 0
print(table)
## [[0 0 0]
##  [0 0 0]
##  [0 0 0]
##  [0 0 0]]

9.1.4.1 Insterting Elements

To add elements, we use the append() function of NumPy. Note that calling this function does not change the object to which the values are added. If we want the changes to be made to this object, we must overwrite it:

t_1 = np.array([1,3,5])
print("t_1 : ", t_1)
## t_1 :  [1 3 5]
t_1 = np.append(t_1, 1)
print("t_1 after the insertion: ", t_1)
## t_1 after the insertion:  [1 3 5 1]

To add a column to a two-dimensional table:

t_2 = np.array([[1,2,3], [5,6,7]])
print("t_2 : \n", t_2)
## t_2 : 
##  [[1 2 3]
##  [5 6 7]]
add_col_t_2 = np.array([[4], [8]])
t_2 = np.append(t_2,add_col_t_2, axis = 1)
print("t_2 after the insertion: \n", t_2)
## t_2 after the insertion: 
##  [[1 2 3 4]
##  [5 6 7 8]]

To add a line, we use the vstack() function of Numpy:

ajout_ligne_t_2 = np.array([10, 11, 12, 13])
t_2 = np.vstack([t_2,ajout_ligne_t_2])
print("t_2 après ajout ligne : \n", t_2)
## t_2 après ajout ligne : 
##  [[ 1  2  3  4]
##  [ 5  6  7  8]
##  [10 11 12 13]]

9.1.4.2 Deleting / Removing Elements

To delete elements, we can use the delete() function of NumPy:

print("t_1 : ", t_1)
# Remove the last element
## t_1 :  [1 3 5 1]
np.delete(t_1, (-1))
## array([1, 3, 5])

Note: for the deletion to be effective, the result of np.delete() is assigned to the object.

To delete multiple items:

print("t_1 : ", t_1)
# Remove the first and second elements:
## t_1 :  [1 3 5 1]
t_1 = np.delete(t_1, ([0, 2]))
print(t_1)
## [3 1]

To delete a column from a two-dimensional table:

print("t_2 : ", t_2)
# Remove the last column:
## t_2 :  [[ 1  2  3  4]
##  [ 5  6  7  8]
##  [10 11 12 13]]
np.delete(t_2, (0), axis=1)
## array([[ 2,  3,  4],
##        [ 6,  7,  8],
##        [11, 12, 13]])

Delete multiple columns:

print("t_2 : ", t_2)
# Remove the first and third columns:
## t_2 :  [[ 1  2  3  4]
##  [ 5  6  7  8]
##  [10 11 12 13]]
np.delete(t_2, ([0,2]), axis=1)
## array([[ 2,  4],
##        [ 6,  8],
##        [11, 13]])

And to delete a row:

print("t_2 : ", t_2)
# Remove the first line:
## t_2 :  [[ 1  2  3  4]
##  [ 5  6  7  8]
##  [10 11 12 13]]
np.delete(t_2, (0), axis=0)
## array([[ 5,  6,  7,  8],
##        [10, 11, 12, 13]])

Delete multiple lines:

print("t_2 : ", t_2)
# Remove the first and third lines:
## t_2 :  [[ 1  2  3  4]
##  [ 5  6  7  8]
##  [10 11 12 13]]
np.delete(t_2, ([0,2]), axis=0)
## array([[5, 6, 7, 8]])

9.1.5 Copyi of an Array

Copying an array, as with lists (c.f. Section 3.1.4), should not be done with the equal symbol (=). Let’s see why.

table_1 = np.array([1, 2, 3])
table_2 = table_1

Let’s modify the first element of table_2, and observe the content of table_2 and table_1:

table_2[0] = 0
print("Table 1: \n", table_1)
## Table 1: 
##  [0 2 3]
print("Table 2: \n", table_2)
## Table 2: 
##  [0 2 3]

As can be seen, using the equal sign simply created a reference and not a copy.

There are several ways to copy an array. Among them, the use of the np.array() function:

table_1 = np.array([1, 2, 3])
table_2 = np.array(table_1)
table_2[0] = 0
print("table_1 : ", table_1)
## table_1 :  [1 2 3]
print("table_2 : ", table_2)
## table_2 :  [0 2 3]

The copy() method can also be used:

table_1 = np.array([1, 2, 3])
table_2 = table_1.copy()
table_2[0] = 0
print("table_1 : ", table_1)
## table_1 :  [1 2 3]
print("table_2 : ", table_2)
## table_2 :  [0 2 3]

It can be noted that when a slicing is made, a new object is created, not a reference:

table_1 = np.array([1, 2, 3, 4])
table_2 = table_1[:2]
table_2[0] = 0
print("table_1 : ", table_1)
## table_1 :  [0 2 3 4]
print("table_2 : ", table_2)
## table_2 :  [0 2]

9.1.6 Sorting

The NumPy library provides a function to sort the tables, sort():

table = np.array([3, 2, 5, 1, 6, 5])
print("Sorted Table: ", np.sort(table))
## Sorted Table:  [1 2 3 5 5 6]
print("Table: ", table)
## Table:  [3 2 5 1 6 5]

As we can see, the sort() function of NumPy offers a view: the table is not modified, which is not the case if we use the sort() method:

table = np.array([3, 2, 5, 1, 6, 5])
table.sort()
print("The array was modified: ", table)
## The array was modified:  [1 2 3 5 5 6]

9.1.7 Transposition

To obtain the transposition of an array, the attribute T can be used. It should be noted that you get a view of the object: the object is not changed.

table = np.array([3, 2, 5, 1, 6, 5])
table.shape = (3,2)
print("Array: \n", table)
## Array: 
##  [[3 2]
##  [5 1]
##  [6 5]]
print("Transposed Array: \n", table.T)
## Transposed Array: 
##  [[3 5 6]
##  [2 1 5]]

The transpose() function of NumPy can also be used:

print(np.transpose(table))
## [[3 5 6]
##  [2 1 5]]

Be careful, if a name is assigned to the transpose, either by using the attribute T or the method np.transpose(), it creates a reference, not a copy of an element…

table_transpose = np.transpose(table)
table_transpose[0,0] = 99
print("Array: \n", table)
## Array: 
##  [[99  2]
##  [ 5  1]
##  [ 6  5]]
print("Transpose of the Array: \n", table_transpose)
## Transpose of the Array: 
##  [[99  5  6]
##  [ 2  1  5]]

To know if an array is a view or not, we can display the base attribute, which returns None if it is not the case:

print("table: ", table.base)
## table:  None
print("table_transpose : ", table_transpose.base)
## table_transpose :  [[99  2]
##  [ 5  1]
##  [ 6  5]]

9.1.8 Operations on Arrays

It is possible to use operators on the tables. Their effect requires some explanation.

9.1.8.1 + and - Operators

When the operator + (-) is used between two tables of the same size, an addition (subtraction) is performed:

t_1 = np.array([1, 2, 3, 4])
t_2 = np.array([5, 6, 7, 8])
t_3 = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
t_4 = np.array([[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]])
t_1 + t_2
## array([ 6,  8, 10, 12])
t_3 + t_4
## array([[14, 16, 18, 20],
##        [22, 24, 26, 28],
##        [30, 32, 34, 36]])
t_1 - t_2
## array([-4, -4, -4, -4])

When the operator + (-) is used between a scalar and an array, the scalar is added (subtracted) to all elements of the array:

print("t_1 + 3 : \n", t_1 + 3)
## t_1 + 3 : 
##  [4 5 6 7]
print("t_1 + 3. : \n", t_1 + 3.)
## t_1 + 3. : 
##  [4. 5. 6. 7.]
print("t_3 + 3 : \n", t_3 + 3)
## t_3 + 3 : 
##  [[ 4  5  6  7]
##  [ 8  9 10 11]
##  [12 13 14 15]]
print("t_3 - 3 : \n", t_3 - 3)
## t_3 - 3 : 
##  [[-2 -1  0  1]
##  [ 2  3  4  5]
##  [ 6  7  8  9]]

9.1.8.2 * and / Operators

When the operator * (/) is used between two tables of the same size, a multiplication (division) forward term is performed:

t_1 * t_2
## array([ 5, 12, 21, 32])
t_3 * t_4
## array([[ 13,  28,  45,  64],
##        [ 85, 108, 133, 160],
##        [189, 220, 253, 288]])
t_3 / t_4
## array([[0.07692308, 0.14285714, 0.2       , 0.25      ],
##        [0.29411765, 0.33333333, 0.36842105, 0.4       ],
##        [0.42857143, 0.45454545, 0.47826087, 0.5       ]])

When the operator * (/) is used between a scalar and an array, all the elements of the array are multiplied (divided) by this scalar :

print("t_1 * 3 : \n", t_1 * 3)
## t_1 * 3 : 
##  [ 3  6  9 12]
print("t_1 / 3 : \n", t_1 / 3)
## t_1 / 3 : 
##  [0.33333333 0.66666667 1.         1.33333333]

9.1.8.3 Power

It is also possible to raise each number in a table to a given power:

print("t_1 ** 3 : \n", t_1 ** 3)
## t_1 ** 3 : 
##  [ 1  8 27 64]

9.1.8.4 Operations on Matrices

In addition to the term-by-term operations/subtraction/multiplication/division (or on a scalar), it is possible to perform some calculations on two-dimensional tables (matrices).

We’ve already seen the tranposition of a matrix in Section 9.1.7.

To perform a matrix product, NumPy provides the function dot():

np.dot(t_3, t_4.T)
## array([[150, 190, 230],
##        [382, 486, 590],
##        [614, 782, 950]])

We have to make sure that the matrices are compatible, otherwise, an error will be returned:

np.dot(t_3, t_4)
## Error in py_call_impl(callable, dots$args, dots$keywords): ValueError: shapes (3,4) and (3,4) not aligned: 4 (dim 1) != 3 (dim 0)
## 
## Detailed traceback: 
##   File "<string>", line 1, in <module>

The matrix product can also be obtained using the operator @:

t_3 @ t_4.T
## array([[150, 190, 230],
##        [382, 486, 590],
##        [614, 782, 950]])

The product of a vector with a matrix is also possible:

np.dot(t_1, t_3.T)
## array([ 30,  70, 110])

9.1.9 Logical Operators

To perform logical tests on the elements of a table, NumPy offers functions, listed in Table 9.1. The result returned by applying these functions is a Boolean array.

Table 9.1: Logical Functions
Code Description
greater() Greater than
greater_equal() Greater than or equal to
less() Lower than
less_equal() Lower than or equal to
equal() Equal to
not_equal() Different from
logical_and() Logical And
logical_or() Logical Or
logical_xor() Logical XOR

For example, to obtain the elements of t between 10 and 20 (included):

t = np.array([[1, 10, 3, 24], [9, 12, 40, 2], [0, 7, 2, 14]])
mask = np.logical_and(t <= 20, t >= 10)
print("mask: \n", mask)
## mask: 
##  [[False  True False False]
##  [False  True False False]
##  [False False False  True]]
print("the elements of t between 10 and 20: \n",
      t[mask])
## the elements of t between 10 and 20: 
##  [10 12 14]

9.1.10 Some Constants

NumPy provides some constants, some of which are shown in Table 9.2.

Table 9.2: Formatting Codes
Code Description
np.inf Infinity (we get \(-\infty\) by writing -np.inf or np.NINF)
np.nan Representation as a floating point number of Not a Number
np.e Euler constant (\(e\))
np.euler_gamma Euler-Mascheroni constant (\(\gamma\))
np.pi Pi (\(\pi\))

We can note the presence of the value NaN, which is a special value among the floating point numbers. The behavior of this constant is special.

When we add, subtract, multiply or divide a number by this NaN value, we obtain NaN:

print("Addition : ", np.nan + 1)
## Addition :  nan
print("Substraction : ", np.nan - 1)
## Substraction :  nan
print("Multiplication : ", np.nan + 1)
## Multiplication :  nan
print("Division : ", np.nan / 1)
## Division :  nan

9.1.11 Universal functions

Universal functions (ufunc for universal functions) are functions that can be applied term-by-term to the elements of an array. There are two types of universal functions: uannic functions, which perform an operation on a single operand, and binary functions, which perform an operation on two operands.

Among the ufuncs are arithmetic operations (addition, multiplication, power, absolute value, etc.) and common mathematical functions (trigonometric, exponential, logarithmic functions, etc.). Table 9.3 lists some universal functions, while Table 9.4 lists some universal binary functions.

Table 9.3: Unary Universal Function
Code Description
negative(x) Opposite elements of elements of x
absolute(x) Absolute values of the elements of x
sign(x) Signs of the elements of x (0, 1 or -1)
rint(x) Ronded value of x to the nearest integer
floor(x) Truncated value of x to the next smaller integer
ceil(x) Truncated value of x to the next larger integer
sqrt(x) Square root of x
square(x) Squared value of x
sin(x), cos(x), tan(x) Sine (cosine, and tangent) of the elements of x
sinh(x), cosh(x), tanh(x) Hyperbolic sine (cosine, and tangent) of the elements of x
arcsin(x), arccos(x), arctan(x) Arc-sine (arc-cosine, and arc-tangent) de x
arcsinh(x), arccosh(x), arctanh(x) Hyperbolic arc-sinus (arc-cosine, and arc-tangent) of the elements of x
hypoth(x,y) Hypotenuse \(\sqrt{x^2+y^2}\)
degrees(x) Conversion of the angles values of x from radians to degrees
radians(x) Conversion of the angles values of x from degrees to radians
exp(x) Exponential of the values of x
expm1(x) \(e^x-1\)
log(x) Natural logarithm of the elements of x
log10(x) Logatithm of the elements of x in base 10
log2(x) Logarithm of the elements of x in base 2
log1p(x) \(ln(1+x\)
exp2(x) \(2^x\)
isnan(x) Boolean table indicating True for the elements NaN
isfinite(x) Boolean table indicating True for non-infinite and non-NaN elements
isinf(x) Boolean array indicating True for infinite elements
Table 9.4: Binary Universal Functions
Code Description
add(x,y) Term by term addition of the elements of x and y
subtract(x,y) Term by term substraction of the elements of x and y
multiply(x,y) Term by term multiplication of the elements of x and y
divide(x,y) Term by term division of the elements of x and y
floor_divide(x,y) Largest integer smaller or equal to the division of the elements of x and y
power(x,y) Elements of x to the power of the elements of y
mod(x,y) Remainder of Euclidean term by term divisions of the eleemnts of x by the elements of y
round(x,n) Rounded value of the elements of x up to \(n\) digits
arctan2(x,y) Polar angles of x and y

To use these functions, proceed as in the following example:

t_1 = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
t_2 = np.array([[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]])
np.log(t_1) # Natural Logarithm
## array([[0.        , 0.69314718, 1.09861229, 1.38629436],
##        [1.60943791, 1.79175947, 1.94591015, 2.07944154],
##        [2.19722458, 2.30258509, 2.39789527, 2.48490665]])
np.subtract(t_1, t_2) # Substraction of the elements of `t_1` by those of `t_2`
## array([[-12, -12, -12, -12],
##        [-12, -12, -12, -12],
##        [-12, -12, -12, -12]])

9.1.12 Mathematical and Statistical Methods and Functions

NumPy provides many methods to calculate statistics on all array values, or on one of the array axes (for example on the equivalent of rows or columns in two-dimensional arrays). Some of them are reported in Table 9.5.

Table 9.5: Mathematical and Statistical Methods
Code Description
sum() Returns the sum of the elements
prod() Returns the product of the elements
cumsum() Returns the cumulative sum of the elements
cumprod() Returns the cumulative product of the elements
mean() Returns the average
var() Returns the variance
std() Returns the standard error
min() Returns the minnimum value
max() Returns the maximum value
argmin() Returns the index of the first element with the lowest value
argmax() Returns the index of the first element with the largest value

Let’s give an example of the use of these methods:

t_1 = np.array([[1, 2, 3, 4], [-1, 6, 7, 8], [9, -1, 11, 12]])
print("t_1 : \n", t_1)
## t_1 : 
##  [[ 1  2  3  4]
##  [-1  6  7  8]
##  [ 9 -1 11 12]]
print("Sum of the elements: ", t_1.sum())
## Sum of the elements:  61
print("Covariance of the elements: ", t_1.var())
## Covariance of the elements:  18.07638888888889

To apply these functions to a given axis, we modify the value of the argument axis:

print("Sum per column: ", t_1.sum(axis=0))
## Sum per column:  [ 9  7 21 24]
print("Sum per row: ", t_1.sum(axis=1))
## Sum per row:  [10 20 31]

NumPy also offers some statistically specific functions, some of which are listed in Table 9.6.

Table 9.6: Statistical Functions
Code Description
sum(x), nansum(x) Sum of the elements of x (nansum(x) does not take into account NaN values)
mean(x), nanmean() Average of x
median(x), nanmedian() Median of x
average(x) Average of x (possibility to use weights using the weight argument)
min(x), nanmin() Mininum of x
max(x), nanmax() Maximum of x
percentile(x,p), nanpercentile(n,p) P-th percentile of x
var(x), nanvar(x) Variance of x
std(x), nanstd() Standard-deviation of x
cov(x) Covariance of x
corrcoef(x) Correlation coefficient

To use the statistical functions:

t_1 = np.array([[1, 2, 3, 4], [-1, 6, 7, 8], [9, -1, 11, 12]])
print("t_1 : \n", t_1)
## t_1 : 
##  [[ 1  2  3  4]
##  [-1  6  7  8]
##  [ 9 -1 11 12]]
print("Variance: ", np.var(t_1))
## Variance:  18.07638888888889

If the array has NaN values, for example, to calculate the sum, if sum() is used, the result will be NaN. To ignore the values NaN, we use a specific function (here, nansum()) :

t_1 = np.array([[1, 2, np.NaN, 4], [-1, 6, 7, 8], [9, -1, 11, 12]])
print("Sum: ", np.sum(t_1))
## Sum:  nan
print("Sum ignoring NaN values: ", np.nansum(t_1))
## Sum ignoring NaN values:  58.0

To calculate a weighted average (let’s consider a vector):

v_1 = np.array([1, 1, 4, 2])
w = np.array([1, 1, .5, 1])
print("Weighted average: ", np.average(v_1, weights=w))
## Weighted average:  1.7142857142857142

9.2 Generation of Pseudo-random Numbers

The generation of pseudo-random numbers is allowed by the random module of Numpy. The reader interested in the more statistical aspects will be able to find more concepts covered in the stats sub-module of SciPy.

from numpy import random

Table 9.7 lists some functions that allow to draw numbers in a pseudo-random way with the random module of Numpy (by evaluating random, we get an exhaustive list).

Table 9.7: Some Functions for Pseudo-random Number Generation
Code Description
rand(size) Drawing size obs. from a Uniform distribution \([0,1]\)
uniform(a,b,size) TDrawing size obs. from a Uniform distribution \([a ; b]\)
randint(a,b,size) Drawing size obs. from a Uniform distribution \([a ; b[\)
randn(size) Drawing size obs. from a Normal distribution \(\mathcal{N}(0,1)\)
normal(mu, std, size) Drawing size obs. from a Normal distribution with mu mean and standard error std
binomial(size, n, p) Drawing size obs. from a Binomial distribution \(\mathcal{B}in(n,p)\)
beta(alpha, beta, size) Drawing size obs. from a Beta distribution \(Beta(\alpha, \beta)\)
poisson(lambda, size) Drawing size obs. from a Poisson distribution \(\mathcal{P}(\lambda)\)
standard_t(df, size) Drawing size obs. from a Student distribution \(\mathcal{S}t(\text{df})\)

Here is an example of generating pseudo random numbers according to a Gaussian distribution:

x = np.random.normal(size=10)
print(x)
## [ 7.14843007e-02 -1.71327067e+00 -2.30772472e+00  1.49569364e-01
##  -1.73425246e+00  9.90838613e-01  8.63645135e-01  1.42007474e-03
##   1.58888406e+00  3.32331878e-01]

A multidimensional array can be generated. For example, a two-dimensional array, in which the first dimension contains 10 elements, each containing 4 random draws according to a \(\mathcal{N}(0.1)\):

x = np.random.randn(10, 4)
print(x)
## [[ 0.24653503 -0.621938    0.29551242 -1.10261818]
##  [ 1.34433925  0.46086007 -0.1577243  -0.25985367]
##  [-0.08698297 -0.83880423  0.19511641  0.17766021]
##  [ 0.27062363  1.32670576  0.87167645  0.35953307]
##  [ 0.19270476 -0.56278371 -0.1240118   0.4494846 ]
##  [ 0.354656   -0.86251269  0.20154409  1.79497518]
##  [-0.22333828 -0.69515443 -2.13235873 -0.54600714]
##  [ 1.30397523  0.7965243  -0.34589564  0.05624394]
##  [-0.35758014 -0.11945162  3.05167861  0.07784099]
##  [ 1.4045547   1.71768057 -0.40108593  0.08300445]]

The generation of numbers is based on a seed, i.e. a number that initiates the generator of pseudo random numbers. It is possible to fix this seed, so that reproducible results can be obtained, for example. To do this, we can use the seed() method, to which we indicate a value as an argument:

np.random.seed(1234)
x = np.random.normal(size=10)
print(x)
## [ 0.47143516 -1.19097569  1.43270697 -0.3126519  -0.72058873  0.88716294
##   0.85958841 -0.6365235   0.01569637 -2.24268495]

By fixing the seed again, one will obtain exactly the same draft:

np.random.seed(1234)
x = np.random.normal(size=10)
print(x)
## [ 0.47143516 -1.19097569  1.43270697 -0.3126519  -0.72058873  0.88716294
##   0.85958841 -0.6365235   0.01569637 -2.24268495]

To avoid affecting the global environment by the random seed, the RandomState method of the random sub-module of NumPy can be used:

from numpy.random import RandomState
rs = RandomState(123)
x = rs.normal(10)
print(x)
## 8.914369396699438

In addition, the switching() function of the random sub-module allows a random switch:

x = np.arange(10)
y = np.random.permutation(x)
print("x : ", x)
## x :  [0 1 2 3 4 5 6 7 8 9]
print("y : ", y)
## y :  [9 7 4 3 8 2 6 1 0 5]

The shuffle() function of the random submodule allows to perform a random permutation of the elements :

x = np.arange(10)
print("x avant permutation : ", x)
## x avant permutation :  [0 1 2 3 4 5 6 7 8 9]
np.random.permutation(x)
## array([7, 5, 4, 1, 0, 8, 3, 9, 6, 2])
print("x après permutation : ", x)
## x après permutation :  [0 1 2 3 4 5 6 7 8 9]

9.3 Exercise

First exercise

Consider the following vector: \(x = \begin{bmatrix}1 & 2 & 3 & 4 & 5\end{bmatrix}\)

  1. Create this vector using an array called x.
  2. Display the type of x and its length.
  3. Extract the first element, then do the same with the last one.
  4. Extract the first three elements and store them in a vector called a.
  5. Extract the 1st, 2nd and 5th elements of the vector (be careful with the positions); store them in a vector called b.
  6. Add the number 10 to the vector x, then multiply the result by 2.
  7. Add a and b, comment on the result.
  8. Make the following addition: x+a; comment on the result, then look at the result of a+x.
  9. Multiply the vector by the scalar `c’ which will be set to 2.
  10. Multiply a and b; comment on the result.
  11. Perform the following multiplication: x*a; comment on the results.
  12. Retrieve the positions of the multiples of 2 and store them in a vector called ind, then store only the multiples of 2 of x in a vector called mult_2.
  13. Display the elements of x that are multiples of 3 and multiples of 2.
  14. Display the elements of x that are multiples of 3 or multiples of 2.
  15. Calculate the sum of the elements of x.
  16. Replace the first element of x with a 4.
  17. Replace the first element of x with the value NaN, then calculate the sum of the elements of x. 18 Delete the vector x.

Second exercise

  1. Create the following matrix: \(A = \begin{bmatrix} -3 & 5 & 6 \\ -1 & 2 & 2 \\ 1 & -1 & -1 \end{bmatrix}\).
  2. Display the size of A, its number of columns, its number of rows and its length.
  3. Extract the second column from A, then the first row.
  4. Extract the element in the third position in the first line.
  5. Extract the submatrix of dimension \(2\times 2\) from the lower corner of A, i. e., \(\begin{bmatrix} 2 & 2 \ -1 & -1 \end{bmatrix}\).
  6. Calculate the sum of the columns and then the rows of A.
  7. Display the diagonal of A.
  8. Add the vector \(\begin{bmatrix} 1 & 2 & 3\end{bmatrix}^\top\)$ to the right of the matrix A and store the result in an object called B.
  9. Remove the fourth vector from B.
  10. Remove the first and third lines from B.
  11. Add scalar 10 to A.
  12. Add the vector \(\begin{bmatrix} 1 & 2 & 3\end{bmatrix}^\top\) to A.
  13. Add the identity matrix \(I_3\) to A.
  14. Divide all the elements of the matrix A by 2.
  15. Multiply the matrix A by the line vector \(\begin{bmatrix} 1 & 2 & 3\end{bmatrix}^\top\).
  16. Display the transposition of A.
  17. Perform the product with transposition \(A^\top A\).