The first step in machine learning is to learn is to choose appropriate tool to work upon.I began my search for that tool and finally ended up with SciKit learn. But before beginning my journey, I began working upon my skills on the libraries that are used along with scikit learn.

The first library which I chose to study was numpy. Numpy library according to much I have learnt is used to handle arrays. Now moving some formal definition of NumPy as per wikipedia is “NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.”

Let’s begin with NumPy

`import numpy as np`

Later in while coding the name numpy would be seem tedious to use thus it is used in short form of np.

#Creating an Array

`arr = np.array([1, 2, 3])`

print(arr)

#Creating an array with range of nos. from 0, 100

`arrOne = np.arange(0, 100)`

#Creating an array within the range along with step size

`arrTwo = np,linespace(1, 3, 0.5)`

In Python, list and arrays seems one or the other form but they differ a lot in terms of their size and time taken to process them. Try running this code on your system.

#Comparing the list and array

`list = range(0, 1000)`

arrThree = np.arange(0, 1000)

import sys

print(sys.getsizeof(1) * len(list))

print(D.size * D.itemsize)

import time

L1 = range(1000)

L2 = range(1000)

start = time.time()

result = [(x+y) for x,y in zip(L1,L2)]

print(time.time()-start)

start = time.time()

AR1 = np.arange(1000)

AR2 = np.arange(1000)

result = AR1 + AR2

print(time.time()-start)

There might be the case that output may be same then in that case try to increase the range.

There are some functions in numpy that help to gain better insight of the array

#Understanding the array

arrFour = np.array([(1, 2, 3), (4, 5, 6),(7, 8, 9)])

print(arrFour.ndim) #Ouput : 2 (dimension)

print(arrFour.itemsize) #Ouput : 4 (item Size)

print(arrFour.dtype) #Ouput : int32 (data type)

print(arrFour.size) #Ouput : 9 (no. of elements)

print(arrFour.shape) #Ouput : (3, 3) (shape)

print(arrFour.reshape(9,1))

In numpy reshaping preserves the size of the array that is no new element is added or deleted while reshaping it is just a transformation from one form to another.

Two different array can be stacked to one n another either horizontally or vertically

arrFive = np.array([(1, 2, 3), (4, 5, 6)])

arrSix = np.array([(7, 8, 9), (10, 11, 12)])

#Column wise stacking

np.hstack((arrFive, arrSix))

#Row wise stacking

np.vstack((arrFive, arrSix))

All the elements in the n-dimensional array can be used in 1d by using the function ravel() or flatten().

arrSeven = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)])

arrSeven.ravel()

arrSeven.flatten()

The output of both the functions will be same with the only difference that flatten() returns copy of the array whereas ravel() returns the original view of the array.

To get a view of a particular set of row or column following code can be used

arrEight = np.array([(1, 2, 3), (4, 5, 6), (7, 8, 9)])

print(arr[0:2])

print(arr[0:2, 2])

In python the indexing of the array starts from 0. In this code snippet the first statement will print print the row (1, 2, 3) and (4, 5, 6) while the second statement will going to print the 3rd elements of each row that are [3, 6].

There are other some standard mathematical operations that can be performed on the arrays such as square root (sqrt), log(log or log 10), standard deviation(std), sin etc.

arrNine = np.array([(1, 2, 3), (4, 5, 6),(7, 8, 9)])

print(np.sqrt(arrNine))

print(np.std(arrNine))

print(np.log10(arrNine))

print(np.log(arrNine))