This chapter introduces the basics of Python, focusing on the different types of objects that can be used in each language, as well as control flow and function writing.
Python is a general-purpose programming language that was created by computer scientist Guido van Rossum in 1991. The language was designed to be highly readable and to encompass a wide range of programming paradigms. One of Python’s key strengths is its flexibility, which allows it to handle a variety of tasks such as web frameworks, database connectivity, networking, web scraping, text and image processing, and many other features that are useful in machine learning.
Python is based on computer science and mathematics and boasts one of the largest ecosystems of any programming language, with over 100,000 open-source libraries. This makes it an ideal choice for those who value flexibility.
Python also has a rich set of data science libraries, including Scikit Learn, the most popular machine learning library, which is easy to learn, supports pipelines to simplify the machine learning workflow and has most of the algorithms you need in one place. Another common library is TensorFlow, developed by software engineers at Google for deep learning and commonly used for image recognition and natural language processing tasks. PyTorch, developed by Facebook is also a popular deep learning framework and a concurrent of Tensorflow and Keras (which is designed for efficiently building neural networks).
Python is easier to deploy in production settings. It has better integration with platforms like Docker, Kubernetes, and cloud environments, which is crucial for building data-driven applications.
Machine learning is done using Python for its capability in the Scikit Learn, TensorFlow, Keras and PyTorch pipeline.
Python is better at integrating with other technologies, like databases (SQL, MongoDB), web frameworks (Flask, Django), cloud services, and DevOps tools. It’s also used in production environments more frequently.
Python work as classical calculator, using “+”, “-”, “*” and “/” we can do arithmetic operations in both languages.
# Python
1+2
## 3
1-2
## -1
1/2
## 0.5
1*2
## 2
We can also apply exponentiation, Modulo and floor division easily in both language.
# Python
2**8 # exponentiation
## 256
2^8 == 2**8 # False
## False
8%3 # modulo
## 2
8//3 # floor division
## 2
Operator | Description |
---|---|
+ | Addition |
– | Subtraction |
* | Multiplication |
/ | Division |
** | Exponent |
% | Modulo |
// | Floor Division |
To compare values, we use comparison operators to determine if a value is equal to, not equal to, greater than, etc.
These operators are the same in both languages.
# Python
2==8
## False
2!=8
## True
2<8
## True
2>8
## False
2<=8
## True
Operator | Description |
---|---|
< | Less than |
> | Greater than |
<= | Less than or equal to |
>= | Greater than or equal to |
== | Equal to |
!= | Not equal to |
# Python
x = [True,True]
y = [True,False]
not x[0]
## False
x and y
## [True, False]
x or y
## [True, True]
Operator | Description |
---|---|
not | Logical NOT |
and | Element-wise logical AND |
or | Element-wise logical OR |
Exercise 1: Write a Python expression that checks if a number is both greater than 10 and less than 20.
number = 15
result = 10 < number < 20
print(result)
## True
Exercise 2: Write a Python expression that checks if a string is either “yes” or “no”.
string = "yes"
result = string == "yes" or string == "no"
print(result)
## True
In Python, the ‘in’ and ‘not in’ operators are used to test membership within a sequence, such as a list, tuple, or string. These operators allow for checking if a particular value exists in a sequence. For instance, a list of characters can be decomposed into individual elements. A practical example would be checking if the string ‘Hello’ exists in the phrase ‘Hello world’ using the ‘in’ operator. This functionality enables simple membership testing for strings and other iterable objects.
# Python
x = 'Hello world'
y = {1:'a',2:'b'}
print('world' in x)
## True
print(1 in y)
## True
print('a' in y)
## False
# Python
x = ['Hello','World']
print('Hello' in x)
## True
Exercise 1: Write a Python expression that checks if the letter “a” is present in the string “banana”.
string = "banana"
result = "a" in string
print(result)
## True
Exercise 2: Write a Python expression that checks if the number 5 is not present in the list [1, 2, 3, 4].
list_ = [1, 2, 3, 4]
result = 5 not in list_
print(result)
## True
Python does not require declaring a variable before assigning a value to it. Variables can be thought of as names that refer to an object. However, there is a difference in the way objects and variables are stored in the computer’s memory in Python compared to R.
In Python, textual data is referred to as a ‘string’, abbreviated as ‘str’. You can use either ” or ’ when defining a textual variable, and it’s possible to explicitly set a variable as a textual variable if needed.
Numerical variables in Python are divided into three types: integer, float, and complex. The data-type ‘float’ has a precision of 15 digits. There are ways to achieve higher precision using libraries like Numpy.
Finally, it’s easy to check the data type of a variable by using the command type
in Python.
# Python
import sys
sys.float_info.dig # number of decimals
## 15
a = 1
type(a)
## <class 'int'>
b = 1.1
type(b)
## <class 'float'>
c = 1.1+2j
type(c)
## <class 'complex'>
d = 'd'
type(d)
# change the type
## <class 'str'>
e = 2
f = str(e)
f
## '2'
float(f)
## 2.0
In Python, the most common data types are List, Tuple, Sets, and Dictionary. It’s important to note that when modifying a variable, we always use ‘=’. Keep in mind that using the ‘.’ in Python can modify the object behind the variable.
Exercise 1: Create a variable containing your full name, then extract and print only your first name using string slicing.
full_name = "John Smith"
first_name = full_name.split()[0]
print(first_name)
## John
Exercise 2: Write a Python expression that checks if ‘ab’ is present in the list [‘aa’,‘bb’,‘ab’,‘ba’].
list_ = ['aa','bb','ab','ba']
result = 'ab' in list_
print(result)
## True
Keep in mind that Lists are mutable.
# Python
a = [1, 2, 3]
a.count(2) # count elements of the list which are exactly equal to 2
## 1
a.sort(reverse = True)
a
# access the element of a list
## [3, 2, 1]
a[0]
## 3
a.index(3)
## 0
a[1:]
## [2, 1]
a[:1]
## [3]
a[0:-1]
## [3, 2]
a[:]
## [3, 2, 1]
# modify
b = [0, 0, 0]
list(zip(a,b)) # zip will pairs the ellements, it works also with more than 2 element ex: zip(a,b,c)
## [(3, 0), (2, 0), (1, 0)]
a.append(b)
a
## [3, 2, 1, [0, 0, 0]]
a[4:5] = ['a','b']
a
## [3, 2, 1, [0, 0, 0], 'a', 'b']
a.extend([4, 5, 6])
a
## [3, 2, 1, [0, 0, 0], 'a', 'b', 4, 5, 6]
a += [7,8] # works as extend
a
## [3, 2, 1, [0, 0, 0], 'a', 'b', 4, 5, 6, 7, 8]
a.insert(2,[1,2])
a
## [3, 2, [1, 2], 1, [0, 0, 0], 'a', 'b', 4, 5, 6, 7, 8]
a.remove(b)
a
## [3, 2, [1, 2], 1, 'a', 'b', 4, 5, 6, 7, 8]
a = a*2 # replicate the list n times
len(a) # number of elements in the list
## 22
# mutable
a = [1,2,3]
b = a
b[0] = 12
a
## [12, 2, 3]
Exercise 1: Given a list [5, 2, 8, 1, 9], sort the list in ascending order.
my_list = [5, 2, 8, 1, 9]
my_list.sort()
print(my_list)
## [1, 2, 5, 8, 9]
Exercise 2: Given a list [‘apple’, ‘banana’, ‘apple’, ‘orange’], write code to count the number of times “apple” appears.
my_list = ['apple', 'banana', 'apple', 'orange']
count = my_list.count('apple')
print(count)
## 2
The main difference between Lists and tuples is the fact that tuples is an immutable type of data, making it faster to use.
# Python
a = (1, 2, 3)
a.count(2) # count elements of the tuple which are exactly equal to 2
## 1
a
# access the element of a tuple
## (1, 2, 3)
a[0]
## 1
a.index(3)
## 2
a[1:]
## (2, 3)
a[:1]
## (1,)
a[0:-1]
## (1, 2)
a[:]
## (1, 2, 3)
# modify
a += (4,5)
a
## (1, 2, 3, 4, 5)
a = a*2 # replicate the tuple n times
len(a) # number of elements in the tuple
## 10
# immutable
a = (1,2,3)
b = a
b += (4,5)
a
## (1, 2, 3)
b[0] = 3 #immutable
## 'tuple' object does not support item assignment
Exercise 1: Create a tuple with the values (10, 20, 30). Then, try to change the first element of the tuple and observe the error.
my_tuple = (10, 20, 30)
my_tuple[0] = 15
## 'tuple' object does not support item assignment
print(my_tuple)
## (10, 20, 30)
Exercise 2: Create a function that takes a tuple as input and returns a new tuple with the elements in reverse order without using the reverse() method.
my_tuple = (1, 2, 2, 3, 4, 2)
reversed_tuple = my_tuple[::-1]
print(reversed_tuple)
## (2, 4, 3, 2, 2, 1)
Dictionary refers to a way of storing data that is not sorted. It works with key and value associate with this key.
# Python
a = {'a':1, 'b':2, 'c':3}
# access the element of a dictionary
a.keys()
## dict_keys(['a', 'b', 'c'])
a['a']
## 1
a.values()
## dict_values([1, 2, 3])
a.items()
## dict_items([('a', 1), ('b', 2), ('c', 3)])
a.get('a')
## 1
a.get('d',4) # set to 4 if the key 'd' is not detected
## 4
a.pop('a') # pop will use the corresponding value to the key a and remove the pair (key, value).
## 1
a
## {'b': 2, 'c': 3}
a.popitem() # pop the last item
## ('c', 3)
a
## {'b': 2}
# modify
a['a'] =1
a.setdefault('d',0) # create new item with a default value
## 0
a
## {'b': 2, 'a': 1, 'd': 0}
b = {'d':4,'e':5}
a.update(b) # update values from other dict
a
## {'b': 2, 'a': 1, 'd': 4, 'e': 5}
a.clear() # remove all items
a
## {}
# mutable
a = {'a':1, 'b':2, 'c':3}
b = a
b['b'] = [12,14]
a
## {'a': 1, 'b': [12, 14], 'c': 3}
Exercise 1: Create a dictionary that assign the keys “name”, “age”, and “city” to some values. Change then the value of ‘city’ and assign a list of two cities to this key. Finaly add a third city to the list by using the append method.
my_dict = {"name": "Pierre", "age": 29, "city": "Strasbourg"}
my_dict['city'] = ["Strasbourg","Schiltigheim"]
print(my_dict)
## {'name': 'Pierre', 'age': 29, 'city': ['Strasbourg', 'Schiltigheim']}
my_dict['city'].append('Colmar')
Exercise 2: Given a dictionary {‘a’: 1, ‘b’: 2, ‘c’: 3}, write code to add a new key-value pair “d”: 4.
my_dict = {'a': 1, 'b': 2, 'c': 3}
my_dict["d"] = 4
print(my_dict)
## {'a': 1, 'b': 2, 'c': 3, 'd': 4}
Sets are unordered collection of unique elements. If we give to a set multiple time the same element, it will automatically delete duplicated values.
# Python
a = {1, 2, 3}
a
# access the element of a set
## {1, 2, 3}
a[0] # since it unordered, we can not access to a given element of a set
## 'set' object does not support indexing
# modify
b = {3,4,5}
a.update(b) # update values from other set
a
## {1, 2, 3, 4, 5}
# mutable
a = {1, 2, 3}
b = a
b.update([12,14])
a
## {1, 2, 3, 12, 14}
Exercise 1: Create two sets, set1 and set2, with some overlapping elements. Then, find the intersection of the two sets.
set1 = {1, 2, 3, 4}
set2 = {3, 4, 5, 6}
intersection = set1.intersection(set2)
print(intersection)
## {3, 4}
Exercise 2: Create a set from the list [1, 2, 2, 3, 4, 4, 5] and observe how duplicate values are handled.
my_list = [1, 2, 2, 3, 4, 4, 5]
my_set = set(my_list)
print(my_set)
## {1, 2, 3, 4, 5}
To manipulate arrays in Python, we use the numpy package. This package is very useful and will be discussed in more detail in later chapters.
It’s worth noting that in R, arrays are created using vectors and are stored in column-major order, which is different from how arrays are handled in Python.
# Python
import numpy
arr = numpy.array([[1,4],[2,5],[3,6]])
arr
## array([[1, 4],
## [2, 5],
## [3, 6]])
type(arr)
## <class 'numpy.ndarray'>
vec = [1,2,3,4,5,6]
arr = numpy.reshape(vec,(3,2))
arr
## array([[1, 2],
## [3, 4],
## [5, 6]])
arr = numpy.reshape(vec,(3,2), order = 'F')
arr
## array([[1, 4],
## [2, 5],
## [3, 6]])
vec = range(1,7)
numpy.array(vec).reshape(2,3)
# diagonal array
## array([[1, 2, 3],
## [4, 5, 6]])
numpy.diagflat([1]*3)
## array([[1, 0, 0],
## [0, 1, 0],
## [0, 0, 1]])
# Python
# access the element of an array
arr[0] # access directly to the raw 1
## array([1, 4])
# modify
vec = [7,8]
arr = numpy.insert(arr, len(arr),vec,axis = 0) # update values from other set
arr
## array([[1, 4],
## [2, 5],
## [3, 6],
## [7, 8]])
# mutable
arr2 = arr
arr2[0] = [12,14]
arr
## array([[12, 14],
## [ 2, 5],
## [ 3, 6],
## [ 7, 8]])
Exercise 1: Create a NumPy array with the values [[1, 2, 3], [4, 5, 6]]. Then, print the element at row 1, column 2.
import numpy as np
my_array = np.array([[1, 2, 3], [4, 5, 6]])
print(my_array[1, 2])
## 6
Exercise 2: Create a NumPy array with the values [1, 2, 3, 4, 5, 6]. Then, reshape it into a 2x3 matrix.
import numpy as np
my_array = np.array([1, 2, 3, 4, 5, 6])
reshaped_array = my_array.reshape(2, 3)
print(reshaped_array)
## [[1 2 3]
## [4 5 6]]
Pandas Data Frames are also very common data-type in Python. The package Pandas is also view deeper in following chapters.
# Python
import pandas
df = pandas.DataFrame(arr)
df
## 0 1
## 0 12 14
## 1 2 5
## 2 3 6
## 3 7 8
vec = [1,2,3,4,5,6]
df = pandas.DataFrame({'vec':vec,'vec1':range(2,8)})
df
## vec vec1
## 0 1 2
## 1 2 3
## 2 3 4
## 3 4 5
## 4 5 6
## 5 6 7
# Python
# access element of a Pandas Data Frame
df['vec']
## 0 1
## 1 2
## 2 3
## 3 4
## 4 5
## 5 6
## Name: vec, dtype: int64
# modify
vec2 = range(3,9)
df['vec2'] = vec2 # add values from other vector
a
## {1, 2, 3, 12, 14}
# mutable
df2 = df
df['vec'][0] = 30
df2
## vec vec1 vec2
## 0 30 2 3
## 1 2 3 4
## 2 3 4 5
## 3 4 5 6
## 4 5 6 7
## 5 6 7 8
Exercise 1: Create a Pandas DataFrame from a dictionary with two columns: “Name” and “Age”.
import pandas as pd
data = {"Name": ["Alice", "Bob"], "Age": [25, 24]}
df = pd.DataFrame(data)
print(df)
## Name Age
## 0 Alice 25
## 1 Bob 24
Exercise 2: Add a new column to a Pandas DataFrame that contains the square of ‘Age’.
df["Squared_Age"] = df["Age"] ** 2
print(df)
## Name Age Squared_Age
## 0 Alice 25 625
## 1 Bob 24 576
In programming, there are two main control flow tools: conditional statements and loops.
Conditional statements, also known as choices, are useful for establishing rules or conditions. They allow for modifying a value according to a certain condition, and generally allow for certain actions to be taken in specific cases.
Loops, on the other hand, allow for sequential execution of actions. They can be used to interactively modify an object, and generally allow for a procedure to be executed multiple times. For example, we can use loops to create multiple similar objects, or to modify multiple lines in a single object.
# Python
# if, elif, else
n = 12
if n%2 == 0 :
print('n is an even number')
## n is an even number
if n != int(n):
print('n is not a integer')
elif n%2 == 0 :
print('n is an even number')
else:
print('n is not an even number')
## n is an even number
Exercise 1: Write a condition that returns “there is an ‘e’” if there is an ‘e’ in a given word, and “there is no ‘e’” otherwise.
word = 'hello'
if 'e' in word:
print("there is an 'e'")
else:
print("there is no 'e'")
## there is an 'e'
Exercise 2: categorizes a word based on its length: “short” if the word has 3 or fewer characters, “medium” if the word has between 4 and 6 characters and “long” if the word has 7 or more characters
if len(word) <= 3:
print("short")
elif len(word) <= 6:
print("medium")
else:
print("long")
## medium
With ‘loops’, we iterate over a predefined number of iterations. However, in certain situations we may not know in advance how many iterations are required to complete a task.
For example, when trying to optimize a function, we may not know how many steps are needed to reach an optimum, but we can set a condition for when the algorithm is considered to have converged. In such cases, we can use a ‘while’ loop, which will iterate until a given condition is met.
# Python
seq = [1,2,None,4,None,6]
total = 0
for val in seq:
if val is not None:
total += val
total
## 13
# Python
import random
total = 0
while total < 1:
rnd = random.gauss(mu = 0, sigma = 1)
if rnd < 0:
pass
else:
total += rnd
total
## 1.474123483437766
Exercise 1: Write a for loop that prints the numbers from 1 to 10.
for i in range(1, 11):
print(i)
## 1
## 2
## 3
## 4
## 5
## 6
## 7
## 8
## 9
## 10
Exercise 2: Write a while loop that keeps prompting the user for input until they enter “quit”.
user_input = ""
while user_input != "quit":
user_input = input("Enter something (or 'quit'): ")
for
loops in R as they are very slow because they execute a function call with every iteration.for
loops, we should use vectorization and the apply
family of functions for better performance. Vectorization is crucial for fast code in R.a = 1
b = a
a += 1
print(a)
## 2
print(b)
## 1
a = [1,2]
b = a
a.append(3)
print(a)
## [1, 2, 3]
print(b)
## [1, 2, 3]
a = (1,2)
b = a
a += (3,)
print(a)
## (1, 2, 3)
print(b)
## (1, 2)
List comprehension is very common and appreciate in the python language features, think of it as a loop for which we will directly store output in a list, set, or dict. we can use it as a filter for example.
# Python
# List
import time
lst = [1,2,3,4]
t = time.time()
results = []
for val in lst:
if val > 2:
results.append(val)
time.time()-t
## 0.008852720260620117
results
## [3, 4]
t = time.time()
# this loop their will produce the same output than a using List comprehension.
results = [val for val in lst if val>2]
time.time()-t
## 0.005784273147583008
results
## [3, 4]
# Python
# Set
import time
st = {1,2,3,4}
t = time.time()
results = set([])
for val in st:
if val > 2:
results.add(val)
time.time()-t
## 0.008643150329589844
results
## {3, 4}
t = time.time()
# this loop their will produce the same output than a using Set comprehension.
results = {val for val in st if val>2}
time.time()-t
## 0.005830287933349609
results
## {3, 4}
# Python
# Dict
import time
dct = {'a':1,'b':2,'c':3,'d':4}
t = time.time()
results = dict([])
for val in dct:
if dct[val] > 2:
results.update({str(val): dct[val]})
time.time()-t
## 0.010318279266357422
results
## {'c': 3, 'd': 4}
t = time.time()
# this loop their will produce the same output than a using Dict comprehension.
results = {str(val): dct[val] for val in dct if dct[val]>2}
time.time()-t
## 0.007021188735961914
results
## {'c': 3, 'd': 4}
Exercise 1: Use list comprehension to create a new list containing only the even numbers from an existing list.
numbers = [1, 2, 3, 4, 5, 6]
even_numbers = [x for x in numbers if x % 2 == 0]
print(even_numbers)
## [2, 4, 6]
Exercise 2: Use dictionary comprehension to create a dictionary where the keys are numbers from 1 to 5 and the values are their squares.
squares = {x: x**2 for x in range(1, 6)}
print(squares)
## {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
Functions are an important aspect of Python. Being able to write our own functions can be more efficient than searching for and understanding a pre-existing package.
Writing our own functions gives us more flexibility and a better understanding of what we are doing. However, it is important to not reinvent the wheel and instead use pre-existing packages when appropriate. It is crucial to carefully read the documentation when using packages, as they can be misleading and lead to a lot of time spent trying to understand how they work. It is also beneficial to look at the package’s source code when unsure of what a function does behind the scenes.
The full power of programming comes from the ability to be autonomous by reading, modifying, and writing code, as well as reusing pre-existing code.
# Python
seq = [1,2,None,4,None,6]*120
def clean_sum(seq):
total = 0
for val in seq:
if val is not None:
total += val
return total
t = time.time()
clean_sum(seq = seq)
## 1560
time.time() - t
## 0.0060193538665771484
def clean_sum2(seq):
total = sum(filter(None,seq))#[val for val in seq if val is not None])
return total
t = time.time()
clean_sum2(seq = seq)
## 1560
time.time() - t
## 0.00640106201171875
Functions in Python are objects and can have attributes and methods like any other object. Functions can also contain data variables and even other functions.
For example, if we want to apply multiple transformations to data, we can create separate functions for each task. These functions can be stored in a list and applied sequentially with ease.
It’s worth noting that in Python, it is possible to unpack the output of a function into multiple variables by specifying them before the assignment.
# Python
def add_two(nb):
nb = [i+2 for i in nb]
return nb
def square_nb(nb):
nb = [i**2 for i in nb]
return nb
def global_function(nb):
for function in func_list:
nb = function(nb)
return nb
func_list = [add_two,square_nb]
x1, x2 = global_function([2,3])
x1
## 16
x2
## 25
Exercise 1: Write a function that takes a list of numbers as input and returns the average of the numbers. Check first that the list is not None before computing the average.
def average_list(numbers):
if not numbers:
return 0
return sum(numbers) / len(numbers)
my_list = [1, 2, 3, 4, 5]
average = average_list(my_list)
print(average)
## 3.0
It is important to understand why a given function may produce an error. Determining the situations where errors may occur is not always easy, but it’s important to not be too lenient in order to avoid errors. In the examples, we will see how the flexibility of a function can lead to different results, some of which may be more efficient than others.
For example, if we expect that 99% of the time the result will contain something iterable, we would use the try/except approach. This will be faster if exceptions are truly exceptional. However, if the result is None more than 50% of the time, then using an ‘if’ statement is probably a better approach.
While an ‘if’ statement always has a cost, setting up a try/except block is relatively inexpensive. However, when an exception does occur, the cost is much higher.
# Python
import numpy as np
# Let's create a function that create a dirty list
def create_dirty_list(None_prop):
list_ = list()
for i in range(10000):
if i<None_prop*10000:
list_.append([None])
else:
values = random.sample(range(1, 1000), random.sample(range(1, 50),1)[0])
# introduce some character
if i%10==0 :
values = [str(i) for i in values]
list_.append(values)
return list_
# randomly take two values to compute the ratio
def calc1(values):
output = values[random.sample(range(1, len(values)+1),1)[0]-1]/values[random.sample(range(1, len(values)+1),1)[0]-1]
return output
list_ = create_dirty_list(0.5)
# Store it in a list
results = [calc1(values) for values in list_]
## unsupported operand type(s) for /: 'NoneType' and 'NoneType'
# Let's change the function
def calc2(values):
if not any(value is None for value in values):
output = values[random.sample(range(1, len(values)+1),1)[0]-1]/values[random.sample(range(1, len(values)+1),1)[0]-1]
return output
results = [calc2(values) for values in list_]
## unsupported operand type(s) for /: 'str' and 'str'
# Let's change the function
def calc3(values):
if not any(value is None for value in values):
if all(isinstance(value,int) for value in values):
output = values[random.sample(range(1, len(values)+1),1)[0]-1]/values[random.sample(range(1, len(values)+1),1)[0]-1]
return output
results = [calc3(values) for values in list_]
# using try
def calc_try(values):
try:
output = values[random.sample(range(1, len(values)+1),1)[0]-1]/values[random.sample(range(1, len(values)+1),1)[0]-1]
except:
output = None
return output
results = [calc_try(values) for values in list_]
list_ = create_dirty_list(0.01)
t= time.time()
results = [calc3(values) for values in list_]
time.time()-t
## 0.09654116630554199
t= time.time()
results = [calc_try(values) for values in list_]
time.time()-t
## 0.0696406364440918
list_ = create_dirty_list(0.33)
t= time.time()
results = [calc3(values) for values in list_]
time.time()-t
## 0.07087230682373047
t= time.time()
results = [calc_try(values) for values in list_]
time.time()-t
## 0.07316446304321289
list_ = create_dirty_list(0.75)
t= time.time()
results = [calc3(values) for values in list_]
time.time()-t
## 0.030693531036376953
t= time.time()
results = [calc_try(values) for values in list_]
time.time()-t
## 0.07138562202453613
Exercise 1: Write a function that divides two numbers and uses a try-except block to handle the case where the second number is zero.
def divide_numbers(a, b):
try:
return a / b
except ZeroDivisionError:
return "Cannot divide by zero"
result1 = divide_numbers(10, 2)
result2 = divide_numbers(10, 0)
print(result1)
## 5.0
print(result2)
## Cannot divide by zero
Translate R solution here into Python script
Exercise 1:
If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.
Find the sum of all the multiples of 3 or 5 below 10000.
# R
get_sum_multiples_below1 <- function(below,multiple_1,multiple_2){
sum <- 0
for(i in 1:(below-1)){
if((i %% multiple_1 == 0)|(i %% multiple_2 == 0)){
sum <- sum + i
}
}
return(sum)
}
t <- Sys.time()
get_sum_multiples_below1(10000,3,5)
## [1] 23331668
Sys.time() - t
## Time difference of 0.03132796 secs
def get_sum_multiples_below2(below,mutiple_1,multiple_2):
multiples = [i if ((i % mutiple_1 == 0) or (i % multiple_2 == 0)) else 0 for i in range(below)]
return sum(multiples)
t = time.time()
get_sum_multiples_below2(10000,3,5)
## 23331668
time.time() - t
## 0.008404254913330078
Exercise 2:
By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.
What is the 10 001st prime number?
# R
nth_prime <- function(nth){
primes = c(1,2)
i = 2
while(length(primes)<nth){
i <- i+1
is_prime = TRUE
for(j in 2:(round(sqrt(i))+1)){
if(i%%j == 0){
is_prime = FALSE
break
}
}
if(is_prime == TRUE){
primes = c(primes,i)
}
}
return(primes[nth])
}
t <- Sys.time()
nth_prime(10001)
## [1] 104743
Sys.time() - t
## Time difference of 0.6678371 secs
import numpy as np
def nth_prime(nth):
primes = [1,2]
i = 2
while len(primes) < nth+1:
i+=1
is_prime = True
for j in range(2,(int(np.sqrt(i))+1)):
if i%j == 0:
is_prime = False
break
if is_prime == True:
primes.append(i)
print(primes[-1])
t = time.time()
nth_prime(10001)
## 104743
time.time() - t
## 0.21979498863220215
Exercise 3:
You are given the following information, but you may prefer to do some research for yourself.
How many Sundays fell on the first of the month during the twentieth century (1 Jan 1901 to 31 Dec 2000)?
get_sundays <- function(year,first_sunday){
# Get number of days per month
if((year %% 100 != 0 & year%%4 == 0) | year %% 400 == 0){
month_length = c(31,29,31,30,31,30,31,31,30,31,30,31)
} else {
month_length = c(31,28,31,30,31,30,31,31,30,31,30,31)
}
# total number of days
nb_days = sum(month_length)
# position of the first days of the month
cumsum_year = cumsum(month_length)-month_length+1
# position of all sundays
sundays = seq(first_sunday,nb_days,7)
# get first sunday of the following year
next_sunday_position = sundays[length(sundays)] - nb_days + 7
nb_sundays_first = length(which(sundays %in% cumsum_year))
return(c(next_sunday_position,nb_sundays_first))
}
t = Sys.time()
# Intialize with 1900, we know that the first sunday is the 7th.
year_result = get_sundays(1900,7)
# compute the sum for each year
nb_sundays = 0
for(i in seq(1901,2000)){
next_sunday = year_result[1]
year_result = get_sundays(i,next_sunday)
nb_sundays = nb_sundays + year_result[2]
}
Sys.time() - t
## Time difference of 0.01941395 secs
# results
nb_sundays
## [1] 171
import numpy as np
def get_sundays(year,first_sunday):
# check if there is a leap
if (year % 100 != 0 and year%4 == 0) or year % 400 == 0 :
month_length = [31,29,31,30,31,30,31,31,30,31,30,31]
else:
month_length = [31,28,31,30,31,30,31,31,30,31,30,31]
# total number of days
nb_days = sum(month_length)
# position of the first days of the month
cumsum_year = np.cumsum(month_length)-month_length+1
# position of all sundays
sundays = list(range(first_sunday,nb_days,7))
# get first sunday of the following year
next_sunday_position = sundays[-1] - nb_days + 7
nb_sundays_first = len(np.where(np.isin(sundays,cumsum_year)==True)[0])
return next_sunday_position, nb_sundays_first
t = time.time()
# Intialize with 1900, we know that the first sunday is the 6th.
year_result = get_sundays(1900,6)
# compute the sum for each year
nb_sundays = 0
for i in range(1901,2001):
next_sunday = year_result[0]
year_result = get_sundays(i,next_sunday)
nb_sundays = nb_sundays + year_result[1]
time.time() - t
## 0.016038894653320312
nb_sundays
## 171