Python Tutorial - Week 2

In the Week 1 we got started with Python. Now that we can interact with python, lets dig deeper into it.

This week we will go over some additional fundamental things common in any program - interactive input from users, adding comments to your code, use of conditional logic i.e. if - else conditions, loops, formatted output with strings and print() statements.

Python Week 2

User Inputs

There are hardly any programs without any input. Input can come in various ways, for example from a database, another computer, mouse clicks and movements or from the internet. Yet, in most cases the input stems from the keyboard. For this purpose, Python provides the function input(). input() has an optional parameter, which is the prompt string, i.e. the text that will be shown when asking for input.

In [1]:
name = input("What's your name? ")
print("Nice to meet you " + name + "!")
age = input("Your age? ")
print("So, you are already " + age + " years old, " + name + "!")

What's your name? Sadanand
Nice to meet you Sadanand!
Your age? 30
So, you are already 30 years old, Sadanand!

What if you try to do some mathematical operation on the age? You will get a TypeError as follows:

In [2]:
age = 12 + age

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-3d9ce720d6f3> in <module>()
----> 1 age = 12 + age

TypeError: unsupported operand type(s) for +: 'int' and 'str'

This says that by default all data is read as raw input i.e. strings. If we want numbers we need to convert them ourselves. For example:

In [3]:
 cities_canada = input("Largest cities in Canada: ")

Largest cities in Canada: ["Montreal", "Ottawa", "Calgary", "Toronto"]

In [4]:
 print(cities_canada, type(cities_canada))

["Montreal", "Ottawa", "Calgary", "Toronto"] <class 'str'>

In [5]:
cities_canada = eval(input("Largest cities in Canada: "))

Largest cities in Canada: ["Montreal", "Ottawa", "Calgary", "Toronto"]

In [6]:
 print(cities_canada, type(cities_canada))

['Montreal', 'Ottawa', 'Calgary', 'Toronto'] <class 'list'>

In [7]:
population = input("Population of Portland? ")

Population of Portland? 604596

In [8]:
 print(population, type(population))

604596 <class 'str'>

In [9]:
population = int(input("Population of Portland? "))

Population of Portland? 604596

In [10]:
 print(population, type(population))

604596 <class 'int'>

In [13]:
pi = input("Value of PI is?")

Value of PI is?3.14

In [14]:
 print(pi, type(pi))

3.14 <class 'str'>

In [15]:
pi = float(input("Value of PI is?"))

Value of PI is?3.14

In [16]:
 print(pi, type(pi))

3.14 <class 'float'>

Notice the use of various methods like eval(), int() and float() to get user input in correct formats. In summary, eval() is used to get data into various native python formats, e.g. lists, dictionaries etc. We will look at these in more detail in next few tutorials. int() is used to convert input to integer numbers (numbers without decimals), while float() is used to get floating point numbers.

Also, of interest above is the type() method used in print statements. You can get the type of any variable in python using this method. In the output of this we see something like: < class ‘float’> - if variable is of float type. For the time being we will ignore the “class” in this.

Indentation Blocks

Python programs get structured through indentation, i.e. code blocks are defined by their indentation (The amount of blank space before any line). This principle makes it easier to read and understand other people’s Python code.

All statements with the same distance to the right belong to the same block of code, i.e. the statements within a block line up vertically. The block ends at a line less indented or the end of the file. If a block has to be more deeply nested, it is simply indented further to the right.

In the following sections below we will see extensive use of such indentation blocks. Consider the following example to calculate Pythagorean triples. You do not need to understand the full code right here. We will revisit this code at the end of this tutorial.

In [18]:
from math import sqrt
n = input("Maximum Number? ")
n = int(n)+1
for a in range(1,n):
    for b in range(a,n):
        c_square = a**2 + b**2
        c = int(sqrt(c_square))
        if ((c_square - c**2) == 0):
            print(a, b, c)

Maximum Number? 10
3 4 5
6 8 10

In the above code, we see three indentation blocks, first and second “for” loops and the third “if” condition. There is another aspect of structuring in Python, which we haven’t mentioned so far, which you can see in the example. Loops and Conditional statements end with a colon “:” - the same is true for functions and other structures introducing blocks. So, we should have said Python structures by colons and indentation.

Comments in Python

Python has two ways to annotate/comment Python code. One is by using comments to indicate what some part of the code does. Single-line comments begin with the hash character (“#”) and are terminated by the end of line. Here is an example:

In [19]:
# This is a comment in Python before print statement
print("Hello World") #This is also a comment in Python

Hello World

Conditionals

Conditionals, - mostly in the form of if statements - are one of the essential features of a programming language. A decision has to be taken when the script or program comes to a point where it has a choice of actions, i.e. different computations, to choose from.

The decision depends in most cases on the value of variables or arithmetic expressions. These expressions are evaluated to the Boolean values True or False. The statements for the decision taking are called conditional statements. Alternatively they are also known as conditional expressions or conditional constructs.

Conditional statements in Python use indentation blocks to conditionally execute certain code. The general form of the if statement in Python looks like this:

In [12]:
if condition_1:
    statement_block_1
elif condition_2:
    statement_block_2

...

elif another_condition:    
    another_statement_block
else:
    else_block


If the condition “condition_1” is True, the statements of the block statement_block_1 will be executed. If not, condition_2 will be evaluated. If condition_2 evaluates to True, statement_block_2 will be executed, if condition_2 is False, the other conditions of the following elif conditions will be checked, and finally if none of them has been evaluated to True, the indented block below the else keyword will be executed.

Typical examples of “condition” statements follow some of following operations: mathematical comparisons like, “<”, “>”, “<=“, “>=“, “==” object comparisons like “is” i.e. this is exactly something or not. boolean logic operators like “not”, “or”, “and”, “xor” etc.

The following objects are evaluated by Python as False:

  • numerical zero values (0, 0L, 0.0, 0.0+0.0j),
  • the Boolean value False,
  • empty strings,
  • empty lists and empty tuples,
  • empty dictionaries.
  • the special value None.

All other values are considered to be True.

Let us try to solve this simple DNA sequence problem: Given the an input DNA sequence, print the sequence if its length is less than equal to 20. Print “Error” if the sequence is empty or its length is larger than 25. If length is between 21 and 25, print the last 5 bases only.


In [20]:
dna = "ATGCCGATTTATCGGGAACCNNNAATTCCGG"

if len(dna) <= 20:
    if len(dna) > 0:
        print(dna)
    else:
        print("ERROR!")
elif len(dna) <= 25:
        print(dna[-5:])
else:
    print("ERROR!")

ERROR!

In [21]:
dna = "ATGCAATGCN"

if len(dna) <= 20: if len(dna) > 0: print(dna) else: print("ERROR!") elif len(dna) <= 25: print(dna[-5:]) else: print("ERROR!")

ATGCAATGCN

In [22]:
dna = ""

if len(dna) <= 20: if len(dna) > 0: print(dna) else: print("ERROR!") elif len(dna) <= 25: print(dna[-5:]) else: print("ERROR!")

ERROR!

In [23]:
dna = "ATGCCGATTTATCGGGAACCNNN"

if len(dna) <= 20: if len(dna) > 0: print(dna) else: print("ERROR!") elif len(dna) <= 25: print(dna[-5:]) else: print("ERROR!")

CCNNN

if else conditions can also be combined in a regular assignment expression to assign values. For example, In the DNA case, we want to store length of DNA. However, we want length to number only if length of sequence is between 1 and 25. In all other cases, we want to store the length of sequence as -1. A typical way to do this would be:

In [24]:
dna = "ATGCCGATTTATCGGGAACCNNN"
length = -1
if 0 < len(dna) <= 20:
    length = len(dna)

print(length)

-1

In [25]:
dna = "CCGGGAACCTCACG"
length = -1
if 0 < len(dna) <= 20:
    length = len(dna)

print(length)

14

This example can be written in a much shorter fashion as well. Such conditions are commonly called as ternary if statements.

In [26]:
dna = "ATGCCGATTTATCGGGAACCNNN"
length = len(dna) if 0 < len(dna) <= 20 else -1
print(length)

-1

In [27]:
dna = "CCGGGAACCTCACG"
length = len(dna) if 0 < len(dna) <= 20 else -1
print(length)

14

Loops

Many algorithms make it necessary for a programming language to have a construct which makes it possible to carry out a sequence of statements repeatedly. The code within the loop, i.e. the code carried out repeatedly, is called the body of the loop.

There are two types of loops in Python -

  1. while Loop
  2. for Loop

The while Loop

These are a type of loop called “Condition-controlled loop”. As suggested by the name, the loop will be repeated until a given condition changes, i.e. changes from True to False or from False to True, depending on the kind of loop.

Let us consider the following example of DNA sequence: We want to print every base of a given sequence, until we have found 2 A’s.

In [28]:
dna = "ATGCCGATTTATCGGGAACCNNN"
countA = 0
index = 0
while countA < 2:
    print(dna[index])
    if dna[index] == 'A':
        countA = countA + 1
    index = index + 1

A
T
G
C
C
G
A

In the above example, the loop (code under the while block) was executed until countA < 2 statement remained true.

The loops can be made to exit before its actual completion using the break statements. Consider the following example of DNA sequence. We want to print every base of a given sequence, until we have found 2 A’s. However, we want to stop printing as soon as we have found an N base.

In [29]:
dna = "ATGCNCGATTTATCGGGAACCNNN"
countA = 0
index = 0
while countA < 2:
    if dna[index] == 'N':
        break
    if dna[index] == 'A':
        countA = countA + 1
    print(dna[index])
    index = index + 1

A
T
G
C

Now, let us consider another case while looping over something. We want to skip over a part of code at certain condition. In such cases, continue statement comes handy.

Consider the following example wrt to DNA sequencing. Given a sequence of dna, we do NOT want to print the base name if it is ‘N’

In [30]:
dna = "ATGCNCN"
index = 0
while index < len(dna):
    index = index + 1
    if dna[index-1] == 'N':
        continue
    print(dna[index-1])

A
T
G
C
C

The for Loop

A for loop is similar to while loop, except it is used to loop over certain elements, unlike while loop that continues until certain condition is satisfied. In the case DNA sequences, say, one case of for loop would be to loop over all bases in a sequence.

Consider the following example: Given a DNA sequence, we want to count the number of all ‘A’, and ’T bases.

In [31]:
dna = "ATGCNCGATTTATCGGGAACCNNN"
count = 0
for base in dna:
    if base == 'A' or base == 'T':
        count += 1

print("Number of A, T bases is:", count)

Number of A, T bases is: 10

Similar to while loops, we can use break and continue statements with for loops as well.

Let us look at somewhat complicated use of for loop:

Given a DNA sequence, we want to count the number of doublets of bases, i.e. no. of times certain bases come twice exactly. If some base occur more than twice, we do not want to count that.

In [32]:
dna = "ATGGCNCGAATTTAAATCGGGAACCNNN"
countPairs = 0
pairFound = 0
prevBase = ''
for base in dna:
    if (base == prevBase):
        pairFound += 1
    else:
        if pairFound == 1:
            countPairs += 1
        pairFound = 0
    prevBase = base

print("Number of paired bases is:", countPairs)

Number of paired bases is: 4

Formatting of Output

Final topic for this week is the formatting of text in the print statements. Consider the following case:

We have following variables: name = "Sadanand", age = 30, and gender = "male"

We would like to print a quite cumbersome statement like as follows. This can be quite easily done using the format method.

In [33]:
name = "Sadanand"
age = 30
gender = "male"
msg = "Hi {0}, You are a {1}, and you have seen {2} winters as you are {2} years old! Thanks {0}!"
print(msg.format(name, gender, age))

Hi Sadanand, You are a male, and you have seen 30 winters as you are 30 years old! Thanks Sadanand!

Thus format method provides us with easy way to mix different types of variables in the strings.

Thats it for this week. Next we will look at strings and lists in Python in more detail.

Exercise

Given the following sequence of dna - “ATGGCNCGAATTTAAATCGGGAACCNNN”,

  1. Write a program to count number of all triplets in it.
  2. Write a program that prints all non ’T’ bases that come after ’T’, but stops when two or more continuous ’T’ has been found.
  3. Write a program to generate new sequence with every 3rd base from the above sequence.
  4. Write a program to calculate sum of all numbers from 1 to 10. HINT: Please take a look at the range method.
comments powered by Disqus