Variables, expressions and statements

Source: this section is heavily based on Chapter 2 of [ThinkCS].

Executing programs in a computer

Before going into the details of Python, it is useful to consider how computers are organised and execute programs. A computer typically consists of at least the following components:

a processor, which executes instructions in a program; examples of well-known processors are Intel Core processors, AMD Athlon processors, Qualcomm Snapdragon processors, and Apple M1 processors.
a main memory, which stores the program and the data while the processor is executing the program; typical capacities for the main memory are nowadays between 8GB and 32GB.
a disk drive, which stores all files available to the computer without Internet connection; in a laptop this is nowadays 256GB or more, while on a desktop 1TB is not uncommon.
a monitor, which displays information to the user.
a network connection, which allows to interact with the Internet.
a keyboard and a mouse or touchpad, which allow the user to provide input to the computer.

If you use the Python interpreter, this program is initially stored on the disk drive; when you start the interpreter, it is loaded in the main memory of the computer, such that the processor can execute the interpreter.

The main memory of the computer stores everything that the processor needs to have access to in order to execute a program. This not only includes the Python interpreter, the instructions of the program that the processor is executing, but also intermediate results of a calculation; after all, in most cases the calculation that we ask a computer to do is so complex that it needs memory to maintain the intermediate steps of a calculation.

To organize its calculations well, the Python interpreter organizes the memory in a specific manner, of which we will see more details later in this syllabus. Core ideas are the following:

Parts of the memory are given names; these names can be used to refer to that part of the memory;
The information that is stored in a certain part of the memory, has a value, for instance 4 or 3.0, and a type: for instance, it is a text, or a number;
To calculate information that can be stored in a part of the memory, in Python programs we write expressions;
To decide the order in which we perform calculations, Python programs consist of statements that are put in a certain order.

In this chapter, we will discuss each of these ideas in more detail.

Values and data types

A value is one of the fundamental things --- like a letter or a number --- that a program manipulates. The values we have seen so far are 4 (the result when we added 2 + 2), and "Hello, World!".

These values are classified into different classes, or data types: 4 is an integer, and "Hello, World!" is a string, so-called because it contains a string of letters. You (and the interpreter) can identify strings because they are enclosed in quotation marks.

If you are not sure what class a value falls into, Python has a function called type which can tell you.

>>> type("Hello, World!")
<class 'str'>
>>> type(17)
<class 'int'>

Not surprisingly, strings belong to the class str and integers belong to the class int. Less obviously, numbers with a decimal point belong to a class called float, because these numbers are represented in a format called floating-point. At this stage, you can treat the words class and type interchangeably. We'll come back to a deeper understanding of what a class is in later chapters.

>>> type(3.2)
<class 'float'>

What about values like "17" and "3.2"? They look like numbers, but they are in quotation marks like strings.

>>> type("17")
<class 'str'>
>>> type("3.2")
<class 'str'>

They're strings!

Strings in Python can be enclosed in either single quotes (') or double quotes ("), or three of each (''' or """)

>>> type('This is a string.')
<class 'str'>
>>> type("And so is this.")
<class 'str'>
>>> type("""and this.""")
<class 'str'>
>>> type('''and even this...''')
<class 'str'>

Double quoted strings can contain single quotes inside them, as in "Bruce's beard", and single quoted strings can have double quotes inside them, as in 'The knights who say "Ni!"'.

Strings enclosed with three occurrences of either quote symbol are called triple quoted strings. They can contain either single or double quotes:

>>> print('''"Oh no", she exclaimed, "Ben's bike is broken!"''')
"Oh no", she exclaimed, "Ben's bike is broken!"
>>>

Triple quoted strings can even span multiple lines:

>>> message = """This message will
... span several
... lines."""
>>> print(message)
This message will
span several
lines.
>>>

Python doesn't care whether you use single or double quotes or the three-of-a-kind quotes to surround your strings: once it has parsed the text of your program or command, the way it stores the value is identical in all cases, and the surrounding quotes are not part of the value. But when the interpreter wants to display a string, it has to decide which quotes to use to make it look like a string.

>>> 'This is a string.'
'This is a string.'
>>> """And so is this."""
'And so is this.'

So the Python language designers usually chose to surround their strings by single quotes. What do think would happen if the string already contained single quotes?

When you type a large integer, you might be tempted to use commas between groups of three digits, as in 42,000. This is not a legal integer in Python, but it does mean something else, which is legal:

>>> 42000
42000
>>> 42,000
(42, 0)

Well, that's not what we expected at all! Because of the comma, Python chose to treat this as a pair of values. We'll come back to learn about pairs later. But, for the moment, remember not to put commas or spaces in your integers, no matter how big they are. Also revisit what we said in the previous chapter: formal languages are strict, the notation is concise, and even the smallest change might mean something quite different from what you intended.

Variables

One of the most powerful features of a programming language is the ability to store values in the memory of the computer. In Python this is done by manipulating variables. A variable is a name that refers to a value stored in the memory of the computer.

The assignment statement gives a value to a variable:

>>> message = "What's up, Doc?"
>>> n = 17
>>> pi = 3.14159

This example makes three assignments. The first assigns the string value "What's up, Doc?" to a variable named message. The second gives the integer 17 to n, and the third assigns the floating-point number 3.14159 to a variable called pi.

After executing these instructions, hence, in the memory of the computer we have three variables; each variable has a name (such as message), a type (such as str) and a value (such as "What's up, Doc?"). The assignment statement effectively changes the contents of the memory of the computer.

The assignment token, =, should not be confused with equals, which uses the token ==. The assignment statement binds a name, on the left-hand side of the operator, to a value, on the right-hand side. This is why you will get an error if you enter:

>>> 17 = n
File "<interactive input>", line 1
SyntaxError: can't assign to literal
Tip

When reading or writing code, say to yourself "n is assigned 17" or "n gets the value 17". Don't say "n equals 17".

A common way to represent variables on paper is to write the name with an arrow pointing to the variable's value. This kind of figure is called a state snapshot because it shows what state each of the variables is in at a particular instant in time. (Think of it as the variable's state of mind). This diagram shows the result of executing the assignment statements:

If you ask the interpreter to evaluate a variable, it will produce the value that is currently linked to the variable:

>>> message
"What's up, Doc?"
>>> n
17
>>> pi
3.14159

We use variables in a program to "remember" things, perhaps the current score at the football game. But variables are variable. This means they can change over time, just like the scoreboard at a football game. You can assign a value to a variable, and later assign a different value to the same variable. (This is different from maths. In maths, if you give `x` the value 3, it cannot change to link to a different value half-way through your calculations!)

>>> day = "Thursday"
>>> day
'Thursday'
>>> day = "Friday"
>>> day
'Friday'
>>> day = 21
>>> day
21

You'll notice we changed the value of day three times, and on the third assignment we even made it refer to a value that was of a different type.

A great deal of programming is about having the computer remember things, e.g. The number of missed calls on your phone, and then arranging to update or change the variable when you miss another call.

Variable names and keywords

Variable names can be arbitrarily long. They can contain both letters and digits, but they have to begin with a letter or an underscore. Although it is legal to use uppercase letters, by convention we don't. If you do, remember that case matters. Bruce and bruce are different variables.

The underscore character ( _) can appear in a name. It is often used in names with multiple words, such as my_name or price_of_tea_in_china.

There are some situations in which names beginning with an underscore have special meaning, so a safe rule for beginners is to start all names with a letter.

If you give a variable an illegal name, you get a syntax error:

>>> 76trombones = "big parade"
SyntaxError: invalid syntax
>>> more$ = 1000000
SyntaxError: invalid syntax
>>> class = "Computer Science 101"
SyntaxError: invalid syntax

76trombones is illegal because it does not begin with a letter. more$ is illegal because it contains an illegal character, the dollar sign. But what's wrong with class?

It turns out that class is one of the Python keywords. Keywords define the language's syntax rules and structure, and they cannot be used as variable names.

Python has thirty-something keywords (and every now and again improvements to Python introduce or eliminate one or two):

and	as	assert	break	class	continue
def	del	elif	else	except	exec
finally	for	from	global	if	import
in	is	lambda	nonlocal	not	or
pass	raise	return	try	while	with
yield	True	False	None

You might want to keep this list handy. If the interpreter complains about one of your variable names and you don't know why, see if it is on this list.

Programmers generally choose names for their variables that are meaningful to the human readers of the program --- they help the programmer document, or remember, what the variable is used for.

Caution!

Beginners sometimes confuse "meaningful to the human readers" with "meaningful to the computer". So they'll wrongly think that because they've called some variable average or pi, it will somehow magically calculate an average, or magically know that the variable pi should have a value like 3.14159. No! The computer doesn't understand what you intend the variable to mean.

So you'll find some instructors who deliberately don't choose meaningful names when they teach beginners --- not because we don't think it is a good habit, but because we're trying to reinforce the message that you --- the programmer --- must write the program code to calculate the average, and you must write an assignment statement to give the variable pi the value you want it to have.

Statements

A statement is an instruction that the Python interpreter can execute. In this chapter we have seen the assignment statement. There are however many other forms of statements. Another example is the function call that we saw in the previous chapter:

print("Hello, World!")

The effect of this statement was to print a string on the screen of the computer.

Note that it is important not to confuse these two statements:

print(3)

And

x = 3

This last statement will store the value 3 in the memory of the computer, such that it can be used later in the program. The first statement prints the value 3 on the screen of the user, but this value is not stored for later use.

We will see more details on functions later; some other kinds of statements that we'll see shortly are while statements, for statements, if statements, and import statements. (There are other kinds too!)

Evaluating expressions

An expression is a combination of values, variables, operators, and calls to functions. If you type an expression at the Python prompt, the interpreter evaluates it and displays the result:

>>> 1 + 1
2
>>> len("hello")
5

In this example len is a built-in Python function that returns the number of characters in a string. We've previously seen the print and the type functions, so this is our third example of a function!

The evaluation of an expression produces a value, which is why expressions can appear on the right hand side of assignment statements. A value all by itself is a simple expression, and so is a variable.

>>> 17
17
>>> y = 3.14
>>> x = len("hello")
>>> x
5
>>> y
3.14

Operators and operands

Operators are special tokens that represent computations like addition, multiplication and division. The values the operator uses are called operands.

The following are all legal Python expressions whose meaning is more or less clear:

20+32   hour-1   hour*60+minute   minute/60   5**2   (5+9)*(15-7)

The tokens +, -, and *, and the use of parenthesis for grouping, mean in Python what they mean in mathematics. The asterisk (*) is the token for multiplication, and ** is the token for exponentiation.

>>> 2 ** 3
8
>>> 3 ** 2
9

When a variable name appears in the place of an operand, it is replaced with its value before the operation is performed.

Addition, subtraction, multiplication, and exponentiation all do what you expect.

Example: so let us convert 645 minutes into hours:

>>> minutes = 645
>>> hours = minutes / 60
>>> hours
10.75

Oops! In Python 3, the division operator / always yields a floating point result. What we might have wanted to know was how many whole hours there are, and how many minutes remain. Python gives us two different flavors of the division operator. The second, called floor division uses the token //. Its result is always a whole number --- and if it has to adjust the number it always moves it to the left on the number line. So 6 // 4 yields 1, but -6 // 4 might surprise you!

>>> 7 / 4
1.75
>>> 7 // 4
1
>>> minutes = 645
>>> hours = minutes // 60
>>> hours
10

Take care that you choose the correct flavor of the division operator. If you're working with expressions where you need floating point values, use the division operator that does the division accurately.

Order of operations

When more than one operator appears in an expression, the order of evaluation depends on the rules of precedence. Python follows the same precedence rules for its mathematical operators that mathematics does. The acronym PEMDAS is a useful way to remember the order of operations:

Parentheses have the highest precedence and can be used to force an expression to evaluate in the order you want. Since expressions in parentheses are evaluated first, 2 * (3-1) is 4, and (1+1)**(5-2) is 8. You can also use parentheses to make an expression easier to read, as in (minute * 100) / 60, even though it doesn't change the result.
Exponentiation has the next highest precedence, so 2**1+1 is 3 and not 4, and 3*1**3 is 3 and not 27.
Multiplication and both Division operators have the same precedence, which is higher than Addition and Subtraction, which also have the same precedence. So 2*3-1 yields 5 rather than 4, and 5-2*2 is 1, not 6.
Operators with the same precedence are evaluated from left-to-right. In algebra we say they are left-associative. So in the expression 6-3+2, the subtraction happens first, yielding 3. We then add 2 to get the result 5. If the operations had been evaluated from right to left, the result would have been 6-(3+2), which is 1. (The acronym PEDMAS could mislead you to thinking that division has higher precedence than multiplication, and addition is done ahead of subtraction - don't be misled. Subtraction and addition are at the same precedence, and the left-to-right rule applies.)
- Due to some historical quirk, an exception to the left-to-right left-associative rule is the exponentiation operator **, so a useful hint is to always use parentheses to force exactly the order you want when exponentiation is involved:
```
>>> 2 ** 3 ** 2     # The right-most ** operator gets done first!
512
>>> (2 ** 3) ** 2   # Use parentheses to force the order you want!
64
```

The immediate mode command prompt of Python is great for exploring and experimenting with expressions like this.

Glossary

assignment statement
A statement that assigns a value to a name (variable). To the left of the assignment operator, =, is a name. To the right of the assignment token is an expression which is evaluated by the Python interpreter and then assigned to the name. The difference between the left and right hand sides of the assignment statement is often confusing to new programmers. In the following assignment:
n = n + 1
n plays a very different role on each side of the =. On the right it is a value and makes up part of the expression which will be evaluated by the Python interpreter before assigning it to the name on the left.
assignment token

= is Python's assignment token. Do not confuse it with equals, which is an operator for comparing values.

data type

A set of values. The type of a value determines how it can be used in expressions. So far, the types you have seen are integers (int), floating-point numbers (float), and strings (str).

evaluate

To simplify an expression by performing the operations in order to yield a single value.

expression

A combination of variables, operators, and values that represents a single result value.

float

A Python data type which stores floating-point numbers. Floating-point numbers are stored internally in two parts: a base and an exponent. When printed in the standard format, they look like decimal numbers. Beware of rounding errors when you use floats, and remember that they are only approximate values.

floor division

An operator (denoted by the token //) that divides one number by another and yields an integer, or, if the result is not already an integer, it yields the next smallest integer.

int

A Python data type that holds positive and negative whole numbers.

keyword

A reserved word that is used by the compiler to parse program; you cannot use keywords like if, def, and while as variable names.

operand

One of the values on which an operator operates.

operator

A special symbol that represents a simple computation like addition, multiplication, or string concatenation.

rules of precedence

The set of rules governing the order in which expressions involving multiple operators and operands are evaluated.

state snapshot

A graphical representation of a set of variables and the values to which they refer, taken at a particular instant during the program's execution.

statement

An instruction that the Python interpreter can execute. So far we have only seen the assignment statement, but we will soon meet the import statement and the for statement.

str

A Python data type that holds a string of characters.

value

A number or string (or other things to be named later) that can be stored in a variable or computed in an expression.

variable

A name that refers to a value.

variable name

A name given to a variable. Variable names in Python consist of a sequence of letters (a..z, A..Z, and _) and digits (0..9) that begins with a letter. In best programming practice, variable names should be chosen so that they describe their use in the program, making the program self documenting.

References

[ThinkCS]

How To Think Like a Computer Scientist --- Learning with Python 3