Understanding Mutable and Immutable Objects in Python
I've recently started coding in Python, after working in C, and have found some of the notation to be incredibly confusing for a beginner because it's much harder to see exactly what is going on "under the hood" in Python. One of the most mind-boggling examples I've seen is the code pictured at the top of this article: if a and b are both set to the same string, "string," it makes sense the a is equal to b (a == b) and a and b are pointing to the same object (a is b). However, when we change a and b to both be set to the same list, [1, 2, 3], a and b are still equal (a == b), but now a and b are not pointing to the same object (a is NOT b). Huh? What's going on here? Well, by delving deeper into Python objects, and looking at mutable vs. immutable types, we can get a clearer picture of what is happening behind the scenes and clear up what's really going on.
type and id:
Before we get into mutable and immutable objects, let's take a closer look at some helpful functions, type() and id(). In Python, everything is an object. Integers, strings, lists, tuples, floats, and more are all considered objects, and creator Guido van Rossum even wanted all objects to have equal status ("first class"). This is what makes Python known as an Object Oriented Programming (OOP) language! While in C, you have to declare all variables by their type (ex. char *string, int 1); in Python, the information about how you declare variables tells Python what type it is (ex. quotes mean string, number without a decimal means int). However, sometimes it's beneficial to see what type a variable is listed as in Python. To see this, you can use type(object). As shown above, Python will print out the class the variable belongs to. Classes and objects are used almost interchangeably, but don't get confused. Objects are collections of data and methods, basically variables and functions, while classes are the templates for these objects. All of the typical "types" you may remember from C (ints, floats, strings, etc.) are their own classes in Python, and these classes give Python the data and methods to use the "type" objects correctly. If you use type(int), for example, you will see that these "types" belong to the Type class. However, if you use type() with variables and examples of these types, such as "string" and 1, you'll see these are objects that belong to classes str and int, respectively. It may also be helpful to know the variable identifier to see which variables are the same in Python. To do this you can use the function id(variable). As shown above, this prints out a series of numbers as the id. For anyone familiar with CPython, this is the variable's memory address.
Mutable Objects:
"Mutable" in Python and other programming languages means the object can be mutated, or changed. The most common examples in Python are lists and sets, although dictionaries can also be mutable. Be careful though: it is possible to have a frozenset, which is immutable, unlike the generic term "set." In addition, tuples are considered immutable themselves, but they can contain objects that are mutable. For example, if you have a tuple consisting of ([list1], [list2]), the tuple is considered immutable, but both list1 and list2 (the values stored in the tuple) are mutable.
Immutable Objects:
"Immutable" in Python means the object cannot be changed. Integers, floats, strings, booleans, and, as mentioned above, frozensets and tuples are all considered immutable. Now, you may be wondering, can't I change the value of an integer or slice a string? Yes, but it's important to note here that you are not directly changing the value of the original object. You are creating a new object with the updated information. There is information in the immutable classes that gives Python the methods to be able to do things, such as concatenation of two strings, but it does not directly affect the object itself. For example, you cannot directly index through a string and assign a new value, like you can in C or with a list in Python. If you index through a string, you can get a read-only value (which allows you to do things like print), but cannot change the value as "'str' object does not support item assignment." This is because the object is immutable. However, as shown below, mutable objects, like a list, are able to support this direct re-assignment. This is why slicing and concatenation are often used in Python to bypass this TypeError and change the string, but by doing so, you are creating a new string and assigning it to the variable name. This is shown in the second picture below, in which the variable id changes after the string is altered and the change must start with "string = " to re-assign the variable name "string" to a new string.
How does Python treat mutable and immutable objects differently?
We've already seen how mutable and immutable objects need to be changed differently, but mutability has further-reaching effects, especially when concerning two or more variables. Let's start by taking a look at two variables a and b that point to an int (remember, integers are immutable objects).
- If variables a and b point to two different ints, it makes sense that they are pointing to two different objects. Here, a point to 2 and b points to 3.
- However, what happens when a and b point to equivalent objects, in this case, 2? Since integers are immutable, Python saves space by having a and b point to the same object* (see more on this below).
- It also follows that if we set b to the variable a, both point to the same object, 2 again.
* a and b generally will point to the same object if that object is mutable, but there are limitations. When Python is first starting up, it will preload certain characters for you automatically. For example, the most commonly used integers are in the range of -5 to 256. So when Python is loading, it will use the macros NSMALLNEGINTS (-5 to 0) and NSMALLPOSINTS (0-256) to go ahead and load these values. This is how a and b are able to point to the same object - it's already in memory! However, if we set a and b both to 257 (or greater) (or -6 or less), a and b actually will not point to the same object (even though they are equal and immutable). This is because a new memory position was created for 257 when a was created first and, instead of checking the value of all previously created variables, b is then set to it's own id because the value 257 was not a preloaded object.
On the other hand, lists are mutable. Even if variables a and b point to equal lists, they do not point to the same object. This is because Python cannot necessarily save space here. If the objects are immutable, Python knows to change the object's value, it will need to be re-assigned to a new variable id. It is safe to have two variables point to the same object because if a is re-assigned, it will not affect b. However, if a and b are pointing to mutable objects, they must point to separate objects, even if they are equal, because changing a could affect b if they are the same. For two variables to point to the same mutable object, you have to assign one variable to the other (b = a). This will mean a and b point to the same object and changing one may also change the other.
This change can be seen in this first example. We are appending the value 5 to the first list, l1. However, since l2 = l1, we also see this addition when we print l2.
However, re-assigning l1, such as l1 = l1 + [4], creates a new variable l1. Since the change is affecting this new variable, and l2 is set to the original l1, the change does not affect l2 as shown here.
How are mutable and immutable objects are passed to functions?
Another common difference between mutable and immutable objects is how they are changed within functions:
As shown above, functions increment_int and increment_list have similar structures. However, immutable int a is unaffected after the function call, while mutable list b is changed after the function call. In Python, function arguments are passed in as "call-by-object." This means, the actual passing of arguments can change depending on the type of object. Immutable objects are passed in such that they cannot be changed, because they are immutable. This is called pass-by-value. In pass-by-value, a local copy containing the same value is passed into the function, can be changed in the function, but has no affect on the variable outside/after the function. Mutable objects, on the other hand, are pass-by-reference. This means the function gets a reference to the object, not just a copy. If you are familiar with C programming, passing in a variable directly is pass-by-value, while passing in a pointer to that variable is pass-by-reference. Similarly, pass-by-reference (pointers) in C also allows any changes to the variable in the function to be seen outside of the function itself.
I hope taking a further look into mutable (changeable) and immutable (unchanging) objects in Python has been as helpful for you as it is for me in better understanding how Python works. Once you know the rules of mutable vs. immutable, it can be much easier to see why Python sets two variables to point to the same immutable object, but has two equal variables point to separate objects. It can be clearer to see how pass-by-value and pass-by-reference affect changes inside and outside of function calls, and how changing one variable may potentially change another.
If you have any feedback or tips for a new Python coder, please send them my way - I'm always curious to learn more about how coding languages work. Thank you for reading and happy coding!