Python Memory: Ref Counting, Garbage Collection

Reference Counting

To commence, please examine the subsequent illustration: Within the Python programming language, upon assigning a value to a variable, for instance, myvar = 10, an integer object is instantiated and preserved in memory. The variable, denoted as myvar, operates as a reference to the aforementioned object, encompassing the memory address at which the object resides. Fundamentally, myvar acts as a pointer to the object in question. This reference counting system upholds a record of the total number of variables directed to the identical object. About the provided example, the reference count for the memory address of the integer object equates to one.

Python's reference counting system is engineered to enhance memory utilization. Envision a situation in which we assign the value of myvar to another variable, denoted as othervar. Instead of generating a new object possessing an identical value, othervar merely directs to the same object as myvar. Consequently, the reference count escalates to two, signifying that a pair of variables refer to the same object. Nonetheless, if one of these variables becomes out of scope or is allocated a distinct value, the reference count diminishes correspondingly.

As variables relinquish their associations with an object, the reference count progressively declines. Upon reaching zero, the Python memory manager perceives this as an indication that the object is no longer required. Subsequently, the memory manager liberates the occupied space, enabling Python to repurpose it during program execution. This procedure transpires automatically, owing to Python's inherent memory management capabilities. It is crucial to emphasize that reference counting solely concentrates on memory management within a single Python process and does not address cyclic references between objects.

Python offers useful utilities for examining the reference count of variables. The sys module provides a getrefcount() function, which returns a variable's reference count. However, it is important to note that invoking this function increases the reference count by one, as a result of passing the variable as an argument. For a more accurate count, the ctypes module allows direct access to a variable's memory address and retrieval of its reference count. This approach circumvents the reference incrementation associated with the getrefcount() function.

Determining the Reference Count

sys.getrefcount(my_var) #passing my_var creates an extra reference

ctype.c_long.from_address(address).value # passing the memory address (an Integer) it doesnt affect the reference count

Demo

>>> variable_1 = 10
>>> id(variable_1)
1283191892560
>>> import sys
>>> sys.getrefcount(variable_1)
64

>>> import ctypes
>>> def ref_count(address):
    return ctypes.c_long.from_address(address).value

>>> ref_count(id(variable_1))
63
>>> b=variable_1
>>> ref_count(id(b))
64
>>> c=b
>>> ref_count(id(c))
65

https://youtu.be/LV6hNtoCK2Q

Garbage Collection

In the previous section, we explored the concept of reference counting. Recall that reference counting is a mechanism in which Python keeps track of the number of references created for an object or variable. This count can range from one reference to multiple references. As soon as all references are removed, Python destroys the object and reclaims the memory. However, there are cases where this mechanism does not function as expected, such as in situations involving circular references.

Suppose we have a variable named 'my_var' that points to an object, Object A. When 'my_var' is set to None, the reference count decreases to 0, and Python destroys the object and reclaims the memory, which works as expected. However, consider a scenario where Object A has an instance variable that points to another object, Object B. If we remove 'my_var', the reference count of Object A becomes 0, while that of Object B remains 1. Now, let's say Object B has an instance variable that points back to Object A, creating a circular reference. In this situation, both Object A and Object B refer to each other, preventing Python from cleaning up the memory space, thus causing a memory leak. In such cases, the garbage collector is the only solution to address the issue.

Garbage collection can be controlled programmatically by using gc module.
By default, it is turned on.
We can turn it off if we are sure our code doesn't create any circular reference.
runs periodically on its own
We can call it manually to clean the memory leak

>>> import ctypes
>>> import gc
#to calculate the number of references to a specified object
>>> def ref_count(address):
    return ctypes.c_long.from_address(address).value
# function that will search the objects in the GC for a specified id
>>> def object_by_id(object_id):
    for obj in gc.get_objects():
        if id(obj) == object_id:
            return "Object exists"
    return "Not found"
#define two classes that we will use to create a circular reference
>>> class A:
    def __init__(self):
        self.b = B(self)
        print('A: self: {0}, b:{1}'.format(hex(id(self)), hex(id(self.b))))
>>> class B:
    def __init__(self, a):
        self.a = a
        print('B: self: {0}, a: {1}'.format(hex(id(self)), hex(id(self.a))))
#turn off the GC
>>> gc.disable()
>>> my_var = A()
B: self: 0x12ac6a59610, a: 0x12ac6a14c40
A: self: 0x12ac6a14c40, b:0x12ac6a59610
#getting id of a and b
>>> a_id = id(my_var)
>>> b_id = id(my_var.b)
#We can see how many references we have for `a` and `b`:
>>> print('refcount(a) = {0}'.format(ref_count(a_id)))
refcount(a) = 2
>>> print('refcount(b) = {0}'.format(ref_count(b_id)))
refcount(b) = 1
>>> print('a: {0}'.format(object_by_id(a_id)))
a: Object exists
>>> print('b: {0}'.format(object_by_id(b_id)))
b: Object exists


#lets remove refernec to the A instance
>>> my_var= None
>>> print('refcount(a) = {0}'.format(ref_count(a_id)))
refcount(a) = 1
>>> print('refcount(b) = {0}'.format(ref_count(b_id)))
refcount(b) = 1
>>> print('a: {0}'.format(object_by_id(a_id)))
a: Object exists
>>> print('b: {0}'.format(object_by_id(b_id)))
b: Object exists

#Let's run the GC manually
>>> gc.collect()
76
>>> print('refcount(a) = {0}'.format(ref_count(a_id)))
refcount(a) = -1005032592
>>> print('refcount(b) = {0}'.format(ref_count(b_id)))
refcount(b) = 0
>>> print('a: {0}'.format(object_by_id(a_id)))
a: Not found
>>> print('b: {0}'.format(object_by_id(b_id)))
b: Not found

In summary, garbage collection in Python, consisting of reference counting and cyclic garbage collection, is a crucial part of the language's memory management system, ensuring efficient memory usage and preventing memory leaks.

Dynamic vs. Static Typing

Static typing and dynamic typing represent two distinct methods for handling variables and data types in programming languages. The following is a brief comparison between static typing and dynamic typing:

Static Typing:

In statically typed languages, you must declare the data type of a variable when it is defined.
The data type of a variable is known and checked at compile-time (before the program runs).
Once a variable's data type is defined, it typically cannot be changed during the program's execution.
Static typing often requires more explicit type annotations, which can make code more verbose but also catch type-related errors at compile-time, reducing runtime errors.
Examples of statically typed languages include C, C++, Java, and Swift.

Dynamic Typing:

In dynamically typed languages, you do not need to specify the data type of a variable explicitly when it's defined.
The data type of a variable is determined at runtime (while the program is running) based on the value assigned to it.
Variables can change their data type during runtime, providing flexibility but potentially leading to runtime errors if type mismatches occur.
Dynamic typing can make code more concise and easier to write quickly but might require thorough testing to catch type-related errors.
Examples of dynamically typed languages include Python, JavaScript, Ruby, and PHP.

Python's Approach:

Python is known for its dynamic typing. You don't have to declare data types explicitly when defining variables, and variables can change types during runtime.
However, Python introduced type hinting through PEP 484, allowing you to add optional type annotations to your code for better readability and to aid static analysis tools. Type hinting is a form of optional static typing.
While Python's type hinting provides the benefits of static typing for code readability and error checking, it doesn't change Python's underlying dynamic typing behavior. Python remains a dynamically typed language.

In summary, the choice between static typing and dynamic typing depends on the language you're using and your specific programming needs. Static typing catches type-related errors at compile-time but may require more explicit type annotations. Dynamic typing provides flexibility and often leads to more concise code but may require thorough testing to catch type-related errors at runtime. Python offers a blend of both through optional type hinting, allowing you to choose the level of static typing that suits your project.

Managing Memory and Typing in Python: Reference Counting, Garbage Collection, and Dynamic vs. Static Typing

A Comprehensive Examination of Memory Management and Typing Paradigms in Python

Reference Counting

Garbage Collection

Dynamic vs. Static Typing

Managing Memory and Typing in Python: Reference Counting, Garbage Collection, and Dynamic vs. Static Typing

A Comprehensive Examination of Memory Management and Typing Paradigms in Python

Reference Counting

Garbage Collection

Dynamic vs. Static Typing

Did you find this article valuable?