Python metaprogramming Tutorial

Metaprogramming, like metadata, is the creation of programs that manipulate other programs. It is widely assumed that metaprograms are programs that generate other programs. However, the paradigm is much broader. Metaprogramming encompasses all programs that are designed to read, analyze, transform, or modify themselves. Here are a few examples:

  1. Domain-specific languages (DSLs)
  1. Parsers
  2. Interpreters
  3. Compilers
  4. Theorem provers
  5. Term rewriters

This tutorial delves into Python metaprogramming. It refreshes your Python knowledge by going over the features of Python so that you can better understand the concepts in this tutorial. It also explains how type in Python is more than just returning an object’s class. The book then goes over metaprogramming in Python and how it can be used to simplify certain tasks.

A little introspection

If you’ve been programming in Python for a while, you’re probably aware that everything is an object and that classes create objects. But, if everything is an object (including classes), who creates those classes? This is the exact question I answer.

Let us check to see if the preceding statements are correct:

>>> class SomeClass:
...     pass
>>> someobject = SomeClass()
>>> type(someobj)
<__main.SomeClass instance at 0x7f8de4432f80>

So, the type function called on an object to return the class of that object.

>>> import inspect
>>>inspect.isclass(SomeClass)
True
>>>inspect.isclass(some_object)
False
>>>inspect.isclass(type(some_object))
True
                

inspect.isclass returns True if a class is passed to it and False otherwise. Because some_object is not a class (it’s an instance of class SomeClass), it returns False. And because type(some_object) returns the class of some_object, inspect.isclass(type(some_object)) returns True:

>>> type(SomeClass)
<type 'classobj'>>>>
inspect.isclass(type(SomeClass))
True

classobj is the class that every class in Python 3 inherits from by default. Everything makes sense now. But what about classobj? Let’s spice things up:

>>> type(type(SomeClass))
<type 'type'>
>>>inspect.isclass(type(type(SomeClass)))
True
>>>type(type(type(SomeClass)))
<type 'type'>
>>>isclass(type(type(type(SomeClass))))
True

Inception, you say? It turns out that the very first statement (that everything is an object) was not entirely correct. Here’s a more accurate statement:

Except for type, everything in Python is an object, and they are all instances of classes or instances of metaclasses.

To verify this:

>>> some_obj = SomeClass()
>>> isinstance(some_obj,SomeClass)
True
>>> isinstance(SomeClass, type)
True

So we now know that an instance is an instantiation class and that a class is an instance of a metaclass.

type is not what we believe it to be

type is a class that has its own type. It’s a subclass. A metaclass instantiates and defines behavior for a class in the same way that a class instantiates and defines behavior for an instance.

Python’s built-in metaclass is type. In Python, we can define a custom metaclass by inheriting the type metaclass to change the behavior of classes (such as the behavior of SomeClass). Metaclasses enable metaprogramming in Python.

Here’s what happens when you define a class

Let us first review what we already know. A Python program’s fundamental building blocks are:

  1. Statements
  1. Functions
  2. Classes

In a program, statements do actual work. Statements can execute at global scope (module-level) or local scope (enclosed within a function). Functions are fundamental-like units of code that are made up of one or more statements that are ordered in a specific order to accomplish a specific task. Functions can be defined at the module level or as a class method. Classes enable object-oriented programming to function. They specify how objects will be instantiated and their properties (attributes and methods).

The namespaces of classes are layered as dictionaries. For example:

 
>>> class SomeClass:
...     classvar = 1
...     def init(self):
...         self.somevar = 'Some value'

>>> SomeClass.dict
{'doc': None,
 'init': <function main.init>,
 'module': 'main',
 'class_var': 1}

>>> s = SomeClass()

>>> s.__dict
{'some_var': 'Some value'}

When the keyword class is encountered, the following occurs:

  1. The body (statements and functions) of the class is isolated.
  1. The namespace dictionary of the class is created (but not populated yet).
  2. The body of the class executes, then the namespace dictionary is populated with all of the attributes, methods defined, and some additional useful info about the class.
  3. The metaclass is identified in the base classes or the metaclass hooks (explained later) of the class to be created.
  4. The metaclass is then called with the name, bases, and attributes of the class to instantiate it.

You can also use type to create classes in Python because it is the default metaclass.

The other side of type

When called with a single argument, type returns the type information for an existing class. A new class object is created when type is called with three arguments. When invoking type, the arguments are the name of the class, a list of base classes, and a dictionary containing the namespace for the class (all the fields and methods).

So:

>>> class SomeClass: pass

Is equivalent to:

>>> SomeClass = type(‘SomeClass’, (), {})

And:

class ParentClass:
    pass

class SomeClass(ParentClass):
    some_var = 5
    def some_function(self):
        print("Hello!")

Is effectively equivalent to:

def some_function(self):
    print("Hello")

ParentClass = type('ParentClass', (), {})
SomeClass = type('SomeClass',
                 [ParentClass],
                 {'some_function': some_function,
                  'some_var':5})

So, by using our custom metaclass instead of type, we can inject behavior into the classes that otherwise would not be possible. But, before we get into implementing metaclasses for changing behavior, let’s look at some other ways to metaprogram in Python.

Decorators: A common example of metaprogramming in Python

Decorators are used to alter the behavior of a function or class. Decorators are used in the following ways:

@some_decorator
def some_func(args, *kwargs):
    pass

@some_decorator is just the syntactic sugar to represent that some_func is wrapped by another function some_decorator. We know that functions, as well as classes (excluding the type metaclass), are objects in Python, which means they can be:

  1. Assigned to a variable
  1. Copied
  2. Passed as parameters to other functions

The previous syntactic is effectively equivalent to:

some_func = some_decorator(some_func)

You might be wondering how some_decorator is defined:

def some_decorator(f):
    """
    The decorator receives function as a parameter.
    """
    def wrapper(args, **kwargs):
        #doing something before calling the function
        f(args, **kwargs)
        #doing something after the function is called
    return wrapper

Assume we have a function that retrieves scraped data from a URL. The server from which we are retrieving the data has a throttling mechanism in place if it detects a high volume of requests from the same IP address with the same interval. So, in order to fool the server, we’d like our scraper to wait for some random amount of time before submitting the request. Is it possible to create a decorator that does this? Let’s take a look:

from functools import wraps
import random
import time

def wait_random(min_wait=1, max_wait=30):
    def inner_function(func):
        @wraps(func)
        def wrapper(args, **kwargs):
            time.sleep(random.randint(min_wait, max_wait))
            return func(args, **kwargs)

        return wrapper

    return inner_function

@wait_random(10, 15)
def function_to_scrape():
    #some scraping stuff

The inner_function and @wraps decorator might be new to you. If you look carefully, the inner_function is analogous to the some_decorator we just previously defined. The other layer of wrapping in wait_random enables us to pass parameters to the decorator (min_wait and max_wait) as well. The @wraps is a nice decorator that copies the metadata of the func (like name, doc string, and function attributes). If we don’t use these, we will not be able to get useful results from function calls like help(func) because in that case, it returns the docstring and information of wrapper instead of func.

But suppose we have a scraper class with several such functions:

class Scraper:
    def func_to_scrape_1(self):
        #some scraping stuff
        pass
    def func_to_scrape_2(self):
        #some scraping stuff
        pass
    def func_to_scrape_3(self):
        #some scraping stuff
        pass

One option is to wrap all the functions with @wait_random individually. But we can do better: We can create a class decorator. The plan is to traverse the class namespace, identify the functions, and wrap them in our decorator.

def classwrapper(cls):
    for name, val in vars(cls).items():
        #callable return True if the argument is callable
        #i.e. implements the __call
        if callable(val):
            #instead of val, wrap it with our decorator.
            setattr(cls, name, wait_random()(val))
    return cls

You can now use @classwrapper to wrap the entire class. But what if there are multiple scraper classes or scraper subclasses? You can either use @classwrapper on them individually or create a metaclass in this case.

Metaclasses

There are two steps to creating a custom metaclass:

  1. Write a subclass of the metaclass type.
  2. Insert the new metaclass into the class creation process using the metaclass hook.

We subclass the type class and modify the magic methods like __init__, __new__, __prepare__, and __call__ to modify the behavior of the classes while creating them. These methods have information like base class, name of the class, attributes, and their values. In Python 2, the metaclass hook is a static field in the class called __metaclass__. In Python 3, you can specify the metaclass as a metaclass argument in the base-class list of a class.

>>> class CustomMetaClass(type):
...     def __init__(cls, name, bases, attrs):  
...         for name, value in attrs.items():
                # do some stuff
...             print('{} :{}'.format(name, value))
>>> class SomeClass:
...          # the Python 2.x way
...         __metaclass__ = CustomMetaClass
...         class_attribute = "Some string"
__module__ :__main__
__metaclass__ :<class '__main__.CustomMetaClass'>
class_attribute :Some string

The attributes get printed automatically due to the print statement in the __init__ method of our CustomMetaClass. Let’s imagine you have an annoying collaborator on your Python project who prefers using camelCase for naming class attributes and methods. You know it’s bad, and the collaborator should be using snake_case (after all, it’s Python!). Can we write a metaclass to change all those camelCase attributes to snake_case?

def camel_to_snake(name):
    """
    A function that converts camelCase to snake_case.
    Referred from: https://stackoverflow.com/questions/1175208/elegant-python-function-to-convert-camelcase-to-snake-case
    """
    import re
    s1 = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
    return re.sub('([a-z0-9])([A-Z])', r'\1_\2', s1).lower()

class SnakeCaseMetaclass(type):
    def __new__(snakecase_metaclass, future_class_name,
                future_class_parents, future_class_attr):
        snakecase_attrs = {}
        for name, val in future_class_attr.items():
            snakecase_attrs[camel_to_snake(name)] = val
        return type(future_class_name, future_class_parents,
                    snakecase_attrs)

You might have wondered why we use __new__ instead of __init__ here. __new__ is actually the first step in creating an instance. It is responsible for returning a new instance of your class. __init__, on the other hand, doesn’t return anything. It’s only responsible for initializing the instance after it’s been created. A simple rule of thumb to remember: Use new when you need to control the creation of a new instance; use init when you need to control the initialization of a new instance.

You won’t often see __init__ being implemented in a metaclass because it’s not that powerful — The class is already constructed before __init__ is actually called. You can see it as having a class decorator with the difference that __init__ would be run when making subclasses, while class decorators are not called for subclasses.

Because our task involved creating a new instance (preventing those camelCase attributes from creeping into the class), override the __new__ method in our custom SnakeCaseMetaClass. Let’s confirm that it works:


>>> class SomeClass(metaclass=SnakeCaseMetaclass):
...     camelCaseVar = 5
>>> SomeClass.camelCaseVar
AttributeError: type object 'SomeClass' has no attribute 'camelCaseVar'
>>> SomeClass.camel_case_var
5

It is effective! You now understand how to create and use a metaclass in Python. Let’s see what we can do with this.

Using metaclasses in Python

Metaclasses can be used to impose different rules on attributes, methods, and their values. Similar examples to the preceding example (using snake_case) include:

  1. Domain restriction of values
  1. Implicit conversion of values to custom classes (you might want to hide all of these complexities from users writing the class)
  2. Enforcing different naming conventions and style guidelines (like “every method should have a docstring”)
  3. Adding new attributes to a class

The main reason for using metaclasses instead of defining all of this logic in the class definitions is to avoid code repetition throughout the codebase.

Real-world uses of metaclasses

Metaclasses solve a practical problem of code redundancy (Don’t Repeat Yourself — DRY) because they are inherited among subclasses. Metaclasses also aid in the abstraction of complex class creation logic, typically by performing extra actions or adding extra code while class objects are produced. Here are a few examples of real-world metaclass applications:

  1. Abstract base classes
  1. Registration of classes
  2. Creating APIs in libraries and frameworks

Let’s look at some examples of each of them.

Abstract base classes

Abstract base classes are those that can only be inherited from and not instantiated. Python includes the following features:

from abc import ABCMeta, abstractmethod

class Vehicle(metaclass=ABCMeta):

    @abstractmethod
    def refill_tank(self, litres):
        pass

    @abstractmethod
    def move_ahead(self):
        pass

Let’s make a Truck class that inherits from the Vehicle class:

class Truck(Vehicle):
    def init(self, company, color, wheels):
        self.company = company
        self.color = color
        self.wheels = wheels

    def refill_tank(self, litres):
        pass

    def move_ahead(self):
        pass

It should be noted that we have not implemented the abstract methods. Let’s see what happens if we try to instantiate a Truck class object:

>>> mini_truck = Truck("Tesla Roadster", "Black", 4)

TypeError: Can't instantiate abstract class Truck with abstract methods move_ahead, refill_tank

This can be fixed by defining both abstract methods in our Truck class:

class Truck(Vehicle):
    def init(self, company, color, wheels):
        self.company = company
        self.color = color
        self.wheels = wheels

    def refilltank(self, litres):
        pass

    def moveahead(self):
        pass
>>> mini_truck = Truck("Tesla Roadster", "Black", 4)
>>> mini_truck
<__main.Truck at 0x7f881ca1d828>

Registration of classes

Let’s look at a server with multiple file handlers to see how this works. The goal is to be able to quickly find the appropriate handler class based on the file format. We’ll make a handlers dictionary and have our CustomMetaclass register various handlers found in the code:

handlers = {}

class CustomMetaclass(type):
    def new(meta, name, bases, attrs):
        cls = type.new(meta, name, bases, attrs)
        for ext in attrs["files"]:
            handlers[ext] = cls
        return cls

class Handler(metaclass=CustomMetaclass):
    formats =     #common stuff for all kinds of file format handlers


class ImageHandler(Handler):
    formats = "jpeg", "png"
class AudioHandler(Handler):
    formats = "mp3", "wav">>> handlers
{'mp3': main.AudioHandler,
 'jpeg': main.ImageHandler,
 'png': main.ImageHandler,
 'wav': main.AudioHandler}

Based on the file format, we can easily determine which handler class to use. In general, you can use metaclasses whenever you need to maintain some sort of data structure storing class characteristics.

Creating APIs

Metaclasses are widely used in frameworks and libraries due to their ability to prevent logic redundancy among subclasses and to hide custom class creation logic that users do not need to know. This opens up interesting possibilities for reducing boilerplate and creating a more appealing API. Consider the following snippet of Django’s ORM usage:


>>> from from django.db import models
>>> class Vehicle(models.Model):
...    color = models.CharField(max_length=10)
...    wheels = models.IntegerField()

In this section, we will create a Vehicle class that will inherit from the models. A Django package’s model class. Within the class, we define a couple of fields (color and wheels) to represent vehicle characteristics. Let us now attempt to instantiate an object of the class we just created.


>>> four_wheeler = Vehicle(color="Blue", wheels="Four")
#Raises an error
>>> four_wheeler = Vehicle(color="Blue", wheels=4)
>>> four_wheeler.wheels
4

We simply had to inherit from the models as a user creating a model for a Vehicle. Create a class model and some high-level statements. The rest of the work (such as creating a database hook, raising an error for invalid values, and returning an int type instead of models. The model created the IntegerField) behind the scenes. Models class and the metaclass it employs.

Conclusion

In this tutorial, you learned about the relationship between Python instances, classes, and metaclasses. You learned about metaprogramming, a technique for manipulating code. We talked about function and class decorators as a way to inject custom behavior into classes and methods. Then we subclassed Python’s default type metaclass to implement our custom metaclasses. Finally, we looked at some real-world applications of metaclasses. The use of metaclasses is hotly debated on the internet, but it should now be much easier for you to analyze and answer whether some problem is better solved using metaprogramming.

Source link