Binary Flags in Python

Programming involves the art of passing arguments around while maintaining readability and utility in your code. Core to almost any application is the concept of a configuration, a series of values representing some optional settings the user might change to control how the application executes.

A simple configuration might be an arbitrary namespace like a module or class which contains a series of variables with specific meaning. Take this config.py example:

import datetime
import sys

class AppConfig:
    """ Primary application configuration """
    debug = False
    testing = False

    _run_date = datetime.datetime.utcnow()
    _version = sys.version

We define a simple class AppConfig with some attributes which is exported for use by the rest of the project. This is a nice implementation because it is explicit and its easy to change values or use inheritance to exchange configuration options. The stdlib contains a variety of configuration tools providing flexibility in how these options are exposed to the end user and how they can be changed or overridden.

Inspiration

I recently ran into an issue where I wanted to provide discrete run modes to end users which might include a set of distinct options. The project is a webcrawler, and I wanted to codify certain discrete flags for these run modes:

class BaseConfig:
    """ Configuration object defining our base 'mode' """
    debug = False
    testing = False
    async = False

    str_pretty_json = False
    str_verbose = False

class PrintMode(BaseConfig):
    """ Config mode meant for verbose/pretty output """
    str_pretty_json = True
    str_verbose = True

This is manageable and represents our modes in an object-oriented fashion, but its not the only way of achieving the same idea. I wanted a solution which would represent certain core modes and the flags which define them, without needing a fully-fledged class for each mode.

Enter binary flags. When using flags of this nature, we can define a complex mode by adding the values of its constituent flags via bitwise or operations.

A binary flag may be defined as any integer value which represents a single whole bit value, or any value represented by 2n.

n value012345678
flag value1248163264128256

The benefit here is in the fact that the combination of any two flags will result in a value which will not conflict with any other flag values – all combinations of flags are unique.

>>> TEST = 1
>>> DEBUG = 2
>>> VERBOSE = 4
>>> TEST | DEBUG
3
>>> TEST | DEBUG | VERBOSE
7

We use bitwise or when combining binary flags rather than addition, for a very specific reason.

>>> TEST = 1
>>> TEST + TEST
2
>>> TEST | TEST
1

This is a logical operation which examines the state of each bit in the two values being compared and sets the resulting bit value to 1 if either value has a 1 in that bit. It may help to visualize these values in binary.

FLAGVALUE23222120
DEBUG20010
VERBOSE40100
TEST10001

If we represent each bit as a column in a table, we can see our flags each have a 1 in only one column.

FLAGVALUE 23 222120
DEBUG20010
VERBOSE40100
DEBUG | VERBOSE60110
FLAG VALUE 23 22 21 20
DEBUG20010
DEBUG20010
DEBUG | DEBUG20010

This mechanism means applying the same flag multiple times is idempotent and does not impact the end result.

To avoid hardcoding flags, a generator was added into a constants.py module. Because these are constants, we adopted the uppercase standard.

import sys

def _f(n=0):
    while n < sys.maxsize:
        yield 2 ** n
        n += 1

try:
    _flags = _f()
    DEBUG = next(_flags)
    VERBOSE = next(_flags)
    TESTING = next(_flags)
except StopIteration:
    raise ValueError(f'Reached maximum flag ({sys.maxsize})')

DEFAULT_MODE = 0
TEST_MODE = DEBUG | VERBOSE | TESTING

Here we dynamically generate our flags, the only hardcoded value is our default mode. This helps us define them, but doesn’t tell us which flags are involved in a particular mode. A decoding generator was added to allow for flag extrapolation.

def decode_mode(n):
    while n:
        _current_bit = n & (~n+1)
        yield _current_bit
        n ^= _current_bit

Using the decoder, it is now possible to obtain a list of unique flags included in a particular mode.

>>> from constants import DEBUG, VERBOSE, decode_mode
>>> mode = DEBUG | VERBOSE
>>> flags = list(decode_mode(mode))
>>> DEBUG in flags
True
>>> VERBOSE in flags
True
>>> flags
[1, 2]

The constants file was expanded to represent some basic flags and modes with the ability to easily expand upon them in the future.

"""
.. module:: pokeycrawl.application.constants
    :synopsis: Constants for configuration and run modes

Integer values appropriate for bitwise operations
Series:
1,2,4,8,16,32,64,128,256,512,1024,2048,4096,8192,16384,32768
"""

import sys

# Internal counter
__N = 0
# Setting a soft max for size
__MAX_FLAG = sys.maxsize


def _f():
    """ Generator which yields a series of values appropriate for flags """
    global __N
    while __N < __MAX_FLAG:
        yield 2 ** __N
        __N += 1


def decode_mode(n):
    """
    Generator yielding individual bit values from a binary flag
    Use to split binary flags which have been bitwise or'ed
    """

    while n:
        _current_bit = n & (~n+1)
        yield _current_bit
        n ^= _current_bit


# Fetch our flag generator
__flags = _f()

# ** Define flags by accessing the next value from our __flags generator
try:
    # App runlevel flags
    DEBUG = next(__flags)
    TESTING = next(__flags)
    ASYNC = next(__flags)
    # App action flags
    FOLLOW_LINKS = next(__flags)     #: Follow links and visit each unique target
    COLLECT_LINKS = next(__flags)    #: Aggregated link list is written to .<domain>.links.txt
    SAVE_INDEX = next(__flags)       #: Index is written to .<domain>.index.txt
    HONOR_ROBOTS = next(__flags)     #: Honor robots.txt (disable for testing, recommended ON)
    # String settings
    STR_PRETTY_JSON = next(__flags)  #: Serialized output should be json pretty-printed
    STR_VERBOSE = next(__flags)      #: Flag indicating verbose output should be used where applicable
except StopIteration:
    raise ValueError(f'Reached maximum flag value')


# ** Modes are calculated by or'ing their constituent flags
MODE_PRINT = ASYNC | STR_VERBOSE
MODE_DEBUG = ASYNC | STR_VERBOSE | DEBUG
MODE_TEST = ASYNC | STR_VERBOSE | DEBUG | TESTING
MODE_PRETTY = MODE_PRINT | STR_PRETTY_JSON
# Run modes
MODE_LINKS_ONLY = HONOR_ROBOTS | FOLLOW_LINKS
MODE_COLLECT = HONOR_ROBOTS | FOLLOW_LINKS | COLLECT_LINKS
MODE_INDEX = HONOR_ROBOTS | FOLLOW_LINKS | SAVE_INDEX
MODE_FULL = HONOR_ROBOTS | FOLLOW_LINKS | COLLECT_LINKS | SAVE_INDEX
# Modes may also be or'ed
MODE_PRETTY_FULL = MODE_FULL | MODE_PRETTY

# Set a global default mode
DEFAULT_MODE = MODE_PRETTY_FULL

# Collections
FLAGS = ['DEBUG', 'TESTING', 'ASYNC', 'FOLLOW_LINKS', 'COLLECT_LINKS', 'SAVE_INDEX', 'HONOR_ROBOTS',
         'STR_PRETTY_JSON', 'STR_VERBOSE']
MODES = ['MODE_PRINT', 'MODE_DEBUG', 'MODE_TEST', 'MODE_LINKS_ONLY', 'MODE_COLLECT', 'MODE_INDEX',
         'MODE_FULL', 'DEFAULT_MODE', 'MODE_PRETTY', 'MODE_PRETTY_FULL']

__all__ = FLAGS + MODES + ['FLAGS', 'MODES', 'decode_mode']