Feed: Nikita Sobolev
Entries found: 10
Python ParamSpec guide
Published: 2021-12-31T00:00:00+00:00
Updated: 2021-12-31T00:00:00+00:00
UTC: 2021-12-31 00:00:00+00:00
URL: https://sobolevn.me/2021/12/paramspec-guideBefore ParamSpec (PEP612) was released in Python3.10 and typing_extensions, there was a big problem in typing decorators that change a function’s signature.Content Preview
Before
ParamSpec( PEP612 ) was released in Python3.10 andtyping_extensions, there was a big problem in typing decorators that change a function’s signature.Let’s start with a basic example. How one can type a decorator function that does not change anything?
from typing import Callable, TypeVar C = TypeVar('C', bound=Callable) def logger(function: C) -> C: def decorator(*args, **kwargs): print('Function called!') return function(*args, **kwargs) return decoratorNotice the most important part here:
C = TypeVar('C', bound=Callable)What does it mean? It means that we take any callable in and return the exact same callable.
This allows you to decorate any function and preserve its signature:
@logger def example(arg: int, other: str) -> tuple[int, str]: return arg, other reveal_type(example) # (arg: int, other: str) -> tuple[int, str]But, there’s a problem when a function does want to change something. Imagine, that some decorator might also add
Noneas a return value in some cases:def catch_exception(function): def decorator(*args, **kwargs): try: return function(*args, **kwargs) except Exception: return None return decoratorThis is a perfectly valid Python code. But how can we type it? Note that we cannot use
TypeVar('C', bound=Callable)anymore, since we are changing the return type now.Initially, I’ve tried something like:
def catch_exception(function: Callable[..., T]) -> Callable[..., Optional[T]]: ...But, this means a different thing: it turns all function’s arguments into
*args: Any, **kwargs: Any, but, the return type will be correct. Generally, this is not what we need when it comes to type-safety.The second way to do that in a type-safe way is adding a custom Mypy plugin. Here’s our example from
dry-python/returnsto support decorators that were changing return types . But, plugins are quite hard to write (you need to learn a bit of Mypy’s API), they are not universal (for example, Pyright does not understand Mypy plugins), and they require to be explicitly installed by the end user.That’s why
ParamSpecwas added. Here’s how it can be used in this case:from typing import Callable, TypeVar, Optional from typing_extensions import ParamSpec # or `typing` for `python>=3.10` T = TypeVar('T') P = ParamSpec('P') def catch_exception(function: Callable[P, T]) -> Callable[P, Optional[T]]: def decorator(*args: P.args, **kwargs: P.kwargs) -> Optional[T]: try: return function(*args, **kwargs) except Exception: return None return decoratorNow, all decorated functions will preserve their argument types and change their return type to include
None:@catch_exception def div(arg: int) -> float: return arg / arg reveal_type(div) # (arg: int) -> Optional[float] @catch_exception def plus(arg: int, other: int) -> int: return arg + other reveal_type(plus) # (arg: int, other: int) -> Optional[int]:The recent release of Mypy 0.930 with
ParamSpecsupport allowed us to remove our custom Mypy plugin and use a well-defined primitive. Here’s a commit to show how easy our transition was. It was even released today inreturns@0.18.0, check it out!What’s next? Concatenate
But, that’s not all! Because some decorators modify argument types, PEP612 also adds the
Concatenatetype that allows prepending, appending, transforming, or removing function arguments.Unfortunately, Mypy does not support
Concatenatejust yet, but I can show you some examples from PEP itself. Here’s how it is going to work.Let’s start with some basic definitions:
from typing_extensions import ParamSpec, Concatenate # or `typing` for `python>=3.10` P = ParamSpec('P') def bar(x: int, *args: bool) -> int: ...We are going to change the type of
barfunction with the help ofPparameter specification. First, let’s prepend anstrargument to this function:def add(x: Callable[P, int]) -> Callable[Concatenate[str, P], int]: ... add(bar) # (str, /, x: int, *args: bool) -> intNotice that a positional-only
strargument is added to the return type ofadd(bar). Now, let’s try removing an argument:def remove(x: Callable[Concatenate[int, P], int]) -> Callable[P, int]: ... remove(bar) # (*args: bool) -> intBecause we use
PandConcatenatein the argument type, the return type will not have anintargument anymore.And finally, let’s change an argument type from
inttostrand return type frominttobool:def transform( x: Callable[Concatenate[int, P], int] ) -> Callable[Concatenate[str, P], bool]: ... transform(bar) # (str, /, *args: bool) -> boolLooking forward to new Mypy release with
Concatenatesupport. I totally know some places where it will be useful.Conclusion
PEP612 adds two very powerful abstractions that allow us to better type our functions and decorators, which play a very important role in Python’s world.
Complex projects (like Django ) or simple type-safe scripts can highly benefit from this new typing feature. And I hope you will!
Happy New Year!
Newly released feature in PEP612 allows you do a lot of advanced typing things with functions and their signatures
Typeclasses in Python
Published: 2021-06-30T00:00:00+00:00
Updated: 2021-06-30T00:00:00+00:00
UTC: 2021-06-30 00:00:00+00:00
URL: https://sobolevn.me/2021/06/typeclasses-in-pythonToday I am going to introduce a new concept for Python developers: typeclasses. It is a concept behind our new dry-python library called classes.Content Preview
Today I am going to introduce a new concept for Python developers: typeclasses. It is a concept behind our new
dry-pythonlibrary calledclasses.I will tell you in advance, that it will look very familiar to what you already know and possibly even use. Moreover, we reuse a lot of existing code from Python’s standard library. So, you can call this approach “native” and “pythonic”. And it is still going to be interesting: I am showing examples in 4 different languages!
But, before discussing typeclasses themselves, let’s discuss what problem they do solve.
Some functions must behave differently
Ok, this one is a familiar problem to all of the devs out there. How can we write a function that will behave differently for different types?
Let’s create an example. We want to
greetdifferent types differently (yes, “hello world” examples, here we go). We want togreet:
strinstances asHello, {string_content}!MyUserinstances asHello again, {username}Note, that
greetas a simple example does not really make much “business” sense, but more complicated things liketo_json,from_json,to_sql,from_sql, andto_binarydo make a lot of sense and can be found in almost any project. But, for the sake of implementation simplicity, I’m going to stick to ourgreetexample.The first approach that comes to our minds is to use
isinstance()checks inside the function itself. And it can work in some cases! The only requirement is that we must know all the types we will work with in advance.Here’s how it would look like:
@dataclass class MyUser(object): name: str def greet(instance: str | MyUser) -> str: if isinstance(instance, str): return 'Hello, "{0}"!'.format(instance) elif isinstance(instance, MyUser): return 'Hello again, {0}'.format(instance.name) raise NotImplementedError( 'Cannot greet "{0}" type'.format(type(instance)), )The main limitation is that we cannot extend this function for other type easily (we can use wrapper function, but I consiser this a redefinition).
But, in some cases -
isinstancewon’t be enough, because we need extendability. We need to support other types, which are unknown in advance. Our users might need togreettheir custom types.And that’s the part where things begin to get interesting.
All programming languages address this problem differently. Let’s start with Python’s traditional OOP approach.
OOP extendability and over-abstraction problems
So, how does Python solve this problem?
We all know that Python has magic methods for some builtin functions like
len()and__len__, it solves exactly the same problem.Let’s say we want to greet a user:
@dataclass class MyUser(object): name: str def greet(self) -> str: return 'Hello again, {0}'.format(self.name)You can use this method directly or you can create a helper with
typing.Protocol:from typing_extensions import Protocol class CanGreet(Protocol): def greet(self) -> str: """ It will match any object that has the ``greet`` method. Mypy will also check that ``greet`` must return ``str``. """ def greet(instance: CanGreet) -> str: return instance.greet()And then we can use it:
print(greet(MyUser(name='example'))) # Hello again, exampleSo, it works? Not really .
There are several problems.
First , some classes do not want to know some details about themselves to maintain abstraction integrity. For example:
class Person(object): def become_friends(self, friend: 'Person') -> None: ... def is_friend_of(self, person: 'Person') -> bool: ... def get_pets(self) -> Sequence['Pet']: ...Does this
Person(pun intended) deserve to know that someto_jsonconversion exists that can turn this poorPersoninto textual data? What about binary pickling? Of course not, these details should not be added to a business-level abstraction, this is called a leaky abstraction when you do otherwise.Moreover, I think that mixing structure and behavior into a single abstraction is bad. Why? Because you cannot tell in advance what behavior you would need from a given structure.
For abstractions on this level, it is way easier to have behavior near the structure, not inside it. Mixing these two only makes sense when we work on a higher level like services or processes .
Second , it only works for custom types. Existing types are hard to extend . For example, how would you add the
greetmethod to thestrtype?You can create
strsubtype withgreetmethod in it:class MyStr(str): def greet(self) -> str: return 'Hello, {0}!'.format(self)But, this would require a change in our usage:
print(greet(MyStr('world'))) # Hello, world! print(greet('world')) # fails with TypeErrorMonkey-patching
Some might suggest that we can just insert the needed methods directly into an object / type. Some dynamically typed languages went on this path:
JavaScript(in 2000s and early 2010s, mostly popularized byjQueryplugins) andRuby( still happening right now ). Here’s how it looks:String.prototype.greet = function (string) { return `Hello, ${string}!` }It is quite obvious, that it is not going to work for anything complex. Why ?
- Different parts of your program might use monkey-patching of methods with the same name, but with different functionality. And nothing will work
- It is hard to read because the original source does not contain the patched method and the patching location might be hidden deeply in other files
- It is hard to type, for example,
mypydoes not support it at all- Python community is not used to this style, it would be rather hard to persuade them to write their code like this (and that’s a good thing!)
I hope that it is clear: we won’t fall into this trap. Let’s consider another alternative.
Extra abstractions
People familiar with things like
django-rest-frameworkmight recommend to add special abstractions togreetdifferent types:import abc from typing import Generic, TypeVar _Wrapped = TypeVar('_Wrapped') class BaseGreet(Generic[_Wrapped]): """Abstract class of all other """ def __init__(self, wrapped: _Wrapped) -> None: self._wrapped = wrapped @abc.abstractmethod def greet(self) -> str: raise NotImplementedError class StrGreet(BaseGreet[str]): """Wrapped instance of built-in type ``str``.""" def greet(self) -> str: return 'Hello, {0}!'.format(self._wrapped) # Our custom type: @dataclass class MyUser(object): name: str class MyUserGreet(BaseGreet[MyUser]): def greet(self) -> str: return 'Hello again, {0}'.format(self._wrapped.name)And we can use it like so:
print(greet(MyStrGreet('world'))) # Hello, world! print(greet(MyUserGreet(MyUser(name='example')))) # Hello again, exampleBut, now we have a different problem: we have a gap between real types and their wrappers. There’s no easy way to wrap a type into its wrapper. How can we match them? We have to do it either by hand or use some kind of registry like
Dict[type, Type[BaseGreet]].And it is still not enough, there will be runtime errors! In practice, it ends up like
<X> is not json-serializableas many of us might have seen it withdrf’s serializers when trying to serialize a custom unregistered type.Typeclasses and similar concepts
Let’s look at how functional languages (and
Rust, people still argue whether it is functional or not) handle this problem.Some common knowledge:
- All these languages don’t have
classconcept as we know it in Python and, of course, there’s no subclassing- All the languages below don’t have
objects as we do in Python, they don’t mix behavior and structure (however,Elixirhas Alan Kay’s real objects )- Instead, these languages use ad-hoc polymorphism to make functions behave differently for different types via overloading
- And, of course, you don’t have to know any of the languages below to understand what is going on
Elixir
Let’s start with one of my favorites.
ElixirhasProtocols to achieve what we want:@doc "Our custom protocol" defprotocol Greet do # This is an abstract function, # that will behave differently for each type. def greet(data) end @doc "Enhancing built-in type" defimpl Greet, for: BitString do def greet(string), do: "Hello, #{string}!" end @doc "Custom data type" defmodule MyUser do defstruct [:name] end @doc "Enhancing our own type" defimpl Greet, for: MyUser do def greet(user), do: "Hello again, #{user.name}" endI am pretty sure that my readers were able to read and understand
Elixireven if they are not familiar with this language. That’s what I call beauty!Usage of the code above:
# Using our `Greet.greet` function with both our data types: IO.puts(Greet.greet("world")) # Hello, world! IO.puts(Greet.greet(%MyUser{name: "example"})) # Hello again, exampleThe thing with
Elixir’sProtocols is that it is not currently possible to express that some type does support ourGreet.greetforElixir’s type checker . But, this is not a big deal forElixir, which is 100% dynamically typed.Protocols are very widely used, they power lots of the language’s features. Here are some real-life examples:
Enumerableallows to work with collections: counting elements, finding members, reducing, and slicingString.Charsis something like__str__in Python, it converts structures to human-readable formatRust
RusthasTraits . The concept is pretty similar toProtocols inElixir:// Our custom trait trait Greet { fn greet(&self) -> String; } // Enhancing built-in type impl Greet for String { fn greet(&self) -> String { return format!("Hello, {}!", &self); } } // Defining our own type struct MyUser { name: String, } // Enhancing it impl Greet for MyUser { fn greet(&self) -> String { return format!("Hello again, {}", self.name); } }And of course, due to
Rust’s static typing, we can express that some function’s argument supports the trait we have just defined:// We can express that `greet` function only accepts types // that implement `Greet` trait: fn greet(instance: &dyn Greet) -> String { return instance.greet(); } pub fn main() { // Using our `greet` function with both our data types: println!("{}", greet(&"world".to_string())); // Hello, world! println!("{}", greet(&MyUser { name: "example".to_string() })); // Hello again, example }See? The idea is so similar, that it uses almost the same syntax as
Elixir.Notable real-life examples of how
Rustuses itsTraits:
CopyandClone- duplicating objectsDebugto show betterreprof an object, again like__str__in PythonBasically,
Traits are the core of this language, it is widely used in cases when you need to define any shared behavior.Haskell
Haskellhas typeclasses to do almost the same thing.So, what’s a typeclass? Typeclass is a group of types, all of which satisfy some common contract. It is also a form of ad-hoc polymorphism that is mostly used for overloading.
I am a bit sorry for the
Haskellsyntax below, it might be not very pleasant and clear to read, especially for people who are not familiar with this brilliant language, but we have what we have:{-# LANGUAGE FlexibleInstances #-} -- Our custom typeclass class Greet instance where greet :: instance -> String -- Enhancing built-in type with it instance Greet String where greet str = "Hello, " ++ str ++ "!" -- Defining our own type data MyUser = MyUser { name :: String } -- Enhancing it instance Greet MyUser where greet user = "Hello again, " ++ (name user)Basically, we do the same thing as we have already done for
RustandElixir:
- We define a
Greettypeclass that has a single function to implement:greet- Then we define instance implementation for
Stringtype, which is a built-in (alias for[Char])- Then we define custom
MyUsertype withnamefield ofStringtype- Implementing the
Greettypeclass forMyUseris the last thing we doThen we can use our new
greetfunction:-- Here you can see that we can use `Greet` typeclass to annotate our types. -- I have made this alias entirely for this annotation demo, -- in real life we would just use `greet` directly: greetAlias :: Greet instance => instance -> String greetAlias = greet main = do print $ greetAlias "world" -- Hello, world! print $ greetAlias MyUser { name="example" } -- Hello again, exampleSome real-life examples of typeclasses:
Showto convert things into user-readable representationsFunctor,Applicate, andMonadare all typeclassesI would say that among our three examples,
Haskellrelies on its typeclasses the heaviest.It is important to note that typeclasses from
Haskelland traits fromRustare a bit different , but we won’t go into these details to keep this article rather short.But, what about Python?
dry-python/classes
There’s an awesome function in the Python standard library called
singledispatch.It does exactly what we need. Do you still remember that we are finding a way to change the function’s behavior based on the input type?
Let’s have a look!
from functools import singledispatch @singledispatch def greet(instance) -> str: """Default case.""" raise NotImplementedError @greet.register def _greet_str(instance: str) -> str: return 'Hello, {0}!'.format(instance) # Custom type @dataclass class MyUser(object): name: str @greet.register def _greet_myuser(instance: MyUser) -> str: return 'Hello again, {0}'.format(instance.name)Looks cool, moreover, it is in standard lib, you even don’t have to install anything!
And we can use it like a normal function:
print(greet('world')) # Hello, world! print(greet(MyUser(name='example'))) # Hello again, exampleSo, what’s the point in writing a completely different library like we did with
dry-python/classes?We even reuse some parts of
singledispatchimplementation, but there are several key differences.Better typing
With
singledispatchyou cannot be sure that everything will work, because it is not supported bymypy.For example, you can pass unsupported types:
greet(1) # mypy is ok with that :( # runtime will raise `NotImplementedError`In
dry-python/classeswe have fixed that. You can only pass types that are supported:from classes import typeclass @typeclass def greet(instance) -> str: ... @greet.instance(str) def _greet_str(instance: str) -> str: return 'Iterable!' greet(1) # Argument 1 to "greet" has incompatible type "int"; expected "str"Or you can break the
@singledispatchsignature contract:@greet.register def _greet_dict(instance: dict, key: str) -> int: return instance[key] # still no mypy errorBut, not with
dry-python/classes:@greet.instance(dict) def _greet_dict(instance: dict, key: str) -> int: ... # Instance callback is incompatible # "def (instance: builtins.dict[Any, Any], key: builtins.str) -> builtins.int"; # expected # "def (instance: builtins.dict[Any, Any]) -> builtins.str"
@singledispatchalso does not allow defining generic functions:@singledispatch def copy(instance: X) -> X: """Default case.""" raise NotImplementedError @copy.register def _copy_int(instance: int) -> int: return instance # Argument 1 to "register" of "_SingleDispatchCallable" # has incompatible type "Callable[[int], int]"; # expected "Callable[..., X]" reveal_type(copy(1)) # Revealed type is "X`-1" # Should be: `int`Which is, again, possible with
dry-python/classes, we fully support generic functions :from typing import TypeVar from classes import typeclass X = TypeVar('X') @typeclass def copy(instance: X) -> X: ... @copy.instance(int) def _copy_int(instance: int) -> int: ... # ok reveal_type(copy(1)) # intAnd you cannot restrict
@singledispatchto work with only subtypes of specific types, even if you want to.Protocols are unsupported
Protocols are an important part of Python. Sadly, they are not supported by
@singledispatch:@greet.register def _greet_iterable(instance: Iterable) -> str: return 'Iterable!' # TypeError: Invalid annotation for 'instance'. # typing.Iterable is not a classProtocols support is also solved with
dry-python/classes:from typing import Iterable from classes import typeclass @typeclass def greet(instance) -> str: ... @greet.instance(Iterable, is_protocol=True) def _greet_str(instance: Iterable) -> str: return 'Iterable!' print(greet([1, 2, 3])) # Iterable!No way to annotate types
Let’s say you want to write a function and annotate one of its arguments that it must support the
greetfunction. Something like:def greet_and_print(instance: '???') -> None: print(greet(instance))It is impossible with
@singledispatch. But, you can do it withdry-python/classes:from classes import AssociatedType, Supports, typeclass class Greet(AssociatedType): """Special type to represent that some instance can `greet`.""" @typeclass(Greet) def greet(instance) -> str: """No implementation needed.""" @greet.instance(str) def _greet_str(instance: str) -> str: return 'Hello, {0}!'.format(instance) def greet_and_print(instance: Supports[Greet]) -> None: print(greet(instance)) greet_and_print('world') # ok greet_and_print(1) # type error with mypy, exception in runtime # Argument 1 to "greet_and_print" has incompatible type "int"; # expected "Supports[Greet]"Conclusion
We have come a long way, from basic stacked
isinstance()conditions - through OOP - to typeclasses.I have shown, that this native and pythonic idea deserves wider recognition and usage. And our extra features in
dry-python/classescan save you from lots of mistakes and help to write more expressive and safe business logic.As a result of using typeclasses, you will untangle your structures from behavior, which will allow you to get rid of useless and complex abstractions and write dead-simple typesafe code. You will have your behavior near the structures, not inside them. This will also solve the extendability problem of OOP.
Combine it with other
dry-pythonlibraries for extra effect!Future work
What do we plan for the future?
There are several key aspects to improve:
- Our
Supportsshould take any amount of type arguments:Supports[A, B, C]. This type will represent a type that supports all three typeclassesA,B, andCat the same time- We don’t support concrete generics just yet. So, for example, it is impossible to define different cases for
List[int]andList[str]. This might require adding runtime typecheker todry-python/classes- I am planning to make tests a part of this app as well! We will ship a hypothesis plugin to test users’ typeclasses in a single line of code
Stay tuned!
If you like this article you can:
Typeclasses is a new (but familiar) idea of how you can organize behavior around your types
- Donate to future
dry-pythondevelopment on GitHub- Star our
classesrepo- Subscribe to my blog for more content!
Make tests a part of your app
Published: 2021-02-28T00:00:00+00:00
Updated: 2021-02-28T00:00:00+00:00
UTC: 2021-02-28 00:00:00+00:00
URL: https://sobolevn.me/2021/02/make-tests-a-part-of-your-app
Higher Kinded Types in Python
Published: 2020-10-24T00:00:00+00:00
Updated: 2020-10-24T00:00:00+00:00
UTC: 2020-10-24 00:00:00+00:00
URL: https://sobolevn.me/2020/10/higher-kinded-types-in-python
How async should have been
Published: 2020-06-07T00:00:00+00:00
Updated: 2020-06-07T00:00:00+00:00
UTC: 2020-06-07 00:00:00+00:00
URL: https://sobolevn.me/2020/06/how-async-should-have-beenIn the last few years async keyword and semantics made its way into many popular programming languages: JavaScript, Rust, C#, and many others languages that I don’t know or don’t use.Content Preview
In the last few years
asynckeyword and semantics made its way into many popular programming languages: JavaScript , Rust , C# , and many others languages that I don’t know or don’t use.Of course, Python also has
asyncandawaitkeywords sincepython3.5.In this article, I would like to provide my opinion about this feature, think of alternatives, and provide a new solution.
Colours of functions
When introducing
asyncfunctions into the languages, we actually end up with a split world. Now, some functions start to be red (orasync) and old ones continue to be blue (sync).The thing about this division is that blue functions cannot call red ones. Red ones potentially can call blue ones. In Python, for example, it is partially true. Async functions can only call sync non-blocking functions. Is it possible to tell whether this function is blocking or not by its definition? Of course not! Python is a scripting language, don’t forget about that!
This division creates two subsets of a single language: sync and async ones. 5 years passed since the release of
python3.5, butasyncsupport is not even near to what we have in the sync python world.Read this brilliant piece if you want to learn more about colors of functions.
Code duplication
Different colors of functions lead to a more practical problem: code duplication.
Imagine, that you are writing a CLI tool to fetch sizes of web pages. And you want to support both sync and async ways of doing it. This might be very useful for library authors when you don’t know how your code is going to be used. It is not limited to just PyPI libraries, but also includes your in-company libraries with shared logic for different services written, for example, in Django and aiohttp. Or any other sync and async code. But, I must admit that single applications are mostly written in sync or async way only.
Let’s start with the sync pseudo-code:
def fetch_resource_size(url: str) -> int: response = client_get(url) return len(response.content)Looking pretty good! Now, let’s add its async counterpart:
async def fetch_resource_size(url: str) -> int: response = await client_get(url) return len(response.content)It is basically the same code, but filled with
asyncandawaitkeywords! And I am not making this up, just compare code sample inhttpxtutorial:They show exactly the same picture.
Abstraction and Composition
Ok, we find ourselves in a situation where we need to rewrite all sync code and add
asyncandawaitkeywords here and there, so our program would become asynchronous.These two principles can help us in solving this problem.
First of all, let’s rewrite our imperative pseudo-code into a functional pseudo-code. This will allow us to see the pattern more clearly:
def fetch_resource_size(url: str) -> Abstraction[int]: return client_get(url).map( lambda response: len(response.content), )What is this
.mapmethod? What does it do?This is a functional way of composing complex abstractions and pure functions. This method allows creating a new abstraction from the existing one with the new state. Let’s say that when we call
client_get(url)it initially returnsAbstraction[Response]and calling.map(lambda response: len(response.content))transforms it to the neededAbstraction[int]instance.Now the steps are pretty clear! Notice how easily we went from several independent steps into a single pipeline of function calls. We have also changed the return type of this function: now it returns some
Abstraction.Now, let’s rewrite our code to work with async version:
def fetch_resource_size(url: str) -> AsyncAbstraction[int]: return client_get(url).map( lambda response: len(response.content), )Wow, that’s mostly it! The only thing that is different is the
AsyncAbstractionreturn type. Other than that, our code stayed exactly the same. We also don’t need to useasyncandawaitkeywords anymore. We don’t useawaitat all (that’s the whole point of our journey!), andasyncfunctions do not make any sense withoutawait.The last thing we need is to decide which client we want: async or sync one. Let’s fix that!
def fetch_resource_size( client_get: Callable[[str], AbstactionType[Response]], url: str, ) -> AbstactionType[int]: return client_get(url).map( lambda response: len(response.content), )Our
client_getis now an argument of a callable type that receives a single URL string as an input and returns someAbstractionTypeoverResponseobject. ThisAbstractionTypeis eitherAbstractionorAsyncAbstractionwe have already seen on the previous samples.When we pass
Abstractionour code works like a sync one, whenAsyncAbstractionis passed, the same code automatically starts to work asynchronously.IOResult and FutureResult
Luckily, we already have the right abstractions in
dry-python/returns!Let me introduce to you type-safe,
mypy-friendly, framework-independent, pure-python tool to provide you awesome abstractions you can use in any project!Sync version
Before we go any further, to make this example reproducible, I need to provide dependencies that are going to be used later:
pip install returns httpx anyioLet’s move on!
One can rewrite this pseudo-code as a real working python code. Let’s start with the sync version:
from typing import Callable import httpx from returns.io import IOResultE, impure_safe def fetch_resource_size( client_get: Callable[[str], IOResultE[httpx.Response]], url: str, ) -> IOResultE[int]: return client_get(url).map( lambda response: len(response.content), ) print(fetch_resource_size( impure_safe(httpx.get), 'https://sobolevn.me', )) # => <IOResult: <Success: 27972>>We have changed a couple of things to make our pseudo-code real:
- We now use
IOResultEwhich is a functional way to handle syncIOthat might fail. Remember, exceptions are not always welcome !Result-based types allow modeling exceptions as separateFailure()values. While successful values are wrapped inSuccesstype. In a traditional approach, no one cares about exceptions. But, we do care ❤️- We use
httpxthat can work with sync and async requests- We use
impure_safefunction to convert the return type ofhttpx.getto return the abstraction we need:IOResultENow, let’s try the async version!
Async version
from typing import Callable import anyio import httpx from returns.future import FutureResultE, future_safe def fetch_resource_size( client_get: Callable[[str], FutureResultE[httpx.Response]], url: str, ) -> FutureResultE[int]: return client_get(url).map( lambda response: len(response.content), ) page_size = fetch_resource_size( future_safe(httpx.AsyncClient().get), 'https://sobolevn.me', ) print(page_size) print(anyio.run(page_size.awaitable)) # => <FutureResult: <coroutine object async_map at 0x10b17c320>> # => <IOResult: <Success: 27972>>Notice, that we have exactly the same result, but now our code works asynchronously. And its core part didn’t change at all!
However, it has some important notes:
- We changed sync
IOResultEinto asyncFutureResultEandimpure_safetofuture_safe, which does the same thing but returns another abstraction:FutureResultE- We now also use
AsyncClientfromhttpx- We are also required to run the resulting
FutureResultvalue. Because red functions cannot run themselves! To demonstrate that this approach works with any async library (asyncio,trio,curio), I am usinganyioutilityCombining the two
And now I can show you how you can combine these two versions into a single type-safe API.
Update after HKT support is released :
Now, after
returns@0.14.0is released, you can have a look what this program looks like with Higher Kinded Types, link . It is 100% recommended over the version above.I am going to keep the old version for the historical reasons.
Old version :
Sadly, Higher Kinded Types and proper type-classes are work-in-progress, so we would use regular
@overloadfunction cases:from typing import Callable, Union, overload import anyio import httpx from returns.future import FutureResultE, future_safe from returns.io import IOResultE, impure_safe @overload def fetch_resource_size( client_get: Callable[[str], IOResultE[httpx.Response]], url: str, ) -> IOResultE[int]: """Sync case.""" @overload def fetch_resource_size( client_get: Callable[[str], FutureResultE[httpx.Response]], url: str, ) -> FutureResultE[int]: """Async case.""" def fetch_resource_size( client_get: Union[ Callable[[str], IOResultE[httpx.Response]], Callable[[str], FutureResultE[httpx.Response]], ], url: str, ) -> Union[IOResultE[int], FutureResultE[int]]: return client_get(url).map( lambda response: len(response.content), )With
@overloaddecorators we describe which combinations of inputs are allowed. And what return type will they produce. You can read more about@overloaddecorator here .Finally, calling our function with both sync and async client:
# Sync: print(fetch_resource_size( impure_safe(httpx.get), 'https://sobolevn.me', )) # => <IOResult: <Success: 27972>> # Async: page_size = fetch_resource_size( future_safe(httpx.AsyncClient().get), 'https://sobolevn.me', ) print(page_size) print(anyio.run(page_size.awaitable)) # => <FutureResult: <coroutine object async_map at 0x10b17c320>> # => <IOResult: <Success: 27972>>As you can see
fetch_resource_sizewith sync client immediately returnsIOResultand can execute itself. In contrast to it, async version requires some event-loop to execute it. Like a regular coroutine. We useanyiofor the demo.
mypyis pretty happy about our code too:» mypy async_and_sync.py Success: no issues found in 1 source fileLet’s try to screw something up:
---lambda response: len(response.content), +++lambda response: response.content,And check that new error will be caught by
mypy:» mypy async_and_sync.py async_and_sync.py:33: error: Argument 1 to "map" of "IOResult" has incompatible type "Callable[[Response], bytes]"; expected "Callable[[Response], int]" async_and_sync.py:33: error: Argument 1 to "map" of "FutureResult" has incompatible type "Callable[[Response], bytes]"; expected "Callable[[Response], int]" async_and_sync.py:33: error: Incompatible return value type (got "bytes", expected "int")As you can see, there’s nothing magical in a way how async code can be written with right abstractions. Inside our implementation, there’s still no magic. Just good old composition. What we real magic we do is providing the same API for different types - this allows us to abstract away how, for example, HTTP requests work: synchronously or asynchronously.
I hope, that this quick demo shows how awesome
asyncprograms can be! Feel free to try newdry-python/returns@0.14release, it has lots of other goodies!Other awesome features
Speaking about goodies, I want to highlight several of features I am most proud of:
- Typed
partialand@curryfunctionsfrom returns.curry import curry, partial def example(a: int, b: str) -> float: ... reveal_type(partial(example, 1)) # note: Revealed type is 'def (b: builtins.str) -> builtins.float' reveal_type(curry(example)) # note: Revealed type is 'Overload(def (a: builtins.int) -> def (b: builtins.str) -> builtins.float, def (a: builtins.int, b: builtins.str) -> builtins.float)'Which means, that you can use
@currylike so:@curry def example(a: int, b: str) -> float: return float(a + len(b)) assert example(1, 'abc') == 4.0 assert example(1)('abc') == 4.0You can now use functional pipelines with full type inference that is augmentated by a custom
mypyplugin:from returns.pipeline import flow assert flow( [1, 2, 3], lambda collection: max(collection), lambda max_number: -max_number, ) == -3We all know how hard it is to work with
lambdas in typed code because its arguments always haveAnytype. And this might break regularmypyinference.Now, we always know that
lambda collection: max(collection)hasCallable[[List[int]], int]type inside this pipeline. Andlambda max_number: -max_numberis justCallable[[int], int]. You can pass any number of arguments toflow, they all will work perfectly. Thanks to our custom plugin!It is an abstraction over
FutureResultwe have already covered in this article. It might be used to explicitly pass dependencies in a functional manner in your async programs.To be done
However, there are more things to be done before we can hit
1.0:
- We need to implement Higher Kinded Types or their emulation, source
- Adding proper type-classes, so we can implement needed abstractions, source
- We would love to try the
mypyccompiler. It will potentially allow us to compile our typed-annotated Python program to binary. And as a result, simply dropping indry-python/returnsinto your program will make it several times faster, source- Finding new ways to write functional Python code, like our on-going investigation of “do-notation”
Conclusion
Composition and abstraction can solve any problem. In this article, I have shown you how they can solve a problem of function colors and allow people to write simple, readable, and still flexible code that works. And type-checks.
Check out
dry-python/returns, provide your feedback, learn new ideas, maybe even help us to sustain this project !And as always, follow me on GitHub to keep up with new awesome Python libraries I make or help with!
Gratis
Your sync and async code can be identical, but still, can work differently. It is a matter of right abstractions. In this article, I will show how one can write sync code to run async programs in Python.
Do not log
Published: 2020-03-11T00:00:00+00:00
Updated: 2020-03-11T00:00:00+00:00
UTC: 2020-03-11 00:00:00+00:00
URL: https://sobolevn.me/2020/03/do-not-logAlmost every week I accidentally get into this logging argument. Here’s the problem: people tend to log different things and call it a best-practice. And I am not sure why. When I start discussing this with other people I always end up repeating the exact same ideas over and over again.Content Preview
Almost every week I accidentally get into this logging argument. Here’s the problem: people tend to log different things and call it a best-practice. And I am not sure why. When I start discussing this with other people I always end up repeating the exact same ideas over and over again.
So. Today I want to criticize the whole logging culture and provide a bunch of alternatives.
Logging does not make sense
Let’s start with the most important one. Logging does not make any sense!
Let’s review a popular code sample that you can find all across the internet:
try: do_something_complex(*args, **kwargs) except MyException as ex: logger.log(ex)So, what is going on here? Some complex computation fails and we log it. It seems like a good thing to do, doesn’t it? Well. In this situation, I usually ask several important questions.
The first question is : can we make bad states for
do_something_complex()unreachable? If so, let’s refactor our code to be type-safe. And eliminate all possible exceptions that can happen here withmypy. Sometimes this helps. And in this case, we would not have any exception handling and logging at all.The second question I ask : is it important that
do_something_complex()did fail? There are a lot of cases when we don’t care. Because of the retries, queues, or it might be an optional step that can be skipped. If this failure is not important - then just forget about it. But, if this failure is important - I want to know exactly everything: why and when did it fail. I want to know the whole stack trace, values of local variables, execution context, the total number of failures, and the number of affected users. I also want to be immediately notified of this important failure. And to be able to create a bug ticket from this failure in one click.And yes, you got it correctly: it sounds like a job for Sentry, not logging.
![]()
I either want a quick notification about some critical error with everything inside, or I want nothing: a peaceful morning with tea and youtube videos. There’s nothing in between for logging.
The third question is : can we instead apply business monitoring to make sure our app works? We don’t really care about exceptions and how to handle them. We care about the business value that we provide with our app. Sometimes your app does not raise any exception to be caught be Sentry. It can be broken in different ways. For example, your form validation can return errors when it should not normally happen. And you have zero exceptions, but a dysfunctional application. That’s where business monitoring shines!
![]()
We can track different business metrics and make sure there are new orders, new comments, new users, etc. And if not - I want an emergency notification. I don’t want to waste extra money on reading logging information after angry clients will call or text me. Please, don’t treat your users as a monitoring service!
The last question is : do you normally expect
do_something_complex()to fail? Like HTTP calls or database access. If so, don’t use exceptions , useResultmonad instead. This way you can clearly indicate that something is going to fail. And act with confidence. And do not log anything. Just let it fail.Logging is a side effect
One more important monad-related topic is the pureness of the function that has a logger call inside. Let’s compare two similar functions:
def compute(arg: int) -> float: return (arg / PI) * MAGIC_NUMBERAnd:
def compute(arg: int): result = (arg / PI) * MAGIC_NUMBER logger.debug('Computation is:', result) return resultThe main difference between these two functions is that the first one is a perfect pure function and the second one is
IO-bound impure one.What consequences does it have?
- We have to change our return type to
IOResult[float]because logging is impure and can fail (yes, loggers can fail and break your app)- We need to test this side effect. And we need to remember to do so! Unless this side effect is explicit in the return type
Wise programmers even use special
Writermonad to make logging pure and explicit. And that requires to significantly change the whole architecture around this function.We also might want to pass the correct
loggerinstance, which probably implies that we have to use dependency injection based onRequiresContextmonad.All I want to say with all these abstractions is that proper logging architecture is hard. It is not just about writing
logger.log()here and there. It is a complex process of creating proper abstractions, composing them, and maintaining strict layers of pure and impure code.Is your team ready for this?
Logging is a subsystem
We used to log things into a single file. It was fun!
Even this simple setup required us to do periodical log file rotation. But still, it was easy. We also used
grepto find things in this file. But, even then we had problems. Do you happen to have several servers? Good luck finding any required information in all of these files andsshconnections.But, now it is not enough! With all these microservices, cloud-native, and other 2020-ish tools we need complex subsystems to work with logging. It includes (but is not limited to):
And here’s an example of what it takes to set this thing up with ELK stack:
version: '3.2' services: elasticsearch: build: context: elasticsearch/ args: ELK_VERSION: $ELK_VERSION volumes: - type: bind source: ./elasticsearch/config/elasticsearch.yml target: /usr/share/elasticsearch/config/elasticsearch.yml read_only: true - type: volume source: elasticsearch target: /usr/share/elasticsearch/data ports: - "9200:9200" - "9300:9300" environment: ES_JAVA_OPTS: "-Xmx256m -Xms256m" ELASTIC_PASSWORD: changeme discovery.type: single-node networks: - elk logstash: build: context: logstash/ args: ELK_VERSION: $ELK_VERSION volumes: - type: bind source: ./logstash/config/logstash.yml target: /usr/share/logstash/config/logstash.yml read_only: true - type: bind source: ./logstash/pipeline target: /usr/share/logstash/pipeline read_only: true ports: - "5000:5000/tcp" - "5000:5000/udp" - "9600:9600" environment: LS_JAVA_OPTS: "-Xmx256m -Xms256m" networks: - elk depends_on: - elasticsearch kibana: build: context: kibana/ args: ELK_VERSION: $ELK_VERSION volumes: - type: bind source: ./kibana/config/kibana.yml target: /usr/share/kibana/config/kibana.yml read_only: true ports: - "5601:5601" networks: - elk depends_on: - elasticsearch networks: elk: driver: bridge volumes: elasticsearch:Isn’t it a bit too hard?! Look, I just want to write strings into
stdout. And then store it somewhere.But no. You need a database. And a separate web-service. You also have to monitor your logging subsystem. And periodically update it, you also have to make sure that it is secure. And has enough resources. And everyone has access to it. And so on and so forth.
Of course, there are cloud providers just for logging. It might be a good idea to consider using them.
Logging is hard to manage
In case you are still using logging after all my previous arguments, you will find out that it requires a lot of discipline and tooling. There are several well-known problems:
Logging should be very strict about its format. Do you remember that we are probably going to store our logs in a NoSQL database? Our logs need to be indexable. You would probably end up using
structlogor a similar solution. In my opinion, this should be the defaultThe next thing to fight is levels. All developers will use their own ideas on what is critical and what’s not. Unless you (as an architect) will write a clear policy that will cover most of the cases. You might also need to review it carefully. Otherwise, your logging database might blow up with useless data
Your logging usage should be consistent! All people tend to write in their own style. There’s a linter for that! It will enforce:
logger.info( 'Hello {world}', extra={'world': 'Earth'}, )Instead of:
logger.info( 'Hello {world}'.format(world='Earth'), )And many other edge cases.
Logging should be business-oriented. I usually see people using logging with a minimal amount of useful information. For example, if you are logging an invalid current object state: it is not enough! You need to do more: you need to show how this object got into this invalid state. There are different approaches to this problem. Some use simple solutions like version history , some people use EventSourcing to compose their objects from changes. And some libraries log the entire execution context, logical steps that were taken, and changes made to the object. Like
dry-python/stories( docs on logging ). And here’s how the context looks like:ApplyPromoCode.apply find_category find_promo_code check_expiration calculate_discount (errored: TypeError) Context: category_id = 1024 # Story argument category = <example.Category> # Set by ApplyPromoCode.find_category promo_code = <example.PromoCode> # Set by ApplyPromoCode.find_promo_codeSee? It contains the full representation of what’s happened and how to recreate this error. Not just some random state information here and there. And you don’t even have to call logger yourself. It will handled for you. By the way, it even has a native Sentry integration , which is better in my opinion.
You should pay attention to what you log. There are GDPR rules on logging and specialized security audits for your logs. Common sense dictates that logging passwords, credit cards, emails, etc is not secure. But, sadly, common sense is not enough. This is a complex process to follow.
There are other problems to manage as well. My point here is to show that you will need senior people to work on that: by creating policies, writing down processes, and setting up your logging toolchain.
What to do instead?
Let’s do a quick recap:
- Logging does not make much sense in monitoring and error tracking. Use better tools instead: like error and business monitorings with alerts
- Logging adds significant complexity to your architecture. And it requires more testing. Use architecture patterns that will make logging an explicit part of your contracts
- Logging is a whole infrastructure subsystem on its own. And quite a complex one. You will have to maintain it or to outsource this job to existing logging services
- Logging should be done right. And it is hard. You will have to use a lot of tooling. And you will have to mentor developers that are unaware of the problems we have just discussed
Is logging worth it? You should make an informed decision based on this knowledge and your project requirements. In my opinion, it is not required for the most of the regular web apps.
Please, get this right. I understand that logging can be really useful (and sometimes even the only source of useful information). Like for on-premise software, for example. Or for initial steps when your app is not fully functioning yet. It would be hard to understand what is going on without logging. I am fighting an “overlogging” culture. When logs are used just for no good reason. Because developers just do it without analyzing costs and tradeoffs.
Are you joining my side?
A lot of developers consider logging as a silver bullet to fix all things at once. And they don't realize how hard it actually is to work with logging properly.
Conditional coverage
Published: 2020-02-25T00:00:00+00:00
Updated: 2020-02-25T00:00:00+00:00
UTC: 2020-02-25 00:00:00+00:00
URL: https://sobolevn.me/2020/02/conditional-coverage
Typed functional Dependency Injection in Python
Published: 2020-02-02T00:00:00+00:00
Updated: 2020-02-02T00:00:00+00:00
UTC: 2020-02-02 00:00:00+00:00
URL: https://sobolevn.me/2020/02/typed-functional-dependency-injection
Complexity Waterfall
Published: 2019-10-13T00:00:00+00:00
Updated: 2019-10-13T00:00:00+00:00
UTC: 2019-10-13 00:00:00+00:00
URL: https://sobolevn.me/2019/10/complexity-waterfall
Testing Django Migrations
Published: 2019-10-13T00:00:00+00:00
Updated: 2019-10-13T00:00:00+00:00
UTC: 2019-10-13 00:00:00+00:00
URL: https://sobolevn.me/2019/10/testing-django-migrationsDear internet,Content Preview
Dear internet,
Today we have screwed up by applying a broken migration to the running production service and causing a massive outage for several hours… Because the rollback function was terribly broken as well.
As a result, we had to restore a backup that was made several hours ago, losing some new data.
Why did it happen?
The easiest answer is just to say: “Because it is X’s fault! He is the author of this migration, he should learn how databases work”. But, it is counterproductive.
Instead, as a part of our “Blameless environment” culture, we tend to put all the guilt on the CI. It was the CI who put the broken code into the
masterbranch. So, we need to improve it!We always write post-mortems for all massive incidents that we experience. And we write regression tests for all bugs, so they won’t happen again. But, this situation was different, since it was a broken migration that worked during the CI process, and it was hard or impossible to test with the current set of instruments.
So, let me explain the steps we took to solve this riddle.
Existing setup
We use a very strict
djangoproject setup with several quality checks for our migrations:
- We write all data migration as typed functions in our main source code. Then we check everything with
mypyand test as regular functions- We lint migration files with
wemake-python-styleguide, it drastically reduces the possibility of bad code inside the migration files- We use tests that automatically set up the database by applying all migrations before each session
- We use
django-migration-linterto find migrations that are not suited for zero-time deployments- And then we review the code by two senior people
- Then we test everything manually with the help of the review apps
And somehow it is still not enough: our server was dead.
When writing the post-mortem for this bug, I spotted that data in our staging and production services were different. And that’s why our data migration crushed and left one of the core tables in the broken state.
So, how can we test migrations on some existing data?
django-test-migrations
That’s where
django-test-migrationscomes in handy.The idea of this project is simple:
- Set some migration as a starting point
- Create some model’s data that you want to test
- Run the new migration that you are testing
- Assert the results!
Let’s illustrate it with some code samples. Full source code is available here .
Here’s the latest version of our model:
class SomeItem(models.Model): """We use this model for testing migrations.""" string_field = models.CharField(max_length=50) is_clean = models.BooleanField()This is a pretty simple model that serves only one purpose: to illustrate the problem.
is_cleanfield is related to the contents ofstring_fieldin some manner. While thestring_fielditself contains only regular text data.Imagine that you have a data migration that looks like so:
def _is_clean_item(instance: 'SomeItem') -> bool: """ Pure function to the actual migration. Ideally, it should be moved to ``main_app/logic/migrations``. But, as an example it is easier to read them together. """ return ' ' not in instance.string_field def _set_clean_flag(apps, schema_editor): """ Performs the data-migration. We can't import the ``SomeItem`` model directly as it may be a newer version than this migration expects. We are using ``.all()`` because we don't have a lot of ``SomeItem`` instances. In real-life you should not do that. """ SomeItem = apps.get_model('main_app', 'SomeItem') for instance in SomeItem.objects.all(): instance.is_clean = _is_clean_item(instance) instance.save(update_fields=['is_clean']) def _remove_clean_flags(apps, schema_editor): """ This is just a noop example of a rollback function. It is not used in our simple case, but it should be implemented for more complex scenarios. """ class Migration(migrations.Migration): dependencies = [ ('main_app', '0002_someitem_is_clean'), ] operations = [ migrations.RunPython(_set_clean_flag, _remove_clean_flags), ]And here’s how we are going to test this migration. At first, we will have to set some migration as a starting point:
old_state = migrator.before(('main_app', '0002_someitem_is_clean'))Then we have to get the model class. We cannot use direct
importfrommodelsbecause the model might be different, since migrations change them from our stored definition:SomeItem = old_state.apps.get_model('main_app', 'SomeItem')Then we need to create some data that we want to test:
# One instance will be `clean`, the other won't be: SomeItem.objects.create(string_field='a') # clean SomeItem.objects.create(string_field='a b') # contains whitespace, is not cleanThen we will run the migration that we are testing and get the new project state:
new_state = migrator.after(('main_app', '0003_auto_20191119_2125')) SomeItem = new_state.apps.get_model('main_app', 'SomeItem')And the last step: we need to make some assertions on the resulting data. We have created two model instances before: one clean and one with the whitespace. So, let’s check that:
assert SomeItem.objects.count() == 2 # One instance is clean, the other is not: assert SomeItem.objects.filter(is_clean=True).count() == 1 assert SomeItem.objects.filter(is_clean=False).count() == 1And that’s how it works! Now we have an ability to test our schema and data transformations with ease. Complete test example:
@pytest.mark.django_db def test_main_migration0002(migrator): """Ensures that the second migration works.""" old_state = migrator.before(('main_app', '0002_someitem_is_clean')) SomeItem = old_state.apps.get_model('main_app', 'SomeItem') # One instance will be `clean`, the other won't be: SomeItem.objects.create(string_field='a') SomeItem.objects.create(string_field='a b') assert SomeItem.objects.count() == 2 assert SomeItem.objects.filter(is_clean=True).count() == 2 new_state = migrator.after(('main_app', '0003_auto_20191119_2125')) SomeItem = new_state.apps.get_model('main_app', 'SomeItem') assert SomeItem.objects.count() == 2 # One instance is clean, the other is not: assert SomeItem.objects.filter(is_clean=True).count() == 1By the way, we also support raw
unittestcases .Conclusion
Don’t be sure about your migrations. Test them!
You can test forward and rollback migrations and their ordering with the help of
django-test-migrations. It is simple, friendly, and already works with the test framework of your choice.I also want to say “thank you” to these awesome people . Without their work it would take me much longer to come up with the working solution.
When migrating schema and data in Django multiple things can go wrong. It is better to test what you are doing in advance.