The common way to test exceptions is to use
pytest.raises
as a context manager, and have
separate tests for the cases that succeed and those that fail. Instead, this
approach lets you unify them.
One parameterized test that covers both good and bad outcomes. Nice.
AntiLRU
The
@functools.lru_cache
decorator (and its
convenience cousin
@cache
) are good ways to save the result of a function
so that you don’t have to compute it repeatedly. But, they hide an implicit
global in your program: the dictionary of cached results.
This can interfere with testing. Your tests should all be isolated from each
other. You don’t want a side effect of one test to affect the outcome of another
test. The hidden global dictionary will do just that. The first test calls the
cached function, then the second test gets the cached value, not a newly
computed one.
Ideally, lru_cache would only be used on
pure
functions: the result only depends on the arguments.
If it’s only used for pure functions, then you don’t need to worry about interactions between tests
because the answer will be the same for the second test anyway.
But lru_cache is used on functions that pull information from the
environment, perhaps from a network API call. The tests might mock out the API
to check the behavior under different API circumstances. Here’s where the
interference is a real problem.
The lru_cache decorator makes a
.clear_cache
method available on each
decorated function. I had some code that explicitly called that method on the
cached functions. But then I added a new cached function, forgot to update the
conftest.py code that cleared the caches, and my tests were failing.
A more convenient approach is provided by
pytest-antilru
: it’s a pytest plugin that monkeypatches
@lru_cache
to track all of the cached functions, and clears them all
between tests. The caches are still in effect during each test, but can’t
interfere between them.
It works great. I was able to get rid of all of the manually maintained cache
clearing in my conftest.py.
This morning I shared a link to this site, and the recipient said, “it looks
like a file.” I thought they meant the page was all black and white with no
color. No, they were talking about the URL, which ended with “.html”.
Content Preview
This morning I shared a link to this site, and the recipient said, “it looks
like a file.” I thought they meant the page was all black and white with no
color. No, they were talking about the URL, which ended with “.html”.
This site started
almost 24 years ago
as
a static site: a pile of .html files created on my machine and uploaded to the
server. The URLs naturally had .html extensions. It was common in web sites of
the time.
Over the years, the technology has changed. In 2008, it was still a static
site on the host, but
produced with
Django
running locally. In 2021, it became a
real Django site
on the host.
Through all these changes, the URLs remained the same—they still had
the old-fashioned .html extension. I was used to them, so it never struck me as
odd. But when it was pointed out today, it suddenly seemed obviously out of
date.
So now the site prefers URLs with no extension. The fashion in URLs changed
quite some time ago: for 2026, I’m going to party like it’s 2006!
The old URLs still work, but get a permanent redirect to the modern style.
If you notice anything amiss,
please let me know
, as
always.
In my last blog post (A testing conundrum), I described
trying to test my Hasher class which hashes nested data. I couldn’t get
Hypothesis to generate usable data for my test. I wanted to assert that two
equal data items would hash equally, but Hypothesis was finding pairs like
[0] and [False]. These are equal but hash differently because the
hash takes the types into account.
Content Preview
In my last blog post (
A testing conundrum
), I described
trying to test my Hasher class which hashes nested data. I couldn’t get
Hypothesis to generate usable data for my test. I wanted to assert that two
equal data items would hash equally, but Hypothesis was finding pairs like
[0]
and
[False]
. These are equal but hash differently because the
hash takes the types into account.
In the blog post I said,
If I had a schema for the data I would be comparing, I could use it to
steer Hypothesis to generate realistic data. But I don’t have that
schema...
I don’t want a fixed schema for the data Hasher would accept, but tests to
compare data generated from the same schema. It shouldn’t compare a list of
ints to a list of bools. Hypothesis is good at generating things randomly.
Usually it generates data randomly, but we can also use it to generate schemas
randomly!
Hypothesis basics
Before describing my solution, I’ll take a quick detour to describe how
Hypothesis works.
Hypothesis calls their randomness machines “strategies”. Here is a strategy
that will produce random integers between -99 and 1000:
This will produce lists of integers from -99 to 1000. The lists will have up
to 50 elements.
Strategies are used in tests with the
@given
decorator, which takes a
strategy and runs the test a number of times with different example data drawn
from the strategy. In your test you check a desired property that holds true
for any data the strategy can produce.
To demonstrate, here’s a test of sum() that checks that summing a list of
numbers in two halves gives the same answer as summing the whole list:
fromhypothesisimportgiven,strategiesasst
@given(st.lists(st.integers(min_value=-99,max_value=1000),max_size=50)) deftest_sum(nums): # We don't have to test sum(), this is just an example! mid=len(nums)//2 assertsum(nums)==sum(nums[:mid])+sum(nums[mid:])
By default, Hypothesis will run the test 100 times, each with a different
randomly generated list of numbers.
Schema strategies
The solution to my data comparison problem is to have Hypothesis generate a
random schema in the form of a strategy, then use that strategy to generate two
examples. Doing this repeatedly will get us pairs of data that have the same
“shape” that will work well for our tests.
This is kind of twisty, so let’s look at it in pieces. We start with a list
of strategies that produce primitive values:
Then a list of strategies that produce hashable values, which are all the
primitives, plus tuples of any of the primitives:
deftuples_of(elements): """Make a strategy for tuples of some other strategy.""" returnst.lists(elements,max_size=3).map(tuple)
# List of strategies that produce hashable data. hashables=primitives+[tuples_of(s)forsinprimitives]
We want to be able to make nested dictionaries with leaves of some other
type. This function takes a leaf-making strategy and produces a strategy to
make those dictionaries:
defnested_dicts_of(leaves): """Make a strategy for recursive dicts with leaves from another strategy.""" returnst.recursive( leaves, lambdachildren:st.dictionaries(st.text(max_size=10),children,max_size=3), max_leaves=10, )
Finally, here’s our strategy that makes schema strategies:
@given(nested_data_schemas.flatmap(lambdas:st.tuples(s,s))) deftest_same_schema(data_pair): data1,data2=data_pair h1,h2=Hasher(),Hasher() h1.update(data1) h2.update(data2) ifdata1==data2: asserth1.digest()==h2.digest() else: # Strictly speaking, unequal data could produce equal hashes, # but it's very unlikely, so test for it anyway. asserth1.digest()!=h2.digest()
Here I use the .flatmap() method to draw an example from the
nested_data_schemas
strategy and call the provided lambda with the drawn
example, which is itself a strategy. The lambda uses
st.tuples
to make
tuples with two examples drawn from the strategy. So we get one data schema, and
two examples from it as a tuple passed into the test as
data_pair
. The
test then unpacks the data, hashes them, and makes the appropriate
assertion.
This works great: the tests pass. To check that the test was working well, I
made some breaking tweaks to the Hasher class. If Hypothesis is configured to
generate enough examples, it finds data examples demonstrating the failures.
I’m pleased with the results. Hypothesis is something I’ve been wanting to
use more, so I’m glad I took this chance to learn more about it and get it
working for these tests. To be honest, this is way more than I needed to test
my Hasher class. But once I got started, I wanted to get it right, and learning
is always good.
I’m a bit concerned that the standard setting (100 examples) isn’t enough to
find the planted bugs in Hasher. There are many parameters in my strategies that
could be tweaked to keep Hypothesis from wandering too broadly, but I don’t know
how to decide what to change.
Actually
The code in this post is different than the actual code I ended up with.
Mostly this is because I was working on the code while I was writing this post,
and discovered some problems that I wanted to fix. For example, the
tuples_of
function makes homogeneous tuples: varying lengths with
elements all of the same type. This is not the usual use of tuples (see
Lists vs. Tuples
). Adapting for heterogeneous tuples added
more complexity, which was interesting to learn, but I didn’t want to go back
and add it here.
You can look at the
final strategies.py
to see
that and other details, including type hints for everything, which was a journey
of its own.
Postscript: AI assistance
I would not have been able to come up with all of this by myself. Hypothesis
is very powerful, but requires a new way of thinking about things. It’s twisty
to have functions returning strategies, and especially strategies producing
strategies. The docs don’t have many examples, so it can be hard to get a
foothold on the concepts.
Claude helped me by providing initial code, answering questions, debugging
when things didn’t work out, and so on. If you are interested,
this is
one of the discussions I had with it
.
In coverage.py, I have a class for computing the fingerprint of a data
structure. It’s used to avoid doing duplicate work when re-processing the same
data won’t add to the outcome. It’s designed to work for nested data, and to
canonicalize things like set ordering. The slightly simplified code looks like
this:
classHasher: """Hashes Python data for fingerprinting."""
defdigest(self)->bytes: """Get the full binary digest of the hash.""" returnself.hash.digest()
To test this, I had some basic tests like:
deftest_string_hashing(): # Same strings hash the same. # Different strings hash differently. h1=Hasher() h1.update("Hello, world!") h2=Hasher() h2.update("Goodbye!") h3=Hasher() h3.update("Hello, world!") asserth1.digest()!=h2.digest() asserth1.digest()==h3.digest()
deftest_dict_hashing(): # The order of keys doesn't affect the hash. h1=Hasher() h1.update({"a":17,"b":23}) h2=Hasher() h2.update({"b":23,"a":17}) asserth1.digest()==h2.digest()
The last line in the update() method adds a dot to the running hash. That was
to solve a problem covered by this test:
The most recent change to Hasher was to add the set() clause. There (and in
dict()), we are sorting the elements to canonicalize them. The idea is that
equal values should hash equally and unequal values should not. Sets and dicts
are equal regardless of their iteration order, so we sort them to get the same
hash.
But I wondered if there was a better way to test this class. My small
one-off tests weren’t addressing the full range of possibilities. I could read
the code and feel confident, but wouldn’t a more comprehensive test be better?
This is a pure function: inputs map to outputs with no side-effects or other
interactions. It should be very testable.
This seemed like a good candidate for property-based testing. The
Hypothesis
library would let me generate data, and I
could check that the desired properties of the hash held true.
It took me a while to get the Hypothesis strategies wired up correctly.
I ended up with this, but there might be a simpler way:
This doesn’t make completely arbitrary nested Python data: sets are forced to
have elements all of the same type or I wouldn’t be able to sort them.
Dictionaries only have strings for keys. But this works to generate data similar
to the real data we hash. I wrote this simple test:
fromhypothesisimportgiven
@given(python_data) deftest_one(data): # Hashing the same thing twice. h1=Hasher() h1.update(data) h2=Hasher() h2.update(data) asserth1.digest()==h2.digest()
This didn’t find any failures, but this is the easy test: hashing the same
thing twice produces equal hashes. The trickier test is to get two different
data structures, and check that their equality matches their hash equality:
This immediately found problems, but not in my code:
> assert h1.digest() == h2.digest() E AssertionError: assert b'\x80\x15\xc9\x05...' == b'\x9ap\xebD...' E E At index 0 diff: b'\x80' != b'\x9a' E E Full diff: E - (b'\x9ap\xebD...)' E + (b'\x80\x15\xc9\x05...)' E Falsifying example: test_two( E data1=(False, False, False), E data2=(False, False, 0), E )
Hypothesis found that (False, False, False) is equal to (False, False, 0),
but they hash differently. This is correct. The Hasher class takes the types of
the values into account in the hash. False and 0 are equal, but they are
different types, so they hash differently. The same problem shows up for
0 == 0.0
and
0.0 == -0.0
. The theory of my
test was incorrect: some values that are equal should hash differently.
In my real code, this isn’t an issue. I won’t ever be comparing values like
this to each other. If I had a schema for the data I would be comparing, I
could use it to steer Hypothesis to generate realistic data. But I don’t have
that schema, and I’m not sure I want to maintain that schema. This Hasher is
useful as it is, and I’ve been able to reuse it in new ways without having to
update a schema.
I could write a smarter equality check for use in the tests, but that would
roughly approximate the code in Hasher itself. Duplicating product code in the
tests is a good way to write tests that pass but don’t tell you anything
useful.
I could exclude bools and floats from the test data, but those are actual
values I need to handle correctly.
Hypothesis was useful in that it didn’t find any failures others than the
ones I described. I can’t leave those tests in the automated test suite because
I don’t want to manually examine the failures, but at least this gave me more
confidence that the code is good as it is now.
Testing is a challenge unto itself. This brought it home to me again. It’s
not easy to know precisely what you want code to do, and it’s not easy to
capture that intent in tests. For now, I’m leaving just the simple tests. If
anyone has ideas about how to test Hasher more thoroughly, I’m all ears.
Today is the publication of the third edition of Autism
Adulthood: Insights and Creative Strategies for a Fulfilling Life. It’s my
wife Susan’s book collecting stories and experiences from
people all along the autism spectrum, from the self-diagnosed to the
profound.The book includes dozens of interviews with autistic adults, their parents,
caregivers, researchers, and professionals. Everyone’s experience of autism is
different. Reading others’ stories and perspectives can give us a glimpse into
other possibilities for ourselves and our loved ones.If you have someone in your life on the spectrum, or are on it yourself, I
guarantee you will find new ways to understand the breadth of what autism means
and what it can be.Susan has also written two other non-fiction autism
books, including a memoir of our early days with our son Nat. Of course I
highly recommend all of them.
Content Preview
The book includes dozens of interviews with autistic adults, their parents,
caregivers, researchers, and professionals. Everyone’s experience of autism is
different. Reading others’ stories and perspectives can give us a glimpse into
other possibilities for ourselves and our loved ones.
If you have someone in your life on the spectrum, or are on it yourself, I
guarantee you will find new ways to understand the breadth of what autism means
and what it can be.
Susan has also written
two other non-fiction autism
books
, including a memoir of our early days with our son Nat. Of course I
highly recommend all of them.
Mock where the object is used,
not
where it’s
defined.
That blog post explained why that rule was important: often a mock doesn’t
work at all if you do it wrong. But in some cases, the mock will work even if
you don’t follow this rule, and then it can break much later. Why?
deftest_add_two_settings(): # NOTE: need to create ~/settings.json for this to work: # {"opt1": 10, "opt2": 7} assertadd_two_settings()==17
As the comment in the test points out, the test will only pass if you create
the correct settings.json file in your home directory. This is bad: you don’t
want to require finicky environments for your tests to pass.
The thing we want to avoid is opening a real file, so it’s a natural impulse
to mock out
open()
:
... File ".../site-packages/coverage/python.py", line 55, in get_python_source source_bytes=read_python_source(try_filename) File ".../site-packages/coverage/python.py", line 39, in read_python_source returnsource.replace(b"\r\n",b"\n").replace(b"\r",b"\n") ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^ TypeError: replace() argument 1 must be str, not bytes
What happened!? Coverage.py code runs during your tests, invoked by the
Python interpreter. The mock in the test changed the builtin
open
, so
any use of it anywhere during the test is affected. In some cases, coverage.py
needs to read your source code to record the execution properly. When that
happens, coverage.py unknowingly uses the mocked
open
, and bad things
happen.
When you use a mock, patch it where it’s used, not where it’s defined. In
this case, the patch would be:
With a mock like this, the coverage.py code would be unaffected.
Keep in mind: it’s not just coverage.py that could trip over this mock. There
could be other libraries used by your code, or you might use
open
yourself in another part of your product. Mocking the definition means
anything
using the object will be affected. Your intent is to only
mock in one place, so target that place.
Postscript
I decided to add some code to coverage.py to defend against this kind of
over-mocking. There is
a lot of over-mocking out
there
, and this problem only shows up in coverage.py with Python 3.14. It’s
not happening to many people yet, but it will happen more and more as people
start testing with 3.14. I didn’t want to have to answer this question many
times, and I didn’t want to force people to fix their mocks.
From a certain perspective, I shouldn’t have to do this. They are in the
wrong, not me. But this will reduce the overall friction in the universe. And
the fix
was really simple:
open=open
This is a top-level statement in my module, so it runs when the module is
imported, long before any tests are run. The assignment to
open
will
create a global in my module, using the current value of
open
, the one
found in the builtins. This saves the original
open
for use in my module
later, isolated from how builtins might be changed later.
This is an ad-hoc fix: it only defends one builtin. Mocking other builtins
could still break coverage.py. But
open
is a common one, and this will
keep things working smoothly for those cases. And there’s precedent: I’ve
already been using a more involved technique to
defend
against mocking of the
os
module
for ten years.
Even better!
No blog post about mocking is complete without encouraging a number of other
best practices, some of which could get you out of the mocking mess:
Separate your code so that computing functions like our
add_two_settings
don’t also do I/O. This makes the functions easier to
test in the first place. Take a look at
Function Core,
Imperative Shell
.
Dependency injection lets you explicitly pass test-specific objects where
they are needed instead of relying on implicit access to a mock.
It’s been a busy, bumpy week with coverage.py. Some things did not go
smoothly, and I didn’t handle everything as well as I could have.
Content Preview
It’s been a busy, bumpy week with coverage.py. Some things did not go
smoothly, and I didn’t handle everything as well as I could have.
It started with trying to fix
issue 2064
about
conflicts between the “sysmon” measurement core and a concurrency setting.
To measure your code, coverage.py needs to know what code got executed. To
know that, it collects execution events from the Python interpreter. CPython now
has two mechanisms for this: trace functions and sys.monitoring. Coverage.py
has two implementations of a trace function (in C and in Python), and an
implementation of a sys.monitoring listener. These three components are the
measurement cores, known as “ctrace”, “pytrace”, and “sysmon”.
The fastest is sysmon, but there are coverage.py features it doesn’t yet
support. With Python 3.14, sysmon is the default core. Issue 2064 complained
that when the defaulted core conflicted with an explicit concurrency choice, the
conflict resulted in an error. I agreed with the issue: since the core was
defaulted, it shouldn’t be an error, we should choose a different core.
But I figured if you explicitly asked for the sysmon core and also a
conflicting setting, that should be an error because you’ve got two settings
that can’t be used together.
Implementing all that got a little involved because of “metacov”: coverage.py
coverage-measuring itself. The sys.monitoring facility in Python was added in
3.12, but wasn’t fully fleshed out enough to do branch coverage until 3.14. When
we measure ourselves, we use branch coverage, so 3.12 and 3.13 needed some
special handling to avoid causing the error that sysmon plus branch coverage
would cause.
Soon,
issue 2077
arrived. Another fix in 7.11.1
involved some missing branches when using the sysmon core. That fix required
parsing the source code during execution. But sometimes the “code” can’t be
parsed: Jinja templates compile html files to Python and use the html file as
the file name for the code. When coverage.py tries to parse the html file as
Python, of course it fails. My fix didn’t account for this. I fixed that on
Saturday and
released 7.11.2
.
In the meantime,
issue 2076
and
issue
2078
both pointed out that now some settings combinations that used to
produce warnings now produced errors. This is a breaking change, they said, and
should not have been released as a patch version.
To be honest, my first reaction was that it wasn’t that big a deal, the
settings were in conflict. Fix the settings and all will be well. It’s hard to
remember all of the possibilities when making changes like this, it’s easy to
make mistakes, and semantic versioning is bound to have judgement calls anyway.
I had already spent a while getting 7.11.1 done, and .2 followed just a day
later. I was annoyed and didn’t want to have to re-think everything.
But the more I thought about it, I decided they were right: it does break
pipelines that used to work. And falling back to a different core is fine: the
cores differ in speed and compatibility but (for the most part) produce the same
results. Changing the requested core with a warning is a fine way to deal with
the settings conflict without stopping test suites from running.
So I just
released 7.11.3
to go back to the older
behavior. Maybe I won’t have to do another release tomorrow!
Coverage.py is basically a one-man show. Maybe the GitHub organization will
make others feel more comfortable chiming in, but I doubt it. I’d like to have
more people to talk through changes with. Maybe I wouldn’t have had to make
three releases in three days if someone else had been around as a sounding
board.
I’m in the
#coverage-py
channel if you want to talk
about any aspect of coverage.py, or I can be reached in
lots of other ways
. I’d love to talk to
you.
Last night was a Boston Python project night where I
had a good conversation with a few people that was mostly guided by questions
from a nice guy named Mark.
Content Preview
Last night was a
Boston Python project night
where I
had a good conversation with a few people that was mostly guided by questions
from a nice guy named Mark.
How to write nice code in research
Mark works in research and made the classic observation that research code is
often messy, and asked about how to make it nicer.
I pointed out that for software engineers, the code is the product. For
research, the results are the product, so there’s a reason the code can be and
often is messier. It’s important to keep the goal in mind. I mentioned it might
not be worth it to add type annotations, detailed docstrings, or whatever else
would make the code “nice”.
But the more you can make “nice” a habit, the less work it will be to do it
as a matter of course. Even in a result-driven research environment, you’ll be
able to write code the way you want, or at least push back a little bit. Code
usually lives longer than people expect, so the nicer you can make it,
the better it will be.
Side projects
Side projects are a good opportunity to work differently. If work means messy
code, your side project could be pristine. If work is very strict, your side
project can be thrown together just for fun. You get to set the goals.
And different side projects can be different. I develop
coverage.py
very differently
than
fun math art
projects
. Coverage.py has an extensive test suite run on many versions of
Python (including nightly builds of the tip of main). The math art projects
usually have no tests at all.
Side projects are a great place to decide how you want to code and to
practice that style. Later you can bring those skills and learnings back to a
work environment.
Forgive yourself
Mark said one of his difficulties with side projects is perfectionism. He’ll
come back to a project and find he wants to rewrite the whole thing.
My advice is: forgive yourself. It’s OK to rewrite the whole thing. It’s OK
to not rewrite the whole thing. It’s OK to ignore it for months at a time. It’s
OK to stop in the middle of a project and never come back to it. It’s OK to
obsess about “irrelevant” details.
The great thing about a side project is that you are the only person who
decides what and how it should be.
How to stay motivated
But how to stay motivated on side projects? For me, it’s very motivating that
many people use and get value from coverage.py. It’s a service to the community
that I find rewarding. Other side projects will have other motivations: a
chance to learn new things, flex different muscles, stretch myself in new
ways.
Find a reason that motivates you, and structure your side projects to lean
into that reason. Don’t forget to forgive yourself if it doesn’t work out the
way you planned or if you change your mind.
How to write something people will use
Sure, it’s great to have a project that many people use, but how do you find
a project that will end up like that? The best way is to write something that
you find useful. Then talk about it with people. You never know what will catch
on.
I mentioned my
cog
project,
which I first wrote in 2004 for one reason, but which is now being used by other
people (including me) for different purposes. It
took years to catch on
.
Of course there’s no guarantee something like that will happen: it most
likely won’t. But I don’t know of a better way to make something people will
use than to start by making something that
you
will use.
Other topics
The discussion wasn’t as linear as this. We touched on other things along the
way: unit tests vs system tests, obligations to support old versions of
software, how to navigate huge code bases. There were probably other tangents
that I’ve forgotten.
Project nights are almost never just about projects: they are about
connecting with people in lots of different ways. This discussion felt like a
good connection. I hope the ideas of choosing your own paths and forgiving
yourself hit home.
This post continues where Hobby Hilbert Simplex left
off. If you haven’t read it yet, start there. It explains the basics of Hobby
curves, Hilbert sorting and Simplex noise that I’m using.
Content Preview
This post continues where
Hobby Hilbert Simplex
left
off. If you haven’t read it yet, start there. It explains the basics of Hobby
curves, Hilbert sorting and Simplex noise that I’m using.
Animation
To animate one of our drawings, instead of considering 40 lines, we’ll think
about 140 lines. The first frame of the animation will draw lines 1 through 40,
the second draws lines 2 through 41, and so on until the 100th frame is lines
100 through 140:
I’ve used a single Hilbert sorter for all of the frames to remove some
jumping, but the Hobby curves still hop around. Also the animation doesn’t loop
smoothly, so there’s a giant jump from frame 100 back to frame 1.
Natural cubics
Hobby curves look nice, but have this unfortunate discontinuity where a small
change in a point can lead to a radical change in the curve. There’s another way
to compute curves through points automatically, called natural cubic curves.
These curves don’t jump around the way Hobby curves can.
Jake Low’s
page about Hobby curves
has interactive
examples of natural cubic curves which you should try. Natural cubics don’t
look as nice to our eyes as Hobby curves. Below is a comparison. Each row has
the same points, with Hobby curves on the left and natural cubic curves on the
right:
The “natural” cubics actually have a quite unnatural appearance. But in an
animation, those quirks could be a good trade-off for smooth transitions. Here’s
an animation with the same points as our first one, but with natural cubic
curves:
Now the motion is smooth except for the jump from frame 100 back to frame 1.
Let’s do something about that.
Circular Simplex
So far, we’ve been choosing points by sampling the simplex noise in small steps along
a horizontal line: use a fixed u value, then take tiny steps along the v axis.
That gave us our x coordinates, and a similar line with a different u value gave
us the y coordinates. The ending point will be completely unrelated to the
starting point. To make a seamlessly looping animation, we need our x,y values
to cycle seamlessly, returning to where they started.
We can make our x,y coordinates loop by choosing u,v values in a circle.
Because the u,v values return to their starting point in the continuous simplex
noise, the x,y coordinates will return as well. We use two circles: one for the
x coordinates and another for the y. The circles are far from each other to
keep x and y independent of each other. The size of the circle is determined by
the distance we want for each step and how many steps we want in the loop.
Here are three point paths created two ways, with linear sampling on the
right and circular sampling on the left. Because simplex provides values between
-1 and 1, the points wander within a square:
It can get a bit confusing at this point: these traces are not the curves we
are drawing. They are the paths of the control points for successive curves. We
draw curves through corresponding sets of points to get our animation. The first
curve connects the first red/green/blue points, the second curve connects the
second set, and so on.
Using circular sampling of the simplex noise, we can make animations that
loop perfectly:
Colophon
If you are interested, the code is available on GitHub at
nedbat/fluidity
.
I saw a generative art piece I liked and wanted to learn how it was made.
Starting with the artist’s Kotlin code, I dug into three new algorithms, hacked
together some Python code, experimented with alternatives, and learned a lot.
Now I can explain it to you.
Content Preview
I saw a generative art piece I liked and wanted to learn how it was made.
Starting with the artist’s Kotlin code, I dug into three new algorithms, hacked
together some Python code, experimented with alternatives, and learned a lot.
Now I can explain it to you.
I love how these lines separate and reunite. And the fact that I can express this idea in 3 or 4 lines of code.
For me they’re lives represented by closed paths that end where they started, spending part of the journey together, separating while we go in different directions and maybe reconnecting again in the future.
The drawing is made by choosing 10 random points, drawing a curve through
those points, then slightly scooching the points and drawing another curve.
There are 40 curves, each slightly different than the last. Occasionally
the next curve makes a jump, which is why they separate and reunite.
Eventually I made something similar:
Along the way I had to learn about three techniques I got from the Kotlin
code: Hobby curves, Hilbert sorting, and simplex noise.
Each of these algorithms tries to do something “natural” automatically, so
that we can generate art that looks nice without any manual steps.
Hobby curves
To draw swoopy curves through our random points, we use an algorithm
developed by John Hobby as part of Donald Knuth’s Metafont type design system.
Jake Low has a
great interactive page for playing with Hobby
curves
, you should try it.
Here are three examples of Hobby curves through ten random points:
The curves are nice, but kind of a scribble, because we’re joining points
together in the order we generated them (shown by the green lines). If you
asked a person to connect random points, they wouldn’t jump back and forth
across the canvas like this. They would find a nearby point to use next,
producing a more natural tour of the set.
We’re generating everything automatically, so we can’t manually intervene
to choose a natural order for the points. Instead we use Hilbert sorting.
Hilbert sorting
The Hilbert space-filling fractal visits every square in a 2D grid.
Hilbert sorting
uses a Hilbert fractal traversing
the canvas, and sorts the points by when their square is visited by the fractal.
This gives a tour of the points that corresponds more closely to what people
expect. Points that are close together in space are likely (but not guaranteed)
to be close in the ordering.
If we sort the points using Hilbert sorting, we get much nicer curves. Here
are the same points as last time:
Here are pairs of the same points, unsorted and sorted side-by-side:
If you compare closely, the points in each pair are the same, but the sorted
points are connected in a better order, producing nicer curves.
Simplex noise
Choosing random points would be easy to do with a random number generator,
but we want the points to move in interesting graceful ways. To do that, we use
simplex noise. This is a 2D function (let’s call the inputs u and v) that
produces a value from -1 to 1. The important thing is the function is
continuous: if you sample it at two (u,v) coordinates that are close together,
the results will be close together. But it’s also random: the continuous curves
you get are wavy in unpredictable ways. Think of the simplex noise function as
a smooth hilly landscape.
To get an (x,y) point for our drawing, we choose a (u,v) coordinate to
produce an x value and a completely different (u,v) coordinate for the y. To
get the next (x,y) point, we keep the u values the same and change the v values by
just a tiny bit. That makes the (x,y) points move smoothly but interestingly.
Here are the trails of four points taking 50 steps using this scheme:
If we use seven points taking five steps, and draw curves through the seven
points at each step, we get examples like this:
I’ve left the points visible, and given them large steps so the lines are
very widely spaced to show the motion. Taking out the points and drawing more
lines with smaller steps gives us this:
With 40 lines drawn wider with some transparency, we start to see the smoky
fluidity:
Jumps
In his Mastodon post, aBe commented on the separating of the lines as one of
the things he liked about this. But why do they do that? If we are moving the
points in small increments, why do the curves sometimes make large jumps?
The first reason is because of Hobby curves. They do a great job drawing a
curve through a set of points as a person might. But a downside of the
algorithm is sometimes changing a point a small amount makes the entire curve
take a different route. If you play around with the interactive examples on
Jake Low’s page
you will see the curve can unexpectedly
take a different shape.
As we inch our points along, sometimes the Hobby curve jumps.
The second reason is due to Hilbert sorting. Each of our lines is sorted
independently of how the previous line was sorted. If a point’s small motion
moves it into a different grid square, it can change the sorting order, which
changes the Hobby curve even more.
If we sort the first line, and then keep that order of points for all the
lines, the result has fewer jumps, but the Hobby curves still act
unpredictably:
Colophon
This was all done with Python, using other people’s implementations of the
hard parts:
hobby.py
,
hilbertcurve
, and
super-simplex
. My code
is on GitHub
(
nedbat/fluidity
), but it’s a
mess. Think of it as a woodworking studio with half-finished pieces and wood
chips strewn everywhere.
A lot of the learning and experimentation was in
my Jupyter
notebook
. Part of the process for work like this is playing around with
different values of tweakable parameters and seeds for the random numbers to get
the effect you want, either artistic or pedagogical. The notebook shows some of
the thumbnail galleries I used to pick the examples to show.
I went on to play with animations, which led to other learnings, but those
will have to wait for another blog post.
Update:
I animated these in
Natural cubics, circular Simplex
.