Feed: Ned Batchelder

Entries found: 10

A testing conundrum

Updated: 2025-12-18T05:30:06-05:00
UTC: 2025-12-18 10:30:06+00:00
URL: https://nedbatchelder.com/blog/202512/a_testing_conundrum.html

In coverage.py, I have a class for computing the fingerprint of a data structure. It’s used to avoid doing duplicate work when re-processing the same data won’t add to the outcome. It’s designed to work for nested data, and to canonicalize things like set ordering. The slightly simplified code looks like this:
Content Preview

In coverage.py, I have a class for computing the fingerprint of a data structure. It’s used to avoid doing duplicate work when re-processing the same data won’t add to the outcome. It’s designed to work for nested data, and to canonicalize things like set ordering. The slightly simplified code looks like this:

class Hasher:
    """Hashes Python data for fingerprinting."""

    def __init__(self) -> None:
        self.hash = hashlib.new("sha3_256")

    def update(self, v: Any) -> None:
        """Add `v` to the hash, recursively if needed."""
        self.hash.update(str(type(v)).encode("utf-8"))
        match v:
            case None:
                pass
            case str():
                self.hash.update(v.encode("utf-8"))
            case bytes():
                self.hash.update(v)
            case int() | float():
                self.hash.update(str(v).encode("utf-8"))
            case tuple() | list():
                for e in v:
                    self.update(e)
            case dict():
                for k, kv in sorted(v.items()):
                    self.update(k)
                    self.update(kv)
            case set():
                self.update(sorted(v))
            case _:
                raise ValueError(f"Can't hash {v = }")
        self.hash.update(b".")

    def digest(self) -> bytes:
        """Get the full binary digest of the hash."""
        return self.hash.digest()

To test this, I had some basic tests like:

def test_string_hashing():
    # Same strings hash the same.
    # Different strings hash differently.
    h1 = Hasher()
    h1.update("Hello, world!")
    h2 = Hasher()
    h2.update("Goodbye!")
    h3 = Hasher()
    h3.update("Hello, world!")
    assert h1.digest() != h2.digest()
    assert h1.digest() == h3.digest()

def test_dict_hashing():
    # The order of keys doesn't affect the hash.
    h1 = Hasher()
    h1.update({"a": 17, "b": 23})
    h2 = Hasher()
    h2.update({"b": 23, "a": 17})
    assert h1.digest() == h2.digest()

The last line in the update() method adds a dot to the running hash. That was to solve a problem covered by this test:

def test_dict_collision():
    # Nesting matters.
    h1 = Hasher()
    h1.update({"a": 17, "b": {"c": 1, "d": 2}})
    h2 = Hasher()
    h2.update({"a": 17, "b": {"c": 1}, "d": 2})
    assert h1.digest() != h2.digest()

The most recent change to Hasher was to add the set() clause. There (and in dict()), we are sorting the elements to canonicalize them. The idea is that equal values should hash equally and unequal values should not. Sets and dicts are equal regardless of their iteration order, so we sort them to get the same hash.

I added a test of the set behavior:

def test_set_hashing():
    h1 = Hasher()
    h1.update({(1, 2), (3, 4), (5, 6)})
    h2 = Hasher()
    h2.update({(5, 6), (1, 2), (3, 4)})
    assert h1.digest() == h2.digest()
    h3 = Hasher()
    h3.update({(1, 2)})
    assert h1.digest() != h3.digest()

But I wondered if there was a better way to test this class. My small one-off tests weren’t addressing the full range of possibilities. I could read the code and feel confident, but wouldn’t a more comprehensive test be better? This is a pure function: inputs map to outputs with no side-effects or other interactions. It should be very testable.

This seemed like a good candidate for property-based testing. The Hypothesis library would let me generate data, and I could check that the desired properties of the hash held true.

It took me a while to get the Hypothesis strategies wired up correctly. I ended up with this, but there might be a simpler way:

from hypothesis import strategies as st

scalar_types = [
    st.none(),
    st.booleans(),
    st.integers(),
    st.floats(allow_infinity=False, allow_nan=False),
    st.text(),
    st.binary(),
]

scalars = st.one_of(*scalar_types)

def tuples_of(strat):
    return st.lists(strat, max_size=3).map(tuple)

hashable_types = scalar_types + [tuples_of(s) for s in scalar_types]

# Homogeneous sets: all elements same type.
homogeneous_sets = (
    st.sampled_from(hashable_types)
    .flatmap(lambda s: st.sets(s, max_size=5))
)

# Full nested Python data.
python_data = st.recursive(
    scalars,
    lambda children: (
        st.lists(children, max_size=5)
        | tuples_of(children)
        | homogeneous_sets
        | st.dictionaries(st.text(), children, max_size=5)
    ),
    max_leaves=10,
)

This doesn’t make completely arbitrary nested Python data: sets are forced to have elements all of the same type or I wouldn’t be able to sort them. Dictionaries only have strings for keys. But this works to generate data similar to the real data we hash. I wrote this simple test:

from hypothesis import given

@given(python_data)
def test_one(data):
    # Hashing the same thing twice.
    h1 = Hasher()
    h1.update(data)
    h2 = Hasher()
    h2.update(data)
    assert h1.digest() == h2.digest()

This didn’t find any failures, but this is the easy test: hashing the same thing twice produces equal hashes. The trickier test is to get two different data structures, and check that their equality matches their hash equality:

@given(python_data, python_data)
def test_two(data1, data2):
    h1 = Hasher()
    h1.update(data1)
    h2 = Hasher()
    h2.update(data2)

    if data1 == data2:
        assert h1.digest() == h2.digest()
    else:
        assert h1.digest() != h2.digest()

This immediately found problems, but not in my code:

> assert h1.digest() == h2.digest()
E AssertionError: assert b'\x80\x15\xc9\x05...' == b'\x9ap\xebD...'
E
E   At index 0 diff: b'\x80' != b'\x9a'
E
E   Full diff:
E   - (b'\x9ap\xebD...)'
E   + (b'\x80\x15\xc9\x05...)'
E Falsifying example: test_two(
E     data1=(False, False, False),
E     data2=(False, False, 0),
E )

Hypothesis found that (False, False, False) is equal to (False, False, 0), but they hash differently. This is correct. The Hasher class takes the types of the values into account in the hash. False and 0 are equal, but they are different types, so they hash differently. The same problem shows up for 0 == 0.0 and 0.0 == -0.0 . The theory of my test was incorrect: some values that are equal should hash differently.

In my real code, this isn’t an issue. I won’t ever be comparing values like this to each other. If I had a schema for the data I would be comparing, I could use it to steer Hypothesis to generate realistic data. But I don’t have that schema, and I’m not sure I want to maintain that schema. This Hasher is useful as it is, and I’ve been able to reuse it in new ways without having to update a schema.

I could write a smarter equality check for use in the tests, but that would roughly approximate the code in Hasher itself. Duplicating product code in the tests is a good way to write tests that pass but don’t tell you anything useful.

I could exclude bools and floats from the test data, but those are actual values I need to handle correctly.

Hypothesis was useful in that it didn’t find any failures others than the ones I described. I can’t leave those tests in the automated test suite because I don’t want to manually examine the failures, but at least this gave me more confidence that the code is good as it is now.

Testing is a challenge unto itself. This brought it home to me again. It’s not easy to know precisely what you want code to do, and it’s not easy to capture that intent in tests. For now, I’m leaving just the simple tests. If anyone has ideas about how to test Hasher more thoroughly, I’m all ears.

Autism Adulthood, 3rd edition

Updated: 2025-11-18T07:47:30-05:00
UTC: 2025-11-18 12:47:30+00:00
URL: https://nedbatchelder.com/blog/202511/autism_adulthood_3rd_edition.html

Today is the publication of the third edition of Autism Adulthood: Insights and Creative Strategies for a Fulfilling Life. It’s my wife Susan’s book collecting stories and experiences from people all along the autism spectrum, from the self-diagnosed to the profound.The book includes dozens of interviews with autistic adults, their parents, caregivers, researchers, and professionals. Everyone’s experience of autism is different. Reading others’ stories and perspectives can give us a glimpse into other possibilities for ourselves and our loved ones.If you have someone in your life on the spectrum, or are on it yourself, I guarantee you will find new ways to understand the breadth of what autism means and what it can be.Susan has also written two other non-fiction autism books, including a memoir of our early days with our son Nat. Of course I highly recommend all of them.
Content Preview

Today is the publication of the third edition of Autism Adulthood: Insights and Creative Strategies for a Fulfilling Life . It’s my wife Susan ’s book collecting stories and experiences from people all along the autism spectrum, from the self-diagnosed to the profound.

The book includes dozens of interviews with autistic adults, their parents, caregivers, researchers, and professionals. Everyone’s experience of autism is different. Reading others’ stories and perspectives can give us a glimpse into other possibilities for ourselves and our loved ones.

If you have someone in your life on the spectrum, or are on it yourself, I guarantee you will find new ways to understand the breadth of what autism means and what it can be.

Susan has also written two other non-fiction autism books , including a memoir of our early days with our son Nat. Of course I highly recommend all of them.

Why your mock breaks later

Updated: 2025-11-16T07:55:48-05:00
UTC: 2025-11-16 12:55:48+00:00
URL: https://nedbatchelder.com/blog/202511/why_your_mock_breaks_later.html

In Why your mock doesn’t work I explained this rule of mocking:
Content Preview

In Why your mock doesn’t work I explained this rule of mocking:

Mock where the object is used, not where it’s defined.

That blog post explained why that rule was important: often a mock doesn’t work at all if you do it wrong. But in some cases, the mock will work even if you don’t follow this rule, and then it can break much later. Why?

Let’s say you have code like this:

# user.py

def get_user_settings():
    with open(Path("~/settings.json").expanduser()) as f:
        return json.load(f)

def add_two_settings():
    settings = get_user_settings()
    return settings["opt1"] + settings["opt2"]

You write a simple test:

def test_add_two_settings():
    # NOTE: need to create ~/settings.json for this to work:
    #   {"opt1": 10, "opt2": 7}
    assert add_two_settings() == 17

As the comment in the test points out, the test will only pass if you create the correct settings.json file in your home directory. This is bad: you don’t want to require finicky environments for your tests to pass.

The thing we want to avoid is opening a real file, so it’s a natural impulse to mock out open() :

# test_user.py

from io import StringIO
from unittest.mock import patch

@patch("builtins.open")
def test_add_two_settings(mock_open):
    mock_open.return_value = StringIO('{"opt1": 10, "opt2": 7}')
    assert add_two_settings() == 17

Nice, the test works without needing to create a file in our home directory!

Much later...

One day your test suite fails with an error like:

...
  File ".../site-packages/coverage/python.py", line 55, in get_python_source
    source_bytes = read_python_source(try_filename)
  File ".../site-packages/coverage/python.py", line 39, in read_python_source
    return source.replace(b"\r\n", b"\n").replace(b"\r", b"\n")
           ~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
TypeErrorreplace() argument 1 must be str, not bytes

What happened!? Coverage.py code runs during your tests, invoked by the Python interpreter. The mock in the test changed the builtin open , so any use of it anywhere during the test is affected. In some cases, coverage.py needs to read your source code to record the execution properly. When that happens, coverage.py unknowingly uses the mocked open , and bad things happen.

When you use a mock, patch it where it’s used, not where it’s defined. In this case, the patch would be:

@patch("myproduct.user.open")
def test_add_two_settings(mock_open):
    ... etc ...

With a mock like this, the coverage.py code would be unaffected.

Keep in mind: it’s not just coverage.py that could trip over this mock. There could be other libraries used by your code, or you might use open yourself in another part of your product. Mocking the definition means anything using the object will be affected. Your intent is to only mock in one place, so target that place.

Postscript

I decided to add some code to coverage.py to defend against this kind of over-mocking. There is a lot of over-mocking out there , and this problem only shows up in coverage.py with Python 3.14. It’s not happening to many people yet, but it will happen more and more as people start testing with 3.14. I didn’t want to have to answer this question many times, and I didn’t want to force people to fix their mocks.

From a certain perspective, I shouldn’t have to do this. They are in the wrong, not me. But this will reduce the overall friction in the universe. And the fix was really simple:

open = open

This is a top-level statement in my module, so it runs when the module is imported, long before any tests are run. The assignment to open will create a global in my module, using the current value of open , the one found in the builtins. This saves the original open for use in my module later, isolated from how builtins might be changed later.

This is an ad-hoc fix: it only defends one builtin. Mocking other builtins could still break coverage.py. But open is a common one, and this will keep things working smoothly for those cases. And there’s precedent: I’ve already been using a more involved technique to defend against mocking of the os module for ten years.

Even better!

No blog post about mocking is complete without encouraging a number of other best practices, some of which could get you out of the mocking mess:

  • Use autospec=True to make your mocks strictly behave like the original object: see Why your mock still doesn’t work .
  • Make assertions about how your mock was called to be sure everything is connected up properly.
  • Use verified fakes instead of auto-generated mocks: Fast tests for slow services: why you should use verified fakes .
  • Separate your code so that computing functions like our add_two_settings don’t also do I/O. This makes the functions easier to test in the first place. Take a look at Function Core, Imperative Shell .
  • Dependency injection lets you explicitly pass test-specific objects where they are needed instead of relying on implicit access to a mock.

Three releases, one new organization

Updated: 2025-11-09T18:27:02-05:00
UTC: 2025-11-09 23:27:02+00:00
URL: https://nedbatchelder.com/blog/202511/three_releases_one_new_organization.html

It’s been a busy, bumpy week with coverage.py. Some things did not go smoothly, and I didn’t handle everything as well as I could have.
Content Preview

It’s been a busy, bumpy week with coverage.py. Some things did not go smoothly, and I didn’t handle everything as well as I could have.

It started with trying to fix issue 2064 about conflicts between the “sysmon” measurement core and a concurrency setting.

To measure your code, coverage.py needs to know what code got executed. To know that, it collects execution events from the Python interpreter. CPython now has two mechanisms for this: trace functions and sys.monitoring. Coverage.py has two implementations of a trace function (in C and in Python), and an implementation of a sys.monitoring listener. These three components are the measurement cores, known as “ctrace”, “pytrace”, and “sysmon”.

The fastest is sysmon, but there are coverage.py features it doesn’t yet support. With Python 3.14, sysmon is the default core. Issue 2064 complained that when the defaulted core conflicted with an explicit concurrency choice, the conflict resulted in an error. I agreed with the issue: since the core was defaulted, it shouldn’t be an error, we should choose a different core.

But I figured if you explicitly asked for the sysmon core and also a conflicting setting, that should be an error because you’ve got two settings that can’t be used together.

Implementing all that got a little involved because of “metacov”: coverage.py coverage-measuring itself. The sys.monitoring facility in Python was added in 3.12, but wasn’t fully fleshed out enough to do branch coverage until 3.14. When we measure ourselves, we use branch coverage, so 3.12 and 3.13 needed some special handling to avoid causing the error that sysmon plus branch coverage would cause.

I got it all done, and released 7.11.1 on Friday.

Soon, issue 2077 arrived. Another fix in 7.11.1 involved some missing branches when using the sysmon core. That fix required parsing the source code during execution. But sometimes the “code” can’t be parsed: Jinja templates compile html files to Python and use the html file as the file name for the code. When coverage.py tries to parse the html file as Python, of course it fails. My fix didn’t account for this. I fixed that on Saturday and released 7.11.2 .

In the meantime, issue 2076 and issue 2078 both pointed out that now some settings combinations that used to produce warnings now produced errors. This is a breaking change, they said, and should not have been released as a patch version.

To be honest, my first reaction was that it wasn’t that big a deal, the settings were in conflict. Fix the settings and all will be well. It’s hard to remember all of the possibilities when making changes like this, it’s easy to make mistakes, and semantic versioning is bound to have judgement calls anyway. I had already spent a while getting 7.11.1 done, and .2 followed just a day later. I was annoyed and didn’t want to have to re-think everything.

But the more I thought about it, I decided they were right: it does break pipelines that used to work. And falling back to a different core is fine: the cores differ in speed and compatibility but (for the most part) produce the same results. Changing the requested core with a warning is a fine way to deal with the settings conflict without stopping test suites from running.

So I just released 7.11.3 to go back to the older behavior. Maybe I won’t have to do another release tomorrow!

While all this was going on, I also moved the code from my personal GitHub account to a new coveragepy GitHub organization !

Coverage.py is basically a one-man show. Maybe the GitHub organization will make others feel more comfortable chiming in, but I doubt it. I’d like to have more people to talk through changes with. Maybe I wouldn’t have had to make three releases in three days if someone else had been around as a sounding board.

I’m in the #coverage-py channel if you want to talk about any aspect of coverage.py, or I can be reached in lots of other ways . I’d love to talk to you.

Side project advice

Updated: 2025-10-30T06:23:13-04:00
UTC: 2025-10-30 10:23:13+00:00
URL: https://nedbatchelder.com/blog/202510/side_project_advice.html

Last night was a Boston Python project night where I had a good conversation with a few people that was mostly guided by questions from a nice guy named Mark.
Content Preview

Last night was a Boston Python project night where I had a good conversation with a few people that was mostly guided by questions from a nice guy named Mark.

How to write nice code in research

Mark works in research and made the classic observation that research code is often messy, and asked about how to make it nicer.

I pointed out that for software engineers, the code is the product. For research, the results are the product, so there’s a reason the code can be and often is messier. It’s important to keep the goal in mind. I mentioned it might not be worth it to add type annotations, detailed docstrings, or whatever else would make the code “nice”.

But the more you can make “nice” a habit, the less work it will be to do it as a matter of course. Even in a result-driven research environment, you’ll be able to write code the way you want, or at least push back a little bit. Code usually lives longer than people expect, so the nicer you can make it, the better it will be.

Side projects

Side projects are a good opportunity to work differently. If work means messy code, your side project could be pristine. If work is very strict, your side project can be thrown together just for fun. You get to set the goals.

And different side projects can be different. I develop coverage.py very differently than fun math art projects . Coverage.py has an extensive test suite run on many versions of Python (including nightly builds of the tip of main). The math art projects usually have no tests at all.

Side projects are a great place to decide how you want to code and to practice that style. Later you can bring those skills and learnings back to a work environment.

Forgive yourself

Mark said one of his difficulties with side projects is perfectionism. He’ll come back to a project and find he wants to rewrite the whole thing.

My advice is: forgive yourself. It’s OK to rewrite the whole thing. It’s OK to not rewrite the whole thing. It’s OK to ignore it for months at a time. It’s OK to stop in the middle of a project and never come back to it. It’s OK to obsess about “irrelevant” details.

The great thing about a side project is that you are the only person who decides what and how it should be.

How to stay motivated

But how to stay motivated on side projects? For me, it’s very motivating that many people use and get value from coverage.py. It’s a service to the community that I find rewarding. Other side projects will have other motivations: a chance to learn new things, flex different muscles, stretch myself in new ways.

Find a reason that motivates you, and structure your side projects to lean into that reason. Don’t forget to forgive yourself if it doesn’t work out the way you planned or if you change your mind.

How to write something people will use

Sure, it’s great to have a project that many people use, but how do you find a project that will end up like that? The best way is to write something that you find useful. Then talk about it with people. You never know what will catch on.

I mentioned my cog project, which I first wrote in 2004 for one reason, but which is now being used by other people (including me) for different purposes. It took years to catch on .

Of course there’s no guarantee something like that will happen: it most likely won’t. But I don’t know of a better way to make something people will use than to start by making something that you will use.

Other topics

The discussion wasn’t as linear as this. We touched on other things along the way: unit tests vs system tests, obligations to support old versions of software, how to navigate huge code bases. There were probably other tangents that I’ve forgotten.

Project nights are almost never just about projects: they are about connecting with people in lots of different ways. This discussion felt like a good connection. I hope the ideas of choosing your own paths and forgiving yourself hit home.

Natural cubics, circular Simplex

Updated: 2025-10-21T07:14:23-04:00
UTC: 2025-10-21 11:14:23+00:00
URL: https://nedbatchelder.com/blog/202510/natural_cubics_circular_simplex.html

This post continues where Hobby Hilbert Simplex left off. If you haven’t read it yet, start there. It explains the basics of Hobby curves, Hilbert sorting and Simplex noise that I’m using.
Content Preview

This post continues where Hobby Hilbert Simplex left off. If you haven’t read it yet, start there. It explains the basics of Hobby curves, Hilbert sorting and Simplex noise that I’m using.

Animation

To animate one of our drawings, instead of considering 40 lines, we’ll think about 140 lines. The first frame of the animation will draw lines 1 through 40, the second draws lines 2 through 41, and so on until the 100th frame is lines 100 through 140:

Swoopy lines flowing across the image, but with occasional jumps

I’ve used a single Hilbert sorter for all of the frames to remove some jumping, but the Hobby curves still hop around. Also the animation doesn’t loop smoothly, so there’s a giant jump from frame 100 back to frame 1.

Natural cubics

Hobby curves look nice, but have this unfortunate discontinuity where a small change in a point can lead to a radical change in the curve. There’s another way to compute curves through points automatically, called natural cubic curves. These curves don’t jump around the way Hobby curves can.

Jake Low’s page about Hobby curves has interactive examples of natural cubic curves which you should try. Natural cubics don’t look as nice to our eyes as Hobby curves. Below is a comparison. Each row has the same points, with Hobby curves on the left and natural cubic curves on the right:

On the right are nice blobby shapes, on the left are the same points but connected with sometimes pointy awkward curves

The “natural” cubics actually have a quite unnatural appearance. But in an animation, those quirks could be a good trade-off for smooth transitions. Here’s an animation with the same points as our first one, but with natural cubic curves:

A flowing animation with pointier curves, only one jump at the end

Now the motion is smooth except for the jump from frame 100 back to frame 1. Let’s do something about that.

Circular Simplex

So far, we’ve been choosing points by sampling the simplex noise in small steps along a horizontal line: use a fixed u value, then take tiny steps along the v axis. That gave us our x coordinates, and a similar line with a different u value gave us the y coordinates. The ending point will be completely unrelated to the starting point. To make a seamlessly looping animation, we need our x,y values to cycle seamlessly, returning to where they started.

We can make our x,y coordinates loop by choosing u,v values in a circle. Because the u,v values return to their starting point in the continuous simplex noise, the x,y coordinates will return as well. We use two circles: one for the x coordinates and another for the y. The circles are far from each other to keep x and y independent of each other. The size of the circle is determined by the distance we want for each step and how many steps we want in the loop.

Here are three point paths created two ways, with linear sampling on the right and circular sampling on the left. Because simplex provides values between -1 and 1, the points wander within a square:

On the right, three trails of points that don't form a closed loop. On the left, three closed loops but still with interesting random shapes

It can get a bit confusing at this point: these traces are not the curves we are drawing. They are the paths of the control points for successive curves. We draw curves through corresponding sets of points to get our animation. The first curve connects the first red/green/blue points, the second curve connects the second set, and so on.

Using circular sampling of the simplex noise, we can make animations that loop perfectly:

A smoothly looping animation
A smoothly looping animation
A smoothly looping animation

Colophon

If you are interested, the code is available on GitHub at nedbat/fluidity .

Hobby Hilbert Simplex

Updated: 2025-09-26T08:14:04-04:00
UTC: 2025-09-26 12:14:04+00:00
URL: https://nedbatchelder.com/blog/202509/hobby_hilbert_simplex.html

I saw a generative art piece I liked and wanted to learn how it was made. Starting with the artist’s Kotlin code, I dug into three new algorithms, hacked together some Python code, experimented with alternatives, and learned a lot. Now I can explain it to you.
Content Preview

I saw a generative art piece I liked and wanted to learn how it was made. Starting with the artist’s Kotlin code, I dug into three new algorithms, hacked together some Python code, experimented with alternatives, and learned a lot. Now I can explain it to you.

It all started with this post by aBe on Mastodon :

I love how these lines separate and reunite. And the fact that I can express this idea in 3 or 4 lines of code.

For me they’re lives represented by closed paths that end where they started, spending part of the journey together, separating while we go in different directions and maybe reconnecting again in the future.

# CreativeCoding # algorithmicart # proceduralArt # OPENRNDR # Kotlin

80 wobbly black hobby curves with low opacity. In some places the curves travel together, but sometimes they split in 2 or 3 groups and later reunite. Due to the low opacity, depending on how many curves overlap the result is brighter or darker.

The drawing is made by choosing 10 random points, drawing a curve through those points, then slightly scooching the points and drawing another curve. There are 40 curves, each slightly different than the last. Occasionally the next curve makes a jump, which is why they separate and reunite.

Eventually I made something similar:

An image similar to the one from Mastodon, with smoky sinuous curves

Along the way I had to learn about three techniques I got from the Kotlin code: Hobby curves, Hilbert sorting, and simplex noise.

Each of these algorithms tries to do something “natural” automatically, so that we can generate art that looks nice without any manual steps.

Hobby curves

To draw swoopy curves through our random points, we use an algorithm developed by John Hobby as part of Donald Knuth’s Metafont type design system. Jake Low has a great interactive page for playing with Hobby curves , you should try it.

Here are three examples of Hobby curves through ten random points:

Red random points connected by green lines then with a curve through all ten.

The curves are nice, but kind of a scribble, because we’re joining points together in the order we generated them (shown by the green lines). If you asked a person to connect random points, they wouldn’t jump back and forth across the canvas like this. They would find a nearby point to use next, producing a more natural tour of the set.

We’re generating everything automatically, so we can’t manually intervene to choose a natural order for the points. Instead we use Hilbert sorting.

Hilbert sorting

The Hilbert space-filling fractal visits every square in a 2D grid. Hilbert sorting uses a Hilbert fractal traversing the canvas, and sorts the points by when their square is visited by the fractal. This gives a tour of the points that corresponds more closely to what people expect. Points that are close together in space are likely (but not guaranteed) to be close in the ordering.

If we sort the points using Hilbert sorting, we get much nicer curves. Here are the same points as last time:

The same three examples of ten points, but the curves make more sense now

Here are pairs of the same points, unsorted and sorted side-by-side:

Comparing the scribbles and the nice curves

If you compare closely, the points in each pair are the same, but the sorted points are connected in a better order, producing nicer curves.

Simplex noise

Choosing random points would be easy to do with a random number generator, but we want the points to move in interesting graceful ways. To do that, we use simplex noise. This is a 2D function (let’s call the inputs u and v) that produces a value from -1 to 1. The important thing is the function is continuous: if you sample it at two (u,v) coordinates that are close together, the results will be close together. But it’s also random: the continuous curves you get are wavy in unpredictable ways. Think of the simplex noise function as a smooth hilly landscape.

To get an (x,y) point for our drawing, we choose a (u,v) coordinate to produce an x value and a completely different (u,v) coordinate for the y. To get the next (x,y) point, we keep the u values the same and change the v values by just a tiny bit. That makes the (x,y) points move smoothly but interestingly.

Here are the trails of four points taking 50 steps using this scheme:

Four trails of red dots showing how the randomness creates unpredictable but interesting paths

If we use seven points taking five steps, and draw curves through the seven points at each step, we get examples like this:

Drawing curves through the points, widely spaced to show the construction

I’ve left the points visible, and given them large steps so the lines are very widely spaced to show the motion. Taking out the points and drawing more lines with smaller steps gives us this:

More lines to move toward the look we want

With 40 lines drawn wider with some transparency, we start to see the smoky fluidity:

Now we're getting the original effect

Jumps

In his Mastodon post, aBe commented on the separating of the lines as one of the things he liked about this. But why do they do that? If we are moving the points in small increments, why do the curves sometimes make large jumps?

The first reason is because of Hobby curves. They do a great job drawing a curve through a set of points as a person might. But a downside of the algorithm is sometimes changing a point a small amount makes the entire curve take a different route. If you play around with the interactive examples on Jake Low’s page you will see the curve can unexpectedly take a different shape.

As we inch our points along, sometimes the Hobby curve jumps.

The second reason is due to Hilbert sorting. Each of our lines is sorted independently of how the previous line was sorted. If a point’s small motion moves it into a different grid square, it can change the sorting order, which changes the Hobby curve even more.

If we sort the first line, and then keep that order of points for all the lines, the result has fewer jumps, but the Hobby curves still act unpredictably:

The same two sets of points as the last figure. Fewer jumps, but still with some discontinuities

Colophon

This was all done with Python, using other people’s implementations of the hard parts: hobby.py , hilbertcurve , and super-simplex . My code is on GitHub ( nedbat/fluidity ), but it’s a mess. Think of it as a woodworking studio with half-finished pieces and wood chips strewn everywhere.

A lot of the learning and experimentation was in my Jupyter notebook . Part of the process for work like this is playing around with different values of tweakable parameters and seeds for the random numbers to get the effect you want, either artistic or pedagogical. The notebook shows some of the thumbnail galleries I used to pick the examples to show.

I went on to play with animations, which led to other learnings, but those will have to wait for another blog post. Update: I animated these in Natural cubics, circular Simplex .

Testing is better than DSA

Updated: 2025-09-22T12:04:08-04:00
UTC: 2025-09-22 16:04:08+00:00
URL: https://nedbatchelder.com/blog/202509/testing_is_better_than_dsa.html

I see new learners asking about “DSA” a lot. Data Structures and Algorithms are of course important: considered broadly, they are the two ingredients that make up all programs. But in my opinion, “DSA” as an abstract field of study is over-emphasized.
Content Preview

I see new learners asking about “DSA” a lot. Data Structures and Algorithms are of course important: considered broadly, they are the two ingredients that make up all programs. But in my opinion, “DSA” as an abstract field of study is over-emphasized.

I understand why people focus on DSA: it’s a concrete thing to learn about, there are web sites devoted to testing you on it, and most importantly, because job interviews often involve DSA coding questions.

Before I get to other opinions, let me make clear that anything you can do to help you get a job is a good thing to do. If grinding leetcode will land you a position, then do it.

But I hope companies hiring entry-level engineers aren’t asking them to reverse linked lists or balance trees. Asking about techniques that can be memorized ahead of time won’t tell them anything about how well you can work. The stated purpose of those interviews is to see how well you can figure out solutions, in which case memorization will defeat the point.

The thing new learners don’t understand about DSA is that actual software engineering almost never involves implementing the kinds of algorithms that “DSA” teaches you. Sure, it can be helpful to work through some of these puzzles and see how they are solved, but writing real code just doesn’t involve writing that kind of code.

Here is what I think in-the-trenches software engineers should know about data structures and algorithms:

  • Data structures are ways to organize data. Learn some of the basics: linked list, array, hash table, tree. By “learn” I mean understand what it does and why you might want to use one.
  • Different data structures can be used to organize the same data in different ways. Learn some of the trade-offs between structures that are similar.
  • Algorithms are ways of manipulating data. I don’t mean named algorithms like Quicksort, but algorithms as any chunk of code that works on data and does something with it.
  • How you organize data affects what algorithms you can use to work with the data. Some data structures will be slow for some operations where another structure will be fast.
  • Algorithms have a “time complexity” (Big O): how the code slows as the data grows . Get a sense of what this means.
  • Python has a number of built-in data structures. Learn how they work, and the time complexity of their operations.
  • Learn how to think about your code to understand its time complexity.
  • Read a little about more esoteric things like Bloom filters , so you can find them later in the unlikely case you need them.

Here are some things you don’t need to learn:

  • The details of a dozen different sorting algorithms. Look at two to see different ways of approaching the same problem, then move on.
  • The names of “important” algorithms. Those have all been implemented for you.
  • The answers to all N problems on some quiz web site. You won’t be asked these exact questions, and they won’t come up in your real work. Again: try a few to get a feel for how some algorithms work. The exact answers are not what you need.

Of course some engineers need to implement hash tables, or sorting algorithms or whatever. We love those engineers: they write libraries we can use off the shelf so we don’t have to implement them ourselves.

There have been times when I implemented something that felt like An Algorithm (for example, Finding fuzzy floats ), but it was more about considering another perspective on my data, looking at the time complexity, and moving operations around to avoid quadratic behavior. It wasn’t opening a textbook to find the famous algorithm that would solve my problem.

Again: if it will help you get a job, deep-study DSA. But don’t be disappointed when you don’t use it on the job.

If you want to prepare yourself for a career, and also stand out in job interviews, learn how to write tests:

  • This will be a skill you use constantly. Real-world software means writing tests much more than school teaches you to.
  • In a job search, testing experience will stand out more than DSA depth. It shows you’ve thought about what it takes to write high-quality software instead of just academic exercises.
  • It’s not obvious how to test code well. It’s a puzzle and a problem to solve. If you like figuring out solutions to tricky questions, focus on how to write code so that it can be tested, and how to test it.
  • Testing not only gives you more confidence in your code, it helps you write better code in the first place.
  • Testing applies everywhere, from tiny bits of code to entire architectures, assisting you in design and implementation at all scales.
  • If pursued diligently, testing is an engineering discipline in its own right, with a fascinating array of tools and techniques.

Less DSA, more testing.

Finding unneeded pragmas

Updated: 2025-08-24T17:28:12-04:00
UTC: 2025-08-24 21:28:12+00:00
URL: https://nedbatchelder.com/blog/202508/finding_unneeded_pragmas.html

To answer a long-standing coverage.py feature request, I threw together an experiment: a tool to identify lines that have been excluded from coverage, but which were actually executed.
Content Preview

To answer a long-standing coverage.py feature request , I threw together an experiment: a tool to identify lines that have been excluded from coverage, but which were actually executed.

The program is a standalone file in the coverage.py repo. It is unsupported. I’d like people to try it to see what they think of the idea. Later we can decide what to do with it.

To try it: copy warn_executed.py from GitHub. Create a .toml file that looks something like this:

# Regexes that identify excluded lines:
warn-executed = [
    "pragma: no cover",
    "raise AssertionError",
    "pragma: cant happen",
    "pragma: never called",
    ]

# Regexes that identify partial branch lines:
warn-not-partial = [
    "pragma: no branch",
    ]

These are exclusion regexes that you’ve used in your coverage runs. The program will print out any line identified by a pattern and that ran during your tests. It might be that you don’t need to exclude the line, because it ran.

In this file, none of your coverage settings or the default regexes are assumed: you need to explicitly specify all the patterns you want flagged.

Run the program with Python 3.11 or higher, giving the name of the coverage data file and the name of your new TOML configuration file. It will print the lines that might not need excluding:

$ python3.12 warn_executed.py .coverage warn.toml

The reason for a new list of patterns instead of just reading the existing coverage settings is that some exclusions are “don’t care” rather than “this will never happen.” For example, I exclude “def __repr__” because some __repr__’s are just to make my debugging easier. I don’t care if the test suite runs them or not. It might run them, so I don’t want it to be a warning that they actually ran.

This tool is not perfect. For example, I exclude “if TYPE_CHECKING:” because I want that entire clause excluded. But the if-line itself is actually run. If I include that pattern in the warn-executed list, it will flag all of those lines. Maybe I’m forgetting a way to do this: it would be good to have a way to exclude the body of the if clause while understanding that the if-line itself is executed.

Give warn_executed.py a try and comment on the issue about what you think of it.

Starting with pytest’s parametrize

Updated: 2025-08-13T06:14:46-04:00
UTC: 2025-08-13 10:14:46+00:00
URL: https://nedbatchelder.com/blog/202508/starting_with_pytests_parametrize.html

Writing tests can be difficult and repetitive. Pytest has a feature called parametrize that can make it reduce duplication, but it can be hard to understand if you are new to the testing world. It’s not as complicated as it seems.
Content Preview

Writing tests can be difficult and repetitive. Pytest has a feature called parametrize that can make it reduce duplication, but it can be hard to understand if you are new to the testing world. It’s not as complicated as it seems.

Let’s say you have a function called add_nums() that adds up a list of numbers, and you want to write tests for it. Your tests might look like this:

def test_123():
    assert add_nums([1, 2, 3]) == 6

def test_negatives():
    assert add_nums([1, 2, -3]) == 0

def test_empty():
    assert add_nums([]) == 0

This is great: you’ve tested some behaviors of your add_nums() function. But it’s getting tedious to write out more test cases. The names of the function have to be different from each other, and they don’t mean anything, so it’s extra work for no benefit. The test functions all have the same structure, so you’re repeating uninteresting details. You want to add more cases but it feels like there’s friction that you want to avoid.

If we look at these functions, they are very similar. In any software, when we have functions that are similar in structure, but differ in some details, we can refactor them to be one function with parameters for the differences. We can do the same for our test functions.

Here the functions all have the same structure: call add_nums() and assert what the return value should be. The differences are the list we pass to add_nums() and the value we expect it to return. So we can turn those into two parameters in our refactored function:

def test_add_nums(nums, expected_total):
    assert add_nums(nums) == expected_total

Unfortunately, tests aren’t run like regular functions. We write the test functions, but we don’t call them ourselves. That’s the reason the names of the test functions don’t matter. The test runner (pytest) finds functions named test_* and calls them for us. When they have no parameters, pytest can call them directly. But now that our test function has two parameters, we have to give pytest instructions about how to call it.

To do that, we use the @pytest.mark.parametrize decorator. Using it looks like this:

import pytest

@pytest.mark.parametrize(
    "nums, expected_total",
    [
        ([1, 2, 3], 6),
        ([1, 2, -3], 0),
        ([], 0),
    ]
)
def test_add_nums(nums, expected_total):
    assert add_nums(nums) == expected_total

There’s a lot going on here, so let’s take it step by step.

If you haven’t seen a decorator before, it starts with @ and is like a prologue to a function definition. It can affect how the function is defined or provide information about the function.

The parametrize decorator is itself a function call that takes two arguments. The first is a string (“nums, expected_total”) that names the two arguments to the test function. Here the decorator is instructing pytest, “when you call test_add_nums , you will need to provide values for its nums and expected_total parameters .”

The second argument to parametrize is a list of the values to supply as the arguments. Each element of the list will become one call to our test function. In this example, the list has three tuples, so pytest will call our test function three times. Since we have two parameters to provide, each element of the list is a tuple of two values.

The first tuple is ([1, 2, 3], 6) , so the first time pytest calls test_add_nums, it will call it as test_add_nums([1, 2, 3], 6). All together, pytest will call us three times, like this:

test_add_nums([1, 2, 3], 6)
test_add_nums([1, 2, -3], 0)
test_add_nums([], 0)

This will all happen automatically. With our original test functions, when we ran pytest, it showed the results as three passing tests because we had three separate test functions. Now even though we only have one function, it still shows as three passing tests! Each set of values is considered a separate test that can pass or fail independently. This is the main advantage of using parametrize instead of writing three separate assert lines in the body of a simple test function.

What have we gained?

  • We don’t have to write three separate functions with different names.
  • We don’t have to repeat the same details in each function ( assert , add_nums() , == ).
  • The differences between the tests (the actual data) are written succinctly all in one place.
  • Adding another test case is as simple as adding another line of data to the decorator.