Feed: Nikita Prokopov

Entries found: 5

Statistics made simple

Published: 2025-12-15T00:00:00Z
Updated: 2025-12-15T17:17:27Z
UTC: 2025-12-15 17:17:27+00:00
URL: https://tonsky.me/blog/clj-simple-stats/

Announcing a simple statistics library for Clojure web servers
Content Preview

I have a weird relationship with statistics: on one hand, I try not to look at it too often. Maybe once or twice a year. It’s because analytics is not actionable: what difference does it make if a thousand people saw my article or ten thousand?

I mean, sure, you might try to guess people’s tastes and only write about what’s popular, but that will destroy your soul pretty quickly.

On the other hand, I feel nervous when something is not accounted for, recorded, or saved for future reference. I might not need it now, but what if ten years later I change my mind?

Seeing your readers also helps to know you are not writing into the void. So I really don’t need much, something very basic: the number of readers per day/per article, maybe, would be enough.

Final piece of the puzzle: I self-host my web projects, and I use an old-fashioned web server instead of delegating that task to Nginx.

Static sites are popular and for a good reason: they are fast, lightweight, and fulfil their function. I, on the other hand, might have an unfinished gestalt or two: I want to feel the full power of the computer when serving my web pages, to be able to do fun stuff that is beyond static pages. I need that freedom that comes with a full programming language at your disposal. I want to program my own web server (in Clojure, sorry everybody else).

Existing options

All this led me on a quest for a statistics solution that would uniquely fit my needs. Google Analytics was out: bloated, not privacy-friendly, terrible UX, Google is evil, etc.

What is going on?

Some other JS solution might’ve been possible, but still questionable: SaaS? Paid? Will they be around in 10 years? Self-host? Are their cookies GDPR-compliant? How to count RSS feeds?

Nginx has access logs, so I tried server-side statistics that feed off those (namely, Goatcounter). Easy to set up, but then I needed to create domains for them, manage accounts, monitor the process, and it wasn’t even performant enough on my server/request volume!

My solution

So I ended up building my own. You are welcome to join, if your constraints are similar to mine. This is how it looks:

It’s pretty basic, but does a few things that were important to me.

Setup

Extremely easy to set up. And I mean it as a feature.

Just add our middleware to your Ring stack and get everything automatically: collecting and reporting.

(def app
  (-> routes
    ...
    (ring.middleware.params/wrap-params)
    (ring.middleware.cookies/wrap-cookies)
    ...
    (clj-simple-stats.core/wrap-stats))) ;; <-- just add this

It’s zero setup in the best sense: nothing to configure, nothing to monitor, minimal dependency. It starts to work immediately and doesn’t ask anything from you, ever.

See, you already have your web server, why not reuse all the setup you did for it anyway?

Request types

We distinguish between request types. In my case, I am only interested in live people, so I count them separately from RSS feed requests, favicon requests, redirects, wrong URLs, and bots. Bots are particularly active these days. Gotta get that AI training data from somewhere.

RSS feeds are live people in a sense, so extra work was done to count them properly. Same reader requesting feed.xml 100 times in a day will only count as one request.

Hosted RSS readers often report user count in User-Agent, like this:

Feedly/1.0 (+http://www.feedly.com/fetcher.html; 457 subscribers; like FeedFetcher-Google)

Mozilla/5.0 (compatible; BazQux/2.4; +https://bazqux.com/fetcher; 6 subscribers)

Feedbin feed-id:1373711 - 142 subscribers

My personal respect and thank you to everybody on this list. I see you.

Graphs

Visualization is important, and so is choosing the correct graph type. This is wrong:

Continuous line suggests interpolation. It reads like between 1 visit at 5am and 11 visits at 6am there were points with 2, 3, 5, 9 visits in between. Maybe 5.5 visits even! That is not the case.

This is how a semantically correct version of that graph should look:

Some attention was also paid to having reasonable labels on axes. You won’t see something like 117, 234, 10875. We always choose round numbers appropriate to the scale: 100, 200, 500, 1K etc.

Goes without saying that all graphs have the same vertical scale and syncrhonized horizontal scroll.

Insights

We don’t offer much (as I don’t need much), but you can narrow reports down by page, query, referrer, user agent, and any date slice.

Not implemented (yet)

It would be nice to have some insights into “What was this spike caused by?”

Some basic breakdown by country would be nice. I do have IP addresses (for what they are worth), but I need a way to package GeoIP into some reasonable size (under 1 Mb, preferably; some loss of resolution is okay).

Finally, one thing I am really interested in is “Who wrote about me?” I do have referrers, only question is how to separate signal from noise.

Performance. DuckDB is a sport: it compresses data and runs column queries, so storing extra columns per row doesn’t affect query performance. Still, each dashboard hit is a query across the entire database, which at this moment (~3 years of data) sits around 600 MiB. I definitely need to look into building some pre-calculated aggregates.

One day.

How to get

Head to github.com/tonsky/clj-simple-stats and follow the instructions:

Let me know what you think! Is it usable to you? What could be improved?

P.S. You can try the live example at tonsky.me/stats . The data was imported from Nginx access logs, which I turned on and off on a few occasions, so it’s a bit spotty. Still, it should give you a general idea.

How to get hired in 2025

Published: 2025-11-26T00:00:00Z
Updated: 2025-12-15T17:17:27Z
UTC: 2025-12-15 17:17:27+00:00
URL: https://tonsky.me/blog/hiring-ai/

A collection of red flags in software engineers' test assignments
Content Preview

It’s 2025 and you are applying for a software engineer position. They give you a test assignment. You complete it yourself, send it over, and get rejected. Why?

Because it looked like AI.

Unfortunately, it’s 2025, AI is spreading like glitter in a kindergarten, and it’s really easy to mistake hard human labor for soulless, uninspired machine slop.

Following are the main red flags in test assignments that should be avoided :

  • The assignment was read and understood in full.
  • All parts are implemented.
  • Industry-standard tools and frameworks are used.
  • The code is split into small, readable functions.
  • Variables have descriptive names.
  • Complex parts have comments.
  • Errors are handled, error messages are easy to follow.
  • Source files are organized reasonably.
  • The web interface looks nice.
  • There are tests.

Avoid these AI giveaways and spread the word!

Logo: Clojure+

Published: 2025-11-18T00:00:00Z
Updated: 2025-11-18T00:00:00Z
UTC: 2025-11-18 00:00:00+00:00
URL: https://tonsky.me/design/#2025-11-clojure-plus

Clojure+ is a project to improve Clojure stdlib.
Content Preview
Clojure+ is a project to improve Clojure stdlib.

Needy programs

Published: 2025-11-13T00:00:00Z
Updated: 2025-12-15T17:17:27Z
UTC: 2025-12-15 17:17:27+00:00
URL: https://tonsky.me/blog/needy-programs/

We used to use software; now software started to use us
Content Preview

If you’ve been around, you might’ve noticed that our relationships with programs have changed.

Older programs were all about what you need: you can do this, that, whatever you want, just let me know. You were in control, you were giving orders, and programs obeyed.

But recently (a decade, more or less), this relationship has subtly changed. Newer programs (which are called apps now, yes, I know) started to want things from you.

Accounts

The most obvious example is user accounts. In most cases, I, as a user, don’t need an account. Yet programs keep insisting that I, not them, “need” one.

I don’t. I have more accounts already than a population of a small town. This is something you want, not me.

The only correct reaction to an account screen

And even if you give up and create one, they will never leave you alone: they’ll ask for 2FA, then for password rotation, then will log you out for no good reason. You’ll never see the end of it either way.

This got so bad that when a program doesn’t ask you to create an account, it feels refreshing .

“Okay, but accounts are still needed to sync stuff between machines.”

Wrong. Syncthing is a secure, multi-machine distributed app and yet doesn’t need an account .

“Okay, but you still need an account if you pay for a subscription?”

Mullvad VPN accepts payments and yet didn’t ask me for my email.

How come these apps can go without an account, but your code editor and your terminal can’t?

Updates

Every program has an update mechanism now. Everybody is checking for updates all the time. Some notoriously bad ones lock you out until you update. You get notified a few seconds after a new version is available.

And yet: do we, users, really need these updates? Did we ask for them?

I’ve been running barebone Nvidia drivers without their bloated desktop app (partly because it asks for an account, lol).

As a result, there’s nobody to notify me about new drivers. And you know what? It’s been fine. I could forget to update for months, and still everything works. It’s the most relaxing I’ve felt in a while.

Even terminal programs bother you with updates now.

There has been a new major release of Syncthing in August. How did I learn about it? By accident; a friend told me. And you know what? I’m happy with that. If I upgrade, nothing in my life will change. It works just fine now. So do I really need an update? Is it my need?

It’s simple, really. If I need an update, I will know it: I’ll encounter a bug or a lack of functionality. Then I’ll go and update.

Until then, politely fuck off.

Notifications

Notifications are the ultimate example of neediness: a program, a mechanical, lifeless thing, an unanimate object, is bothering its master about something the master didn’t ask for. Hey, who is more important here, a human or a machine?

Notifications are like email: to-do items that are forced on you by another party. Hey, it’s not my job to dismiss your notifications!

I just downloaded this and already have three notifications to dismiss.

Sure, there are good notifications. Sometimes users need to be notified about something they care about, like the end of a long-running process.

But the general pattern is so badly abused that it’s hard to justify it now. You can make a case that giving a toddler a gun can help it protect itself. But much worse things will probably happen much sooner.

These fucking dots.

There’s no good reason why, e.g. code editor needs a notification system. What’s there to notify about? Updates? Sublime Text has no notifications. And you know what? It works just fine. I never felt underinformed while using it.

The ultimate example: account, update, and notification

Onboarding

The company needs to announce a new feature and makes a popup window about it.

Read this again: The company. Needs. It’s not even about the user. Never has been.

What’s new in Calendar? I don’t know, 13th month?

Did I ask about Copilot? No. The company wants me to use it. Not me:

Do I care about Figma Make? Not really, no.

Yet I still know about it, against my will.

To sum it up

I’ve read somewhere (sorry, lost the link):

ls never asks you to create an account or to update.

I agree. ls is a good program. ls is a tool. It does what I need it to do and stays quiet otherwise. I use it; it doesn’t use me. That’s a good, healthy relationship.

At the other end of the spectrum, we have services. Programs that constantly update. Programs that have news, that “keep you informed”. Programs that need something from you all the time. Programs that update Terms of Service just to remind you of themselves.

Programs that have their own agenda and that are trying to make it yours, too. Programs that want you to think about them. Programs that think they are entitled to a part of your attention. “Pick me” programs.

And you know what? Fuck these programs. Give me back my computer.

I am sorry, but everyone is getting syntax highlighting wrong

Published: 2025-10-15T00:00:00Z
Updated: 2025-12-15T17:17:27Z
UTC: 2025-12-15 17:17:27+00:00
URL: https://tonsky.me/blog/syntax-highlighting/

Applying human ergonomics and design principles to syntax highlighting
Content Preview

Translations: Russian

Syntax highlighting is a tool. It can help you read code faster. Find things quicker. Orient yourself in a large file.

Like any tool, it can be used correctly or incorrectly. Let’s see how to use syntax highlighting to help you work.

Christmas Lights Diarrhea

Most color themes have a unique bright color for literally everything: one for variables, another for language keywords, constants, punctuation, functions, classes, calls, comments, etc.

Sometimes it gets so bad one can’t see the base text color: everything is highlighted. What’s the base text color here?

The problem with that is, if everything is highlighted, nothing stands out. Your eye adapts and considers it a new norm: everything is bright and shiny, and instead of getting separated, it all blends together.

Here’s a quick test. Try to find the function definition here:

and here:

See what I mean?

So yeah, unfortunately, you can’t just highlight everything. You have to make decisions: what is more important, what is less. What should stand out, what shouldn’t.

Highlighting everything is like assigning “top priority” to every task in Linear. It only works if most of the tasks have lesser priorities.

If everything is highlighted, nothing is highlighted.

Enough colors to remember

There are two main use-cases you want your color theme to address:

  1. Look at something and tell what it is by its color (you can tell by reading text, yes, but why do you need syntax highlighting then?)
  2. Search for something. You want to know what to look for (which color).

1 is a direct index lookup: color → type of thing.

2 is a reverse lookup: type of thing → color.

Truth is, most people don’t do these lookups at all. They might think they do, but in reality, they don’t.

Let me illustrate. Before:

After:

Can you see it? I misspelled return for retunr and its color switched from red to purple.

I can’t.

Here’s another test. Close your eyes (not yet! Finish this sentence first) and try to remember what color your color theme uses for class names?

Can you?

If the answer for both questions is “no”, then your color theme is not functional . It might give you comfort (as in—I feel safe. If it’s highlighted, it’s probably code) but you can’t use it as a tool. It doesn’t help you.

What’s the solution? Have an absolute minimum of colors. So little that they all fit in your head at once. For example, my color theme, Alabaster, only uses four:

  • Green for strings
  • Purple for constants
  • Yellow for comments
  • Light blue for top-level definitions

That’s it! And I was able to type it all from memory, too. This minimalism allows me to actually do lookups: if I’m looking for a string, I know it will be green. If I’m looking at something yellow, I know it’s a comment.

Limit the number of different colors to what you can remember.

If you swap green and purple in my editor, it’ll be a catastrophe. If somebody swapped colors in yours, would you even notice?

What should you highlight?

Something there isn’t a lot of. Remember—we want highlights to stand out. That’s why I don’t highlight variables or function calls—they are everywhere, your code is probably 75% variable names and function calls.

I do highlight constants (numbers, strings). These are usually used more sparingly and often are reference points—a lot of logic paths start from constants.

Top-level definitions are another good idea. They give you an idea of a structure quickly.

Punctuation: it helps to separate names from syntax a little bit, and you care about names first, especially when quickly scanning code.

Please, please don’t highlight language keywords. class , function , if , else stuff like this. You rarely look for them: “where’s that if” is a valid question, but you will be looking not at the if the keyword, but at the condition after it. The condition is the important, distinguishing part. The keyword is not.

Highlight names and constants. Grey out punctuation. Don’t highlight language keywords.

Comments are important

The tradition of using grey for comments comes from the times when people were paid by line. If you have something like

of course you would want to grey it out! This is bullshit text that doesn’t add anything and was written to be ignored.

But for good comments, the situation is opposite. Good comments ADD to the code. They explain something that couldn’t be expressed directly. They are important .

So here’s another controversial idea:

Comments should be highlighted, not hidden away.

Use bold colors, draw attention to them. Don’t shy away. If somebody took the time to tell you something, then you want to read it.

Two types of comments

Another secret nobody is talking about is that there are two types of comments:

  1. Explanations
  2. Disabled code

Most languages don’t distinguish between those, so there’s not much you can do syntax-wise. Sometimes there’s a convention (e.g. -- vs /* */ in SQL), then use it!

Here’s a real example from Clojure codebase that makes perfect use of two types of comments:

Disabled code is gray, explanation is bright yellow

Light or dark?

Per statistics, 70% of developers prefer dark themes. Being in the other 30%, that question always puzzled me. Why?

And I think I have an answer. Here’s a typical dark theme:

and here’s a light one:

On the latter one, colors are way less vibrant. Here, I picked them out for you:

Notice how many colors there are. No one can remember that many.

This is because dark colors are in general less distinguishable and more muddy. Look at Hue scale as we move brightness down:

Basically, in the dark part of the spectrum, you just get fewer colors to play with. There’s no “dark yellow” or good-looking “dark teal”.

Nothing can be done here. There are no magic colors hiding somewhere that have both good contrast on a white background and look good at the same time. By choosing a light theme, you are dooming yourself to a very limited, bad-looking, barely distinguishable set of dark colors.

So it makes sense. Dark themes do look better. Or rather: light ones can’t look good. Science ¯\_(ツ)_/¯

But!

But.

There is one trick you can do, that I don’t see a lot of. Use background colors! Compare:

The first one has nice colors, but the contrast is too low: letters become hard to read.

The second one has good contrast, but you can barely see colors.

The last one has both : high contrast and clean, vibrant colors. Lighter colors are readable even on a white background since they fill a lot more area. Text is the same brightness as in the second example, yet it gives the impression of clearer color. It’s all upside, really.

UI designers know about this trick for a while, but I rarely see it applied in code editors:

If your editor supports choosing background color, give it a try. It might open light themes for you.

Bold and italics

Don’t use. This goes into the same category as too many colors. It’s just another way to highlight something, and you don’t need too many, because you can’t highlight everything.

In theory, you might try to replace colors with typography. Would that work? I don’t know. I haven’t seen any examples.

Using italics and bold instead of colors

Myth of number-based perfection

Some themes pay too much attention to be scientifically uniform. Like, all colors have the same exact lightness, and hues are distributed evenly on a circle.

This could be nice (to know if you have OCD), but in practice, it doesn’t work as well as it sounds:

OkLab l=0.7473 c=0.1253 h=0, 45, 90, 135, 180, 225, 270, 315

The idea of highlighting is to make things stand out. If you make all colors the same lightness and chroma, they will look very similar to each other, and it’ll be hard to tell them apart.

Our eyes are way more sensitive to differences in lightness than in color, and we should use it, not try to negate it.

Let’s design a color theme together

Let’s apply these principles step by step and see where it leads us. We start with the theme from the start of this post:

First, let’s remove highlighting from language keywords and re-introduce base text color:

Next, we remove color from variable usage:

and from function/method invocation:

The thinking is that your code is mostly references to variables and method invocation. If we highlight those, we’ll have to highlight more than 75% of your code.

Notice that we’ve kept variable declarations. These are not as ubiquitous and help you quickly answer a common question: where does thing thing come from?

Next, let’s tone down punctuation:

I prefer to dim it a little bit because it helps names stand out more. Names alone can give you the general idea of what’s going on, and the exact configuration of brackets is rarely equally important.

But you might roll with base color punctuation, too:

Okay, getting close. Let’s highlight comments:

We don’t use red here because you usually need it for squiggly lines and errors.

This is still one color too many, so I unify numbers and strings to both use green:

Finally, let’s rotate colors a bit. We want to respect nesting logic, so function declarations should be brighter (yellow) than variable declarations (blue).

Compare with what we started:

In my opinion, we got a much more workable color theme: it’s easier on the eyes and helps you find stuff faster.

Shameless plug time

I’ve been applying these principles for about 8 years now .

I call this theme Alabaster and I’ve built it a couple of times for the editors I used:

It’s also been ported to many other editors and terminals; the most complete list is probably here . If your editor is not on the list, try searching for it by name—it might be built-in already! I always wondered where these color themes come from, and now I became an author of one (and I still don’t know).

Feel free to use Alabaster as is or build your own theme using the principles outlined in the article—either is fine by me.

As for the principles themselves, they worked out fantastically for me. I’ve never wanted to go back, and just one look at any “traditional” color theme gives me a scare now.

I suspect that the only reason we don’t see more restrained color themes is that people never really thought about it. Well, this is your wake-up call. I hope this will inspire people to use color more deliberately and to change the default way we build and use color themes.