Python: logging different threads to different files

The OpenAddresses code includes a Python task manager thats spins off a lot of Python threads, one per input file describing a job. We’d like to capture the debug log records for each thread to a separate file, for presentation in a dashboard. How to do that?

This demo program I wrote shows how. The key concept here is a Python logging.Handler() object that uses some per-Thread state to decide where to write the message. In this case I use the Thread name itself, but you could also use some state stored in Thread.local().

Also this feels related.. log4j had a way to store information in ThreadLocal variables that you could then write out to log files. Ie: you could add a special per-thread variable like “job name” or “HTTP session ID” and have the formatter print it out where appropriate. It was quite handy in some use cases. Python doesn’t exactly have that, but the cookbook has notes on doing something similar with Filters.

Update: see also Mike’s alternate approach. He creates a new Handler for every thread, then uses a Filter to only show messages from the thread he cares about. I like how the approach makes more use of logging’s machinery, it is a bit weird how I had to make a whole Handler that did something odd with the output. I think Mike’s approach means for N threads we’ll end up with N Handlers seeing every log message. That’d be inefficient but if there’s not 100+ threads I doubt it matters.