Non-ASCII filenames

I’m on a quest to clean up my music library. I decided to give up and make all my music filenames ASCII; no µ-Ziq, no Sigur Rós, no préludes. I feel bad but after 10 years of struggles I still occasionally run into problems with Unicode filenames. Also most music players are grabbing the display info from metadata tags anyway, the filenames don’t matter so much.

Anyway, here’s a quick trick for finding non-ASCII filenames

LC_ALL=C find . -name ‘*[! -~]*’

The find range names any character outside 0x20 to 0x7f. The LC_ALL setting is presumably to disable any proper text processing. Ie: bytes, not characters.

The bigger project here is the music library grooming. I had this all locked down in 2008 or so when almost all my music was ripped by a service. But since then I’ve acquired a bunch of music of uncertain provenance with all sorts of random tags. I think it was when I discovered I had a Genre named “drownstep” I realized it was time to clean things up.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s