Links AtomicParsley

Entering UTF-8 tags under cygwin (regular rxvt or bash):

Note: this document is deprecated - the since AP 0.8.8, full unicode support for text & filenames exists in the win32 builds.


On Mac OS X & Linux, unicode text tags (encoded in utf8) isn't a problem. On Mac OS X, its just a matter of pasting and hitting return. Under cygwin on Windows its a little harder (my unfamiliarity with the platforms also doesn't help). However, there is an ugly, ugly ugly, heinously ugly workaround - you enter the octal codes of the utf-8 characters directly.

Using the printf command, you can get the shell in cygwin to feed the proper codes into the AtomicParsley command line arguments to set a valid tag. Say you want the ℗ symbol followed by a space & then the symbol and then the year on the copyright tag. If you can get the unicode hex value for ℗ (its E2 84 97), then the octal value is 342 204 227. Then you can set the tag without ever having pasted in a unicode character:

AtomicParsley.exe sample.m4a --copyright "`printf '\342\204\227 \302\251 2006'`"

You will note that the format is printf ' somecodes ', which is encased in ` ` (which itself tells the shell to perform this sub-command separately), and the whole thing is further encased in double quotes: " printf_cmd ". The encasing "" are needed because when the sub command exits, in this particular case what AtomicParsley.exe will get will be the equivalent to:

AtomicParsley.exe sample.m4a --copyright ℗ 2006

and 2006 isn't an argument that AtomicParsley understands - but:

AtomicParsley.exe sample.m4a --copyright "℗ 2006"

with the enclosing quotes isn't a problem. And so neither is this command to set tags in mutliple languages:

AtomicParsley.exe sample.m4a --copyright "`printf '\342\204\227 \302\251 2006 Bugger Records'`" --artist "`printf '\346\265\243\350\205\270 (Barium type)'`" --title "`printf '\327\236\327\231\327\223\327\242 \327\236\327\247\327\231\327\243 (\327\221\327\242\327\231\327\247\327\250 \327\250\327\244\327\225\327\220'`" --genre "`printf '\305\201\341\271\273\306\214\310\241\310\211\305\247\307\235\303\237 \306\257\311\262\311\250\341\271\253\311\233'`" --comment "`printf '\320\212\321\224\321\200-\320\205\321\257\321\271\321\202'`" --description "`printf '\330\263\331\210\330\261\330\251 \330\247\331\204\331\201\330\247\330\252\330\255\330\251 \330\250\330\263\331\205 \330\247\331\204\331\204\331\207 \330\247\331\204\330\261\330\255\331\205\331\206 \330\247\331\204\330\261\330\255\331\212'`" --lyrics "`printf '\317\225\317\264\317\200\317\236\316\250\341\277\274'`"

Note: AtomicParsley can get the text data out of the tags, but the shell (either bash or rxvt) will still prevent you from seeing it. If out output to a file:

AtomicParsley sample-temp-123.m4a -t > unicode_out.txt

and open the text file with a utf-8 capable text editor, you should see the proper tags. And iTunes will be able to pick up the utf8 tags you set this way. The cygwin AtomicParsley on the main download page isn't utf-8 enabled - it will be missing the byte order mark (BOM) needed for output. The version in the "experimental" folder of the cygwin download will.

... I did say it was ugly - but it works. The above command sets tags in Hebrew, Arabic, Cyrillic, Japanese, Greek & Latin diacriticals. I set that on Win98SE (which I think is UTF-0 enabled) using regular rxvt. Logo

item6a item3a item6 item3 item2