AtomicParsley
Links AtomicParsley

Atoms, Boxes, Parents, Children & hex (oh my)

An MPEG-4 file is made of a number of discrete units called atoms (well, they were called atoms in the first version of the specification, now they are prosaically called 'boxes'). An atom has a format:

atoms2a

Anything beyond that basic 8 bytes is either optional & defined by the hierarchy it is found in (moov.udta.meta.XXXX atoms have a format defined by Quicktime), or defined by the atom itself. The ftyp atom is ALWAYS first, and has a certain type of format - it tells what type of file it is & the basic versioning of the atom structures.

In the above example the moov atom has a length of 0x00001D38 or 7480 bytes. Immediately following the moov name however is a new atom. This is the mvhd atom, and its length is 0x0000006C or 108 bytes. Because 108 bytes is less than 7480 bytes, the mvhd atom is a child atom of the moov atom. The MPEG-4 specification says that either an atom can be a parent atom (as moov is a parent to mvhd) or it can carry some sort of information on it (as ftyp & mvhd show above), but not both.

The length of the atom is determined by the length of itself PLUS any and all atoms in the level immediately below it - not all the way down to the end of the hierarchy. For example, the moov atom sums the length of the mvhd atom and other atoms on the same level (not shown), but not children of mvhd - mvhd sums those lengths. The atoms in the level below sum the lengths in the atoms below them until you get to the end of a hierarchy. At that point the sum of that atom is:

4 bytes for the atom length
4 bytes for the atom name
??? bytes that are optional for any data it might hold

The minimum length of an atom then would be 8 bytes.

The 'Atom Is A Parent Or Holds Data' rule is made to be broken . Often the atom under moov.trak.mdia.minf.stbl.stsd is a parent and contains data. Apple's drm implementation breaks this rule further. The other standard atom that breaks this rule is moov.udta.meta for historical reasons. Still, the MPEG-4 container is relatively easy to understand & highly flexible.

The most important part of an MPEG-4 file is the mdat atom - its where the actual raw information for the file is stored. This top level atom takes up the bulk of an MPEG-4 file. However, the moov atom comprises a number of different atoms and hierarchies, and provides for basic functionality - like specifying the dimensions of a video file, or the duration of a song.

uuid atoms are user-defined atoms, and are similar to normal atoms, but their name is 8 bytes (4 bytes holding uuid and the name of the uuid atom). Sony PSP mp4 files notably use uuid atoms. AtomicParsley supports setting & reading its own uuid atoms to carry supplemental metadata.

 

stco & mdat

What happens when atoms are added, modified or removed is that the tree gets changed, and then the lengths of the atoms needs to be re-determined. If the mdat atom moves relative to the beginning of the file, further adjustments need to be made. The free atom is meant to minimize this exact behavior.

The mdat data is made up of 'chunks' - these chunks are referenced in moov to provide for seeking within the file, and to tell the player where the beginning of the media data is. This information is stored on the moov.trak.mdia.minf.stbl.stco Sample Table Chunk Offset atom. This atom has a particular structure:

stco

Each entry in the stco atom (and there can be mutliple stco atoms) needs to be readjusted.

 

Known iTunes Metadata Atoms

Metadata to be used with iTunes comes in the moov.udta.meta.ilst hierarchy. The atoms directly under the ilst atom have specific names, but they do not carry the data directly. The children of these named atoms (the data atom) carry the actual information. The 4 letter code of the parent is listed below, while the atom flags after the data atom are listed in the Class column. It is the class of the data atom that broadly determines whether text or numbers or binary data is contained.

4char code

Name

Class/Flag

Appearance

©alb

Album

1

text

iTunes 4.0

©art

Artist

1

text

iTunes 4.0

aART

Album Artist

1

text

??

©cmt

Comment

1

text

iTunes 4.0

©day

Year

1

text

iTunes 4.0

©nam

Title

1

text

iTunes 4.0

©gen | gnre

Genre

1 | 0 1

text | uint8

iTunes 4.0

trkn

Track number

0

uint8

iTunes 4.0

disk

Disk number

0

uint8

iTunes 4.0

©wrt

Composer

1

text

iTunes 4.0

©too

Encoder

1

text

iTunes 4.0

tmpo

BPM

21

uint8

iTunes 4.0

cprt

Copyright

1

text

? iTunes 4.0

cpil

Compilation

21

uint8

iTunes 4.0

covr

Artwork

13 | 14 2

jpeg | png

iTunes 4.0

rtng

Rating/Advisory

21

uint8

iTunes 4.0

©grp

Grouping

1

text

iTunes 4.2

stik

?? (stik)

21

uint8

??

pcst

Podcast

21

uint8

iTunes 4.9

catg

Category

1

text

iTunes 4.9

keyw

Keyword

1

text

iTunes 4.9

purl

Podcast URL

21 | 0 4

uint8

iTunes 4.9

egid

Episode Global Unique ID

21 | 0 4

uint8

iTunes 4.9

desc

Description

1

text

iTunes 5.0

©lyr

Lyrics

1 3

text

iTunes 5.0

tvnn

TV Network Name

1

text

iTunes 6.0

tvsh

TV Show Name

1

text

iTunes 6.0

tven

TV Episode Number

1

text

iTunes 6.0

tvsn

TV Season

21

uint8

iTunes 6.0

tves

TV Episode

21

uint8

iTunes 6.0

purd

Purchase Date

1

text

iTunes 6.0.2

pgap

Gapless Playback

21

uin8

iTunes 7.0

1 Genre comes on 2 atoms - standard genres are on gnre; custom genres are on ©gen; only 1 is permitted at a time.
2 Coverart is the only atom that permits more than 1 data child atom. If there is a limit, its > 16.
3 Lyrics is the only text atom that doesnt't fall under a 255byte limit.
4 Apple changed from the original 21 to the current 0 around the release of iTunes 6.0.3

(there are also iTMS atoms of akID, sfID, geID, plID, atID, cnID & apID; some metadata like Soundcheck information is carried on ---- atoms)

Text metadata has a limit of 255bytes. It comes in UTF-8 (no BOM), and isn't null terminated.

Unsigned integer metadata is 8bits wide (a limit of 255 for tracknum for example). Most have a format (cpil is 4 NULL bytes, then the value) specific to that atom. Only numerical data can be carried for most of these (except purl & egid). Vinyl taggers of "A1": complain to Apple.

Here is a sample of metadata - compilation (true) & tracknumber (2 of 5):

Atomings

And for those thinking "Heavens to Murgatroid, how did cpil's 21 become 15 in the pic above... gosh, golly" - hex.

There is also another form of tagging that iTunes uses internally by a few inaccessible tags. Called the reverse DNS style (or something along that line), this form is pictured below:

 

Atom ---- @ 39852 of size: 72, ends @ 39924

Atom mean @ 39860 of size: 28, ends @ 39888

Atom name @ 39888 of size: 16, ends @ 39904

Atom data @ 39904 of size: 20, ends @ 39924

where the mean atom carries the reverse DNS domain (com.apple.iTunes) & the name atom carries the descriptor for the contents of the data atom.

Known names/descriptors:

tool

iTunNORM

iTunSMPB

iTunes_CDDB_IDs

iTunes_CDDB_TrackNumber

 

Tagging implementations

The only style of metadata defined in the ISO Base Media File Format is what amounts to a single atom cprt - and the format described is in the 3gp asset style. In fact, the ISO copyright notice is identical to the 3gp copyright asset. This copyright notice is the only common tag available to all mpeg-4 files and derivatives.

The major brands that iTunes writes are listed at http://www.mp4ra.org/filetype.html, but iTunes-style metadata isn't defined in any publicly available document - its format is determined by the types of files that iTunes & the iTunes Music Store produce & provide. Since the goal of AtomicParsley is to set metadata is be maximally compatible with iTunes, the iTunes-style format of metadata is fully supported.

The 3GPP assets are family of metadata tags that the 3gp specification allows. These atoms differ in a number of ways from the more common iTunes style. There is no data atom; information is carried directly on the atom. Most 3gp assets have a language setting - so dozens of a like named atom are permitted that differ in the language used (around 480 languages).

A new style of metadata emerged with the foobar2000 0.9.x series. For whatever reason, this style typically duplicates the iTunes-style metadata. There is a double artwork tag, the artist is listed twice - it is heavy with redundancy. It is also non-compliant. A generic tool isn't allowed to create their own atoms - a mechanism exists to extend for supplemental functionality - the uuid atom form. foobar2000 doesn't use this mechanism. Nero has also adopted this tagging style with their freeware tagging tools. It seems to also write some tags in the reverse DNS form in the com.apple.iTunes domain.

The newest style of tagging was recently added at the MPEG4 Registration Authority. Currently, there is no known tool that can read or set this style of metadata.

SourceForge.net Logo

Links
Links
item6a1 item6a item3a item6b item3b item2a