Parsing Blu-ray MPLS Playlist Files

metadata

I have a nice little Python function, pyguymer3.media.return_dict_of_media_audio_streams(), that uses the JSON output format from ffprobe -print_format json -show_streams /path/to/file to create a Python dictionary of information about all of the audio streams within a media file. However, ffprobe v3.4 doesn’t return the language information for any of the audio streams in a Blu-ray playlist. For example, running ffprobe -probesize 3G -analyzeduration 1800M -playlist 820 bluray:/path/to/br on one of my Blu-rays yields:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

ffprobe version 3.4 Copyright (c) 2007-2017 the FFmpeg developers
  built with FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 4.0.0)
  configuration: --prefix=/usr/local --mandir=/usr/local/man --datadir=/usr/local/share/ffmpeg --pkgconfigdir=/usr/local/libdata/pkgconfig --enable-shared --enable-pic --enable-gpl --enable-postproc --enable-avfilter --enable-avresample --enable-pthreads --cc=cc --disable-alsa --disable-libopencore-amrnb --disable-libopencore-amrwb --enable-libass --disable-libbs2b --disable-libcaca --disable-libcdio --disable-libcelt --disable-chromaprint --disable-libdc1394 --disable-debug --disable-htmlpages --disable-libdrm --enable-libfdk-aac --disable-ffserver --disable-libflite --enable-fontconfig --enable-libfreetype --enable-frei0r --disable-libfribidi --disable-libgme --disable-libgsm --enable-iconv --disable-libilbc --disable-jack --disable-libkvazaar --disable-ladspa --enable-libmp3lame --enable-libbluray --disable-librsvg --disable-libxml2 --enable-mmx --disable-libmodplug --disable-openal --disable-opencl --enable-libopencv --disable-opengl --disable-libopenh264 --disable-libopenjpeg --enable-optimizations --disable-libopus --disable-libpulse --enable-runtime-cpudetect --disable-librubberband --disable-sdl2 --disable-libsmbclient --disable-libsnappy --disable-sndio --disable-libsoxr --disable-libspeex --enable-sse --disable-libssh --disable-libtesseract --enable-libtheora --disable-libtwolame --enable-libv4l2 --enable-vaapi --enable-vdpau --disable-libvidstab --enable-libvorbis --disable-libvo-amrwbenc --enable-libvpx --disable-libwavpack --disable-libwebp --enable-libx264 --enable-libx265 --disable-libxcb --enable-libxvid --disable-outdev=xv --disable-libzimg --disable-libzmq --disable-libzvbi --disable-gcrypt --enable-gmp --disable-librtmp --enable-gnutls --disable-openssl --enable-version3 --enable-nonfree --disable-libmysofa
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
[bluray @ 0x80d07e000] 13 usable playlists:
Input #0, mpegts, from 'bluray:/path/to/br':
  Duration: 01:01:38.57, start: 11.650667, bitrate: 33068 kb/s
  Program 1
    Stream #0:0[0x1011]: Video: h264 (High) (HDMV / 0x564D4448), yuv420p(progressive), 1920x1080 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 90k tbn, 47.95 tbc
    Stream #0:1[0x1100]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 640 kb/s
    Stream #0:2[0x1101]: Audio: truehd (AC-3 / 0x332D4341), 48000 Hz, 7.1, s32 (24 bit)
    Stream #0:3[0x1101]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 640 kb/s
    Stream #0:4[0x1102]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 448 kb/s
    Stream #0:5[0x1103]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, stereo, fltp, 256 kb/s
    Stream #0:6[0x1104]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 448 kb/s
    Stream #0:7[0x1105]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 448 kb/s
    Stream #0:8[0x1106]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 448 kb/s
    Stream #0:9[0x1107]: Audio: ac3 (AC-3 / 0x332D4341), 48000 Hz, 5.1(side), fltp, 448 kb/s
    Stream #0:10[0x1200]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:11[0x1201]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:12[0x1202]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:13[0x1203]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:14[0x1204]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:15[0x1205]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:16[0x1206]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:17[0x1207]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:18[0x1208]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:19[0x1209]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:20[0x120a]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:21[0x120b]: Subtitle: hdmv_pgs_subtitle ([144][0][0][0] / 0x0090), 1920x1080
    Stream #0:22[0x1a00]: Audio: eac3 ([161][0][0][0] / 0x00A1), 48000 Hz, stereo, fltp, 192 kb/s
    Stream #0:23[0x1b00]: Video: h264 (High) (HDMV / 0x564D4448), yuv420p(progressive), 720x480 [SAR 40:33 DAR 20:11], 23.98 fps, 23.98 tbr, 90k tbn, 47.95 tbc

              
You may also download “blu-ray_ffprobe.out” directly or view “blu-ray_ffprobe.out” on GitHub Gist (you may need to manually checkout the “main” branch).

Note the square brackets just after each stream definition, these are where the language code should be. I decided to do a little bit of investigating to see how hard it would be for me to grab the language codes myself and add them to the dictionary in my Python function. I decided that instead of writing feature requests in both ffmpeg and libbluray it would be much quicker to code up a binary reader for the MPLS file and add its data to the data provided by ffprobe myself.

My first action was to open up the binary MPLS file “00820.mpls” in a text editor - and behold, there were the language codes surrounded by a bunch of binary gibberish. I clearly just needed a MPLS parser, however, searching online did not yield any results. A lot of searching did manage to produce some unofficial documentation on the file format for this small binary MPLS file. The two best resources that I found were:

It turns out that the MPLS file is in big-endian too so I used the Python function struct.unpack to convert each entry to the appropriate type and kind. Following the documentation to make a parser for all of the data structures within the MPLS file wasn’t that hard at all, it was more of a bore than anything else. As an aside, the largest MPLS file that I have ever seen is 78 KiB and the smallest Blu-ray disc that can possibly exist is 25 GB so why the playlist information isn’t stored as a JSON or XML file I have no idea (there will always be some space left on the disc for a slightly larger text format).

My function pyguymer3.media.return_dict_of_media_audio_streams() now calls my new sub-module, pyguymer3.media.MPLS, and adds the language code to each stream when a Blu-ray is passed. This means that I can now choose which stream to use based off its language.