Parsing Blu-ray MPLS Playlist Files
I have a nice little Python function, pyguymer3.media.return_dict_of_media_audio_streams()
, that uses the JSON output format from ffprobe -print_format json -show_streams /path/to/file
to create a Python dictionary of information about all of the audio streams within a media file. However, ffprobe v3.4 doesn’t return the language information for any of the audio streams in a Blu-ray playlist. For example, running ffprobe -probesize 3G -analyzeduration 1800M -playlist 820 bluray:/path/to/br
on one of my Blu-rays yields:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
|
checkout
the “main” branch).Note the square brackets just after each stream definition, these are where the language code should be. I decided to do a little bit of investigating to see how hard it would be for me to grab the language codes myself and add them to the dictionary in my Python function. I decided that instead of writing feature requests in both ffmpeg and libbluray it would be much quicker to code up a binary reader for the MPLS file and add its data to the data provided by ffprobe
myself.
My first action was to open up the binary MPLS file “00820.mpls” in a text editor - and behold, there were the language codes surrounded by a bunch of binary gibberish. I clearly just needed a MPLS parser, however, searching online did not yield any results. A lot of searching did manage to produce some unofficial documentation on the file format for this small binary MPLS file. The two best resources that I found were:
It turns out that the MPLS file is in big-endian too so I used the Python function struct.unpack
to convert each entry to the appropriate type and kind. Following the documentation to make a parser for all of the data structures within the MPLS file wasn’t that hard at all, it was more of a bore than anything else. As an aside, the largest MPLS file that I have ever seen is 78 KiB and the smallest Blu-ray disc that can possibly exist is 25 GB so why the playlist information isn’t stored as a JSON or XML file I have no idea (there will always be some space left on the disc for a slightly larger text format).
My function pyguymer3.media.return_dict_of_media_audio_streams()
now calls my new sub-module, pyguymer3.media.MPLS
, and adds the language code to each stream when a Blu-ray is passed. This means that I can now choose which stream to use based off its language.