I have a nice little Python function,
return_dict_of_media_audio_streams, that uses the JSON output format from
ffprobe -print_format json -show_streams /path/to/file to create a Python dictionary of information about all of the audio streams within a media file. However,
ffprobe v3.4 doesn't return the language information for any of the audio streams in a Blu-ray playlist. For example, running
ffprobe -probesize 3G -analyzeduration 1800M -playlist 820 bluray:/path/to/br yields:
Note the square brackets just after each stream definition, these are where the language code should be. I decided to do a little bit of investigating to see how hard it would be for me to grab the language codes myself and add them to the dictionary in my Python function. I decided that instead of writing feature requests in both ffmpeg and libbluray it would be much quicker to code up a binary reader for the file and add its data to the data provided by
My first action was to open up the binary file "00820.mpls" in a text editor - and behold, there were the language codes surrounded by a bunch of binary gibberish. I clearly just needed a MPLS parser, however, searching online did not yield any results. A lot of searching did manage to produce some unofficial documentation on the file format for this small binary file. The two best resources that I found were:
It turns out that the file is in big-endian too so I used the Python function
struct.unpack to convert each entry to the appropriate type and kind. Following the documentation to make a parser for all of the data structures within the file wasn't that hard at all, it was more of a bore than anything else. As an aside, the largest MPLS file that I have ever seen is 78 kiB and the smallest Blu-ray disc that can possibly exist is 25 GB so why the playlist information isn't stored as a JSON or XML file I have no idea (there will always be some space left on the disc for a slightly larger text format).
return_dict_of_media_audio_streams now calls my new sub-module,
pyguymer.MPLS, and adds the language code to each stream when a Blu-ray is passed. This means that I can now choose which stream to use based off its language.