The VMAF Python library offers full functionalities from running basic VMAF command lines, software testing, training and validating a new VMAF model on video datasets, data visualization tools, etc. It is the playground to experiment with VMAF and other video quality metrics.
Make sure you have python3 (python 3.6 or higher). You can check the version by python3 --version.
On Debian Buster (10), or Debian-based systems such as Ubuntu 18.04 or higher:
sudo apt install nasm doxygen python3-devOn Fedora 22 or higher, or CentOS 8 or higher:
sudo dnf install nasm doxygen python3-develOn older CentOS and RHEL:
sudo yum install nasm doxygen python3-develMake sure nasm is 2.13.02 or higher (check by nasm --version).
First, install Homebrew.
If you don't already have a python3 installation on your Mac, run the following to install Python 3 via Homebrew:
brew install python3Install the remaining dependencies:
brew install nasm doxygen llvm libompNote that brew requires no sudo.
Follow the steps below to set up a clean virtual environment
python3 -m pip install virtualenv
python3 -m virtualenv .venv
source .venv/bin/activateFrom this point forward python3 and pip3 will be relative to the virtualenv and isolated from the system python. Returning to the project in subsequent shell sessions will require re-activating the virtualenv with source .venv/bin/activate.
Now install the tools required to build VMAF into the virtualenv.
pip3 install cython numpy meson ninja
Make sure ninja is 1.7.1 or higher (check by ninja --version).
Clean build the binary by:
make clean; makeCheck if build is successful:
./libvmaf/build/tools/vmaf --versionInstall the rest of the required Python packages:
pip3 install -r python/requirements.txtOn macOS it's important to use the LLVM from homebrew as the macOS clang does not include support for OpenMP, which is needed for libsvm-official
CC=$HOMEBREW_PREFIX/opt/llvm/bin/clang CXX=$HOMEBREW_PREFIX/opt/llvm/bin/clang++ pip3 install -r python/requirements.txtRun unittests and make sure they all pass:
./unittestOne can run VMAF in the command line by run_vmaf, which allows the input videos to be the .yuv format. To run VMAF on a single reference/distorted video pair, run:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
format width height \
reference_path \
distorted_path \
[--out-fmt output_format]The arguments are the following:
formatcan be one of:yuv420p,yuv422p,yuv444p(8-Bit YUV)yuv420p10le,yuv422p10le,yuv444p10le(10-Bit little-endian YUV)yuv420p12le,yuv422p12le,yuv444p12le(12-Bit little-endian YUV)yuv420p16le,yuv422p16le,yuv444p16le(16-Bit little-endian YUV)
widthandheightare the width and height of the videos, in pixelsreference_pathanddistorted_pathare the paths to the reference and distorted video filesoutput_formatcan be one of:textxmljson
For example, the following command runs VMAF on a pair of .yuv inputs (src01_hrc00_576x324.yuv, src01_hrc01_576x324.yuv):
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
yuv420p 576 324 \
src01_hrc00_576x324.yuv \
src01_hrc01_576x324.yuv \
--out-fmt jsonThis will generate JSON output like:
{
...
"aggregate": {
"VMAF_feature_adm2_score": 0.93458780776205741,
"VMAF_feature_motion2_score": 3.8953518541666665,
"VMAF_feature_vif_scale0_score": 0.36342081156994926,
"VMAF_feature_vif_scale1_score": 0.76664738784617292,
"VMAF_feature_vif_scale2_score": 0.86285338927816291,
"VMAF_feature_vif_scale3_score": 0.91597186913930484,
"VMAF_score": 76.699271371151269,
"method": "mean"
}
}where VMAF_score is the final score and the others are the scores for VMAF's elementary metrics:
adm2,vif_scalexscores range from 0 (worst) to 1 (best)motion2score typically ranges from 0 (static) to 20 (high-motion)
VMAF follows a machine-learning based approach to first extract a number of quality-relevant features (or elementary metrics) from a distorted video and its reference full-quality video, followed by fusing them into a final quality score using a non-linear regressor (e.g. an SVM regressor), hence the name “Video Multi-method Assessment Fusion”.
In addition to the basic commands, the VMAF package also provides a framework to allow any user to train his/her own perceptual quality assessment model. For example, directory model contains a number of pre-trained models, which can be loaded by the aforementioned commands:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
format width height \
reference_path \
distorted_path \
[--model model_path]For example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
yuv420p 576 324 \
python/test/resource/yuv/src01_hrc00_576x324.yuv \
python/test/resource/yuv/src01_hrc01_576x324.yuv \
--model model/other_models/nflxtrain_vmafv3.pklA user can customize the model based on:
- The video dataset it is trained on
- The list of features used
- The regressor used (and its hyper-parameters)
Once a model is trained, the VMAF package also provides tools to cross validate it on a different dataset and visualization.
To begin with, create a dataset file following the format in example_dataset.py. A dataset is a collection of distorted videos. Each has a unique asset ID and a corresponding reference video, identified by a unique content ID. Each distorted video is also associated with subjective quality score, typically a MOS (mean opinion score), obtained through subjective study. An example code snippet that defines a dataset is as follows:
dataset_name = 'example'
yuv_fmt = 'yuv420p'
width = 1920
height = 1080
ref_videos = [
{'content_id':0, 'path':'checkerboard.yuv'},
{'content_id':1, 'path':'flat.yuv'},
]
dis_videos = [
{'content_id':0, 'asset_id': 0, 'dmos':100, 'path':'checkerboard.yuv'},
{'content_id':0, 'asset_id': 1, 'dmos':50, 'path':'checkerboard_dis.yuv'},
{'content_id':1, 'asset_id': 2, 'dmos':100, 'path':'flat.yuv'},
{'content_id':1, 'asset_id': 3, 'dmos':80, 'path':'flat_dis.yuv'},
]See the directory resource/dataset for more examples. Also refer to the Datasets document regarding publicly available datasets.
Once a dataset is created, first validate the dataset using existing VMAF or other (PSNR, SSIM or MS-SSIM) metrics. Run:
PYTHONPATH=python ./python/vmaf/script/run_testing.py \
quality_type \
test_dataset_file \
[--vmaf-model optional_VMAF_model_path] \
[--cache-result] \
[--parallelize]where quality_type can be VMAF, PSNR, SSIM, MS_SSIM, etc.
Enabling --cache-result allows storing/retrieving extracted features (or elementary quality metrics) in a data store (under workspace/result_store_dir/file_result_store), since feature extraction is the most expensive operations here.
Enabling --parallelize allows execution on multiple reference-distorted video pairs in parallel. Sometimes it is desirable to disable parallelization for debugging purpose (e.g. some error messages can only be displayed when parallel execution is disabled).
For example:
PYTHONPATH=python ./python/vmaf/script/run_testing.py \
VMAF \
resource/example/example_dataset.py \
--cache-result \
--parallelizeMake sure matplotlib is installed to visualize the MOS-prediction scatter plot and inspect the statistics:
- PCC – Pearson correlation coefficient
- SRCC – Spearman rank order correlation coefficient
- RMSE – root mean squared error
When creating a dataset file, one may make errors (for example, having a typo in a file path) that could go unnoticed but make the execution of run_testing fail. For debugging purposes, it is recommended to disable --parallelize.
If the problem persists, one may need to run the script:
PYTHONPATH=python ./python/vmaf/script/run_cleaning_cache.py \
quality_type \
test_dataset_fileto clean up corrupted results in the store before retrying. For example:
PYTHONPATH=python ./python/vmaf/script/run_cleaning_cache.py \
VMAF \
resource/example/example_dataset.pyNow that we are confident that the dataset is created correctly and we have some benchmark result on existing metrics, we proceed to train a new quality assessment model. Run:
PYTHONPATH=python ./python/vmaf/script/run_vmaf_training.py \
train_dataset_filepath \
feature_param_file \
model_param_file \
output_model_file \
[--cache-result] \
[--parallelize]For example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf_training.py \
resource/example/example_dataset.py \
resource/feature_param/vmaf_feature_v2.py \
resource/model_param/libsvmnusvr_v2.py \
workspace/model/test_model.pkl \
--cache-result \
--parallelizefeature_param_file defines the set of features used. For example, both dictionaries below:
feature_dict = {'VMAF_feature': 'all', }and
feature_dict = {'VMAF_feature': ['vif', 'adm'], }are valid specifications of selected features. Here VMAF_feature is an "aggregate" feature type, and vif, adm are the "atomic" feature types within the aggregate type. In the first case, all specifies that all atomic features of VMAF_feature are selected. A feature_dict dictionary can also contain more than one aggregate feature types.
model_param_file defines the type and hyper-parameters of the regressor to be used. For details, refer to the self-explanatory examples in directory resource/model_param. One example is:
model_type = "LIBSVMNUSVR"
model_param_dict = {
# ==== preprocess: normalize each feature ==== #
'norm_type':'clip_0to1', # rescale to within [0, 1]
# ==== postprocess: clip final quality score ==== #
'score_clip':[0.0, 100.0], # clip to within [0, 100]
# ==== libsvmnusvr parameters ==== #
'gamma':0.85, # selected
'C':1.0, # default
'nu':0.5, # default
'cache_size':200, # default
}The trained model is output to output_model_file. Once it is obtained, it can be used by the run_vmaf, or by run_testing to validate another dataset.
Above are two example scatter plots obtained from running the run_vmaf_training and run_testing commands on a training and a testing dataset, respectively.
The commands run_vmaf_training and run_testing also support custom subjective models (e.g. MLE_CO_AP2 (default), MOS, DMOS, SR_MOS (i.e. ITU-R BT.500), BR_SR_MOS (i.e. ITU-T P.913) and more), through the sureal package.
The subjective model option can be specified with option --subj-model subjective_model, for example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf_training.py \
resource/example/example_raw_dataset.py \
resource/feature_param/vmaf_feature_v2.py \
resource/model_param/libsvmnusvr_v2.py \
workspace/model/test_model.pkl \
--subj-model MLE_CO_AP2 \
--cache-result \
--parallelize
PYTHONPATH=python ./python/vmaf/script/run_testing.py \
VMAF \
resource/example/example_raw_dataset.py \
--subj-model MLE_CO_AP2 \
--cache-result \
--parallelizeNote that for the --subj-model option to have effect, the input dataset file must follow a format similar to example_raw_dataset.py. Specifically, for each dictionary element in dis_videos, instead of having a key named dmos or groundtruth as in example_dataset.py, it must have a key named os (stands for opinion score), and the value must be a list of numbers. This is the "raw opinion score" collected from subjective experiments, which is used as the input to the custom subjective models.
run_vmaf_cross_validation.py provides tools for cross-validation of hyper-parameters and models. run_vmaf_cv runs training on a training dataset using hyper-parameters specified in a parameter file, output a trained model file, and then test the trained model on another test dataset and report testing correlation scores. run_vmaf_kfold_cv takes in a dataset file, a parameter file, and a data structure (list of lists) that specifies the folds based on video content's IDs, and run k-fold cross validation on the video dataset. This can be useful for manually tuning the model parameters.
You can also customize VMAF by plugging in third-party features or inventing new features, and specify them in a feature_param_file. Essentially, the "aggregate" feature type (for example: VMAF_feature) specified in the feature_dict corresponds to the TYPE field of a FeatureExtractor subclass (for example: VmafFeatureExtractor). All you need to do is to create a new class extending the FeatureExtractor base class.
Similarly, you can plug in a third-party regressor or invent a new regressor and specify them in a model_param_file. The model_type (for example: LIBSVMNUSVR) corresponds to the TYPE field of a TrainTestModel subclass (for example: LibsvmnusvrTrainTestModel). All needed is to create a new class extending the TrainTestModel base class.
For instructions on how to extending the FeatureExtractor and TrainTestModel base classes, refer to CONTRIBUTING.md.
Overtime, a number of helper tools have been incorporated into the package, to facilitate training and validating VMAF models. An overview of the tools available can be found in this slide deck.
A Bjøntegaard-Delta (BD) rate implementation is added. Example usage can be found here. The implementation is validated against MPEG JCTVC-L1100.
An implementation of LIME is also added as part of the repository. For more information, refer to our analysis tools presentation. The main idea is to perform a local linear approximation to any regressor or classifier and then use the coefficients of the linearized model as indicators of feature importance. LIME can be used as part of the VMAF regression framework, for example:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py \
yuv420p 576 324 \
src01_hrc00_576x324.yuv \
src01_hrc00_576x324.yuv \
--local-explainNaturally, LIME can also be applied to any other regression scheme as long as there exists a pre-trained model. For example, applying to BRISQUE:
PYTHONPATH=python ./python/vmaf/script/run_vmaf.py yuv420p 576 324 \
src01_hrc00_576x324.yuv \
src01_hrc00_576x324.yuv \
--local-explain \
--model model/other_models/nflxall_vmafv1.pklA tool to convert a model file (currently support libsvm model) from pickle to json is added at python/vmaf/script/convert_model_from_pkl_to_json.py. Usage:
usage: convert_model_from_pkl_to_json.py [-h] --input-pkl-filepath
INPUT_PKL_FILEPATH
--output-json-filepath
OUTPUT_JSON_FILEPATH
optional arguments:
-h, --help show this help message and exit
--input-pkl-filepath INPUT_PKL_FILEPATH
path to the input pkl file, example:
model/vmaf_float_v0.6.1.pkl or
model/vmaf_float_b_v0.6.3/vmaf_float_b_v0.6.3.pkl
--output-json-filepath OUTPUT_JSON_FILEPATH
path to the output json file, example:
model/vmaf_float_v0.6.1.json or model/vmaf_float_b_v0.6.3.json
Examples:
python/vmaf/script/convert_model_from_pkl_to_json.py \
--input-pkl-filepath model/vmaf_float_b_v0.6.3/vmaf_float_b_v0.6.3.pkl \
--output-json-filepath ./vmaf_float_b_v0.6.3.json
python/vmaf/script/convert_model_from_pkl_to_json.py \
--input-pkl-filepath model/vmaf_float_v0.6.1.pkl \
--output-json-filepath ./vmaf_float_v0.6.1.jsonThe core classes of the VMAF Python library can be depicted in the diagram below:
An Asset is the most basic unit with enough information to perform a task on a media. It includes basic information about a distorted video and its undistorted reference counterpart, as well as the auxiliary preprocessing information that can be understood by the Executor and its subclasses. For example:
- The frame range on which to perform a task (i.e.
dis_start_end_frameandref_start_end_frame) - At what resolution to perform a task (e.g. a video frame is upscaled with a
resampling_typemethod to the resolution specified byquality_width_heightbefore feature extraction)
Asset extends the WorkdirEnabled mixin, which comes with a thread-safe working directory to facilitate parallel execution.
An Executor takes a list of Assets as input, run computations on them, and return a list of corresponding Results. An Executor extends the TypeVersionEnabled mixin, and must specify a unique type and version combination (by the TYPE and VERSION attribute), so that the Result generated by it can be uniquely identified. This facilitates a number of shared housekeeping functions, including storing and reusing Results (result_store), creating FIFO pipes (fifo_mode), etc. Executor understands the preprocessing steps specified in its input Assets. It relies on FFmpeg to do the processing for it (FFmpeg must be pre-installed and its path specified in the FFMPEG_PATH field in the python/vmaf/externals.py file).
An Executor and its subclasses can take optional parameters during initialization. There are two fields to put the optional parameters:
optional_dict: a dictionary field to specify parameters that will impact numerical result (e.g. which wavelet transform to use).optional_dict2: a dictionary field to specify parameters that will NOT impact numerical result (e.g. outputting optional results).
Executor is the base class for FeatureExtractor and QualityRunner (and the sub-subclass VmafQualityRunner).
A Result is a key-value store of read-only execution results generated by an Executor on an Asset. A key corresponds to an "atom" feature type or a type of a quality score, and a value is a list of score values, each corresponding to a computation unit (i.e. in the current implementation, a frame).
The Result class also provides a number of tools for aggregating the per-unit scores into an average score. The default aggregatijon method is the mean, but Result.set_score_aggregate_method() allows customizing other methods (see test_to_score_str() in test/result_test.py for examples).
ResultStore provides capability to save and load a Result. Current implementation FileSystemResultStore persists results by a simple file system that save/load result in a directory. The directory has multiple subdirectories, each corresponding to an Executor. Each subdirectory contains multiple files, each file storing the dataframe for an Asset.
FeatureExtractor subclasses Executor, and is specifically for extracting features (aka elementary quality metrics) from Assets. Any concrete feature extraction implementation should extend the FeatureExtractor base class (e.g. VmafFeatureExtractor). The TYPE field corresponds to the "aggregate" feature name, and the ATOM_FEATURES/DERIVED_ATOM_FEATURES field corresponds to the "atom" feature names.
FeatureAssembler assembles features for an input list of Assets on a input list of FeatureExtractor subclasses. The constructor argument feature_dict specifies the list of FeatureExtractor subclasses (i.e. the "aggregate" feature) and selected "atom" features. For each asset on a FeatureExtractor, it outputs a BasicResult object. FeatureAssembler is used by a QualityRunner to assemble the vector of features to be used by a TrainTestModel.
TrainTestModel is the base class for any concrete implementation of regressor, which must provide a train() method to perform training on a set of data and their groud-truth labels, and a predict() method to predict the labels on a set of data, and a to_file() and a from_file() method to save and load trained models.
A TrainTestModel constructor must supply a dictionary of parameters (i.e. param_dict) that contains the regressor's hyper-parameters. The base class also provides shared functionalities such as input data normalization/output data denormalization, evaluating prediction performance, etc.
Like an Executor, a TrainTestModel extends TypeVersionEnabled and must specify a unique type and version combination (by the TYPE and VERSION attribute).
CrossValidation provides a collection of static methods to facilitate validation of a TrainTestModel object. As such, it also provides means to search the optimal hyper-parameter set for a TrainTestModel object.
QualityRunner subclasses Executor, and is specifically for evaluating the quality score for Assets. Any concrete implementation to generate the final quality score should extend the QualityRunner base class (e.g. VmafQualityRunner, PsnrQualityRunner).
There are two ways to extend a QualityRunner base class -- either by directly implementing the quality calculation (e.g. by calling a C executable, as in PsnrQualityRunner), or by calling a FeatureAssembler (with indirectly calls a FeatureExtractor) and a TrainTestModel subclass (as in VmafQualityRunner).


