Construction
Having obtained the raw videos, we use FFmpeg and PySceneDetect to split all the videos into 104,582 sequences. We manually check and remove the duplicated, chaotic, and blur scenes. Videos that are wrongly split by the scene detect tools are also removed. Finally, we reserve 32,405 videos with more than six million frames for disparity annotations.
We further filter the videos that are not qualified for our dataset. According to optical flow and valid masks, samples with the following three conditions are removed: 1) more than 30% of pixels in the consistency masks are invalid; 2) more than 10% of pixels have vertical disparity larger than two pixels; 3) the average range of horizontal disparity is less than 15 pixels. Then, we manually check all the videos along with their corresponding ground truth, and remove the samples with obvious errors. Finally, we retain 14,203 videos with 2,237,320 frames in VDW dataset.
Statistics
VDW dataset has larger numbers of video scenes. Compared with the closed-domain datasets, VDW is not restricted to a certain scene, which is more helpful to train a robust video depth model. For the natural-scene datasets, VDW has more than ten times the number of videos as the previous largest dataset WSVD. It is also worth noticing that our VDW dataset has higher resolutions. We only collect videos over 1080p and crop them to 1880 × 800 to remove black bars and subtitles.
VDW contains 14,203 videos with 2,237,320 frames. The total data collection and processing takes over six months and about 4,000 man-hours. To verify the diversity of scenes and entities in our dataset, we conduct semantic segmentation by Mask2Former trained on ADE20k. All the 150 categories are covered in our dataset, and each category can be found in at least 50 videos. The five categories that present most frequently are person (97.2%), wall (89.1%), floor (63.5%), ceiling (46.5%), and tree (42.3%).
We randomly adopts 90 videos with 12,622 frames as test set. Test videos adopt different data sources from training data, i.e., different movies, web videos, or animations. VDW not only alleviates the data shortage for learning-based approaches, but also serves as a comprehensive benchmark for video depth.
VDW Training Set
VDW Test Set
Download and Usage
We have released the VDW dataset under strict conditions. We must ensure that the releasing of VDW dataset won’t violate any copyright requirements and regulations. Thus, we will not release any video frames or the derived data. Instead, we provide the meta data and detailed toolkits, which can be used to reproduce VDW or generate your own dataset. All our released meta data and toolkits are licensed under CC BY-NC-SA 4.0, which can only be used for academic and research purposes. The VDW dataset cannot be adopted for any commercial usage.
The VDW test set is with 90 videos and 12622 frames, while the VDW training set contains 14203 videos with over 2 million frames (8TB on hard drive). We also provide a VDW demo set with two sequences. Users could leverage the VDW official toolkits with demo sequences to learn about our data processing pipeline.
The release of VDW dataset will be divided into two parts: (1) Downloading for metadata; (2) Toolkits for data usage or constructions.
Updates
Download For Metadata
We present VDW metadata for downloading. Downloading our metadata represents your agreement to our license and conditions. For the metadata of each sequence, we provide movie name, IMDB numbers, resolutions, movie durations (second), sequence start time, end time, and cropping area (Four coordinate values from the upper left corner to the lower right corner). The downloading of metadata is presented as follows. The generation and inspections of the meta data for the large training set take lots of time and work. We will release that part once we are ready.
For More Information
Our VDW dataset cannot be used for any commercial purposes. If you need more help or information to generate disparity, you can contact the VDW official email: vdw.dataset@gmail.com. You can use the template to specify your requirements. Your name, institution, purposes, and agreements to License should be included. We will send you feedback on 3-5 weekdays.
License
The releasing of VDW dataset is under strict conditions. Our VDW dataset cannot be used or distributed for any commercial purposes. It can only be used for academic and research purposes. Thus, VDW dataset is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). Under this license, if you want to modify VDW or generate new data from VDW dataset (e.g., generating defocus maps for bokeh rendering), the releasing of your new data should be licensed under the same CC BY-NC-SA 4.0.
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
Creative Commons Corporation (“Creative Commons”) is not a law firm and does not provide legal services or legal advice. Distribution of Creative Commons public licenses does not create a lawyer-client or other relationship. Creative Commons makes its licenses and related information available on an “as-is” basis. Creative Commons gives no warranties regarding its licenses, any material licensed under their terms and conditions, or any related information. Creative Commons disclaims all liability for damages resulting from their use to the fullest extent possible.
Using Creative Commons Public Licenses
Creative Commons public licenses provide a standard set of terms and conditions that creators and other rights holders may use to share original works of authorship and other material subject to copyright and certain other rights specified in the public license below. The following considerations are for informational purposes only, are not exhaustive, and do not form part of our licenses.
Considerations for licensors
Our public licenses are intended for use by those authorized to give the public permission to use material in ways otherwise restricted by copyright and certain other rights. Our licenses are irrevocable. Licensors should read and understand the terms and conditions of the license they choose before applying it. Licensors should also secure all rights necessary before applying our licenses so that the public can reuse the material as expected. Licensors should clearly mark any material not subject to the license. This includes other CC-licensed material, or material used under an exception or limitation to copyright. More considerations for licensors.
Considerations for the public
By using one of our public licenses, a licensor grants the public permission to use the licensed material under specified terms and conditions. If the licensor’s permission is not necessary for any reason–for example, because of any applicable exception or limitation to copyright–then that use is not regulated by the license. Our licenses grant only permissions under copyright and certain other rights that a licensor has authority to grant. Use of the licensed material may still be restricted for other reasons, including because others have copyright or other rights in the material. A licensor may make special requests, such as asking that all changes be marked or described. Although not required by our licenses, you are encouraged to respect those requests where reasonable. More considerations for the public.
By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License ("Public License"). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions.
Section 1 – Definitions.
Section 2 – Scope.
Section 3 – License Conditions.
Your exercise of the Licensed Rights is expressly made subject to the following conditions.
Attribution.
If You Share the Licensed Material (including in modified form), You must:
In addition to the conditions in Section 3(a), if You Share Adapted Material You produce, the following conditions also apply.
Section 4 – Sui Generis Database Rights.
Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the Licensed Material:
Section 5 – Disclaimer of Warranties and Limitation of Liability.
Section 6 – Term and Termination.
Where Your right to use the Licensed Material has terminated under Section 6(a), it reinstates:
Section 7 – Other Terms and Conditions.
Section 8 – Interpretation.
Creative Commons is not a party to its public licenses. Notwithstanding, Creative Commons may elect to apply one of its public licenses to material it publishes and in those instances will be considered the “Licensor.” The text of the Creative Commons public licenses is dedicated to the public domain under the CC0 Public Domain Dedication. Except for the limited purpose of indicating that material is shared under a Creative Commons public license or as otherwise permitted by the Creative Commons policies published at creativecommons.org/policies, Creative Commons does not authorize the use of the trademark “Creative Commons” or any other trademark or logo of Creative Commons without its prior written consent including, without limitation, in connection with any unauthorized modifications to any of its public licenses or any other arrangements, understandings, or agreements concerning use of licensed material. For the avoidance of doubt, this paragraph does not form part of the public licenses.
Creative Commons may be contacted at creativecommons.org.
Thanks all the authors for their contribution to our work,
especially appreciate Jiaqi Li and Zihao Huang for their help in VDW dataset constructions.
Please refer to Download and Usage for using our VDW dataset. If you have any questions about the dataset, feel free to contact us by the
VDW official email.