In this tutorial the author has calculated the screen time of both Tom and Jerry
from a given video. The steps involved to crack this problem are as follows:
- Import and read the video, extract frames from it, and save them as images.
- Label a few images for training the model.
- Build our model on training data.
- Make predictions for the remaining images.
- Calculate the screen time of both TOM and JERRY.
The author has also used the following python libraries for implementing this
project: Numpy, Pandas, Matplotlib, Keras, Skimage, OpenCV.

The author has obtained an accuracy of around 88% on the validation data and 64% on the test data using the proposed model. One possible reason for getting a low accuracy on test data could be a lack of training data. As the model does not have much knowledge of cartoon images like TOM and JERRY, more images need to be fed to the model during the training process. The key point is to extract more frames from different TOM and JERRY videos, label them accordingly, and use them for training the model. Once the model has seen a plethora of images of these two characters, there’s a good chance it will lead to a better classification result.
Source: https://www.analyticsvidhya.com/blog/2018/09/deep-learning-
video-classification-python/
Author: Pulkit Sharma
Article Read and Shared by : Rakshith M D