Poster: Inferring Video Resolution with Coarse QoS in GEO Satellite Networks.

Jiamo Liu, David Lerner,Jae Chung,Elizabeth M. Belding

Workshop on Mobile Computing Systems and Applications（2024）

Cited 0|Views9

No score

Abstract

Video streaming is a dominant force in today's network landscape, representing 53% of total network traffic. This prominence makes it essential for Internet Service Providers (ISPs) to monitor the quality of experience (QoE) of video streaming for their users to ensure a satisfactory service. However, ISPs face a significant challenge in this regard because the payload data of video streams is encrypted. To address this, recent studies have suggested the use of fine-granularity network information like TCP headers and per-second network throughput of individual video streams as features to estimate QoE using machine learning[1, 2]. While these methods are promising, they present challenges in terms of storage cost and complexity. Our goal in this work is to evaluate the feasibility of performing video resolution inference with coarse granularity network features on a production Geosynchronous (GEO) satellite network. Packet scheduling is crucial in GEO satellite networks, as it facilitates dynamic adjustment of the priority of different service classes. This adjustment directly influences the bandwidth allocation to each service class, depending on the total bandwidth available. Our approach measures the speed experience (kbps) of the video service class at the MAC-layer (Media Access Control-layer) during each packet scheduling epoch, averaged over five seconds. To capture temporal variability and variation of the maximum achievable throughput, we use a rolling window. This involves computing statistical measures (mean, variance, median) across rolling windows ranging from 5 to 450 seconds, with each successive window spaced 5 seconds apart. To obtain the video resolution labels, we act as a CDN for a major streaming content provider and directly measure the resolution (e.g. 360p, 480p) of each video playback session every five seconds. Finally, we use the Random Forest model, which has been shown to be the most accurate model in previous studies. A key challenge with using coarse-grained features is the occasional lack of correlation with the label. Figure 1 illustrates the relationship between maximum achievable throughput and playback bitrate. Notably, there are instances where the playback bitrate decreases even though the maximum achievable throughput far exceeds the playback bitrate. We also verified that resolution follows the same trend of bitrate, though it's excluded from the figure for clarity. This suggests the influence of other factors, such as poor Wi-Fi signals or misaligned antennas, which our coarse-grained features at the ISP level fail to capture. We observe that including these noisy data points in our training significantly reduces model accuracy. To focus on estimating video experience for cases where the network was the primary limiting factor, we exclude training samples where the average maximum achievable throughput over the past 300 seconds exceeds 15Mbps. Additionally, we find that predicting the exact resolution (e.g., 360p, 480p) is challenging with our features. Therefore, we instead focus on predicting whether the mode of playback resolution is below 720p within each 300-second interval. The training set consists of the first 70% of video playback sessions, and the remaining 30% is the test set. The results are shown in Table 1. We confirm that the binary Random Forest classifier can achieve some degree of accuracy. While the results are promising, our current work is to continue to improve our model.

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined