Primary Focus: Proactively predict any factor that impacts our customers’ quality of experience (QoE) on their home network devices.
Key Responsibilities
Root Cause Analysis & Feature Engineering
Investigate gateway reboots to identify any Wi-Fi–related events or patterns that trigger them.
Focus on “isolated reboots” to isolate those caused by Wi-Fi telemetry, filtering out reboots driven by other factors.
Analyze telemetry data (RDK dataset) collected every 15 minutes from Wi-Fi gateways (XB3, XB6, XB7, XB8).
Engineer numerical and categorical features (e.g., network efficiency, congestion metrics, retransmission rates) to highlight pain points or stress in the network.
Data Analysis & Visualization
Conduct rigorous exploratory data analysis (EDA) on telemetry markers to understand distributions and correlations.
Develop performance/resilience metrics to capture the strain on different gateway models under a range of usage scenarios.
Create visualization scripts (e.g., time-series plots, correlation heatmaps) to detect anomalies, identify malformation in data logging, and confirm theoretical expectations about telemetry marker behavior.
Document findings and present comparative analysis across different gateway models and scenarios.
Machine Learning Pipeline & Statistical Testing
Use ML models and statistical tests (Pearson correlation, point-biserial correlation, chi-squared test) to assess the relationship between engineered features and reboot events.
Develop time-series features (moving averages, derivatives, variance) to highlight sudden spikes or drops in telemetry markers leading to quality or service degradation.
Data Infrastructure & Tools
Script in PySpark/Python using AWS EMR and Athena notebooks, storing labeled datasets/results in S3.
Run SQL queries in Athena for data exploration, feature engineering, and model development.
Implement imputation techniques (backward filling) to handle null values in telemetry data based on historical windows.
Validation & On-Site Testing
Set up local testbeds using iperf, Wireshark, and other tools to generate controlled traffic patterns (TCP/UDP) and interference levels.
Monitor and validate telemetry markers (e.g., RSSI, channel widths, channel hopping frequency, throughput) in real-time to confirm and refine assumptions.
Align on-site test results with analysis from the RDK dataset to ensure consistency and accuracy.
Collaboration & Communication
Present weekly progress and data insights to teammates and leadership.
Document all findings and share comprehensive visual and written reports, ensuring that stakeholders understand the key conclusions and next steps.