Improve documentation
Browse files- app.py +8 -0
- apps/clustering.py +6 -2
- documentations/anomaly_detection_doc.py +125 -0
- documentations/trafic_analysis_doc.py +118 -0
app.py
CHANGED
|
@@ -196,6 +196,14 @@ if check_password():
|
|
| 196 |
"documentations/lte_capacity_docs.py",
|
| 197 |
title="📘LTE Capacity Documentation",
|
| 198 |
),
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 199 |
],
|
| 200 |
}
|
| 201 |
|
|
|
|
| 196 |
"documentations/lte_capacity_docs.py",
|
| 197 |
title="📘LTE Capacity Documentation",
|
| 198 |
),
|
| 199 |
+
st.Page(
|
| 200 |
+
"documentations/trafic_analysis_doc.py",
|
| 201 |
+
title="📘Trafic Analysis Documentation",
|
| 202 |
+
),
|
| 203 |
+
st.Page(
|
| 204 |
+
"documentations/anomaly_detection_doc.py",
|
| 205 |
+
title="📘Anomaly Detection Documentation",
|
| 206 |
+
),
|
| 207 |
],
|
| 208 |
}
|
| 209 |
|
apps/clustering.py
CHANGED
|
@@ -137,8 +137,12 @@ st.title("Automatic Site Clustering App")
|
|
| 137 |
# Add description
|
| 138 |
st.write(
|
| 139 |
"""This app allows you to cluster sites based on their latitude and longitude.
|
| 140 |
-
|
| 141 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 142 |
)
|
| 143 |
|
| 144 |
# Download Sample file
|
|
|
|
| 137 |
# Add description
|
| 138 |
st.write(
|
| 139 |
"""This app allows you to cluster sites based on their latitude and longitude.
|
| 140 |
+
|
| 141 |
+
Please choose a file containing :
|
| 142 |
+
- latitude and longitude columns
|
| 143 |
+
- region column
|
| 144 |
+
- site code column
|
| 145 |
+
"""
|
| 146 |
)
|
| 147 |
|
| 148 |
# Download Sample file
|
documentations/anomaly_detection_doc.py
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import streamlit as st
|
| 2 |
+
|
| 3 |
+
st.markdown(
|
| 4 |
+
"""
|
| 5 |
+
# KPI Anomaly Detection Documentation
|
| 6 |
+
|
| 7 |
+
## Overview
|
| 8 |
+
The KPI Anomaly Detection application is designed to automatically identify and analyze anomalies in Key Performance Indicators (KPIs) using change point detection algorithms. It helps in identifying significant changes in KPI trends that may indicate network issues or other important events.
|
| 9 |
+
|
| 10 |
+
## Features
|
| 11 |
+
|
| 12 |
+
### 1. Data Processing
|
| 13 |
+
- Supports both CSV and Excel file formats
|
| 14 |
+
- Automatic date parsing and data cleaning
|
| 15 |
+
- Handles missing values appropriately
|
| 16 |
+
- Processes multiple KPIs in a single run
|
| 17 |
+
|
| 18 |
+
### 2. Anomaly Detection
|
| 19 |
+
- Utilizes the PELT (Pruned Exact Linear Time) algorithm for change point detection
|
| 20 |
+
- Configurable penalty parameter to control sensitivity
|
| 21 |
+
- Identifies both sudden and gradual changes in KPI trends
|
| 22 |
+
- Filters out insignificant changes based on mean value differences
|
| 23 |
+
|
| 24 |
+
### 3. Visualization
|
| 25 |
+
- Interactive time series plots with Plotly
|
| 26 |
+
- Visual indicators for detected change points
|
| 27 |
+
- Displays initial and final mean values for comparison
|
| 28 |
+
- Responsive design for different screen sizes
|
| 29 |
+
|
| 30 |
+
### 4. Reporting
|
| 31 |
+
- Export detected anomalies to Excel
|
| 32 |
+
- Separate sheets for each KPI
|
| 33 |
+
- Includes all relevant data points and change point indicators
|
| 34 |
+
|
| 35 |
+
## Input Requirements
|
| 36 |
+
|
| 37 |
+
### Required File Format
|
| 38 |
+
- **CSV or Excel** file containing time series KPI data
|
| 39 |
+
- First 5 columns should be (in order):
|
| 40 |
+
1. Date/Time
|
| 41 |
+
2. Controller ID
|
| 42 |
+
3. BTS ID
|
| 43 |
+
4. Cell ID
|
| 44 |
+
5. DN (Directory Number)
|
| 45 |
+
- Remaining columns should contain KPI values
|
| 46 |
+
|
| 47 |
+
### Data Requirements
|
| 48 |
+
- At least 30 data points per cell for reliable detection
|
| 49 |
+
- Consistent time intervals between measurements
|
| 50 |
+
- Numeric values for KPI columns
|
| 51 |
+
|
| 52 |
+
## Usage
|
| 53 |
+
|
| 54 |
+
### 1. Upload Data
|
| 55 |
+
- Click "Upload KPI file" and select your CSV or Excel file
|
| 56 |
+
- The application will automatically detect the file format
|
| 57 |
+
|
| 58 |
+
### 2. Configure Detection
|
| 59 |
+
- Adjust the "Penalty" parameter to control sensitivity:
|
| 60 |
+
- Lower values = More sensitive (more change points detected)
|
| 61 |
+
- Higher values = Less sensitive (only major changes detected)
|
| 62 |
+
- Default value of 2.5 works well for most cases
|
| 63 |
+
|
| 64 |
+
### 3. Review Results
|
| 65 |
+
- The application will display a list of KPIs with detected anomalies
|
| 66 |
+
- Select a KPI and cell to view detailed analysis
|
| 67 |
+
- The plot shows:
|
| 68 |
+
- KPI values over time (blue line)
|
| 69 |
+
- Detected change points (red markers)
|
| 70 |
+
- Initial mean (gray dotted line)
|
| 71 |
+
- Final mean (black dashed line)
|
| 72 |
+
|
| 73 |
+
### 4. Export Results
|
| 74 |
+
- Click "Generate Excel file with anomalies" to export all detected anomalies
|
| 75 |
+
- Each KPI is saved in a separate sheet
|
| 76 |
+
- The Excel file includes all data points with change point indicators
|
| 77 |
+
|
| 78 |
+
## Technical Details
|
| 79 |
+
|
| 80 |
+
### Algorithm
|
| 81 |
+
- Uses the PELT (Pruned Exact Linear Time) algorithm from the `ruptures` library
|
| 82 |
+
- Model: RBF (Radial Basis Function) kernel for detecting changes in mean
|
| 83 |
+
- Automatic pruning of similar change points
|
| 84 |
+
|
| 85 |
+
### Performance Considerations
|
| 86 |
+
- Processing time depends on:
|
| 87 |
+
- Number of cells in the dataset
|
| 88 |
+
- Number of KPIs
|
| 89 |
+
- Length of the time series
|
| 90 |
+
- Large datasets may take several minutes to process
|
| 91 |
+
- Results are cached for better performance when adjusting parameters
|
| 92 |
+
|
| 93 |
+
## Troubleshooting
|
| 94 |
+
|
| 95 |
+
### Common Issues
|
| 96 |
+
1. **No anomalies detected**
|
| 97 |
+
- Try reducing the penalty value
|
| 98 |
+
- Check if the data contains enough variation
|
| 99 |
+
- Ensure there are at least 30 data points per cell
|
| 100 |
+
|
| 101 |
+
2. **Too many false positives**
|
| 102 |
+
- Increase the penalty value
|
| 103 |
+
- Check data for noise or outliers
|
| 104 |
+
- Consider pre-processing the data
|
| 105 |
+
|
| 106 |
+
3. **File format errors**
|
| 107 |
+
- Ensure the file is not open in another program
|
| 108 |
+
- Check that the file is not corrupted
|
| 109 |
+
- Verify the column structure matches requirements
|
| 110 |
+
|
| 111 |
+
## Best Practices
|
| 112 |
+
1. Start with the default penalty value (2.5) and adjust as needed
|
| 113 |
+
2. For large datasets, consider processing in smaller chunks
|
| 114 |
+
3. Review detected anomalies in context with network events or changes
|
| 115 |
+
4. Regularly update the application to get the latest improvements
|
| 116 |
+
|
| 117 |
+
## Dependencies
|
| 118 |
+
- Python 3.7+
|
| 119 |
+
- pandas
|
| 120 |
+
- numpy
|
| 121 |
+
- plotly
|
| 122 |
+
- ruptures
|
| 123 |
+
- streamlit
|
| 124 |
+
"""
|
| 125 |
+
)
|
documentations/trafic_analysis_doc.py
ADDED
|
@@ -0,0 +1,118 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import streamlit as st
|
| 2 |
+
|
| 3 |
+
st.markdown(
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
# Traffic Analysis Application Documentation
|
| 7 |
+
|
| 8 |
+
## Overview
|
| 9 |
+
The Traffic Analysis application is a Streamlit-based tool designed to analyze and visualize 2G, 3G, and LTE network traffic data. It provides insights into voice and data traffic patterns, performs comparative analysis between different time periods, and generates visual reports.
|
| 10 |
+
|
| 11 |
+
## Features
|
| 12 |
+
|
| 13 |
+
### 1. Multi-Technology Support
|
| 14 |
+
- **2G Traffic Analysis**
|
| 15 |
+
- Voice traffic analysis
|
| 16 |
+
- Data traffic analysis
|
| 17 |
+
- PS (Packet Switched) traffic metrics
|
| 18 |
+
|
| 19 |
+
- **3G Traffic Analysis**
|
| 20 |
+
- CS (Circuit Switched) voice traffic
|
| 21 |
+
- PS data traffic
|
| 22 |
+
- Combined traffic metrics
|
| 23 |
+
|
| 24 |
+
- **LTE Traffic Analysis**
|
| 25 |
+
- UL/DL data traffic
|
| 26 |
+
- Combined traffic volume
|
| 27 |
+
|
| 28 |
+
### 2. Comparative Analysis
|
| 29 |
+
- Pre/Post period comparison
|
| 30 |
+
- Traffic trend analysis
|
| 31 |
+
- Monthly traffic aggregation
|
| 32 |
+
- Traffic growth/decline metrics
|
| 33 |
+
|
| 34 |
+
### 3. Visualization
|
| 35 |
+
- Interactive maps showing traffic distribution
|
| 36 |
+
- Top N sites analysis
|
| 37 |
+
- Traffic trend charts
|
| 38 |
+
- Comparative bar charts
|
| 39 |
+
|
| 40 |
+
## Input Requirements
|
| 41 |
+
|
| 42 |
+
### Required Files
|
| 43 |
+
1. **2G Traffic Report**
|
| 44 |
+
- Required columns: `BCF name`, `PERIOD_START_TIME`, `TRAFFIC_PS DL`, `PS_UL_Load`
|
| 45 |
+
- Format: CSV or ZIP containing CSV
|
| 46 |
+
|
| 47 |
+
2. **3G Traffic Report**
|
| 48 |
+
- Required columns: `WBTS name`, `PERIOD_START_TIME`, `Total CS traffic - Erl`, `Total_Data_Traffic`
|
| 49 |
+
- Format: CSV or ZIP containing CSV
|
| 50 |
+
|
| 51 |
+
3. **LTE Traffic Report**
|
| 52 |
+
- Required columns: `LNBTS name`, `PERIOD_START_TIME`, `4G/LTE DL Traffic Volume (GBytes)`, `4G/LTE UL Traffic Volume (GBytes)`
|
| 53 |
+
- Format: CSV or ZIP containing CSV
|
| 54 |
+
|
| 55 |
+
## Usage
|
| 56 |
+
|
| 57 |
+
### 1. Data Upload
|
| 58 |
+
- Upload the three required reports using the file uploaders
|
| 59 |
+
- Supported formats: CSV or ZIP (containing CSV)
|
| 60 |
+
|
| 61 |
+
### 2. Date Range Selection
|
| 62 |
+
- **Pre-period**: Select the baseline period for comparison
|
| 63 |
+
- **Post-period**: Select the period to compare against the baseline
|
| 64 |
+
- **Last period**: Select the most recent period for current analysis
|
| 65 |
+
|
| 66 |
+
### 3. Analysis Configuration
|
| 67 |
+
- Set the number of top traffic sites to display
|
| 68 |
+
- The application will automatically process and analyze the data
|
| 69 |
+
|
| 70 |
+
### 4. Results
|
| 71 |
+
- **Summary Analysis**: Overview of traffic metrics
|
| 72 |
+
- **Top Sites**: Lists and charts of highest traffic sites
|
| 73 |
+
- **Maps**: Geographical visualization of traffic distribution
|
| 74 |
+
- **Monthly Trends**: Traffic patterns over time
|
| 75 |
+
|
| 76 |
+
## Technical Implementation
|
| 77 |
+
|
| 78 |
+
### Key Functions
|
| 79 |
+
|
| 80 |
+
1. **Data Processing**
|
| 81 |
+
- preprocess_2g(): Processes 2G traffic data
|
| 82 |
+
- preprocess_3g(): Processes 3G traffic data
|
| 83 |
+
- preprocess_lte(): Processes LTE traffic data
|
| 84 |
+
|
| 85 |
+
2. **Analysis Functions**
|
| 86 |
+
- merge_and_compare(): Combines and compares traffic data across technologies
|
| 87 |
+
- monthly_data_analysis(): Aggregates data by month for trend analysis
|
| 88 |
+
|
| 89 |
+
3. **Visualization**
|
| 90 |
+
- Interactive maps using Plotly
|
| 91 |
+
- Bar charts for top sites
|
| 92 |
+
- Data tables with sortable columns
|
| 93 |
+
|
| 94 |
+
## Dependencies
|
| 95 |
+
- Python 3.7+
|
| 96 |
+
- pandas
|
| 97 |
+
- plotly
|
| 98 |
+
- streamlit
|
| 99 |
+
- numpy
|
| 100 |
+
|
| 101 |
+
## Output
|
| 102 |
+
The application generates:
|
| 103 |
+
1. Summary tables of traffic metrics
|
| 104 |
+
2. Interactive visualizations
|
| 105 |
+
3. Exportable reports in Excel format
|
| 106 |
+
|
| 107 |
+
## Best Practices
|
| 108 |
+
1. Ensure input files follow the required format
|
| 109 |
+
2. Select appropriate date ranges for meaningful comparison
|
| 110 |
+
3. Review top traffic sites for network optimization opportunities
|
| 111 |
+
4. Use the monthly analysis to identify long-term trends
|
| 112 |
+
|
| 113 |
+
## Troubleshooting
|
| 114 |
+
- If data doesn't load, check file formats and required columns
|
| 115 |
+
- Ensure date ranges are properly selected
|
| 116 |
+
- Verify that uploaded files contain valid data
|
| 117 |
+
"""
|
| 118 |
+
)
|