DavMelchi commited on
Commit
3d9465e
·
1 Parent(s): db6bec9

Improve documentation

Browse files
app.py CHANGED
@@ -196,6 +196,14 @@ if check_password():
196
  "documentations/lte_capacity_docs.py",
197
  title="📘LTE Capacity Documentation",
198
  ),
 
 
 
 
 
 
 
 
199
  ],
200
  }
201
 
 
196
  "documentations/lte_capacity_docs.py",
197
  title="📘LTE Capacity Documentation",
198
  ),
199
+ st.Page(
200
+ "documentations/trafic_analysis_doc.py",
201
+ title="📘Trafic Analysis Documentation",
202
+ ),
203
+ st.Page(
204
+ "documentations/anomaly_detection_doc.py",
205
+ title="📘Anomaly Detection Documentation",
206
+ ),
207
  ],
208
  }
209
 
apps/clustering.py CHANGED
@@ -137,8 +137,12 @@ st.title("Automatic Site Clustering App")
137
  # Add description
138
  st.write(
139
  """This app allows you to cluster sites based on their latitude and longitude.
140
- **Please choose a file containing the latitude and longitude region and site code columns.**
141
- """
 
 
 
 
142
  )
143
 
144
  # Download Sample file
 
137
  # Add description
138
  st.write(
139
  """This app allows you to cluster sites based on their latitude and longitude.
140
+
141
+ Please choose a file containing :
142
+ - latitude and longitude columns
143
+ - region column
144
+ - site code column
145
+ """
146
  )
147
 
148
  # Download Sample file
documentations/anomaly_detection_doc.py ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+ st.markdown(
4
+ """
5
+ # KPI Anomaly Detection Documentation
6
+
7
+ ## Overview
8
+ The KPI Anomaly Detection application is designed to automatically identify and analyze anomalies in Key Performance Indicators (KPIs) using change point detection algorithms. It helps in identifying significant changes in KPI trends that may indicate network issues or other important events.
9
+
10
+ ## Features
11
+
12
+ ### 1. Data Processing
13
+ - Supports both CSV and Excel file formats
14
+ - Automatic date parsing and data cleaning
15
+ - Handles missing values appropriately
16
+ - Processes multiple KPIs in a single run
17
+
18
+ ### 2. Anomaly Detection
19
+ - Utilizes the PELT (Pruned Exact Linear Time) algorithm for change point detection
20
+ - Configurable penalty parameter to control sensitivity
21
+ - Identifies both sudden and gradual changes in KPI trends
22
+ - Filters out insignificant changes based on mean value differences
23
+
24
+ ### 3. Visualization
25
+ - Interactive time series plots with Plotly
26
+ - Visual indicators for detected change points
27
+ - Displays initial and final mean values for comparison
28
+ - Responsive design for different screen sizes
29
+
30
+ ### 4. Reporting
31
+ - Export detected anomalies to Excel
32
+ - Separate sheets for each KPI
33
+ - Includes all relevant data points and change point indicators
34
+
35
+ ## Input Requirements
36
+
37
+ ### Required File Format
38
+ - **CSV or Excel** file containing time series KPI data
39
+ - First 5 columns should be (in order):
40
+ 1. Date/Time
41
+ 2. Controller ID
42
+ 3. BTS ID
43
+ 4. Cell ID
44
+ 5. DN (Directory Number)
45
+ - Remaining columns should contain KPI values
46
+
47
+ ### Data Requirements
48
+ - At least 30 data points per cell for reliable detection
49
+ - Consistent time intervals between measurements
50
+ - Numeric values for KPI columns
51
+
52
+ ## Usage
53
+
54
+ ### 1. Upload Data
55
+ - Click "Upload KPI file" and select your CSV or Excel file
56
+ - The application will automatically detect the file format
57
+
58
+ ### 2. Configure Detection
59
+ - Adjust the "Penalty" parameter to control sensitivity:
60
+ - Lower values = More sensitive (more change points detected)
61
+ - Higher values = Less sensitive (only major changes detected)
62
+ - Default value of 2.5 works well for most cases
63
+
64
+ ### 3. Review Results
65
+ - The application will display a list of KPIs with detected anomalies
66
+ - Select a KPI and cell to view detailed analysis
67
+ - The plot shows:
68
+ - KPI values over time (blue line)
69
+ - Detected change points (red markers)
70
+ - Initial mean (gray dotted line)
71
+ - Final mean (black dashed line)
72
+
73
+ ### 4. Export Results
74
+ - Click "Generate Excel file with anomalies" to export all detected anomalies
75
+ - Each KPI is saved in a separate sheet
76
+ - The Excel file includes all data points with change point indicators
77
+
78
+ ## Technical Details
79
+
80
+ ### Algorithm
81
+ - Uses the PELT (Pruned Exact Linear Time) algorithm from the `ruptures` library
82
+ - Model: RBF (Radial Basis Function) kernel for detecting changes in mean
83
+ - Automatic pruning of similar change points
84
+
85
+ ### Performance Considerations
86
+ - Processing time depends on:
87
+ - Number of cells in the dataset
88
+ - Number of KPIs
89
+ - Length of the time series
90
+ - Large datasets may take several minutes to process
91
+ - Results are cached for better performance when adjusting parameters
92
+
93
+ ## Troubleshooting
94
+
95
+ ### Common Issues
96
+ 1. **No anomalies detected**
97
+ - Try reducing the penalty value
98
+ - Check if the data contains enough variation
99
+ - Ensure there are at least 30 data points per cell
100
+
101
+ 2. **Too many false positives**
102
+ - Increase the penalty value
103
+ - Check data for noise or outliers
104
+ - Consider pre-processing the data
105
+
106
+ 3. **File format errors**
107
+ - Ensure the file is not open in another program
108
+ - Check that the file is not corrupted
109
+ - Verify the column structure matches requirements
110
+
111
+ ## Best Practices
112
+ 1. Start with the default penalty value (2.5) and adjust as needed
113
+ 2. For large datasets, consider processing in smaller chunks
114
+ 3. Review detected anomalies in context with network events or changes
115
+ 4. Regularly update the application to get the latest improvements
116
+
117
+ ## Dependencies
118
+ - Python 3.7+
119
+ - pandas
120
+ - numpy
121
+ - plotly
122
+ - ruptures
123
+ - streamlit
124
+ """
125
+ )
documentations/trafic_analysis_doc.py ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+ st.markdown(
4
+ """
5
+
6
+ # Traffic Analysis Application Documentation
7
+
8
+ ## Overview
9
+ The Traffic Analysis application is a Streamlit-based tool designed to analyze and visualize 2G, 3G, and LTE network traffic data. It provides insights into voice and data traffic patterns, performs comparative analysis between different time periods, and generates visual reports.
10
+
11
+ ## Features
12
+
13
+ ### 1. Multi-Technology Support
14
+ - **2G Traffic Analysis**
15
+ - Voice traffic analysis
16
+ - Data traffic analysis
17
+ - PS (Packet Switched) traffic metrics
18
+
19
+ - **3G Traffic Analysis**
20
+ - CS (Circuit Switched) voice traffic
21
+ - PS data traffic
22
+ - Combined traffic metrics
23
+
24
+ - **LTE Traffic Analysis**
25
+ - UL/DL data traffic
26
+ - Combined traffic volume
27
+
28
+ ### 2. Comparative Analysis
29
+ - Pre/Post period comparison
30
+ - Traffic trend analysis
31
+ - Monthly traffic aggregation
32
+ - Traffic growth/decline metrics
33
+
34
+ ### 3. Visualization
35
+ - Interactive maps showing traffic distribution
36
+ - Top N sites analysis
37
+ - Traffic trend charts
38
+ - Comparative bar charts
39
+
40
+ ## Input Requirements
41
+
42
+ ### Required Files
43
+ 1. **2G Traffic Report**
44
+ - Required columns: `BCF name`, `PERIOD_START_TIME`, `TRAFFIC_PS DL`, `PS_UL_Load`
45
+ - Format: CSV or ZIP containing CSV
46
+
47
+ 2. **3G Traffic Report**
48
+ - Required columns: `WBTS name`, `PERIOD_START_TIME`, `Total CS traffic - Erl`, `Total_Data_Traffic`
49
+ - Format: CSV or ZIP containing CSV
50
+
51
+ 3. **LTE Traffic Report**
52
+ - Required columns: `LNBTS name`, `PERIOD_START_TIME`, `4G/LTE DL Traffic Volume (GBytes)`, `4G/LTE UL Traffic Volume (GBytes)`
53
+ - Format: CSV or ZIP containing CSV
54
+
55
+ ## Usage
56
+
57
+ ### 1. Data Upload
58
+ - Upload the three required reports using the file uploaders
59
+ - Supported formats: CSV or ZIP (containing CSV)
60
+
61
+ ### 2. Date Range Selection
62
+ - **Pre-period**: Select the baseline period for comparison
63
+ - **Post-period**: Select the period to compare against the baseline
64
+ - **Last period**: Select the most recent period for current analysis
65
+
66
+ ### 3. Analysis Configuration
67
+ - Set the number of top traffic sites to display
68
+ - The application will automatically process and analyze the data
69
+
70
+ ### 4. Results
71
+ - **Summary Analysis**: Overview of traffic metrics
72
+ - **Top Sites**: Lists and charts of highest traffic sites
73
+ - **Maps**: Geographical visualization of traffic distribution
74
+ - **Monthly Trends**: Traffic patterns over time
75
+
76
+ ## Technical Implementation
77
+
78
+ ### Key Functions
79
+
80
+ 1. **Data Processing**
81
+ - preprocess_2g(): Processes 2G traffic data
82
+ - preprocess_3g(): Processes 3G traffic data
83
+ - preprocess_lte(): Processes LTE traffic data
84
+
85
+ 2. **Analysis Functions**
86
+ - merge_and_compare(): Combines and compares traffic data across technologies
87
+ - monthly_data_analysis(): Aggregates data by month for trend analysis
88
+
89
+ 3. **Visualization**
90
+ - Interactive maps using Plotly
91
+ - Bar charts for top sites
92
+ - Data tables with sortable columns
93
+
94
+ ## Dependencies
95
+ - Python 3.7+
96
+ - pandas
97
+ - plotly
98
+ - streamlit
99
+ - numpy
100
+
101
+ ## Output
102
+ The application generates:
103
+ 1. Summary tables of traffic metrics
104
+ 2. Interactive visualizations
105
+ 3. Exportable reports in Excel format
106
+
107
+ ## Best Practices
108
+ 1. Ensure input files follow the required format
109
+ 2. Select appropriate date ranges for meaningful comparison
110
+ 3. Review top traffic sites for network optimization opportunities
111
+ 4. Use the monthly analysis to identify long-term trends
112
+
113
+ ## Troubleshooting
114
+ - If data doesn't load, check file formats and required columns
115
+ - Ensure date ranges are properly selected
116
+ - Verify that uploaded files contain valid data
117
+ """
118
+ )