Matthew Sevey
31cf9fb59e
Disable load check until we have a process of actively addressing it
2021-11-24 14:13:26 -05:00
Karol Wypchlo
9026d56777
fix health check script invalid syntax
...
python lint job
2021-10-06 14:21:58 +02:00
Matthew Sevey
b75c0c2c3b
Merge pull request #1178 from SkynetLabs/round-full-hours
...
round reporting datetime to full hours
2021-09-10 14:25:30 -04:00
Karol Wypchlo
bcea4d5b90
do not send message on server down
2021-09-10 17:00:37 +02:00
Karol Wypchlo
4fddbb2fc8
do not notify on portal disabled
2021-09-10 16:23:05 +02:00
Karol Wypchlo
5ba4f54302
round reporting datetime to full hours
2021-09-10 13:24:19 +02:00
Karol Wypchlo
72f2b56f17
include skapps checks
2021-08-23 16:21:09 +02:00
Karol Wypchło
1c8816530c
cleanup unused discord import ( #1073 )
2021-08-17 09:58:00 +02:00
Karol Wypchło
379f87ea27
increase disk space size that warrants a warning ( #1063 )
2021-08-12 13:14:00 +02:00
Karol Wypchło
71f9d5280e
use webhook instead of discrod bot to send messages ( #979 )
...
* initial refactor
* do not use before define
* forgot to remove client
* test notification
* add /cc
* fix /cc
* fix /cc role
* fix /cc
* test file upload
* test file upload
* test file upload
* default to no mentions
* unformat
* replace discord with DiscordWebhook
* add readme
* don't fail on failures in message send
2021-07-16 13:12:58 +02:00
Karol Wypchlo
36aa7c8311
improve health check reliability
2021-07-12 14:53:12 +02:00
Karol Wypchlo
a2aa850632
improve health check reliability
2021-07-12 14:49:53 +02:00
Karol Wypchlo
7fd97b5824
improve health check reliability
2021-07-12 14:48:13 +02:00
Karol Wypchlo
49bb6dd2e2
fix portal size check reporting zero files
2021-06-15 11:41:01 +02:00
Karol Wypchło
b8a6816876
fixed health check blowing up on eu-fin-3 ( #838 )
...
* request 127.0.0.1 over https - http localhost causes issues
* reformat with black
2021-06-07 15:08:18 +02:00
Karol Wypchlo
cd7dac5b7e
verbose => extended
2021-04-29 13:43:40 +02:00
Karol Wypchlo
1c99da3af8
fix repair string
2021-04-14 12:17:01 +02:00
Karol Wypchlo
f48a8d9302
fix health-check
2021-04-13 16:19:42 +02:00
Matthew Sevey
c752a17058
Update setup-scripts/health-checker.py
2021-02-03 10:30:27 -07:00
Matthew Sevey
50dff35da8
Update setup-scripts/health-checker.py
...
Co-authored-by: Marcin S. <scatman@bu.edu>
2021-02-03 10:22:24 -07:00
Matthew Sevey
ff183beb66
Add repair size information to health checker
2021-02-03 09:42:55 -07:00
Karol Wypchło
c0673b3f76
do not ping when server is in maintenance mode ( #552 )
2020-12-01 13:31:59 +01:00
Matthew Sevey
5f76d1ca52
remove error alert notification, subtract out siafile alerts
2020-11-24 07:49:51 -07:00
Karol Wypchlo
2dfb6d6a56
restore "or"
2020-11-24 15:26:51 +01:00
Karol Wypchlo
383144b7a6
tweak notifications on number of files in a node
2020-11-24 13:16:25 +01:00
Karol Wypchlo
7946f97d58
tweak notifications on error alerts
2020-11-24 13:08:08 +01:00
Ivaylo Novakov
41460f155f
Moved the container name var to the global space where it belongs.
2020-11-20 22:08:04 +01:00
Ivaylo Novakov
801597ccde
Fixed some typos.
...
Fixed formatting (force of habit...).
2020-11-20 21:45:19 +01:00
Matthew Sevey
05cd1bfb32
fix weird formatting
2020-11-20 11:46:35 -07:00
Matthew Sevey
a337b754a8
run python format
2020-11-20 11:33:07 -07:00
Matthew Sevey
efc6060924
scripts: update file health check to check siac output. Add total files check
2020-11-20 11:26:20 -07:00
Matthew Sevey
243d084b5d
update message for siafile bad health
2020-11-18 11:04:04 -07:00
Matthew Sevey
09a4b646ec
srcipts: add alert check to the python scripts
2020-11-18 10:21:06 -07:00
Karol Wypchlo
1922c4cd98
use os.popopen manually
2020-10-06 12:12:19 +02:00
Karol Wypchlo
9b6d61aa7e
remove unnecessary time dependency
2020-10-06 11:27:06 +02:00
Karol Wypchlo
60f8371170
stop sia container on critical disk space threshold
2020-10-06 11:24:18 +02:00
Karol Wypchlo
2328e605b7
parse disk size as int before multiplying
2020-10-05 10:03:10 +02:00
Karol Wypchło
e58752571e
add response content to health check failures ( #437 )
2020-09-30 16:20:55 +02:00
Karol Wypchło
10a251c081
reimplement health checks ( #434 )
2020-09-29 12:32:45 +02:00
Karol Wypchlo
20362fe7c5
fix health checks
2020-09-10 15:16:31 +02:00
Ivaylo Novakov
8235d75795
Only announce healthy status once a day.
2020-09-08 18:20:56 +02:00
Ivaylo Novakov
ddf72ad850
Make the time comparisons in the health checker timezone-aware.
2020-09-08 18:07:33 +02:00
Ivaylo Novakov
2d032dbf17
Docstrings.
2020-09-07 17:59:39 +02:00
Ivaylo Novakov
0838e4f5e5
Add free disk space check to health-checker.py.
...
Move load-average check to health-checker.py.
2020-09-07 17:56:47 +02:00
Ivaylo Novakov
3f4742a436
Only notify the team if critical checks have failed.
2020-09-04 17:17:26 +02:00
Ivaylo Novakov
5eece67b03
Move parameter parsing to the top of the script.
2020-09-04 17:13:36 +02:00
Ivaylo Novakov
1cc20903c6
Move max discord message len to a constant.
...
Report critical checks failed.
Formatting.
2020-09-04 17:07:47 +02:00
Ivaylo Novakov
a0a9137ae7
Update setup-scripts/health-checker.py
...
Co-authored-by: Karol Wypchło <kwypchlo@gmail.com>
2020-09-04 16:44:19 +02:00
Ivaylo Novakov
62e27120cd
Use localhost.
2020-09-04 16:39:39 +02:00
Ivaylo Novakov
59a77bfaf6
Add a health checker script to Gollum.
2020-09-04 16:12:20 +02:00