You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @martinrusev i've found a bug where some alerts for custom plugins are causing error and mess up completely all other alerts.
Test Case :
4 similar servers with custom.plugins.conf file.. All custom metrics works fine (custom alert name in amon alerts rule: "temperature.temperature.cpu")
setup alert : server = server1 | metric=temperature.temperature.cpu | more than = 20 | For = 3 minutes
setup alert : server = server2 | metric=temperature.temperature.cpu | more than = 20 | For = 3 minutes
setup alert : server = server3 | metric=temperature.temperature.cpu | more than = 20 | For = 3 minutes
.. till now all working fine and no error in amon_request.log .. If we try to lower the alert value in order to test the alert, we get all notifications based in the "For" value range normally with no errors at all...
setup alert : server = server4 | metric=temperature.temperature.cpu | more than = 20 | For = 3 minutes
.. now we get our error ..
ERROR Internal Server Error: /api/system/v2/ Traceback (most recent call last): File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/exception.py", line 41, in inner response = get_response(request) File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/base.py", line 249, in _legacy_get_response response = self._get_response(request) File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/base.py", line 187, in _get_response response = self.process_exception_by_middleware(e, request) File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/base.py", line 185, in _get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/opt/amon/env/lib/python3.5/site-packages/django/views/decorators/csrf.py", line 58, in wrapped_view return view_func(*args, **kwargs) File "/opt/amon/env/lib/python3.5/site-packages/django/views/generic/base.py", line 68, in view return self.dispatch(request, *args, **kwargs) File "/opt/amon/env/lib/python3.5/site-packages/rest_framework/views.py", line 477, in dispatch response = self.handle_exception(exc) File "/opt/amon/env/lib/python3.5/site-packages/rest_framework/views.py", line 437, in handle_exception self.raise_uncaught_exception(exc) File "/opt/amon/env/lib/python3.5/site-packages/rest_framework/views.py", line 474, in dispatch response = handler(request, *args, **kwargs) File "/opt/amon/amon/apps/api/views/core.py", line 66, in post api_model.save_data_to_backend(server=server, data=data) File "/opt/amon/amon/apps/api/models.py", line 94, in save_data_to_backend plugin_alerter.check(data=data, plugin=plugin, server=server) File "/opt/amon/amon/apps/alerts/alerter.py", line 106, in check alert = plugin_alerts.check(data=plugin_data, rule=rule) File "/opt/amon/amon/apps/alerts/checkers/plugin.py", line 32, in check incoming_value = data.get(key_name) AttributeError: 'NoneType' object has no attribute 'get'
Even with this error if we lower the alert value for server 4 (*that actually gives this error) we will still get the notifications based on the "for" time range we setup...
BUT ..
If we add more and more servers that gives errors too, then at some point alerts get messed up and we might get 6 notifications for the same alert at the SAME timestamp.
Also the above error disappears if we delete the alert for the specific servers that gives this error.
All custom plugin graphs works fine and values are reporting also fine , tested allso with amonagent test custom plugin.
All servers are same OS and all the custom plugins are the same exactly under the same user with same permisions.
I'm going to further investigate the issue but if you have any idea on what else to check please do tell.
The text was updated successfully, but these errors were encountered:
@martinrusev in case it helps you If we change the plugin name in : "/etc/opt/amonagent/plugins-enabled/custom.conf" from temperature to temp for example , everything is working fine for the servers that cause errors.
Hi @martinrusev i've found a bug where some alerts for custom plugins are causing error and mess up completely all other alerts.
Test Case :
4 similar servers with custom.plugins.conf file.. All custom metrics works fine (custom alert name in amon alerts rule: "temperature.temperature.cpu")
.. till now all working fine and no error in amon_request.log .. If we try to lower the alert value in order to test the alert, we get all notifications based in the "For" value range normally with no errors at all...
.. now we get our error ..
ERROR Internal Server Error: /api/system/v2/ Traceback (most recent call last): File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/exception.py", line 41, in inner response = get_response(request) File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/base.py", line 249, in _legacy_get_response response = self._get_response(request) File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/base.py", line 187, in _get_response response = self.process_exception_by_middleware(e, request) File "/opt/amon/env/lib/python3.5/site-packages/django/core/handlers/base.py", line 185, in _get_response response = wrapped_callback(request, *callback_args, **callback_kwargs) File "/opt/amon/env/lib/python3.5/site-packages/django/views/decorators/csrf.py", line 58, in wrapped_view return view_func(*args, **kwargs) File "/opt/amon/env/lib/python3.5/site-packages/django/views/generic/base.py", line 68, in view return self.dispatch(request, *args, **kwargs) File "/opt/amon/env/lib/python3.5/site-packages/rest_framework/views.py", line 477, in dispatch response = self.handle_exception(exc) File "/opt/amon/env/lib/python3.5/site-packages/rest_framework/views.py", line 437, in handle_exception self.raise_uncaught_exception(exc) File "/opt/amon/env/lib/python3.5/site-packages/rest_framework/views.py", line 474, in dispatch response = handler(request, *args, **kwargs) File "/opt/amon/amon/apps/api/views/core.py", line 66, in post api_model.save_data_to_backend(server=server, data=data) File "/opt/amon/amon/apps/api/models.py", line 94, in save_data_to_backend plugin_alerter.check(data=data, plugin=plugin, server=server) File "/opt/amon/amon/apps/alerts/alerter.py", line 106, in check alert = plugin_alerts.check(data=plugin_data, rule=rule) File "/opt/amon/amon/apps/alerts/checkers/plugin.py", line 32, in check incoming_value = data.get(key_name) AttributeError: 'NoneType' object has no attribute 'get'
Even with this error if we lower the alert value for server 4 (*that actually gives this error) we will still get the notifications based on the "for" time range we setup...
BUT ..
If we add more and more servers that gives errors too, then at some point alerts get messed up and we might get 6 notifications for the same alert at the SAME timestamp.
Also the above error disappears if we delete the alert for the specific servers that gives this error.
All custom plugin graphs works fine and values are reporting also fine , tested allso with amonagent test custom plugin.
All servers are same OS and all the custom plugins are the same exactly under the same user with same permisions.
I'm going to further investigate the issue but if you have any idea on what else to check please do tell.
The text was updated successfully, but these errors were encountered: