-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty results of some probes #212
Comments
Isn't this a case of the measuremnt timing out? I believe we don't have any explicit way to indicate per-measurement errors so this is what happens then. |
The question is why it timed out. Our current probes should be reliable enough to work. If you give me some probes I can try to pull the local logs. We probably need a status field to clearly mark what happened in each result. Did the whole probe time out because of bad internet, meaning the target endpoint is not at fault, did it time out because of an exception in which case we need to fix it, did the API fail and we need to fix something, did the target endpoint fail and the user can safely mark it as down |
no. A measurement timeout is handled by the server, and the server doesn't include any other value than |
Such output may occur for several reasons:
Also we are putting the reason of failed measurement (e.g. "Private IP ranges are not allowed") in I think we should add a
|
@jimaek @MartinKolarik what do you think? |
Not sure if the |
Idea is not to mix up reason of status and raw output.
@jimaek, how we are able to get logs from the probe?
I think overall measurement status logic may be left as is. We can just add "in progress" (or "not finished") status for progress messages. So when the overall measurement is finished we understand that all "in progress" measurements mean timeout. |
Probe logs are local and if we don't control it we have no way to access them |
maybe that's a potential idea for another feature - save probe logs to file, and make it accessible through the API? |
It sounds nice, but I can imagine it would be too complicated. e.g. hardware probes are read-only, there is no way for the user to interact. We could build a system to trigger log upload on adopted probes via the dashboard but that sounds like a huge project to implement correctly and securely |
So the new field that we are adding is Pushing "Private IP ranges are not allowed" to According to sending logs, maybe we can use socket.io for that? On probe side it can be simple |
Let's use values |
Seems we can override long running script's |
Assuming the |
or |
I considered that but don't think we should have a special type for each type of error. From the client's perspective, it's simply a failure and the raw output has the message with additional details that can be shown to the user. |
Ok, another thing is an extra load to redis. As I see the only way to that is:
So under high load when timeouts will happen more frequently we will have event more redis writes. |
We can do one set on the whole JSON document, can't we? Not sure how exactly redis implements the individual field operations, but I imagine if there are more than a few fields to update, the whole document set might be faster. |
Yes, seems whole document set is not a perfect, but a better option. |
So, we added statuses and they help with timeouts, but there are still a few cases when we are able to get the same
We need to differentiate them, possible options are: Option 1: Add messages to the rawOutput.
Option 2: Use
Option 3: Use other
Addition. Here is the example of execa error object: {
shortMessage: "Command failed with exit code 68: unbuffer ping -c 3 -i 0.2 asdf.asdf",
command: "unbuffer ping -c 3 -i 0.2 asdf.asdf",
escapedCommand: "unbuffer ping -c 3 -i 0.2 asdf.asdf",
exitCode: 68, // for some reason it is 68 for ping of unknown host
signal: undefined,
signalDescription: undefined,
stdout: "ping: cannot resolve asdf.asdf: Unknown host",
stderr: "",
failed: true,
timedOut: false,
isCanceled: false,
killed: false,
} Here is the example of execa successful object: {
command: "unbuffer ping -c 3 -i 0.2 google.com",
escapedCommand: "unbuffer ping -c 3 -i 0.2 google.com",
exitCode: 0,
stdout: "PING google.com (142.250.75.14): 56 data bytes\n64 bytes from 142.250.75.14: icmp_seq=0 ttl=120 time=7.826 ms\n64 bytes from 142.250.75.14: icmp_seq=1 ttl=120 time=6.043 ms\n64 bytes from 142.250.75.14: icmp_seq=2 ttl=120 time=7.785 ms\n\n--- google.com ping statistics ---\n3 packets transmitted, 3 packets received, 0.0% packet loss\nround-trip min/avg/max/stddev = 6.043/7.218/7.826/0.831 ms",
stderr: "",
all: undefined,
failed: false,
timedOut: false,
isCanceled: false,
killed: false,
} The best option from my perspective is to use |
Solution 1 makes the most sense to me. The message itself may change depending on the exit code (or depending on the execa fields as in option 3) if you feel it explains the problem better to the user / helps us in debugging. |
Maybe we can also locally log with a lot more detail when issues like that happen? To make it easier to debug in case a user complains that a test failed. |
Sounds valid, I'll prepare a draft PR for a ping command. |
Here it is: |
So, there are only 4 faulty probes, they constantly respond with “Test failed. Please try again.” in case of dns, ping, traceroute or mtr measurements. http type is working fine. Seems like these probes are behind a firewall or smth that blocks everything except http. Issue is still valid so I am not closing it. But to proceed we should either:
|
Fixed by #52 |
Sometimes some of the probes of the
finished
measurements have result fields. Maybe data is not sent by the probe or API didn't handle it properly. We need to adress it.Steps to reproduce (may not reproduce every time):
status
isfinished
result
fieldsExpected value:
some valid result data
Actual value:
The text was updated successfully, but these errors were encountered: