Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pi fails to respond to SSH when copying music from CIFS share #263

Open
fatfurrycat opened this issue Sep 17, 2019 · 13 comments
Open

Pi fails to respond to SSH when copying music from CIFS share #263

fatfurrycat opened this issue Sep 17, 2019 · 13 comments

Comments

@fatfurrycat
Copy link

fatfurrycat commented Sep 17, 2019

I've just setup teslausb using the latest image and config script.
I've attached my config script.

Upon initial reboot following install, I am able to SSH to the Pi. I was also able to change password for Pi user. The /boot directory contained TESLAUSB_SETUP_FINISHED and WIFI_ENABLED files.

However, I was unable to see the Pi as a USB drive, so I issued a reboot from the command line.

When the Pi had rebooted, I can see from my network monitor that it is transferring music files from my CIFS share on my Synology NAS, which I'm assuming is why I can't see the device as a USB drive on my PC.

However, I'm also no longer able to SSH into the Pi to check on progress of the copy. I've tried to SSH from Fedora, Cygwin and Putty. All tools simply hang and no login prompt is presented. I also see no timeouts.

Debug output from ssh client:
$ ssh -v [email protected]
OpenSSH_8.0p1, OpenSSL 1.1.1c FIPS 28 May 2019
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config
debug1: configuration requests final Match pass
debug1: re-parsing configuration
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Reading configuration data /etc/crypto-policies/back-ends/openssh.config
debug1: Connecting to 192.168.0.216 [192.168.0.216] port 22.
debug1: Connection established.
debug1: identity file /home/teslausb/.ssh/id_rsa type 0
debug1: identity file /home/teslausb/.ssh/id_rsa-cert type -1
debug1: identity file /home/teslausb/.ssh/id_dsa type -1
debug1: identity file /home/teslausb/.ssh/id_dsa-cert type -1
debug1: identity file /home/teslausb/.ssh/id_ecdsa type -1
debug1: identity file /home/teslausb/.ssh/id_ecdsa-cert type -1
debug1: identity file /home/teslausb/.ssh/id_ed25519 type -1
debug1: identity file /home/teslausb/.ssh/id_ed25519-cert type -1
debug1: identity file /home/teslausb/.ssh/id_xmss type -1
debug1: identity file /home/teslausb/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.0

If I telnet to port 22, then I see:
telnet> open teslausb.local 22
Trying 192.168.0.216...
Connected to teslausb.local.
Escape character is '^]'.

I get no further response.

I can get a response from the Pi using ping, but that's about it:
$ ping 192.168.0.216
PING 192.168.0.216 (192.168.0.216) 56(84) bytes of data.
64 bytes from 192.168.0.216: icmp_seq=7 ttl=64 time=4.92 ms
64 bytes from 192.168.0.216: icmp_seq=8 ttl=64 time=5.13 ms
64 bytes from 192.168.0.216: icmp_seq=9 ttl=64 time=5.06 ms
64 bytes from 192.168.0.216: icmp_seq=10 ttl=64 time=5.14 ms

If I try to connect to the 'TESLAUSB WIFI' AP, then I get the following in the system journal:

Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.4470] device (wlp4s0): Activation: starting connection 'TESLAUSB WIFI' (2291cc3c-cf7e-47b5-bd8a-575ec391aaa2)
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.4507] sup-iface[0x55f30479e0e0,wlp4s0]: connection disconnected (reason -3)
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.4508] device (wlp4s0): supplicant interface state: completed -> disconnected
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.4508] device (p2p-dev-wlp4s0): supplicant management interface state: completed -> disconnected
Sep 17 15:18:27 lenovo-p50 systemd[1]: Starting Network Manager Script Dispatcher Service...
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.4581] device (wlp4s0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Sep 17 15:18:27 lenovo-p50 systemd[1]: Started Network Manager Script Dispatcher Service.
Sep 17 15:18:27 lenovo-p50 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm="systemd" exe="/usr>
Sep 17 15:18:27 lenovo-p50 nm-dispatcher[15044]: req:1 'down' [wlp4s0]: new request (3 scripts)
Sep 17 15:18:27 lenovo-p50 nm-dispatcher[15044]: req:1 'down' [wlp4s0]: start running ordered scripts...
Sep 17 15:18:27 lenovo-p50 wpa_supplicant[2148]: wlp4s0: CTRL-EVENT-REGDOM-CHANGE init=DRIVER type=WORLD
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5679] device (wlp4s0): set-hw-addr: reset MAC address to 7C:76:35:E9:2A:91 (preserve)
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5756] device (wlp4s0): supplicant interface state: disconnected -> disabled
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5756] device (p2p-dev-wlp4s0): supplicant management interface state: disconnected -> disabled
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5762] device (wlp4s0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5775] device (wlp4s0): Activation: (wifi) access point 'TESLAUSB WIFI' has security, but secrets are required.
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5776] device (wlp4s0): state change: config -> need-auth (reason 'none', sys-iface-state: 'managed')
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5973] device (wlp4s0): supplicant interface state: disabled -> disconnected
Sep 17 15:18:27 lenovo-p50 NetworkManager[1960]: [1568729907.5974] device (p2p-dev-wlp4s0): supplicant management interface state: disabled -> disconnected

and in dmesg, I see:

[1751658.370892] wlp4s0: deauthenticating from 2c:56:dc:da:6e:64 by local choice (Reason: 3=DEAUTH_LEAVING)
[1751659.667021] wlp4s0: authenticate with b8:27:eb:a1:62:4e
[1751659.675268] wlp4s0: send auth to b8:27:eb:a1:62:4e (try 1/3)
[1751659.682742] wlp4s0: authenticated
[1751659.683134] iwlwifi 0000:04:00.0 wlp4s0: disabling HT as WMM/QoS is not supported by the AP
[1751659.683144] iwlwifi 0000:04:00.0 wlp4s0: disabling VHT as WMM/QoS is not supported by the AP
[1751659.684508] wlp4s0: associate with b8:27:eb:a1:62:4e (try 1/3)
[1751659.698658] wlp4s0: RX AssocResp from b8:27:eb:a1:62:4e (capab=0x411 status=0 aid=1)
[1751659.703841] wlp4s0: associated
[1751659.722097] wlp4s0: Limiting TX power to 30 (30 - 0) dBm as advertised by b8:27:eb:a1:62:4e
[1751669.704522] wlp4s0: deauthenticating from b8:27:eb:a1:62:4e by local choice (Reason: 3=DEAUTH_LEAVING)
[1751671.041099] wlp4s0: authenticate with b8:27:eb:a1:62:4e
[1751671.049531] wlp4s0: send auth to b8:27:eb:a1:62:4e (try 1/3)
[1751671.056143] wlp4s0: authenticated
[1751671.056498] iwlwifi 0000:04:00.0 wlp4s0: disabling HT as WMM/QoS is not supported by the AP
[1751671.056508] iwlwifi 0000:04:00.0 wlp4s0: disabling VHT as WMM/QoS is not supported by the AP
[1751671.057503] wlp4s0: associate with b8:27:eb:a1:62:4e (try 1/3)
[1751671.071512] wlp4s0: RX AssocResp from b8:27:eb:a1:62:4e (capab=0x411 status=0 aid=1)
[1751671.073930] wlp4s0: associated
[1751671.088389] wlp4s0: Limiting TX power to 30 (30 - 0) dBm as advertised by b8:27:eb:a1:62:4e
[1751681.075014] wlp4s0: deauthenticating from b8:27:eb:a1:62:4e by local choice (Reason: 3=DEAUTH_LEAVING)

The connection to the AP appears to timeout, and presents me with a prompt for the security key.

I'm just wondering if this is normal behaviour when copying music?

teslausb_setup_variables.conf.txt

@marcone
Copy link
Owner

marcone commented Sep 17, 2019

I've seen the nonresponsive ssh before, and always attributed it to slow wifi being saturated by archiving, because it always started working again after archiving finished. Haven't noticed it during music copying though.

@fatfurrycat
Copy link
Author

Thanks for the quick response, marcone.

I'm transferring about 123GB of Music to the Pi, and figured it might be quicker over the CIFS connection that copying via USB. Maybe, that was a mistake. It's still running, so I'll leave it going for now.

@marcone
Copy link
Owner

marcone commented Sep 17, 2019

Yeah that's going to take a looong time. You should seriously ask yourself whether you really need that much music in you car. If you the answer is yes, I would recommend using a separate dedicated usb drive for that, or even a small ssd.

@marcone
Copy link
Owner

marcone commented Sep 18, 2019

Twice now I've also seen the network going up and down repeatedly, resulting in a very sluggish connection, to the point of unusability. It looks like this in dmesg:

[18721.398246] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[18731.144154] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18731.144198] brcmfmac: power management disabled
[18733.828410] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[18739.944577] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18739.946434] brcmfmac: power management disabled

etcetera, over and over until I reboot the Pi, and then it's fine. I have no idea what causes this.

@fatfurrycat
Copy link
Author

I'm beginning to think you're right on the music front. I think I'll turn syncing off and try to find another way.

I'm guessing I could mount the partition and the .bin file on a Linux system and copy the files using a card reader?

Any thoughts on what might work?

@fatfurrycat
Copy link
Author

fatfurrycat commented Sep 18, 2019

So, I've put the SD card into my Fedora laptop, and done the following:

$ mkdir /mnt/teslausb
$ mount /dev/mmcblk0p3 /mnt/teslausb
$ file music_disk.bin
music_disk.bin: DOS/MBR boot sector; partition 1 : ID=0xc, start-CHS (0x0,32,33), end-CHS (0x145,85,16), startsector 2048, 268433408 sectors, extended partition table (last)

which gives me the partition start sector. To calculate the offset for losetup, you need to multiply that number by 512, so the offset is 2048*512=1048576

$ losetup
This will show what loopback filesystems you've currently got (I've currently got 14). Pick a loopback number that's unused. In my case, I just arbitrarily chose 20.

$ losetup -o 1048576 loop20 /backingfiles/music_disk.bin
$ mkdir /mnt/music
$ mount /dev/loop20 /mnt/music

I'm now copying my music into the mounted /mnt/music filesystem at around 25MB/s. A huge improvement.

@marcone
Copy link
Owner

marcone commented Sep 18, 2019

If you also mount the /root partition of the sd card, you can look up the offset in the Pi's fstab, and possibly even do something like:
mount --fstab /media/fatfurrycat/root/etc/fstab /mnt/music
(untested)

@curiousgrge
Copy link

I've redone my teslausb multiple times now and here are some other tips if you still plan on using music sync.

I moved my music out of the share into a temporary folder outside the share to avoid it syncing the first time.

Connect to PC via USB by using TWO USB cables to supply power as one cable wasn't enough to bring up the partitions.

Try this to speed up your USB copying https://github.com/marcone/teslausb/issues/239#issuecomment-529830171

@sTevefor3
Copy link

Twice now I've also seen the network going up and down repeatedly, resulting in a very sluggish connection, to the point of unusability. It looks like this in dmesg:

[18721.398246] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[18731.144154] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18731.144198] brcmfmac: power management disabled
[18733.828410] IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[18739.944577] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[18739.946434] brcmfmac: power management disabled

etcetera, over and over until I reboot the Pi, and then it's fine. I have no idea what causes this.

I've seen this same behaviour, although rarely. In addition, it appears that the Pi grabs 2 IP addresses from my home network DHCP server. A reboot fixed it, but it has appeared again.

The issue with data transfer stalling out ssh attempts is also present in my setup. Doesn't seem to interfere with normal non-data transferring operations (i.e., recording cam video & music playback), but it does prevent me from logging-in via ssh until the transfer operation is completed. No biggee, I guess...

@AngusThermo-Pyle
Copy link

I noticed this issue as well recently when updating my OS and recopying music. It would leave existing sessions open, respond to new ones, but not respond to the session at all until the copy was done. Not much of an issue for my use case, but odd.

@sboesebeck
Copy link

my guess is, this is not only happening when syncing music - i have the same issue, that SSH fails to respond sometimes. But I do not have any music that is to be transferred.

Could it be, that the transfer "eats up" all bandwith? Maybe we could fix it using some QOS-Rules in network/firewall settings?

if that is really a thing that needs fixing. Usually I do not connect to the PI if all works fine :-)

@githubsean
Copy link

Somewhat related, I'm finding that while transferring clips onto my NAS, the connection checker is killing the archive process. This may be because, as mentioned above, the network is saturated and nothing is getting through.

Another reason I have thought of is that the ping (using hping3) isn't actually properly closing the connection - all it does is a SYN packet and waits for the SYN/ACK packet from the NAS. This leaves the connection state open until it times out - effectively a SYN flood. I haven't checked this yet though. I'm going to try changing the connection checker to use some different methods - maybe nc or nmap - to see if that makes a difference in my situation. I'll also see if there are any "security" settings on the NAS (Synology) which might be causing this.

@stauffenberg2020
Copy link

stauffenberg2020 commented May 5, 2024

UPDATE : I confirm that the below work around doesnt work after a couple of tests

May be a late comment. I have seen this network throttling issues and disconnections especially while testing how music copy works. If you have a (Synology) NAS, You can easily set the upload and download rates for the NAS user. This probably can get rid of network disconnections. (I tested once and it seems to work...Need to test a couple of more times)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants