Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replay of page with javascript never completes #109

Open
lwrubel opened this issue Jul 20, 2022 · 4 comments
Open

Replay of page with javascript never completes #109

lwrubel opened this issue Jul 20, 2022 · 4 comments
Labels
bug Something isn't working replay web archiving 2022 web archiving work cycle

Comments

@lwrubel
Copy link
Contributor

lwrubel commented Jul 20, 2022

Pywb never completes loading pages from https://swap.stanford.edu/was/*/http://bondholder-information.stanford.edu/index.html. The console shows the page continually trying to load various javascript files.

The page renders in openwayback, likely because it is not handling the javascript.

It's unclear so far what is causing the problem with replay. This ticket is to describe what's known about the pywb bug so far. It prevents viewing the site and capturing a thumbnail for its seed.

@lwrubel lwrubel added bug Something isn't working web archiving 2022 web archiving work cycle labels Jul 20, 2022
@lwrubel
Copy link
Contributor Author

lwrubel commented Jul 21, 2022

@peterchanws will run a capture with webrecorder and accession that crawl.

@lwrubel lwrubel added the replay label Jul 21, 2022
@peterchanws
Copy link
Collaborator

Autopilot didn't run properly at https://bondholder-information.stanford.edu/.
I checked archived pages in AIT and found the site was not captured properly.
https://wayback.archive-it.org/5591/20220429054108/https://bondholder-information.stanford.edu/
I tried patching several times and some images are still not aligning properly. I have reported the issue to AIT.

@peterchanws
Copy link
Collaborator

I archived the site using Browsertrix Crawler. Accession it using one time registration. Manually created a thumbnail. Here is the results:
https://argo.stanford.edu/view/druid:jn493fq7015
https://swap.stanford.edu/was/20220729152735/https://bondholder-information.stanford.edu/

@edsu
Copy link
Contributor

edsu commented Jul 29, 2022

I'm noticing that the page displays (yay) but that there appears to be some JavaScript code injected during replay into the page?! Scan the replayed page for

However, _____WB$wombat$check$this$function_____(this) Section 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working replay web archiving 2022 web archiving work cycle
Projects
None yet
Development

No branches or pull requests

3 participants