Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running OCR gives no results and NS_ERROR_FILE_NOT_FOUND #88

Open
TrakJohnson opened this issue Dec 17, 2024 · 4 comments
Open

Running OCR gives no results and NS_ERROR_FILE_NOT_FOUND #88

TrakJohnson opened this issue Dec 17, 2024 · 4 comments
Assignees

Comments

@TrakJohnson
Copy link

TrakJohnson commented Dec 17, 2024

Hi, just installed the plugin, when trying to OCR my first file I get the following error in the developer console:

NS_ERROR_FILE_NOT_FOUND: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFile.isDirectory]

I first thought that I had misconfigured tesseract/pdftoppm, but everything seems to look fine.. are there any ways to further investigate this ? I read through #87 but it doesn't seem related. Thanks !

Here's my configuration:

  • Zotero 7.0.11
  • Fedora Linux 41 / Linux 6.11.10-300.fc41.x86_64
  • Libraries:
❯ /usr/bin/pdftoppm -v                                                           
pdftoppm version 24.08.0
Copyright 2005-2024 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2011, 2022 Glyph & Cog, LLC
❯ /usr/bin/tesseract -v                                                              ~
tesseract 5.4.1
 leptonica-1.84.1
  libgif 5.2.2 : libjpeg 6b (libjpeg-turbo 3.0.2) : libpng 1.6.40 : libtiff 4.6.0 : zlib 1.3.1.zlib-ng : libwebp 1.4.0
 Found AVX512BW
 Found AVX512F
 Found AVX512VNNI
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found libcurl/8.9.1 OpenSSL/3.2.2 zlib/1.3.1.zlib-ng libidn2/2.3.7 nghttp2/1.62.1
  • Zotero-OCR settings:

image

@aborel
Copy link
Collaborator

aborel commented Dec 19, 2024

First we can check whether the problem happens at the pdftoppm or at the tesseract stage. Are the PNG images saved to the Zenodo item folder?

@aborel aborel self-assigned this Dec 19, 2024
@zzyzx-dc
Copy link

zzyzx-dc commented Jan 3, 2025

I am having the same issue and came here to see if anyone else was. Zotero 7.0.11 on Fedora Workstation 41.

Could not get children of file(/opt) because it does not exist
Error code: NS_ERROR_FILE_NOT_FOUND: Component returned failure code: 0x80520012 (NS_ERROR_FILE_NOT_FOUND) [nsIFile.isDirectory] zotero-ocr.js:87

Additionally, like the original post, I had to manually set the filepaths to /usr/bin/tesseract and /usr/bin/pdftoppm or it returned OperationError: Could not parse path (tesseract): NS_ERROR_FILE_UNRECOGNIZED_PATH but the documentation helped me realize I needed to locate the file paths myself. Thanks!

@aborel
Copy link
Collaborator

aborel commented Jan 3, 2025

Since the OP didn't answer, maybe you can check whether pdftoppm did its job?

@zzyzx-dc
Copy link

zzyzx-dc commented Jan 3, 2025

Sure thing - I am not sure how to check so you might have to walk me through it. When I go to the item folder (Zotero item - right click - Show file) there is only the PDF file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants