Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIFF with multiple image frames being read as having 1 frame #1606

Closed
dodewall opened this issue Aug 15, 2024 · 16 comments · Fixed by #1607
Closed

TIFF with multiple image frames being read as having 1 frame #1606

dodewall opened this issue Aug 15, 2024 · 16 comments · Fixed by #1607

Comments

@dodewall
Copy link

dodewall commented Aug 15, 2024

Hello! I'm trying to use large-image to view & manipulate large (~12GB) timelapse acquisitions (~1300 16-bit grayscale frames saved as a .tif file). When I use source = large_image.open(file_path), the source object shows the correct sizeX and sizeY, but source.frames returns "1". The tileIterator also only returns a single tile. Is there any advice for troubleshooting available?

@manthey
Copy link
Member

manthey commented Aug 15, 2024

large_image has several tile source readers that read tiff files. tiff files have huge variations, and some writers (such as ImageJ) claim to write tiff files that are not actually compliant with the specification. From python, you can see which reader was used by doing something like print(large_image.open("file.tiff")) and it will print the name of the class that actually did the reading.

For any tiff file, we can see its internal structure using the tifftools python package. pip install tifftools in any modern version of python, then tifftools dump file.tiff will print the internal details. If the file is compliant with the tiff specification, I'd expect this output to be huge (a few dozen lines or more of output per frame). If you can share the first hundred-ish lines of that output, then we can know exactly what is going on. If your file has private data in it, such as names, it could be revealed in this output, so please make sure this is publicly shareable.

@dodewall
Copy link
Author

Thank you @manthey! Below is the output of tifftools dump file.tiff (file.tiff being replaced with the file path I'm interested in). It's not even 100 lines

Header: 0x4d4d
Directory 0: offset 8 (0x8)
NewSubfileType 254 (0xFE) LONG: 0
ImageWidth 256 (0x100) LONG: 2190
ImageLength 257 (0x101) LONG: 946
BitsPerSample 258 (0x102) SHORT: 16
Compression 259 (0x103) SHORT: 1 (None 1 (0x1))
Photometric 262 (0x106) SHORT: 1 (MinIsBlack 1 (0x1))
ImageDescription 270 (0x10E) ASCII: ImageJ=1.54f
images=3337
slices=3337
loop=false
min=104.0
max=4095.0

StripOffsets 273 (0x111) LONG: 291698
SamplesPerPixel 277 (0x115) SHORT: 1
RowsPerStrip 278 (0x116) LONG: 946
StripByteCounts 279 (0x117) LONG: 4143480
ImageJMetadataByteCounts 50838 (0xC696) LONG: <3338> 12 78 78 78 78 78 78 78 78 78 80 80 80 80 80 80 80 80 80 80 ...
ImageJMetadata 50839 (0xC697) BYTE: <278106> 73 74 73 74 108 97 98 108 0 0 13 9 0 116 0 58 0 49 0 47 ...

@manthey
Copy link
Member

manthey commented Aug 15, 2024

This file is "not-quite-a-tiff" file written by ImageJ. These are a valid 1-frame tiff file with all the extra frames appended afterwards without proper tiff references to them. Normally large_image asks the tifffile source to read these, since it has specific code to handle this. There, I wonder if large_image picked the tifffile source module if it would read correctly. Does the following show the correct number of frames?

import large_image_source_tifffile  # note that this is tifffile not tiff

source = large_image_source_tifffile.open(file_path)
print(source.frames)

If so, then I need to dig into why the tifffile source wasn't chosen by default. If not, then I'll need to look a little deeper.

@dodewall
Copy link
Author

Running the code above on the "not-quite-a-tiff" file throws the following error:
TileSourceError: File cannot be opened via tifffile source: 'no maximum series'

Is there a way to convince ImageJ to write the tiff file "properly"?

@manthey
Copy link
Member

manthey commented Aug 15, 2024

I don't see anything in ImageJ's user guide to vary how it saves tiff files.

Can you check if you have a very recent version of the tifffile python package in your environment? If not, maybe upgrading that will help. If it is very recent, then the tifffile package fails to read imageJ output and we could probably hunt down what is going on there. If you can share the first 300,000 bytes or so of your file, I'd be able to replicate the issue (basically all of the header information and none of the imagery, which based on tifftools dump is the first 291,698 bytes).

@dodewall
Copy link
Author

I have tifffile version 2024.8.10. If that's meant to be the date of its release, then it is very recent.
Happy to share part of the file, but please forgive my ignorance - what's the easiest way to split the first 300k bytes from the file for your replication? Is it possible to use tifftools?
Thank you for your guidance.

@manthey
Copy link
Member

manthey commented Aug 16, 2024

The linux command head -c 300000 file.tiff > tiff-header.dat will do it.

@dodewall
Copy link
Author

dodewall commented Aug 16, 2024

I've attached the first ~300kb as a .txt file here (neither .tif nor .dat files are not supported for attachment):

nov4_d.sbdsort1_div15 [aligned].txt

@manthey
Copy link
Member

manthey commented Aug 16, 2024

With this and the file extended with random data to a total of 13827084458 bytes, my instance of large_image uses tifffile and properly reports 3337 frames. This worked on several versions of python and on linux and osx. This means either your actual file is a different length then I'd expect from the headers or your environment is somehow significantly different than mine. Can you confirm your file's length? And, if that matches, can you give details on your OS/Python versions and which version of large_image you have installed.

@dodewall
Copy link
Author

dodewall commented Aug 16, 2024

File length is confirmed as 13827084458 bytes.
OS version: Windows 11 Home Version 10.0.22631 N/A Build 22631
Python version: 3.12.5
large_image version: 1.29.4.

I tried using large_image_source_tifffile.open(image_file) in both python on the command line and in a virtual environment with the above versions of python and large_image installed. Same result, specifically:

large_image.exceptions.TileSourceError: File cannot be opened via tifffile source: 'no maximum series'

Could the file path be the issue? I had to add an escape character ('\') behind each backward slash in the file path since I'm in a Windows environment.

@dodewall
Copy link
Author

For what it's worth, when I try opening the file without specifying the source, this is the result (reading as a JPEG?)

Command: large_image.open(image_file)

Result: PILFileTileSource ("('C:\\\\Users\\\\oadew\\\\Downloads\\\\nov4_d.sbdsort1_div15 [aligned].tif', 'JPEG', 95, 0, 'raw', False, '__STYLESTART__', None, '__STYLEEND__')", None),None

@manthey
Copy link
Member

manthey commented Aug 16, 2024

I had only tried on linux and OSX. I get exactly your result on Windows. The culprit is a line to check sanity of the image and find the largest image series that reads np.prod(s.shape). In linux and OSX this behaves as expected; if Windows, numpy's default integer is int32 (not int64), and this produces the wrong value. I'll have a fix for this shortly (but I worry that we've made assumptions somewhere else like this).

Thanks for working through this.

@manthey
Copy link
Member

manthey commented Aug 16, 2024

For what it's worth, when I try opening the file without specifying the source, this is the result (reading as a JPEG?)

That `'JPEG'`` term indicates that if you ask for part of the image as an image tile it will default to returning a JPEG. You can override and ask for any output format PIL supports, but this is the default since as a tile server for the web jpeg is often an acceptable choice.

@manthey
Copy link
Member

manthey commented Aug 16, 2024

@dodewall You can try this out by installing the development release (pip install "large-image-source-tifffile>=1.29.6.dev2").

@dodewall
Copy link
Author

dodewall commented Aug 19, 2024

Amazing - this works; thank you!

@manthey
Copy link
Member

manthey commented Aug 19, 2024

Thanks for the confirmation, and I'm glad we could hunt down what was going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants