Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't Relative Abundance take ‘estimate-unknown’ into account? #37

Closed
HirokiK0 opened this issue Dec 23, 2024 · 1 comment
Closed
Labels
question Further information is requested

Comments

@HirokiK0
Copy link

Thank you for developing such a wonderful tool!

When comparing the relative abundance of species obtained using sylph, I would like to take into account unknown sequences.

However, the current -u option is only valid for sequence abundance and cannot be applied to relative abundance.

If there is a way to take the -u option into account for relative abundance, please let me know.

Or is it not appropriate to apply the -u option to relative abundance?

@bluenote-1577
Copy link
Owner

@HirokiK0 this is a mathematical issue. Taxonomic abundance depends in the size of the genomes. If the "unknown" genomes have small or large length, this would shift the percentage of unknown sequences.

So we don't scale taxonomic abundance by unknown content. You are free to multiply it by the percentage of unknown content, just like with sequence abundance... although this is not mathematically valid, but it's usually OK.

@bluenote-1577 bluenote-1577 added the question Further information is requested label Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants