Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error running TED: Error: $ operator is invalid for atomic vectors #16

Open
zroger49 opened this issue Mar 29, 2022 · 6 comments
Open

Comments

@zroger49
Copy link

zroger49 commented Mar 29, 2022

Hello.
Previously, a member from my lab was able to run your software (this was about 5-6 months ago).
I tried to reproduce their results using a different reference dataset.

However, when I try to run my analysis, the following error occurs:

Error: $ operator is invalid for atomic vectors
In addition: Warning message:
In mclapply(1:N, FUN = function(i) { :
  all scheduled cores encountered errors in user code 

And when I run the same analysis using a single core

current sample ID:1  Error in rmultinom(n = 1, size = X.i[g], prob = prob.mat[, g]) :
  invalid second argument 'size'

My bulk data is the TPM residuals (after regressing out the effect of multiple covariates using a multiple linear model in the original TPM expression table), while my scRNA-seq data was normalized using NormalizeData(normalization.method = "RC", scale.factor = 100000, margin = 1) and subseted for the top 5000 variable genes.

I know this method works best if counts are used as input, but this setup has previously worked and we had decent results

@zroger49 zroger49 reopened this Mar 29, 2022
@tinyi
Copy link
Collaborator

tinyi commented Mar 31, 2022 via email

@zroger49
Copy link
Author

zroger49 commented Apr 5, 2022

Hello again,
The issue seems to be the related to the negative numbers in the expression matrix!
I tried to run with the raw counts matrix, but I had the following issue:

...
[1] "pooling information across samples"
Killed
Error in sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) :
  ignoring SIGPIPE signal
Calls: run.Ted ... optimize.psi -> mclapply -> lapply -> FUN -> sendMaster

Could this be related to the memory I have available in my machine?

@tinyi
Copy link
Collaborator

tinyi commented Apr 5, 2022 via email

@zroger49
Copy link
Author

zroger49 commented Apr 7, 2022

Yes. I think it is related to memory. You may try setting the n.cores.2g argument with a smaller value. Also removal of unused variables from the workspace followed by cleaning up the memory using gc() may help. There are a few possibilities of fixing the negative values. The easiest way would be to exclude genes with negative values. I am not sure what the setup of your regression is. Using the residuals may result in lots of zeros. If that is the case, you may try adding back the intercept term. Also you can change the reference level in the regression to see if which direction may yield fewer negative values. Also some people may regress using the log transformed values. In this case, one will need to transform it back to the original raw scale by exponentiating the values.

On Tue, Apr 5, 2022 at 3:44 AM Rogério Ribeiro @.> wrote: Hello again, The issue seems to be the related to the negative numbers in the expression matrix! I tried to run with the raw counts matrix, but I had the following issue: ... [1] "pooling information across samples" Killed Error in sendMaster(try(lapply(X = S, FUN = FUN, ...), silent = TRUE)) : ignoring SIGPIPE signal Calls: run.Ted ... optimize.psi -> mclapply -> lapply -> FUN -> sendMaster Could this be related to the memory I have available in my machine? Is this related — Reply to this email directly, view it on GitHub <#16 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4NHSZ4N5YGIGLOTTQTPCLVDPVO3ANCNFSM5R56F2JQ . You are receiving this because you commented.Message ID: @.>

Regarding the negative counts in the input matrix, I decided to run the analysis using the raw counts and then take into account the covariates in downstream analysis. The results look good so far.

Also, when I tried to run the same type of analysis, but using another dataset (this time TCGA), I got the following error

[1] "pooling information across samples"
Error in log.fold[i, ] : subscript out of bounds
In addition: Warning message:
In mclapply(1:nrow(input.phi), function(idx) { :
  scheduled cores 3, 8 did not deliver results, all values of the jobs will be affected

It seems that the first round of the analysis was completed tho. Is it safe to carry on with these results?

@tinyi
Copy link
Collaborator

tinyi commented Apr 7, 2022 via email

@tinyi
Copy link
Collaborator

tinyi commented May 10, 2022

Hi Rogério,

I have updated the current git repository to v1.4. This version has addressed the memory issue. You may try this and let me know if there it helps.

Best,

Tinyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants