Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using {N_miss} and the like for continuous variables #2095

Open
haziqj opened this issue Dec 6, 2024 · 3 comments
Open

Error using {N_miss} and the like for continuous variables #2095

haziqj opened this issue Dec 6, 2024 · 3 comments

Comments

@haziqj
Copy link

haziqj commented Dec 6, 2024

Hi, it used to be possible to request statistics on the number of missing and non-missing observations and their proportions for continuous variables. This now returns an error. Is this a bug or the feature has been removed?

It works fine for categorical variables.

Thanks

library(gtsummary)
trial |>
  select(age) |>
  tbl_summary(
    statistic = all_continuous() ~ "{N_miss}",
    missing = "no"
  )
#> Error in `x[c("variable", "stat_name")]`:
#> ! Can't subset columns that don't exist.
#> ✖ Columns `variable` and `stat_name` don't exist.

Created on 2024-12-06 with reprex v2.1.1

@ddsjoberg
Copy link
Owner

You can still use them for continuous variables, but the function is expecting at least one continuous summary statistic, like the mean in the example below.

library(gtsummary)

trial |>
  select(age) |>
  tbl_summary(
    statistic = all_continuous() ~ "{mean}; N missing={N_miss}",
    missing = "no"
  ) |> 
  as_kable()
Characteristic N = 200
Age 47; N missing=11

Created on 2024-12-05 with reprex v2.1.1

I never expected someone to only summarize the number of non-missing variables, but I guess it's something we should support again.

@haziqj
Copy link
Author

haziqj commented Dec 6, 2024

My use case is for a missing data analysis, where I would like to show the severity of missing data for each variable by a certain categorical variable. tbl_summary() was a neat way of doing this, especially with add_overall().

In the mean time, I've done a hacky thing to get what I want:

library(gtsummary)
blank_fn <- function(x) NA
tab <- trial |>
  select(age) |>
  tbl_summary(
    statistic = all_continuous() ~ "{blank_fn} {N_miss} ({p_miss}%)",
    missing = "no"
  )
tab$table_body <-
  tab$table_body |>
  mutate(across(everything(), \(x) gsub("NA ", "", x)))
as_kable(tab)
Characteristic N = 200
Age 11 (5.5%)

Created on 2024-12-06 with reprex v2.1.1

Thanks

@ddsjoberg
Copy link
Owner

FWIW, this is how I imaged this type of operation would be run:

library(gtsummary)

trial |>
  select(age) |>
  dplyr::mutate(dplyr::across(everything(), is.na)) |> 
  tbl_summary() |> 
  as_kable() # convert to kable to display on GH
Characteristic N = 200
age 11 (5.5%)

Created on 2024-12-17 with reprex v2.1.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants