Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tabulate factor levels #748

Open
raffaem opened this issue Dec 26, 2024 · 0 comments
Open

Tabulate factor levels #748

raffaem opened this issue Dec 26, 2024 · 0 comments

Comments

@raffaem
Copy link

raffaem commented Dec 26, 2024

For factor variables, currently skimr only tabulate the most frequent values:

> starwars %>% pull(hair_color) %>% as.factor() %>% skimr::skim()
── Data Summary ────────────────────────
                           Values    
Name                       Piped data
Number of rows             87        
Number of columns          1         
_______________________              
Column type frequency:               
  factor                   1         
________________________             
Group variables            None      

── Variable type: factor ───────────────────────────────────────────────────────────────────────────
  skim_variable n_missing complete_rate ordered n_unique top_counts                       
1 data                  5         0.943 FALSE         11 non: 38, bro: 18, bla: 13, whi: 4

I would like an output in which all levels are tabulated, including the missing level, together with frequency and percentage.

Something like:

Class: factor
12 unique levels (level for missing included)

   Value         Freq  Prop    CumProp
   --------------------------------------
   none          38    43.68%  43.68%
   brown         18    20.69%  64.37%
   black         13    14.94%  79.31%
   <NA>          5     5.75%   85.06%
   white         4     4.60%   89.66%
   blond         3     3.45%   93.10%
   auburn        1     1.15%   94.25%
   auburn, grey  1     1.15%   95.40%
   auburn, white 1     1.15%   96.55%
   blonde        1     1.15%   97.70%
   brown, grey   1     1.15%   98.85%
   grey          1     1.15%   100.00%
   --------------------------------------
   Total         87    100.00%

Would it be possible to implement that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant