After using .transform(...,sparse=True), how to make a DataFrame for input to .fpgrowth()? #878

geoffreya · 2022-01-08T16:13:05Z

geoffreya
Jan 8, 2022

I was trying to encode my 4+ million transactions for input into .fpgrowth() when I got an error at the pd.DataFrame constructor:

ValueError: Shape of passed values is (4669117, 1), indices imply (4669117, 118669)

Note: In my case .transform(...,sparse=True) is necessary because using sparse=False was trying to allocate a half-terabyte of RAM which I do not have.

fitted = te.fit(itemSetList)

te_ary = fitted.transform(itemSetList, sparse=True)

df = pd.DataFrame(te_ary, columns=te.columns_)

So question is, for purpose where I am going to input a dataframe into mlxtend's .fpgrowth(), what is correct syntax for this DataFrame constructor after calling .transform(...,sparse=True)?

Thanks in advance.

Answered by geoffreya

Jan 8, 2022

I found the solution. You need to make a couple of small changes where you encode the transactions.
`
#te_ary = te.fit(itemSetList).transform(itemSetList)

#df = pd.DataFrame(te_ary, columns=te.columns_)

fitted = te.fit(itemSetList)

te_ary = fitted.transform(itemSetList, sparse=True) # seemed to work good

df = pd.DataFrame.sparse.from_spmatrix(te_ary, columns=te.columns_) # seemed to work good
`
Now you can call mlxtend's fpgrowth() followed by association_rules().

View full answer

geoffreya · 2022-01-08T17:24:14Z

geoffreya
Jan 8, 2022
Author

I found the solution. You need to make a couple of small changes where you encode the transactions.
`
#te_ary = te.fit(itemSetList).transform(itemSetList)

#df = pd.DataFrame(te_ary, columns=te.columns_)

fitted = te.fit(itemSetList)

te_ary = fitted.transform(itemSetList, sparse=True) # seemed to work good

df = pd.DataFrame.sparse.from_spmatrix(te_ary, columns=te.columns_) # seemed to work good
`
Now you can call mlxtend's fpgrowth() followed by association_rules().

1 reply

rasbt Jan 9, 2022
Maintainer

Awesome, great to hear that the issue was solved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After using .transform(...,sparse=True), how to make a DataFrame for input to .fpgrowth()? #878

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

After using .transform(...,sparse=True), how to make a DataFrame for input to .fpgrowth()? #878

geoffreya Jan 8, 2022

Replies: 1 comment · 1 reply

geoffreya Jan 8, 2022 Author

rasbt Jan 9, 2022 Maintainer

geoffreya
Jan 8, 2022

Replies: 1 comment 1 reply

geoffreya
Jan 8, 2022
Author

rasbt Jan 9, 2022
Maintainer