Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse tensors #1998

Draft
wants to merge 40 commits into
base: main
Choose a base branch
from
Draft

Conversation

McArthur-Alford
Copy link
Contributor

@McArthur-Alford McArthur-Alford commented Jul 9, 2024

Pull Request Template

This is a draft PR for sparse tensor support. It is probably still quite a ways away from being finished in my eyes, however I wanted to make this early to get some eyes on the matter throughout the process. Because of the nature of this draft, I haven't really done much documenting/testing at all, and it will probably fail many tests. Ill get around to those problems once the code is stable.

Right now there isn't a lot of functionality (though spmm for the COO tensor is working nicely). The goal at the moment is to get feedback on the overall architectural choices. I'm pretty happy, but there are some things I suspect could be done better/differently, and Id like to make those kinds of changes before bulk implementing functionality.

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Related Issues/PRs

Changes

My motivation/end goal is to do GNNs. As a result, some of the things I work on might be skewed towards that. If there are other sparse tensor things that you might want, tell me.

What are the main changes?

  • SparseBackend backend extension for sparse functionality. It will contain most of the typical tensor ops you would find in Int/Bool/FloatTensorOps, plus some special sparse ones like spmm. Certain operations are not easily achievable with sparse tensors and require conversion back to dense. Due to the potential performance costs of this, I am intentionally not adding ops, instead requiring the user to deliberately convert back to dense.
  • A new Sparse tensor kind and the associated tensor API functionality (but only for tensors where B: SparseBackend). Right now, Sparse and SparseBackend are in the burn-tensor crate, though I am debating moving them into their own crate.
  • The burn-sparse crate. This provides a SparseDecorator struct that implements SparseBackend. It also provides support for multiple representations of sparse tensor. The point of this is to get all backends supporting sparse tensors quickly and easily, at the cost of not having custom kernels, etc. My goals are probably to just get COO, CSR and CSC working before I'm happy with it.
    • One thing I'm unhappy with is the ops.rs file in burn-sparse. I'm not sure if there is a better solution besides macros or introducing a delegation crate (which only makes the problem slightly less verbose). I would really like to hear from anyone if there is a better way to deal with that mess, but I'm pretty sure that with the way rust is its effectively unavoidable. This feels like a future headache for maintenance purposes.

Some things I haven't done but intend to address/am seeking ideas on:

  • The TensorData struct. I haven't even looked at this yet, but it seems like it is built purely for dense tensors. Given sparse tensors can have quite different representations, so I'm not sure wrapping it in an enum would make sense. Unfortunately, I have to do something with this, because It feels like a bad solution to make users pull the coordinates/values or other representation specific details out of their tensor without an API.
  • SparseInt and SparseBool tensors. This would add quite a lot of complexity, I'm trying to gauge if its even necessary. Does anyone know a use case for these that cant reasonably be done with float tensors?
  • Implementing display for sparse tensors. This seems like a potential headache, and with how Display is implemented on tensors currently I have run into some issues. I'll probably solve this myself eventually, but if anyone has good ideas please let me know 😄
    • It would be so nice if I could make the current Display implementation only apply to non-sparse backends, and have an extra. Not sure if rust can handle that yet though.

Testing

I've done no testing so far, beyond a little demo. If you want to get a sense for the api, the below code should run:

use burn::backend::ndarray::NdArrayDevice;
use burn::backend::sparse::*;
use burn::backend::NdArray;
use burn::tensor::sparse::*;
use burn::tensor::Float;
use burn::tensor::Tensor;

fn main() {
    type B = SparseDecorator<NdArray, SparseCOO>;
    let device = NdArrayDevice::Cpu;

    let sparse: Tensor<B, 2, Sparse> = Tensor::from_floats(
        [
            [1.0, 0.0, 0.0, 0.0, 0.0],
            [0.0, 2.0, 1.0, 0.0, 0.0],
            [0.0, 1.0, 2.0, 0.0, 0.0],
            [0.0, 0.0, 0.0, 1.0, 0.0],
        ],
        &device,
    )
    .into_sparse();
    println!(
        "Sparse Coordinates: \n{}",
        sparse.clone().into_primitive().coordinates
    ); // display on sparse directly is broken, this is a jank temporary workaround
    println!(
        "Sparse Values: \n{}",
        sparse.clone().into_primitive().values
    ); // display on sparse directly is broken, this is a jank temporary workaround

    let dense: Tensor<B, 2, Float> = Tensor::from_floats(
        [[1.5, 0.0], [0.0, 2.0], [8.0, 1.0], [0.0, 0.0], [1.0, -5.0]],
        &device,
    );
    println!("Dense: {}", dense);

    let output = sparse.spmm(dense);

    println!("Sparse x Dense: {}", output);
}

@McArthur-Alford McArthur-Alford changed the title Sparse tensor Sparse tensors Jul 9, 2024
{
type SparseTensorPrimitive<const D: usize> = SparseCOOTensor<B, D>;

fn sparse_to_sparse<const D: usize>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be dense_to_sparse?

Copy link
Contributor Author

@McArthur-Alford McArthur-Alford Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you. Really it should be sparse_dense_to_sparse based on the naming convention (right now its sparse_to_dense to represent the to_dense function on a sparse tensor)... but that's very verbose and/or confusing and I like dense_to_sparse better.

Given everything is broken out into its own crate/extension I could probably just drop the sparse_ at the start of some of these though? It originally came from the other tensor kinds using this naming scheme before I broke everything out into it's own crate.

@McArthur-Alford
Copy link
Contributor Author

I've finally done it... I think I landed on a good sparse API. 😪

I have spent a lot of time agonizing over how to make sparse tensors fit into the API nicely. Every solution I found felt like it was compromising on something, so I eventually settled on something that did the job, but had some inconveniences in the user facing API, and really annoying delegation issues... but no more! Now I have a solution that leaves me satisfied (though it did push my knowledge of rusts type system to its limits, which might not be the best sign 👀 ).

    // Tensor representations:
    let sparse_tensor = Tensor::<B, 1, Float, Sparse<COO, B>>::empty(...);
    let dense_tensor = Tensor::<B, 1, Float, Dense>::empty(...);
    let dense_tensor = Tensor::<B, 1, Float>::empty(...);

    // And other types work fine too:
    let sparse_tensor = Tensor::<B, 1, Bool, Sparse<COO, B>>::empty(...);
    let sparse_tensor = Tensor::<B, 1, Int, Sparse<COO, B>>::empty(...);

Id appreciate some feedback on the implementation as it is now (specifically, the changes in burn-tensor, burn-sparse is old and to be renovated still). I think this is a solution that doesn't compromise on anything and requires fairly minimal amounts of rewriting (it compiles so far). The general idea, is it adds a representation generic which defaults to Dense. That way we can do all the good stuff above, but leave the creation of arbitrary specific sparse representations up to the backends themselves, and potentially add other representations in future (not even sure what that could be). It also completely removes the need for 1000 lines of delegation I previously had on my decorator backend.

@laggui
Copy link
Member

laggui commented Aug 23, 2024

Sorry I haven't had the time to take a good look at this, I'll make time to give you some feedback at the beginning of next week 🙂

@McArthur-Alford
Copy link
Contributor Author

McArthur-Alford commented Aug 23, 2024

Sorry I haven't had the time to take a good look at this, I'll make time to give you some feedback at the beginning of next week 🙂

Very appreciated. Ive spent some time cleaning things up a lot, so honestly its probably for the best you didn't have time until now 😄. I threw together a quick mock version of what I'm doing. Given it helped me figure everything out, it will surely also be helpful to you guys navigating all the changed files easier.


Im not yet set on the names (they don't feel right to me), but the final representation I settled on has the TensorKind and TensorStorage traits. TensorKind is as it was previously.

trait TensorKind<Backend> {
    type Primitive;
}

trait TensorStorage<Backend> {}

struct Float;
struct Bool;
struct Int;

struct Dense;
struct Sparse<B: Backend, SR: SparseStorage<B>> {
    p: PhantomData<(B, SR)>,
}

An important part of the above is that we have SparseStorage. A big problem I wanted to solve was 1) not forcing all existing backends to have to go and implement sparse tensor support, and 2) not force them all the have the same representations. As such, we can use this trait to let backends define their own (and my decorator can easily just support all backends).

/// This lets us offload the representation details to the backend
/// e.g. some might support COO, CSC, etc
trait SparseStorage<B> {
    type Primitive<K: TensorKind<B>>;
}
We can then implement all of these things as usual. Folded because its not really interesting.
// Tensorkind implementations as usual
impl<B: Backend> TensorKind<B> for Float {
    type Primitive = B::FloatPrimitive;
}
impl<B: Backend> TensorKind<B> for Bool {
    type Primitive = B::BoolPrimitive;
}
impl<B: Backend> TensorKind<B> for Int {
    type Primitive = B::IntPrimitive;
}

// Storage implementations
impl<B: Backend> TensorStorage<B> for Dense {}
impl<B: Backend, SR: SparseStorage<B>> TensorStorage<B> for Sparse<B, SR> {}

Now the part that gives me the most hesitancy as to whether this is a good approach is where we combine the two to get the correct primimtive. That is done using the TensorRepr trait (once again, naming is not final, I really dislike kind/storage/repr, they are not satisfying names):

trait TensorRepr {
    type Primitive;
}

impl<B: Backend, K: TensorKind<B>> TensorRepr for (B, K, Dense) {
    type Primitive = K::Primitive;
}
impl<B: Backend, K: TensorKind<B>, SR: SparseStorage<B>> TensorRepr for (B, K, Sparse<B, SR>) {
    type Primitive = SR::Primitive<K>;
}

// Then when we define tensors:
struct Tensor<B: Backend, K: TensorKind<B> = Float, S: TensorStorage<B> = Dense>
where
    (B, K, S): TensorRepr,
{
    primitive: <(B, K, S) as TensorRepr>::Primitive,
}

Because rust has no specialisation (or meaningful negative trait bounds), and because of how I've broken up the kinds and representations, we get somewhat infectious where clauses in the code. This is because we can no longer guarantee that (B, K, S) is actualy TensorRepr, while the old system which only had kinds was simple enough for rust to figure that out. This infectious where clause doesn't extend outside of the burn-tensor crate (as far as I can tell, i had to change no other code to get it compiling) so the user facing API is clean and other backends are safe. Still... there are a lot of places I had to go through in burn-tensor and add these in. Its a bit ugly, but everything works pretty well. That said, it certainly feel like it's pushing rusts type system to its limits a little bit. I've run into so many issues hitting the limits of what we can do in rust before I settled on this approach. (I certainly learned a lot about the type system doing it!!!).


The mockup above is describing my changes to burn-tensor. There are also changes in the burn-sparse crate. A lot less complete, but they contain the implementation of the backend agnostic sparse representation. A lot of operations there that have gotten quite complex trying to use an exclusively scatter/gather/reshape approach (particularly given burn has a rather limited gather/scatter at the time of writing).

Something like below will compile, if you want to test things:

Code
use burn::backend::ndarray::NdArrayDevice;
use burn::backend::sparse::COO;
use burn::backend::NdArray;
use burn::prelude::*;

fn main() {
    type B = NdArray;

    let device = NdArrayDevice::Cpu;

    let dense_tensor = Tensor::<B, 2>::from_floats(
        [
            [1.0, 0.0, 0.0, 0.0],
            [0.0, 1.0, 0.0, 0.0],
            [0.0, 0.0, 1.0, 0.0],
            [0.0, 0.0, 0.0, 1.0],
        ],
        &device,
    );

    let sddmm_l = Tensor::<B, 2>::from_floats([[1, 1, 1, 1]], &device).transpose();
    let sddmm_r = Tensor::<B, 2>::from_floats([[1, 2, 3, 4]], &device);

    let sparse_tensor = Tensor::<B, 2>::from_floats(
        [
            [0.0, 0.0, 0.1, 1.0],
            [4.0, 0.0, 2.0, 0.0],
            [2.5, 0.0, 0.0, 0.0],
            [0.0, 8.0, 0.0, 1.0],
        ],
        &device,
    )
    .into_sparse::<COO>();

    // let multiplied = sparse_tensor.clone().spmm(dense_tensor.clone());
    let spmm = sparse_tensor.clone().spmm(dense_tensor.clone());
    let sddmm = sparse_tensor.clone().sddmm(sddmm_l, sddmm_r);

    println!("{}", spmm);
    println!("{}", sddmm.into_dense());

    let sparse_tensor = sparse_tensor.unsqueeze::<3>().repeat_dim(0, 2);
    println!("{}", sparse_tensor.into_dense());
}

@laggui
Copy link
Member

laggui commented Aug 27, 2024

First of all, great work so far! It seems like you've come a long way 👏

Regarding the implementation, I went through the suggested changes in the PR a couple of times and I'm not entirely convinced yet with the added traits required just to add a sparse tensor primitive. While the user facing API remains essentially unchanged (very nice!), at this stage it feels like the implementation is only halfway to a backend decorator. I do not necessarily mind the additional TensorStorage generic, but in its current form it doesn't feel right (mainly because its only purpose is to wrap the sparse tensor primitive via the Sparse struct). I think there's just too many layers in its current form.

I'll come back to this draft later this week to see if my mind has changed 😅 and otherwise try to come up with alternatives to help out.

@McArthur-Alford
Copy link
Contributor Author

McArthur-Alford commented Aug 27, 2024

First of all, great work so far! It seems like you've come a long way 👏

Regarding the implementation, I went through the suggested changes in the PR a couple of times and I'm not entirely convinced yet with the added traits required just to add a sparse tensor primitive. While the user facing API remains essentially unchanged (very nice!), at this stage it feels like the implementation is only halfway to a backend decorator. I do not necessarily mind the additional TensorStorage generic, but in its current form it doesn't feel right (mainly because its only purpose is to wrap the sparse tensor primitive via the Sparse struct). I think there's just too many layers in its current form.

I'll come back to this draft later this week to see if my mind has changed 😅 and otherwise try to come up with alternatives to help out.

This is pretty much how I feel about it. It is the "best" solution I've found (no breaking changes, quite extensible, lets backends choose how to implement), I just wish the implementation could be simpler... It feels convoluted. Not convoluted enough i couldn't live with it, but enough that I hope there is a better way.

And yes, it feels like a very round-about way of doing what a decorator would. There were some issues I ran into with the decorator + backend extension combo unfortunately.

  1. BasicOps and Numeric have a lot of functions that return tensors of a fixed type. E.g. any/all return a bool tensor, when ideally they would return a sparse bool tensor if called on a sparse tensor. I couldn't find a satisfying way to fix this, without duplicating all of BasicOps/Numeric (not desirable).
  2. Unlike 1 where massive duplication of code could be avoided by simply panicking, the decorator needed to delegate. There have been a dozen things on nightly that I wish I could have while working on this, and proper delegation support is one of them. Unfortunately that is nightly, so without it I ended up with something like 6k lines of meaningless boilerplate that broke every time there were changes to main, just to get delegation working.
  3. No easy way to have transitions between representations in the API (orphan rule ruined my day!!!! 😠), or representations at all. The backend decorator was for a single representation, whilst a lot of sparse use cases would benefit greatly from being able to swap back and forth quite quickly (e.g. between CSR and CSC).

Since I made my last comment I have begun to suspect it might be possible to combine the backend extension + representation approaches together into a happy middle ground? Rip all the traits/functionality out of tensor storage and put them in a backend extension which takes the TensorStorage as a generic. Then implement the sparse API on Tensor<..., Sparse<SR> where B implements the sparse extension specifically for SR.

Instead of a decorator, I could just do a blanket implementation of this extension (for the decorator COO SparseStorage only at the moment) on all backends (not sure if this will work). That would eliminate the delegation issues. Individual backends can implement the extension with a different TensorStorage type later, if they want to support their own/more kinds.

Sorry if that was a bit of a mess of words, its 3am as I write this 😅.

I definitely intend to explore that approach soonish. For now though I want to get all of the operations implemented for the decorator so that it is complete (because changing the API doesn't invalidate that code, luckily). Gives you time to think on it too, while I do that.

@laggui
Copy link
Member

laggui commented Aug 27, 2024

@McArthur-Alford Thank you for sharing the issues you ran into and your thought process! It helps to know why some decisions were made a bit more specifically.

To give a more thorough review with appropriate suggestions I feel like I would have to dive into it a bit more to iterate over some design choices (which I would like to do). We just released today so I should be able to reserve some time for that in the following week 🤞

Since I made my last comment I have begun to suspect it might be possible to combine the backend extension + representation approaches together into a happy middle ground? Rip all the traits/functionality out of tensor storage and put them in a backend extension which takes the TensorStorage as a generic. Then implement the sparse API on Tensor<..., Sparse where B implements the sparse extension specifically for SR.

I think this would alleviate my biggest pain point with the current implementation. I feel like Tensor::<B, 1, Float, Sparse<COO, B>> is a bit clunky with the abstractions, and this change sounds it would address that.

For now though I want to get all of the operations implemented for the decorator so that it is complete (because changing the API doesn't invalidate that code, luckily).

Yes please go ahead 🙏 don't want to slow you down hehe

Copy link
Contributor

This PR has been marked as stale because it has not been updated for over a month

@github-actions github-actions bot added the stale The issue or pr has been open for too long label Sep 29, 2024
@laggui laggui removed the stale The issue or pr has been open for too long label Sep 30, 2024
@McArthur-Alford
Copy link
Contributor Author

This PR has been marked as stale because it has not been updated for over a month

Apologies bot. I am in fact still alive, just got incredibly busy the last few weeks. Should be back to this soonish.

@laggui
Copy link
Member

laggui commented Oct 1, 2024

This PR has been marked as stale because it has not been updated for over a month

Apologies bot. I am in fact still alive, just got incredibly busy the last few weeks. Should be back to this soonish.

No worries, I have been quite busy as well 😅

And don't worry about the conflicts for now, we just refactored the tensor ops to remove the const D generic for primitives.

Copy link
Contributor

github-actions bot commented Nov 1, 2024

This PR has been marked as stale because it has not been updated for over a month

@github-actions github-actions bot added the stale The issue or pr has been open for too long label Nov 1, 2024
@McArthur-Alford
Copy link
Contributor Author

This PR has been marked as stale because it has not been updated for over a month

Im still not dead, I promise. Probably about a week or two away from having the time to actually go through and get this thing done.

@laggui laggui removed the stale The issue or pr has been open for too long label Nov 1, 2024
Copy link
Contributor

github-actions bot commented Dec 2, 2024

This PR has been marked as stale because it has not been updated for over a month

@github-actions github-actions bot added the stale The issue or pr has been open for too long label Dec 2, 2024
@laggui laggui removed the stale The issue or pr has been open for too long label Dec 2, 2024
Copy link
Contributor

github-actions bot commented Jan 2, 2025

This PR has been marked as stale because it has not been updated for over a month

@github-actions github-actions bot added the stale The issue or pr has been open for too long label Jan 2, 2025
@laggui laggui removed the stale The issue or pr has been open for too long label Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants