Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try migration to dragonfly #269

Open
alexey-yarmosh opened this issue Jan 3, 2023 · 4 comments
Open

Try migration to dragonfly #269

alexey-yarmosh opened this issue Jan 3, 2023 · 4 comments

Comments

@alexey-yarmosh
Copy link
Member

Seems like all redis JSON operations which we are using (get, set, strappend) are implemented by dragonfly team/contributors (https://github.com/dragonflydb/dragonfly/releases/tag/v0.12.0). Since that was a main blocker in using dragonfly we can try to migrate to it and measure the performance diff.

Questions to answer:

  • Does our API work correctly after migrations?
  • Do utility tools work fine (monitoring, integration with newrelic, db management tools)?
  • What is the perf diff?
@MartinKolarik
Copy link
Member

I don't think this is necessary at all because redis can scale just as well or better with the cluster module, we just don't use it yet. The client library has support for clustering, so with small config changes, we should be able to utilize all cores with redis, too, without using an entirely different DB.

The benchmarks at https://redis.com/redis-enterprise/technology/linear-scaling-redis-enterprise/ suggest redis achieves almost 100% linear scaling, while dragonfly showcases "25x speedup" but that's on a 32 core / 64 thread server - so at least on the first sight, worse than redis.

@jimaek
Copy link
Member

jimaek commented Mar 14, 2023

Clustering Redis reliably is a pain, especially on OSS version.
So I am 100% for migration to Dragonfly

@MartinKolarik
Copy link
Member

MartinKolarik commented Dec 28, 2024

I looked into this closer with two main reasons for the possible migration being the zero setup scaling and the new SSD Data Tiering. On paper, it really seems like switching to Dragonfly might be the easiest way to deal with scaling both the load and storage requirements.

Unfortunately, the way Dragonfly presents itself is somewhat misleading. It may be "almost Redis", but definitely not a "drop-in Redis replacement". It looks like they copied the redis documentation 1:1 as it was but didn't test it very well and in some cases, didn't even implement part of the documented features. Just running our test suite revealed three bugs:

Given the nature of these bugs, I have no confidence that we wouldn't hit more if we started to use it, as their tests clearly don't cover lots of stuff.

Additionally:

The missing commands could be worked around on our side by forking and editing the affected packages, but the other bugs are more serious, and even if they got fixed soon, I wouldn't trust Dragonfly enough to use it as one of the main components of our system.

For scaling, proper cluster configuration of redis is going to be the best option. On a single server, it shouldn't cause any big issues, and it'll also resolve our problem with slow RDB loading on startup. The performance is likely going to be better than Dragonfly's.

As for long-term storage of measurement results, we can implement it relatively easily by using a separate DB (Maria) or even an object storage (S3) for finished measurements. Alternatively, we can consider this a bit later as a part of #291 since "long term storage" is required there as well.

@jimaek
Copy link
Member

jimaek commented Dec 28, 2024

Adding yet another DB to manage (Maria) is not ideal, the current system is much simpler. And a timeseries DB for non-timeseries long-term data storage is also not ideal. S3 sounds better as there is nothing to manage, but I worry about costs related to requests and bandwidth.

I guess we could try Redis clustering within a single server and see how it goes and then consider moving measurements. But I do have little hope for OSS Redis, they keep adding the best stuff to the paid version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants