Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas - Feel free to post ideas. #82

Open
ruslandoga opened this issue May 1, 2023 · 13 comments
Open

Ideas - Feel free to post ideas. #82

ruslandoga opened this issue May 1, 2023 · 13 comments

Comments

@ruslandoga
Copy link
Contributor

ruslandoga commented May 1, 2023

Feel free to post ideas.

That idea itself is stolen from youki-dev/youki#10

@ruslandoga ruslandoga changed the title Ideas Ideas - Feel free to post ideas. May 1, 2023
@aerosol
Copy link
Member

aerosol commented May 2, 2023

Livebook smart cell :)

@hkrutzer
Copy link

hkrutzer commented May 2, 2023

Streaming the results of queries would be great. I think it requires setting the connection mode to active though.

@ruslandoga
Copy link
Contributor Author

ruslandoga commented May 2, 2023

Streaming the results of queries would be great. I think it requires setting the connection mode to active though.

There is an undocumented Ch.stream function that is used in the benchmarks. It doesn't decode the raw data into rows very well yet (right now it only does the decoding if :types option is provided).

Here's a minimal example:

{:ok, pid} = Ch.start_link()

Ch.run(pid, fn conn ->
  conn
  |> Ch.stream("select number from system.numbers limit {limit:UInt64}", %{"limit" => 1_000_000}, types: ["UInt64"]) 
  |> Stream.each(fn rows -> IO.puts("at #{Time.utc_now()} got #{length(List.flatten(rows))} rows") end)
  |> Stream.run()
end)
at 13:24:49.737616 got 2560 rows
at 13:24:49.742150 got 36864 rows
at 13:24:49.752535 got 50657 rows
at 13:24:49.760068 got 51169 rows
at 13:24:49.765942 got 50688 rows
at 13:24:49.771635 got 51169 rows
at 13:24:49.776692 got 50657 rows
at 13:24:49.782096 got 51169 rows
at 13:24:49.787199 got 50657 rows
at 13:24:49.792197 got 51200 rows
at 13:24:49.797056 got 50657 rows
at 13:24:49.802442 got 51169 rows
at 13:24:49.807619 got 50657 rows
at 13:24:49.812214 got 51200 rows
at 13:24:49.816771 got 50657 rows
at 13:24:49.821738 got 51169 rows
at 13:24:49.826986 got 51169 rows
at 13:24:49.831561 got 50657 rows
at 13:24:49.836822 got 51200 rows
at 13:24:49.841795 got 50657 rows
at 13:24:49.846984 got 44018 rows

And some tests.

@hkrutzer
Copy link

That's great! Will it support not having to specify types in the future?

@ruslandoga
Copy link
Contributor Author

@hkrutzer yeah, most likely. We'd "just" need to decode types from RowBinaryWithNamesAndTypes and store them in the accumulator.

@jaronoff97
Copy link

is there a way to hook into clickhouse's WATCH functionality?

@hkrutzer
Copy link

I think that functionality is deprecated and will be removed: https://clickhouse.com/docs/en/sql-reference/statements/create/view#live-view-deprecated

@ruslandoga
Copy link
Contributor Author

ruslandoga commented May 30, 2024

👋 @jaronoff97

I wasn't able to create a live view on clickhouse/clickhouse-server:24.3.3.102-alpine but either way it should be possible to consume it with Ch.stream like in #82 (comment)

eb6fbe5ed437 :) CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Syntax error: failed at position 21 ('WITH'):

CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Expected one of: token, Dot, UUID, ON, TO, OpeningRoundBracket, sql security, DEFINER, SQL SECURITY, AS

I expect something like this to work, timeout might be tricky:

{:ok, pid} = Ch.start_link(pool_size: 1)

DBConnection.run(pid, fn conn ->
  Ch.stream(conn, "watch lv format JSONEachRowWithProgress") 
  |> Stream.each(&IO.inspect/1)
  |> Stream.run()
end, timeout: :infinity)

@hkrutzer
Copy link

Did you run set allow_experimental_live_view = 1 first?

@ruslandoga
Copy link
Contributor Author

ruslandoga commented May 30, 2024

No, but it doesn't seem to have changed much. Somehow it's a syntax error.

eb6fbe5ed437 :) set allow_experimental_live_view = 1;

SET allow_experimental_live_view = 1

Query id: e40f3eb8-548e-4b3d-bc0e-8f92456f076c

Ok.

0 rows in set. Elapsed: 0.001 sec.

eb6fbe5ed437 :) CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Syntax error: failed at position 21 ('WITH'):

CREATE LIVE VIEW lv WITH REFRESH 5 AS SELECT now();

Expected one of: token, Dot, UUID, ON, TO, OpeningRoundBracket, sql security, DEFINER, SQL SECURITY, AS

eb6fbe5ed437 :)

@jaronoff97
Copy link

ah okay, thank you! 🙇

@ananthakumaran
Copy link
Contributor

@ruslandoga I tried Ch.stream for select query, but I can't get the Ch.RowBinary.decode_rows to work reliably. It does work for simple queries, but it breaks for complex queries (or for any queries that return a lot of data back (eg SELECT number, randomFixedString(1024) FROM system.numbers_mt LIMIT {limit:UInt64}). I am assuming this is because Ch.Result{data: data} doesn't have any concept of alignment? and decode_rows only works if the data is aligned at row level?. I still can make it work with FORMAT CSV and doing my csv parse_stream, but just want to confirm if that is the only way to get this work reliably.

@ruslandoga
Copy link
Contributor Author

ruslandoga commented Dec 28, 2024

👋 @ananthakumaran

I am assuming this is because Ch.Result{data: data} doesn't have any concept of alignment? and decode_rows only works if the data is aligned at row level?

Yes. Right now Ch.stream is used in Plausible only as a "raw" data pipe into a Zip file.

We would probably need to implement an incremental version of Ch.RowBinary.decode_rows for Ch.stream to work reliably.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants