-
Notifications
You must be signed in to change notification settings - Fork 13.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-36941][docs] Update DATE_FORMAT Doc and Python Tests #25828
base: master
Are you sure you want to change the base?
Conversation
5a9ea55
to
8e32861
Compare
fyi: it can not be both |
Should it be [FLINK-36941][docs]? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from changing the title change - lgtm!
You still need a ✅ from someone else tho ;)
docs/data/sql_functions.yml
Outdated
@@ -642,7 +642,10 @@ temporal: | |||
description: Returns TRUE if two time intervals defined by (timepoint1, temporal1) and (timepoint2, temporal2) overlap. The temporal values could be either a time point or a time interval. E.g., (TIME '2:55:00', INTERVAL '1' HOUR) OVERLAPS (TIME '3:30:00', INTERVAL '2' HOUR) returns TRUE; (TIME '9:00:00', TIME '10:00:00') OVERLAPS (TIME '10:15:00', INTERVAL '3' HOUR) returns FALSE. | |||
- sql: DATE_FORMAT(timestamp, string) | |||
table: dateFormat(TIMESTAMP, STRING) | |||
description: Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's SimpleDateFormat. | |||
description: Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's DateTimeFormatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: to a value of string -> to a string
@@ -350,14 +350,19 @@ def temporal_overlaps(left_time_point, | |||
def date_format(timestamp, format) -> Expression: | |||
""" | |||
Formats a timestamp as a string using a specified format. | |||
The format must be compatible with MySQL's date formatting syntax as used by the | |||
date_parse function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this has been removed - do we not need to deprecate first before replacing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidradl , the comment I deleted is not accurate.
MySQL's date formatting syntax follows its own date formatting syntax, not Java's SimpleDateFormat
or DateTimeFormatter
. MySQL is written in C++, not Java, and doesn't have much significance as a reference here.
description: Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's DateTimeFormatter. | ||
- sql: DATE_FORMAT(string, string) | ||
table: dateFormat(STRING, STRING) | ||
description: Re-format a timestamp string to another string, using a custom format. The format string is compatible with Java's SimpleDateFormat. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do the same for UNIX_TIMESTAMP(string1[, string2])
(https://github.com/apache/flink/pull/25828/files#diff-539fb22ee6aeee4cf07230bb4155500c6680c4cc889260e2c58bfa9d63fb7de5R662) please?
flink/flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/DateTimeUtils.java
Line 1468 in 7fa4f78
public static long unixTimestamp(String dateStr, String format, TimeZone tz) { |
flink/flink-table/flink-table-common/src/main/java/org/apache/flink/table/utils/DateTimeUtils.java
Line 960 in 7fa4f78
private static long internalParseTimestampMillis(String dateStr, String format, TimeZone tz) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming I've read
Line 2304 in 7fa4f78
public static final BuiltInFunctionDefinition UNIX_TIMESTAMP = |
unixTimestamp(dateStr, format, null)
as there is no tz
third parameter in the SQL function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nictownsend sounds like a reasonable idea, @snuyanzin @yiyutian1 if you agree we could raise this separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I hadn't appreciated the scope of the underlying Flink issue, happy to raise a new one if it's not appropriate to fix here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slightly different thought
Since it is Flink 2.0 and anyway it brings some breaking changes. May be we can completely switch to DateTimeFormatter in 2.0 while for 1.19 and 1.20 we just fix documentation.
However this option requires a discussion in ML
i can start one if no objections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@snuyanzin I like the idea.
Before that happens, should we fix the current doc?
If to only commit for 1.19 and 1.20, should I create PR against release-1.19 and release-1.20?
For example `date_format(col("time"), "%Y, %d %M")` results in strings formatted as | ||
"2017, 05 May". | ||
Supported functions: | ||
1. date_format(TIMESTAMP, STRING) -> STRING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should mention the difference between the supported format strings.
@@ -299,6 +299,10 @@ def test_expressions(self): | |||
lit(2).hours))) | |||
self.assertEqual("dateFormat(time, '%Y, %d %M')", | |||
str(date_format(col("time"), "%Y, %d %M"))) | |||
self.assertEqual("dateFormat(toTimestamp('1970-01-01 08:01:40'), 'yyyy-MM-dd HH:mm:ss')", | |||
str(date_format(to_timestamp('1970-01-01 08:01:40'), "yyyy-MM-dd HH:mm:ss"))) | |||
self.assertEqual("dateFormat('1970-01-01 08:01:40', 'yyyy-MM-dd HH:mm:ss')", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add a test with the format that the DateTimeFormatter could cope with - with precision and timezone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @davidradl , sure I can add that.
Also want to point out that these test_expressions don't test any implementations, they just test the "signature" of defined functions.
docs/data/sql_functions.yml
Outdated
@@ -642,7 +642,10 @@ temporal: | |||
description: Returns TRUE if two time intervals defined by (timepoint1, temporal1) and (timepoint2, temporal2) overlap. The temporal values could be either a time point or a time interval. E.g., (TIME '2:55:00', INTERVAL '1' HOUR) OVERLAPS (TIME '3:30:00', INTERVAL '2' HOUR) returns TRUE; (TIME '9:00:00', TIME '10:00:00') OVERLAPS (TIME '10:15:00', INTERVAL '3' HOUR) returns FALSE. | |||
- sql: DATE_FORMAT(timestamp, string) | |||
table: dateFormat(TIMESTAMP, STRING) | |||
description: Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's SimpleDateFormat. | |||
description: Converts timestamp to a value of string in the format specified by the date format string. The format string is compatible with Java's DateTimeFormatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious what happens if you supply a TIMESTAMP_TZ into this function, is this expected to work?
8e32861
to
ff312e1
Compare
ff312e1
to
238c093
Compare
What is the purpose of the change
To update the documentation for
DATE_FORMAT
, which is not accurate now. The current function behaviors are:DATE_FORMAT(timestamp, string)
eventually callsformatTimestamp(TimestampData ts, String format, ZoneId zoneId)
inDateTimeUtils.java
, which uses DateTimeFormatter.DATE_FORMAT(string, string)
eventually callsformatTimestampStringWithOffset(String dateStr, String fromFormat, String toFormat, TimeZone tz, long offsetMills)
inDateTimeUtils.java
, which uses SimpleDateFormat.Brief change log
DATE_FORMAT
.DATE_FORMAT
functions.Verifying this change
Please make sure both new and modified tests in this PR follow the conventions for tests defined in our code quality guide.
Added python tests for for specific
DATE_FORMAT
functions.Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (no)Documentation