generated from duckdb/extension-template
-
Notifications
You must be signed in to change notification settings - Fork 127
Open
Description
What happens?
When inserting into a table partitioned by a datepart function (ALTER TABLE example SET PARTITIONED BY (YEAR(ts));), nonsensical values are returned and used in the hive partitioning scheme. This has occurred on multiple datasets I have tried.
Example from my local Minio Object Store:
- year=9999
- year=-1
- year=-1005240734263017374
- year=-4521051002081600685
- year=-4521321939762871114
- year=-4528713560160304931
- year=-4528721211045426796
- year=-4534128942037135759
- year=-4539346949412749312
- year=-4543984170262200808
- year=-4548000494483161079
- year=-4550822401377224700
- year=-4557031394559129541
- year=-4561963618355968950
- year=-4565172248142360563
- year=-4580933227769618360
To Reproduce
CREATE TABLE IF NOT EXISTS example AS
SELECT *
FROM read_parquet('s3://lake/landing/example_tbl/*.parquet', union_by_name=true)
WHERE 0=1
ALTER TABLE example SET PARTITIONED BY (YEAR(timestamp));
INSERT INTO example
SELECT *
FROM read_parquet('s3://lake/landing/example_tbl/*.parquet', union_by_name=true);
OS:
Windows
DuckDB Version:
1.4.3
DuckLake Version:
0.3
DuckDB Client:
Python
Hardware:
No response
Full Name:
Ryan Black
Affiliation:
none
What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.
I have tested with a stable release
Did you include all relevant data sets for reproducing the issue?
No - I cannot share the data sets because they are confidential
Did you include all code required to reproduce the issue?
- Yes, I have
Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?
- Yes, I have
paolodina