Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize date_part Minute by avoiding unnecessary computation #14043

Open
jayzhan211 opened this issue Jan 8, 2025 · 2 comments
Open

Optimize date_part Minute by avoiding unnecessary computation #14043

jayzhan211 opened this issue Jan 8, 2025 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@jayzhan211
Copy link
Contributor

Is your feature request related to a problem or challenge?

Open an issue to track the status of this optimization

Related
#13449
apache/arrow-rs#6746

Describe the solution you'd like

I guess there are some changes in arrow-rs left to do.

Describe alternatives you've considered

No response

Additional context

impl ExtractDatePartExt for PrimitiveArray<TimestampSecondType> {
    fn date_part(&self, part: DatePart) -> Result<Int32Array, ArrowError> {
        // TimestampSecond only encodes number of seconds, so these will always be 0
        let array =
            if let DatePart::Millisecond | DatePart::Microsecond | DatePart::Nanosecond = part {
                Int32Array::new(vec![0; self.len()].into(), self.nulls().cloned())
            } else if let Some(tz) = get_tz(self.data_type())? {
                let map_func = get_date_time_part_extract_fn(part);
                self.unary_opt(|d| {
                    timestamp_s_to_datetime(d)
                        .map(|c| Utc.from_utc_datetime(&c).with_timezone(&tz))
                        .map(map_func)
                })
            } else {
                let map_func = get_date_time_part_extract_fn(part);
                self.unary_opt(|d| timestamp_s_to_datetime(d).map(map_func))
            };
        Ok(array)
    }
}

If I remember correctly, we need to switch timestamp_s_to_datetime to timestamp_s_to_time and extract the data from Minute

@jayzhan211 jayzhan211 added the enhancement New feature or request label Jan 8, 2025
@jayzhan211
Copy link
Contributor Author

I think this is a good first issue for getting familiar with optimization and benchmarking code.

@samsond
Copy link

samsond commented Jan 8, 2025

Take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants