Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve docs on column encryption for nested fields #9

Draft
wants to merge 127 commits into
base: main
Choose a base branch
from

Conversation

EnricoMi
Copy link
Owner

@EnricoMi EnricoMi commented Feb 1, 2025

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

kou and others added 7 commits January 30, 2025 13:27
…roup and multiple row groups (apache#45350)

### Rationale for this change

Loading `arrow::ArrayStatistics` logic depends on `parquet::ColumnChunkMetaData`.

We can't get `parquet::ColumnChunkMetaData` when requested row groups are empty because no associated row group and column chunk exist.

We can't use multiple `parquet::ColumnChunkMetaData`s for now because we don't have statistics merge logic. So we can't load statistics when we use multiple row groups. 

### What changes are included in this PR?

* Don't load statistics when no row groups are used
* Don't load statistics when multiple row groups are used
* Add `parquet::ArrowReaderProperties::{set_,}should_load_statistics()` to enforce loading statistics by loading row group one by one

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#45339

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…45134)

### Rationale for this change
Post  apache#44945 the Java implementation lives in it's own repo. Update docs
to point there. 

### What changes are included in this PR?
Updates to a few locations that reference old Java impl location.

### Are these changes tested?
Rendered the Sphinx ones locally to check.  

### Are there any user-facing changes?
No

Lead-authored-by: parthchonkar <[email protected]>
Co-authored-by: Parth Chonkar <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…t for pandas>=2.3 (apache#45383)

### Rationale for this change

The option already exists in pandas 2.2, but for that version our code does not work, so restricting it to pandas >= 2.3

* GitHub Issue: apache#45296

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Raúl Cumplido <[email protected]>
…he#45391)

### Rationale for this change

apache#45390 implemented `garrow_array_validate_full()`. But it used `validate_full` not `validate-full` for error tag. We should use hyphen-separated words for error tag for consistency.

### What changes are included in this PR?

`validate_ful` -> `validate-full`

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* GitHub Issue: apache#45390

Authored-by: Hiroyuki Sato <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…e#45386)

### Rationale for this change

[RecordBatch::ValidateFull](https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4NK5arrow11RecordBatch12ValidateFullEv) available in the C++ API.
But, GLib doesn't support that method yet.

### What changes are included in this PR?

This PR adds a validation method in the record-batch class.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

* GitHub Issue: apache#44760

Lead-authored-by: Hiroyuki Sato <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…ntu CI (apache#45395)

### Rationale for this change

Ubuntu 20.04 will reach EOL by 2025-04, so we must upgrade the MATLAB workflow's GitHub runner from Ubuntu 20.04 to Ubuntu 22.04 or Ubuntu 24.04.

### What changes are included in this PR?

1. Updated the Ubuntu MATLAB GitHub workflow to use Ubuntu 22.04 as the GitHub runner.
2. Updated the Ubuntu MATLAB crossbow task to use Ubuntu 22.04 as the GitHub runner.

### Are these changes tested?

1. All GitHub checks passed.
2. Manually triggered the MATLAB crossbow task and installed the MATLAB-Arrow Interface Toolbox on Debian-12. 

### Are there any user-facing changes?

N/A

* GitHub Issue: apache#45388

Authored-by: Sarah Gilmore <[email protected]>
Signed-off-by: Sarah Gilmore <[email protected]>
…apache#45387)

### Rationale for this change

Conan 1 is deprecated. We should use Conan 2.

### What changes are included in this PR?

Use "conania/gcc11-ubuntu16.04:2.12.1" because it's the latest version.

Based on
https://github.com/conan-io/conan-docker-tools/blob/master/images/README.md#official-docker-images, "gcc11-ubuntu16.04" is only supported image.

`ci/conan/` is synchronized with the latest https://github.com/conan-io/conan-center-index .

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#45381

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Copy link

github-actions bot commented Feb 1, 2025

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

See also:

@EnricoMi EnricoMi force-pushed the docs-column-encryption-nested-fields branch 2 times, most recently from bccd9c3 to 543bd7b Compare February 1, 2025 16:52
@EnricoMi EnricoMi force-pushed the docs-column-encryption-nested-fields branch from 543bd7b to 56e803d Compare February 1, 2025 16:58
dependabot bot and others added 15 commits February 2, 2025 06:16
Bumps [memfs](https://github.com/streamich/memfs) from 4.14.0 to 4.17.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/streamich/memfs/releases">memfs's releases</a>.</em></p>
<blockquote>
<h2>v4.17.0</h2>
<h1><a href="https://github.com/streamich/memfs/compare/v4.16.0...v4.17.0">4.17.0</a> (2025-01-09)</h1>
<h3>Features</h3>
<ul>
<li>allow setting rdev on node (<a href="https://redirect.github.com/streamich/memfs/issues/1085">#1085</a>) (<a href="https://github.com/streamich/memfs/commit/2717334372ee92b1892ef12cdc341d43312455f2">2717334</a>)</li>
</ul>
<h2>v4.16.0</h2>
<h1><a href="https://github.com/streamich/memfs/compare/v4.15.4...v4.16.0">4.16.0</a> (2025-01-09)</h1>
<h3>Features</h3>
<ul>
<li>support <code>UInt8Array</code> in place of <code>Buffer</code> (<a href="https://redirect.github.com/streamich/memfs/issues/1083">#1083</a>) (<a href="https://github.com/streamich/memfs/commit/0d3662a75f09fee7bfc6b73f26c34e0e2922bf6f">0d3662a</a>)</li>
</ul>
<h2>v4.15.4</h2>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.3...v4.15.4">4.15.4</a> (2025-01-09)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>remove <code>debugger</code> statement (<a href="https://redirect.github.com/streamich/memfs/issues/1086">#1086</a>) (<a href="https://github.com/streamich/memfs/commit/648917202d0832908ec57b72400f4c772e15a624">6489172</a>)</li>
</ul>
<h2>v4.15.3</h2>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.2...v4.15.3">4.15.3</a> (2025-01-01)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>allow setting custom file types beyond S_IFREG and S_IFDIR (<a href="https://redirect.github.com/streamich/memfs/issues/1082">#1082</a>) (<a href="https://github.com/streamich/memfs/commit/24da3e73ade9f9fdc1bb7e2dbf898fab547150f4">24da3e7</a>)</li>
</ul>
<h2>v4.15.2</h2>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.1...v4.15.2">4.15.2</a> (2024-12-30)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>don't fail on closing fd after reset has been called (<a href="https://redirect.github.com/streamich/memfs/issues/550">#550</a>) (<a href="https://redirect.github.com/streamich/memfs/issues/1081">#1081</a>) (<a href="https://github.com/streamich/memfs/commit/ede0f4ff774f8ceb0f5c0ba2650a7ce0bd39c118">ede0f4f</a>)</li>
</ul>
<h2>v4.15.1</h2>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.0...v4.15.1">4.15.1</a> (2024-12-22)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>resolve relative symlinks to the current directory (<a href="https://redirect.github.com/streamich/memfs/issues/1079">#1079</a>) (<a href="https://github.com/streamich/memfs/commit/63e38735fe08b728da02b9328d16be4d132b9327">63e3873</a>), closes <a href="https://redirect.github.com/streamich/memfs/issues/725">#725</a></li>
</ul>
<h2>v4.15.0</h2>
<h1><a href="https://github.com/streamich/memfs/compare/v4.14.1...v4.15.0">4.15.0</a> (2024-12-09)</h1>

</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/streamich/memfs/blob/master/CHANGELOG.md">memfs's changelog</a>.</em></p>
<blockquote>
<h1><a href="https://github.com/streamich/memfs/compare/v4.16.0...v4.17.0">4.17.0</a> (2025-01-09)</h1>
<h3>Features</h3>
<ul>
<li>allow setting rdev on node (<a href="https://redirect.github.com/streamich/memfs/issues/1085">#1085</a>) (<a href="https://github.com/streamich/memfs/commit/2717334372ee92b1892ef12cdc341d43312455f2">2717334</a>)</li>
</ul>
<h1><a href="https://github.com/streamich/memfs/compare/v4.15.4...v4.16.0">4.16.0</a> (2025-01-09)</h1>
<h3>Features</h3>
<ul>
<li>support <code>UInt8Array</code> in place of <code>Buffer</code> (<a href="https://redirect.github.com/streamich/memfs/issues/1083">#1083</a>) (<a href="https://github.com/streamich/memfs/commit/0d3662a75f09fee7bfc6b73f26c34e0e2922bf6f">0d3662a</a>)</li>
</ul>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.3...v4.15.4">4.15.4</a> (2025-01-09)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>remove <code>debugger</code> statement (<a href="https://redirect.github.com/streamich/memfs/issues/1086">#1086</a>) (<a href="https://github.com/streamich/memfs/commit/648917202d0832908ec57b72400f4c772e15a624">6489172</a>)</li>
</ul>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.2...v4.15.3">4.15.3</a> (2025-01-01)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>allow setting custom file types beyond S_IFREG and S_IFDIR (<a href="https://redirect.github.com/streamich/memfs/issues/1082">#1082</a>) (<a href="https://github.com/streamich/memfs/commit/24da3e73ade9f9fdc1bb7e2dbf898fab547150f4">24da3e7</a>)</li>
</ul>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.1...v4.15.2">4.15.2</a> (2024-12-30)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>don't fail on closing fd after reset has been called (<a href="https://redirect.github.com/streamich/memfs/issues/550">#550</a>) (<a href="https://redirect.github.com/streamich/memfs/issues/1081">#1081</a>) (<a href="https://github.com/streamich/memfs/commit/ede0f4ff774f8ceb0f5c0ba2650a7ce0bd39c118">ede0f4f</a>)</li>
</ul>
<h2><a href="https://github.com/streamich/memfs/compare/v4.15.0...v4.15.1">4.15.1</a> (2024-12-22)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>resolve relative symlinks to the current directory (<a href="https://redirect.github.com/streamich/memfs/issues/1079">#1079</a>) (<a href="https://github.com/streamich/memfs/commit/63e38735fe08b728da02b9328d16be4d132b9327">63e3873</a>), closes <a href="https://redirect.github.com/streamich/memfs/issues/725">#725</a></li>
</ul>
<h1><a href="https://github.com/streamich/memfs/compare/v4.14.1...v4.15.0">4.15.0</a> (2024-12-09)</h1>
<h3>Features</h3>
<ul>
<li>implement <code>createReadStream</code> and <code>createWriteStream</code> on <code>FileHandle</code> (<a href="https://redirect.github.com/streamich/memfs/issues/1076">#1076</a>) (<a href="https://github.com/streamich/memfs/commit/c413df5e3e151e446a944a8fba5cc02db46937a0">c413df5</a>), closes <a href="https://redirect.github.com/streamich/memfs/issues/1063">#1063</a></li>
</ul>
<h2><a href="https://github.com/streamich/memfs/compare/v4.14.0...v4.14.1">4.14.1</a> (2024-12-02)</h2>

</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/streamich/memfs/commit/90a4bc6714911c43559a49cf9129153f27dd6c3d"><code>90a4bc6</code></a> chore(release): 4.17.0 [skip ci]</li>
<li><a href="https://github.com/streamich/memfs/commit/2717334372ee92b1892ef12cdc341d43312455f2"><code>2717334</code></a> feat: allow setting rdev on node (<a href="https://redirect.github.com/streamich/memfs/issues/1085">#1085</a>)</li>
<li><a href="https://github.com/streamich/memfs/commit/cdb04030e19468d69e9077ccfde5b9e29f9a4d4c"><code>cdb0403</code></a> chore(release): 4.16.0 [skip ci]</li>
<li><a href="https://github.com/streamich/memfs/commit/0d3662a75f09fee7bfc6b73f26c34e0e2922bf6f"><code>0d3662a</code></a> feat: support <code>UInt8Array</code> in place of <code>Buffer</code> (<a href="https://redirect.github.com/streamich/memfs/issues/1083">#1083</a>)</li>
<li><a href="https://github.com/streamich/memfs/commit/77c4a53bda5d73edf07296bcb5fb406ee6a89ef0"><code>77c4a53</code></a> chore(release): 4.15.4 [skip ci]</li>
<li><a href="https://github.com/streamich/memfs/commit/648917202d0832908ec57b72400f4c772e15a624"><code>6489172</code></a> fix: remove <code>debugger</code> statement (<a href="https://redirect.github.com/streamich/memfs/issues/1086">#1086</a>)</li>
<li><a href="https://github.com/streamich/memfs/commit/47028116205af6ee3209b02e0f53a7e97c1796e0"><code>4702811</code></a> chore(deps): lock file maintenance (<a href="https://redirect.github.com/streamich/memfs/issues/1084">#1084</a>)</li>
<li><a href="https://github.com/streamich/memfs/commit/420c46a0db3dda9ec075717449f77b3b40aaf7f9"><code>420c46a</code></a> chore(release): 4.15.3 [skip ci]</li>
<li><a href="https://github.com/streamich/memfs/commit/24da3e73ade9f9fdc1bb7e2dbf898fab547150f4"><code>24da3e7</code></a> fix: allow setting custom file types beyond S_IFREG and S_IFDIR (<a href="https://redirect.github.com/streamich/memfs/issues/1082">#1082</a>)</li>
<li><a href="https://github.com/streamich/memfs/commit/59768d35132d70e2911290b308c363ec14197c91"><code>59768d3</code></a> chore(release): 4.15.2 [skip ci]</li>
<li>Additional commits viewable in <a href="https://github.com/streamich/memfs/compare/v4.14.0...v4.17.0">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=memfs&package-manager=npm_and_yarn&previous-version=4.14.0&new-version=4.17.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sutou Kouhei <[email protected]>
Bumps [del-cli](https://github.com/sindresorhus/del-cli) from 5.1.0 to 6.0.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/sindresorhus/del-cli/releases">del-cli's releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h3>Breaking</h3>
<ul>
<li>Require Node.js 18  de54031</li>
</ul>
<h3>Improvements</h3>
<ul>
<li>Update dependencies  de54031</li>
</ul>
<p><a href="https://github.com/sindresorhus/del-cli/compare/v5.1.0...v6.0.0">https://github.com/sindresorhus/del-cli/compare/v5.1.0...v6.0.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/sindresorhus/del-cli/commit/f6e4605b03475c6045d08db001e2e001d04af8ba"><code>f6e4605</code></a> 6.0.0</li>
<li><a href="https://github.com/sindresorhus/del-cli/commit/de54031c340dda9230d1a7c855f93813f7ab1259"><code>de54031</code></a> Require Node.js 18 and update dependencies</li>
<li><a href="https://github.com/sindresorhus/del-cli/commit/028ebb11bb0401f2901233ab15fd86f52d9e7dd1"><code>028ebb1</code></a> Meta tweaks</li>
<li>See full diff in <a href="https://github.com/sindresorhus/del-cli/compare/v5.1.0...v6.0.0">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=del-cli&package-manager=npm_and_yarn&previous-version=5.1.0&new-version=6.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sutou Kouhei <[email protected]>
…5406)

Bumps [flatbuffers](https://github.com/google/flatbuffers) from 24.3.25 to 25.1.24.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/google/flatbuffers/releases">flatbuffers's releases</a>.</em></p>
<blockquote>
<h2>v25.1.24</h2>
<h2>What's Changed</h2>
<ul>
<li>Also use rules_bazel_bazel_integration_test dependency with Bzlmod by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8498">google/flatbuffers#8498</a></li>
<li>Add bazel ci by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8497">google/flatbuffers#8497</a></li>
<li>Fix Bzlmod by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8503">google/flatbuffers#8503</a></li>
<li>Fix Bazel ts support by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8499">google/flatbuffers#8499</a></li>
<li>Improve Bazel CI by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8502">google/flatbuffers#8502</a></li>
<li>Fix npm bzlmod by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8506">google/flatbuffers#8506</a></li>
<li>Add support for Bazel 7 and 8 in Bazel CI by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8505">google/flatbuffers#8505</a></li>
<li>Test external modules explicitly in CI by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8507">google/flatbuffers#8507</a></li>
<li>Bump the versions of all aspect Bazel dependencies by <a href="https://github.com/sbarfurth"><code>@​sbarfurth</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8508">google/flatbuffers#8508</a></li>
<li>Remove Bazel WORKSPACE setup. by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8509">google/flatbuffers#8509</a></li>
<li>Add Bazel instructions to docs by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8510">google/flatbuffers#8510</a></li>
<li>[C++] Avoid adding semicolon after a statement by <a href="https://github.com/tzik"><code>@​tzik</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8488">google/flatbuffers#8488</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/sbarfurth"><code>@​sbarfurth</code></a> made their first contribution in <a href="https://redirect.github.com/google/flatbuffers/pull/8508">google/flatbuffers#8508</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/google/flatbuffers/compare/v25.1.21...v25.1.24">https://github.com/google/flatbuffers/compare/v25.1.21...v25.1.24</a></p>
<h2>v25.1.21</h2>
<h2>What's Changed</h2>
<ul>
<li>Add new Docs source files by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8461">google/flatbuffers#8461</a></li>
<li><code>docs.yml</code> Add workflow for updating docs by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8462">google/flatbuffers#8462</a></li>
<li>docs.yml enable for pushes to main branch by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8463">google/flatbuffers#8463</a></li>
<li><code>contributions.md</code> Add doc about how to contribute to flatbuffers by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8464">google/flatbuffers#8464</a></li>
<li>CNAME: add custom domain by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8465">google/flatbuffers#8465</a></li>
<li>[Swift] Bug fix for verifier where its being copied by <a href="https://github.com/mustiikhalil"><code>@​mustiikhalil</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8451">google/flatbuffers#8451</a></li>
<li><code>flatc.md</code> Add more documentation by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8467">google/flatbuffers#8467</a></li>
<li><code>quick_start.md</code>: Add quick start guide by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8469">google/flatbuffers#8469</a></li>
<li>Add Annotating Docs by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8470">google/flatbuffers#8470</a></li>
<li><code>mkdocs.yml</code> add footer and other info by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8471">google/flatbuffers#8471</a></li>
<li><code>schema.md</code> Fixed some warnings by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8472">google/flatbuffers#8472</a></li>
<li>Fix crash for TypeScript enum in substruct by <a href="https://github.com/fergushenderson"><code>@​fergushenderson</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8430">google/flatbuffers#8430</a></li>
<li>fix typo in tutorial by <a href="https://github.com/shynur"><code>@​shynur</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8476">google/flatbuffers#8476</a></li>
<li>A couple of small updates to the docs by <a href="https://github.com/srinarasi"><code>@​srinarasi</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8477">google/flatbuffers#8477</a></li>
<li>Add imports for bazel by <a href="https://github.com/dbaileychess"><code>@​dbaileychess</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8486">google/flatbuffers#8486</a></li>
<li>Rust full reflection by <a href="https://github.com/candysonya"><code>@​candysonya</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8102">google/flatbuffers#8102</a></li>
<li>Fix a minor typo in flatc --help output by <a href="https://github.com/musicinmybrain"><code>@​musicinmybrain</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8468">google/flatbuffers#8468</a></li>
<li>Add missing headers to runtime_cc target by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8492">google/flatbuffers#8492</a></li>
<li>Use Label() to resolve repo name by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8493">google/flatbuffers#8493</a></li>
<li>Use rules_bazel_integration_test to download Bazel binary by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8495">google/flatbuffers#8495</a></li>
<li>Add support for Bzlmod by <a href="https://github.com/mering"><code>@​mering</code></a> in <a href="https://redirect.github.com/google/flatbuffers/pull/8494">google/flatbuffers#8494</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/shynur"><code>@​shynur</code></a> made their first contribution in <a href="https://redirect.github.com/google/flatbuffers/pull/8476">google/flatbuffers#8476</a></li>
<li><a href="https://github.com/srinarasi"><code>@​srinarasi</code></a> made their first contribution in <a href="https://redirect.github.com/google/flatbuffers/pull/8477">google/flatbuffers#8477</a></li>
<li><a href="https://github.com/candysonya"><code>@​candysonya</code></a> made their first contribution in <a href="https://redirect.github.com/google/flatbuffers/pull/8102">google/flatbuffers#8102</a></li>
<li><a href="https://github.com/mering"><code>@​mering</code></a> made their first contribution in <a href="https://redirect.github.com/google/flatbuffers/pull/8492">google/flatbuffers#8492</a></li>
</ul>

</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/google/flatbuffers/blob/master/CHANGELOG.md">flatbuffers's changelog</a>.</em></p>
<blockquote>
<h2>[25.1.24] (January 24 2024)(<a href="https://github.com/google/flatbuffers/releases/tag/v25.1.24">https://github.com/google/flatbuffers/releases/tag/v25.1.24</a>))</h2>
<ul>
<li>Mostly related to bazel build support.</li>
<li>Min bazel supported is now 7 or higher, as WORKSPACE files are removed (<a href="https://redirect.github.com/google/flatbuffers/issues/8509">#8509</a>)</li>
<li>Minor C++ codegen fix removing extra semicolon (<a href="https://redirect.github.com/google/flatbuffers/issues/8488">#8488</a>)</li>
</ul>
<h2>[25.1.21] (January 21 2025)(<a href="https://github.com/google/flatbuffers/releases/tag/v25.1.21">https://github.com/google/flatbuffers/releases/tag/v25.1.21</a>)</h2>
<ul>
<li>Rust Full Reflection (<a href="https://redirect.github.com/google/flatbuffers/issues/8102">#8102</a>)</li>
<li>Mostly documentation updates hosted at <a href="https://flatbuffers.dev">https://flatbuffers.dev</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/google/flatbuffers/commit/0312061985dbaaf6b068006383946ac6095f5b63"><code>0312061</code></a> FlatBuffers Version 25.1.24</li>
<li><a href="https://github.com/google/flatbuffers/commit/9f94ceedbc069007848187576383bf9fec221e56"><code>9f94cee</code></a> [C++] Avoid adding semicolon after a statement (<a href="https://redirect.github.com/google/flatbuffers/issues/8488">#8488</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/bcd2b9d03952bfeaec299aaa8cd481f2e4ae6dfe"><code>bcd2b9d</code></a> Add Bazel docs (<a href="https://redirect.github.com/google/flatbuffers/issues/8510">#8510</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/82fefbf25205b5d9249b1eacb67e673ba50d5c3f"><code>82fefbf</code></a> Remove Bazel WORKSPACE setup. (<a href="https://redirect.github.com/google/flatbuffers/issues/8509">#8509</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/65e49faf762639f0c2317f1041ab27c208e5cbb8"><code>65e49fa</code></a> Bump the versions of all aspect Bazel dependencies (<a href="https://redirect.github.com/google/flatbuffers/issues/8508">#8508</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/50be3cfe8c3733585f6029e743d47fece60f79b2"><code>50be3cf</code></a> Test external modules explicitly in CI (<a href="https://redirect.github.com/google/flatbuffers/issues/8507">#8507</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/026c243dc53c779bb35bc144b6bdd3bdadb129a1"><code>026c243</code></a> Add support for Bazel 7 and 8 in Bazel CI (<a href="https://redirect.github.com/google/flatbuffers/issues/8505">#8505</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/a9257b6963b896ce67c603d3e2ed8ed7007ae326"><code>a9257b6</code></a> Fix npm bzlmod (<a href="https://redirect.github.com/google/flatbuffers/issues/8506">#8506</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/fceafd438d60630b2ee55b1ea0faa866b6748f05"><code>fceafd4</code></a> Improve Bazel CI (<a href="https://redirect.github.com/google/flatbuffers/issues/8502">#8502</a>)</li>
<li><a href="https://github.com/google/flatbuffers/commit/33a15d63cf6497b07d4d243fd159607d25aa0526"><code>33a15d6</code></a> Fix reflection.fbs import path (<a href="https://redirect.github.com/google/flatbuffers/issues/8499">#8499</a>)</li>
<li>Additional commits viewable in <a href="https://github.com/google/flatbuffers/compare/v24.3.25...v25.1.24">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=flatbuffers&package-manager=npm_and_yarn&previous-version=24.3.25&new-version=25.1.24)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change

We need `arrow_testing` for `ARROW_FUZZING`. And `arrow_testing` needs Boost.

### What changes are included in this PR?

Use Boost with `ARROW_FUZZING`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#45396

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…che#45403)

### Rationale for this change

apache#45398 will apply Ruby lint. Before this change, some code needs to be reformatted to avoid format errors.

```bash
Ruby Format..............................................................Failed
- hook id: rubocop
- exit code: 1
- files were modified by this hook

Inspecting 79 files
..............................C................................................

Offenses:

c_glib/test/test-record-batch-datum.rb:52:101: C: [Corrected] Layout/LineLength: Line is too long. [107/100]
    assert_equal("RecordBatch(visible:   [\n" + "    true,\n" + "    false\n" + "  ]\n" + ")", @ datum.to_s)
                                                                                                    ^^^^^^^
c_glib/test/test-record-batch-datum.rb:53:1: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
@ datum.to_s)
^^^^^^^^^^^

79 files inspected, 2 offenses detected, 2 offenses corrected
Inspecting 79 files
..................C................................C.....................C.....

Offenses:

dev/release/01-prepare-test.rb:267:101: C: Layout/LineLength: Line is too long. [101/100]
              "+<p><a href=\"../#{@ previous_compatible_version}/r/\">#{@ previous_r_version}</a></p>",
                                                                                                    ^
c_glib/test/test-struct-field-options.rb:45:101: C: Layout/LineLength: Line is too long. [112/100]
    message = "[struct-field-options][set-field-ref]: Invalid: Dot path '[foo]' contained an unterminated index"
                                                                                                    ^^^^^^^^^^^^
ruby/red-arrow/test/test-table.rb:1524:21: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                    table1.join(table2, ...
                    ^^^^^^^^^^^^^^^^^^^

79 files inspected, 3 offenses detected, 1 offense corrected
Inspecting 79 files
......................C........................................................

Offenses:

c_glib/test/test-chunked-array-datum.rb:52:101: C: [Corrected] Layout/LineLength: Line is too long. [108/100]
    assert_equal("ChunkedArray([\n" + "  [\n" + "    true,\n" + "    false\n" + "  ]\n" + "])", @ datum.to_s)
                                                                                                    ^^^^^^^^
c_glib/test/test-chunked-array-datum.rb:53:1: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
@ datum.to_s)
^^^^^^^^^^^

79 files inspected, 2 offenses detected, 2 offenses corrected
Inspecting 79 files
..............C............................C....C..............................

Offenses:

c_glib/test/dataset/test-file-system-dataset.rb:98:30: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                             label: [ ...
                             ^^^^^^^^
dev/release/post-12-bump-versions-test.rb:305:101: C: Layout/LineLength: Line is too long. [103/100]
                "+<p><a href=\"../#{@ previous_compatible_version}/r/\">#{@ previous_r_version}</a></p>",
                                                                                                    ^^^
c_glib/test/test-array.rb:122:101: C: [Corrected] Layout/LineLength: Line is too long. [103/100]
                   build_int32_array([0, 1069547520, -1071644672, nil]).view(Arrow::FloatDataType.new))
                                                                                                    ^^^

79 files inspected, 3 offenses detected, 2 offenses corrected
Inspecting 79 files
...............................................................................

79 files inspected, no offenses detected
Inspecting 79 files
........................................................................C......

Offenses:

c_glib/test/test-large-list-array.rb:91:30: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                             [ ...
                             ^

79 files inspected, 1 offense detected, 1 offense corrected
Inspecting 79 files
...............................................................................

79 files inspected, no offenses detected
Inspecting 79 files
...............................C...............................................

Offenses:

c_glib/test/test-uint-array-builder.rb:35:40: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                                       Arrow::Buffer.new(values.pack("S*")),
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c_glib/test/test-uint-array-builder.rb:36:40: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                                       Arrow::Buffer.new([0b011].pack("C*")),
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c_glib/test/test-uint-array-builder.rb:37:40: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                                       -1))
                                       ^^
c_glib/test/test-uint-array-builder.rb:45:40: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                                       Arrow::Buffer.new(values.pack("L*")),
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c_glib/test/test-uint-array-builder.rb:46:40: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                                       Arrow::Buffer.new([0b011].pack("C*")),
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c_glib/test/test-uint-array-builder.rb:47:40: C: [Corrected] Layout/ArgumentAlignment: Align the arguments of a method call if they span more than one line.
                                       -1))
                                       ^^

79 files inspected, 6 offenses detected, 6 offenses corrected

pre-commit hook(s) made changes.
If you are seeing this message in CI, reproduce locally with: `pre-commit run --all-files`.
To run `pre-commit` as part of git workflow, use `pre-commit install`.
All changes made by hooks:
diff --git a/c_glib/test/dataset/test-file-system-dataset.rb b/c_glib/test/dataset/test-file-system-dataset.rb
index 96deedf6b..0cd2a61ea 100644
--- a/c_glib/test/dataset/test-file-system-dataset.rb
+++ b/c_glib/test/dataset/test-file-system-dataset.rb
@@ -95,11 +95,11 @@ class TestDatasetFileSystemDataset < Test::Unit::TestCase
                                build_int32_array([2]),
                                build_int32_array([3]),
                              ],
-                             label: [
-                               build_string_array(["a", "a"]),
-                               build_string_array(["b"]),
-                               build_string_array(["c"]),
-                             ])
+                                 label: [
+                                   build_string_array(["a", "a"]),
+                                   build_string_array(["b"]),
+                                   build_string_array(["c"]),
+                                 ])

     return dataset, expected_table
   end
diff --git a/c_glib/test/test-array.rb b/c_glib/test/test-array.rb
index aa129a474..681544920 100644
--- a/c_glib/test/test-array.rb
+++ b/c_glib/test/test-array.rb
@@ -119,7 +119,8 @@ class TestArray < Test::Unit::TestCase
   sub_test_case("#view") do
     def test_valid
       assert_equal(build_float_array([0.0, 1.5, -2.5, nil]),
-                   build_int32_array([0, 1069547520, -1071644672, nil]).view(Arrow::FloatDataType.new))
+                   build_int32_array([0, 1069547520, -1071644672,
+nil]).view(Arrow::FloatDataType.new))
     end

     def test_invalid
diff --git a/c_glib/test/test-chunked-array-datum.rb b/c_glib/test/test-chunked-array-datum.rb
index b82f3eed8..17a2cbd1a 100644
--- a/c_glib/test/test-chunked-array-datum.rb
+++ b/c_glib/test/test-chunked-array-datum.rb
@@ -49,7 +49,8 @@ class TestChunkedArrayDatum < Test::Unit::TestCase
   end

   def test_to_string
-    assert_equal("ChunkedArray([\n" + "  [\n" + "    true,\n" + "    false\n" + "  ]\n" + "])", @ datum.to_s)
+    assert_equal("ChunkedArray([\n" + "  [\n" + "    true,\n" + "    false\n" + "  ]\n" + "])",
+                 @ datum.to_s)
   end

   def test_value
diff --git a/c_glib/test/test-large-list-array.rb b/c_glib/test/test-large-list-array.rb
index 2f7efab5a..fa9c92ec8 100644
--- a/c_glib/test/test-large-list-array.rb
+++ b/c_glib/test/test-large-list-array.rb
@@ -88,10 +88,10 @@ class TestLargeListArray < Test::Unit::TestCase

   def test_value_offsets
     array = build_large_list_array(Arrow::Int8DataType.new,
-                             [
-                               [-29, 29],
-                               [-1, 0, 1],
-                             ])
+                                   [
+                                     [-29, 29],
+                                     [-1, 0, 1],
+                                   ])
     assert_equal([0, 2, 5],
                  array.value_offsets)
   end
diff --git a/c_glib/test/test-record-batch-datum.rb b/c_glib/test/test-record-batch-datum.rb
index ec572e0f1..e2d9c0258 100644
--- a/c_glib/test/test-record-batch-datum.rb
+++ b/c_glib/test/test-record-batch-datum.rb
@@ -49,7 +49,8 @@ class TestRecordBatchDatum < Test::Unit::TestCase
   end

   def test_to_string
-    assert_equal("RecordBatch(visible:   [\n" + "    true,\n" + "    false\n" + "  ]\n" + ")", @ datum.to_s)
+    assert_equal("RecordBatch(visible:   [\n" + "    true,\n" + "    false\n" + "  ]\n" + ")",
+                 @ datum.to_s)
   end

   def test_value
diff --git a/c_glib/test/test-uint-array-builder.rb b/c_glib/test/test-uint-array-builder.rb
index 89621189b..3aa3a1c48 100644
--- a/c_glib/test/test-uint-array-builder.rb
+++ b/c_glib/test/test-uint-array-builder.rb
@@ -32,9 +32,9 @@ class TestUIntArrayBuilder < Test::Unit::TestCase
     values = [0, border_value]
     assert_equal(build_uint_array([*values, nil]),
                  Arrow::UInt16Array.new(3,
-                                       Arrow::Buffer.new(values.pack("S*")),
-                                       Arrow::Buffer.new([0b011].pack("C*")),
-                                       -1))
+                                        Arrow::Buffer.new(values.pack("S*")),
+                                        Arrow::Buffer.new([0b011].pack("C*")),
+                                        -1))
   end

   def test_uint32
@@ -42,9 +42,9 @@ class TestUIntArrayBuilder < Test::Unit::TestCase
     values = [0, border_value]
     assert_equal(build_uint_array([*values, nil]),
                  Arrow::UInt32Array.new(3,
-                                       Arrow::Buffer.new(values.pack("L*")),
-                                       Arrow::Buffer.new([0b011].pack("C*")),
-                                       -1))
+                                        Arrow::Buffer.new(values.pack("L*")),
+                                        Arrow::Buffer.new([0b011].pack("C*")),
+                                        -1))
   end

   def test_uint64
diff --git a/ruby/red-arrow/test/test-table.rb b/ruby/red-arrow/test/test-table.rb
index a69e92615..2117e60df 100644
--- a/ruby/red-arrow/test/test-table.rb
+++ b/ruby/red-arrow/test/test-table.rb
@@ -1521,10 +1521,10 @@ visible: false
                                       ["key2_right", [100, 20]],
                                       ["string", ["1-100", "2-20"]],
                                     ]),
-                    table1.join(table2,
-                                ["key1", "key2"],
-                                left_suffix: "_left",
-                                right_suffix: "_right"))
+                   table1.join(table2,
+                               ["key1", "key2"],
+                               left_suffix: "_left",
+                               right_suffix: "_right"))
     end
   end
 end
```

### What changes are included in this PR?

Reformat Ruby codes.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#45402

Lead-authored-by: Hiroyuki Sato <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change

Multiple users are developing Ruby codes. Adding Ruby Lint helps keep the same style.

### What changes are included in this PR?

Add Ruby Lint. (Rubocop)

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#45398

Lead-authored-by: Hiroyuki Sato <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.3.0 to 5.4.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/actions/setup-python/releases">actions/setup-python's releases</a>.</em></p>
<blockquote>
<h2>v5.4.0</h2>
<h2>What's Changed</h2>
<h3>Enhancements:</h3>
<ul>
<li>Update cache error message by <a href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/968">actions/setup-python#968</a></li>
<li>Enhance Workflows: Add Ubuntu-24, Remove Python 3.8  by <a href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/985">actions/setup-python#985</a></li>
<li>Configure Dependabot settings by <a href="https://github.com/HarithaVattikuti"><code>@​HarithaVattikuti</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1008">actions/setup-python#1008</a></li>
</ul>
<h3>Documentation changes:</h3>
<ul>
<li>Readme update - recommended permissions by <a href="https://github.com/benwells"><code>@​benwells</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1009">actions/setup-python#1009</a></li>
<li>Improve Advanced Usage examples by <a href="https://github.com/lrq3000"><code>@​lrq3000</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/645">actions/setup-python#645</a></li>
</ul>
<h3>Dependency updates:</h3>
<ul>
<li>Upgrade <code>undici</code> from 5.28.4 to 5.28.5 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1012">actions/setup-python#1012</a></li>
<li>Upgrade <code>urllib3</code> from 1.25.9 to 1.26.19 in /<strong>tests</strong>/data by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/895">actions/setup-python#895</a></li>
<li>Upgrade <code>actions/publish-immutable-action</code> from 0.0.3 to 0.0.4 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1014">actions/setup-python#1014</a></li>
<li>Upgrade <code>@ actions/http-client</code> from 2.2.1 to 2.2.3 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1020">actions/setup-python#1020</a></li>
<li>Upgrade <code>requests</code> from 2.24.0 to 2.32.2 in /<strong>tests</strong>/data by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1019">actions/setup-python#1019</a></li>
<li>Upgrade <code>@ actions/cache</code> to <code>^4.0.0</code> by <a href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a> in <a href="https://redirect.github.com/actions/setup-python/pull/1007">actions/setup-python#1007</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/benwells"><code>@​benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/1009">actions/setup-python#1009</a></li>
<li><a href="https://github.com/HarithaVattikuti"><code>@​HarithaVattikuti</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/1008">actions/setup-python#1008</a></li>
<li><a href="https://github.com/lrq3000"><code>@​lrq3000</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-python/pull/645">actions/setup-python#645</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-python/compare/v5...v5.4.0">https://github.com/actions/setup-python/compare/v5...v5.4.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/actions/setup-python/commit/42375524e23c412d93fb67b49958b491fce71c38"><code>4237552</code></a> Improve Advanced Usage examples (<a href="https://redirect.github.com/actions/setup-python/issues/645">#645</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/709bfa58ba5a9cefd64220decb43e45cc2a85775"><code>709bfa5</code></a> Bump requests from 2.24.0 to 2.32.2 in /<strong>tests</strong>/data (<a href="https://redirect.github.com/actions/setup-python/issues/1019">#1019</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/ceb20b242df24c1f8bf064b3c943c31c2555ddd8"><code>ceb20b2</code></a> Bump <code>@​actions/http-client</code> from 2.2.1 to 2.2.3 (<a href="https://redirect.github.com/actions/setup-python/issues/1020">#1020</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/0dc2d2cf0c96a1befa4c9f1803d3b9eb03458031"><code>0dc2d2c</code></a> Bump actions/publish-immutable-action from 0.0.3 to 0.0.4 (<a href="https://redirect.github.com/actions/setup-python/issues/1014">#1014</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/feb9c6e7c63362340a8853582968731d6adb0454"><code>feb9c6e</code></a> Bump urllib3 from 1.25.9 to 1.26.19 in /<strong>tests</strong>/data (<a href="https://redirect.github.com/actions/setup-python/issues/895">#895</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/d0b4fc497a1daddb64da40799d80949aa3a0c559"><code>d0b4fc4</code></a> Bump undici from 5.28.4 to 5.28.5 (<a href="https://redirect.github.com/actions/setup-python/issues/1012">#1012</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/e3dfaac0fd011839eef87186e3b48165c3ba0162"><code>e3dfaac</code></a> Configure Dependabot settings (<a href="https://redirect.github.com/actions/setup-python/issues/1008">#1008</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/b8cf3eb1ebc9c7f906e4ca96fcdf2e289e25d230"><code>b8cf3eb</code></a> Use the new cache service: upgrade <code>@ actions/cache</code> to <code>^4.0.0</code> (<a href="https://redirect.github.com/actions/setup-python/issues/1007">#1007</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/1928ae624dc06094d8c65f021a4700ea8fa56b9d"><code>1928ae6</code></a> Update README.md (<a href="https://redirect.github.com/actions/setup-python/issues/1009">#1009</a>)</li>
<li><a href="https://github.com/actions/setup-python/commit/3fddbee7870211eda9047db10474808be43c71ec"><code>3fddbee</code></a> Enhance Workflows: Add Ubuntu-24, Remove Python 3.8  (<a href="https://redirect.github.com/actions/setup-python/issues/985">#985</a>)</li>
<li>Additional commits viewable in <a href="https://github.com/actions/setup-python/compare/v5.3.0...v5.4.0">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5.3.0&new-version=5.4.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sutou Kouhei <[email protected]>
Bumps [actions/setup-dotnet](https://github.com/actions/setup-dotnet) from 4.2.0 to 4.3.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/actions/setup-dotnet/releases">actions/setup-dotnet's releases</a>.</em></p>
<blockquote>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>README update - add permissions section by <a href="https://github.com/benwells"><code>@​benwells</code></a> in <a href="https://redirect.github.com/actions/setup-dotnet/pull/587">actions/setup-dotnet#587</a></li>
<li>Configure Dependabot settings by <a href="https://github.com/HarithaVattikuti"><code>@​HarithaVattikuti</code></a> in <a href="https://redirect.github.com/actions/setup-dotnet/pull/585">actions/setup-dotnet#585</a></li>
<li>Upgrade <strong>cache</strong> from 3.2.4 to 4.0.0 by <a href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a> in <a href="https://redirect.github.com/actions/setup-dotnet/pull/586">actions/setup-dotnet#586</a></li>
<li>Upgrade <strong>actions/publish-immutable-action</strong> from 0.0.3 to 0.0.4 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-dotnet/pull/590">actions/setup-dotnet#590</a></li>
<li>Upgrade <strong><code>@​actions/http-client</code></strong> from 2.2.1 to 2.2.3 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-dotnet/pull/592">actions/setup-dotnet#592</a></li>
<li>Upgrade <strong>undici</strong> from 5.28.4 to 5.28.5 by <a href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a href="https://redirect.github.com/actions/setup-dotnet/pull/596">actions/setup-dotnet#596</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/benwells"><code>@​benwells</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-dotnet/pull/587">actions/setup-dotnet#587</a></li>
<li><a href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a> made their first contribution in <a href="https://redirect.github.com/actions/setup-dotnet/pull/586">actions/setup-dotnet#586</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/setup-dotnet/compare/v4...v4.3.0">https://github.com/actions/setup-dotnet/compare/v4...v4.3.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/actions/setup-dotnet/commit/3951f0dfe7a07e2313ec93c75700083e2005cbab"><code>3951f0d</code></a> Bump undici from 5.28.4 to 5.28.5 (<a href="https://redirect.github.com/actions/setup-dotnet/issues/596">#596</a>)</li>
<li><a href="https://github.com/actions/setup-dotnet/commit/4849e736f1bd76c37a6dd7d4b8f84581fb6d8baf"><code>4849e73</code></a> Bump <code>@​actions/http-client</code> from 2.2.1 to 2.2.3 (<a href="https://redirect.github.com/actions/setup-dotnet/issues/592">#592</a>)</li>
<li><a href="https://github.com/actions/setup-dotnet/commit/3e76c4dc412e00b3fcb875aaddbf27000f2d428b"><code>3e76c4d</code></a> Bump actions/publish-immutable-action from 0.0.3 to 0.0.4 (<a href="https://redirect.github.com/actions/setup-dotnet/issues/590">#590</a>)</li>
<li><a href="https://github.com/actions/setup-dotnet/commit/91b379339b47ad470db060ea16315b66313ae78e"><code>91b3793</code></a> Configure Dependabot settings (<a href="https://redirect.github.com/actions/setup-dotnet/issues/585">#585</a>)</li>
<li><a href="https://github.com/actions/setup-dotnet/commit/4b37d22250db48bbc2ccdde8ef5dbf479769f7ac"><code>4b37d22</code></a> upgrade cache from 3.2.4 to 4.0.0 (<a href="https://redirect.github.com/actions/setup-dotnet/issues/586">#586</a>)</li>
<li><a href="https://github.com/actions/setup-dotnet/commit/f9d0f6282caace67f83f94183a90502136d73900"><code>f9d0f62</code></a> Update README.md (<a href="https://redirect.github.com/actions/setup-dotnet/issues/587">#587</a>)</li>
<li>See full diff in <a href="https://github.com/actions/setup-dotnet/compare/v4.2.0...v4.3.0">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-dotnet&package-manager=github_actions&previous-version=4.2.0&new-version=4.3.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change
apache#37848 upgraded the JIT compiler for LLVM/Gandiva code which presented linking errors with newer version of LLVM. Some Gandiva tests were disabled, and here at Dremio I am running into the same linking problem when trying to build with an updated Arrow library. After reading some threads on the LLVM discord server it appears that updating to LLVM 18.1 will fix the symbol issue. I tested locally and was able to re-enable the disabled java tests which were showing the unexported ORC symbol issue.

More discussion in apache/arrow-java#63.

### What changes are included in this PR?
Updating vcpkg and pinning LLVM to 18.1 Notably I found encountered some build problems using the newest vcpkg update, which appeared to be related to the updated gRPC libraries. My Arrow jar CI build was timing out in this case with no clear error in the logs. The vcpkg version included here has the LLVM 18 update but not the gRPC update (which isn't needed for this issue).

### Are these changes tested?
Covered by existing tests. Will also re-enable the disabled Java tests in a future change.

### Are there any user-facing changes?
No.

* GitHub Issue: apache#45132

Lead-authored-by: Logan Riggs <[email protected]>
Co-authored-by: lriggs <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change

apache#26648 proposes an optimization in checking sparse array equality by detecting contiguous runs, this PR implements that change

### What changes are included in this PR?

previously, sparse array comparison was checked one by one, in this change, contiguous runs are detected and compared by checking equality of current and previous child_ids

### Are these changes tested?

already covered by existing unit tests

### Are there any user-facing changes?

no

* GitHub Issue: apache#26648

Lead-authored-by: shawn <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
### Rationale for this change

[Table::Validate](https://arrow.apache.org/docs/cpp/api/table.html#_CPPv4NK5arrow5Table8ValidateEv) is available in the C++ API.
But, GLib doesn't support that method yet.

### What changes are included in this PR?

This PR adds a validation method in the table class.
Before this change, the `Validate()` method was used in 
`garrow_table_new_values()`, `garrow_table_new_arrays()`,
and `garrow_table_new_chunked_arrays()` functions implicitly.
This PR removes them and adds it as a separate method.
Users need to call `garrow_table_validate()` explicitly by themselves.
This is a backward incompatible change.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.

**This PR includes breaking changes to public APIs.**

* GitHub Issue: apache#44761

Lead-authored-by: Hiroyuki Sato <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…pache#45372)

### Rationale for this change

apacheGH-45371

### What changes are included in this PR?

Use `std::atomic_compare_exchange` to initialize `boxed_columns_[i]` so they are correctly written only once. This means that a reference to `boxed_columns_` is safe to read after each element has been initialized.

### Are these changes tested?

Yes, there is a test case `TestRecordBatch.ColumnsThreadSafety` which passes under TSAN.

### Are there any user-facing changes?

No

**This PR contains a "Critical Fix".**

Without this fix, concurrent calls to `SimpleRecordBatch::columns` could lead to an invalid memory access and crash.
* GitHub Issue: apache#45371

Lead-authored-by: Colin Schultz <[email protected]>
Co-authored-by: Colin <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
…che#45370)

### Rationale for this change

The PR apache#40237 introduced code:

```
// time to time
template <typename To, typename From, typename T = typename To::TypeClass>
```

However, the `Time64Type::TypeClass` doesn't exist, so SFINAE always failed.

### Are these changes tested?

Yes

### Are there any user-facing changes?

No.

* GitHub Issue: apache#45362

Authored-by: mwish <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
…pandas 2.3 dev version (apache#45428)

### Rationale for this change

Small follow-up on apache#45383 to ensure this version comparison also does the right thing for the currently not-yet-released dev version of 2.3.0

* GitHub Issue: apache#45427

Authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Raúl Cumplido <[email protected]>
…#45392)

### Rationale for this change

`RankQuantileOptions` are currently not exposed on Pyarrow and CI job breaks when `-W error` is used.

### What changes are included in this PR?

Expose `RankQuantileOptions` and test options and kernel from pyarrow.

It also includes some minor refactor for the unwrap sort keys logic to move it into a common function.

### Are these changes tested?

Yes

### Are there any user-facing changes?

The options for the new kernel are exposed on pyarrow.
* GitHub Issue: apache#45380

Lead-authored-by: Raúl Cumplido <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
kou and others added 21 commits February 27, 2025 15:21
…che#45561)

### Rationale for this change

"column" in examples have duplicated values for the same column but "column" must have only one value for each column.

### What changes are included in this PR?

* Use "rowspan" for "column" in logical examples.
* Remove duplicated values from "column" in physical examples.
* Add missing "offsets" to "statistics" in physical examples.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#45560

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
….metadata (apache#45583)

### Rationale for this change

Before, these types would be interpreted as `"object"` type and therefore the `precision` and `scale` attributes of these types would not be preserved in the `"metadata"`

### What changes are included in this PR?

Uses `pa.types.is_decimal` is instead of `isinstance`ing just the `Decimal128Type` to determine a `"decimal"` pandas type

### Are these changes tested?

Yes

### Are there any user-facing changes?

Yes, but should not break compatibility.

* GitHub Issue: apache#45582

Authored-by: Matthew Roeschke <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
…apache#44171)

### Rationale for this change

There is a bug that when column dtype is np.bytes,it will goto the final branch and run level=level.astype(dtype)

### Are these changes tested?
Yes
* GitHub Issue: apache#44188

Lead-authored-by: Piong1997 <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
…undled Thrift (apache#45637)

### Rationale for this change

We must use the Boost for Apache Arrow C++ itself and bundled libraries including Apache Thrift. We can enforce it by specifying the target Boost explicitly instead of detecting Boost in each bundled libraries.

### What changes are included in this PR?

Always specify `-DBoost_INCLUDE_DIR`.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#45628

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
### Rationale for this change

[arrow::BinaryViewArray](https://arrow.apache.org/docs/cpp/api/array.html#_CPPv4N5arrow15BinaryViewArrayE) is available in the C++ API.
But, GLib doesn't support that method yet.

### What changes are included in this PR?

Add `GArrowBinaryViewArray` for wrapping `arrow::BinaryViewArray` class

### Are these changes tested?

Yes.

### Are there any user-facing changes?

Yes.
* GitHub Issue: apache#45649

Lead-authored-by: Hiroyuki Sato <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…e#45657)

Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4.1.8 to 4.1.9.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/actions/download-artifact/releases">actions/download-artifact's releases</a>.</em></p>
<blockquote>
<h2>v4.1.9</h2>
<h2>What's Changed</h2>
<ul>
<li>Add workflow file for publishing releases to immutable action package by <a href="https://github.com/Jcambass"><code>@​Jcambass</code></a> in <a href="https://redirect.github.com/actions/download-artifact/pull/354">actions/download-artifact#354</a></li>
<li>docs: small migration fix by <a href="https://github.com/froblesmartin"><code>@​froblesmartin</code></a> in <a href="https://redirect.github.com/actions/download-artifact/pull/370">actions/download-artifact#370</a></li>
<li>Update MIGRATION.md by <a href="https://github.com/andyfeller"><code>@​andyfeller</code></a> in <a href="https://redirect.github.com/actions/download-artifact/pull/372">actions/download-artifact#372</a></li>
<li>Update artifact package to 2.2.2 by <a href="https://github.com/yacaovsnc"><code>@​yacaovsnc</code></a> in <a href="https://redirect.github.com/actions/download-artifact/pull/380">actions/download-artifact#380</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/Jcambass"><code>@​Jcambass</code></a> made their first contribution in <a href="https://redirect.github.com/actions/download-artifact/pull/354">actions/download-artifact#354</a></li>
<li><a href="https://github.com/froblesmartin"><code>@​froblesmartin</code></a> made their first contribution in <a href="https://redirect.github.com/actions/download-artifact/pull/370">actions/download-artifact#370</a></li>
<li><a href="https://github.com/andyfeller"><code>@​andyfeller</code></a> made their first contribution in <a href="https://redirect.github.com/actions/download-artifact/pull/372">actions/download-artifact#372</a></li>
<li><a href="https://github.com/yacaovsnc"><code>@​yacaovsnc</code></a> made their first contribution in <a href="https://redirect.github.com/actions/download-artifact/pull/380">actions/download-artifact#380</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a href="https://github.com/actions/download-artifact/compare/v4...v4.1.9">https://github.com/actions/download-artifact/compare/v4...v4.1.9</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/actions/download-artifact/commit/cc203385981b70ca67e1cc392babf9cc229d5806"><code>cc20338</code></a> Merge pull request <a href="https://redirect.github.com/actions/download-artifact/issues/380">#380</a> from actions/yacaovsnc/release_4_1_9</li>
<li><a href="https://github.com/actions/download-artifact/commit/1fc0fee191f40422f502da571c0f01ff460afe53"><code>1fc0fee</code></a> Update artifact package to 2.2.2</li>
<li><a href="https://github.com/actions/download-artifact/commit/7fba95161a0924506ed1ae69cdbae8371ee00b3f"><code>7fba951</code></a> Merge pull request <a href="https://redirect.github.com/actions/download-artifact/issues/372">#372</a> from andyfeller/patch-1</li>
<li><a href="https://github.com/actions/download-artifact/commit/f9ceb7763ba1fdfd81b2e2f93aa1f6015ff6b35d"><code>f9ceb77</code></a> Update MIGRATION.md</li>
<li><a href="https://github.com/actions/download-artifact/commit/533298bc57c27f112a2c04a74a04a4d43e2866fd"><code>533298b</code></a> Merge pull request <a href="https://redirect.github.com/actions/download-artifact/issues/370">#370</a> from froblesmartin/patch-1</li>
<li><a href="https://github.com/actions/download-artifact/commit/d06289e120b300840a833b25db66cb8c19f5d274"><code>d06289e</code></a> docs: small migration fix</li>
<li><a href="https://github.com/actions/download-artifact/commit/d0ce8fd1167ed839810201de977912a090ab10a7"><code>d0ce8fd</code></a> Merge pull request <a href="https://redirect.github.com/actions/download-artifact/issues/354">#354</a> from actions/Jcambass-patch-1</li>
<li><a href="https://github.com/actions/download-artifact/commit/1ce0d91ace59dfbf6763107ee5aa8466ebbadf48"><code>1ce0d91</code></a> Add workflow file for publishing releases to immutable action package</li>
<li>See full diff in <a href="https://github.com/actions/download-artifact/compare/v4.1.8...v4.1.9">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/download-artifact&package-manager=github_actions&previous-version=4.1.8&new-version=4.1.9)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sutou Kouhei <[email protected]>
…45654)

Bumps [Grpc.Tools](https://github.com/grpc/grpc) from 2.69.0 to 2.70.0.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a href="https://github.com/grpc/grpc/commits">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=Grpc.Tools&package-manager=nuget&previous-version=2.69.0&new-version=2.70.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Curt Hagenlocher <[email protected]>
…e#45655)

Bumps [ZstdSharp.Port](https://github.com/oleg-st/ZstdSharp) from 0.8.4 to 0.8.5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/oleg-st/ZstdSharp/releases">ZstdSharp.Port's releases</a>.</em></p>
<blockquote>
<h2>0.8.5</h2>
<p>Ported zstd v1.5.7</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/oleg-st/ZstdSharp/commit/033a60cc132a437ff64467fd0a531b45638ce2f2"><code>033a60c</code></a> 0.8.5</li>
<li><a href="https://github.com/oleg-st/ZstdSharp/commit/83cd9776badd75e79cc27086035af4adeadda21b"><code>83cd977</code></a> Ported zstd v1.5.7</li>
<li>See full diff in <a href="https://github.com/oleg-st/ZstdSharp/compare/0.8.4...0.8.5">compare view</a></li>
</ul>
</details>
<br />

[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=ZstdSharp.Port&package-manager=nuget&previous-version=0.8.4&new-version=0.8.5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@ dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@ dependabot rebase` will rebase this PR
- `@ dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@ dependabot merge` will merge this PR after your CI passes on it
- `@ dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@ dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@ dependabot reopen` will reopen this PR if it is closed
- `@ dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@ dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency
- `@ dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@ dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

</details>

Authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Curt Hagenlocher <[email protected]>
…comma) (apache#45660)

### Rationale for this change

Preparing to add Ruby lint rule. require space after comma.
ex) `[1,2,3]` -> `[1, 2, 3]`

### What changes are included in this PR?

Add space after comma.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.

* GitHub Issue: apache#45659

Authored-by: Hiroyuki Sato <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
… comma) (apache#45662)

### Rationale for this change

Automatic style checking for Ruby language.
It is necessary to have extra space after a comma like `[1, 2]` instead of `[1,2]`.

### What changes are included in this PR?

Enable automatic lint checking (add space after comma)

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#45661

Authored-by: Hiroyuki Sato <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…partitioning batches ahead to reduce contention (apache#45612)

### Rationale for this change

High contention is observed in Swiss join build phase as showed in apache#45611 .

A little background about the contention. To build the hash table in parallel, we first build `N` partitioned hash tables (the "build" stage), then merge them together into the final hash table (the "merge" stage, less interesting in this PR). In the build stage, each one of the exec batches from the build side table is distributed to one of the `M` threads. Each such thread processes each one of the assigned batches by:
1. Partition the batch based on the hash of the join key into `N` partitions;
2. Insert the rows of each of the `N` partitions into the corresponding one of the `N` partitioned hash tables.

Because each batch contains arbitrary data, all `M` threads will write to all `N` partitioned hash tables simultaneously. So we use (spin) locks on these partitioned hash tables, thus the contention.

### What changes are included in this PR?

Instead of all `M` threads writing to all `N` partitioned hash tables simultaneously, we can further split the build stage into two:
1. Partition stage: `M` threads, each only partitions the batches and preserves the partition info of each batch;
2. (New) Build stage: `N` threads, each builds one of the `N` partitioned hash tables. Every thread will iterate all the batches and only insert the belonging rows of the batch into its assigned hash table.

#### Performance

Take [this benchmark](https://github.com/apache/arrow/blob/31994b5c2069a768e70fba16d1f521e4de64139e/cpp/src/arrow/acero/hash_join_benchmark.cc#L301), which is dedicated for the performance of parallel build, the result shows by eliminating the contention, we can achieve up to **10x** (on Arm) and **5x** (on Intel) performance boost for Swiss join build. I picked `krows=64` and `krows=512` and made a chart.

![Arm (1)](https://github.com/user-attachments/assets/21e8f198-9e47-46c9-a04b-7f24105968a1)

![Intel](https://github.com/user-attachments/assets/b61fb614-8422-4adc-b57c-2f83b7a7637b)

Note the single thread performance is actually down a little bit (reasons detailed later). But IMO this is quite trivial compared to the total win of multi-threaded cases.

Detailed benchmark numbers (on Arm) follow.

<details>
<summary>Benchmark After (Click to expand)</summary>

```
Run on (10 X 24.1216 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB
  L1 Instruction 128 KiB
  L2 Unified 4096 KiB (x10)
Load Average: 3.47, 2.76, 2.54
-----------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                               Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------------------------
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:1/process_time          53315 ns        53284 ns        12295 rows/sec=19.2179M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:1/process_time          73001 ns        80862 ns         8606 rows/sec=12.6636M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:1/process_time          88003 ns        95127 ns         7429 rows/sec=10.7645M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:1/process_time          93248 ns       120317 ns         5135 rows/sec=8.51085M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:1/process_time         109931 ns       140384 ns         4527 rows/sec=7.29427M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:1/process_time         127997 ns       180633 ns         3546 rows/sec=5.66897M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:1/process_time         125138 ns       185416 ns         3267 rows/sec=5.52271M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:1/process_time         142611 ns       236355 ns         3613 rows/sec=4.33247M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:1/process_time         169663 ns       336376 ns         2158 rows/sec=3.04421M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:1/process_time        174708 ns       362630 ns         1943 rows/sec=2.82381M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:1/process_time        186939 ns       409803 ns         1693 rows/sec=2.49876M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:1/process_time        196817 ns       451213 ns         1542 rows/sec=2.26944M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:1/process_time        209194 ns       501488 ns         1407 rows/sec=2.04192M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:1/process_time        218517 ns       544590 ns         1299 rows/sec=1.88031M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:1/process_time        224407 ns       579947 ns         1206 rows/sec=1.76568M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:1/process_time        236201 ns       630016 ns         1134 rows/sec=1.62536M/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:8/process_time         213061 ns       213082 ns         3276 rows/sec=38.4453M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:8/process_time         260230 ns       374124 ns         1900 rows/sec=21.8965M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:8/process_time         275723 ns       483754 ns         1331 rows/sec=16.9342M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:8/process_time         326784 ns       711857 ns          974 rows/sec=11.5079M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:8/process_time         351987 ns       861883 ns          798 rows/sec=9.50477M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:8/process_time         370956 ns      1000389 ns          683 rows/sec=8.18881M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:8/process_time         384963 ns      1064672 ns          646 rows/sec=7.69439M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:8/process_time         406914 ns      1172464 ns          606 rows/sec=6.987M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:8/process_time         425632 ns      1252871 ns          567 rows/sec=6.53858M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:8/process_time        433262 ns      1287050 ns          524 rows/sec=6.36494M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:8/process_time        443328 ns      1329822 ns          528 rows/sec=6.16022M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:8/process_time        450736 ns      1383203 ns          508 rows/sec=5.92249M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:8/process_time        465523 ns      1425956 ns          495 rows/sec=5.74492M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:8/process_time        471723 ns      1462440 ns          475 rows/sec=5.6016M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:8/process_time        484823 ns      1524638 ns          464 rows/sec=5.37308M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:8/process_time        485260 ns      1541146 ns          453 rows/sec=5.31553M/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:64/process_time       1716517 ns      1716522 ns          404 rows/sec=38.1795M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:64/process_time       1762125 ns      2982570 ns          235 rows/sec=21.973M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:64/process_time       1826549 ns      4331683 ns          161 rows/sec=15.1295M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:64/process_time       2032670 ns      6228081 ns          111 rows/sec=10.5227M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:64/process_time       2008129 ns      7401860 ns           93 rows/sec=8.85399M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:64/process_time       2022595 ns      8733805 ns           77 rows/sec=7.50372M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:64/process_time       2084620 ns     10333721 ns           68 rows/sec=6.34196M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:64/process_time       2186912 ns     12275696 ns           56 rows/sec=5.33868M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:64/process_time       3061302 ns     20949833 ns           24 rows/sec=3.12823M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:64/process_time      4241129 ns     34483810 ns           21 rows/sec=1.90049M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:64/process_time      4123000 ns     33438545 ns           22 rows/sec=1.95989M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:64/process_time      5282983 ns     44385773 ns           22 rows/sec=1.47651M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:64/process_time      4214940 ns     33978250 ns           16 rows/sec=1.92876M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:64/process_time      9775500 ns     85277400 ns           10 rows/sec=768.504k/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:64/process_time      8448605 ns     40459190 ns           21 rows/sec=1.61981M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:64/process_time      8311054 ns     74384765 ns           17 rows/sec=881.041k/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:512/process_time     15124972 ns     15124152 ns           46 rows/sec=34.6656M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:512/process_time      9977718 ns     19336583 ns           36 rows/sec=27.1138M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:512/process_time      8751039 ns     23240667 ns           30 rows/sec=22.5591M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:512/process_time      9839327 ns     33597150 ns           20 rows/sec=15.6051M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:512/process_time     10058853 ns     41758118 ns           17 rows/sec=12.5554M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:512/process_time     10139465 ns     49509846 ns           13 rows/sec=10.5896M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:512/process_time     10311708 ns     58393545 ns           11 rows/sec=8.97853M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:512/process_time     10327653 ns     65427667 ns            9 rows/sec=8.01325M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:512/process_time     13476536 ns     99947571 ns            7 rows/sec=5.24563M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:512/process_time    17290050 ns    143569000 ns            5 rows/sec=3.65182M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:512/process_time    20576010 ns    176557250 ns            4 rows/sec=2.96951M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:512/process_time    24393117 ns    205985600 ns            5 rows/sec=2.54527M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:512/process_time    21039639 ns    168724000 ns            3 rows/sec=3.10737M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:512/process_time    38604708 ns    333330667 ns            3 rows/sec=1.57288M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:512/process_time    63189833 ns    502763000 ns            1 rows/sec=1042.81k/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:512/process_time    91289749 ns    731794000 ns            1 rows/sec=716.442k/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:4096/process_time   164686385 ns    164197000 ns            4 rows/sec=25.5443M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:4096/process_time   112767458 ns    217052333 ns            3 rows/sec=19.3239M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:4096/process_time   100643792 ns    245290000 ns            3 rows/sec=17.0994M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:4096/process_time    74837889 ns    268070667 ns            3 rows/sec=15.6463M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:4096/process_time    63174056 ns    269879667 ns            3 rows/sec=15.5414M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:4096/process_time    59140353 ns    294662000 ns            2 rows/sec=14.2343M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:4096/process_time    64158124 ns    354435000 ns            2 rows/sec=11.8338M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:4096/process_time    70799208 ns    465744500 ns            2 rows/sec=9.00559M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:4096/process_time   118786833 ns    730395500 ns            2 rows/sec=5.74251M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:4096/process_time  158779374 ns   1254764000 ns            1 rows/sec=3.3427M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:4096/process_time  124160834 ns    985925000 ns            1 rows/sec=4.25418M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:4096/process_time  261909918 ns   1956600000 ns            1 rows/sec=2.14367M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:4096/process_time  437582374 ns   3326539000 ns            1 rows/sec=1.26086M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:4096/process_time  225402042 ns   1756542000 ns            1 rows/sec=2.38782M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:4096/process_time  284178668 ns   2485382000 ns            1 rows/sec=1.68759M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:4096/process_time  198744084 ns   1697137000 ns            1 rows/sec=2.4714M/s
```

</details>

<details>
<summary>Benchmark After (Click to expand)</summary>

```
Running ./arrow-acero-hash-join-benchmark
Run on (10 X 24.1886 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB
  L1 Instruction 128 KiB
  L2 Unified 4096 KiB (x10)
Load Average: 3.72, 3.38, 3.20
-----------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                               Time             CPU   Iterations UserCounters...
-----------------------------------------------------------------------------------------------------------------------------------------
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:1/process_time          64162 ns        60216 ns        11306 rows/sec=17.0054M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:1/process_time          73712 ns        85168 ns         8287 rows/sec=12.0233M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:1/process_time          81532 ns       108468 ns         6563 rows/sec=9.44057M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:1/process_time          90389 ns       125957 ns         5590 rows/sec=8.12979M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:1/process_time          98131 ns       144575 ns         3912 rows/sec=7.08281M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:1/process_time         112269 ns       171638 ns         3551 rows/sec=5.96605M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:1/process_time         127481 ns       207426 ns         3053 rows/sec=4.93669M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:1/process_time         135240 ns       221817 ns         3337 rows/sec=4.61641M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:1/process_time         167247 ns       323541 ns         2152 rows/sec=3.16497M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:1/process_time        173753 ns       363113 ns         1913 rows/sec=2.82006M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:1/process_time        182739 ns       404210 ns         1717 rows/sec=2.53334M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:1/process_time        194151 ns       451175 ns         1542 rows/sec=2.26963M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:1/process_time        205538 ns       496195 ns         1423 rows/sec=2.06371M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:1/process_time        217099 ns       540857 ns         1259 rows/sec=1.89329M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:1/process_time        228487 ns       591203 ns         1274 rows/sec=1.73206M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:1/process_time        240082 ns       642682 ns         1087 rows/sec=1.59332M/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:8/process_time         218917 ns       218912 ns         3219 rows/sec=37.4214M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:8/process_time         239310 ns       338138 ns         2066 rows/sec=24.2268M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:8/process_time         284833 ns       411252 ns         1570 rows/sec=19.9197M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:8/process_time         315525 ns       496170 ns         1437 rows/sec=16.5105M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:8/process_time         329116 ns       557150 ns         1246 rows/sec=14.7034M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:8/process_time         339415 ns       612913 ns         1123 rows/sec=13.3657M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:8/process_time         354355 ns       673437 ns         1040 rows/sec=12.1645M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:8/process_time         371602 ns       736217 ns          948 rows/sec=11.1271M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:8/process_time         388963 ns       788646 ns          870 rows/sec=10.3874M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:8/process_time        398060 ns       838691 ns          850 rows/sec=9.76761M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:8/process_time        403233 ns       875477 ns          789 rows/sec=9.35719M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:8/process_time        410908 ns       917480 ns          748 rows/sec=8.92881M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:8/process_time        425442 ns       971118 ns          702 rows/sec=8.43564M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:8/process_time        427492 ns      1002726 ns          718 rows/sec=8.16973M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:8/process_time        442728 ns      1057910 ns          653 rows/sec=7.74357M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:8/process_time        455481 ns      1115695 ns          642 rows/sec=7.34251M/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:64/process_time       1731379 ns      1731375 ns          403 rows/sec=37.852M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:64/process_time       1179658 ns      2152165 ns          328 rows/sec=30.4512M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:64/process_time       1116942 ns      2232095 ns          316 rows/sec=29.3608M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:64/process_time        814811 ns      2498054 ns          276 rows/sec=26.2348M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:64/process_time        900296 ns      2959111 ns          235 rows/sec=22.1472M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:64/process_time        917596 ns      3253949 ns          215 rows/sec=20.1405M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:64/process_time        920826 ns      3526660 ns          197 rows/sec=18.583M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:64/process_time        811062 ns      3789065 ns          184 rows/sec=17.2961M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:64/process_time       1031480 ns      5637721 ns          122 rows/sec=11.6246M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:64/process_time      1072212 ns      6040280 ns          118 rows/sec=10.8498M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:64/process_time      1088001 ns      6204862 ns          116 rows/sec=10.562M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:64/process_time      1119427 ns      6356310 ns          113 rows/sec=10.3104M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:64/process_time      1128651 ns      6542557 ns          115 rows/sec=10.0169M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:64/process_time      1152430 ns      6731112 ns          107 rows/sec=9.73628M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:64/process_time      1161581 ns      6772318 ns          107 rows/sec=9.67704M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:64/process_time      1171040 ns      6748462 ns          106 rows/sec=9.71125M/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:512/process_time     16584785 ns     16419156 ns           45 rows/sec=31.9315M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:512/process_time      9782162 ns     18750500 ns           36 rows/sec=27.9613M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:512/process_time      9204909 ns     18933861 ns           36 rows/sec=27.6905M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:512/process_time      5665851 ns     20187600 ns           35 rows/sec=25.9708M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:512/process_time      6824165 ns     24445690 ns           29 rows/sec=21.4471M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:512/process_time      6476403 ns     25448704 ns           27 rows/sec=20.6018M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:512/process_time      6380011 ns     26670808 ns           26 rows/sec=19.6577M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:512/process_time      4994868 ns     29002792 ns           24 rows/sec=18.0772M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:512/process_time      6097037 ns     37510263 ns           19 rows/sec=13.9772M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:512/process_time     6024000 ns     40356889 ns           18 rows/sec=12.9913M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:512/process_time     6167103 ns     41287529 ns           17 rows/sec=12.6985M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:512/process_time     6087725 ns     40475722 ns           18 rows/sec=12.9531M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:512/process_time     6163463 ns     41720647 ns           17 rows/sec=12.5666M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:512/process_time     6056402 ns     40388529 ns           17 rows/sec=12.9811M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:512/process_time     5972958 ns     40973824 ns           17 rows/sec=12.7957M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:512/process_time     6593174 ns     40719647 ns           17 rows/sec=12.8756M/s
BM_HashJoinBasic_BuildParallelism/Threads:1/HashTable krows:4096/process_time   174475083 ns    174058000 ns            3 rows/sec=24.0972M/s
BM_HashJoinBasic_BuildParallelism/Threads:2/HashTable krows:4096/process_time   109935347 ns    200222667 ns            3 rows/sec=20.9482M/s
BM_HashJoinBasic_BuildParallelism/Threads:3/HashTable krows:4096/process_time    89852042 ns    187011000 ns            3 rows/sec=22.4281M/s
BM_HashJoinBasic_BuildParallelism/Threads:4/HashTable krows:4096/process_time    57974139 ns    202076667 ns            3 rows/sec=20.756M/s
BM_HashJoinBasic_BuildParallelism/Threads:5/HashTable krows:4096/process_time    57160194 ns    210744667 ns            3 rows/sec=19.9023M/s
BM_HashJoinBasic_BuildParallelism/Threads:6/HashTable krows:4096/process_time    56770167 ns    221233000 ns            3 rows/sec=18.9588M/s
BM_HashJoinBasic_BuildParallelism/Threads:7/HashTable krows:4096/process_time    59031097 ns    241927000 ns            3 rows/sec=17.3371M/s
BM_HashJoinBasic_BuildParallelism/Threads:8/HashTable krows:4096/process_time    46069291 ns    263787667 ns            3 rows/sec=15.9003M/s
BM_HashJoinBasic_BuildParallelism/Threads:9/HashTable krows:4096/process_time    51498374 ns    310020500 ns            2 rows/sec=13.5291M/s
BM_HashJoinBasic_BuildParallelism/Threads:10/HashTable krows:4096/process_time   52055417 ns    319261500 ns            2 rows/sec=13.1375M/s
BM_HashJoinBasic_BuildParallelism/Threads:11/HashTable krows:4096/process_time   49418250 ns    331526500 ns            2 rows/sec=12.6515M/s
BM_HashJoinBasic_BuildParallelism/Threads:12/HashTable krows:4096/process_time   53305833 ns    332126000 ns            2 rows/sec=12.6287M/s
BM_HashJoinBasic_BuildParallelism/Threads:13/HashTable krows:4096/process_time   48910062 ns    325631500 ns            2 rows/sec=12.8805M/s
BM_HashJoinBasic_BuildParallelism/Threads:14/HashTable krows:4096/process_time   52218458 ns    312798500 ns            2 rows/sec=13.409M/s
BM_HashJoinBasic_BuildParallelism/Threads:15/HashTable krows:4096/process_time   51131709 ns    344045500 ns            2 rows/sec=12.1911M/s
BM_HashJoinBasic_BuildParallelism/Threads:16/HashTable krows:4096/process_time   55233376 ns    338843500 ns            2 rows/sec=12.3783M/s
```

</details>

#### Overhead

This change introduces some overhead indeed. First, in the old implementation, the partition info is used right way after partitioning the batch, whereas the new implementation preserves the partition info and uses it in the next stage (potentially by other thread). This may be less cache friendly. Second, preserving the the partition info requires more memory: the increased allocation may hurt performance a bit, and worsen the memory profile by 6 bytes per row (4 bytes for hash and 2 bytes for row id in partition).

But as mentioned above, almost all multi-threaded cases are winning. Even nicer, the increased memory profile spans only a short period and doesn't really increase the peak memory: the peak moment always comes in the merge stage, and by that time, the preserved partition info for all batches are released already. This is verified by printing the memory pool stats when benchmarking in my local. 

### Are these changes tested?

Yes. Existing tests suffice.

### Are there any user-facing changes?

None.

**This PR includes breaking changes to public APIs.** (If there are any breaking changes to public APIs, please explain which changes are breaking. If not, you can remove this.)

**This PR contains a "Critical Fix".** (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)

* GitHub Issue: apache#45611 

Lead-authored-by: Rossi Sun <[email protected]>
Co-authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Rossi Sun <[email protected]>
…conversion (apache#36701)

### Rationale for this change
Add support for LargeBinaryType, BinaryViewType, LargeStringType, and StringViewType in NumPy to Arrow conversion.

### What changes are included in this PR?
Adds new `Visit` methods in `NumPyConverter` for `LargeStringType`, `StringViewType`, `LargeBinaryType`, and `BinaryViewType`. These are built on top of new templated `VisitString` and `VisitBinary` methods.

Also adds the Arrow -> NumPy type map for the large binary types.

### Are these changes tested?
New test added showing a string and binary array over 2 GiB is valid and still a single chunk. Also added the new Arrow to NumPy maps to a schema test.

### Are there any user-facing changes?
Adds support for converting NumPy string/binary lists to large binary types.

* Closes: apache#35289
* GitHub Issue: apache#35289

Authored-by: Adam Binford <[email protected]>
Signed-off-by: AlenkaF <[email protected]>
…#45658)

### Rationale for this change

This document includes this sample code: `orc.write_table(table, where, compression='gzip')` which doesn't actually work: `ValueError: Unknown CompressionKind: GZIP`

### What changes are included in this PR?

Replace `gzip` references with `zlib`. 

### Are these changes tested?

Only updated documentation.

### Are there any user-facing changes?

Yes, this doc is posted here: https://arrow.apache.org/docs/python/orc.html

Authored-by: Gibby Free <[email protected]>
Signed-off-by: Gang Wu <[email protected]>
…r" functions (apache#45562)

### Rationale for this change

Add "pivot wider" functionality such as [in Pandas](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html), through two dedicated functions:
1. a "pivot_wider" scalar aggregate function returning a Struct scalar
2. a "hash_pivot_wider" grouped aggregate function returning a Struct array

Both functions take two arguments (the column of pivot keys and the column of pivot values) and require passing a `PivotWiderOptions` structure with the expected pivot keys, so as to determine the output Struct type.

### Are these changes tested?

Yes, by dedicated unit tests.

### Are there any user-facing changes?

No, just new APIs.
* GitHub Issue: apache#45269

Authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
…ncurrentQueue API (apache#45421)

### Rationale for this change
BackpressureConcurrentQueue::Pop() does not check for empty. This can lead to UB.

### Are there any user-facing changes?
No
* GitHub Issue: apache#45652

Lead-authored-by: Rafał Hibner <[email protected]>
Co-authored-by: gitmodimo <[email protected]>
Co-authored-by: Rossi Sun <[email protected]>
Signed-off-by: Rossi Sun <[email protected]>
…pache#45671)

### Rationale for this change

I want to find binary verification jobs in a release PR easily for apacheGH-45548.
Example jobs: apache#45502 (comment)

If we use specific prefix for the jobs, we can find them easily.

### What changes are included in this PR?

Make `prefix` customizable by an option.

### Are these changes tested?

No.

### Are there any user-facing changes?

No.
* GitHub Issue: apache#45670

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
…age v2 is not compressed (apache#45367)

### Rationale for this change

Currently, if data page v2 is enabled, is_compressed is always set if Page has compression, however, this can be eliminated if:

1. Page is empty
2. Compression makes page event larger.

### What changes are included in this PR?

Enhancement is_compressed setting in column_writer.cc

### Are these changes tested?

Yes

### Are there any user-facing changes?

No

* GitHub Issue: apache#45366

Authored-by: mwish <[email protected]>
Signed-off-by: mwish <[email protected]>
### Rationale for this change

Add kapa AI bot to the docs to make it easier to find answers to questions

### What changes are included in this PR?

Config for enabling bot

### Are these changes tested?

Manually, yep

### Are there any user-facing changes?

Yep, adds widget to header in docs pages

* GitHub Issue: apache#45665

Authored-by: Nic Crane <[email protected]>
Signed-off-by: Nic Crane <[email protected]>
### Rationale for this change

Binary view support was implemented in apache/arrow-nanoarrow#596 and run end encoded support was added in apache/arrow-nanoarrow#507

### What changes are included in this PR?

Updated documentation

### Are these changes tested?

N/A

### Are there any user-facing changes?

No

Authored-by: Will Ayd <[email protected]>
Signed-off-by: Will Ayd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment