Skip to content

Implement cpuinfo_deinitialize() to free heap-allocated globals#387

Merged
GregoryComer merged 4 commits into
pytorch:mainfrom
crvineeth97:vchelur/add-cpuinfo-deinitialize
Jun 12, 2026
Merged

Implement cpuinfo_deinitialize() to free heap-allocated globals#387
GregoryComer merged 4 commits into
pytorch:mainfrom
crvineeth97:vchelur/add-cpuinfo-deinitialize

Conversation

@crvineeth97

Copy link
Copy Markdown
Contributor

Currently cpuinfo_deinitialize() is a no-op stub. Applications that dynamically load and unload libraries using cpuinfo (e.g., ONNX Runtime as a DLL on Windows, or as a shared library via dlopen/dlclose on Linux) leave orphaned heap allocations when the library is unloaded mid-process. These are flagged as memory leaks by tools such as App Verifier and Valgrind.

This PR implements cleanup for all heap-allocated globals: processors, cores, clusters, packages, caches, and uarch info. Platform-specific details:

  • Windows: Uses HeapFree to match HeapAlloc allocations. Cleanup is guarded by InitOnceExecuteOnce for thread safety.
  • ARM64 Windows: Caches are freed as a single contiguous block, matching the allocation pattern in init-by-logical-sys-info.c.
  • Linux (x86/x86_64/RISC-V), macOS, FreeBSD: Uses free() to match calloc allocations. Cleanup is guarded by pthread_once for thread safety.
  • ARM Linux, Emscripten: cpuinfo_deinitialize() is a no-op on these platforms because they use static storage for some globals (packages, L3 cache) that cannot be safely freed.

Cleanup is idempotent, meaning that calling cpuinfo_deinitialize() multiple times is safe.

Also fixes a minor existing leak of processor_infos in the x86 Windows init cleanup path.

Related issues

@meta-cla

meta-cla Bot commented May 5, 2026

Copy link
Copy Markdown

Hi @crvineeth97!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

crvineeth97 pushed a commit to crvineeth97/onnxruntime that referenced this pull request May 5, 2026
Bump pytorch/cpuinfo to crvineeth97/cpuinfo@df8c6a8 which implements
cpuinfo_deinitialize() to properly free heap-allocated globals. This
prevents memory leak reports from App Verifier, Valgrind, and sanitizers
when ORT is dynamically loaded/unloaded.

ORT integration:
- Add CPUIDInfo::ShutDown() which calls cpuinfo_deinitialize()
- Call ShutdownCpuInfo() from DllMain on DLL_PROCESS_DETACH
- In memleak-check builds, also call shutdown during process termination

The cpuinfo bump also includes upstream fixes that make three ORT patches
redundant (removed):
- patch_vcpkg_arm64ec_support.patch (pytorch/cpuinfo#324)
- win_arm_fp16_detection_fallback.patch (pytorch/cpuinfo#348)
- 0001-Add-implementation-for-cpuinfo_deinitialize.patch

The patch_cpuinfo_h_for_arm64ec.patch is retained as it is not yet
upstream in cpuinfo.

Related: pytorch/cpuinfo#150, pytorch/cpuinfo#387

Co-authored-by: Copilot <[email protected]>
@meta-cla meta-cla Bot added the cla signed label May 5, 2026
@meta-cla

meta-cla Bot commented May 5, 2026

Copy link
Copy Markdown

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@crvineeth97 crvineeth97 force-pushed the vchelur/add-cpuinfo-deinitialize branch from 831e1bb to 8ff94bd Compare May 6, 2026 18:59
@crvineeth97

Copy link
Copy Markdown
Contributor Author

Hi @fbarchard @malfet @GregoryComer
Requesting your reviews on this PR. Thank you!

Comment thread src/cpuinfo/internal-api.h

@malfet malfet left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it sounds great, but I think it's better to move the portions for de-init next to respective allocation routines?

Comment thread src/init.c Outdated
@crvineeth97 crvineeth97 requested a review from malfet May 8, 2026 23:27
@crvineeth97

crvineeth97 commented May 8, 2026

Copy link
Copy Markdown
Contributor Author

The init and deinit guards are reset at the end of the deinitialize function so that if a DLL loads, unloads and then loads again, it can initialize if required.

The deinit tests I added in the previous commit were incorrect since the ABI doesn't allow checking the various variables when cpuinfo_is_initialized is false. The init.cc tests sufficiently test the guard reset since each test does init and deinit

@fbarchard @malfet @GregoryComer

@crvineeth97

Copy link
Copy Markdown
Contributor Author

@malfet @fbarchard @GregoryComer May I please get a review for this?

@crvineeth97

Copy link
Copy Markdown
Contributor Author

Hi @malfet @fbarchard @GregoryComer, I have a pending pull request in the onnxruntime repo (microsoft/onnxruntime#28245) that depends on this PR. I would greatly appreciate your reviews.

@GregoryComer GregoryComer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks reasonable. Given that the changes are mostly gated to the new deinit functions, it should be relatively safe. One request - can you add some tests that cover de-init and then re-init? It would be good to have regression tests to make sure that things function correctly across multiple re-init cycles.

@crvineeth97

Copy link
Copy Markdown
Contributor Author

This overall looks reasonable. Given that the changes are mostly gated to the new deinit functions, it should be relatively safe. One request - can you add some tests that cover de-init and then re-init? It would be good to have regression tests to make sure that things function correctly across multiple re-init cycles.

Thank you! The existing init.cc tests already implicitly exercise deinit → re-init cycles and that's the reason why I had to add the reinit in the first place. Each of the ~30 test cases calls cpuinfo_initialize() at the start and cpuinfo_deinitialize() at the end, and since GTest runs them sequentially in a single process, every test after the first is effectively a re-init cycle. Is that sufficient?

@GregoryComer

Copy link
Copy Markdown
Member

Thank you! The existing init.cc tests already implicitly exercise deinit → re-init cycles and that's the reason why I had to add the reinit in the first place. Each of the ~30 test cases calls cpuinfo_initialize() at the start and cpuinfo_deinitialize() at the end, and since GTest runs them sequentially in a single process, every test after the first is effectively a re-init cycle. Is that sufficient?

Ah, good point. That should be sufficient, I think. Since I haven't seen any objections from the other reviews and CI is green, I'll go ahead and merge.

@GregoryComer GregoryComer merged commit 4628dc0 into pytorch:main Jun 12, 2026
12 checks passed
@crvineeth97

Copy link
Copy Markdown
Contributor Author

Thank you so much! I really appreciate it

crvineeth97 pushed a commit to crvineeth97/onnxruntime that referenced this pull request Jun 12, 2026
Bump pytorch/cpuinfo to crvineeth97/cpuinfo@df8c6a8 which implements
cpuinfo_deinitialize() to properly free heap-allocated globals. This
prevents memory leak reports from App Verifier, Valgrind, and sanitizers
when ORT is dynamically loaded/unloaded.

ORT integration:
- Add CPUIDInfo::ShutDown() which calls cpuinfo_deinitialize()
- Call ShutdownCpuInfo() from DllMain on DLL_PROCESS_DETACH
- In memleak-check builds, also call shutdown during process termination

The cpuinfo bump also includes upstream fixes that make three ORT patches
redundant (removed):
- patch_vcpkg_arm64ec_support.patch (pytorch/cpuinfo#324)
- win_arm_fp16_detection_fallback.patch (pytorch/cpuinfo#348)
- 0001-Add-implementation-for-cpuinfo_deinitialize.patch

The patch_cpuinfo_h_for_arm64ec.patch is retained as it is not yet
upstream in cpuinfo.

Related: pytorch/cpuinfo#150, pytorch/cpuinfo#387

Co-authored-by: Copilot <[email protected]>
crvineeth97 added a commit to crvineeth97/onnxruntime that referenced this pull request Jun 12, 2026
Bump pytorch/cpuinfo to crvineeth97/cpuinfo@df8c6a8 which implements
cpuinfo_deinitialize() to properly free heap-allocated globals. This
prevents memory leak reports from App Verifier, Valgrind, and sanitizers
when ORT is dynamically loaded/unloaded.

ORT integration:
- Add CPUIDInfo::ShutDown() which calls cpuinfo_deinitialize()
- Call ShutdownCpuInfo() from DllMain on DLL_PROCESS_DETACH
- In memleak-check builds, also call shutdown during process termination

The cpuinfo bump also includes upstream fixes that make three ORT patches
redundant (removed):
- patch_vcpkg_arm64ec_support.patch (pytorch/cpuinfo#324)
- win_arm_fp16_detection_fallback.patch (pytorch/cpuinfo#348)
- 0001-Add-implementation-for-cpuinfo_deinitialize.patch

The patch_cpuinfo_h_for_arm64ec.patch is retained as it is not yet
upstream in cpuinfo.

Related: pytorch/cpuinfo#150, pytorch/cpuinfo#387

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants