Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Issue with cupy and numpy #326

Closed
zohimchandani opened this issue Dec 2, 2024 · 36 comments
Closed

[BUG] Issue with cupy and numpy #326

zohimchandani opened this issue Dec 2, 2024 · 36 comments
Labels
bug Something isn't working

Comments

@zohimchandani
Copy link

Describe the bug

TypeError: Argument 'b' has incorrect type (expected cupy._core.core._ndarray_base, got numpy.ndarray)

To Reproduce

  1. Pull a CUDA-Q image: docker pull nvcr.io/nvidia/nightly/cuda-quantum:cu12-latest

  2. Turn the image into a container: docker run -it --net=host --user root --gpus all -d --name cudaq_zohim_test 05346a75eaf7

  3. The machine I am running on has CUDA Version 12.4 installed

  4. Installing cuda-toolkit 12.4 based on CUDA version: sudo -S apt-get install -y cuda-toolkit-12.4

  5. Clone a repo where I have a job to run: git clone https://github.com/davidev886/tutorial_vqe

  6. Install some pip packages including ipie specified in a file in the repo: pip install -r requirements.txt

  7. Run unset CUDA_HOME and unset CUDA_PATH to enable the job to look in the right location for the CUDA libraries

  8. Execute the AFQMC workflow - this file does not run VQE but uses a previously saved statevector from a VQE run
    python3 complete_workflow-cudaq.py

  9. Error message:

.
.
.
# iteration   563: delta_max = 1.00020910e-05: time = 1.30560398e-02
# iteration   564: delta_max = 9.99470883e-06: time = 1.26273632e-02
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.166557
# Preparing MSD wf
# MSD prepared with 100 determinants
TypeError: Argument 'b' has incorrect type (expected cupy._core.core._ndarray_base, got numpy.ndarray)

Other information:

NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.5 LTS (Jammy Jellyfish)"```

@zohimchandani zohimchandani added the bug Something isn't working label Dec 2, 2024
@zohimchandani
Copy link
Author

Versions of packages that might be helpful


cupy-cuda12x              13.3.0
numba                     0.60.0
numpy                     1.24.4

@jiangtong1000
Copy link
Collaborator

can you target which line cause this issue? looks like all you need is to use xp.array(b) to fix it.

@zohimchandani
Copy link
Author


# iteration   564: delta_max = 9.99470883e-06: time = 7.14135170e-03
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.172330
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 114, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 243, in build
    self.ovlp = trial.calc_greens_function(self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 534, in calc_greens_function
    return greens_function_multi_det_wicks_opt(walkers, self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/greens_function_multi_det.py", line 1182, in greens_function_multi_det_wicks_opt
    ovlp = numpy.dot(walker_batch.phia[iw].T, trial.psi0a.conj())
  File "<__array_function__ internals>", line 200, in dot
  File "cupy/_core/core.pyx", line 1719, in cupy._core.core._ndarray_base.__array_function__
  File "/usr/local/lib/python3.10/dist-packages/cupy/linalg/_product.py", line 63, in dot
    return a.dot(b, out)
TypeError: Argument 'b' has incorrect type (expected cupy._core.core._ndarray_base, got numpy.ndarray)

@jiangtong1000
Copy link
Collaborator

add identity = xp.array(identity) behind identity = np.eye(self.nbasis, dtype=np.float64)

should be able to fix.

identity = np.eye(self.nbasis, dtype=np.float64)
self.psi0a = identity[:, self.occa[0]].copy()
self.psi0b = identity[:, self.occb[0]].copy()

@zohimchandani
Copy link
Author

Added the below in /usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py

        identity = np.eye(self.nbasis, dtype=np.float64)
        identity = xp.array(identity)
        self.psi0a = identity[:, self.occa[0]].copy()
        self.psi0b = identity[:, self.occb[0]].copy()

Error:

# iteration   564: delta_max = 9.99470883e-06: time = 1.24752522e-02
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.176035
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 111, in <module>
    afqmc_hamiltonian, trial_wavefunction = get_afqmc_data(pyscf_data, final_state_vector)
  File "/home/tutorial_vqe/src/utils_ipie.py", line 177, in get_afqmc_data
    trial_wavefunction.half_rotate(afqmc_hamiltonian)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 370, in half_rotate
    rot_1body, rot_chol = half_rotate_generic(
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/half_rotate.py", line 55, in half_rotate_generic
    rH1a[:] = np.einsum("Jpi,pq->Jiq", orbsa.conj(), hamiltonian.H1[0], optimize=True)
  File "cupy/_core/core.pyx", line 1481, in cupy._core.core._ndarray_base.__array__
TypeError: Implicit conversion to a NumPy array is not allowed. Please use `.get()` to construct a NumPy array explicitly.

@jiangtong1000
Copy link
Collaborator

i see. this will cause another issue.

self.phia = xp.array(
[initial_walker[:, : self.nup].copy() for iw in range(self.nwalkers)],
dtype=xp.complex128,
)
self.phib = xp.array(
[initial_walker[:, self.nup :].copy() for iw in range(self.nwalkers)],
dtype=xp.complex128,
)

can you try changing xp here to numpy?

@zohimchandani
Copy link
Author

        self.phia = numpy.array(
            [initial_walker[:, : self.nup].copy() for iw in range(self.nwalkers)],
            dtype=xp.complex128,
        )
        self.phib = numpy.array(
            [initial_walker[:, self.nup :].copy() for iw in range(self.nwalkers)],
            dtype=xp.complex128,
        )
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 114, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 243, in build
    self.ovlp = trial.calc_greens_function(self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 534, in calc_greens_function
    return greens_function_multi_det_wicks_opt(walkers, self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/greens_function_multi_det.py", line 1182, in greens_function_multi_det_wicks_opt
    ovlp = numpy.dot(walker_batch.phia[iw].T, trial.psi0a.conj())
  File "<__array_function__ internals>", line 200, in dot
  File "cupy/_core/core.pyx", line 1719, in cupy._core.core._ndarray_base.__array_function__
  File "/usr/local/lib/python3.10/dist-packages/cupy/linalg/_product.py", line 67, in dot
    return a.dot(b, out)
TypeError: Argument 'b' has incorrect type (expected cupy._core.core._ndarray_base, got numpy.ndarray)

@jiangtong1000
Copy link
Collaborator

jiangtong1000 commented Dec 2, 2024 via email

@zohimchandani
Copy link
Author

        self.phia = numpy.array(
            [initial_walker[:, : self.nup].copy() for iw in range(self.nwalkers)],
            dtype=numpy.complex128,
        )
        self.phib = numpy.array(
            [initial_walker[:, self.nup :].copy() for iw in range(self.nwalkers)],
            dtype=numpy.complex128,
        )
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 114, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 243, in build
    self.ovlp = trial.calc_greens_function(self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 534, in calc_greens_function
    return greens_function_multi_det_wicks_opt(walkers, self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/greens_function_multi_det.py", line 1182, in greens_function_multi_det_wicks_opt
    ovlp = numpy.dot(walker_batch.phia[iw].T, trial.psi0a.conj())
  File "<__array_function__ internals>", line 200, in dot
  File "cupy/_core/core.pyx", line 1719, in cupy._core.core._ndarray_base.__array_function__
  File "/usr/local/lib/python3.10/dist-packages/cupy/linalg/_product.py", line 67, in dot
    return a.dot(b, out)
TypeError: Argument 'b' has incorrect type (expected cupy._core.core._ndarray_base, got numpy.ndarray)

@jiangtong1000
Copy link
Collaborator

@zohimchandani
Copy link
Author

Added config.update_option("use_gpu", True) after build and just before run:

# MSD prepared with 100 determinants
# random seed is 1
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 131, in <module>
    afqmc_msd.run(estimator_filename='afqmc_data_' +system+ '.h5')
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 534, in run
    self.walkers.orthogonalise()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/base_walkers.py", line 174, in orthogonalise
    detR = self.reortho()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 111, in reortho
    return self.reortho_batched()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 148, in reortho_batched
    (self.phia, Rup) = qr(self.phia, mode=qr_mode)
  File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/_decomp_qr.py", line 133, in qr
    raise ValueError("expected a 2-D array")
ValueError: expected a 2-D array

@zohimchandani
Copy link
Author

zohimchandani commented Dec 2, 2024

FYI, I undid each of your suggestions above before trying the next one you mentioned. Not sure if you wanted them to be implemented together.

@jiangtong1000
Copy link
Collaborator

ok.

can you move this block
https://github.com/JoonhoLee-Group/ipie/blob/ed752921f911bdfe98f0be776c76d2341bf6b120/ipie/walkers/uhf_walkers.py#L235C1-L241C62

after this line:
self.ovlp = trial.calc_greens_function(self)

@zohimchandani
Copy link
Author

zohimchandani commented Dec 2, 2024

        self.ovlp = trial.calc_greens_function(self)

        if config.get_option("use_gpu"):
            self.cast_to_cupy()
            self.Ga = xp.asarray(self.Ga)
            self.Gb = xp.asarray(self.Gb)
            trial._rchola = xp.asarray(trial._rchola)
            trial._rcholb = xp.asarray(trial._rcholb)
            trial._rchola_act = xp.asarray(trial._rchola_act)
# random seed is 1
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 131, in <module>
    afqmc_msd.run(estimator_filename='afqmc_data_' +system+ '.h5')
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 534, in run
    self.walkers.orthogonalise()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/base_walkers.py", line 174, in orthogonalise
    detR = self.reortho()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 111, in reortho
    return self.reortho_batched()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 148, in reortho_batched
    (self.phia, Rup) = qr(self.phia, mode=qr_mode)
  File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/_decomp_qr.py", line 133, in qr
    raise ValueError("expected a 2-D array")
ValueError: expected a 2-D array

@jiangtong1000
Copy link
Collaborator

I think this new one is related to #321

@jiangtong1000
Copy link
Collaborator

jiangtong1000 commented Dec 2, 2024

also, it seems this fix #326 (comment)
can also work

@zohimchandani
Copy link
Author

        self.ovlp = trial.calc_greens_function(self)

        if config.get_option("use_gpu"):
            self.cast_to_cupy()
            self.Ga = xp.asarray(self.Ga)
            self.Gb = xp.asarray(self.Gb)
            trial._rchola = xp.asarray(trial._rchola)
            trial._rcholb = xp.asarray(trial._rcholb)
            trial._rchola_act = xp.asarray(trial._rchola_act)
afqmc_msd = AFQMC.build(
    pyscf_data["mol"].nelec,
    afqmc_hamiltonian,
    trial_wavefunction,
    num_walkers=num_walkers,
    num_steps_per_block=25,
    num_blocks=10,
    timestep=0.001,
    stabilize_freq=5,
    seed=random_seed,
    pop_control_freq=5,
    verbose=True)

config.update_option("use_gpu", True)

# Run the AFQMC simulation and save data to .h5 file
afqmc_msd.run(estimator_filename='afqmc_data_' +system+ '.h5')

afqmc_msd.finalise(verbose=False)
# iteration   564: delta_max = 9.99470883e-06: time = 1.17278099e-03
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.152253
# Preparing MSD wf
# MSD prepared with 100 determinants
# random seed is 1
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 130, in <module>
    afqmc_msd.run(estimator_filename='afqmc_data_' +system+ '.h5')
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 534, in run
    self.walkers.orthogonalise()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/base_walkers.py", line 174, in orthogonalise
    detR = self.reortho()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 111, in reortho
    return self.reortho_batched()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 148, in reortho_batched
    (self.phia, Rup) = qr(self.phia, mode=qr_mode)
  File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/_decomp_qr.py", line 133, in qr
    raise ValueError("expected a 2-D array")
ValueError: expected a 2-D array

@zohimchandani
Copy link
Author

Sorry, dont think I understand what you want me to change now?

@jiangtong1000
Copy link
Collaborator

Some old version of cupy will not support batched qr decomposition.

@zohimchandani
Copy link
Author

pip list yields:

cupy-cuda12x 13.3.0 which is already the latest version.

@jiangtong1000
Copy link
Collaborator

jiangtong1000 commented Dec 3, 2024 via email

@zohimchandani
Copy link
Author

I can try this tomorrow. Would it be possible for you to reproduce the error on your end? Might be easier to find a fix that way. Thanks for all the suggestions today.

@jiangtong1000
Copy link
Collaborator

sure, I can probably try the day after tomorrow, if this is still not fixed with the above fix.

@zohimchandani
Copy link
Author

Current changes are:

        self.ovlp = trial.calc_greens_function(self)

        if config.get_option("use_gpu"):
            self.cast_to_cupy()
            self.Ga = xp.asarray(self.Ga)
            self.Gb = xp.asarray(self.Gb)
            trial._rchola = xp.asarray(trial._rchola)
            trial._rcholb = xp.asarray(trial._rcholb)
            trial._rchola_act = xp.asarray(trial._rchola_act)

# Initialize AFQMC
afqmc_msd = AFQMC.build(
    pyscf_data["mol"].nelec,
    afqmc_hamiltonian,
    trial_wavefunction,
    num_walkers=num_walkers,
    num_steps_per_block=25,
    num_blocks=10,
    timestep=0.001,
    stabilize_freq=5,
    seed=random_seed,
    pop_control_freq=5,
    verbose=True)

config.update_option("use_gpu", True)

# Run the AFQMC simulation and save data to .h5 file
afqmc_msd.run(estimator_filename='afqmc_data_' +system+ '.h5')
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.155547
# Preparing MSD wf
# MSD prepared with 100 determinants
# random seed is 1
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 130, in <module>
    afqmc_msd.run(estimator_filename='afqmc_data_' +system+ '.h5')
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 534, in run
    self.walkers.orthogonalise()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/base_walkers.py", line 174, in orthogonalise
    detR = self.reortho()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 111, in reortho
    return self.reortho_batched()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 148, in reortho_batched
    (self.phia, Rup) = qr(self.phia, mode=qr_mode)
  File "/usr/local/lib/python3.10/dist-packages/scipy/linalg/_decomp_qr.py", line 133, in qr
    raise ValueError("expected a 2-D array")
ValueError: expected a 2-D array

Please specify exactly what change if any is required here

Thanks

@jiangtong1000
Copy link
Collaborator

config.update_option("use_gpu", True)

should be moved to the begining of the script

@zohimchandani
Copy link
Author

Moving this to the top of the script

from ipie.config import config
config.update_option("use_gpu", True)

and this change:


        self.ovlp = trial.calc_greens_function(self)

        if config.get_option("use_gpu"):
            self.cast_to_cupy()
            self.Ga = xp.asarray(self.Ga)
            self.Gb = xp.asarray(self.Gb)
            trial._rchola = xp.asarray(trial._rchola)
            trial._rcholb = xp.asarray(trial._rcholb)
            trial._rchola_act = xp.asarray(trial._rchola_act)

error:

# iteration   564: delta_max = 9.99470883e-06: time = 1.25360489e-03
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.148303
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 115, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 236, in build
    self.ovlp = trial.calc_greens_function(self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 534, in calc_greens_function
    return greens_function_multi_det_wicks_opt(walkers, self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/greens_function_multi_det.py", line 1182, in greens_function_multi_det_wicks_opt
    ovlp = numpy.dot(walker_batch.phia[iw].T, trial.psi0a.conj())
  File "<__array_function__ internals>", line 200, in dot
  File "cupy/_core/core.pyx", line 1713, in cupy._core.core._ndarray_base.__array_function__
  File "/usr/local/lib/python3.10/dist-packages/cupy/linalg/_product.py", line 63, in dot
    return a.dot(b, out)
TypeError: Argument 'b' has incorrect type (expected cupy._core.core._ndarray_base, got numpy.ndarray)

@jiangtong1000
Copy link
Collaborator

I believe now adding this can finally fix (based on your most recent fix).
If not, I will try to reproduce ASAP. sorry for all the trial-and-errors.

self.phia = xp.array(
[initial_walker[:, : self.nup].copy() for iw in range(self.nwalkers)],
dtype=xp.complex128,
)
self.phib = xp.array(
[initial_walker[:, self.nup :].copy() for iw in range(self.nwalkers)],
dtype=xp.complex128,
)

self.phia = numpy.array(
[initial_walker[:, : self.nup].copy() for iw in range(self.nwalkers)],
dtype=numpy.complex128,
)
self.phib = numpy.array(
[initial_walker[:, self.nup :].copy() for iw in range(self.nwalkers)],
dtype=numpy.complex128,
)

@zohimchandani
Copy link
Author

        self.phia = numpy.array(
            [initial_walker[:, : self.nup].copy() for iw in range(self.nwalkers)],
            dtype=numpy.complex128,
        )
        self.phib = numpy.array(
            [initial_walker[:, self.nup :].copy() for iw in range(self.nwalkers)],
            dtype=numpy.complex128,
        )
        self.ovlp = trial.calc_greens_function(self)

        if config.get_option("use_gpu"):
            self.cast_to_cupy()
            self.Ga = xp.asarray(self.Ga)
            self.Gb = xp.asarray(self.Gb)
            trial._rchola = xp.asarray(trial._rchola)
            trial._rcholb = xp.asarray(trial._rcholb)
            trial._rchola_act = xp.asarray(trial._rchola_act)

at the top:

from ipie.config import config
config.update_option("use_gpu", True)
# iteration   564: delta_max = 9.99470883e-06: time = 1.55687332e-03
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.161392
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 115, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 236, in build
    self.ovlp = trial.calc_greens_function(self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 534, in calc_greens_function
    return greens_function_multi_det_wicks_opt(walkers, self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/greens_function_multi_det.py", line 1203, in greens_function_multi_det_wicks_opt
    dets_a_full, dets_b_full = compute_determinants_batched(
  File "/usr/local/lib/python3.10/dist-packages/ipie/propagation/overlap.py", line 618, in compute_determinants_batched
    dets_a, dets_b = get_dets_single_excitation_batched_opt(G0a, G0b, trial)
  File "/usr/local/lib/python3.10/dist-packages/ipie/propagation/overlap.py", line 279, in get_dets_single_excitation_batched_opt
    wk.get_dets_singles(
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/kernels/gpu/wicks_gpu.py", line 89, in get_dets_singles
    get_dets_singles_kernel(
  File "cupy/_core/raw.pyx", line 93, in cupy._core.raw.RawKernel.__call__
  File "cupy/cuda/function.pyx", line 223, in cupy.cuda.function.Function.__call__
  File "cupy/cuda/function.pyx", line 177, in cupy.cuda.function._launch
  File "cupy/cuda/function.pyx", line 133, in cupy.cuda.function._pointer
TypeError: You are trying to pass a numpy.ndarray of shape (200, 72, 103) as a kernel parameter. Only numpy.ndarrays of size one can be passed by value. If you meant to pass a pointer to __global__ memory, you need to pass a cupy.ndarray instead.

@jiangtong1000
Copy link
Collaborator

jiangtong1000 commented Dec 4, 2024 via email

@zohimchandani
Copy link
Author

I changed this to ParticleHole:

error:

# iteration   564: delta_max = 9.99470883e-06: time = 1.22594833e-03
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 0.158061
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow-cudaq.py", line 115, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 237, in build
    self.ovlp = trial.calc_greens_function(self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/trial_wavefunction/particle_hole.py", line 446, in calc_greens_function
    return greens_function_multi_det_wicks_opt_gpu(walkers, self)
  File "/usr/local/lib/python3.10/dist-packages/ipie/estimators/greens_function_multi_det.py", line 1345, in greens_function_multi_det_wicks_opt_gpu
    walker_batch.Ga += xp.einsum("w,wpq->wpq", ovlps, G0a, optimize=True)
  File "cupy/_core/core.pyx", line 1693, in cupy._core.core._ndarray_base.__array_ufunc__
  File "cupy/_core/_kernel.pyx", line 1286, in cupy._core._kernel.ufunc.__call__
  File "cupy/_core/_kernel.pyx", line 159, in cupy._core._kernel._preprocess_args
  File "cupy/_core/_kernel.pyx", line 145, in cupy._core._kernel._preprocess_arg
TypeError: Unsupported type <class 'numpy.ndarray'>

It turns out that type(walker_batch.Ga) = <class 'numpy.ndarray'>

If I change the type of something, it breaks something else in the code. Would be best if you try reproduce so we can resolve this more efficiently.

Let me know if you need anything else.

Thanks

@jiangtong1000
Copy link
Collaborator

@zohimchandani please check #327

@zohimchandani
Copy link
Author

This now works - thanks.

Question:

I am using a cluster of 8 H100s.

Receiving an out of memory error - see below:

# - CUDA compute capability: 9.0
# - CUDA version: 12.06.0
# - GPU Type: 'NVIDIA H100 80GB HBM3'
# - GPU Mem: 79.097 GB
# - Number of GPUs: 8
# MPI communicator : <class 'mpi4py.MPI.Intracomm'>
# Available memory on the node is 2015.563 GB
# There are unused GPUs (1 MPI tasks but 8 GPUs).  Check if this is really what you wanted.
# PhaselessGeneric: expected to allocate 0.0 GB
# PhaselessGeneric: using 25.2642822265625 GB out of 79.09661865234375 GB memory on GPU
# GenericRealChol: expected to allocate 0.13464972376823425 GB
# GenericRealChol: using 25.2642822265625 GB out of 79.09661865234375 GB memory on GPU
# UHFWalkersParticleHole: expected to allocate 0.0 GB
# UHFWalkersParticleHole: using 25.2642822265625 GB out of 79.09661865234375 GB memory on GPU
# Setting up estimator object.
# Writing estimator data to afqmc_data_10q.h5
# Finished settting up estimator object.
            Block                   Weight            WeightFactor            HybridEnergy                  ENumer                  EDenom                  ETotal                  E1Body                  E2Body
                0   1.0000000000000000e+04  1.0000000000000000e+04  0.0000000000000000e+00 -2.1237020500117082e+07  1.0000000000000000e+04 -2.1237020500117083e+03 -4.6822529147033019e+03  2.5585508646915937e+03
                1   2.7314090682770265e+05  2.4687498571639704e+06 -1.1586580202448267e+03 -2.1237573243674748e+07  1.0000000000000000e+04 -2.1237573243674747e+03 -4.6822587783462113e+03  2.5585014539787362e+03
                2   1.0000619061718366e+04  5.3334290771429962e+05 -1.1587034566746970e+03 -2.1238082175784133e+07  1.0000000000000000e+04 -2.1238082175784134e+03 -4.6822622111536966e+03  2.5584539935752832e+03
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 8,933,760,000 bytes (allocated so far: 74,319,149,056 bytes).

Is there a way to use ipie in a multi-gpu setting?

Thanks

@jiangtong1000
Copy link
Collaborator

jiangtong1000 commented Dec 9, 2024 via email

@zohimchandani
Copy link
Author

zohimchandani commented Dec 10, 2024

I am testing this workflow.

num_walkers = 9500 works with mpirun -np 1 --allow-run-as-root python3 complete_workflow.py --cudaq-full-stack-trace

num_walkers = 9600 fails with mpirun -np 8 --allow-run-as-root python3 complete_workflow.py --cudaq-full-stack-trace

# iteration   562: delta_max = 1.01314564e-05: time = 4.09116745e-02
# iteration   563: delta_max = 1.00020910e-05: time = 3.03366184e-02
# iteration   564: delta_max = 9.99470883e-06: time = 4.92453575e-02
 # Orthogonalising Cholesky vectors.
 # Time to orthogonalise: 1.175176
 # Time to orthogonalise: 0.314975
# Preparing MSD wf
# MSD prepared with 100 determinants
# Preparing MSD wf
# MSD prepared with 100 determinants
# Preparing MSD wf
# MSD prepared with 100 determinants
# Preparing MSD wf
# MSD prepared with 100 determinants
Traceback (most recent call last):
  File "/home/tutorial_vqe/complete_workflow.py", line 122, in <module>
    afqmc_msd = AFQMC.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/qmc/afqmc.py", line 379, in build
    walkers.build(
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 236, in build
    self.cast_to_cupy()
  File "/usr/local/lib/python3.10/dist-packages/ipie/walkers/uhf_walkers.py", line 106, in cast_to_cupy
    cast_to_device(self, verbose)
  File "/usr/local/lib/python3.10/dist-packages/ipie/utils/backend.py", line 100, in cast_to_device
    self.__dict__[k] = arraylib.array(v)
  File "/usr/local/lib/python3.10/dist-packages/cupy/_creation/from_data.py", line 53, in array
    return _core.array(obj, dtype, copy, order, subok, ndmin, blocking)
  File "cupy/_core/core.pyx", line 2408, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2435, in cupy._core.core.array
  File "cupy/_core/core.pyx", line 2578, in cupy._core.core._array_default
  File "cupy/_core/core.pyx", line 137, in cupy._core.core.ndarray.__new__
  File "cupy/_core/core.pyx", line 225, in cupy._core.core._ndarray_base._init
  File "cupy/cuda/memory.pyx", line 738, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1424, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1445, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1116, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1137, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 1382, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
  File "cupy/cuda/memory.pyx", line 1385, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 1,629,542,400 bytes (allocated so far: 9,414,626,816 bytes).

The error message is printed out 8 times, it seems like the script is run 8 times rather than the memory of 8GPUs being pooled?

@jiangtong1000
Copy link
Collaborator

jiangtong1000 commented Dec 10, 2024 via email

@zohimchandani
Copy link
Author

Can I expect this to be merged in weeks? or months? just so I can plan for workloads that I intend to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants