Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: tuple index out of range #2694

Closed
zhaodaye2022 opened this issue Nov 17, 2023 · 8 comments · Fixed by #2737
Closed

IndexError: tuple index out of range #2694

zhaodaye2022 opened this issue Nov 17, 2023 · 8 comments · Fixed by #2737
Assignees

Comments

@zhaodaye2022
Copy link

start command

python3 -m fastchat.serve.model_worker --model-path lmsys/vicuna-7b-v1.5--device mps --load-8bit

result

2023-11-17 17:54:14 | INFO | model_worker | args: Namespace(host='localhost', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:21001', model_path='lmsys/vicuna-7b-v1.5', revision='main', device='mps', gpus=None, num_gpus=1, max_gpu_memory=None, dtype=None, load_8bit=True, cpu_offloading=False, gptq_ckpt=None, gptq_wbits=16, gptq_groupsize=-1, gptq_act_order=False, awq_ckpt=None, awq_wbits=16, awq_groupsize=-1, enable_exllama=False, exllama_max_seq_len=4096, exllama_gpu_split=None, model_names=None, conv_template=None, embed_in_truncate=False, limit_worker_concurrency=5, stream_interval=2, no_register=False, seed=None) 2023-11-17 17:54:14 | INFO | model_worker | Loading the model ['vicuna-7b-v1.5'] on worker a0c7314a ... 0%| | 0/2 [00:00<?, ?it/s] 50%|██████████████████████████████████████████████████████████████▌ | 1/2 [00:27<00:27, 27.78s/it] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:36<00:00, 16.68s/it] 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:36<00:00, 18.35s/it] 2023-11-17 17:54:52 | ERROR | stderr | 2023-11-17 17:54:52 | INFO | model_worker | Register to controller 2023-11-17 17:54:52 | ERROR | stderr | INFO: Started server process [97889] 2023-11-17 17:54:52 | ERROR | stderr | INFO: Waiting for application startup. 2023-11-17 17:54:52 | ERROR | stderr | INFO: Application startup complete. 2023-11-17 17:54:52 | ERROR | stderr | INFO: Uvicorn running on http://localhost:21002 (Press CTRL+C to quit) 2023-11-17 17:55:07 | INFO | stdout | INFO: ::1:64855 - "POST /worker_generate_stream HTTP/1.1" 200 OK 2023-11-17 17:55:07 | ERROR | stderr | ERROR: Exception in ASGI application 2023-11-17 17:55:07 | ERROR | stderr | Traceback (most recent call last): 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi 2023-11-17 17:55:07 | ERROR | stderr | result = await app( # type: ignore[func-returns-value] 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | return await self.app(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/fastapi/applications.py", line 1106, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | await super().__call__(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | await self.middleware_stack(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | raise exc 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | await self.app(scope, receive, _send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | raise exc 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | await self.app(scope, receive, sender) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | raise e 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | await self.app(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | await route.handle(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle 2023-11-17 17:55:07 | ERROR | stderr | await self.app(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/routing.py", line 69, in app 2023-11-17 17:55:07 | ERROR | stderr | await response(scope, receive, send) 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__ 2023-11-17 17:55:07 | ERROR | stderr | async with anyio.create_task_group() as task_group: 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 597, in __aexit__ 2023-11-17 17:55:07 | ERROR | stderr | raise exceptions[0] 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap 2023-11-17 17:55:07 | ERROR | stderr | await func() 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/responses.py", line 262, in stream_response 2023-11-17 17:55:07 | ERROR | stderr | async for chunk in self.body_iterator: 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/concurrency.py", line 63, in iterate_in_threadpool 2023-11-17 17:55:07 | ERROR | stderr | yield await anyio.to_thread.run_sync(_next, iterator) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync 2023-11-17 17:55:07 | ERROR | stderr | return await get_asynclib().run_sync_in_worker_thread( 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread 2023-11-17 17:55:07 | ERROR | stderr | return await future 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run 2023-11-17 17:55:07 | ERROR | stderr | result = context.run(func, *args) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/starlette/concurrency.py", line 53, in _next 2023-11-17 17:55:07 | ERROR | stderr | return next(iterator) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/Volumes/data/soft/chat/FastChat-0.2.32/fastchat/serve/model_worker.py", line 104, in generate_stream_gate 2023-11-17 17:55:07 | ERROR | stderr | for output in self.generate_stream_func( 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 35, in generator_context 2023-11-17 17:55:07 | ERROR | stderr | response = gen.send(None) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/Volumes/data/soft/chat/FastChat-0.2.32/fastchat/serve/inference.py", line 130, in generate_stream 2023-11-17 17:55:07 | ERROR | stderr | out = model(input_ids=start_ids, use_cache=True) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl 2023-11-17 17:55:07 | ERROR | stderr | return self._call_impl(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl 2023-11-17 17:55:07 | ERROR | stderr | return forward_call(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1034, in forward 2023-11-17 17:55:07 | ERROR | stderr | outputs = self.model( 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl 2023-11-17 17:55:07 | ERROR | stderr | return self._call_impl(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl 2023-11-17 17:55:07 | ERROR | stderr | return forward_call(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 922, in forward 2023-11-17 17:55:07 | ERROR | stderr | layer_outputs = decoder_layer( 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl 2023-11-17 17:55:07 | ERROR | stderr | return self._call_impl(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl 2023-11-17 17:55:07 | ERROR | stderr | return forward_call(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 672, in forward 2023-11-17 17:55:07 | ERROR | stderr | hidden_states, self_attn_weights, present_key_value = self.self_attn( 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl 2023-11-17 17:55:07 | ERROR | stderr | return self._call_impl(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/opt/homebrew/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl 2023-11-17 17:55:07 | ERROR | stderr | return forward_call(*args, **kwargs) 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/Volumes/data/soft/chat/FastChat-0.2.32/fastchat/model/monkey_patch_non_inplace.py", line 62, in forward 2023-11-17 17:55:07 | ERROR | stderr | query_states, key_states = apply_rotary_pos_emb( 2023-11-17 17:55:07 | ERROR | stderr | ^^^^^^^^^^^^^^^^^^^^^ 2023-11-17 17:55:07 | ERROR | stderr | File "/Volumes/data/soft/chat/FastChat-0.2.32/fastchat/model/monkey_patch_non_inplace.py", line 22, in apply_rotary_pos_emb 2023-11-17 17:55:07 | ERROR | stderr | gather_indices = gather_indices.repeat(1, cos.shape[1], 1, cos.shape[3]) 2023-11-17 17:55:07 | ERROR | stderr | ~~~~~~~~~^^^ 2023-11-17 17:55:07 | ERROR | stderr | IndexError: tuple index out of range

@richard-bridgeman
Copy link

getting the same error when trying to run on a mac m2.

@phg0324
Copy link

phg0324 commented Nov 18, 2023

for me, got errors when I send test_message
using mac m1 pro

@richard-bridgeman
Copy link

Running the command:

python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.5 --device mps --load-8bit

It loads, and I get the USER: prompt, I type 'hello' and then i get the following error:

USER: hello
ASSISTANT: Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/serve/cli.py", line 291, in <module>
    main(args)
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/serve/cli.py", line 215, in main
    chat_loop(
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/serve/inference.py", line 527, in chat_loop
    outputs = chatio.stream_output(output_stream)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/serve/cli.py", line 62, in stream_output
    for outputs in output_stream:
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/serve/inference.py", line 130, in generate_stream
    out = model(input_ids=start_ids, use_cache=True)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 1034, in forward
    outputs = self.model(
              ^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 922, in forward
    layer_outputs = decoder_layer(
                    ^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/transformers/models/llama/modeling_llama.py", line 672, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
                                                          ^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/model/monkey_patch_non_inplace.py", line 62, in forward
    query_states, key_states = apply_rotary_pos_emb(
                               ^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/fastchat/model/monkey_patch_non_inplace.py", line 22, in apply_rotary_pos_emb
    gather_indices = gather_indices.repeat(1, cos.shape[1], 1, cos.shape[3])
                                                               ~~~~~~~~~^^^
IndexError: tuple index out of range

@peterwilli
Copy link
Contributor

I have this issue too on an m2 MacBook Air with 16GB RAM
Scherm­afbeelding 2023-11-18 om 16 08 38

@peterwilli
Copy link
Contributor

It seems to start at 0.2.21, downgrading to 0.2.20 fixes the issue, but inference looks very weird, I'm going to play with it for a bit.
Scherm­afbeelding 2023-11-18 om 16 46 11

@suquark suquark self-assigned this Nov 23, 2023
@suquark
Copy link
Collaborator

suquark commented Nov 23, 2023

This issue is due to an update in transformers library. The temporary fix would be downgrading transformers. I will push a fix recently.

@infwinston
Copy link
Member

Hi all, this issue should be fixed by #2737. Could you try the latest master and help us confirm the issue is gone? @zhaodaye2022 @richard-bridgeman @peterwilli

@alfiedotwtf
Copy link

@infwinston Looks good. All working for me. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants