Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support RNA004 kit in Tombo #455

Open
charlesxu90 opened this issue May 16, 2024 · 3 comments
Open

Support RNA004 kit in Tombo #455

charlesxu90 opened this issue May 16, 2024 · 3 comments

Comments

@charlesxu90
Copy link

charlesxu90 commented May 16, 2024

Dear @marcus1487,

I want to use Tombo to align latest RNA004 signals to sequences. I converted pod5 to fast5, and convert multi-read fast5 to single read ones. Then I tried to align the reads with signals.

However, it seems there are something different between the data. It pops up an error in the alignment.

BaseCalled_template:::PAU73183_pass_15caf1a1_68646c47_1.0_515.fast5-single_read/0/0061fb45-7c81-41b5-9ed6-e6135a0b081b.fast5
:::
Traceback (most recent call last):
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/tombo/resquiggle.py", line 1404, in _io_and_map_read
    map_thr_buf, q_score_thresh, seq_len_rng)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/tombo/resquiggle.py", line 1336, in map_read
    seq_data.id.decode(), bc_subgrp, num_start_clipped_bases,
AttributeError: 'str' object has no attribute 'decode'

Looks like the problem is come from the converted fast5. But I'ts not clear for me how to fix it. Could you please provide some hints?

Should I modify the Tombo code to support it? Or should I use a different approach to convert pod5 into fast5?

Thanks a lot for the help!

@charlesxu90
Copy link
Author

charlesxu90 commented May 16, 2024

I locate the error to be from this line of code:

align_info = th.alignInfo(
        seq_data.id.decode(), bc_subgrp, num_start_clipped_bases,
        num_end_clipped_bases, num_ins, num_del, num_match,
        num_aligned - num_match)

So I updated it into:

if hasattr(seq_data.id, 'decode'):
    seq_data_id = seq_data.id.decode()
else:
    seq_data_id = seq_data.id

align_info = th.alignInfo(
        seq_data_id, bc_subgrp, num_start_clipped_bases,
        num_end_clipped_bases, num_ins, num_del, num_match,
        num_aligned - num_match)

However, I met with an error while buiding Tombo.

python setup.py develop --no-deps                                                                                                                                                                      [9:54:45]
running develop
running egg_info
writing ont_tombo.egg-info/PKG-INFO
writing dependency_links to ont_tombo.egg-info/dependency_links.txt
writing entry points to ont_tombo.egg-info/entry_points.txt
writing requirements to ont_tombo.egg-info/requires.txt
writing top-level names to ont_tombo.egg-info/top_level.txt
reading manifest file 'ont_tombo.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching '*' under directory 'cython'
adding license file 'LICENSE.md'
writing manifest file 'ont_tombo.egg-info/SOURCES.txt'
running build_ext
Compiling tombo/_c_dynamic_programming.pyx because it changed.
[1/1] Cythonizing tombo/_c_dynamic_programming.pyx
/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/Cython/Compiler/Main.py:384: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /mnt/data/done_projects/Nano_seq/ref_works/0.ont_process/tombo/tombo/_c_dynamic_programming.pyx
  tree = Parsing.p_module(s, pxd, full_module_name)

Error compiling Cython file:
------------------------------------------------------------
...
        DTYPE_INT_t start_seq_pos, DTYPE_t mask_fill_z_score,
        bool do_winsorize_z, DTYPE_t max_half_z_score,
        bool return_z_scores=False):
    cdef DTYPE_INT_t n_bases = fwd_pass.shape[0] - 1
    cdef DTYPE_INT_t bandwidth = fwd_pass.shape[1]
    cdef DTYPE_INT_t half_bandwidth = bandwidth / 2
                                                ^
------------------------------------------------------------

tombo/_c_dynamic_programming.pyx:327:48: Cannot assign type 'double' to 'DTYPE_INT_t'
Traceback (most recent call last):
  File "setup.py", line 102, in <module>
    'Topic :: Scientific/Engineering :: Bio-Informatics',
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/setuptools/__init__.py", line 153, in setup
    return distutils.core.setup(**attrs)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/setuptools/command/develop.py", line 34, in run
    self.install_for_development()
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/setuptools/command/develop.py", line 114, in install_for_development
    self.run_command('build_ext')
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/command/build_ext.py", line 339, in run
    self.build_extensions()
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/command/build_ext.py", line 448, in build_extensions
    self._build_extensions_serial()
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/distutils/command/build_ext.py", line 473, in _build_extensions_serial
    self.build_extension(ext)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/setuptools/command/build_ext.py", line 202, in build_extension
    _build_ext.build_extension(self, ext)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/Cython/Distutils/build_ext.py", line 123, in build_extension
    ext,force=self.force, quiet=self.verbose == 0, **options
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/Cython/Build/Dependencies.py", line 1134, in cythonize
    cythonize_one(*args)
  File "/home/xiaopeng/miniconda3/envs/tombo_env/lib/python3.6/site-packages/Cython/Build/Dependencies.py", line 1301, in cythonize_one
    raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: tombo/_c_dynamic_programming.pyx

Looks like some dependency error from Cython. However, no clear clue for me to proceed.

Please provide me some hints if you are familiar with this issue. Thanks!

Should I used an old version of Cython? Which version do you suggest?

@charlesxu90
Copy link
Author

I locate the above error to be from the cpython codetombo/_c_dynamic_programming.pyx:327:48

cdef DTYPE_INT_t bandwidth = fwd_pass.shape[1]
cdef DTYPE_INT_t half_bandwidth = bandwidth / 2  # <-  Cannot assign type 'double' to 'DTYPE_INT_t'

However, I didn't see any problems with the assignment, as both of them are DTYPE_INT_t type.

@charlesxu90
Copy link
Author

Seems an old Cython version would work.

pip install Cython==0.29.36

I updated the Cython dependency in my folk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant