Skip to content

Commit

Permalink
Slim down big table indexing (#246)
Browse files Browse the repository at this point in the history
* Deprecated ExpressionsTableIndexer and CLI wrapper of it.

* Add range definition table for replacement for previous index.

* Make entrypoint more prominent in cells parser.

* Use record identifying column for scope purposes.

* Start adding range definition

* Add serial id getter and adder.

* Include specimen-scope-specific range definition in optimized query.

* Refactor query/parameters dynamically chosen based on optimization flag.

* Deprecate "unoptimized" expressions query, use only range-based optimized form.

* Version bump.

* Deprecate CLI interface to constraint modifier, used only internally now.

* Version bump and changelog
  • Loading branch information
jimmymathews authored Nov 16, 2023
1 parent f88e64e commit 3b39fb0
Show file tree
Hide file tree
Showing 14 changed files with 168 additions and 424 deletions.
9 changes: 9 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
# v0.16.2
- Deprecates heavy index on large tables:
- Adds a new table for tracking scope ranges.
- Converts the former `source_specimen` column on `expression_quantification` to a `SERIAL`` integer.
- Makes tabular import keep track of ranges per-specimen in the new range_definitions table.
- Updates the "optimized" sparse matrix query to use the ranges rather than the former huge index.
- Deprecates the modify-constraints CLI entrypoint (only used internally now).
- Deprecates the expression indexing module, CLI entrypoint, etc.

# v0.16.0
- Separates datasets into own databases:
- `DBCursor` and `DBConnection` usage streamlined, typically requires study-scoping (dataset-scoping).
Expand Down
2 changes: 0 additions & 2 deletions pyproject.toml.unversioned
Original file line number Diff line number Diff line change
Expand Up @@ -174,12 +174,10 @@ packages = [
]
"spatialprofilingtoolbox.db.scripts" = [
"create_schema.py",
"modify_constraints.py",
"guess_channels_from_object_files.py",
"status.py",
"retrieve_feature_matrices.py",
"drop.py",
"index_expressions_table.py",
"drop_ondemand_computations.py"
]
"spatialprofilingtoolbox.db.data_model" = [
Expand Down
10 changes: 10 additions & 0 deletions spatialprofilingtoolbox/db/data_model/performance_tweaks.sql
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ ADD UNIQUE (feature_specification, specifier, ordinality) ;
ALTER TABLE two_cohort_feature_association_test
ADD UNIQUE (selection_criterion_1, selection_criterion_2, test, p_value, feature_tested) ;

ALTER TABLE expression_quantification
ADD range_identifier_integer SERIAL ;

CREATE TABLE range_definitions (
scope_identifier VARCHAR(512),
tablename VARCHAR(512),
lowest_value INT,
highest_value INT
) ;

CREATE EXTENSION IF NOT EXISTS tablefunc;

CREATE TABLE sample_strata (
Expand Down
145 changes: 0 additions & 145 deletions spatialprofilingtoolbox/db/expressions_table_indexer.py

This file was deleted.

31 changes: 0 additions & 31 deletions spatialprofilingtoolbox/db/scripts/index_expressions_table.py

This file was deleted.

84 changes: 0 additions & 84 deletions spatialprofilingtoolbox/db/scripts/modify_constraints.py

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
from os import getcwd
import sys

from spatialprofilingtoolbox.db.expressions_table_indexer import ExpressionsTableIndexer
from spatialprofilingtoolbox.workflow.common.structure_centroids import StructureCentroids
from spatialprofilingtoolbox.ondemand.defaults import EXPRESSIONS_INDEX_FILENAME
from spatialprofilingtoolbox.workflow.common.cli_arguments import add_argument
Expand Down Expand Up @@ -48,7 +47,6 @@ def main():
message = '%s was not found, will do feature matrix pull after all.'
logger.info(message, EXPRESSIONS_INDEX_FILENAME)

# ExpressionsTableIndexer.ensure_indexed_expressions_tables(database_config_file)
puller = SparseMatrixPuller(database_config_file)
puller.pull_and_write_to_files()

Expand Down
Loading

0 comments on commit 3b39fb0

Please sign in to comment.