Releases: buerki/SubString
SubString 1.2
SubString 1.1.2
substring.sh
- enabled the consolidation of n-grams of lengths up to n = 30.
SubString 1.1
release notes v. 1.1
If you use substring-A.py with the mwetoolkit, the latest release (at time of writing) of the mwetoolkit must be installed.
install.sh & substring-A.py
- changed name of libraries folder from 'libs' to 'mwetk' in line with changes in the latest version of the mwetoolkit
cutoff.sh
- added option that allows processing of file in Google n-gram format
SubString 1.0
release notes v. 1.0
A new, modular architecture was introduced, splitting SubString into three modules. The main algorithm of SubString up to version 0.9.9.2 was retained as one of the modules and a new module (substring-A.py) added that implements a frequency consolidation algorithm that makes use of mwetoolkit's indexing of n-grams. The auxiliary scripts were retained as the third module.
substring.sh
- adjusted to the modular architecture
TP-filter, cutoff.sh, random_lines.sh, length-adjust.sh
- changed handling of filename extensions so that extensions are preserved correctly
substring-processor.sh
- renamed substring-B.sh
newly added:
- substring-A.py
- libs/filetype/ft_ngp.py & ft_nsp.py
- xml_list_to_NGP.py
- TUTORIAL.md
- plaintext_list.xsl
SubString 0.9.9.2
release notes v. 0.9.9.2
substring-processor.sh
- raised limit of number of uncut lists to be incorporated in the consolidation to 12.
TP-filter.sh
- added option to rate on a scale from 1 to 6 rather than a binary T/F distinction.
SubString 0.9.9.1
release notes v. 0.9.9.1
substring-processor.sh
- fixed an issue with the progress indicator not advancing during the preparatory stage
SubString 0.9.9
release notes v. 0.9.9
substring.sh
- added a menu-based interface
- moved main processing code to new substring-processor.sh
Added the following auxiliary scripts used by substring.sh
- consolidate.sh
- en-filter.sh
- random_lines.sh
- TP-filter.sh
- substring-processor.sh
listconv.sh
- script was retired as the NGP package now provides suitably formatted input files.
Removed awk commands from all scripts due to compatibility issues with Cygwin in certain configurations
Adjusted the following to reflect above changes
- install.sh
- README.pdf/README.md:
- test_data
SubString 0.9.7
release notes v. 0.9.7
Added double-clickable installers for Cygwin and a generic installer script:
- Cygwin_installer.lnk
- Cygwin64_installer.lnk
- install.sh
README.pdf/README.md:
- adjusted for changes
SubString 0.9.6
release notes v. 0.9.6
Added double-clickable installer for OS X and Linux:
- OSX_installer.command
- linux_installer.desktop
README.pdf/README.md:
- adjusted for changes
SubString 0.9.5
release notes v. 0.9.5
substring.sh:
- prep stage processing now makes use of Bash 4's associative arrays – this
enables the processing of far larger amounts of data (previous algorythm
still works if no bash 4 is detected) - fixed bug introduced in 0.9.4 which resulted in hung processing if the -d
option was invoked - removed trailing tabs from output lists inadvertently introduced in 0.9.4
- a few other efficiency improvements
README.txt/README.md:
- adjusted for changes
test_data:
- updated gold lists to reflect changes to substring.sh