Skip to content

Releases: buerki/SubString

SubString 1.2

03 Jan 13:28
Compare
Choose a tag to compare

substring-B.sh

  • fixed an issue introduced in version 1.1.2, where the order of consolidation was mixed up for n-grams of n > 9.
  • extended limit of 12 uncut lists to 29 uncut lists.

SubString 1.1.2

13 Oct 08:16
Compare
Choose a tag to compare

substring.sh

  • enabled the consolidation of n-grams of lengths up to n = 30.

SubString 1.1

12 Jan 16:12
Compare
Choose a tag to compare

release notes v. 1.1


If you use substring-A.py with the mwetoolkit, the latest release (at time of writing) of the mwetoolkit must be installed.

install.sh & substring-A.py

  • changed name of libraries folder from 'libs' to 'mwetk' in line with changes in the latest version of the mwetoolkit

cutoff.sh

  • added option that allows processing of file in Google n-gram format

SubString 1.0

02 Sep 21:48
Compare
Choose a tag to compare

release notes v. 1.0


A new, modular architecture was introduced, splitting SubString into three modules. The main algorithm of SubString up to version 0.9.9.2 was retained as one of the modules and a new module (substring-A.py) added that implements a frequency consolidation algorithm that makes use of mwetoolkit's indexing of n-grams. The auxiliary scripts were retained as the third module.

substring.sh

  • adjusted to the modular architecture

TP-filter, cutoff.sh, random_lines.sh, length-adjust.sh

  • changed handling of filename extensions so that extensions are preserved correctly

substring-processor.sh

  • renamed substring-B.sh

newly added:

  • substring-A.py
  • libs/filetype/ft_ngp.py & ft_nsp.py
  • xml_list_to_NGP.py
  • TUTORIAL.md
  • plaintext_list.xsl

SubString 0.9.9.2

13 Aug 11:41
Compare
Choose a tag to compare

release notes v. 0.9.9.2


substring-processor.sh

  • raised limit of number of uncut lists to be incorporated in the consolidation to 12.

TP-filter.sh

  • added option to rate on a scale from 1 to 6 rather than a binary T/F distinction.

SubString 0.9.9.1

19 Jun 19:35
Compare
Choose a tag to compare

release notes v. 0.9.9.1


substring-processor.sh

  • fixed an issue with the progress indicator not advancing during the preparatory stage

SubString 0.9.9

12 Feb 23:03
Compare
Choose a tag to compare

release notes v. 0.9.9


substring.sh

  • added a menu-based interface
  • moved main processing code to new substring-processor.sh

Added the following auxiliary scripts used by substring.sh

  • consolidate.sh
  • en-filter.sh
  • random_lines.sh
  • TP-filter.sh
  • substring-processor.sh

listconv.sh

  • script was retired as the NGP package now provides suitably formatted input files.

Removed awk commands from all scripts due to compatibility issues with Cygwin in certain configurations

Adjusted the following to reflect above changes

  • install.sh
  • README.pdf/README.md:
  • test_data

SubString 0.9.7

31 May 08:35
Compare
Choose a tag to compare

release notes v. 0.9.7


Added double-clickable installers for Cygwin and a generic installer script:

  • Cygwin_installer.lnk
  • Cygwin64_installer.lnk
  • install.sh

README.pdf/README.md:

  • adjusted for changes

SubString 0.9.6

28 May 02:50
Compare
Choose a tag to compare

release notes v. 0.9.6


Added double-clickable installer for OS X and Linux:

  • OSX_installer.command
  • linux_installer.desktop

README.pdf/README.md:

  • adjusted for changes

SubString 0.9.5

13 Oct 14:14
Compare
Choose a tag to compare

release notes v. 0.9.5


substring.sh:

  • prep stage processing now makes use of Bash 4's associative arrays – this
    enables the processing of far larger amounts of data (previous algorythm
    still works if no bash 4 is detected)
  • fixed bug introduced in 0.9.4 which resulted in hung processing if the -d
    option was invoked
  • removed trailing tabs from output lists inadvertently introduced in 0.9.4
  • a few other efficiency improvements

README.txt/README.md:

  • adjusted for changes

test_data:

  • updated gold lists to reflect changes to substring.sh