Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a general-purpose autoencoder class AE to serve as a building block for other autoencoder models #932

Merged
merged 14 commits into from
Jun 13, 2024

Conversation

voidvoxel
Copy link
Contributor

image

Description

Oh. My. Freaking. Goodness! There's somewhat of a "story" behind this, but here's your TL;DR:

Motivation and Context

I've been using my own home-rolled autoencoders that utilize brain.js for awhile now, so I wanted to share it with the community that made it possible to begin with 💖

How Has This Been Tested?

I added unit tests to test the basic functionality of the class. Further tests should be added in the future, which I'll be glad to contribute to 😊

To run the tests, I rebuilt and ran the test script by running the following command:

npm rebuild && npm run build && npm test

Screenshots (if appropriate):

image

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Author's Checklist:

  • My code focuses on the main motivation and avoids scope creep.
  • My code passes current tests and adds new tests where possible.
  • My code is SOLID and DRY.
  • I have updated the documentation as needed.

Reviewer's Checklist:

  • I kept my comments to the author positive, specific, and productive.
  • I tested the code and didn't find any new problems.
  • I think the motivation is good for the project.
  • I think the code works to satisfies the motivation.

[Wikipedia said "Autoencoder" and not "Auto encoder" or "Auto-encoder"](https://en.wikipedia.org/wiki/Autoencoder)
The other classes in this library use their respective acronyms with the exceptions of `FeedForward`, `NeuralNetwork`, and `NeuralNetworkGPU`. Furthermore, `AE` variants of existing classes will likely be made, so an acronym equivalent would be desirable. I'm considering the naming conventions for classes such as `LSTMTimeStep`. For example maybe a future `VAE` class could be made to represent variational autoencoders.
@robertleeplummerjr
Copy link
Contributor

Nice work! One of the most thorough PR's I've seen for brain.js. Can the autoencoder be used with things like strings easily such as with an LLM?

@voidvoxel
Copy link
Contributor Author

The original implementation did have string support, but I noticed quite poor quality from the word embedding algorithm I wrote, so I'm reworking it right now. I can either add it in the next PR or I can add it to this one once its finished. The new embedding algorithm is designed to process new words that were not in the training data much more effectively, which is likely improve the quality of encodings produced by the string autoencoders by a significant amount 😊

@robertleeplummerjr
Copy link
Contributor

I can either add it in the next PR or I can add it to this one once its finished.

Let's do this incrementally.

The new embedding algorithm is designed to process new words that were not in the training data much more effectively, which is likely improve the quality of encodings produced by the string autoencoders by a significant amount 😊

That has me super excited. Thank you for sharing. I'll go ahead and merge this now.

If anyone is watching this, this is how you contribute to a project.

@robertleeplummerjr
Copy link
Contributor

I see I have some updates I should make prior to merging. I'll work on that next to get you in a state to merge.

@voidvoxel
Copy link
Contributor Author

I see I have some updates I should make prior to merging. I'll work on that next to get you in a state to merge.

If you'd like any backup for delegation, feel free to ping me 💕

@robertleeplummerjr
Copy link
Contributor

If you would like to handle upgrading the packages so that they run on Travis ci, by all means take a stab at it.

@voidvoxel
Copy link
Contributor Author

If you would like to handle upgrading the packages so that they run on Travis ci, by all means take a stab at it.

On it friend 🥰

@voidvoxel
Copy link
Contributor Author

A little weather-related hiccup required my system to be unplugged last night, but we’re back on track today 💞

IMG_2578

* Fix "@rollup/plugin-typescript TS2807"
* Choose a more accurate name for `includesAnomalies` (`likelyIncludesAnomalies`, as it makes no guarantees that anomalies are truly present and only provides an intuitive guess)
@voidvoxel
Copy link
Contributor Author

voidvoxel commented Jun 8, 2024

Sorry for the delays! Our house flooded in a couple of rooms and a hallway during that storm. Also, unrelated: I now have an unexpected surgery coming up. Wish me luck!

Commit a4257fd should fix the workflow errors, as they were relating to my usage of JavaScript private properties (#property syntax) instead of TypeScript private properties (private property syntax). My apologies!

@robertleeplummerjr
Copy link
Contributor

I was off last week and had an injury, sorry for the delay. I hope you the best on your surgery. Looking at this now.

@robertleeplummerjr
Copy link
Contributor

The inability to build in travis is unrelated to these fine changes. Let's go ahead and merge this, and the build issues can be handled elsewhere so as to prevent scope creep.

@robertleeplummerjr robertleeplummerjr merged commit c8a62f1 into BrainJS:master Jun 13, 2024
0 of 7 checks passed
@voidvoxel voidvoxel deleted the feature/AE branch June 15, 2024 09:41
* An autoencoder learns to compress input data down to relevant features and reconstruct input data from its compressed representation.
*/
export class AE<DecodedData extends INeuralNetworkData, EncodedData extends INeuralNetworkData> {
private decoder?: NeuralNetworkGPU<EncodedData, DecodedData>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make AE support non-gpu Neural network? Or is there a specific reason why we are tying this to GPU NNs only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it ever made it into mainstream or not, but I already wrote both CPU and GPU implementations of the autoencoder class (Autoencoder and AutoencoderGPU IIRC). If I forgot to create a PR, I'll make one soon 😊

Sorry for the lack of updates for awhile. Health got in the way of work, but I've mostly recovered and am back in good health so I've recently started to work full time again, so I'll try to make more progress with the autoencoder and loss function features 💖

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a PR that adds serialization (toJSON & fromJSON) to AE - #950

@seunafara
Copy link

AE as of today runs on only Neural Network GPU, do we want to pass an argument to the AE class telling it to use NeuralNetworkGPU or NeuralNetwork (cpu)?

@robert-lore
Copy link

Just for comparing benchmarks would be very interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants