Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow implementation of entmax1.5 #2

Open
justheuristic opened this issue Aug 6, 2019 · 10 comments
Open

Tensorflow implementation of entmax1.5 #2

justheuristic opened this issue Aug 6, 2019 · 10 comments

Comments

@justheuristic
Copy link

justheuristic commented Aug 6, 2019

Here's a tensorflow implementation of entmax $\alpha=1.5$ mapping and loss in case someone's interested.

https://gist.github.com/justheuristic/60167e77a95221586be315ae527c3cbd

It should work on tf >= 1.8 and matches both outputs and gradients of the official pytorch implementation.

Thanks lena-voita@ for assistance

@vene
Copy link
Contributor

vene commented Aug 13, 2019

This looks great, thanks @justheuristic @lena-voita!

Will it stay a gist or are you considering making it a repo with some tests? (or contributing it to ours?) I'd be happy to add a link either way!

@Ghostvv
Copy link

Ghostvv commented Aug 14, 2019

Great! Thank you.
I need it for sequence loss so my tensors are 3D. It seems to me that tf.einsum("ij,ij->i", p_incr, logits) is equivalent to tf.reduce_sum(p_incr * logits, -1)

@justheuristic
Copy link
Author

justheuristic commented Aug 27, 2019

That is correct. Or you can just reshape inputs and targets before feeding them into the loss. I am no tf expert, but i believe that reshape + einsum should work somewhat faster. Sorry for... not... reacting... for so long.. yknow.

By default it stays a gist, but if you were to use it in something more substantial we'd both be grateful.

@Ghostvv
Copy link

Ghostvv commented Aug 27, 2019

I'd love to use it in rasa, but I'd prefer to import it as a library, rather than copying it to our codebase. @justheuristic would you be up to create a PR here?

@justheuristic
Copy link
Author

Definitely so but i am unsure if @vene would be willing to accept it without the top-k version and approximate binsearch version for \alpha != 1.5.

I'll get to it if Vlad confirms he won't mind a TF port, but i'd be happy if you could just build it into rasa and forget that i ever existed (unlicense allows that). Btw rasa looks gorgeous :)

@vene
Copy link
Contributor

vene commented Aug 27, 2019

Any partial implementation is imo welcome and better than none at all! The question is more whether it's better to put it in this same package. Advantages: easier discoverability, we can unit test them against each other. Disadvantages: not sure how to structure such a project, and none of the current authors of the entmax package can maintain tensorflow code...

@justheuristic
Copy link
Author

justheuristic commented Aug 27, 2019

Thank you, Vlad

As for structuring, this package demonstrates a simple albeit a bit clumsy way to organize such code.

I'll try to concoct a PR within 48 hours and we'll see if it is works for the authors. I'll promise to break as few things as i can)

_update, 2nd of October - i am terribly sorry, this wasn't 48 hours

@Ghostvv
Copy link

Ghostvv commented Aug 28, 2019

@tmbo maybe you have a recommendation how to structure such a project

@tmbo
Copy link

tmbo commented Aug 28, 2019

Did take a quick look at the package @justheuristic referenced: I'd go for the same structure, so separate submodules for the different backends.

As far as I can see this package isn't published yet as a pypi package either.

@justheuristic
Copy link
Author

Guys, i'm sorry, i messed up. The chances of me actually finishing this PR are slim. Please take this over if you're interested in the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants