Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge ranked and shaped (as in ADReady) into tensor indexed by [Maybe Nat] #105

Open
Mikolaj opened this issue Jun 13, 2023 · 13 comments
Open
Labels
help wanted Extra attention is needed

Comments

@Mikolaj
Copy link
Owner

Mikolaj commented Jun 13, 2023

This is @awf's idea that would improve the end user API. The Nothing would signify that the shape may differ at runtime. Cognitive load would be reduced (no more ranked and shaped in signatures) and new expressiveness would be gained (tensors that are ranked on some dimensions but shaped on others).

For this to be possible we need an extension of GHC.TypeLits.Normalise plugin family and somebody to maintain it. [Edit: also an extension of orthotope would be needed]. It would be great to see if any other Haskell libraries require such functionality already. Or we could move to Agda.

We'd also need good examples where the ability to vary some shapes during runtime is crucial, because in all current examples the shapes are known and fixed once they are determined at the start of runtime (by looking at input data). It's not even clear such varying shapes would work, e.g., with our current optimizer implementations.

An example. A signature in our current formalism.

type LayerWeightsRNNShaped shaped in_width out_width r =
  ( shaped r '[out_width, in_width]   -- input weight
  , shaped r '[out_width, out_width]  -- state weight
  , shaped r '[out_width] )           -- bias

rnnMnistTwoS
  :: forall out_width batch_size sizeMnistHeight.
     ADReady ranked shaped r
  => shaped r [2 * out_width, batch_size]
  -> PrimalOfS shaped r [sizeMnistHeight, batch_size]
  -> (LayerWeightsRNNShaped shaped sizeMnistHeight out_width r
     ,LayerWeightsRNNShaped shaped out_width out_width r)
  -> (shaped r [out_width, batch_size]
     ,shaped r [2 * out_width, batch_size])

A new signature of a different program. Let's assume the program requires some shapes to vary.

type data Size = Dynamic | N Nat

type LayerWeightsRNNShaped tensor in_width out_width r =
  ( tensor r '[out_width, in_width]   -- input weight
  , tensor r '[Dynamic, out_width]    -- state weight
  , tensor r '[out_width] )           -- bias

rnnMnistTwoS
  :: forall out_width batch_size sizeMnistHeight.
     (ADReady tensor r, N 0 + batch_size ~ batch_size) 
  => tensor r [Dynamic, batch_size]
  -> PrimalOfS tensor r [sizeMnistHeight, batch_size]
  -> (LayerWeightsRNNShaped tensor sizeMnistHeight out_width r
     ,LayerWeightsRNNShaped tensor out_width out_width r)
  -> (tensor r [Dynamic :: Size, batch_size :: Size]
     ,tensor r [N 2 * out_width, batch_size])
@Mikolaj Mikolaj added the help wanted Extra attention is needed label Jun 13, 2023
@awf
Copy link
Collaborator

awf commented Jun 13, 2023

Great write up!

@Mikolaj
Copy link
Owner Author

Mikolaj commented Feb 24, 2024

Related: I'm now using a lot yet another type, the heterogeneous vector of tensors of different scalar types and different ranked/shaped form. The vector is completely untyped currently, but it could be indexed with a list of individual tensor types.

So, using the idea from the ticket, a heterogeneous vector would be indexed by an expression of the form [([Maybe Nat], Type)]. where the Type is the type of the scalar. I'm afraid this is a completely unwieldy type information, but maybe there is a way to present it better or somehow simplify. It's already better (and more expressive) than the original one would be: [(Either [Nat] Nat, Type)].

@Mikolaj
Copy link
Owner Author

Mikolaj commented Mar 11, 2024

An example of how the [Maybe Nat] type indexes seem to emerge naturally once we permit the notation of tensors nested within tensors. Consider a type signature

u :: (RankedTensor ranked, ShapedTensor shaped)
  => shaped '[23, 34] (shaped '[22] Float, ranked 1 Double)

or and even simpler case, where the product is trivial (but nesting is not)

t :: (RankedTensor ranked, ShapedTensor shaped)
  => shaped '[23, 34] (ranked 1 Double)

This looks fine, but we'd like to represent this using unboxed tensors exclusively and, in particular, have instances of the shaped and ranked tensor type classes as unboxed concrete arrays (e.g., from package orthotope, where the type of ranked arrays is OR.Array and of shaped arrays is OS.Array). Would the type of the instantiated t be represented as

t2 :: OR.Array 3 Double

but then the instance is not faithful, because we can't do

ssum (t2 :: OR.Array 3 Double)

while we can do

(ssum t) :: shaped '[34] (ranked 1 Double)

Obviously, the type of the instantiated t can't be

t3 :: OS.Array '[23, 34, ???] Double

because we can't fix statically what ??? is going to be. So we are probably going to need

t4 :: OX.Array '[Just 23, Just 34, Nothing] Double

Or maybe, to the contrary, the concrete array instance should project all tensors (or all ranked and product tensors) to ranked arrays and we should ban all shaped operations on product/nested tensors?

As a side note, while I can apply ssum to an argument

t :: (RankedTensor ranked, ShapedTensor shaped)
  => shaped '[23, 34] (ranked 1 Double)

I can't apply to it sgather

  sgather
    :: forall r sh2 p sh.
       ( GoodScalar r, Sh.Shape sh2, Sh.Shape sh, Sh.Shape (Sh.Take p sh)
       , Sh.Shape (Sh.Drop p sh), Sh.Shape (sh2 Sh.++ Sh.Drop p sh) )
    => shaped r sh
    -> (IndexSh shaped sh2 -> IndexSh shaped (Sh.Take p sh))
    -> shaped r (sh2 Sh.++ Sh.Drop p sh)

because sgather (unlike rgather) depends on the full shape being statically declared. So adding the [Maybe Nat] type indexes is just the beginning. We then need to define a new set of tensor operations with the new typing.

In the best scenario, for each current pair of a ranked and a shaped operation we'd get just one mixed operation. In the worst case, we'd get three or more. One would need to try. Another planning phase consideration would be whether orthotope can be extended to the mixed type indexes without problems (e.g., without losing some of its ability to restructure tensors in constant time by manipulating only the metadata). Or can we get away without extending orthotope at all that in some manner?

@tomsmeding
Copy link
Collaborator

Just a thought:

type MapJust :: '[k] -> '[Maybe k]
type family MapJust l where
  MapJust '[] = '[]
  MapJust (x ': xs) = Just x ': MapJust xs

type SArr :: '[Nat] -> Type -> Type
data SArr l t  -- no constructors, this is just a tag type

type RArr :: Nat -> Type -> Type
data RArr n t  -- idem

-- AstRanked and AstShaped are indexed by the tag types to indicate what kind of array they are describing
data AstRanked span t where
  AstSum :: AstRanked s (RArr (1 + n) t) -> AstRanked (RArr n t)
  -- etc.

type ShapeVec :: Type -> '[Maybe Nat]
type family ShapeVec t where
  ShapeVec (SArr l t) = MapJust l ++ ShapeVec t
  ShapeVec (RArr n t) = Replicate n Nothing ++ ShapeVec t
  ShapeVec _ = '[]

type Core :: Type -> Type
type family Core t where
  Core (SArr l t) = Core t
  Core (RArr n t) = Core t
  Core t = t

type Rep t = OX.Array (ShapeVec t) (Core t)

@tomsmeding
Copy link
Collaborator

The above is a bad idea that destroys all the compositionality that we have; don't use it. Though I think there is something here; need to think more before typing.

@Mikolaj
Copy link
Owner Author

Mikolaj commented Mar 26, 2024

Tom's idea that may be easy to implement: replace the ??? above with just Nat type variables (or expressions with them). Because we don't have filter, this should not require packed existentials, because the Nat expressions should be statically known.

Mikolaj's TODO: try to write a filter odd function in horde-ad using fromList and see where it breaks (and keep the failed attempts as tests). It should break at not being able to inspect tensor elements due to this being interpretable in the symbolic layer (maybe ifF would not work, somehow).

@Mikolaj
Copy link
Owner Author

Mikolaj commented Apr 7, 2024

Mikolaj's TODO: try to write a filter odd function in horde-ad using fromList and see where it breaks

It breaks, as it should, already when type-checking:

filterPositiveFail :: ADReady ranked
=> ranked Double 1 -> ranked Double 1
filterPositiveFail v =
let l = runravelToList v
-- l2 = filter (\x -> x >= 0) l
-- Could not deduce ‘Ord (ranked Double 0)’
-- l2 = filter (\x -> x >=. 0) l
-- Could not deduce ‘BoolOf ranked ~ Bool’
l2 = take 3 l -- the most I can do
in rfromList l2

However, from this experiment I got the feeling it would be easy to add hardwired operations like filter to our tensor class and I think we are likely to add them in the future unless there's a very compelling reasons not to (on top of them not being typable with shaped tensors but only with ranked tensors).

@tomsmeding
Copy link
Collaborator

tomsmeding commented Apr 7, 2024

If you can, that's great! That would mean that we do not actually need the restriction that all tensor shapes are statically known. It would change a lot of typing: for example, rbuild currently takes a ShapeInt which is just a sized list of Ints. If tensor shapes do not need to be statically known, we could loosen this to a sized list of ASTs (i.e. IndexOf)!

EDIT: the Could not deduce ‘BoolOf ranked ~ Bool’ error you got from filter (\x -> x >=. 0) l is precisely staging rearing its head: you're trying (through filter) to branch at "compile time" of the staged program on values known only at runtime of the staged program.

@tomsmeding
Copy link
Collaborator

How does build1 k (\i -> filter f a) vectorise, for some function f and some array-typed expression a?

@Mikolaj
Copy link
Owner Author

Mikolaj commented Apr 7, 2024

Ouch, you are right. We could add the operations easily, but then our machinery breaks at once. The first thing to break would be

shapeAst :: forall n s r. (KnownNat n, GoodScalar r)
=> AstRanked s r n -> ShapeInt n

and everything depends on it. Vectorization would get stuck as well.

@tonyday567
Copy link

The dimensions library implements a similar idea to this:

https://hackage.haskell.org/package/dimensions-2.1.1.0/docs/Numeric-Dimensions-Dim.html#t:XNat

@Mikolaj
Copy link
Owner Author

Mikolaj commented Sep 13, 2024

Oh wow, thank you, the whole https://github.com/achirkin/easytensor repo is interesting. I wonder how much it overlaps with the not yet on Hackage library by @tomsmeding.

@tomsmeding
Copy link
Collaborator

tomsmeding commented Sep 13, 2024

It seems to serve a different goal; the partially static shapes definitely overlap, but the major reason why ox-arrays exists is because of array nesting. Easytensor seems to have more focus on particular operations and particular dimensionalities (there are explicit operations on matrices) and much less on general multi-dimensional tensor operations (reshape, replicate, partial reductions / indexing, etc.). I'm not sure what the idea of a data frame is besides a collection of arrays of the same size (I may be missing something there). The underlying representation of an array in easytensor seems to be orthotope-ish, except that the innermost stride must be 1. I'm not sure to what extent it allows "weird" stride vectors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants