Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all lazy structures are not stack safe #212

Closed
safareli opened this issue Dec 15, 2016 · 16 comments
Closed

all lazy structures are not stack safe #212

safareli opened this issue Dec 15, 2016 · 16 comments

Comments

@safareli
Copy link
Member

safareli commented Dec 15, 2016

All lazy statures (IO, Future, State) have stack issue. general way of implementing operations on such structures is:

function Structure(f) {
  if (!(this instanceof Structure)) {
    return new Structure(f);
  }
  this.run = f
}

Structure.prototype.method = function (a){
  return new Structure(function(){
    // here we call `run` from a or something like that 
  })
}

All of lazy structures need extra stack frame on each map/ap/chain/..., as they are creating new functions which have use old computation or passed in function, requiring larger stack size, leading to stack overflow (when you run/execute/lower them)

  1. apply curried function f which is taking 10000 arguments:
...(T.of(9999).ap(T.of(10000).ap(T.of(f)))
  1. map on some structure 10000 times
T.of(1).map(id).map(id).map(id).map(id).map(id).....

btw there is no such issue with native Promise:

Array(40000)
  .fill(idx => a=>a + idx)
  .reduce(
    (p,f,idx) => p.then(f(idx)), 
    new Promise((res,rej) => { setTimeout(res, 1000, 1)})
  ).then(console.log.bind(console))// 799980001

General way to fix this would be rewrite such structures so that they builds up computation as data on map/ap/chain/...(like Free monad/applicative...) and executing run interpret it in a stack safe way. For example this way we can make function composition safe. Also related paper "Stack safety for free" by @paf31.

what others think on this issue?

@SimonRichardson
Copy link
Member

This is why tail call optimisations are important, but last time I checked they had dropped it from the latest proposal.

@safareli
Copy link
Member Author

"effected" libs

@safareli
Copy link
Member Author

safareli commented Dec 15, 2016

@SimonRichardson TCO is enabled in safari10 by default and in node 6+ using harmony flag.

for example this snippet in TCO enabled environments runs without any issues:

function countTo(n, acc) {
  'use strict';
  if(n === 0) {
    return acc;
  }
  return countTo(n - 1, acc + n);
}
countTo(10000, 0) // 50005000

// but this fails 
Array(100000).fill().map((_, idx) => x => x + idx).reduce((f, g) => x => f(g(x)))(1) // stack 💥

So how could TCO help in making for example IO stack safe?

function IO(f) {
  if (!(this instanceof IO)) {
    return new IO(f);
  }
  this.run = f
}

IO.of = (x) => IO(() => x)
IO.prototype.map = function(f) {
  return IO(() => f(this.run()));
};

Array(40000).fill(idx => a=>a + idx).reduce((io,f) => io.map(f), IO.of(1))// stack 💥

@SimonRichardson
Copy link
Member

So how could TCO help in making for example IO stack safe?

That's the right question 🤔

@SimonRichardson
Copy link
Member

Just off the top of my head; if you went with something like freaky then you could make a trampolined interpreter. @DrBoolean might have some thoughts on this. The issue is knowing when a good time to recurse and therefore to avoid the stack 💣

@DrBoolean
Copy link

I'm currently studying recursion schemes with TCO and generators. I'll report back...
FWIW @safareli's knowledge of stack safety blows mine away :)

@SimonRichardson
Copy link
Member

Also you should check out what scalaz/cats do, they've have the same issues at some point. I remember @puffnfresh doing a commit on a new freeT and something was mentioned then. It was some time ago, but that's what I recollect.

@safareli
Copy link
Member Author

As I understand in cats:

data State a = StateT Trampoline a
data Trampoline a =  Free Function0 a
data Function0 a = () => a

So Free monads could help here, but in case of applicative structures we can't do much, for now (would investigate this as well)

@robotlolita
Copy link
Contributor

@SimonRichardson PTC is still in the spec, and VM writers are supposed to implement it. There's a proposal to require a token for calls that should be a tail call, but that hasn't left draft stage yet, and as I understand part of TC39 objects to that (in particular because you don't really have PTC at that point) — either way, we'd get some sort of tail calls though.

Tail calls solve the same problem trampolining does, they're just cheaper (you can translate them to a simple jump with the appropriate registers / argument stack updated), you just need to write your code such that all relevant calls would be in tail position (not the case with the Io example above — f(this.run()) doesn't have this.run() in tail position —, but Data.Task's current implementation pretty much translates things into CPS, as does fantasy-promises in this organisation, since you need to for asynchronous code anyway).

So:

var Io = (fork) => ({
  run: fork,
  of: (value) => Io(f => f(value)),
  chain: (f) => Io(g => fork(x => f(x).run(g))),
  map: (f) => Io(g => fork(x => g(f(x))))
});

// This:
io(f =>  f(1)).map(x => x + 1).map(x => x * 2).run(x => x);

// Translates into:
(h =>
  (g =>
    (f => f(1))
      (x => g(x + 1)) // first map
  )(y => h(y * 2)) // second map
)(x => x)

Because all of these calls are in tail position, the VM doesn't need to return a value to once it finishes evaluating the function, so it can translate it to something like:

JUMP a

a: -- f(1)
PUSH 1
JUMP b

b: -- g(x + 1)
PUSH 1
ADD
JUMP c

c: -- h(y * 2)
PUSH 2
MULTIPLY
JUMP d

d:  -- x
RET

Doing CPS translations manually to verify these is quite a pain, but you really only need to ensure that your call happens in tail position. If it does, the VM doesn't need to translate that into a CALL instruction, it can just use a JUMP, and then you don't have a stack problem.

The only problem with this so far is that not all VMs have implemented PTC :(

@SimonRichardson
Copy link
Member

So it sounds either you live with the pain until PTC lands, or you implement what you want with a new structure (could be free or other ones).

@robotlolita
Copy link
Contributor

(As a side-note: stack is only a problem here when you have a large chain of synchronous maps/chains. sequence(Task, Array(1000).build(Task.of))) is a problem, but sequence(Task, listOfAsyncTasks) is not. All asynchronous code gets added to the scheduler to run on a clean stack)

@safareli
Copy link
Member Author

Here what I have done is made Func (function wrapper) chainRec and used it in Free to define Trampoline:

// data Func i o = Func (i -> o)
const Func = (run) => {
  run,
  chain(f) { return Func((a) => f(this.run(a)).run(a)) },
  map(f) { return this.chain((a) => Func.of(f(a))) },
  ap(a) { return this.chain((f) => a.map(f)) },
}

Func.of = (x) => Func(() => x)
Func.chainRec = (f, i) => Func((a) => {
  const chainRecNext = (value) => ({ isNext: true, value })
  const chainRecDone = (value) => ({ isNext: false, value })
  let step = chainRecNext(i)
  while (step.isNext) {
    step = f(chainRecNext, chainRecDone, step.value).run(a)
  }
  return step.value
})

// stack safe
Func.chainRec((next, done, v) => 
  v == 30000 ? Func.of(v).map(done) : Func.of(v+1).map(next),
  1
).run() // 3000


// Trampoline a = Free (() ->) a
Trampoline = {
  // as Free and Func are chainRec, this will work without stack issues
  run: (t) => t.foldFree(Func, Func).run(),
  // :: a ->  Trampoline a
  done: a => Free.of(a),
  // :: (() -> a) ->  Trampoline a
  delay: f => Free.liftF(f),
}

function loop(n) {
  function inner(i) {
    return i == n ?
      Trampoline.done(n) :
      // instead of suspend we can just `chain`
      Trampoline.delay(() => i + 1).chain(inner)
   }
   return Trampoline.run(inner(0));
}

console.log(loop(100000)); // 100000 🎉

I would investigate State = StateT Trampoline part.

I think @jdegoes can also give some suggestions on the topic.

@jdegoes
Copy link

jdegoes commented Dec 15, 2016

StateT Trampoline is not stack-safe, though Trampoline (StateT f) is. Also you can modify StateT to be stack safe if you want, it becomes f (s -> f (s, a)).

@safareli
Copy link
Member Author

safareli commented Jan 18, 2017

Recently I have Implemented FreeApplicative (Par in safareli/free#31) which has O(1) complexity on map/ap and O(n) on fold. But the most important part is that fold is stacksafe, which was not possible with other optimised implementations out there, as they use functions to build up a structure which requires extra stack on fold, leading to stack overflow. here the Free Applicative structure itself is not using any function, and the fold is written so that function's stack is not used. I think same technique could be applied to other "lazyish" structures to fix stack issue without need for TCO support in environment.

@Avaq
Copy link
Contributor

Avaq commented Feb 11, 2018

It may be worth noting that Fluture hasn't been affected since its 6.x release. Since that means that providing a stack safe implementation can be up to the library author, I suppose this issue could be closed.

@safareli
Copy link
Member Author

Yes we can close it, also recently implemented stack safe IO structure for purescript which could be easily converted to JS if someone wants to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants