Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another lament on Redbin limitations, or why sparse designs are bound to be reinvented until complete #155

Open
hiiamboris opened this issue Nov 17, 2023 · 11 comments

Comments

@hiiamboris
Copy link

hiiamboris commented Nov 17, 2023

In my recent work on ParSEE visualization tool I've encountered the need to save the dump of all Parse events, to later load it in the GUI tool to analyze.

Reasons for such split design are:

  1. Spaces (on which GUI tool is built upon) are a big library. Requiring it at the place of parsing would be unwise because:
    • countless #include bugs will make usage of such tool pure hell, so no one will use it
    • such a big library is likely to affect the original program (consider custom event scheduler, custom mold implementation, hacked console, and many exports)
  2. Requiring GUI would require GUI functionality, which is not always available (say, on headless servers), and would limit the applicability of the tool
  3. It is simply convenience sometimes to save multiple results and process them later, without the need to reproduce them every time (also useful if hard to reproduce)

While point 1 could in theory be addressed in some far far future by a solid module system that could provide enough convenience and isolation, points 2-3 will remain valid, validating the whole need for backend/frontend separation.

Saved dump consists of thousands of events, each including input Parse is working on and rule processing the input:

  • input of Parse can be nested and even cyclic, and it can contain arbitrary Red values (natives, faces with handles, etc.)
  • rules can also be nested, cyclic, and linked to each other indirectly via words, and contain arbitrary code, which usually includes natives

Traditional mold/load cycle is inappropriate as it would destroy both sameness and offsets of all series:

  • if event A and event B have the same rule, it should remain same once loaded, and it should preserve the head index
  • all nested sub-series of input once loaded must retain sameness across all events: if event A has input X containing input Y, and event B has input Y, then Y of both must be the same value, else I won't know where to mark the current parsing location

Redbin was supposed to be of help here, but its shortcomings make it a kludge rather than a solution.

Current Redbin implementation cannot save:

  • values of types routine! handle! event! op!
  • values within system/words context

So when I try to save such dump of values I only receive an error most of the time.

To force Redbin to save the dump I had to recursively preprocess whole tree of values (including both input and events dump) in the following manner:

  1. Copy each any-block series and keep a map [old block -> new block] - needed for (2)
  2. Replace each any-block with it's copy from the map (1) - needed for (3-5), as I cannot modify the original rules/input in place or I'll break the parser
  3. Bind every word to the global context so it loses its binding during Redbin encoding (it's a special case in Redbin at the moment)
  4. Replace complex (objects, maps, functions) values with their abbreviations, as I don't need their content because Parse cannot enter them. This helps me avoid deep preprocessing of these values
  5. Replace values unsupported by Redbin by their abbreviations

For more general preprocessing I would also need include objects, maps and functions into the deeply preprocessed data set.

In essence, it's a slow high-level reinvention of the logic of Redbin, which by my observation is also quite hard to get done correctly and reasonably fast. It would thus be nice if Redbin didn't require invention of such kludges in order to just use it.

@dockimbel
Copy link
Member

Current Redbin implementation cannot save:

  • values of types native! action! routine! handle! event! op!

??

>> rb: system/codecs/redbin
>> probe rb/decode rb/encode :insert none
make action! [[
    {Inserts value(s) at series index; returns series past the insertion} 
    series [series! port! bitset!] 
    value [any-type!] 
    /part "Limit the number of values inserted" 
    length [number! series!] 
    /only {Insert block types as single values (overrides /part)} 
    /dup "Duplicate the inserted values" 
    count [integer!] 
    return: [series! port! bitset!]
]]
== make action! [[
    {Inserts value(s) at series index; returns series past the insertion} 
    series ...
>> probe rb/decode rb/encode :parse none
make native! [[
    "Process a series using dialected grammar rules" 
    input [binary! any-block! any-string!] 
    rules [block!] 
    /case "Uses case-sensitive comparison" 
    /part "Limit to a length or position" 
    length [number! series!] 
    /trace 
    callback [function! [
        event [word!] 
        match? [logic!] 
        rule [block!] 
        input [series!] 
        stack [block!] 
        return: [logic!]
    ]] 
    return: [logic! block!]
]]
== make native! [[
    "Process a series using dialected grammar rules" 
    input [binary! any-block! an...
>> probe rb/decode rb/encode :+ none
make op! [[
    "Returns the sum of the two values" 
    value1 [scalar! vector!] "The augend" 
    value2 [scalar! vector!] "The addend" 
    return: [scalar! vector!] "The sum"
]]

@dockimbel
Copy link
Member

dockimbel commented Nov 17, 2023

Current Redbin implementation cannot save:

  • values within system/words context

That was not part of Redbin goal, which was to provide a way to serialize local Red data accurately without pulling the entirety of the global context (== whole Red runtime environment). Even in its current form, Redbin is already pulling some parts of the global context, which is not always desirable. A possible evolution of Redbin could include a way to control how "far" it pulls references, so the user can scale it for its specific needs.

@hiiamboris
Copy link
Author

Thanks for correcting me, I've removed natives and actions from that list.

That was not part of Redbin goal, which was to provide a way to serialize local Red data accurately

I understand, yes. But local data may include global words that need to be saved, or it may contain words like system which enforce unwanted global context inclusion, so in real code it becomes a tangled mess in need of deep and meticulous preprocessing.

@hiiamboris
Copy link
Author

hiiamboris commented Nov 17, 2023

Perhaps the easiest patch would be to let it accept a callback, either to handle all values, or only those it can't save, rather than failing. And some save-anything callback available out of the box.

But these are just some thoughts and a use case to inform the big picture.

@hiiamboris
Copy link
Author

I had other thoughts on Redbin generality here

@greggirwin
Copy link
Contributor

The big picture thinking, and a real use case like this, is great. Thanks @hiiamboris. 👍

@hiiamboris
Copy link
Author

hiiamboris commented Dec 23, 2023

Another illustration of how bad it gets - saving two scalar values, carrying over the whole runtime:

f: function [geom [map!]] [
	unless geom/offset [geom/offset: 0x0]
	unless geom/size   [geom/size: system/view/screens/1/size]
	save %test.redbin probe geom
	view/options [button "TEST" [unview]] [size: geom/size offset: geom/offset]
]
f #()

Output:

#(
    offset: 0x0
    size: 1280x720
)
*** Access Error: cannot decode or encode (no codec): routine ["Internal Use Only"][bool: as red-logic! stack/arguments bool/header: T
*** Where: encode
*** Near : codec/encode :value dst
*** Stack: f save

I have to use text for state files module, because Redbin is a no-go. Of course that bears another risk.

@dockimbel
Copy link
Member

dockimbel commented Dec 25, 2023

That last example looks like a bug where the words in the map are wrongly pulling their context instead of being processed just as symbols.

@hiiamboris
Copy link
Author

Is it a bug in Redbin not having a special case for maps, or in maps for not removing words binding then?

@9214
Copy link
Contributor

9214 commented Apr 2, 2024

That last example looks like a bug where the words in the map are wrongly pulling their context instead of being processed just as symbols.

I'm pretty sure that's by design of map! itself:

>> map: to map! bind [foo: 'bar] context [foo: 'baz]
== #[
    foo: 'bar
]
>> get probe last keys-of map
foo
== baz
>> unset? :foo
== true

provide a way to serialize local Red data accurately

Which it evidently does, judging by the example above 🤷‍♂️

special case for maps

For the record, at the time of implementation I didn't know there supposed to be one. IIRC all there is to map! is a convenient key/value wrapper over hash!.

@9214
Copy link
Contributor

9214 commented Apr 2, 2024

WRT inability to encode global values, I think it can be rectified by collecting them in a separate context, which would serve as a localized substitute for system/words. Basically:

>> foo: 'bar
== bar
>> save/as #{} [foo] 'redbin

Would be the same as:

>> save/as #{} bind [foo] context [foo: system/words/foo] 'redbin

As for the other possibility, with unmarshaling values from Redbin payload straight into system/words: imagine loading an innocious payload where global + is set to a function that reads your home folder, sends its content over the network, and then blows up your PC. Next time you'll evaluate anything in Red it will likely do just that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants