Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix VC history operation and add VC tests #49

Merged
merged 10 commits into from
Jun 27, 2023

Conversation

siddharth-krishna
Copy link
Collaborator

@siddharth-krishna siddharth-krishna commented Jun 26, 2023

This PR fixes three bugs in the fetchVCObjectHistory operation:

The first is in the order of the history of a VC object. Currently, given a script with 4 versions s1 -> s2 -> s3 -> s4, fetchVCObjectHistory returns [s3, s2, s1, s4]. This is because we store history in oldest-to-newest order in file, but the following line adds head_h (i.e. the latest version s4) to the wrong end of the history list:

history <- withRead lock $ do
head_h <- fetchCurrentHead h
let head_fp = storePath </> "heads" </> show head_h
preds <- readVCObjectHashTxt head_fp
pure $ head_h : preds

(the foldM below then reverses the order, but with s4 still in the wrong place)

Even after fixing the above, fetchVCObjectHistory incorrectly adds a second source script in the case of a script that is cloned twice in succession. Consider the following history:

mcJDUg CloneOf F6LRpH
4I47d_ CloneOf mcJDUg
cUpepL MarkedBreakingWithPred 4I47d_

In this case, fetchVCObjectHistory cUpepL returns [F6LRpH, mcJDUg, 4I47d_, cUpepL] (after the above fix) but it should return [mcJDUg, 4I47d_, cUpepL]. Interestingly, the above bug hides this bug in production right now because because history = [cUpepL, mcJDUg, 4I47d_] (incorrect), so metas = [4I47d_, mcJDUg, cUpepL] and so when it finds that 4I47d_, is a clone it adds original = mcJDUg and returns nubBy [mcJDUg, 4I47d_, mcJDUg, cUpepL] = [4I47d_, mcJDUg, cUpepL]. But once I fix history = [cUpepL, 4I47d_, mcJDUg], then metas = [mcJDUg, 4I47d_, cUpepL] and original = F6LRpH and the code returns [F6LRpH, mcJDUg, 4I47d_, cUpepL]. This is incorrect because we only want to add on the original version of the first cloning operation when going back in history.

While fixing this second bug, I also noticed that nubBy (a linear operation) is unnecessary, as one can simply recurse through history and stop when one sees the first cloning operation. I thus simplified the fold function and the large case-split into a single recursive function getHistory that walks through history newest-to-oldest, fetches the VC objects, and stops when it sees a cloning operation.

NOTE: This PR requires a code change downstream. There is a third issue with the current system: since the frontend requires the history in newest-to-oldest order, but the current code returns the order [s3, s2, s1, s4], OnPing is reversing the order it gets from the VC to put s4 first. This PR fixes the VC to return history in newest-to-oldest order, so OnPing should not reverse the order that it gets.

I also benchmarked a few different ways of doing this operation (see below) and found that the fastest way would be to use a recursive function instead of a fold, and return the history in newest-to-oldest order.

Test cases:

// iPFmvYX2GLvC9LYLxRW0HogFrhw188s7bwMguELJ0_c=
KyZIcL MarkedBreakingWithPred tHV1iC
wwWlxB CloneOf KyZIcL
_9eHVU MarkedBreakingWithPred wwWlxB
Vbke7C MarkedBreakingWithPred _9eHVU
iPFmvY MarkedBreakingWithPred Vbke7C
// Expected: [KyZIcL, wwWlxB, _9eHVU, Vbke7C, iPFmvY]

// 0yfJRn3k_14jJWPVMjFEBTHwgnYyBcgTJYU4ThJqHtM=
Rxq6af CloneOf F0j9ZH
26Brml MarkedBreakingWithPred Rxq6af
t5jCXh MarkedBreakingWithPred 26Brml
0yfJRn MarkedBreakingWithPred t5jCXh
// Expected: [F0j9ZH, Rxq6af, 26Brml, t5jCXh ,0yfJRn]

// cUpepLZIHur8wRKkx5Wc8IrvfLmNl0V_pnz9C2T9l54=
mcJDUg CloneOf F6LRpH
4I47d_ CloneOf mcJDUg
cUpepL MarkedBreakingWithPred 4I47d_
// Expected: [mcJDUg, 4I47d_, cUpepL]

I tested the new code on the above histories (heads/ files) and it returned the correct responses.

Benchmarking different ways of recursing through history

module Main where

import Data.Foldable (foldrM)
import System.Environment (getArgs)

_foldrM :: Int -> IO ()
_foldrM n = do
  let l1 = [1 .. n]
  let l2 = l1 ++ [n + 1]
  l3 <- foldrM process [] l2
  print $ sum $ reverse l3
  where
    process x acc = do
      if x > 5
        then do
          -- print x
          pure $ x : acc
        else pure acc

_rec :: Int -> IO ()
_rec n = do
  let l1 = [1 .. n]
  let l2 = n + 1 : reverse l1
  l3 <- processRec [] l2
  print $ sum $ reverse l3
  where
    processRec acc (x : xs) = do
      if x > 5
        then do
          -- print x
          processRec (x : acc) xs
        else pure acc
    processRec acc [] = pure acc

_rec2 :: Int -> IO ()
_rec2 n = do
  let l1 = [1 .. n]
  let l2 = n + 1 : reverse l1
  l3 <- processRec l2
  print $ sum l3
  where
    processRec (x : xs) = do
      if x > 5
        then do
          -- print x
          res <- processRec xs
          pure $ x : res
        else pure []
    processRec [] = pure []

{-
Run as: <exe> <method> <n>

I benchmarked with:
hyperfine -L n foldrM,rec,rec2 'cabal run inferno {n} 50000000'

And rec2 was 1.2x faster than the other 2 methods.
-}
main :: IO ()
main = do
  getArgs >>= \case
    [m, n'] -> do
      let n = read n'
      case m of
        "foldrM" -> _foldrM n
        "rec" -> _rec n
        "rec2" -> _rec2 n
        _ -> undefined
    _ -> error "expected args: method n"

@@ -340,87 +339,63 @@ fetchVCObjectHistory h = do
head_h <- fetchCurrentHead h
let head_fp = storePath </> "heads" </> show head_h
preds <- readVCObjectHashTxt head_fp
pure $ head_h : preds
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

preds is stored in file as oldest-to-newest so this line adds head_h to the wrong end of the history.

-- 2. Ignore this meta.
-- Approach no. 2 is taken here by just returning the accumulator.
pure acc
(metas, removeds) <- foldM f ([], []) history
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fold reverses order to newest-to-oldest (but the latest version is now at the end).

-- i.e. original -> cloneof orignal = cloned -> cloneof cloned = cloned'
-- when viewing cloned' history, it will only show up to cloned.
case metas of
all'@(x : _) ->
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern match is looking at the second newest version, not the oldest version as intended.

@siddharth-krishna siddharth-krishna requested a review from shulhi June 26, 2023 05:28
@siddharth-krishna
Copy link
Collaborator Author

Thanks for the review! I noticed we don't have tests for inferno-vc anywhere so I added some simple tests. Hope it catches small bugs in the future.

timestamp o' `shouldBe` timestamp o
obj o' `shouldBe` obj o

-- -- TODO is fetchFunctionsForGRoups wrong? It is returning deleted scripts:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siddharth-krishna is this something that needs to be addressed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs more investigation, but it is unrelated to this PR (if it is a bug it's an existing bug). Maybe I'll open an issue to track it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#50

@siddharth-krishna siddharth-krishna changed the title Fix VC history operation Fix VC history operation and add VC tests Jun 27, 2023
Copy link
Contributor

@shulhi shulhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I wonder if the tests can be written in a way that doesn't need to spin up a server?

@siddharth-krishna siddharth-krishna merged commit 77bb919 into main Jun 27, 2023
@siddharth-krishna siddharth-krishna deleted the sidk-fix-vc-history branch June 27, 2023 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants