Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

split function for Data.Conduit.Text #205

Closed
MaxGabriel opened this issue Mar 27, 2015 · 5 comments
Closed

split function for Data.Conduit.Text #205

MaxGabriel opened this issue Mar 27, 2015 · 5 comments

Comments

@MaxGabriel
Copy link
Contributor

Hi, I needed a split function for some log parsing I was doing with Conduit, would you be interested in me adding this to Data.Conduit.Text? If so, I can add tests and remove the Safe dependency that my code is using, then submit a PR.

This is the code I'm currently using; it's modeled closely after lines and I believe lines could be implemented in terms of split.

split :: Monad m => Text -> Conduit Text m Text
split splitText = awaitText T.empty
    where
      awaitText buf = await >>= maybe (finish buf) (process buf)

      finish buf = unless (T.null buf) (yield buf)

      process buf text = yieldSplits $ buf `T.append` text

      yieldSplits text = do
        let splits = T.splitOn splitText text
            lastSplit = lastDef T.empty splits
        mapM_ yield (initSafe splits)
        awaitText lastSplit
@mgsloan
Copy link
Contributor

mgsloan commented Mar 27, 2015

Just for info / reference, I did something similar when implementing fusion in conduit-combinators: https://github.com/fpco/conduit-combinators/blob/master/Data/Conduit/Combinators.hs#L1889

The main difference is that yours takes a Text rather than Element seq -> Bool.

@MaxGabriel
Copy link
Contributor Author

Cool @mgsloan. I guess the conduit-combinators version of this would take an EqSequence and split its input using that?

@mgsloan
Copy link
Contributor

mgsloan commented Mar 27, 2015

Yup, that seems reasonable to me. I'm not seeing an implementation of split for EqSequence, though.

@snoyberg
Copy link
Owner

I think the code above is subtly broken; if the broken-upon string occurs over a chunk boundary, it won't fire. E.g.:

mapM_ yield ["123x", "x456"] $= split "xx"

doesn't look like it would result in the output you'd expect.

@snoyberg
Copy link
Owner

Tracked by PR #207

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants