Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-detect sensitive fields and automatically hide potential sensitive data #64

Open
evoxmusic opened this issue Apr 26, 2022 · 2 comments
Labels
enhancement New feature or request feature New feature request

Comments

@evoxmusic
Copy link
Contributor

One feature that could be very useful and that will prevent any potential unexpected data leak would be to automatically detect sensitive fields to apply a transformer on it. It could be an option in the conf.yaml that will enable it:

source:
  auto_hide_sensitive_data: 
    enable: true
    fallback_transformers: 
    -  field_type: string
        transformer: random
...

Why

They are many reasons why this feature would be useful:

  1. Suppose you use Replibyte and you defined your conf.yaml with a certain version of your database schema. Then someone adds a field in your database that you are not aware of; if the conf.yaml is not updated, then we will leak this new field.
  2. Specifying every field from the database that we need to hide is tedious and even almost impossible with a large database.

Happy to have your feedback:

  1. Does this feature request make sense?
  2. How we can design it?
@evoxmusic evoxmusic added enhancement New feature or request feature New feature request labels Apr 26, 2022
@timkrins
Copy link

timkrins commented Jul 27, 2022

I also like the idea that fields without transformers defined are not output by default.
In my mind I would expect an API similar to output_fields_with_transformers_only: true or strict: true rather than it being behind something related to 'automatic detection' etc.

@patrick-grow-therapy
Copy link

patrick-grow-therapy commented Sep 13, 2022

Has this been looked at? I also thought it would be useful to specify which columns you're fine with outputting by default and then those not specified were always transformed. It'd be easier for people to onboard and safer long term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature New feature request
Projects
None yet
Development

No branches or pull requests

3 participants