This Startup Wants to Build a “GitHub for Data”

Generic image of data

A startup called Gretel wants to build a “GitHub for data” so developers can safely access sensitive data.

Often, developers don’t need full access to a bank of user data — they just need a portion or a sample to work with. In many cases, developers could suffice with data that looks like real user data.

This so-called “synthetic data” is essentially artificial data that looks and works just like regular sensitive user data. Gretel uses machine learning to categorize the data — like names, addresses and other customer identifiers — and classify as many labels to the data as possible. Once that data is labeled, it can be applied access policies. Then, the platform applies differential privacy — a technique used to anonymize vast amounts of data — so that it’s no longer tied to customer information.

 

Check It Out: This Startup Wants to Build a “GitHub for Data”

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.