00:00 
Hi everybody, welcome to DEMO, the show where companies come in and they show us their products and services. Today I’m joined by Ameesh Divatia, he is the co-founder and CEO at Baffle. Welcome to the show, Amesh.
 
00:10
Thank you, Keith. Happy to be here.
 
00:12
So tell me a little bit about Baffle, and then tell me about what you’re going to show here today.
 
00:016
Absolutely. So Baffle is what we call the easiest way to protect data. What we do is protect data all the way at the field level, so that the cloud service provider or somebody that’s managing the infrastructure never sees sensitive data. So that’s the crux of what we do. We do it with a with a no-code model, we make sure that there are no application changes needed for masking, tokenization, encryption of data.
 
00:39
Who within the company is going to benefit most from using Baffle, is it the CIO level? Is it someone who’s trying to access data, or are you preventing certain people from not seeing data?
 
00:51
Well, the main benefit is for data scientists, they want to analyze data, and they’re prevented from doing so if that data is sensitive. So there’s lots of rules and regulations around it, compliance requirements that security typically sets. So security is usually the one that finds us, but it’s a data scientist that we benefit the most.
 
01:10
So in the problem you’re solving, as far as I can tell, and you can tell me if I’m right or wrong on this is that it allows certain people to see data that’s encrypted, and you’re not unencrypting it or decrypting it, right? So you can still see it, and that just that blows my mind. That’s just like magic wand to me at this point. So again, why should people care about this? Like, what problems are people having when you’ve got people trying to look at encrypted data?
 
01:34
The two main problems, the first one is data breaches, right? Everybody is inundated with, we all get these requests all the time from companies that have shared our data inadvertently, and they’re trying to make make up for it. So data breaches continue to happen. They’re proliferating, which means that the existing data protection solutions don’t necessarily work. What is very interesting, though, is in the past five plus years since GDPR went into effect, there’s a plethora of regulations that are coming into effect for preventing exactly this problem, which is that individuals, just you and I, should not be losing their data just because we share it with somebody that we trust.
 
02:12
So if a company didn’t have something like Baffle on their system in order for someone to look at the data, you would have to decrypt it, and then keep your fingers crossed that that data doesn’t then get breached or stolen or somewhere sitting on a server somewhere unencrypted.
 
02:26
Actually, the problem is worse, because when the data is still on their systems as it’s being processed, it can be exfiltrated. It can be breached. Especially if the database admins credentials are compromised. So you know, at the highest level, we have our phones and we have our messages, iMessages or WhatsApp messages are end to end encrypted. Enterprise applications are not.
 
02:51
So there’s data out there that could be unprotected at this point.
 
02:55
Exactly. So we have protected data at rest. We’ve protected data in transit, but we don’t protect data in use.
 
03:00
So you’ve got some cool things to show me on the demo here. Let’s jump right into it and show me the cool features.
 
03:07
All right, so the animation, the right sort of captures exactly what we do. If you look at the flow of data, it’s coming in from, usually from clear text from an on-prem database, or even if it’s in cloud, it’s the one where the data is clear text, then it goes into this particular protected data, usually in the cloud. So we transform the data all the way down at the field level. So you’re seeing these credit cards. These credit cards are now being transformed into something that’s still readable. It still looks at the credit card, but it’s not the original credit card, and then depending on the persona and the credentials that anybody who has access to the data can get, they can get views of that data. So that’s what you’re seeing in the second half of the animation. Okay, that sort of captures what we do in 30 seconds.
 
So now let’s dive into this. This is our console. This is Baffle Manager. And there’s a few things that are going on, on the left. First of all, what we do, as you could see, there is the application, there is the database, and there is something in the middle, which is what Baffle is. So first things first, what we do is we enroll a database. So this is the database itself that’s been enrolled. It’s a Postgres database that has sensitive data. So step one is actually to figure out exactly what kind of data protection policy that you’re going to have, because, again, we’re going all the way down to the field level. So let’s first start with the data source, which means that you have different sources of information that’s coming in. I’m going to create one just for kicks, because it makes it fun to see. I’m going to see what we can do with this particular database, and what you’re able to do with this is to go all the way down to this particular table and see what is in there and be able to pick which column to encrypt.
 
So now what we’re going to do is we’re going to go into the Postgres database. We’re going to go into the specific table that we’re going to protect. It’s called transactions, and you see all of these things in there. I’m going to make it very simple, just protect one particular field. So I’m going to protect the INT field. So now we have a data source, then the next thing we’re going to do, now that you have a source, you’ve identified what data to protect, you’re going to have to set up a policy. So we’re going to set up a policy here. I’m going to call it test policy, and I am going to protect this particular data source that I picked.
 
05:49
And now I’m going to create that particular policy. And I’m going to have a choice. I can either do traditional encryption or I can do format preserving encryption. Format preserving encryption is the same as tokenization, which means that the credit card actually looks like a credit card. So let’s just do that for this particular case, and we’re going to pick a specific mode, whether it’s FP int or FP decimal, doesn’t really matter, and it’s going to be global. So we’re going to encrypt this everywhere. We’re going to have to choose a specific key coming from AWS, kms in this case, and we’re going to choose a data encryption key. So there’s a key encryption key and a data encryption key. Okay? Again, cryptographers, anybody who understands envelope encryption will know exactly what to do there. But what is very important is that we are allowing a mechanism by which, even if you rotate keys, you don’t have to re-encrypt the data. That’s what this two level hierarchy is. And that’s it. That’s how the policy gets created.
 
06:48
So that’s the two important steps, data, source, data, policy. And then you have to choose user groups. So this is the access control part of it. We already created some use cases here, something called admin, something called Human Resources, these are all the credentials about how data is accessed, right?
 
07:03
So there’s some people in a company that you want to have access to the entire data set.
 
07:15
Precisely. So now let’s go and look at the most important thing, which is I just encrypted a whole bunch of things in that particular use case there, and now I’m seeing this data. This is the direct access to the database. As you can see, the name, the email, is all morphed. It’s not the original. And things like the transaction amount, transaction, completely morphed. So these are not format preserving at all. While the credit card is format preserving. This is direct access to database. If somebody were to compromise the database admins credentials, that’s what they would see.
 
Now let’s look at another person. This is very good, because access will not yield any good data, garbage, they can’t do anything. But now let’s look at a persona called Harry, who belongs to human resources, doesn’t need to see the credit card at all, so it’s private. It’s completely masked. Or Sally, who happens to be somebody who’s a little bit more privileged, can see the last four. So this is our masking feature. It works in two ways, static or dynamic in this particular place, this is dynamic masking.
 
Now let’s look at a privileged user, somebody who is privileged to go and look at the actual transactions. So what is very interesting here is, first, I can extract all of the data out. So if I’m privileged, I can just extract all of this out. That’s what I just did. Everything is now in the clear, I’m privileged. I can see that. But something that Baffle does that’s unique is you don’t necessarily have to extract all the data out. We have this concept of real queryable encryption, where this particular data that you just saw, the transaction amount, was completely morphed. But I’m going to actually now choose a name from that particular scenario where I could find all the doctors in that particular table, without decrypting the data when extracting it out.
 
09:11
So you’re running a query on encrypted data without actually decrypting it.
 
09:17
Without actually extracting it out of the database. The decryption happens at the database tier, but not in the database itself. It happens in what is known as the extension. And it gets even better, you can, not only can you just do a wild card search like that, you can actually do math. So just look at all this, right? You’re trying to figure out whether transaction quantity is greater than zero. And then you can do complex queries where you’re doing a combination of these searches, where it is a transaction quantity from the transactions table with a name that starts with a D and so on, right, you can do nested queries. You can do any SQL query that’s out there, all of that on encrypted.
 
09:57
And if they didn’t have this again, they would be what would a company be doing? Well, extracting all the data and then doing all of the queries?
 
10:02
First of all, let’s go back to the implementation, right? So the implementation would require code changes to the application. I’m mimicking an application here. This is dbeaver, off the shelf, open source database browser. I didn’t have access to the source code. There was no way I could change it, but I can still get access to that encrypted data. So that’s step one, no code changes. So that’s the do it yourself. Thing is, you can get an SDK and you can do it yourself. The second thing is that I didn’t have to manage any keys manually. It’s all done in the control path, and automatically the keys associated with a column encrypts the data or decrypts it. When it comes to our QE, where processing is required, they can actually do it at the database tier as well without any manual intervention.
 
10:47
So the other thing that is very interesting is, because of the fact that we are in the data path, and we can actually see every transaction going back and forth, we have some very interesting analytics on this particular database. So now what we’re going to do is we’re going to go in there and actually look at what the proxy is seeing, which is the fact that it can see all of this wonderful data associated with open connections, maximum open connections, transaction counts, process CPUs, usage, Heap usage, all of this. We can see performance as well. We can see how quickly can these particular queries run on it.
 
We can see things like proxy errors that have happened with the past certain amount of time, last hour, database errors, unauthorized accesses, if there were any privilege changes. And then last but not least, we’ve just introduced this capability where I can type in anything I want. I can require it to be on the show is you have to have some kind of generative AI feature, apparently. Well, we have been ready for it because, because one of the other things that I did not mention early on, we talked more about databases. But as you can see on the on the left here, we have a data proxy and an API service as part of our product portfolio. It comes together really nicely when it comes to AI, because in the AI context, it’s a lot of unstructured data, so data coming in is usually unstructured, so you need a data proxy, something that sits in front of S3 and then gets analyzed in a database, in a vector database. That’s where the database proxy comes in. So we have that end-to-end solution. We are the only ones who actually encrypt data going in and then process it and then mask it coming out. Most of the other DLP players will only mask data coming out. This is a true end-to-end solution, so you can type in any question here. What it does is it’s everything is processed locally, it is not sent to the the LLM at all. We do use the LLM, but the LLM is just used to query our own database that has all of this information, okay?
 
13:00
Oh, yeah, I was waiting for the answer, but it’s already up there.
 
13:04
Again, it’s natural language processing. We’re just starting to roll this capability out, and it’s gotten our customers very excited.
 
13:16
This is very, very, very cool. And I figured that there’s a lot of little mini wizards inside your system that are producing all of this. Obviously, you’ve got a lot of other features that you could talk about. Where can people go for more information on Baffle.
 
13:32
Well, first of all, they can reach me anytime ameeshd@baffle.io. Would love to hear from anybody who’s listening to this, and would love to tell them about the journey. But info@baffle.io or our website, baffle.io has a lot of data, a lot of information out there, a lot of resources that that you can see customer journeys that we’ve been to through, and itwould be great to for them to request a demo and go from there.
 
13:52
Do you offer free trials or free versions or anything like that?
 
13:32
Absolutely. Once the demo is done, we typically set them up with what we call a workshop so they can actually use the product in AWS, get a real feel for it, and, you know, and go from there.
 
14:08
All right, cool. Ameesh Divatia, thanks again for being on the show. All right, that’s all the time we have for today’s episode. Be sure to like the video, subscribe to the channel and add any thoughts you have below. Join us every week for new episodes of DEMO. I’m Keith Shaw, thanks for watching.

source

Leave a Reply

Your email address will not be published. Required fields are marked *