On this episode of the podcast we speak with Antti Rauhala, co-founder at Aito. Aito is a predictive database that you can spin up and put into production quickly. Aito replaces the current machine learning tools that have a steep learning curve and generate single-purpose models. It seems like it could be a great fit for a variety of RPA applications. We talk to Antti about his background, the genesis of Aito and where the company is headed.
machine learning, rpa, data, case, automation, process, invoice, intelligent automation, people, project, observation, create, business, problem, learning, talk, lead, provide, interesting, machine
Mark Percival, Antti Rauhala, Brent Sanders
Brent Sanders 00:03
On this episode of the podcast, we speak with Antti Rauhala co-founder of Aito, Aito is a predictive database that you can spin up and put into production quickly. It replaces the current machine learning tools that have a steep learning curve and generate single purpose models. It seems like it could be a great fit for a variety of RPA applications. We talked to Antti's by his background, the genesis of Aito and where the company is headed. Thanks for listening. Before we dive into the company. Antti, maybe you could tell us a little bit about your background, how you got into this company?
Antti Rauhala 00:39
Well, first of all, I’m honored to be here, thank you for the invitation. But just regarding my background, I actually used to be somebody 10 years in data science consultation, I was starting the data science competence ..., which is a Finnish software consulting company. And I was really going to confess person first person doing it and when I left to either we had one of the bigger tapes as teams in Finland. This I'm also quickly as pious by quite a time in theaters and also before it so I have always be interested, just absolutely fascinated by the AI. And, there was some subsequent experiences in university when trying to make a choice to play games, which kind of conflicted a bit of disappointment and frustration with the kind of narrowness of machine learning. So it's like pipe which previously, I was putting predictors. And also during futures, timestamp Dev, a lot of a lot of different inspirations and observations, which led to do contests of this this machine learning platform?
Brent Sanders 02:10
Excellent. Yeah, I mean, so 10 years focusing on until I look back at the calendar, so I'm assuming around 2010 or so, how have you seen, you know, in that time period of sort of specializing in machine learning and around data science, I assume you mean machine learning. And so first of all, I may bastardize, some of the terminology here, but it from a high level, when you talk about data science, I connected to machine learning I connected to AI, I mean, these are semi buzzwords, but I think maybe our listeners may get those but over that time period, have you seen just outside of RPA? Have you seen the machine learning world evolve? Because it's, it's definitely it seems like it's just exploding.
Antti Rauhala 02:55
What really I do just enormous, changed. Of course, I can mostly talk about continuous perspective and maybe of Europe and this particular tool to some degree, but 10 years ago, there was actually quite a lot of love discusses and, and tons of interest and maybe a bit of hype also. But, but pretty when I was starting there was it kind of dead, there was a lot of things happening there. So so they were just confused, early adopters who were were often pre con young in the organizations who maybe could have like, like click on small patches to come try try new things, but it was really kind of in its infancy and it just kind of really interesting to kind of see how it developed over the years. So and I think there was both kind of development in different use case and know also so obviously, when you have this kind of case like recommendation systems take and click on try KPI and then they can create kind of try business and they can try engagement. And if you try to engagement in things replace like media, you can pretty continuously quantitative it into like money also and revenue. So I think this day sort of success stories which kind of enabled machine learning and data science to become like this big, big big thing in consultation and in businesses. So I think one what one thing was was was obviously tools developed the kind of technology developed machinist develop the use case developed, but but also it was kind of interesting to see the contributors, early adopter, people can get higher into organization and after they had kind of learned to use machine learning and ... can apply machine learning. You could actually highlight customers a bit higher and or having bigger batches and being able to just can apply machine learning and data science in kind of more impactful and bigger, bigger ways.
Brent Sanders 05:06
Sure, sure. Yeah. I'd imagine. You know, as you were ending the consulting part of this, I mean, it seems like it, it naturally seems like there's a place to productize. Right? So walk me through how you started the company. I mean, it was that the thinking is, hey, we're keeping, we're really repeating the same engagement over and over, we could really productize this or provide just an API or a service to do this work. I mean, how, what was the genesis of hopping over to the product side of this?
Antti Rauhala 05:38
Well, I think they were like, like, like, kind of observations on creator levels. And one observation was, was just going to be some insider observation. So there were a lot of a lot of customers. And especially if you had like, like dismissed, maybe touched ... customers, who had this kind of create desires, and then create interest or so that really made wanted something matching things or sometimes recommendation systems or some kind of smart automation. In different different confused in different contexts. But what often, but when we started to kind of adopt the numbers and how much it would cost, they couldn't contrast the party business cases, obviously, they were configured parties, and they would actually kind of have like the real business there. But there was come to observation as if there seemed to be like tons and tons and pretty into somewhat of use cases, what what didn't what couldn't be actually done, done, because the economics and I think this also something which is very present .... So you actually have like tons of ... and machine learning use cases, but by property economics just just just don't just don't work. Another kind of observation was just technical. So one project we did was adding personalization to like Lucene search engines, which is basically a database, and there was observations that, hey, this database actually contain like statistics and pre-calculus statistics. And they can also be used to calculate statistics. And they would actually kind of build like, actually a database, which is not for search, but for predicting, which was, which was kind of interesting, especially when, when kind of being a bit frustrated with the narrowness of the discipline was machine learning before. And then there must be like, promised that they would create a machine, which could just can predict anything based on anything, and pick up immediately, but at that time, I didn't actually kind of, kind of, kind of figure out what to do with that kind of system. So it seems that sounds like really cool. And I created a prototype of it, which worked, but we didn't actually scale this skill set, well, the time, but maybe the kind of missing point was just contract nuisance that way, a lot of use case for you. But you can actually do it with normal data science and machine learning projects. Because economics and and the other side was that you could actually create on top of this kind of machine learning system, an interface, which would be SQL like, so you could just kind of query with simpler query on nodes in the same way as you'd use a normal database query to pair it to normal. So you could just kind of ask, okay, I have, let's say an invoice in here, I don't know that that was going to review it, I don't know the costs. And though I didn't know the, I don't know, the budget, I don't know, tax costs, or I don't know, these kind of things, but could actually ask for its unknowns. And, and do click to make a system which could really just kind of can provide you with this information. And, and obviously, it's like complete different workflow, and they like to out claim like transformative things. And then one transformative way is just easiness. So, anyone who can use SQL and can use database can use this kind of system. But obviously the other side is to convert flow and the economics so instead of having like, data science protocol, you need a data scientists from data science team or consultants get improved in prototype, to get the axis get current data accurate and data post data to read feature engineering during model fitting, trying different models, deployed model to interface and stare and finally we actually get to test it in real world and it can take weeks or months or it can cost like hundred thousand euros, you just can get the breakfast instantly, and settlement transformative and then it kind of seemed really, really, really fascinating idea. And that was kind of the birth of the spider.
Brent Sanders 09:51
Sure. Yeah. That's a great, sir. I mean, I think what you're, when you mention, you know, the ability for one individual in Let me know if I have this right. But I think what you're saying is there's, you know, on a particular project to build an engine like this is just going to, you know, it's going to eclipse any value you're going to deliver, even if you can bundle a handful of projects together. So it's, I feel like there are players in this space, talking about Intelligent Automation that are focusing on purpose built, right, they're building a, we're going to build the, the logistics, or healthcare or very specific accounting and payroll tools. And we're going to use machine learning or Intelligent Automation for that, because we have the context, and we can spread that cost out to 20, 30 clients. And so yeah, I guess the really interesting thing is, from what I've learned, and I know very little, but from what I've learned from the website, it doesn't sound like this is machine learning, in the sense that there's models there's really, you mentioned, like a sequel like interface, can you I think some of our listeners are technical, but for even for the semi non-technical technical listeners, could you explain the difference between your approach and sort of traditional learning models?
Antti Rauhala 11:08
What is this, this is pretty good observation different does and this one is going to mean things, what it provides and why it is so so kind of different and unique. And while they are the data like technical term, so so that attempts like eager kind of machine learning, what is normally used what's called model based approach, so what you have is, is that kind of beforehand, you take all data, and you create this ... model, which can serve any kind of, or, let's say any kind of very specific kind of area, in a certain kind of parameters, but it's a bit like pipe, so you can put certain kind of data in, and it's going to provide exactly one computation out. And then there's this kind of concept of layers learning, or what we have sometimes called ad hoc modeling. And the basic idea is that instead of like, like spending, like 10 minutes or hours, or something like that, to build this conflict on one single model, what you do is that you can get the query, and you just optimize the underlying statistics crazy fast. And you make it possible to create a specific model at the sport in a millisecond scale, by calculating thousands of statistics in in like this, this kind of think of my and if you can, can kind of pull set true like I was doing, you can just kind of create more or less spot and make use to to provide predictions and .... So it is kind of technical and under sort of, let's say magic, that is needed to make it happen. It is not an easy thing to do. And it is actually quite a lot of kind of technical kind of observations and analytic discoveries too, to make it work.
Brent Sanders 13:02
It sounds like it's largely geared towards developers, which makes it tough. So people that are going to, you know, have a general understanding, and it sounds like from a high level, your inputs, or are going to be you know, some bits of data, right, I'm going to send something and I'm going to get something back. And usually when I've worked with similar types of API's usually get, here's some results, and here's a degree of confidence around those results, or, you know, some measurement of how far you are away from something is that similar to what I'd expect out of Aito?
Antti Rauhala 13:35
Yes, it is kind of similar. So it is ... tool. And, and obviously, you can put the data in, like you would put in, in a normal database in very primitive kind of silicon interfaces. And, after you have put the data in, what you can do is that you can basically state this contest here like that. So you can state table what you're looking at your state norms, let's say you have a new invoice coming from the parking company, as an interest as for predictions or so who would preview this, this, this this invoice and, and what you're going to get these is different options. So they may be let's say, 100 people in the organization who were taught to for viewing and these options are going to be ordered by the probability basically, or the confidence value. And if we have convenient enough evidence, they may be like 95% probability that is going to be this kind of person who has been handling this, this this contractless before. And, and as such, this is kind of similar, similar thing. Of course, you can also kind of request for explanations there, which is, I think, a must in any kind of machine learning solution today.
Brent Sanders 14:53
I think one of the big things that we see in some of our practice as well as I can see in some of the examples that are on the side or you know, around, you know, matching data together matching a vendor to an account or, and so what it sounds like, you know, if you can, you basically give us the known quantity, right? There are only so many vendors, there are only so many codes, if you can give those to us and tell us the relationships, we can then help you sort of match that information. Is that, is that about right?
Antti Rauhala 15:27
Yeah, yeah, that is right. And you can basically have any kind of discount app or storage is completion. So, so, in Intel's case, what we did some time ago, the kind of setting was to study have like master data, and the master data condense kind of media requests of, let's say, media like Simpsons, or let media like Star Wars. And then what do you have is that you have this certain audience providing graphical reports about music media usage, but the problem is, is that these are in different different formats or so they need extra talking different languages, or they may be just otherwise kind of different, they conformed it. And what it can do is that it can actually combine the statistical inference and it can combine this kind of search engines matching. So it can of course, recognized Okay, here, like the Simpsons, and here Simpsons, but it can also recognize that Okay, here's like, that the ... and he was like, which is ... and in here is we have Star Wars, it has to kind of confused about history, historical evidence. And, and, and this can be on ..., similar kind of things as entails case by case there may be some different things like like what what we did for ... this or they have like ..., furniture frames, and like housing, discount videos and decorate doors, what kind of kind of matching and playing on top of the kitchen for inorder grade this this shopping basket and automated process. But they could also be frequented different, like, let's say trucks and people, which have a lot of kind of common things, but also a lot of different person and different kind of things and even different languages. So it is very kind of fascinating kind of domain. And I have heard a lot of cases in process automation theory, where this kind of kind of a question actually could be, could it be used?
Brent Sanders 17:31
Sure, sure. So what it sounds like, you guys have been off to the races company has been running for about two and a half three years or so. What, what is it about? Or what got you into the sort of the automation or RPA? industry? In that sense? I feel like you can kind of go, obviously, we were talking about before the podcast, but, you know, obviously there's a geographic propensity there, but it seems like you could have gone any of which way What made you and it doesn't seem like, you know, automation is the only application that can be used, but it does seem like one of the major ones.
Antti Rauhala 18:09
So there's actually excellent question and and, and the thing is really that we didn't almost point to automation as automation for us. This is happening in a lot of startups also. So we went actually talking to automation cases also I I have experience in in Intelligent Automation and Intelligent Automation seemed like I consider it a kind of nice domain to go because it is actually very rewarding and it's kind of easy domain for machine learning applications and undoubtedly good reasons why what by this kind of a better country boarding is but extra story was was that we kind of kind of knew a guy who knew this dozen people in in this epic console contests and courses for each hour, or whatever extra core camp quarterbacks in and take appropriate interest and I'm fascinated by a provider and they invited us to us to talk there and we actually and they kind of kind of kind of run into obsolescence. So it was like, like, three four years ago that there was like tons of hype about AI and RPA but there wasn't a lot of things happening at that at least in the fitness landscape. So there might be select few products here and there. But it just wasn't kind of getting kind of kind of cracks and they can diagnose possible simplicity. If you think about containers as project and kind of ... the dislike like this sort of ... there so so there are companies who are kind of making LP automation every three weeks or with imposter guess the median therapy application takes like three months but if you take like data science projects, they tend to be much longer and they tend to be much more expensive. And then the problem really is that when the country pays, it's all part costs, and it's all about kind of business benefit. And there's going to be some kind of optimal maximum for the investment where you could do so if you're looking for, let's say, 100,000, Euro saving, did the protocol of course, like like hundred hours on your ID, it doesn't make any sense to kind of spend 100,000 euros to, to, to kind of get get hundred thousand euros. And, and as such, the RPA is actually very cost sensitive domain. And when you have this kind of very cost sensitive domain, and this can actually pretty expensive technology, meaning it, there's a lot of clicks in there. And, and, and the other thing is, is also that one of the causes of the destruction is also the data completeness of technology. So now we have all the ML technologies, but they seem to be aimed for data scientists, and they talk, they talk them and they are not that kind of lost or confused us. And, and, and and, and suddenly you have this portfolio. So it's like a woman sitting. But this point solution state, they also have a complexity. So if you would like to console 10 different kinds of problems, let's say our problems, let's say in three big projects, or let's say in three month projects, there's actually quite a lot of learning and load of this discard business process having to do in order to kind of kind of configure also with any kind of product actually feeds the problem. And there are expenses, that is that is risk of the problem not fitting there. And there's logic like terms of complexity. And there's also it's kind of challenging thing. So but obviously, it can work and it works for a lot of people but but they kind of pull the midpoint was that the current sort of LPs was mostly Aida to kind of kind of expensive or too narrow for the kind of channeled needs of celebrity consultants. And, and this also something we have been hearing from a lot of airplay teams that did the traditional way of doing machine learning, it just doesn't work in the Caribbean landscape. And for these reasons, this worker, really, really, really excited when they can find out about a product and they actually managed to kind of make a proof of concept of rpm and machine learning in like five hours using a pipeline anti white bath with which was just mind blowing for everyone involved.
Mark Percival 22:43
Yeah, that's I mean, looking at, you know, looked at a little bit of the customer stories, but you also recently posted a, I think, a post on using it with robot framework, which is a really good explanation. If you're coming at it from a developer standpoint of how easy it is to get started with this. I don't know, I know, Brent, and I were looking at that reasonably, maybe it'd be it'd be kind of interesting to talk a little bit more about these customer stories, or just some example use cases where this has worked out really well. Because, you know, I thought the one that, obviously categorization seems to work out well with this. The other one is the lead qualification, I think that'd be an interesting one to talk a little bit about if you got some.
Antti Rauhala 23:25
Yeah, so experience on that one. So well, if I started with the process, and doesn't say some kind of best example is just posting invoices. And I think in this case, it is not just about the categorization, but it is also about kind of discount way to messengering, to mimic an existing process. So you basically have a kind of expert and you have this kind of machine and import which has come up selling with the expert is doing and once the externality process this this invoice coming from the from the distance parking company, let's say three times after that, it is pretty obvious that how it's going to be processed. And it can actually take the can process and do the process with very small arrows correctly also because this business process state they tend to be very deterministic. So if you can recognize process, which is done few times the same way. delight notice is that there's going to be kind of process for a long, long, long time. And, and delicate, like really kind of kind of a lot of fundamentals. Why this kind of automation is so exciting from our perspective. And then I think from everyone's perspective, so the thing is, is that in this process tend to be a lot of strong patterns and there has been a lot of repetition. Basically, the question is today to kind of logon to it, if you click on one to figure out how much kind of impact this this kind of invoice automation could take, you're basically asking that Tom trepidation Do you have any kind of process can you find for for this case, or each individual case, few examples from your existing data of how it has been processed kind of properly and, and how many of these kinds of cases actually do have historical examples. And it occurs that a lot of most of these cases typically have examples. And especially with invoices even have like six, you can have like 6% automated service rates, you can even have like 80% automation rates, or even higher order message rates, simply because, because if you can take any one new invoice, you can be pretty certain that this kind of kind of thing has been seen before. And, and also one of the compromise things in different classification or mimicking problems is that business officers obviously keep very concrete records of a taken process and process data, which means that the data is typically with good quality, there's a very direct connection between data and the and the actual process, which means that the investment actually is going to be low, and didn't spill high volumes in especially bigger companies. So there was 510 thousand invoices processed in posti, after a month, and, and is discussed purchase invoice is going to be taken into account and they're going to determine ... it's going to be going to the CEO or different than managers and they're going to open different invoice systems and kind of wonder that was happening. And it was it was can actually take quite a lot of time to kind of get through the current process. And if we are talking about like, like 60 70% 80% automating trades in the process, which occurs 10,000 times in August, so, which takes a lot of time, pain was the obvious talking about huge huge impacts, and simply very kind of compromised return investments and frequent buys Nice, nice, nice kind of success in any case, and also did post this kind of comment in here was that when they can average projects takes three months or medium project. If they were to same project or similar project again, the adding dimension during the output may take just this much more. And I think this kind of brings us to a very interesting kind of competition because the normal RPA tends to be like several times things where you don't want to use RPA or the kind of audit reasons lead to pro automation trades. Let's say if you have a lot of complex like like become weak invoices, you can may have thousands or 10s of thousands of different configure most categories. It is Institute managed so so role based automation may take maybe 5% 10% of the automatics. And now suddenly you are getting into 60 or 80% of the mass. And that is like like almost kind of phrasing of magnitude, while you're just kind of spending one one that this month to to kind of implement it. And I think a lot of a lot of this kind of kind of cases. So of course they are those marks. And it is it's not going to be that kind of easy, kind of all the time. But But there's something concrete into queueing in this kind of cases.
Mark Percival 28:44
Yeah, I mean, when you talk about numbers like that, the other thing, I think that's interesting, when you think about machine learning is from a lot of a lot of concern can be well, how will I get it wrong? How accurate will it be? But when people don't, you know, estimate is, if you have 80 people categorizing this, there's a lot of human error that's going into this as well. And so and a lot of cases, it's actually a situation where you're actually also increasing not just the throughput, but actually the accuracy of the categorization, and then the tagging versus a, you know, a set of humans that are obviously depending on the day, and who which human, you get, you know, you get different rates of accuracy. So, yeah, I think that's an interesting and they have a really good use cases is that idea of categorizing this data. The other one I thought it was interesting was that you had basically, I think it was like the actual term here. It was basically just lead qualification is that when I don't have a ton of IQ, so I know ..., I work, you know, definitely we've touched on the financial side. On especially with invoices, everybody that's probably one of the more commonplace areas target RPA but a leak office lead qualification sounded interesting. I'd love to hear more about that.
Antti Rauhala 29:55
Well, this is something done by a partner in two weeks. We're providing this this contract port and automation platform to and then there's likely likely a common problem for any kind of compensation. So so basically the basic thing is is that it says the country's passiveness and speed or other kind of supreme things, so I don't remember statistics, but but but if you kind of slowed response, the deputy to kind of move forward it did, it actually drops pretty can float in ordering in some like one day. And then the For this reason, the kind of basic idea is that Do you have a process where you would kind of normally or people would would would kind of take leads on it would be assigned. And you'll record this this this information on the side, you have like a chatbot interface in there to do kind of absurd new leads. And once you have enough data, you can basically have the contract put starting to assign these to people based on a discount different different than the information so so what is the? What is the history for the person to respond different than please What is the case, and if you have also information about the opposite lead kind of metadata of companies, that you can increase the decline of feed it with such kind of information?
Mark Percival 31:29
Yeah, that's an interesting, that's an interesting use case, I think, you know, there's, I'm certain that, you know, it's interesting, mainly, because I think, from my standpoint, RPA is very much about cost savings, but this is actually a revenue driver, which is different than what you typically see an RPA project start with.
Antti Rauhala 31:44
Yeah, this takes us to the kind of additional kind of thing so, machine learning can obviously be confused to confirm in a continuing process. But also another thing is, is that the disk machine and replicates It can also be used for optimizing things. And I think that that has been discussed and so about this contract Monday's use cases, we haven't actually implemented this just yet with customers. But I think it is also one thing where you can really kind of tap into revenue side by will in a lot of organizations, you'll have basically kind of people kind of arranging this, this kind of calling list and and taking on different kinds of mechanisms for conducting the confounder parties. And with machine learning, what you can do is that if you have like product, and you have some list of possible customers, and you have some kind of history information, you can start to infer that at how likely the different possible kind of customers actually could be. Could be willing to buy, buy, buy, buy, buy offer. And I think this is kind of interesting also in the sense that said, idle is kind of tactical mention it has machinery and it can provide experiences, but it can also can discover you start school connections. And then the Dyson gardener can be interested in the rest so so you could basically kind of conflict with all sorts of that what contrives is and what what people are likely to buy certain products and get good company insights and adopting this bit of code, this cafe area between the price optimization discount different areas where you can actually kind of tap into revenue.
Mark Percival 33:33
If people wanted to get started with Aito what sort of like I know, I know I read through this document on robot framework, but I know you guys are early so I don't want to put too much pressure on Oh, what's sort of the best way to jump in?
Antti Rauhala 33:47
Well, I think did kind of what one is a way is to just come into our websites and we have free trials and we have I think all customer success in ... has been doing just great in creating documentation. And we also have documentation for quite a lot of a lot of different platforms also for you our data robot framework, prism actually provided audio either in the gray zone which was which was planted from from our side, we have popped up a lot. I think we have more than like half a dozen dozen different platforms and of course I do X also from .... So I think Python is one of the major platforms for doing RPA and this obviously something what we can do.
Mark Percival 34:37
Great. I know you guys are hiring, I think?
Antti Rauhala 34:42
we are hiring so and then this is something which could be also interesting for your business. So we are in for RPA says in China, in Europe and obviously So, we are basically creating new market in India pierside. And I think this is this absolutely exciting. And I believe that it could also be really kind of exciting, exciting, kind of roll up and play a few new markets and new frontiers in Dallas. Absolutely. Yep.
Mark Percival 35:27
For the guests, for the audience's listening, I just to be clear, it's Aito.ai just in case, you're trying to try to figure out how to spell it. Brent, have any other questions or
Brent Sanders 35:39
No, no, this has been great. It's been really fun to, you know, dip into the kind of lineage, the history and some of the use cases. I know that. Yeah, I'm excited to dive in, I always kind of debate when we have somebody on who's got a product, like, should we dive into it and use it first? Or should we get the rundown? So we can have you back on after we've had a chance to play with it? And, you know, ask more specific questions, but that's probably, you know, definitely tune into future lessons or future lessons future episodes, I would be willing to bet we'll get our hands dirty with it.
Mark Percival 36:15
Yeah, there might be a project coming up soon that we would.
Brent Sanders 36:19
That's right. Yeah. So you know, I think like any, anybody who's got an RPA practice, you or even just a technology practice there, once you know, kind of there's a tool out there, everything starts to look like you know, a great use case for it. So I'm excited to dive in, we'll check out the trial and and probably put it through a couple of paces on some things that we have going on and see how it fares. And so yeah, we may be speaking with your customer success, people pretty quickly here.
Antti Rauhala 36:50
So it would have been splendid, I have to say, and I have to say that just absolutely. It was honored to be here. And it was a pleasure to talk with you guys. And it would also be just amazing to come back and continue the discussion.
Mark Percival 37:08
Yeah. Thanks for taking the time to talk with us. Thanks
Brent Sanders 37:12