Mark and Brent introduce the It's Automic podcast and discuss what they see as the 12-factor bot. For more info on our 12 steps check out this blog post
Automic’s 12 Steps to Better RPA:
- Do you monitor your RPA process with a platform tool?
- Are your RPA logs stored in a centralized, searchable location (e.g. Splunk)?
- Is your escalation process integrated with your existing IT ticketing system?
- Are you using software to triage bot errors/failures?
- Do you have a process to assign IT support resources to outages?
- Do you have a process to communicate outages?
- Do all your bots have an SLA and it is monitored?
- Do you have an up to date BCP plan for your RPA?
- Do you have a documented process for any changes to automation?
- Are you tracking upgrades that may affect automation?
- Do you have development and testing environments that match production?
- Do you have a dashboard to report SLA, KPI’s and other metrics to stakeholders?
12 Factors for Great RPA
bots, rpa, automation, people, software, number, run, best practices, steps, logs, expected, software development, process, working,developers, year, problem, factors, fails, pager duty
Mark Percival, Brent Sanders
Mark Percival 00:06
Hey there, welcome to the it's atomic podcast. I'm Mark Percival,and I'm going to be talking to you about some RPA along with
Brent Sanders 00:12
Mark Percival 00:14
And I think it'd be useful for us to quickly give a background.From my standpoint, background as a software developer, I have a degree in accounting, though, some some time in accounting, and then spent a lot of time in software development side of things work for a variety of companies in the Bay Area. And I've since then become kind of enameled with what's going on in the automation space.
Brent Sanders 00:31
And I have a similar background, I'm a self-taught software engineer, I've been working for a little over 15 years in the business and the past five, working in automation. I've worked on a wide variety of projects from, you know, big insurance to manufacturing. And I've just also become enamored with the life cycle of engineering and businesses and how they kind of interplay.
Mark Percival 00:56
Yeah, I think if you've spent any time in software, you've kind of come across older software, or software that even you wrote some time ago. And it's easy to kind of feel like, Oh, I, I don't actually remember what this did.And I think, you know, as we started to get more into the automation space,it's going to become clear that a lot of this stuff becomes sort of the next set of legacy software.
Brent Sanders 01:15
So one thing that we look at in automation, and specifically, we'll use the term bots, but, you know, automation programs are, you know, a long winded term for bots, these are robots or scripts that go out and do, you know,menial activities that, you know, people have been doing, and they're meant to save time, but bots are inherently brittle. And we know they're going to fail,because they use all these environmental services they use, you know, web browsers that automatically update they use websites that also change their UI. They also use credentials at times that expire inherently, these are, are things that are going to break. And I'm, you know, really interested in, in figuring out, how do we, you know, improve the world of automation? And specifically, you know, the common term now is RPA is, you know, how do we look at improving the world of RPA, and bringing it, you know, forward to where sort of modern software engineering is, and, and where modern best practices live.And I think that's a lot of what we're going to talk about, at least in this first episode, and in ongoing in the in the podcast.
Mark Percival 02:26
So, your bots are live now. It's been, say, a year, you've expected some cost savings from those bots. And, you know, let's see where we're at.Now, I think it's important to kind of look at this from that post implementation, follow up a year, and that first year, you had those savings,but you also had the implementation costs. But the hope is, in that follow up year, in the following years, after that, you're going to save a lot of money,and you're not going to have that implementation cost. I think there's an assumption that you're just gonna be paying a license fee, and maybe some support. But as we just talked about, these bots are so brutal, that sometimes it's easy to miss or underestimate exactly how much time you're gonna spend on support.
Brent Sanders 03:02
Yeah, I think one of the key things that you see in, in web development or, you know, mobile development is when context is fresh, when you have a team actively working on a project, it's really easy to make changes,well I shouldn't say real easy, it's easier to make changes, it's, you have development resources on hand, you may have a manager that's leading a group, and the team still intact, you know, people that wrote the code a year ago, they're still there. And they remember why they've changed things. And, you know, a year later, it's, it's pretty easy to remember what the decisions were, why they were made. And you know, why we're in the current state. And it's also pretty, usually pretty well recorded, you know, you have a backlog, maybe that has a decision document on hand. And if you're following some or even most best practices, you can look forward and make adjustments to a changing environment.So, for example, like you look at, you know, mobile development, and, you know,iOS versions, and hey, we're gonna go from 10 to 11 to 12. And you can handle those changes. But in in automation, a lot of the development is release, and the team sort of mostly evaporates, right? The developers generally gone, they they're not needed for a year, the managers are not really needed, because the bots operating and working as expected, and there's likely one person that was involved in the development process that keeps an eye on things, but as things work for a year straight, there's not a lot of, of memory or sort of cognitive energy going back into the process.
Mark Percival 04:42
Yeah, I think that's the critical difference between this and a lot of other software development is its ongoing software development, you have that context the entire time. And so maybe it's a year later, but you have two or three developers that are still familiar with the code base because they'reworking in a day to day. The dangerous code bases are always the ones that have been left to sort of language and bots by default, that's sort of how they're,they're designed to be programmed, right? You've you build it one day, you launch it, say next month, and then then there's a, you know, possibly 12, 16 months where you don’t, you're not expected to touch it, maybe for just small changes, like credential update. Right?
Brent Sanders 05:13
Right. If you did everything right, you don't touch the bot for a year.
Mark Percival 05:16
Yeah. Which is frightening. Because if you did everything, right,it also means you lose all that context. So, in some ways, you know, doing everything right, actually since put you in a position where you might not have that same person working on the team, maybe the person that implemented it was a contractor, and they're completely gone. And so that's where some things like, you know, documentation, and having these up to date docs is critical.
Brent Sanders 05:36
One thing that that like we can all assume is that there's, there's a pretty good process with most automation practices of putting together some documentation around, you know, what the process is a process design document,and a software design document. So, we know, we should know, a year later, you know, what the bot was intended to do, and as it, how it should be implemented,and what steps it's taking, and most of the time those things line up. So, if you kind of fast forward, and look, a year later, you're cracking open, you know, some urine documentation, some members of the team may still be there.You know, the, the key stakeholders, you know, who the bot is servicing are,hopefully are still there a year later, but you know, they may have changed.And it it's kind of now up to you to figure out, Okay, well, it didn't run. And what do we do now? You know, how do we, how do we handle this this situation? So,it could be as simple as, hey, this was tied to a user's credentials, and those credentials expired, or it's using a service and the button moved, and we just need to update that. And so, depending on who you are in the organization,those things can be easy to identify, or actually very difficult. And I think that's, that's the beginning of this bot problem.
Mark Percival 06:58
Yeah, I think this goes back to the estimating that, you know,what's the full time equivalent required to manage your workforce, if you have these constantly, but all bots breaking, and you don't have these processes in place, then obviously, you know, the cost savings that you're going to see in the following years are going to be much more constrained? Because you'll be spending a lot more support,
Brent Sanders 07:18
right? You're talking about, you know, full time engineers, right?People that are on staff, in some ratio to the number of bots. So, let's say you're running six bots, and you have, you know, one or two people that are full time monitoring the bots, and it, it's a weird situation, right? Because,again, if the bots are working as intended, these people are mostly idle, and they're kind of just there as a safety net. And yeah, it doesn't sound like a fun job. First of all, like, I wouldn't want to take that role.
Mark Percival 07:50
But I think it's why a lot of it gets outsourced. Right, right. You know, you have you have somebody who's been is a bit more, you know, can be spread out more, so they can be taking jobs from different clients. And the hope is that, yeah, they're not going to be as needed. But at the same time,that puts you in position to make sure that your bots are capable of being serviced by somebody who, who's coming in with less context.
Brent Sanders 08:10
Right, right. So, you know, I think going back to why we're doing this podcast, why, you know, what we're most concerned with in this episode is like, what most companies that aren't technical, who really is the right target for automation and RPA. It's like, this podcast, and a lot of our messaging overall is not intended for companies that have like an engineering culture is.But make no mistake, if you're writing these bots and putting them into production, it, it's kind of expected that you do have some of the aspects of an engineering culture.
Mark Percival 08:48
And so, I'd want to go into something there. I think a lot of people, when they hear engineering culture might not know what that means.Yeah. And I think it's easy to kind of, I think, from when you come from a software development background, a lot of times when we say engineering culture, we're talking about best practices and engineering. And if you've had any time in software engineering, you've kind of seen the rise of sort of the best the best practices that today are now taking for granted. So, things like software, you know, version control software, they started with subversion, and then went to get things like testing, unit testing. Those are not common practices, you know, 20 years ago, like they are today. Any of the any of the tools that developers I think, to date definitely take for granted, but is expected in an organization that has a good engineering culture, these are all the factors that you would typically see in that culture, that organization.
Brent Sanders 09:37
Yeah, that's right.
Mark Percival 09:38
So, one of those. So, if we were to go through, I guess, you know,what, what those look like for, for the 12 factor for RPA. You know, and I think we're, when we look at this, to give some history on this. Joel Spolsky,who is a Microsoft engineer came up with 12 factors that made for making great software. And those four factors were just simple things. Going back to the Testing and using a version control system. And for us, it was, well, let's look at this as what are the 12 factors for doing RPA? Well,
Brent Sanders 10:10
yeah, and also, you know, there was, there's, I think there's also a 12 factor from Adam Wiggins. That was like, another big. All right, I think that, you know, the Joel test was 12 steps to better software, and then the 12-factor app, which was huge, right, this was, you know, coming.
Mark Percival 10:30
This is like, 2000 versus 2010.
Brent Sanders 10:32
Right. Right. And so, in looking at these, these frameworks for like, how do you think about best practices? You know, I think that there's a need for this in the automation space, specifically around RPA, specifically around sort of these tools geared towards, quote unquote, citizen developers,or turning accountants or bookkeepers into engineers, roughly right, like or coders, whatever, whatever you already want to use. You're gonna have some what non technical people writing software through tools. And but
Mark Percival 11:06
and they're just, there just isn't a best practices playbook.Right. And a lot of cases, I mean, I think that's the hardest part is if you're coming at this from a low code angle, or you're saying, you know, you're a citizen developer, it's, well, what is what are the best practices and isn't really a guide out there that says, This is like the 12 steps of making, you know, a great RPA bot? It's very much still early on in RPA.
Brent Sanders 11:27
Yeah, and I think the good news is, is like we've put our heads together to put the like, our own set of, of sort of 12 points. And a lot of them aren't necessarily like, Hey, you need to be a better developer, you need to think about this in a more complex way. Or you need to, you know, author your bots, in a certain way. It's more around process expectation in like,anticipating the next sort of phase of this, which is what we're most concerned with is right, a year later, like, what would you want to inherit?
Mark Percival 11:59
Exactly. And I think that's the you see that the same can be said,for software engineering, which is it's not necessarily that the code can be perfect the first time it, it's that it's in a version control system, and I could see when it was put in place, and I can go back and I can see who did it,and that there was a test written for it. And it might not be the code that's that ultimately is going to be the you know, the ideal product, but it at least gives me a starting point. And so, I think the same thing with RPA, which is never going to be that the bots can be perfect. But can you put it together in a way that will allow you to go back and triage what went wrong more quickly?
Brent Sanders 12:30
Right. Right. So, we should dive into to the 12 steps that we see,you know, it's automic's 12 steps a better RPA?
Mark Percival 12:39
Because we had to come up with our own.
Brent Sanders 12:41
Yes, exactly. You have to have 12.
Mark Percival 12:43
It's 2000 2010. And now 2020. Yeah, we've moved beyond just great software as a service and great software.
Brent Sanders 12:51
So, the first step I'm just gonna dive into the first step is around monitoring, like, do you monitor your RPA process with a platform or tool? Like, do you have something that beyond I think just, there's usually control centers command center, I mean, UI Path orchestrator, there's something that's going to tell you whether it runs or if it fails, like there's a successor failure. But I think there is an element of like, maybe even a step beyond that, like, do you have a way of understanding as it relates to the business?Are you about to working? Do you have a way to know, right now? You know, has everything run? But also, are the outputs correct? Are you getting,
Mark Percival 13:35
it's more than just the bot running? It's actually the bot ran and put the right thing out? And, and I think that's, that's, there's, there's also the aspect of, you know, are they running at the correct times? are they running? Have they run, you know, there's one option, which is you've monitored, and, you know, the bots failed? And there's another more dangerous one, I think, from the software side, which is I had a process that was running where we're supposed to run once a week, but I haven't heard from it six months, and I don't know check. And that's famously mostly manifests itself as like a database backup, right.
Brent Sanders 14:06
The second one, number two, is that monitoring do have some sort of logging. So, number two, are your logs stored in a centralized, searchable location? You know, we saw this in the early days. I mean, I experienced this early in my career on the web, where, you know, logs were on the server, and,you know, they weren't stored in a centralized place. And, you know, the server goes down, there's, I can't get to the logs or, yeah, right. We're, I hate to admit it, but it's probably happened everyone, you know, the logs were the reason the server ran out of space. Right? You know, the modern way of sending your logs to something like Paper Trail or Splunk, or, you know, a variety of different tools. That seems like a no brainer for RPA is like the logs should be sent somewhere where everyone can get their hands on it and you know,doesn't put your current infrastructure at risk.
Mark Percival 14:58
Yeah. The next one kind of goes into all of this, which is the air escalation process. And in software, we we seem services, things like pager duty companies that are focused on that escalation process. But in general,it's Who do you notify when something goes down or somebody doesn't work?
Brent Sanders 15:14
Yeah. Yeah. And, and furthermore, I think in pager duty, I think does a great job of this is acknowledgement like, is there a process in place of like, Hey, I emailed the developer, I haven't heard back and it the amount of anxiety and stress of that script that creates, so developing an actual like communication plan. It's like, okay, I received your note. And knowing to respond back, I'm looking into it, I will get back to you within this time frame.
Mark Percival 15:38
Yep. And if I don't get back to you, within a time frame, it's going to ping everybody again. Yeah. And they do that really well. And that kind of goes into the number four, which is the triage process, and actually getting that data. And that's a combination of, you know, the acknowledgement, who's actually going to be doing what, and then the logging, and monitoring and seeing what you can pull out of those logs that will give you some kind of clue as to where to start with fixing this bot.
Brent Sanders 16:01
Again, you know, looking at this with the lens of we haven't heard anything from this bot in a year, you know, a year it could be you have somebody who had nothing to do with the project gets some sort of notification,ideally, they get a notification, you know, it most likely if none of these steps are in place, the business will be telling you, hey, something's wrong with you know, payroll invoicing, something terrible that would, you know, keep your VP up at night. It's like, where do you even start? So, having a starting point and having logging or some sort of failure? I hate to call it but like a stack trace some, like understanding of what were the things that led to this failing. And, you know, a lot of RPA platforms can give you a basic idea of this out of the box, like through going back to UiPath is like, you probably will get some form of air that would indicate, okay, I can't find a button or these credentials aren't working. But a lot of the times, that's not really apparent.
Mark Percival 17:02
Yeah, and going. And then yeah, the other issue, which is if you have an outsourced team, this is even more critical, you have to be able to get them all that data. So, it means you have a logging platform, they need to be able to get to it, which is not necessarily always that clear. So, the next one is technical resource planning. This one is a really tough one. Because I think this is there's more art to this than than math and science, this is really trying to figure out just what are you going to require in terms of resources for full time equivalent employees to actually manage these bots? Is it one employee for you know, 10 bots? You know, is it is it you know, is it one for 20? This is this really is the where I think a lot of the guesswork is obviously coming in. But at the same time, it's a critical piece where you have to get this right, if you want to estimate exactly which projects to target.Because if you don't know what this looks like, then it's hard to know the return on investment.
Brent Sanders 17:58
Yeah, I mean, having a process to assign some sort of support resource to an outage needs to be evaluated on a byte by byte basis. And that goes along with usually, you know, establishing some sort of SLA, which we talked about later. But, you know, all these points sort of get to this, they work together, but they all get to this point of like, Is there a plan? Is there a plan in place? And so, I think, you know, great, you you've identified,I shouldn't say great, unfortunately, a bot has gone down, you know, what is the process to get somebody on it? And if you are lucky enough to be part of an organization that has, you know, resources equated to, you know, each bot,that's wonderful. If not, you know, how do you peel somebody off of another project? What type of criteria are you using to determine how long before that person starts working on a resolution?
Mark Percival 18:54
And the next one is the I think, one that a lot of people overlook,which is communicate failures to the stakeholders, having a plan that's in notifying who is not who's responsible for fixing the bot, but who's gonna be impacted by it, which is just as critical. If you're expecting something to run, if you're expecting a payroll report to run or do something, you know, on a closing the books on end of the month, it failing is actually something that you need to be aware of quickly. And so, I think that's something that I think most, I would say, a large, I would say most people with RPA, but you have to sort of have most of the best practices. This is one that a lot of forget,because it's just an easy one to kind of forget, it's easy to think about fixing it, but not to think about, well, who needs to be notified of the fact that it's in failure mode?
Brent Sanders 19:41
Yeah, I mean, especially if you, you know, let's say you've gone ahead and you've defined, you know, a parameter of like, here's how long but this bot can be down for, you know, we have a number of hours before it needs to be broken. You know, you've got you define some sort of SLA which, again,we'll talk about further but you define Within that parameter, like, Okay, I have 24 hours, let's say to fix, you know, a general ledger entry bot. And within that 24 hours, that may be up to the person that received in the communication style of the person that's received that ticket or notification,it's best to just define this stuff and pick any ambiguity out of it. And so if you can define that ahead of time, say, the minute this goes down, you know, so and so an accountant needs to knows, they need to know they need to just be informed, because there may be some sort of continuity plan that is put in place, way down the line, if they can't get it fixed. But there, there just needs to be a defined communication strategy.
Mark Percival 20:49
Number seven is the resolution planning for the bot. And this is really it goes back to Okay, so the bots back up now, what is the resolution?We're getting it back into the process? So has somebody else run the payroll for the month? So, you know, what is this look like you're getting a bot back into production from the failure state. And that goes into number eight, which is business continuity planning. And that really is the larger piece of this,which is okay, so at this spot fails, and something catastrophic happens, what is your business continuity plan for getting around this?
Brent Sanders 21:17
Yeah, this is a tricky thing in that Mo, like, most of the time,you're going to have, when you first kick off this bot, you're going to have,you know, people that do this job, and you will have people on staff that know how to do it by hand. Fast forward a year later, you likely don't have those people on hand doing that task and more they've forgotten, you know, what it takes, or potentially the possibility of doing the, the test manually is, is it's going to be incredibly time consuming. So ideally, it doesn't come to it.But you do need to have a plan in place, and it needs to stay updated. You know, if it doesn't stay updated, it may be completely out of date and unattainable, you might not be able to actually execute on the BCP. So it's something that needs to be revisited on a quarterly basis, just kind of take a look at the budget, do we still have, you know, a sane plan that can be executed on if we need to, you know, run a bunch of invoices manually, or, you know, make General Ledger entries? Like, do we have credentials that are not expiring for a manual person to get access to assist them, you know, it's, it's worth checking, you know, all of the, the ways that this could potentially go wrong in coming back to them on a sort of timed basis.
Mark Percival 22:38
And number nine is the change request plan. So, I think this is something that's overlooked, which is, even when you have one of these bots,it's running well, there's always going to be some kind of change that happens.That happens in the future, typically, that changes the underlying software upgrade, say underlying software to the platform is running on, it could also be an upgrade to the software that is interacting with. And it could also be a change the functionality. But really, the key here is that you have a plan in place for how you manage those changes.
Brent Sanders 23:08
It's always going to change the business changes, the environment changes, I think the main thing is you need a way for stakeholders, or even people that receive the sort of downstream effects of any automation to be able to have a voice and propose changes. And so, what this could look like, it could be a review board, it could be, you know, a multitude of different things. But I think the number one piece is allowing people to request changes,you know, getting the voice up, and you can evaluate the change request. And you can say no, or it could be people that, you know, have a little knowledge of, of parts of the business that the bot uses that, you know, maybe they understand that something is an upcoming down the pike. Right. And that's,that's where number 10 comes in, which is being able to do have a plan for upcoming maintenance, environment changes. How do you handle that? And that kind of goes back to resourcing. Right, like, can you get someone back on this bot that maybe hasn't been touched in a year? And can they make the changes without breaking it?
Mark Percival 24:14
Yeah, and then some of the times that plan can just be delaying those changes, if the changes can be delayed for say a week while you have somebody prepare, they have time, let's say in the next two weeks to kind of focus on this and make sure they basically Shepherd it into production right on these new on these new environment changes and that can be useful as well. Number 11 is do you have a development and testing environment that matches production?
Brent Sanders 24:36
Yeah, I mean, this is a huge problem in it. This is not exclusive to automation or RPA. This is a problem for every engineer in every enterprise.You are really hard pressed to have production data or something like it, you know, complex relationships and the sheer amount of data in production is hard to replicate in in a lot of Tears in, you know, a lot of the testing tools that are available with so like modern web development are, you know, with unit testing or you know, a variety of tools where you can sort of create fixtures or what do they call it when you use Factory Girl are?
Mark Percival 25:19
Oh, yeah, it's like picture data. I mean, the be fair though, this is a very early stage an RPA. And we were still not really seeing the same level. And a lot of times, this is just in general, this is just tough, right?to match that level of, of, you know, the environment from production to testing. And this is where I think we're gonna see a lot of change in the future as it starts to get better. And this RPA starts to take, you know, more and more of these best practices to heart. Because right now, it is it is tough to if you're interacting with an external system, sometimes it's tough to make that external system into something that can be, you know, faked out as a staging instance, for example. Yeah.
Brent Sanders 25:55
Yeah, yeah. I mean, it's, it's a problem. And I think if you can address it using, you know, something in your in your automation solution that gets you as close as possible. That's great. You know, unfortunately, yeah, I don't really have a just being transparent. I don't have a good solution for this. I don't think we even have a solution for this. In the product.
Mark Percival 26:19
No. And in the danger, and even looking at it from a software development standpoint is going back, it's very easy to mock out and step out things and then realize you've stubbed out so much that you're not testing anything. Right?
Brent Sanders 26:30
Yeah, it's like oh, yeah. And then when you change things, yeah.
Mark Percival 26:34
Yeah. And then you have to actually when things do change, you have to go and change the Moxon, the stubbing, because you realize that it's now going to match the thing that's external. So, you kind of have this constant problem once you're dealing with and this is from the software developer standpoint, once you're dealing with API's, API's automatically throw in a kind of a wrench in the works, because it's like, oh, well, now I got to mock this thing out that can actually change it. I don't know what it's gonna change.
Brent Sanders 26:56
Um, let's move on to the final. Final factor.
Mark Percival 27:00
Yeah, I think the final one is, is big. The number 12? Do you have a dashboard report SLAs, KPIs and other metrics to stakeholders?
Brent Sanders 27:08
I look at bots, I mean, if you give them or I should say, automations, or bots, if you give them the same sort of level of attention that you wouldn't employ, right is, are you checking in with them? Are you doing a regular, you know, touch point to make sure that they're getting their work done?And, you know, they're part of your workforce. I mean, they are bots. But you know, thankfully, HR is a lot easier. It's just you deal; you still need to ensure that you're getting what you expect out of them. I mean, there are licensed costs within that, you know, first year, let's say everything goes smoothly, there's still licensed costs, there's a lot involved in evaluating are we getting what we expect out of this bot? Or is it is it breaking often?
Mark Percival 27:55
Yeah. And I think the other key point to that is, is as you kind of see those KPIs change, you can reevaluate future targeted RPA projects, right?Because if you say, Oh, well supports actually costing me more than expected,then that's gonna make you reevaluate what exactly, you're gonna start tackling next. Those are our 12 steps, or 12 factors of, you know, best practices and automation.
Brent Sanders 28:17
You know, as we progress, you'll hear a lot more from more real-world cases and experiences that people are having.
Mark Percival 28:24
Yeah, we're hoping to involve more people that have you know, that real-world hands-on experience on RPA that we meet in the course of our work, and invite them on the show and kind of discuss exactly what they kind of see as the best practices in the areas that they see in RPA that needs improvement. And of course, if you have any feedback to this list, feel free to email us firstname.lastname@example.org. Likewise, you can email Brent or I. Mark@itsautomic.comand Brent@itsautomic.com
Brent Sanders 28:48
Feel free to visit itsautomic.com and check out what we're up to in the automation space.
Mark Percival 28:53