The Fake News about Robots and Their Reliability
June 03, 2019      
Steve Sickler, Tend.AI

With industrial robot manufacturers claiming Mean Time Between Failure (MTBF) rates of 40,000 hours, 60,000 hours, 80,000 hours and even 100,000 hours, we are starting to see claims that “a new robot will run up to 43 years before it fails.” Wow! Sign us up for one of those. Sounds like you set up your new robot in a fancy new robot work cell and it starts welding or painting or grinding without failure until one of your grandchildren gives up gaming.

Hmmm. Wait a minute. Most of the robot manufacturers are introducing their own proprietary predictive maintenance software that warn you before one of their incredibly reliable robots breaks. Fanuc even will order the replacement part for you and send a service technician out to install this failing part before is actually fails, claiming they will prevent downtime in your factory. They even call it Zero Down Time (ZDT). My God, this sounds fabulous! And odd. Why is everyone rushing to adopt predictive maintenance for robots that break about as often as Halley’s Comet shows up and Cleveland wins a major sports title?

Robots and reliability

The answer is in the math around the reliability of the entire factory, not just one robot, and the ancillary equipment the robots needs to do their job(s).

According to the most recent research on the reliability of Robot Automation published in the International Journal of Performability Engineering, the average reliability of a robot cell is 88%. In other words, 12% of the time the cell is not working. Even more striking, the mean time to failure in a robot cell is 87 minutes. Yikes! These super-reliable robots aren’t working 12% of the time and production is stopping every hour and a half or so.

Not surprisingly, like your significant other will tell you, most of the time it’s not their — uh, I mean — the robot’s fault. The data collected from more than 400 factories shows that 80% of the time, the robot cell stops for problems not related to the robot. It’s things like conveyors, faulty sensors, paint guns, etc. The good news is the data also shows that 75% of the time these non-robot problems are fixed within 12 minutes per incident.

So, robot cells are stopping frequently and fixed quickly most of the time. These types of problems can be addressed with better monitoring of the robot program and the robot cell. Gathering the data and creating reports of issues can illustrate at what programmatic “step” the chronic issues occur. Armed with this data, factory engineers can make adjustments in the program or the equipment to cut down on these chronic issues causing work stoppages.

Breakdown of causes of Robot cell failure:

Robot cell failure chart

That leaves 20% of the time where it is a robot problem. Your robot cell isn’t working 2.5% of the total time available because of a robot problem, and it turns out 25% of those problems are that the robot flat broke. A gear, a reducer, a bearing, etc. Real metal parts that break.

This explains the rush towards predictive maintenance. While manufacturers claim long MTBF times, the reality is that manufacturers see almost 1% of their downtime caused by a robot actually breaking. Because these failures take much longer to fix than most other types of failures, they are the most costly for a factory. Employees and the factory sits idle, and there is lost production.

Another 45% of the time a robot has a failure, it is because of “Positional” problems caused by a variety of reasons. Equipment in the cell is bumped, normal wear of non-robot components, the robot is “crashed” into another piece of equipment that is left in the way, etc. For any of these reasons, the robot can’t get to the correct position to do its job properly. Robot welds are done poorly, pieces are not picked up or placed properly by the robot or whatnot. Additionally, maintenance personnel will sometimes make changes to the robot control program to try to compensate for the positional problems caused by other issues, creating chronic issues that cause even more downtime.

The remaining causes of robot downtime is caused by program faults, sensor faults and other miscellaneous issues. Yes, the robot software sometimes just has “bugs” or faulty electronic components that causes the robot to stop.

In any case, either the robots break and it takes a long time to repair them or you have a lot of short failures caused by other equipment that is pretty easy to fix. You end up with two curves:

1) Total Time to Repair (TTR) in minutes and the rate – this shows you have a lot of failures that can be fixed in 10 minutes or less. Also, you can see far fewer failures that take an hour to almost three hours to fix.
2) Factory Costs – the frequent failures still cost the factory in idle time and lost production, while the infrequent failures are very expensive due to the duration.

Time to Repair chart robot failures

The goal of any factory management team is to reduce both of these failure types.

Luckily, companies have introduced plug-and-play solutions that can:

  • Continually monitor the robot cell, robot sensors and other equipment in the cell.
  • Log the point of failure in the programs to identify chronic issues that can be addressed, as well as report on program changes.
  • Automatically create an electronic signature of a normal robot operation, and notify users when a robot is showing signs of failure. Using machine learning, the system can improve its predictive ability over time.

It’s time we stopped believing the “fake news” around robots and reliability. They are amazing pieces of equipment, but we can make them much more reliable by addressing the entire robot cell system, and giving manufacturers the tools to reduce and prevent common problems.

Tend.ai's Stephen SicklerAbout the author: Steve Sickler was a founding member of the MSPAlliance, the largest and oldest vendor-neutral organization for cloud and managed service providers. He has held senior management positions with some of the SaaS pioneers and leaders such as SiteLite (one of the first MSPs), Tivoli/IBM and SAP. Currently, Steve is President of Tend.ai, disrupting how robots are managed via their robot-agnostic monitoring, alerting and predictive-maintenance automation cell platform. Steve learned sarcasm while working as a skipper on the Jungle Cruise ride at Disneyland.