Before I start on this, I need to explain something. I do know something about this subject on which I am about to expound. I have a job involved in very complex technology. I can’t write about those direct experiences, however. One of my firm rules I imposed upon myself when I started this blog was that I wasn’t going to blog about my job. That is a good way to lose said job. But I do have almost 30 years experience with dealing with safety critical, highly complex systems. You are going to have to trust me on that point. And, for the record, I have a Bachelor’s degree in electrical engineering and a Master’s in physics. While I was in college, I also received some experience in the power generation field and spent some time around a nuclear power plant, courtesy of my participating in the school’s cooperative education program. Although that was now over 30 years ago, I do remember quite a lot of the fundamentals involved here.
For the record, no, I don’t work in the power generation field with my current job.
I have been watching with interest the ongoing and continually worsening crisis in Japan regarding the four (I believe it is now four, it could be more) unstable, possibly out of control, nuclear reactors in Japan. This emergency has been ongoing since the double barrel hit of a magnitude 9.0 earthquake and a huge tsunami, and it seems to be getting worse as the days progress. I am not going to try to summarize the current situation, as any summary I try to put together will be out of date by tomorrow. Rather, I would like to focus on the bigger issue of technology in general and how our society has become totally dependent upon it without understanding or underestimating the very real risks involved.
I have seen some writers in recent days compare this accident, if that is even a good word for it, to the sinking of the Titanic in 1912. That ship was labeled, from the time it was on the drawing board, as “unsinkable.” Literally, the belief was that it could never be sunk. Technology had advanced during the Industrial Age to the point that it was infallible. That was one reason why, when that huge ship sank on its very first voyage, it just clobbered the accepted reality of the day. Yes, the horrendous loss of life also had a huge impact on the collective psyches of both America and Britain. But the one part that I would like to discuss is the fact that people really had come to believe that the technology involved was that advanced such that all risks had been negated. That ship could never sink, end of story.
Our society today seems to find itself in that same erroneous state of mind regarding technology. It has become an integral piece of the fabric of day-to-day life in the 21st Century. We are literally surrounded by very sophisticated technology that we hardly even see for what it is. Smart phones, iPads, instant connectivity no matter where you might be, electrical power that we never even think about except when it isn’t there anymore, relatively inexpensive gasoline for your automobiles no further away than the gas station or convenience store three blocks away, on and on. Technology surrounds us. Our current society is totally dependent upon these things and we hardly ever think about what might be required to sustain all this technology and what the consequences might be if it doesn’t behave according to our preconceived notions.
There is one basic fact about all technology, no matter how advanced or primitive it might seem. That is, at some point, it will fail. Things break. Washers in faucets wear out. Pipes break. Metal fatigue can result in a catastrophic failure of a component in a system. Transistors fail. Insulation in wiring becomes chafed which results in a short circuit. Water gets into somewhere it isn’t supposed to be, which always results in things not working correctly. Oil pumps fail which causes moving parts to become dry and fail due to heat and friction. Failures are a fact of life in technology.
The trick for system designers, therefore, is to anticipate these failures and build mitigations into whatever it is you are designing so that the failure is not catastrophic. That is why we have backup systems and backups to backup systems. You have monitors in place for anything that might go wrong, so you can detect it and do something about it. A very easy to understand mitigation factor is for an operator of something, such as a car engine or a nuclear power plant, is to shut the thing down when it starts not working correctly. However, the trick to that is for the operator to understand when it isn’t working correctly and that he should intervene. With very complex systems, that becomes a very tricky proposition indeed.
The problem with this approach of understanding and mitigating all possible risks for very complex systems, such as modern commercial airplanes, nuclear reactors, submarines, and petrochemical plants, is that the list of things that can potentially go wrong becomes almost infinite. It becomes much worse of a problem when you start factoring in multiple failure conditions and what is always referred to as “operator error” on top of those failures. Once the system designer starts applying his or her fertile imagination to this problem, the whole process becomes almost infinite.
That is where “probabilities” start coming into the picture. Because it is impossible to take care of every single and multiple thing that could ever go wrong, then it becomes a matter of addressing what is most likely to happen and what is the thing that can happen that could have the worst consequences? Those are the risks that system designers address first. And as one who has seen the inside of this process, it becomes almost a game you play. Cost factors start to become very large in the overall decision making process. If a designer wants to address a possible situation that could have dire consequences if it were to happen in a very specific way and it would take a whole lot of money , time and resources to address, but it really isn’t all that likely to happen, then it is obvious that the people that make those kinds of decisions are not likely to address that particular scenario. It's all in how you "draw your box" around the possible scenarios you have to design your system to accommodate in a manner that will hopefully minimize death and destruction. Once something is placed outside that box, then, by definition, it is outside of your accepted reality. It will not happen. The Titanic will not sink.
When the increasingly dire situation in Japan started hitting the news, I immediately started wondering why the backup cooling systems of those power plants didn’t automatically kick in. Not having cooling water for a nuclear reactor is probably THE one big thing that designers worry about. It is rather likely that it will happen, too, if you don’t design and operate your reactor in very certain ways. So it was a mystery to me why all of these reactors that are in trouble seem to have lost their backup cooling systems. All of them had that problem at the same time. How did that happen? I won’t go into the probability numbers, but that would seem like that would be a Once In The Lifetime of the Dinosaur Kingdom kind of event.
Well, it turns out that the problem with all these plants is that they did have a backup cooling system that depends on emergency electrical power generators. And guess why those all failed? The tsunami wiped them all out, as they were built in a low-lying area right next to the ocean. That risk was either never identified or was considered to be so unlikely that they designers didn’t need to bother mitigating that one.
In a country like Japan that has a long history of violent earthquakes and tsunamis, you would think that this scenario might have been considered, given the possible catastrophic results. No, this was one of the things that was not considered in the realm of possibility. Why? My guess was that it was an economic decision, not a technological one. The potential for this scenario was recognized but it would have cost too much money to address a situation that people in charge didn’t think would occur.
That is called “acceptable risk.” People are making decisions about what is acceptable, given what it would take to address that problem. That is the basis for a Cost/Benefit Analysis. The problem here, as you might be able to see now, is that the main risk involved is for the citizens of Japan along that stretch of coastline. The risk that the managers of the company that had those nuclear plants built is being borne by the citizens of Japan, who had absolutely no vote in that decision.
Believe me, this kind of thing is constantly going on in today’s society. That is how this vast technologically based infrastructure we have in this country and the world works. Everything is a tradeoff.
There has been much discussion about nuclear power and whether it is a “safe” technology. Here’s my thought on that question.
There will never, repeat, never be a totally safe nuclear power plant. There will always be a possibility of something going wrong. If the system is designed in a way that the first and almost only requirement is that it will never experience an accident, then the risk that there will ever be a critical problem will be very small indeed. For instance, the plant could be designed with quadruple redundant cooling systems and with not one but two containment domes, each about 30 feet thick that would withstand a military type bomb being dropped directly on it. Accidents will not be a Zero Possibility, but they become increasingly very, very unlikely. The more safeguards you put in, the less likely that anything bad, no matter the cause, will happen.
However, that is not how things are done in our society. The companies that are going to invest billions of dollars in nuclear power plants are going to want a return on their investment. What’s the point of having these wonderful power plants if you never make any money? That, in common parlance, is referred to as a "Bad Business Case." Therefore, economics is a very major player in design decisions that you might think would be totally technologically driven. If you think that, you would be wrong.
But this is still O.K. That’s how our society works. The problem lies with how one goes about assessing the possible risks. If your schedule is very tight and your company is on the verge of going under, you might be inclined to start cutting corners in order to not spend any more money or effort than you absolutely have to. You might start “drawing your box” around the possible scenarios that you must account for in a way that starts to minimize those possible scenarios you must consider. If you limit those scenarios, then your job as a system designer becomes much easier. And cheaper, by the way.
That is what happened at the Fukushima Daiichi nuclear power plant. The situation that occurred was considered to be something that the designers did not have to worry about happening. When it did occur, then the response is invariably, “Who could have foreseen X happening?” Does that sound at all familiar?
There are a number of other factors that contribute to accidents, of course. This is a very complex field of study and many books have been written on the subject. I am not going to be able to cover everything here. But I will mention a couple of other factors that I believe are very prevalent in how our technological society functions.
The first one is that these systems, whatever they are, are now so complex that the designers don’t truly fully understand what it is they are building and, more importantly, how they can fail. They are just too complex and there are too many individuals and different companies involved for any single person to grasp the intricacies of every single aspect of that system. When the first failure of something we have never seen before occurs, designers can learn from that experience and address that particular aspect in future designs. This is called (by me) the “Oops! Well, we shouldn’t do THAT again!” approach to systems engineering. However, that approach certainly didn’t help those affected by the first accident.
Another factor that is very important to the study of industrial accidents is that of complacency. To summarize, this means that since something bad hasn’t happened before when we did X, then, by definition, it won’t happen when we do X the next time. And complacency becomes so prevalent that it becomes nothing bad will happen when we do X again and put Y on top of it. That was one of the leading contributors to both the Challenger and Columbia Space Shuttle accidents. That entire process is called, in more academic circles, “Normalization of Risk.” Nothing bad has happened before, so it won’t happen in the future. Everyone has accepted the unstated assumption that the condition, which I called X previously, will not happen. It just isn’t in the entire paradigm. Therefore, we don’t need to worry about X, or even X plus Y, in our design.
I don’t think I even want to start talking about “operator error.” That’s what everyone’s reaction seems to be when anything happens outside of your preconceived notion of how things should work. If anything happens that designers didn’t anticipate, don’t we always see that “operator error” tag, even though you might be asking the question, “Well, how was the operator supposed to know how to react to these circumstances, since they have never been trained to deal with X, or X plus Y?”
All of this goes a long way in explaining why the Chernobyl nuclear power plant in the Ukraine was built without a containment dome or building, in case of a radiation leak or explosion. An explosion of the nuclear reactor was never in the realm of possibility, so therefore, it would have been a waste of money to build a containment dome to contain something that would never exist. As a result, we now have an uninhabitable zone, a la The Planet of the Apes, the size of Switzerland in the middle of Russia and the Ukraine.
If, on the other hand, worrying about the economic impact on your design and operational requirements is NOT part of the picture, then it is possible to design, build and operate a technologically complex system, and to do it safely. This is the case of the U.S. Navy and their nuclear power submarines. After the loss of two submarines, the U.S.S. Thresher and the U.S.S. Scorpion, the Navy decided that they were not going to lose any more submarines, for any reason other than being sunk by a hostile military force. To this end, they instituted a system called SUBSAFE. As a direct result of this program, the U.S. Navy has not lost any submarines since this program was instituted. Please note that economic considerations, such as how much this program costs the U.S. Navy, is not a factor. Keeping their submarine fleet safe from sinking is the only priority. This is not the case for anything done where profit margin is a concern, which is pretty much anything not involved with the U.S. military.
Where am I going with this discussion? I think the main point I wanted to make here is that we, as a society, do not understand the risks involved with these very complex technologies. Things will always go wrong, and the likelihood of spectacular accidents and failures increase the less that the people in charge worry about the possibility of those things actually occurring. And the risk is borne by all of us. Yes, those companies will lose money, maybe the entire company, if accidents are bad enough. But it is the unwitting participants who also bear a very great risk, such as those who live next to nuclear power plants, petrochemical plants, underground pipelines, etc.
It is too late to put this particular genii back it the bottle. It is out and the possibility of it wreaking significant havoc is very large, especially when you lump all the possible risks of all our technology together. You end up with the Bhopal chemical plant disaster in India, in which thousands died. You end up with collapsing coal mines with miners still inside. You end up with deep-water oil rigs that explode and leak millions of gallons of crude oil into the Gulf of Mexico with absolutely no known way of shutting off a deep water leak like that, even though many workers knew that risks were being taken. And you end up with the possibility of hundreds of thousands of Japanese people being forced to relocate from an area that may be contaminated with radioactivity for thousands of years to come.
“Acceptable risks” are always acceptable. Until they actually happen, that is.