Lies Your Optimization Guru Told You

April 8, 2015


Before you get out your pitchforks, I want to stress that this article does not represent Peep’s views.


The easiest lies to believe are the ones we want to be true, and nothing speaks to us more than validation of the work we are doing or what we already believe. Due to this we become naturally defensive when someone challenges that world view.


The “truth” is that there is no single state of truth and that all actions, disciplines, and behaviors can and should be evaluated for growth opportunities. It doesn’t matter if we are designers, optimizers, product managers, marketers, executives, or engineers, we all come from our own disciplines and will naturally defend to the death if we feel threatened even in the face of overwhelming evidence.


Is how you are doing things the optional way? What if it’s not?


It is important instead that we challenge many commonly held beliefs and that we face ask if what we have been doing is of actual value or just easy to do? We should instead ask if the results we getting are really good? Or are we just not getting our faults pointed out to us?


These questions are true no matter what discipline we are discussing, be it design or optimization. We can all get better and we all need to become disconnected with the stories we have told ourselves and each other if we really want to improve.


With that in mind I want to tackle some of the most common “half-truths” and flat out lies that permeate the optimization discipline. In truth because I have spent my career trying to fight many of these myths and working to turn around programs that have fallen prey to the most common forms of bad advice, this is where I spend so much of my time this list could be nearly infinite, but I want to point out the absurdity of the most common lies that get used to justify past results and future actions.


The Many Lies of Optimization Gurus

You may now gather your pitchforks and torches…


mob


It is Ok that only X Percent of Your Tests Have a Clear Winner


No, No, No, No, No.


Yes, you can get value from the things that didn’t work, and in fact you often get far more value from the patterns of things that don’t work then those that do work. That statement however is about individual options, not the larger tests themselves.


You should never ever accept a non-winning test as a good thing and you should be doing everything to make sure you are shooting for 100% of your tests to produce a clear actionable valid winner. If you are doing the things that really matter with testing, it is actually hard to have failed tests, as the entire system is designed to maximize outcomes and not just ideas.


I can tell you that in over 5 years I have had exactly 6 tests that did not provide a clear winner and I am still pissed to this day about those 6. Now I do things different in that I focus on avoiding opinions and maximize the number of things I can compare, which means that I probably have the largest number of failed experiences than anyone else, but in the end the tests provide a positive outcome and what won or didn’t win is of the least importance, only the fact that something did. I have broken them down and tried different ways to tackle those problems. I think about those way more than I do the thousands of successful tests, and that is a good thing.


Accepting failing tests is part of what allows people to pretend they are having a successful testing program when they are in fact wasting everyone’s times and resources.


Every test that you run that does not have a clear and meaningful winner towards your organization’s bottom line screams that you have allowed biases to filter what you test and how you test. It is a sign that you are just spinning your wheels and that you are making no effort to tackle the real problems that plague a program.


There is never a time that a failed test is acceptable and there is never a time when you should be ok with a 25%, or a 50%, or a 12.5% (the industry average) success rate on your tests.


Every test, positive or negative, is a chance for optimization. Not just of the things on your site, but for your own practices and for your organization.


You Can Figure Out Why Something Won


Nothing can more quickly vex marketers, designers, executives and just human nature than why something won. In many ways people will never believe that something won until they know why it was better.


People will turn to user feedback, or surveys, or user observations, user labs, or just their own intuition to divine the great reason why something happened. This constant need to figure out why makes people go on great snipe hunts which result in wasted resources and almost false conclusions. They do all of this despite the fact that there is overwhelming evidence that people have no clue why they do the actions they do. In fact people do an action and then rationalize it afterwards.


Even worse is when people pretend that this augury provides additional value to the future. In fact all it is doing is creating false constraints going forward.


8147791934_ede3e5135b_z


Image source


Your evidence is probably not evidence


You are creating rules without any actual data to back them up, otherwise known as the exact same mistake that designers will often make, and for the exact same reason, your own ego. All those things that you are taking for evidence have so much error associated with them that they represent the opposite of actual knowledge, and yet we hold onto them like they are they are our life rafts.


The issue is not that you are right or wrong, you may very well be right. The issue is that the evidence used to arrive at that conclusion is faulty and there is no evidence to support any conjecture you make. You are creating a situation where you believe something to be true but there is no evidence and you are asking someone to prove its non existence.


That mental model you are creating based on what you think you know might be correct, but by pretending you have far more insight then you really do you are allowing yourself and others to eliminate options that you have no actual data to get rid of. You have the exact same amount of information for those conclusions as you do for pretending that all your users will give your organization 5 million dollars as a donation 3 years in the future. The only real difference is that one feels good and one does not.


One of the most important rules, and definitely the one that takes the most inforcing, when working with organizations is “no storytelling”. It is not for me to say that you are right or wrong, but the very act of storytelling allows the perception of understanding when there is no evidential reason for it.


In the end it just doesn’t matter why and all efforts spent in that fruitless endeavor distract from the discipline of acting on data and on going where the data tells you to. You don’t need to understand why to act, you just need to act.


“I Don’t Agree Because I Often See…”


Let’s ignore the problem with black swans and first person perspectives. Let’s ignore the issues of self serving observer-expectancy bias. Let’s ignore all the problems with false validation of hypothesis and stories (more on this to come). Let’s take your stories at face value despite the overwhelming evidence that astrology has as much basis on providing value and context to whatever you are defending as the stories that you believe are true.


How many times did you purposely test to break whatever rule or heuristic you are trying to defend? How often did you purposely test the opposite of the feedback or your belief to see if what the best answer was and not just validate your opinion? It is easy to validate what you already believe when you refuse to look for evidence that might contradict it.


Challenge your ideas and cherished notions


So many of the things that optimizers hold dear is because they never took the effort to validate those things they most wanted to be true. Even when they attempt to challenge an idea, it is only meaningful data when you you establish large numbers examples and know the limitations and error rate, then you are providing false knowledge to yourself and others.


That user feedback you are following seems to have provided you with a great idea for a test? Did you test it against the exact opposite and other possibilities in a large range of outcomes? What is the pattern of outcome, was it just a one time thing or can you establish that whatever the change is works in a large number of cases and consistently is the best, not just better then other tactics. Have you taken the time to disprove alternative hypothesis?


What are the assumptions that went into that action as opposed to this one and and how often did you see that outcome? It is easy to validate an opinion as long we never have intellectual curiosity, too bad this can only limit your results and negatively impact your organization.


Pay attention to the patterns


9726310510_48fdf709ab_z


Image credit


It is possible to derive patterns, but only from very large data sets and only in the broadest terms. Patterns I usually see: Real Estate usually has a much higher beta and longer half life then copy changes. Spatial changes tend to be better than contextual changes for long term monetary value. Even in those cases you better be designing your efforts to see if that is true in this case and going forward.


In my current job I have seen copy have a much larger impact than most other places, much to the delight of our copywriter. We only know this by constantly building tests which include multiple executions of these concepts and by measuring the long term impact and the cost of implementation; not from single test or by validating his or my own beliefs going in. It is only by having looked at the impact of multiple forms of copy changes in comparison to all other types of changes (real estate, function, or presentation) and by comparing that with other sites and other changes.


I have run X number of test/tested for X number of Years, therefore I am an Optimization Expert


I have been driving for 18 years and have driven well over 250,000 miles and you know what all that experience tells me about driving F-1 cars? Absolutely nothing.


That 10,000 hour rule sure sounds good in theory, but in reality is just a cover for common practice and complacency. You can get really good at executing tests, but if you are not spending each moment actively trying to improve your program then there is no incremental gain from each test. If you are not looking at how you can change your own behavior and your own beliefs in regards to improving results then no amount of time spent testing will do any good. The number of tests you run is meaningless towards how much time you spent actually trying to get better, or focusing on the much more complicated and uncomfortable world of changing what people want to do and getting them to accept that when they are wrong, and they always are, that they make so much more money.


So little of what makes a program successful has to do with how you run a test or even what specific test ideas you have. It is all about changing how people think about their actions, about challenging assumptions, and about building the environment where rational decisions are made.


Keep growing, keep challenging yourself


mirrordirty


Image credit


Trying to impress people with years or numbers of tests is as much of a non-sequitur as people claiming the number of Call of Duty games they have played is a measure of their ability to handle a war.


In almost all cases if that is what you are hiding behind, or even worse some false award then it is a glaring flag that you have never done anything to improve your actual optimization efforts.


Real optimization is the act of optimizing your own actions, and is about the efficiency of improving all actions. There is never a time when you have “made it” or are an “expert”, instead it is about the constant improving and challenging of yourself and others to get more then they would have otherwise.


My Test was Successful Because We Got X Result


To be fair sometimes this statement is of actual value, but only when it is given with the context of site and larger test. A result by itself sounds good but is of absolutely no value in telling you if a test was successful or the real impact of the program and person executing that specific action.


While I am loath for stories here is a perfect example from a recent test we ran. We were recently optimizing the second most trafficked part of our site, and run a large number of versions on the the page to see what mattered and to inform future actions. As part of this the perceived favorite did grant a 8.5% increase to RPV, which is not bad. Most groups would take that and run…


Another experience, one that was chosen because it was similar to another site (not a competitor) had a 43% increase to RPV. Now looking at the 8.5% increase and comparing it to the 43% increase makes that 8.5% look sad and pathetic compared to what we should be getting for the same amount of effort. It is about building the context of your own situation and about measuring the efficiency in that context.


Never stop looking for a better alternative


balt


Now lets compare that to another experience in our test, one that was built off of the pattern of previous results and designed to challenge many commonly held beliefs about user actions. That experience produced well over a 700% increase to user RPV.


Obviously you should be incredulous whenever you see a lift above 50% on an established site, and very large improvements have their own world of unique issues that most groups will not have to worry about very often, but in this case we retested multiple times to confirm the scale of impact. Now that outcome makes those other 2 outcomes absolutely awful for the business as a whole as we would have been bleeding revenue by just going with them. In fact future testing of that option showed an even greater result and has helped shape many other business opportunities going forward.


Each answer is an outcome, but the actual value of them is only judged in context, not by some arbitrary value. What would have happened if I had gotten a 7000% increase? What did I fail to do that might have generated that outcome?


The specific outcome of an expereince is irrelevant. You only know what the real impact of your efforts are when you look at a large veriety of possible outcomes and maximize the range of things you compare. In some cases getting a 8% lift is amazing, and sometimes it would be a devastating loss to the organization. You will ever only know what that lift means when you establish the context.


An outcome is great. An outcome in light of the range of feasible outcomes and in light of resources is even better.


You need a hypothesis


I am going to just leave this as everything associated with how most people understand the concept of hypothesis or even scientific method. No single item is trumpeted more than hypothesis and no single item causes more destruction in terms of getting results then the use of these vile things.


Now hypothesis has many definitions, but the most common way people think of it is in terms of, “If I do X, then I expect to see Y result”.


In reality the actual concept of a hypothesis is not positive or negative, but the way that people hold onto them as a way to validate their actions is destructive.


Once you get past 6th grade science you will understand that a hypothesis is just a small part of a much larger and richer scientific discipline. It is designed for use in certain cases and to allow for repeatable conclusions which, and this is the very important part, validate that outcome against ALL OTHER POSSIBLE ALTERNATIVE HYPOTHESIS. Hypothesis testing has come under consistent and larger attack as time has gone on.


Not only is it not designed for efficiency or for actual business problems, but it in itself is used in conjunction with many other techniques and controls, all of which are commonly ignored by the false cognoscente of the optimization world.


Don’t confuse discovery with validation


Even worse is when the direct pursuit of the hypothesis limits the scope of testing or allows people to believe that they have discovered an answer when they have not. These are the times when optimization moves away from discovery or value and instead enters the realm of validation, otherwise known as the mortal enemy of results.


Where it is easy to think in terms of what you think will happen I want to challenge you to think differently instead. Instead of worrying at all about what you think will happen, start by simply listing all the feasible ways you could interact with the page/element/experience that you are trying to optimize.


Think about all the ways that you can execute those ideas, and then choose the largest range you can from that list, include your favorite if you like, and then make that your test. Even better is when you get others to add their input as they might think a different tactic is “better”. Testing should be the most inclusive thing you can do, since all ideas go through the same process and you are trying to maximize the amount of things you compare.


It is perfectly OK to have a belief on how things work. Everyone has their own mental model when it comes to what are problems and how to fix them. The issue becomes when you allow some false structure or that belief in what will work to limit what you test or what conclusions you derive from outcomes. Don’t allow an opinion, yours or anyone elses, to be the limiter on what you test or even how you test.


Conclusion

The list of all the common lies could go on forever. Optimizers in many ways are the most at risk of bias because they have so much access to data and they control the way that people look at data. While we don’t have time to dive into other common pitfalls like:



– rest assured that those are just as negative to actual outcomes as the items that I did address directly.


Always keep going


moving


In the end there is no good or bad, only the efficiency of different tactics and the consistent and continued attempt to become more and more efficient. No matter what tactics you use you are always going to get a result, the question is if that’s the best result you could and should have gotten? It is only be challenging every part of how you think and act on optimization that you can start to get new and improved results. Every day allows an opportunity to do just that, as well as one to run back to that which is most familiar and comfortable. Each of these items and so many more limit the efficiency of not just specific actions but also entire programs. The largest problem is that the more comfortable something seems the less likely it is to be of high value.


The goal here is to open up for debate the most commonly repeated pieces of advice in the industry. I can guarantee you that every single part of my programs can and should be fixed and improved, just as I can guarantee you that every part of your program is the same.


That is the best news possible as it means that every day and every action is a chance to learn something new and to grant better information of the world around us. The only way to accomplish that however is to let go of so many past beliefs and to purposely challenge everything you hold dear. Just like the act of testing, if you aren’t challenging what you hold true how can you expect others to follow suit?


Occom’s Razor should tell you that the more people are talking about a practice the less likely it is to be valuable and the more likely it is to be efficient. Skipping game theory, basic human nature tells us we gravitate towards that which already meets our own thoughts and patterns and look for confirmation instead of conflict with prior beliefs. Fight that urge and your program and efforts can have magnitudes greater outcomes.


“Whenever you find yourself on the side of the majority, it is time to pause and reflect.” – Mark Twain

Digital & Social Articles on Business 2 Community

(211)

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.