Prediction can be a hard problem to solve and, oftentimes, even un-solvable. When discussing honesty in prediction models, we need to recognise the challenges of modelling complex systems. How many times have we seen articles about some financial market, be it housing, stock or crypto predicting some behaviour, and yet the opposite happens. For the sports fans out there, it is clear that no matter how much you watch a sport or how much information you have gathered and taken into account, surprises still happen. Just think of Greece’s victory over Portugal at the 2004 Euro’s or Paul the Octopus’s uncanny football predictions!

That is because life happens, the world is full of chaos and unless we are modelling some physical laws of nature it’s pretty hard to predict. This chaotic nature of the world makes honesty in prediction models even more critical. It’s important to acknowledge that forecasting isn’t always precise, and we should embrace transparency in the process. This is both a bit scary but also very exciting. It is the reason I moved from a purely mechanical focused career where you could calculate the right answer, to something a bit more open ended. It means that if you can predict something of value, it is worth doing, and it can set you apart. But it is hard.
I am not going to go into how to predict things, or even why somethings are easier to model than others, there are already plenty of very good books on that subject. But I am briefly going to focus on why it is important to be honest with ourselves, that this can be a tricky process, there isn’t always enough signal, sometimes things don’t work, but all of that is okay. And what I think is important is cultivating an environment where we are honest about our ability, for it to be okay for things to not work and to share our failings along with our successes, the more we are comfortable with these aspects I believe the better things can be.
Belief in Possibility: The Attitude of “Yes, We Can”
I grew up in South Asia and one of my favourite cultural observations when compared to the UK was attitude to repair. If you had a device that had broken, and you took it a shop and asked about repair, the answer was almost always “yes, no problem”, almost always things could be repaired. On some occasions perhaps things wouldn’t quite come back the same, but in the UK, it feels like the response is almost always “Sorry, it looks like a write off, better to just get a new one”. Now I am not going to comment on attitudes to disposability or access to goods etc. I am aware there are many factors at play. But it feels like that the response was always one of belief and acceptance, “Yes we can do it.”
I love this, but it certainly has its place. I have had the fortunate experience to work under some real believers, this certainly provides many positives, however it did mean that anytime someone asked if our team could deliver something the answer would almost always be, yes (think Jim Careys Carl from yes man), a behaviour I am certainly guilty of myself.

I think there was however a very noticeable difference between making build promises from an engineering capability vs analytical perspective. Saying yes to building a new endpoint or platform functionality is very different to saying yes to building a model that is X% accurate or delivers £X value. From an engineering perspective I am comfortable with the idea of what I can and cannot build, sure there will always bumps along the way, but can it be built? I feel I can answer that. But in the analytical space it is a whole other story, it is why you see stories like Even After $100 Billion, Self-Driving Cars Are Going Nowhere, and issues like stops signs to the right, building predictive models is considered experimentation for a reason.

We don’t know what results we can get; we may be able to get a good idea, but promising beforehand or working to specific target metrics without exploring and iterating is not ideal, and there can be so many unknown factors and outliers at play. So we should be open about this, say “yes there are possibilities and things we can explore”, but we need to iterate, try things out, only then can we start thinking about deliverables and saying what we will achieve, even then delivery needs to be structured in a way that allows for experimentation (as briefly discussed below).
This process and mind set should be a good thing, after-all it leads to less broken promises, less stress & anxiety and fewer moments for your data scientists ending up working late in the night asking themselves ‘why won’t some model just work’.
That's why it's important to foster an honest dialogue about what's achievable. We explore, iterate, and refine our approaches, recognising that the path to innovation is marked by valuable learning moments. By embracing experimentation, we minimise the pressure of over-promising and under-delivering. Being transparent about our limitations and effectively managing expectations is crucial. This approach ensures that all stakeholders are aligned with realistic outcomes rather than being misled by overly optimistic projections.
It’s OK to Stop: Avoid the ‘Pot Committed’ Trap
So sometimes it does happen, you’ve built it up to your stakeholders, spent ages learning new frameworks, got more data, burned through resource hours and you are still not getting results. For those that don’t play poker, being pot committed is the act of having “put in so many chips”, or otherwise risk of consequence, that you might as well follow through with the plan. But this should never be the case, you shouldn’t be afraid to stop, and it should be acceptable.

Earlier in my career whilst working in Insurance I had been supporting some analysis for price elasticity, we had conducted price tests to see how consumers would react to higher and lower prices i.e. would we get more demand? could we generate more margin? We spent a reasonable time on the piece, but our result was that we don’t have enough data, and we will probably never have enough data given our position in the market. We had to present these results at a steering committee meeting to the senior leadership team of the company including the CEO. Obviously earlier in your career (and perhaps at any stage) this can be quite intimidating; “Uhh we can’t really tell anything; all of this has been a bit of a waste of time” is not the ideal message. But I remember how accepting the committee were and how supportive my manager was in delivering the message. It was very formative of my outlook in being confident in the truth, owning it and creating a space for it around me with the people that I work with and those that I manage.
I think perhaps a key take away is how to recognise that point earlier within the work we are doing, to stop wasting resources down fruitless rabbit holes. It isn’t entirely avoidable, we humans like our challenges. There are lots of frameworks to help this, but the key for me is small quick iterations and evaluations, how much improvement are we seeing for the work we are putting in, we know there is going to be diminishing returns somewhere and identifying that earlier is gold.
The Importance of Honesty in Prediction Models: Embracing the Value of Experimentation
If there is such a degree of uncertainty about data science, you may ask, how are we meant to plan and deliver effectively. There are many different models that companies have employed see Models for integrating data science teams within organizations for some additional reading. However I went to a great talk by Nick Jakobi in one of the Pydata Meetups where he talked about delivery within agile frame works. He spoke about the idea that experimentation as a task had the aim of answering a question. Can we predict this, does this show this etc. The key thing was that whatever the answer the results of the experiment was still delivery of information, and that information was inherently valuable and should be deemed so.
For example, let's say you are working on improving a forecasting model, and after spending a sprints worth of work on it, you simply cannot make any significant improvements to the performance of a model. In this case you have learned that very fact that given the provided effort model improvements are not feasible, at this point in time. And that learning is valuable, I believe they used a knowledge base to enter the result of each learning and task so that that it was easy to access and allowed them to learn globally about what works and what doesn't.
Show me your code
By being okay with failure and cultivating a sense of honesty and trust we can alleviate so many issues before they arise, but again this is difficult. Most will have seen Elon Musk asking all his engineers to show him their ‘most salient lines of code’ as he goes about his firing spree. I am sure this would evoke a bit of fear in those who wish to stay.

As with many industries there is an epidemic of imposter syndrome within tech this can breed a reluctance to show and share code and low level results, especially with the analytical domain. But again, cultivating the ability to be honest and open for challenge and review, (what we would expect in academia) and just reach out about where we are struggling also helps to alleviate this and can mitigate against this.
You will still see some scenarios like what we saw with the covid modelling by the Imperial University, where external validation was not able to take place initially causing some concern or models like the re-offending prediction used in the US where predictions are used that impact people sentencing but there is a strong reluctance to show process of getting there.
So instead of imitating Col Jessup from A Few Good Men, and maintaining ‘you can’t handle the truth’, we should be leaning towards Fletcher Reed from Liar Liar (again Jim Carey), in building an environment where we can be honest with each other, which should really be beneficial for all involved.

