The missing discussion of LLMs' potential

January 8, 2023

I studied machine learning in general and neural networks in particular in 2017 but got increasingly frustrated by the disconnect between the hype and actual results when put into practice. I decided to move on to other things. Ironically, this was right as the Transformers paper came out. The subsequent emergence of GPT2 and then GPT3 (alongside Codex) was a reminder of how fast the field could move, and ChatGPT made me realize that a new chapter had begun. My favorite software tool of 2022 is Github Copilot, which has fundamentally transformed how I write software.

It is difficult to say whether the recent technological advancements are simply incremental improvements or if we have reached a true turning point. ChatGPT dug itself out of the uncanny valley that was holding me back from using natural language models. Despite feeling immediately at ease with a more practical tool like Copilot, thanks in part to the attention given to prompt engineering, I was mesmerized (as so many of us were) when ChatGPT came out; I decided to incorporate it into my daily work and personal research in the same way I might with a new IDE or programming language. As I continue on this journey, I feel as though I have only just begun to scratch the surface of what is possible, yet I have already made a number of exciting discoveries.

I think LLMs and image-generation networks are powerful tools for learning and helping humans process information. I am often disheartened by the debates surrounding these technologies in my social circles. Not because the issues raised—such as biased training, non-consensual training on copyrighted data, the concentration of training and research in the hands of corporations, cost and energy considerations, and the potential for displacement of knowledge workers by machines—aren't important. Rather, it is that these debates rarely focus on the tangible benefits these technologies offer beyond the exaggerated claims made by venture capitalists, technology firms, and businesspeople.

It seems that most people who vocally oppose LLMs have never personally tried to work with them. Most concerns I see are people worried about low-quality posts on LinkedIn and the proliferation of disinformation or plausible-sounding spam, but these issues have already reached a scale that exceeds the ability of individuals to combat.

If we want to engage in truly effective technology criticism, I think it is essential that we put in the effort to evaluate these tools thoroughly, if only to provide well-supported counterarguments. There are many fascinating aspects to LLMs and image-generation networks when it comes to personal productivity, and there is great potential for them to be used as aids in self-actualization. The more I experiment with these tools, the more I am able to discern their limitations. Still, I also keep encountering situations where they prove to be remarkably useful, if not downright transformative.

I find many things problematic about the current rollout of these models, but if anything, that is even more of an incentive to study them.

(Disclaimer: I used ChatGPT heavily to edit the final form of this article, and I documented how in a follow-up post)