mnmlmnl

@mnl@hachyderm.io

This talk is about the GO GO GOLEMS ecosystem, which consists of: – glazed, a library that helps application expose their rich internal data as well as make their functionality composable – parka and flour, libraries that turn glazed applications into APIs and event-driven services – sqleton, geppetto, escuse-me and many others, concrete applications built on top glazed, parka and flour – many more utilities that were quickly iterated on...

The entire ecosystem is built around a few key concepts: – data should be exposed – computation should be composed – software should be deployed

While the talk will focus on some of the technical aspects of the GO GO GOLEMS software, the main topic of the talk is how to design software with a vision and more specifically how to leverage large language models to write better software.

I will cover: – how I design, brainstorm, iterate and refine my vision and design – how learning, note-taking, writing and drawing shapes my software – why large language models are a powerful tool for building small software – how hard thinking, ruthless abstraction and writing lots of mediocre code go hand in hand – why we should build tools to make computers compute, so that they rightfully become tools for everybody and not just the few – why software is art, science, engineering, magic and craftsmanship combined

Better Together: Unifying Datalog and Equality Saturation

We present egglog, a fixpoint reasoning system that unifies Datalog and equality saturation (EqSat). Like Datalog, it supports efficient incremental execution, cooperating analyses, and lattice-based reasoning. Like EqSat, it supports term rewriting, efficient congruence closure, and extraction of optimized terms. 

We identify two recent applications—a unification-based pointer analysis in Datalog and an EqSat-based floating-point term rewriter—that have been hampered by features missing from Datalog but found in EqSat or vice-versa. We evaluate egglog by reimplementing those projects in egglog. The resulting systems in egglog are faster, simpler, and fix bugs found in the original systems

While I don't plan on reading this paper soon, I have printed it out because as a piler (the word [[Temple Grandin]] uses in [[BOOK – Visual Thinking – Temple Grandin]] to talk about people who like having piles of things they are working on), I want to have it on my visual periphery, if not now, then in the future.

[[Cory Doctorow]] and [[BOOK – Chokepoint Capitalism – Rebecca Giblin Cory Doctorow]]

I listened to the [[changelog Podcast]] episode interviewing [[Cory Doctorow]] about his book [[BOOK – Chokepoint Capitalism – Rebecca Giblin Cory Doctorow]] and found his points quite compelling. I had had the book sitting on my bookshelf for a few weeks now but hadn't yet dug into it.

I was especially compelled by his arguments around sampling and generative AI. Enforcing copyright around sampling basically meant that the entire power over sampling had been given away to the big labels and not the musicians themselves, with the resulting effect that albums like De La Soul's first 3 albums couldn't even be played anymore. With generative image AI, we are seeing the first lawsuit being levied by Getty, not individual artists. I still don't know where I fall with generative image AI, as so many of the products being built around it are clearly in the stock photo “fire your artist” category, and besides Adobe (of all companies!), not many are attempting to build artist-centric tools that incorporate AI into the workflow itself (things like BlenderGPT, etc...).

(rewritten with wordtune): I was especially compelled by his arguments around sampling and generative AI. As a result of copyright laws being enforced around sampling, the big labels now hold all power over sampling, not the musicians. The result was that albums like De La Soul's first three albums couldn't even be played anymore (until March of last year). In the meantime, Getty filed the first lawsuit about generative image AI, not an individual artist. I don't know where I fall with generative image AI, as so many of the products being built around it are clearly in the “stock photo fire your artist” category, and there aren't many artist-centric tools (like BlenderGPT, etc...) that incorporate artificial intelligence into the workflow itself (besides Adobe, of all companies!).

Despite my general annoyance at his meme-y style and his tendency to go full on annoying nerd (I guess that resonates with many people, who am I to judge), he is quite eloquent and I appreciated the clear statements around different business models. I canceled my audible account right after the podcast, used my libro.fm account to get a copy of the audiobook, and listened to the first 2 chapters on my subsequent mountain book ride.

Building glazed/sqleton

Procrastinating around writing articles about programming with LLMs and actually tackling writing a webserver for sqleton, I set out to add go-pretty support to [[PROJ – glazed]]. I went a little bit overboard there, and added not just support for the many different standard go-pretty styles, but in fact support for adding your own styles as YAML files.

After faffing around with using chatGPT (or in fact, [[PROJ – pinocchio]]) within [[Alfred]], about which I should also write an article, I set down to continue what I had started to sketch out on friday: a watcher for a collection of command repositories. glazed offers a very simple abstraction for commands, as defined by a class called CommandDescription (this part of the design was emergent, and as such still quite messy)

It's been a minute since I wrote regularly. I only sporadically jot things into my Obsidian vault these days—most of my writing is done either in my sketchbooks or in the margins of papers.

Since discovering Large Language Models in December and slowly realizing how transformative the technology is for programming, I have been binging on research papers. 3 months later, with barely any notes having moved into the vault, I realize how much I am missing, and how much work I will have to put into transferring what I learned into notes that will keep their value into the future.

Cory Doctorow ascribes a fair amount of his prolific productivity to his habit of writing daily, in public, about the material he comes across. I noticed that kind of transcendent effect in making music (and sharing it with peeps on discord) and writing notes and publishing them in Obsidian. The idea of not just making them accessible, but really publishing them, however, is something I haven't done.

Writing two or three paragraphs is not that much effort, and the microblogging platform (as well as its posting to mastodon) is hopefully low enough on friction to allow me to do this 15 times a day if necessary.

I have been heavily programming with Github Copilot for over a year and ChatGPT since it came out. It's the biggest paradigm shift in programming I've personally ever experienced, and I havent shied away from learning wild stuff.

Programming with a Large Language Model (LLM for short) feels to me like a very “weird” way to program. It is neither a conversation with a human (because of course it isn't), nor is it programming as I know it, because I am using imprecise natural language to get fuzzy answers. If I could best describe it, it feels like writing a google search query vs writing a regexp. Not only because of the language aspect, but because it is hard to tell “what” the algorithm is latching on to, and why it is giving the answers it gives.

I will be writing up quite a few of these smaller articles, but keep each of them short. I have multiple times started 10k words oddysseys and never gone back to them.

Technique number 1: Establishing context

Because with each API call, the LLM starts “from scratch” (this is both its strength and its weakness), you have to provide context with each query. In the case of straight GPT API usage, this is entirely up to you. In the case of Copilot, a much more complex context is sent to the model (here's a write up of reverse engineering an earlier version of copilot: copilot-explorer | Hacky repo to see what the Copilot extension sends to the server) and in the case of ChatGPT I have more studying to do to give a decent answer, but it is able to obviously take previous messages into consideration.

A language model is a mathematical construct that will provide the probability for a certain word (or token, to be more precise) given a certain context. This allows it to accomplish a variety of tasks, for example by selecting the or one of the most probable tokens and build sentences that way. In the case of a large language model, this probability is computed by an immense deep learning network that has been trained on a gargantuan amount of data. There is no realistic way to say why some probability got computed, but there is definitely structure to it.

I think of these models are cars that have been engineered to follow a certain set of roads, at a certain speed, and with certain capabilities to navigate their environment, and it is our job now to direct them where to go. I like to use the following example, and have a human answer. Complete the following sentences: – As a suburban dad, when I go to town I use my ... – As a race car engineer, when I go to town I use my ... – As a 12 year old, when I go to town I use my ... – As an alien from outer space, when I go to town I use my ...

Think of prompting as a similar exercise, except that you often want the model to complete with much more useful information: either new facts, or longer sequences of words to save on typing and research time (say, scaffold out a shell script, a SQL schema, etc...)

Two weeks after ChatGPT came out and I had my “lol this is fun” time, I decided to use it as fully and honestly as I could for the entire week, as a tool like any other. Tools are often unintuitive; a good tool often requires you to practice a lot before its quality comes to shine—many tools can seriously hurt you if you don't know how to use them. It definitely feels the same with ChatGPT. 3 months later and I still think I am only scratching at the surface at what concretely large language models offer for my field of work: programming.

Tool building and personal ethics

I love tools, I have used and built many in my life—in fact, I think my “goal” in life is to build tools for other people. There is nothing more fulfilling than seeing someone use something I built to build something that absolutely floors me. There is nothing more fulfilling than seeing someone thoroughly enjoy and cherish a tool I built because it makes them feel like they are doing better work.

While I didn't think of formulating my personal ethics before picking up writing (I do now because I discovered that writing is itself a tool to sharpen ideas), I think I had a clear direction for most of my life. Save for one freelance job, I always rejected working for companies that weren't building tools for users (vs say, tools for companies, or just plain nonsense, like ad-tech). Discovering Ivan Illich and his “Tools for Conviviality” last year allowed me to put realize that these are concepts that you can articulate and communicate.

While I haven't even made it halfway through Illich's book, the first couple of chapters nail what my ethics are: tools and machines should be built and used to empower humans to live richer, more creative, more free lives. Tools should be built to augment human craft, not replace it.

My standpoint on large language models

I think that large language models (I will never be able to say “AI” because it is such a ridiculous, misleading, polarizing, disingenuous term) have an incredible potential for augmenting human craft. I genuinely didn't think this way of transforming natural language would appear during my lifetime. These models are a paradigm shift in how I work and write software (both professionally and in open-source). I am building tools that more than anything I've built so far, would allow me to share the joy and importance of knowledge and intellectual work, and allow others to do the same. I will write more about why I think that and how I use them to that effect (here's two examples in the meantime: https://share.descript.com/view/w2gXSYanYMp and https://share.descript.com/view/YGc8nnsL52G )

It also means that I find how these tools are released (especially ChatGPT and Bing Chat) and the overall discourse of “replacing customer support”, “a search engine that answers your questions”, “replacing artists” absolutely abhorrent. These things don't make art, they don't answer questions, and they certainly don't replace humans providing meaningful customer support. They can certainly help people do these things, but by themselves, they will just fool people into thinking there was meaning where there is none, and allow grifters to pretend the same. There is a complete 180 between a human using a large language model powered tool to provide better support because they now have more agency, and something taking agency from a human (to the point of replacing them entirely) and packaging it into a sterile chatbot.

As subtle as the difference between these two use cases might seem, to me, they are two entirely incompatible sides of the same medal. One side is building tools to empower humans, the other is building tools to disenfranchise humans, both workers and consumers. That subtlety makes talking about it hard, especially in my heavily “anti-capitalist” circles. The assumption is that LLMs are only there to replace workers and enrich techno robber barons, so any mention of a productive use of LLMs immediately leads to angry callouts and mob dogpiling (I am putting anti-capitalist in quotes because I certainly don't appreciate being called a tech-bro by someone who is a principal engineer at Microsoft, of all companies. Yeah, that stuck...).

What I am doing about it

As an individual, as a tool-builder, as a techno-optimist, I think the biggest impact I can make in order to make the world a slightly better place is to share how I use these tools to enrich my life, my work, creatively and intellectually, because it is not something written about much, and it is not an easy tool to steer.

I also am building opensource tooling and making it not just nice enough to use, but come packaged with a strong ethical stance, so that you can't just take it and then build a chatbot with it without coming into contact with some material that will hopefully make you think twice about what you are doing. What that looks like is still a bit unclear to me, but it's proper dada (see https://github.com/go-go-golems ).

It also means that I will consistently call out bullshit in the AI-pilled circles I hang out in (because part of being serious about a field that is overhyped is that you come in contact with a lot of BS and grift and exploitation). People are enthusiastic about these technologies for a lot of reasons, and their worldview is heavily shaped by the framing of the companies behind these models—most of them are already victims of the future the companies building these tools wish to unleash upon us, so it is often easy to start an earnest conversation. It is infinitely more productive than telling people that they are gullible fools, or harbor evil intentions.

If I can change the minds of 20 people in an AI-hype discord by dropping a few spicy links, then I'll definitely hang out on AI-hype discords. Conversely, if I can show people that think LLMs are just random words strung together that these tools can help grassroots organizations build more accessible documentation, websites, more secure software; that they can free you from sacrificing your cognitive potential on the altar of capitalist bullshit work, then that's where you will find me too.

(This article was written entirely without LLMs because I just want to get it out. Otherwise, I would have spent quite a bit more time revising and editing it with wordtune and chatGPT, because I think they do make my writing better.)

I created this write.as account because I wanted to write more. Yet, since January 2023, I have almost entirely stopped writing, both in blog form and in my Obsidian. It is not because I have nothing to say, on the contrary, but I have lapsed in my regular habit, another habit has taken over and things have gone the way of ADHD.

The invoice for this months bill reminded me that it only takes a few minutes to yeet out a blog, that it doesn't have to be quality, that no one cares anyway, so here I am.

TL;DR: I used ChatGPT extensively to edit The missing discussion of LLMs potential. It was cool. I will do it again.

I have been using ChatGPT to help me edit some of my writing, but never for a full blog post. I am not a native English speaker and decided to write (and improve my writing) in 2021. A big part of writing is editing. Editing is notoriously difficult, but it is hard for an author to distance themselves enough from their writing (and thinking) to make the choices that would benefit a reader the most. This is true in music just as it is in writing. While an easy way to get some distance is to let pieces rest, this is not realistic in a professional context, nor do I want to leave drafts of blog posts lying around too long.

An often untold benefit of editing is editing other people's writing and having other people edit your own writing. It is a drastically efficient way of getting a sense of what makes for good writing and what doesn't, and it provides mutual benefit. It is also hard to see one's own words being modified, discarded, and criticized. Learning how to criticize and how to accept criticism is a big part of art education. It should be more prevalent in the education of programmers since code reviews and refactoring is a crucial aspect of programming.

One thing ChatGPT is particularly impressive at is stylistic transfer (rewrite X in the style of the declaration of independence was a viral hit). I figured this would allow me to see my own writing from different perspectives, in a very quick feedback loop.

Editing an article

I usually start my blog posts by freewriting on my freewrite traveler or by collecting some of my threads on [mastodon]. I'll do a first editing pass, playing with the order of arguments, finding additional points to raise in my Zettelkasten, and rewriting sentences to create flow. Lately, I experimented with pushing these second drafts out quickly, which was a good exercise to overcome my perfectionism, although I think I veered too much in the other direction.

I have been using professional editors in the past, and the value they add is very different. They can understand what I am trying to achieve—the point I might be trying to make—and suggest writing a completely different article instead or discarding everything but a paragraph and starting again. These are judgments I wouldn't trust ChatGPT with (although I might decide to use it for ideation; in that case, I would still be operating within the confines of my own limitations).

In this case, I have been working on an extremely long article of using LLMs and stable diffusion for retrieval practice and decided to extract irrelevant rambling into their own blog posts. I asked ChatGPT to suggest different ways of rephrasing individual paragraphs, ultimately settling on the “style of a sober Atlantic article.”

Here's the raw dump of the session, for your perusal. I think it has overall been very beneficial to the wording of the article, although only time will tell. I can definitely see myself becoming bored of its style after using it for a couple of articles. I could see myself deciding that I don't need it anymore except for sentences that give me real trouble. I could see myself building a tool that streamlines the process so that I can use it for everything I write. I could myself pushing it much further than I did here.

For a full comparison of after / before, see this entry in my vault.

Using ChatGPT as an editor, a raw log

Opening paragraph

Original paragraph:

My favorite software tool of 2022 is Github Copilot, which has transformed how I write software. While I studied machine learning in general and neural networks in particular in 2017, I got frustrated by the hype-to-usefulness ratio and decided to move on to other things. Ironically, this was right as the Transformers paper came out. GPT2, GPT3, and then ChatGPT definitely showed me how fast the field could move, and ChatGPT made me realize that a whole new page had been turned.

Prompt for GPT:

Rewrite this paragraph slightly to make it a bit more lyrical, in the style of a new yorker article:

Answer:

As I write this in the waning days of 2021, my thoughts are already with the software tool that has captured my heart and changed my life: Github Copilot. It was back in 2017 when I was deeply immersed in the study of machine learning, and particularly neural networks, that I began to feel a sense of frustration with the disconnect between the hype and the practical applications. I made the decision to move on to other pursuits, little realizing that this was the very moment when the Transformers paper was about to shake the field to its core. The rapid succession of GPT2, GPT3, and ChatGPT that followed only served to underscore the breathtaking pace of progress, and it was ChatGPT that truly opened my eyes to the fact that a new chapter had begun.

Prompt:

Rewrite this paragraphh in the style of a critical the atlantic article:

Answer:

In 2022, it is Github Copilot that stands out as my preferred software tool, fundamentally altering the way I approach writing code. As I delved into the study of machine learning and, specifically, neural networks in 2017, I was increasingly disappointed by the disproportionate emphasis on hype over tangible usefulness. I ultimately decided to turn my attention elsewhere, only to be caught off guard by the release of the Transformers paper. The subsequent emergence of GPT2, GPT3, and ChatGPT served as a harsh reminder of the lightning speed at which the field was advancing, with ChatGPT serving as a particularly poignant signal that a new era had dawned.

Prompt:

Rewrite this paragraph in a more sober, yet personal style:

In 2022, it is Github Copilot that stands out as my preferred software tool, fundamentally altering the way I approach writing code. As I delved into the study of machine learning and, specifically, neural networks in 2017, I was increasingly disappointed by the disproportionate emphasis on hype over tangible usefulness. I ultimately decided to turn my attention elsewhere, only to be caught off guard by the release of the Transformers paper. The subsequent emergence of GPT2, GPT3, and ChatGPT served as a harsh reminder of the lightning speed at which the field was advancing, with ChatGPT serving as a particularly poignant signal that a new era had dawned.

Answer:

As I reflect on my experiences in the world of software development, there is one tool that has had a particularly profound impact on my work: Github Copilot. When I began studying machine learning and neural networks in 2017, I found myself growing disillusioned by the emphasis on buzz over substance. I decided to shift my focus elsewhere, only to be startled by the arrival of the Transformers paper and the subsequent rapid emergence of GPT2, GPT3, and ChatGPT. These developments drove home the point that the field was moving at an extraordinary pace, with ChatGPT serving as a particularly noteworthy marker of a new chapter. Through its influence on my approach to writing code, Github Copilot has been a game-changer for me.

Ultimately, I settled on:

I studied machine learning in general and neural networks in particular in 2017 but got increasingly frustrated by the disconnect between the hype and actual results when put into practice. I decided to move on to other things. Ironically, this was right as the Transformers paper came out. The subsequent emergence of GPT2 and then GPT3 (alongside Codex) was a reminder of how fast the field could move, and ChatGPT made me realize that a new chapter had begun. My favorite software tool of 2022 is Github Copilot, which has fundamentally transformed how I write software.

Second paragraph

Original paragraph:

I don't know if the technology itself is just incrementally improving or if an actual threshold has been crossed. ChatGPT dug itself out of the uncanny valley that was holding me back from using natural language models (while I was immediately comfortable with more pragmatic applications like Copilot, in part due to the care around its prompt engineering ). Shortly after its release, I decided to give ChatGPT a spin as a daily tool for both work and personal research—just as I would a new IDE or programming language; I am still on that journey, and while I feel I haven't really left home base yet, I have already discovered many amazing things.

Prompt:

Rewrite the following paragraph in the style of a sober atlantic article

Answer:

It is difficult to say whether the recent advancements in technology are simply incremental improvements or if we have reached a true turning point. For me, ChatGPT represented a crucial step out of the uncanny valley that had previously hindered my use of natural language models. Despite feeling immediately at ease with more practical tools like Copilot, thanks in part to the attention given to prompt engineering, I was drawn to ChatGPT and decided to incorporate it into my daily work and personal research in the same way I might with a new IDE or programming language. As I continue on this journey, I feel as though I have only just begun to scratch the surface of what is possible, yet I have already made a number of exciting discoveries.

Great sentence:

“It is difficult to say whether the recent advancements in technology are simply incremental improvements or if we have reached a true turning point.”

In general the whole paragraph is better than what I wrote, and I settled on:

It is difficult to say whether the recent technological advancements are simply incremental improvements or if we have reached a true turning point. ChatGPT dug itself out of the uncanny valley that was holding me back from using natural language models. Despite feeling immediately at ease with a more practical tool like Copilot, thanks in part to the attention given to prompt engineering, I was mesmerized (as so many of us were) when ChatGPT came out; I decided to incorporate it into my daily work and personal research in the same way I might with a new IDE or programming language. As I continue on this journey, I feel as though I have only just begun to scratch the surface of what is possible, yet I have already made a number of exciting discoveries.

Creating a subheading

I then decided to ask it for alternative headings, which was less interesting (although, scroll further down for a further attempt).

Prompt:

This is a subtitle in the article I am writing, but I would like it to be shorter and a bit more impactful, while retaining its meaning: Amidst the criticism, I am missing a discussion of their potential

Answer:

“Ignoring the potential amidst the criticism”

Prompt:

no, try again

Answer:

“Overlooking potential amid the criticism”

Prompt:

no, try again, emphasizing that I a missing a discussion of the technology's potential

Answer:

“The missing discussion of potential amid the criticism”

Next paragraph

Original paragraph:

I think LLMs and image-generation networks are powerful tools for learning and helping humans process information. I am always a bit dismayed by the debates around these technologies in my social bubbles, not because the points raised (biased training, generating wrong but plausible-sounding bullshit, non-consensual training on copyrighted data, training and research being mostly in the hands of corporations, cost, and energy concerns, impact on knowledge workers being replaced by machines) aren't important, but because the debate never really focuses on actual benefits to its users, besides the wild overblown promises of the VC/tech/business people.

Prompt:

Rephrase the following in the style of a sober atlantic article

Answer:

As someone who believes that LLMs and image-generation networks have the power to facilitate learning and aid in the processing of information for humans, I am often disheartened by the debates surrounding these technologies in my social circles. It is not that the issues raised – such as biased training, the generation of misleading but credible-sounding misinformation, non-consensual training on copyrighted data, the concentration of training and research in the hands of corporations, cost and energy considerations, and the potential for displacement of knowledge workers by machines – are not important. Rather, it is that these discussions rarely focus on the tangible benefits these technologies can offer to their users, beyond the extravagant and exaggerated claims made by venture capitalists, technology firms, and businesspeople.

I ended up with:

I think LLMs and image-generation networks are powerful tools for learning and helping humans process information. I am often disheartened by the debates surrounding these technologies in my social circles. Not because the issues raised—such as biased training, non-consensual training on copyrighted data, the concentration of training and research in the hands of corporations, cost and energy considerations, and the potential for displacement of knowledge workers by machines—aren't important. Rather, it is that these debates rarely focus on the tangible benefits these technologies offer beyond the exaggerated claims made by venture capitalists, technology firms, and businesspeople.

Asking it for its “reasoning”

I decided to ask it a couple of questions about what it was “thinking.” I have been working with ChatGPT enough to know that it has absolutely no real understanding of what it is doing but instead is just vibing. Yet, these vibes have structure, and vibes are what I am interested in, so it is instructive to ask these kinds of questions at times. It is important to remember that they might sound like some “human reasoning” was going on, but that is just an illusion. The reasoning these models have is ultimately very alien to us.

Prompt:

Why did you do the changes you did?

Answer:

I made the changes to the original paragraph in order to reflect the more sober, measured tone of an Atlantic article. This involved rephrasing some of the more casual language, such as “my social bubbles,” and avoiding words and phrases that might come across as overly emotional or subjective. Additionally, I aimed to add clarity and specificity to certain points, such as by specifying the types of biases and the sources of overblown promises.

Note how the second part is just nonsense if you need a reminder that this stuff is not “intelligent” in any reasonable sense of the word.

Prompt:

which phrases came across as overly emotional or subjective in the original paragraph?

Answer:

Phrases that might be considered overly emotional or subjective in the original paragraph include “always a bit dismayed” and “wild overblown promises.” These phrases convey a sense of strong personal feeling, which might not be appropriate in a more sober, objective style of writing such as that of an Atlantic article.

Prompt:

Are there other changes you made that you think were particularly important?

Answer:

One change that I think was particularly important was the addition of specificity to certain points. For example, the original paragraph mentioned “biased training,” but did not specify the type of bias being referred to. By specifying that the biases in question relate to “training,” I was able to add clarity to this point and make it more concrete for the reader. Similarly, the original paragraph mentioned “wild overblown promises,” but did not specify the source of these promises. By specifying that the promises in question are made by “venture capitalists, technology firms, and businesspeople,” I was able to provide more context and help the reader better understand the issue being addressed.

sic.

Trying to get a subtitle again

Newer LLMs are usually quite good at summarizing text, and it is one of my main uses for writing documentation or writing emails. I know I tend to write too much, and I'll often distill things through ChatGPT, and shorten my writing (rarely actually using ChatGPT's output, because it is very often not exactly what I am trying to express).

Prompt:

What do you think the main point of the paragraph was?

Answer:

The main point of the paragraph seems to be that the debates surrounding LLMs and image-generation networks often do not focus on the actual benefits these technologies can offer to users, but rather on other issues such as biased training and the concentration of training and research in the hands of corporations. The writer expresses frustration with this lack of focus on the potential benefits, and suggests that the extravagant promises made by certain groups may be contributing to this lack of emphasis.

Prompt:

What would be a good subtitle for this paragraph?

Answer:

A good subtitle for this paragraph might be: “Failing to Consider the Potential Benefits of LLMs and Image-Generation Networks.” This title captures the main point of the paragraph, which is that the debates surrounding these technologies often overlook their potential benefits, and conveys the sense of frustration expressed by the writer.

Editing a short paragraph

This shows that the conversational interface to ChatGPT is quite interesting. In many ways, the whole process feels like talking to a well read but utterly stoned friend who is also a pathological liar. I actually have a few real-life friends like that, and I always enjoy spending time with them, because their company is very intellectually stimulating. Having to take everything they say with a huge grain of salt is part of that stimulation.

Prompt:

Can you summarize the following in a single sentence, in the same sober style of an atlantic piece? It seems that most people loudly arguing against LLMs have never tried to work with them. I get it that people are worried about low-quality posts on LinkedIn and flooding of disinformation or plausible-sounding spam, but that phenomenon has already reached a scale that overwhelms the individual.

Answer:

Most individuals who vocally oppose LLMs have not personally worked with them, despite concerns about low-quality posts on LinkedIn and the proliferation of disinformation and plausible-sounding spam that have already reached a level that exceeds the ability of individuals to combat.

Prompt:

Actually, summarize it in two distinct sentences, to separate the two facts

Answer:

Most individuals who vocally oppose LLMs have not personally worked with them. There are concerns about low-quality posts on LinkedIn and the proliferation of disinformation and plausible-sounding spam, but these issues have already reached a scale that exceeds the ability of individuals to combat.

I ended up with:

It seems that most people who vocally oppose LLMs have never personally tried to work with them. Most concerns I see are people worried about low-quality posts on LinkedIn and the proliferation of disinformation or plausible-sounding spam, but these issues have already reached a scale that exceeds the ability of individuals to combat.

More paragraph editing

I lost stamina towards the end, and decided to wrap things up:

Prompt for the final paragraph of the article:

Rewrite the following in a the style of ezra klein on vox.com: If we want to do proper technology criticism, we have to put in the effort to evaluate it fairly and intensively. There are tremendous amounts of interesting aspects to these models when used for personal productivity and great potential for them to be used for self-actualization. The more I play with them, the more I see their limitations, but the more I have scenarios where they turn out to be extremely useful, if not game changers.

Answer:

If we want to engage in truly effective technology criticism, it is essential that we put in the time and effort to assess these tools objectively and thoroughly. There are many fascinating aspects to LLMs and image-generation networks when it comes to personal productivity, and there is great potential for them to be used as aids in self-actualization. The more I experiment with these tools, the more I am able to discern their limitations, but I also keep encountering situations where they prove to be remarkably useful, if not downright transformative.

Prompt:

Do it again, but just for this sentence: If we want to engage in truly effective technology criticism, I think it is essential that we put in the effort to evaluate these tools thoroughly, if only to give solid counterarguments.

Answer:

In order to engage in truly effective technology criticism, it is crucial that we invest the necessary time and effort to evaluate these tools thoroughly, even if only to provide well-supported counterarguments.

Prompt:

Rewrite the following sentence as a punchy closing sentence to the few previous paragraphs: There are many things I find problematic about the current rollout of these models, but that is not going to keep me from studying them. For example, I have my own personal concerns about the misuse of LLMs for disinformation.

Answer:

Despite my reservations about the current deployment of these models, I remain committed to examining them closely – even my own personal concerns about the potential misuse of LLMs for disinformation will not deter me.

Prompt:

How would David Resnick write the following sentence: I find many things problematic about the current rollout of these models, but if anything, that is even more of an incentive to study them.

Answer:

It is precisely because there are so many aspects of the current rollout of these models that I find problematic that I am all the more motivated to study them.

Prompt:

How would Ezra Klein write it?

Answer:

There are many things about the current rollout of these models that give me pause, but rather than dissuade me, those very concerns only strengthen my resolve to delve deeper into their study.

What I ultimately settled on:

If we want to engage in truly effective technology criticism, I think it is essential that we put in the effort to evaluate these tools thoroughly, if only to provide well-supported counterarguments. There are many fascinating aspects to LLMs and image-generation networks when it comes to personal productivity, and there is great potential for them to be used as aids in self-actualization. The more I experiment with these tools, the more I am able to discern their limitations. Still, I also keep encountering situations where they prove to be remarkably useful, if not downright transformative.

I find many things problematic about the current rollout of these models, but if anything, that is even more of an incentive to study them.

Conclusion

I hope this raw overview of what I did for the blog post I just published was instructive. This is all an experiment, and I would love feedback. Do you think the editing was valuable? Do you do something similar? How does this compare to human editing? What are the downsides you see?

I studied machine learning in general and neural networks in particular in 2017 but got increasingly frustrated by the disconnect between the hype and actual results when put into practice. I decided to move on to other things. Ironically, this was right as the Transformers paper came out. The subsequent emergence of GPT2 and then GPT3 (alongside Codex) was a reminder of how fast the field could move, and ChatGPT made me realize that a new chapter had begun. My favorite software tool of 2022 is Github Copilot, which has fundamentally transformed how I write software.

It is difficult to say whether the recent technological advancements are simply incremental improvements or if we have reached a true turning point. ChatGPT dug itself out of the uncanny valley that was holding me back from using natural language models. Despite feeling immediately at ease with a more practical tool like Copilot, thanks in part to the attention given to prompt engineering, I was mesmerized (as so many of us were) when ChatGPT came out; I decided to incorporate it into my daily work and personal research in the same way I might with a new IDE or programming language. As I continue on this journey, I feel as though I have only just begun to scratch the surface of what is possible, yet I have already made a number of exciting discoveries.

I think LLMs and image-generation networks are powerful tools for learning and helping humans process information. I am often disheartened by the debates surrounding these technologies in my social circles. Not because the issues raised—such as biased training, non-consensual training on copyrighted data, the concentration of training and research in the hands of corporations, cost and energy considerations, and the potential for displacement of knowledge workers by machines—aren't important. Rather, it is that these debates rarely focus on the tangible benefits these technologies offer beyond the exaggerated claims made by venture capitalists, technology firms, and businesspeople.

It seems that most people who vocally oppose LLMs have never personally tried to work with them. Most concerns I see are people worried about low-quality posts on LinkedIn and the proliferation of disinformation or plausible-sounding spam, but these issues have already reached a scale that exceeds the ability of individuals to combat.

If we want to engage in truly effective technology criticism, I think it is essential that we put in the effort to evaluate these tools thoroughly, if only to provide well-supported counterarguments. There are many fascinating aspects to LLMs and image-generation networks when it comes to personal productivity, and there is great potential for them to be used as aids in self-actualization. The more I experiment with these tools, the more I am able to discern their limitations. Still, I also keep encountering situations where they prove to be remarkably useful, if not downright transformative.

I find many things problematic about the current rollout of these models, but if anything, that is even more of an incentive to study them.

(Disclaimer: I used ChatGPT heavily to edit the final form of this article, and I documented how in a follow-up post)

TL;DR: it doesn't matter how many tabs you have or how you close them; the value I get from them comes down to intellectually engaging with them fully, which is exhausting and rewarding. The most valuable practice I found is “generating” out of tabs, and for that, sometimes, just the tab title is enough. No tool is going to save you.

I have a few modes of “consuming” these tab piles.

In-depth studying

In-depth studying can be, for example, taking one day to go through one page of a math book. It's sitting down with a tutorial and actually going through it, doing all the side exercises, and then reflecting upon it. It's slow af, but it's gratifying, and of course, I learn things. Over the long term, the value of that learning diminishes, sometimes very rapidly, depending on what I focused on. I learned awk and R repeatedly, at times over months, and it's all gone. What stays are some more profound insights that were uncovered just through sheer focus. This is the “it takes a full day to close a single tab” mode.

Of course, a tab could be a textbook that actually would take three semesters to work through, so there's a wide range in what “in-depth” itself means.

Reading and annotating

Reading and annotating is where I sit down with an article (for example, using Reader) and read it to really engage with it. I don't just highlight interesting passages; I put myself in the mindset of having a conversation with the author, of pitting my ideas against theirs. This is a pretty high-intensity activity, and when I do this over the weekend, it takes me about an hour to process a single tab.

This way of processing information is what I find has the most “return on time reading.” I have a reasonably elaborate Zettelkasten system going, and filing thoughts that come out of articles, along with notes, is very productive; it often leads to blog posts. Over time, I have found how to make highlights and quotes, and crosslinking work for me (it wasn't easy!).

The downside is that usually, for every tab closed, 80 more get opened. I can reasonably process about 5-6 tabs this way during a work week, maybe ten if I'm pushing it. On holidays I would average 5-6 per day because I get more efficient once I get into a particular “scholarly” routine.

Reading for entertainment

This would be just reading a tab for fun. I only do this if I just opened a tab. I rarely go back to an old tab and then just read it for fun, as it's usually just more dopamine-rewarding to open a new tab on HN or scroll Mastodon. This is reasonably fast and generally pretty transient in terms of “return on investment.” Sure maybe over the years, I get something out of it, but I consider it entertainment (which is great!).

I think there is a lot of value in just looking at the title of a tab, quickly scrolling through it, and either discarding it or writing a small paragraph about it. Writing a short entry about why I think the link is interesting, what is interesting about it, how I found it, and what other pieces of knowledge relate to it is the most valuable thing to capture. It puts the rich context I have in my head, at this moment, onto paper.

If that little paragraph is stored in a suitable location, it means that the next time I want to study that topic or look something up, I will find it, along with its link, and immediately get context. That is actually extremely valuable. This filing of links is something I am not very good at, and I definitely want to work on making that part of my workflow better.

It's very easy to think that I will remember the nuances of my current thinking in the future, but that's a delusion. I need to write it down. Otherwise, it's gone.

This not only creates a lasting way for me to find that link again (this is where a crosslinked system becomes very useful because it allows me to crosslink the text I am writing at the speed of typing, really)—it seizes the opportunity to do some generative thinking, creating new knowledge that is entirely mine.

Just the fact of writing that little paragraph explaining why I want to read the link in the future probably gives me more “value” (as in, it will help me generate my knowledge in the future) than actually reading for leisure (see 3), because I actually “created” something myself.

This takes about 2-5 minutes per tab (sometimes faster when groups of tabs relate to the same concept). It is also exhausting work; if I do this for two hours, I'll be ready to plop down in front of Netflix and purge my brain with reality TV.

A concrete example: filing the Fennel homepage

A concrete example: I stumble across the the Fennel programming language. Incredibly interesting to me, but also something I feel would deserve a few months, if not a year, of attention to “really” get it. I can file it away under Lisp / Lua / Programming Languages and my daily log in my obsidian vault, maybe skim the website and make a bullet point list of points I find interesting, and link a HN discussion.

Just for fun, I created the entry for Fennel Lisp by “processing” the index page to show what that is like: https://publish.obsidian.md/manuel/Wiki/Programming/Fennel+Lisp

I mentioned Fennel Lisp in June in my daily notes, so I also linked that. If you step through the vault, you can see how I slowly moved from:

  • pile of links, to
  • pile of links with a single sentence per link, to
  • full paragraph with a lot more thinking about how I relate to the link.

Which is better?

So, is any of these ways of processing open tabs better? I like all of them, and I definitely had to build workflows for 1, 2, 4. I am content now knowing that there is no solution, and feeling like you can process 800 links a day is impossible. Instead I focus on time-boxing “quality time”, and just close all the tabs once I'm done, there'll be plenty of high-value quality time the next day.

We know that even slight changes in the user experience or algorithms of social media software can have snowball effects and drastically change how people interact.

But, no-one is omniscient. Even the best intentioned developers, the most carefully planned and discussed feature can, in hindsight, turn out to have undesired effects.

Not only should we design our social media experiences to foster an individual's agency (individual opt-in, instance opt-in, facilitate instance migration), we should also design features and processes such that we can reconsider our decisions and potentially roll them back easily.

On a software level, this could for example mean:

  • separate feature releases from bugfix releases
  • use a plugin architecture, or at least enable entire features to be toggle-able at instance level
  • structure open-source projects such that forks can benefit from each other's work without excessive fragmentation
  • always think of frontend/client developers first, because they ultimately shape what the experience is for most users
  • have features degrade gracefully for people who opt-out

On a “governance” level:

  • clear guidelines for the adoption of new features (could be by vote, by decree, by dice-roll, as long as instance members agree to it)
  • processes to check back in and see if people still think the feature is a good idea
  • this could include beta programs or A/B testing, as long as users consent to participating
  • processes to roll back previously agreed upon features

On a community level:

  • understand that people want different things out of social media—pretty much everybody taking part in these discussions is of good faith and deserves respect
  • de-federation is the mechanism that enables us to have productive discussions around these topics. If, after listening to each other, and understanding each other's incentives, we decide we are fundamentally incompatible, we can sever all links yet both still contribute to the fediverse itself

Disclaimer: I say we because I understand I am not operating in a vacuum, but these are just concepts I am working out for myself. I welcome feedback!