With all the AI hype in the News, you would think it can do just about anything. Like many people, I gave the free version of ChatGPT a whirl. Yes, it was impressive. I threw a variety of topics at it from chemistry, physics, literature, electrical engineering, RF topics like VSWR and even had it tell me a few jokes. It’s early in the ball game, but it’s already pretty impressive.
As you may know, everything you type on ChatGPT becomes public information. That information is used to help train the LLM (Large Language Model) running in the background on the supercomputer cluster. Not a big deal for guys fooling around and asking silly questions. But what about a company working on proprietary products or processes. Like drug development or R&D of an important product yet to be patented? Or even an engineer or consultant working on a pet project at home, who wants to keep prying eyes away.
Well, that where self-hosting your own version of a ChatGPT-like environment on your own local network would be great. It’s possible now, and doesn’t require a supercomputer. I first read about this possibility over a year ago. There are some open source software projects now, along with some powerful pre-trained LLM’s available for free.
It’s actually pretty amazing. Companies like Google, Meta and others have spend millions of dollars and many months of compute time creating and training these models. Their inputs consist of vast amounts of data scraped from the web. Places like Reddit and other online forums to get an understanding of language including slang. Factual sites, online sales sites, on and on it goes. All of this gets tokenized into numerical representations by a process called a transformer, a deep learning architecture, which further converts this data into vectors using a what’s called a multi-head attention mechanism. If it sounds complex, that’s because it really is. But get from this that the hard part is creating, training and encoding these models. That’s where the heavy lifting is. That’s where the supercomputers are needed along with GPU’s like the H100 from NVidia. They run up to 8 of these monsters chips on a giant heatsink. Each draws up to 700W of DC power. And they have racks of them. These GPU’s excel at doing matrix arithmetic and massively parallel tasks. They have thousands of compute cores and huge bus bandwidth. That’s what it takes to make these models.
And the incredible thing is you can download them FOR FREE. Decoding the information from these models is not trivial either, but it’s doable on a home setup. How much you have to spend depends on how fast you want the data. These models typically are contained in one large file. That’s what’s downloaded. Once you have this file, no Internet connection is needed. People confuse the idea that GPT works like a search engine; that data is being looked up on the Net as needed. That’s not how it does it’s thing. Not even close. Everything is contained in the model. Nothing is looked up live from the Internet. Nothing goes out and nothing comes in. It does not work like a typical database either. The way information is obtained in the decoding process is complex and fascinating.
So how do you do it, and what can you do with it? What the hell does this have to do with Amateur Radio you ask? Well we will get to that.
Dave, K7DMK