How artificial intelligence is energizing information technology



Technology is like a flywheel. Flywheels do a lot of work, but they slow down if the don’t get an injection of new energy. Artificial intelligence looks like the energy coming into the flywheel of information technology. My next guest has looked into this. The chief technology officer at Booz Allen, Bill Vass joined the Federal Drive with Tom Temin to discuss.

Tom Temin And you have been safe to say, following technology for a great many years and in the cloud and in the data center and in government and an industry. So what’s your sense of where we’re headed with AI? Because there’s a lot of heat. Where’s the light shining?

Bill Vass Remember, the engine of AI is data. And I think that’s an important piece that we can’t forget. And so data, it’s been called the new oil for a long time. But I think that data is getting even more gravity in this space because the more data you have, the more accuracy these parameters in get in your model, and then the more accurate the model becomes over time. And so, when I started about 10 years, and kind of knowing that ML would grow, I didn’t think about transformer models or illusion models at the time, 10 years ago. But knowing that it would go, I focused a lot on storage, making it very cheap to store things forever. Because I knew as more things got connected and more data and more logs occurred, you’d want to store it forever. And then the next focus was, of course, making it easy to stream the data in and connect everything with IoT. And IoT had a lot of promise for a long time, but it’s actually starting to pay off now. And then that can feed into two digital twins that you can do virtual simulation in the cloud with. And then when you have a lot of data you’ve collected, you can train these new transformer models to generate synthetic data, which is done manually. When we did robotic training at Amazon and other places I’ve worked, when I did autonomous systems in the ocean with liquid robotics, we generated a tremendous amount of synthetic data using procedural capabilities and artists and a whole bunch of other stuff. Now with some of the stuff coming out of Luna AI, latest stuff from OpenAI and others in Anthropic, and actually Hugging Face just just did a release last week in the ability to generate 3D synthetic data which robots and drones work in 3D environments. And so creating synthetic 3D environments is a big deal. And then of course saw the launch of Cosmos by Nvidia, which has got a tremendous amount of both synthetic and real 3D environment to do that training on the three computer problem. And then the interesting thing, AI makes it so that flywheel then you take all that training data, improve your AI, improve your software systems, which then you put back out into operation, which generates more data, and it just pushes that flywheel again.

Tom Temin And what’s your sense of how this happens to affect the architecture of the systems that federal agencies are using or any large customer in the sense that people are rethinking storage in the cloud and rethinking what their data centers can do for them in the AI data area, because they want the provenance of the data. And as you say, long term storage, the Cloud may not be the most economical option. And if you leave the data close to processing, then how do you?

Bill Vass Yeah. So I’ll argue that it is the most economical option for storage. If you have durable storage, it’s very expensive to do durable storage on your own because you’ve got to have three physical data centers that you replicate the data across that are at least 60km away, 10 to 60km away from each other. Setting up the networks for that and getting the storage unit on AWS is 11/9 durable. That’s very hard to do on your own and incredibly expensive. And so you can do it incredibly cheap or cheaply On Glacier, for example, on AWS on S3 Glacier it’s much cheaper than anything you could store it yourself on PREM. However, more and more, you’re seeing a lot of need to do the inference and even inference learning at the edge. And that’s where you start to see as software, everything becomes software defined, pushing software to find systems for satellites, for drones, for cars, for all of those kinds of things. The ability to collect at the edge and use the models to filter at the edge, too. So that’s an important piece too. It’s expensive to send all the data back. On a security camera you don’t need to send 30 frames per second of an empty parking lot at night. You can have an ML routine on the camera now on these little tiny like wyze cameras that are 25 bucks that you’ve trained on the cloud to look for changes. And then when a car drives through the parking lot at 3 a.m., that four seconds gets sent up, the rest doesn’t. And so there’s a lot you can happen here. I think it’s an exciting time as the edge gets more powerful. Storage on the edge gets more powerful, but also the cloud is getting more powerful at the same time.

Tom Temin We’re speaking with Bill Vass. He’s chief technology officer at Booz Allen Hamilton. And what’s your assessment of the DeepSeek? Not so much the deep seek itself, but the fact that they purportedly have powerful models running on conventional chip hardware and not on the Nvidia. Is that actually that significant or is it less significant than we think or maybe it’s really revolutionary?

Bill Vass Well, I think it’s significant in that, I’ve played a lot personally with that and we’ve been running it in sandboxes and stuff. And just to be aware, it’s  says that even on their website, that your data goes back to China. So anyone using it, if you pull the model down and run the sensors locally in a sandbox, you can play with it. And a great example on Amazon Web Services, they have DeepSeek available in bedrock in a sandbox that safe to use. So I do encourage people to experiment with it. But what they did is some pretty, for the longest time and I used to race cars a lot and there used to be an old fashioned saying there’s no replacement for displacement. So we were just in the race to build a bigger model. We just kept adding GPUs and adding GPUs. And it was only a matter of time till someone said, we’re going to pit too much to fill up with gas. We’re going to need to make them more efficient with turbochargers or whatever. And so what they did is do a number of things. They did what’s called a mixture of experts so that you don’t have to go through the full model. It recognizes different types of questions, and then it doesn’t require to go through the full model for everything. It’s got specialized sections of the model. The second thing they did was distillation. Now, some people are complaining that, well, that was cheating. So distillation is using somebody who spent a huge amount of money on a big model and then sending a bunch of questions to the big model, getting the answers, and then training on the questions and the answers to bypass the need for the giant model. So that’s another thing they did. So the other people spent the money on the GPUs and they leveraged it. And so that’s very smart. And then last is this chain of thought components that they’ve built in there, whereas what the big models do is they have a bunch of people, if you want to solve a problem, it lays out the steps of the problem and the model learns from the steps and the answer. They just sent the question and the answers and avoided a lot of the step training. And so those are always, at Amazon, I also ran our video game systems and things like that. And to do real time 3D graphics, if you like, quote unquote cheat a lot. If I was rendering your image right now, I wouldn’t render the stuff behind you to save rendering engine because I have to react quickly. And so these are things that they did to reduce things. Now, I don’t personally believe they did it for $6 million of compute. I think they spent a lot more on that. Even just running the distillation, calling all the other models to train with your model causing money, like these are the kinds of things. But at the same token, the model is quite compact. We were able to run it on a little Orin Jetson card about this big and it worked. Now it was slow, it was about six tokens per second. So a complicated question. Took about 11 minutes to respond. But that’s very usable, still on the edge as long as the users are expecting it to be slow.

Tom Temin Sure. And the other question is, the term we’re hearing more and more is small language models. The idea that maybe you don’t need to train it with all the data on earth, but trained only on your domain. And that seems to be something that the government needs to pay attention to.

Bill Vass Yeah, very much. Both transformer and especially the fusion models where you’re doing image generation or image detection, deepfake detection or other types of obviously detection or image processing. Having subsets, small language models on the edge are really important. Like Booz Allen, we were the first company to put a transformer model in space on the International Space Station. And that took a lot of compression to get it to operate on the International Space Station. And what it does up there is it helps the astronauts quickly diagnose issues with the systems. And so we filled it with all of the manuals basically for the space station, along with enough transformer information that it understood context of questions. And it just speeds up their error resolution components and it’s running up there on a little HP server with some GPUs and stuff, but it’s much smaller than it would be. It took us two days to upload it, just to give you an idea.

Tom Temin Sure. And just a practical final question for the program manager. You’ve laid out a lot of interesting technologies which are easy to at least conceptualize if you can’t necessarily be a practitioner of them. What do program managers need to know to understand how AI can actually help their programs, especially at a time when everybody is looking for ways to be more efficient?

Bill Vass Yeah, I think the important part is you got to kind of also manage the hype. And I think one of things I’ve liked at Booz Allen is we’re very good at practical implementations of models. So using that to help with requirements definition, it’ll definitely speed you up there. Basic programing tasks definitely speeds you up there. It’s not going to just write everything for you, but it’ll definitely speed you up in those areas doing large data analytics of unstructured data. It’ll definitely speed you up there. So there’s a lot of places to understand where to properly apply it, and we’re not to apply it. I think that’s the key, is to understand both and to also keep up to date with as it evolves. Like one of the things we did with a customer and we’re using the Lama model and we would show it core and show it extended thing called ragging and then also then show where it got its reference answers from. So the analysts can look at all three and decide which they wanted to believe. And how accurate the model was. And then in places where the model would make mistakes, we would provide a feedback loop to ever improve it. And so that’s an important piece as well.

Copyright
© 2025 Federal News Network. All rights reserved. This website is not intended for users located within the European Economic Area.





Source link

Share:

More Posts

See how Cap50's services can help deliver results for your business.